Introduction to Selenium and Its Key APIs

Introduction to Selenium

Selenium is an open-source browser automation tool widely used for automating web applications for testing purposes or performing various repetitive web actions programmatically. It supports multiple programming languages such as Python, Java, C#, Ruby, JavaScript, etc., and works across different web browsers like Google Chrome, Mozilla Firefox, Safari, Microsoft Edge, and others.

Selenium has evolved over the years, with its components like Selenium WebDriver, Selenium IDE, and Selenium Grid. Selenium WebDriver is the most widely used of these, providing Object-Oriented API bindings for programming languages to interact with the web browser.

Key Features of Selenium:

  • Cross-browser compatibility.
  • Multi-language support.
  • Open-source and widely supported.
  • Rich APIs for interacting with web elements.
  • Supports parallel execution of test cases using Selenium Grid.
  • Lightweight and integrates well with CI/CD pipelines.

In this blog, you’ll learn about the key APIs provided by Selenium WebDriver, followed by a generic application built using Selenium.


Detailed Explanation of Selenium APIs with Code Snippets

Below is a list of frequently used Selenium APIs, along with their explanations and practical examples in Python (you can translate them easily to other supported languages).

1. webdriver.Chrome()

Creates a new instance of the Chrome WebDriver.

  from selenium import webdriver

  driver = webdriver.Chrome()  # Launches a new Chrome browser window
  driver.get("https://example.com")  # Navigate to a URL
  driver.quit()  # Close the browser

2. driver.get(url)

Navigates the browser to the specified URL.

  driver.get("https://example.com")

3. driver.close()

Closes the current browser tab/window but keeps the WebDriver session running.

  driver.close()

4. driver.quit()

Closes all browser windows/tabs and ends the WebDriver session.

  driver.quit()

5. driver.find_element()

Finds a web element on the page.

  from selenium.webdriver.common.by import By

  element = driver.find_element(By.ID, "username")  # Locates element by ID
  element.send_keys("my_username")  # Inputs text value

6. driver.find_elements()

Finds all matching web elements and returns a list.

  elements = driver.find_elements(By.CLASS_NAME, "menu-item")
  for element in elements:
      print(element.text)

7. element.send_keys()

Simulates typing into an input field.

  element = driver.find_element(By.NAME, "password")
  element.send_keys("my_password")

8. element.click()

Simulates a click action on a web element like a button or link.

  login_button = driver.find_element(By.CSS_SELECTOR, ".submit-btn")
  login_button.click()

9. driver.get_title()

Returns the title of the current webpage.

  title = driver.title
  print("Page Title:", title)

10. driver.get_current_url()

Gets the URL of the currently loaded page.

  current_url = driver.current_url
  print("Current URL:", current_url)

11. By.XPATH

Identifies web elements using XPath expressions.

  search_bar = driver.find_element(By.XPATH, "//input[@name='q']")
  search_bar.send_keys("Selenium WebDriver")

A Generic Application Using Selenium APIs

The following is an example application that uses the discussed APIs for an automated login flow on a sample webpage.

Automated Login Script

  from selenium import webdriver
  from selenium.webdriver.common.by import By
  from selenium.webdriver.common.action_chains import ActionChains
  from selenium.webdriver.support.ui import WebDriverWait
  from selenium.webdriver.support import expected_conditions as EC

  # Initialize the WebDriver (Chrome in this case)
  driver = webdriver.Chrome()

  try:
      # Step 1: Navigate to the login page
      driver.get("https://example.com/login")
      
      # Step 2: Find and fill in the username field
      username_field = driver.find_element(By.NAME, "username")
      username_field.send_keys("test_user")

      # Step 3: Find and fill in the password field
      password_field = driver.find_element(By.NAME, "password")
      password_field.send_keys("secure_password")

      # Step 4: Click on the login button
      login_button = driver.find_element(By.ID, "loginButton")
      login_button.click()

      # Step 5: Wait for a successful login message
      wait = WebDriverWait(driver, 10)
      success_message = wait.until(
          EC.presence_of_element_located((By.CLASS_NAME, "success-message"))
      )
      print("Login Success:", success_message.text)

      # Step 6: Execute JavaScript to highlight an element
      driver.execute_script("arguments[0].style.border='3px solid red'", success_message)

      # Step 7: Take a screenshot after login
      driver.save_screenshot("logged_in.png")
      
  finally:
      # Close the browser
      driver.quit()

This comprehensive guide should help you understand Selenium better and deploy its capabilities effectively in your projects. Let us know if you have other ideas you’d like us to explore!

Leave a Reply

Your email address will not be published. Required fields are marked *