Introduction to Selenium
Selenium is an open-source browser automation tool widely used for automating web applications for testing purposes or performing various repetitive web actions programmatically. It supports multiple programming languages such as Python, Java, C#, Ruby, JavaScript, etc., and works across different web browsers like Google Chrome, Mozilla Firefox, Safari, Microsoft Edge, and others.
Selenium has evolved over the years, with its components like Selenium WebDriver, Selenium IDE, and Selenium Grid. Selenium WebDriver is the most widely used of these, providing Object-Oriented API bindings for programming languages to interact with the web browser.
Key Features of Selenium:
- Cross-browser compatibility.
- Multi-language support.
- Open-source and widely supported.
- Rich APIs for interacting with web elements.
- Supports parallel execution of test cases using Selenium Grid.
- Lightweight and integrates well with CI/CD pipelines.
In this blog, you’ll learn about the key APIs provided by Selenium WebDriver, followed by a generic application built using Selenium.
Detailed Explanation of Selenium APIs with Code Snippets
Below is a list of frequently used Selenium APIs, along with their explanations and practical examples in Python (you can translate them easily to other supported languages).
1. webdriver.Chrome()
Creates a new instance of the Chrome WebDriver.
from selenium import webdriver driver = webdriver.Chrome() # Launches a new Chrome browser window driver.get("https://example.com") # Navigate to a URL driver.quit() # Close the browser
2. driver.get(url)
Navigates the browser to the specified URL.
driver.get("https://example.com")
3. driver.close()
Closes the current browser tab/window but keeps the WebDriver session running.
driver.close()
4. driver.quit()
Closes all browser windows/tabs and ends the WebDriver session.
driver.quit()
5. driver.find_element()
Finds a web element on the page.
from selenium.webdriver.common.by import By element = driver.find_element(By.ID, "username") # Locates element by ID element.send_keys("my_username") # Inputs text value
6. driver.find_elements()
Finds all matching web elements and returns a list.
elements = driver.find_elements(By.CLASS_NAME, "menu-item") for element in elements: print(element.text)
7. element.send_keys()
Simulates typing into an input field.
element = driver.find_element(By.NAME, "password") element.send_keys("my_password")
8. element.click()
Simulates a click action on a web element like a button or link.
login_button = driver.find_element(By.CSS_SELECTOR, ".submit-btn") login_button.click()
9. driver.get_title()
Returns the title of the current webpage.
title = driver.title print("Page Title:", title)
10. driver.get_current_url()
Gets the URL of the currently loaded page.
current_url = driver.current_url print("Current URL:", current_url)
11. By.XPATH
Identifies web elements using XPath expressions.
search_bar = driver.find_element(By.XPATH, "//input[@name='q']") search_bar.send_keys("Selenium WebDriver")
A Generic Application Using Selenium APIs
The following is an example application that uses the discussed APIs for an automated login flow on a sample webpage.
Automated Login Script
from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.common.action_chains import ActionChains from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC # Initialize the WebDriver (Chrome in this case) driver = webdriver.Chrome() try: # Step 1: Navigate to the login page driver.get("https://example.com/login") # Step 2: Find and fill in the username field username_field = driver.find_element(By.NAME, "username") username_field.send_keys("test_user") # Step 3: Find and fill in the password field password_field = driver.find_element(By.NAME, "password") password_field.send_keys("secure_password") # Step 4: Click on the login button login_button = driver.find_element(By.ID, "loginButton") login_button.click() # Step 5: Wait for a successful login message wait = WebDriverWait(driver, 10) success_message = wait.until( EC.presence_of_element_located((By.CLASS_NAME, "success-message")) ) print("Login Success:", success_message.text) # Step 6: Execute JavaScript to highlight an element driver.execute_script("arguments[0].style.border='3px solid red'", success_message) # Step 7: Take a screenshot after login driver.save_screenshot("logged_in.png") finally: # Close the browser driver.quit()
This comprehensive guide should help you understand Selenium better and deploy its capabilities effectively in your projects. Let us know if you have other ideas you’d like us to explore!