Table of Contents
What is Selenium?
Selenium is an Opensource Automation testing tool which is only meant for testing Web-based applications and cannot be used for desktop based, windows based applications. It runs in multiple browsers and multiple operating systems.
History of Selenium
- Currently, Selenium 3.0 Web is in use comprising of Selenium IDE, Selenium WebDriver and Selenium Grid.
- Selenium 3.0 Web is a suite of tools. Selenium was actually created by Jason Huggins in 2004 as an internal tool of Thoughtworks.
- Later on Paul Hammant joined the team at ThoughtWorks and started the second mode of development i.e. Selenium RC.
- Later in 2008 Philippe Hanrigou developed Selenium Grid, which provides a hub allowing the running of multiple Selenium tests parallelly thus reducing the execution time of test scripts.
- The first version of Selenium which was launched in the market was Selenium 1.0. It was a suite of tools comprising of Selenium IDE, Selenium RC, and Selenium Grid.
If we go to the official website of Selenium (http://www.seleniumhq.org/download/) we can see that the latest version is 3.14. Initially, in 2007 we had Selenium RC, IDE, and Grid.
Selenium Grid is a tool used for parallel execution of selenium scripts. For example, if we have a single machine and to this single machine, we can connect multiple machines with multiple operating systems so that we can run our test cases parallel across different machines which saves our time.
Selenium IDE is a tool which basically runs only on Chrome and Firefox browsers. It generates no reports and cannot execute multiple test cases. For example, if we have 5000 test cases then IDE cannot work, it’s not a robust tool to execute multiple test cases. It cannot generate logs.
Selenium RC, which is deprecated now in present market can write dynamic scripts which could work on multiple browsers. In Selenium RC, we had to learn a programming language like Python, C#, Ruby, Java to execute Selenium RC. It can generate Reports and logs.
As time progressed the selenium guys came up with WebDriver 2.0 in 2011. It is not a migration from RC to WebDriver, it was an entirely different tool than RC, where each has its own commands.
WebDriver could also make dynamic scripts and could work on multiple browsers. Like Selenium RC, it can generate reports and logs. Now in current market WebDriver 3.0 has come up which can do the things done in WebDriver 2.0 and Grid2 was evolved into Grid3.
Selenium WebDriver Architecture
Before starting with Selenium WebDriver Architecture, we need to know a few concepts if we want to understand the working of Selenium WebDriver. There are five components of Selenium Architecture:
- Lanugage Binding or Selenium Client Library: It is nothing but Jar files where the language in which we write our selenium framework. It means that language which we used to write the script it may be Java, C#, Ruby, Python, Perl.
- Selenium API: API Stands for Application Programming Interface. API is a particular set of rules and specification that software programs can follow to communicate with each other. API serves as an interface between the software program and facilitate their interaction. API is software to software interaction which means API works between software to software. With the help of API application talks to each other without any user knowledge.
- Remote WebDriver: It is an implementation class of the WebDriver interface that a test script developer can use to execute their test script through WebDriver server on the remote machine.
- WebDriver: Webdriver is a tool for automating web applications and verifying that they work as expected.
How Selenium WebDriver Works Internally?
In real time, we will write a script in languages like Java, C#, Python, Ruby, Perl. Let’s see how Selenium WebDriver works internally. Generally, you write code in Eclipse IDE (Integrated Development Environment) by using any one of the supported Selenium client libraries (Java in our case).
WebDriver driver = new ChromeDriver(); driver.get(http://www.seleniumhq.org);
Once you are ready with your script, you will click Run to execute the program. Based on the above statement, the Chrome browser will be launched and it will navigate to the SeleniumHQ website.
- Once you click on run, the selenium client library will communicate with selenium API.
- Selenium API will send the command taken from language level binding to browser driver with the help of JSON wired protocol.
- Selenium API sends the request to Browser Driver, it may be Firefox driver, IE driver, Chrome driver.
- The browser driver will use the HTTP server for getting the HTTP request and the HTTP Server filter out all the commands which need to be executed.
- Then the commands in your selenium script will be executed on the browser.
- Finally, HTTP server sends the response back to the automation test script.
That’s all for a quick overview of Selenium WebDriver architecture and how it works internally.