barancev October 1, 2012 at 13:40

What is Selenium WebDriver?

This article is a continuation of the more general article, “What is Selenium?” , Which explains what position Selenium WebDriver occupies among other web application automation tools.

Here I will try to tell in more detail about what Selenium WebDriver is and why it makes no sense to compare it with TestComplete, QuickTest Pro and other testing automation tools. And the point is not only that Selenium WebDriver is free and open - it is just as pointless to compare it with other free tools like Sahi or Robot Framework.

Why?

Because Selenium WebDriver is not a test automation tool .

But what is it?

Several different answers can be given to this question, first I will give short answers, and then more detailed ones.

In addition, I will explain why Selenium WebDriver has such a miserable and inconvenient to use interface (a set of commands), why it does not generate beautiful reports and why, despite all this, it is so popular :)

Just in case , I’ll make a reservation that although this article is about it’s about WebDriver, many arguments are true for Selenium RC, but I won’t say anything specifically about this outdated version, because its place is in the dustbin of history.

So what is Selenium WebDriver?

By purpose, Selenium WebDriver is a browser driver , that is, a software library that allows you to develop programs that control the behavior of the browser.

In essence, Selenium WebDriver is :

specification of the software interface to control the browser ,
reference implementations of this interface for multiple browsers ,
a set of client libraries for this interface in several programming languages .

Now it’s clear why it makes no sense to compare Selenium WebDriver with “other testing tools”? Unclear? Then add the details.

Selenium WebDriver is a browser driver

Surely everyone who came across computers, not even an IT specialist, knows the word “driver”. This is such a small program, or rather a software library, that allows other programs to interact with some device. The printer driver allows you to print anything on the printer. The disk driver allows you to read and write data. The network card driver allows you to exchange data with other computers over the network.

Users do not work directly with the driver. They work with application programs that, through drivers, interact with various devices. The driver does not have a user interface. Wait, but is there sometimes a user interface for setting up the driver? It happens. But this is the program interface for setting up the driver, not the driver itself. The driver has only a software interface; its purpose is to enable application user programs to interact with the device.

So, Selenium WebDriver, or just WebDriver, is a browser driver, that is, a software library that does not have a user interface, which allows various other programs to interact with the browser, control its behavior, receive some data from the browser and force the browser to execute some kind of teams.

Based on this definition, it is clear that WebDriver is not directly related to testing . It just provides autotests with access to the browser. This is where his functions end.

Structuring, grouping and running tests, as well as generating test reports, provides a testing framework such as JUnit or TestNG for Java, NUnit or Gallio for .Net, RSpec or Cucumber for Ruby, and so on. Test development is conducted in the environment of Eclipse , Intellij IDEA , Visual Studio , RubyMine and so on. Build by Maven , Gradle , Ant , NAnt , Rakeetc. Scheduled tests and publication of reports are run by a continuous integration server - Jenkins , CruiseControl , Bamboo , TeamCity and so on. And all these are independent tools that are not related to the Selenium project.

However, as part of the Selenium project, not only a driver is being developed, but several other related products - Selenium Server allows you to organize remote browser launch, using Selenium Grid you can build a cluster of Selenium servers. They stand on a par with the above tools and frameworks, because they also participate in the construction of a test run system. In addition, there is a “recorder” called the Selenium IDE, it can record user actions and generate code that uses the WebDriver interface to perform recorded actions.

But the main thing in the Selenium project is WebDriver, it is a key element of the Selenium ecosystem.

Are there other drivers? Of course.

Within each commercial “integrated” tool there are browser drivers, but as a rule they cannot be used separately outside this tool. There are also free open drivers - Watir provides access to the main browsers, WatiN has a good driver for Internet Explorer browser, Sahi can work with the Big Five browsers.

How to compare Selenium WebDriver with other tools?

From all the above, we can conclude that comparing WebDriver with some kind of testing tool like TestComplete or Sahi is pointless. They are in different weight categories. This is the same as comparing a printer driver with a text editor.

And what can be compared?

You can compare WebDriver with drivers that are included with various tools. For example, you can compare:

Which browsers and which browser versions are supported, including mobile,
what operating systems are supported, including mobile,
is it possible to manage multiple browsers on the same machine at the same time, are there any conflicts
Is it possible to control the browser on a remote machine,
what actions in the browser can be performed,
what data can be received from the browser,
how accurately does the driver emulate user actions, that is, does it generate all the same events in the browser that occur when the real user is working,
Is it possible to work with dialog boxes (alert, prompt),
Is it possible to work with native windows (file upload dialog),
Is it possible to work with HTTPS protocols and certificates?
etc.

And here WebDriver is the undisputed leader. However, comparing WebDriver with anything is beyond the scope of this article.

As for the comparison with "complex" tools like TestComplete or Sahi, for this you need to take not the WebDriver, but the full stack.

For example, a stack for Java technology might be: Jenkins + Maven + Thucydices + JUnit + WebDriver. All the features of the Java programming language are added to this, plus a lot of plug-ins for Maven and Jenkins, and to make everything cool, you can run tests in the clouds using some service like SauceLabs .

Then the comparison will be interesting. But this is not only the merit of WebDriver, the whole stack is important, and not just the browser driver. As for WebDriver, it is worth noting only that it integrates perfectly into almost any stack, this is one of its advantages as an "independent" driver.

Of course, WebDriver can be used not only for testing. He does not care who and why wants to control the browser. You can automate some routine tasks. You can make bots that will flood in the forums. You can make a script that automatically takes screenshots for documentation. Anything. The driver doesn't care. It only provides access to the browser.

In addition, no matter what tool you use, it is possible that you can connect to it WebDriver, which has implementations in a variety of languages - Java, C #, Ruby, Python. And then you, in addition to all the features of your favorite tool, add all the advantages of WebDriver. This is worth the effort, because among the drivers at the moment it is the best.

Well, yes, I already repeated several times that “he is the best”, but at the same time did not give a comparison with other drivers. And I will not. Because there is an argument that in the future is more important than any comparisons.

Selenium WebDriver is a browser management interface specification

The most important difference between WebDriver and all other drivers is that it is a “standard” driver, and all the rest are “non-standard”.

And this is not a simple figure of speech.

The W3C organization really took WebDriver as the basis for the development of a browser interface standard . Now it is under public review.

In a year and a half, this standard will be approved. And then the implementation of the WebDriver interface will be entrusted to browser manufacturers, and WebDriver as an independent driver may disappear altogether in the future, because it will be built directly into browsers.

Thus, we can say that Selenium WebDriver is not a tool at all, but a specification, a document, a standard that describes which interface browsers should provide outside so that the browser can be controlled through this interface.

While the standard is being discussed, browser makers are already operating. Within the framework of the Selenium project, several reference implementations were developed for various browsers, but gradually this activity is transferred to the browser manufacturers. The driver for the Chrome browser is being developed as part of the Chromium project , it is made by the same team that is developing the browser itself. Driver for Opera browser developed by Opera Software. The driver for the Firefox browser is still being developed by the participants of the Selenium project, but in the bowels of the Mozilla company a replacement is already being prepared for it, which is code-named Marionette . This new driver for Firefox is already available in browser development builds. The next step is Internet Explorer and Safari, the employees of the respective companies have not yet joined their development, but there are some shifts in this direction, because the standard (even the future) is binding.

In general, we can say that Selenium is the only project to create tools for automating browser management, which directly involved companies that develop browsers. This is one of the key reasons for its success.

And what will happen after all browsers implement this standard?

It would be logical to expect that the manufacturers of testing tools will not reinvent the wheel, but will control the browser through a standard interface. We can say that all the tools will use WebDriver to interact with the browser. But this will not be Selenium WebDriver as an independent driver, but Selenium WebDriver as an interface specification.

So why does it have such a primitive interface?

Precisely because WebDriver is:

browser driver, i.e. a library of a rather low level of abstraction,
The standard for the browser control interface, that is, the minimum set of commands that must be implemented in each browser.

When developing Selenium WebDriver, the goal was originally set to not include anything superfluous in it. The standard browser management interface should be simple and stable.

The set of commands has been gradually reduced, such “usability enhancing” commands as check, uncheck (for checkboxes), select (for drop-down lists) were thrown out. They all boil down to a simpler click command and therefore they are superfluous. Now in the WebDriver interface there is only one redundant command left - this is submit, but it may someday be eliminated.

In addition, the interface structure was designed in such a way that it could be described in IDL (this is exactly what was done in the W3C standard) and made implementations in various programming languages. Therefore, a minimum of language idioms, a minimum of "hidden" variables, and a "dumb and straightforward" interface were used.

But thanks to this primitive interface, now for the WebDriver interface there are client library implementations in Java, C #, Ruby, Python, JavaScript, PHP, Perl, and even Haskell!

And thanks to the same simplicity, WebDriver integrates perfectly with any other tools, integrates into any stack. This is the secret of its popularity and rapid spread - it does not try to “defeat” other tools, instead it integrates with them.

But what about usability?

Extensions based on Selenium WebDriver should solve this problem. They should provide an expanded set of commands, implementing these commands through the primitive WebDriver interface. The Selenium distribution has a Select class designed to work with drop-down lists, which is a clear demonstration of how extensions should be built.

Gradually, libraries appear that are built on the basis of Selenium WebDriver and provide a higher level of abstraction: Selenide , fluent-selenium , watir-webdriver , Thucidides . Popular test design frameworks allow you to use WebDriver along with other drivers. Among such frameworks can be mentionedRobot Framework , Capybara and the same Thucidides .

Sooner or later, auxiliary libraries should appear that make it easier to work with one or another set of widgets - jQuery, Prototype, ExtJS, GWT and others.

The number of such extensions and tools will grow, complexity too. So it may soon happen that you, using some kind of tool, will run tests without even suspecting that interaction with the browser is carried out through the Selenium WebDriver driver.

Is it worth it then to study Selenium?

Maybe it is better to study these libraries and tools of a higher level?

To answer this question, I will formulate it differently: who and why should study Selenium, and who better to use higher-level libraries and tools?

Whatever tool you use, you need to select the driver that controls the browser. To select it, you must know the capabilities of the driver - what it can and what cannot. At this level, Selenium must be mastered by every automation specialist. Moreover, the WebDriver interface, if you work with it, is not necessary to study.
A simple set of commands is easier to learn than "advanced", that is, Selenium is easier to learn than its extension. This phenomenon has a flip side - if you have studied an extended set of commands, then suddenly it turns out that you also mastered the WebDriver command set.
Extensions, as a rule, are language-dependent, because the addition of convenience involves the use of language idioms, typical techniques for organizing code in a particular programming language. The basic interface of WebDriver is simple, so having mastered it, you can use it in any language, it will look almost the same.
Most libraries aimed at increasing the convenience of the interface improve the search tools for elements - additional types of locators, a more convenient way to describe locators, and so on. The primitives corresponding to user actions in WebDriver are already good enough. Although, of course, libraries will implement typical “bundles”, that is, sequences of these actions, similar to how this is done in the Select class for drop-down lists.
If you use "tablets" to describe tests (as in the Robot Framework) or a special language for description at the domain level (DSL, Domain Specific Language) - you do not need to know about WebDriver primitives. But if you implement “fixtures” for tests, describe the actions that can be operated on in tablets, implement DSL, you will have to work directly with WebDriver, or with some extension, but not too high-level.
And the very last argument, which, I hope, will become less relevant over time - alas, while good extensions are sorely lacking. They will certainly appear. Maybe you are the one who implements one of these extensions. To do this, you need to learn the WebDriver interface. And those who will enjoy the fruits of your labor will be able to work with a higher-level library. In the meantime, you have to use WebDriver directly with small add-ons above it.

I hope that all of the above will allow you to better understand what place Selenium WebDriver occupies in the overall picture of the world and how it relates to other tools. If there are still incomprehensible moments - ask questions in the comments, I will try to clarify everything.

Tags: