Selenium 2.0 - WebDriver. Impressions, problems and tips for use


The last three months I have had to work with Selenium 2.0 (WebDriver).
In this article I will describe my impressions, thoughts and experiences that I have gained.
I will also describe the main actions that most often cause problems and show the most successful solutions that I could implement for them. Perhaps there are more correct approaches - I will be glad if you leave them in the comments.

Briefly about Selenium

Selenium library allows you to test web graphic interfaces. Its principle is to simulate the user's activities as accurately as possible. Essentially, this is writing a bot that runs through the pages of a site, performs actions, and checks the expected result. Selenium 2.0 implements browser messages through special drivers. Unlike Selenium 1.0, it does not use JavaScript, but communicates directly with the browser APIs.

What did I manage to implement

It turned out to write tests based on JUnit and Selenium 2.0, combined into one application. This application can run on the Selenium Grid - it is a network led by the Selenium Hub that accepts and distributes incoming test tasks to its Selenium Nodes. Various browsers can be configured on various Selenium Nodes. The drivers used are native drivers for each browser.

Part one. Impressions

Different behavior in different browsers

By browsers, I understand the main ones: Firefox, Google Chrome, Opera, Safari, IE8, IE9.
In order for the same code to work equally well in different browsers, you need to spend a huge amount of time. Sometimes an iron will is needed in order not to abandon this disastrous business. In this regard, the most obedient browsers are Firefox and Google Chrome. In my personal experience, it is vital that the test can change behavior in the right places depending on which browser is currently being used. Those. he must have information about the environment in which he goes.

Try not to use the webDriver object directly in tests! Create wrapper methods around the basic methods you need. It's easier to change behavior in one place than everywhere in the code of all tests.

Selenium 2.0 - Raw Product

While reading many posts on Stackoverflow in search of best practices or simply solving a problem, you constantly come across workarounds. There are several reasons: differences in the operation of browser drivers, the failure of the contract drivers to fulfill the required functionality, the presence of errors in versions, the presence of a direct dependence of the browser version on the driver version. Sometimes he is able to simply drop the test from scratch (from the point of view of the API user) - there is an element, but he does not see it. In my experience, there are many floating errors that are intentionally reproduced only once every time under absolutely similar conditions and actions. With Firefox, a fever sometimes starts and the browser may just close with something like this: Error communicating with the remote browser. It may have died. It is extremely difficult to find a reason, if at all available to the Selenium user. Therefore, sometimes the situation is helpless - the functional simply does not work.

Such an unfortunate alignment of affairs makes you change the behavior of the test case. Fortunately, the same things in GUI clients can often be done either in different sequences or in different ways. If you could not find a solution by googling, try to choose another behavior that will be successfully worked out. Do not become confined to a specific action unless there is a special need for it.

Selenium Tests - Dependent Tests

This means that if you do not take extra care, the actions of one test can affect the result of another test. This is quite obvious, including the user changing data in the process of his activity. When testing such functionality, you will be forced to change the initial data. If other tests depend on it and you did not return the data to its original state - or you could not do this because the test was interrupted by an error - another test may also break. A sort of domino principle. When you first realize this, it becomes very painful. Hands drop ...

If it is possible to reproduce the test conditions independently, i.e. There is direct access to the application under test and there are no barriers to deploying the initial test data - you are lucky to isolate your tests in this way - preparing the data before the test and clearing it after. For example, Liquibase can help restore data to a database.
Most likely there is no such possibility. In this case, there is only one way out - in addition to the actions being tested, to describe with the help of Selenium the actions for their “rollback” as well. Those. if the user deleted the entity - at the end of the test it must be re-created or loaded.
This is a sinful way. Since such actions are also vulnerable and can also result in an error, not fulfilling its purpose.

Selenium Tests - Slow Tests

You need to be prepared that the sequential run of a large set of tests for all browsers can take a large amount of time, measured in hours (from 30 minutes to 2-3 hours). This makes everything that I described above tragedy and sometimes it seems like mockery. The reason is that the tests are very saturated with various expectations, searching for elements and other slow actions.

Test only what you really need to test. Of all the possible working options for implementing the same action - choose the fastest.

Selenium IDE - no helper

Selenium IDE is a special plug-in for Firefox, which is able to record all the performed user actions in the form of scripts. There you can export compiled scripts into various languages ​​and in two formats: Selenium 1.0 (RC) and Selenium 2.0 (WebDriver).
In most cases, a useless thing.

  • generated code - do not read
  • the generated code is not working, in the case of a complex interface, due to all the above features
  • if the id elements (div, table, span, input) are automatically generated - the XPath options offered by choice are not suitable
  • a large number of tests (5 tests are enough) will force you to take the right path of the Jedi and compose your own implementation of frequently performed actions - and then use them as an inherited method. Once described and honed. As soon as such a set of methods is formed, the benefits of the IDE drop sharply. She cannot be told to use her methods - the development environment will generate her own idle non-ideal templates. Then to look through the generated code and replace all the necessary places with time it comes down to a complete rewrite of this code. The same idea can be continued with a single “reference book” - a list of all XPath key element locators. As soon as all such locators are placed in constants or in a separate directory, it becomes easier to use them than once again to check what the development environment generated there

The only benefit that I constantly feel is the ability to check the XPath pointer. A very convenient function - if the pointer is correct and such an element exists, it is highlighted with a frame.

Play with the IDE, get the gist of Selenium, you can even write tests with it. But as soon as you feel that the benefits are less than the costs - start making your own blanks. Accumulate them in a generic abstract ancestor class or utility class. From a certain point on, your tests may simply turn into a listing of such methods, diluted with checks of the result and current status.

Part two. Practical Solutions to Emerging Issues (Java)

The solutions described below are not beautiful, ideal, they can cause rejection, but they are working. In my experience, they eliminate problems. I hope they benefit and save you from lost hours and days.

Getting an element (findElement)

WebDriver provides a mechanism for finding and retrieving a WebElement entity:

Theoretically, the behavior of this method is affected by the parameter 'implicit wait' which can be specified when building the webdriver itself. For example, like this:
webDriver.manage().timeouts().implicitlyWait(5, TimeUnit.SECONDS);

Again, theoretically, this should explicitly force the webdriver to search for an element for the specified time and wait either until the desired element appears, or until the specified timeout ends. By the way, this timeout seems to be set only once.
In practice, something strange happens. The pause is maintained, but there is an internal feeling that the search, if it is by the DOM model, does not update this DOM model. For some browsers, a different situation arises - the element is already in the DOM model, but has not yet been rendered or partially rendered (Google Chrome). WebDriver returns the half-drawn element found and the click event falls into the coordinates that are not yet drawn. The isDisplayed () method does not help in such cases. In any case, the result is always the same for me - the element is visually guaranteed to appear, but the webDriver still does not detect it.

Take a rough pause. In order not to double the number of lines of code, I recommend making your own implementation of the findElement () method;
As I wrote above - for more effective work - the test should know which browser is currently running. For Firefox, in my experience, such a delay is not required.
You can also use the WebDriverWait tool. I will not describe such an option here, since I decided to stay in hibernation of the stream, this is enough for me - therefore there is no proven option. But everything is pretty simple there.
In the future, use only this method in all tests and do not use webDriver.findElement () directly.

Code example:
protected WebElement findElement(By elementLocatorToFind) {
if(isSafari() || isChrome() || isIE()) {
// for example, use simple Thread.sleep(1000) inside
return webDriver.findElement(elementLocatorToFind);

Getting items (findElements)

The problem and solution is similar to finding one element.

Check for the existence of an element

If it is necessary to verify that the element is missing, it is recommended to use the construction:

Here is a recommendation from JavaDocs:
findElement should not be used to look for non-present elements, use findElements (By) and assert zero length response instead.

Downloading a picture or file

There is a desire to test the downloading of the file, that the downloaded file is as expected, and if this is a picture - it is really available at the specified link.

In 99% of cases you do not need it. Ask yourself again what do you want to test? I am pretty sure that you only need to know that the download is available. That the link is active, the download button is enabled and the response status after the start of the download is 200. You have no task to test the browser and its download process.
Also, if the tests pass on the Selenium Grid, then you will not be able to download the file and check its location after that. The file is downloaded to the Selenium Node, and you will check it on the Selenium Hub. These are different hosts, at least in normal practice.

The solution is to make an ordinary HTTP request using the link leading to the file on the server, or the link by which the server should return such a file. If the status of the response 200 received from the server is correct, the file exists. I consider all other options as inaccessibility of downloading a file. Since often requests must have authorized cookies with them - these cookies must be imported from webDriver.
If one status is not enough, nothing prevents you from reading the entire InputStream from HttpEntity and further comparing its contents with the standard one, be it an MD5 sum or some other way.

Code example:
 // just look at your cookie's content (e.g. using browser) and import these settings from it
    private static final String SESSION_COOKIE_NAME = "JSESSIONID";
    private static final String DOMAIN = "";
    private static final String COOKIE_PATH = "/cookie/path/here";
    protected boolean isResourceAvailableByUrl(String resourceUrl) {
        HttpClient httpClient = new DefaultHttpClient();
        HttpContext localContext = new BasicHttpContext();
        BasicCookieStore cookieStore = new BasicCookieStore();
        localContext.setAttribute(ClientContext.COOKIE_STORE, cookieStore);
        // resourceUrl - is url which leads to image
        HttpGet httpGet = new HttpGet(resourceUrl);
        try {
            HttpResponse httpResponse = httpClient.execute(httpGet, localContext);
            return httpResponse.getStatusLine().getStatusCode() == HttpStatus.SC_OK;
        } catch (IOException e) {
            return false;
    protected BasicClientCookie getSessionCookie() {
        Cookie originalCookie = webDriver.manage().getCookieNamed(SESSION_COOKIE_NAME);
        if (originalCookie == null) {
            return null;
        String cookieName = originalCookie.getName();
        String cookieValue = originalCookie.getValue();
        BasicClientCookie resultCookie = new BasicClientCookie(cookieName, cookieValue);
        return resultCookie;

Clearing input field value

UPD: below, in the comments, you can find a discussion. As a result, the method apparently fulfills correctly, and the reason was hidden in the difference in browser versions and in another conflict associated with this. But I decided not to delete the description of this problem, because perhaps for some people such methods will also be useful, as for me in due time.

Sometimes, it is necessary to clear the value of a field of type input. For example, you need to replace the old value with the new one.
WebDriver provides a special method for this:

In my experience, this method does not work, moreover, it throws an error and breaks the test.
You need to find another way to clear the field value.

There are several basic ways.
The first way is to simulate the “select all” action and immediately after that send a new value:
inputElement.sendKeys(Keys.chord(Keys.CONTROL, "a") + Keys.DELETE + newValue);

But this solution does not work for me on all browsers and not always.

The second way is to send the number of backspace characters equal to the length of the old value. This solution is ugly, but it is effective and guaranteed to work in all browsers.
Below I publish an option that I use myself. It has a separate consideration of the situation when the browser is IE and input is of type file.
This is a special situation. When the sendKeys command is executed on such an element, IE will replace the old value with the new one, and will not append it to the end. Therefore, it makes no sense to clean up such a field. Moreover, such an attempt will result in an error. Either because of a non-existent file (as there will be an attempt to find the file in an empty path), or because of an attempt to find the file in a path whose string value will be equal to the backspace character.

Code example:
protected void clearInput(WebElement webElement) {
        // isIE() - just checks is it IE or not - use your own implementation
        if (isIE() && "file".equals(webElement.getAttribute("type"))) {
            // workaround
            // if IE and input's type is file - do not try to clear it.
            // If you send:
            // - empty string - it will find file by empty path
            // - backspace char - it will process like a non-visible char
            // In both cases it will throw a bug.
            // Just replace it with new value when it is need to.
        } else {
            // if you have no StringUtils in project, check value still empty yet
            while (!StringUtils.isEmpty(webElement.getAttribute("value"))) {
                // "\u0008" - is backspace char

Upload file to server

Problem: You
must upload the file to the server using standard HTML elements:

I recommend just taking it and putting it into a separate universal method and using it every time you need to upload something through this form.

Safari Driver does not fully support file download, as as I understand it, it is javascript-based. A window appears with a file selection puts it in a dead end. Such scenarios must either be avoided, or to achieve the result in another way - to create your own HTTP request or to attach data directly on the server side, if possible.

Code example:
protected void uploadFile(By uploadInput, By uploadButton, String filePath) {

Actions with elements inside iframe

If the required element is inside the iframe, it is not accessible from the default context. You cannot detect it in the DOM model and webDriver will throw a NoSuchElementException.

Before interacting with this element, you must switch the webDriver to the iframe context of the element. As I understand it, this is due to the fact that the page context and iframe context on this page are two different DOM models.

Code example:
// do actions against inner web element, located in iframe
// continue to do actions in default content

IE8 XPath issue

IE8, in its peculiar eccentric manner, sometimes misinterprets element pointers (, By.xpath and others).
I had situations when he ignored the refinement for the required element indicating its class attribute.
For example, IEDriver refused to distinguish between two different elements found by such locators and displayed elements suitable for both options:

I could not understand in what situations he was having problems.
An absolutely identical situation occurred with a direct indication of the id of the element. WebDriver pretends that it does not exist.

If IEDriver has a hallucinogenic delusion in finding an element (but not in other drivers and browsers) - the best way is to change XPath. Thanks to the flexibility of XPath, there are always many options.

IE8 item not clickable

IE8, unlike other browsers, is not always able to independently scroll to an element if you click on an element that is outside the visible part of the container (layer, table, etc.). As a result, this behavior leads to an error.

Solution: You
need to scroll. The only working method I found was to use javascript help. In fact, WebDriver has a special mechanism designed to help with scrolling to the required element:

new Actions(webDriver).moveToElement(elementToScrollTo).perform();

But it will not work in the case of IE8.

Code example:
((JavascriptExecutor) webDriver).executeScript("container.scrollLeft=1000;");

Where container is the id of the element to scroll. Those. in our case, a div or table inside which the element is located. As you can see, this script will scroll horizontally.

Firefox may die

Firefox Driver could be an example for other drivers, but it has one very unpleasant drawback. As can be understood from comments on different versions of WebDriver, this flaw either disappears or reappears from version to version.
The bottom line is that sometimes Firefox finds a demon and it suddenly, without any external, as it seems, influences and changes, begins to fall out of the blue.
It looks something like this: you observe how a test that has already been debugged to a shine successfully runs in a browser window. And here at an absolutely petty step or action the browser window simply disappears. In the logs you find this entry: Error communicating with the remote browser. It may have died. Everything, you will not find any more information.

Это регрессивная ошибка и гарантированного лекарства быть не может. Заключается она в том, что между браузером и драйвером возникает непонимание. К примеру по причине того, что ваш браузер обновился, вы не обратили на это внимание и продолжаете использовать старый WebDriver. Между Firefox и его драйвером есть, как я это ощутил, зависимость. Она не абсолютная, т.е. не всякий раз когда Firefox обновился, нужно бежать обновлять вебдрайвер тоже. Но первое что я посоветую сделать — погуглите, какая версия вебдрайвера наиболее подходит под вашу версию Firefox.
В случае с Firefox 19 мне помогло обновить stand-alone-server селениума до версии 2.30.0.


I am grateful for such experience and for the opportunity to work with this framework. Over the past months, XPath has become like a native language for me, probably soon I can correspond in it. Apparently, I got a lot of knowledge on how to use Selenium and how to do it effectively.

But still ... I would not want to face such tasks in the future. This is extremely tiring, debugging is like torment, it sometimes makes you write bad code, but most of all it’s scary that the tested web client will be modified. I am guaranteed to know that it will be changed. And this is another painful moment.

Therefore, if you decide to write serious tests on this platform - get ready psychologically.

Also popular now: