How we test CSS regressions with Gemini. Report on BEMup in Yandex

    Hello! My name is Sergey Tatarintsev. At Yandex, I work in a common interface development group. Our group is creating interface libraries used in many services, including Search. We support four libraries, which in total include 62 blocks.

    If we count all the desktop and mobile browsers of all versions, it turns out that we have more than 15 in their support. About a year ago, we all tested them manually. The tester simply took and clicked all of this in all browsers and looked to see if something went wrong, whether it worked as intended. This led to a very long release process. Up until the development and testing took approximately the same time. Many bugs escaped the eyes of the tester or were detected after a fairly long time.

    We decided that you can’t live like that anymore and decided to automate the testing process somehow. We started with static analysis tools. To test the code style, we use the jscs tool written by our colleague Marat Dulin. The well-known JSHint is used for static code analysis . And for catching regressions in JS, we write unit tests. To some extent, this helped to cope with the problem: the analyzers caught absolutely stupid errors, and the tests allowed us to check the functionality of the block. But with CSS regressions, there was a space. Appearance testing was still carried out by the hands and eyes of the tester. We began to look for tools that would help us in automation.

    Consider an example. The picture below is a typical block from our standard library. There are various fonts, lights and shadows.

    Probably, we could write a unit test, where we would check for the presence of certain CSS properties with a certain block in a block. But such an activity does not seem particularly interesting, and such tests will not be particularly stable. The best way to test a picture is to compare it with a reference.


    In addition, several conditions must be observed. First, there should be the ability to test in multiple browsers at once. After all, coincidence of the picture with the standard in Firefox does not guarantee that in IE there will be the same correct picture. Secondly, screenshots of blocks need to be taken in different states. The same button when pressed may look very different. In addition, it should be possible to take screenshots of individual blocks, and not the entire page. There may be dynamic content on the page, which will result in false positives. Another advantage - you can immediately see in which block on the page there is a problem. Thirdly, I would like to store reference screenshots in the repository. This allows you to version them together with the code, support several versions of the library with different designs, and local storage allows you to increase testing performance, since you don’t need to go to a URL and remove it again after the reference screenshot. We would like to write the tests themselves in JavaScript, since we ourselves all web developers and this language know and love.

    Before writing our tool, we decided to look at existing ones. The first such tool is Depicted by Google. Its main advantage is that it does not need test code. You simply feed him two URLs - the reference site and the test one - and he goes through all the links, takes screenshots and prepares a report. Unfortunately, he does not know how to use the reference repository and take screenshots of individual blocks.

    The second such tool is casper.js + phantom.css: a framework for the headless browser phantomjs for integration testing and an addon for testing screenshots, respectively. This thing allows you to shoot fragments and test in various conditions. However, it is very attached to phantom.js, and testing in other browsers is not possible.

    The last tool we studied is Instagram Huxley. He also does not need code for tests, he records all your actions with a special plug-in and allows you to play them later. But he can only take screenshots of the whole page and runs tests in only one browser at a time.

    It is already clear that none of these tools is suitable for all our parameters. In this table you can see what exactly we lacked in each of them.

    In the end, we decided to develop our own. We called him gemini - twins.

    Scheme of work

    I’ll talk a little about how it works. You describe several block states. For each state, you can specify a list of actions that must be performed to go to it.

    Let's look at a specific example, a button from the bem-components library. It has four states: initial, with hover, pressed and in focus:

    This is how the previous abstract scheme will look for a particular button:

    It has an initial state that does not require any action. From it you can go to a hover, for which you need to hover over it. From this state, clicking LMB, you can go to the pressed state. When we release LMB, the button will receive focus.

    We will go directly from the circuits to the code. The test for gemini is a regular node.js-module, and we import gemini using the usual noderequire. First we need to create our test suite. This is done by the team gemini.suite. We pass the name of the set and the function in which we will further configure this set. All the further code that I will give in the examples happens inside this function.

    var gemini =require('gemini');
    gemini.suite('button', function(suite) {

    The first step is to set the URL from which we will take screenshots.


    Next we need to set the region for shooting. This is done using a list of CSS selectors. In the example, there is only one element, but there can be any number of them. The capture area is defined as the minimum rectangle into which all of the listed elements fall.


    Having finished the setup, you can proceed to capture screenshots. Our first state is the original (plain). we don’t have to perform any actions, just make a screenshot capture with the command captureto which we pass the status name.


    For the second state - hovered - we already need to perform a certain action, move the cursor. This can be done in the second argument of the function caprure.

    suite.capture('hovered', function(actions) {

    The next state is the pressed button. Clicking is a command mouseDown. In this example, you can still see an alternative way of defining an element: not passing the CSS selector directly, but wrapping it in a function find. Why this is needed and what is useful, I will tell a little below.

    suite.capture('pressed', function(actions, find) {

    The last state is the button in focus: the button pressed in the previous example must be released with the command mouseUP.

    siute.capture('clicked', function(actions, find) {

    In principle, you can finish the test on this, but you can still conduct a little optimization. In each example, we interacted with the same button. And in all examples, an element is searched every time. This can be simplified by performing a search once in a function beforeusing a function find, saving the results in a variable. In the future, instead of searching, you can use it. The final version of the test will look something like this:

    var gemini = require('gemini');
   gemini.suite('button', function() {
        .before(function(actions, find) {
            this.button = find('.button');
        .capture('hovered', function(actions, find) {
        .capture('pressed', function(actions, find){ 
       .capture('clicked', function(actions, find) {

    We also need to create a configuration file. In it, we set the root URL from which the relative URLs specified in the tests will be calculated. The second parameter is the URL for the Selenium Grid (since gemini is based on Selenium, the use of Grid is mandatory). Well and the list of browsers in which we will test. The config looks something like this:

    rootUrl: http://localhost:8000
    gridUrl: http://localhost:4444/wd/hub
            browserName: firefox
            version:  30
            browserName: opera
            version: 12

    You can view the test report directly in the console or create an html report.

    Useful Tips

    Screenshots are not a substitute for unit tests. You should not use them to test complex logic.
    Use only static data. If you have some kind of dynamic backend in which for each test different data will come that somehow affect the layout, screenshots will be useless.

    Integration with third-party services

    The first such service is Sauce Labs , something like a cloud-based Selenium Grid. For open source projects, it is free. To integrate it with gemini, we need to set two environment variables: the username and access keywhich we will be issued after registration.


    And in the config instead of Selenium Grid, you need to register this URL: .

    If you do not have a dedicated service for testing pages, you will need the SauceConnect utility , which will open a tunnel between your localhost and Souce Labs servers.

    Another service is the well-known Travis , usually used for continuous integration. To work with it, you need to install several native dependencies. In particular, gemini needs graphicsmagick. To run the test in gemini, you need to register in the package.json of our project gemini test:

    "scripts": {
        "test": "gemini test"

    If you want to integrate both services at the same time, Sauce Labs has quite detailed instructions on this topic. For an example, see this demo .

    The documentation for gemini is available at this link , and the tool itself is available on the github . You can also send your wishes and pull requests there, we are always glad to see them.

    The tool continues to evolve. Most recently, we taught you how to calculate its coverage for your CSS tests. To do this, you must specify the parameter in the config coverage: true. After running the tests, the report will lie in the gemini-coverage folder. The opportunity is still experimental, we welcome your feedback.

    Also, in the near future a version with a software API will be releasedand graphical interface .

    Also popular now: