How to implement back-to-back testing. Yandex experience

    When accelerating development, there is a need to speed up writing autotests. Among the approaches that allow you to cover significant pieces of functionality with tests in a short time are back-to-back testing. One of the most common variations of such testing for web services is the comparison of screenshots. We talked about how we use it in testing Yandex search. If you have a tested version of the product, then creating a set of autotests for the next versions is quite simple and it will not take much time. The main difficulty is that identical situations are reproduced in different versions of the service. To do this, you often have to maintain a large amount of test data in several environments.

    When you think about using a back-to-back approach, the first thing that comes to mind is to make a comparison with a stable environment. But as a standard, it is suitable for a very limited range of products, because data in a stable and test environment often diverges. Often, making sure that a comparison with a stable environment fails, the researcher refuses to use back-to-back testing. Read under the cat about a couple of standard ways to implement this approach, which we use for Yandex services and which solve many problems that arise when using a stable environment. We will also talk about their advantages and disadvantages that we discovered.

    For example, consider a web service that can be divided into frontend and backend. Its pages are formed on the basis of data received from the backend. So, if the same data comes to two independent versions of the frontend, then they should form the same pages that can already be compared. We did this in two ways:
    • set both versions of the frontend to the same backend;
    • They passed the backend response we needed directly to the frontend.

    Consider each of these methods in more detail.

    Shared backend

    Service Description

    We used this method to test the Yandex.Market affiliate interface ( documentation ) - a web service that allows Yandex partners to manage the placement of their goods on Yandex.Market, monitor the status of stores and view statistics. In fact, all the functionality of the service is the presentation of a large amount of various data. Some of the data is confidential, and most pages are hidden behind the authorization process. Since the same page may look different for different types of users and stores, for full testing, you need to support a large set of test data that covers all cases.

    Based on the fact that we will compare the test and stable environments, the service is absolutely not suitable for checks through screenshots. Maintaining a large set of identical data in a test and stable environment is very expensive, and for some pages it is simply impossible, because some store data is confidential and cannot be repeated in a test environment.


    During manual checks, the test environment supports data for playing all the necessary cases. Therefore, by deploying an additional environment in the service on which we will lay out stable packages, and aiming them at the test base, we get rid of data discrepancies and get two service instances that allow testing it by comparing screenshots.

    To start testing, you need physical or virtual machines on which the service will be deployed. Since they are required for a short time, while tests are underway, we decided not to keep constantly raised instances of the service, but to receive virtual machines from the cloud and install the necessary packages on them. For this we used a plugin for jenkins, which through openstackreceives necessary virtualka. And then, using a set of scripts, we install all the necessary packages on them and start the service. We deploy stable packages in a test environment, so existing test environment configuration files are suitable for them. The whole process takes about 10 minutes.

    The tests themselves were written in such a way that the user (in our case, a manual tester) sets the set of pages that he wants to test, plus some parameters, such as authorization data, role, etc. Tests read the input conditions, perform some set of actions, then compare screenshots of the page or some part of it. To compare and build the report, we use the same tool as in this post.. It presents the differences of screenshots in a form convenient for humans, which makes it easy to analyze the results of launches of a large number of tests.


    The stability of the frontend checks with this approach depends on the work of the backend. This can cause problems, because the backend in the test environment will break from time to time - for example, due to unsuccessful builds or environmental problems. Testing in such cases will fail, even if the frontends themselves are working. But at the same time, integration is checked, because the pages are formed on the basis of the current backend answers. The format of interaction may vary. For example, if a backend with a data block appears in the response needed to build new parts of the page, the old version of the frontend will no longer be able to process this data and some of the tests will break. But if we select as a common part not only the entire backend, but only the data warehouse, then the problem is solved, since in this case two bundles of the frontend and backend, which have compatible versions, are checked.

    The tests themselves produce little action on the page. Which case will be checked is often determined by the data in the test environment. Therefore, they must be stable, and if something changes, the user of automatic tests should always be aware of this. For example, data can be regularly copied from a stable environment. This is useful, since the tester is in the same context with real users, but not all cases will be reproduced on “combat” data. Part of the functionality may be present only in the test environment, and to test it, additional data will be required, which in a stable environment is simply not available yet. If they are erased during copying, then tests of new functionality may break or stop checking necessary cases.

    For a test to start checking a new page, it’s enough to simply add it to the list and set the parameters, therefore, to cover new cases or to support data, the participation of the developer of autotests is not required. Test development will be needed only for more complex scenarios of interacting with the page (entering values, activating pop-ups, prompts, etc.).
    All the data needed for verification is already in the test environment. To manually reproduce the test results, additional tools are not required. If any element is missing or its position is violated, just open the page being checked in the test frontend.

    Advantages and disadvantages of the method

    • low implementation costs;
    • low support costs.

    • test execution time depends not only on the service being tested;
    • stability depends on the backend;
    • playing some cases may be difficult.

    Backend response emulation

    Service Description

    We used this approach to test Yandex.Direct interfaces through which partners can manage their advertising campaigns. As in the previous case, the main part of the functionality of the pages is the presentation of a large amount of data received from the backend. The display depends on many factors, such as the type of advertising campaign, the user, etc. These factors are difficult to control. For example, user information comes to Yandex.Direct from other services, and in order to change it, you will have to contact them. Parameters such as the type of advertising campaign can generally be changed a limited number of times a day. All this makes testing very difficult. To make the tests more stable, we decided to remove the dependence on the backend, and therefore on data and integration with other Yandex services.


    For frontends on test environments, the ability to set the backend response when loading a page was added, that is, build a page using arbitrary data specified by the test. We already had the environment on which the stable package was laid out, so we did not need to deploy new ones. But in order to fully use the comparison of screenshots, I had to write an additional module for tests that would manage the data transmitted to the page. To store the used backend responses, we decided to take the Elliptics distributed file system. The data in it is grouped by cases and test types. For ease of management, we decided to store general information about the data in our internal wiki, so that the user of autotests can easily find out what and how is checked, as well as change something if necessary. So that the service developer could easily reproduce and repair the bugs found, I had to realize the ability to repeat the tests manually.


    With this approach, tests become completely independent of the backend, which speeds them up and makes them more stable. To reproduce a specific situation, it is enough to transmit to the front end the desired answer, which can be saved in advance or formed according to the template. However, the integration with the current version of the backend is not checked in this case, it will have to be tested separately, for example, using the approach described above.

    There is no need to perform many preliminary steps before the tests. In order to check any cases, it is not necessary to create a new campaign, conduct it through a series of checks, provide it with a positive balance, etc. It is enough to simply form a suitable backend response. There is an opportunity to check cases that simply cannot be reproduced, for example, time-dependent behavior.

    The backend response format can change, and this makes the tests potentially vulnerable. To solve this problem, we provided the opportunity to take the actual backend response and record it in the test data storage - then it will be used during subsequent test runs. But for some cases, this method will not work. If the backend answer was edited to reproduce a difficult case, then it will not work to take the actual backend answer. If you change the response format, the data of such cases will have to be updated, and this will take time. To correct the test answers of the backend, you need to know the features of the service implementation and have certain technical skills, which makes data support more expensive.

    Advantages and disadvantages of the method

    • speed: tests do not wait for a backend response, therefore they work out faster;
    • stability: tests depend only on the performance of the frontend;
    • flexibility: it becomes possible to easily play complex cases.

    • high implementation costs: you need to make changes to the service code or emulate a backend, make tools for managing data and reproducing errors.


    It is easy to see: the shortcomings of one approach become advantages in another, so it is reasonable to combine these approaches. The first method is simpler, you can start with it, it will not require additional development and support. However, the second method is more flexible, it makes it possible to reproduce complex situations. Having raised two versions over one backend, you can get an opportunity to estimate the number of complex cases, the test execution time and their stability. This will help to understand whether more advanced approaches should be implemented.

    In both cases, we used the back-to-back approach to test the frontend. But limiting oneself to the frontend is not necessary at all. The main difficulty with back-to-back testing is to ensure that data matches across different versions of packages. For most products, the data is in some kind of storage. Separating only the storage as a common part, you can get the opportunity to test the backend logic without creating additional tests. In exactly the same way, you can test various APIs or other services that do not include the frontend.

    It is very convenient to apply the approach at the very beginning of product development when there are no self-tests yet or too few. By quickly covering the logic of calculating and displaying the results, you can focus on more complex functionality that cannot be verified in this way - for example, saving and changing data. Due to the fact that the product is actively changing, the results of different versions often do not match, which can lead to false crashes of back-to-back-tests. Therefore, in the first place, you should provide a good report that will make it easy to analyze the results. This problem is especially relevant at the very beginning of product development, as a lot of new functionality constantly appears and the appearance of the service changes greatly.
    It is significant that problems with testing screenshots can be easily solved by improving the environment, rather than complicating the tests, therefore, before developing tests, it is reasonable to analyze the benefits of improving the environment, and then you may not have to write complex tests at all.

    Also popular now: