Evolution of testing strategies - stop being a monkey
In this series of articles, I want to describe our experience in creating the fully automated testing strategy (without QA) of the Nielsen Marketing Cloud web application that we have created over the past few years.
At the Nielsen Marketing Cloud development center, we work without any manual testing of both new functionality and regressions. And this gives us a large number of advantages:
- manual testing is expensive. An automatic test is written once, and you can run it millions of times
- each developer feels more responsible, since no one will check his new features
- release time is reduced, you can upload new functionality every day without having to wait for other people or teams
- code confidence
- self-documenting code
In the long run, good test automation gives better quality and reduces the number of regressions for old functionality.
But here a million dollar question arises - how to effectively automate testing?
Part 1 - How to not do it
When most developers hear “automated testing” they think “automation of user behavior” (end to end tests using Selenium) and we were no exception to this rule.
Of course, we knew about the importance of unit and integration tests, but “Let's just go to the system, click on the button and thus check all parts of the system together” seemed like a pretty logical first step in automation.
In the end, the decision was made - let's write a bunch of end to end tests through Selenium!
That seemed like a good idea. But the result that we ultimately came to was not so joyful. Here are the problems we encountered:
1. Unstable tests
At some point, when we wrote a fairly large number of end to end tests, they started to fall. And the falls were not permanent.
What do developers do when they don’t understand why the test accidentally fell? That's right - they add sleep.
This begins with the addition of sleep for 300ms in one place, then for 1 second in another, then for 3 seconds.
At some point, we find in tests something like this:
sleep (60000) // wait for action to be completedI think it’s obvious enough why adding sleeps is a bad idea, right? *
* If all the same, sleeps slow down the tests and they are constant, which means that if something goes wrong and some operation takes more time than planned - your tests will fall, that is, they will be unstable and not deterministic.
2. Tests are slow
Because automated end-to-end tests test a real system with a real server and database, running such tests is a rather slow process.
When something works slowly, developers will usually avoid it, so no one will run tests on the local environment and the code will push into the version control system with the hope that CI will be green. But this is not so. And CI is red. And the developer has already gone home. AND…
3. The fall of the tests is not informative
One of the biggest problems of our end to end tests is that if something is wrong with them, we cannot understand what the problem is.
And the problem can be WHERE IT IS PLEASANT: environment (remember that ALL parts of the system are part of the equation), configuration, server, front-end, data, or simply not a stable test (not enough sleeps).
And the only thing you will see:
As a result, determining what really went wrong takes a lot of time and quickly starts to bother.
4. Data for tests
Since we test the entire system with all layers together, it is sometimes very difficult to simulate or create a complex script. For example, if we want to check how the user interface will react when the server is not accessible or the data in the database is corrupted, this is not a trivial task at all, so most developers will take the path of least resistance and just test only positive scenarios. And no one will know how the system will behave in case of problems (and they will certainly arise).
But this is just an example, often you need to check out a specific scenario when a button should be inactive, and building a whole universe for this specific scenario requires a lot of real effort. And this is not to mention that to clear the data after the tests, so as not to leave garbage for the next tests.
5. Dynamic user interface
In fashionable one-page (SPA) applications - the user interface is very dynamic, elements appear and disappear, animations hide elements, pop-ups appear, parts of the screen change asynchronously depending on data in other parts, etc.
This makes script automation much more difficult, since in most cases it is very difficult to define a condition that means the action was successful
6. Let's build our own framework
At some point, we began to realize that our tests are pretty much repeating each other, since different parts of the user interface use the same elements (search, filtering, tables, etc.).
We didn’t really want to copy-paste the test code, so we started building our own test framework with some generalized scripts that could be used in other scripts.
After that, we realized that some elements can be used in different ways, have different data, etc. So our generalized parts began to grow in configurations, inheritance, factories and other design patterns.
In the end, our test code became really complex and only a few people in the team understood how this magic worked.
Of course, some problems can be partially solved (running tests in parallel, taking screenshots when tests fall, etc.), but as a result, this set of end to end tests has become very problematic, complicated and expensive to support.
Moreover, at some point, we had so many false crashes and stability problems that the developers simply stopped taking red tests seriously. And this is even worse than not having any tests at all!
As a result, we have:
- thousands of end to end tests
- 1.5 hours for one start (20 minutes for parallel start)
- fall almost every day and when they are red - we have to test manually
- it takes a lot of developer time to reproduce and fix problems with tests
- 30 thousand lines of code in end to end tests
The final result that we got is known as the " Antipattern of the ice cream cone ."
In general, we decided to significantly reduce the number of our end to end tests and use something much cooler instead. But more about that in the next part of our story.