pushtaev October 2, 2014 at 10:13

Accident in autotests

From the sandbox

Introduction

When I wrote my first autotest a few years ago, it looked like this. In a cycle, I got a random user from the database 100 times, performed the required operation on it and checked that the result suits me. This seemed logical enough: I can’t conduct the test on one user, this is not enough, it will prove nothing.

Considerable time has passed since then, I managed to work on several different projects in different languages and even change the team. Today I can say with confidence: you should not use randomness in your autotests , except for cases that will be discussed separately. And I will tell you why.

Example

For examples, I will use the simplest function that squares a number, but retains a sign. In Ruby, for example, it would look like this:

def smart_sqr(x)
  x > 0 ? x*x : -x*x;
end

It is easy to imagine how the test will look for such a function. I just take some test cases and compare the meaning of smart_sqr()these examples with the test cases:

assert_equal(smart_sqr(4), 16);

The question is, by what principle should I choose the values.

"Benefits" of random values

Why did I start choosing random values the time I wrote my first autotest? Why do programmers continue to use random values in their tests? Their (and my) logic is easy to understand: one experiment does not prove anything, the approach is purely probabilistic: the more different options tested, the better.

All is not quite so. As a rule, in modern systems, theoretical proof of the fidelity of programs (a) is practically impossible and (b) is not required. The whole program is based on the hypothesis of the programmer himself that she does what she should. It is impossible to prove this hypothesis, but with the help of tests I can reduce my program to a set of simpler hypotheses, an informal understanding of which would be more accessible.

What I mean? For the function written above, to me, in a sense,it is obvious that she behaves the same on all positive numbers. By the word "obvious" I mean the very hypothesis on which my belief is based that my program generally works as it should (this is a common problem of all engineering disciplines, that some things have to be done by eye). In the absence of any hypotheses, any testing would be futile; only formal proof (which, I repeat, is on the verge of the impossible) would help me.

In the presence of a hypothesis, it is enough to check the function’s performance on only one positive number to make sure that it functions correctly on everyone.

So, I do not need random values to test the functionality of my function. I just use all the boundary values (or rather, those that seem to me to be such) and one value for each class of values, which, in my opinion, behave the same way. In reality, for our function, I would use the values –7, 0, and 13. Your opinion about the boundary conditions may differ from mine, and this is normal. For example, a unit behaves in a slightly different way: its square is equal to the original value.

It may also seem to many programmers that it is pointless to run a test for all the same values again and again, because their result cannot change. This is true, but the task of autotests is not to look for errors in already running programs, their task is to respond to code changes. If you do not change the code, then you can not run the tests again at all.

Disadvantages of Random Values

If you use random values in tests, you may encounter a number of problems.

First, the test may behave erratically. This is theoretically unacceptable, and can also cause a lot of problems in practice (for example, your system running the tests may decide that you broke everything and fixed it just because the test blinked red and perform some undesirable actions). The test should respond to code changes and only to it. The fall of tests due to environmental disruption is a problem, so there is no need to aggravate it by increasing the influence of the environment by adding tests that depend on the state of the random number generator.

Secondly, debugging such tests can be a serious problem. If the values on which the test fell were not preserved, then such a result may be useless in general.

Thirdly, the test code may lose its clarity when adding random numbers to it. What should be the square of a random number in our example? The square of this random number? With this approach, the test code will exactly repeat the function code (oh, by the way, a great idea, we will use it for verification!).

But in my case ...

Yes, in some cases, using random values may be useful. But you should be extremely wary of this. In some languages, it is possible to call private class methods. And this also can sometimes be useful. But this is no reason not to think seven times, and then two more before using this opportunity.

I will give a couple of cases in which, in my opinion, you can close your eyes to the use of random values. This is not a complete list. If common sense tells you that you can or should even break the rule that I voiced above, break it.

You are looking for a mistake.You know that there is an error in your code, it sometimes manifests itself in production. It is not possible to find it, following the logic. You can try to find it too much. To do this, you can use your automatic testing system, which at each start will check a certain number of random values. Everything is good in this decision, except that it is not quite an autotest. This is just a bug script that you have integrated into your automated testing system for convenience. Think twice: you may not need this integration at all.

The source data is too large.It may happen that you are relatively indifferent to the source data, but their volume is such that storing them is difficult. In this case, you can create them on the fly, although storing pre-generated data is still preferable. There is also an option with automatic non-random generation.

If you nevertheless decided to use random values in your tests, you need to save the value with which the random number generator was initialized. If you use randomness in third-party modules or systems (for example, inside your database), this can cause serious technical difficulties.

Finally

I would like to add that automatic testing, in my opinion, is one of the most poorly researched and formalized areas in programming. You can get diametrically opposite answers to any question, and for any reason you can hear mutually exclusive opinions. Even the points of view of respected and recognized specialists can vary significantly. If right now you try searching to find the answer to the question discussed in my article, you will hear thousands of points of view, ranging from "randomness is necessary" to "randomness is unacceptable." I tried to explain my ideas as clearly as possible, because a simple formulation of the principles in the field of auto-testing has not been working for a long time.

Tags: