zloddey October 18, 2014 at 18:24

Old Dog Teaches New Tricks: Code Kata Using QuickCheck

When I urge fellow programmers to create more different self-tests for their code, they often complain that this is a complicated and dull job. And in some ways they are right. When using classic unit tests, indeed, often you have to write a lot of code to check each individual case of behavior. And the quality of testing sometimes raises questions, especially in complex systems, when trivial use cases go off with a bang, but on some more complex scenarios that no one thought to write tests for, unpleasant problems arise.

I have heard about the testing method used in QuickCheck for a long time , but still there wasn’t enough of the final push to get it right. This push was the presentation. from John Hughes, author of this wonderful library.

What is the QuickCheck Approach

The essence of the approach can be described quite simply: we do not create test examples, but instead set the rules that determine the behavior of the system on arbitrary input data. The library itself generates a large amount of random input and checks to see if the behavior of the code complies with the established rules. If this is not so, then it shows us on which example the test falls.

It sounds promising? Quite.

That's just from what side to approach this miracle to a ~~simple bydloprogrammer~~a person who writes not in Haskell and not in Erlang, but in more mainstream languages? For example, I feel more comfortable when programming in Java. No problem! Google almost immediately suggests that for JUnit there is a corresponding plugin called JUnit-QuickCheck .

The best option to try a new approach to programming is to write something already known. So I took the classic Prime Factors Kata from Robert Martin . I recommend that you quickly familiarize yourself with it before delving into my article, because otherwise some points may not be clear.

Go

First, create an empty project. In order not to bore the reader with a sheet of XML files, I will use Gradle for this . With it, the entire description of the project will fit in several lines:

apply plugin: 'java'
repositories {
    mavenCentral()
}
dependencies {
    testCompile (
        "junit:junit:4.11",
        "org.hamcrest:hamcrest-all:1.3",
        "org.junit.contrib:junit-theories:4.11",
        "com.pholser:junit-quickcheck-core:0.4-beta-1",
        "com.pholser:junit-quickcheck-generators:0.4-beta-1"
    )
}

Each addiction is not accidental here. There is no need to explain why JUnit is needed here, but I will say a couple of words about the rest of the dependencies.

We will use Hamcrest to write beautiful and easy-to-read assert-s.
JUnit-Theories is also necessary because our plugin only works with it (because the unpleasant bug has not yet been fixed in the version of theories built into JUnit )
The junit-quickcheck-core and junit-quickcheck-generators projects contain classes that we will use directly to generate test values.

Following the principles of TDD, we start with the simplest theory, which allows you to quickly make sure that the feedback loop is functioning.

import static org.hamcrest.CoreMatchers.is;
import static org.hamcrest.MatcherAssert.assertThat;
import com.pholser.junit.quickcheck.ForAll;
import org.junit.contrib.theories.Theories;
import org.junit.contrib.theories.Theory;
import org.junit.runner.RunWith;
@RunWith(Theories.class)
public class PrimeFactorsTest
{
    @Theory public void allOk(@ForAll Integer number) {
        assertThat(true, is(true));
    }
}

This simple trick can save you a lot of time. I have often seen people using TDD spend a lot of time creating a complex test, and when they finally launch it, they find that it cannot work due to some completely extraneous problem (dependencies have not been downloaded or registered, JDK is installed, the project is configured incorrectly, the code is incorrectly written and a lot of other ridiculous errors). It is always very frustrating and confusing to the working rhythm. It is especially difficult to deal with these newcomers who are just trying TDD to taste.

Therefore, I myself always start with the simplest, most trivial, moronic test, and I advise you to do the same. You just need to run it and check what I see when it passes, and see when it falls. That means my system is ready for~~combat~~ work, and nothing will distract from the Red-Green-Refactor cycle.

First working theory

In order not to bother with the question of how to reveal prime numbers (my code should do this!), I simply hammer the already known numbers into an array . Obviously, due to the limitations of our list, we are also forced to limit the range of numbers for testing. We will fix this later when the time comes. In order not to be distracted from the main thing, I will no longer write imports in the code, I will limit myself only to the code itself.

@Theory public void primeNumberIsItsOwnFactor(@ForAll @InRange(minInt = 1, maxInt = 50) Integer number) {
    List firstPrimeNumbers = Arrays.asList(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47);
    assumeThat(number, isIn(firstPrimeNumbers));
    List factors = PrimeFactors.extract(number);
    assertThat(factors, hasItem(number));
}

We use annotations @ForAlland @InRangethe project JUnit-QuickCheck, to automatically generate a random number in a specified interval. Then we additionally filter them using the function assumeThat, so that the subsequent code works only with the numbers that I specified in the array. The difference between assumeThatfrom assertThatis that the first function just stops the test (and proceeds to the next example) if the next number fails the test, and the second will signal an error (and stop the test for all subsequent examples). Using assume in tests is more idiomatic than filtering values using conditional expressions.

This test will fall first (because we have no method implementationextract), but it’s easy to fix it. The solution that passes all the tests is trivial.

public class PrimeFactors
{
    public static List extract(Integer number)
    {
        return Arrays.asList(number);
    }
}

Do not be surprised or indignant ahead of time. This code works completely in accordance with the specification , decomposing into prime factors any prime number not exceeding 50 . To teach the code how to work with other numbers, just write a new theory.

Build meat on a skeleton

What properties does the set of number factors have? Obviously, their product must be equal to the number itself.

@Theory public void productOfFactorsShouldBeEqualToNumber(@ForAll @InRange(minInt = 2, maxInt = 50) Integer number) {
    List factors = PrimeFactors.extract(number);
    Integer product = 1;
    for (Integer factor: factors)
        product = product * factor;
    assertThat(product, is(number));
}

This theory ... doesn't fall! And indeed, if our code returns the number itself, then it will always be so. Heck.

Okay, another theory, this time more successful.

@Theory public void everyFactorShouldBeSimple(@ForAll @InRange(minInt = 2, maxInt = 50) Integer number) {
    List factors = PrimeFactors.extract(number);
    assertThat(factors, everyItem(isIn(firstPrimeNumbers)));
}

Each factor should be simple (well, get on our list of primes), so the theory begins to fall stably and regularly. And this is exactly what we need. Let, for example, the following error occur:

org.junit.contrib.theories.internal.ParameterizedAssertionError: everyFactorShouldBeSimple("10" )
    at org.junit.contrib.theories.Theories$TheoryAnchor.reportParameterizedError(Theories.java:215)
    at org.junit.contrib.theories.Theories$TheoryAnchor$1$1.evaluate(Theories.java:169)
    at org.junit.contrib.theories.Theories$TheoryAnchor.runWithCompleteAssignment(Theories.java:153)
    at org.junit.contrib.theories.Theories$TheoryAnchor.runWithAssignment(Theories.java:142)
    ...

Let's write the simplest code that allows you to find the divisors for this number:

public class PrimeFactors
{
    public static List extract(Integer number)
    {
        if (number % 2 == 0)
            return Arrays.asList(2, number / 2);
        return Arrays.asList(number);
    }
}

Run the tests again. They automatically fall on the new value found:

org.junit.contrib.theories.internal.ParameterizedAssertionError: everyFactorShouldBeSimple("15" )
    at org.junit.contrib.theories.Theories$TheoryAnchor.reportParameterizedError(Theories.java:215)
    at org.junit.contrib.theories.Theories$TheoryAnchor$1$1.evaluate(Theories.java:169)
    at org.junit.contrib.theories.Theories$TheoryAnchor.runWithCompleteAssignment(Theories.java:153)
    ...

Let's make a hack for him too!

public class PrimeFactors
{
    public static List extract(Integer number)
    {
        if (number % 2 == 0)
            return Arrays.asList(2, number / 2);
        if (number % 3 == 0)
            return Arrays.asList(3, number / 3);
        return Arrays.asList(number);
    }
}

We run the tests again and again, and each time they find a new value, on which the implementation falls. At the same time, we cannot just return some prime number so that the test passes. If we do this, then the previous theory (which checks the product of numbers) will begin to break. Therefore, we are forced to implement the correct algorithm step by step.

Gradually (and in fact, quite quickly) this series of hacks leads us to the first correct decision .

public class PrimeFactors
{
    public static List extract(Integer number)
    {
        List factors = new ArrayList<>();
        for (int divisor = 2; divisor <=7; divisor++) {
            while ((number > divisor) && (number % divisor == 0)) {
                factors.add(divisor);
                number = number / divisor;
            }
        }
        factors.add(number);
        return factors;
    }
}

Of course, the word "correct decision" means only that it passes all the tests at this stage stably . Although, obviously, it is not suitable for the general case.

You should take a break and reflex a little. The theory, which itself selects a counterexample for the current code, turns out to be a very convenient thing. The process of working on the code turns into a ping-pong with a robot that delivers fast, accurate and tricky punches. No need to spend time thinking about new examples breaking code, because they are born on their own. Instead, you can completely concentrate on thoughts about the algorithm itself, grind it in flow mode. This partly explains why there was such a big leap in commits.. It’s just that the code was born too fast for the intermediate steps to harden and form into full-fledged commits.

So far it seems that everything is very cool. We wrote just a couple of theories, and they semi-automatically fostered our algorithm. Isn't that beauty? However, let's see what happens next.

It's time to grow out of short pants

The euphoria gradually passes, and the eye begins to pay attention to sharp corners, which we carefully skirted around in the first stages. Our code, of course, works according to the specification, but this specification is defined only for numbers from 2 to 50. Not rich! At this interval, you can do without a program, just count everything in your mind.

Let's move on. Raise the upper bound 10 times in all theories!

@Theory public void primeNumberIsItsOwnFactor(@ForAll @InRange(minInt = 2, maxInt = 500) Integer number) {
    ...
}
@Theory public void productOfFactorsShouldBeEqualToNumber(@ForAll @InRange(minInt = 2, maxInt = 500) Integer number) {
    ...
}
@Theory public void everyFactorShouldBeSimple(@ForAll @InRange(minInt = 2, maxInt = 500) Integer number) {
    ...
}

Suddenly , a new problem arises: our theories are not aware that there are prime numbers, more than 47 (oops, no one introduced her to Euclid ). We need to come up with a new way to determine prime numbers.

We cheat a little (or is everything honest?) And use the ready-made simplicity implementation ), which is in the standard Java library. In order not to violate the beauty and uniformity of the code, we will make it in the form of a corresponding matcher.

@Theory public void primeNumberIsItsOwnFactor(@ForAll @InRange(minInt = 2, maxInt = 500) Integer number) {
    assumeThat(number, isProbablySimple());
    List factors = PrimeFactors.extract(number);
    assertThat(factors, hasItem(number));
}
@Theory public void productOfFactorsShouldBeEqualToNumber(@ForAll @InRange(minInt = 2, maxInt = 500) Integer number) {
    ...
}
@Theory public void everyFactorShouldBeSimple(@ForAll @InRange(minInt = 2, maxInt = 500) Integer number) {
    List factors = PrimeFactors.extract(number);
    assertThat(factors, everyItem(isProbablySimple()));
}
private Matcher isProbablySimple()
{
    return new BaseMatcher()
    {
        @Override
        public boolean matches(Object item)
        {
            return (item instanceof Integer) &&
                (BigInteger.valueOf((Integer) item).isProbablePrime(5));
        }
        @Override
        public void describeTo(Description description)
        {
            description.appendText("prime number");
        }
    };
}

Now our code falls on the decomposition of large numbers. It's time to fix it!

public class PrimeFactors
{
    public static List extract(Integer number)
    {
        List factors = new ArrayList<>();
        for (int divisor = 2; divisor <= number; divisor++) {
            ...

We fix the old loop boundary (7) on number, and everything seems to work again.

Only a little remains: to push the boundaries of the tests even wider and enjoy the result. And then a sudden surprise awaits us ...

Facing harsh reality

The reality looks like this :

@Theory public void primeNumberIsItsOwnFactor(@ForAll @InRange(minInt = 2, maxInt = Integer.MAX_VALUE) Integer number) {
    ...
}
@Theory public void productOfFactorsShouldBeEqualToNumber(@ForAll @InRange(minInt = 2, maxInt = Integer.MAX_VALUE) Integer number) {
    ...
}
@Theory public void everyFactorShouldBeSimple(@ForAll @InRange(minInt = 2, maxInt = Integer.MAX_VALUE) Integer number) {
    ...
}

As soon as we increased the upper bound of the tests from 500 to Integer.MAX_VALUE(2 ^ 31 - 1), how the tests began to work unrealistically long. A minute for each test. What is the matter? What's wrong?

An unexpected side effect of QuickCheck-style tests is their sensitivity to the speed of the tested code . Although, if you think about it, this is quite logical: if our code is not optimized and runs slowly, then a hundred calls to it will make this non-optimality a hundred times more visible. In the "classic" unit tests, this slowdown will not be so noticeable, but here it is manifested in all its glory.

What do we do when we need to find a plug in the code? There are two options: either we take the profiler in our hands and begin to take readings, or we look for an error by the method of scrutiny .

However, in our code there is nothing special to look at, everything is in plain sight. The problem is that we run the cycle for too long idle, in vain burning electricity. Everyone who is familiar with the factorization algorithm remembers that it is enough for us to check factors that do not exceed the square root of a given number. And who does not remember, he can go to Uncle Bob .

We’ll roll the fix . Again, change the upper bound of the loop, but this time before Math.sqrt(number).

public class PrimeFactors
{
    public static List extract(Integer number)
    {
        List factors = new ArrayList<>();
        for (int divisor = 2; divisor <= Math.sqrt(number) + 1; divisor++) {
            ...

How did this affect the result of the work? Tests again began to work quickly, and the difference is truly impressive.

Now everything is already all right! All tests pass, the code looks neat, an interesting experience has been gained - is it time to write an article on Habr? And then another thought creeps into my head ...

Test your tests

Stop, my friend, I tell myself, have you written down the boundary condition of the cycle correctly? Is it necessary to add one to the root of the number, or is it superfluous?

It seems to be a trifling question. But we have tests that run on a hundred test values! They will show who is wrong here.

Subtract "+1" at the top of the loop ( divisor <= Math.sqrt(number);) and run the tests.

Great, they pass!

We take one more unit, just like that, just in case ( divisor < Math.sqrt(number);).

Tests pass again!

What?

And here I had to think again. Aggravate the situation even further .

public class PrimeFactors
{
    public static List extract(Integer number)
    {
        List factors = new ArrayList<>();
        for (int divisor = 2; divisor < Math.sqrt(number) - 2; divisor++) {
            ...

I wrote obviously wrong code (it won’t find multipliers even in number 9), but the tests say that everything is fine . I start them again - they again say that everything is fine. I start them again - time after time they pass successfully. Falls occur very rarely, and counterexamples to my erroneous algorithm, which tests occasionally find, are not saved for future runs.

What is the reason for this test behavior?

By increasing the boundaries of the tests to Integer.MAX_VALUE, we were able to find and fix performance problems, but fell into a new trap. The trick is that with these range settings in the tests, mostly largenumbers (because uniform distribution is used for generation). And the defect introduced into the code manifests itself only on the squares of primes (I hope it does not require explanation), which are very few among large numbers .

Unfortunately, I was not able to come up with solutions that were more successful than to cheat a little again and make a copy of the existing specification, but only again with narrow borders.

@Theory public void everyFactorShouldBeSimple(@ForAll @InRange(minInt = 2, maxInt = Integer.MAX_VALUE) Integer number) {
    List factors = PrimeFactors.extract(number);
    assertThat(factors, everyItem(isProbablySimple()));
}
@Theory public void everyFactorShouldBeSimpleEspeciallyForSmallNumbers(@ForAll @InRange(minInt = 2, maxInt = 200) Integer number) {
    everyFactorShouldBeSimple(number);
}

It looks clumsy, but at least it allows you to find the exact upper limit to which you need to drive the cycle ( divisor <= Math.sqrt(number)).

It's time to take stock and bring together all the discoveries that came across to us in this (seemingly) simple example.

What did we get as a result

Even one experiment in an unfamiliar area can bring many discoveries. I’ll try to collect all the features of the QuickCheck approach in one bundle and evaluate them.

Laconic specification

Indeed, there is such a thing. I had to write only three theories, each of which tested one feature of the algorithm. This is noticeably less than a dozen conventional unit tests, which occurs in the classic version of the kata. We write this feature in a definite plus of this technique.

The need to carefully formulate verifiable properties

In order for theories to work well, one must come up with qualitative properties for verification that are invariant with respect to the input parameters. Sometimes it can be really complicated. It may seem like you need to fully implement the test algorithm inside the test code.

In the above example, it was possible to use a method isProbablePrimethat uses a fast heuristic algorithm for inaccurate verification of numbers for simplicity. However, if such an algorithm did not exist, then what would be our verification options? After all, by definition, a prime number is a number that has no divisors. And to check the simplicity of a number, you need to try to decompose it into divisors .

This is probably the hardest part about QuickCheck testing. I will need further research to understand how difficult it is to create good invariants for use in theories.

Slow Code Sensitivity

On the one hand, this is good, because it can immediately suggest that our code is not optimal. On the other hand, if the code, in principle, cannot be greatly accelerated, you will either have to put up with the slow operation of the tests or reduce the number of random values that are used as test parameters. And if we reduce the number of random values, then our confidence that tests will find possible defects in the code falls to an appropriate degree.

I think you already guessed that using QuickCheck for end-to-end testing might not be the best idea for this reason. Although, if you really want to, then you can try .

Insensitivity to boundary conditions

Perhaps this is a specific feature of the JUnit-QuickCheck library, but in other languages this situation is better. Or is it a feature of the specific example that we selected for the test. However, this shows that you should not always lightly rely on random values that the library helpfully selects for us. All the same, you have to think hard with your brain and double-check the correctness of the written code.

QuickCheck can also be used for TDD!

It seems to be quite real, although the sensations are different. Due to the fact that there are fewer theories (and each of them tests more cases), it is easier to build a chain of test methods that will lead us to a working code. On the other hand, this can turn into troubles if you need to take too big a step in order to make the code go through a freshly added theory. However, people encounter such problems in the classic TDD (and either find ways to solve them, or begin to fear TDD in principle).

It is possible that when testing code, a combination of classic test cases and parameterized theories in QuickCheck style will work well. I will definitely try to continue my research in this area and share interesting findings.

Tags: