kentilini March 20, 2019 at 08:00

ReactJS Testing: How Deep the Rabbit Hole Is

Hello everyone, my name is Yaroslav Astafiev, and today I would like to conduct a guided tour in testing ReactJS. I will not delve into the complexity of testing web applications using certain libraries (guided by the approach “it is difficult to test only bad code”), in return I will try to diversify your horizons. So in this article, React is more of an occasion to put together testing approaches, a starting point that combines hipsters and technology. It would be more correct to even say that we will talk about the principles of testing in general with illustrations on ReactJS (and not only).

If you consider yourself a testing guru - skip the first half of the article , it is about the basic principles of testing. If the second part does not reveal anything new for you, come to us to work and teach how to.

If the introduction did not cause an attack of synesthesia, welcome to cat.

Unit tests

Unit Jest is a library for testing JavaScript. Do not like this - take another , but then Ava is better . Everything is simple here: click on the magic button and make sure that a certain value from "0" has changed to "1":

import React from "react"
import { MyButton } from "../src/components/dummy/myButton"
import renderer from "react-test-renderer"
test("MyButton has onPress fn", () => {
  let x = 0
  const instance = renderer
    .create( x++} />)
    .getInstance()
  expect(instance.handlePress).toBeDefined()
  expect(x).toBe(0)
  instance.props.handlePress()
  expect(x).toBe(1)
})

Now you have all the necessary skills to test the magic button. Unfortunately, these skills are not related to real life. The React component cannot be so well insulated, and isolation is one of the main principles of unit testing. Somehow, it is necessary to remove all components that somehow participate in the render method, with the exception of the tested one. And there is a solution: smart people came up with mockAPI for this.

  // initJest.jsx file
  global.fetch = require('jest-fetch-mock')
  //custom mock
  const API = require('mockAPI')
  //static mock
  describe("Date() Tests", () => {
    beforeEach(() => {
      MockDate.set("2011-09-11T00:00:00.000Z")
    })
    afterEach(() => {
      MockDate.reset()
    })
    //smth ...
  })

The essence of Mock is simple: all that is not ours is Mock / Stub / Fake / Dummy / Spy etc. We "emulate" the way we need the real behavior of a component, which can have complex logic, on pre-prepared test data and take it on faith that all emulated components work perfectly if the correct parameters are input to them.

There is a jest-fetch-mock library for jest , in it you can define moki globally. If you don’t like this option, you can “wet” each component you need in the test separately.

Net functionto the same input always returns the same answer. Accordingly, if in our business logic the components have “not clean” functions / components, then in unit tests they will also need to be “cleared” (but for unit tests this rule is not always true). A classic example is a react component that displays the current date and time in the format you need, the date will be different each time you run the tests, and you won’t be able to write the correct unit tests. For all those who disagree, you can complicate the example where your component should display the date in relative format and highlight dates that are older than a year from the current date in red .

Accordingly, if you have dynamic things that depend on time / weather / pressure, then mock will redefine the call you need so that there is no dependence on third-party factors. Thus, you do not have to wait on February 29th to catch a fallen test.

Unit Test Rules

The problems described above and methods for solving them are attempts at an informal approach to testing: each test runs as it wants. In my opinion, compliance with three important rules of unit testing is sufficient :

Determinism
Insulation
Independence from external factors
Common sense

Rule one: all tests must be deterministic. If I wrote a test on Windows, then on a Mac it should also start and produce the same result. Windows developers love to forget that file names on * nix systems are case sensitive. And you are lucky if the tests fall within the framework of CI, and not the application in production .

The next rule is isolation. We wet all non-tested components. If this is difficult to do, then it's time to refactor.

Last but not least: if there is data that your application receives in runtime, they also need to be nailed. This can be a locale, window size, date format, floating point number format, etc.

Integration tests

When to start writing integration tests is, in my opinion, an open question, and in each individual team / product the decision should be made taking into account internal factors.

You can make a formal approach: achieve coverage of unit tests in 80%(do not revise poorly written tests / require only new or changed code to cover tests), then conduct a full audit and refactoring of all written tests with analysis of typical errors, formalize internal rules for writing tests and conduct such raids once a year. If after all the actions described above your unit tests code coverage is still 80% +, then you have a mature team, or you are simply not critical of your code / tests. If the code coverage has become less, then you need to achieve coverage of 80% again and proceed to write integration tests. You can come up less formally and simply be guided by common sense: for example, for each bug that has been played n times, write a test or come up with something else, for example, toss a coin.

The second open question: aWhat tests are considered integration ? Perhaps leave it unanswered.

In integration tests, we test the work of not one component, but several components in a bunch. There are no rules, but common sense tells you:

Do not test how it is rendered, where it is called, and when it will all end;
Do not test the work of ReactJS, if it does not work, nothing will help you;
Do not test how the state machine React works;
test business logic / data model / borderline situations / something that often breaks.

In such tests, you should not go into details. They run noticeably longer and it is also more difficult to write them, so you should not get carried away and cover every minor case in the logic of the application. This is expensive in terms of infrastructure rental, and long in terms of development and script execution time. Someone will spend their lives on this routine, and the manager with users will be sad and wait for new features, and ...

Another reason why you should not try to test everything is False Security (the most important points I tried to write above). Each team should read about mistakes of the first and second kind, the Neumann-Pearson lemma and assess their risks in terms of money, parrots or other measure of truth accepted in the team.

But there are exceptions to this rule, as well as to any other.Forget everything that is said above :

When testing dynamic dependencies. When you do not know which component will come in runtime, you render it, but it may not come. You also need to write a test for this, nobody canceled the circuit bracker. Either the wrong component that you expected, or a broken component will come. Therefore, in this case, we write an integration test and render. We check whether everything works, if nothing falls.
With pixel perfect (well, you understand), the development will have to render and diff screenshots, and each time the component library is updated to a new version - update the reference screenshots. Because it’s easier ~~to hire a new designer~~ to reconcile than to fix.

Snapshot tests

The simplest integration test is a snapshot:

We take a component, render it
In the render we write console.log (this)
Copy the reference data from the console
Compare

If you want to get a little confused, then I advise you to play around with the StoryBook library . This is a library for snapshot tests, which incidentally wrapped up the idea of StyleGuidist - creating your own design system based on React components.

The first rule of the snapshot test: never ~~tell,~~ try to test the data . They should always be static, “locked up” and independent. The second rule: a broken snapshot test does not mean that everything is bad. If he is red - not the fact that everything went. There are many options to make the layout the same, but the DOM tree was different. So we ignore whitespace, attributes, keys or we don’t test what requires so much support time. Or we mark with our hands what is broken, what is not. We fix the broken tests and restart the StoryBook in mock update mode - the mode in which the test will render components and insert Snapshot as a reference value in expect condition.

xState and React Automata

ReactJS is a tricky thing. It would seem that the library is cool. I made three components - a class: and the state machine seems to work and the code is beautiful. Then you write on ReactJS for six months, you look at the code - some kind of nonsense. You don’t understand where the crutches, where the routes, where the state, where the caches ... Then you think: well, I’ll do as Facebook advises: I’ll fasten the “hokeys”, “hooks”, something else and suddenly catch yourself thinking how you wool hh. ru in attempts to find a project with development on the react from scratch, so that there certainly is sure to do everything beautifully ...

Everything is so complicated that it is generally impossible to understand how it works. And it works until someone complains. We will fix it - and it will fix, but breaks around ... One of the exits is a state machine, a set of deterministic application states and allowed transitions between them. And, as they say in narrow circles, you did not write on the reaction if you did not gash your state machine.

It is worth recalling xState . This is a deterministic state machine for JavaScript. You can make a very cool UI on xState - a link to the corresponding report can be found in the documentation of the React Automata library . In turn, React Automata is the library in which xState was adapted in ReactJS. In addition, she is able to generate tests for the state of a state machine .

If we have the first tick true - the green light is on. If the second is false, then a gray dog is drawn, and React Automata generates tests for all four combinations of these parameters and validates the dog and bulbs. However, at some point you will want to cut down half of the tests, but at first you will be very happy ... In any case, this is a convenient look at your tests from the side, it reminds me very much of the idea of testing with deterministic chaos.

Cypress

With snapshot, we more or less figured out, you can go towards end2end. We have all the products internal, so on-premises solutions are our only option. I hope that you have the opportunity to use cloud solutions, then such a thing as Cypress will come in handy .

Previously, you chose testing frameworks, took libraries for assertion, libraries to compare complex XMLs. Then we selected a driver and a browser to start it all. They started and wrote a bunch of tests. All this eats up the infrastructure, you need to stuff everything into the docker, then screw on some thing that looks at the tests in dynamics, analyzes them, shows what’s wrong ... The

guys from Cypress did all this for you. They solved several problems:setting up the working environment, writing code, running and recording tests. If the test is broken, you can display a screenshot highlighting what has broken. True, it does not work for mobile phones, but there, for example, is Detox . This, of course, is tough from the point of view of the entry threshold: you will have to tailor your application to it, rewrite a bunch of files, etc. But if you want, it’s possible.

Soft tests

There are alternative types of tests that cannot be called good tests either. I call them soft tests. For example, linter. They are mainly used by front-end (and even, sometimes, inspiration comes down to javists). There are many linters: ESLint , JSHint , Prettier , Standard , Clinton . I advise Prettier: fast, cheap, cheerful, easy to configure, works out of the box .

If you want to get confused, you can configure ESLint. Let me give you a classic example of a plugin for him: when a customer finds comments with obscene expressions in your code, he usually swears. Tricky developers make comments in Russian so that the customer does not guess. But the customer guessed ... use a Google translator and found out everything that the developers think about him. The way out of the situation is unpleasant, possibly with the loss of money or customers. For these cases, you can always develop a plugin for ESLint , which finds the "native Russian" words in your source code and says: "Oh, sorry, reject your commit."

The beauty of linters in JavaScript is that they can be put on a pre commit hook. Personally, I don’t like that Prettier doesn’t store history (although, on the other hand, it doesn’t accumulate technical debt either). From the point of view of statistical analysis of the code, such tests are wretched, because you can’t see the dynamics of the project, see how many errors were yesterday, the day before yesterday. In principle, this problem is solved in SonarQube , it is also in the cloud solution. This is a statistical code analyzer that stores the history of runs, can work with two dozen languages, including even PHP (who else do not need the iron hand of static analysis? :)). In it, you can watch the dynamics of your vulnerabilities, bugs, technical debt and more.

Complexity tests

Frontenders use linters because they want beautiful indentation . Complexity is also a soft-test with which you can try to check the quality of your code.

The picture shows how I'm trying to follow the thought of the junior who made the pull request. In such cases, I suggest demolishing everything and building a direct road.

Complexity tests follow a very simple principle: they calculate cyclomatic complexityan algorithm. For example, they read a function, find 10 variables in it. If 10 - probably it is difficult. Let's set complexity 1. For each cycle we will give 3 points, for a cycle in a cycle - 9, for each cycle in a cycle of a cycle - 27. We add everything and say: cyclomatic complexity 120, and a person can understand only 6. The meaning of this assessment is to subjectively say when you need to refactor your source code, break it into pieces, highlight new functions, and the like. And yes, SonarQube can do it too.

Alternative tests

In my world, alternative tests also apply to soft tests. Solidarity is a very useful thing for onboarding. And not only for front-end, although it is written in JavaScript. It allows you to test the working environment . Previously, it was necessary to compose huge instructions, indicating the versions of the programming language, libraries, the list of necessary software in order to just start, keep everything up to date etc. Now we can say this: “Here is your computer, here is the source code. Until Solidarity passes- do not come". At the same time, the entry threshold is low. Solidarity can make fingerprints of a customized work environment and allows you to add very simple rules to validate not only installed software. And it infuriates you when they come up with the words: “Oh, I'm sorry, something doesn’t work for me there, can you help? ..” The

second use case (it’s the main one) is testing the production environment with the words: “Unit CI tests of course passed, but the CI and PROD configurations vary significantly. So there are no guarantees ... ". The goal of the library is very simple: to fulfill the first rule of continuous integration: "everyone should have the same environment." Clean, isolated so that there are no side effects, well, or at least so that there are less of them ... whom am I trying to fool?

API Call

It happens that developers are divided into several teams - some write a frontend, others write a backend. A fictional situation that cannot be in a real team: everything worked yesterday, and today after two releases - front and back - everything broke. Who's guilty? I, as a person with initially backend experience, always say: front-end. It's simple, they messed up somewhere, as always. At one point, the front-end vendors come and say: “Here we read the post by reference , went through the guide and learned how to cast your REST API . And you will not believe it has changed ... " In general, if your backends are not friends with Swagger, openAPI or other similar solutions - it’s worth taking a note.

Performance js

And finally, the performance JS test. No one is testing performance JS except browser makers. As a rule, they all use Benchmark.js . "Oh, and we have whetted our Explorer for 18 years so that it displays a billion-per-billion tablet faster than in Chrome." Who needs such a tablet at all?

If you want to do performance tests, it’s better to go the other way: test end-to-end and watch how it works. The user forms a perception of how the application works as a whole, the user does not care that these are problems on the backend side.

War story # 1

Now an example from life. Somehow the bosses come to us and say: “Your front is working very poorly, it is barely loading. Something needs to be done with the performance, people are complaining. ” We think: now we’ll have two weeks to perform tuning, picking logs why did you update , picking tree shaking, cutting everything into chunk with dynamic loading ... What if it doesn’t burn out, then we’ll break it all up or just make it worse? An alternative solution needs to be made. We open the browser, look: 7.5 MB, 2 seconds, everything is fine.

Let's put Nginx GZip:

Nginx has the ability to adjust the compression ratio, let's try:

Growth - 25% performance. Stop early. Take a look at the little designer logo in the corner. It remains very beautiful, even if it is stretched, but why do we need it?

Here's what you got after optimizing one picture. You can evaluate the weight of the logo yourself. Finally, we come to the customer and say: “the first download is not as important as the second one. And, enable forced caching:

... Everyone is happy, everyone is jubilant! ” Except for the user, of course.

As a result, we decided to conduct a size audit more often. Gzip, fonts, pictures, styles - those places where rarely anyone looks, but there are many benefits.

Madge and updtrJS

Next step: dependency auditing . Madge is such a thing that analyzes the code and says: here is a class so-and-so associated with so-and-so, etc. If everything goes through one component and it breaks, then there will be little pleasant. Madge is a great visualization tool, but suitable only for manual exploration. It has such an option, circular, that searches for all the cyclic dependencies in your project . If there are any, it’s bad, if not, then they haven’t written yet.

The pain of legacy frameworks and libraries is almost resolved with updtrJS .
Do you have 70 thousand lines of code? Are you trying to move from 13th React to 16th? Updtr will not help youto move, but it will help to say which versions of libraries you can move to without serious consequences. In addition, it allows developers to stay in trend, helps keep up-to-date dependencies. If you have good test coverage, I recommend.

Static types

Use JS static typing , because JS dynamic typing isn’t a feature at all, it’s an abyss. Flow, typeScript, reasonML that's all ...

Chaos testing

The essence of chaos testing is simple: start the browser and start poking everything that is poked, enter everything that is entered into all fields, and so on. Until it breaks. At one time, this product was written by Amazon to get as many exceptions on the backend as possible. Like, "come on, you'll be knocking on the front, and we will watch mistakes on the back." If there are errors, then we will fix it. Implementations: Gremlin.com and Chaos Monkey .

In React, it has acquired a new meaning - starting with the 16th version, in which componentDidCatch was added. If your front has fallen, thrown an exception, you can catch it and bail. Although there are more elegant ways to solve this problem.

As part of the Naughty String project, volunteers collect all the bad stringsand not only lines that can lead to breakdowns or other unexpected results, can significantly enrich the range of your tests. For example, if you post a “blank of zero length” on Twitter, he will respond with an internal server error, what then can we say about our knee-high crafts?

War story # 2

This is the probability of one of my dough falling - one ten billionth. And this event happened.

def generateRandomPostfix(String prefix) {
    return prefix + "-" + Math.abs(new Random().nextInt()).toString()
}
def "testCorrectRandomPostfix"(){
    given:
        def prefix = "ASD"
    when:
        def result = generateRandomPostfix(prefix)
    then:
        result?.matches(/[a-zA-Z]++-[0-9]++/)
}

It was a big and difficult test. In the code above, I tried to leave only the main idea. It was not very easy to figure out, but we found a problem.

Let's go through the steps. We have the ASD prefix, there is a function that generates a random postfix and adds it to the prefix through a hyphen. Moreover, the postfix is strictly digital. Next, we check with a regular expression the correctness of the generated result (I forgot to say this test is in groovy language). In Java, the smallest integer is 1 more modulo the largest integer. Therefore, the module from the smallest integer will be the smallest integer - no one has yet canceled the buffer overflow.

Above is the code that kills the browser. Below is a fix. See at least one difference? This is the “+” before fromX. The bottom line is that at some point the backend format changed, XML began to send stocks.

No one is safe from this.

Parametrized Chaos

Sometimes parameterized chaos can save you. TestCheck.JS is one of the coolest libraries in terms of chaos. It supports, probably, all test frameworks. And the bonus is Flow-To-Gen : if you described data types in Flow, then it can generate tests for you.

check(
  property(
    gen.int, gen.int,
    (a, b) => a + b >= a && a + b >= b
  )
)

{ result: false,
  failingSize: 2,
  numTests: 3,
  fail: [ 2, -1 ],
  shrunk:
   { totalNodesVisited: 2,
     depth: 1,
     result: false,
     smallest: [ 0, -1 ] 
   } 
}

TestCheck has many out-of-box generators, but you can write your own. The function above checks that the sum of two numbers is always greater than each of them. Naturally, this is not true, a simple example: 0 and -1. This thing detected an error on the values 2 and -1, but did not stop and found the most primitive option on which this error can be reproduced. Very cool!

Other testing

Handles also need to poke. For example, you need to test the state. A state is not only good, but people often don’t think about it. What to do when you have old data and a new state? when you don’t have internet connection? no permission'ov? problems with locale? You can test different devices, platforms, mobile phones, sticky keys, support for screen readers and much more.

Do not forget about testing 3rd party failures . “So this third-party library broke there” - this is also your problem.

Production tests

For production in React 16, you can use two simple solutions: ErrorCeption and HoneyBadger (we really have Sentry ). You connect the library, and they begin to collect statistics on production errors.

Optimizely does A / B testing. It is very cool organized, knows how to make delivery for specific people, a specific time, content, test sample, at the same time balance the load and keep statistics.

Out of the box

JS is not a panacea. Many things can break, and testing can be done not only with JavaScript. A very simple thing is validator.w3.org/checklink . This is a crawler who goes to your site, looks at what links are on the pages, and checks their functionality.

If your site is available from India, this does not mean that it can be opened in Russia. Achecker.ca/checker should help. Webpagetest.org is a project that makes it difficult for me to live without. Tools.pingdom.com is another interesting project. At www.w3.org/WAI/ER/tools you can find thousands of solutions for various cases.

At last work, we drove tests for every commit. The commits were there on a bunch of everything. Thank God that there was no Jenkins Multibranch Plugin back then, which, after the merge of the pull request, reassembled all pull requests. Use test suites, test chains (“why run it if it obviously doesn't work”), nightly builds, test packs regress, full regress, smart regress, etc.

Programmers are probably the only people who are paid to correct their mistakes, for the fact that they did their job not well enough last time. There are options and approaches for a wagon, so I will be very grateful if this post results in a discussion with a listing of useful resources that will help make our lives easier.

I will probably end this short description. Ready to answer questions in the comments.

Tags: