A / V tests do not work. Check what you are doing wrong

Probably, only the greenest marketers and product managers have not heard about A / V tests, but sometimes even experienced specialists don’t know how to conduct them and what to do with the results. Because of this, you can often hear that A / B testing does not work and is generally useless.

In order to dispel rumors, we talked with A / B analyst from Agima agency Sergey Filatov, and he told us about working A / B testing methodologies, tools that help conduct tests for a mobile application, and about the prospects that the mastering of these skill pool.

A / V tests are generally any research dedicated to choosing the best one among several. The thing is that the term is very broad: it is the tests of marketers, and the type of analysis of digital products. This often causes confusion when you see the case that “the company conducted A / V testing,” and you have to figure out which one — general or non-technical. We will talk about A / B tests to assess the functionality of mobile applications. (However, this knowledge can be easily transferred to the field of marketing research.)

This material is part of a series prepared for the launch of a joint course of the online university Skillbox and the agency Agima, Fullstack Mobile Developer . We have already told how to get into the AppStore from the first time , and about that,how to develop application interfaces in the process of distributing several ten percent discounts and four discounts of 20%.

Those who have already solved two puzzles and want more (to increase the summing up discount) will have a mystery about the testing tool today. Look for it in the text! And the rest habouiers can still order any course for 10,000 rubles cheaper, using the “Habr” promotional code (recall that with discounts that give unraveled puzzles, this does not add up).

A / B testing is usually perceived as an analytical tool that allows you to evaluate the impact of product changes on its conversion — an increase in the number of leads that go into orders. Conversion here is not necessarily the purchase of something: it is any transition of the user from one stage to another as he passes through the order funnel, and every interaction with the forms and elements of the service along the way.

A / V-test is needed in order to:

choose from several screen or page options;
assess the possibility of changing certain indicators of your product;
calculate the effectiveness of the replacement of certain elements on the page or screen;
understand how to increase the conversion at each stage of the sales funnel, and therefore, increase their number;
Inside the mobile application, the A / V tests make it possible to improve the user experience, allowing it to more conveniently arrange the elements and make the content more interesting and useful for the user.

Formulation of the problem

Any A / B test begins with a hypothesis. They are of two kinds. The first ones are more marketing, aimed at increasing the traffic, the number of people committing this or that action and clarifying which audience the application is aimed at. In this case, it’s not the functionality of the application itself that is tested, but the marketing channels and conversions from each advertising tool. We will focus on the second type of hypotheses.

It lies in the fact that by changing this or that internal functionality - an element or a unit, a connection between them or the logic of their interaction - we can achieve changes in certain indicators of the application (however, all this applies to sites).

These hypotheses can relate either to elements that are located on the screens of the service, or to the links of the screens within it. Unfortunately, testing connections between screens is technically problematic because of the difficulty of setting up a test, so the analyst is usually limited to working on specific blocks and individual screens.

The essence of A / B testing in this case lies in the fact that one variant of the location or configuration of the interface is shown to one group of users, and the other is shown to the second.

And here is the rebus! Recall that English here can interfere with Russian, and the subject of the riddle is mobile. And do not forget that we will carefully monitor the comments and remove hints and answers from them! Prudential, encrypted in the rebus, should be called when you will be contacted by our manager after you send the application for the course. Discounts for unraveled puzzles are added together (taking into account this article there are already three), but not with discounts on the site. Too slow is not worth it - the promo works until August 30, 2018.

From the desired result to the search for solutions

The hypotheses of this type have one general rule: at the start, a certain final indicator is set, which we want to increase or decrease. Hypotheses can be formulated on the basis of reports and other similar analytical information, but often they are made without special training, based on the heuristic assumptions of the developers.

We start with the fact that we formulate a problem that we want to solve: low conversion, a small number of clicks on a particular element, the lack of svaypov or scrolls.

Then we select specific actions that could potentially lead to the desired result. This could be the addition of new buttons, a change in the arrangement of blocks on the screen or, for example, a change in the organization of the menu from the “burger” on the left to the lower side bar, as is done on Instagram.

An example of how the effectiveness of the tested changes in the Optimizely application is assessed.

That is, we are starting to come up with various ways to influence a key indicator. So the hypothesis becomes complete.

Mandatory components of the hypothesis:

the “if-then” formula;
verb - it describes the action that we perform in relation to the selected element;
description of the expected result.

" If we increase the size of the font and repaint the button in green, the conversion will increase by 15% ."

Quality turns into quantity

With the help of A / B tests, two types of research can be conducted: qualitative and quantitative.

Qualitative research aims to work with a person’s emotional experience, to find out whether he likes the decision we use: whether it is convenient for perception, affects the interaction time or not. Such tests are focused on finding out what feelings the user has when working with an application or service.

Quantitative research is aimed at increasing a certain number in the target indicator: the volume of clicks on the button, tips for increasing the probability of sale, and so on. This is a dry count of conversions, traffic, sales, movement through the funnel.

All indicators that need to be recognized must be translated into numerical metrics. For example, the question “whether the user is interested in content” turns into indicators of the amount of time spent on the screen, the depth of scrolling, clicking on a certain key element.

Important! Follow the rule: one screen - one experiment. Do not test two hypotheses related to elements on the same screen at the same time. Moreover, two hypotheses associated with one element, otherwise you will not be able to deal with the results (if the description of the hypothesis means "to swap two elements in places" - this is one action).

Types of A / B tests and depth of investigation

Multivariate tests involve a combination of several options. For example, we have a block that consists of a button and a call to action. In this case, you can create all possible views of this button with different calls. But it is important to remember that such tests are suitable only for large applications with a large amount of traffic.

Split tests are testing entire screens in order to understand which of them causes the most response. For example, you can compare different versions of the start screen-tutorial to see if users read the tips you have prepared or skip them, going directly to the application functionality.

In the framework of the usual element by element A / B test, headers, links, menu layout, call-to-action quality, availability and effectiveness of certain functional or text blocks and illustrations, user interaction with the application, depending on the device and what came to him during the test can be assessed. version of the adaptive version of the application.

There are A / B / C / N tests, in which we do not necessarily choose from only two options. They are also not suitable for all services: they require a lot of traffic, otherwise the test simply will not pass the statistical certainty threshold. So that we can be sure that the change in the key indicator was not accidental, there should be enough users on the screen.

For a small project that, when tested, offers only options A and B, it may well be enough for thousands of people to commit an action. For larger ones, their number can be much larger.

The usual terms of the experiment - from two weeks to one and a half months. This is necessary in order to make sure that no external factors have affected its course: for example, advertising campaigns, weather conditions or something else. (The weather here is not only about the mood of users, but also about the fact that, for example, for delivery applications, it is important to consider whether it is raining or not - this affects the conversion).

If your product (or a specific test element in it) does not depend on weather conditions, fashion or marketing activity of competitors, then conclusions about the expediency of changes can be drawn from the actions of the first thousand users. After collecting the data, you can proceed to their interpretation and implementation of changes that proved to be justified.

A / V Testing Tools

It is much easier to conduct experiments on sites due to the flexibility of their settings from the control panels, but, fortunately, there are several solutions for mobile that have established themselves as best practices.

Optimizely is one of the most popular tools. It has an intuitive and pleasant interface, a visual editor and wide integration with classes, has built-in features for editing the functionality of elements and hanging by attaching new events to them. However, the service is not available to all developers because of the high price.

Five Second Test is more relevant for usability research and studying the effectiveness and clarity of the design of specific blocks and elements.

Convert Experiments- the most affordable of the platforms, the cost of subscription to the service starts at $ 9 per month. At the same time, it has a visual editor that allows the tester to work with elements without having the skills of a programmer. There are fewer available metrics and less advanced internal analytics, but in order to quickly set up an A / V test and run it, the program is quite suitable.

Apptimize has a more advanced system of internal analytics and SDK, which is quite simple to master. There is also a visual editor.

Google Analytics Experiments is focused on mobile applications based on web technologies and hybrid applications.

A / V tests and application updates

Just a few years ago, to launch the A / V tests, it was not necessary to publish an updated version of the application: the changes were made on the fly by incorporating certain snippets into the code. However, due to the fact that such an approach allowed to bypass the security policies and restrictions of Apple and Google, this feature was closed to developers. Today, to conduct an A / V test, you will need to roll out an updated version of your application.

What to learn and where to grow

To conduct A / V testing it is not necessary to be a cool analyst - it is enough to understand the indicators and draw the right conclusions based on them.

One of the main skills of a specialist conducting A / B tests is the ability to interpret quantitative indicators into qualitative ones and, conversely, decompose qualitative hypotheses into numbers available for analysis.

Beginners should get a deeper insight into the rules of product analytics, since its practices are closer to A / V testing than is used in web analytics and e-commerce.

It is useful to study flexible methodologies, in particular line startup. For the tester, the product becomes his “internal startup”. So, such solutions are well suited to him. A lot of useful information about research can be obtained by visiting business incubators and their activities; Moreover, it is a strong source of inspiration. You can also see a lot of variants of A / V tests in real life — both automatic and traditional, such as surveys and in-depth interviews.

Of course, the necessary skills and working with numbers - from conducting sociological surveys to experience in applied mathematics and computer science. Without this, you will have problems processing test results.

All these skills will allow, over time, if there is a desire, to move towards working in marketing as a strategist, UI / UX analytics or to the position of the product owner and even to create your own project. Everywhere, where doubts arise, where it is not clear where to go, where you need to explore the soil, test the audience and its moods - in all these areas of knowledge acquired during A / B testing, will be able to find application.

So, by learning how to go from collecting preliminary data to a hypothesis, developing solutions and checking them with subsequent analysis — which, in fact, lies behind the short term “A / B Testing” —you can discover much more prospects than just developing in the role of QA or analyst.

Skillbox recommends thematic courses:

Mobile developer from scratch
Mobile app design (feat. Redmadrobot!)
UX design

We remind: For all readers of Habr - a discount of 10,000 rubles when registering for any Skillbox course on the promotional code "Habr".

There are a few more materials from our mobile development series ahead, and it's time to ask: what would you like to read? Tell us in the comments which topics related to the mobile seem important to you, but not sufficiently covered, and we will try to satisfy your interest.

Tags: