Relz November 25, 2016 at 10:57

Worldwide-billing Badoo through the eyes of QA

Hello, Habr! For more than four years now I have been engaged in manual and automated testing of Badoo billing systems. And Badoo billing is one of the most developed (and complex) in the world, and testing it is often an interesting and extraordinary task. Today I want to tell you why these systems are so interesting and powerful, what I have learned over the years and why testing billing is not ( very ) scary. And at the same time I will share with you another batch of interesting stories ( yes, I love this business very much) . Most of the things will be applicable not only to our specific case, but also to any other complex payment system ( and not only a payment system, to be honest ).

What is our billing? This is a payment processing system in a social network in which more than 330 million registered users. We accept payments in all countries of the world, support more than thirty active payment methods (and about a hundred of them have been implemented all the time) and process about 1500 requests per second. Billing Badoo is an independent dedicated service that works with a dozen different clients (different platforms, different applications). Curious enough base for development of testing, isn't it?

Test object

So, for starters, I’ll briefly tell you what exactly we have to test. All our clients (web, mobile applications and some back-end services) communicate with billing using the API. Billing itself is located on a separate cluster in each of our data centers and communicates with various payment systems (sends requests for payment, receives notifications with the result of processing requests, etc.). The cluster contains machines for processing customer requests and payment systems, machines for launching CLI scripts (for example, for updating expiring subscription services), our own bank card payment processing server and databases.

Billing developers deal with several types of tasks:

development of new functionality : new paid services, promotional campaigns, various features for subscribers;
development of new integrations with payment services ( you can always find someone with lower fees or higher conversions );
updating existing integrations ( our partners are also developing );
fixing bugs ( let's admit - everyone has them! );
tasks for optimizing and solving technical debt ( you can always make the service a little better );
solving technical support problems ( we love the most cunning users who manage to create a dozen subscriptions to different payment services in different countries and then get confused how to cancel unnecessary ones ).

And all this "good" ultimately comes to testing our small team. In addition to tasks directly from billing developers, we also receive tasks from other teams if they relate to payments in any way: for example, changes and new features on clients or a mobile application server.

What exactly are we testing? You can break it all down into three categories:

user interfaces : all kinds of payment windows ( we call them “wizards” ) on different platforms, settings windows, advertising banners, promo windows, etc .;
“Admin panel” and configuration tools : setting up prices, promotion campaigns, experiments and technical support tools (which we also use very actively during testing);
billing back end : processing payments, queues for the provision of services and conducting various deferred operations (the most difficult and “juicy” part ).

I will try to tell you about all this in order.

User interfaces

So, the most important part here is the payment wizards. Performing the same function (getting information from the user about how much service he wants to buy and how he wants to do it), wizards look different on different platforms. This depends primarily on the features of the platform, but also on the various requirements of the regulators and on the countless A / B tests that are conducted on our applications.

What can be tested here? Yes, the sea of everything! Each payment method should be displayed correctly for any selected service option. The list of options themselves should correspond to the desired one, each of them should indicate the price specified in the billing settings, while the format of the price and currency should correspond to the standard adopted in the country: for example, $ 6.49 ,125.00 MXN or 17.64BYN .

Each payment window must be accompanied by detailed terms of service. Each promo box should also contain everything you need or lead to the next step with a full description of the conditions ( this, by the way, is one of the most common problems that is so easy to forget about ).

Any user action in such windows should be accompanied by correct messages, not only about successful payments, but also about errors (you need to be able to distinguish between situations when the user canceled the payment on the partner’s side and when he actually entered incorrect information).

It would seem that we can draw up a set of basic checks for all this and limit ourselves to it. There it was! Each country has its own mandatory requirements that must be observed in order to be able to conduct business. For example, when paying via SMS in Belgium, a short payment number must be drawn in large white numbers in a black rectangle ( I'm not joking ). In France, at one time on EVERY page of the site there should have been a button to unsubscribe from an existing subscription service, and the unsubscribe itself should still be done in one click, without any confirmation steps. In some countries, it is imperative to inform that the price includes taxes, and in some others even separately indicate the cost of the service and additional taxes (and nowhere to write them in total ).

How to check such a “zoo” of payment methods? Traveling to all countries of the world for the most honest testing will not work ( and I would like to test payments in Brazil, sob ), just like creating accounts in all existing payment systems. Therefore, you have to be content with various "sandboxes". Some partners provide us with their own very convenient sandboxes, for example, bank card aggregators or PayPal. Some of them are not so functional: for one of the partners it is a screenshot of their usual payment window with the "Pay" button superimposed on it.

In other cases, we have to build sandboxes ourselves, emulating various answers and notifications. But even this does not work out everywhere, and you have to collect notifications with your hands, make some kind of substitution in the code and send them to yourself with https requests.

Wizards in mobile applications are a completely separate headache. Here the user communicates with billing even more indirectly. The application sends a request to the payment system (AppleStore or GoogleWallet, for example), the received response immediately sends to the mobile application server, which in turn processes the information and sends a new request to the billing cluster, and the billing response goes all this way back to the payment system. User Experience can break anywhere in this chain! A cloud of error may mean that the request did not reach billing and the payment was not completed, but it may also mean that everything went fine, but the mobile application server did not respond to the payment system in the exact format that it expected. Mess!

And let's not even talk about the uncomfortable sandboxes of Apple and Google, especially when trying to test subscriptions.

By the way, the fact of working with external partners brings with it a lot of problems. Their payment windows can open for a long time and slow down testing, they can contain the most common bugs ( which you, as a self-respecting tester, ascribe to your own developers first ). Any actions that require cooperation on their part (fixes of the same bugs, protocol expansion) are also often delayed, and they can make some changes on their own without informing us ( of which we only learn from the increased error schedules ), and provide us incomplete or even incorrect documentation.

Admin

No less important component of billing is completely hidden from the eyes of our users. This is all that allows our management to regulate prices and the availability of services and launch promotional campaigns, and technical support workers can identify the causes of user problems ( and make sure that they are not trying to just get the service “for free” ) and as simple as possible and a safe way to solve them. In addition, all these tools help us in testing (it’s quite difficult to play many cases exclusively with actions in the user interface or for a very long time).

In addition to a direct performance check, you need to devote a lot of time to ensuring that developers and managers perceive the implemented functionality in the same way (aboutoften we only manage to establish contact between them and complete understanding ). All configuration systems are quite complicated due to a wide range of capabilities (we constantly have dozens of A / B design tests, payment methods and promotional campaigns), and even the smallest details can lead to the system behaving quite differently than it expects. management. Our responsibility is to make sure that the developer correctly understood the task, and the manager was able to figure out the documentation provided ( if there is any ). And of course, it’s very cool after each change to the configurators to follow the results of their work and to clarify several times whether everyone really wanted to do this.

And here it is necessary to boast that one of the tools developed by us (and of course tested!) For routing card payments to the necessary banks and accounts brought us the prestigious Merchant Spotlight Award.

Back end

And here the fun begins. What is hidden from the eyes of ordinary users; what managers don’t even want to know anything about; the place where the most monstrous fantasies of our developers are embodied - the internal logic of processing payments and providing services

It tests just a lot of all kinds of things.

Appeals to affiliate systems : checking the status of subscriptions (sometimes we can’t manage them for our part, and we can only verify that they are still active), requests for updates and cancellations of subscriptions, and much more.
Processing notifications from partners : we must correctly process each notification (and each partner has its own format and protocol!), Determine the user, service and all possible parameters so as not to confuse anything. Sometimes notifications mean nothing at all: “ Look, we still couldn’t deduct money from the user !” ;, sometimes they contradict themselves: “The user canceled the payment :( But no, here the money came! ”; Sometimes they are completely irrelevant : " Remember that subscription three years ago? Now, it's still expired! " - and we must come up with the right "flow" for every possible case.
Provision of services : in order not to lose the orders of users in case of problems, services are provided through the queue. If something went wrong - the event is postponed, and any service in any case should be delivered to the user. This is "in any case" we must guarantee during testing.
Subscription renewal : if the user is subscribed to certain services, he should receive them on time. We should not “charge” it earlier (or later) than the time, it should always be debited from exactly the amount for which he subscribed. In addition, we have a lot of different logic for choosing the time for updating subscriptions in different countries (either these are our experiments, or the requirements of regulators). For example, somewhere we “charm” users only during business hours, somewhere only on certain days of the week.
Payments according to available data : as in any self-respecting payment system, with us the user can save the details of his payment method in order to pay faster next time. We must check that the details are stored safely (for bank cards, for example, we need to comply with PCI DSS), that payments are processed and cases are correctly processed when the details are no longer valid (for example, a user card is blocked).
And so on and so forth .

The amount of different logic in the server code is simply unlimited. Each new task turns into an entertaining quest of the type “Understand how it works => Understand how it MUST work => Understand how to make the system work like this”. What are the ways to achieve this?

First, you need to read the code. Testing billing as a black box is almost impossible: only having an idea of how the system works, you can understand what cases you can test here. In addition, very often for successful testing it is necessary to make changes in the code: remove calls to aggregators (so that we do not ask them for the status of a non-existent test subscription), replace signature verification for notifications (so that you do not need to generate it each time) or “hardcode” the choice of a specific option in the A / B test (so as not to register dozens of users falling into the desired groups). Fortunately, we are doing our best to develop testing utilities to simplify these processes.

Secondly, you need not be afraid to test things in non-obvious ways. You can’t surely play the case from the interface? You can write a functional test! You can climb into the test base with “pens” and fill in the necessary data! You can collect the notification from the partner manually and send it to your own address! The main thing is not to be afraid to climb into the wilds.

Third, the developer is your friend . A joint entertaining debug is a fascinating ( not always ) and rallying team ( except when you want to strangle a developer ). Together, dealing with unexpected behavior is much easier. And you will either understand your mistakes and justify the task, or you will find a real problem and let the developer return to its completion, already having some idea of the situation (or understand the situation for a new developer if the old one went on vacation ).

Automated Testing

Auto tests are a very cool thing. And in fact, testing here is much better than not testing . Directly with us, all autotests can be divided into four groups:

unit tests : written by developers while working on a task. In our process, a task is not considered solved until it is covered by tests;
integration tests : written by developers (and sometimes testers) at the testing stage to check difficult to reproduce places. They continue to replace part of the code, like unit tests, but work with a much wider layer of entities at the same time;
Selenium and Calabash system tests : test the client as the user sees it. Not ideally stable, slow enough, but very useful, since they also allow you to find problems caused by the tasks of other departments;
system curl tests : a fairly new direction. They check the overall performance of the system on thousands of different cases: we get payment wizards of all services, all their options, in every country in the world, on every payment method. Retesting as it is.

When do these tests run? In different combinations, this happens all the time:

developers run tests manually when working on a task;
they are automatically launched when the task transitions to the Finish status;
QA engineers run them manually during testing;
they are launched every time you build each new version of the build;
ultimately, they run constantly and regularly on pre-production.

Of course, all these autotests take a significant amount of time, and therefore we always strive to optimize this process as much as possible. For integration and unit tests (and more recently for curl tests), we use a cloud-based “test launcher” ( more than 73 thousand tests in 4 minutes already! ). For Selenium tests, we have a “large farm” of the SeleniumGrid cluster. And in general, work to improve and optimize tests never stops.

Monitoring

The tester’s work on the task does not stop immediately at the moment the task is sent to production. To make sure that it withstands the stress of work in a combat environment is possible only through careful monitoring. Have new unexpected errors appeared in the logs ( yes, there are expected errors, this is normal )? Has the load on the billing cluster increased? Has it begun to fall ( or rise sharply, which is also usually strange) profit in any country or in any payment method? Badoo has a wonderful monitoring department that monitors all metrics in manual and automatic modes around the clock. However, in any case, they will need some time in order to independently determine the causes of certain anomalies. Therefore, the QA engineer must carefully conduct his task in the ( last ) battle.

For these purposes, we use several different systems, the most important of which are three:

RRD Tool : in RRD we store error and debug logs, graphs of a huge number of basic metrics (profit, number of payments, number of services rendered, queue sizes);
Splunk : a delightful system with which we analyze all billing events in real time, we can build various graphs about the number of various billing requests in time and much, much more;
Anomaly Detection : our own anomaly detection system that automatically reports the unexpected behavior of a metric. Unlike the first two systems, this one works in fully automatic mode.

What can be considered anomalies in billing charts? Let’s look at this chart in Poland. Each point shows the total profit for the last day, the scale of the chart is also a day.

A nightmare, a terrible drop in profits, you have to beat all the bells! But what is it? We open the chart for the month ...

What a mess? It turns out that this is how mobile aggregators work in Poland. They conduct all subscription updates only on a specific day of the week, for example, on Tuesday. If the user subscribed on Monday for a week, then ... he’s all the same "zacharjat" on Tuesday! Such are the rules in Poland. And each peak is the "cherished" day of the week of one or another aggregator.

We look further. A similar graph of AppleStore’s profit for the week from the 24th to the 1st of the next month:

Getting scared right away? Sure! Such a fall, no growth per day - a definite problem! While we rush headlong through the office and scream, a day goes by. And what do we see?

The schedule recovered by itself! Magic? Catastrophic mistakes? Reptiloid plot? Not at all, this is Apple's policy. They always make subscription updates on the same day of the month in which the subscription was started. But what happens in February with those who started subscribing on the 30th or 31st: do they sit happily for a month for free? Of course not, their “charge” on February 28th. And since then only 28 numbers begin to charge. Therefore, at the end of the month these two peaks (the 28th for February and the 30th for all other “short” months) fall, and on the 31st, no subscriptions are renewed for more than a month.

As you can see, monitoring is also necessary wisely. As I already said, to test is not so bad, but you can also catch cuffs from developers for excessive alarmism.

Instead of a conclusion

Testing billing is an interesting and entertaining affair. There are many extraordinary things in it, pitfalls and back streets that nobody really knows, but the solution to almost every task is a real quest, upon completion of which you experience an absolute sense of triumph. It is unfortunate that not much is said about this area of testing ( and at conferences I have repeatedly heard things like “And we test billing on production” ). I hope that my article will help someone take a different look at the testing processes in their company and, perhaps, decide to test their payment systems a little more tightly. Anyway, any low-level things. Believe me, this is really not boring!

Kudinov Ilya, Sr. QA Engineer

Tags: