“Breaking Bugs” in Sberbank: how to fix the seven-day rate of bugs per day

    Bugfixing is a tedious, but mandatory part of any development, and not everyone wants to do it. How to turn bugfixing into something exciting? Arrange a competition! In this post, we will talk in detail about our 24-hour “bugfix marathon” - from preliminary preparation to raking the last commits after awarding the winners.



    Infected with the idea


    The scale of development of our Sberbank Online application has significantly increased over the past year. Along with this, minor bugs began to accumulate, which were not reflected in any way on key metrics. But we understood that this is a time bomb and something needs to be done with it.

    We were inspired by how our Avito colleagues solve such problems , and we decided to organize a massive attack on bugs in the bagaton format - taking into account our development structure, culture and the specifics of flow.



    It was necessary to arrange everything so that the guys themselves wanted to participate in a bagaton and prove their coolness without directives from above. To do this, the competition should have a cool atmosphere. We decided to come up with a special style, something recognizable about bugs. Bugs are bugs. Who destroys bugs in ordinary life? Disinsectors are guys in yellow chemical protection suits. Where have they been lit in recent years? In one popular series about a chemistry teacher. There is a basis, we are finishing up with activities. They organized a video game tournament, a quiz with prizes, cool individual nominations ... and of course, a lot of delicious food. But the main thing, whatever one may say, is the competition for eliminating bugs. This was reminded of a dashboard with a web interface, showing the progress of teams, their current positions, number of points, etc. We discussed everything with the team leaders - they approved our plans.

    Android vs iOS - so dishonest


    First, we wanted to push Sberbank Online’s Android developers with their iOS counterparts, play on rival platforms. But in the process, organizations realized that this is not the best solution, because technically the platforms work in unequal conditions. It so happened that on iOS, builds are faster and autotests are run.

    Then we changed the format and made mixed teams: five Android and iOS developers each. Previously, captains were chosen from among proactive developers to help form teams. It turned out nine teams. And despite the fact that we figured out the iron issue from the point of view of fair play, we still had to make sure that other restrictions would not get in the way of our army of bug fixers.

    The next quest was the choice of bagaton date. The release dates for each platform are different - they were selected so that everyone was comfortable. We tried to make the date as close as possible to the date when the release candidate is assigned.

    In addition, bagaton heavily loads platform infrastructures. When there is a competition, who fixes bugs faster, the number of pull requests takes off. Another month and a half before bagaton, there was a risk that our equipment could not cope with the predicted peaks. But at that moment we were expecting new iron, and it arrived just in time. We managed to connect, configure and strengthen the bandwidth infrastructure of both platforms several times.



    Pipeline - how not to lower everything into a pipe


    Everything was done here as follows: immediately before the start of the bagaton from our develop, we took the branch in which the teams had to work. A lot of pull requests with fixed bugs were poured into it during bagaton. Autotests were run at each of them, developers reviewed pull requests, and testers checked new builds to fix the bug. And so all 24 hours of the competition.

    It was also necessary to distribute the load of testers. We made an hourly chart of the predicted number of pull requests in the 24-hour bagaton interval - depending on the number of participants, server load, third-party activities, etc. Compared with the average performance of testers and the number of effective hours of each accompanying bagaton. We distributed the "duty" so that by Saturday morning the lines were as few as possible. In general, they got confused.

    At the same time, we took into account that after bagaton it was necessary to immediately begin regression testing in order to evaluate the quality of the branch as soon as possible and decide on its infusion into the dev branch. This is an additional burden on testers.



    Features Review


    It was very important for us not just to fix bugs, but to do it qualitatively. Three procedures provide verification of the code sent by developers in pull requests. In order for the code to snap, they must pass successfully:

    • Three experienced developers reviewed and approved the code.
    • the code normally crashed and did not fail the autotests;
    • After the build and infusion, the bug in the assembly on the described conditions does not resume.

    We feared that in competitive mode no one would review each other. And inside the team you can’t leave a review. Therefore, they decided not to invent anything and act on the standard flow, as in the working mode: an arbitrary cross-review - whoever is free, he takes the process upon himself.

    It was also necessary to track so that the reviews were not going to the queue. In order to play it safe, we attracted signors to the review (even those who did not participate in the bagaton itself) and actively reminded the participants of the orientation on quality. One senior iOS developer, along with a fix for his team’s bugs, scanned 80 pull requests per day - he read and understood. This is really a lot!


    We select and evaluate bugs


    We took low-priority bugs, we eliminated obvious garbage by labels and dates. In total, 490 bugs turned out - mostly small and medium, which the hands did not reach because of more important tasks. These are all sane trivials and minors:

    • bugs that have repeatedly moved from version to version
    • bugs brought by user requests
    • freshest crash
    • regression bugs
    • bugs that affect UX

    All bugs were divided into three waves according to closing priority:

    • The first wave - about 230 bugs
    • The second wave - about 150 bugs
    • The third wave (reserve) - about 110 bugs

    Defects were evaluated not by complexity, but by criticality for the business. The most critical ones are “artificially” and temporarily placed in priority “blocker” and “critical”. The higher the priority of the bug, the more points were awarded for it. The complexity was not taken into account - it happened that the bug blocker closed in 20 minutes, and the trivial - in 4 hours. For one bug, you could earn from 1 to 7 points.

    We kept each team’s score for closed bugs according to their value in bagaton rules. If the teams had time, they took to work the following defect. Motivation through value made it possible to close in the first place more critical bugs.



    How to close bugs


    We divided the first wave of bugs into 11 groups (with a margin), equal in the number of points and in the ratio of Android and iOS. The first wave is “expensive” bugs, priority ones, with an increased cost. For a convenient search in Jira, we assigned them the appropriate labels. About 20 bugs were released in each group.

    At the beginning of the bagaton, we gathered team captains and played labels. Further, the captains in their filter designated the desired label and distributed the corresponding bugs within the team. So we managed to eliminate chaotic bugfixing, where the guys would simply take what is more understandable for them.

    The first four hours, the teams were awarded points only for bugs with the labels of the group that had fallen to them to set a specific rhythm. When the time was up, the still open bugs passed into the second wave, adding to the others that made sense to close within the bagaton.

    By 19:00, all unclosed bugs went into the third wave - in addition to those bugs that were already planned there. As a result, for the evening we had “fast” bugs that would close in the usual flow: caches and current ones, unloaded literally a day before the bagaton, as well as bugs with the lowest priority. All three waves went to work. As a result, 286 out of 493 selected bugs were closed for bagaton.



    Bagaton unites


    Bagaton headquarters was located in our conference room, there were also quizzes and a video game tournament. The teams were not limited, scattered wherever convenient for them. As a result, the whole bank found out about bagaton. One product-ooner from the fourth floor said: “I'm going to meet on the 14th floor, looking for the right meeting room. Suddenly I understand that I just saw familiar faces, I’m coming back - my developers are sitting figurines with might and main, and zero attention to me. Ha - I think - they will not hide from their product-owner and over 10 floors, okay, sit already, the bugfix is ​​a right thing. ”

    There was a team in which only one Android and six strong iOS developers came to the bagaton. We in an exceptional manner knocked out another package with iOS bugs for this team.

    In addition, seven developers from the regions came to bagaton. Some met their teams for the first time, which they had previously seen only by videoconferencing. It was very cool to watch how these guys actively joined the process.



    How were the results evaluated?


    For almost a hundred developers, we had only 15 testers. And at night there are four at all. All of them were not enough, so testing was continued the next day. It was the testers who scored points for teams, so we removed them from the team in order to eliminate bias. In a normal workflow, the tester can call the developer and find out: “Listen, dude, there is such a problem ...”. On bagaton it was strict: testers should wrap everything that does not pass clearly.

    So we were able to see that some developers are not working in the accepted flow. Hackathon has become a kind of catalyst for all deviations. Those who work clearly by flow, managed to pass testing in the first wave and get points. Everyone who didn’t really correspond got into the queue, which they had already raked after bagaton. It got 60 bugs.



    Incidents


    In general, everything went as usual, the incidents were typical and resolved in a working order. When something broke, some of the gentlemen immediately switched from a bugfix to eliminating the incident.

    There was one funny incident. When preparing the dashboard, we described the possible risks: access to Jira, rolling updates, etc. They notified all administrators that for the time of the bagaton it was necessary to suspend all maintenance work, updates of Jira and servers. Created backup accounts to access Jira. And suddenly around 18:00 we realize that the dashboard has stopped collecting data. Assumptions were different. Maybe they didn’t take into account some security protocol? The reason was unexpected. Our organization is very large, it is not always possible to get complete information about all planned processes. Our dashboard was deployed on a virtual machine on one of the secondary servers. It turned out that it was on this day, Friday evening, that this server, according to an unknown plan, was physically disconnected from the outlet, loaded into the car and sent for permanent residence to our new data center.

    Merge branches and other results


    In normal operating mode, the entire branch is manually driven by 800+ test cases. A full team of testers does our full-time regression testing in two weeks. We could not afford to keep developing unchanged for so long. To reduce the testing time, we selected the main test cases of the application’s health - about 107. Until the end of Monday, they drove 80% of iOS, 50% of Android and did not reveal a single critical bug. We decided that the branches can be merged.

    Of the 286 bugs closed on the bagaton, 182 bugs were fixed. The rest are redjacks that are not relevant for various reasons, bugs (somewhere the design or functionality has already changed). These bugs are not critical, but now they will no longer need to be distracted and you can calmly focus on important tasks.

    Also, many, following the results of bagaton, had a question: how many bugs did we make? Only eight bugs on iOS and seven bugs on Android.

    It is important for us that developers feel responsible for the product code on an equal basis with other team members. This is important in any development, but in distributed development it becomes a prerequisite for successful work. And in our opinion, we managed to increase the level of that same ownership and team spirit. The result was a story with a bunch of profit: in a short time we fixed a bunch of bugs, unloaded backlogs, pumped team skills and got a lot of fan.

    The material was prepared by the Sberbank Digital Business Platform team

    Also popular now: