“It makes no sense for us to use Retrofit”: about Android development at Sberbank Online
How many Russian applications on Google Play say “50,000,000+ installations”? Obviously, each such case is a unique story with its own specifics, so it would be interesting to talk with the developers. And when such an application also has a rating of 4.6, this strengthens the interest.
Vladimir Tebloev is one of the people working on the Sberbank Online Android application . In the spring, when Sberbank Technologies participated in our Mobius conference , he made a report there, and now we decided to ask Vladimir about the features of his work.
- First, tell us what exactly are you doing?
- In the Sberbank Online application, I am engaged in the Dialogs service, which allows users to transfer money in one click and see the entire transfer history in full view. The service is available to all users of the application - now it is 37 million people.
I have been working in SberTech since the summer of 2016 - then, as part of the application, there was still no division into separate teams. And later, when, as part of the transition to agile, they began to assign different teams to separate application modules, one of the first was the Dialogs team, and since then I have been in it.
- Everyone has the words “Sberbank” and “mobile development” associated with Sberbank Online. But in such a large company, is there probably also internal mobile development? Is it any different from the outside?
- Yes, there are also applications for internal use. I have nothing to do with them, but I know that React Native is actively used there. Internal development has its own requirements: there is no strict design review and sophisticated animation; development is faster using a cross-platform solution.
When the expertise grows, it will be possible to apply it on a “combat” application. Although the fact that Sberbank Online will be able to actively use cross-platform development, I doubt it. There are many difficulties, and when you have tens of millions of users, even a rare problem can hurt so many people.
- And how does this “even a rare problem hurt many” affect the work? Do you have to deal with some exotic problems that smaller applications may pass under the radar?
- Sometimes there are problems on some “special” devices. On one custom, but widespread firmware, it shot hard, and we had to figure it out for a long time. It turned out that the problem was in the driver of the motherboard of the device itself - he tried to emulate libraries under ARMv5, although the project was only for ARMv7.
When there are many users and the price of the error is high, this leads to the fact that everything needs to be rolled out “a little bit”, carefully following the reports. If something rises there, we immediately stop rolling and make a hotfix. In addition to rolling “by a given percentage of users”, we also geographically roll out everything in parts: Sberbank has the concept of a “territorial bank”, and features can gradually roll out by region.
- While some hipster startup can put a high MinSdkVersion and say “everyone else is not our audience”, you have a different situation, you can’t wave at people. What is your current MinSdkVersion value?
- Now it’s 16, and we lifted it literally last year from 14. We look at the number of customers with a specific SDK, and we can raise the version if it becomes less than 5%. So far, we have many users on Android 4.4 KitKat, about 16% - we need to support them.
- Is there any of the new versions that you are looking at right now and thinking, “As soon as we increase MinSdkVersion, do we use it right away?”
- Of course, I would like to raise our minimum API to Android 5.0 in order to make full use of such innovations as, for example, transition animation, which will work adequately and everywhere. But, in principle, this does not apply to writing functionality, business logic, so this is not critical. In general, animations are worked out by our designers, that is, it can be implemented manually. So this issue is not critical, it relates to the comfort, "peace of mind" of the developer.
There are some cases in which we check the version, for example, SSL pinning. It works differently in different versions of Android, so we implement two versions of the code, for Android devices “up to 4.4” and “from 4.4”.
Of course, I would like to “just develop for Android P and not think about anything” - but this is always the case, there is no getting anywhere.
- To the mentioned SSL pinning. Obviously, security issues are very important for the bank. And how does this affect you, how is your work different from working on a non-banking application?
- It features a very strict approach to the user's personal data. Any leak is a huge risk. We have a security department that tests our application before each release. If there are comments, work on them passes to the team that is responsible for the functionality with the discovered vulnerability.
In small companies, I think, often there is no security department that will pentest the application. If any shoals are found, they can pop up on w3bsit3-dns.com or a similar resource.
Also related to security is that our application uses an antivirus. Some users are unhappy with its presence, but the introduction of antivirus has given us a very noticeable reduction in fraud. For example, in our SMS-bank, where you can write SMS “transfer amount to X card,” with antivirus, the fraud level in this direction was reduced to a minimum.
- For security reasons, banks limit their functionality in the case of rooted smartphones. What is forbidden for Sberbank Online for rooted devices?
- In June, we abandoned the limited functionality for owners of devices with root rights. Now all users of Sberbank Online on Android have full functionality. At the same time, protection remains at the same level thanks to the fraud monitoring system.
- And in the name of security, it was necessary to restrict in something not users, but themselves as developers, refusing that they would otherwise use it?
- When in 2015 we wanted to introduce Retrofit, he had problems with obfuscation, he worked crookedly with a standard obfuscator. Our security department pointed out this vulnerability, fraught with a cyber attack on the bank and the risks of breaking the code, as the API sticks out. Then we abandoned Retrofit and still do not use it. As far as I know, now the problems with the standard obfuscator are already fixed there. But we have since written our own HTTP client, it works and satisfies everyone, many wrappers have already been written for it for different teams working with different servers. There is no sense in changing it to Retrofit.
- The inevitable question: what do you have with Kotlin?
- We are going in his direction, but leisurely. The difficulty is that many Android developers of different levels are working on the application right away, someone knows Kotlin perfectly, someone does not. In general, there are no insurmountable obstacles for implementation, but now we have a shortage of reviewers for watching the Kotlin code. If we all start abruptly writing tomorrow in Kotlin, then people will “tear” on requests. In addition, in the case of Kotlin, there are problems with the static code analyzer, which is used in our pipeline.
So Kotlin is implemented in small steps: for example, we write tests on Kotlin and use data classes (so we save time so as not to write tests for getters, setters, equals (), hashCode () and so on).
Now it is slowly running in, and in the next step we want to write our DSL for testing on Kotlin. And in parallel, we want to raise the level of Kotlin knowledge in the company: for example, with the help of mitaps.
- In the case of "Dialogues" you are engaged in messaging, but not a direct analogue of WhatsApp. And so it’s interesting: how useful are other people's solutions to you? Do you use open source messengers in the code?
- It was useful when we wanted to add emoticons. We had a question how to make a panel with them, and in the open source we saw an option where everything is easily solved by pop-up above the keyboard. Then everything converges in height, and it turns out seamlessly for the user.
But in general, looking at other people's decisions is not always good, it is more efficient to form your own, taking into account the experience of others. For example, it’s better not to look at Telegram at all, because due to the huge size of the classes in the source code of their Android application, it’s not easy to figure it out. We are trying to go our own way, especially since the interaction with the server can be different: in the same Telegram it is MTProto, we have the usual WebSockets.
- I am a lazy interviewer, so I decided to just take a list of things that you are connected with at work, and ask about each item “tell us exactly how things are going with this.”
The first point: you are in the "application module architecture." We already said about the fact that the application is divided into modules - what else can you say about architecture?
- It is developing iteratively with us, versioning is underway, now we have reached the 17th version.
On the 16th, Clean Architecture was introduced. We agreed on who is responsible for what (presentation, domain, data-layer), which entities and where should be used, where should the converters be - in general, they painted all the architectural issues and implemented them.
Implemented as follows: all new features had to be written on our new architecture. If in the pull request something deviates from the set norm, then such a pull request goes for revision. But at the same time, they did not immediately rush to saw through all the old functionality, because this can cause many problems.
For the presentation layer, we chose the MVP standard, but some of our teams use MVVM. In the presentation layer, we are not limited by anything. For example, we sawed our chat on MVI - more precisely, on our interesting implementation of MVI, which is fundamentally different from what the developer Mosby wrote.
Then we switched to version 17 of the architecture and implemented RxJava, which entailed architectural changes. If we use strict definitions, now our architecture has turned out to be hexagonal, from Clean we have “forked”. But they are similar in that both work according to SOLID principles, so one flows into the other quite smoothly. Now we are working on it.
In future versions of the architecture, we want to abandon the Moxy framework used to implement MVP, because it causes some difficulties. The project is large, it uses annotation processing, and when making changes to the modules of the "lower level" the build time is large. And we strive to make life easier for our developers.
- The second point is "optimization of work and memory consumption." How acute is this question, do I have to constantly think about it for users with older devices?
- This issue is in the focus of platform teams, they are developing tools that feature teams use. The need to do this, rather, arises as the need for one of the teams. For example, in the Dialogs team, in the early stages of development, chat was very slow. Then I had to roll up my sleeves, start with the profiler, see where the bottlenecks in the application were, figure out the reasons for their occurrence.
In terms of optimization, for example, we abandoned PNG and gradually clean them out of the project in order to use only the vector. Optimization of the dependency graph in Dagger is planned for this year to speed up the cold start of the application.
- Let's move on to testing questions: how is it happening with you?
- I can only talk about our team, in others this process can be built differently.
Our team initially had one tester. Subsequently, he became bored with just testing. And he began to ask us to help deal with the writing of unit tests. We showed him how to write tests for the database, in essence, for parsing - and in this way he unloaded us, removed part of the work from us. This is good: he is interested, and us.
Over time, we came to the conclusion that we need to automate regression, we need to write UI tests. At first, my partner and I worked on UI tests, and later the quality department joined us - our testers, who in the past tested the backend. They know Java, and now they are connected to our project to automate the entire regression. We sat down and considered the solutions that are: Appium, Espresso, Selenium.
We stopped at Espresso and began to develop approaches together. To facilitate testing, we developed our own framework, something like Kakao. We started this work at the beginning of 2017, and now we have a large framework, and most of the tests are assembled as a constructor, because many gamers and actions for various situations have been written.
Now our testers are actively asking us to teach them how to write UI tests, because it’s easier to write a test once, than to “pierce” the same actions on five devices. But, of course, you don’t automate everything, and some cases still need to be checked manually.
As for the developers, a retrospective is held in our team every two weeks. At one of them, we came to the conclusion that developers should conduct at least alpha testing after they wrote a feature. In order not to get out any completely basic bugs of the form “the application crashes at startup”. Thus, the developers also connected to the testing. When we are preparing a major release and we need to quickly test the feature, everyone sits down for regression and together passes the regression tests. When a bug is detected, developers disconnect from the regression, quickly fix, and again.
- Next item: code review. Do you have any specifics, or “like everyone else”?
- There is a specificity caused by the number of developers. When there are ten mobile developers in a company, then two or three people can review everything. And how to revise the code of hundreds of people? We developed a “review matrix”. 20-30 people were selected, about whom we know for sure that they can well publicize, leave feedback and resolve controversial points in the comments. They took these people and divided all the teams between them.
Why a matrix? This is to ensure that all reviewers have the same load. How is the review going? Our team requires at least three appraisals. The first is from someone on the team. The second - from someone from outside, from a team that does not deal with this functionality. And the third appruv - from someone from an adjacent team. In our case, there are several related commands, and they all look at our code. Well, accordingly, all builds should be collected: unit tests and UI tests should pass without problems. Thus we have a code review.
- The next point is the refactoring of the legacy code. How systematically does it happen: precisely with the planned tasks, or "did you need to make changes to the old code - at the same time refactor it"?
- In general, we have a peculiar “scout principle”: if you touched something old - be kind enough to do it right, you are now a co-author. But there is a planned refactoring too. For example, for Dialogs, refactoring of two directions was needed: the contact book that we use and translations. The contact book was taken out, cleaned, rewritten the entire database on Room, carried out in a separate module. And our payments were written long ago using RoboSpice, if you still remember this, and it hurt us. I must say, cutting this out was an unpleasant task, because there were many ties to it. And you had to subtly clean it, so as not to break the rest of the functionality.
- Even at Sbertekh, you are involved in training programmers. What does training inside the company look like?
And now we have regularly held mitaps. The choice of topics for them is not the same as at conferences, where something new and hype is necessary. For example, if we know that developers have problems with something, it is important to talk about this. From recent - one of our developers talked about vector graphics. Not just about a specific library that draws vectors beautifully on Android, but started with how vector graphics work in general, and then went on to the private one. They talked about both Room, and about Java concurrency, which many developers have problems with, and about Dagger 2.
Last year, we had an Android development school, and we hired those who successfully completed it. Such people should not be immediately connected to some projects and left to cook on their own. Therefore, a mentor is assigned to each newcomer employee and also to the junior, who will guide him, develop and help him. This is an internal learning.
- Interviews: do you have them “like everyone else”, or is there a specificity?
- I used to think that "like everyone else", but in the end it turns out that they are still a little special. In my experience, there are three common approaches on the market. The first is asked on three or four topics and evaluated solely on them. For example, I came to the company as an Android developer, and I am considered as a person who must have excellent knowledge of algorithms and synchronization in Java, and at the same time do not appreciate what I do in super libraries. This may be due to the fact that the company needs a person who perfectly needs to know some narrow part of the framework or language. The second - when interviewed through the sleeves, almost a conversation for life for 30-40 minutes. Here, rather, the matter lies in the competencies and experience of the interviewer. The third - when at an interview they talk about the problems of the company and try to get some kind of solution on the spot. The disadvantage of this approach is that the solution may not coincide with the opinion of the person who asks this question. In my opinion, such approaches are found in about half the cases.
As for us, we have worked out a methodology for considering a candidate in four broad areas: OOP, OOD (Object Oriented Design, architecture), Java Core, and the Android SDK. Methodically question by question we go through all the topics. If the candidate as a whole confidently answers on the topic, we gradually begin to go deeper, ask more specific questions. Figuratively, it looks like a tree: we have a root, from where we go into each topic, and can go five to seven steps in depth. Then the candidate is evaluated in aggregate on all the questions passed. If the interview is quick, then we start asking about libraries, for example, Dagger 2, RxJava. If there was enough time for this, then according to Kotlin. Thus, the candidate is evaluated as a whole. If a person does not understand one topic, but knows another well, this does not mean that he is a bad programmer. This means that for a certain period he should tighten this topic.
“Another of your work tasks is“ research and review of new technologies. ” Here I want to ask for a concrete example of some technology that was reviewed.
- The last major library is RxJava, we examined how it can affect our project. We tested it in local branches, then we implemented it in one non-critical module to see how it behaves on production. After all this, they took it as a standard and determined everyone to write new functionality on it.
Of the unsuccessful examples, we considered Retrofit, which I already mentioned: a good library that solves its problems, but its time has passed for our project. Implementing it so that we have many ways to enter the network is bad practice.
We also considered the TinyMachine library for implementing the state machine - the library is simple, not extensible, that is, it satisfies one command, but it is not suitable for others. Therefore, they refused it, because even if you "drag" the library, then only the one that is suitable for everyone. As a result, we decided to write our own state machine, fortunately, this is not some kind of rocket science, which is difficult to implement.
- And the last: record keeping. In your case, when there are a lot of developers and it is impossible to keep everything in mind, without accurate documentation, nowhere at all?
- Yes. The first kind of documentation is Java docks. We have a local meme “verification of Prilutsky”: there is no Java dock - the pull request does not pass (“Prilutsky” is in honor of one of the leaders in the Android development team: he wrote so often on all requests that he should describe documentation for the code, and without this, the code will not go to the general branch that such a meme was born). Now developers already understand that every public method, every class, every constructor - everything should be described by Java docs. All code should be docked, even tests. To make it clear what this test is written for, so that, for example, no questions arise “What is this payment?” Is it payment from messenger or payment from payments? What is paymentTest? ”
In addition, we have documentation in Confluence. When I came here, material design guidelines for us were described in the cloud, and there were a couple of articles about how we work. Now all the global things that affect everyone are necessarily described in Confluence. For example, we need to insert certificates to access the repository, and the one who did this writes an article so that later they won’t write a million times in the chat what to do in case of a broken certificate. Another example: it was decided to implement RxJava and Confluence describes best practices - how to do it well, how not to do it, and a link to the sample. The simplest example: how to arrange methods in a class so that everything is standard.
These articles are gradually but regularly written. Now our Confluence has grown to 200 articles on various issues. Such a tool helps newcomers as well. They study Confluence, get an idea of the internal development kitchen, in case of questions, they can independently figure out and make a decision, not always involving their mentor.