How we check the security of mobile applications, and why it is not easy. Yandex Security

    My name is Yuri Leonovich. I work in the Yandex information security service, where I develop interesting services that combine machine learning methods with BigData analysis. As you know, Yandex has a large number of mobile applications. And if we have been dealing with the security of our web applications for a long time, then often insufficient attention was paid to mobile. This was partly due to the fact that mobile applications were considered a continuation of their "big" brothers, add-ons over the WEB API.

    But with the advent of the mobile platforms iOS and Android, the situation has changed dramatically. The number of applications developed by us grew, their complexity increased, and some of the applications became separate large independent projects. In addition, we launched Yandex.Store, where we had to ensure the security of already third-party applications.

    We learned how to ensure the absence of vulnerabilities both in Yandex applications and in third-party ones in various ways, including using machine learning. About how we work in this place I will tell. I'll start with how we test our own applications.

    Mistakes need not only to be sought. It is very important to make sure that they do not appear in applications. We decided to use the already known methodology from Microsoft - Secure Development Lifecycle (SDL ). Of course, we would like everything that SDL offers to be implemented and used right away, but it's too complicated. All SDL controls are more likely an ideal end result.

    The security of mobile applications in Yandex has several important features. We are actively developing many of our mobile applications for different platforms (iOS, Android, Windows Phone). Obviously, it would be very difficult to check all applications for all platforms using the force of the Yandex product security team. Therefore, we try to be guided by the principle of "Divide and conquer." To do this, we constantly interact with key developers of mobile applications, try to increase their awareness of security-related issues. For example, last year we conducted special trainings where IOActive mobile application security expertsshowed practical hacking and code analysis techniques. Our developers were able to see in practice how their application will attack and how to protect themselves from many types of attacks. Usually, developers themselves contact us in all cases when new features or changes in applications affect security.

    Most Yandex mobile applications use common components and libraries. We check these parts of the code regularly, since one fixed error in shared libraries will affect all new versions of applications. For example, in our applications we found a vulnerability related to the read-only content provider in which SQL-injection was found. Although different researchers reported it and wrote about different applications, the error itself was actually in one shared library, so it was not difficult to fix it.

    The most popular mobile applications undergo regular static code analysis, which runs as one of the steps of building the application. We use a static code analyzer from Coverity. Our version of the analyzer has been slightly improved: additional checks have been introduced in it to search for Android-specific vulnerabilities. With each commit of the code, developers receive a detailed report on the bugs found and their severity. In this case, the static analysis works effectively, since the code base of mobile applications is not too large and the developers manage to fix all the errors found. In this case, the developer can immediately see their mistakes - until the moment when the application is published. In addition, static code analysis allows you to find errors in semi-automatic mode that are difficult to find with your hands.

    Many of our mobile applications interact with backends on Yandex servers. Such applications are tested in two steps. We look at the mobile and server side. All API calls that the application makes are checked, since most often APIs work via HTTP / HTTPS, then usual vulnerabilities from OWASP Top 10 can appear here .

    An important difference between the security approach of our mobile applications is that we try to use crowdfunding when searching for vulnerabilities. We were the first Russian company to launch a full-fledged reward programfor found vulnerabilities and immediately included mobile applications in it. During the Bug Hunt, about 10% of messages were bugs in our mobile applications. Some of them were very interesting. For example, one of the participants in our “Hunt” spoke about the vulnerabilities found in J. Browser under iOS at the HITB conference in Amsterdam. We relate to such errors found positively, as they show us potentially weak points in the code that should be paid close attention. And when reviewing old legacy code, application developers often fix other issues.

    All new applications must be checked before launching by our product security team. Moreover, it is checked not only compliance with the recommendations for safe development for the appropriate platform, but also the presence of hidden algorithmic errors. We try not only to correct errors, but also to explain to developers why they appeared. To search for some vulnerabilities, special programs were written that allow you to clearly demonstrate the presence of defects.

    Third-Party Mobile Application Security

    When we designed our store of mobile applications for Android, there was a concern that third-party developers would load us into the store. We could not immediately tighten the screws, because we would scare off all the developers, and we needed to fill the store. To ask everyone at the start to document their identity would not be a good idea. On the other hand, if we did not control the situation at all, we would get a huge repository of malware in which users would be wary of looking for something useful. Therefore, we decided to immediately integrate anti-virus checks into the store, after which we conducted several studies.

    We had a large number of malware samples, which at that time were actively distributed among Android users. We selected a number of anti-virus engines and tested their effectiveness on test suites. Kaspersky Anti-Virus then showed the best results, but even they did not satisfy us, so I had to come up with my own solution.

    Security Model for Android

    The Android operating system initially contained many mechanisms for protecting information and restricting access to the resources of a mobile device. Since Android is based on the Linux kernel, the mechanisms for restricting access to the resources of processes belonging to different users have been inherited by the new operating system. But due to the fact that Android was designed for mobile devices and applications had to be executed in a special Dalvik virtual machine , additional levels of abstractions and protection appeared in it. Android device can be viewed on the classic diagram.


    Android is characterized by strictly set permissions for the file system, launching user applications in separate processes in a kind of" sandboxes ".


    It is important that to access most of the resources you need to request special permission in the application manifest. At the same time, when installing user applications, they are warned what capabilities the application will have and what user data it will have access to. For example, an application with READ_SMS and INTERNET permissions may well transfer one-time passwords that fall into the user's smartphone to an unauthorized person. Despite the screen that has changed to increase information content, which shows the requested permissions during installation, and other tricks of Android developers, most users pay little attention to what they decided to launch inside their device. At first, when the platform documentation was rather scarce, many developers also did not understand what permissions they needed for certain actions. Therefore, they violated the principle of the least available privileges and set maximum permissions in the manifest. This led to the fact that users began to perceive huge sets of permissions as the norm.

    Common malware

    Like any fast-moving consumer market operating system, Android has attracted the attention of various attackers. An additional source of interest was the fact that at present the mobile device is not only a valuable source of personal data, but also an actually easily accessible wallet.

    Already in 2010, there were reports of anti-virus companies about the first infections of mobile devices with malicious applications that send SMS to short paid numbers (examples once and twice) The functionality of such applications was very simple. When installing, they asked for the right to send SMS, which they did at the first start. In those applications, errors were not uncommon, due to which SMS could not be sent at all or go to the wrong numbers. Over the past four years, malicious applications have evolved significantly, but most of them are still simple programs aimed at quick profit. According to LC, in 2013 36% of Android Trojans sent paid SMS messages. Successful new malicious applications are copied many times, so Android Trojans are not very diverse.

    Detection Methods: Static and Dynamic

    To successfully detect malicious applications, various techniques are combined: signatures, heuristics, application emulation are used. But the increase in the number of such programs has led to the need for techniques for the automated execution of tasks that anti-virus analysts often have to do. For Android, some methods were not immediately effective enough. Emulation of applications was hampered by the huge fragmentation of the platform, not very stable operation of the emulator and simple methods for its detection. Many researchers have proposed using machine learning methods to find malware. We also decided to follow this path and a classifier of downloadable programs using our machine learning algorithms was launched a year ago with Yandex.Store (presentation at YaC'13 ).

    Classifier construction

    Selection of classification factors

    In order to choose the right classification factors, two approaches were used. The first is the analysis of existing malicious mobile applications. It was carried out by hand and allowed to see some features. For example, malicious applications (at the time of the analytical work) were leaders in the use of obfuscation methods. Now this symptom is no longer so relevant, as developers of popular applications include at least ProGuard during assembly. Another important factor is the use of permissions. It has not lost its relevance over time, since simple malicious applications still prefer to ask permission to send SMS.

    The second approach to creating a set of factors was even more trivial. A certain amount of time was spent searching for already used classification factors in scientific articles. It did not give much results, but it allowed turning on fantasy and coming up with a lot of the craziest facts that could be calculated by looking at the input file. Of course, it was not entirely clear what “exhaust” would be from such factors as the size of the input file or the number of URLs in variables and resources, but at the first stage I wanted to use all the features.

    Assessment of detection efficiency and speed

    After several trainings of the classifier, the Matrixnet allowed to throw out many currently ineffective factors. This does not mean that we completely forgot about them. Since the launch, some factors have lost relevance, and some have become effective again. For example, the file size, which initially did not make much contribution to detection, gained a certain weight over time, since the file size for many games has increased significantly. Some factors could be used only after a certain time of the analyzer. The level of trust in them was calculated for developers based on all the applications that they downloaded to our store. Of course, at the time of the start of Yandex.Store, it was impossible to make such calculations, since most developers did not have too many different applications, most often one version.

    Evolution of factors and retraining of the system

    Changes in application assembly processes, the emergence of new types of malware, and not only this, make the analyzer constantly retrain. This in this case is a normal process, which is repeated every time after the beginning of the degradation of the detection results. In many controversial cases where the application is detected as malicious, but this is not confirmed by other methods, we add the application to the training set. Now retraining the system can be done semi-automatically. When the first training was conducted, the sample sizes were about 250 files. Now they have already exceeded one thousand.

    Future system

    Support for new executable file engines

    The Android platform is developing very dynamically. Over the past two years, ART has appeared , some permissions have changed, the devices themselves have changed. Accordingly, the application analyzer needs to be further developed. Now there are ideas for developing the project in two directions: the first is improving the quality of static analysis by strengthening the checks of the native code; the second is the introduction of dynamic analysis to test applications.

    Factor changes

    Improving static analysis through native code checks is necessary, as malware developers have begun to use JNI more actively. Now you can already write full-fledged Android applications in C ++ without any problems.

    There is also a need for a lot of refactoring and code optimization, since we want to further reduce file scan time and increase analyzer performance. However, this is more likely a task for the near future.

    Also popular now: