How Project Infer from Facebook helps to find bugs in mobile apps before deploying

Published on February 16, 2019

How Project Infer from Facebook helps to find bugs in mobile apps before deploying

https://towardsdatascience.com/machine-learning-for-detecting-code-bugs-a79f37f144b7
  • Transfer


A few days ago, the Facebook engineering team distinguished itself - it was awarded the Most Influential POPL Paper Award . Among machine learning specialists, this is very honorable. The award was presented for the work of Compositional Shape Analysis by Means of Bi-abduction , which reveals the nuances of Project Infer. The project itself is designed to detect and eliminate bugs in the code of the mobile application before it deploem.

Bugs in mobile software are very expensive for developers and users alike. As for the former, detecting a problem in an application already placed in directories is a nightmare for any specialist. Of course, the software is being tested, the work of the programs is checked against certain templates. But more often, developers can not foresee everything, and bugs in the application are detected after deployment.

We remind: for all readers of "Habr" - a discount of 10,000 rubles when recording for any Skillbox course on the promotional code "Habr".

Skillbox recommends: Online course "Data Analyst in Python" .
Project Infer scans the code of mobile applications and allows you to find all sorts of problems, the patterns of which are stored in the database (and it is updated all the time). The project itself was submitted three years ago. Almost immediately after the announcement, Facebook opened the code, after which it was used in companies such as Amazon Web Services, Spotify and Uber.

How it works?


Project Infer uses a specialized set of algorithms for analyzing code performance. In the source code of any large application, there may be millions of combinations that lead to errors. Traditional code analysis procedures cannot help discover everything. Infer from Facebook, self-learning, expands its base, so the project allows you to detect a lot of problems in the code.

In a general sense, the Facebook Infer process can be divided into two stages: data collection and analysis. The life cycle (lifecycle) is also divided into two parts: global and differential.

During the data collection phase, Infer translates the source code into its own language. The analysis phase is devoted to the study of the smallest details of the structure of a code that can potentially lead to an error. If Infer encounters a familiar combination of factors, identified as an error pattern, the analysis is stopped for a particular method or function, but other methods and functions continue to be analyzed. Here is an example of how Infer works.



From the point of view of execution, Infer can work in two modalities - Global and Differential, as mentioned above. In the first case, Infer analyzes all files. For a project that is compiled using Gradle, the launch of Infer is done by the command

infer run -- gradle build

The differential process is used in incremental build systems specific to mobile applications. In this case, you must first start collecting Infer data to get all the compilation commands, and then analyze only the changes. To do this, use the following set of commands:

gradle clean
infer capture -- gradle build
edit some/File.java
# make some changes to some/File.java
infer run --reactive -- gradle build

Infer reports can be viewed using the InferTraceBugs command.

infer run -- gradle build
inferTraceBugs

Basis of Project Infer


Facebook's Infer is based on two new mathematical methods: separation logic and Bi-abduction .



A key feature of the separation logic is the possibility of local reasoning. It appeared due to the presence of spatial connectives in the assertions (spatial connectives) between the parts of the heap. In this case, there is no need to take into account the entire amount of memory at each of the stages.

The main element of the separation logic is the * operator (and separately), which is called a separating connection. The formula X↦Y ∗ Y↦X can be read as “X points to Y, and separately Y points to X”, which is very similar to how memory pointers work.

In the context, Infer Bi-abduction can be viewed as an inference method that allows the platform to detect properties related to the behavior of independent parts of the application code. Bi-abduction jointly displays anti-frames (missing parts of the state) and frames (those parts that are not affected by the operation). Mathematically, the problem of Bi-abduction is expressed using the following syntax: A ∗? Antiframe⊢B ∗? Frame.

In Facebook's Infer method, it is possible to deduce pre / post specifications from clean code, provided we know the specifications for primitives at the basic code level.

The creation of FI was made possible by analyzing the work of machine learning specialists, which was carried out for many years. In the course of work on Infer, the following key articles were published for the whole sphere:

  • Compositional Shape Analysis by Bi-abduction . It was for this work that the prize mentioned above was received. The work introduces the reader to the compositional analysis of the form. This is an addition to traditional shape analysis (shape analysis), which makes it possible to apply the method to analyze the source code of applications.
  • A Local Shape Analysis Based on Separation Logic : This article describes the separation logic as a mechanism for analyzing the source code of applications. The authors show the possibility of studying individual cells in a heap of memory, without studying the entire heap entirely. Thus, certain cells make a linked list without complete analysis.
  • Smallfoot: Modular Automatic Assertion Checking with Separation Logic : this paper describes the predecessor of Facebook Infer, which is called Smallfoot.
  • AL: A new declarative language for detecting bugs with Infer : AL allows any developer to design new checkers without a full understanding of the internal Infer kitchen. AL is a declarative language.
  • Moving Fast with Software Verification : Finally, an article that reveals how Facebook uses Project Infer for its own needs. The document describes how Facebook developers have integrated Infer into their development process to provide static analysis for mobile applications such as Instagram, Facebook Messenger, and Facebook applications for Android and iOS.

So far, Infer can only be used for mobile applications. But some of its principles apply to general-purpose applications. Perhaps in the future, Infer’s capabilities will become wider, and with its help, developers will be able to analyze desktop or server applications.

Skillbox recommends: