Anti-spoofing: how do facial recognition systems resist scammers?

From the sandbox

In this article I will try to summarize the information about the existing methods of liveness detection, which are used to protect against hacking facial recognition systems.

What are we protecting from?

With the development of cloud technologies and web services, more and more transactions are moving into the online environment. At the same time, more than 50% of online transactions (retail) are made from mobile devices.

The growing popularity of mobile transactions can not be accompanied by the active growth of cybercrime.

Cases of online fraud are 81% more likely than fraud at point of sale.

16.7 million personal data of Americans were stolen only for 2017 ( Javelin Strategy and Research ). Damage from account seizure fraud amounted to $ 5.1 billion.

In Russia, according to Group-IB , in 2017, hackers stole more than a billion rubles from owners of Android smartphones, which is 136% more than a year earlier.

Traditional methods of ensuring security in remote authentication cases, for example, using test questions or SMS, are no longer so reliable due to improved user fraud and social engineering mechanisms. Here, biometrics is increasingly coming to the rescue, especially facial recognition.

According to Acuity Market Intelligence , by 2020 the total volume of biometric transactions, payment and non-payment, will exceed 800 million per year.

Face recognition technology is usually preferable due to contactlessness and minimal requirements for user interaction, and at the same time, perhaps the most vulnerable to fraud attacks. An image of a person’s face is much easier to obtain than other biometric identifiers, such as a fingerprint or iris. Any photograph of the user (obtained by taking close-up shots without the user's consent or from the Internet) can be used to deceive the system. This kind of attack, when a real user is replaced by a fraudster with a fake ID, is called spoofing.

Liveness detection methods

From time to time messages appear on the Internet about another successful attempt to deceive the facial recognition system. But do developers and researchers take no action to improve the security of facial recognition systems? Of course, take. This is how liveness detection technologies appeared, the task of which is to check the identifier against a live user.

There are several classifications of liveness detection methods. First of all, they can be divided into hardware and software.

Hardware methodsinvolve the use of additional equipment, such as infrared cameras, thermal cameras, 3D-cameras. Due to low sensitivity to lighting conditions and the ability to capture specific differences in images, these methods are considered the most reliable, in particular, according to the results of recent tests, the iPhone X, equipped with an infrared camera, turned out to be the only smartphone that successfully withstood the attacks using a 3D face model. The disadvantages of such methods include the high cost of additional sensors and the complexity of integration into existing face recognition systems.

Hardware techniques are the perfect solution for mobile device manufacturers.

Unlike hardware, software methods do not require additional equipment (using a standard camera), which means that they are more accessible, at the same time, they are more vulnerable to spoofing, since the result of the check depends on factors such as lighting and camera resolution.

So, is it enough to buy a modern smartphone with biometrics and an infrared sensor "on board" and the problem is solved? It is a logical conclusion, if not for one BUT. according to forecastsBy 2020, only 35% of authentications will be implemented via biometrics “embedded” in mobile devices, while biometric mobile applications will be used in 65% of cases. There is one reason - such mobile devices are much more expensive, and therefore will not be widely used. This means that the focus is still shifting towards software methods that can work effectively on billions of devices with conventional cameras. On them and dwell in detail.

There are two types of software methods: active (dynamic) and passive (static).

Active methods require user collaboration. In this case, the system prompts the user to perform certain actions in accordance with the instructions, for example, to blink, turn his head in a certain way, smile, etc. (call-response protocol). This leads to the drawbacks of such methods: firstly, the need for cooperation negates the advantage of the face recognition system, as a non-cooperative type of biometric authentication, users do not like to spend time on unnecessary “gestures”; secondly, if the required actions are known in advance, protection can be circumvented by playing a video or 3D replica with simulated facial expressions / movements.

The essence of such methods is in detecting movement in a sequence of input frames for extracting dynamic features that allow distinguishing between real and fake faces. Analysis methods are based on the fact that the movement of flat 2D-objects is significantly different from the movement of a real human face, which is a 3D object. Since active methods use more than one frame, they require more time to make a decision. The frequency of facial movements typically ranges from 0.2 to 0.5 Hz. Therefore, collecting data for detecting spoofing takes more than 3 seconds, while human vision, the ability of which, in essence, mimic these methods, determines movement and builds a map of the structure. the environment is much faster.

Unlike active, passive methods do not require user participation and rely on data analysis of a single 2D image, which provides quick response and convenience for the user. The most used: methods based on the Fourier spectrum (search for differences in the intensity of light reflection of 2D and 3D objects) and methods that extract the properties of image textures. The effectiveness of these methods decreases with changes in direction and brightness of the illumination. In addition, modern devices are capable of transmitting images in high resolution and natural color, allowing you to fool the system.

What's better?

Category methods	Principle of operation	Benefits	Restrictions
Methods based on movements (facial expressions) or temporal methods (dynamic, less often static)	Fixing involuntary muscle movements or actions upon request	Good generalizing ability *	- Low reliability; - slow response (> 3 sec.); - high computational complexity; - effective against photos and 2D masks.
Texture Analysis Methods (Static)	Search for features of the texture characteristic of the printed face (blurring, printing failures, etc.)	- Fast response (<1 sec.); - only one image is required; - low computational complexity; - low cost; - non-invasive method.	- Low generalizing ability; - Vulnerable to attacks with high resolution video.
Methods based on image quality analysis (static)	Image quality analysis of a real face and a fake 2D image (distortion analysis, analysis of mirror image distribution)	- A good generalizing ability; - fast response (<1 sec.); - low complexity of calculations.	- For different types of spoofing attacks, different classifiers are required; - vulnerable to modern devices.
Methods based on 3D facial structure (dynamic)	Fixation of differences in the properties of the optical flow generated by three-dimensional objects and two-dimensional planes (analysis of the trajectory of motion, building a depth map)	High reliability of methods (applied to 2D attacks and 3D attacks)	- Slow response (> 3 sec.); - sensitivity to lighting and image quality.
Multimodal methods (static and dynamic)	The combination of two or more biometric methods	- High reliability; - versatility (the choice of modality).	- Slow response (> 3 sec.); - the choice of modality facilitates the choice of the simplest method of attack; - the complexity of combining features extracted by different methods.
Methods using inertial sensors (dynamic)	Analysis of the correspondence of facial movements to camera movement with the help of built-in sensors of a mobile device (accelerometer and gyroscope)	- High reliability methods (applied to 2D attacks); - The necessary sensors are already in the complete set of smartphones.	- Slow response (> 3 sec.); - the result depends on the measurement accuracy of the sensors; - sensitivity to lighting, occlusion and facial expressions.

The table briefly presents the key characteristics of the main categories of methods. I will not describe the methods included in each category, there are many of them and they differ depending on the algorithms used and their combinations.

^{Category methods Principle of operation Benefits Restrictions
Methods based on movements (facial expressions) or temporal methods (dynamic, less often static) Fixing involuntary muscle movements or actions upon request Good generalizing ability * - Low reliability;

- slow response (> 3 sec.);

- high computational complexity;

- effective against photos and 2D masks.
Texture Analysis Methods (Static) Search for features of the texture characteristic of the printed face (blurring, printing failures, etc.) - Fast response (<1 sec.);

- only one image is required;

- low computational complexity;

- low cost;

- non-invasive method.
- Low generalizing ability;

- Vulnerable to attacks with high resolution video.
Methods based on image quality analysis (static) Image quality analysis of a real face and a fake 2D image (distortion analysis, analysis of mirror image distribution) - A good generalizing ability;

- fast response (<1 sec.);

- low complexity of calculations.
- For different types of spoofing attacks, different classifiers are required;

- vulnerable to modern devices.
Methods based on 3D facial structure (dynamic) Fixation of differences in the properties of the optical flow generated by three-dimensional objects and two-dimensional planes (analysis of the trajectory of motion, building a depth map) High reliability of methods (applied to 2D attacks and 3D attacks)
- Slow response (> 3 sec.);

- sensitivity to lighting and image quality.
Multimodal methods (static and dynamic) The combination of two or more biometric methods - High reliability;

- versatility (the choice of modality).
- Slow response (> 3 sec.);

- the choice of modality facilitates the choice of the simplest method of attack;

- the complexity of combining features extracted by different methods.
Methods using inertial sensors (dynamic) Analysis of the correspondence of facial movements to camera movement with the help of built-in sensors of a mobile device (accelerometer and gyroscope) - High reliability methods (applied to 2D attacks);

- The necessary sensors are already in the complete set of smartphones.

- Slow response (> 3 sec.);

- the result depends on the measurement accuracy of the sensors;

- sensitivity to lighting, occlusion and facial expressions.}
* The ability of the model to work effectively in cases that go beyond the teaching examples (for example, when the template registration conditions change: lighting, noise, image quality)

Different types of methods can be combined with each other, but due to the processing time of various parameters, the efficiency of detection with similar hybrid methods leaves much to be desired.
The picture of application in modern face recognition systems is approximately as follows *:

* According to the results of the analysis of systems from more than 20 vendors

As can be seen from the graph, dynamic methods prevail, while the rate is placed on the request for action. Such a choice is most likely due to the assumption that typical attackers have limited technical skills and simple means. In practice, the development of technologies and the growth of their availability lead to the emergence of more sophisticated methods of spoofing.

An example of this is the report.researchers from the University of North Carolina, who managed to fool five face recognition algorithms using textured 3D models of the heads of volunteers created on a smartphone using studio photos and photos from social networks, as well as virtual reality technology to simulate movements and facial expressions. “Deceived” systems just relied on the analysis of user actions (with building a structure or simply checking for movements), at least no other methods were stated by other vendors at that time.

And here is the FaceLive methodwhich at that time was not used in face recognition systems, missed attacks only in 50% of cases. The liveness detection mechanism compares the similarity between the accelerometer-measured changes in the direction of movement of a mobile phone and changes in the facial landmarks (nose, eyes, etc.) observed on video from the camera. A live user is detected if the changes in the position of the head in the face video are consistent with the movements of the device. The disadvantages of the method include the dependence on the accuracy of the inertial sensors of the device, the level of illumination, the user's mimicry, and the long duration of the procedure.

Successfully resist attacks using a 3D model that mimics facial expressions and movements, according to the authors of the report, can analyze blood flow, light projection and use an infrared camera.

Blood flow analysis is based on identifying differences in the reproduction of periodic changes in skin color due to heart contractions. Fake images reproduce color worse.

When using a light projection, the built-in device or external light source emits flashes at random intervals. When you try to cheat, the 3D rendering system should be able to quickly and accurately visualize the projected lighting patterns on the model. The requirement for additional equipment is a significant limitation.

The said report was published in 2016, during which time some algorithms have been improved. So, some vendors claim the ability of their systems to successfully resist attacks using 3D masks.

An example of a serious attitude to the reliability of technology are Apple and Microsoft. Face ID at one time helped draw the attention of a wide audience to face recognition, demonstrating what the future of personal data security might look like. But soon after the launch, dozens of videos appeared (mostly fake) on the subject of technology fraud. In 2017, Windows Hello facial recognition was able to fool with a printed image. Returning to the results of the Forbes tests , it can be stated that companies have since done a lot of work, as a result of which their system could not be hacked.

I personally did not see any examples of real (for the purpose of committing a crime) hacking of facial recognition systems, in contrast, say, to systems based on fingerprint scanning. Those. all hacking attempts were made either to test the reliability or to discredit the technology. Of course, face recognition systems are not as common as fingerprint scanning systems, but they are still used, including in banks, where security is given maximum attention.

Let's sum up

The developers of face recognition systems certainly care about security issues; all vendors offer protection from spoofing (or declare it available), with the exception of some mobile device manufacturers, but they usually warn about the possibility of cheating recognition technology. persons offering it as an additional protection factor.
Traditional methods tend to be subject to restrictions such as dependence on lighting conditions, speed of response, interactivity, or high cost. Therefore, improvement of algorithms is required to improve the user qualities of recognition systems.
Future protection mechanisms should anticipate the development of spoofing technologies and quickly adapt to new threats.
The introduction of modern algorithms will make fraud "expensive pleasure", and therefore impractical for the majority of attackers, i.e. the more technical tools and skills are required to make attacks, the more protected users can feel.
The presence of new algorithms in the Graph of the ratio of the use of various methods, albeit in insignificant proportions, indicates the search by vendors for more effective means of protection against spoofing. Companies are experimenting, often offering not one, but several methods of liveness detection, which cannot but inspire optimism about the future of face recognition systems.

Tags:

Anti-spoofing: how do facial recognition systems resist scammers?

What are we protecting from?

Liveness detection methods

What's better?

Let's sum up

Also popular now: