
DIY passport scanner

Hello, Habr! In previous articles, we already told you about how we managed to turn the input of passport data on mobile devices from a routine into a simple and quick procedure. As the next logical step, we turned our Smart PassportReader SDK into a server component, making it easier for large financial institutions to work with documents in back-offices. Finally, having shown inventive ingenuity and an engineering approach, we managed to develop a hardware-software complex (looking ahead, imagine its name - Smart PassportBox), which allows optimizing the work of front offices and ACS solutions. Therefore, if you are interested in how many directors, programmers, soldering irons, jigsaws and screwdrivers are required to create a full-fledged PAK, welcome to cat.
I would like to clarify right away - we do not declare that we invented the wheel. The first automated workplaces for recognizing passports of the Russian Federation appeared more than 10 years ago (the well-known giants in the field of document recognition - Abbyy and Cognitive, and relatively small companies - for example, PassportVision - have the software for solving this problem.
At first glance, it seems that such a simple and understandable task has been solved. Take and use. However, a closer look reveals a number of limitations that you encounter during integration. Let's analyze two main problems.
Firstly, no matter how sad it may sound, most recognition software works only on PCs running MS Windows. However, in recent years, due to the active development of open source communities, Linux is increasingly seen on operator workstations. Therefore, passport recognizing software should also follow this trend.
Secondly, a high-quality passport recognition requires a scanner (you need to somehow get a graphic image of the document). And here the consumer usually faces a choice: cheap, long, with average quality or expensive, fast and with a bunch of additional “lotions”. Let us explain the current problem in more detail. Today, it is customary to receive a document image in one of two ways: using a flatbed scanner or using a special passport device. Flatbed scanners with their relative cheapness (the price of compact models of size A5 starts at about 12,000 rubles) allow you to get a picture suitable for recognition in 10-15 seconds. However, it is not a fact that recognition will be performed the first time and that re-scanning will not be required (due to bending of the page or incorrect orientation of the document). During this time, an experienced typist can drive passport data manually. You can speed up the process using specialized devices (usually allowing, in addition to instant scanning, verification, recognition of information on the built-in chip, etc.), the main disadvantage of which is the cost (about 100,000 rubles).
After analyzing, we realized that the market for inexpensive, but effective passport recognition solutions has not yet been fully developed; there is still room for new engineering and high-tech solutions.
So we were inspired to create a software and hardware complex ...
Development of an optical scanning device
As you probably already understood from the introduction, an optical device is a fundamental part of the entire software and hardware complex. Therefore, we generously allocated $ 100 to this device, without reducing the requirements for quality and speed of image acquisition. The first (and most obvious) thing that came to mind was the use of a good webcam. Moreover, we know how to recognize video well (read our previous post ). And yes, Habr, we succeeded quite successfully! The combination of web-camera + Smart PassportReader allows you to solve the problem of automated document input. True, it was not so easy for an ordinary operator to master working with such a PAK for the following reasons:
- A certain skill is required in order to learn how to quickly demonstrate a document in front of a web-camera at a certain distance so that there are no significant geometric distortions;
- it is necessary in each case to take into account the features of the workplace when placing the web-camera, since otherwise possible glare caused by external lighting can adversely affect the quality of document recognition.
- it is not possible to recognize the entire U-turn in front of the camera, since the passport is by nature a book and always strives to “fold” in half.
After a little reflection, we decided to install the camera inside a closed box with a glass “lid”, about which everything is known: the distance from the camera to the document (you can fix the focal length), the lighting parameters are always the same and do not depend on external factors, the passport is pressed against the glass surface, which makes it easy to avoid the effect of the book.
During the long cold winter holidays, armed with boards and screwdrivers instead of snowboards and skis, we started to create a device ... From wood ...

Despite the apparent simplicity, the resulting device has its own nuances:
- the relative position of the camera and the LED backlight should be selected based on the reduction of possible reflection from the glass work surface;
- the power of the LED backlight should be sufficient for the normal functioning of the camera, while the document areas should not be illuminated;
- the distance from the camera to the work surface should be chosen so as to obtain the document in maximum resolution.
As a result, on the third attempt (thanks again to the leadership of the country for such a long vacation), we managed to assemble a scanner box that returns projectively distorted, but quite suitable for recognizing video sequences using our SDK.
Below is a brief estimate of the building materials that went into production. As you can see, we met the treasured 100 dollars with a margin.
No. | Component part | Cost, rub. |
---|---|---|
1 | Furniture panels (chipboard) 16 mm | 200.00 |
2 | Window glass 4 mm | 100.00 |
3 | Accessories (self-tapping screws + holders + bracket) | 200.00 |
4 | LEDs, batteries, wires | 500.00 |
5 | FullHD webcam | 5000.00 |
Software development
If creating a box for our organization (which is engaged in software development by type of activity) is a kind of challenge, then preparing a recognition program on the basis of our SDK itself is actually a vacation. So, in accordance with our goal, recognizing software must meet the following requirements:
- cross-platform (in particular, support for MS Windows and Linux);
- natural integration with an optical scanning device;
- high speed recognition.
Accordingly, when developing software, we used the following techniques:
- all programs are written in C ++ using, if necessary, cross-platform third-party libraries (for example, Qt for writing components with a user interface);
- natural integration with an optical device essentially consists in writing a web-camera control module (which, unfortunately, had to be written individually for each operating system) and correcting the resulting geometric distortions;
- high recognition speed is provided by the core itself - on a medium-performance PC, both pages of the passport using the Smart PassportReader SDK are recognized about 200 ms.
What did we get
Following the saying “It’s better to see once ...” we decided to shoot a short video about the resulting hardware and software system.
Heading “Questions and Answers”
In this section, we have the courage to predict your questions and formulate answers to them in advance. However, we look forward to other questions in the comments to the article.
1. So all the same, how many directors, programmers and tools were required to create such a PAC?
All carpentry materials can be bought in a large construction store (in our case, Leroy Merlin), there you can gently file the boards of the required size (so as not to mess with the jigsaw at home), a webcam in an electronics store such as Yulmart, wires and LEDs - Chip and deep. In addition, 1 director was required (it turned out that the directors, by nature, have a good command of a screwdriver and a soldering iron) and 1 C ++ programmer (to create a software connection between a web camera, Smart PassportReader SDK and the output interface).
2. Judging by the images, the PAK has impressive physical dimensions. Is it convenient to work with him?
At numerous demonstrations by potential customers, the scanner height (and it is now 276 mm) is really a bit confused. But you need to understand that this is just a prototype made entirely from household parts. If you make a plastic case, use a small-format wide-angle camera, then the height of the device can be brought up to 120 mm, which is quite comparable with existing passport scanning devices. But even with such "enormous" sizes, the device can be effectively used: no one forbids to hide this "box" inside the table or screw it on the side (as shown in the figure below).

3. Anyway, even with such a compact design, there is a lot of free space inside the “box”. You did not think how to fill it?
We didn’t just think, but even started to fill out the “box”. The remaining space can be effectively used for location inside the microcomputer, which will actually deal with document recognition (since our SDK is perfectly optimized for the ARM platform). In this form, the passport recognition PAC will be localized inside one device without using any external computing power.

Conclusion
Of course, our wooden box can hardly be called a finished industrial design. Nevertheless, already at the current stage of our experimental development, we can formulate the main advantages of the resulting hardware and software complex that solves the problem of recognizing passports of citizens of the Russian Federation:
- high passport recognition speed (less than 1 second per turn on a computer with an Intel Core i5 processor along with scanning );
- high recognition quality (balance of hardware and software components);
- simplicity of the device (case + camera + software, does not require special drivers and other system software);
- ease of operation (device design prevents typical operator errors);
- not a single moving part (in fact, nothing to break);
- simple API with various interfaces (C ++, ActiveX, Java, C #);
- domestic development.
PS The concept of a hardware-software complex described in this article is protected by the patent legislation of the Russian Federation.