The long-awaited step towards complexly structured documents (+ video)
With this article on Habré, we are pleased to announce to you, friends, that we have moved from template rigidly structured documents to the recognition of various complexly structured documents. And this, believe me, is a completely different song! For details welcome under cat.
In each of our articles on Habré, we never tire of repeating that our primary goal is to automate the input of data from any documents in natural uncontrolled conditions without the need for special equipment. In just a few years, we managed to bring the recognition system of ID documents to the industrial level and now most of the financial applications (including even some applications of national importance) use our technology to speed up and simplify working with the application.
For this year, our global goal is the recognition of any documents, without additional requirements for templates and forms. As always, recognition should be performed directly on the device (be it a mobile device or a powerful server). Having spent most of the time on internal review, having redesigned our basic Hieroglyph technology almost completely, we created the first version of the universal document recognition program - Smart DocumentReader.
What documents are recognized by Smart DocumentReader
Architecturally, the Smart DocumentReader program does not contain any restrictions on the types of supported documents and allows you to configure the recognition of any complexly structured documents. Documents can contain various semantic elements: tables, checkboxes, areas of handwritten filling, etc. Although, one limitation, caused rather by the hardware features of mobile devices, is present in our program: the maximum physical size of recognized documents is A4 format. But, you see, this is not a strong limitation from the point of view of bureaucracy in the Russian Federation. All the main financial documents are printed on our pages A4: certificate in the form 2-NDFL, invoice, invoice, certificate, waybill (TTN), waybill in the form TORG12, universal transfer document (UPD), charter, contract , invoice, profile,
Recognition of references 2-personal income tax
As a first example, we set up the Smart DocumentReader program for recognizing certificates in the form of 2-NDFL. From the point of view of practical use, this is a very popular document, which is required, for example, by banks when applying for large loans, by the state to receive tax deductions.
From the point of view of the internal structure, the 2-NDFL certificate is an excellent example of a complexly structured document: it contains mandatory and optional fields, several tables, there is a logical relationship between individual attributes, a large number of recognized fields.
Smart DocumentReader supports the recognition of multi-page documents. To do this, the program should alternately show all pages of the document. Upon the appearance of new pages, the overall recognition result will be updated with new data.
Like all of our previous products, Smart DocumentReader runs on a wide range of processor architectures under various operating systems. Today we support the Elbrus, Comdiv, SPARC, MIPS, ARM, x86, Sailfish Mobile OS RUS (Aurora), iOS, Android, Elbrus, Linux, Windows, macOS, Solaris operating systems . As for the recognition speed, on a mobile phone a one-page 2-NDFL document is recognized in 3-5 seconds.
PS In this article, we almost missed the technical part, anticipating in the near future a series of serious publications about the most important details that implement the presented functionality.