Comparative testing of Smart IDReader on 5 computing systems with Elbrus processors

    Smart IDReader is an application that allows you to recognize identity documents on various platforms. Various recognition modes allow you to extract data from the document holder from the video stream, photos or scans of documents.



    Today we decided to tell you about how we tested Smart IDReader on the Elbrus family of Russian-made computing systems. What will we test on? How does document recognition work on the new Elbrus-8.4 machine? If interested, go to cat.


    Our previous articles on recognition on Elbrus:
    https://habrahabr.ru/company/smartengines/blog/304750/
    https://habrahabr.ru/company/smartengines/blog/317672/
    https://habrahabr.ru/company/smartengines/blog/329858/
    https://habrahabr.ru/company/smartengines/blog/340918/


    What will we test on?



     
    The review involves 5 different devices based on Elbrus processors:


    Elbrus 101-RS


    Elbrus 101-PC is a compact workstation based on the Elbrus-1C + microprocessor with a system unit in nettop format, characterized by a low noise level. The Elbrus-1C + processor itself has a built-in 3D graphics accelerator that supports OpenGL 2.1 and OpenCL 1.2, and consumes no more than 25 watts of power, making it well suited for embedded systems and portable terminals.



     


    Elbrus 401-PC


    Elbrus 401-PC is a personal computer based on the Elbrus-4C microprocessor, which has repeatedly been featured in our reviews.



     


    Server Elbrus-4.4


    Server Elbrus-4.4 - 4-processor server based on Elbrus-4C, equipped with 96 GB of RAM. On such a powerful server, you can solve complex computational problems, use for various server applications, or simply store data.



    (output is trimmed)
     


    Elbrus 801-RS


    Elbrus 801-PC is a workstation based on the Elbrus-8C microprocessor, released last year. Elbrus-8C has an improved architecture: it supports up to 25 operations per 1 cycle, and also operates at a frequency of up to 1300 MHz. In our sample, at the time of the experiments, the clock frequency was reduced to 1200 MHz.



    (output is trimmed)
     


    Elbrus-8.4


    And finally, the new development of the MCST and INEUM named after Brook: 4-processor server module based on Elbrus-8C with 10-gigabit Ethernet M10GE / E Ethernet interface developed by MTsST, providing a trusted connection between nodes. It is intended for use as a node for storing, processing and transmitting data or for solving other tasks for which there is enough imagination.


    Characteristics of Elbrus-8.4


    Parameter NameValue
    Microprocessor nameElbrus-8S (1891VM10AYA)
    The number of cores in the microprocessor, pcs.8
    The maximum clock frequency of microprocessors, GHzup to 1.3
    The number of microprocessors in the computing device, pcs.4
    The amount of RAM, GBup to 256 GB of RAM with error correction (ECC)
    Cooling systembuilt-in, air type
    Input / output channels3 Gigabit Ethernet network
    connectors 3 PCI Express
    connectors 1 RS-232
    bus connector 4 USB bus connectors
    1 VGA video connector
    Power supply220 ± 22 V, 50 ± 1 Hz
    Power consumption, W, no more500
    Operating temperature range−10 C ... +50 C


    (output is trimmed)


    In our sample, at the time of the experiments, the clock frequency was reduced to 1200 MHz.


    Now we present the characteristics of all tested machines together:


    The carElbrus 101-PCElbrus 401-PCElbrus-4.4Elbrus 801-PCElbrus-8.4
    CPUElbrus-1C +Elbrus-4CElbrus-4CElbrus-8SElbrus-8S
    The number of general purpose cores1416832
    Clock frequency, MHz98580075012001200
    The number of operations per cycle (per core)up to 25 (8 int., 12 substances.)up to 23up to 23up to 25 (8 int., 12 substances.)up to 25 (8 int., 12 substances.)
    Technological process40 nm65 nm65 nm28 nm28 nm
    Data storage device120 GB SSD mSATA 3.0120 GB SSD mSATA 2.0500 GB HDD 3.5 '' SATA2.0120 GB SSD mSATA 3.02 TB HDD 3.5 '' SATA3.0
    Error Corrected RAM (ECC)16 GB24 GB96 GB32 GB128 GB
    The number of transistors (per processor)375 million986 million~ 986 million2.73 billion~ 2.73 billion
    L1 cache (per core)64 KB data + 128 KB commands64 KB data + 128 KB commands64 KB data + 128 KB commands64 KB data + 128 KB commands64 KB data + 128 KB commands
    L2 cache (per core)2 MB2 MB2 MB512 kb512 kb
    L3 cache (shared)---16 MB16 MB

    The width of the SIMD instructions for all processors was 64 bits.


    Tested Documents


    We decided to consider the recognition of 6 quite different types of documents. It:


    Russian passport



     
    Biometric passport of the Russian Federation



     
    Driving license of the Russian Federation



     
    UK Driving License



     
    German ID cards



     
    Disability certificate (sick leave)



     
    As you can see, for the driver’s license and ID-cards are several samples that differ from each other. In fact, this is a fairly typical situation: after the release of new standards for a while, both updated documents and old-style documents are in use. In addition, documents issued in different regions or documents for different categories of citizens, for example, adults and minors, may differ. Therefore, before recognizing a driver’s license or ID-card, Smart IDReader determines to which specific type the document belongs.


    Performance rating


    To evaluate the performance of Smart IDReader, we measured the net recognition time of one scan or photo without taking into account the loading of an image from a file, and also without taking into account the loading of configuration files. In this case, the document in the image can be arbitrarily rotated. Recognition time was averaged over 100 images of each document.


    Our application was compiled for the Elbrus architecture from the source code using the lcc 1.21.19 compiler and started in native mode. Parallelization was performed on the maximum available number of threads using the tbb library.


    First, we launched sequential recognition (time per image):


    Elbrus 101-PCElbrus 401-PCElbrus-4.4Elbrus 801-PCElbrus-8.4
    Russian passport3.87 s1.90 s1.80 s1.21 s1.09 s
    Biometric passport of the Russian Federation3.33 s1.85 s1.80 s1.10 s1.05 s
    Driving license of the Russian Federation4.24 s2.12 s1.81 s1.24 s1.09 s
    UK Driving License2.26 s1.08 s1.03 s0.69 s0.66 s
    German ID cards2.32 s1.22 s1.13 s0.77 s0.72 s
    Sick leave7.59 s3.40 s2.65 s1.97 s1.49 s

    In a more visual form:



    You can see that we were not idle: from our last articleRecognition time of the passport of the Russian Federation decreased by 1.5 times on both 401-PC and 801-PC and became less than 2 seconds. But recognition in more than 4 streams does not give a significant performance increase on all documents except for sick leave: nevertheless, only 12 text fields are recognized in the passport, and even less in driver’s licenses and ID-cards: 7. Therefore, it’s parallelized it turns out to be not such a large part of the algorithm, which, of course, is a minus for the recognition of individual document images on multicore systems. The sick-list contains significantly more fields, so the acceleration between 401-RS and Elbrus-4.4 and 801-RS and Elbrus-8.4 is more noticeable. It is also worth noting that a 101-PC with a single core works only twice as slow as a 401-PC. This is because the Elbrus-1C + processor of the new revision is in the 101-RS,


    However, on multi-core systems, you can run Smart IDReader in server mode: run several document recognition processes in parallel. In this mode, we can fully load all the processor cores and more realistic evaluate the performance of the respective devices.


    Each recognition call was parallelized in the same way as in the previous experiment, however here the processing time included loading the image from the file.


    Results when fully loaded Elbrus (average time per image):


    Elbrus 401-PCElbrus-4.4Elbrus 801-PCElbrus-8.4
    Russian passport1.27 s0.36 s0.43 s0.11 s
    Biometric passport of the Russian Federation1.13 s0.36 s0.42 s0.11 s
    Driving license of the Russian Federation1.79 s0.47 s0.64 s0.16 s
    UK Driving License0.93 s0.26 s0.32 s0.08 s
    German ID cards0.99 s0.26 s0.37 s0.10 s
    Sick leave2.22 s0.66 s0.86 s0.22 s

    Chart Results:



    According to these results, it is clear that server modules based on Elbrus fully correspond to the declared characteristics and demonstrate acceleration 3-4 times for tasks with a high degree of parallelism. At the same time, the Elbrus-4.4 server is still 20-30% more powerful than the Elbrus 801-RS workstation. The comparison of 401-PC and 801-PC also did not bring any surprises: the 801-PC is almost 3 times faster than its predecessor due to the increase in clock frequency and a significant improvement in architecture. For Elbrus-4.4 and Elbrus-8.4, this ratio is preserved.


     
    We are very grateful to the company and employees of the MCST and INEUM named after Brook for the opportunity to test the new server Elbrus-8.4 and we want to wish them to continue to please us with decent developments!


    Congratulations to all on the upcoming and wish you professional success, health and happiness in the new 2018!


    Also popular now: