Optimize PhonoPaper Using Intel Tools

    Some time ago I already wrote about one of my developments - PhonoPaper technology and the program of the same name that allows you to play sound printed in the form of a spectrogram on paper or any other surface. The process looks something like this: 10-second sound (voice, a piece of a song) is converted into a special format picture. The picture is printed and, for example, glued to the wall. A passerby, noticing the code, launches the PhonoPaper scanner on the phone, points the camera at the picture and at the same moment begins to hear the sound encoded in it. At the same time, the user is fully involved in the process - the direction and speed of playback depends on the movement of his hand (although there is also an automatic mode). All necessary information is stored in the image, Internet access is not required.



    PhonoPaper has aroused keen interest among musicians, artists and just lovers of unusual experiments. And in the 3rd quarter of last year, the application took first place in the project " Intel Rating for Developers " on the website Apps4All.ru. In this connection, Intel kindly provided me with a tablet based on Android x86 for further improvement and optimization of PhonoPaper. I hastened to take advantage of this, and I will tell about the work done and the results below.

    Video capture


    The first thing that was done was a set of Intel INDE Media for Mobile libraries . Specifically, the GLCapture class for capturing video from an OpenGL ES surface in real time (in HD quality and with sound). Why is this needed? Firstly, the process of finding and playing PhonoPaper codes is fun, an exciting sight, reminiscent of playing an unusual musical instrument. Secondly, PhonoPaper can work in free mode, when everything is converted to sound indiscriminately - including your carpet and cat. Both would be great to record and share on YouTube.


    PhonoPaper hand-drawn codes


    Free mode - any image from the camera is perceived as a spectrum of sound.

    The GLCapture connection process has been repeatedly described in various articles . I will tell you only about a few points that it is desirable to know about before starting work.
    • Android version must be at least 4.3. For older devices, I made a cross-platform MJPEG recorder, the speed and quality of which, of course, is much inferior to the hardware-accelerated GLCapture, which is written in mp4.
    • The application should be built on the basis of OpenGL ES 2.0. My programs have historically used version 1.1, so I had to rewrite the code. But the transition to GLES 2.0 ultimately had a positive effect on performance, since it became possible to manually configure shaders.
    • GLCapture can record sound from a microphone. This is good if you need to accompany the video with your comments. If you need high-quality sound directly from the application, you will have to write it separately to a file, and then combine it with mp4. To combine, you can use the MediaComposer class with the SubstituteAudioEffect effect from the Media for Mobile bundle. Another way is to write to WAV, encode from WAV to AAC, add the AAC track to the mp4 file using the mp4parser library .

    Since PhonoPaper is written in the Pixilang programming language , the video capture function will later be extended to other pixilang-based applications (PixiTracker, PixiVisor, Nature - Oscillator, Virtual ANS), and most importantly, it will be available to all developers using Pixilang. It’s very easy to access (just a few functions: to start capture, to stop and save).

    Intel C ++ and optimization


    The next step is to build the x86 Android version of PhonoPaper using the Intel compiler (version 15.0.0) and compare the results with GCC 4.8. I am a user of Debian Linux and a rather old version. Therefore, the first problem was to find the appropriate version of Intel C ++. For some reason, most of the links led to the Intel INDE project, inside which there is the necessary compiler, but only for Windows and OS X. This seemed strange ... Fortunately, the necessary distribution was found - this is Intel System Studio 2015 . And despite the warnings during installation, everything worked and the first assembly was successful.

    Compilation was performed with the following keys: -xATOM_SSSE3 -ipo -fomit-frame-pointer -fstrict-aliasing -finline-limit = 300 -ffunction-sections -restrict. To test the performance of the Pixilang virtual machine (it is the basis of all my applications), small tests were written, the source and results of which can be found in this archive . As a result, even without preliminary preparation, some pieces of code were accelerated 5 (!) Times. Pretty impressive results!

    In PhonoPaper, most of the load goes to the function of a spectral synthesizer (tabular-wave, not FFT) - wavetable_generator () . A separate test was written for her, which within four seconds renders a stream of sound with a random spectrum. At the end, the test gives the highest possible sampling rate. Alas, here the ICC did worse: 105 kHz versus 100 kHz on GCC. Add -qopt-report = 2 key during compilation and in the report we see the message:

    loop was not vectorized: vector dependence prevents vectorization.

    The main loop inside our function could not be vectorized, because input data pointers can point to overlapping sections of memory:

    int* amp = (int*)amp_cont->data
    int* amp_delta = (int*)amp_delta_cont->data;
    

    As a developer, I see that in this place the intersection is excluded and it just needs to be reported to the compiler. In C / C ++, there is a special restrict keyword that indicates that the declared pointer points to a block of memory that no other pointer points to. Therefore, we replace the above code with this:

    int* restrict amp = (int*)amp_cont->data;
    int* restrict amp_delta = (int*)amp_delta_cont->data;
    

    Then again we collect the application and see that the cycle has been successfully vectorized. Taking into account some additional changes (in the process it turned out that it is possible to get rid of several bit operations), we have the result - 190 kHz. GCC, subject to the same changes, produced 130 kHz. We get a performance increase of 1.46 times!

    What's next


    As you can see, the results are very positive! PhonoPaper has become faster (thanks in large part to the Intel C ++ compiler) and has acquired video capture functionality. In addition, video recording will appear as a few simple features in the upcoming Pixilang 3.6 update.
    For those who are not in the know, Pixilang is an open cross-platform programming language focused on working with sound and graphics. The syntax of the language is very minimalistic and is a kind of hybrid of BASIC and C, which, coupled with other features (the ability to write code without functions, universal containers for storing any data) reduces the threshold for entry.

    Also popular now: