0xdde June 26, 2019 at 15:40

OpenCV on STM32F7-Discovery

I am one of the developers of the Embox operating system , and in this article I will talk about how I managed to run OpenCV on the STM32746G board.

If you drive into a search engine something like "OpenCV on STM32 board", you can find quite a few who are interested in using this library on STM32 boards or other microcontrollers.
There are several videos that, judging by the name, should demonstrate what is needed, but usually (in all the videos that I saw) on the STM32 board, only the image was received from the camera and the result was displayed on the screen, and the image processing was done either on on a regular computer, or on boards more powerful (for example, Raspberry Pi).

Why is it difficult?

The popularity of search queries is explained by the fact that OpenCV is the most popular computer vision library, which means that more developers are familiar with it, and the ability to run code ready for the desktop on the microcontroller greatly simplifies the development process. But why are there still no popular ready-made recipes for solving this problem?

The problem of using OpenCV on small boards is associated with two features:

If you compile the library even with a minimal set of modules, it simply won’t fit into the flash memory of the same STM32F7Discovery (even without taking into account the OS) due to the very large code (several megabytes of instructions)
The library itself is written in C ++, which means
- Need support for a positive runtime (exceptions, etc.)
- There is little support for LibC / Posix, which is usually found in OS for embedded systems - you need a standard library of pluses and a standard library of STL templates (vector, etc.)

Porting to Embox

As usual, before porting any programs to the operating system, it's a good idea to try to assemble it in the form in which the developers intended it. In our case, there are no problems with this - the sources can be found on the github , the library is built under GNU / Linux with the usual cmake.

From the good news - OpenCV out of the box can be assembled as a static library, which makes porting easier. We collect the library with the standard config and see how much space they take. Each module is assembled in a separate library.

> size lib/*so --totals
   text    data     bss     dec     hex filename
1945822   15431     960 1962213  1df0e5 lib/libopencv_calib3d.so
17081885     170312   25640 17277837    107a38d lib/libopencv_core.so
10928229     137640   20192 11086061     a928ed lib/libopencv_dnn.so
 842311   25680    1968  869959   d4647 lib/libopencv_features2d.so
 423660    8552     184  432396   6990c lib/libopencv_flann.so
8034733   54872    1416 8091021  7b758d lib/libopencv_gapi.so
  90741    3452     304   94497   17121 lib/libopencv_highgui.so
6338414   53152     968 6392534  618ad6 lib/libopencv_imgcodecs.so
21323564     155912  652056 22131532    151b34c lib/libopencv_imgproc.so
 724323   12176     376  736875   b3e6b lib/libopencv_ml.so
 429036    6864     464  436364   6a88c lib/libopencv_objdetect.so
6866973   50176    1064 6918213  699045 lib/libopencv_photo.so
 698531   13640     160  712331   ade8b lib/libopencv_stitching.so
 466295    6688     168  473151   7383f lib/libopencv_video.so
 315858    6972   11576  334406   51a46 lib/libopencv_videoio.so
76510375     721519  717496 77949390    4a569ce (TOTALS)

As you can see from the last line, .bss and .data do not take up much space, but the code is more than 70 MiB. It is clear that if this is linked statically with a specific application, the code will become smaller.

Let's try to throw as many modules as possible so that a minimal example (which, for example, just displays the version of OpenCV) is collected, so we look cmake .. -LAand turn off everything that is disabled in the options.

        -DBUILD_opencv_java_bindings_generator=OFF \
        -DBUILD_opencv_stitching=OFF \
        -DWITH_PROTOBUF=OFF \
        -DWITH_PTHREADS_PF=OFF \
        -DWITH_QUIRC=OFF \
        -DWITH_TIFF=OFF \
        -DWITH_V4L=OFF \
        -DWITH_VTK=OFF \
        -DWITH_WEBP=OFF \
        <...>

> size lib/libopencv_core.a --totals
   text    data     bss     dec     hex filename
3317069   36425   17987 3371481  3371d9 (TOTALS)

On the one hand, this is only one library module, on the other hand, it is without optimization by the compiler in terms of code size ( -Os). ~ 3 MiB of code is still quite a lot, but it already gives hope for success.

Run in emulator

Debugging on the emulator is much easier, so first make sure that the library runs on qemu. As the emulated platform, I chose Integrator / CP, because firstly, it is also ARM, and secondly, Embox supports graphics output for this platform.

Embox has a mechanism for building external libraries, using it we add OpenCV as a module (passing all the same options for the "minimal" build as static libraries), after that I add the simplest application that looks like this:

version.cpp:
#include 
#include 
int main() {
    printf("OpenCV: %s", cv::getBuildInformation().c_str());
    return 0;
}

We assemble the system, run it - we get the expected conclusion.

root@embox:/#opencv_version                                                     
OpenCV: 
General configuration for OpenCV 4.0.1 =====================================
  Version control:               bd6927bdf-dirty
  Platform:
    Timestamp:                   2019-06-21T10:02:18Z
    Host:                        Linux 5.1.7-arch1-1-ARCH x86_64
    Target:                      Generic arm-unknown-none
    CMake:                       3.14.5
    CMake generator:             Unix Makefiles
    CMake build tool:            /usr/bin/make
    Configuration:               Debug
  CPU/HW features:
    Baseline:
      requested:                 DETECT
      disabled:                  VFPV3 NEON
  C/C++:
    Built as dynamic libs?:      NO
< Дальше идут прочие параметры сборки -- с какими флагами компилировалось,
  какие модули OpenCV включены в сборку и т.п.>

The next step is to run some example, best of all some standard of those that the developers themselves offer on their website . I chose Canny's border detector .

The example had to be rewritten a bit in order to display the image with the result directly in the frame buffer. I had to do this because the function is imshow()able to draw images via the QT, GTK and Windows interfaces, which, of course, will definitely not be in the config for STM32. In fact, QT can also be run on STM32F7Discovery, but this will be discussed in another article :)

After a short clarification in which format the result of the border detector is stored, we get an image.

Original picture

Result

Running on STM32F7Discovery

There are several hardware partitions on the 32F746GDISCOVERY that we can use anyway

320KiB RAM
1MiB flash for image
8MiB SDRAM
16MiB QSPI NAND flash drive
MicroSD card slot

An SD card can be used to store images, but in the context of running a minimal example, this is not very useful.
The display has a resolution of 480x272, which means that the memory for the framebuffer will be 522,240 bytes at a depth of 32 bits, i.e. this is more than the size of RAM, so we will place the framebuffer and a bunch (which will be required for OpenCV to store data for images and auxiliary structures) in SDRAM, everything else (memory for stacks and other system needs) will go to RAM .

If we take the minimal config for STM32F7Discovery (throw out the entire network, all the commands, make the stacks as small as possible, etc.) and add OpenCV with examples there, with the required memory, the following will be:

   text    data     bss     dec     hex filename
2876890  459208  312736 3648834  37ad42 build/base/bin/embox

For those who are not very familiar with what section which develops, I will explain: in .textand .rodatalie intstruktsii and constant (roughly speaking, readonly-data) in .datathe data changes are, in .bsslies "zanulonnye" variable, which, nevertheless, need a place (this section will "go" to RAM).

The good news is that .data/ .bssshould fit, but with .texttrouble - there is only 1MiB of memory for the image. You can throw out the .textpicture from the example and read it, for example, from the SD card into memory at startup, but fruits.png weighs about 330KiB, so this will not solve the problem: most of it .textconsists of OpenCV code.

By and large, there is only one thing left - loading part of the code onto a QSPI flash drive (it has a special operating mode for mapping memory to the system bus, so that the processor can access this data directly). In this case, a problem arises: firstly, the memory of a QSPI flash drive is not available immediately after rebooting the device (you need to separately initialize the memory-mapped mode), and secondly, you cannot flash this memory with the usual bootloader.

As a result, it was decided to link all the code in QSPI, and flash it with a bootloader, which will receive the necessary binary via TFTP.

Result

The idea to port this library to Embox came about a year ago, but over and over again it was delayed due to various reasons. One of them is support for libstdc ++ and standart template library. The problem of supporting C ++ in Embox is beyond the scope of this article, so here I’ll just say that we managed to achieve this support in the right amount for this library to work :)

In the end, these problems were overcome (at least enough for the OpenCV example to work), and the example started. 40 long seconds takes the board to search for boundaries by the Canny filter. This, of course, is too long (there are considerations on how to optimize this matter, it will be possible to write a separate article about it if successful).

Nevertheless, the intermediate goal was to create a prototype that will show the fundamental possibility of running OpenCV on STM32, respectively, this goal was achieved, cheers!

tl; dr: step by step instructions

0: Download the sources of Embox, for example like this:

    git clone https://github.com/embox/embox && cd ./embox

1: Let's start by building a bootloader that will “flash” the QSPI flash drive.

    make confload-arm/stm32f7cube

Now you need to configure the network, because We will upload the image via TFTP. In order to set the IP addresses of the board and host, you need to modify the conf / rootfs / network file.

Configuration Example:

iface eth0 inet static
    address 192.168.2.2
    netmask 255.255.255.0
    gateway 192.168.2.1
    hwaddress aa:bb:cc:dd:ee:02

gateway- the host address from where the image will be loaded; address- the address of the board.

After that, collect the bootloader:

    make

2: Обычная загрузка загрузчика (простите за каламбур) на плату — здесь ничего специфичного, нужно это сделать как для любого другого приложения для STM32F7Discovery. Если вы не знаете, как это делается, можно почитать об этом тут.
3: Компиляция образа с конфигом для OpenCV.

    make confload-platform/opencv/stm32f7discovery
    make

4: Извлечение из ELF секций, которые нужно записать в QSPI, в qspi.bin

    arm-none-eabi-objcopy -O binary build/base/bin/embox build/base/bin/qspi.bin \
        --only-section=.text --only-section=.rodata \
        --only-section='.ARM.ex*' \
        --only-section=.data

В директории conf лежит скрипт, который это делает, так что можно запустить его

    ./conf/qspi_objcopy.sh # Нужный бинарник -- build/base/bin/qspi.bin

5: С помощью tftp загружаем qspi.bin.bin на QSPI-флэшку. На хосте для этого нужно скопировать qspi.bin в корневую папку tftp-сервера (обычно это /srv/tftp/ или /var/lib/tftpboot/; пакеты для соответствующего сервера есть в большинстве популярных дистрибутивов, обычно называется tftpd или tftp-hpa, иногда нужно сделать systemctl start tftpd.service для старта).

    # вариант для tftpd
    sudo cp build/base/bin/qspi.bin /srv/tftp
    # вариант для tftp-hpa
    sudo cp build/base/bin/qspi.bin /var/lib/tftpboot

On Embox (i.e., in the bootloader), you need to run the following command (we assume that the server has the address 192.168.2.1):

    embox> qspi_loader qspi.bin 192.168.2.1

6: Using the command gotoyou need to "jump" into the QSPI memory. The specific location will vary depending on how the image is linked, you can see this address with the command mem 0x90000000(the start address fits into the second 32-bit image word); You will also need to set the stack as a flag -s, the stack address is at 0x90000000, an example:

    embox>mem 0x90000000
    0x90000000:     0x20023200  0x9000c27f  0x9000c275  0x9000c275
                      ↑           ↑
              это адрес    это  адрес 
                стэка        первой
                           инструкции
    embox>goto -i 0x9000c27f -s 0x20023200 # Флаг -i нужен чтобы запретить прерывания во время инициализации системы
    < Начиная отсюда будет вывод не загрузчика, а образа с OpenCV >

7: Run

    embox> edges 20

and enjoy a 40 second border search :)

If something goes wrong - write issue in our repository , or in the mailing list embox-devel@googlegroups.com, or in the comments here.

Tags: