Seals against neural networks. Or choose and run the neural network to recognize objects on the Raspberry Zero

    Good day to all.

    A tiny computer Raspberry is a wonderful thing. I used Raspberry Zero W in a couple of projects for the last six months. Bribed the simplicity of typing and hauling various ideas. And now the optional question interested, will this device draw a full convolutional grid? [Spoiler - pull, but there are fun nuances]. Who is interested in the topic - welcome under cat. Be careful, there will be many cats!


    Why Raspberry Neural Network?

    Once I gathered at Raspbery Zero W a simple video trap to monitor the nightlife of animals (mostly cats) in the country. The code was simple and worked well. For video-photo detection, an infrared camera was used like this “Raspberry Pi Night Version Camera” .


    The essence of the code is to take two consecutive frames, compare pixel by pixel, and if the number of changed pixels is more than a certain threshold value, start recording a 10-second video. The text of the code in this post will not give, if someone is interested, write in the comments, I can put in the following. The main chip, put the receipt of two compared frames in 0.2 seconds at least to catch the fast events. Well, quickly compare these frames, of course.

    Then the idea arose of screwing a simple neural network to the algorithm, so that you could identify an object in a captured frame and start recording video only if the class of the object was reliably determined. This potentially eliminates the false positives of the video trap. These happen from moving objects (for example, grass or branches) or from the sharp illumination of the camera scene (the lights in the window turned on or the lamp went out, for example).

    What kind of mesh to put on Raspberry?

    Fortunately, a pre-installed Python (in my case it is 3.5.3) and widely available OpenCV (I use 3.4.3) you can put almost any mesh. Unfortunately, due to the limited computational capabilities of the device, the list of options is small. In fact, you can only choose from the "light" options:

    1. SqueezeNet (sample code here ).
    2. YOLO Tiny ( here ).
    3. MobileNet-SSD ( here ).
    4. MobileNet_v1_224 (there is a fantastic video of the work of the object detector on this grid ).

    In all the cases listed above, the opportunity to use the model pre-trained in respectable datasets is captivating, thereby saving itself from all the anguish and anxieties of an independent data set and the subsequent training on them of a neural network.

    Applicant No. 1 was inspired by the declared high accuracy of recognition with modest weights. In addition, a brief Internet search brought to Adrian Rosebrock’s great blog , which commented in detail on the code and outlined several options for implementing deep learning on Raspberry.

    The code from here is used to test the SqueezeNet features. The author of the code sends the weights and textual representation of the model to an email after filling in the form on the website. By the way, if you do not have OpenCV installed, you can find an algorithm of actions in his own blog. Plus, there are also examples of “overclocking” of the code to speed up the model operation time and much more. Respect Adriana, really cool resource.

    Well, well, run the code and on the first picture we get a stunning result!


    The cat in the picture is defined as Persian with a probability of 99%. In fact, he is not a Persian, but a British Longhair or Highlander. But for a model with a range of 1000 classes, one can say, hit the bull's eye. For convenience, I put the main results of the neural network directly on the photo. These are the 5 most probable classes, the first is the most likely, the second is the next in importance, and so on.

    By the way, the model counts object classes in my Zero 6.5 seconds. According to Adrian, the calculation for the Raspberry Pi B + on the pictures in his post (photo of the barbershop, cobra and jellyfish) will take about 0.92 seconds. I readily believe that the full version of Raspberry 4 cores in the processor after all. I suppose we all know that Zero has only one (((

    It seems that you will have to forget about determining the class of an object in real time at Zero. By the way, I must admit that the second time for the model to work on a full-fledged Pi is also not a limit dreams.

    But we will continue testing the model.


    The cat changed its position and lost as much as 7% of its former “Persianness”). But this is a joke, of course, in general, the work of the model is very high. It was possible to finish this very spot, but I wanted to slightly complicate the task of the model. Let's continue to practice on ... cats. But recruit frames where the cat is not sitting in the classic position, and sleeping, for example. So let's go.


    In this picture, the cat is defined as Angora, but it is not accurate. Apparently from the fact that she was irritated by the obsessive request to leave the sink. Well, the neural network was mistaken, well, with whom it does not happen in the end?


    It turns out that a fluffy soccer ball lives in my house) Yes, it happens that people are not at all what they seem at first glance. Fighting cats and neural networks takes a serious turn.


    Wow. Now she is a Siberian Husky. Something tells me that the cat is still leading in the bill)


    It seems that one of these two is clearly knocked down and this is clearly not a cat. Now it is defined by the neural network as a spindle (albeit only by 8.5%), there are also options that it is a beagle, killer whale, rocky python or skunk. Not a cat, but a woman is a mystery!


    Come on! It's all the same killer whale! Yes, yes, a marine mammal of the order of cetaceans. For some reason, I recalled the lines from a distant childhood:
    “There is no order in this fairy tale,
    There is a mistake, a typo! Someone,
    Against any rules,
    In the fairy tale, he moved the letters,
    "KIT" to "CAT",
    "CAT" to "KIT", on the contrary! ”
    The ringing of the gong, the referee stops the fight)


    In the second round, the cat, wearing insidiously wearing glasses, stepped down for the Boston bulldog with a probability of 34%. Or for the French. It seems that the neural network has not fully recovered from the defeat in the first round)


    Well, finally! The cat is defined as Siamese with a probability of as much as 66%! Bravo, SqueezeNet! Seriously, it seems that in the original dataset photos of not lying cats prevailed. Lying were mostly dogs)


    The ability of cats to take the form of a box confuses even people, what can we say about the neural network. Immersion to the box reduced recognition accuracy by as much as 40%.


    So, so ... And this, it seems, is generally prohibited reception. A computer mouse lying next to a cat finally confuses a neural network. Now our cat is a mouse! )

    So, the total neural network was shown 11 photos of cats, of which only 5 were correctly identified. At the same time, with a probability of more than 50% in only three cases. In no way detract from the work of the authors SqueezeNet. This is a good-quality network with a very wide class of objects and relatively low resource requirements.

    The article, of course, is a comic character, but quite pragmatic conclusions can be drawn from the obtained data. It is necessary to use pre-trained neural networks with great care, checking them on real images of the task for which you plan to use the neural network.

    As for the choice of the optimal neural network for Raspberry, the question remains open. I continue to experiment, with the interest of the audience to this topic, I will share the results of further research. Simply, the results of the first step were so funny that I really wanted to share them.

    Thank you for reading to the end. Good luck and a good work week)

    UPD: The running code for running a neural network on Raspberry Pi Zero W is in the second part of this post.

    Also popular now: