Autonomous driving on the sidewalk through OpenCV and Tensorflow

    The creation of autonomous machines is a popular topic today and a lot of interesting things happen here at the amateur level.

    The oldest and best known online course was Udacity .

    So, in autonomous machines there is a very fashionable approach - Behavioral Cloning, the essence of which is that the computer learns to behave like a person (behind the wheel), relying only on the recorded input and output data. Roughly speaking, there is a base of pictures from the camera and the corresponding angle of rotation of the steering wheel.

    In theory, having trained a neural network on this data, we can give it a steer machine.
    This approach is based on an article from Nvidia .

    There are many implementations made mainly by Udacity students:

    Even more interesting is the use in real projects. For example, the Donkey Car machine is controlled by a specially trained neural network .

    Such a rich infosphere directly pushes into action, all the more so since my robot tank since the previous article had come to a certain dead end in its development, and it urgently needed fresh ideas. It was a bold dream - to walk through the park with your tank, which, in general, is no worse than a domestic dog. Things are easy - to teach a tank to ride on the sidewalk in the park.

    So, what is a sidewalk in terms of a computer?

    Some area in the picture that is different in color from other areas.

    It so happened that in the parks accessible to me, the pavement turned out to be the grayest object in the picture.

    (The grayest refers to the minimum difference between the RGB values). This property is gray and will be the key to recognizing the sidewalk.

    Another important gray parameter is brightness. Autumn photos consist of gray a little less than completely, so that the differences from the road from the roadside are only in shades.

    tank in a park

    A couple of the most obvious approaches are in pre-calibration - set up the robot so that the road takes up most of the screen and

    • take average brightness (in HSV format)
    • or the average RGB of a piece, guaranteed to consist of a road (in this case, it will be the lower left corner).

    Having established such criteria for recognition of the pavement, we run through the picture and get some kind of outline of the road.

    The next step is to turn a coarse spot into action - go straight or turn right or left.

    We go straight, if the right edge is visible and the angle is less than 45 degrees from the vertical.

    Turn left if the right edge is visible and the angle deviates from the vertical downwards.
    Turn right if we do not see the right edge.

    The right edge of the frivolous spot is rather bleak to solve this problem with the help of geometry. Let the artificial intelligence be better off looking for patterns of inclination in these fragments.

    This is where neural networks come to the rescue.

    The original pictures are washed out, compressed and cut, select the gray pavement and select the 64x64 black and white masks.

    We decompose these masks into 3 piles - Left, Right, Straight and train the neural network classifier on them.

    Collecting and preparing data is a tedious task, it took a couple of months.

    Here are samples of masks:




    To work with the neural network, I used Keras + Tensorflow.

    At first there was an idea to take the structure of a neural network from Nvidia, but, obviously, it is intended for several other tasks and does not do very well with the classification. As a result, it turned out that the simplest neural network from any tutorial on multi-category classification yields quite acceptable results.

    model = Sequential()
    activation = "relu"
    model.add(Conv2D(20, 5, padding="same", input_shape=input_shape))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
    model.add(Conv2D(50, 5, padding="same"))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
    opt = SGD(lr=0.01)
    model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])

    Having trained the first version of the network, I ran into its incompatibility with the Raspberry Pi. Before that, I used Tensorflow version 1.1, with the help of shamanism, collected by one very clever person .

    Unfortunately, this version is outdated and could not read models from Keras.

    However, recently, people from Google have finally condescended and collected TF under the Raspberry Pi, albeit under the new version of Raspbian - Stretch. Stretch was all good, but a year ago, OpenCV wasn’t going for me, so the tank went to Jessie.

    Now, under the pressure of change, I had to switch to Stretch. Tensorflow got up without any problems (although it took several hours). OpenCV for the year also did not stand still and version 4.0 has already been released. So we managed to assemble it under Stretch, so there are no more obstacles for migration.

    There were doubts how Raspberry would pull in such a monster as Tensorflow in realtime, but everything turned out to be generally acceptable - despite the initial network load of a few seconds, the classification itself is able to work several times per second without significant memory and CPU overshoot.

    As a result, most of the problems and mistakes happen at the stage of recognition of the road.
    The neural network misses very rarely, despite the simplicity of the structure.

    With the updated firmware, the tank cuts through the park.

    As a result of injuries, the robot constantly blows to the right, so that without artificial intelligence, he quickly leaves on the lawn.

    You can now walk it in the morning and detect oncoming dogs.


    Also popular now: