An Introduction to OpenCV for Road Marking Recognition
Hello, Habr! We are publishing material from our Deep Learning graduate and Big Data Program Coordinator, Cyril Danilyuk, about his experience using the OpenCV computer vision framework to define road marking lines.
Some time ago, I started a program from Udacity: “Self-Driving Car Engineer Nanodegree” . It consists of many projects on various aspects of building a driving system on autopilot. I present to you my decision to the first project: a simple linear detector of road marking. To understand what happened in the end, first watch the video:
The goal of this project is to build a simple linear model for lane-by-frame recognition of lanes: we get a frame at the input, with a series of transformations, which we will talk about later, process it, we get a filtered image that can be vectorized and trained two independent linear regressions: one for each lane. The project is intentionally simple: only a linear model, only good weather conditions and visibility, only two marking lines. Naturally, this is not a production solution, but even such a project allows you to play enough with OpenCV, filters and, in general, helps to feel what difficulties autopilot developers in cars face.
The detector construction process consists of three main steps:
First, the input function
I tried to approach the task in the OOP style (unlike most analytical tasks): so that each of the steps turned out to be isolated from the others.
The first stage of our work is familiar to data scientists and everyone who works with raw data: first we need to pre-process the data, and then vectorize it in a way that is clear to the algorithms. The general pipeline for preprocessing and vectorization of the original image is as follows: Our project uses OpenCV, one of the most popular frameworks for working with images at the pixel level using matrix operations. First, we convert the original RGB image to HSV - it is in this color model that it is convenient to highlight ranges of specific colors (and we are interested in shades of yellow and white to determine lanes). Pay attention to the screenshot below: highlighting “all yellow” in RGB is much more difficult than in HSV.
After converting the image to HSV, some recommend applying Gaussian blur, but in my case it reduced the quality of recognition. The next stage is binarization (converting the image into a binary mask with the colors we are interested in: shades of yellow and white).
Finally, we are ready to vectorize our image. We apply two transformations:
Obviously, the upper part of the image is unlikely to contain marking lines, so it can be ignored. There are two ways: either immediately paint over the top of our binary mask with black, or think of a smarter line filtering. I chose the second method: I considered that everything that is above the horizon cannot be a marking line.
The skyline (vanishing point) can be determined by the point at which the right and left lanes converge.
Upgrading of road marking lines will be using the function
To facilitate the process, I decided to use the PLO and to represent road marking lines as the instances of a class
Let's take a closer look at classes
Each instance of a class
Thus, to determine whether a marking line belongs to, we use quite trivial logic: we make decisions based on the slope of the line and the distance to the marking. The method is not ideal, but it worked for my simple conditions.
The class
Stabilize the resulting road marking linesvery important: the original image is very noisy, and the determination of the bands occurs frame by frame. Any shadow or heterogeneity of the road surface immediately changes the marking color to one that our detector is not able to determine ... In the process, I used several stabilization methods:
For example, for
The indicator of the stability of the marking line in the current frame is described by objects of the class
As a result, we get fairly stable lines:
In order for the lines to be drawn correctly, I wrote a fairly simple algorithm that calculates the coordinates of the horizon point, which we have already talked about. In my project, this point is needed for two things:
To visualize the entire process of determining the bands, I did a little
This project is still very far from completion: the more I work on it, the more things that need improvement, I find:
All source code of the project is available on GitHub at the link .
Of course, this post should also have a fun part. Let's see how pathetic the detector becomes on a mountain road with frequent changes in direction and light. At first, everything seems to be normal, but in the future the error in determining the bands accumulates, and the detector ceases to have time to monitor them:
And in the forest, where the light changes very quickly, our detector completely failed the task:
By the way, one of the following projects is to make a non-linear detector, which will just cope with the “forest” task. Stay tuned for new posts!
→ The original Medium post in English .
Some time ago, I started a program from Udacity: “Self-Driving Car Engineer Nanodegree” . It consists of many projects on various aspects of building a driving system on autopilot. I present to you my decision to the first project: a simple linear detector of road marking. To understand what happened in the end, first watch the video:
The goal of this project is to build a simple linear model for lane-by-frame recognition of lanes: we get a frame at the input, with a series of transformations, which we will talk about later, process it, we get a filtered image that can be vectorized and trained two independent linear regressions: one for each lane. The project is intentionally simple: only a linear model, only good weather conditions and visibility, only two marking lines. Naturally, this is not a production solution, but even such a project allows you to play enough with OpenCV, filters and, in general, helps to feel what difficulties autopilot developers in cars face.
Detector operation principle
The detector construction process consists of three main steps:
- Data preprocessing, noise filtering and image vectorization.
- Updating the status of road marking lines according to the data from the first step.
- Drawing updated lines and other objects in the original image.
First, the input function
image_pipeline
is served 3-channel RGB format image, which is then filtered, converted and updated in function objects Line
and Lane
. Then, all the necessary elements are drawn on top of the image itself, as shown below:I tried to approach the task in the OOP style (unlike most analytical tasks): so that each of the steps turned out to be isolated from the others.
Step 1: Preprocessing and Vectoring
The first stage of our work is familiar to data scientists and everyone who works with raw data: first we need to pre-process the data, and then vectorize it in a way that is clear to the algorithms. The general pipeline for preprocessing and vectorization of the original image is as follows: Our project uses OpenCV, one of the most popular frameworks for working with images at the pixel level using matrix operations. First, we convert the original RGB image to HSV - it is in this color model that it is convenient to highlight ranges of specific colors (and we are interested in shades of yellow and white to determine lanes). Pay attention to the screenshot below: highlighting “all yellow” in RGB is much more difficult than in HSV.
blank_image = np.zeros_like(image)
hsv_image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)
binary_mask = get_lane_lines_mask(hsv_image, [WHITE_LINES, YELLOW_LINES])
masked_image = draw_binary_mask(binary_mask, hsv_image)
edges_mask = canny(masked_image, 280, 360)
# Correct initialization is important, we cheat only once here!ifnot Lane.lines_exist():
edges_mask = region_of_interest(edges_mask, ROI_VERTICES)
segments = hough_line_transform(edges_mask, 1, math.pi / 180, 5, 5,
After converting the image to HSV, some recommend applying Gaussian blur, but in my case it reduced the quality of recognition. The next stage is binarization (converting the image into a binary mask with the colors we are interested in: shades of yellow and white).
Finally, we are ready to vectorize our image. We apply two transformations:
- Canny Border Detector : An optimal border detection algorithm that calculates image intensity gradients and then uses two thresholds to remove weak borders, leaving the desired ones (we use
(280, 360)
) as threshold values in the functioncanny
. - Hough Transformation: Having obtained the boundaries using the Canny algorithm, we can connect them using lines. I do not want to go into the mathematics of the algorithm - it deserves a separate post - this link or the link above will help you if you are interested in the method. The main thing is that, applying this transformation, we get a set of lines, each of which, after a little additional processing and filtering, becomes an instance of the Line class with a known angle of inclination and a free term.
Obviously, the upper part of the image is unlikely to contain marking lines, so it can be ignored. There are two ways: either immediately paint over the top of our binary mask with black, or think of a smarter line filtering. I chose the second method: I considered that everything that is above the horizon cannot be a marking line.
The skyline (vanishing point) can be determined by the point at which the right and left lanes converge.
Step 2: Update Road Marking Lines
Upgrading of road marking lines will be using the function
update_lane(segments)
in image_pipeline
which the input receives objects segments
from the last step (which in fact are the objects Line
of the Hough transform). To facilitate the process, I decided to use the PLO and to represent road marking lines as the instances of a class
Lane
: Lane.left_line, Lane.right_line
. Some students limited themselves to adding the `lane` object to the global namespace, but I'm not a fan of global variables in the code. Let's take a closer look at classes
Lane
and Line
their instances: Each instance of a class
Line
represents a separate line: a piece of road marking or just any line that will be determined by the Hough transform, while the main goal of class objects Lane
is to identify whether this line is a segment of the road marking. To do this, we will be guided by the following logic:- The line cannot be horizontal and should have a moderate slope.
- The difference between the slopes of the road marking line and the candidate line cannot be too high.
- The candidate line should not be far from the road markings to which it belongs.
- Candidate line should be below the horizon
Thus, to determine whether a marking line belongs to, we use quite trivial logic: we make decisions based on the slope of the line and the distance to the marking. The method is not ideal, but it worked for my simple conditions.
The class
Lane
is a container for the left and right marking lines (refactoring is requested). The class also presents several methods related to working with marking lines, the most important of which fit_lane_line
. In order to create a new marking line, I represent suitable marking segments as points, and then approximate them with a first-order polynomial (that is, a line) using the usual function numpy.polyfit
Stabilize the resulting road marking linesvery important: the original image is very noisy, and the determination of the bands occurs frame by frame. Any shadow or heterogeneity of the road surface immediately changes the marking color to one that our detector is not able to determine ... In the process, I used several stabilization methods:
- Buffers . The resulting marking line remembers N previous states and sequentially adds the status of the marking line on the current frame to the buffer.
- Additional line filtering based on data in the buffer. If, after conversion and cleaning, we could not get rid of noise in the data, then there is a chance that our line will turn out to be an outlier, and, as we know, the linear model is sensitive to outliers. Therefore, for us a fundamentally high value of accuracy - even to the detriment of a significant loss of completeness. Simply put, it’s better to filter out the correct line than add an outlier to the model. Especially for such cases, I created
DECISION_MAT
a “decision-making” matrix that decides how to correlate the current line slope and the average of all lines in the buffer.
For example, for
DECISION_MAT = [ [ 0.1, 0.9] , [1, 0] ]
we consider the choice of two solutions: consider the line as unstable (i.e. potential outlier), or stable (its slope corresponds to the average slope of the lines of the given band in the buffer plus / minus the threshold value). If the line is unstable, we still want not to lose it: it can carry information about the actual turn of the road. We will simply take it into account with a small coefficient (in this case - 0.1). For a stable line, we will simply use its current parameters without any weighting from previous data. The indicator of the stability of the marking line in the current frame is described by objects of the class
Lane
: Lane.right_lane.stable
and Lane.left_lane.stable
, which are Boolean. If at least one of these variables takes a value False
, I visualize it as a red polygon between two lines (below you can see how it looks). As a result, we get fairly stable lines:
Step 3: Drawing and updating the source image
In order for the lines to be drawn correctly, I wrote a fairly simple algorithm that calculates the coordinates of the horizon point, which we have already talked about. In my project, this point is needed for two things:
- Limit extrapolation of marking lines to this point.
- Filter out all Hough lines above the horizon.
To visualize the entire process of determining the bands, I did a little
image augmentation
:
As you can see from the code, I superimpose two images on the original video: one with a binary mask, the second with the Hough lines (transformed into points) that passed all our filters. I impose two lanes on the original video itself (linear regression over points from the previous image). The green rectangle is an indicator of the presence of "unstable" lines: when they are present, it turns red. Using this architecture makes it easy enough to change and combine frames that will be displayed as a dashboard, allowing you to simultaneously visualize many components and all this without any significant changes in the source code.defdraw_some_object(what_to_draw, background_image_to_draw_on, **kwargs):# do_stuff_and_return_image# Snapshot 1
out_snap1 = np.zeros_like(image)
out_snap1 = draw_binary_mask(binary_mask, out_snap1)
out_snap2 = draw_filtered_lines(segments, out_snap1)
snapshot1 = cv2.resize(deepcopy(out_snap1), (240,135))
# Snapshot 2
out_snap2 = np.zeros_like(image)
out_snap2 = draw_canny_edges(edges_mask, out_snap2)
out_snap2 = draw_points(Lane.left_line.points, out_snap2, Lane.COLORS['left_line'])
out_snap2 = draw_points(Lane.right_line.points, out_snap2, Lane.COLORS['right_line'])
out_snap2 = draw_lane_polygon(out_snap2)
snapshot2 = cv2.resize(deepcopy(out_snap2), (240,135))
# Augmented image
output = deepcopy(image)
output = draw_lane_lines([Lane.left_line, Lane.right_line], output, shade_background=True)
output = draw_lane_polygon(output)
output = draw_dashboard(output, snapshot1, snapshot2)
return output
What's next?
This project is still very far from completion: the more I work on it, the more things that need improvement, I find:
- Make the detector non-linear so that it can work successfully, for example, in the mountains, where there are turns at every step.
- Make the projection of the road as a “top view” - this will greatly simplify the definition of lanes.
- Road recognition. It would be great to recognize not only the markings, but also the road itself, which will greatly facilitate the operation of the detector.
All source code of the project is available on GitHub at the link .
PS And now we will break everything!
Of course, this post should also have a fun part. Let's see how pathetic the detector becomes on a mountain road with frequent changes in direction and light. At first, everything seems to be normal, but in the future the error in determining the bands accumulates, and the detector ceases to have time to monitor them:
And in the forest, where the light changes very quickly, our detector completely failed the task:
By the way, one of the following projects is to make a non-linear detector, which will just cope with the “forest” task. Stay tuned for new posts!
→ The original Medium post in English .