Handwriting Recognition
Introduction
This article will discuss the method of handwriting recognition by analyzing all points on the plane and sorting out all sorts of combinations in order to find the best superposition of control points on the previously described shapes. I will explain.
Handwriting is drawing with a conceivable “pen” a certain figure. Drawing in computer systems is the storage in the graphic memory of information about all the pixels of the graphic context. “A point on a plane” in mathematics is an abstract concept. In computer graphics, a “pixel" is hidden behind this concept. This recognition algorithm will analyze the set of points (pixels) provided to it and try to find the most possible and similar figure in it. The figure, in turn, is a frame containing only the main (control) points that make the figure unique.
Materiel
Generally speaking, the heart of the algorithm is the Cosine Theorem , well-known from the time of school , which is a generalized Pythagorean theorem . Knowing the coordinates of the three points of the plane and their order of “appearance” on it, we can easily determine the angle described by these points (The vertex of the angle is the second point in a row):
A (x1; y1)
B (x2; y2)
C (x3; y3) the
distances between the points are found by the Pythagorean theorem
a ^ 2 = b ^ 2 + c ^ 2 - 2 * b * c * cos (ALPHA)
cos (ALPHA ) = (b ^ + c ^ -a ^) / 2 * b * c
Knowing the cosine, the angle can easily be calculated.
Among the set of points that are fed to the input of the algorithm, it is necessary to "substitute" the points in all kinds of skeletons of figures (about them above) and choose the best solution among those found. This is done as follows:
- We take the first and last points of the wireframes of the figures. Already there are two, it remains to find the third (to find the value of the angle).
- The search for the third is carried out by enumerating all subsequent points after the first. The decision to include a point in the proposed frame of the figure is made on the basis of two analyzes:
- An attempt to substitute a point in a corner (as a third, final) and check it for compliance with the value of the same angle in the frame of a real figure.
- Check the aspect ratio of the resulting angle with the same aspect ratio of the angle in the frame of the real figure.
If these two conditions are met, then the algorithm decides to include a point from a set of points in a conceivable frame (in this case, we increase the similarity to the current analyzed figure).
If, for example, we have several analyzed frames, for example, “8” and “6”. And the result of the recognition algorithm: “8” - 80%, “6” - 90%, then the decision is made in favor of the figure in whose frame there are more control points, that is, in favor of the figure eight.
The percentage of similarity of a set of points with points in the wireframe is calculated simply: all points that coincide with the same points in the wireframe are summed up and the relation is found. Suppose if there are N control points in the frame, and we have M converging, then the percentage of similarity isM / N * 100
In words, something may be incomprehensible. Therefore, everything is the same, but clearly (using the example of the figure “6”):
Black denotes sets of points, red denotes a wireframe, in accordance with which the analysis is performed.
The numbers indicate the points of the angles (starting from the last), if we assume that we draw the six from the point “2” and end with the point “1”, then the first two points from which the analysis of the frame begins - “1” and “2”, then search for the point "3", so that its parameters relative to the angle formed by it coincide with the topic with the same parameters in the frame. Further, as we found the point “3”, we look for the point “4” (already relying on the points “2” and “3”) again, in accordance with the real frame, etc.
Letters markedside corners. That is, following the rules of the algorithm, a point from the set (points of the plane) can be included in the proposed frame if (examples):
(angle 2 ~ = angle 2 in the frame) And (a / b ~ = a1 / b1) then the point "3" will be turned on
(angle 3 ~ = angle 3 in the frame) And (b / c ~ = b1 / c1) then point "4" will be included
, etc.
The description of the algorithm ends here.
The code
It is easy to say, but before you say you need to think carefully about how to do what is said ...
Well, I already thought and implemented the invented algorithm using C ++ and the OpenGL graphics library (+ GLUT add-in). A graphics library is used to draw a set of points in two-dimensional space. The code turned out not so little, but not so much. Almost all of the code is spread across C ++ header files. The project is made publicly available to anyone who is at least a little interested. Sources are located on Bitbucket . The project uses the GIT version control system, so those who wish to deflate the source code of the project should have no difficulties with this.
To enter the figure programming modeYou need to click in the main window with the right mouse button. Then draw the frame (using the points connected to each other in order by segments) and click the middle mouse button. “Done!” Appears on the screen. After that restart the application.
Underwater rocks
Almost everywhere they are ... and this algorithm is no exception. Frankly, the algorithm allows periodic misfires in the correct recognition.
Take, for example, the characters "S" and "5", where misfires are almost inevitable. Although, if all control points are qualitatively designated, then, most likely, misfires can be avoided. Misfires can also occur when analyzing shapes that have complex circularities. If misfires occur on dissimilar characters (for example, I misfired with "6" and "8": 6 - 100%, 8 - 83%), then you can program the skeleton of each of the figures again (the number of repetitions is not limited). This way you can avoid recognition errors. And the last thing to note is the angle formed by the last, first and second points + aspect ratio should be remembered. To do this, you can “align” this angle to 90 degrees, as shown in the demo video below.
In the article I mentioned only numbers, as you may have noticed. But, in fact, recognition is applicable to absolutely any figures of two-dimensional space. I also made a small application for the article - a video demonstrating the operation of the algorithm.
If you have any questions, then ask - I will be happy to answer them.