nenuacho October 9, 2014 at 12:18

Optical Character Recognition by .NET

From the sandbox

For example, I created an ad on a popular site that displays numbers in the form of pictures.

Here is the issue itself:

First of all, I will need a dictionary of all the characters that can be found in such pictures, so I’ll start not with this phone, but with training. To do this, I found on the same ads site 2 phones that contained all the possible 10 digits and glued them into one image:

Each symbol highlights the fact that it does not merge with the background, and each identical symbol is drawn in the same way. First, let's remove transparency:

void RemoveAlphaChannel(Bitmap src)
        {
            for (int y = 0; y < src.Height; y++)
                for (int x = 0; x < src.Width; x++)
                {
                    var pxl = src.GetPixel(x, y);
                    if (pxl.A == 0) src.SetPixel(x, y, Color.FromArgb(255, 255, 255, 255));
                }
        }

Cut off the excess:

private Bitmap CropImage(Bitmap sourceBitmap)
        {
            var upperLeft = GetCorner(sourceBitmap, true);
            var lowerRight = GetCorner(sourceBitmap, false);
            var width = lowerRight.X - upperLeft.X;
            var height = lowerRight.Y - upperLeft.Y;
            Bitmap target = new Bitmap(width, height);
            using (Graphics g = Graphics.FromImage(target))
            {
                g.DrawImage(sourceBitmap, new Rectangle(0, 0, target.Width, target.Height), new Rectangle(ul, new Size(width, height)), 			GraphicsUnit.Pixel);
            }
            return target;
        }

The GetCorner method will not be specifically described. In short, it compares colors pixel by pixel and returns the upper left or lower right points of the rectangle surrounding the usable area.

Next, we parse the resulting picture into characters and add them to the collection. I used an algorithm that plucks each iteration by the character to the left:


    private void CropChars(Bitmap bitmapPattern, string stringPattern)
        {
            var croped = CropImage(bitmapPattern);
            RemoveAlphaChannel(croped);
            int cntr = 0;
            for (int x = 0; x < croped.Width; x++)
            {
                for (int y = 0; y < croped.Height; y++)
                {
                    if (
                       (y == croped.Height - 1 && x > 0)
                       || (x == croped.Width - 1 && x > 0)
                       )
                    {
                        var rect = new Rectangle(0, 0, x, croped.Height);
                        //Дубли пропускаем
                        if (_charInfoDictionary.FirstOrDefault(c => c.Char == stringPattern[cntr]) == null)
                            _charInfoDictionary.Add(new CharInfo(CropImage(croped, rect), stringPattern[cntr]));
                        ++cntr;
                        if (croped.Width - x <= 1) return;
                        croped = CropImage(croped, new Rectangle(x, 0, croped.Width - x, croped.Height));
                        x = 0;
                    }
                    if (!IsEmptyPixel(croped.GetPixel(x, y)))
                    {
                        break;
                    }
                }
            }
        }

The key points here are 2:

1. stringPattern represents the term "8929520-51-488926959-74-93", each character of which corresponds to a graphic representation of the character.

2. The entity that describes the symbol:

public class CharInfo
    {
        //Последовательность яркостей
        public int[] _hsbSequence;
        //Кол-во областей, на которые будут разделены символы, для составления последовательности яркостей (по горизонтали и вертикали)
        private const int XPoints = 4;
        private const int YPoints = 4;
        //Символьное представление сущности
        public char Char { get; set; }
        //Графическое представление сущности
        public Bitmap CharBitmap { get; private set; }
        public CharInfo(Bitmap charBitmap, char letter)
        {
            Char = letter;
            CharBitmap = charBitmap;
            //Сжимаем наш символ в соответствии с кол-вом областей
            Bitmap resized = new Bitmap(charBitmap, XPoints, YPoints);
            _hsbSequence = new int[XPoints * YPoints];
            int i = 0;
            //И заполняем последовательность яркостями*10. Сама яркость, это double от 0.0(черное) до 1.0(белое)
            for (int y = 0; y < YPoints; y++)
                for (int x = 0; x < XPoints; x++)
                    _hsbSequence[i++] = (int)(resized.GetPixel(x, y).GetBrightness()*10);
        }
        /// 
        /// Метод сравнения с другим символом, сравнивает последовательности яркостей
        /// 
        /// 
        /// Количество совпадений
        public int Compare(CharInfo charInfo)
        {
            int matches = 0;
            for (int i = 0; i < _hsbSequence.Length; i++)
            {
                if (_hsbSequence[i] == charInfo._hsbSequence[i]) ++matches;
            }
            return matches;
        }
    }

Now, having returned to the number in the ad, it remains only to put together a similar collection (with one difference: the symbolic representation for each element will occupy a space) and compare each element with a dictionary.

public IEnumerable Recognize(Bitmap bitmap)
        {
            RemoveAlphaChannel(bitmap);
            var charsToRecognize = CropChars(bitmap);
            List result = new List();
            foreach (var charInfo in charsToRecognize)
            {
                CharInfo closestChar = null;
                int maxMatches = 0;
                foreach (var dictItem in _charInfoDictionary)
                {
                    var matches = dictItem.Compare(charInfo);
                    if (matches > maxMatches)
                    {
                        maxMatches = matches;
                        closestChar = dictItem;
                    }
                }
                result.Add(closestChar);
            }
            return result;
        }

As a result, we have a collection of characters for which the piece of iron picked up and correctly put down all the numbers.

	    StringBuilder sb = new StringBuilder();
            foreach (var charInfo in charsToRecognize)
                sb.Append(charInfo.Char);
	    textBox1.Text = sb.ToString();

Recognition of letters of the alphabet, for example, the letter “» ”, is somewhat more complicated due to the fact that they have components and require a more complex algorithm for finding a framing rectangle, but the comparison algorithm itself will be the same.

PS As for third-party libraries, at that time I found several of them, among which (however, I don’t remember the names of the others) I chose the MODI library from Microsoft (it was part of MS Office) for my purposes. She recognized the text perfectly. Of the minuses - in the context of one process, only one recognition procedure could work, i.e. she just didn’t want to parallelize into several threads.

Tags:

Optical Character Recognition by .NET

Also popular now: