
Computer vision and machine learning in PHP using the opencv library
- Tutorial
Hello. This is my anniversary article on Habré. For almost 7 years I have written 10 articles (including this one), 8 of them are technical. The total number of views of all articles is about half a million.
I made the main contribution to two hubs: PHP and Server Administration. I like to work at the junction of these two areas, but the scope of my interests is much wider.
Like many developers, I often use the results of someone else's work (articles on Habré, code on github, ...), therefore I am always happy to share my results with the community in response. Writing articles is not only a return of debt to the community, but also allows you to find like-minded people, get comments from professionals in a narrow field, and further deepen your knowledge in the field.
Actually this article is about one of these points. In it, I will describe what I did almost all my free time in the past six months. In addition to those moments when I went swimming in the sea across the road , watched TV shows or played games.

Now "Machine Learning" is developing very much, a lot of articles have been written on it, including on the Habr and almost every developer would like to take and start using it in their work tasks and home projects, but where to start and what not to apply always clear. Most articles for beginners offer a lot of literature, for reading which life is not enough, English-language courses (and not all of us can learn material in English as effectively as in Russian), “inexpensive” Russian-language courses, etc.
New articles regularly appear that describe new approaches to solving a particular problem. On github you can find the implementation of the approach described in the articles. The most commonly used programming languages are: c / c ++, python 2/3, lua and matlab, and as frameworks: caffe, tensorflow, torch. Everyone writes - who is doing what. Large segmentation of programming languages and frameworks greatly complicates the process of finding what you need and integrating it into the project. In addition, recently a lot of source code with comments in Chinese.
To somehow reduce all this chaos in opencv, we added the dnn module , which allows you to use models trained in the main frameworks. For my part, I will show how this module can be used from php.
Jeremy Howard (creator of the free-of-charge practical course “machine learning for coders” ) believes that now there is a big threshold between learning machine learning and putting it into practice.

Howard says one year of programming experience is enough to start learning machine learning . I completely agree with him and hope that my article will help lower the threshold for joining opencv for php developers who are new to machine learning and are still not sure whether they want to do this at all or not, and also try to describe all the points that I spent hours and days so it took you no more than a minute.
So what did I do besides the logo?

(I hope that opencv does not condemn me for plagiarism)
I considered the possibility of writing the php-opencv module myself using SWIG and spent a lot of time on it, but I did not achieve anything. It was complicated by the fact that I did not know c / c ++ and did not write extensions for php 7. Unfortunately, most of the materials on the Internet on php extensions were written for php 5, so I had to collect information bit by bit and solve the problems on my own.
Then I found the php-opencv library on the github, it is a module for php7, which makes calls to opencv methods. It took me several nights to compile, install, and run the examples. I started trying various features of this module, but I lacked some methods, I added them myself, created a pull request, and the author of the library accepted. Later I added even more features.
Perhaps the reader at this point will ask himself the question: why did the author need such problems at all, why was it just not possible to start using python and tensorflow?
In general, so far I am satisfied with working with opencv in php.
This is how image loading looks:
For comparison, in python, it looks like this:
When reading an image in php (as well as in c ++), information is stored in a Mat object (matrix). In php, its analog is a multidimensional array, but unlike a multidimensional array, this object allows various quick manipulations, for example, dividing all elements by a number. In python, when loading an image, a numpy object is returned.
Caution Legacy! It just so happened that imread (in php, c ++ and pyton) loads the image not in RGB format, but in BGR. Therefore, in the examples with opencv you can often see the conversion procedure BGR-> RGB and vice versa.
The first thing I tried was this feature. Opencv has a CascadeClassifier class for it , which can use a pre-trained model in xml format. Before finding a face, it is recommended to translate the image in black and white.
→ Full example code
Result:

As can be seen from the example, it is not a problem to find a face even in the photo in the makeup of zombies. Glasses also do not interfere with finding a face.
To do this, opencv has the LBPHFaceRecognizer class and the train / predict methods.
If we want to find out who is present in the photo, we first need to train the model using the train method, it takes two parameters: an array of face images and an array of numerical marks for these images. After that, you can call the predict method on the test image (face) and get the numerical mark to which it corresponds.
→ Full code for the example
Face Sets:


Result:

When I started working with LBPHFaceRecognizer, it did not have the ability to save / load / retrain the finished model. Actually my first pullrequest added these methods: write / read / update.
When I started to get acquainted with opencv, I often came across photos of faces on which eyes, nose, lips, etc. are marked with dots. I wanted to repeat this experiment myself, but this was not implemented in the opencv version for python. It took me an evening to add support for FacemarkLBF to php and send a second pull request. Everything works simply, load the pre-trained model, submit an array of faces to the input, and get an array of points for each face.
→ Full code for the example
Result:

As you can see from the example, the makeup of a zombie can impair the location of reference points on the face. Glasses can also interfere with your face. Highlight also affects. In this case, foreign objects in the mouth (strawberries, cigarettes, etc.) may not interfere.
After my first pull-quest, I was inspired and began to watch what else could be done with opencv and came across an article Deep Learning, now in OpenCV. Without hesitation, I decided to add in php-opencv the ability to use pre-trained models, which are full on the Internet. It turned out to be not very difficult to download caffe models, though later it took me a lot of time to get to learn how to work with multidimensional matrices, half of which went into c ++ and studying the internals of opencv, and the second in python and working with caffe / torch models / tensorflow without using opencv.
So, opencv allows you to load pre-trained models into Caffe using the readNetFromCaffe function . It takes two parameters - the paths to the .prototxt and .caffemodel files. The prototxt file contains the description of the model, and the caffemodel contains the weights calculated during the training of the model.
Here is an example of the beginning of a prototxt file :
This piece of the file describes that a 4-dimensional 1x3x300x300 matrix is expected to be input. In the description of the models they usually write what is expected in this format, but most often this means that an RGB image (3 channels) of 300x300 size is expected to be input.
By loading a 300x300 RGB image using the imread function, we get a 300x300x3 matrix.
To bring the 300x300x3 matrix to 1x3x300x300 in opencv, there is a blobFromImage function .
After that, we can only submit a blob to the network input using the setInput method and call the forward method , which will return the finished result to us.
In this case, the result is a 1x1x200x7 matrix, i.e. 200 arrays of 7 elements each. In a four-person photo, the network found 200 candidates for us. Each of which looks like this [,, $ confidence, $ startX, $ startY, $ endX, $ endY]. The $ confidence element is responsible for "confidence", i.e. that the probability of prediction is successful, for example 0.75. The following elements are responsible for the coordinates of the rectangle with the face. In this example, only 3 persons were found with a confidence of more than 50%, and the remaining 197 candidates have a confidence of less than 15%.
Model size 10 MB, full example code .
Result:

As you can see from the example, a neural network does not always give good results when using it “on the forehead”. The fourth person was not found, and if the fourth photo is cut out and sent to the network separately, the face will be found.
I have long heard about the waifu2x library , which allows you to eliminate noise and increase the size of icons / photos. The library itself is written in lua, and under the hood it uses several models (for enlarging icons, eliminating photo noise, etc.) trained in torch. The author of the library exported these models to caffe and helped me use them from opencv. As a result, an example was written in php to increase the resolution of the icons.
Model size 2 MB, full example code .
Original:

Result:

Enlarging a picture without using a neural network:

The MobileNet neural network trained on the ImageNet dataset allows you to classify an image. In total, it can define 1000 classes , which in my opinion is quite enough.
Model size 16 MB, full example code .
Original:

Result:
87% - Egyptian cat, 4% - tabby, tabby cat, 2% - tiger cat
The MobileNet SSD (Single Shot MultiBox Detector) neural network trained in Tensorflow on the COCO dataset can not only classify an image, but also return regions, although only 182 classes can determine it .
Model size 19 MB, full example code .
Original:

Result:

I also added the phpdoc.php file to the example repository . Thanks to him, Phpstorm highlights the syntax of functions, classes and their methods, as well as code completion. This file does not need to be included in your code (otherwise there will be an error), it is enough to put it in your project. Personally, it makes my life easier. This file describes most of the functions of opencv, but not all, so pull requests are welcome.
The dnn module appeared in opencv only in version 3.4 (before that it was in opencv-contrib).
In ubuntu 18.04, the latest version of opencv is 3.2 . Building opencv from source takes about half an hour, so I compiled a package for ubuntu 18.04 (works for 17.10, size 25MB), and also compiled php-opencv packages for php 7.2 (ubuntu 18.04) and php 7.1 (ubuntu 17.10) (size 100KB).
Registered ppa: php-opencv, but has not yet mastered the fill there and has not found anything better than just uploading packages to the github . I also created a request to create an account in pecl, but after a few months I did not receive a response.
So now the installation under ubuntu 18.04 looks like this:
Installation with this option takes about 1 minute. All installation options on ubuntu .
I also put together a 168 MB docker image .
Download:
Launch:
I ask all interested people to answer the polls after the article, well, subscribe so as not to miss my next articles, like them to motivate me to write them and write questions in the commentary, suggest options for new experiments / articles.
Traditionally, I warn that I do not advise and do not help through personal messages of Habr and social networks.
You can always ask questions by creating an Issue on a github (available in Russian).
Acknowledgments:
dkurt for quick answers on the github.
arrybn for the article “Deep Learning, now in OpenCV”
Links:
→ php-opencv-examples - all examples from the article
→ php-opencv / php-opencv - my fork with support for the dnn module
→ hihozhou / php-opencv - the original repository, without support for the dnn module (I created a pullrequest, but it has not yet been accepted).
→ Translation of the article into English - I heard that the British and Americans are very patient with those who make mistakes in English, but it seems to me that there is a redistribution for everything and I crossed this line :) in general, like, who do not mind. The same goes for reddit .
I made the main contribution to two hubs: PHP and Server Administration. I like to work at the junction of these two areas, but the scope of my interests is much wider.
Like many developers, I often use the results of someone else's work (articles on Habré, code on github, ...), therefore I am always happy to share my results with the community in response. Writing articles is not only a return of debt to the community, but also allows you to find like-minded people, get comments from professionals in a narrow field, and further deepen your knowledge in the field.
Actually this article is about one of these points. In it, I will describe what I did almost all my free time in the past six months. In addition to those moments when I went swimming in the sea across the road , watched TV shows or played games.

Now "Machine Learning" is developing very much, a lot of articles have been written on it, including on the Habr and almost every developer would like to take and start using it in their work tasks and home projects, but where to start and what not to apply always clear. Most articles for beginners offer a lot of literature, for reading which life is not enough, English-language courses (and not all of us can learn material in English as effectively as in Russian), “inexpensive” Russian-language courses, etc.
New articles regularly appear that describe new approaches to solving a particular problem. On github you can find the implementation of the approach described in the articles. The most commonly used programming languages are: c / c ++, python 2/3, lua and matlab, and as frameworks: caffe, tensorflow, torch. Everyone writes - who is doing what. Large segmentation of programming languages and frameworks greatly complicates the process of finding what you need and integrating it into the project. In addition, recently a lot of source code with comments in Chinese.
To somehow reduce all this chaos in opencv, we added the dnn module , which allows you to use models trained in the main frameworks. For my part, I will show how this module can be used from php.
How_standards are multiplying.jpg
Perhaps an attentive reader immediately thought about this picture and he will partially be right.


Jeremy Howard (creator of the free-of-charge practical course “machine learning for coders” ) believes that now there is a big threshold between learning machine learning and putting it into practice.

Howard says one year of programming experience is enough to start learning machine learning . I completely agree with him and hope that my article will help lower the threshold for joining opencv for php developers who are new to machine learning and are still not sure whether they want to do this at all or not, and also try to describe all the points that I spent hours and days so it took you no more than a minute.
So what did I do besides the logo?

(I hope that opencv does not condemn me for plagiarism)
I considered the possibility of writing the php-opencv module myself using SWIG and spent a lot of time on it, but I did not achieve anything. It was complicated by the fact that I did not know c / c ++ and did not write extensions for php 7. Unfortunately, most of the materials on the Internet on php extensions were written for php 5, so I had to collect information bit by bit and solve the problems on my own.
Then I found the php-opencv library on the github, it is a module for php7, which makes calls to opencv methods. It took me several nights to compile, install, and run the examples. I started trying various features of this module, but I lacked some methods, I added them myself, created a pull request, and the author of the library accepted. Later I added even more features.
Perhaps the reader at this point will ask himself the question: why did the author need such problems at all, why was it just not possible to start using python and tensorflow?
Answer. Caution, tediousness and excuses!
The fact is that I am not a professional machine learning specialist, I cannot at this stage develop my own approach to solving a particular narrow task, in which I will achieve results a couple percent better than other researchers, and then still get to This is a patent matter. For example, five Chinese guys with academic degrees did so, who developed mtcnn and wrote an implementation in matlab and caffe. Then the other three Chinese guys ported this code to C ++ & caffe, Python & mxnet, Python & caffe. As you probably already guessed, knowing only python and tensorflow will not go far. You will have to constantly deal with code in different languages using different frameworks and comments in Chinese.
Another example, I wanted to use facemarkfrom opencv, but unfortunately the authors did not add support for this module when working from python. At the same time, it took me one evening to add facemark binders to php.
I also tried to compile opencv to work with nodejs, according to several instructions, but I got various errors and failed to achieve the result.
For the most part, I was interested in doing this despite all the difficulties.
Another example, I wanted to use facemarkfrom opencv, but unfortunately the authors did not add support for this module when working from python. At the same time, it took me one evening to add facemark binders to php.
I also tried to compile opencv to work with nodejs, according to several instructions, but I got various errors and failed to achieve the result.
For the most part, I was interested in doing this despite all the difficulties.
In general, so far I am satisfied with working with opencv in php.
This is how image loading looks:
$image = cv\imread("images/faces.jpg");
For comparison, in python, it looks like this:
image = cv2.imread("images/faces.jpg")
When reading an image in php (as well as in c ++), information is stored in a Mat object (matrix). In php, its analog is a multidimensional array, but unlike a multidimensional array, this object allows various quick manipulations, for example, dividing all elements by a number. In python, when loading an image, a numpy object is returned.
Caution Legacy! It just so happened that imread (in php, c ++ and pyton) loads the image not in RGB format, but in BGR. Therefore, in the examples with opencv you can often see the conversion procedure BGR-> RGB and vice versa.
Search for faces in the photo
The first thing I tried was this feature. Opencv has a CascadeClassifier class for it , which can use a pre-trained model in xml format. Before finding a face, it is recommended to translate the image in black and white.
$src = imread("images/faces.jpg");
$gray = cvtColor($src, COLOR_BGR2GRAY);
$faceClassifier = new CascadeClassifier();
$faceClassifier->load('models/lbpcascades/lbpcascade_frontalface.xml');
$faceClassifier->detectMultiScale($gray, $faces);
→ Full example code
Result:

As can be seen from the example, it is not a problem to find a face even in the photo in the makeup of zombies. Glasses also do not interfere with finding a face.
Recognition (recognition) of persons in the photo
To do this, opencv has the LBPHFaceRecognizer class and the train / predict methods.
If we want to find out who is present in the photo, we first need to train the model using the train method, it takes two parameters: an array of face images and an array of numerical marks for these images. After that, you can call the predict method on the test image (face) and get the numerical mark to which it corresponds.
$faceRecognizer = LBPHFaceRecognizer::create();
$faceRecognizer->train($myFaces, $myLabels = [1,1,1,1]); // 4 мои лица
$faceRecognizer->update($angelinaFaces, $angelinaLabels = [2,2,2,2]); // 4 лица Анжелины
$label = $faceRecognizer->predict($faceImage, $confidence);
// получаем label (1 или 2) и $confidence (уверенность)
→ Full code for the example
Face Sets:


Result:

When I started working with LBPHFaceRecognizer, it did not have the ability to save / load / retrain the finished model. Actually my first pullrequest added these methods: write / read / update.
Face tagging
When I started to get acquainted with opencv, I often came across photos of faces on which eyes, nose, lips, etc. are marked with dots. I wanted to repeat this experiment myself, but this was not implemented in the opencv version for python. It took me an evening to add support for FacemarkLBF to php and send a second pull request. Everything works simply, load the pre-trained model, submit an array of faces to the input, and get an array of points for each face.
$facemark = FacemarkLBF::create();
$facemark->loadModel('models/opencv-facemark-lbf/lbfmodel.yaml');
$facemark->fit($src, $faces, $landmarks);
→ Full code for the example
Result:

As you can see from the example, the makeup of a zombie can impair the location of reference points on the face. Glasses can also interfere with your face. Highlight also affects. In this case, foreign objects in the mouth (strawberries, cigarettes, etc.) may not interfere.
After my first pull-quest, I was inspired and began to watch what else could be done with opencv and came across an article Deep Learning, now in OpenCV. Without hesitation, I decided to add in php-opencv the ability to use pre-trained models, which are full on the Internet. It turned out to be not very difficult to download caffe models, though later it took me a lot of time to get to learn how to work with multidimensional matrices, half of which went into c ++ and studying the internals of opencv, and the second in python and working with caffe / torch models / tensorflow without using opencv.
Search for faces in a photo using the dnn module
So, opencv allows you to load pre-trained models into Caffe using the readNetFromCaffe function . It takes two parameters - the paths to the .prototxt and .caffemodel files. The prototxt file contains the description of the model, and the caffemodel contains the weights calculated during the training of the model.
Here is an example of the beginning of a prototxt file :
input: "data"
input_shape {
dim: 1
dim: 3
dim: 300
dim: 300
}
This piece of the file describes that a 4-dimensional 1x3x300x300 matrix is expected to be input. In the description of the models they usually write what is expected in this format, but most often this means that an RGB image (3 channels) of 300x300 size is expected to be input.
By loading a 300x300 RGB image using the imread function, we get a 300x300x3 matrix.
To bring the 300x300x3 matrix to 1x3x300x300 in opencv, there is a blobFromImage function .
After that, we can only submit a blob to the network input using the setInput method and call the forward method , which will return the finished result to us.
$src = imread("images/faces.jpg");
$net = \CV\DNN\readNetFromCaffe('models/ssd/res10_300x300_ssd_deploy.prototxt', 'models/ssd/res10_300x300_ssd_iter_140000.caffemodel');
$blob = \CV\DNN\blobFromImage($src, $scalefactor = 1.0, $size = new Size(300, 300), $mean = new Scalar(104, 177, 123), $swapRB = true, $crop = false);
$net->setInput($blob, "");
$result = $net->forward();
In this case, the result is a 1x1x200x7 matrix, i.e. 200 arrays of 7 elements each. In a four-person photo, the network found 200 candidates for us. Each of which looks like this [,, $ confidence, $ startX, $ startY, $ endX, $ endY]. The $ confidence element is responsible for "confidence", i.e. that the probability of prediction is successful, for example 0.75. The following elements are responsible for the coordinates of the rectangle with the face. In this example, only 3 persons were found with a confidence of more than 50%, and the remaining 197 candidates have a confidence of less than 15%.
Model size 10 MB, full example code .
Result:

As you can see from the example, a neural network does not always give good results when using it “on the forehead”. The fourth person was not found, and if the fourth photo is cut out and sent to the network separately, the face will be found.
Improving image quality using a neural network
I have long heard about the waifu2x library , which allows you to eliminate noise and increase the size of icons / photos. The library itself is written in lua, and under the hood it uses several models (for enlarging icons, eliminating photo noise, etc.) trained in torch. The author of the library exported these models to caffe and helped me use them from opencv. As a result, an example was written in php to increase the resolution of the icons.
Model size 2 MB, full example code .
Original:

Result:

Enlarging a picture without using a neural network:

Image classification
The MobileNet neural network trained on the ImageNet dataset allows you to classify an image. In total, it can define 1000 classes , which in my opinion is quite enough.
Model size 16 MB, full example code .
Original:

Result:
87% - Egyptian cat, 4% - tabby, tabby cat, 2% - tiger cat
Tensorflow Object Detection API
The MobileNet SSD (Single Shot MultiBox Detector) neural network trained in Tensorflow on the COCO dataset can not only classify an image, but also return regions, although only 182 classes can determine it .
Model size 19 MB, full example code .
Original:

Result:

Syntax highlighting and code completion
I also added the phpdoc.php file to the example repository . Thanks to him, Phpstorm highlights the syntax of functions, classes and their methods, as well as code completion. This file does not need to be included in your code (otherwise there will be an error), it is enough to put it in your project. Personally, it makes my life easier. This file describes most of the functions of opencv, but not all, so pull requests are welcome.
Installation
The dnn module appeared in opencv only in version 3.4 (before that it was in opencv-contrib).
In ubuntu 18.04, the latest version of opencv is 3.2 . Building opencv from source takes about half an hour, so I compiled a package for ubuntu 18.04 (works for 17.10, size 25MB), and also compiled php-opencv packages for php 7.2 (ubuntu 18.04) and php 7.1 (ubuntu 17.10) (size 100KB).
Registered ppa: php-opencv, but has not yet mastered the fill there and has not found anything better than just uploading packages to the github . I also created a request to create an account in pecl, but after a few months I did not receive a response.
So now the installation under ubuntu 18.04 looks like this:
apt update && apt install -y wget && \
wget https://raw.githubusercontent.com/php-opencv/php-opencv-packages/master/opencv_3.4_amd64.deb && dpkg -i opencv_3.4_amd64.deb && rm opencv_3.4_amd64.deb && \
wget https://raw.githubusercontent.com/php-opencv/php-opencv-packages/master/php-opencv_7.2-3.4_amd64.deb && dpkg -i php-opencv_7.2-3.4_amd64.deb && rm php-opencv_7.2-3.4_amd64.deb && \
echo "extension=opencv.so" > /etc/php/7.2/cli/conf.d/opencv.ini
Installation with this option takes about 1 minute. All installation options on ubuntu .
I also put together a 168 MB docker image .
Using examples
Download:
git clone https://github.com/php-opencv/php-opencv-examples.git && cd php-opencv-examples
Launch:
php detect_face_by_dnn_ssd.php
PS
I ask all interested people to answer the polls after the article, well, subscribe so as not to miss my next articles, like them to motivate me to write them and write questions in the commentary, suggest options for new experiments / articles.
Traditionally, I warn that I do not advise and do not help through personal messages of Habr and social networks.
You can always ask questions by creating an Issue on a github (available in Russian).
Acknowledgments:
dkurt for quick answers on the github.
arrybn for the article “Deep Learning, now in OpenCV”
Links:
→ php-opencv-examples - all examples from the article
→ php-opencv / php-opencv - my fork with support for the dnn module
→ hihozhou / php-opencv - the original repository, without support for the dnn module (I created a pullrequest, but it has not yet been accepted).
→ Translation of the article into English - I heard that the British and Americans are very patient with those who make mistakes in English, but it seems to me that there is a redistribution for everything and I crossed this line :) in general, like, who do not mind. The same goes for reddit .
Only registered users can participate in the survey. Please come in.
Can I call the call to opencv methods from php as "Working with computer vision in php"?
- 73.7% yes, because I only write php code without using C ++ 101
- 4.3%нет, работа с С++ библиотекой opencv из python — это «настоящее» компьютерное зрение6
- 0%нет, работа с С++ библиотекой opencv из matlab — это «настоящее» компьютерное зрение0
- 0%нет, работа с С++ библиотекой opencv из lua — это «настоящее» компьютерное зрение0
- 21.8%нет, работа с opencv на c++ — это «настоящее» компьютерное зрение30
Удобный способ установки php-opencv
- 43.1%установка из pecl57
- 43.9%скачивание и установка пакета58
- 25.7%установка из ppa-репозитория34
- 2.2%установка из snap-репозитория3
- 21.9%установка из docker-хаба29
- 19.6%установка из исходников26
Нужны ли ещё статьи на эту тему?
- 59.1%да, нужно больше теории107
- 79.5%да, нужно больше практики144
- 62.4%да, нужны полезные примеры использования предобученных моделей113
- 60.2%да, нужны интересные примеры работы с изображениями109
- 40.8%да, нужны примеры работы с видео74
- 61.8%да, нужны примеры обучения сетей112
- 4.9%нет, не нужны9
php и python
- 21.4% I am programming in php and python 50
- 65.2% I only program in php 152
- 13.3% I only program in python 31
Machine learning and php developers
- 1.6% I am a php developer and work with machine learning 3
- 85.4% I am a php developer and interested in machine learning 159
- 12.9% I am a php developer and not interested in machine learning 24