Find people in photos on Android using OpenCV

Recently, I came across one interesting problem for the mobile “horse” on Android — it is necessary to determine the contours of people in photographs (if there were any, of course). After searching the Internet, it was decided to use the open source project OpenCV , which can run on the Android platform.

Much has already been written about him , but this subject was not found by me and was collected from several sources and personal observations.



Customization


foldersA small description of how to include a library in a project on Android Studio (using gradle):
To get started, you need to download the latest version of the library from the site and copy the contents of the folder OpenCV-2.4.8-android-sdk/sdk/javafrom the archive to the folder of libs/OpenCVyour project (if necessary, create it).
Next, connect this module to the gradle files :
In the project root folder, edit settings.gradleand add our module:
include ':app',':app:libs:OpenCV'

In the build gradle file of our application (not in the root file, but app/build.gradle) add a line compile project(':app:libs:OpenCV')to the section dependenciesso that it turns out:
dependencies {
    compile 'com.android.support:appcompat-v7:+'
    compile project(':app:libs:OpenCV')
}

And create the build.gradle file in the OpenCV folder with the code:
buildscript {
    repositories {
        mavenCentral()
    }
    dependencies {
        classpath 'com.android.tools.build:gradle:0.6.+'
    }
}
apply plugin: 'android-library'
repositories {
    mavenCentral();
}
android {
    compileSdkVersion 19
    buildToolsVersion "19"
    defaultConfig {
        minSdkVersion 8
        targetSdkVersion 19
    }
    sourceSets {
        main {
            manifest.srcFile 'AndroidManifest.xml'
            java.srcDirs = ['src']
            resources.srcDirs = ['src']
            aidl.srcDirs = ['src']
            renderscript.srcDirs = ['src']
            res.srcDirs = ['res']
            assets.srcDirs = ['assets']
        }
    }
}

Well, actually, that's it, the OpenCV Android SDK is connected and we can move on to the implementation.

Install libraries on a device


At the first stage of acquaintance with OpenCV, I was confused by some feature of working under Android - the need to install the OpenCV Manager application separately , which your creation will directly interact with. A rather strange decision, because it will be necessary to explain to the end user the fact that in order to use your application he will need to install another program in the market (the benefit of the user will redirect your application directly, which will not allow a person to get lost, but it can scare him away anyway) .

There is another way to connect - static initialization, but the developers claim that it exists only for development purposes and, it seems to me, can be removed in new versions (It is designed mainly for development purposes. This approach is deprecated for the production code, release package is recommended to communicate with OpenCV Manager via the async initialization described above.).

But in the context of this article, this is only at hand for us since no need to bother with connecting NDK and assembling \ connecting library libraries into the project. So let's continue.

To install OpenCV Manager on the emulator, we will use the adb utility from our Android SDK. To do this, start the virtual device, wait for it to load, and run the command:
/PATH_TO_ANDROID_SDK/platform-tools/adb install /PATH_TO_OPENCV/OpenCV-2.4.8-android-sdk/apk/OpenCV_2.4.8_Manager_2.16_armv7a-neon.apk(by selecting the one suitable for ABI apk).

Work with images


The whole initialization of OpenCV in the application is to implement the callback of the BaseLoaderCallback interface, where there is one onManagerConnected method, in which we can and can already start working with OpenCV, and by calling the static method OpenCVLoader.initAsync, passing the necessary parameters to it (including callback). If your application does not find OpenCV Manager, then it will ask the user to install it. Connection:
    @Override
    public void onResume()
    {
        super.onResume();
        //Вызываем асинхронный загрузчик библиотеки
        OpenCVLoader.initAsync(OpenCVLoader.OPENCV_VERSION_2_4_8, this, mLoaderCallback);
    }
    private BaseLoaderCallback mLoaderCallback = new BaseLoaderCallback(this) {
        @Override
        public void onManagerConnected(int status) {
            switch (status) {
                case LoaderCallbackInterface.SUCCESS:
                {
		//Мы готовы использовать OpenCV
                } break;
                default:
                {
                    super.onManagerConnected(status);
                } break;
            }
        }
    };

Now we can safely work with our library.

In this example, we create a Bitmap from the url of the photo, then translate it into an OpenCV Mat object (image matrix), convert it from color to grayscale (this is required for the analyzer) and call the static method of the object HOGDescriptor.detectMultiScale(into which we first add the standard human loop detector from the method HOGDescriptor.getDefaultPeopleDetector) After the call, the locations variable will contain objects of rectangular areas where people are located (x, y, width, height), and in weights - the relevance of the search (but, as practice has shown, it does not quite correspond to reality with such images).

For simplicity, I uploaded the photos to facebook and combined the methods of uploading and processing photos into one. The method code itself:
public Bitmap peopleDetect ( String path ) {
        Bitmap bitmap = null;
        float execTime;
        try {
            // Закачиваем фотографию
            URL url = new URL( path );
            HttpURLConnection connection = (HttpURLConnection) url.openConnection();
            connection.setDoInput(true);
            connection.connect();
            InputStream input = connection.getInputStream();
            BitmapFactory.Options opts = new BitmapFactory.Options();
            opts.inPreferredConfig = Bitmap.Config.ARGB_8888;
            bitmap = BitmapFactory.decodeStream(input, null, opts);
            long time = System.currentTimeMillis();
            // Создаем матрицу изображения для OpenCV и помещаем в нее нашу фотографию
            Mat mat = new Mat();
            Utils.bitmapToMat(bitmap, mat);
            // Переконвертируем матрицу с RGB на градацию серого
            Imgproc.cvtColor(mat, mat, Imgproc.COLOR_RGB2GRAY, 4);
            HOGDescriptor hog = new HOGDescriptor();
            //Получаем стандартный определитель людей и устанавливаем его нашему дескриптору
            MatOfFloat descriptors = HOGDescriptor.getDefaultPeopleDetector();
            hog.setSVMDetector(descriptors);
            // Определяем переменные, в которые будут помещены результаты поиска ( locations - прямоугольные области, weights - вес (можно сказать релевантность) соответствующей локации)
            MatOfRect locations = new MatOfRect();
            MatOfDouble weights = new MatOfDouble();
            // Собственно говоря, сам анализ фотографий. Результаты запишутся в locations и weights
            hog.detectMultiScale(mat, locations, weights);
            execTime = ( (float)( System.currentTimeMillis() - time ) ) / 1000f;
            //Переменные для выделения областей на фотографии
            Point rectPoint1 = new Point();
            Point rectPoint2 = new Point();
            Scalar fontColor = new Scalar(0, 0, 0);
            Point fontPoint = new Point();
            // Если есть результат - добавляем на фотографию области и вес каждой из них
            if (locations.rows() > 0) {
                List rectangles = locations.toList();
                int i = 0;
                List weightList = weights.toList();
                for (Rect rect : rectangles) {
                    float weigh = weightList.get(i++).floatValue();
                    rectPoint1.x = rect.x;
                    rectPoint1.y = rect.y;
                    fontPoint.x  = rect.x;
                    fontPoint.y  = rect.y - 4;
                    rectPoint2.x = rect.x + rect.width;
                    rectPoint2.y = rect.y + rect.height;
                    final Scalar rectColor = new Scalar( 0  , 0 , 0  );
                    // Добавляем на изображения найденную информацию
                    Core.rectangle(mat, rectPoint1, rectPoint2, rectColor, 2);
                    Core.putText(mat,
                            String.format("%1.2f", weigh),
                            fontPoint, Core.FONT_HERSHEY_PLAIN, 1.5, fontColor,
                            2, Core.LINE_AA, false);
                }
            }
            fontPoint.x = 15;
            fontPoint.y = bitmap.getHeight() - 20;
            // Добавляем дополнительную отладочную информацию
            Core.putText(mat,
                    "Processing time:" + execTime + " width:" + bitmap.getWidth() + " height:" + bitmap.getHeight() ,
                    fontPoint, Core.FONT_HERSHEY_PLAIN, 1.5, fontColor,
                    2, Core.LINE_AA, false);
            Utils.matToBitmap( mat , bitmap );
        } catch (IOException e) {
            e.printStackTrace();
        }
        return bitmap;
    }

At the output, we get a bitmap with the areas of the alleged location of people superimposed on it, the weight of the given search result and some additional information. The processing speed of one photo (up to a thousand pixels in width and height) on the Samsung Galaxy S3 is about 1-6 seconds. Below are search results with runtime.

In the first photo, the analyzer did not find a single person, no matter how much we would like (
Image-1.jpg width: 488 height: 420 executionTime: 1.085
image


Further, the result is better, but also not that
Image-2.jpg width: 575 height: 400 executionTime: 1.226
image


Yes, and the third let us down
Image-3.jpg width: 618 height: 920 executionTime: 6.459
image


Something already
Image-4.jpg width: 590 height: 505 executionTime: 3.084
image


Moving to a more lively photo, the result was slightly unexpected for me
Image-5.jpg width: 604 height: 453 executionTime: 1.913
image


The monument was partially recognized
Image-6.jpg width: 960 height: 643 executionTime: 4.106
image


And here is the first photo that really shows why the library is “sharpened”
Image-7.jpg width: 960 height: 643 executionTime: 2.638
image


In clear contrast, I did not get the desired effect.
Image-8.jpg width: 960 height: 857 executionTime: 3.293
image


Nothing defined here
Image-9.jpg width: 960 height: 642 executionTime: 2.264
image


Background photo of people in the background
Image-10.jpg width: 960 height: 643 executionTime: 2.188
image


Close-up but unsuccessfully
Image-11.jpg width: 960 height: 639 executionTime: 2.273
image


Instead of four, a little more
Image-12.jpg width: 960 height: 640 executionTime: 2.669
image


As can be seen from the results, the library is more suitable for determining photos / videos from surveillance cameras, where you can select an object and confirm it with the following frames, and for photos with previously unknown plans, it gives a sufficiently large recognition error (you can filter by weight, but then risk losing a lot of images). The speed of image analysis does not yet allow using OpenCV for a large number of photos, and when working in real time, using these capacities and algorithms, it may not keep up with the stream of frames.

For my purposes, the library did not fit, but maybe my little research will be useful to you.

Thanks for reading!

Project on GitHub .

Also popular now: