Licenzero: looking for porn by skin color

    Mask for skin colorWe continue the description of the classifier of pornographic video content developed by Inventos (Licenzero, which is present in the title is not a separate company, but a division in Inventos).

    A skin color detector is one of the detectors with which we classify videos. It is not as complex as a motion detector , or a fragment detector, one might even say very simple. At the beginning, we had a bunch of ideas related to skin color in the video. But having tried the simplest approach to classification, we decided (maybe temporarily) to dwell on it, since the results we were quite satisfied with. So.

    Skin color determination


    There were two tasks before us:
    • determine what color can be called “skin color”,
    • categorize porn videos by skin color.

    So, let's start with skin color. First things first we got a few thousand pictures. Both just pictures and frames from video, including pornographic, since we were primarily interested in just such a video.

    Then, using a simple home-made program:
    Skin selector
    We noted in the pictures areas with and without skin. Thus, we obtained the coordinates (in RGB) of several million points, classified by the belonging of these coordinates to human skin.

    Then there was the problem of choosing a color model, in which we will consider the coordinates of the points. Choose between RGB, LAB, HSV and YСbCr. We conducted several tests and decided to focus on YСbCr, not least because, because we had to classify it by color, we could discard the “gray” component of brightness Y.

    Here are the points from our pictures on the Сb scale:
    Histograms Cb
    And on the Cr scale:
    Histograms Cr
    That is, it is noticeable that skin pixels can be distinguished to these two coordinates. Here is the likelihood (according to our data) that a certain point with the coordinates Cb and Cr is the skin:
    Probability CbCr
    Where there is blue, the probability is 0%, where red is 100%. This slide tells us that if we choose, for example, 50% as a threshold for classifying the skin, we can easily separate the skin light (in the Cb and Cr coordinates) from all the other colors.

    We decided not to use SVM for classification purposes, but simply to define a rectangular area that optimally classifies skin pixels. That is, such a pseudo-SVM with four reference vectors. Here is a rectangle:
    Rectangle
    The black line is our rectangle. The green line is the probability that the point on this curve refers to the skin is 50% (red: 90%, blue: 10%). That is, all points with coordinates Cb and Cr that fall inside the black rectangle are pixels of the skin.
    Here is an example of determining the skin of our system:

    We stopped our research on this rectangle because it’s all interesting, of course, but we need to move on to the classification of pornography by skin color.

    Skin color classification


    So, we have decided what we will consider skin. Since we work with video data, we switched from the YCbCr color space to YUV (in fact, this is the same thing, the Russian Wikipedia even redirects from the YCbCr page to the YUV page ). C YUV work in this case is extremely convenient. We not only do not re-encode the raw video, but the pairs (U, V) we get are half as many as the points in the frame (if the video is in the yuv420p format), in general, a solid saving.

    But what about the classification? With the classification, everything turned out to be even simpler than with the determination of skin color. We thought: what will happen if we calculate the proportion of skin color in porn and non-porn videos (that is, the number of “skin” pixels is divided by the total number of pixels in the video). The result is such a picture:
    Skin color classification
    These are histograms of distribution of rollers. The y-axis is the number of rollers, the x-axis is the proportion of skin pixels in the roller. Dotted lines show plots of distribution density, if we assume that the distribution is normal in both cases, but it is, just for illustration.

    If we measure the fraction of skin pixels not in whole clips, but in porn and non-porno fragments, and we also have such fragments - we manually cut them when we made a motion detector, then the results will be even better.

    Our skin color detector returns the probability (however, like our other detectors) that the fragment is pornographic. And this probability is simply a function of the fraction of skin pixels in the fragment. The function is approximately like this:
    Probability function
    So, in order to classify some fragment of a certain clip, we:
    • we count the number of skin pixels in all frames of the fragment;
    • divide by the total number of skin pixels in all frames of the fragment - we get the fraction of skin pixels;
    • depending on the fraction of skin pixels, we obtain the probability that the fragment is pornographic;
    • we use this probability along with the probabilities obtained from other detectors for the final classification of the fragment and the whole video.

    Also popular now: