Evaluation of the possibility of post-processing video in a browser

    Recently, post-processing of video in runtime has become increasingly important - thanks to the power of modern PCs, almost every user can skip a video sequence through a complex filter chain right while watching, thereby eliminating the need for full-fledged video encoding, often produced using slow and complicated tools .

    This area is pretty well covered in the desktop environment - filters like ffdshow raw video filter and madVR allow you to do almost anything you might need for a pleasant viewing. Unfortunately, the web cannot boast of a similar toolkit, and you either enjoy all the shortcomings of the next video on YouTube, or open it in an external application likeMPC-BE , which is not very convenient. And it would be nice to have one magic button that activates filtering in the place where it should be - in your browser.

    This post is a brief report of my research in this area, where the ultimate goal was to assess the possibility of filtering in real time at a resolution of at least 1920x1080.

    Remarks


    When reading an article, you should consider:
    1. All of the demos shown are based on html5 video with the loop attribute set. This video can jerk terribly and lag while switching to the beginning of the video in some browsers, due to the fault of these browsers . I did not try to redo the code for the possible fix of this problem.
    2. If the repeated video annoys you, you can add loop = false to the GET request parameters.
    3. Demos were tested only in chrome, fox and IE11, in other browsers it may not work.
    4. The source code of all the demos is directly inside the corresponding html-pages, without dependencies.
    5. The text contains many warped English words and clumsy translations. I am poorly versed in Russian terminology, corrections are welcome.
    6. We turn a blind eye to possible problems with CORS, sites using Flash-video, etc. Only spherical tests in vacuum.
    7. In JavaScript, I’m passing through, so don’t trust the text below too much. For greater certainty, you can divide the time by 2. I hope to see corrections and tips in the comments.

    Implementation principles


    The only option that would allow one core for all target browsers (Chrome and Firefox in the first place) is a browser extension. The alternative in the form of Google Chrome Native Client , suddenly, works only in Chrome, and Mozilla is not currently going to support NaCl in Firefox. In addition, I did not study the possibility of NaCl accessing elements on the page - it may well turn out that for our purposes it will not work.

    The basic algorithm of the (theoretical) extension is quite simple: look for the video element on the page, hide it, and create a canvas on top of which the filtered frames of the video stream are rendered. So far, everything is simple.

    The real problem with the extension is the implementation language - interpreted JavaScript, and as we know, interpreted languages ​​are not suitable for serious calculations. But it doesn’t matter! JavaScript has been gaining a lot of love and optimization lately, and there are a fairly large number of programmers who believe that JS is a language suitable for writing any kind of application and that everything should move on the web. Moreover, many new technologies are available, such as asm.js, SIMD.js, WebGL and WebCL, which, in theory, allow you to implement whatever your heart desires, at a speed only slightly less than native. So we should not have any serious problems writing a set of filters in the browser, right?

    Not really.

    Pure javascript


    Filtering in pure JS works as follows:
    1. We get both necessary elements - a hidden video and canvas, located on top of it.
    2. We draw a frame from the video on canvas through context.drawImage(video, 0, 0), where context is the 2d context received from the canvas.
    3. We get the frame buffer (an array of color bytes) through context.getImageData(0, 0, width, height).
    4. We process the buffer with the required filters.
    5. Put the processed array back through context.putImageData(imageData, 0, 0).

    This algorithm works and allows real video filtering in pure JavaScript with a minimum amount of code very similar to C. This will look like the basic (not optimized) implementation of the invert filter, which inverts RGB bytes in each pixel of the frame:
    outputContext.drawImage(video, 0, 0);
    var imageData = outputContext.getImageData(0, 0, width, height);
    var source = imageData.data;
    var length = source.length;
    for (var i = 0; i < length; i += 4) {
        source[i  ] = 255 - source[i];
        source[i+1] = 255 - source[i+1];
        source[i+2] = 255 - source[i+2];
        // игнорируем альфу
    }
    outputContext.putImageData(imageData, 0, 0);
    

    And although this method works for demos and simple pictures, it very quickly “blown away” at high resolutions. Although the call drawImageitself is pretty fast even at 1080p , after the addition getImageData, putImageDatathe execution time grows to 20-30 milliseconds per iteration . The full code above is already executed in 35-40ms , which is the maximum speed for PAL video (25 frames per second, 40ms per frame). All measurements were taken at 4770k, which is one of the most powerful home processors at the moment. This means that the execution of any more or less complex filter on previous generations of processors is impossible , regardless of the performance of JavaScript. Any, even very fast code, will rest on the terrible performance of the canvas itself.

    But JavaScript is not very fast on its own. Although normal operations like inverting or running through the LUT can be performed in a reasonable amount of time, any more or less complex filter causes terrible lags. A simple implementation of the noise adding filter (Math.random () * 10 to each pixel) already requires 55 milliseconds, and the 3x3 core for blura , implemented in the code below, takes 400ms, or 2.5 frames per second.
    function blur(source, width, height) {
        function blur_core(ptr, offset, stride) {
            return (ptr[offset - stride - 4] +
                    ptr[offset - stride] +
                    ptr[offset - stride + 4] +
                    ptr[offset - 4] +
                    ptr[offset] +
                    ptr[offset + 4] +
                    ptr[offset + stride - 4] +
                    ptr[offset + stride] +
                    ptr[offset + stride + 4]
                    ) / 9;
        }
        var stride = width * 4;
        for (var y = 1; y < (height - 1); ++y) {
            var offset = y * stride;
            for (var x = 1; x < stride - 4; x += 4) {
                source[offset] = blur_core(source, offset, stride);
                source[offset + 1] = blur_core(source, offset + 1, stride);
                source[offset + 2] = blur_core(source, offset + 2, stride);
                offset += 4;
            }
        }
    }

    Firefox shows even more depressing results with 800 ms / pass. Interestingly, IE11 is even twice as fast as Chrome (but the canvas itself is slow, so this does not save). In any case, it becomes clear that pure JavaScript is the wrong way to implement filters.

    asm.js


    Newfangled asm.js is a tool from Mozilla to optimize the execution of JavaScript code. The generated code will still work in chrome, however, you should not hope for a serious performance gain, since support for asm.js, apparently, has not yet been added .

    Unfortunately, I could not find a simple way to compile the selected functions into asm.js-optimized code. Emscripten generates about 4.5 thousand lines of code when compiling a simple two-line function, and I did not understand how to extract only the necessary code from it in a reasonable amount of time. Writing asm.js with your hands is still a pleasure . In any case, asm.js will run into the performance of the 2d canvas context, similar to pure JavaScript.

    SIMD.js


    SIMD.js is a very new technology for manual optimization of JS-applications, which is currently "supported" only in Firefox Nightly , but very soon it will be able to receive support for all target browsers . Unfortunately, the API now works with only two data types , float32x4 and uint32x4, which makes the whole idea useless for most real 8-bit filters. Moreover, the Int32x4Array type has not yet been implemented even in Nightly, so any writing and reading of data from memory will be slow and scary (when implemented in this way ). However, I will give an implementation code for a conventional inversion filter (this time working through XOR):
    function invert_frame_simd(source) {
        var fff = SIMD.int32x4.splat(0x00FFFFFF);
        var length = source.length / 4;
        var int32 = new Uint32Array(source.buffer);
        for (var i = 0; i < length; i += 4) {
            var src = SIMD.int32x4(int32[i], int32[i+1], int32[i+2], int32[i+3]);
            var dst = SIMD.int32x4.xor(src, fff);
            int32[i+0] = dst.x;
            int32[i+1] = dst.y;
            int32[i+2] = dst.z;
            int32[i+3] = dst.w;
        }
    }

    At the moment, the above code runs much slower than pure JS - 1600ms / pass (Nighly users can try another demo ). It seems that you will have to wait a sufficient amount of time before you can do at least something useful with this technology. Unfortunately, it is not clear how support for 256-bit YMM registers will be implemented (int32x4 is the usual 128-bit xmm from SSE2), and whether instructions from newer technologies like SSSE3 will be available. Well, SIMD.js does not save you from slow canvas. But SIMD fans can now get some familiar bugs right in the browser!

    Webgl


    A completely different way to implement filters is WebGL . In its most basic understanding, WebGL is the JS interface for the native OpenGL technology, which allows you to execute a variety of code on the GPU. Usually it is used for programming graphics in games, etc., but no one bothers to process pictures or even video with its help. WebGL also does not require calls to getImageData, which in theory avoids the typical 20ms lag.

    But nothing is free - WebGL is not a general-purpose tool and using this API for abstract non-graphical code is a terrible pain. You will need to determine the useless vertices (which will always cover the entire frame), correctly position the texture (which will cover the entire frame), and thenuse video as texture . Fortunately, WebGL is smart enough to automatically request the right frames from a video. At least in chrome and fox. IE11 will please the error WEBGL11072: INVALID_VALUE: texImage2D: This texture source is not supported.

    Finally, to write filters you will have to use shaders implemented in the slightly flawed GLSL language, which (at least in the WebGL version) does not even support the installation of constant arrays , so any arrays will either need to be transferred using uniforms (such type-global variables) Or use the Indian method:
    float core1[9];
    core1[0] = 1.0;
    core1[1] = 1.0;
    core1[2] = 0.0;
    core1[3] = 1.0;
    core1[4] = 0.0;
    core1[5] = -1.0;
    core1[6] = 0.0;
    core1[7] = -1.0;
    core1[8] = -1.0;

    It also requires that the pixel shader return a single value - the color of the current pixel, which makes it impossible for the typical implementation of some filters to process several pixels per iteration (the same blocking suppression). Such filters will have to be rethought and implemented differently.

    In general, technologies like CUDA and OpenCL were not invented from a good life.

    To justify WebGL, it has really amazing web performance ( which you can't measure ). At least it can handle the prewitt filter from masktools (choosing the maximum value of four 3x3 cores) in real time at 1080pand higher. If you hate yourself and are not afraid to get a little unsupported code, WebGL allows you to do quite interesting things with the video. It might be more reasonable to use the seriously.js library , which hides part of the template WebGL code, but it may not be advanced enough to handle video resolution changes or implement temporary filters.

    If you love yourself, then most likely you will want to use something like WebCL.

    Webcl


    But it will not work. Wikipedia says that WebCL 1.0 was finalized on March 19th of this year, making the technology the youngest of the entire list, younger than even SIMD.js. And, unlike SIMD.js, it will not be supported in Firefox in the near future . I read somewhere about a similar solution for Chrome, but lost the link. So WebCL is currently a dead technology without a clear future.

    Conclusion


    Real-time video processing in the browser is possible, however, the only working implementation option is to use WebGL, programming video filters on which is an occupation worthy of real masochists. All other methods rest on the terrible performance of the 2d canvas context, and they themselves do not shine with speed of execution. Such sad things.

    Also popular now: