Testing video codecs. Episode II: Encoders Attack

    We continue to learn the secrets of testing video codecs. This time, let's talk about encoders.
    Link to the first part.


    I would like to note right away that encoders are not decoders, there are no predefined sets of video sequences for which it is customary to test, which complicates the whole process from the very beginning. But there are commonly used ones, usually they are used to prove the superiority of some codecs (or their implementations) over others.
    Great comparisons are made by the guys from Moscow State University, you can see here: compression.ru

    If we talk about comparison, then we can distinguish two criteria: quality (subjective or objective) and productivity. A rare codec is able to win in both categories at once. However, we will focus on testing encoders.
    So, at the input we have an uncompressed video and an implementation of some codec (the encoder itself), and the output is a compressed clip. We already talked a bit about the input above, but I want to note that we are limited only by our imagination and ... licenses. Those. you can always shoot a video yourself and then use it to mock a decoder / encoder, and it’s not always possible to download a movie from the Internet for the same purpose. In addition, it can be useful to create artificial sequences: noise, frames of the same color, etc.
    After a lengthy compression process, we get rigorous specifications of a calibrated compressed flow. It has almost everything that was in the source data (we're talking about lossy compression). It is this file that we have to explore. And since we are talking about specifications, one of the important checks will be to make sure that our clip complies with the standard.
    This is best done with third-party tools - so trust in the rating is higher for third-party users.

    But the initiative is not always punishable: there is a desire - you can also parse the file into its components and double-check everything. This should also be done in order to make sure that our encoder wrote down all the parameters the way we wanted, and not anyhow. Of course, in this case we are limited to “knobs” that we can “twitch” in the encoder settings.
    X264 fans will have to put up with fewer switches

    But back to our encoded file. How to evaluate the quality of the image that the user sees? Everything is very similar to what was in the first part: we need a decoder,
    preferably third-party, reference, to avoid double errors

    with the help of which we get an uncompressed video sequence that we can compare with the original one using the metrics we already know - PSNR and SSIM.
    However, here we come to an important question: what should be the thresholds of “similarity” of a video? In other words, which PSNR / SSIM values ​​satisfy us and which do not? In general, good is when it’s not bad, i.e. There are no artifacts, distortions, and everything else that the eye does not like. And it’s good if our metrics caught it. Do not forget to write the desired number in a secret place and use it during the next check - and suddenly something broke. But “breaking” can be done in different ways.
    So, let's say yesterday our encoder gave X dB at Y bitrate (we will encode with a constant bitrate)
    we will not take into account all other parameters, we will leave them fixed

    Today, our encoder produces X1 dB at the same bitrate. What's the matter? We sound the alarm? Minute, but X1 is greater than X? If so, then everything is fine, the encoder began to work better! Or not? Not everything is as simple as we would like: when using a constant bitrate there is no guarantee that it really will be constant - slight deviations are possible (in both directions). Now imagine that the supposedly higher quality is due to the increased bitrate. And this means that there is no real improvement: the encoder “cheated” by taking more bits, which it used to improve the quality. Also, a decrease in the value of X1 compared to X does not indicate a deterioration in the encoder - maybe it saved us bits. Moral: more does not mean better, less does not mean worse.

    But what are we all about picture quality? It’s time to think about decoders as well, then they should take apart all this encoded dregs. Did the encoder put all the necessary parameters into the stream? Did he put them right? Did you comply with all the requirements and requirements in the standard described?
    All this and much more must be checked, double-checked, and then clarified. And so every time, with every encoder. But we have already talked about this a bit, we will not focus on

    And it seems that everything can be finished on this, but there is one thing but: we have the same SDK! This means that all options for working with memory, threads, and everything else that the user can change, must be checked. In theory, in most cases, all ways to use the SDK should produce the same result.
    Well, the truth is, why should quality suffer if a user uses D3D memory instead of system memory? Or the depth of asynchrony is changed. Or they decided to use more (or less) threads.

    So, we can very well test some of the most used model, and compare the rest with it. This will be faster, because the decoding operations (to obtain raw video) and the calculation of metrics (just based on raw files) cannot be called cheap.

    Summing up, I want to note that while testing the decoder we are most interested in the input data (and the more diverse, the better), then when testing encoders everything is important: both the input data and the encoding parameters, of which there can be very, very many. And the checks themselves in the second case should be more intelligent.

    UPD
    Article about reviewing approaches to testing Video Pre-Processing can be read on ISN

    Also popular now: