Zoom 1080P video to 4K, or How I learned not to worry and loved upscale using neural networks

  • Tutorial
While reading a recent article about upscale ( Upscale - scaling an image to a higher resolution), this time about a commercial product Topaz AI Gigapixel, I left a comment with the following content:
It’s a pity that the post is a simple translation, I would like to compare it with something free, like the same waifu2x. I think the difference will be very difficult to find, even though waifu2x is designed for animation.
Well, since the article was a translation, I decided to take matters into my own hands. So, let's not waste time, get acquainted:


Under the cut, Longrid, as well as upscale video guides using Instant 4K, Waifu2x, Lanczos and Topaz Gigapixel AI.


We will compare them using our own eyes, because we can optimize the image for various solutions for analyzing the image by winning “extra” percentages ... However, we will not refuse the latter either, therefore the results of the MITSU and VMAF analysis will be added to the results in screenshots , since both programs work with video without a sample. SSIM, PSNR and others will not suit us in this case, because we do not have a real 4K with which the results of the upscale can be compared.

Content:



All files from the article, including PNG frames without compression, animated comparisons, 4K H.265 samples, tables and FFV1 files are on Google . Mirror on Yandex . There are 3 simple video guides for using Adobe Premiere Instant 4K, Waifu2x and Topaz.

1. Preparation of material for testing

Testbed configuration:
CPU: Intel Core i7-4980HQ 4.2 GHz
Motherboard: MSI Z97 GAMING 5
(do not ask)
RAM: 32 GB DDR3 2400
GPU: NVIDIA GTX 1080ti FE 11GB, the core frequency is set manually to 1923, the memory frequency is 5602. Storage
: system and programs on SSD M.2 SATA 850 EVO 250GB, files on HDD 2 TB WDC WD40EZRZ.
Upscale was performed according to the following scenario:

  • If the selected method supported work directly from the Adobe Premiere video editor, then all the work took place in this editor, after which the result was exported in the FFV1 codec with the GOP 1 parameter to avoid any losses and use exclusively reversible image compression.
  • If the selected method did not support work from the video editor, then the video was decomposed into separate frames in PNG format using a simple bat file:

    BAT file for saving individual frames to the frames folder
    wmic process where name="cmd.exe" CALL setpriority 16384
    @echo off
    :hugly
    if "%~1" EQU "" goto mugly
    ffmpeg -probesize 1000M -i "%~1" -vsync vfr frames\image-%%03d.png
    shift
    goto hugly
    :mugly
    pause

    To decompose the video into frames, just put this file in the folder to ffmpeg.exe and create the frames folder, after which it is commonplace to drag the video file directly onto the bat file.

    After that, the processed frames were sent to FFMPEG for mixing, and exported with exactly the same settings as in the first case.

    BAT file for converting individual frames into a file, without sound
    ffmpeg -framerate 24 -i image-%%03d.png -vcodec ffv1 -pix_fmt yuv420p -level 3 -g 1 -r 24 photoshop.mkv

    -Framerate and -r parameters are responsible for the installation of video frame rate should be specified , both .

In both cases, the output was a MKV file with the FFV1 codec, approximately 700 megabytes in size for every 10 seconds of video.

The result of the analysis of one of the files using MediaInfo
General
Unique ID : 116184020412676472870756705294056286853 (0x57683A783D4732308C09451184B9EA85)
Complete name : D:\HABR\4K SOURCE\topaz.mkv
Format : Matroska
Format version : Version 4
File size : 1.21 GiB
Duration : 16 s 475 ms
Overall bit rate mode : Variable
Overall bit rate : 630 Mb/s
Writing application : Voukoder 1.2.1 (Premiere) - www.voukoder.org
Writing library : Lavf58.12.100
ErrorDetectionType : Per level 1

Video
ID : 1
Format : FFV1
Format version : Version 3.4
Format settings, GOP : N=1
Codec ID : V_MS/VFW/FOURCC / FFV1
Duration : 16 s 475 ms
Bit rate mode : Variable
Bit rate : 618 Mb/s
Width : 3 840 pixels
Height : 2 160 pixels
Display aspect ratio : 16:9
Frame rate mode : Constant
Frame rate : 23.976 (24000/1001) FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Compression mode : Lossless
Bits/(Pixel*Frame) : 3.107
Stream size : 1.19 GiB (98%)
Default : Yes
Forced : No
Color range : Limited
Color primaries : BT.709
Transfer characteristics : BT.709
Matrix coefficients : BT.709
coder_type : Golomb Rice
MaxSlicesCount : 12
ErrorDetectionType : Per slice


After that, each of the files was opened in MPC-HC with the aim of taking screenshots, and then it was sent for analysis to ffmpeg using, again, a bat file:

For this script to work, ffmpeg must be built with libvmaf support!
ffmpeg.exe -i instant_4k.mkv -i A016_C001_02073O_001.mkv -lavfi libvmaf=model_path=vmaf_4k_v0.6.1.pkl:log_path=vmaf.log:log_fmt=json:psnr=1:ssim=1:ms_ssim=1 -f null -
FFMPEG with VMAF support can be downloaded from my links to Google and Yandex above.

MITSU, in turn, did not need to be configured, the finished bat-files require a little modification with a file (I indicated the full paths to ffprobe and ffmpeg, and also put cygwin1.dll in the folder with the executable file).

The received data was imported into Excel and turned into beautiful and not very graphics.
FFV1 files were also converted to mp4 with the h265 codec (in order to minimize file size while preserving the maximum detail) with VBR 25000 kbps and SAO disabled, in the hope of achieving adequate quality.

↑ Back to contents

2. The choice of material for testing

In this article, we consider a variety of examples: fragments from a series in 1080P, hand-drawn animation and even 4K video shot on a professional camera to get objective results for various cases of using the programs in question.

The following types of videos were processed:

  • Scene from the series " Person of Interest ", S02E20, intro: a lot of computer graphics, a lot of movement and change of plans.
  • A scene from the same series of the series: little movement, several changes of plans and many details in the background and foreground, from tree branches to face details.

    Both scenes are taken from one file with a resolution of 1920 × 1080 (Full HD), H.264 codec, bitrate - 12,664 kbps. This is a pretty good initial quality for the series. Comparison of the upscale result was carried out with the file enlarged using simple bilinear interpolation (it is indicated in the list as a reference).
  • Sample video from RED.com : real 4K HD (3840x2160) shot at 120FPS. The video was reduced to 1920x1080 using Lanczos, and then increased to 4K using the programs from the list above. The frame rate has been reduced to ~ 24 frames per second. The result was compared with the source file converted from a RED file to the familiar FFV1 (ffmpeg refuses to work with RED files).
  • Hand-drawn animation "The Sacred Book of the Werewolf ", downloaded from YouTube, artificial shaking of the "camera", many changes of plans. Source file in resolution 1280x720, WEBM container, VP9 codec, bitrate 1556 kbps. This is a very low quality, but quite common on YouTube.

↑ Back to contents

3. Selected programs for testing

We will explore each of the upscale methods in a little more detail:

  • Red Giant Shooter Instant 4K 13.1.5
    Paid: yes ($ 99).
    Integration with Adobe Premiere (does not require decomposition of video into frames): yes.

    Quality settings: Filter Type - Best, Sharpness 2, Quality 25 (maximum), Anti-aliasing 6. These are standard settings, with the exception of Quality - it was set to maximum manually.
    Processing time for 914 INTRO frames: 532 seconds (11 times longer than regular exports).

    Upscale method: unknown (“intelligent algorithms”)

    image alt
    The interface of the Instant 4K plug-in in the Adobe Premiere 2019 program window, in the upper left corner.

    We select the target resolution (you can specify your own), select the type of filter (I see no reason to choose something other than Best), change or leave the default Sharpness (sharpness), Quality (quality) and Anti-Aliasing (smoothing), and then export the video file and enjoy the result.

    It works fast enough, practically does not load the GPU and CPU more than usual export. A week ago was my standard upscale video plugin. It makes changes that are not typical for simple mathematical filters in the details of the picture, so it is most likely an AI. It takes the second place in speed.

    Allows you to see the result before rendering the video, adjust the settings. For smooth playback requires rendering, the changes are visible in the viewer a couple of seconds after changing the settings.
  • Lanczos filter (known to many as Lanczos).

    Paid: no.

    Integration with Adobe Premiere (does not require decomposition of video into frames): yes, as part of the Voukoder plugin .

    Quality Settings: None.

    Processing time for 914 INTRO frames: 54 seconds (1.13 times longer than regular export), excluding the time it takes to convert video into separate frames and then reduce the frames back to video.

    Upscale Method: Non-AI.

    Example line for upscaling a video file from under Windows:
    ffmpeg -framerate 23.976 -i input.mp4 -vcodec ffv1 -pix_fmt yuv420p -level 3 -g 1 -vf scale=3840:2160 -sws_flags lanczos+full_chroma_inp -r 23.976 lanczos.mkv


    There is no preview, as well as settings. It is possible to work directly with the video, without decomposing it into frames. Works great from ffmpeg. It works faster than everyone else, only on the CPU.
  • Adobe Photoshop Saving Details 2.0
    Paid: Yes.

    Integration with Adobe Premiere (does not require video decomposition into frames): no.

    Quality settings: there is a “noise reduction” parameter, set to 100%, like the author of the article.

    Processing time for 914 INTRO frames: 3840 seconds (80 times longer than regular export), excluding the time it takes to convert video into individual frames and then bring them back into video.

    Upscale Method: Unknown.

    image alt
    It requires initial configuration in the form of creating a template “open file - resize - save file”, it spends a lot of time directly on opening files and saving them: the selected frame format affects PNG, it works on the CPU. The penultimate in speed.

    Allows you to see the result before enlarging the image and saving it.
    An inexperienced user may have problems with changing the color space: double-check the color profile after saving the image, compare the colors of the original image and upscale. Most likely, the sRGB IEC61966-2.1 color profile is suitable for you.
  • Topaz Gigapixel AI
    Paid: yes ($ 99).

    Integration with Adobe Premiere (does not require video decomposition into frames): no.

    Quality Settings: Suppress Noise and Remove Blur. In addition, it is possible to enable and disable “AI models of maximum quality”. In our comparison, these models are included.

    Processing time for 914 INTRO frames: 7680 seconds (160 times longer than regular export), excluding the time it takes to convert video into separate frames and then bring the frames back into video.

    Upscale Method: AI.

    image alt
    A fairly pleasant interface with a preview of the result, it has a fully functional trial of 30 days, it loads the video card very much and works, in fact, only on it. If desired, you can run on the CPU. Compared to the rest - the slowest.
  • Waifu2x with UpResNet10 profile
    Paid: no.

    Integration with Adobe Premiere (does not require video decomposition into frames): no.

    Quality settings: under Windows there is waifu2x-caffe , it allows you to select a profile and adjust the power of "noise reduction" (Off / 1/2/3). Experimentally, I chose the UpResNet10 profile as showing the best result. The noise canceler is set to AUTO 1.
    Processing time for 914 INTRO frames: 879 seconds (18 times longer than regular export), excluding the time it takes to convert video into separate frames and then reduce the frames back to video.
    Upscale Method: AI.
    image alt
    A simple interface that works on the GPU may cause additional difficulties during initial setup (I had to install cuDNN, it took 10-15 minutes). If desired, you can run on the CPU. The average speed, ahead of only Photoshop and Topaz.

This is not a complete list of programs that are suitable for upscale video, and in the next article I will try to add even more ways. Suggest your options in the comments!
↑ Back to contents

4. Finally, we move on to the video:


4.1 File 1: “INTRO”:

First passage: computer graphics.
image alt
A frame from the series enlarged using bilinear interpolation and reduced to 720P for previews.
The main differences: the quality of processing text and marks on the map, including grids throughout the frame, as well as maintaining the shape of the “sight” in the center of the frame.
JPEG Instant 4K , Lanczos , Photoshop , Topaz , Waifu2x results with UpResNet10 profile and original.
Animated Comparison: MP4 H.264 , WEBP . Also both files and original frames in PNG are available here.. I recommend viewing videos and frames locally, not from the Google Drive page.
Full-size 4K samples in the H.265 codec are here (~ 100MB each).

Animated frame center comparison: MP4 H.264 , WEBP .
Let's start with the sight in the center. Red square: pay attention to the behavior of two intersecting lines. Since we (in places) are dealing with AI, we are looking for unwanted distortions. Instant 4K made quite strong changes to this square, for which it receives a minus in karma from me. All other methods behaved approximately the same, the clearest result - with Photoshop. Topaz - in second place. UpResNet10, unfortunately, noticed compression artifacts and kindly decided to increase them. Lanczos is actually no different from the usual increase.
Blue square: follow the shape of the circles, hope for the smoothest possible circle. Instant 4K added gaps again and smoothed the circle, removing the gap at the bottom of it. However, this time, for this he receives a plus from me. However, Photoshop has the best result. UpResNet10 brought out too many details (grid) over the circle, which can be perceived as artifacts.

Animated text comparison: MP4 H.264 , WEBP .
We turn to the text: Lanczos made it bold, for which it gets a minus. UpResNet10 is fond of a grid again, and again shows itself worse than competitors. Most of all I liked the text that Topaz produced. However, it is approximately equal to the results of other methods - it is already a matter of taste. Those who are ready to forgive Instant 4K for “thinking over” can most of all be satisfied with the text they have issued, the rest, I believe, will be divided between Topaz and Photoshop. The latter, for my taste, is still “soapy”.

Animated grid comparison on the map: MP4 H.264 , WEBP .
The last point is the grid on the map. Here, everyone showed themselves more or less the same, with the exception of two distinguished ones - UpResNet10 and Topaz. Topaz killed a whole bunch of dots and almost all the details. UpResNet10, on the contrary, brought out points where they were practically invisible before. Personally, both of these options do not suit me, so I share the victory “for points” between Instant 4K and Photoshop.
In general, Photoshop is the winner on a static map from computer graphics, Instant 4K takes the second place, UpResNet10 takes the third place (yet we want more details from 4K, and he gave them to us, albeit of dubious quality). Topaz killed too many details, while Lanczos is trite too little different from bilinear interpolation.

We look at the next frame: the middle of the animation with movement.
image alt
A frame from the series enlarged using bilinear interpolation and reduced to 720P for previews.
JPEG Instant 4K , Lanczos , Photoshop , Topaz , Waifu2x results with UpResNet10 profile and original.
Animated Comparison: MP4 H.264 , WEBP . Also both files and original frames in PNG are available here .
At first glance, the picture is expected: all options, except Lanczos, gave a clearer image. We’ll pay particular attention to the details of Topaz, because it has changed the picture most of all. We will study the real, not drawn actor, and the details of his clothes, as well as the text.

Animated actor comparison: MP4 H.264 , WEBP .
As for the actor, the differences here are really minimal: although the last time UpResNet10 displayed (sometimes superfluous) details that were previously hard to see, this time it only increased the sharpness of the diagonal grid from below. The differences between Photoshop, Lanczos and Instant 4K really need to be looked under a magnifying glass, even a 400% increase is not enough to notice the differences. In general, Photoshop and Instant 4K produced a slightly sharper picture. Instant 4K is doing amateur work again - the shirt collar has changed in the actor’s red square. However, some real difference from just an increase is noticeable except that for Topaz: the color noise around the bands has decreased, and this is the only program that has a sharpness increase in the blue square.

Animated Text Comparison: MP4 H.264 ,WEBP .
We move below to the text: here Photoshop yielded to all other methods of enlargement and produced a picture identical to Lanczos. UpResNet10 and Instant 4K added sharpness, but also introduced a little "gag" in the form of letters. Topaz again gave the clearest picture, highlighting the jagged letters (good or bad - you decide) and minor compression artifacts. The color noise is again reduced, and the capital letters on top do not compare with other methods of increasing.

The last frame from the segment: a frame from the series with computer graphics superimposed on top of it.
image alt
A frame from the series enlarged using bilinear interpolation and reduced to 720P for previews.
JPEG Instant 4K Results , Lanczos ,Photoshop , Topaz , Waifu2x with UpResNet10 profile and the original.
Animated Comparison: MP4 H.264 , WEBP . Also both files and original frames in PNG are available here .

Animated actor face comparison: MP4 H.264 , WEBP .
We will consider in magnification only the actor’s face on the right: Topaz added sharpness and highlighted the grid very well, as a result the actor’s face turned out to be very clear. Instant 4K and UpResNet10 did the same, but removed less blur (blur). Photoshop yielded to them too, overtaking unless Lanczos. In general, no one lost any details, and Instant 4K did not make unnecessary changes.

In the three selected frames, the winner for me was Topaz. Despite a bunch of lost details in the first frame, he recouped on the last two. The second place was taken by Instant 4K, for a more or less stable sharpening in all three frames, despite minor changes. The third is for UpResNet10. In the first frame showed a picture that I did not like, but in the last two showed a good increase in sharpness. Photoshop started well in the first frame, but in the last two almost did not differ from the usual increase. Lanczos without sharping is almost no different from bilinear interpolation in all three frames.

Let's see what MITSU and VMAF will say:
image alt
The result of the analysis of MITSU (full size) , Blur and Noise, smaller is better.
image alt
VMAF Analysis Result (Full Size), more is better.
MITSU reports that least of all bluers contain frames processed by Topaz. However, all other upscale methods, including Lanczos, showed an improvement in Blur and crossed the 5-point line, the numbers below which tell us that the video does not have too much blur. That is, all upscale methods have improved image clarity.
As for the noise, the picture is the opposite - the least noise in the original picture, the most in Topaz. In principle, this is logical because:
  1. Part of the noise could indeed be incorrectly classified by all programs as details and “improved”.
  2. Some details of the image could be recognized by MITSU as noise.

In any case, I did not notice any noticeable increase in the amount of noise in the picture, and all values ​​are below the border of 3.5 points, that is, according to the MITSU documentation , they have no visible noise.
Since the noise in all frames is much lower than noticeable, we will judge by the Blur indicator.
As for VMAF, here all the charts are basically at the maximum value of 100. However, in the VMAF chart we can see drawdowns - such, for example, in Instant 4K and Topaz have the same appearance - only Topaz falls below. In the middle of the chart, Instant 4K is replaced by UpResNet10, which drops slightly with Topaz. Here, the last one has a drop in VMAF to 77, and at the end of the chart for Photoshop this value drops to 0. At the same time, there are no visible differences, artifacts or “glitches” in these frames.
So, the algorithms distributed the programs as follows: in the first place UpResNet10, in the second - Instant 4K, in the third - Photoshop.
The "original" file FFV1 MKV 1080P, for those wishing to conduct their own experiments or repeat mine, can be downloaded here.
↑ Back to contents

4.2 File 2: “Scene 1”

Second passage: a scene without computer graphics, details of faces and backgrounds.
The main differences: background details, face details, artifacts on them.
image alt
A frame from the series enlarged using bilinear interpolation and reduced to 720P for previews.
JPEG Instant 4K , Lanczos , Photoshop , Topaz , Waifu2x results with UpResNet10 profile and original.
Animated Comparison: MP4 H.264 , WEBP . Also both files and original frames in PNG are available here .
Full-sized 4K samples in the H.265 codec are here (~ 97MB each).

Animated tree comparison: MP4 H.264 , WEBP .
First, consider the background, or rather, the trees: Instant 4K and UpResNet10 proved to be approximately the same, adding a bit of sharpness, while Photoshop, by contrast, lost some of the details by smoothing the picture. In addition, Instant 4K slightly distorted the trees (see red squares). But Topaz added a lot of sharpness and made very clear trees from the soap background. Lanczos is again no different from bilinear interpolation.

image alt
A frame from the series enlarged using bilinear interpolation and reduced to 720P for previews.
JPEG Instant 4K Results , Lanczos , Photoshop , Topaz ,Waifu2x with UpResNet10 profile and original.
Animated Comparison: MP4 H.264 , WEBP . Also both files and original frames in PNG are available here .
An animated comparison of the actor on the background of trees: MP4 H.264 , WEBP .
Now another shot: the actor is facing us against the background of trees.
Spoiler: you do not need to look at the trees, they do not differ. Therefore, we study the glass in the actor’s hand and his face.

Animated actor face comparison: MP4 H.264 , WEBP .
In the case of the face, Instant 4K and UpResNet10 again showed approximately the same results, and Photoshop again lost some of the details. Topaz added these details, although not as much as in the case of trees in the background. Characteristic for Instant 4K artifacts this time was not found. Lanczos turned out to be slightly sharper than bilinear interpolation.

Animated glass comparison: MP4 H.264 , WEBP .
Well, in the last comparison, a glass glass, the main differences were noticeable in its transparent part without a drink. Topaz again added more detail than the rest, Photoshop blurred the image again (blue square), and UpResNet10 and Instant 4K again behave approximately the same. Full repeatability of the result.
The winner in this shot was without a doubt Topaz, adding details without artifacts (!), The second place was taken by UpResNet10, which, although it added a bit of sharpness, definitely didn’t reach Topaz, but didn’t add artifacts, as did the third-place Instant 4K. Photoshop took the last - not increased, but reduced the detail.

image alt
A frame from the series enlarged using bilinear interpolation and reduced to 720P for previews.
JPEG Instant 4K , Lanczos , Photoshop , Topaz , Waifu2x results with UpResNet10 profile and original.
Animated Comparison: MP4 H.264 , WEBP. Also both files and original frames in PNG are available here .
Animated face comparison of the second actor: MP4 H.264 , WEBP .
The last frame from the same fragment: the second actor.
Everything is clear to us with trees, backgrounds and glasses, so here we study the face of the actor.
Here, the result roughly corresponds to the previous frames, with one very important caveat: Topaz painted something on the actor’s cheekthat I began to notice at a 500% increase, and I was able to see it only at 1000%: a very unpleasant "texture", which obviously should not be there. All other methods worked out approximately the same, only Photoshop lost the details again (but got rid of the noise on the white shirt). The last place for such amateur performances is Topaz, the penultimate one is Photoshop, all the rest take the honorable “first place”.

According to the results of the review of all three frames, reluctantly, but Topaz takes the first place. Probably, his behavior from the last frame can be fixed, but this will take you extra time, and a lot. However, this detail is not very visible at 100% scale, and it showed itself perfectly in the first comparison with trees, so it will be a winner. The second place is for UpResNet10, for increasing the clarity without artifacts, the third is for Instant 4K, for the same thing, but with artifacts. Photoshop takes the last place for the deterioration of the clarity of the picture.

What about MITSU and VMAF?
image alt
The result of the analysis of MITSU (full size) , Blur and Noise, smaller is better.
image alt
VMAF analysis result (full size) , bigger is better.
Here, everything is about the same as with the first video: Topaz has the least Blur, and it has the most noise. Although the noise once again did not even reach 1 point, not like 3.5, so again we pay attention to the blur, where everyone has crossed the boundaries of the allowed 5 points. Topaz showed a noticeable improvement here, reducing Blur almost twice, to 6, while the rest are in a heap of 10-12 points.
According to the VMAF graph, no drawdowns are observed, except that on the VMAF graph, Topaz demonstrates a significantly greater unevenness of values ​​in neighboring frames than other programs. He gave the first place to Instant 4K, the second to Topaz, and the third to UpResNet10.
Thus, Topaz takes the first place, Instant 4K takes the second, and UpResNet10 takes the third.
The “original” file FFV1 MKV 1080P, for those who wish to conduct their own experiments or repeat mine, can be downloaded here .
↑ Back to contents

4.3 File 3: “4K Source”

image alt
A frame from a RED file converted to FFV1 and reduced to 720P for previews.
JPEG Instant 4K , Lanczos , Photoshop , Topaz , Waifu2x results with UpResNet10 profile and original.
Animated Comparison: MP4 H.264 , WEBP . Also both files and original frames in PNG are available here .
Full-sized 4K samples in the H.265 codec are here (~ 49MB each).
The third comparison will be carried out as follows: the source file with a resolution of 3840x2160 in R3D format was exported from Adobe Premiere with the same resolution, but with a frame rate reduced from 120 FPS to 23.98 FPS.
After that, the obtained FFV1 file was again reduced, this time using bilinear interpolation, to a resolution of 1920x1080, after which the resulting file was run through the upscale program and compared with the FFV1 file in 4K resolution, that is, the comparison was not with bilinear interpolation, as in previous times, but with the original picture. The file is short, only 16 seconds, so we will compare one single frame.
The first difference that catches your eye is the color and brightness of the image. Several times I rechecked the settings for dividing the video into frames and then collecting these frames in the video using ffmpeg: the result is one - no algorithm was able to keep the colors and brightness unchanged, not even Lanczos. However, the brightness and color of the image in this case does not play a role for us, so we will look at the clarity of the image, namely, the details on the car, images and reflections.

Animated car comparison: MP4 H.264 , WEBP .
Instant 4K slightly sharpened the entire image and drew a dark reflection above the wheel. Also, all other upscale methods drew this shadow, which is interesting: only Instant 4K “thought through” the details before. Perhaps the algorithms made out what is poorly visible to us - the transition from the area with reflection on the car body. Topaz tried hardest of all, removing noise throughout the image (which, in fact, is not always good, because often noise is added / left intentionally) and highlighted details on the car, including images. However, just so much blurring cannot be removed, so in many places Topaz drew double contours - especially on the text (blue square). UpResNet10 this time disappointed by increasing the pixelation of some parts of the frame, probably incorrectly recognizing noise as “features” (red square). What is characteristic a similar behavior is observed in Lanczos, which has no AI at all. Photoshop made a strong middling, reducing noise and not adding much pixelation.
In this comparison, the winner for me was once again Topaz, Photoshop took the second place, Instant 4K took the third place.

Let's move on to the graphs: here we add SSIM and PSNR to the graphs, since we have the original in 4K.
image alt
The result of the analysis of MITSU (full size) , Blur and Noise, smaller is better.
image alt
VMAF analysis result (full size) , bigger is better.
image alt
SSIM analysis result (full size) , bigger is better.
image alt
PSNR analysis result (full size) , bigger is better.

PSNR put Photoshop first, SSIM Lanczos, and VMAF Topaz. Second place PSNR and SSIM were given to UpResNet10, while VMAF preferred Instant 4K. In third place we again have a lot to do, and while PSNR puts Lanczos in third place, SSIM gives it Instant 4K. VMAF - Photoshop. At the same time, VMAF evaluates each Topaz frame at 100 points out of 100 possible - I checked twice.
MITSU again shows the familiar picture, and tells us that the noise on all the results is practically absent. As for blurring, everything is not so good here - most of the upscalers retained the blurriness of the original, only Topaz and UpResNet10 broke away from the group, by 4 and 2 points, respectively. Both could not get to the coveted 5, but there is a result.
Of all four metrics, the VMAF and MITSU readings most closely match my impressions, while PSNR and SSIM prefer AI “math”. According to the results of the analysis, it is difficult to name the winner clearly.
The "original" FFV1 MKV 4K file, for those who wish to conduct their own experiments or repeat mine, can be downloaded here .
↑ Back to contents

4.4 File 4: “Youtube”

image alt
A frame from a Youtube video, enlarged using bilinear interpolation, and reduced to 720P for previews.
Video from Youtube, from start to finish. The initial quality is 720P, hand-drawn animation, plus camera shaking has been added. Quality is acceptable, but leaves much to be desired, especially on a 4K monitor.
JPEG Instant 4K , Lanczos , Photoshop , Topaz , Waifu2x results with UpResNet10 profile and original.
Animated Comparison: MP4 H.264 , WEBP . Also both files and original frames in PNG are available here .
Full-size 4K samples in the H.265 codec are here(~ 265MB each).

At the first comparison, the contrast of the frame issued by Instant 4K is striking: it differs in both color and brightness. I exported this video several times from several versions of Adobe Premiere and checked the color - Instant 4K insisted on changing the color at any settings. The final result is from version 2019 with Display Color Management enabled.

Animated Face Comparison: MP4 H.264 , WEBP .
Without an increase, the difference is noticeable except in Instant 4K and Topaz, so let's take a closer look at the face.
Lanczos showed no improvement, UpResNet10, oddly enough, too. Photoshop reduced the blocking effect of the picture, which appeared in the original due to too much compression, without losing any details. For animation, this method is quite acceptable, we remove artifacts without increasing sharpness, in general, the picture becomes nicer. Instant 4K increased the number of artifacts and changed colors, there is also a slight increase in sharpness, which is offset by the appearance of even more artifacts. Topaz, on the other hand, demonstrated exactly what I would like to see from all other methods: artifacts almost completely disappeared, as did the blocking effect, and sharpness cannot be compared to the original or any other upscale method. The winner, of course, is Topaz, the second place goes to Phtoshop,

image alt
A frame from a Youtube video, enlarged using bilinear interpolation, and reduced to 720P for previews.
JPEG Instant 4K , Lanczos , Photoshop , Topaz , Waifu2x results with UpResNet10 profile and original.
Animated Comparison: MP4 H.264 , WEBP . Also both files and original frames in PNG are available here .

We move on to the second frame: here we have many gradients with artifacts, as well as face details, again.
It will be interesting to see how all the upscale methods worked out artifacts on gradients, as well as look for unwanted changes. As for the artifacts on the gradient on the left (blue square), only Topaz and Instant 4K drew attention to them. While the first got rid of these blocks almost completely, Instant 4K again failed, changing colors and making artifacts more noticeable than with Topaz. Of course, the result still can not be compared with the rest: there, these blocks bloom and smell.

Animated Face Comparison: MP4 H.264 , WEBP .
In the original, we again have a smear of macroblocks, and we expect about the same result as in the last frame. Instant 4K slightly tweaked the transitions between the blocks and made them smoother, while Photoshop succeeded worse, unlike the first comparison. However, the image after processing Photoshop still became a little nicer. UpResNet10, unfortunately, again hit the dirt in the face and did not improve what is happening on the screen, almost not differing from Lanczos. But Topaz once again added clarity, removed artifacts and did not add new ones.
The winner here again is Topaz, the second place is for Instant 4K (I have already reconciled with the color change), the third is Photoshop.

image alt
A frame from a Youtube video, enlarged using bilinear interpolation, and reduced to 720P for previews.
JPEG Instant 4K , Lanczos , Photoshop , Topaz , Waifu2x results with UpResNet10 profile and original.
Animated Comparison: MP4 H.264 , WEBP . Also both files and original frames in PNG are available here .

The third and last frame: here I specifically selected a darker frame to see how all programs will work with artifacts in dark areas.

First of all, I want to draw attention to the green square: we expect from all upscalers to reduce blocking in this place. Unfortunately, our expectations were met by quiet Topaz, while the rest, including Photoshop, did not make artifacts less noticeable. Instant 4K again shoots itself in the foot, increasing the brightness of the image and stretching even those artifacts that we simply did not see before. I want to note that I also didn’t really like the result of Topaz on this frame, I think that it smeared the gradients too much. However, for this he removed almost all visible flaws, so there was simply no one to compare him with - all the others, as I said, practically did not affect them.

Animated clothing comparison: MP4 H.264 , WEBP .
Since we have already seen enough of faces, we will consider larger details of clothes: here we are witnessing a real parade of macroblocks, especially on gradients. Unfortunately, once again, not one of the upscalers made the picture better, with the exception of Topaz, which smoothed out all the gradients and almost completely removed blocking, for which it ranks first in the analysis of this frame. Sadly, there is no one to assign second and third place for this shot.

Surprisingly, waifu2x, seemingly sharpened by the increase in anime and drawings, was completely unable to realize its advantage. It came as a surprise to me.

Compare my impressions with the results of the VMAF and MITSU analysis, starting with the first:
image alt
The results of the analysis of MITSU (full size) , Blur and Noise, smaller is better.
image alt
VMAF analysis result (full size) , bigger is better.
VMAF arranged all the upscale methods in almost the opposite of my order, in the first place Lanczos, in the last - Topaz. Everyone has frequent drawdowns in quality, Topaz falls the lowest in the VMAF metric, and in the longest drawdowns, everyone else rises behind him, except for Lanczos. However, I did not see anything unusual in these frames.
Well, MISTU once again demonstrates a big gap between Topaz and the rest according to the Blur metric, as well as a slight increase in Noise. According to MITSU, absolutely all the upscale methods made the picture clearer, and Instant 4K even reduced the amount of noise invisible to us.
Thus, the first place is occupied by Instant 4K, the second - UpResNet10, the third - Topaz.
This time I disagree with VMAF.

The “original” file FFV1 MKV 720P, for those who want to conduct their own experiments or repeat mine, can be downloaded here .
↑ Back to contents

5. Analysis of the results

To determine the winner, I will use the results of the VMAF metric and the subjective conclusions about each file. I simply add up the VMAF metric using the average of each file. As for the subjective assessment, here I will turn to my conclusions about each file, and give out for each file the first place is 3 points, the second place - 2, and the third - 1. After which I add these points and place them vertically on the graph.

image alt
The result of adding VMAF and subjective analysis (full size) , bigger is better.

According to the VMAF metric, Instant 4K turned out to be the winner, breaking away from Topaz, which finished second, by only 0.58 points. According to subjective analysis, on the other hand, the distance between the first and second places is much greater - Topaz took first place, breaking away from Intsant 4K who took second place by 8 points, got 3 times more points.

The winner of my comparison is Topaz, the second place goes to Instant 4K, and the third to UpResNet10.


However, the comparison would not be complete without comparing the speed of the programs and my comments on their work. Let's start with the first:
image alt
Time spent on processing and volumes of received files (full size)

Increased detailing of Topaz affects even file sizes, and significantly. As a rule, Topaz files are 2 times “heavier” than the files of other upscalers, and 4 times “heavier” than the original files, enlarged by bilinear interpolation. The only exception is the “4K Source” file, where the difference was less than 30%.

As for the speed of the programs, all of them, except Topaz and Photoshop, work at a very reasonable speed. However, acceptable for what?
What if we want to enlarge a movie through Topaz, lasting 2 hours, at a frame rate of 25 frames per second? My computer will take 421 hours or 17 and a half days of continuous operation. What about a 40 minute series? 6 days.

Yes, the picture with the same waifu2x is somewhat different - 2 days for the film, 16 hours for the series. Do you think the result shown by waifu2x is worth two days of continuous rendering? I propose to answer this question to everyone independently.

A month ago, for upscale video, I used only and exclusively Instant 4K, for images (logos, sometimes photos) waifu2x. The result demonstrated by Topaz forced me to add it to my collection of upscale programs, at least for enlarging images. I often enlarge short clips with a duration of less than 10 seconds, in addition, it perfectly enlarges images - both drawings and photographs.

Returning to the topic of films, in the case of Topaz, I see only one way to solve the problem: the distribution of frames for processing on several machines.
image alt
Topaz AI can run faster. What's the catch?
Disabling "high-quality AI models" reduces the processing time of a theoretical film by 4 days, at the cost of reducing the accuracy of color processing. You might be interested to know thatOn the Topaz forum, many consider this “lite" model to be of higher quality. PNG frames and animated comparisons can be viewed here .

Instant 4K works at a good speed and demonstrates a wonderful result (better than other methods, except Topaz, and at the same time faster and also more convenient).

Lanczos , as expected, does not make sense without an additional sharpen filter for sharpening.

Photoshop is definitely not worth the time it took, at least with the settings I used. Maybe if you use JPEG and replace all the disks with an SSD, you can achieve a significant increase in processing speed.

UpResNet10In general, it plays Instant 4K. But if you do not agree with his initiative, you can try replacing Instant 4K with UpResNet10. Personally, I don't find Instant 4K artifacts so noticeable. On the other hand, using Instant 4K to enlarge images is at least strange, and usually not at all convenient. Therefore, replacing waifu2x with a video editor is possible, but not necessary.

Can any of the upscale methods be called universal? Perhaps yes. I think that correctly configured Topaz can produce a consistently better picture than other upscale methods. But there is no barrel of honey without a fly in the ointment, we must not forget about the performance of Topaz.
↑ Back to contents

6. A beginner who wants to try out upscale programs in practice

... I can first recommend Instant 4K as the easiest to use.

You simply run the program, import your file into it, create a sequence of “Sequence from Clip” from it, go to its settings (Sequence Settings) and change the resolution to what you want. After that, drag the Instant 4K effect onto your video track, adjust the resolution and additional parameters, and you can start exporting.

Unfortunately, to use this program, you need to buy or find Adobe Premiere, as well as the plugin itself. To buy all this is a very serious investment, and I can not advise to search for “cured” programs on the Internet.

image alt
Adobe Premiere dissatisfied with my taste
I want to draw your attention to the fact that Premiere does not work with all video formats, and a considerable part of films from the Internet will swear by a good obscenity. For this case, I can only advise converting the movie to something like ProRes using ffmpeg:
ffmpeg -hide_banner -probesize 1000M -i file.mkv -pix_fmt yuv420p -c:v prores_ks -c:a aac -b:a 128k file.mov

The next in line will be waifu2x : lay out the video into frames using the BAT file from the section "Preparing Material for Testing" , open the folder in waifu2x, select the parameters (increase to resolution, noise reduction) and start exporting. At the end of it, you will only need to bring the frames and sound into a finished video. You may have some problems with the configuration, as I already wrote, I personally had to look for cuDNN, in addition, waifu2x requires a fairly powerful video card.

If for some reason you cannot configure waifu2x, or you do not have enough GPU power, refer to Photoshop . In this article there is a good enough instructions for its use.

If you are satisfied with image experiments, or you have a lot of extra time, use the Topaz trial : it will work with all functions for 30 days. You just have time to increase your favorite movie before the end of the trial period (but this is not accurate). The requirements for a powerful waifu2x graphics card apply here. In addition, when using Topaz or waifu2x, you may have problems working with other, even simplest programs, since both programs strive to occupy the GPU by 100%, making the interface of the rest very slow. The principle of operation is the same as with waifu2x.

In the last three cases, I recommend that you separate the audio track from the original video file in advance, in any way convenient for you. If this is a simple video from Youtube or a movie with one track - well, the simplest single-line BAT file will do. If you are “lucky” to experiment with a film with tracks 5.1 and several translations, then such cases go far beyond the scope of this article. But he who seeks will always find.

Sample Sound Extraction Script
ffmpeg -i input-video.avi -vn -acodec copy output-audio.aac 

After receiving the audio in one file and at the end of the processing of frames with the upscaler, you should reduce the sound and video into one file.

For beginners, I recommend using the script from the article about Photoshop:
ffmpeg -framerate 24 -i image-%%03d.png -i output-audio.aac -pix_fmt yuv420p -vcodec libx264 -preset veryslow -crf 15 -c:a aac -b:a 128k -r 24 test_4K.mp4

Tweaked to my script for converting video into individual frames. Remember to swap -r and -framerate with your frame rate. Also, when working with PNG, it doesn’t hurt to indicate the required video color space: -pix_fmt yuv420p

The rest should study the parameters and settings of H.265 (smaller size and higher quality) and H.264 (faster).
More script options for FFMPEG can be found in this upscale article .
↑ Back to contents

7. Plan for the next article and questions to readers

In the next article I plan to add interactive for subjective testing, so that readers directly evaluate certain upscale methods, for this I am going to use the MSU Video Quality Measurement Tool . But I will need to understand if I can freely use H.265 and compress files more strongly, or I will have to use H.264. Please take the survey below and note whether your PC can play H.265 without frame loss (“lags”).
In addition, I plan to consider the Super-Resolution Convolutional Neural Network model (SRCNN) and the Efficient Sub-Pixel Convolutional Neural Network model (ESPCN).

  • В следующей статье я бы хотел повторить использование Lanczos, но добавить sharpening. Какой? Есть ли у вас пожелания или предложения по его настройке? Возможно, стоит заменить Lanczos на другой способ апскейла, например, sinc?
  • Стоит ли поменять модель waifu2x на другую? CUnet? Профиль Y? Возможно, вы в своих апскейлах с большим эффектом использовали другой?
  • Нужно ли оставить Photoshop?
  • Использовать «медленную, но качественную» модель Topaz, либо «быструю, но не качественную»?
  • Возможно, у вас есть предложения по исходным видео для тестирования?
  • Оставить или убрать MITSU? Возможно, поменять метрики, например Noise на Blockiness?

Бонусный франкенштейн от Topaz:
image alt
Один из двух кадров, испорченных Topaz при обработке исходного материала.

↑ Назад к содержанию

Только зарегистрированные пользователи могут участвовать в опросе. Войдите, пожалуйста.

Может ли Ваш ПК воспроизводить H.265 без потерь кадров («лагов»)?

  • 85.8%Да85
  • 14.1%Нет14

Результаты какой из программ Вам понравились больше всего?

  • 16.9%Adobe Premiere (Instant 4K)11
  • 3%Adobe Photoshop2
  • 9.2%FFMPEG (Lanczos)6
  • 52.3%Topaz Gigapixel AI34
  • 18.4%Waifu2x (UpResNet10)12

Хотите ли Вы принять учатие в субьективном тестировании в следующей статье?

  • 69.8%Да44
  • 30.1% No 19

Also popular now: