larrabee March 17, 2018 at 12:08

Image Optimization for web

There are enough articles and projects on the Internet to resize images. Why is one more needed? In this article I will tell you why we were not satisfied with the current solutions and had to cut our own.

Problem

First, let's figure out why we resized images. We, as a web service, are interested in the fastest page loading for the user. Users like this and increase conversion. If the user has a slow or mobile Internet, then it is extremely important that the pages are lightweight, not wasting user traffic and processor resources. One of the points that helps with this is resizing images.

We solve two problems. The first problem is that images are often not squeezed to the desired resolution, that is, the client has to not only download data that he does not need, but also spend CPU resources on resizing pictures using the browser. Solution: give the user pictures in the resolution in which they will be displayed in the browser.

The second problem is that images are usually not compressed well enough, that is, they can be encoded more optimally, which will increase the page loading speed without subjective loss of image quality. Solution: optimize the images before returning to the client.

As an example of how to do it, you do not need to look at the main page of such a famous site as github.com . With a page weight of 2 MB, 1.2 of them are useless images that can be optimized and not downloaded.

The second example is our Habr. I will not give a screenshot, so as not to stretch the article, the results are by reference . On a habra to pictures change permission to the necessary, but do not optimize them. This would reduce their size by 650 Kb (50%).

In many places on the site, smaller versions of pictures are needed, for example, to show a reduced version of a picture of news in the news feed. We implement this as follows — only the picture in maximum quality is stored on our server, and if necessary, insert its updated version, add the required resolution at the end of the url via "@". Then the request will be sent not for the file, but to our resizing backend and returning the refreshed and optimized version of the picture.

Common solutions

All that will be said below applies to JPEG and PNG images, as These are the most popular formats on the Internet.

Having driven something like “image resize backend” into google, you will see that in half of the cases it is suggested to use Nginx, the other part is various self-written services, most often Node.js.

From nginx, or rather from libgd, which is used in the nginx module, we were able to squeeze 63 RPS on the test picture , which is not bad, but I would like faster and more flexibility. Graphicsmagick is also not suitable, because its speed is too low. In addition, both of these solutions produce non-optimized images. Most other solutions, for example on Node, suggest using Sharp for resizing, MozJPEG for optimizing JPEG images, and pngquant for optimizing PNG.

For a long time we ourselves used a samopisny bunch of Nod'y, Libvips and MozJPEG c pngquant, but one day we asked ourselves the question: "Is it possible to make resize faster and less demanding on resources?".

Spoiler: it is possible. ;)

Now it would be nice to find out how you can speed up our application. After examining the application code, we found out that imagemin, which was used for optimization, in particular, its MozJPEG and pngquant plugins, when running, pull the utilities of the same name via os.Exec. We will definitely cut this thing out and use only bindings to C'shnyh libs. For resizing, the Sharp module was used, which is a binding to the Libvips C library.

Our implementation

Guglezh showed that Libvips is still the leader in speed and only OpenCV can compete with it. So we will use Libvips in our implementation, this is already a proven solution and it has ready-made bindings for Go. It's time to try to write a prototype and see what happens.

A few words about why Golang was chosen to try to solve this problem. Firstly, it is fast enough, but you still remember that we want to make a quick resize. The code on it is easy to read and maintain. The last requirement was the ability to work with the C library, this is useful to us.

We quickly wrote a prototype, tested it and realized that despite the larger number of internal twists than in Sharp, Libvips still produces non-optimized images. Something needs to be done with this. Again we turn to the almighty Google and find out that the best option is still MozJPEG. Here doubts begin to creep in, that we will now write the same thing that was on Node, only on Go. But after carefully reading the description of MoZJPEG, we learn that it is a fork of libjpeg-turbo and is compatible with it.

It looks very promising. The point is small - to build your own version of Libvips, in which jpeg-turbo is replaced by a version from Mozila. For the assembly, we chose Alpine Linux, because the application was still planned to be published using Docker and Alpine has a very nice package config format, very similar to that used in Arch Linux.

*Image optimization reduced its size by 4 times without visible loss of quality.*
Original JPEG 351x527 79 Kb	Optimized 351x527 17 Kb

Collected, tested. Now Libvips immediately upon resize issues an optimized version. That is, in the Node version of the version, we first resized, and then again passed the picture through decoder-encoder. Now we are only doing a resize.

We figured out JPEGs, and what to do with png. To solve this problem, the libpngquant library was found. It is not very popular, despite the fact that the console utility pngquant, which is based on it, is used in many solutions. Also, a binding on Go was found for her, a little abandoned and with a memory leak, I had to fix it, supplement it with documentation and everything else that befits a decent project. We also compiled libpngquant as an Alpine package for easy installation.

Due to the fact that now the image does not need to be saved to a file for processing using pngquant, we can optimize the process a bit. For example, do not compress the image when resizing in Libvips, but only after processing in pngquant. This will save a little precious processor time. Needless to say, we also save a lot because calling the C library is much faster than running the console utility.

*The difference in size is 3 times, but artifacts may appear (depending on the picture).*
Original PNG 450x300 200 Kb	Optimized 450x300 61 Kb

*An example of a not-so-good picture in which artifacts appear during compression.*
Original PNG 351x527 270 Kb	Optimized 351x527 40 Kb

After the prototype was written, tested on my PC and gave a decent 25 RPS on the mobile two nuclear process, eating the entire CPU, I wanted to see how much I can squeeze out of it on normal hardware. Run the code on a six-core machine, set Jmeter and WTF ??? We get 30 RPS. Trying to figure out what kind of garbage.

Libvips itself implements multithreading, that is, we only need to initialize the library and in the future we can safely access it from any stream. But for some reason Libvips works for us in 1 thread, which limits us to one core. Another 1 core is pngquant. In total, it turns out that our super fast resizer works perfectly only on the developer's laptop, and on the other machines it cannot utilize all resources. ;)

We look at the source codes for Libvips and see that there CONCURRENCY is set to 1 by default due to data races in Libvips. But judging by the bug tracker, these problems have long been fixed. Put CONCURRENCY back, testing. Nothing has changed, Libvips still refused to resize images multithreaded. All attempts to overcome this problem failed and to tell the truth, I got tired of solving it and decided to get around the problem at a different level.

All more or less modern Linux kernels (3.9+ and 2.6.32-417 + in CentOS 6) support the SO_REUSE option, which allows multiple instances of the application to use the same port. This approach is more convenient than balancing with third-party software such as HAProxy, because It does not require configuration and allows you to quickly add and remove instances.
Therefore, we used SO_REUSE and the "--scale" option in Docker compose, which allows you to specify the number of instances to run.

Time to measure

The time has come to evaluate the result of our labors.

Configuration:

CPU: Intel Xeon E5-1650 v3 @ 3.50GHz 6 cores (12 vCPU)
RAM: 64 Gb (about 1-2 Gb used)
Number of Workers: 12

Results:

FIle	Output resolution	Node RPS	Go rps
bird_1920x1279.jpg	800x533	34	73
clock_1280x853.jpg	400x267	69	206
clock_6000x4000.jpg	4000x2667	1.9	5.6
fireworks_640x426.jpg	100x67	114	532
cc_705x453.png	405x260	21	33
penguin_380x793.png	280x584	28	69
wine_800x800.png	600x600	27	49
wine_800x800.png	200x200	55	114

More benchmarks (though without comparison with the Node version) on the wiki page .
As you can see, we did not redo the resize in vain, the speed increase was from 30 to 400% (in some cases). If you need to resize even faster, you can turn the “speed” and “quality” knobs in libimagequant. They will allow to further reduce the size or increase the encoding speed at the cost of loss of image quality. GitHub

project code . Go binding to libimagequant also on GitHub .

Tags:

Image Optimization for web

Problem

Common solutions

Our implementation

Time to measure

Also popular now: