Neural network as a predictor for encoding PNG images
I bring to your attention a translation of the article Neural Network As Predictor For Image Coding (PNG) . The author’s blog is here .
The main reason for this work was to improve existing pre-filters . Create a new filter that, using an artificial neural network, would make the best prediction, leading to better file compression.
Classically, PNG compression is divided into two steps:
In this article, only the first step is important. In the figure below you can see the pre-filters that currently exist, and how they preserve the difference between the real and the predicted pixels.
Currently existing filters + new solution:
The last filter is a new implementation of the author of this article. It internally uses a neural network with an array of input pixels. As a result, it returns the predicted pixel variable. As in other filters, the difference between the original and the predicted value is preserved. But what are these input values you ask? In the figure below, the author tried to describe the process of transmitting a neural network of input values more clearly and clearly. Firstly, there are three different parts of the image:
The entire red area will be copied 1: 1, since the source data is required to start the neuro filter. That’s the reason why copying such an image frame. The network configuration was as follows:
So all pixels from 1st to 28th will be copied.
The first pixel processed by the filter is in position (5,4) . This pixel can be predicted using the remaining 28 pixels and the neural network. This can be seen from the illustration above.
All green pixels are the input pixels that the neural network processes, resulting in a predicted value for the BLUE pixel.
In this section, the author describes the developed and used components. All code is written in JAVA.
At the first stage, it is necessary to train the neural network. To complete this step a little faster, the author developed Pattern Exporter, which creates a training sequence for the JavaNNS Tool. For clarity, this step is described in the figure below.
After the training of the neural network, it must be used in the encoder / encoder. A detailed explanation of the described stage is shown in the figure below.
For encoding and decoding, the author used the pngj library. You can find her here .
There are many ways to choose a neural network configuration.
Possible ways to select a neural network configuration:
Below are some of the optimal options for the design of a neural network that were evaluated by the author. Basically, he evaluated them, simply checking with a few samples of images, and then he calculated the BPP (bit per pixel) of the neural network and determined the best parameters. This led to the following results:
Estimated configuration of the neural network:
At the next stage, the author compared his neural filter with other PNG filters that are currently used. Testing took place on several images.
It can be seen that the neural network copes with image compression somewhat worse than the Paeth and Average filters, but it is much better than Sub and Up. After this check, another one was carried out, it was attended by much more images (111), on which nature was captured. It was necessary to find out which images the filter copes best with and which ones worse. Below are the images that the neural network handled much better than all the other filters:
So, we can conclude that the neural network is good to use if the image contains:
In the next step, the author rummaged through his photographs taken during the holidays to find one that would satisfy the conditions described above and found one:
As a result, the following BPP values were calculated for 6 filters:
Thus, the theory of image features for better compression using a neuro filter was confirmed.
The author conducted another test to find out how the origin of the object in the image affects compression. The following results were obtained:
The project is on the GIT Hub. Who cares, you can see .
Research topic
The main reason for this work was to improve existing pre-filters . Create a new filter that, using an artificial neural network, would make the best prediction, leading to better file compression.
Compression
Classically, PNG compression is divided into two steps:
- Pre-filtering (using predictors);
- Compression (using DEFLATE).
In this article, only the first step is important. In the figure below you can see the pre-filters that currently exist, and how they preserve the difference between the real and the predicted pixels.
Currently existing filters + new solution:
A type | Name | Filter function | Recovery function |
0 | None | Filt (x) = Orig (x) | Recon (x) = Filt (x) |
1 | Sub | Filt (x) = Orig (x) - Orig (a) | Recon (x) = Filt (x) + Recon (a) |
2 | Up | Filt (x) = Orig (x) - Orig (b) | Recon (x) = Filt (x) + Recon (b) |
3 | Average | Filt (x) = Orig (x) - floor ((Orig (a) - Orig (b) / 2)) | Recon (x) = Filt (x) + floor ((Recon (a) - Recon (b) / 2)) |
4 | Paeth | Filt (x) = Orig (x) - PaethPredictor (Orig (a), Orig (b), Orig (d)) | Recon (x) = Filt (x) + PaethPredictor (Recon (a), Recon (b), Recon (d)) |
5 | Neural network | Filt (x) = Orig (x) - NN (ArrayOfInputPixels) | Recon (x) = Filt (x) + NN (ArrayOfInputPixels) |
Neural network as a predictor
The last filter is a new implementation of the author of this article. It internally uses a neural network with an array of input pixels. As a result, it returns the predicted pixel variable. As in other filters, the difference between the original and the predicted value is preserved. But what are these input values you ask? In the figure below, the author tried to describe the process of transmitting a neural network of input values more clearly and clearly. Firstly, there are three different parts of the image:
- Copyable (indicated by RED);
- Input pixels for a neural network (marked GREEN);
- The predicted pixel (indicated by BLUE).
Copied pixels
The entire red area will be copied 1: 1, since the source data is required to start the neuro filter. That’s the reason why copying such an image frame. The network configuration was as follows:
- 28 input neurons (marked GREEN) - (8 * 4-4) px.
- 1 output neuron (marked by BLUE) - 29th px.
So all pixels from 1st to 28th will be copied.
Input pixels
The first pixel processed by the filter is in position (5,4) . This pixel can be predicted using the remaining 28 pixels and the neural network. This can be seen from the illustration above.
Predicted Pixel
All green pixels are the input pixels that the neural network processes, resulting in a predicted value for the BLUE pixel.
Components
In this section, the author describes the developed and used components. All code is written in JAVA.
At the first stage, it is necessary to train the neural network. To complete this step a little faster, the author developed Pattern Exporter, which creates a training sequence for the JavaNNS Tool. For clarity, this step is described in the figure below.
After the training of the neural network, it must be used in the encoder / encoder. A detailed explanation of the described stage is shown in the figure below.
- Input Image: A simple image that a neural network will compress.
- PNG Encoder / Decoder: Encode and decode an image using a predictor on a neural network.
- Neural Netwrok: A neural network developed in the JAVA programming language.
- JNNSParser
- Output image: as an output, you should get an image smaller than what was compressed.
For encoding and decoding, the author used the pngj library. You can find her here .
results
There are many ways to choose a neural network configuration.
Possible ways to select a neural network configuration:
- selection of the number of input neurons;
- determination of the input neuron circuit;
- selection of the number of hidden neurons;
- selection of the number of hidden layers of neurons;
- determination of neuron activation functions;
- learning algorithm definition
- And so on...
Below are some of the optimal options for the design of a neural network that were evaluated by the author. Basically, he evaluated them, simply checking with a few samples of images, and then he calculated the BPP (bit per pixel) of the neural network and determined the best parameters. This led to the following results:
Estimated configuration of the neural network:
- Number of input neurons: 28.
- Number of hidden neurons:
- 9 neurons (3x3);
- 25 neurons (5x5).
- Number of hidden layers: 1.
- Activation function: sigmoid, limiting the range from 0.2 to 0.8.
- Learning Algorithm: Backward Error Propagation.
Comparison with other PNG predictors
At the next stage, the author compared his neural filter with other PNG filters that are currently used. Testing took place on several images.
It can be seen that the neural network copes with image compression somewhat worse than the Paeth and Average filters, but it is much better than Sub and Up. After this check, another one was carried out, it was attended by much more images (111), on which nature was captured. It was necessary to find out which images the filter copes best with and which ones worse. Below are the images that the neural network handled much better than all the other filters:
I wasnt sure what these picture have in common. Well there are many flowers. So possibly my Neural Network really likes Flowers. But I wasn't very comfortable with that explanation.
So, we can conclude that the neural network is good to use if the image contains:
- many textures;
- various textures;
- little noise.
In the next step, the author rummaged through his photographs taken during the holidays to find one that would satisfy the conditions described above and found one:
As a result, the following BPP values were calculated for 6 filters:
A type | Name | BPP |
0 | None | 7.289 |
1 | Sub | 6.681 |
2 | Up | 6.667 |
3 | Average | 6.433 |
4 | Paeth | 6.486 |
5 | Nn | 6.368 |
Thus, the theory of image features for better compression using a neuro filter was confirmed.
Comparison of images of nature with the image of what man created
The author conducted another test to find out how the origin of the object in the image affects compression. The following results were obtained:
Conclusion
- There is great potential. There was not enough time to find a suitable neural network setup. Maybe if he specialized in this field, the neuro filter would have beaten other filters according to BPP.
- Perhaps using a different neural network topology could bring improvements. There were thoughts about recursive neural networks ...
- Another idea was that it was possible to train a neural network to process only one type of image.
- Productivity was not the goal the author was working on. It is clear that other filters process images much faster than the above solution.
The project is on the GIT Hub. Who cares, you can see .