
On the issue of cross-browser Data URIs
In pursuit of website optimization, I wanted to reduce the number of requests, without sacrificing the size of optimized files.
The goal is to transfer images of different formats in one file with different optimization settings.
As a tool, I chose the data uri and the gzip css file. However, IE with data uri works very badly. But they have mhtml in them. The existing implementation did not meet my requirements, because I had to transfer 1 file twice - once for IE, in mhtml, and the second for everyone else, in the data uri. In search of a solution, I came across an article by bolk , which described a solution for the jpeg format and some theoretical calculations for gif and png. After almost three weeks of smoking mana, I managed to implement a solution for gif and png and automate the process for all three formats.
Since images are transmitted in base64, it’s worth highlighting some points related to this encoding
Meditation:
Base64 on the English Wikipedia
Format of sections in JPEG: [header] [data]
With jpeg everything is simple and described by bolk :
The point is that the string in CSS looks like:

And decrypts in IE as:

And others see it as:

Due to the peculiarities of base64, it is necessary to additionally transmit a certain number of characters so that the string is encrypted / decrypted correctly. They are inserted before and after CSS. The amount was calculated and selected empirically:
/ 9j / 4AA0; background-image: url (data: image / jpeg; base64; 00,
My script that automates this process:
Information for consideration:
JPEG in English Wikipedia
With this format, things are not so good.
What can be done:
I chose the second option, this makes the base64 string more readable and allows you to convert any non-animated gifs.
More or less standard version of GIF sections: [header] [size] [data] [00]
Many GIFs have an incorrect field order. For example, if you make `convert jpeg gif`, then the resulting file will not be adequately processed by the script. Use GIMP.
The point is that the line in CSS would look like:

While before all the edits the file would look like:

My script to automate the process:
There is no script for animated gif. I find it better to use animated CSS sprites.
Theoretical calculations:
Meditation Information:
GIF color tables
Gif specification
After gif it is a quiet haven. In sections, the size is not limited, they have 4-byte headers and they are very convenient to find. For comparison, for gif I racked my brains and debazed the script almost all day, but for png I did everything in an hour.
Format of sections in PNG: [size (4 bytes)] [data] [CRC (4 bytes)]
And here there were some pitfalls. CRC is very important for IE, if CRC is broken then IE will not display a picture. To all the rest, he is beaten deeply in parallel or not.
Many PNGs have an incorrect structure, in any case, my script will not work with them until I run them through optipng . In addition to optimizing the image, this program will put the fields in the right order. Also, I noticed that Photoshop sometimes cuts sRGB fields and the saved pngs are not always processed by it.
We will hide the CSS in the tEXt
PNG section immediately we need to optimize it using optipng, then cut it so that tExt is immediately behind IHDR.
Keyword00 must be passed in the tEXt section; its length is taken into account in the total length of the section. I have this 'Comment'
General order:
It was:

It became:

The script is well commented, and in the specification you can also draw a lot.
IE6 does not see transparency, sometimes it can be fixed with bKGD by setting the desired Background color.
Then we run `optipng -fix FILE` to fix the CRC section of tEXt
My script to automate the process:
Meditation:
PNG Basics
PNG Specification
If MHTML is used, then CSS should be completely edited for it and divided into sections (example in the archive):
Archive with sources and scripts
Example of a working site
Tested in FF 3.6, Opera 10.10, chromium, chrome, IE6-8
PS: The author of this article is my good friend Banderlog. I post the article at his request, respectively, I recommend asking questions directly to jabber: banderlog@jabber.com.ua
PPS: It is strange that only on the second day the fact was discovered that an hellish mistake was made in the scripts when posting the article. All 3 were the same.
The goal is to transfer images of different formats in one file with different optimization settings.
As a tool, I chose the data uri and the gzip css file. However, IE with data uri works very badly. But they have mhtml in them. The existing implementation did not meet my requirements, because I had to transfer 1 file twice - once for IE, in mhtml, and the second for everyone else, in the data uri. In search of a solution, I came across an article by bolk , which described a solution for the jpeg format and some theoretical calculations for gif and png. After almost three weeks of smoking mana, I managed to implement a solution for gif and png and automate the process for all three formats.
BASE64
Since images are transmitted in base64, it’s worth highlighting some points related to this encoding
- First: base64 only understands [A-Za-z0-9 + /] .
When decrypted by browsers or ruby, any other symbols drop.
Console base64 on Linux does not drop them and displays an error. - Secondly: base64 converts every 3 source bytes (in our case, ASCII characters) to 4 ASCII characters.
Therefore, the decoded CSS line must be balanced, for example, with zeros so that it behaves correctly inside the encoded file.
Meditation:
Base64 on the English Wikipedia
Jpeg
Format of sections in JPEG: [header] [data]
With jpeg everything is simple and described by bolk :
We open the HEX editor:
FF D8 - JPEG header for IE
FF E0 - declaration of the APP0 section, where everything is hidden until the image data,
“ ; background-color: url (data: image / jpeg; base64, ” - this is seen by other browsers.
When IE decodes this line, then it turns out trash that does not affect anything
FF D8 - the beginning of JPEG for other browsers
" image data " - all browsers already see this place
The point is that the string in CSS looks like:

And decrypts in IE as:

And others see it as:

Due to the peculiarities of base64, it is necessary to additionally transmit a certain number of characters so that the string is encrypted / decrypted correctly. They are inserted before and after CSS. The amount was calculated and selected empirically:
/ 9j / 4AA0; background-image: url (data: image / jpeg; base64; 00,
My script that automates this process:
#!/usr/bin/ruby
require 'base64'
# тут строка ВСЕГДА равна одному значению:
a="/9j/4AA0;background-image:url(data:image/jpeg;base64;00,"
#Основной файл
b=Base64.encode64(File.open("#{ARGV[0]}",'r'){|f| f.read})
# Запись в файл
File.open('temp','w'){|i| i.write("#{Base64.decode64(a)}#{Base64.decode64(b)}")}
# перегонка файла обратно в base64
#cat test | base64 | tr -d "\n" > jpeg64.txt
File.open('temp2','w'){|o| o.write(Base64.encode64(File.open('temp','r'){|f| f.read}))}
#File.delete('temp')
c=File.open('temp2','r'){|f| f.read}.gsub(/\/9j\/4AA0backgroundimageurldataimage\/jpegbase6400/,"/9j/4AA0;background-image:url(data:image/jpeg;base64;00,").gsub(/\n/,"")
File.open('out_jpeg64','w'){|s| s.write("#{c}\);")}
File.delete('temp2')
# можно вставлять в css
# cat output64 | tr -d "\n"
# и хорошо поверить mhtml!!!
* This source code was highlighted with Source Code Highlighter.
Information for consideration:
JPEG in English Wikipedia
GIF
With this format, things are not so good.
- Firstly, the size of its section sets 1 byte, i.e. maximum ff section length or 255 characters.
- Secondly: the size of the Comments section is for some reason limited by the size of 240 bytes, and a CSS string occupies 30 characters and a few more are needed to 'balance' base64.
- Third: There are only 2 blocks where you can cram the 'garbage' - Application Extension and Comment Extension and they cannot go in front of the General Color table. And the color table can occupy a maximum of 256 * 3 = 768 bytes.
What can be done:
- Do not touch the General Color Table if the number of colors does not exceed ~ 70
- Move contents of General Color Table to Local Color Table
I chose the second option, this makes the base64 string more readable and allows you to convert any non-animated gifs.
More or less standard version of GIF sections: [header] [size] [data] [00]
Many GIFs have an incorrect field order. For example, if you make `convert jpeg gif`, then the resulting file will not be adequately processed by the script. Use GIMP.
The first 13 bytes is that infa which cannot be reduced. Moreover, 11 bytes is complex and describes the Global Color Table. Its changing to 00
Cut flowers table (from 14 byte to kamentah - 21 FE xx, where xx - size commentary)
Memo with css first and thirteenth bytes.
We cut out the color table (from 14 bytes and to the wire - 21 FE xx, where xx is the comment size)
'Inner comment' 1 character
long We cut the color table (from 14 bytes and to the wire - 21 FE xx, where xx is the comment size)
2c 00 00 00 00- Image descriptor. Its 10th byte is complex and describes the Local Color Table. We transfer everything from the 11th byte that is transferred (declare Local Color Table, sorted \ no, size of Local color table), more in the format specification.
Insert a color table
Continued Image descriptor
The point is that the line in CSS would look like:

While before all the edits the file would look like:

My script to automate the process:
#!/usr/bin/ruby
# CONVERT INCORRECTLY TRANSFER DATA. USE GIMP INSTEAD
# USE: ./GIF_SCRIPT.RB [GIF_FILE]
require 'base64'
# OPEN GIF FILE IN HEX
orig=File.open("#{ARGV[0]}",'r'){|f| f.read.unpack("H*")}.to_s
# FUTURE HEADER
header=orig[0..25]
# GREP GENERAL COLOR TABLE
# [26..1565]/6 = 256 BYTE (MAX SIZE OF COLOR TABLE)
color_table=orig[26..1565][/(.*)21fe/,1]
if color_table.class == NilClass
color_table=orig[26..1575][/(.*?)2c0000/,1]
end
# FOR DEBUGING
#puts color_table
#puts color_table.length
puts "COLORS IN PALLETE: #{color_table.length/6}"
# GIF IMAGE DATA
data=orig[/2c0000.*/]
# SAVE 11 BYTE'S INFO AND ADOPT IT FOR LOCAL COLOR TABLE
eleven=header[20..21].to_i(16).to_s(2)
local_mix="10#{eleven.split("")[4].to_s}00#{eleven.split("")[5..7].to_s}".to_i(2).to_s(16)
# 11 BYTE TO ZERO
header[20..21]="00"
# DECLARE LOCAL COLOR TABLE
data[18..19]=local_mix
# MAGIC COMMENT
comment=Base64.decode64(";background-image:url(data:image/gif;base64;pzd,").unpack("H*").to_s
# WRITE ALL IN ONE FILE
var=header+"21fe313030"+comment+header+"21fe013000"+data[0..19]+color_table+data[20..-1]
File.open('out.gif','w'){|f| f.write(var.to_a.pack("H*"))}
# ENCODE FILE TO BASE64 WITH "\n" REMOVING
File.open('temp','w'){|o| o.write(Base64.encode64(File.open('out.gif','r'){|f| f.read}).gsub(/\n/,""))}
# MAKE STRING CSS READEABLE
c=File.open('temp','r'){|f| f.read}.gsub(/backgroundimageurldataimage\/gifbase64pzd/,";background-image:url(data:image/gif;base64;pzd,").gsub(/\n/,"")
File.delete('temp')
# JUST PASTE TEXT FROM THIS FILE TO CSS
File.open('out_gif64','w'){|s| s.write("#{c}\);")}
* This source code was highlighted with Source Code Highlighter.
There is no script for animated gif. I find it better to use animated CSS sprites.
Theoretical calculations:
- For each frame, doing a Local Color Table does not make sense, because this will increase the size.
- Animated gifs with 64 colors can be processed with the inclusion of the General Color Table in the comment.
- Application Extension and Comment Extension can go in a row, which increases the possible size of x2.
- On the Internet, I came across information that in the Application Extension, in fact, 2 blocks specify the size.
21 ff SizeSize 'NETSCAPE2.0' SizeSize 01 00 00, where SizeSize is 2 bytes for size and 01 byte for infinitive loop.
- Which, in theory, can provide an opportunity to 'score' a larger number of colors. But still less than 256 (about 230).
Meditation Information:
GIF color tables
Gif specification
PNG
After gif it is a quiet haven. In sections, the size is not limited, they have 4-byte headers and they are very convenient to find. For comparison, for gif I racked my brains and debazed the script almost all day, but for png I did everything in an hour.
Format of sections in PNG: [size (4 bytes)] [data] [CRC (4 bytes)]
And here there were some pitfalls. CRC is very important for IE, if CRC is broken then IE will not display a picture. To all the rest, he is beaten deeply in parallel or not.
Many PNGs have an incorrect structure, in any case, my script will not work with them until I run them through optipng . In addition to optimizing the image, this program will put the fields in the right order. Also, I noticed that Photoshop sometimes cuts sRGB fields and the saved pngs are not always processed by it.
We will hide the CSS in the tEXt
PNG section immediately we need to optimize it using optipng, then cut it so that tExt is immediately behind IHDR.
Keyword00 must be passed in the tEXt section; its length is taken into account in the total length of the section. I have this 'Comment'
General order:
IHDR
tExt
Other overhead
data
It was:

It became:

The script is well commented, and in the specification you can also draw a lot.
IE6 does not see transparency, sometimes it can be fixed with bKGD by setting the desired Background color.
Then we run `optipng -fix FILE` to fix the CRC section of tEXt
My script to automate the process:
#!/usr/bin/ruby
#
#!!!! RUN optipng FIRST !!!!
#
# USE: ./PNG_SCRIPT.RB [PNG_FILE]
require 'base64'
# OPEN GIF FILE IN HEX
orig=File.open("#{ARGV[0]}",'r'){|f| f.read.unpack("H*")}.to_s
#ihdr=orig[0..65]
ihdr=orig[/(.*?)73524742/,1][0..-9]
#sRGB - 73 52 47 42 & -4b (8 characters)
#srgb_phys=orig[66..171]
#check for tEXt existence
if orig[/74455874/].class == NilClass
srgb_phys=orig[/(.{8}73524742.*?)49444154/,1][0..-9]
else
srgb_phys=orig[/(.{8}73524742.*?)74455874/,1][0..-9]
end
#srgb_phys=orig[/(.{8}73524742.*?)74455874/,1][0..-9]
#tEXt - 74 45 58 74 –њ–Њ—Б–ї–µ–і–љ–Є–µ 8 –љ–∞–і–Њ –Љ–µ–љ—П—В—М –љ–∞ CRC 00000000
#text=orig[172..245]
#text=orig[/(.{8}74455874.*?)49444154/,1][0..-9]
#IDAT - 49444154
#data=orig[246..-1]
data=orig[/.{8}49444154.*/]
#MAGIC COMMENT
comment=Base64.decode64(";background-image:url(data:image/png;base64;pzd,").unpack("H*").to_s
###### OUTER PNG
# "00000059"+"74455874"+"436f6d6d656e7400"
# tEXt_length + 'tEXt' + 'Comment.'
# "3030" - two zero for base64 balance
###### INNER PNG
# "00000008"+"74455874"+"436f6d6d656e7400"+"00000000"
# min_tEXt_length + 'tEXt' + 'Comment.' + blank CRC
#
# CRC field one for two PNG's
# IE can't live without it, but others feel indifferently
var=ihdr+"00000059"+"74455874"+"436f6d6d656e7400"+"3030"+comment+ihdr+"00000008"+"74455874"+"436f6d6d656e7400"+"00000000"+srgb_phys+data
File.open('out.png','w'){|f| f.write(var.to_a.pack("H*"))}
# CRC FIX
puts "optipng -fix started..."
`optipng -fix out.png`
puts "optipng -fix completed"
# ENCODE FILE TO BASE64 WITH "\n" REMOVING
File.open('temp','w'){|o| o.write(Base64.encode64(File.open('out.png','r'){|f| f.read}).gsub(/\n/,""))}
# MAKE STRING CSS READEABLE
c=File.open('temp','r'){|f| f.read}.gsub(/backgroundimageurldataimage\/pngbase64pzd/,";background-image:url(data:image/png;base64;pzd,").gsub(/\n/,"")
File.delete('temp')
# JUST PASTE TEXT FROM THIS FILE TO CSS
File.open('out_png64','w'){|s| s.write("#{c}\);")}
* This source code was highlighted with Source Code Highlighter.
Meditation:
PNG Basics
PNG Specification
Mhtml
If MHTML is used, then CSS should be completely edited for it and divided into sections (example in the archive):
/*
Content-Type: multipart/related; boundary="_"
--_
Content-Type: text/css;
*/
html, body {
margin: 0;
padding: 0;
width: 100%;
height: 100%;
}
#half_logo {
/*
--_
Content-Location:logo
Content-Transfer-Encoding:base64
Content-Type: image/png;*/
iVBORw0KGgoAAAANSUhEUgAAAT4AAAA3CAMAAACintZ+AAAAWXRFWHRDb21tZW50ADAw;background-image:url(data:image/png;base64;pzd,iVBORw0K...);
/*
--_
Content-Type: text/css;
*/
background-image: url(mhtml:http://192.168.1.2/test.css!logo) !ie;
/*
--_--
*/
* This source code was highlighted with Source Code Highlighter.
Archive with sources and scripts
Example of a working site
Tested in FF 3.6, Opera 10.10, chromium, chrome, IE6-8
PS: The author of this article is my good friend Banderlog. I post the article at his request, respectively, I recommend asking questions directly to jabber: banderlog@jabber.com.ua
PPS: It is strange that only on the second day the fact was discovered that an hellish mistake was made in the scripts when posting the article. All 3 were the same.