How is Data Matrix created?

There are many sites for creating such codes, but I was always wondering how the text turns into a set of black and white squares. Should there be some kind of algorithm?
When creating the Data Matrix, we need to turn to the arithmetic of Galois fields and Reed-Solomon codes. Consider this process with a simple example.
First of all, let's look at the structure of the matrix:

Our matrix consists of two parts: a search pattern and encoded data. Of course, the size of the matrix is directly proportional to the size of the input data. Around our code, there must be a free zone separating the code from the rest of the image.
Take a short word, for example, “Habr” (without quotes) and create a Data Matrix for it. The process consists of two stages: at the high-level coding stage, you need to obtain a sequence of data codes and error correction codes, and at the low-level coding stage, you need to depict a binary representation of these codes in the matrix.
High level coding
In the Data Matrix, as in the QR code, Reed-Solomon codes are used above the Galois field



For calculations, we need a table of powers of two for each element of the field. This table is created quite simply: if the exponent





It is necessary to obtain a codeword

where





First, we create an information polynomial. To do this, we need to know what size the matrix should be so that all information codes can be placed:

From the table it can be seen that to encode a row of 4 elements, we need to take a 12x12 matrix (“useful” area is 10x10), which contains 5 codes data and 7 correction codes.
For ASCII table characters, the code is as follows: C = ASCII value + 1. For example, for the character 'H' C = 72 + 1 = 73.
The running numbers are combined in pairs, and for them C = N + 130, where N is the number obtained as a result of grouping. For example, if the numbers 2 and 5 are nearby, then C = 25 + 130 = 155.
Since we have fewer elements than it should be (instead of five, only four), you need to add special indent codes. The first such code is always 129. The subsequent indent codes, up to the first error correction code, are calculated as follows:,



For the word “Habr” we get the following sequence of codes: 73, 98, 99, 115, 129.
Now we can write the information polynomial:

and multiply it by



We now proceed to create the generating polynomial. It is calculated by the following formula:

We start to multiply the brackets:


Addition in our field is defined as bitwise addition modulo 2. First, raising to a power using a table is performed, then adding them and finding the “logarithm” of the resulting number to return to powers of two. If, after adding the degrees, a number greater than 254 is obtained, take its remainder from dividing by

After multiplying all the brackets and raising to the power we get:

The last operation that completes the high-level coding, and perhaps the most difficult, is finding the remainder of the division



The polynomials are divided into columns, but taking into account the fact that the subtraction defined in the same way as addition and multiplication are performed in the Galois field.
Now we can write the codeword


Low level coding


The remaining elements are placed in a similar way, but before drawing them, it is necessary to note several special cases associated with the corners of the matrix.
If

If


If



There are two more cases that arise only when constructing rectangular matrices, so we omit them.
Let's go back to our matrix and add all the other elements, and also indicate which codeword each element corresponds to. The arrows show how the numbering is done:

After transferring the non-placed elements, we get:

In the lower right corner there was an unoccupied square (



Carefully fill the matrix. Let's start with the search template and the bottom square, and then we add each code in turn:




So, our Data Matrix code is ready:
