CRC16 Checksum Reliability
Not so long ago, on duty, I ran into a rather interesting problem.
We have a device that carries out intensive exchange on the internal RS485 bus, the number of packets passing is about several thousand per second, each packet is 7 bytes long, two of which are designed to store the CRC16 checksum in its CMS version (polynomial = 0x8005, starting value = 0xFFFF). Reception is carried out in the FIFO buffer, which shifts upward with crowding out after reception of each subsequent byte. An indicator of receiving a real packet is the fact that its checksum matches the value transmitted in the packet itself. No headers or additional parameters.
The problem was as follows - periodically, about once every 5 minutes, when transmitting data, a packet slipped through, the data of which gave a data outlier for one of the channels, and most often the outlier occurred to the same value. At first we looked in the direction of physical collisions, but it turned out differently - from time to time in the buffer where the data was collected, a packet appeared that consisted of the end of the previous packet and the beginning of the next, and the checksum of such a combined packet turned out to be correct. That is, there is a collision of the checksum: the package does not make sense, but gives the correct checksum.
Naturally, the error was already at the system design level, since the packages did not have any headers, the introduction of an additional byte header reduced the number of errors to an undetectable level, but this seemed to me not enough. I decided to check how different types of 16-bit checksums differ from each other in real conditions. Actually, about this and the article.
For comparison, I selected some of the most commonly used 16-bit checksums with various polynomials, starting values, and the mechanism for receiving bits. The amounts I have selected are summarized in the following table:
In this case:
When emulating packet corruption, I implemented the following models:
Then, the program randomly generated 100,000,000 packets, each of them carried out the above operations, after which the checksums of the original and upgraded packages were compared. Packets that did not change during the conversion were discarded. If the checksum matches, an error is logged.
As a result, the following table was obtained with the number of errors:
Obviously, the starting value of the algorithm does not affect the result in any way, which is logical, the starting value gives us only a different value of the checksum, but the calculation mechanism itself does not change in any way. Therefore, I excluded these checksums from further consideration. In the same way, it makes no sense to consider errors in single bits, all the checksums coped with this without error, which, in fact, was required of them during creation.
Well, the final quality table of the checksum, already without taking into account duplicate algorithms:
I leave the remaining conclusions to readers. I only note from myself that the number of units in the checksum polynomial has a certain effect on the results. But this is just my personal subjective opinion. I will be glad to hear other explanations.
The source code of the program is given below.
As a result, in the next version of the product for the internal bus, the CCITT checksum was chosen, to a greater extent because its implementation was available in the hardware of the microcontroller used.
We have a device that carries out intensive exchange on the internal RS485 bus, the number of packets passing is about several thousand per second, each packet is 7 bytes long, two of which are designed to store the CRC16 checksum in its CMS version (polynomial = 0x8005, starting value = 0xFFFF). Reception is carried out in the FIFO buffer, which shifts upward with crowding out after reception of each subsequent byte. An indicator of receiving a real packet is the fact that its checksum matches the value transmitted in the packet itself. No headers or additional parameters.
The problem was as follows - periodically, about once every 5 minutes, when transmitting data, a packet slipped through, the data of which gave a data outlier for one of the channels, and most often the outlier occurred to the same value. At first we looked in the direction of physical collisions, but it turned out differently - from time to time in the buffer where the data was collected, a packet appeared that consisted of the end of the previous packet and the beginning of the next, and the checksum of such a combined packet turned out to be correct. That is, there is a collision of the checksum: the package does not make sense, but gives the correct checksum.
Naturally, the error was already at the system design level, since the packages did not have any headers, the introduction of an additional byte header reduced the number of errors to an undetectable level, but this seemed to me not enough. I decided to check how different types of 16-bit checksums differ from each other in real conditions. Actually, about this and the article.
For comparison, I selected some of the most commonly used 16-bit checksums with various polynomials, starting values, and the mechanism for receiving bits. The amounts I have selected are summarized in the following table:
Designation | Polynomial | Init | Refin | Refout | Xorout |
CMS | 0x8005 | 0xFFFF | false | false | 0x0000 |
CCITT | 0x1021 | 0xFFFF | false | false | 0x0000 |
AUG-CCITT | 0x1021 | 0x1D0F | false | false | 0x0000 |
BYPASS | 0x8005 | 0x0000 | false | false | 0x0000 |
CDMA2000 | 0xC867 | 0xFFFF | false | false | 0x0000 |
DDS-110 | 0x8005 | 0x800D | false | false | 0x0000 |
DECT-X | 0x0589 | 0x0000 | false | false | 0x0000 |
EN-13757 | 0x3D65 | 0x0000 | false | false | 0xFFFF |
Modbus | 0x8005 | 0xFFFF | true | true | 0x0000 |
T10-DIF | 0x8BB7 | 0x0000 | false | false | 0x0000 |
TELEDISK | 0xA097 | 0x0000 | false | false | 0x0000 |
XMODEM | 0x1021 | 0x0000 | false | false | 0x0000 |
In this case:
- RefIn - the order of arrival of bits from the data buffer: false - starting with the most significant bit (MSB first), true - LSB first;
- RefOut - flag of inverting the order of bits at the output: true - invert.
When emulating packet corruption, I implemented the following models:
- Shuffle: filling random number of bytes in a packet with random values
- Bit shift: shift random bytes in a packet to the left
- Roll packet: ring byte shift in the packet to the left
- Right shift: packet shift to the right by one byte, 0xFF is added to the left (transmission is via UART)
- Left shift: shift the packet to the left by one byte, 0xFF is added to the right
- Fill zeros: filling a random number of bytes in a packet with 0x00 bytes (all zeros)
- Fill ones: filling a random number of bytes in a packet with 0xFF bytes (all units)
- Byte injection: insert a random byte into a packet in a random place, the bytes after the inserted are shifted in the tail direction
- Single bit: damage to a single random bit
Then, the program randomly generated 100,000,000 packets, each of them carried out the above operations, after which the checksums of the original and upgraded packages were compared. Packets that did not change during the conversion were discarded. If the checksum matches, an error is logged.
As a result, the following table was obtained with the number of errors:
Designation | Shuffle | Bit shift | Roll packet | Right shift | Left shift | Fill zeros | Fill ones | Byte injection | Sum |
CMS | 5101 | 3874 | 2937 | 1439 | 1688 | 3970 | 4010 | 1080 | 24099 |
CCITT | 2012 | 1127 | 3320 | 1494 | 1486 | 1063 | 1096 | 1130 | 12728 |
AUG-CCITT | 2012 | 1127 | 3320 | 1494 | 1486 | 1063 | 1096 | 1130 | 12728 |
BYPASS | 5101 | 3874 | 2937 | 1439 | 1688 | 3970 | 4010 | 1080 | 24099 |
CDMA2000 | 1368 | 1025 | 1946 | 1462 | 1678 | 1043 | 1028 | 1112 | 10662 |
DDS-110 | 5101 | 3874 | 2937 | 1439 | 1688 | 3970 | 4010 | 1080 | 24099 |
DECT-X | 1432 | 1189 | 5915 | 1779 | 1580 | 1215 | 1209 | 1093 | 15412 |
EN-13757 | 1281 | 2209 | 3043 | 1520 | 1528 | 2193 | 2187 | 1039 | 15,000 |
Modbus | 5090 | 3888 | 3086 | 1282 | 1582 | 3947 | 3897 | 1073 | 23845 |
T10-DIF | 1390 | 922 | 1424 | 1421 | 1630 | 994 | 938 | 1093 | 9812 |
TELEDISK | 1394 | 1049 | 5398 | 1451 | 1512 | 1096 | 1066 | 1065 | 14031 |
XMODEM | 2012 | 1127 | 3320 | 1494 | 1486 | 1063 | 1096 | 1130 | 12728 |
Obviously, the starting value of the algorithm does not affect the result in any way, which is logical, the starting value gives us only a different value of the checksum, but the calculation mechanism itself does not change in any way. Therefore, I excluded these checksums from further consideration. In the same way, it makes no sense to consider errors in single bits, all the checksums coped with this without error, which, in fact, was required of them during creation.
Well, the final quality table of the checksum, already without taking into account duplicate algorithms:
Designation | Number of collisions | Place |
CMS | 24099 | 8 |
CCITT | 12728 | 3 |
CDMA2000 | 10662 | 2 |
DECT-X | 15412 | 6 |
EN-13757 | 15,000 | 5 |
Modbus | 23845 | 7 |
T10-DIF | 9812 | 1 |
TELEDISK | 14031 | 4 |
I leave the remaining conclusions to readers. I only note from myself that the number of units in the checksum polynomial has a certain effect on the results. But this is just my personal subjective opinion. I will be glad to hear other explanations.
The source code of the program is given below.
Source
#include
#include
#include
#include
#include
#include
#define PACKET_LEN (7)
#define NUM_OF_CYCLES (100000)
static unsigned char reverse_table[16] =
{
0x0, 0x8, 0x4, 0xC, 0x2, 0xA, 0x6, 0xE,
0x1, 0x9, 0x5, 0xD, 0x3, 0xB, 0x7, 0xF
};
uint8_t reverse_bits(uint8_t byte)
{
// Reverse the top and bottom nibble then swap them.
return (reverse_table[byte & 0b1111] << 4) | reverse_table[byte >> 4];
}
uint16_t reverse_word(uint16_t word)
{
return ((reverse_bits(word & 0xFF) << 8) | reverse_bits(word >> 8));
}
uint16_t crc16_common(uint8_t *data, uint8_t len, uint16_t poly, uint16_t init,
uint16_t doXor, bool refIn, bool refOut)
{
uint8_t y;
uint16_t crc;
crc = init;
while (len--)
{
if (refIn)
crc = ((uint16_t)reverse_bits(*data++) << 8) ^ crc;
else
crc = ((uint16_t)*data++ << 8) ^ crc;
for (y = 0; y < 8; y++)
{
if (crc & 0x8000)
crc = (crc << 1) ^ poly;
else
crc = crc << 1;
}
}
if (refOut)
crc = reverse_word(crc);
return (crc ^ doXor);
}
uint16_t crc16_ccitt(uint8_t *data, uint8_t len)
{
return crc16_common(data, len, 0x1021, 0xFFFF, 0x0000, false, false);
}
uint16_t crc16_bypass(uint8_t *data, uint8_t len)
{
return crc16_common(data, len, 0x8005, 0x0000, 0x0000, false, false);
}
uint16_t crc16_xmodem(uint8_t *data, uint8_t len)
{
return crc16_common(data, len, 0x1021, 0x0000, 0x0000, false, false);
}
uint16_t crc16_teledisk(uint8_t *data, uint8_t len)
{
return crc16_common(data, len, 0xA097, 0x0000, 0x0000, false, false);
}
uint16_t crc16_augccitt(uint8_t *data, uint8_t len)
{
return crc16_common(data, len, 0x1021, 0x1d0f, 0x0000, false, false);
}
uint16_t crc16_cdma2000(uint8_t *data, uint8_t len)
{
return crc16_common(data, len, 0xc867, 0xffff, 0x0000, false, false);
}
uint16_t crc16_dds110(uint8_t *data, uint8_t len)
{
return crc16_common(data, len, 0x8005, 0x800d, 0x0000, false, false);
}
uint16_t crc16_dect(uint8_t *data, uint8_t len)
{
return crc16_common(data, len, 0x0589, 0x0000, 0x0000, false, false);
}
uint16_t crc16_en13757(uint8_t *data, uint8_t len)
{
return crc16_common(data, len, 0x3d65, 0x0000, 0xffff, false, false);
}
uint16_t crc16_t10dif(uint8_t *data, uint8_t len)
{
return crc16_common(data, len, 0x8bb7, 0x0000, 0x0000, false, false);
}
uint16_t crc16_cms(uint8_t *data, uint8_t len)
{
return crc16_common(data, len, 0x8005, 0xFFFF, 0x0000, false, false);
}
uint16_t crc16_modbus(uint8_t *data, uint8_t len)
{
return crc16_common(data, len, 0x8005, 0xFFFF, 0x0000, true, true);
}
bool compare_buf(uint8_t *buf1, uint8_t *buf2)
{
uint8_t x;
for (x = 0; x < PACKET_LEN; x++)
{
if (buf1[x] != buf2[x])
return true;
}
return false;
}
bool method_shuffle(uint8_t *buf)
{
uint8_t i, j;
uint16_t rnd;
uint8_t copy[PACKET_LEN];
memcpy(copy, buf, PACKET_LEN);
for (i = 0; i < PACKET_LEN; i++)
{
for (j = 0; j < 10; j++)
{
rnd = (uint16_t)rand();
if (rnd % 7 == 0)
buf[i] ^= (1 << (rnd % 8));
}
}
return compare_buf(buf, copy);
}
bool method_bitshift(uint8_t *buf)
{
uint8_t x, i, j;
uint8_t copy[PACKET_LEN];
memcpy(copy, buf, PACKET_LEN);
x = (uint8_t)(rand() % PACKET_LEN) + 1;
for (j = 0; j < x; j++)
{
i = (uint8_t)(rand() % PACKET_LEN);
if (buf[i] == 0)
buf[i] = 0x01;
else
buf[i] <<= 1;
}
return compare_buf(buf, copy);
}
bool method_packetroll(uint8_t *buf)
{
uint8_t x, i, j;
uint8_t temp;
uint8_t copy[PACKET_LEN];
memcpy(copy, buf, PACKET_LEN);
x = (uint8_t)(rand() % (PACKET_LEN - 1)) + 1;
for (j = 0; j < x; j++)
{
temp = buf[0];
for (i = 0; i < PACKET_LEN - 1; i++)
buf[i] = buf[i + 1];
buf[PACKET_LEN - 1] = temp;
}
return compare_buf(buf, copy);
}
bool method_shiftright(uint8_t *buf)
{
uint8_t i;
uint8_t copy[PACKET_LEN];
memcpy(copy, buf, PACKET_LEN);
for (i = 0; i < PACKET_LEN - 1; i++)
buf[i + 1] = buf[i];
buf[0] = 0xff;
return compare_buf(buf, copy);
}
bool method_shiftleft(uint8_t *buf)
{
uint8_t i;
uint8_t copy[PACKET_LEN];
memcpy(copy, buf, PACKET_LEN);
for (i = 0; i < PACKET_LEN - 1; i++)
buf[i] = buf[i + 1];
buf[PACKET_LEN - 1] = 0xff;
return compare_buf(buf, copy);
}
bool method_zero(uint8_t *buf)
{
uint8_t x, i, j;
uint8_t copy[PACKET_LEN];
memcpy(copy, buf, PACKET_LEN);
x = (uint8_t)(rand() % PACKET_LEN) + 1;
for (j = 0; j < x; j++)
{
i = (uint8_t)(rand() % PACKET_LEN);
if (buf[i] != 0x00)
buf[i] = 0x00;
else
buf[i] = 0xFF;
}
return compare_buf(buf, copy);
}
bool method_one(uint8_t *buf)
{
uint8_t x, i, j;
uint8_t copy[PACKET_LEN];
memcpy(copy, buf, PACKET_LEN);
x = (uint8_t)(rand() % PACKET_LEN) + 1;
for (j = 0; j < x; j++)
{
i = (uint8_t)(rand() % PACKET_LEN);
if (buf[i] != 0xFF)
buf[i] = 0xFF;
else
buf[i] = 0x00;
}
return compare_buf(buf, copy);
}
bool method_injection(uint8_t *buf)
{
uint8_t x, i;
uint8_t copy[PACKET_LEN];
memcpy(copy, buf, PACKET_LEN);
x = (uint8_t)(rand() % PACKET_LEN);
for (i = PACKET_LEN - 1; i > x; i--)
{
buf[i] = buf[i - 1];
}
buf[x] = (uint8_t)rand();
return compare_buf(buf, copy);
}
bool method_single(uint8_t *buf)
{
uint8_t x;
x = (uint8_t)(rand() % (PACKET_LEN * 8));
buf[(uint8_t)(x / 8)] ^= (1 << (x % 8));
return true;
}
typedef struct
{
uint16_t crc1;
uint16_t crc2;
uint32_t errors;
uint16_t (*fn)(uint8_t *data, uint8_t len);
char name[32];
} tCRC;
typedef struct
{
bool (*fn)(uint8_t *buf);
char name[32];
} tMethod;
static tMethod methods[] =
{
{method_shuffle, "Shuffle"},
{method_bitshift, "Bit shift"},
{method_packetroll, "Roll packet"},
{method_shiftright, "Right shift"},
{method_shiftleft, "Left shift"},
{method_zero, "Fill zeros"},
{method_one, "Fill ones"},
{method_injection, "Byte injection"},
{method_single, "Single bit"}
};
static tCRC crcs[] =
{
{0, 0, 0, crc16_cms, "CMS"},
{0, 0, 0, crc16_ccitt, "CCITT"},
{0, 0, 0, crc16_augccitt, "AUG-CCITT"},
{0, 0, 0, crc16_bypass, "BYPASS"},
{0, 0, 0, crc16_cdma2000, "CDMA2000"},
{0, 0, 0, crc16_dds110, "DDS-110"},
{0, 0, 0, crc16_dect, "DECT-X"},
{0, 0, 0, crc16_en13757, "EN-13757"},
{0, 0, 0, crc16_modbus, "Modbus"},
{0, 0, 0, crc16_t10dif, "T10-DIF"},
{0, 0, 0, crc16_teledisk, "TELEDISK"},
{0, 0, 0, crc16_xmodem, "XMODEM"}
};
int main(int argc, char * argv[])
{
uint32_t num_of_cycle;
uint32_t num_of_sums;
uint8_t packet[PACKET_LEN];
uint8_t i;
uint8_t m;
//uint8_t buf[8] = {0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17};
srand(time(NULL));
printf("------------------------- CRC16 comparison -------------------------\n");
num_of_sums = sizeof(crcs) / sizeof(tCRC);
for (m = 0; m < sizeof(methods) / sizeof(tMethod); m++)
{
printf("\r%s:\n", methods[m].name);
for (i = 0; i < num_of_sums; i++)
{
crcs[i].errors = 0;
}
for (num_of_cycle = 0; num_of_cycle < NUM_OF_CYCLES; num_of_cycle++)
{
for (i = 0; i < PACKET_LEN; i++)
packet[i] = (uint8_t)rand();
for (i = 0; i < num_of_sums; i++)
crcs[i].crc1 = crcs[i].fn(packet, PACKET_LEN);
if (!methods[m].fn(packet))
continue;
for (i = 0; i < num_of_sums; i++)
{
crcs[i].crc2 = crcs[i].fn(packet, PACKET_LEN);
if (crcs[i].crc1 == crcs[i].crc2)
crcs[i].errors++;
}
if (num_of_cycle % 1000 == 0)
printf("\r%.2f%%", (float)num_of_cycle / NUM_OF_CYCLES * 100);
}
for (i = 0; i < num_of_sums; i++)
printf("\r %20s: %10d\n", crcs[i].name, crcs[i].errors);
}
return 0;
}
As a result, in the next version of the product for the internal bus, the CCITT checksum was chosen, to a greater extent because its implementation was available in the hardware of the microcontroller used.