Why traffic graphs "lie"
Being engaged in protection against DDoS attacks, it is brought up on the client’s site to carry out load tests, protection tests, and help in repelling attacks. Often you observe a situation when the graphs in different systems differ on the same traffic. A brief explanation of “think differently” does not inspire confidence. Therefore, he described the reasons for a separate article. This article will be useful to beginning engineers from the operation of the network and those who have to deal with schedules.
The reasons for the discrepancy of readings were divided into three groups:
1. Counting
2. Collection and storage
3. Display
I'll start with the main reason for the discrepancy and which is most often overlooked.
1. Engineers often believe that the minimum "packet" size is 64 Bytes.
2. Network equipment differently considers the amount of information transmitted.
The sources of error and answers are in this picture.
Let me remind you of the Ethernet header structure.
For example, we will do the calculations for 10 GbE. Through a 10 GbE interface, a maximum of 1,000,000,000 bits pass (10 ^ 10).
Convert the size of the headers from octets to bits
* The use of overlay technologies on the transport network affects the initial size of the PDU , which reduces the maximum pps.
** For an example took VLAN. Processing frames with vlan IDs on network interfaces may vary. Some increase the MTU, others decrease the allowable maximum payload size.
We calculate the maximum and minimum speed in the PDU per second with the complete utilization of the interface (wirespeed)
Those. through 10 GbE interface the maximum passes ~ 14.88 Mpps. For ease of remembering, we call it a zigapack.
I also draw attention to the fact that max pps and min pps differ by more than 18 times . For this reason, when considering antiDDoS solutions, you need to pay attention to performance in Mpps. Often vendors claim performance in Gbps, silent in packets. Description of methods for evaluating the performance of protection systems is a topic for a separate large article.
Network equipment can read PDU size at different levels and exclude fields from counting. Frequent sets of fields for counting:
Now we calculate the readings on the graph when attacking TCP SYN Flood on wirespeed without and using vlan.
So with the full utilization of the 10 GbE interface on the chart, you can observe a speed of 4.43 Gbps.
Therefore, when comparing speed values in different systems, you need to understand how the size of the PDU is considered. To simplify the comparison, we made a calculator for ourselves , showing the speed when counting different headers.
The following two other groups of causes affect peak smoothing and are applicable to all speed graphs.
Usually polled counters showing the absolute value of the processed bytes and packets. To display the speed you need to calculate the derivative.
The accuracy of the graph is highly dependent on the frequency of polling the counter. The less often, the greater the averaging. For example, it is customary for operators to take values every five minutes. Therefore, with Pulse Wave DDoS attacks, the graph profiles in the monitoring system and the filtering system will be very different.
Often, the cyclic database (rrd) approach is used to store counter values. In order to save resources, data for different periods is stored with different accuracy. The farther into the past, the more sparse the values, the greater the averaging.
Different systems may have different retention policies, therefore, retrospectively looking at charts, you can observe different values.
Usually on the chart there is a limit on the number of displayed points. If there are more points during the requested period, then when displaying the points are consolidated. Most often, neighboring points are consolidated into one with an average value. This averaging smooths out the peaks.
Illustrative example:
An additional discrepancy in the readings is added by the chart drawing tools. For graphs in bits, they can use various degrees to display the same prefix. You can read more at en.wikipedia.org/wiki/
Basically, the counters on the network equipment show the amount of processed information in bytes. If not converted, then the graph will show the speed in Bps (bytes per second), and not in bps (bits per second).
Charts are a useful and informative tool. By looking at the right set of graphs, you can quickly find answers to many questions. But when working with charts, you need to understand the nuances, especially when correlating charts from different systems. Therefore, the first time you look at the chart, find out:
The reasons for the discrepancy of readings were divided into three groups:
1. Counting
2. Collection and storage
3. Display
1. Counting
I'll start with the main reason for the discrepancy and which is most often overlooked.
1. Engineers often believe that the minimum "packet" size is 64 Bytes.
2. Network equipment differently considers the amount of information transmitted.
The sources of error and answers are in this picture.
1.1 RTFM
Let me remind you of the Ethernet header structure.
For example, we will do the calculations for 10 GbE. Through a 10 GbE interface, a maximum of 1,000,000,000 bits pass (10 ^ 10).
Convert the size of the headers from octets to bits
bytes | bits | |
---|---|---|
L1 Header size | 20 | 160 |
L2 MAC Header size | 14 | 112 |
L2 FCS size | 4 | 32 |
L2 VLAN size | 4 | 32 |
Payload min | 46 | 368 |
Payload max | 1500 | 12000 |
Total | ||
Min payload w / o VLAN | 84 | 672 |
Min payload w VLAN | 88 | 704 |
Max payload w / o VLAN | 1538 | 12304 |
Max payload w VLAN | 1542 | 12336 |
* The use of overlay technologies on the transport network affects the initial size of the PDU , which reduces the maximum pps.
** For an example took VLAN. Processing frames with vlan IDs on network interfaces may vary. Some increase the MTU, others decrease the allowable maximum payload size.
We calculate the maximum and minimum speed in the PDU per second with the complete utilization of the interface (wirespeed)
Max pps | 14880952 .38 |
Max pps w VLAN | 14204545.45 |
Min pps | 812743 .8231 |
Min pps w VLAN | 810635.5383 |
Those. through 10 GbE interface the maximum passes ~ 14.88 Mpps. For ease of remembering, we call it a zigapack.
I also draw attention to the fact that max pps and min pps differ by more than 18 times . For this reason, when considering antiDDoS solutions, you need to pay attention to performance in Mpps. Often vendors claim performance in Gbps, silent in packets. Description of methods for evaluating the performance of protection systems is a topic for a separate large article.
1.2 Features of PDU Size Counting
Network equipment can read PDU size at different levels and exclude fields from counting. Frequent sets of fields for counting:
- L2 Data or IP packet len
- L2 Data + MAC Header
- L2 Data + MAC Header + FCS (CRC Checksum)
Now we calculate the readings on the graph when attacking TCP SYN Flood on wirespeed without and using vlan.
PDU size (bytes) | Gbps multiplier | ||
---|---|---|---|
10 9 | 2 30 | ||
Pps w / o vlan = 14880952.38 | |||
L1 | 84 | 10 | 9.313225746 |
L2 | 64 | 7.6190 47619 | 7.095791045 |
L2 w / o FCS | 60 | 7.1428 57143 | 6.652304104 |
L2 Data (IP + TCP) | 40 | 4.761904762 | 4.434869403 |
Pps w vlan = 14204545.45 | |||
L1 | 88 | 10 | 9.313225746 |
L2 | 68 | 7.727272727 | 7.196583531 |
L2 w / o FCS | 64 | 7.272727273 | 6.773255088 |
L2 Data (IP + TCP) | 40 | 4.545454545 | 4.23328443 |
So with the full utilization of the 10 GbE interface on the chart, you can observe a speed of 4.43 Gbps.
Therefore, when comparing speed values in different systems, you need to understand how the size of the PDU is considered. To simplify the comparison, we made a calculator for ourselves , showing the speed when counting different headers.
The following two other groups of causes affect peak smoothing and are applicable to all speed graphs.
2. Collection and storage
2.1 Counter polling frequency
Usually polled counters showing the absolute value of the processed bytes and packets. To display the speed you need to calculate the derivative.
The accuracy of the graph is highly dependent on the frequency of polling the counter. The less often, the greater the averaging. For example, it is customary for operators to take values every five minutes. Therefore, with Pulse Wave DDoS attacks, the graph profiles in the monitoring system and the filtering system will be very different.
2.2 Data Consolidation (Retention policy)
Often, the cyclic database (rrd) approach is used to store counter values. In order to save resources, data for different periods is stored with different accuracy. The farther into the past, the more sparse the values, the greater the averaging.
Different systems may have different retention policies, therefore, retrospectively looking at charts, you can observe different values.
3. Display
3.1 Number of points on the chart
Usually on the chart there is a limit on the number of displayed points. If there are more points during the requested period, then when displaying the points are consolidated. Most often, neighboring points are consolidated into one with an average value. This averaging smooths out the peaks.
Illustrative example:
3.2 Binary consoles
An additional discrepancy in the readings is added by the chart drawing tools. For graphs in bits, they can use various degrees to display the same prefix. You can read more at en.wikipedia.org/wiki/
3.3 Units
Basically, the counters on the network equipment show the amount of processed information in bytes. If not converted, then the graph will show the speed in Bps (bytes per second), and not in bps (bits per second).
Conclusion
Charts are a useful and informative tool. By looking at the right set of graphs, you can quickly find answers to many questions. But when working with charts, you need to understand the nuances, especially when correlating charts from different systems. Therefore, the first time you look at the chart, find out:
- what is going and how;
- how is it stored;
- as displayed.