Why traffic graphs "lie"

Being engaged in protection against DDoS attacks, it Different graphs based on the same datais brought up on the client’s site to carry out load tests, protection tests, and help in repelling attacks. Often you observe a situation when the graphs in different systems differ on the same traffic. A brief explanation of “think differently” does not inspire confidence. Therefore, he described the reasons for a separate article. This article will be useful to beginning engineers from the operation of the network and those who have to deal with schedules.

The reasons for the discrepancy of readings were divided into three groups:

1. Counting
2. Collection and storage
3. Display

1. Counting


I'll start with the main reason for the discrepancy and which is most often overlooked.

1. Engineers often believe that the minimum "packet" size is 64 Bytes.
2. Network equipment differently considers the amount of information transmitted.

The sources of error and answers are in this picture.

1.1 RTFM


Let me remind you of the Ethernet header structure.



For example, we will do the calculations for 10 GbE. Through a 10 GbE interface, a maximum of 1,000,000,000 bits pass (10 ^ 10).

Convert the size of the headers from octets to bits
bytesbits
L1 Header size20160
L2 MAC Header size14112
L2 FCS size432
L2 VLAN size432
Payload min46368
Payload max150012000
Total
Min payload w / o VLAN84672
Min payload w VLAN88704
Max payload w / o VLAN153812304
Max payload w VLAN154212336

* The use of overlay technologies on the transport network affects the initial size of the PDU , which reduces the maximum pps.
** For an example took VLAN. Processing frames with vlan IDs on network interfaces may vary. Some increase the MTU, others decrease the allowable maximum payload size.

We calculate the maximum and minimum speed in the PDU per second with the complete utilization of the interface (wirespeed)

Max pps14880952 .38
Max pps w VLAN14204545.45
Min pps812743 .8231
Min pps w VLAN810635.5383


Those. through 10 GbE interface the maximum passes ~ 14.88 Mpps. For ease of remembering, we call it a zigapack.

I also draw attention to the fact that max pps and min pps differ by more than 18 times . For this reason, when considering antiDDoS solutions, you need to pay attention to performance in Mpps. Often vendors claim performance in Gbps, silent in packets. Description of methods for evaluating the performance of protection systems is a topic for a separate large article.

1.2 Features of PDU Size Counting



Network equipment can read PDU size at different levels and exclude fields from counting. Frequent sets of fields for counting:

  • L2 Data or IP packet len
  • L2 Data + MAC Header
  • L2 Data + MAC Header + FCS (CRC Checksum)

Now we calculate the readings on the graph when attacking TCP SYN Flood on wirespeed without and using vlan.

PDU size
(bytes)
Gbps multiplier
10 92 30
Pps w / o vlan = 14880952.38
L184109.313225746
L2647.6190 476197.095791045
L2 w / o FCS607.1428 571436.652304104
L2 Data (IP + TCP)404.7619047624.434869403
Pps w vlan = 14204545.45
L188109.313225746
L2687.7272727277.196583531
L2 w / o FCS647.2727272736.773255088
L2 Data (IP + TCP)404.5454545454.23328443

So with the full utilization of the 10 GbE interface on the chart, you can observe a speed of 4.43 Gbps.

Therefore, when comparing speed values ​​in different systems, you need to understand how the size of the PDU is considered. To simplify the comparison, we made a calculator for ourselves , showing the speed when counting different headers.

The following two other groups of causes affect peak smoothing and are applicable to all speed graphs.

2. Collection and storage


2.1 Counter polling frequency


Usually polled counters showing the absolute value of the processed bytes and packets. To display the speed you need to calculate the derivative.

The accuracy of the graph is highly dependent on the frequency of polling the counter. The less often, the greater the averaging. For example, it is customary for operators to take values ​​every five minutes. Therefore, with Pulse Wave DDoS attacks, the graph profiles in the monitoring system and the filtering system will be very different.

2.2 Data Consolidation (Retention policy)


Often, the cyclic database (rrd) approach is used to store counter values. In order to save resources, data for different periods is stored with different accuracy. The farther into the past, the more sparse the values, the greater the averaging.

Different systems may have different retention policies, therefore, retrospectively looking at charts, you can observe different values.

3. Display


3.1 Number of points on the chart


Usually on the chart there is a limit on the number of displayed points. If there are more points during the requested period, then when displaying the points are consolidated. Most often, neighboring points are consolidated into one with an average value. This averaging smooths out the peaks.

Illustrative example:



3.2 Binary consoles


An additional discrepancy in the readings is added by the chart drawing tools. For graphs in bits, they can use various degrees to display the same prefix. You can read more at en.wikipedia.org/wiki/

3.3 Units


Basically, the counters on the network equipment show the amount of processed information in bytes. If not converted, then the graph will show the speed in Bps (bytes per second), and not in bps (bits per second).

Conclusion


Charts are a useful and informative tool. By looking at the right set of graphs, you can quickly find answers to many questions. But when working with charts, you need to understand the nuances, especially when correlating charts from different systems. Therefore, the first time you look at the chart, find out:

  • what is going and how;
  • how is it stored;
  • as displayed.

Also popular now: