Improving the speed and quality of a network suffering from packet loss

Good day. I decided to share with you my understanding of this issue, and its solution mechanism used in Silver Peak WAN optimization devices.

So, the practical majority of modern networks use packet transmission and IP as their basis. Information is transmitted through the network in pieces of information, and usually the size of these pieces varies from 1 byte to 1,500 bytes (I mean the payload). On the way through the WAN network, such packets can go through a great many routers and gateways. You can see some of these transshipment points using the Traceroute utility. But these are far from all real transit nodes, for example, you will not see here those nodes through which the traffic went through tunneled (MPLS VPN, GRE, etc.). At the same time, with a non-zero probability, one of the transit nodes will be heavily loaded at the moment, and will destroy your packet in order to prevent network congestion. And the more such transit nodes, the greater the probability of packet loss in the network.

As an example, I’ll give a picture showing how the percentage of lost packets can change over time on the same communication channel:

image

Theoretically, such a drop of a packet is a completely safe and normal thing - a special TCP protocol monitors the integrity of data transfer. But as always, there are nuances. The nuances are that the TCP protocol, in case of packet loss, will have to send it over the network again. But in order to make a decision on the re-shipment, you need to wait for a notification from the receiving side that the next packet has not been received. And here comes to the fore such a network parameter as signal delay. The longer it is, the longer the transmitting side will be in the dark, and the slower the transmission of information.
Below are graphs of the dependence of the traffic speed on the delay in the communication channel and the percentage of packet loss. It can be seen from it that, in fact, the main loss in the transmission speed in the channel with a typical WAN delay of 50-100 milliseconds occurs with a still quite seemingly insignificant percentage of losses: 1-2%.

image

If we talk about applications that work through UDP and are focused on working in real time, such as telephony or video conferencing, then the retransmission mechanism is not provided for or justified in them. And if there is a loss, then, whatever one may say, artifacts come out in the form of "croaking", "stuttering", and a periodically crumbling picture.
It turns out that you can look at packet loss from a slightly different angle and solve it in a rather elegant way, as Silver Peak engineers did. Surely many of you have heard about special coding methods that allow you to identify errors, and some of them even correct errors in the information. For example, ECC codes and Reed-Solomon codes, first used industrially in the 70s when CDs appeared. The general meaning of such codes is that they introduce some redundancy, and this redundancy can adaptively adapt to the current characteristics of the channel. Another more illustrative example is the technology for protecting information on RAID5 disk arrays, which provides for one redundant disk drive for every 3, 4, 5, or more data disks. In the case of packet transmission, an analogue of disks,
The trouble is that such technologies, having the common English language name Forward Error Correction (FEC), are usually used only at the physical level of the data transmission channel. And in no way can they eliminate information loss associated with network congestion, dynamic topology rearrangements, etc. Silver Peak engineers implemented FEC technology at the link level, so that between any two Silver Peak devices, they create their own “tunnel”, in which a number of redundant packets are supported and adaptively tuned. A typical topology of the communication channel using this solution and FEC technology is shown in the following picture:

image

It shows how the device on the non-transmitting side generates an excess packet, and the device on the receiving side recreates, on its basis, another, lost packet.
To evaluate the effectiveness of using FEC to eliminate packet loss, you can look at the time the file was transferred through the network, with a certain percentage of traffic loss. The following picture shows that even a small percentage of redundancy can increase the file transfer speed by several times, and stuttering and cubist-style pictures during a video conference can be easily forgotten:

image

Also popular now: