The quality of data networks. Transport
In a previous article, the basic quality metrics of networks and data transmission systems were discussed. It was also promised to write about how everything works from the inside. And intentionally it was not mentioned about the quality of the data transmission medium and its characteristics. I hope that the new article will provide answers to these questions.
I will start, perhaps, with the last point - the quality of the transmission medium. As already mentioned above, nothing was said about it in the previous narrative, since the number of media and their characteristics are very different in themselves and depend on just a huge number of factors. Understanding all this diversity is the task of the relevant specialists. The use of radio as a medium for data transmission is obvious to everyone. I remember in the late 90s and early 00s such exotic transmission methods as atmospheric laser transmitters began to be especially popular with telecom operators. They looked, depending on the manufacturer and configuration, approximately as in the picture on the left (yes, almost such a light telephonefrom amateur radio childhood). Their advantage was that it was not necessary to obtain permission from the GRKCh, and the speeds, in comparison with the radio bridge, were slightly higher, in addition, there were modifications for organizing channels with time division (E1, etc.), and such radio access equipment was prohibitively expensive. Why not an optical cable? Because in those happy times of wild provider optics was still quite expensive, and for an interface converter or active equipment that could take an optical link directly gave a small (and someone big) bar of gold. There were satellite channels, but this is generally from the realm of science fiction, and only companies in the oil sector and other national welfare could afford them. But the operation of the channel through the satellite comes down to the use of radio broadcasting, with all the ensuing consequences and the introduction of a huge delay.
Accordingly, plunging into the question as a result, we will have many media and not a single generalized characteristic. Nevertheless, for us, the environment is just a transport that transfers information from point A to point B. And for a transport (even public), a characteristic reflecting its quality will be the delivery of all bits (well, or passengers) without distortion and loss (I would not want to lose some body during transportation, agree). Those. we come to such a generalized metric of transport quality as the number of bit errors, or BER (Bit error rate). In pure packet networks, it is practically not used, because transmission errors are detected at the packet level, for example, by checking checksums: FCS (Frame check sequence) for L2 or shecksum IPfor L3. If the checksum does not match, then the entire packet is discarded as invalid. If we consider heterogeneous networks, those in which a non-packet network can serve as transport, for example, one of the options described above, or transit through ATM , PDH , SDH and the like is generally used without direct (but with recovery) packet transmission, then bit errors transport can significantly affect, of course, depending on the technology. Consider the encapsulation and transmission of an Ethernet frame in HDLC. Other technologies use almost the same technique.
The diagram is read from left to right (taken here ).
- Some node in network A sends a packet to the side of some node in network B
- Transport between networks built on a PDH network
- The node at the exit boundary of network A cuts out the payload area from the Ethernet frame (fields from DestinationAddress to FCS inclusive), wraps the headers in HDLC, and sends it to the boundary entry node of network B
- The network B boundary input node identifies the payload area and restores the Ethernet frame
- The frame from the boundary node is sent to the recipient.
As you can see, in this case, the control is transmitted correctly and in case of damage to the bit stream during transmission, the restored packet with an incorrect FCS will be discarded by the receiver. In this case, the error detection mechanism is present.
But the encapsulation add-in is not always used, or not a full frame is transmitted at all, but only the payload field. Those. a region is cut, wrapped in an internal protocol, and on the other side, missing data is restored, including missing L2 headers. Accordingly, FCS disappears - it is simply calculated again. Thus, it turns out that if the data was damaged, and the FCS was calculated on the basis of “corrupted” data, then the receiver does not accept the packet that was sent to him. This is quite common in satellite communications in order to increase the useful utilization of the channel, avoiding the transfer of conditionally “unnecessary” information. Summarizing, it turns out that the BER metric can be interesting in cases where:
- it is necessary to check the stability of the physical channel, for example, for optics it is 10E-12 (mentioned in IEEE802.3)
- Ethernet frames are packaged in SDH (GFP), PDH, ATM, and other transport networks.
- xHSL technologies are used, PPP protocols in which IP packets are packaged
The metric is known - this is the ratio of the number of bit errors to the total number of transmitted bits. The measurement technique for TDM networks is known as the ITU-T G.821 specification. Classically, BERT (BER Test) of the first level is used to check channels, but taking into account the specifics of the packet encapsulation protocols and the principle of packet networks, it is necessary to be able to conduct tests on L1-L4. A little further will be considered in more detail. Well, now you should decide what to check and how to check. To the question: “What to check?” ITU-T 0.150 answers. In paragraph 5, the types of SRP are considered.(pseudo-random sequences) from which data is simply taken to form a packet. Those. you just need to take and fill in the corresponding packet level with the data of the selected memory bandwidth. In our devices, the following memory bandwidths are used:
- PSP 2e9 (ITU-T 0.150 clause 5.1)
- PSP 2e11 (ITU-T 0.150 clause 5.2)
- PSP 2e15 (ITU-T 0.150 clause 5.3)
- PSP 2e23 (ITU-T 0.150 clause 5.6)
- PSP 2e31 (ITU-T 0.150 clause 5.8)
- user sequence (32 bits)
- all zeros
- all units
- alternative sequence (01010101)
The user sequence is introduced for compatibility with devices that exist on the market, that is, you can specify any sequence and conduct a joint test.
The question of how to check is open so far, let's try to figure it out. Suppose we can generate specific packages. If you send such a packet to the other end of the transport, then how do you understand that it has not changed (you should ignore the packet principle, since we may not have FCS and other types of control, as described earlier)? The easiest option is to wrap the packet back (in TDM it is called “make a loop”, in Ethernet it means installing a loop). Inversion, in many cases, can be done at the channel output without changing the transmission medium, i.e. really put a loop at the output of E1 and everything will work. But since Since the data goes a double way, the probability of an error also increases by 2 times. And the channels can be asymmetric or unidirectional. Accordingly, it would be ideal to be able to possess information about the correct sequence and compare incoming packets with already known information. The first and simplest option, applicable when both channel outputs are located side by side (for example, this is possible with TDM switching, or testing the optical “ring”) is that one port of the device generates test traffic, and the other port of the same device receives it and compares, as Since the comparison takes place at the same node as the generation, there are no problems with comparing the sequence data. The second option involves restoring the original sequence and comparing it with incoming data. In the case of a completely random sequence, it is not possible to realize this, but if the sequence is pseudo-random, then completely. It takes some time to synchronize at the very beginning of the test, but then the comparison is not difficult. Since the SRP of the first device and the SRP of the second are known and identical, synchronization is reduced to finding the place of the beginning of comparison in the SRP of the second device. Therefore, the following topologies exist:
- "On itself" 1 - one device on one port, at the other end of the transport there is a loop
- "On itself" 2 - one device from one port of its port to another port
- from one device to another device, with synchronization
- an Ethernet frame is formed containing the memory bandwidth data
- FCS is calculated for such a frame and it fits into the output buffer
- the frame is sent over the network to another device
- for some reason, only one bit is changed inside the packet
- the recipient accepts the package
- FCS of received packet does not match content
- the packet is discarded (if there is, for example, a switch between the sender and the receiver, the “curve” packet will not reach the receiver at all, because it will be destroyed before it)
- the sender forms the next packet (it all starts with step 1)
In the above example, in step 8, the synchronization will fail on the recipient side. This will happen because the sender will take the next block of the memory bandwidth, and the recipient will compare with the block that was lost in the previous cycle (he does not know anything about the loss). Failure of synchronization will lead to an unreasonably large increase in bit errors, because all newly coming blocks do not coincide at all, which will lead to the fact that in one packet the number of bit errors will increase by the size of the frame. After some time, an attempt will be made to restore synchronization, but the number of accumulated bit errors will be very untrue.
And what about the iron?
I don’t know about others, but on our Bercut devices ( ET , ETX , ETL , B100 , as well as the B5-GBE module for MMT ), things are as follows. Remembering the principle of generating and analyzing traffic as close as possible to the physical segment from the first article, all such tasks were assigned to the FPGA. A simplified block diagram looks like this:
The MAC core is represented by two blocks: one for reception, the other for transmission. This allows you to independently receive and send packets, i.e. there is no interference between the send queue and the receive queue and vice versa. It is also possible to keep general statistics on received and sent traffic from two independent blocks, regardless of the type of test. Data from the transmission unit is sent to the transmitter and sent to the network, and incoming data from the transceiver is sent to the reception unit.
Since some test topologies require loop functionality (loopback, loop), it is implemented as a separate unit. It is possible to install a loop of level L1-L4:
- L1 - just wraps the traffic back (it happens in the transceiver)
- L2 - swaps DstMAC <-> SrcMAC in places, recounts FCS
- L3 - swaps DstMAC <-> SrcMAC and DstIP <-> SrcIP in places, recounts FCS
- L4 - changes DstMAC <-> SrcMAC, DstIP <-> SrcIP and DstPort <-> SrcPort, recounts FCS
Packet statistics are also maintained for the loopback mode, which allows a rough estimate of the ratio of sent and received packets.
The generator module for each type of test is different; for BERT, it contains a generator of memory bandwidth of all declared types.
It works as follows. From the PSP generator, data is transmitted to the multiplexer (in other words, the switch), which, if some other channel is not turned on at the moment, directs the flow to the MAC tx module. The MAC tx module, in accordance with the test settings (BERT level, packet size, field data) forms a valid Ethernet frame from the SRP and sends it to the transceiver, which in turn sends it to the network. Depending on the test topology, the frame is either wrapped by the remote side or analyzed. In any case, the initial processing of the package is no different. The frame hits the MAC rx core, which sends it to the multiplexer. The multiplexer, depending on the operating mode of the device, sends the packet either to the Loopback module, from where, after processing, it is immediately sent to MAC tx for sending, or to the processing and statistics module of the test, where, if necessary, an attempt will be made to synchronize the SRP and a comparison of the initial sequence with the obtained one will be performed. The processing results are sent to the statistics output module.
Using FPGA or ASIC allows all operations to be carried out in parallel, which does not introduce any delays in processing and eliminates the interference of processing modules.
Despite the apparent simplicity of the algorithms and techniques, many years of serious research are behind them. A huge number of factors still affect both the accuracy of measurements and the cost of devices (precision elements, high-speed FPGAs). For example, the above BER test does not differ in significant complexity in the general algorithmic plan, but requires knowledge in the field of mathematics, computer science, and information theory to develop a viable model. Modification of the BER test for packet networks (support for L2-L4 levels) requires a deep understanding of the principles of switching and routing. I hope that such articles are interesting and useful. In the following publications I plan to write about certified tests, traffic generators, filters and analytical complexes.
“And we will do it. Not because it's easy, but because it's hard. ”
PS. Ask questions and suggest topics, within our competence we are ready for anything :)