High Speed ​​File Transfer Protocol - Aspera FASP



    At present, in the age of popularity of the Internet and various content, including media, the size of which in HD-quality can take several Gigabytes, the most acute problem has arisen of high-speed file transfer over the network. As an example, consider the work of a news television studio, where a reporter, being on another continent, must quickly transmit a high-quality report to a central studio for processing and broadcasting. It is clear that here the transmission speed plays a key role, since the news will no longer be news if it appears in a couple of days.

    Or, for example, in the case where a floating drilling rig, on which there is only a satellite channel, must transmit a cube of geophysical data for well drilling for interpretation to high-performance computing centers, each day of delay can lead to losses.

    The first thing that comes to mind is to use the FTP protocol. And, indeed, FTP, based on the Transmission Control Protocol (TCP), will do just fine if you need to transfer files over short distances or over a “good” network. But in the above examples, the FTP protocol speed will be very slow due to TCP inefficient operation on networks with delays and packet loss. At the same time, increasing the channel does not solve the problem, and the expensive channel remains unused.

    The Aspera FASP protocol has no disadvantages of TCP, and it utilizes as much as possible any available channel, increasing the data transfer speed compared to FTP by hundreds of times. Before the advent of Aspera, data was recorded on hard drives or tapes and transferred by couriers, which is not very fast, in addition, there was a risk of data loss.

    FASP is a high-speed file transfer protocol with guaranteed delivery, developed and patented by Aspera (www.asperasoft.com). In 2014, IBM acquired Aspera and the solution fell into the IBM cloud division.

    The abbreviation FASP stands for fast, adaptive, secure protocol, which in translation from English means a fast, adaptive and secure protocol. Fast, because it effectively uses all the available bandwidth; adaptive, because it adapts to the state of the network and is “friendly” to other traffic; and it’s safe, because it uses encryption both during transmission and in “idle state”.

    In order to understand the benefits of FASP, let's recall how TCP works. In the 70s of the last century, scientists were tasked with developing a protocol that could tolerate a nuclear strike. It was necessary to create a protocol that could safely transmit data. Therefore, when creating TCP, the main efforts were aimed at creating a mechanism of reliable, rather than high-speed transmission. In those years there were no mobile or satellite networks, and the only transatlantic channel from the USA to Europe had a speed of 64 Kbps, which shows the state of the technology for that period. TCP was designed so that the transmission rate is inversely proportional to the distance between the endpoints. In addition, in the event of packet loss, TCP considers the channel to be congested, and independently reduces the transmission speed. TCP performance decreases with increasing transmission distance and due to poor network quality. The greater the distance, the greater the delay, and the lower the transmission speed. Delay is usually measured by Round-Trip Time (RTT). This is the time it takes to send the package and receive confirmation from the recipient. Delay occurs due to the laws of physics that limit the speed of light or an electromagnetic signal. For example, the delay in transmission over satellite networks can reach 800 ms. Moreover, when transmitting over long distances over the global Internet (WAN), a packet must go through a large number of routers before the recipient receives it. The router takes time to process the packet, and if it is configured incorrectly or is overloaded, packet loss may occur. The higher the number of lost packets, all the more time-consuming transmission becomes. Figure 1 shows that TCP has good local area network (LAN) performance relative to the available network bandwidth, but at the same time, the more RTT and packet loss, the lower the transmission performance.

    Also, the performance of the TCP protocol does not increase with increasing channel. In other words, if you have a slow transmission on a 10 Mbit / s channel, there is no guarantee that if the channel is increased to 1 Gbit / s, the speed will increase. Of course, if you need to transfer the file to a neighboring street, the performance increase will be noticeable, but if the task is to transfer data over long distances, then increasing the channel to 1 Gb / s will most likely not help much.


    Figure 1 TCP performance TCP

    - session protocol, i.e. upon transmission, it establishes a connection, as can be seen in Figure 2. all sent packets receive confirmation in the form of ACK packets. Delay (RTT) is the time difference between sending a packet and receiving acknowledgment (ACK).


    Figure 2 TCP packet exchange

    The number of packets that TCP can send at a given time is determined by a mechanism called the TCP Sliding Window. The TCP window is controlled by the Adaptive Increase Multiplicative Decrease (AIMD) algorithm and the Flow Control, which controls the rate at which packets are sent. Thanks to this, TCP can “be sure” that packets are sent no faster than the receiver can receive them. As shown in Figure 3, thanks to the TCP window, the sender can send several packets at the same time, but at the same time, if confirmation is not received, TCP will block the transmission of any subsequent packets.


    Figure 3 TCP Sliding Window

    In case of packet loss / acknowledgment, the sender, using the AIMD algorithm, reduces the TCP window size by half or to zero. The situation when the sender does not wait for confirmation of delivery and reduces the transmission window is most often manifested during transmission over long distances, i.e. on networks with large RTT. At the same time, if packet loss also occurs, the transmission speed will decrease to almost zero. If you represent the transmission speed of the TCP protocol in the form of a graph, you get the so-called “sawtooth” function (sawtooth pattern).


    Figure 4 TCP AIMD “Sawtooth” Pattern

    As can be seen in Figure 4, due to the use of the AIMD congestion avoidance algorithm in TCP, the transmission rate in TCP increases until “congestion” occurs, as a result of which the speed drops sharply, and the “sawtooth” process constantly repeats without giving the TCP protocol completely dispose of the available channel. There are tools on the Internet that allow you to evaluate the effective transmission rate over TCP (http://asperasoft.com/performance_calculator/). For example, on a network with a bandwidth of 100 Mbit / s, RTT = 150 ms and packet loss of 1.5%, the transmission speed will be less than 1 Mbit / s, it does not matter what channel it will be: 100 Mbit / s or 1 Gbit / s, because TCP transmission rate depends on RTT and packet loss.

    FASP Protocol


    Aspera FASP transport protocol, unlike TCP, works perfectly in any network, providing guaranteed delivery. This protocol efficiently transfers data of unlimited volume over the global Internet, satellite and mobile channels, and at the same time does not reduce its efficiency in networks with increased RTT and packet loss. The protocol provides maximum speed and mechanism to prevent congestion, as well as control of transmission policies, security and reliability.


    Figure 5 FASP Performance

    When working on a local network, the difference between TCP and FASP is not significant, but as soon as RTT and packet loss increase between the two endpoints on the network, FASP performance becomes much better (Figure 5). FASP transfers data faster, making the most of the available transmission channel. At the same time, there are no restrictions on the theoretical maximum transfer rate, which could more likely be limited by hardware resources, for example, the performance of the disk subsystem. Transmission over TCP shows a “sawtooth” graph, which you will never see in FASP, where the transmission rate will reach the specified level (Target Rate) and will adhere to it regardless of the presence of lost packets. Of course, the lost data will be transferred again, but this will not affect the transmission performance.

    The FASP protocol operates on the basis of the User Datagram Protocol (UDP) at the OSI model transport layer, and since UDP does not guarantee delivery, the congestion avoidance algorithms and transmission control are implemented at the application level.

    FASP does not use the TCP window mechanism and is independent of the transmission distance. Transmission starts at a specific speed and sends data at a speed calculated by a mathematical algorithm. FASP uses a negative acknowledgment mechanism (NACK); this means that if there is a lost packet, the recipient will report that he did not receive this packet, while the sender will continue to transmit. FASP, unlike TCP, does not wait for confirmation before sending the following data. Since it is not important for the transfer of files in what sequence packets will arrive, data transfer does not stop when packet loss occurs, as occurs in TCP. Data efficiently continues to be transmitted at high speed. The sender will re-send only the lost packet, and not the entire packet window, as in TCP, without stopping the current transmission.


    Figure 6 Aspera and TCP performance comparison utility

    Figure 6 shows a comparison of a file transfer speed of 100 GB on a network with a bandwidth of 100 Mbps, RTT of 150 ms, and a 2% packet loss. As you can see, the effective file transfer speed is 98 Mb / s when using Aspera FASP, and 0.7 Mb / s when using TCP.

    The refusal to use redundant algorithms for controlling the speed and reliability of the TCP protocol allows the FASP protocol not to reduce the transmission speed when losing packets. Lost data is sent at a channel speed or a predetermined speed, without sending repeated data.

    The available channel bandwidth is determined based on the FASP rate control mechanism. The adapted speed control mechanism in the FASP protocol constantly sends trial packets with the help of which the so-called queuing delay is measured. That is, when a packet arrives at the router, it needs to be processed and forwarded. Because the router can process one packet per unit of time, then when a packet is received before the router can process it, the packet is queued. Thus, a delay in the queue is formed. FASP uses the measured queue delay values ​​as the main indicator of network congestion (or disk subsystem). The network is constantly being polled, the transmission speed increases when the value of the delay in the queue is lower than the target value (indicates that the channel is not fully utilized and it is necessary to increase the speed) and decreases when the size of the delay in the queue is higher than the specified level (indicates that the channel is completely utilized and overload is possible). By constantly sending test packets to the network, FASP receives accurate queue delay values ​​along the entire transmission route. When the delay in the queue increases, the FASP session reduces the transmission rate in proportion to the difference between the target and the current delay in the queue, allowing you not to overload the network. With reduced network load, the FASP session quickly increases speed in proportion to the target delay, while utilizing almost 100% of the available channel bandwidth. that the channel is not fully utilized and it is necessary to increase the speed) and decreases with an increase in the delay time in the queue above a specified level (indicates that the channel is completely utilized and overload is possible). By constantly sending test packets to the network, FASP receives accurate queue delay values ​​along the entire transmission route. When the delay in the queue increases, the FASP session reduces the transmission rate in proportion to the difference between the target and the current delay in the queue, allowing you not to overload the network. With reduced network load, the FASP session quickly increases speed in proportion to the target delay, while utilizing almost 100% of the available channel bandwidth. that the channel is not fully utilized and it is necessary to increase the speed) and decreases with an increase in the delay time in the queue above a specified level (indicates that the channel is completely utilized and overload is possible). By constantly sending test packets to the network, FASP receives accurate queue delay values ​​along the entire transmission route. When the delay in the queue increases, the FASP session reduces the transmission rate in proportion to the difference between the target and the current delay in the queue, allowing you not to overload the network. With reduced network load, the FASP session quickly increases speed in proportion to the target delay, while utilizing almost 100% of the available channel bandwidth. that the channel is completely disposed of and overload is possible). By constantly sending test packets to the network, FASP receives accurate queue delay values ​​along the entire transmission route. When the delay in the queue increases, the FASP session reduces the transmission rate in proportion to the difference between the target and the current delay in the queue, allowing you not to overload the network. With reduced network load, the FASP session quickly increases speed in proportion to the target delay, while utilizing almost 100% of the available channel bandwidth. that the channel is completely disposed of and overload is possible). By constantly sending test packets to the network, FASP receives accurate queue delay values ​​along the entire transmission route. When the delay in the queue increases, the FASP session reduces the transmission rate in proportion to the difference between the target and the current delay in the queue, allowing you not to overload the network. With reduced network load, the FASP session quickly increases speed in proportion to the target delay, while utilizing almost 100% of the available channel bandwidth.

    In addition to efficient channel utilization, based on the measurement of delays in the speed control mechanism, FASP allows you to evenly share bandwidth with other TCP traffic, as well as set priorities for various file transfers.

    Alternative Technologies and FASP


    To date, there are various technologies for accelerating data transmission over the WAN, for example, compression technology (eg Riverbed, Silverpeak). This technology consists in compressing files at the sender and further decompressing at the recipient. That is, in fact, you are not accelerating the transfer of data in its pure form, but simply sending less data over the WAN. But there may be scenarios when files cannot be compressed, or they are already archived or encrypted - in this case, performance growth will not happen.

    Another technology is Traffic Shaping. It sets different priority for different types of traffic. For example, it makes sense to give high priority to the traffic of applications with which the team of sales managers works, since this traffic is critical for business. Traffic from the SAP system or e-mail can be set to medium priority, and low web browsing. It is clear that this allows you to configure the quality of the service for certain tasks, but this method has nothing to do with speeding up data transfer.
    Another method of speeding up the transfer is caching. The idea behind this method is to cache data that is often transmitted over the network. The hardware-software complex, which is on the sender’s side, polls the software-hardware complex on the recipient’s side for the presence of a transmitted file or a part of it, and if they exist, the file is not transferred, but is extracted from the cache. The disadvantages of this method are expensive equipment: a hardware-software complex that needs to be installed in each location, and the “all or nothing” approach (the method does not work if the object is not found in the cache or has been changed).

    The positioning of the FASP protocol differs from the methods listed above, since the protocol utilizes the available bandwidth as efficiently as possible, guaranteed to transfer data at the admissible possible speed.

    To summarize, here are the main technological advantages of Aspera:
    • Ability to send very voluminous files (from 500 GB to several TB). Aspera customers regularly send files larger than 1 TB in both cloud and local storage, transferring more than 10 TB in one session.
    • Transmission over long distances in networks with a long delay (round-trip time) and packet loss. For example, BGI transferred genomic data between the United States and China at 10 Gb / s. (http://phys.org/news/2012-06-bgi-genomic-gigabits-china.html).
    • Guaranteed and predictable delivery of files of unlimited volume over long distances and at maximum speed.

    Also popular now: