AVrublev October 20, 2014 at 14:59

Features of the reflection of DDoS attacks and the history of attacks on one large bank

Previously, DDoS attacks could successfully fight back on the data center side of the attacked company. Fast smart actions of the administrator and a good filtering iron guaranteed sufficient protection against attacks. Today, botnet services are rapidly becoming cheaper, and DDoS is becoming available almost in a small business.

About half of the attacks go to online stores or commercial sites of companies in the spirit of “fill up the competitor”, they almost always attack media sites, especially after “hot” publications, they hit government services much more often than they seem. In Russia, the main goals are banks, retail, the media and tender sites. One of the largest Russian banks, for example, periodically blocks traffic from China - attacks from there come with enviable regularity, one of the last was more than 100 Gb / s.

Accordingly, when the attack exceeds, say, 10 Gb / s, reflecting it on its side becomes problematic due to the banal clogging of the channel. It is at this moment that you need to switch to the data cleaning center so that all the "bad" traffic is eliminated somewhere else near the trunk channels, and does not go to you. Now I’ll tell you how it works for one of our vendors of protective equipment - Arbor, which monitors about 90 Tbps (45% of global Internet traffic).

Attack scenario

First, the attacker selects targets within the infrastructure. Most of the “stupid” attacks go to HTTP, there are a lot of attacks to DNS, but the main “smart” attacks are usually aimed at previously explored nodes inside the target infrastructure. Often, DDoS sets the goal of not only and not so much a denial of services, but the ability to bring a “more serious threat” through the DMZ. This is often achieved by “overloading” perimeter protection systems, such as firewalls, IPS / IDS, and the like, based on stateful inspection sessions. Therefore, colleagues believe that if the device has a state table (session table), it should be considered as part of the infrastructure that needs protection.

Key points of attacks:

The attack reflection scheme is implemented as follows:

On the side of the protected company is a device that "closes" the network. As soon as the attack begins, the device should recognize it as an anomaly, using one of the mechanics, for example, built-in countermeasures, abnormal behavior of the traffic source or compliance with the known attack profile (signature from the database).
If the device copes with the attack, work continues in a relatively normal mode: legitimate traffic is skipped, illegitimate traffic is cut. There are several “alarm levels” that differ in the degree of complexity of the attack and the possible loss of legitimate traffic.
If the communication channel begins to become clogged, then traffic is automatically redirected using the standard Cloud Signalling protocol. First, everything comes to the site of a large provider (this may be Rostelecom, Orange, Transtelecom, Akado), where the data is cleaned with Peakflow SP equipment. Already cleared traffic goes to the end client. At the same time, the client has an understanding of everything that happens - he can quickly go to the personal account of the telecom operator and see the current status of cleaning, what countermeasures work, what is the effectiveness of suppressing attacks, and so on. The client device also shows the effectiveness of the cleanup currently occurring and a list of blocked hosts. If desired, you can easily remove traffic dump in pcap format from both devices for subsequent “debriefing”.

Reality that is not customary to talk about

1. Loss of legitimate traffic.
The fight against a DDOS attack is the rejection of illegitimate packets and the passage of legitimate traffic, between which sometimes a very thin line passes. Many vendors in their prospectuses love to write that these losses are close to zero. In my practice, they may well reach 2% - this means that theoretically the same director who went on a business trip will not be able to get into the corporate network from a hotel or from a conference. It is very important that the system supports the ability to allow specific connections or protocols on the fly. A number of vendors from the moment the attack begins, the settings are, in fact, almost hardcoded into the firmware of the iron, and changing them is extremely problematic. Arbor has a completely different situation with this. Firstly, to control the false-positive level, there are three “anxiety modes” - from “chop everything that’s definitely not ours” to “there is an opportunity to delve into more details”. Secondly, there is a convenient search for blocked hosts and the ability to cancel traffic blocking for a specific host, protocol or country in one click. Note that the reality of the presence of false positive in the fight against DDoS is recognized by all prominent market players. Alexander Lyamin (Qrator) once noted: “anyone who says that false positive has zero is a quack.”

2. Ability to use a clogged communication channel for signaling an attack.
How will the client ask the provider about the attack if the channel between them is blocked due to DDoS? Strictly speaking, even with the utilization of the communication channel close to 100% in the direction from the provider to the client, there is a very big chance that the request for data cleaning will reach the provider. To do this, Cloud Signaling works on top of UDP, and in addition, the protocol does not require a response from the provider. Thus, it is enough to have at least a little capacity in the direction from the client to the operator. For reinsurance, it is recommended that you create a separate Cloud Signaling messaging channel. However, a backup channel is usually created, plus an alarm occurs at a threshold of about 70-80% of the channel.

3. Time to switch to the data center.
The delay in redirecting traffic to the cleaning center of a service provider of protection against DDoS attacks can take units or even tens of minutes. This is mainly due to the redirection mechanism - based on DNS records or BGP announcements, as well as the fact that the decision is made to redirect manually or automatically. In any case, if the suppression of the attack is carried out in the "cloud", then the delay cannot be avoided. At a minimum, it is a few minutes. Therefore, we adhere to the concept of multi-level protection, when large attacks on communication channels are suppressed by the operator, and slow, subtle attacks of the application level - by equipment installed at the customer.

4. Use of SSL certificates.
It is quite problematic to analyze SSL / TLS traffic for application-level attacks - you need to decrypt each packet. Here you are faced with a difficult dilemma: either share your certificates with your service provider, or accept the risk that an attack at the HTTPS level may be missed. Vendors try to find solutions without opening packages (which is not always good), or use a special SSL / TLS traffic decryption module built into the client device:

In this case, your certificates are downloaded to the device that is under your control, and there are delays for additional processing packets are milliseconds.

5. Manual signature generation.
Most signatures are generated by manufacturers of solutions for protection against DDoS attacks. However, situations arise periodically when resources are exposed to new types of attacks.

Depending on the vendor, there are two possible scenarios for action in such a situation: either request a new signature from the manufacturer, or create the signature yourself. In the first case, you most likely will have to pay extra for the service, as well as a significant amount of time to remain under attack while waiting for a solution to the problem.

In the second case, the principle of "saving the drowning - the work of the drowning themselves", well, or their partners, works. Here, the functionality of the equipment becomes critical, allowing you to quickly identify, intercept and analyze malicious traffic. And the ability to automatically generate a signature based on the information received becomes a salvation in a critical situation. For example, the ability to go into packet capture, select the desired bit sequence (bit pattern), and in a couple of clicks to make a signature that blocks such a sequence.

Scale advantage

Arbor has a very interesting “trick” that allows them to very effectively use their market share. The fact is that their equipment is on all TIER-I telecom operators and providers and on most TIER-II operators.

Arbor's position in the market for DDoS protection equipment in the Carrier, Enterprise, Mobile segments - 65% of the total market - is the first (Infonetics Research for December 2013).

Information about 90 Tb / s passes through the ATLAS system. As soon as signs of an attack appear somewhere, devices begin to transmit data about what is happening over their own communications network, and information quickly spreads around the world. For example, if they “knock down” a small provider, its iron signals according to a special protocol a level higher. If a higher-level operator has an agreement with a lower-level, then the signature is accepted and distributed to all subnets of a large provider.

Sensors (honeypots) of the ATLAS system are located on the main nodes of the global Internet to detect and classify attacks, bot bot activity, and various malicious software. Information is sent to the ATLAS data center, where it is combined with data received from the Arbor Peakflow installation and other data. The system automatically analyzes hundreds of thousands of code elements, and allows a team of engineers to quickly update signatures for company customers around the world.

Connection Features

It is worth saying a few words about iron, which is put directly to you in the data center or on the site. Many realities are also connected with him, which are not always spoken of.

1. Setup and configuration.
Typically, a device needs time to profile traffic and evaluate network behavior. But some devices are delivered in “combat” mode for working from the first second after connecting - there are already ready-made templates there, plus the device receives data from the “cloud” of its vendor. On the one hand, this is good in terms of repelling an ongoing attack, on the other hand, if you are used to fine-tuning everything in your infrastructure, you will have to partially rely on the experience of the vendor.

2. The protective iron itself may become a point of failure.
Therefore, firstly, on serious objects, it is duplicated or several devices are assembled in a cluster (accordingly, control devices are needed). And secondly, such devices have hardware bypasses that allow switching interfaces on the physical layer directly in case of various emergencies - power outages, software failures, and so on ...

3. Protective hardware can also be the target of an attack.
Especially if it is a device with a state table (session table), for example, it combines the functionality of protection against DDOS, IPS, firewall, etc.). Each time a new session is opened on such devices, the device allocates memory to track the session, fills the log, and so on - and the smarter the device, the more work happens. Many sessions - big utilization of CPU and memory. Therefore, choosing a solution, it is worth paying attention to this aspect.

4. Usually there is a lot of junk traffic on the network, and in the case of large organizations, there are also regularly spikes in activity that can pass as “stupid” DDoS.
It is clear that everything will be profiled by geography, by the time of using the services, and so on, but at the first stage it is important to get an understanding that connecting the device to the network will not kill the services. To do this, use the mirror mode of connection: the device is put into the network on bypass and receives mirror traffic, showing what it rejects and what it misses. This allows you to make an assessment before problems arise. I know customers where DDoS protection is in this mode for a long time. Arbor believes that building a baseline is only one of the ways to deal with DDoS, but using specialized countermeasures is much better. Colleagues say the same thing about signatures - no matter how large the sensor network is and how much traffic it analyzes, you cannot rely solely on signatures.

5. At the level of traffic exchange between providers, a rather unpleasant situation sometimes occurs when a subnet transmits an attack signature, and a higher-level provider says: “Yeah, interesting, but we won’t clean it at home”.
The principle is very simple - as long as the channel copes, all DDoS traffic is billed. It is not always interesting for a higher provider to spend money and clean it, when you can simply ship it below, and even for the money of the victim. Such situations are unpleasant for service providers, however, they do not affect users of DDoS protection services in any way, since operators fulfill their obligations in full.

Reports

One of the important things is a quick understanding of what is happening. Let's look at the reports using two attacks on a large Russian bank as an example. Feel how the admins added gray hair in the comic below.

Attack with a capacity of about 50 Gbit / s

This attack began with DNS amplification. Traffic recorded on Arbor Peakflow TMS service provider: up to 7.15 Gbit / s., 1.1 Mpackages / s. Given the filtering of most of the attack on FlowSpec, the total attack traffic is estimated to be 50 Gb / s. The attack was continued by HTTP flood.

Having recorded an abnormal traffic growth, the client device automatically requested help from the service provider in suppressing the attack.

The protection system installed by the operator transmitted information about the progress of suppressing attacks on Pravail APS in real time.

When the DNS-amplification attack came to naught, a large number of HTTP floods appeared.

In addition to the HTTP flood, TCP SYN flood was also present.

As a result, most of the DNS-amplification attacks were successfully suppressed using Flowspec, the rest of them were suppressed by the security system, and HTTP and TCP SYN floods were dropped on Pravail APS.

Attack with a capacity of the order of 125 Gbit / s

A few minutes in the beginning - HTTP flood. A surge in the number of foreign hosts is visible (red graph). About 1000 hosts per domain of the “Big Russian Bank”. Main countries: USA (672), Germany (141), Great Britain (82), Italy (30), Netherlands (24), France (14).

The next surprise is NTP-amplification - up to 125 Gb / s.

After a failed NTP amplification - SYN flood with spoofed IP - up to 300 Mbps

Summary

Attacks are ongoing, and if you haven’t yet encountered this, it’s just a matter of time. To illustrate, here are a few events in the Russian Federation:

Attacked the site of the CIS Interparliamentary Assembly
Attacked the CBR website
Attacked the site of the first channel
Attacked the website of VGTRK
Attacked the Kremlin website
Attack on VTB24
Attacked RIA website
Attacked the Feed site
Attacked the website of the newspaper Vedomosti

If you have questions, I will be happy to answer here or by mail at avrublevsky@croc.ru. Our division can offer various solutions if you need a different vendor.

And tomorrow, October 21, we are conducting a webinar on DDoS protection . Come and tell us what is happening on this front in Russia now.

Tags: