interrupt October 21, 2013 at 03:06

Prevent participation in dns amplification attack or experience writing nuclear code

From the sandbox

Introduction

In this article I want to talk about a fairly obvious, as it seems to me, method of filtering dns amplification attack, and about a small module that was written to implement the idea.

That dns amplification attack is written more than once, for example here; many faced this, many fought, some more successfully, some less. The attack is based on sending a DNS query to a DNS server with the source ip address set to the victim's ip address. The response from the DNS server is almost always larger than the request, especially considering that the attacker usually executes an ANY request. AAAA recordings are no longer uncommon, SPF and other information in TXT recordings, all this makes it quite easy to get amplification by 5 or even more times. For an attacker, it looks very tempting, you can arrange a good dos, even without a large botnet. You can argue for a very long time why IP address spoofing on the Internet is possible, but the reality is that it is still possible, so today the task of making it difficult to use your own DNS servers in such attacks seems to be very important. I also note that in this attack it is possible to use both authoritative dns servers and public resolvers; the proposed solution can also be used in both cases.

The main methods of struggle applicable on dns servers:

Block the unwanted.
In principle, there is nothing tricky here, many firewalls have the ability to block traffic when exceeding the number of packets passing per second. You can block by any rules, as for example was done in the above article. If you do not want to use the firewall on dns servers, you can run tcpdump once in a certain period of time, parse its output and route unwanted traffic to / dev / null by routing. In extreme cases, you can add the ip of the attacker to the loopback interface (this technique was recommended by I. Sysoev at one of the conferences, as a way to do without a firewall on FreeBSD). You can configure traffic mirroring on the switch, analyze it somewhere separately, and then send the result to the edge router to block it. There are many options, minus one - we are losing some of the traffic. Do not forget that we block the substituted ip,
Let's ask for TCP.
The packet header DNS has a TC flag field. If the TC flag is set, then the client must retry the request using TCP, and all other response data is ignored. The idea of this method is that the attacker will not switch to TCP, it does not make sense to him, while an honest client will switch and receive a response via TCP. Of course, TCP for DNS is slower, but, firstly, the answer should settle in the cache of the recursor or client, and secondly, some latency in this case is a lesser evil. This approach has already been implemented in some dns servers: for example, in powerdns you can set the tc flag to respond to ANY requests, which is already a good compromise. But this option is also not ideal. The fact is that there are still servers on the big Internet that are not quite following the RFC or are simply misconfigured, and, just setting TC for all answers, no one guarantees that everyone will correctly retransmit the answer over TCP. Also, do not forget that by setting the tc flag, we, of course, will reduce the amount of outgoing traffic and, consequently, the load on network equipment, but here the servers themselves will still process this gigantic incoming stream of requests, spending precious context switches and warming data centers .

Actually, the idea came up to do something new. However, in dos protection, as a rule, there are no ideal tools, and the proposed option is also not without drawbacks: for example, it does not protect against using the server to reflect dns traffic without amplification. However, the advantages of the proposed solution are as follows:

We do not block anyone, when the threshold is exceeded, we simply force the use of tcp for certain ip addresses, that is, set the tc flag and respond with a truncated dns response.
Forcing the use of TCP is performed in the kernel.

I also wanted to make the solution as simple as possible in terms of use, without the need to use iptables, configure any additional rules, etc. The module is Linux-specific, but I think nothing fundamentally prevents the idea from being implemented on FreeBSD.

How it works?

We count the number of received packets from each ip address in a given period of time. If the number of incoming packets from a certain ip exceeded the threshold, then we form a UDP response with the TC flag and discard the request. Thus, we drastically reduce the number of context switches caused by the need to process this traffic by the DNS server application. A legitimate client, having received an udp response with a tc flag, will be forced to repeat the request over TCP, and this traffic will already reach the dns server.

For effective implementation, the fact that the header format for the dns request and response is the same will help us a lot, moreover, the header is the only necessary part of the package so that the dns response is considered correct. Let's look at the dns header in more detail:

Another good news: the dns header has a fixed size of 12 bytes. It turns out a very simple scheme, we do not even need to completely parse the dns header. We check that the packet arriving at 53 UDP port contains data larger than 12 bytes, copy the first 12 bytes of data (while writing, the idea was that it might be necessary to additionally check the remaining header fields) from the request into a new packet, set TC bit to it and a response bit, and send it back. Since we copied only the header, it is also advisable to reset the QDCOUNT field, otherwise we will receive parser warnings on the client side. The request itself is then deleted. All this work can be done directly in the NF_INET_LOCAL_IN hook, in the same hook we must put the source ip in the KFIFO queue for further statistics calculation. We will consider the statistics of incoming packets asynchronously, in a separate stream, using red-black trees. Thus, we introduce a minimum delay in the passage of the packet - KFIFO is a lock free data structure, in addition, a queue is created for each cpu. True, there is a need to configure the interval depending on the expected pps. There is still a restriction on the size of the memory allocated for per cpu data: now it is 32kB, taking into account a queue of 4096 ip addresses for each cpu. Thus, choosing an interval of 100ms, we will be able to calculate up to 40960 pps for each cpu, which in most cases seems to be sufficient. On the other hand, overflowing the queue will simply lead to the loss of some data for statistics. using red and black trees. Thus, we introduce a minimum delay in the passage of the packet - KFIFO is a lock free data structure, in addition, a queue is created for each cpu. True, there is a need to configure the interval depending on the expected pps. There is still a restriction on the size of the memory allocated for per cpu data: now it is 32kB, taking into account a queue of 4096 ip addresses for each cpu. Thus, choosing an interval of 100ms, we will be able to calculate up to 40960 pps for each cpu, which in most cases seems to be sufficient. On the other hand, overflowing the queue will simply lead to the loss of some data for statistics. using red and black trees. Thus, we introduce a minimum delay in the passage of the packet - KFIFO is a lock free data structure, in addition, a queue is created for each cpu. True, there is a need to configure the interval depending on the expected pps. There is still a restriction on the size of the memory allocated for per cpu data: now it is 32kB, taking into account a queue of 4096 ip addresses for each cpu. Thus, choosing an interval of 100ms, we will be able to calculate up to 40960 pps for each cpu, which in most cases seems to be sufficient. On the other hand, overflowing the queue will simply lead to the loss of some data for statistics. In addition, a queue is created for each cpu. True, there is a need to configure the interval depending on the expected pps. There is still a restriction on the size of the memory allocated for per cpu data: now it is 32kB, taking into account a queue of 4096 ip addresses for each cpu. Thus, choosing an interval of 100ms, we will be able to calculate up to 40960 pps for each cpu, which in most cases seems to be sufficient. On the other hand, overflowing the queue will simply lead to the loss of some data for statistics. In addition, a queue is created for each cpu. True, there is a need to configure the interval depending on the expected pps. There is still a restriction on the size of the memory allocated for per cpu data: now it is 32kB, taking into account a queue of 4096 ip addresses for each cpu. Thus, choosing an interval of 100ms, we will be able to calculate up to 40960 pps for each cpu, which in most cases seems to be sufficient. On the other hand, overflowing the queue will simply lead to the loss of some data for statistics. choosing an interval of 100ms, we will be able to calculate up to 40960 pps for each cpu, which in most cases seems to be sufficient. On the other hand, overflowing the queue will simply lead to the loss of some data for statistics. choosing an interval of 100ms, we will be able to calculate up to 40960 pps for each cpu, which in most cases seems to be sufficient. On the other hand, overflowing the queue will simply lead to the loss of some data for statistics.

A logical question arises: why not just use a hash?

Unfortunately, inaccurate use of hashes in such places opens up the possibility for another type of attack - attacks aimed at causing collisions: knowing that a hash is used in some piece of code critical at the time of execution, it is possible to pick up such data that the operation with them on the hash the table will already occur beyond O (n) instead of O (1). Such attacks are also unpleasant because it can be difficult to identify them - apparently nothing happened, and the server became ill.

If pps from the blocked ip is less than the threshold value, then the block is removed. It is possible to configure a hysteresis equal to 10% of the threshold value by default.

At the end of the article is a link to the project; Any constructive comments, suggestions and additions are welcome.

Usage example

In the directory with the assembled module, execute
insmod ./kfdns.ko threshold=100 period=100 hysteresis=10
threshold - the threshold, upon exceeding which we will set the tc flag;
period - counting period, ms (i.e., in this case, the filter will work if more than 100 packets from one ip were received in 100ms);
hysteresis - the difference between the response threshold and the filter release threshold. Hint: if you set hysteresis = threshold, then after the lock is triggered, it will never be released, in some cases it can be useful.

After loading the module in,
cat /proc/net/kfdns
you can find statistics on ip that fell under the filtering.

Test results

To create a parasitic load, dnsperf was used (in two copies, one on the neighboring virtual machine, the second on the laptop, and unfortunately this was not even enough to load the system to failure), the dns server was raised in the KVM virtual machine running CentOS, as the dns server itself was used by pdns-recursor.

The graphs show the values of the counters before activating the module, after activation, and again with the unloaded module. PPS during the whole experiment was at 80kpps.

So, what we achieved was a reduction in outgoing traffic. It can be seen that after the module was turned on, outgoing traffic became even less than incoming traffic,
which is logical in principle: let's not forget that we only copy the header.

A sharp decrease in the number of context switches is good.

And here is what happened with the system: a noticeable reduction in the consumption of system time, user time is visible. Changes to steal time in this case are the impact of virtualization, which is also logical. But a barely noticeable increase in irq time is interesting and may be the reason for further experiments.

What do you want to add in the future?

Work from PREROUTING hook (or from FORWARDING, but you need to check). This will allow you to use the module not only on dns servers, but also, for example, on balancers, or border firewalls.
Preparation of packages for the main distributions.
Documentation, best practice

The project itself:
github.com/dcherednik/kfdns4linux

While the project is in a fairly young state, but I hope that there are interested people in the Habrasociety, and perhaps it will be useful to someone.

References and literature:
Linux netfilter Hacking HOWTO
Writing Netfilter Modules
Unreliable Guide To Locking
ISBN 978-5-8459-1779-9 Robert Love, Linux Kernel: Description of the development process

Tags: