
Clouds versus cleaver, or Chronicle of DDoS attacks on cvk2012.org

In previous issues of the Qrator traffic filtering network logbook (the obligatory advertising link is http://qrator.net/ ), we found out that in anticipation of serious events in Runet it is better to take care of protection against attacks in advance. Today I will tell you how you can, even having fulfilled this requirement, earn a couple of sleepless nights at the workplace admin.
Background
CVK employees approached the problem of fault tolerance of the frontend of the voting system with all academic seriousness and did not begin to put all their eggs in one basket. The muzzle of the voting system was hosted on the Microsoft Azure cloud , which is recommended for Windows-based applications. At the same time, the domain election.cvk2012.org was registered in Qrator almost a month before the events described and all this time was under preventive protection, which allowed to accumulate the necessary base for training. All this made it possible to expect that the elective weekend in terms of the working capacity of the opposition’s sites would be as calm and problem-free as several previous rallies.
October 18th
The first bell rang on Thursday. By this time, a low-power attack on the voter registration site had been going on for 3 days, but at that moment it had created problems for the first time. The attack scheme is typical JS LOIC : on a supposedly stable hosting there is a page with an ECMAScript code that updates the pseudo-image from the URL of the attacked service every 200 ms. In April , a similar scheme was used by the Vkontakte service to “warn” the antigate.com site . In this case, at first the attacking script was located on the hellotesak.narod.ru website , while Yandex servers did not take part in the attack - the script was executed in users' browsers. This site was later blocked.

It turned out that under some conditions, the attacking code creates serious problems for the database. Since there was no time left to optimize frontend communication with the DBMS, and the site had no margin for analyzing the history of all incoming requests, after consulting with Qrator technical support, the customer decided to closely integrate the application with the filtering system, providing it with feedback on the complexity of the database request .

Thus, the site was protected from adversity until Saturday.
The 20th of October
On Saturday night, the attackers turned their attention to the voting site located on Microsoft Azure - election.democratia2.ru . The cloud itself continued to work, but a polling server located elsewhere was successfully killed by a targeted LOIC attack - this time from cvkhello.do.am . At first, the development team tried to overcome the trouble by changing the architecture, then introduced captcha, but at midnight the voting site was put under Qrator protection ... without HTTPS - they forgot about it in a hurry. When Qrator gained access to SSL, in addition, the lack of an accumulated history of the behavior of legitimate users affected it.
Job election.democratia2.ruwas restored by one in the morning on Sunday, October 21, after training the filtration system.
According to information from Meddy (an uCoz employee), the website cvkhello.do.am lasted two days. The site was registered with an IP address belonging to one of the largest Ukrainian cable providers.

October 21
Of course, the urgent abandonment of Azure and large-scale changes in the code could not but negatively affect the overall performance of the site. And on Sunday, the attackers finally came to use, along with LOIC, a full-fledged botnet with IP addresses from Asian and European countries. The total number of bots involved in the attack amounted to a little more than 130 thousand excluding NAT. All this together led to the need to proactively block a number of potentially suspicious IP addresses, allowing them to go to the site individually and analyzing the behavior history of each of them individually. Toward evening, when the bottlenecks in the performance of the protected site were eliminated, access to the site was granted to all legitimate users.

Dry technical attack details
- Attack type: combined, SYN flood + application level attack (botnet + LOIC)
- Peak SYN flood capacity: 150 thousand packets / s
- Peak application level attack power: 3.8 thousand HTTP requests / s
- The number of IP addresses involved in the attack: approximately 135 thousand
Morality?
- Cloud application architecture is not a panacea for all ills. Even with a favorable set of circumstances, you are facing a multi-valued bill for resources expended by the cloud for the full processing of spurious traffic. A less optimistic forecast is serious problems with stability and performance and, as a result, downtime at the most crucial moment.
- No highly loaded application can be relieved of load testing. Neither the placement in the cloud, nor the reputation of the server platform and framework, nor the pre-purchased gigabit channel, guarantee the absence of bottlenecks in scripts, in the database schema, and in the query processing logic. A loaded application should have a margin in performance and withstand at least 115% of the calculated or standard load - only this will ensure high-quality filtering with a minimum of false positives.
- By the way, about performance. If your server often thinks about the answer for more than a second, it is highly likely that you have big problems that need to be resolved before the system goes to production!
- If the site is no longer accessible from the outside, keep calm and soberly assess your capabilities. Hot patching a site right on production may close one hole, but it will open three new ones and cut your way back. Consider all the available rescue opportunities and choose from them the best in terms of price and quality.
- I repeat: the sooner your application is protected, the more free time will remain for you and your system administrators to work, relax and fight for your ideals.