Fight for resources, part 5: Starting from scratch

We continue to study cgroups. In Red Hat Enterprise Linux 7, they are enabled by default, because systemd is used here, and it, in turn, has already built-in cgroups. With Red Hat, Red Hat Enterprise Linux 6 is a little different. In fact, cgroups controllers were originally there and there, and this version came out, we recall, in January 2010, that is, a couple of centuries ago in terms of computer years.

However, cgroups in Red Hat Enterprise Linux 6 and today are capable of a lot, which we will illustrate today.

Let us analyze the cgroups features in Red Hat Enterprise Linux 6 on one purely hypothetical example entirely based on real events. But for a start, according to tradition, a small digression.

With security in IT, there have never been as many problems as now. It is not surprising, because today not only all computers and telephones are connected to the network, but also refrigerators, vacuum cleaners and a bunch of different other things - the scope for network threats is simply immense. And the fight against these threats, as a rule, begins immediately on all fronts. Rapid installation of security patches? Yes, sure! Strengthening the protection of the system - firewalls, SELinux, competent authentication, is that all? Of course! Antivirus scanners on Linux machines? Well, how to say ...

On Linux machines, anti-virus scanners sometimes do more harm than good. However, bezopasnik have their reasons, and they often require regularly run anti-virus checks, not really thinking about their validity from a technical point of view. And this is the reality that you have to put up with, and with which, sooner or later, almost every IT person is confronted.

The second point is that Red Hat Enterprise Linux 7 is certainly fashionable, advanced and cool, but many still use Red Hat Enterprise Linux 6 and don’t think of giving it up. In fact, people therefore choose Red Hat - you can sit on the same version for years and still have all the latest patches, updates and support.

Back to our example ... Imagine that there is a guy named Jerry. Jerry works in a large office and is responsible for the servers of Red Hat Enterprise Linux 6. He is completely satisfied with the way they work, and he doesn’t need new problems and bumps.

But then the guys from the security department decide that all of its servers need to put one thing called ScanIT. And since this thing will periodically check disks and memory for viruses and other malware, it needs full root access.

Jerry sighs, puts down his guitar and goes to put ScanIT on a test machine. Pretty quickly it turns out that:

When you run an antivirus scan, scanit (this is the script to start the process) eats away all the CPU time that you can reach. And this is a very, very bad effect on the work of the test machine - once Jerry could not even reach her through ssh.
In addition, the process scanit from time to time eats memory as not in itself. As a result, the OOM Killer wakes up and starts killing any processes other than the scanit itself.

In general, something must be done about it.

Jerry picks up the guitar and, playing the Grateful Dead, begins to think. Pretty quickly the thought comes to his mind that those very cgroups from Red Hat Enterprise Linux 7 can probably help here, about which a buddy named Alex buzzed all his ears. Jerry again sets aside the guitar and is taken to read docks sent by Alex on Red Hat Enterprise Linux 6 . It turns out that the first thing he needs is libcgroup.

There is no libcgroup on the test machine, so Jerry starts to install it:

In addition, Jerry includes two services that are necessary for the work of permanent (persistent) cgroups:

cgconfig - provides a more or less simple interface for working with cgroup trees. Of course, Jerry could mount and configure cgroups manually, but why, if you can save time?
cgred - this thing is a cgroup engine of rules: when starting a process, this service puts it into one or another cgroup according to the rules specified.

By installing and configuring all this, Jerry can finally proceed directly to the problem itself. Having thought it over well, he makes the following decision:

scanit and its child processes should consume no more than 20% of the CPU resources. In fact, even less - no more than 20% of the resources of a single processor core, even on a multi-core machine. In cgroups, this is done using CPU quotas.
As for memory, scanit and its child processes should consume no more than 512 MB of system memory. If they are crossing this line, the system should kill them, and not any other processes.

Don't tell me what to do!

Jerry will have to deal with two sets of configuration files:

/etc/cgconfig.conf - automatically generated when installing libcgroup.
/etc/cgrules.conf - contains a ruleset rule set, according to which cgred sorts start processes by cgroups groups.

Here is the default cgconfig.conf file:

Jerry could have made the necessary changes directly to him, but it is better to use drop-in conf files for this. How it works? If you put (English drop-in - throw) into the /etc/cgconfig.d folder any file with the .conf extension, the system will process it and make the appropriate changes to the configuration. This is convenient in that you can create drop-ins for various tasks and add or remove them from the configuration with the help of those tools that you like best (say, Ansible, well, this is still a Red Hat blog).

First, Jerry creates a drop-in file for the CPU:

We look, that here with us and how it works.

The group keyword simply specifies the name of the new cgroup group, in our case scanit. Inside the curly braces, we specify the cgroup controls we want to use. Here it is cpu.cfs_period_us and cpu.cfs_quota_us, they allow you to set the appropriate limits in Completely Fair Scheduler, the kernel scheduler that is used by default in Red Hat Enterprise Linux 6. Let's see what is written about them in the Red Enterprise Linux Resource Management Guide 6 :

In other words, Jerry wrote in his drop-in: “For every process related to cgroup called scanit, check the amount of CPU resources allocated to it once a second. If the total processor time for all processes in this group is more than 200,000 milliseconds, then completely stop issuing processor time to these processes. ” Well, that is, to allocate to all processes in the cgroup-group scanit, as well as their child processes, totally no more than 20% of the CPU time.

After restarting cgconfig, the server will update the configuration, and if we get into the file system, we will see that scanit is now located in the CPU controller directory:

This, of course, is good, but we still need to somehow put the scanit itself into this cgroup. Here crged comes in handy, by default it looks like this:

Using this file is more or less easy. True, for this we will have to directly edit the cgrules.conf file, since the drop-in mechanism is not supported here. We specify the user or group that owns the process, as well as the name of the specific - if you want - process, as well as the custom controller and cgroup destination group.

In our example, instead of a real anti-virus scanit scanit, we use a script, which is also called scanit, but in fact just emulates the load. Without cgroup, it all looks like this:

The CPU is fully occupied, mostly with user space and a bit of a system.

Jerry is scratching his beard. He launches vi and, using strictly one index finger, makes some changes and restarts the cgred daemon:

Then he manually launches scanit ...:

And - hooray! Victory.

As you can see, our load emulation processes (child processes of scanits) now totally consume 20% of CPU resources, mainly in user space and a bit in system. So, this damn antivirus will no longer load the car to complete insanity.

Remember what next?

Delighted with success, Jerry almost forgot about the memory. But then he still remembers and starts vi again to fix his config-file.

Now it adds two memory-related settings:

Memory.limit_in_bytes - max. The amount of RAM that all processes in the cgroup-group scanit can use is total. And without taking place in the swap. Jeri limits his 256 MB
Memory.memsw.limit_in_bytes - max. the amount of RAM plus the space in the swap file that can be allocated to all processes in the cgroup-group scanit is total. If this threshold is exceeded, OOM killer will kill the processes. Jerry sets it to 512 MB.

Oh no! What is wrong?

Jerry looks top and sees that the child scanit processes are still running. Since this cgroup is being used, Jerry cannot start the service. Therefore, it kills child processes manually and restarts such services.

Now a bit of editing in cgred.conf:

To check, Jerry runs several scanit tasks at once so that the OOM killer will work for sure.

Then Jerry looks at the system log and nods in satisfaction - scanit can no longer drive off the memory in any quantities with impunity.

Hopefully our cgroups series helped you understand what it is, how to use them in Red Hat Enterprise Linux 7, how to create them in Red Hat Enterprise Linux 6, and how to use them in your environment.

Part 1 - habr.com/company/redhatrussia/blog/423051
Part 2 - habr.com/company/redhatrussia/blog/424367
Part 3 - habr.com/company/redhatrussia/blog/425803
Part 4 - habr.com/company/redhatrussia/blog/427413
Part 6 - habr.com/company/redhatrussia/blog/430748

Tags:

Fight for resources, part 5: Starting from scratch

Don't tell me what to do!

Remember what next?

Also popular now: