Gordon01 August 26, 2014 at 18:18

Linux optimization for desktop and games

In this article, I want to share almost 10 years of experience using Linux on a home computer. During this time, I conducted many experiments on the kernel, tried various configurations for different applications, and now I want to organize all this in a long post with recommendations on how to get the most out of linux and achieve excellent performance, without the need to buy powerful hardware.

Personally, I think the part where I wrote about kernel tuning is still a bit outdated and modern hardware already a priori gives the necessary performance for normal operation, but, as I’ve noticed recently, there are still problems with games, even now, even on powerful hardware .

Although I promised that after reading this article, it will be possible to play Metro 2033 on a calculator (a joke, this will not happen), nevertheless it will begin with a recommendation to buy something from iron, if you do not already have it.

1. Buy an SSD if you don’t have one yet

For some reason, many people are skeptical of SSDs, although this is the first and most important component of a computer that is going to work quickly.

Seriously, everything described later in the article will give you some increase in performance and response time, but any, even the cheapest SSD, will reduce the startup time of most programs to 0 , which, visually, will be very noticeable. In almost any computer (and server), the main brake is always the disk subsystem and no HDD will ever give you the desired search speed (which for SSD tends to 0 ms). For all the time of communication with computers and their upgrade, only the transition to SSD gave a significant increase in speed and response. Remember how slowly floppy disks work, what a huge search time they have? That's about how the hard drive is perceived after the SSD.

So if you don’t have an SSD yet, then there’s no point in continuing, your computer (even equipped with a 12-core Xeon) will still work slowly, so go shopping.

Regarding reliability: there is a myth that SSDs die a year later. We are obliged to his birth the first SSD on sandForce chips. Naturally, any new SSD from the store is at least more reliable and durable than modern hard drives, so do not worry about this at all. I bought a second-hand SSD 2 years ago, at that time it was in use a year. Now he has 11 681 hours of use and a resource utilization of 10%, so with the same mode of use, I will have enough for another 27 years. I think that by this time, data storage technology will change several times. So again, reliability issues are more than contrived.

Comrade Vadim Sterkin wrote about SSD myths in more detail on his blog . True, his blog is about Windows, but this does not change the essence. I strongly advise you to read, very interesting.

In Ubuntu 14.04, SSDs work out of the box, the discard option is automatically registered in fstab, except there is nothing more to do .
Other distributions need to check if the partitions on the SSD have this option. It is worth mentioning that only ext4 supports this option. For other FS, you will have to use fstrim from the scheduler.

2. Partition table

Do not partition drives.

For a home computer, this is pointless and harmful. On the SSD you should have one partition for the root, there you will have the system and all the data stored. On the HDD (if necessary) you should have one partition with a mount point in / mnt (I have / mnt / data), where large, under-used data will be stored (movies, music, games). DO NOT make the HDD the mount point / home, since in / home 99% of the programs store their data and constantly access them, so / home must be on the SSD.

I will repeat briefly: on the SSD you should have everything that the system constantly accesses (writes / reads)!

Do not listen to the bad advice on transferring such data to the HDD, as already mentioned there is no SSD wear problem, this is a myth, and a large number of recording cycles does not affect the durability of the SSD at all. Once again, I refer to the article of Vadim Sterkin, there it is all described in more detail and supported by explanations.

As for the SWAP partition: you do not need it. If you do not have enough RAM, then OOM-killer will beat resource-intensive applications, if this happens then buy RAM, since its price does not bite much. Using swap as a RAM expander significantly slows down the computer. There are many opinions that without SWAP there will be some problems, but IMHO, the roots of these conversations are growing from Win9x and today these are already myths, I personally did not notice any problems from rejecting SWAP. As proof: on VPS now rarely see the connected SWAP and work somehow!
You do not need suspend-to-disk either, because a cold start with SSD is faster than recovering from hibernation from the HDD, so use suspend-to-ram or turn off the computer completely. The only plus from the swap is the ability to go into hybrid hibernation, when the system prepares for suspend-to-disk, but executes suspend-to-ram, so later, if everything is okay, there’s a simple way out of hibernation, and if there is a power failure - then the system will recover from disk.

I use the ext4 file system everywhere, because with others I could not get a noticeable difference in performance, and ext4 is the most common, plus there are utilities for data recovery (but do not rely on them, but backup). When creating, use -T largefile or largefile4.

3. Use a 64-bit kernel

Little depends on the performance of RAM, it will not increase the FPS in games and applications will not start faster. Using 64-bit applications also does not provide any growth for ordinary tasks, only for very specific mathematical calculations and archiving operations. Also, the use of 64 cores is not required for addressing more than 4 GB of memory, PAE allows you to address up to 64 GB of memory on a 32 bit system.

But using a 64-bit kernel, applications can address more than 4 GB of memory, which is quite useful, because otherwise a situation may arise when the OOM-killer will beat the program, although there is still enough RAM. Also, on a 64-bit system, all physical memory can be addressed immediately, on a 32-bit system, everything above ~ 800 MB must be constantly remaped, which slightly reduces the page exchange speed, although, as I said, this does not particularly affect the speed of work.

I also noticed the effect that the OOM-killer can kill processes that seem to have not yet occupied 4 GB. I have had this with some games. The problem was solved by switching to 64 bits. So without a 64-bit kernel it’s nowhere, although this adds a little overhead to the memory usage.

4. Use the pf-kernel patset

pf-kernel is a set of patches for the linux kernel, collected by the Ukrainian Alexander Natalenko ( pfactum ) aimed at improving the desktop-experience of linux systems.

It consists of:

The most useful are the BFS and BFQ patches, about which a lot has already been written. BFQ struggles with the problem of system brakes during large disk operations (the famous bug 12309, which is fixed by documents, but in fact continues to annoy), BFS is a process scheduler more suitable for desktop work than those that go to the kernel. For example, CFS, which is used by default, allows a situation where 2 processes that require real-time priority will be executed on one core, although other kernels are busy with low-priority tasks. Naturally, this behavior leads to global brakes. But the "honest planner." BFS is not so "honest", but it is much closer to the realities of desktop computers with a small (large - 4096) number of cores.

To install, I download from kernel.org the necessary version of the kernel without stabilization patches and apply pf-kernel to it. In general, it looks like this:

cd /usr/src
wget ftp://ftp.kernel.org/pub/linux/kernel/v3.x/linux-3.12.tar.xz
tar -xf linux-3.12.tar.xz
cd linux-3.12
wget https://pf.natalenko.name/sources/3.12/patch-3.12.4-pf.bz2
bunzip2 patch-3.12.4-pf.bz2
patch -p1 < patch-3.12.4-pf

This is a very important patchset, it is it that allows the system to be responsive, even in times of heavy load. As a result, for example, even at maximum load, the application startup time remains the same as when idle!

Here, for example, is a screenshot of htop when Dota 2 + The Sims 3 (multiseat) is running:

With this load on the third screen, you can work quietly and 25% (in a 5-minute window according to load-average data) CPU overload is not even felt. Although, of course, the percent needs to be changed :(

5. Tuning the core!

The kernel uses by default not very optimal parameters, due to the historical purpose of linux for servers and availability for debugging.

So do make xconfig.

I will talk about the most important options for optimization.

Turn off preemption, set a low timer frequency and turn off dynticks!

YES! We really, even contrary to the documentation for BFS, disable the "vital" options to increase the responsiveness of the system. And the reason is that they are outdated, there is no sense from them and besides preemption negatively affects performance.

There was a time when I had a single-core processor, while preemption and a high-frequency timer were not included in the finished kernels, then, after turning on these options, there was a huge effect. Namely, a heavyweight application occupying 100% of the CPU, even with disk I / O and lack of RAM, did not affect interactivity and responsiveness. In those days, there was nothing besides WinXP, but I don’t need to tell in detail how terribly XP behaves in such situations, it usually hangs tightly, forcing to reach for the reset button. So having a system that almost never slows down and does not freeze was nice.

But those days are gone, multi-core processors and huge amounts of memory by themselves solve the problems of responsiveness under load, so it is not only useless, but also harmful to solve them by software.

So go to the Processor type and features and choose the parameter Preemption Model significance No Forced Preemption (Server) . We are not afraid of the phrase “ocasional longer delays are possible” because this problem is effectively solved by BFS and a multi-core processor. As described in the description, we win in “raw processing power”.

Also, for optimization purposes, for the Processor family parameter, select your processor.

Next, set the parameter Timer frequencyvalue of 300 HZ . 100 will still not be enough, and there’s not much point (read why), but you can experiment. Also, 300 Hz is completely divided by 25 and 30, which are typical frequencies for video, this contributes to the fight against tearing (this is from a help. In fact, only triple buffering + vsync successfully fights against tearing).

There are a lot of interesting options in this section, look, for example, you can turn off the hot-plug for cpu and memory, since it is simply impossible to do on the desktop (and rarely need anyone to turn on / off the kernel on the fly).

Since I do not have a laptop, I turn off everything related to energy saving, that is, for example, turn off CPU Frequency scaling support in general.

Now turn off the dynamic timer. I’m not sure for sure, since I didn’t specifically check, but it seems that this option leads to constant “twitching” on some videos and especially in games. So go to General setup -> Timers subsystem and for the Timer tick handling option select Periodic timer ticks (constant rate, no dynticks) .

Turn on BFQ

By default, BFQ is off and you need to turn it on and select the default one.

Go Enable the block layer -> IO Schedulers include options BFQ I / O scheduler and BFQ hierarchical scheduling support , options for Default I / O scheduler choose, obviously, BFQ.

6. Prelink

You can pre-link dynamic libraries with executable files, which further reduces application startup time. There is a separate article on this topic from peter23 .

7. Conclusion

The most important thing that I always notice is that after applying the patchset and tuning the kernel, “twitching” in the games goes away. The weaker the iron, the more noticeable these twitches, although I have suspicions that this is still some kind of problem in the nVidia drivers, because different versions behave differently.

For the sake of proofs, I decided to conduct tests using Geekbench 3 from Steam and gputest , the results of which are a bit strange:

3.14-pf :
Single-Core Score 2421
Multi-Core Score 8209
gputest: 3720 pts, 62 FPS

3.13-generic :
Single-Core Score 2646
Multi -Core Score 8414
gputest: 3713 pts, 61 FPS

Windows:
Single-Core Score 2572
Multi-Core Score 8242
gputest: 3634 pts, 60 FPS

As you can see, for some reason the CPU optimizes for the “optimized” version in the test less than parrots, and more in the GPU test. Only now I noticed that I tested different kernels, perhaps this is the reason for the difference in results. As there will be time, I will conduct the same tests on 3.16, I hope I can find the reason. The funny thing is that Windows has worse results, especially in 3D significantly.

Tags: