Using, configuring, and testing distcc and ccache

Prehistory:


Once I got the idea to check the effectiveness of tools such as distcc and ccache. They have been used for a long time, but the idea of ​​the expediency of all this still revolved in my head. And finally, the hands came to check in practice, I immediately warn that everything was done for myself, in real conditions, and the results of your tests may not coincide with the ones given below. Consider installing and configuring these components, and then test the build speed using the time command.


Theory:


To begin with, we will determine what it is and what it can be useful for. Although if you read this, you probably already understand what these tools are used for.
A few words about Distcc from Wikipedia.
Distcc (from the English distributed C / C ++ / ObjC compiler) is a tool that allows you to compile source codes using C / C ++ / ObjC compilers on remote machines, which allows you to speed up the compilation process. It consists of two main parts - the server (distccd) and client (distcc).
Ccache (from the English compiler cache; pronounced “sikesh”) is the cache of C and C ++ compilers for Linux and other Unix-like systems.
Using csache can significantly speed up the assembly of some packages or projects that are compiled several times, since only files that have changed since the last compilation will be recompiled.

What do we do?


  1. Install and configure distcc and ccache.
  2. We will check how effectively these mechanisms work together, and separately.

The following infrastructure is available: 100 Mib / s network, with the following machines:

Distcc client
Distcc server
Distcc serverDistcc server
CPU
Intel Pentium Dual E2160 1.80GHz
Intel Xeon 3.20GHz
Intel Pentium D CPU 3.00GHz
Intel Pentium 4 CPU 3.00GHz
Memory
1 GB
1 GB1 GB1 GB
Gcc
4.4.5
4.4.54.4.54.4.5
OS
gentoo x86
gentoo x86debian 5.0.7 x86
debian 5.0.7 x86

As you can see, the machines are rather weak, but they cope with their tasks, at the time of the tests, all third-party services were disabled. Also, as you can see, all machines have one version of GCC installed (it is recommended to use the same version, it is possible to differ in the last digit of version 4.4. *), However, earlier, GCC 4.4.3 was also used on some machines and everything worked fine.

Installation and setup:


First you need to install and configure our tools.

Gentoo

The installation process for distcc and ccache is described in detail in the manual , I will only briefly go through the main stages, and show some nuances.
Install distcc:
emerge -avt distcc
We look at the USE flags, what we need to leave, what we do not need to remove and start the installation.
After distcc builds and installs, the distccd daemon settings will be located in the /etc/conf.d/distccd file, the list of used hosts in / etc / distcc / hosts. In /etc/conf.d/distccd, you need to make changes to the distccd server settings.
We will add the ability to maintain logs, it is necessary for debugging, in the future we can disable it:
DISTCCD_OPTS="${DISTCCD_OPTS} --log-file /var/log/distcc.log --log-level info "
Define a list of hosts (networks) whose requests we will process:
DISTCCD_OPTS="${DISTCCD_OPTS} --allow 192.168.0.0/24"
Indicate the interface and port on which the daemon will be launched:
DISTCCD_OPTS="${DISTCCD_OPTS} --listen 192.168.0.174"
It is also possible to specify with what priority the service will be launched, this value must be set strictly individually, its default value is 15: You
DISTCCD_OPTS="${DISTCCD_OPTS} -N 15"
must tell the system that distcc can be used, add to make.conf:
FEATURES="distcc"
And we will define a directory for temporary distcc files:
DISTCC_DIR="/tmp/.distcc"
For those who don’t sorry memory and want to increase performance, you can make a temporary folder on tmpfs (the file system is organized in virtual memory (RAM + swap). Suitable for storage of temporary files.)
mount -t tmpfs tmpfs -o size=850M /tmp/
in the case of tmpfs data will overflow pisat camping in the swap that could adversely affect the performance of other programs.
For permanent use, add this line to / etc / fstab:
tmpfs /tmp tmpfs size=850M,mode=0777 0 0

Next, you need to start the service itself:
/etc/init.d/distccd start
And you can also add it to autoload:
rc-update add distccd default

I recall these changes we made for the server side of distcc.
Now let's see what needs to be done on the client side. Everything is simple here, we need to add a list of hosts that act as distcc servers to the / etc / distcc / hosts file in one line with a space (you can specify both machine names and their ip addresses). Listing of machines is recommended to start in order of decreasing power. A client can be a server, and vice versa.

Install ccache for guidance.
emerge -avt ccache
We look at the USE flags, what we need to leave, what we do not need to remove and start the installation.
After ccache builds and installs, make changes to /etc/make.conf and tell the system that you need to use ccache:
FEATURES="distcc ccache "
Set the cache size:
CCACHE_SIZE="2G"
Determine the location where the cache will be located:
CCACHE_DIR="/var/tmp/ccache"
You can also store the cache directory in tmpfs:
mount -t tmpfs tmpfs -o size=2G /var/tmp/ccache
or write it to fstab:
tmpfs /var/tmp/ccache tmpfs size=2G,mode=0777 0 0
However, if the system reboots, all data in tmpfs will be lost, which will reduce all ccache operating time to zero.

Debian

Install distcc. Since the client will be a machine with gentoo on board, something will have to be fixed on Debian machines.
Install distcc and ccache: The parameters of the distccd daemon are in the / etc / default / distcc file, the list of used hosts is in / etc / distcc / hosts. We will make changes to / etc / default / distcc. We enable the daemon to start when the machine boots: We indicate the subnet with which we listen to requests: We indicate the address on which the service will listen to the port: Priority with which the running process will work: Enable or detect through mDNS / DNS-SD We also fill in the values ​​in / etc / distcc / hosts as on gentoo if necessary (my debian is used only as a distcc server).
apt-get install distcc
apt-get install ccache



STARTDISTCC="true"

ALLOWEDNETS="192.168.3.0/24"

LISTENER="192.168.3.103"

NICE="15"

ZEROCONF="false"

Now we need to add a symbolic link to the version of gcc used, the link should have the name of the executable on the client (i686-pc-linux-gnu-c ++): Run the distcc server: It should be explained that the links are like i686-pc-linux-gnu-c ++, should be created, since on the client system (gentoo), when compiling, a hard link with the same name, which is also gcc-4.4, is launched. Also for best results, I placed the build directory for temporary files in the tmpfs file system.
cd /usr/bin/
ln -s ./g++-4.4 ./i686-pc-linux-gnu-c++
ln -s ./cpp-4.4 ./i686-pc-linux-gnu-cpp
ln -s ./gcc-4.4 ./i686-pc-linux-gnu-gcc
ln -s ./g++-4.4 ./i686-pc-linux-gnu-g++


/etc/init.d/distccd start



Testing


Now let's try to check the effectiveness of the examples. For the purity of the experiment, the assembly was carried out 3 times. We will collect mplayer, just the first one came to hand. We will test using the time utility, which gives the following indicators at the output:
user - the total number of seconds of CPU time that the process spent in user mode.
sys is the total number of seconds of CPU time that the process spent in kernel mode.
real - the actual elapsed time (in seconds).

1 Build mplayer on the local host without using distcc and ccache (-distcc -ccache) with the -j2 option
time MAKEOPTS="-j2" FEATURES="-distcc -ccache" emerge -vt mplayer

123
real 3m56.979s
user 6m11.729s
sys 0m44.551s
real 3m55.825s
user 6m12.151s
sys 0m43.660s
real 3m55.745s
user 6m12.351s
sys 0m43.671s


2 Build mplayer on the local host without using distcc and ccache (-distcc -ccache) with the -j3 option
time MAKEOPTS="-j3" FEATURES="-distcc -ccache" emerge -vt mplayer

123
real 3m57.662s
user 6m15.214s
sys 0m43.392s
real 3m58.108s
user 6m14.779s
sys 0m43.656s
real 3m57.901s
user 6m14.894s
sys 0m43.464s


3 Build mplayer on the local host without using distcc, but with ccache enabled (-distcc ccache) with the -j2 option
time MAKEOPTS="-j2" FEATURES="-distcc ccache" emerge -vt mplayer

123
real 4m14.438s
user 6m28.199s
sys 1m0.705s
real 0m58.587s
user 0m39.101s
sys 0m25.706s
real 0m57.901s
user 0m38.940s
sys 0m25.564s


4 Build mplayer on the local host using distcc but without ccache (distcc -ccache) with the -j10 option
time MAKEOPTS="-j10" FEATURES="distcc -ccache" emerge -vt mplayer

123
real 2m26.516s
user 1m6.692s
sys 0m34.821s
real 2m27.065s
user 1m6.643s
sys 0m34.856s
real 2m26.965s
user 1m6.593s
sys 0m34.569s


5 Build mplayer on the local host using distcc and ccache (distcc ccache) with the -j10 option
time MAKEOPTS="-j10" FEATURES="distcc ccache" emerge -vt mplayer

123
real 2m28.590s
user 1m16.117s
sys 0m45.723s
real 0m57.456s
user 0m39.892s
sys 0m24.781s
real 0m57.411s
user 0m39.947s
sys 0m24.692s


I decided to try something harder, for example, chromium

5 First we collect on a local machine without distcc and ccache.
time MAKEOPTS="-j2" FEATURES="-distcc -ccache" emerge -vt chromium

123
real 67m30.435s
user 116m58.378s
sys 11m57.523s
real 66m50.345s
user 115m48.389s
sys 11m59.223s
real 67m45.165s
user 116m59.548s
sys 11m57.530s


6 Let's try to build using distcc (distcc -ccache)
time MAKEOPTS="-j10" FEATURES="distcc -ccache" emerge -vt chromium

123
real 36m25.392s
user 16m2.063s
sys 8m8.307s
real 35m15.291s
user 15m10.604s
sys 7m4.378s
real 35m22.390s
user 15m1.423s
sys 7m5.156s


7 Already better, now let's try the same thing, but using distcc and ccache.
time MAKEOPTS="-j10" FEATURES="distcc ccache" emerge -vt chromium

123
real 32m2.656s
user 18m56.006s
sys 10m57.023s
real 17m22.228s
user 18m1.757s
sys 9m36.169s
real 16m59.284s
user 17m59.577s
sys 8m35.679s


Conclusion:

Well, you can look at the whole picture, and summarize the results of the tests. And the conclusion is the following: using distcc reduces the build time by about half, and when paired with ccache, the speed increases by about four times. For myself, I decided that I would continue to use these tools.

Also popular now: