
How to quickly run voluntary distributed computing on hundreds of machines
While working in the IT department, I constantly observe computers that are very bored for various organizational reasons. The golden days of mining bitcoins on the CPU have passed, and in search of a new useful business, I came to voluntary distributed computing, in particular, to the World Community Grid . The first thing to look for a cure for cancer was a puzzling server from a cold reserve and a low-priority virtual machine on a virtualization cluster. It is more difficult with workstations, they constantly come and go, on each install, configure, and then remove BOINC for a long time and is not technologically advanced.
It was decided to build a live distribution with a wired BOINC and distribute it over the network. Turned on the computer, pressed F12, selected the desired item - and already benefit humanity!
The platform was chosen by Debian, which a) has long been familiar and b) has a wonderful manual on the topic. Nevertheless, the rake was not without, and in this case, almost every new rake means a fairly long reassembly of the image. I hope this post saves some admin time, and at the same time reminds you of the existence of such a wonderful project as WCG.
Note that everything was done in a very closed environment, and security needed to be paid very little attention. Perhaps, in your case, security will need additional work.
The system consists of the following:
1. Network boot server. Everything was ready for me, the configured TFTP and DHCP were left from the project for thin clients from me. If you do not, then picking up a new one is easy. In a nutshell, we install and run tftpd-hpa, and in DHCP we specify the parameters 66 and 67. Just do not let anyone get into the network (in my case, these are cadets), this can be dangerous. In addition to the BIOS, you can password-protect part of the TFTP server boot menu.
2. NFS server. First, BOINC should be able to save its data in the process. It is assumed that you can’t touch the local hard drive, so we’ll allow NFS to write to the directory, for example,
In
after which we restart the service (for some reason, the recommended one
3. Assembly station. It is just a virtual machine with the usual Debian Wheezy. Installed a package
We leave for the assembly station.
We create the base config for our distribution, specifying the address of the NFS server:
A certain tree of directories is formed, having different content in them, you can customize your assembly. We will add the following:
1.
2.
3.
After adding the necessary settings, we start the assembly itself:
The value for us is the resulting file
It will be unpacked into
The content
Now everything lies in its place.
If after assembly you want to check the contents of the root FS, then you do not have to boot with it; you can simply mount it as a loop device:
If you need to rebuild the distribution with new parameters, then before that you can do either
All in a few minutes there will be new tasks in the state of "Ready to run" and "Works."
This procedure needs to be performed only once for each computer (more precisely, for each MAC address). Even after returning to you after a long operation in another place, the computer will find its data on the NFS server at the MAC address and continue to work immediately after switching on (only some tasks will already be expired, but this is a trifle, it will receive new ones).
What remains unresolved:
It was decided to build a live distribution with a wired BOINC and distribute it over the network. Turned on the computer, pressed F12, selected the desired item - and already benefit humanity!
The platform was chosen by Debian, which a) has long been familiar and b) has a wonderful manual on the topic. Nevertheless, the rake was not without, and in this case, almost every new rake means a fairly long reassembly of the image. I hope this post saves some admin time, and at the same time reminds you of the existence of such a wonderful project as WCG.
Note that everything was done in a very closed environment, and security needed to be paid very little attention. Perhaps, in your case, security will need additional work.
Training
The system consists of the following:
- Network boot server.
- Nfs server
- Assembly station
1. Network boot server. Everything was ready for me, the configured TFTP and DHCP were left from the project for thin clients from me. If you do not, then picking up a new one is easy. In a nutshell, we install and run tftpd-hpa, and in DHCP we specify the parameters 66 and 67. Just do not let anyone get into the network (in my case, these are cadets), this can be dangerous. In addition to the BIOS, you can password-protect part of the TFTP server boot menu.
2. NFS server. First, BOINC should be able to save its data in the process. It is assumed that you can’t touch the local hard drive, so we’ll allow NFS to write to the directory, for example,
/srv/boinc-nfs
. Here, each computer will create a subdirectory with a name that matches its MAC address. Secondly, the /srv/debian-live
root FS will be in the directory for network boot. So:mkdir /srv/debian-live
mkdir /srv/boinc-nfs
chown nobody:nogroup /srv/boinc-nfs
chmod 755 /srv/boinc-nfs
In
/etc/exports
add:/srv/boinc-nfs *(rw,sync,no_root_squash,no_subtree_check)
/srv/debian-live *(ro,async,no_root_squash,no_subtree_check)
after which we restart the service (for some reason, the recommended one
exportfs -rv
did not give me the result):/etc/init.d/nfs-kernel-server restart
3. Assembly station. It is just a virtual machine with the usual Debian Wheezy. Installed a package
live-build
that will do the bulk of the work. There should be internet here.Assembly process
We leave for the assembly station.
mkdir /srv/live-default && cd /srv/live-default
We create the base config for our distribution, specifying the address of the NFS server:
lb config -b netboot --net-root-path "/srv/debian-live" --net-root-server "192.168.15.20"
A certain tree of directories is formed, having different content in them, you can customize your assembly. We will add the following:
1.
config/package-lists/boinc.list
- a list of packages that will be needed in our assembly. We write in it:boinc-client
nfs-common
2.
config/includes.chroot/etc/init.d/boinc-preps
- an init script that will mount NFS, configure BOINC and change the hostname ( perhaps the same hostnames prevent WCG from identifying the computer, with them many tasks went into detached state). In this script you need to insert the address of your NFS and the addresses of the hosts from which password-free management will be allowed. The contents of the script:#!/bin/bash
### BEGIN INIT INFO
# Provides: boinc-preps
# Required-Start: nfs-common
# Required-Stop:
# Should-Start:
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Various stuff for BOINC
# Description: Various stuff for BOINC
### END INIT INFO
PATH=/sbin:/usr/sbin:/bin:/usr/bin
. /lib/init/vars.sh
do_start () {
MYMAC=`ifconfig eth0 | grep -o -E '([[:xdigit:]]{1,2}:){5}[[:xdigit:]]{1,2}' | sed s/://g`
ancien=`hostname`
nouveau=DYNWCG-$MYMAC
mkdir -p /mnt/boinc-nfs
mount 192.168.15.20:/srv/boinc-nfs /mnt/boinc-nfs && mkdir -p /mnt/boinc-nfs/$MYMAC
service boinc stop
sed -i "s/^BOINC_DIR=.*/BOINC_DIR=\/mnt\/boinc-nfs\/$MYMAC/;s/^BOINC_USER=.*/BOINC_USER=\"root\"/" /etc/default/boinc-client
echo "192.168.10.60" > /mnt/boinc-nfs/$MYMAC/remote_hosts.cfg
echo "192.168.10.61" >> /mnt/boinc-nfs/$MYMAC/remote_hosts.cfg
echo "" >> /mnt/boinc-nfs/$MYMAC/gui_rpc_auth.cfg
for file in \
/etc/hostname \
/etc/hosts
# сюда можно добавить
#/etc/ssh/ssh_host_rsa_key.pub \
#/etc/ssh/ssh_host_dsa_key.pub \
# если нужен SSH
do
[ -f $file ] && sed -i.old -e "s:$ancien:$nouveau:g" $file
done
invoke-rc.d hostname.sh start
invoke-rc.d networking force-reload
service boinc start
}
case "$1" in
start|"")
do_start
;;
restart|reload|force-reload|status)
echo "Error: argument '$1' not supported" >&2
exit 3
;;
stop)
# NOP
exit 3
;;
*)
echo "Usage: ... [start|stop]" >&2
exit 3
;;
esac
:
3.
config/hooks/boinc-preps-init.chroot
- a script from one command that will be executed during the assembly and will add boinc-preps
from the previous item to startup:#!/bin/sh
update-rc.d boinc-preps defaults
After adding the necessary settings, we start the assembly itself:
lb build
The value for us is the resulting file
binary.netboot.tar
. Unpack it in /srv
:cd /srv && tar -xvf live-default/binary.netboot.tar
It will be unpacked into
/srv/debian-live
(root FS for network boot) and /srv/tftpboot
(files for TFTP server). In my case, the assembly station and the NFS server are one computer, therefore it is /srv/debian-live
already in its place. The content
tftpboot
is a ready-made boot menu, it must be placed on the TFTP server. I did not check its work, since I have a working TFTP server with my own menu, and from here I needed only a part of the data. Firstly, I copied all the files from tftpboot/live/
to the TFTP server in images/debian-live/
(relative to the root directory of the TFTP server). Secondly, from tftpboot/live.cfg
I borrowed text that adds a new menu item, while changing it to this state (here you also need to specify the address of your NFS server):label live-686-pae
menu label BOINC-live (686-pae)
linux images/debian-live/vmlinuz1
initrd images/debian-live/initrd1.img
append boot=live config nosplash root=/dev/nfs nfsroot=192.168.15.20:/srv/debian-live
Now everything lies in its place.
If after assembly you want to check the contents of the root FS, then you do not have to boot with it; you can simply mount it as a loop device:
mount -o loop,ro /srv/debian-live/live/filesystem.squashfs /mnt/squash/
If you need to rebuild the distribution with new parameters, then before that you can do either
lb clean --binary
or lb clean
.Using
- Turn on the computer and select the boot over the network (usually just press F12 to do this).
- Depending on the boot menu, we either select the “BOINC-live” item, or simply wait until it loads by timeout.
- In the appeared (if everything went as it should) command line we write
sudo ifconfig
(you do not need to enter a password) and write the IP address. - On the management computer (one of those that we indicated in
config/includes.chroot/etc/init.d/boinc-preps
), run the boinc-manager, click "Advanced - Change computer" (this button is only in the "Full view"). In this case, BOINC should not request any passwords from the user. - After connecting, a wizard appears in which you will need to select a project (in my case, it is the World Community Grid) and enter the login / password.
All in a few minutes there will be new tasks in the state of "Ready to run" and "Works."
This procedure needs to be performed only once for each computer (more precisely, for each MAC address). Even after returning to you after a long operation in another place, the computer will find its data on the NFS server at the MAC address and continue to work immediately after switching on (only some tasks will already be expired, but this is a trifle, it will receive new ones).
Summary
What remains unresolved:
- Autologin in WCG. Probably, you can embed the project’s config in the image so that you don’t have to enter anything with your hands at all, but in an acceptable time the solution did not google.
- Email notifications. In order not to enter “sudo ifconfig”, you can automatically send the administrator the address of the computer that started up. However, I did not do this because I had crutches, and it is better to implement the previous paragraph instead.
- The IP of the NFS server is indicated twice, perhaps one of them can be removed.
- The MAC address of the eth0 interface is always used to create a computer directory. Good or not, I cannot say for sure.
Do not let computers get bored! And write comments, I will gladly answer or supplement the article.