How to quickly run voluntary distributed computing on hundreds of machines

    While working in the IT department, I constantly observe computers that are very bored for various organizational reasons. The golden days of mining bitcoins on the CPU have passed, and in search of a new useful business, I came to voluntary distributed computing, in particular, to the World Community Grid . The first thing to look for a cure for cancer was a puzzling server from a cold reserve and a low-priority virtual machine on a virtualization cluster. It is more difficult with workstations, they constantly come and go, on each install, configure, and then remove BOINC for a long time and is not technologically advanced.

    It was decided to build a live distribution with a wired BOINC and distribute it over the network. Turned on the computer, pressed F12, selected the desired item - and already benefit humanity!

    The platform was chosen by Debian, which a) has long been familiar and b) has a wonderful manual on the topic. Nevertheless, the rake was not without, and in this case, almost every new rake means a fairly long reassembly of the image. I hope this post saves some admin time, and at the same time reminds you of the existence of such a wonderful project as WCG.

    Note that everything was done in a very closed environment, and security needed to be paid very little attention. Perhaps, in your case, security will need additional work.

    Training


    The system consists of the following:
    1. Network boot server.
    2. Nfs server
    3. Assembly station
    I have 2 and 3 - one car.

    1. Network boot server. Everything was ready for me, the configured TFTP and DHCP were left from the project for thin clients from me. If you do not, then picking up a new one is easy. In a nutshell, we install and run tftpd-hpa, and in DHCP we specify the parameters 66 and 67. Just do not let anyone get into the network (in my case, these are cadets), this can be dangerous. In addition to the BIOS, you can password-protect part of the TFTP server boot menu.

    2. NFS server. First, BOINC should be able to save its data in the process. It is assumed that you can’t touch the local hard drive, so we’ll allow NFS to write to the directory, for example,/srv/boinc-nfs. Here, each computer will create a subdirectory with a name that matches its MAC address. Secondly, the /srv/debian-liveroot FS will be in the directory for network boot. So:
    mkdir /srv/debian-live
    mkdir /srv/boinc-nfs
    chown nobody:nogroup /srv/boinc-nfs
    chmod 755 /srv/boinc-nfs

    In /etc/exportsadd:
    /srv/boinc-nfs *(rw,sync,no_root_squash,no_subtree_check)
    /srv/debian-live *(ro,async,no_root_squash,no_subtree_check)

    after which we restart the service (for some reason, the recommended one exportfs -rvdid not give me the result):
    /etc/init.d/nfs-kernel-server restart

    3. Assembly station. It is just a virtual machine with the usual Debian Wheezy. Installed a package live-buildthat will do the bulk of the work. There should be internet here.

    Assembly process


    We leave for the assembly station.
    mkdir /srv/live-default && cd /srv/live-default

    We create the base config for our distribution, specifying the address of the NFS server:
    lb config -b netboot --net-root-path "/srv/debian-live" --net-root-server "192.168.15.20"

    A certain tree of directories is formed, having different content in them, you can customize your assembly. We will add the following:
    1. config/package-lists/boinc.list- a list of packages that will be needed in our assembly. We write in it:
    boinc-client
    nfs-common

    2. config/includes.chroot/etc/init.d/boinc-preps- an init script that will mount NFS, configure BOINC and change the hostname ( perhaps the same hostnames prevent WCG from identifying the computer, with them many tasks went into detached state). In this script you need to insert the address of your NFS and the addresses of the hosts from which password-free management will be allowed. The contents of the script:
    #!/bin/bash
    ### BEGIN INIT INFO
    # Provides:          boinc-preps
    # Required-Start:    nfs-common
    # Required-Stop:
    # Should-Start:
    # Default-Start:     2 3 4 5
    # Default-Stop:      0 1 6
    # Short-Description: Various stuff for BOINC
    # Description:       Various stuff for BOINC
    ### END INIT INFO
    PATH=/sbin:/usr/sbin:/bin:/usr/bin
    . /lib/init/vars.sh
    do_start () {
      MYMAC=`ifconfig eth0 | grep -o -E '([[:xdigit:]]{1,2}:){5}[[:xdigit:]]{1,2}' | sed s/://g`
      ancien=`hostname`
      nouveau=DYNWCG-$MYMAC
      mkdir -p /mnt/boinc-nfs
      mount 192.168.15.20:/srv/boinc-nfs /mnt/boinc-nfs && mkdir -p /mnt/boinc-nfs/$MYMAC
      service boinc stop
      sed -i "s/^BOINC_DIR=.*/BOINC_DIR=\/mnt\/boinc-nfs\/$MYMAC/;s/^BOINC_USER=.*/BOINC_USER=\"root\"/" /etc/default/boinc-client
      echo "192.168.10.60" > /mnt/boinc-nfs/$MYMAC/remote_hosts.cfg
      echo "192.168.10.61" >> /mnt/boinc-nfs/$MYMAC/remote_hosts.cfg
      echo "" >> /mnt/boinc-nfs/$MYMAC/gui_rpc_auth.cfg
      for file in \
        /etc/hostname \
        /etc/hosts
        # сюда можно добавить
        #/etc/ssh/ssh_host_rsa_key.pub \
        #/etc/ssh/ssh_host_dsa_key.pub \
        # если нужен SSH
      do
        [ -f $file ] && sed -i.old -e "s:$ancien:$nouveau:g" $file
      done
      invoke-rc.d hostname.sh start
      invoke-rc.d networking force-reload
      service boinc start
    }
    case "$1" in
      start|"")
            do_start
            ;;
      restart|reload|force-reload|status)
            echo "Error: argument '$1' not supported" >&2
            exit 3
            ;;
      stop)
            # NOP
            exit 3
            ;;
      *)
            echo "Usage: ... [start|stop]" >&2
            exit 3
            ;;
    esac
    :

    3. config/hooks/boinc-preps-init.chroot- a script from one command that will be executed during the assembly and will add boinc-prepsfrom the previous item to startup:
    #!/bin/sh
    update-rc.d boinc-preps defaults

    After adding the necessary settings, we start the assembly itself:
    lb build

    The value for us is the resulting file binary.netboot.tar. Unpack it in /srv:
    cd /srv && tar -xvf live-default/binary.netboot.tar

    It will be unpacked into /srv/debian-live(root FS for network boot) and /srv/tftpboot(files for TFTP server). In my case, the assembly station and the NFS server are one computer, therefore it is /srv/debian-livealready in its place.
    The content tftpbootis a ready-made boot menu, it must be placed on the TFTP server. I did not check its work, since I have a working TFTP server with my own menu, and from here I needed only a part of the data. Firstly, I copied all the files from tftpboot/live/to the TFTP server in images/debian-live/(relative to the root directory of the TFTP server). Secondly, from tftpboot/live.cfgI borrowed text that adds a new menu item, while changing it to this state (here you also need to specify the address of your NFS server):
    label live-686-pae
            menu label BOINC-live (686-pae)
            linux images/debian-live/vmlinuz1
            initrd images/debian-live/initrd1.img
            append boot=live config nosplash root=/dev/nfs nfsroot=192.168.15.20:/srv/debian-live

    Now everything lies in its place.

    If after assembly you want to check the contents of the root FS, then you do not have to boot with it; you can simply mount it as a loop device:
    mount -o loop,ro /srv/debian-live/live/filesystem.squashfs /mnt/squash/

    If you need to rebuild the distribution with new parameters, then before that you can do either lb clean --binaryor lb clean.

    Using


    1. Turn on the computer and select the boot over the network (usually just press F12 to do this).
    2. Depending on the boot menu, we either select the “BOINC-live” item, or simply wait until it loads by timeout.
    3. In the appeared (if everything went as it should) command line we write sudo ifconfig(you do not need to enter a password) and write the IP address.
    4. On the management computer (one of those that we indicated in config/includes.chroot/etc/init.d/boinc-preps), run the boinc-manager, click "Advanced - Change computer" (this button is only in the "Full view"). In this case, BOINC should not request any passwords from the user.
    5. After connecting, a wizard appears in which you will need to select a project (in my case, it is the World Community Grid) and enter the login / password.

    All in a few minutes there will be new tasks in the state of "Ready to run" and "Works."
    This procedure needs to be performed only once for each computer (more precisely, for each MAC address). Even after returning to you after a long operation in another place, the computer will find its data on the NFS server at the MAC address and continue to work immediately after switching on (only some tasks will already be expired, but this is a trifle, it will receive new ones).

    Summary


    What remains unresolved:
    • Autologin in WCG. Probably, you can embed the project’s config in the image so that you don’t have to enter anything with your hands at all, but in an acceptable time the solution did not google.
    • Email notifications. In order not to enter “sudo ifconfig”, you can automatically send the administrator the address of the computer that started up. However, I did not do this because I had crutches, and it is better to implement the previous paragraph instead.
    • The IP of the NFS server is indicated twice, perhaps one of them can be removed.
    • The MAC address of the eth0 interface is always used to create a computer directory. Good or not, I cannot say for sure.

    • Do not let computers get bored! And write comments, I will gladly answer or supplement the article.

    Also popular now: