Aecktann December 25, 2012 at 10:02

How to Become a Puppeteer or Puppet for Beginners

Tutorial

Hello.

This topic opens a series of articles on the use of the Puppet configuration management system .

What is a configuration management system?

Suppose you have a fleet of servers that perform various tasks. While there are few servers and you are not growing, you can easily configure each server manually. Install the OS (maybe automatically), add users, install software, enter commands in the console, configure services, edit the configs of your favorite text editors (nanorc, vimrc), set the same DNS server settings on them, install the monitoring system agent, configure syslog for centralized collection of logs ... In a word, there is a lot of work and it is not particularly interesting.

I sincerely believe that a good admin is a lazy admin. He does not like to do something several times. The first thought is to write a couple of scripts in which there will be something like:

servers.sh

servers="server00 server01 server02 server03 server04"
for server in $servers ; do
  scp /path/to/job/file/job.sh $server:/tmp/job.sh
  ssh $server sh /tmp/job.sh
done

job.sh

#!/bin/bash
apt-get update
apt-get install nginx
service nginx start

Everything seemed to be easy and good. We need to do something - we write a new script, run it. Changes come to all servers sequentially. If the script is well debugged, everything will be fine. For the time being.

Now imagine that there are more servers. For example, a hundred. And the change is long - for example, the assembly of something big and scary (for example, the kernel) from the source. The script will run for a hundred years, but it's not so bad.

Imagine that you only need to do this on a specific group of these hundreds of servers. And after two days, you need to do another big task on another server slice. You will have to rewrite the scripts each time and check many times whether there are any errors in them or if this will cause any problems at startup.

The worst part is what you describe in such scriptsactions that must be performed to bring the system into a specific state, and not this state itself. So, if the system was originally not in the state that you expected, then everything will definitely go wrong. Puppet manifests declaratively describe the necessary state of the system, and calculating how to get it from the current state is the task of the configuration management system itself.

For comparison: puppet manifest that does the same work as a couple of scripts from the top of the topic:

nginx.pp

class nginx {
  package { 'nginx':
    ensure => latest
  }
  service { 'nginx':
    ensure => running,
    enable => true,
    require => Package['nginx']
  }
}
node /^server(\d+)$/ {
  include nginx
}

If you correctly use the servers and spend some time on the initial setup of the configuration management system, you can achieve such a state of the fleet of servers that you do not need to log in to them to do the work. All necessary changes will come to them automatically.

What is Puppet?

Puppet - configuration management system. The architecture is client-server, configs are stored on the server (in puppet terms they are called manifests ), clients access the server, receive them and apply them. Puppet is written in Ruby, the manifest itself is written in a special DSL, very similar to Ruby itself.

First steps

Let's forget about clients, servers, their interaction, etc. Suppose we have only one server on which a bare OS is installed (hereinafter I work in Ubuntu 12.04, for other systems the actions will be slightly different).

First install puppet latest version.

wget http://apt.puppetlabs.com/puppetlabs-release-precise.deb
dpkg -i puppetlabs-release-precise.deb
apt-get update
apt-get install puppet puppetmaster

Wonderful. Now puppet is installed in our system and you can play with it.

Hello world!

Create the first manifest:

/tmp/helloworld.pp

file { '/tmp/helloworld':
  ensure => present,
  content => 'Hello, world!',
  mode => 0644,
  owner => 'root',
  group => 'root'
}

And apply it:

$ puppet apply helloworld.pp 
/Stage[main]//File[/tmp/helloworld]/ensure: created
Finished catalog run in 0.06 seconds

A little bit about launch

Manifests given in this topic can be applied manually using puppet apply. However, in subsequent topics, the master-slave configuration (standard for Puppet) will be used for work.

Now look at the contents of the / tmp / helloworld file. It will contain (surprisingly!) The line “Hello, world!”, Which we set in the manifest.

You can say what could be done echo "Hello, world!" > /tmp/helloworld, it would be faster, easier, I would not have to think, write some scary manifestos and in general ~~it doesn’t need anyone, it~~ is somehow too complicated, but think more seriously. In fact, it would be necessary to write

touch /tmp/helloworld && echo "Hello, world!" > /tmp/helloworld && chmod 644 /tmp/helloworld && chown root /tmp/helloworld && chgrp root /tmp/helloworld

in order to guarantee the same result.

Let's

look at the lines of what exactly is contained in our manifest: /tmp/helloworld.pp

file { '/tmp/helloworld':
    ensure  => present,          # файл должен существовать
    content => 'Hello, world!',  # содержимым файла должна являться строка "Hello, world!"
    mode    => 0644,             # права на файл - 0644
    owner   => 'root',           # владелец файла - root
    group   => 'root'            # группа файла - root
}

In terms of Puppet, a file type resource with the name (title) / tmp / helloworld is described here .

Resources

A resource is the smallest unit of abstraction in Puppet. Resources can be:

Files
packages (Puppet supports the package systems of many distributions);
Services;
users
groups
cron tasks
etc.

You can peek at the syntax of resources in the documentation .

Puppet has the ability to add resources. Therefore, if you get confused, you can get to manifests like:

webserver.pp

include webserver;
webserver::vhost { 'example.com':
    ensure => present,
    size   => '1G',
    php    => false,
    https  => true  
}

In this case, Puppet will create a logical volume of 1 GiB on the server, mount it where it should be (for example, in /var/www/example.com), add the necessary entries to fstab, create the necessary virtual hosts in nginx and apache, restart both daemons, add example.com to ftp and sftp with the password mySuperSecretPassWord with write access to this virtual host.

Delicious? Not that word!

Moreover, the most delicious, in my opinion, is not automation of routine. If you are an idiot, for example, and constantly re-mount your servers in production, Puppet will allow you to pick up an old, lovingly created set of packages and configs from scratch in fully automatic mode. You simply install the Puppet agent, connect it to your Puppet master and wait. Everything will come by itself. The server will magically (no, really magically!) Appear packages, your ssh keys decompose, a firewall is installed, individual bash and network settings come, all the software that you prudently installed using Puppet is installed and configured.
In addition, Puppet, when you try, allows you to get a self-documenting system, because the configuration (manifests) themselves are the backbone of the documentation. They are always relevant (they already work), they have no errors (you check your settings before starting), they are minimally detailed (works the same).

Some more magic

A bit about cross-distribution

Puppet has the ability to use cross-distribution manifests, this is one of the purposes for which it was created. I intentionally never used this and I do not recommend it to you. The server park should be as homogeneous as possible in terms of system software, this allows you to avoid thinking at the critical moments of “ayblin, here
rc.d, not init.d” (curtsy towards ArchLinux) and generally allows you to think less on routine tasks.

Many resources depend on other resources. For example, the resource “sshd service” requires the resource “sshd package” and the optional “sshd config”.
Let's see how this is implemented:

file { 'sshd_config':
    path    => '/etc/ssh/sshd_config',
    ensure  => file,
    content => "Port 22
    Protocol 2
    HostKey /etc/ssh/ssh_host_rsa_key
    HostKey /etc/ssh/ssh_host_dsa_key
    HostKey /etc/ssh/ssh_host_ecdsa_key
    UsePrivilegeSeparation yes
    KeyRegenerationInterval 3600
    ServerKeyBits 768
    SyslogFacility AUTH
    LogLevel INFO
    LoginGraceTime 120
    PermitRootLogin yes
    StrictModes yes
    RSAAuthentication yes
    PubkeyAuthentication yes
    IgnoreRhosts yes
    RhostsRSAAuthentication no
    HostbasedAuthentication no
    PermitEmptyPasswords no
    ChallengeResponseAuthentication no
    X11Forwarding yes
    X11DisplayOffset 10
    PrintMotd no
    PrintLastLog yes
    TCPKeepAlive yes
    AcceptEnv LANG LC_*
    Subsystem sftp /usr/lib/openssh/sftp-server
    UsePAM yes",
    mode    => 0644,
    owner   => root,
    group   => root,
    require => Package['sshd']
}
package { 'sshd':
    ensure => latest,
    name   => 'openssh-server'
}
service { 'sshd':
    ensure    => running,
    enable    => true,
    name      => 'ssh'
    subscribe => File['sshd_config'],
    require   => Package['sshd']
}

It uses an inline config that makes the manifest ugly. In fact, this is almost never done, there is an ERB-based template engine and the ability to simply use external files. But we are not interested in this.

The most delicious lines here are the dependency lines - require and subscribe.

Puppet supports many options for describing dependencies. Details, as always, can be found in the documentation .

Require means exactly what is expected. If resource A depends on resource B, then Puppet will first process resource B and then return to resource A.
Subscribe gives a little trickier behavior. If Resource A is subscribed to Resource B, then Puppet will first process Resource B and then return to Resource A (behavior similar to require), and then, if B changes, it will be reprocessed A. This is very convenient for creating services that depend on from their configs (as in the example above). If the config changes, the server restarts, no need to worry about it yourself.

There are also notify , before , but we will not touch on them here. For those interested - in the already mentioned documentation .

Total

At the moment, we have already learned how to write simple manifests indicating the dependencies between resources. A lot of simple demons fall into the package-config-service model, so even in this form puppet is already suitable for use.
The following topics will describe how to use the more powerful puppet features when creating a spherical LAMP hosting in a vacuum (if there are other ideas for a spherical project for training, welcome to the PM or in the comment).