Ansible is not so simple

  • Tutorial
I have three servers, but I am not a professional system administrator. This means that in spite of four databases and a hundred applications, backups are not being performed anywhere, I approach any problem on the server by sighing noisily and throwing a plate on the wall, and the operating systems there reached EOL two years ago. I would be happy to update, but this probably needs to be allocated a week to save everything and rearrange. It’s easier to forget about yum updateand apt-get upgrade.

Of course, this is wrong. I have been looking at chef and Puppet for a long time, which I thought would solve all my problems. But I looked at configs of familiar projects and put off. The same needs to be studied, to deal with ruby, to deal with numerous, according to reviews, jambs and restrictions. Two weeks ago an article by George amaraobecame a life-giving kick. Not even the article itself, but an enumeration of configuration management systems. After reading the comments and a little googling, I decided: I'll take Ansible. Because python, and no one complains about problems.



Well then, I'll be the first.

First, I dug up a bunch of Ansible documentation and tutorials, starting with the useless Quick Start video on the official website. Of course, there are many of them, made for different tasks and written by different people, but they have one thing in common: textbooks were made for people who already understand Ansible. For people with a spherical server in a vacuum, it is enough to suggest that there are roles, modules and tasks. But I came with a clean slate and collected all the rakes I found. I hope this post helps you get around them.

I expected miracles from configuration management systems, such as automatically updated applications from git. But it turned out that Ansible is just a way to preserve the sequence of actions when setting up a new server. You can do in Ansible only what you can do from the console yourself. There are no miracles.

Start. Vagrant


Task: I do not make a new host, because I want to save ip. That is, I will clear the droplet through the control panel, then initialize using Ansible. Plan: write a playbook and debug it on Vagrant.

Getting started is very difficult. All Ansible tutorials begin with a description of inventory, where you need to register the server address. But what kind of ip does the vagrant have? The devil knows him. The Ansible documentation has instructions on how to run a playbook in Vagrant; the Vagrant documentation has instructions for connecting Ansible, and they are not exactly identical. As a result, I scored on the search for ip and took the general: minimal Vagrantfile, which launches the playbook.

Vagrant.configure(2) do |config|
  config.vm.box = "ubuntu/xenial64"
  config.vm.network "forwarded_port", guest: 80, host: 8080
  # так и не знаю, зачем это:
  config.ssh.insert_key = false
  config.vm.provision "ansible" do |ansible|
    ansible.verbose = "v"
    ansible.playbook = "playbook.yml"
  end
end

I sketched a draft of the playbook, created prefabricated roles, and started it vagrant up. It didn’t take off. Since the official xenial image is only for VirtualBox, and in Fedora Linux virtualization is via libvirt. Long remembered the correct command vagrant up --provider virtualbox. Then the syntax errors in yaml were corrected (why are there three mandatory hyphens at the beginning?). Remember that after starting the box to restart Ansible, we write vagrant provision.

And the first surprise: there is no python by default in the Ubuntu 16.04 box! Wildness for Fedora, where the package manager is written in python. Ansible, as I found out, uploads its modules to the server and runs them there. We go to StackOverflow, we find a magic task (more precisely, ten variations of one task and it is unclear how best):

- name: Install python for Ansible 
  become: yes
  raw: test -e /usr/bin/python || (apt -qy update && apt install -y python-minimal)
  register: output
  changed_when: output.stdout

Superuser, become!


Even with the documentation and examples, a lot is not clear. I don’t understand, for example, why Vagrant redefines remote_user, and how it turns out that each box has its own superuser. I will run the playbook on a clean server, where there will be only root, and I will need to make my superuser. But you need to do this under a vagrant differently than on a clean server, apparently. It’s not clear at all: will there be two playbooks, for staging and for production?

Or here becomeit is become_user: one does not mean the other. Which of these needs to be specified in the root playbook, if you need to always turn on the root to configure the server? I first put it there become: yesand in every second task I wrote become_user: root. Then it turned out that withoutbecome_userEverything also works from the root! Because root is the default value and I, in fact, from the very beginning did sudo -iwithout the ability to let go.

Somewhere here I remembered that I had not updated the system on my laptop for a long time, and started it dnf update. Continuing to bathe with a playbook. Vagrant worked, and dnf in the next tab updated VirtualBox. It seems that this is not necessary, because the next one vagrant provisionsaid: "everything has broken and I am not to blame." He lacked VirtualBox, which is " terminated unexpectedly during startup with exit code 1 (0x1) " - and at least you crack. Commandvboxheadless -h(I'm not a real devops, I googled) showed the error -1912. On the Internet, everyone answers as one: reinstall VirtualBox. Fuck there, it doesn’t help. Desperate, I found the xenial box for libvirt and switched to it. It’s good when there is a choice.

I copied an apt call task with a bunch of parameters from some example, and then I learned that update_cache=yesit would be nice to do a separate task. And this task, this is the trouble, it always returns “changed”. It turned out that you need to register cache_valid_time=3600to check for updates no more than once per hour. At first I thought to write 86400 (day), but I’m not going to call Ansible in the crown, but once a month - let him live.

Let's expand the database


PostgreSQL installation - five lines in the console or a whole epic in Ansible. At a certain point you need to do become_user: postgres. And then the box produced a strange error: " Failed to set permissions on the temporary files Ansible needs to create when becoming an unprivileged user ". Remember how Ansible loads the modules on the server and launches there? Well, he downloads them from root or from another superuser, and then the postgres user does not have access to them. This is bad luck.

StackOverflow helps again: it turns out there are three ways out. One of them is to make ansible.cfgand prescribe inside pipelining=True(and for the solution of some other arisen problem I temporarily set pipelining=False). The second way out is literally, “don't do that.” And the third easiest: put the packageacland everything magically works. Rather, it does not work in another way: " sudo: a password is required ". Well, what’s the matter, where do the passwords come from here, am I going in with the key?

It turned out, I go into the virtual machine without a key, by the user vagrant. Which was made before us and for us. Ansible become_user, apparently, does sudo -u postgres, but it requires a vagrant user password. There is no password.

I'm starting to sort through the options. become_method: sucrashes by timeout because the server asks for a password, but Ansible does not understand this. What he does there is incomprehensible, because he sudo su postgresdoes not ask me for a password. There is an option in the file to /etc/sudoers.d/vagrantwrite " vagrant ALL=(ALL) ...", because the word in brackets will allow you to dosudo -uno password. But then the playbook becomes imprisoned for Vagrant, and I still have to run it in the prod. Inaccurate.

From hopelessness I try to remove it altogether become. Postgres expectedly runs: “ Peer authentication failed for user„ postgres “ ”. I dig out the stewardess. New plan: launch the role under the user zverik, who has everything in the world of law. I split the playbook into two: in the first install python and make the user, second I set and configure everything else with remote_user: zverik. I'm starting it up. And again, " sudo: a password is required ". Why? Well, yes, Vagrant conveys the value remote_userand does not allow it to change. Well, damn it.

To escape, I opened a text editor and began to write these notes. At this point, I’ve been working with Ansible for a week and a half to two hours, and I haven’t even created a database in postgres. In textbooks, it all looks so simple ... I counted the tabs related to Ansible in firefox: 48 pieces. Forty eight. About one sixth of the total.

Then I turned off ansible.force_remote_userthe Vagrantfile and restarted provision. Hooray, new mistake! Reminds that user login zverik only works on a certificate. But I have a certificate, and it vagrant ssh -pworks and admits without a password. Googled the solution: you need to specify the path to the certificate in ansible.cfg. It will not work for the same reason as remote_user: Vagrant wins. This time it’s easier to override the main variable: add “ansible_ssh_private_key_file: "{{ lookup('env', 'HOME') }}/.ssh/id_rsa""And it works! It didn’t work out very beautifully, but cheers!

After I figured out the users, the writing of the roles went like clockwork. One role of six, sixty tasks is already ready. But getting started is harder than it seems from the textbooks.

Useful stuff


When writing playbooks, you find or google a lot of useful little things. Some are described in the documentation, some in the articles (look for "Ansible" on the hub). Here are a few of them.

To execute commands - only modules commandor shell. The latter, as the documentation writes, is only in extreme cases, so forget about redirecting the output and &&. The result is always “changed,” which is bad. Manage the result with either a parameter creates(more conveniently - in a block args, along with chdir), or registerand changed_when. It is useful to check the conditions before execution: first reconnaissance command + register + changed_when: False, and then use it to whencheck the saved stdout for the need to run the command.

Less module callscommand, all the better. Google: almost always there is a module. For example, I first did command: npm install -g {{ item }}, and then found that it was possible npm: name={{ item }} global=yes. A module is always better than a command because there is no need to check the configuration and because the result of the work will not be in the stdout line, but in a convenient structure.

Configuration files are almost always corrected through lineinfile, which searches for a line by regular expression and replaces it with another. The module blockinfileadds whole blocks of text. There is a nuance with it: if several tasks are written in one file, then you need to redefine it marker: # {mark} block name. Otherwise, everyone will overwhelm other people's blocks.

Before modifying PostgreSQL tables, it is convenient to check their state with pg_tables. For instance:

command: psql -A -t -d {{ gisdb }} -c "SELECT tableowner FROM pg_tables WHERE schemaname = 'public' AND tablename = 'spatial_ref_sys'"

Inheritance is our everything: if you can write one with conditional expressions and instead of two almost identical tasks with_items, do this. Take out a group of repeating tasks with similar parameters in a separate file and call through include_rolec vars. There should still be about parameterization of roles, but I'm still just learning and I have one role.

In one of the articles I found advice not to reinvent the wheel, but to look for suitable roles in the Ansible Galaxy catalog . Indeed, php-fpm and postfix have put thousands of people before you, and there is often a well-written role with convenient defaults.

On the other hand, what's the point of rocking a role geerlingguy.apachewhenapt: pkg=apache2solves all my problems? Or, here, I found a role for installing osm2pgsql from the sources, and in 2014 it uses the outdated one sudo: yes. That is, of course, I wrote roles_path = roles.galaxy:rolesin ansible.cfgand made a playbook to install all the roles, but nothing set yet. Here's what it looks like:

- hosts: localhost 
  vars: 
    galaxy_path: roles.galaxy 
  tasks: 
    - name: Remove old galaxy roles 
      file: path={{ galaxy_path }} state=absent 
    - name: Install Ansible Galaxy roles 
      local_action: command ansible-galaxy install -r requirements.yml --roles-path {{ galaxy_path }}

And in requirements.ymlwriting lines for each role from the Galaxy:

- src: автор.роль

They wrote a playbook and it worked in Vagrant until the end? Great, now do vagrant destroyand recreate the box. Absolutely discover several jambs: forgotten sudo, skipped mode: 0755 for executable files, missing packages (help dnf providesor apt-filewhich to install). Finally, the most important thing: after the second start, there vagrant provisionshould be “changed: 0”.

***


Transferring servers to a configuration management system is difficult no matter which system you choose. But after the initial rake field, the programming of the playbook argues. The main thing is not to forget about the goal, so as not to burn out: there, now I have the target operating system Ubuntu 16.04, and in a month I’ll transfer the server to 18.04 without too much difficulty. And the pleasure of a fully functional server from scratch with a single command in the console will help along the way.

Also popular now: