Do YML programmers dream of testing ansible?
This is a text version of the speech 2018-04-25 on the Saint-Petersburg Linux User Group . Sample code here: https://github.com/ultral/ansible-role-testing
I assume that you are using configuration mangement, not bash . Those. Your configuration is code. If we say that infrastructure is code, then the same philosophy should be applied to its creation as for software development. Have you thought about it? How is this done? And the others?
Prerequisites
In the described case, there were many introductory:
- A lot of ansible roles.
- Hyper-V as the main hypervisor.
- Private cloud with limited opportunities to create virtual machines on the fly.
- Proxy for internet access.
- Impossibility to start testing ansible roles in the docker, because the role is the config of the whole VM, including network settings for example.
- The desire to use green wizard policy for repository with ansible roles.
We’re doing what to do, compare existing solutions.
Project | Test kitchen | Molecule | Your |
---|---|---|---|
Language | ruby | python | bash / ruby |
Watchers | 132 | 126 | 0 |
Stars | 1413 | 1154 | one |
Forks | 502 | 174 | 2 |
License | Apache 2.0 | MIT | Any |
Commits | 1929 | 1264 | 0 |
Releases | 101 | 121 | 0 |
Contributors | 109 | 82 | five |
Name | testinfra | serverspec | inspec | Goss |
---|---|---|---|---|
Github | philpep / testinfra | mizzy / serverspec | chef / inspec | aelsabbahy / goss |
Language | python | ruby | ruby | go |
Watchers | 93 | 145 | 165 | 67 |
Stars | 997 | 2105 | 1167 | 2170 |
Forks | 138 | 361 | 330 | 156 |
License | Apache 2.0 | MIT | Apache 2.0 | Apache 2.0 |
Commits | 380 | 1854 | 4609 | 309 |
Releases | 35 | 282 | 346 | 47 |
Contributors | 43 | 110 | 159 | 31 |
We decided not to reinvent the wheel and take a turnkey solution. Our infrastructure team is able to ruby so Test Kitchen & inspec was chosen
Kitchen-ci
The idea is ugly simple. Create a new virtual machine, apply the role, run smoke-test.
Green build policy
But we decided to go further. Use ala github flow, i.e. roles in individual brunches and after a review in the master. If the tests are ok, then roll the role on the infrastructure.
Nested Virtualization
As you remember, we had restrictions on the creation of virtual machines, so we had to make an unattractive solution in the form of nested virtualization.
Initially tried Virtualbox x32 not to include support of nested. This turned out to be not so much an idea because of the stable kernel panic. The second important factor is that we are sitting on x86_64, so the research continued (hello libvirt), but stopped at virtualbox as more common on supported OS.
Difficulties
During the launch, it was all good stuff.
Proxy settings proxy from guest host guest
In some test scripts, the proxy settings were used, and a transparent proxy was used on the host with the testkitchen and the ansible did not accept extra variables with empty values as a bonus.
Solution: trite - create an ERB template.
<%= ENV['http_proxy'].to_s.empty? ? 'http://proxy.example.com:3128' : ENV['http_proxy'] %>
Manage network settings via ansible
In some roles, the network was configured; in the tests, it looked like this:
- We set up the network by copying the file.
- Apply network settings.
- Everything is bad.
Solution: Add an interface to the virtual machine
If the test set contains "_" everything falls
Virtualbox cannot use "_" in the name of the virtual machine. And the virtual machine used the name of the script.
Solution: rename test sets "vm_" => "vm-"
Test scripts with Oracle installation without "." at the end of the virtual machine name fall
The role used in the conditional sale, when they decided to cover it with tests. When you roll it over to a prepared vm - the role is fulfilled, it falls through the testkitchen.
Small hint
[root@vm-oracle vagrant]# getent ahosts vm-oracle
127.0.0.1 STREAM vm-oracle
127.0.0.1 DGRAM
127.0.0.1 RAW
[root@vm-oracle vagrant]# getent ahosts vm-oracle.
fe80::a00:27ff:febd:bd6a STREAM vm-oracle
fe80::a00:27ff:febd:bd6a DGRAM
fe80::a00:27ff:febd:bd6a RAW
10.0.2.15 STREAM
10.0.2.15 DGRAM
10.0.2.15 RAW
[root@oracle vagrant]# getent ahosts oracle.example.com.
192.168.128.182 STREAM oracle.example.local
192.168.128.182 DGRAM
192.168.128.182 RAW
Any idea what's going on?
It was a funny script:
- We have enabled IPv4 binding only in the oracle listener settings.
- oracle uses FQDN.
- linux contains a special database "myhostname" for resolving domain names, it was used after the / etc / hosts & dns servers.
- Vagrant creates VM & updates
/etc/hosts
.
I'll explain a little bit:
What happens in the case of vm-oracle ?
- vagrant creates a virtual machine.
- vagrant updates
/etc/hosts
( vm-oracle x2) - oracle listener listens on ipv4.
- oracle listeners resolves the domain name vm-oracle. & gets IPv6.
- FAILED
What happens in the case of vm-oracle. ?
- vagrant creates a virtual machine.
- vagrant updates / etc / hosts ( vm-oracle & vm-oracle. ).
- oracle listener listens on ipv4.
- oracle listeners resolves the domain name vm-oracle. & gets IPv4
- Ok
OOM comes to visit us
OOM randomly killed virtual machines. In this case, Testkitchen in its logs gave all sorts of strange messages.
Solution: Increase the amount of memory.
Slow builds
This whole scheme worked slowly, for tens of minutes, sometimes more than an hour.
Solutions:
- Packer . Precompile images of virtual machines.
- Run some test scripts in parallel
Conclusion
If we say that infrastructure is code, then the same philosophy should be applied to its creation as for software development. On the one hand, it turned out to be a working solution, but there are some unpleasant moments:
- Not friendly it all looks.
- A mixture of ruby & python.
- There is a lack of validation and role implication.
- Works slowly.
- Complicated....
At the output, a molecule with docker looks interesting and more native. We are thinking about it.