Do YML programmers dream of testing ansible?

    kitchen-ci schema

    This is a text version of the speech 2018-04-25 on the Saint-Petersburg Linux User Group . Sample code here:

    I assume that you are using configuration mangement, not bash . Those. Your configuration is code. If we say that infrastructure is code, then the same philosophy should be applied to its creation as for software development. Have you thought about it? How is this done? And the others?


    In the described case, there were many introductory:

    • A lot of ansible roles.
    • Hyper-V as the main hypervisor.
    • Private cloud with limited opportunities to create virtual machines on the fly.
    • Proxy for internet access.
    • Impossibility to start testing ansible roles in the docker, because the role is the config of the whole VM, including network settings for example.
    • The desire to use green wizard policy for repository with ansible roles.

    We’re doing what to do, compare existing solutions.

    ProjectTest kitchenMoleculeYour
    Languagerubypythonbash / ruby
    LicenseApache 2.0MITAny

    Githubphilpep / testinframizzy / serverspecchef / inspecaelsabbahy / goss
    LicenseApache 2.0MITApache 2.0Apache 2.0

    We decided not to reinvent the wheel and take a turnkey solution. Our infrastructure team is able to ruby ​​so Test Kitchen & inspec was chosen


    kitchen-ci schema

    The idea is ugly simple. Create a new virtual machine, apply the role, run smoke-test.

    Green build policy

    Green build policy schema

    But we decided to go further. Use ala github flow, i.e. roles in individual brunches and after a review in the master. If the tests are ok, then roll the role on the infrastructure.

    Nested Virtualization

    As you remember, we had restrictions on the creation of virtual machines, so we had to make an unattractive solution in the form of nested virtualization.

    we need to go deeper

    Initially tried Virtualbox x32 not to include support of nested. This turned out to be not so much an idea because of the stable kernel panic. The second important factor is that we are sitting on x86_64, so the research continued (hello libvirt), but stopped at virtualbox as more common on supported OS.


    During the launch, it was all good stuff.

    Proxy settings proxy from guest host guest

    In some test scripts, the proxy settings were used, and a transparent proxy was used on the host with the testkitchen and the ansible did not accept extra variables with empty values ​​as a bonus.

    Solution: trite - create an ERB template.

    <%= ENV['http_proxy'].to_s.empty? ? '' : ENV['http_proxy'] %>

    Manage network settings via ansible

    In some roles, the network was configured; in the tests, it looked like this:

    • We set up the network by copying the file.
    • Apply network settings.
    • Everything is bad.

    Solution: Add an interface to the virtual machine

    If the test set contains "_" everything falls

    Virtualbox cannot use "_" in the name of the virtual machine. And the virtual machine used the name of the script.

    Solution: rename test sets "vm_" => "vm-"

    Test scripts with Oracle installation without "." at the end of the virtual machine name fall

    The role used in the conditional sale, when they decided to cover it with tests. When you roll it over to a prepared vm - the role is fulfilled, it falls through the testkitchen.

    Small hint

    [root@vm-oracle vagrant]# getent ahosts vm-oracle STREAM vm-oracle DGRAM RAW
    [root@vm-oracle vagrant]# getent ahosts vm-oracle.
    fe80::a00:27ff:febd:bd6a STREAM vm-oracle
    fe80::a00:27ff:febd:bd6a DGRAM
    fe80::a00:27ff:febd:bd6a RAW STREAM DGRAM RAW
    [root@oracle vagrant]# getent ahosts STREAM oracle.example.local DGRAM RAW

    Any idea what's going on?

    It was a funny script:

    1. We have enabled IPv4 binding only in the oracle listener settings.
    2. oracle uses FQDN.
    3. linux contains a special database "myhostname" for resolving domain names, it was used after the / etc / hosts & dns servers.
    4. Vagrant creates VM & updates /etc/hosts.

    I'll explain a little bit:
    What happens in the case of vm-oracle

    1. vagrant creates a virtual machine.
    2. vagrant updates /etc/hosts( vm-oracle x2)
    3. oracle listener listens on ipv4.
    4. oracle listeners resolves the domain name vm-oracle. & gets IPv6.
    5. FAILED

    What happens in the case of vm-oracle.

    1. vagrant creates a virtual machine.
    2. vagrant updates / etc / hosts (  vm-oracle &   vm-oracle. ).
    3. oracle listener listens on ipv4.
    4. oracle listeners resolves the domain name vm-oracle. & gets IPv4
    5. Ok

    OOM comes to visit us

    OOM randomly killed virtual machines. In this case, Testkitchen in its logs gave all sorts of strange messages.

    Solution: Increase the amount of memory.

    Slow builds

    This whole scheme worked slowly, for tens of minutes, sometimes more than an hour.


    • Packer . Precompile images of virtual machines.
    • Run some test scripts in parallel


    If we say that infrastructure is code, then the same philosophy should be applied to its creation as for software development. On the one hand, it turned out to be a working solution, but there are some unpleasant moments:

    • Not friendly it all looks.
    • A mixture of ruby ​​& python.
    • There is a lack of validation and role implication.
    • Works slowly.
    • Complicated....

    At the output, a molecule with docker looks interesting and more native. We are thinking about it.


    Also popular now: