Do YML programmers dream of testing ansible?

    kitchen-ci schema


    This is a text version of the speech 2018-04-25 on the Saint-Petersburg Linux User Group . Sample code here: https://github.com/ultral/ansible-role-testing


    I assume that you are using configuration mangement, not bash . Those. Your configuration is code. If we say that infrastructure is code, then the same philosophy should be applied to its creation as for software development. Have you thought about it? How is this done? And the others?


    Prerequisites


    In the described case, there were many introductory:


    • A lot of ansible roles.
    • Hyper-V as the main hypervisor.
    • Private cloud with limited opportunities to create virtual machines on the fly.
    • Proxy for internet access.
    • Impossibility to start testing ansible roles in the docker, because the role is the config of the whole VM, including network settings for example.
    • The desire to use green wizard policy for repository with ansible roles.

    We’re doing what to do, compare existing solutions.


    ProjectTest kitchenMoleculeYour
    Languagerubypythonbash / ruby
    Watchers1321260
    Stars14131154one
    Forks5021742
    LicenseApache 2.0MITAny
    Commits192912640
    Releases1011210
    Contributors10982five

    NametestinfraserverspecinspecGoss
    Githubphilpep / testinframizzy / serverspecchef / inspecaelsabbahy / goss
    Languagepythonrubyrubygo
    Watchers9314516567
    Stars997210511672170
    Forks138361330156
    LicenseApache 2.0MITApache 2.0Apache 2.0
    Commits38018544609309
    Releases3528234647
    Contributors4311015931

    We decided not to reinvent the wheel and take a turnkey solution. Our infrastructure team is able to ruby ​​so Test Kitchen & inspec was chosen


    Kitchen-ci


    kitchen-ci schema


    The idea is ugly simple. Create a new virtual machine, apply the role, run smoke-test.


    Green build policy


    Green build policy schema


    But we decided to go further. Use ala github flow, i.e. roles in individual brunches and after a review in the master. If the tests are ok, then roll the role on the infrastructure.


    Nested Virtualization


    As you remember, we had restrictions on the creation of virtual machines, so we had to make an unattractive solution in the form of nested virtualization.


    we need to go deeper


    Initially tried Virtualbox x32 not to include support of nested. This turned out to be not so much an idea because of the stable kernel panic. The second important factor is that we are sitting on x86_64, so the research continued (hello libvirt), but stopped at virtualbox as more common on supported OS.


    Difficulties


    During the launch, it was all good stuff.


    Proxy settings proxy from guest host guest


    In some test scripts, the proxy settings were used, and a transparent proxy was used on the host with the testkitchen and the ansible did not accept extra variables with empty values ​​as a bonus.


    Solution: trite - create an ERB template.


    <%= ENV['http_proxy'].to_s.empty? ? 'http://proxy.example.com:3128' : ENV['http_proxy'] %>

    Manage network settings via ansible


    In some roles, the network was configured; in the tests, it looked like this:


    • We set up the network by copying the file.
    • Apply network settings.
    • Everything is bad.

    Solution: Add an interface to the virtual machine


    If the test set contains "_" everything falls


    Virtualbox cannot use "_" in the name of the virtual machine. And the virtual machine used the name of the script.


    Solution: rename test sets "vm_" => "vm-"


    Test scripts with Oracle installation without "." at the end of the virtual machine name fall


    The role used in the conditional sale, when they decided to cover it with tests. When you roll it over to a prepared vm - the role is fulfilled, it falls through the testkitchen.


    Small hint


    [root@vm-oracle vagrant]# getent ahosts vm-oracle
    127.0.0.1 STREAM vm-oracle
    127.0.0.1 DGRAM
    127.0.0.1 RAW
    [root@vm-oracle vagrant]# getent ahosts vm-oracle.
    fe80::a00:27ff:febd:bd6a STREAM vm-oracle
    fe80::a00:27ff:febd:bd6a DGRAM
    fe80::a00:27ff:febd:bd6a RAW
    10.0.2.15 STREAM
    10.0.2.15 DGRAM
    10.0.2.15 RAW
    [root@oracle vagrant]# getent ahosts oracle.example.com.
    192.168.128.182 STREAM oracle.example.local
    192.168.128.182 DGRAM
    192.168.128.182 RAW

    Any idea what's going on?


    It was a funny script:


    1. We have enabled IPv4 binding only in the oracle listener settings.
    2. oracle uses FQDN.
    3. linux contains a special database "myhostname" for resolving domain names, it was used after the / etc / hosts & dns servers.
    4. Vagrant creates VM & updates /etc/hosts.

    I'll explain a little bit:
    What happens in the case of vm-oracle


    1. vagrant creates a virtual machine.
    2. vagrant updates /etc/hosts( vm-oracle x2)
    3. oracle listener listens on ipv4.
    4. oracle listeners resolves the domain name vm-oracle. & gets IPv6.
    5. FAILED

    What happens in the case of vm-oracle.


    1. vagrant creates a virtual machine.
    2. vagrant updates / etc / hosts (  vm-oracle &   vm-oracle. ).
    3. oracle listener listens on ipv4.
    4. oracle listeners resolves the domain name vm-oracle. & gets IPv4
    5. Ok

    OOM comes to visit us


    OOM randomly killed virtual machines. In this case, Testkitchen in its logs gave all sorts of strange messages.


    Solution: Increase the amount of memory.


    Slow builds


    This whole scheme worked slowly, for tens of minutes, sometimes more than an hour.


    Solutions:


    • Packer . Precompile images of virtual machines.
    • Run some test scripts in parallel

    Conclusion


    If we say that infrastructure is code, then the same philosophy should be applied to its creation as for software development. On the one hand, it turned out to be a working solution, but there are some unpleasant moments:


    • Not friendly it all looks.
    • A mixture of ruby ​​& python.
    • There is a lack of validation and role implication.
    • Works slowly.
    • Complicated....

    At the output, a molecule with docker looks interesting and more native. We are thinking about it.


    Links



    Also popular now: