Ansible and ChatOps or how to manage 100+ chat servers

Original author: Eugen C.
  • Transfer
  • Tutorial
Updated: February 21, 2017

Ansible and ChatOps with StackStorm, Slack and Hubot

What are ChatOps?


ChatOps is still a fresh and rare occurrence in the DevOps world, when working with infrastructure is transferred to general chat. You can run commands directly from the chat, while developers / system administrators see what is happening in real time, can view the history of changes, run their teams, maintain communication around the work and even share experiences. Thus, the information and workflow belongs to the whole team - and this carries many advantages.

You can think of things like deploying code or deploying servers from chat, viewing monitoring schedules, sending SMS, managing clusters, or just running shell commands. ChatOps can be a high-level presentation of your truly sophisticated CI / CD.system, carrying with ease using the command in chat like: !deploy that thing. This approach does wonders to improve visibility and reduce complexity around the deployment process.


Improved ChatOps


StackStorm is an OpenSource project with a special focus on automation and ChatOps. The platform connects the huge number of existing DevOps tools like configuration management, monitoring, graphs, alerts, etc. together, allowing you to rule everything from a single checkpoint. And this is ideal from the point of view of ChatOps - you can create and automate conceivable and inconceivable workflows by controlling any toolbox directly from the chat.

There is Ansible integration in StackStorm and starting with <1.0 versions more ChatOps features are added in 1.2 and 1.4releases that opens the way for the real application of ChatOps, not just posting photos of funny cats using a bot. In this article, we will tell you how to make ChatOps and Ansible work using the StackStorm platform.
By the way, StackStorm, like Ansible, is declarative, written in Python and uses Yaml + Jinja, which will make it easier for you to figure it out.


Plan


First, we are going to install a control machine that will run under Ubuntu 14. Then we will configure the StackStorm platform on it, including the Ansible and ChatOps management packs with the Hubot framework. Finally, we connect the entire system to Slack chat, and show some simple, but real-world examples of interactive use of Ansible.

Let's start, but at the same time we’ll check how far we have come and whether the technological singularity has come , giving root access to some chat bots and letting them manage our 100+ servers or even data centers (by the way, RackSpace works with ChatOps).

Step 0. Preparing Slack


As already mentioned, we will use Slack.com as a chat platform (although other integrations are available). Register a Slack account if you do not already have one. Enable Hubot integration in settings.
Hubot - GitHub bot framework designed specifically for ChatOps

Enable Hubot Integration in Slack
As a result, Slack will give you an API token like:
HUBOT_SLACK_TOKEN=xoxb-5187818172-I7wLh4oqzhAScwXZtPcHyxCu

Next, we will configure the StackStorm platform, show real examples of use, and of course, tell you how to create your own ChatOps commands.
But wait, there is an easy way!

For the laziest


For those who are lazy (most DevOps developers are), there is a specially prepared repository with Vagrant that installs everything you need with the help of simple bash scripts, taking you from the start line directly to the finish line, allowing after automatic installation to immediately launch ChatOps commands from the showcase Slack chat -ansible-chatops :
# Замените на свой токен
export HUBOT_SLACK_TOKEN=xoxb-5187818172-I7wLh4oqzhAScwXZtPcHyxCu
git clone https://github.com/StackStorm/showcase-ansible-chatops.git
cd showcase-ansible-chatops
vagrant up

For those who are interested in the details - we will switch from automatic to manual mode and go through all the steps. Just keep in mind if something doesn't work out - check out the examples from the ansible & chatops demo repository .

Step 1. Install StackStorm


Installation is simple. Only 1 team:
curl -sSL https://stackstorm.com/packages/install.sh | sudo bash -- --user=demo --password=demo

Keep in mind this is for demonstration purposes. When deploying production, use Ansible playbooks , verify signatures and do not trust the installation commands in one line! Installation details are described in the documentation: docs.stackstorm.com/install/deb.html


Step 2. Installing the StackStorm Plugin: Ansible


The idea of ​​integration packs (plugins) in StackStorm is such that they connect the system with other tools and external services.
So, we need Ansible pack, install:
st2 pack install ansible

Ansible itself will be available in Python virtualenv: /opt/stackstorm/virtualenvs/ansible
Full list of integrations: exchange.stackstorm.org , among them: AWS, GitHub, RabbitMQ, Pagerduty, Jenkins, Nagios, Docker, - more than 100+!


Step 3. Configure ChatOps


Now you need to configure the /opt/stackstorm/chatops/st2chatops.envfile with environment variables. This is how it looked for the Slack bot with the name stanley:
# Bot name
export HUBOT_NAME=stanley
export HUBOT_ALIAS='!'
# StackStorm API key
# Use: `st2 apikey create -k` to generate
# Replace with your key (!)
export ST2_API_KEY="123randomstring789"
# ST2 AUTH credentials
# Replace with your username/password (!)
export ST2_AUTH_USERNAME="demo"
export ST2_AUTH_PASSWORD="demo"
# Configure Hubot to use Slack
export HUBOT_ADAPTER="slack"
# Replace with your token (!)
export HUBOT_SLACK_TOKEN="xoxb-5187818172-I7wLh4oqzhAScwXZtPcHyxCu"


After the changes, do not forget to restart the service:
sudo service st2chatops restart


Step 4. First ChatOps Experience


At this point, the Stanley bot should be online chatting. To invite him to a specific Slack room:
/invite @stanley

Get a list of available commands:
!help

Surely you will like shipit :
!ship it

Having played enough with existing teams, we’ll do some really serious things.

Step 5. Creating Your Own ChatOps Commands


One of the features of StackStorm is the ability to create simple aliases / wrappers around teams, making it easier to work with ChatOps. Instead of typing a long command, you can just bind it to something more friendly and easy, syntactic sugar.

So, create your own StackStorm pack that will contain the commands we need. Fork the StackStorm template pack on GitHub. Our first action alias aliases/ansible.yaml:
---
name: "chatops.ansible_local"
action_ref: "ansible.command_local"
description: "Run Ansible command on local machine"
formats:
  - display: "ansible "
    representation:
      - "ansible {{ args }}"
result:
  format: |
    Ansible command `{{ execution.parameters.args }}` result: {~}
    {% if execution.result.stderr %}*Stdout:* {% endif %}
    ```{{ execution.result.stdout }}```
    {% if execution.result.stderr %}*Stderr:* ```{{ execution.result.stderr }}```{% endif %}
  extra:
    slack:
      color: "{% if execution.result.succeeded %}good{% else %}danger{% endif %}"

For reference: the above alias uses ansible st2 integration pack

We send the changes to the recently created GitHub repository and you can install our pack. There is already a ChatOps alias for this:
!pack install https://github.com/armab/st2_chatops_aliases


Now you can run simple Ansible ad-hoc commands directly from Slack chat:
!ansible "uname -a"

Running ansible commands - ChatOps
At a low level, this is the same as:
/opt/stackstorm/virtualenvs/ansible/bin/ansible all --connection=local --args='uname -a' --inventory-file='127.0.0.1,'

But let's look at more useful examples of interactive ChatOps.

Example 1. Getting server status


Ansible has a ping module that connects to hosts and returns pongif successful. A simple but powerful example that allows you to understand the status of servers directly from chat in seconds without having to go to the terminal.

To do this, create in our pack actionthat launches the real team and action aliasis syntactic sugar for the action and allows you to create such a ChatOps construct:
!status 'web'

Action actions/server-status.yaml:

---
name: server_status
description: Show server status by running ansible ping ad-hoc command
runner_type: local-shell-cmd
entry_point: ""
enabled: true
parameters:
  sudo:
    description: "Run command with sudo"
    type: boolean
    immutable: true
    default: true
  kwarg_op:
    immutable: true
  cmd:
    description: "Command to run"
    type: string
    immutable: true
    default: "/opt/stackstorm/virtualenvs/ansible/bin/ansible {{hosts}} --module-name=ping"
  hosts:
    description: "Ansible hosts to ping"
    type: string
    required: true

By the way, in addition to bashscripts, Action can work with Python runner or any binary capable of returning in general json- here is all the flexibility of use.


Action alias aliases/server_status.yaml:
---
name: chatops.ansible_server_status
action_ref: st2_chatops_aliases.server_status
description: Show status for hosts (ansible ping module)
formats:
  - display: "status "
    representation:
      - "status {{ hosts }}"
      - "ping {{ hosts }}"
result:
  format: |
    Here is your status for `{{ execution.parameters.hosts }}` host(s): {~}
    ```{{ execution.result.stdout }}```
  extra:
    slack:
      color: "{% if execution.result.succeeded %}good{% else %}danger{% endif %}"
      fields:
        - title: Alive
          value: "{{ execution.result.stdout|regex_replace('(?!SUCCESS).', '')|wordcount }}"
          short: true
        - title: Dead
          value: "{{ execution.result.stdout|regex_replace('(?!UNREACHABLE).', '')|wordcount }}"
          short: true
      footer: "{{ execution.id }}"
      footer_icon: "https://stackstorm.com/wp/wp-content/uploads/2015/01/favicon.png"

Make sure you add the necessary hosts to the Ansible inventory file: /etc/ansible/hosts

After sending the code to the repository, do not forget to reload your pack from the chat:
!pack install armab/st2_chatops_aliases

It’s very convenient that we can store all our ChatOps settings in the form of a st2 pack and catch changes from the repository - the infrastructure is like code.

The result of the newly created team in Slack:
Show Server Status - ChatOps

This is really convenient, even your CEO can see the status without having access to the servers! With this approach, communication, deployment and work around the infrastructure can occur directly in the chat: whether you are in the office or work remotely (some of us can work right from the beach).

Example 2. Reloading services


Has it ever happened to you that a simple reboot of the service helped? Not an ideal way, but often a quick fix is ​​a must. Let's create a ChatOps command that would overload the specified service on specific servers.
The task is to get this design:
!service restart "rabbitmq-server" on "mq-01"

To do this, in the existing st2 package, create actions/service_restart.yaml:

---
name: service_restart
description: Restart service on remote hosts
runner_type: local-shell-cmd
entry_point: ""
enabled: true
parameters:
  sudo:
    description: "Run command with sudo"
    type: boolean
    immutable: true
    default: true
  kwarg_op:
    immutable: true
  cmd:
    description: "Command to run"
    type: string
    immutable: true
    default: "/opt/stackstorm/virtualenvs/ansible/bin/ansible {{hosts}} --become --module-name=service --args='name={{service_name}} state=restarted'"
  hosts:
    description: "Ansible hosts"
    type: string
    required: true
  service_name:
    description: "Service to restart"
    type: string
    required: true

ChatOps Alias aliases/service_restart.yaml:

---
name: chatops.ansible_service_restart
action_ref: st2_chatops_aliases.service_restart
description: Restart service on remote hosts
formats:
  - display: "service restart  on "
    representation:
      - "service restart {{ service_name }} on {{ hosts }}"
result:
  format: |
    Service restart `{{ execution.parameters.service_name }}` on `{{ execution.parameters.hosts }}` host(s): {~}
    {% if execution.result.stderr %}
    *Exit Status*: `{{ execution.result.return_code }}`
    *Stderr:* ```{{ execution.result.stderr }}```
    *Stdout:*
    {% endif %}
    ```{{ execution.result.stdout }}```
  extra:
    slack:
      color: "{% if execution.result.succeeded %}good{% else %}danger{% endif %}"
      fields:
        - title: Restarted
          value: "{{ execution.result.stdout|regex_replace('(?!SUCCESS).', '')|wordcount }}"
          short: true
        - title: Failed
          value: "{{ execution.result.stdout|regex_replace('(?!(FAILED|UNREACHABLE)!).', '')|wordcount }}"
          short: true
      footer: "{{ execution.id }}"
      footer_icon: "https://stackstorm.com/wp/wp-content/uploads/2015/01/favicon.png"

Result:
Reloading Nginx Service on Remote Servers - ChatOps
And you know what? Thanks to the Slack mobile app, you can reload services directly from your phone!

Example 3. MySQL processlist


We want to create a simple Slack command that displays a list of SQL queries being executed on a MySQL server:
!show mysql processlist

Action actions/mysql_processlist.yaml:
---
name: mysql_processlist
description: Show MySQL processlist
runner_type: local-shell-cmd
entry_point: ""
enabled: true
parameters:
  sudo:
    immutable: true
    default: true
  kwarg_op:
    immutable: true
  cmd:
    description: "Command to run"
    type: string
    immutable: true
    default: "/opt/stackstorm/virtualenvs/ansible/bin/ansible {{ hosts }} --become --become-user=root -m shell -a \"mysql --execute='SHOW PROCESSLIST;' | expand -t 10\""
  hosts:
    description: "Ansible hosts"
    type: string
    default: db

Action alias for ChatOps aliases/mysql_processlist.yaml::
---
name: chatops.mysql_processlist
action_ref: st2_chatops_aliases.mysql_processlist
description: Show MySQL processlist
formats:
  - display: "show mysql processlist "
    representation:
      - "show mysql processlist {{ hosts=db }}"
      - "show mysql processlist on {{ hosts=db }}"
result:
  format: |
    {% if execution.status == 'succeeded' %}MySQL queries on `{{ execution.parameters.hosts }}`: ```{{ execution.result.stdout }}```{~}{% else %}
    *Exit Code:* `{{ execution.result.return_code }}`
    *Stderr:* ```{{ execution.result.stderr }}```
    *Stdout:* ```{{ execution.result.stdout }}```
    {% endif %}

Заметьте, что мы сделали hosts параметр опциональным (db по умолчанию), так что эти две команды эквивалентны:
!show mysql processlist
!show mysql processlist 'db'

Show the list of executable SQL queries - ChatOps
Ваш DBA будет счастлив!

Пример 4. Получаем HTTP статистику из nginx


Мы хотим получить массив HTTP статус кодов из nginx лога, отсортировать их в зависимости от количества и красиво отобразить в чате, чтоб понять как много 200 или 50x ошибок на веб серверах, находятся ли они в пределах нормы или нет:
!show nginx stats on 'web'

Для этого создадим action, который запускает shell команду, actions/http_status_codes.yaml:
---
name: http_status_codes
description: Show sorted http status codes from nginx logs
runner_type: local-shell-cmd
entry_point: ""
enabled: true
parameters:
  sudo:
    immutable: true
    default: true
  kwarg_op:
    immutable: true
  cmd:
    description: "Command to run"
    type: string
    immutable: true
    default: "/opt/stackstorm/virtualenvs/ansible/bin/ansible {{ hosts }} --become -m shell -a \"awk '{print \\$9}' /var/log/nginx/access.log|sort |uniq -c |sort -k1,1nr 2>/dev/null|column -t\""
  hosts:
    description: "Ansible hosts"
    type: string
    required: true

Alias aliases/http_status_codes.yaml:
---
name: chatops.http_status_codes
action_ref: st2_chatops_aliases.http_status_codes
description: Show sorted http status codes from nginx on hosts
formats:
  - display: "show nginx stats on "
    representation:
      - "show nginx stats on {{ hosts }}"
result:
  format: "```{{ execution.result.stdout }}```"

Спасибо Brian Coca, Ansible core разработчику за великолепную идею!

Show a list of nginx status codes on servers - ChatOps
More and more, it looks like a flight control control center . You can run entire chains of commands on servers directly from the chat and everyone can see the result in real time. Excellent!

Example 5. Security patching


Imagine that you urgently need to fix another critical vulnerability like Shellshock . To do this, you need to update bashon all servers. Ansible is perhaps the perfect tool for such atomic operations. But instead of running a single-line ansible command, let's create a solid playbook
playbooks/update_package.yaml::

---
- name: Update package on remote hosts, run on 25% of servers at a time
  hosts: "{{ hosts }}"
  serial: "25%"
  become: True
  become_user: root
  tasks:
    - name: Check if Package is installed
      command: dpkg-query -l {{ package }}
      register: is_installed
      failed_when: is_installed.rc > 1
      changed_when: no
    - name: Update Package only if installed
      apt: name={{ package }}
        state=latest
        update_cache=yes
        cache_valid_time=600
      when: is_installed.rc == 0

Playbookwill update the package only if it is already installed, the operation is performed on 20% of the hosts at a time, those in 5 steps. Useful when you need to update something more serious like nginxa really large number of servers. Therefore, we do not send the entire web cluster to down. Additionally, you can add disconnection from the load balancer in groups. Real life example.

It can be seen that the playbook variables {{hosts}}and {{package}}come from outside, namely from the action in our StackStorm pack actions/update_package.yaml:

---
name: update_package
description: Update package on remote hosts
runner_type: local-shell-cmd
entry_point: ""
enabled: true
parameters:
  sudo:
    immutable: true
    default: true
  kwarg_op:
    immutable: true
  timeout:
    default: 6000
  cmd:
    description: "Command to run"
    immutable: true
    # эта строчка
    default: "/opt/stackstorm/virtualenvs/ansible/bin/ansible-playbook /opt/stackstorm/packs/${ST2_ACTION_PACK_NAME}/playbooks/update_package.yaml --extra-vars='hosts={{ hosts }} package={{ package }}'"
  hosts:
    description: "Ansible hosts"
    type: string
    required: true
  package:
    description: "Package to upgrade"
    type: string
    required: true

Action alias, which allows you to run the playbook as a simple ChatOps command
aliases/update_package.yaml:

---
name: chatops.ansible_package_update
action_ref: st2_chatops_aliases.update_package
description: Update package on remote hosts
formats:
  - display: "update  on "
    representation:
      - "update {{ package }} on {{ hosts }}"
      - "upgrade {{ package }} on {{ hosts }}"
result:
  format: |
    Update package `{{ execution.parameters.package }}` on `{{ execution.parameters.hosts }}` host(s): {~}
    {% if execution.result.stderr %}
    *Exit Status*: `{{ execution.result.return_code }}`
    *Stderr:* ```{{ execution.result.stderr }}```
    *Stdout:*
    {% endif %}
    ```{{ execution.result.stdout }}```
  extra:
    slack:
      color: "{% if execution.result.succeeded %}good{% else %}danger{% endif %}"
      fields:
        - title: Updated nodes
          value: "{{ execution.result.stdout|regex_replace('(?!changed=1).', '')|wordcount }}"
          short: true
        - title: Executed in
          value: ":timer_clock: {{ execution.elapsed_seconds | to_human_time_from_seconds }}"
          short: true
      footer: "{{ execution.id }}"
      footer_icon: "https://stackstorm.com/wp/wp-content/uploads/2015/01/favicon.png"

Here she is:
!update 'bash' on 'all'


An important part of the DevOps engineer’s work is the improvement of processes, making the work of developers easier, team communication is better, diagnosing problems faster by automating and using the right tools — all in order to make the company more successful.
ChatOps helps solve these problems in a completely new, effective way!

In conclusion. Holy cow


As you know, Ansible has известная любовь к утилите cowsay. Let's move it to ChatOps!

First, install the utility itself:
sudo apt-get install cowsay

Action actions/cowsay.yaml:

---
name: cowsay
description: Draws a cow that says what you want
runner_type: local-shell-cmd
entry_point: ""
enabled: true
parameters:
  sudo:
    immutable: true
  kwarg_op:
    immutable: true
  cmd:
    description: "Command to run"
    type: string
    immutable: true
    default: "/usr/games/cowsay {{message}}"
  message:
    description: "Message to say"
    type: string
    required: true

Alias aliases/cowsay.yaml:

---
name: chatops.cowsay
action_ref: st2_chatops_aliases.cowsay
description: Draws a cow that says what you want
formats:
  - display: "cowsay "
    representation:
      - "cowsay {{ message }}"
ack:
  enabled: false
result:
  format: |
    {% if execution.status == 'succeeded' %}Here is your cow: ```{{ execution.result.stdout }}``` {~}{% else %}
    Sorry, no cows this time {~}
    Exit Code: `{{ execution.result.return_code }}`
    Stderr: ```{{ execution.result.stderr }}```
    Hint: Make sure `cowsay` utility is installed.
    {% endif %}

Call Cow's Holy ChatOps:
!cowsay 'Holy ChatOps Cow!'

Holy ChatOps Cow
For reference: All the results of command execution can be viewed in the StackStorm control panel
https: // chatops / login: demopassword: demo
(replace hostname with IP if you did not use the Vagrant demo ):


Do not stop there!


These were simple but combat use cases. More complex things when several DevOps tools are combined into a dynamic workflow will be shown in future articles. Here, StackStorm demonstrates all its power, making decisions depending on the situation: this is called an event-oriented architecture like self-healing systems after an incident.
If you did not find the necessary functionality in StackStorm, suggest an idea or add a Pull Request to GitHub (Python is our main language). There is also a community where you can ask a question or share your experience: public Slack channel (with a pre-installed demo bot) and IRC: #StackStormon freenode.net .


Thank you for your attention, I hope it turned out to highlight the features of this fairly new approach in the world of DevOps.
And for what cases would you use ChatOps? Please share ideas and stories (we love stories).

Also popular now: