Prometheus + Grafana + Node Exporter + Docker in Azure with notifications in Telegram
- From the sandbox
- Tutorial
To begin with, we will prepare a virtual machine, for this we will write a small script that deploys and automates some routine operations, the script uses the Azure Cli:
#!/bin/bashecho"AZURE VM Create"echo"Azure Account:"echo"Azure name:"read AZ_NAME
read -sp "Azure password: " AZ_PASS && echo && az login -u $AZ_NAME -p $AZ_PASSecho"Name Group VM"read GROUP_NAME
az group create --name $GROUP_NAME --location eastus
echo"VM name"read VM
echo"Admin user name"read ADMIN
az vm create --resource-group $GROUP_NAME --name $VM --image UbuntuLTS --admin-username $ADMIN --generate-ssh-keys --custom-data cloud-init.txt
az vm open-port --resource-group $GROUP_NAME --name $VM --port 8080 --priority 1001
az vm open-port --resource-group $GROUP_NAME --name $VM --port 8081 --priority 1002
az vm open-port --resource-group $GROUP_NAME --name $VM --port 9090 --priority 1003
az vm open-port --resource-group $GROUP_NAME --name $VM --port 9093 --priority 1004
az vm open-port --resource-group $GROUP_NAME --name $VM --port 9100 --priority 1005
az vm open-port --resource-group $GROUP_NAME --name $VM --port 3000 --priority 1006
RESULT=$(az vm show --resource-group $GROUP_NAME --name $VM -d --query [publicIps] --o tsv)
echo$RESULTecho"Whait 5 min"
sleep 300
ssh $ADMIN@$RESULT -y << EOF
sudo usermod -aG docker $ADMIN
EOF
sleep 10
echo"Connect to Azure..."
In the script, we use the cloud-init.txt file that will automatically install Docker and Docker-Compose on the virtual machine.
#cloud-config
package_upgrade: true
write_files:
- path: /etc/systemd/system/docker.service.d/docker.conf
content: |
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd
- path: /etc/docker/daemon.json
content: |
{
"hosts": ["fd://","tcp://127.0.0.1:2375"]
}
runcmd:
- apt-get update && apt-get install mc -y
- curl -sSL https://get.docker.com/ | sh
- curl -L "https://github.com/docker/compose/releases/download/1.23.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
- chmod +x /usr/local/bin/docker-compose
In the home directory, create a folder for the project and the file docker-compose.yaml
version: '3.2'
services:
alertmanager-bot:
image: metalmatze/alertmanager-bot:0.3.1
environment:
- ALERTMANAGER_URL=http://<alertmngerURL>:9093 #откуда бот получает алерт
- LISTEN_ADDR=0.0.0.0:8080
- BOLT_PATH=/data/bot.db
- STORE=bolt
- TELEGRAM_ADMIN=<TelegramAdminID> #ваш ID в телеграм
- TELEGRAM_TOKEN=<TelegramBotToken> # токен бота
- TEMPLATE_PATHS=/templates/default.tmpl
volumes:
- /srv/monitoring/alertmanager-bot:/data
ports:
- 8080:8080
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus:/etc/prometheus/
command:
- --config.file=/etc/prometheus/prometheus.yml
ports:
- 9090:9090
links:
- cadvisor:cadvisor
depends_on:
- cadvisor
restart: always
node-exporter:
image: prom/node-exporter
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --collector.filesystem.ignored-mount-points
- ^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)
ports:
- 9100:9100
restart: always
deploy:
mode: global
alertmanager:
image: prom/alertmanager
ports:
- 9093:9093
volumes:
- ./alertmanager/:/etc/alertmanager/
restart: always
command:
- --config.file=/etc/alertmanager/config.yml
- --storage.path=/alertmanager
cadvisor:
image: google/cadvisor
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
ports:
- 8081:8080
restart: always
deploy:
mode: global
grafana:
image: grafana/grafana
depends_on:
- prometheus
ports:
- 3000:3000
volumes:
- ./grafana:/var/lib/grafana
- ./grafana/provisioning/:/etc/grafana/provisioning/
restart: always
Do not forget that in yaml there should be no tabulation, only spaces, watch out for this carefully. Let's look at the docker-compose.yaml file in more detail:
image: - here, docker container images are shown that we will use
Now let's start creating Telegram bot. We will not dwell on this in detail, the Internet is full of descriptions, I will just say that the creation takes place through the @BotFather bot .
We need a Token bot and your TelegramID to control the bot, you need to substitute this data into the file docker-compose.yaml
Create files:
prometheus.yml in the prometheus directory which describes the server for collecting metrics and sending alerts.
scrape_configs:
- job_name: node
scrape_interval: 5s
static_configs:
- targets: ['ip_node_explorer:9100']
rule_files:
- './con.yml'
alerting:
alertmanagers:
- static_configs:
- targets: ['ip_alertmanager:9093']
con.yml in the same directory for describing alerts. This file describes one alert that checks if our Node Exporter is alive.
groups:
- name: ExporterDown
rules:
- alert: NodeDown
expr: up{job='Node'} == 0
for: 1m
labels:
severity: Error
annotations:
summary: "Node Explorer instance ($instance) down"
description: "NodeExporterDown"
config.yml in directory alertmanager in which the binder is added to the Telegram bot
route:
group_wait: 20s # Частота
group_interval: 20s # уведомлений
repeat_interval: 60s # в телеграм
group_by: ['alertname', 'cluster', 'service']
receiver: alertmanager-bot
receivers:
- name: alertmanager-bot
webhook_configs:
- send_resolved: true
url: 'http://ip_telegram_bot:8080'
Run our image and check the result:
docker-compose up -d
docker-compose ps
You should have something like this:
As we can see the State of all containers Up , if for some reason one of the containers did not start, we can see the log with the command:
docker logs <имя контейнера>
for example:
docker logs docker logs project_alertmanager_1
will bring us this result:
Now create a test.sh script to check for notifications.
#!/bin/sh
curl \
--request POST \
--data '{"receiver":"telegram","status":"firing","alerts":[{"status":"firing","labels":{"alertname":"Fire","severity":"critical"},"annotations":{"message":"Something is on fire"},"startsAt":"2018-11-04T22:43:58.283995108+01:00","endsAt":"2018-11-04T22:46:58.283995108+01:00","generatorURL":"http://localhost:9090/graph?g0.expr=vector%28666%29\u0026g0.tab=1"}],"groupLabels":{"alertname":"Fire"},"commonLabels":{"alertname":"Fire","severity":"critical"},"commonAnnotations":{"message":"Something is on fire"},"externalURL":"http://localhost:9093","version":"4","groupKey":"{}:{alertname=\"Fire\"}"}' \
localhost:8080
after the launch of which, the bot must send a test message.
We can also check the operation of our alert, described in con.yml , for this we will stop the Node Exporter with the command
docker stop <имя контейнера node exporter>
in two minutes, the bot will send you a notification that the server has crashed, run the Node Exporter command
docker start <имя контейнера node exporter>
And after a certain time, the bot will respond that the server has started.
That's all, in the following article I will teach you to connect additional metrics and create notifications in Prometheus.