Consul: Service Discovery is simple, or say goodbye to config files

  • Tutorial
What's interesting here:

image

A review article on Consul ( http://consul.io ), a system for supporting service discovery and distributed key-value storage. In addition to Consul itself, consider Consul-Template - a tool for managing service configurations that automatically reflects changes in the topology. The article will be of interest to DevOps engineers, system architects, team leads of projects and others interested in microservice architectures.

Naturally, I can’t cover all aspects of the functioning and use of Consul, but the article will describe enough to make the inquisitive reader interested and continue to study more deeply.

Consul: “what kind of bird - what do they eat with?”

Lyrical digression, but on the topic.
In the current world of huge amounts of data, where distributed systems for their processing become not something from the world of unattainable fiction but ordinary things, the issues of their proper design and implementation become a very important point in the subsequent development of these systems. Everyone who has at least once taken part in the development of architectures for automatically scalable, distributed systems knows that this process is very laborious and requires a fairly serious stack of knowledge of systems from which such architectural solutions can be built. Given the rapid development of cloud computing and the emergence of IaaS platforms - the deployment of scalable systems has become quite simple. However, the interaction of the components of such systems (integration of new components, removal of unused parts, etc.) always causes a headache for architects, devops engineers and programmers. For these purposes, you can invent your own bike (configuration file templates, support for self-registration from the application side, etc.), you can use local or distributed key-value storage systems (redis, zookeeper, etcd, etc.) or you can use Service Discovery

Often the term Service Discovery (in the future I will use the abbreviation - SD) refers to network discovery systems ( SDP protocol , for example), but recently SD is also used in the software part of architectures for mutual detection of related system components. This is especially true of the microservice approach to the development of software systems. MSA (Micro Services Architecture), one of the pioneers and popularizers of which is Netflix, is increasingly becoming the standard for the development of distributed, auto-scalable, highly loaded systems. And Consul has been used a lot where to provide SD in this kind of systems. For example, Cisco uses it in its engine for MSA - Cisco MI .

Actually, Consul is a successful combination of K / V storage and SD functionality. Well, now more.

Consul: What is it better?

A reasonable enough question is “Why do we need Consul, if we have Zookeeper and it does an excellent job with the SD?” The answer on the surface is Zookeeper, and similar systems (etcd, doozerd, redis, etc) do not provide SD functionality - their task is only to store data in one format or another and guarantee their availability and consistency at any given time (provided that it is correct settings and use, of course). Naturally, such a model will be quite enough to provide SD, but the ease of use (settings, maintenance, etc.) often leaves much to be desired.

For example, Zookeeper: this is a constant fuss with its cluster - starting with the initial setup (automated installation of a zk cluster using Ansible or SaltStack can be a bit of a hassle even for an advanced specialist), ending with transfers to a software using Zookeeper of links to a cluster of the form zk: // 10.10.1.2:2181,10.10.1.1.3:2181,10.10.1.5:2181/app (you must first know where the cluster is located, all its nodes). Moreover, if for some reason the Zookeeper cluster "moves" to other addresses (very important in cloud environments, MSA architectures), you will have to restart all applications and services using this cluster.
Consul is easier: guys from HashiCorpmade "everything for people." Consul is distributed as 1 binary file (there is no need to monitor dependencies, use package managers) and any software using Consul always makes requests to it on localhost (there is no need to store a link to the cluster or master node of the service) - Consul takes care of everything. Using gossipas a communication protocol, Consul makes it fast, fault-tolerant and does not require a dedicated wizard for normal operation. In fact, the master as such is formally present (even the quorum of the masters), but this is necessary for the most part in order to survive a complete stop of all cluster nodes (the wizards provide periodic saving of operational data to disk, thereby guaranteeing data persistence). As a result, for an application (microservice) using Consul, all work with SD is reduced to communicating with localhost: 8500 - wherever the application moves - there will always be a Consul agent. Moreover, to work with Consul you do not need to have any client libraries (as in the case of Zookeeper) - for this, a simple and understandable HTTP REST API is used (simple data, no more than 20 different API functions),
More details can be found here .

Consul: How to deliver, and get started?

I must say right away that we will not dwell on installation and configuration in detail - for those who read the article, I think this will be a fairly simple exercise. Only one problem worthy of attention is not transparency in the search for installation documentation on the site, so here are the links: initial installation (as homework - developing start / stop scripts for your favorite init.d / upstart / systemd - cross out unnecessary), launch agents and cluster initialization .

A couple of comments about choosing a cluster topology. It will not be superfluous to note that Consul does not have a separate master who single-handedly accepts and distributes service configurations and data between nodes — absolutely any agent can be used to register services and data. Generally speaking, there is a master (more precisely, a quorum of mast), and its main role is to ensure the persistence of data during cluster restarts.

Consul: Register a service, use requests

So, having a ready cluster (or one node for tests) we will begin to register services. To begin with, we will generate a hypothetical scenario based on which we will further understand the work of Consul: suppose we have a classic web application consisting of several frontend services, several backend services and a data warehouse - let it be mongodb. We will say right away that the infrastructure is test and questions like: why is MongoDB not clustered ?, why is HAProxy and not Nginx? etc. I leave the inquisitive reader as homework.
When working with Consul, we will distinguish between 2 types of services - active (using http rest api for their own registration and implementation of accessibility checks) and passive (requiring pre-configured configuration files for Consul). The first will include services developed locally (the company's product and its components), and the second: third-party applications (not necessarily supporting work with Consul, or not supporting it at all, for example MongoDB).

So, let's enter registration for the MongoDB service - create the file /etc/consul.d/mongodb.json :

{
  "service": {
    "name": "mongo-db",
    "tags": ["mongo"],
    "address": "123.23.34.56",
    "port": 27017,
    "checks": [
      {
        "name": "Checking MongoDB"
        "script": "/usr/bin/check_mongo.py --host 123.23.34.56 --port 27017",
        "interval": "5s"
      }
    ]
  }
}

The most important here:
1. address / port - this is actually the data that Consul clients will receive in response to a request for information about the mongo-db service. The published address must be available.
2. Section “checks” - a list of checks to identify if the service is alive. It can be any script (returning 0 in case of normal functioning of the service; 1 in case of warning status of the service and any other value in case of unavailability of the service), http check (a certain URL is requested and the status of the service is generated based on the response - HTTP / 2XX - the service is alive , HTTP / 4XX, HTTP / 5XX - the service is unavailable).

More details on the site: description of the service , description of checks .

Subsequent agent restart (specifying /etc/consul.d/ as the configuration directory) will accept this file and register MongoDB as a service available for SD. The script specified in the checks section makes a connection to MongoDB on the specified host (testing the availability of the service) and, for example, makes a request to some collection to check the availability of data.
Later, you can check the registration using curl:

~/WORK/consul-tests #curl -XGET http://localhost:8500/v1/catalog/service/mongo-db
[{"Node":"mpb.local","Address":"192.168.59.3","ServiceID":"mongo-db","ServiceName":"mongo-db","ServiceTags":["mongo"],"ServiceAddress":"123.23.34.56","ServicePort":27017}]

Or using the built-in Consul DNS server:

~/WORK/consul-tests #dig @127.0.0.1 -p 8600 mongo-db.service.consul SRV
; <<>> DiG 9.8.3-P1 <<>> @127.0.0.1 -p 8600 mongo-db.service.consul SRV
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50711
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;mongo-db.service.consul.	IN	SRV
;; ANSWER SECTION:
mongo-db.service.consul. 0	IN	SRV	1 1 27017 mbp.local.node.dc1.consul.
;; ADDITIONAL SECTION:
NEST.local.node.dc1.consul. 0	IN	A	123.23.34.56
;; Query time: 1 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Thu Sep 17 17:47:22 2015
;; MSG SIZE  rcvd: 152

The use of one or another method of obtaining data from Consul depends on the architecture of the requesting component (for scripts it is more convenient to use the DNS interface, for components written in high-level language - REST requests or specialized libraries).

All services that can support self-registration should use libraries for the necessary language: python , java , go , ruby , php . We must not forget, in addition to actually registering services, correctly develop scripts to check the availability of a particular service so as not to get a system with registered but not working services.

Consul: Goodbye configuration files.

Actually, we got to the bottom of the story - it’s dedicated to reading ... So at some certain point in time we got the environment in which the services are registered (mongodb, backend, for example) what kind of benefit can we get?
In traditional distributed systems (without an embedded SD), this technique is mainly used to add a new component to the system (say, when the load increases, you need to create another backend):
1. A backend service instance is created (often using orchestration systems like SaltStack / Puppet / Ansible / Hand-Made scripts / etc)
2. The template orchestration system generates new configuration files for services using backend (load balancers, frontends, etc)
3. The same orchestration system generates a config file for this new backend service, indicating contact information about mongodb and other dependent components in it
4. All dependent services re-read the configuration (or restart) re-creating the connections between themselves
5. The system waits for convergence and switches to working state.

Such an approach is very costly - you need to make file config generation, distribution, restart of services, etc. On top of that, the orchestration system (a component external to the working system) is involved in the process, the availability of which also needs to be monitored.

SD allows you to significantly simplify this process (exactly, an inquisitive reader already guessed it), but requires a change in the behavior of the services themselves included in the system. And this is not only support for SD (service registration and service discovery), but also Fail Tolerance (the ability of a service to easily survive changes in the topology of subordinate services), the active use of KV repositories to exchange configuration information, etc.
The only external component that will have to be used in such configurations is the Consul-Template- a tool for connecting to Consul a variety of non-SD supporting systems, for example: HAProxy. The task of this software tool is to track the registration / deregistration of services and change the configuration of subordinate services, i.e. upon registration of a new backend, the HAProxy config will be automatically rebuilt to include this new instance in the load balancing process. You can read more about this here .
Actually, the use of SD based on Cunsul, Consul-Template, Consul-KV, in principle, can help completely get rid of any configuration files and leave everything to the mercy of Consul.

As a conclusion.

In general, due to the fact that Consul is in the phase of active development, some problems are possible (from what I noticed - problems with cluster collapse when restarting all nodes with Consul), but the basic SD functionality works fine and there are no complaints about it. Let me remind you of Consul’s support for many data centers to ensure distribution.

Also popular now: