Towards Uninterrupted (HA) Open Cloud: An Introduction to Using OpenStack in Commercial Installations
Author: Oleg Gelbukh
There are several basic requirements for deploying the OpenStack platform for commercial use, both as a small cluster for development environments in startups and as a large-scale installation for a resource provider for cloud services. The following requirements are most often met and, as a result, are the most important:
- Uninterrupted service (HA) of service and redundancy
- Cluster scalability
- Automation of technological operations
Mirantis has developed an approach that can satisfy all three of these requirements. This article is the first in a series of articles that describe our approach. The article provides an overview of the methods and tools used.
In general, services based on the OpenStack platform can be divided into several groups, in this case, based on an uninterrupted service approach for each service.
API Services
The first group includes API servers, namely:
- nova-api
- glance-api
- glance-registry
- keystone
Since HTTP / REST protocols are used, redundancy is relatively easily achieved using a load balancer added to the cluster. If the load balancer supports health checks, this is sufficient to provide basic high availability APIs. Bear in mind that in the 2012.1 release (Essex) of the OpenStack platform, only the Swift API supports the “health check” call. To test the functionality of other services, additions to the API are required to support this call.
Computing services
The second group includes services that actually manage virtual servers and provide resources for them:
- nova-compute
- nova-network
- nova-volume
These services do not require special redundancy in the production environment. The approach to these groups of services is based on the basic cloud computing paradigm, that is, a paradigm where there are many interchangeable workflows and the loss of one of them leads only to a temporary local disruption in manageability, and not a failure of the service provided by the cluster. Thus, it is enough to track these services using an external monitoring system, as well as have at its disposal the main recovery scenarios implemented in the form of event handlers. The simplest scenario is to send a notification to the administrator and try to restart the service that failed.
The high availability of the network data transfer service provided by the multihost support function of the nova-network service is described in the official OpenStack documentation. In really working environments, however, when switching to this scheme frequently, load transfer via network routing to an external hardware router is used. Thus, the nova-network service performs only the functions of a DHCP server, and support for multiple nodes ensures that the DHCP server is not the only point of failure.
Scheduler
Redundancy is an integral part of the nova-scheduler service. When you start the first instance of nova-scheduler, it starts receiving messages from the scheduler queue on the RabbitMQ server. This creates an additional queue scheduler_fanout_, which nova-compute services use to update statuses. The parameter in the queue name is replaced by the identifier of the new scheduler instance. All subsequently launched nova-scheduler schedulers act in a similar way, which allows them to work in parallel without additional efforts.
Queue server
The RabbitMQ queue server is the primary communication channel for all nova services, and it must be reliable in any production environment configuration. Cluster creation and queue mirroring are initially supported by the RabbitMQ server, and the load balancer can be used to distribute connections between RabbitMQ servers operating in cluster mode. Mirantis also developed an update for the Nova RPC library, which allows it to fail over if it switches to the backup RabbitMQ server if the primary server fails and cannot accept connections.
Database
Most often, when deploying the OpenStack platform, a MySQL database is used. Mirantis also most often uses this database in its installations. To date, there are several solutions that ensure the continuity and scalability of the MySQL database. The most commonly used replication management tool is multi-master replication manager MySQL-MMM. This solution is used in several installations performed by Mirantis and works quite reliably, with the exception of well-known limitations.
Although there were no serious problems using MMM, we are considering using more modern open source solutions to ensure the smooth operation of the database, in particular, the MySQL clustering engine based on WSREP, Galera. The Galera Cluster provides a simple and transparent scalability mechanism and supports fault tolerance through synchronous replication with multiple data change sources implemented at the WSREP level.
Now that we know how to balance the load or distribute the load in parallel, we need a mechanism that allows us to add service processes to the cluster and expand it to provide more load, that is, to perform “horizontal” scaling. For most components of the OpenStack platform, simply add a server instance, include it in the load balancer configuration, and scale the cluster horizontally. However, this leads to two problems in industrial plants:
Most clusters scale by nodes, not by service instances. In this regard, it becomes necessary to determine the roles of nodes that allow performing “smart” scaling of clusters. The role essentially corresponds to the set of services running on the node and scales by adding the node to the cluster.
Horizontal scaling of the cluster by adding a controlling node requires a configuration change in several places in a certain order, that is, you need to deploy the cluster, start services, and only then update the load balancer configuration to include a new node. For compute nodes, the procedure is simpler, but, nevertheless, it requires a high degree of automation at all levels, from equipment to service configuration.
Nodes and Roles
While OpenStack services can be distributed across servers with a high degree of flexibility, the most common deployment option for the OpenStack platform is the presence of two types of nodes: a control node and computing nodes. A typical OpenStack installation for development includes one control node that starts all services, with the exception of a group of calculations, and several computing nodes that run computing services and which host virtual servers.
It becomes clear that such an architecture is not suitable for commercial installations. For small clusters, we recommend making the cluster nodes as self-sufficient as possible by installing API servers on the computing nodes, leaving only the database, the queue server, and the control panel on the control node. The configuration of the control nodes must include redundancy. The following node roles are defined for this architecture:
- End node. This site runs load balancing and uptime services, which may include software for load balancing and cluster creation. A hardware and software complex in the network designed to balance the load can act as an end node. It is recommended that you create at least two end nodes in the cluster to provide redundancy.
- Control node. This site hosts communication services that provide the entire cloud, including a queue server, database, Horizon control panel, and possibly a monitoring system. On this node, the nova-scheduler service and API servers can optionally be located, the load balancing of which is controlled by the end node by balancing the load distribution. At least two control nodes must be created in the cluster to provide redundancy. The management node and the end node can be combined on one physical server, but you must make changes to the configuration of nova services - you must transfer them from the ports that the load balancer uses.
- Computing node. This node hosts the hypervisor and virtual instances that use its computing power. A computing node can also act as a network controller for virtual instances located on it if a multihost scheme is used.
Configuration management
To implement the architecture proposed above on each physical server, a certain sequence of steps must be performed. Some steps are quite complex, others include setting up multiple nodes; for example, configuring a load balancer or setting up replication with multiple sources of data change. Due to the complexity of the current process of deploying the OpenStack platform, writing scripts for these operations is important for its successful implementation. This led to the emergence of several projects, including the famous Devstack and Crowbar.
Simple scripting of the installation process is not enough to successfully install the OpenStack solution in production environments, or to provide cluster scalability. Also, if you need to change something in your architecture or upgrade component versions, you will need to develop new scripts. This can be done using tools designed specifically for these tasks: configuration software. The most famous among them are Puppet and Chef, and there are also products developed on their basis (for example, the above-mentioned Crowbar uses Chef as an engine).
To deploy OpenStack in various projects, we used both Puppet and Chef. Naturally, each of the programs has its own limitations. Our experience shows that best results are achieved when the configuration software is supported by a centralized orchestration engine for a smooth and successful deployment. When combined with an application for setting up physical servers at the hardware level and a set of tests to confirm the quality of the installation, we get an integrated approach that allows you to quickly install the OpenStack platform in a wide range of hardware configurations and logical architectures.
Using the orchestration engine with a configuration system that takes into account the roles of nodes allows us to automate the deployment process to a fairly high degree. We can also automate scaling. All this reduces the cost of operating and maintaining OpenStack. Most modern orchestration engines have an API that allows you to create a command line interface or web interfaces for operators performing tasks to manage the entire cluster or its individual parts.
We will discuss this in more detail in the following articles.
There are several basic requirements for deploying the OpenStack platform for commercial use, both as a small cluster for development environments in startups and as a large-scale installation for a resource provider for cloud services. The following requirements are most often met and, as a result, are the most important:
- Uninterrupted service (HA) of service and redundancy
- Cluster scalability
- Automation of technological operations
Mirantis has developed an approach that can satisfy all three of these requirements. This article is the first in a series of articles that describe our approach. The article provides an overview of the methods and tools used.
Uninterruptible (HA) and redundancy
In general, services based on the OpenStack platform can be divided into several groups, in this case, based on an uninterrupted service approach for each service.
API Services
The first group includes API servers, namely:
- nova-api
- glance-api
- glance-registry
- keystone
Since HTTP / REST protocols are used, redundancy is relatively easily achieved using a load balancer added to the cluster. If the load balancer supports health checks, this is sufficient to provide basic high availability APIs. Bear in mind that in the 2012.1 release (Essex) of the OpenStack platform, only the Swift API supports the “health check” call. To test the functionality of other services, additions to the API are required to support this call.
Computing services
The second group includes services that actually manage virtual servers and provide resources for them:
- nova-compute
- nova-network
- nova-volume
These services do not require special redundancy in the production environment. The approach to these groups of services is based on the basic cloud computing paradigm, that is, a paradigm where there are many interchangeable workflows and the loss of one of them leads only to a temporary local disruption in manageability, and not a failure of the service provided by the cluster. Thus, it is enough to track these services using an external monitoring system, as well as have at its disposal the main recovery scenarios implemented in the form of event handlers. The simplest scenario is to send a notification to the administrator and try to restart the service that failed.
The high availability of the network data transfer service provided by the multihost support function of the nova-network service is described in the official OpenStack documentation. In really working environments, however, when switching to this scheme frequently, load transfer via network routing to an external hardware router is used. Thus, the nova-network service performs only the functions of a DHCP server, and support for multiple nodes ensures that the DHCP server is not the only point of failure.
Scheduler
Redundancy is an integral part of the nova-scheduler service. When you start the first instance of nova-scheduler, it starts receiving messages from the scheduler queue on the RabbitMQ server. This creates an additional queue scheduler_fanout_, which nova-compute services use to update statuses. The parameter in the queue name is replaced by the identifier of the new scheduler instance. All subsequently launched nova-scheduler schedulers act in a similar way, which allows them to work in parallel without additional efforts.
Queue server
The RabbitMQ queue server is the primary communication channel for all nova services, and it must be reliable in any production environment configuration. Cluster creation and queue mirroring are initially supported by the RabbitMQ server, and the load balancer can be used to distribute connections between RabbitMQ servers operating in cluster mode. Mirantis also developed an update for the Nova RPC library, which allows it to fail over if it switches to the backup RabbitMQ server if the primary server fails and cannot accept connections.
Database
Most often, when deploying the OpenStack platform, a MySQL database is used. Mirantis also most often uses this database in its installations. To date, there are several solutions that ensure the continuity and scalability of the MySQL database. The most commonly used replication management tool is multi-master replication manager MySQL-MMM. This solution is used in several installations performed by Mirantis and works quite reliably, with the exception of well-known limitations.
Although there were no serious problems using MMM, we are considering using more modern open source solutions to ensure the smooth operation of the database, in particular, the MySQL clustering engine based on WSREP, Galera. The Galera Cluster provides a simple and transparent scalability mechanism and supports fault tolerance through synchronous replication with multiple data change sources implemented at the WSREP level.
Scalability
Now that we know how to balance the load or distribute the load in parallel, we need a mechanism that allows us to add service processes to the cluster and expand it to provide more load, that is, to perform “horizontal” scaling. For most components of the OpenStack platform, simply add a server instance, include it in the load balancer configuration, and scale the cluster horizontally. However, this leads to two problems in industrial plants:
Most clusters scale by nodes, not by service instances. In this regard, it becomes necessary to determine the roles of nodes that allow performing “smart” scaling of clusters. The role essentially corresponds to the set of services running on the node and scales by adding the node to the cluster.
Horizontal scaling of the cluster by adding a controlling node requires a configuration change in several places in a certain order, that is, you need to deploy the cluster, start services, and only then update the load balancer configuration to include a new node. For compute nodes, the procedure is simpler, but, nevertheless, it requires a high degree of automation at all levels, from equipment to service configuration.
Nodes and Roles
While OpenStack services can be distributed across servers with a high degree of flexibility, the most common deployment option for the OpenStack platform is the presence of two types of nodes: a control node and computing nodes. A typical OpenStack installation for development includes one control node that starts all services, with the exception of a group of calculations, and several computing nodes that run computing services and which host virtual servers.
It becomes clear that such an architecture is not suitable for commercial installations. For small clusters, we recommend making the cluster nodes as self-sufficient as possible by installing API servers on the computing nodes, leaving only the database, the queue server, and the control panel on the control node. The configuration of the control nodes must include redundancy. The following node roles are defined for this architecture:
- End node. This site runs load balancing and uptime services, which may include software for load balancing and cluster creation. A hardware and software complex in the network designed to balance the load can act as an end node. It is recommended that you create at least two end nodes in the cluster to provide redundancy.
- Control node. This site hosts communication services that provide the entire cloud, including a queue server, database, Horizon control panel, and possibly a monitoring system. On this node, the nova-scheduler service and API servers can optionally be located, the load balancing of which is controlled by the end node by balancing the load distribution. At least two control nodes must be created in the cluster to provide redundancy. The management node and the end node can be combined on one physical server, but you must make changes to the configuration of nova services - you must transfer them from the ports that the load balancer uses.
- Computing node. This node hosts the hypervisor and virtual instances that use its computing power. A computing node can also act as a network controller for virtual instances located on it if a multihost scheme is used.
Configuration management
To implement the architecture proposed above on each physical server, a certain sequence of steps must be performed. Some steps are quite complex, others include setting up multiple nodes; for example, configuring a load balancer or setting up replication with multiple sources of data change. Due to the complexity of the current process of deploying the OpenStack platform, writing scripts for these operations is important for its successful implementation. This led to the emergence of several projects, including the famous Devstack and Crowbar.
Simple scripting of the installation process is not enough to successfully install the OpenStack solution in production environments, or to provide cluster scalability. Also, if you need to change something in your architecture or upgrade component versions, you will need to develop new scripts. This can be done using tools designed specifically for these tasks: configuration software. The most famous among them are Puppet and Chef, and there are also products developed on their basis (for example, the above-mentioned Crowbar uses Chef as an engine).
To deploy OpenStack in various projects, we used both Puppet and Chef. Naturally, each of the programs has its own limitations. Our experience shows that best results are achieved when the configuration software is supported by a centralized orchestration engine for a smooth and successful deployment. When combined with an application for setting up physical servers at the hardware level and a set of tests to confirm the quality of the installation, we get an integrated approach that allows you to quickly install the OpenStack platform in a wide range of hardware configurations and logical architectures.
Operations Automation
Using the orchestration engine with a configuration system that takes into account the roles of nodes allows us to automate the deployment process to a fairly high degree. We can also automate scaling. All this reduces the cost of operating and maintaining OpenStack. Most modern orchestration engines have an API that allows you to create a command line interface or web interfaces for operators performing tasks to manage the entire cluster or its individual parts.
We will discuss this in more detail in the following articles.