How to save on EC2 spot instances with Scylla

Original author: Pavel Klyushin
  • Transfer
Spot instances can save you a lot of money. But what if you work with stateful services, such as NoSQL databases? The main problem is that in this case each node in the cluster must save some parameters - IP, data and other configurations. In this post, we will talk about the open source Scylla NoSQL database and how it can be used in EC2 spot instances for continuous operation - using the predictive technology SpotInst, as well as advanced state-saving functionality.




What is Scylla?


Scylla is a NoSQL database distributed by an open source model. It was designed with Apache Cassandra compatibility in mind, while providing significantly higher throughput and less latency. It supports the same protocols and file formats as Apache Cassandra. However, Scylla is written entirely in C ++, and not in Java, like Apache Cassandra. In addition, Scylla was built using the Seastar framework, which is an asynchronous library that replaces threads, shared memory, mapped files, and other classic Linux programming techniques. Scylla also has its own unique disk scheduler, which helps increase productivity.

Tests conducted by both ScyllaDB engineers and third-party companies have demonstrated that Scylla is 10 times better than Apache Cassandra.



How Scylla replicates its data between nodes


Scylla provides AlwaysOn availability. Automatic transition to a backup system, replication between multiple nodes and data centers provide fault tolerance.

Scylla, like Cassandra, uses the gossip protocol to exchange metadata to identify nodes in the cluster and determine if they are active. There is no single point of failure - there cannot be a single registry for the state of nodes, so they must exchange information with each other.


How to run Scylla on Spotinst


When creating a new Scylla cluster, it is unlikely that you will immediately want to resort to spot instances due to their unstable behavior. Not in their favor is the fact that these instances can be disabled within 2 minutes. Therefore, Elastigroup is the standard choice for such an environment.

Elastigroup with 100% availability is the leader in Spot Market. Choosing the right bet for the right spot, analyzing the history of data in real time - all this helps to choose spot instances with the lowest price and the longest life. Changes to the Spot Market are predicted 15 minutes in advance, which allows you to replace the spot without interruption.

Now about saving state. Elastigroup can save data volumes. For any EBS volume that is connected to the instance, snapshots will be continuously executed during operation, and after replacement it will be used to match the blocks.



In order for your machine to continue to work in the event of a failure, there are a few things to keep in mind:

  1. Private IP - make sure that the new computer has the same IP address so that the gossip protocol can continue interacting with the machine.
  2. Vol . The node must be connected to the same storage and must have the same volume as before. If not, the service will be unavailable.
  3. The config file is scylla.yaml, by default, is located at /etc/scylla/scylla.yaml. It must be edited so that the nodes have information about their configuration. Here are the key parameters to configure:

  • Cluster_name is the name of the cluster. This parameter separates the nodes of different logical clusters. For all nodes within the same cluster, the same value must be set;
  • Listen_interface - the interface that Scylla assigns to connect to other nodes;
  • Seeds - seed nodes are used during startup to bootstrap the gossip process and join the cluster;
  • Rpc_address - interface IP address for client connections (Thrift, CQL);
  • Broadcast_address - IP address of the interface for connections between nodes, how it will be visible inside the cluster.

Rack Selection


To increase the availability of your data, it is recommended to distribute nodes between AZ. This can be configured using the value Ec2Snitchin the scylla.yamland files cassandra-rackdc.properties.

Suppose you have a cluster created in a region us-east-1. If node 1 is in us-east-1aand node 2 is in us-east-1b, Scylla will assume that they are in two different racks in the same data center. Node 1 will be considered rack 1a, and node 2 will be considered rack 1b.

Now we show how to deploy a cluster of six nodes. Each data center will consist of three ordinary nodes and two seeds. IP addresses look like this:

US US-DC1 US US-DC2 On each Scylla node, you need to edit the scylla.yaml file. Here is another example, for one node in each data center:

Node# Private IP
Node1 192.168.1.1 (seed)
Node2 192.168.1.2 (seed)
Node3 192.168.1.3



Node# Private IP
Node4 192.168.1.4 (seed)
Node5 192.168.1.5 (seed)
Node6 192.168.1.6



US Data-center 1 - 192.168.1.1 US Data-center 2 - 192.168.1.4 On each Scylla node, edit the cassandra-rackdc.properties file, indicating the relevant information about the rack and data center: Nodes 1-3 Nodes 4-6

cluster_name: 'ScyllaDB_Cluster'
seeds: "192.168.1.1,192.168.1.2,192.168.1.4,192.168.1.5"
endpoint_snitch: Ec2Snitch
rpc_address: "192.168.1.201"
listen_address: "192.168.1.201"



cluster_name: 'ScyllaDB_Cluster'
seeds: "192.168.1.1,192.168.1.2,192.168.1.4,192.168.1.5"
endpoint_snitch: Ec2Snitch
rpc_address: "192.168.1.4"
listen_address: "192.168.1.4"





dc=us-east-1a
rack=RACK1



dc=us-east-1b
rack=RACK2

Configure Spotinst Console


When configuring Elastigroup, it is important to enable the state saving function - this is necessary to save data and network configuration when replacing an instance due to the disconnection of the spot. Open the Compute tab, go to the stateful function and check the options as shown in the screenshot below.



We also recommend that you use the nodetool drain shutdown script command to clear the commit log and stop accepting new connections. The description is in the shutdown script section .

How does this work?


In the animation below, you see a Scylla cluster with three instances. All nodes work on spot instances with configured state preservation.

When one of the instances is disconnected, our stateful function creates an instance with Private IP and Root / Data volumes. And, as you can see below, the instance returns to the cluster.



So with Scylla and Spotinst, you can increase productivity while lowering costs.

If you want to see and test the solution, you can contact us through the form on the website , in the comments to the post, by mail ru@globaldots.com or by phone + 7-495-762-45-85.

Also popular now: