shveenkov January 9, 2017 at 16:20

We pass from Tarantool 1.5 to 1.6

Hello, Habr! I want to tell the story of the migration from Tarantool version 1.5 to 1.6 in one of our projects. Do you think you need to migrate to the new version, if everything works like that? How easy is it to do if you have already written a lot of code? How not to affect live users? What difficulties can be encountered with such changes? What is the general profit from moving? Answers to all questions can be found in this article.

Service Architecture

It will be about our push notification service. On Habré there is already an article about sending messages " across the entire user base. " This is one of the parts of our service. All code was originally written in Python 2.7. We use the uwsgi, gevent stack, and, of course, Tarantool! The service already sends out about one billion push notifications per day. Two clusters of eight servers each cope with such a load. The architecture of the service looks as shown in the figure: The

architecture of the push notification

service Distribution of notifications is handled by several servers from the cluster, indicated in the figure as Node 1, Node N. The distribution service interacts with the cloud platforms apple push notification service and firebase cloud messaging. Each server handles HTTP traffic from mobile applications. Distribution servers are completely identical. If one of them fails, then the entire load will be automatically distributed to the other servers in the cluster.

Our users, their settings and push tokens are stored in Tarantool. In the figure, this is Tarantool Storage. Two replicas are used to distribute the read load. The service code is designed for temporary unavailability of the master or replicas. If one of the replicas does not respond, then a request of type select is executed to the next available replica or wizard. All requests to write to the master are made through the Tarantool Queue. If the wizard is unavailable, then the queue accumulates requests until the wizard is ready for operation.

For a long time, we had one master with the Tarantool database, which was used to store push tokens. With 10,000 requests per second, one wizard is enough. And to distribute 100,000 read requests, we use several replicas. While one wizard is writing data, the read load can be distributed by adding new replicas.

The architecture of the service was originally designed for load growth and horizontal scaling. For a while, we could easily grow horizontally, adding new servers for sending out notifications. But is it possible to grow endlessly with such an architecture?

The problem is that there is only one instance with the Tarantool master. It runs on a separate server and has grown to 50 GB. Two replicas are hosted on a second server. They began to occupy 50 * 2 = 100 GB. It turned out to be quite heavy Tarantool instances. When restarted, they do not take off instantly. And the size of free memory on the server with replicas reached the limit, which says that you need to change something.

Database sharding

The solution with database sharding begs. We take a large instance of the Tarantool master and make several small instances! There are already ready-made solutions for sharding: one and two . But they all work only with Tarantool 1.6.

As always, there is one more “but”. Mail.Ru mail generates a huge traffic of events about new letters, reading letters, deleting them, etc. Only a part of these events must be delivered to mobile applications in the form of push notifications. Processing this traffic is a fairly resource-intensive task. Therefore, Mail.Ru Mail service filters out unnecessary events and sends only a useful part of the traffic to our service. To filter events Mail.Ru Mail needs to receive information about installing a mobile application. For this, queries to our Tarantool replicas are used. That is, dividing one Tarantool into several is already becoming more difficult. We need to tell all third-party services about how our sharding works. This will greatly complicate the system, especially when resharding is required. Development will take a long time, since several services need to be finalized.

One of the possible solutions is to use a proxy, which will pretend to be Tarantool and make distribution of requests across several shards. In this case, you do not need to change third-party services.

So, what do we get from sharding?

scaling options will increase, you can grow further;
the size of Tarantool instances will decrease, they will start faster, this will reduce the system downtime in case of accidents;
due to sharding, we will distribute the load from Tarantool across the CPU on one server more efficiently, we will be able to use more cores;
get more efficient use of server memory, 100 GB for the master and 100 GB for replicas.

Our service uses Tarantool 1.5. The development of this version of Tarantool is stopped. Well, if we do sharding, develop proxies, then why not replace the old Tarantool 1.5 with the new Tarantool 1.6?

What are the other advantages of Tarantool 1.6

Our service is written in Python, and to work with Tarantool we use the connector github.com/tarantool/tarantool-python . For Tarantool 1.5, the iproto protocol is used , the packaging and unpacking of data in the connector is done in Python using calls to struct.pack / struct.unpack . The Tarantool 1.6 Python connector uses the msgpack library . Preliminary benchmarks showed that unpacking and packaging on msgpack consumes slightly less CPU time compared to iproto. Switching to 1.6 may free up CPU resources in the cluster.

A little bit about the future

Apple has developed a new protocol for pushing notifications to iOS devices. It differs from the previous version , is based on HTTP / 2 , it implements support for hiding push notifications. The maximum size of the sent data for one push notification is 4 Kbytes (in the old protocol - 2 Kbytes).

To send notifications to Android devices, we use the Google
Firebase Cloud Messaging service . It introduced support for encrypting push notification content for the Chrome browser.

Unfortunately, in Python there are still no good libraries for working with HTTP / 2. There are also no libraries to support Google notification encryption. And even worse, you need to make existing libraries work with the gevent and asyncio frameworks. This was an occasion to reflect on the difficulties of supporting our service in the future. We examined the use case of golang. Go has good support for all the new goodies from Apple and Google. But again, the problem: we did not find the official connector on go to Tarantool 1.5. Sadness, pain, decay. Not! This is not about us. :)

So, for the development of our service it is necessary to solve the following tasks:

support sharding;
upgrade to Tarantool 1.6;
create a proxy that will process requests from clients in the protocol format for Tarantool 1.5;
transfer the queuing system to work with Tarantool 1.6;
update the battle cluster so that it does not affect our users.

We develop proxy

We chose golang as the language for proxy. This language is very convenient for solving this class of problems. After Python, it’s unusual to code in a compiled language with type checking. The lack of classes and exceptions raised some doubts. But several months of work showed that you can do fine without these things. Goroutines and channels in go is very cool and convenient from a development point of view. Tools such as benchmark tests, a powerful golang, golint, gofmt profiler help a lot and speed up the development process. Community language support, conferences, blogs, articles about go - all this is just admirable!

So, we got tarantool-proxy. It accepts connections from clients, enables Tarantool 1.5 communication, and distributes requests across multiple Tarantool 1.6 instances. The reading load, as before, can be scaled using replicas. When introducing a new solution, we provided for the possibility of rollback. To do this, we modified our Python code. We duplicated all write requests in tarantool-proxy and additionally in the "old" instance with Tarantool 1.5. In fact, our code has not changed, but started working with Tarantool 1.6 through a proxy. You may ask: why is it so difficult? There will be no rollback? No, there was a rollback. And not one.

Despite the fact that we carried out load testing, after the first start tarantool-proxy consumed too much CPU. Rolled back, profiled, fixed. After the second start, tarantool-proxy consumed a lot of memory, as much as 3 GB. The golang profiler helped find the problem again.

The profiler turns on quite simply:

import (
    _ "net/http/pprof" 
    "net/http" 
)
go http.ListenAndServe(netprofile, nil)

Run profile removal:

go tool pprof -inuse_space tarantool-proxy http://127.0.0.1:8895/debug/pprof/heap

Visualize profiling results:

Entering interactive mode (type "help" for commands)
(pprof) top20
1.74GB of 1.75GB total (99.38%)
Dropped 122 nodes (cum <= 0.01GB)
      flat  flat%   sum%        cum   cum%
    1.74GB 99.38% 99.38%     1.74GB 99.58%  main.tarantool_listen.func1
         0     0% 99.38%     1.75GB 99.89%  runtime.goexit
(pprof) list main.tarantool_listen.func1
Total: 1.75GB
ROUTINE ======================== main.tarantool_listen.func1 in /home/work/src/tarantool-proxy/daemon.go
    1.74GB     1.74GB (flat, cum) 99.58% of Total
         .          .     37:
         .          .     38:           //run tarantool15 connection communicate
         .          .     39:           go func() {
         .          .     40:                   defer conn.Close()
         .          .     41:
    1.74GB     1.74GB     42:                   proxy := newProxyConnection(conn, listenNum, tntPool, schema)
         .     3.50MB     43:                   proxy.processIproto()
         .          .     44:           }()
         .          .     45:   }
         .          .     46:}
         .          .     47:
         .          .     48:func main() {
(pprof)

There is a problem with memory consumption that needs to be fixed. We launched all this on a combat server under load. By the way, there is an excellent article on profiling in go . She really helped us find problem areas in the code.

After fixing all tarantool-proxy problems, we watched the service for another week according to the new scheme. And then they finally abandoned Tarantool 1.5, removed all requests to it from Python code.

How to make data migration from Tarantool 1.5 to 1.6?

To migrate data from 1.5 to 1.6, everything is ready out of the box github.com/tarantool/migrate . Take snapshot 1.5 and fill it with 1.6 instances. For sharding, delete unnecessary data. A little patience - and we have a new Tarantool storage. All third-party services gained access to the new cluster through tarantool-proxy.

What difficulties have we encountered?

I had to migrate the lua-procedure code under 1.6. But honestly, this did not require significant effort. Another 1.6 feature - there are no queries of the form

select * from space where key in (1,2,3)

I had to rewrite to several queries in a loop of the form

for key in (1,2,3):
    select * from space where key = ?

We also took the new tarantool-queue , which is supported by developers and works with Tarantool 1.6. There were times when you had to work in Python with both Tarantool 1.5 and Tarantool 1.6 at the same time.

Install the connector for 1.5 using pip and rename it:

pip install tarantool\<0.4
mv env\lib\python2.7\site-packages\tarantool env\lib\python2.7\site-packages\tarantool15

Next, install the connector for 1.6:

pip install tarantool

In Python code, do the following:

# todo use 1.6
import tarantool15 as tarantool
tnt = tarantool.connect(host, port)
tnt.select(1, "foo", index=0)
import tarantool
tnt = tarantool.connect(host, port)
tnt.select(1, "bar", index="baz")

Thus, supporting multiple versions of Tarantool in Python at the same time does not require much effort. We gradually threw 1.5 instances with queues and completely switched to Tarantool 1.6.

Production Update

If you think that we quickly updated production and everything worked right away, then this is far from the case. For example, after the first attempt to switch to a new tarantool-queue with 1.6, the graphs with the average response time of our services looked like this:

Average service response time

On the green graphs you can clearly see the increase in the average time for processing HTTP requests for uwsgi, and not only. But after several iterations of searching for the reasons for this growth, the graphs returned to normal. And here is the final graphs of the Load Average and CPU consumption on the battle server:

Load Average

CPU usage

Green graphs show that we have released some of our hardware resources. Considering that our loads are constantly growing, we got a small reserve of iron on the current cluster configuration.

To summarize

We spent several months working to upgrade to Tarantool 1.6. And about a month for sharding the storefront and updating the production. Modifying an existing system with a high load is quite difficult. Our service is constantly changing. In a live project, there are persistent errors that require developer intervention. New product features of Wishlist are constantly appearing, which also require changes in the existing code.

Development cannot be stopped, especially for such a long task. You always have to think about possible options for a rollback to the previous state. And most importantly, development has to be implemented in small iterations.

What's next?

Resharding is one of the remaining unresolved issues. While it is not acute, we have time to evaluate all the work and make a prediction when we need reharding. We also plan to rewrite part of the service in golang. Perhaps the service will consume even less CPU. Then you have to do another article. :)

Thanks to all the developers, Mail.Ru Mail maintenance engineers, the Tarantool team and all those who participated in the migration, helped and inspired us to support and develop our service. It was cool!

Links used to write the article:

Tarantool - tarantool.org
Tarantool Queue - github.com/tarantool/queue
Asyncio Tarantool Queue, get in line
Profiling and optimizing Go programs

Tags: