TheGingerHAL July 13, 2019 at 13:26

Configuring ClickHouse for integration testing in gitlab-ci

From the sandbox

We had a service on golang, a separate topic kafka, clickhouse, gitlab-ci and a falling payline, a rotten ssh-key and that’s all, along with a vacation season, terrible rains in the city, a broken laptop, alerts at night, and a hot sale . Not that all this was necessary for this article, but once you show the typical everyday life of the tester, then go in your intention to the end. The only thing that bothered me was p0. In the world there is nothing more desperate, gloomy and depressed than the tester who missed it on the prod. But I knew that pretty soon I would plunge into it.

Why all this?

There is a common bundle of services — the service itself that does something — and a database into which these results are written. sometimes this happens directly, that is, “service - base”. In my case, recording occurs through an intermediary, that is, “service - queue - base”.

In total, there are several elements, and the border of these elements - the output of one and the input of the other - this is the place where problems appear. They simply cannot not appear there.

A vivid example: in the service, the price field is processed as float32, in the database it is configured as decimal (18, 5), we feed the maximum value of float32 as a test case from the service’s output to the database - oh, the database does not respond. Or a more sad example - the database does not crash, but there is no error in the data logs in the database. it’s just that the database no longer pours. Or the recording goes through, but with loss of data or with distortion: the field exits the service as float64, and is recorded as float32.

Or in the process of the service life cycle they decided that it was necessary to change the type of this or that field. The field has long been implemented on the prod, but here it is necessary to edit it. And of course we changed it in only one place. Hoba, something went wrong again.

Task

I do not want to keep track of all these changes. I want it to not fall. I want the recording to go right.

Exit: integration tests!

Implementation and difficulties

Where to break?

There is a dev environment: terribly unstable and is usually used by developers as a sandbox. There is chaos and anarchy characteristic of a hard backend.

There is a test environment or qa-stand: it’s better configured, even devops are watching it, but until you kick them, nothing will happen. and this environment is often updated. and even more often, something is broken there.

And there is a prod - the holy of holies: it is better not to drive anything like that on it. integration tests suggest the possibility of a bug that they must find before it gets to the prod.

So what to do with the environment when it is either unstable or combat? That's right, create your own!

What to do with the base?

The database can be launched in several ways.

As we discussed above, we will not connect to the real base of this or that environment.

Firstly, you can raise the ~~cruttle~~ clickhouse-server with the necessary settings, roll out the necessary sql on it and communicate with it via clickhouse-client. On the first successful attempt to put a similar base, ci was sad. the tests flashed, the server did not go out and continued to work. Let's just say, it still remains a mystery to me why it even started. (it itself, I have nothing to do with it). I do not recommend this option.

A convenient option out of the box is the use of a docker image .
Download the desired version to your car. Clickhouse needs config.xml with settings to start. More details here
For the reused click image, you need to create the correct dockerfile. We indicate in it that we want to copy config.xl to the folder, we dock the other required configs. Be sure to copy the scripts to deploy your base.

Since we will be accessing the image from the outside, we need to open the ports through which we will communicate with the clickhouse. Click works on 8123 on http and on 9000 on tcp.

We get the following dockerfile:

From yandex/clickhouse-server
Expose 8123
Expose 9000
Add config.xml /etc/clickhouse-server/config.xml
Add my_init_script.sql /docker-entrypoint-initdb.d/

How to throw an image in ci?

To somehow work with the docker image in ci, you need to call it somehow there.

You can commit and run the image in your repository and run docker run with the necessary parameters as part of running the tests. Only here the docker-image of the click weighs under 350mb. it’s indecent to keep such files in git.

In addition, if the same docker image is needed on different projects (for example, different services are written to the same database), then all the more so you should not do this. You can use the docker registry image storage.
We believe that it is already in use in our project. Therefore, log in, collect the docker image and push it there.

docker build -t my_clickhouse_image .
docker login my_registry_path.domain.com
docker push my_clickhouse_image

Alight and our image flew into the registry. Be sure to specify the tag during assembly!

The base is ready.

Read more about registry here

What to do with ci?

How to launch both your service and database within one step?

It all depends on how we start and use the service. If you work with the service as with a docker image, and indeed the whole .gitlab-ci.yml works only with them, then everything is simple.
There is a dind stray - docker-in-docker . It is indicated as the main service ci works with, and allows you to fully use the docker and not strain at all.

We pump out the latest image, add the required testing step to the stages, and describe our sequence of actions.

image: docker:stable
services:
- docker:dind
stages:
- build
  …
- test-click
...
- test
- release
  …
test-click:
  variables:
    VERY_IMPORTANT_VARIABLE: “its value”
  before_script:
  - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
  script:
  - docker pull My_Service_Image
  - docker pull My_ClickHouse_Image
  - docker run -FLAGS My_ClickHouse_Image
  - docker run My_Service_Image /path/to/tests

In the official docker docker, it is indicated that it is not recommended to use dind , but if you really need to ...

In my project, the service must be tested through the launch of the binary. This is where the magic begins.
To do this, use the database as a service. The official gitlab-ci documentation cites the use of a container with a base as an example of the most common use case for a docker container in ci. Even examples of mysql, postress, and redis settings are provided. But we are not looking for easy ways, we need a clickhouse.

Connect the base! Be sure to specify alias. if it is not specified, then the base will be assigned a random name and random ip. That is, it will not be clear exactly how to access it. There will be no such problem with alias - in the test code, the call will look like, for example, in https http://my_alias_name:8123.

For tests, an image of the database is still required, which we carefully put into the registry. To download the image, you need to do docker login and docker pull, only ci does not know what docker is - you need to install it.

The resulting code for the step in gitlab-ci.yml:

Integration tests:
Services:
- name: my_clickhouse:latest
  alias: clicktest
Stage: tests
Variables:
Variables_for_my_service: “value”
Before_script:
- curl -ssl https://get.docker.com/ | sh
- docker login -u gitlab-ci-token -p $ci_build_token my_registry_path.domain.com
Script:
- ./bin/my_service &
- go test -v ./tests -tags=integration
Dependencies:
- build

Profit

I have a working bunch of service base.
As part of the autotest, it is easy to access the database - simply by alias.
I reset the records and settings of the database as part of the setup test, call the service, it writes to the database, I turn to the database, I see that the database has not fallen off, I see what has arrived, I validate. throw more tests.
You can not test with pens!

results

It would seem that a couple of lines of setup in gitlab-ci. Building a docker image is easy. Running the locally locally is simple. I got integration with the first tests that found problems in a day. But attempts to launch it at ci turned into a week of pain and hopelessness. And now, in the weeks of pain and hopelessness of developers who have to repair everything they have programmed there.

What did we manage to do?

We set up a container with clickhouse.
We started the container in local storage.
We learned to pull this image into step ci.
Launched it in the runner.

Easily sent data to the database and accessed it from the test.

Automation is a pretty simple way to get rid of the routine of manually piercing integration.

What is important to pay attention to: make sure that the input types of the base correspond to the output types of the previous link. (and documentation ~~, if any~~ ).

Tags: