Tips for creating custom workflows in GitLab CI

https://threedots.tech/post/gitlab-ci-tips-for-building-custom-workflows/

Transfer

Note trans. : The original article was written by Miłosz Smółka, one of the founders of the small Polish company Three Dots Labs , which specializes in “advanced backend solutions”. The author draws on his experience of active exploitation of GitLab CI and shares the accumulated tips for other users of this Open Source product. After reading them, we realized how close the problems he described to us were, so we decided to share the proposed solutions with a wider audience.

This time I will cover more advanced topics in GitLab CI. A frequent task here is to implement non-standard features in the pipeline. Most of the tips are specific to GitLab, although some of them can be applied to other CI systems.

Running integration tests

As a rule, checking code using unit tests is easy to connect to any CI system. This is usually no more difficult than running one of the commands built into the standard set of programming language utilities. In such tests, you will most likely use different mocks and plugs to hide implementation details and focus on testing specific logic. For example, you can use an in-memory database as storage or write stubs for HTTP clients that will always return already prepared responses.

However, sooner or later you will need integration tests to cover more unusual situations with tests. I will not go into the discussion about all possible types of testing and just say that under integrationI mean tests that use some kind of external resources. It can be a real database server, an HTTP service, a connected storage, etc.

In GitLab, it is easy to run pluggable resources as Docker containers associated with a container running scripts. These dependencies can be determined with services. They are available by the image name or by the name of your choice, if you specify it in the field alias.

Here is a simple example of using a plugin with MySQL:

integration_tests:
  stage: tests
  services:
    - name: mysql:8
      alias: db
  script:
    - ./run_tests.sh db:3306

In this case, the test scripts will need to connect to the host db. Using alias is usually a good idea because it allows you to replace images without the need to modify test code. For example, mysqlyou can replace the image with mariadb, and the script will still work correctly.

Waiting for containers

Since pluggable containers take a while to load, you may need to implement a wait before sending any requests. The simple way is the wait-for-it.sh script with a defined timeout.

Using Docker Compose

For most cases servicesshould be enough. However, sometimes you may need to interact with external services. For example, in the case of launching Kafka and ZooKeeper in two separate containers (this is how official images are assembled). Another example is the launch of tests with a dynamic number of nodes, for example, Selenium. The best solution for running such services would be Docker Compose :

version: '3'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
  kafka:
    image: confluentinc/cp-kafka
    environment:
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
    ports:
      - 9092:9092

If you are using your installation with GitLab runners on trustworthy servers, you can start the Docker Composer through the Shell executor . Another possible option is the Docker in Docker ( dind) container . But in that case, read this article first .

One way to use Compose is to set up an environment, run tests, and then destroy everything. A simple bash script will look like this:

docker-compose up -d
./run_tests.sh localhost:9092
docker-compose down

As long as you run tests in a minimal environment, everything will be fine. Although there may be a situation in which you need to install some dependencies ... There is another way to run tests in Docker Compose - it allows you to create your Docker image with a test environment. In one of the containers you run the tests and exit with the appropriate return code:

version: '3'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
  kafka:
    image: confluentinc/cp-kafka
    environment:
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
  tests:
    image: registry.example.com/some-image
    command: ./run_tests.sh kafka:9092

Notice that we got rid of the need to map ports. In this example, tests can interact with all services directly.

And their launch is carried out by one command:

docker-compose up --exit-code-from tests

The option --exit-code-fromimplies --abort-on-container-exitthat it means: the whole environment initiated docker-compose upwill be stopped after one of the containers has completed its work. The exit code of this command will be equivalent to the exit code of the selected service (i.e., this is testsin the example above). If the command that starts the tests ends with a non-zero code, then the whole team docker-compose upwill finish working with it.

Use of labels as CI tags

Warning : this is a rather unusual idea, but it seemed to me very useful and flexible.

As you may know, GitLab has the Labels feature available at the project and group levels. Labels can be installed on tickets and merge requests. However, they have no relationship with the pipelines.

A minor revision will allow access to the merge request labels in job scripts. In GitLab 11.6, everything has become even easier, because there was an environment variable CI_MERGE_REQUEST_IID(yes, precisely with IID, but not ID), if pipeline uses only: merge_requests.

If only: merge_requestsnot used or you are working with an older version of GitLab, MR can still be obtained - by calling the API:

curl "$CI_API_V4_URL/projects/$CI_PROJECT_ID/repository/commits/$CI_COMMIT_SHA/merge_requests?private_token=$GITLAB_TOKEN"

The field we need is iid. However, remember that for a given commit, many MRs can return.

When the MR IID is received, it remains only to access the Merge Requests API and use the field labelsfrom the answer:

curl "$CI_API_V4_URL/projects/$CI_PROJECT_ID/merge_requests/$CI_MERGE_REQUEST_IID?private_token=$GITLAB_TOKEN"

Authorization

Unfortunately, at the moment it is not possible to use $CI_JOB_TOKENfor access to the project API (at least, if the project is not public). If the project has limited access (internal or private), for authorization in the GitLab API you will need to generate a personal API token. However, this is not the safest solution, so be careful. If the token falls into bad hands, then it may appear to write access to all your projects. One of the ways to reduce risks is to create a separate account with the right only to read the repository and generate a personal token for this account.

How safe are your variables?

Just a few versions ago, the Variables section was called Secret Variables , which sounds as though they were created for the reliable storage of credentials and other critical information. In fact, the variables are simply hidden from users who do not have Maintainer rights. They are not encrypted on disk, and their leakage can easily occur through environment variables in scripts.

Keep this in mind when adding any variables, and consider storing secrets in safer solutions (for example, Vault from HashiCorp ).

Use cases

What to do with the lists of labels - you decide. Here are some ideas:

Use them for segmentation tests.
Use key-value semantics with a colon as a delimiter (for example, labels like tests:auth, tests:user)
Include certain features for job'ov.
Allow debugging of certain jobs if the label exists.

Call external API

Although GitLab comes with a set of features already available, it’s very likely that you will want to use other utilities that can be integrated with pipelines. The simplest method of implementation is, of course, the challenges of the good old curl.

If you create your own tools, you can teach them to listen to GitLab Webhooks (see the Integrations tab in the project settings). However, if you are going to use them with some critical systems, make sure that they meet the requirements of high availability.

Example: Grafana annotations

If you are working with Grafana , annotations are a great way to mark events that have occurred over time on charts. They can be added not only manually by clicking on the GUI, but also by invoking the Grafana REST API :

To access the API, you will need to generate an API Key. Consider creating a separate user with limited access:

Define two variables in the project settings:

GRAFANA_URL- URL to the Grafana installation (for example, https://grafana.example.com);
GRAFANA_APIKEY - generated API key.

To be able to reuse it, put the script in a repository with common scripts :

#!/bin/bashset -e
if [ $# -lt 2 ]; thenecho"Usage: $0 <text> <tag>"exit 1
fireadonly text="$1"readonly tag="$2"readonly time="$(date +%s)000"
cat >./payload.json <<EOF
{
    "text": "$text",
    "tags": ["$tag"],
    "time": $time,
    "timeEnd": $time
}
EOF
curl -X POST "$GRAFANA_URL/api/annotations" \
     -H "Authorization: Bearer $GRAFANA_APIKEY" \
     -H "content-type: application/json" \
     -d @./payload.json

Now you can add to the CI configuration its call with the necessary parameters:

deploy:
    stage: deploy
    script:
      - $SCRIPTS_DIR/deploy.sh production
      - $SCRIPTS_DIR/grafana-annotation.sh "$VERSION deployed to production" deploy-production

These calls can be placed in a script deploy.shto simplify the CI configuration.

Bonus: quick tips

GitLab has excellent documentation for all possible keywords that can be used to configure CI. I do not want to duplicate its contents here, but I will point out some useful cases. Click on the headings to familiarize yourself with the documentation on the topic.

Advanced use only / except

By defining templates for CI variables, you can define non-standard assemblies for some branches. This can help, for example, to identify push fixes for urgent fixes, but do not abuse it:

only:
  refs:
    - branches
  variables:
    - $CI_COMMIT_REF_NAME =~ /^hotfix/

GitLab has many predefined variables in every CI job — use them.

Yaml anchors

Use them to avoid duplication.

From version 11.3, you can also use the extends keyword :

.common_before_script: &common_before_script
  before_script:
    - ...
    - ...
deploy:
  <<: *common_before_script

Elimination of artifacts

By default, all artifacts collected in the pipeline will be transferred to all subsequent jobs. If you explicitly list the artifacts on which jobs depend, you can save time and disk space:

dependencies:
  - build

Or - vice versa - completely skip everything if none of them are required:

dependencies: []

Git strategy

Skip repository cloning if job will not use these files:

variables:
  GIT_STRATEGY: none

Everything!

Thank you for reading! With feedback and questions, contact me on Twitter or Reddit .

More tips on GitLab can be found in previous publications: