We assemble Docker images for CI / CD quickly and conveniently with dapp (review and video)
This is the second publication based on my speeches at conferences. The first was a general review of Continuous Delivery practices with Docker . The new one is based on a more applied report “We assemble Docker images quickly and conveniently”, which was made on November 8 at the HighLoad ++ 2016 conference in the section “DevOps and operation”.

As last time, if you have the opportunity to spend ~ an hour on the video, we recommend that you watch it in full (see the end of the article) . Otherwise, we present the main essence in text form.
Our requirements in the context of CI / CD processes (Continuous Integration, Continuous Delivery and Continuous Deployment) are as follows:

To meet these requirements, the standard Docker mechanism - the Dockerfile - is not enough. The authors of Docker have official reasons for this, which are accepted in the project as fundamental and very logical principles, such as a focus on solving common (rather than private) problems and a high level of portability.
Therefore, we wrote our own utility - dapp (in Ruby, distributed under the MIT License). At this stage, she only knows how to assemble images, and plans for her development include support for the full CI / CD cycle. In the design and implementation of dapp, ease of use and speed / efficiency are preferred.
Updated August 13, 2019: the dapp project has now been renamed werf, its code has been completely rewritten in Go, and the documentation has been greatly improved.
The configuration for images compiled with dapp is described in the Dappfile on the principle of One repository → One project → One Dappfile. The format of this file is currently Ruby DSL, but we plan to switch to a simpler and more familiar YAML.
What features does dapp bring?
Dapp implements a four-step pattern for building a Docker image:
The results of these stages are cached, which leads to a significant increase in the speed of reassembling the image.

For containers at the time of assembly, the so-called “external context” is available — these are mounted directories that are used at the time of assembly but are excluded from the final image. In these directories you can save information and use it for the next builds.
An example of use is the directory / var / lib / apt: its contents after apt-get update is also necessary for the following assemblies, but not inside the image itself (additional data).
Support for changes from Git is also made in accordance with the ideas of optimization and flexibility. At the first assembly of the image, all application source codes (git archive) are added to it, and at subsequent assemblies only deltas, i.e. Git patch apply. The content of patches is cached for better performance.

It is possible to specify files / directories, in case of change of which it is necessary to perform the install stage.

Sometimes, to build a project (“compile” some of its components), large third-party tools are required that are not used by the final application (that is, it is not necessary to store it in an image). This applies not only to the assembly of sources in languages such as C, but also, for example, generating assets using Node.js. The problem is implemented using the so-called "artifacts". When the dapp build command is executed, an additional Docker image is created with an artifact (i.e. the third-party tool required to build the image). Files of this artifact are added to the real (final) image from the additional using an external context. For artifacts, cache is also supported.
Modularity in assembling images brings significant benefits, but its implementation in the framework of shell (Bash) is an ungrateful task. But it is beautifully made in configuration management systems: Chef → Berkshelf, Puppet → Librarian ... We use Chef, so we added its support to dapp. This support means the ability to execute recipes inside the created Docker image.
Technically, everything is organized in such a way that a Chef cookbook is placed in a special directory inside the Git repository (.dapp_chef). When the dapp build command is executed, everything necessary is collected in a directory mounted inside the Docker container. Additionally, a fully installed Chef (chefdk) is mounted in the container. Next, Chef starts, which configures the container for the cookbook. The resulting image is configured according to recipes, but does not contain either chefdk or cookbook.

The format adopted in Dappfile allows you to immediately describe many images in one file.
We have been using and developing dapp for almost 2 years now and really want to make a useful Open Source solution out of it, which will help you configure your CI / CD processes and solve related tasks. We remind you that the source code is available on GitHub , issues and pull requests are also welcome there . Documentation is available here .
For dapp, we are looking for a unique enthusiast who dreams of becoming a technology evangelist. If you have a real interest in such tools, experience in DevOps, project management and writing competent technical texts - do not put it in a long way and be sure to write to us at info@flant.ru.
The video from the speech (about an hour) was published on YouTube (the link starts from the 13th minute, where the technical problems with the microphone stopped and the introductory part repeating the general report on CI ) was completed .
Presentation of the report:
Other related reports on our blog:

As last time, if you have the opportunity to spend ~ an hour on the video, we recommend that you watch it in full (see the end of the article) . Otherwise, we present the main essence in text form.
What do we want from Docker images?
Our requirements in the context of CI / CD processes (Continuous Integration, Continuous Delivery and Continuous Deployment) are as follows:
- Compact volume. The reason is that within the framework of the CD it is necessary to collect very often and a lot (each commit), which can soon lead to the need for huge storage. We accepted the image rate of <200 MB (the basic image with the Ubuntu system takes 130 MB).
- When committing with a volume of 10 KB, we want to see a similar increase in the size of the image, rather than the added full size of the image.
- Fast assembly of images - in 10 seconds.
- The ability to use the generated images as the final product for different sites (from test to production).

dapp instead of dockerfile
To meet these requirements, the standard Docker mechanism - the Dockerfile - is not enough. The authors of Docker have official reasons for this, which are accepted in the project as fundamental and very logical principles, such as a focus on solving common (rather than private) problems and a high level of portability.
Therefore, we wrote our own utility - dapp (in Ruby, distributed under the MIT License). At this stage, she only knows how to assemble images, and plans for her development include support for the full CI / CD cycle. In the design and implementation of dapp, ease of use and speed / efficiency are preferred.
Updated August 13, 2019: the dapp project has now been renamed werf, its code has been completely rewritten in Go, and the documentation has been greatly improved.
The configuration for images compiled with dapp is described in the Dappfile on the principle of One repository → One project → One Dappfile. The format of this file is currently Ruby DSL, but we plan to switch to a simpler and more familiar YAML.
What features does dapp bring?
1. Stages and cache
Dapp implements a four-step pattern for building a Docker image:
- before_install: OS settings, etc., which (according to the results of our analysis of dozens of different projects) account for <1% of commits;
- install: application dependencies - about 5% of commits;
- before_setup;
- setup: configs - about 2%.
The results of these stages are cached, which leads to a significant increase in the speed of reassembling the image.

2. External context
For containers at the time of assembly, the so-called “external context” is available — these are mounted directories that are used at the time of assembly but are excluded from the final image. In these directories you can save information and use it for the next builds.
An example of use is the directory / var / lib / apt: its contents after apt-get update is also necessary for the following assemblies, but not inside the image itself (additional data).
3. Git
Support for changes from Git is also made in accordance with the ideas of optimization and flexibility. At the first assembly of the image, all application source codes (git archive) are added to it, and at subsequent assemblies only deltas, i.e. Git patch apply. The content of patches is cached for better performance.

It is possible to specify files / directories, in case of change of which it is necessary to perform the install stage.

4. Artifacts
Sometimes, to build a project (“compile” some of its components), large third-party tools are required that are not used by the final application (that is, it is not necessary to store it in an image). This applies not only to the assembly of sources in languages such as C, but also, for example, generating assets using Node.js. The problem is implemented using the so-called "artifacts". When the dapp build command is executed, an additional Docker image is created with an artifact (i.e. the third-party tool required to build the image). Files of this artifact are added to the real (final) image from the additional using an external context. For artifacts, cache is also supported.
5. Chef support
Modularity in assembling images brings significant benefits, but its implementation in the framework of shell (Bash) is an ungrateful task. But it is beautifully made in configuration management systems: Chef → Berkshelf, Puppet → Librarian ... We use Chef, so we added its support to dapp. This support means the ability to execute recipes inside the created Docker image.
Technically, everything is organized in such a way that a Chef cookbook is placed in a special directory inside the Git repository (.dapp_chef). When the dapp build command is executed, everything necessary is collected in a directory mounted inside the Docker container. Additionally, a fully installed Chef (chefdk) is mounted in the container. Next, Chef starts, which configures the container for the cookbook. The resulting image is configured according to recipes, but does not contain either chefdk or cookbook.

6. Several images
The format adopted in Dappfile allows you to immediately describe many images in one file.
dapp as open source
We have been using and developing dapp for almost 2 years now and really want to make a useful Open Source solution out of it, which will help you configure your CI / CD processes and solve related tasks. We remind you that the source code is available on GitHub , issues and pull requests are also welcome there . Documentation is available here .
By the way, we have a job!
For dapp, we are looking for a unique enthusiast who dreams of becoming a technology evangelist. If you have a real interest in such tools, experience in DevOps, project management and writing competent technical texts - do not put it in a long way and be sure to write to us at info@flant.ru.
Videos and slides
The video from the speech (about an hour) was published on YouTube (the link starts from the 13th minute, where the technical problems with the microphone stopped and the introductory part repeating the general report on CI ) was completed .
Presentation of the report:
PS
Other related reports on our blog:
- “ Werf is our CI / CD tool in Kubernetes (review and video report) ” (Dmitry Stolyarov; May 27, 2019 at DevOpsConf) ;
- “ Databases and Kubernetes ” (Dmitry Stolyarov; November 8, 2018 on HighLoad ++) ;
- “ Best CI / CD practices with Kubernetes and GitLab ” (Dmitry Stolyarov; November 7, 2017 at HighLoad ++) ;
- “ Our experience with Kubernetes in small projects ” (Dmitry Stolyarov; June 6, 2017 at RootConf) .