Learn Docker, Part 3: Dockerfile files
- Transfer
In the translation of the third part of a series of materials on Docker, we will continue to be inspired by baking, namely bagels. Our main theme today is Dockerfile files management. We will analyze the instructions that are used in these files.
→ Part 1: Basics
→ Part 2: Terms and Concepts
→ Part 3: Dockerfile Files

Bagels are instructions in the Dockerfile file.
Recall that the Docker container is a Docker image brought to life. This is a self-contained operating system, in which there is only the most necessary and application code.
Docker images are the result of the build process, and Docker containers are running images. In the heart of Docker are Dockerfile files. Such files tell Docker how to build the images from which containers are created.
Each Docker image corresponds to a file called Dockerfile. His name is written that way - without the extension. When you run the command
Containers, as we found out in the first material of this series, consist of layers. Each layer, except the last, located on top of all the others, is intended only for reading. Dockerfile informs Docker about which layers and in which order to add to the image.
Each layer, in fact, is just a file that describes the change in the state of the image in comparison with the state in which it was after adding the previous layer. In Unix, by the way, almost anything is a file .
The base image is what is the original layer (or layers) of the image being created. The basic image is also called the parent image.

The base image is where the Docker image begins.
When an image is downloaded from a remote repository to a local computer, only the layers that are not on this computer are physically downloaded. Docker aims to save space and time by reusing existing layers.
Dockerfile files contain instructions for creating an image. From them, typed in capital letters, begin the lines of this file. Following the instructions are their arguments. Instructions, when assembling an image, are processed from top to bottom. Here's what it looks like:
The layers in the final image is created only instructions
Here we proceed from the assumption that the Docker image based on a Unix-like OS is used. Of course, here you can use the image based on Windows, but using Windows is a less common practice, working with such images is more difficult. As a result, if you have this opportunity, use Unix.
To begin with, here is a list of Dockerfile instructions with brief comments.
Now let's talk about these instructions.
A dockerfile can be extremely simple and short. For example - as follows:
The Dockerfile file must begin with an instruction
The FROM keyword tells Docker to use the base image that matches the provided name and tag when building the image. The basic image, in addition, is also called the parent image .
In this example, the base image is stored in the ubuntu repository . Ubuntu is the name of the official Docker repository, providing the basic version of the popular Linux operating system family, called Ubuntu.
Note that the Dockerfile in question includes a tag
When the above-described Dockerfile is used on the local machine to build an image for the first time, Docker will load the image defined layers
When creating a container, a layer in which you can make changes is added on top of all other layers. The data in the remaining layers can only be read.

The structure of the container (taken from the documentation )
Docker, for the sake of efficiency, uses a copy strategy when writing. If a layer in the image exists at the previous level and some layer needs to read data from it, Docker uses the existing file. At the same time do not need to download anything.
When the image is executed, if the layer needs to be modified by means of the container, the corresponding file is copied to the topmost, mutable layer. To learn more about the copy-on-write strategy, take a look at this material from the Docker documentation.
We will continue the discussion of the instructions that are used in the Dockerfile, giving an example of such a file with a more complex structure.
Although the Dockerfile file we just looked at turned out neat and understandable, it is too simple, it only uses one instruction. In addition, there are no instructions called during the execution of the container. Take a look at another file that collects a small image. It has mechanisms that define the commands invoked during container execution.
Perhaps, at first glance, this file may seem rather complicated. So let's deal with it.
The base of this image is the official Python image with the tag 3.7.2-alpine3.8. After analyzing this code, you can see that this basic image includes Linux, Python, and, by and large, its composition is limited to this. Alpine OS images are very popular in the Docker world. The fact is that they are distinguished by their small size, high speed of work and safety. However, the images of Alpine are not distinguished by the wide capabilities typical of ordinary operating systems. Therefore, in order to collect something useful on the basis of such an image, the creator of the image needs to install the necessary packages into it.

Labels
The LABEL instruction(label) allows you to add metadata to an image. In the case of the file being considered now, it includes the contact details of the image creator. Declaring tags does not slow down the process of assembling an image and does not increase its size. They only contain useful information about the Docker image, so they are recommended to be included in the file. Details about working with metadata in the Dockerfile can be found here .

Environment
The ENV instructionallows you to set persistent environment variables that will be available in the container during its execution. In the previous example, after creating the container, you can use a variable
The instruction is
It should be noted that in Dockerfile files there are often different ways of solving the same tasks. What exactly to use is a question whose solution is influenced by the desire to follow the practices adopted in the Docker environment, to ensure the transparency of the solution and its high performance. For example, instructions

RUN instruction
manual RUN allows you to create a layer during assembly of the image. After its execution, a new layer is added to the image, its state is fixed. The instruction is
What the commands look like
Instructions
In the previous example, we used the shell-shape RUN instructions in this form:
Later in our Dockerfile, exec-form instructions are used

COPY Instruction
Manual COPY presented in this file as follows:
The ADD instruction allows you to solve the same tasks as
In this example, the instruction
In addition, the documentation suggests wherever possible, instead of instructions,
Note that the instruction

CMD instruction
manual CMD provides Docker command you want to execute when the container starts. The results of this command are not added to the image at the time of its assembly. In our example, using this command runs the script
Here is something else you need to know about the instructions
Consider another Dockerfile file in which some new commands will be used.
In this example, among other things, you can see comments that begin with a symbol
One of the main actions performed by the Dockerfile tools is the installation of packages. As already mentioned, there are various ways to install packages using the instructions
Packages in an Alpine Docker image can be installed using
In addition, Python packages can be installed into an image using pip , wheel and conda . If it's not about Python, but about other programming languages, then other package managers can be used to prepare the corresponding images.
At the same time, in order for the installation to be possible, the underlying layer must provide the layer in which the packages are being installed, with a suitable package manager. Therefore, if you encounter problems installing packages, make sure the package manager is installed before you try to use it.
For example, the statement
In addition, to install multiple packages, you can proceed in a different way. They can be listed in a file and transferred to the package manager using this file

Working directories
The WORKDIR instructionallows you to change the working directory of a container. From this directory work instructions
The ARG instruction allows you to set a variable, the value of which can be transferred from the command line to the image during its assembly. The value for the default variable can be represented in the Dockerfile. For example:
Unlike

Point of transition to a certain place.
The ENTRYPOINT statement allows you to specify a command with arguments that must be executed when the container is started. It is similar to the command
Instead, command line arguments passed to view constructs are
There are several recommendations in the Docker documentation regarding which instruction,
In our example, the use of instructions
Docker documentation recommends using the exec-form

EXPOSE instruction
manual EXPOSE indicates which ports will be opened so that through them could be contacted with a working container. This instruction does not open ports. Rather, it plays the role of documentation for the image, a means of communication for the one who collects the image, and the one who runs the container.
In order to open a port (or ports) and configure port forwarding, you need to run the command

VOLUME instruction
manual VOLUME allows you to specify a place that the container will be used to permanently store files and to work with these files. We will talk about this later.
Now you know a dozen of instructions used to create images using Dockerfile. This list of such instructions is not exhausted. In particular, we have not considered here are the instructions as
Probably, Dockerfile files are a key component of the Docker ecosystem, which you need to work with to learn to anyone who wants to feel confident in this environment. We will return to talking about them the next time we discuss ways to reduce the size of images.
Dear readers! If you use Docker in practice, please tell us about how you write Docker-files.

→ Part 1: Basics
→ Part 2: Terms and Concepts
→ Part 3: Dockerfile Files

Bagels are instructions in the Dockerfile file.
Docker Images
Recall that the Docker container is a Docker image brought to life. This is a self-contained operating system, in which there is only the most necessary and application code.
Docker images are the result of the build process, and Docker containers are running images. In the heart of Docker are Dockerfile files. Such files tell Docker how to build the images from which containers are created.
Each Docker image corresponds to a file called Dockerfile. His name is written that way - without the extension. When you run the command
docker build
to create a new image, it means that the Dockerfile is in the current working directory. If this file is located elsewhere, its location can be specified using the flag -f
.Containers, as we found out in the first material of this series, consist of layers. Each layer, except the last, located on top of all the others, is intended only for reading. Dockerfile informs Docker about which layers and in which order to add to the image.
Each layer, in fact, is just a file that describes the change in the state of the image in comparison with the state in which it was after adding the previous layer. In Unix, by the way, almost anything is a file .
The base image is what is the original layer (or layers) of the image being created. The basic image is also called the parent image.

The base image is where the Docker image begins.
When an image is downloaded from a remote repository to a local computer, only the layers that are not on this computer are physically downloaded. Docker aims to save space and time by reusing existing layers.
Dockerfile files
Dockerfile files contain instructions for creating an image. From them, typed in capital letters, begin the lines of this file. Following the instructions are their arguments. Instructions, when assembling an image, are processed from top to bottom. Here's what it looks like:
FROM ubuntu:18.04
COPY . /app
The layers in the final image is created only instructions
FROM
, RUN
, COPY
, and ADD
. Other instructions set up something, describe metadata, or tell Docker that something needs to be done while the container is running, for example, opening a port or executing a command. Here we proceed from the assumption that the Docker image based on a Unix-like OS is used. Of course, here you can use the image based on Windows, but using Windows is a less common practice, working with such images is more difficult. As a result, if you have this opportunity, use Unix.
To begin with, here is a list of Dockerfile instructions with brief comments.
Dockerfile dozen instructions
FROM
- sets the base (parent) image.LABEL
- describes the metadata. For example, information about who created and maintains the image.ENV
- sets persistent environment variables.RUN
- executes the command and creates an image layer. Used for installation in a package container.COPY
- copies files and folders into container.ADD
- copies files and folders into a container, can unpack local .tar-files.CMD
- describes a command with arguments to be executed when the container is launched. Arguments can be overridden when the container is started. The file may contain only one instructionCMD
.WORKDIR
- sets the working directory for the next instruction.ARG
- sets variables for Docker transfer during image build.ENTRYPOINT
- provides a command with arguments to call during container execution. Arguments are not redefined.EXPOSE
- indicates the need to open the port.VOLUME
- creates a mount point for working with persistent storage.
Now let's talk about these instructions.
Instructions and examples of their use
▍Simple Dockerfile
A dockerfile can be extremely simple and short. For example - as follows:
FROM ubuntu:18.04
FRFROM Instruction
The Dockerfile file must begin with an instruction
FROM
, or with an instruction ARG
followed by an instruction FROM
. The FROM keyword tells Docker to use the base image that matches the provided name and tag when building the image. The basic image, in addition, is also called the parent image .
In this example, the base image is stored in the ubuntu repository . Ubuntu is the name of the official Docker repository, providing the basic version of the popular Linux operating system family, called Ubuntu.
Note that the Dockerfile in question includes a tag
18.04
clarifying exactly which base image we need. It is this image that will be loaded when building our image. If the tag is not included in the instruction, then Docker assumes that the most recent image from the repository is required. In order to more clearly express his intentions, the author of the Dockerfile is recommended to indicate which particular image he needs. When the above-described Dockerfile is used on the local machine to build an image for the first time, Docker will load the image defined layers
ubuntu
. They can be presented superimposed on each other. Each next layer is a file that describes the differences in the image in comparison with the state in which it was after adding the previous layer to it.When creating a container, a layer in which you can make changes is added on top of all other layers. The data in the remaining layers can only be read.

The structure of the container (taken from the documentation )
Docker, for the sake of efficiency, uses a copy strategy when writing. If a layer in the image exists at the previous level and some layer needs to read data from it, Docker uses the existing file. At the same time do not need to download anything.
When the image is executed, if the layer needs to be modified by means of the container, the corresponding file is copied to the topmost, mutable layer. To learn more about the copy-on-write strategy, take a look at this material from the Docker documentation.
We will continue the discussion of the instructions that are used in the Dockerfile, giving an example of such a file with a more complex structure.
▍More complicated Dockerfile
Although the Dockerfile file we just looked at turned out neat and understandable, it is too simple, it only uses one instruction. In addition, there are no instructions called during the execution of the container. Take a look at another file that collects a small image. It has mechanisms that define the commands invoked during container execution.
FROM python:3.7.2-alpine3.8
LABEL maintainer="jeffmshale@gmail.com"
ENV ADMIN="jeff"
RUN apk update && apk upgrade && apk add bash
COPY . ./app
ADD https://raw.githubusercontent.com/discdiver/pachy-vid/master/sample_vids/vid1.mp4 \
/my_app_directory
RUN ["mkdir", "/a_directory"]
CMD ["python", "./my_script.py"]
Perhaps, at first glance, this file may seem rather complicated. So let's deal with it.
The base of this image is the official Python image with the tag 3.7.2-alpine3.8. After analyzing this code, you can see that this basic image includes Linux, Python, and, by and large, its composition is limited to this. Alpine OS images are very popular in the Docker world. The fact is that they are distinguished by their small size, high speed of work and safety. However, the images of Alpine are not distinguished by the wide capabilities typical of ordinary operating systems. Therefore, in order to collect something useful on the basis of such an image, the creator of the image needs to install the necessary packages into it.
▍Instruction LABEL

Labels
The LABEL instruction(label) allows you to add metadata to an image. In the case of the file being considered now, it includes the contact details of the image creator. Declaring tags does not slow down the process of assembling an image and does not increase its size. They only contain useful information about the Docker image, so they are recommended to be included in the file. Details about working with metadata in the Dockerfile can be found here .
▍ ENV Instruction

Environment
The ENV instructionallows you to set persistent environment variables that will be available in the container during its execution. In the previous example, after creating the container, you can use a variable
ADMIN
. The instruction is
ENV
well suited for specifying constants. If you use a certain value in the Dockerfile several times, say, when describing commands running in a container, and suspect that you may have to change it to something else, it makes sense to write it in a similar constant.It should be noted that in Dockerfile files there are often different ways of solving the same tasks. What exactly to use is a question whose solution is influenced by the desire to follow the practices adopted in the Docker environment, to ensure the transparency of the solution and its high performance. For example, instructions
RUN
, CMD
and ENTRYPOINT
serve different purposes, but they are all used to execute commands.▍ RUN Instruction

RUN instruction
manual RUN allows you to create a layer during assembly of the image. After its execution, a new layer is added to the image, its state is fixed. The instruction is
RUN
often used to install additional packages into the images. In the previous example, the instructionRUN apk update && apk upgrade
tells Docker that the system needs to update packages from the base image. Following these two commands, there is a command&& apk add bash
indicating that you need to install bash in the image. What the commands look like
apk
is the abbreviation of the Alpine Linux package manager(Alpine Linux package manager). If you are using a base image of some other Linux operating system, then you, for example, when using Ubuntu, may need a command to install packages RUN apt-get
. Later we will talk about other ways to install packages. Instructions
RUN
and similar to her instructions - such as CMD
and ENTRYPOINT
, can be used either in the exec-form or in the shell-shaped. The exec form uses a syntax that resembles the description of a JSON array. For example, it might look like this: RUN ["my_executable", "my_first_param1", "my_second_param2"]
. In the previous example, we used the shell-shape RUN instructions in this form:
RUN apk update && apk upgrade && apk add bash
. Later in our Dockerfile, exec-form instructions are used
RUN
, in the formRUN ["mkdir", "/a_directory"]
to create a directory. In this case, using the instructions in this form, you need to remember about the need to design strings using double quotes, as is customary in the JSON format.COP COPY Instruction

COPY Instruction
Manual COPY presented in this file as follows:
COPY . ./app
. It tells Docker to take the files and folders from the local build context and add them to the current working directory of the image. If the target directory does not exist, this instruction will create it.AD ADD Instruction
The ADD instruction allows you to solve the same tasks as
COPY
, but with it a couple more use cases are associated with it. So, using this instruction, you can add files downloaded from remote sources to the container, as well as unpack local .tar-files. In this example, the instruction
ADD
was used to copy a file accessible via a URL to the container directory my_app_directory
. It should be noted, however, that the Docker documentation does not recommend the use of similar files obtained from URLs, since they cannot be deleted, and since they increase the size of the image. In addition, the documentation suggests wherever possible, instead of instructions,
ADD
use instructions.COPY
in order to make the Dockerfile files clearer. I believe, Docker development team would be necessary to combine ADD
and COPY
in a single instruction to those who create images that would not have to remember too many instructions. Note that the instruction
ADD
contains a line break character - \
. Such characters are used to improve the readability of long commands by splitting them into several lines.CM CMD Instruction

CMD instruction
manual CMD provides Docker command you want to execute when the container starts. The results of this command are not added to the image at the time of its assembly. In our example, using this command runs the script
my_script.py
during container execution. Here is something else you need to know about the instructions
CMD
:- There can be only one instruction in one Dockerfile file
CMD
. If there are several such instructions in the file, the system will ignore everything except the last one. - The instruction
CMD
may have an exec form. If this instruction does not include the mention of the executable file, then the instruction must be present in the fileENTRYPOINT
. In this case, both of these instructions should be in the formatJSON
. - The command line
docker run
arguments passed override the arguments provided by the instructionsCMD
in the Dockerfile.
▍More complicated Dockerfile
Consider another Dockerfile file in which some new commands will be used.
FROM python:3.7.2-alpine3.8
LABEL maintainer="jeffmshale@gmail.com"
# Устанавливаем зависимости
RUN apk add --update git
# Задаём текущую рабочую директорию
WORKDIR /usr/src/my_app_directory
# Копируем код из локального контекста в рабочую директорию образа
COPY . .
# Задаём значение по умолчанию для переменной
ARG my_var=my_default_value
# Настраиваем команду, которая должна быть запущена в контейнере во время его выполнения
ENTRYPOINT ["python", "./app/my_script.py", "my_var"]
# Открываем порты
EXPOSE 8000
# Создаём том для хранения данных
VOLUME /my_volume
In this example, among other things, you can see comments that begin with a symbol
#
. One of the main actions performed by the Dockerfile tools is the installation of packages. As already mentioned, there are various ways to install packages using the instructions
RUN
. Packages in an Alpine Docker image can be installed using
apk
. For this, as we have said, a view command is used RUN apk update && apk upgrade && apk add bash
. In addition, Python packages can be installed into an image using pip , wheel and conda . If it's not about Python, but about other programming languages, then other package managers can be used to prepare the corresponding images.
At the same time, in order for the installation to be possible, the underlying layer must provide the layer in which the packages are being installed, with a suitable package manager. Therefore, if you encounter problems installing packages, make sure the package manager is installed before you try to use it.
For example, the statement
RUN
in the Dockerfile can be used to install a list of packages with pip
. If you do this, combine all the commands into one instruction and separate it with line breaks using the symbol \
. Thanks to this approach, the files will look neat and this will result in adding fewer layers to the image than would have been added using several instructions RUN
.In addition, to install multiple packages, you can proceed in a different way. They can be listed in a file and transferred to the package manager using this file
RUN
. Usually such files are given a name requirements.txt
.WOR WORKDIR Instruction

Working directories
The WORKDIR instructionallows you to change the working directory of a container. From this directory work instructions
COPY
,ADD
,RUN
,CMD
andENTRYPOINT
, reaching forWORKDIR
. Here are some features related to this instruction:- It is better to set
WORKDIR
absolute paths to folders instead of navigating through the file system using the commandscd
in the Dockerfile. - The statement
WORKDIR
automatically creates a directory if it does not exist. - You can use several instructions
WORKDIR
. If such instructions are provided with relative paths, each of them changes the current working directory.
AR ARG instruction
The ARG instruction allows you to set a variable, the value of which can be transferred from the command line to the image during its assembly. The value for the default variable can be represented in the Dockerfile. For example:
ARG my_var=my_default_value
. Unlike
ENV
-var, ARG
-varg are not available during container execution. However, ARG
-vars can be used to set default values for ENV
-vars from the command line during the image build process. A- ENV
variables will already be available in the container during its execution. Details about this method of working with variables can be found here .▍Instrution ENTRYPOINT

Point of transition to a certain place.
The ENTRYPOINT statement allows you to specify a command with arguments that must be executed when the container is started. It is similar to the command
CMD
, but the parameters specified inENTRYPOINT
are not overwritten in the event that the container is started with command line parameters. Instead, command line arguments passed to view constructs are
docker run my_image_name
added to the arguments given by the instructionENTRYPOINT
. For example, after executing a command of the form, thedocker run my_image bash
argument isbash
added to the end of the list of arguments specified withENTRYPOINT
. When preparing the Dockerfile, do not forget about the instructionsCMD
orENTRYPOINT
.There are several recommendations in the Docker documentation regarding which instruction,
CMD
or ENTRYPOINT
, should be chosen as a tool for executing commands when launching the container:- If every time you start the container you need to execute the same command - use
ENTRYPOINT
. - If the container will be used as an application - use
ENTRYPOINT
. - If you know that when you start the container, you will need to pass to it the arguments that can overwrite the arguments specified in the Dockerfile, use
CMD
.
In our example, the use of instructions
ENTRYPOINT ["python", "my_script.py", "my_var"]
causes the container, when launched, to run a Python script my_script.py
with an argument my_var
. The value presented my_var
is then used in the script using argparse . Note that in the Dockerfile variable my_var
, prior to its use, the default value is assigned using ARG
. As a result, if the corresponding value was not passed to it when the container was started, the default value will be applied. Docker documentation recommends using the exec-form
ENTRYPOINT
: ENTRYPOINT ["executable", "param1", "param2"]
.EX EXPOSE Instruction

EXPOSE instruction
manual EXPOSE indicates which ports will be opened so that through them could be contacted with a working container. This instruction does not open ports. Rather, it plays the role of documentation for the image, a means of communication for the one who collects the image, and the one who runs the container.
In order to open a port (or ports) and configure port forwarding, you need to run the command
docker run
with the key-p
. If you use the key in the form-P
(with a capital letterP
), then all the ports specified in the instructions will be openEXPOSED
.▍ VOLUME Instruction

VOLUME instruction
manual VOLUME allows you to specify a place that the container will be used to permanently store files and to work with these files. We will talk about this later.
Results
Now you know a dozen of instructions used to create images using Dockerfile. This list of such instructions is not exhausted. In particular, we have not considered here are the instructions as
USER
, ONBUILD
, STOPSIGNAL
, SHELL
and HEALTHCHECK
. Here is a quick reference to the Dockerfile instructions. Probably, Dockerfile files are a key component of the Docker ecosystem, which you need to work with to learn to anyone who wants to feel confident in this environment. We will return to talking about them the next time we discuss ways to reduce the size of images.
Dear readers! If you use Docker in practice, please tell us about how you write Docker-files.
