RHEL 8 Beta Workshop: Building Active Web Applications
RHEL 8 Beta offers developers many new features, the listing of which may take pages, however, it is always better to learn new things in practice, so we suggest going through a workshop to actually create an application infrastructure based on Red Hat Enterprise Linux 8 Beta.
We take Python, a combination of Django and PostgreSQL, a fairly common bundle for creating applications, and configure RHEL 8 Beta to work with them. Then add a couple more (unclassified) ingredients.
The test environment will change, because it is interesting to study automation capabilities, work with containers and try environments with multiple servers. To start working with a new project, you can start by creating a small simple prototype manually - this way you can see what exactly needs to happen and how the interaction is carried out, and then move on to automation and creating more complex configurations. Today is the story of creating such a prototype.
Let's start by deploying the RHEL 8 Beta VM virtual machine image. You can install the virtual machine from scratch, or use the KVM guest image, available with a Beta subscription. When using a guest image, you will need to configure a virtual CD, which will contain metadata and user data for cloud initialization (cloud-init). You don’t need to do anything special with the structure of the disk or the available packages; any configuration will do.
Let's look at the whole process in more detail.
With the latest version of Django, you will need a virtual environment (virtualenv) with Python 3.5 or later. In the notes for Beta, you can see that Python 3.6 is available, let's check if this is true:
Red Hat actively uses Python as a system toolkit in RHEL, so why do you get this result?
The fact is that many developers using Python are still thinking about switching from Python 2 to Python 3, while Python 3 itself is under active development and more and more new versions are constantly appearing. Therefore, in order to satisfy the need for stable system tools, and at the same time offer users access to various new versions of Python, system Python was transferred to a new package and provided the ability to install both Python 2.7 and 3.6. More information about the changes and why this was done can be found in the Langdon White blog post .
So, to get working Python, you need to install only two packages, while python3-pip will pull up as a dependency.
Why not use direct module calls, as Langdon suggests, and not install pip3? Keeping in mind the upcoming automation, it is known that Ansible will require installed pip, because the pip module does not support virtual environments (virtualenvs) with a custom pip executable.
With a working python3 interpreter at your disposal, you can continue the Django installation process and get a working system along with our other components. The network presents many options for implementation. One version is presented here, but users can use their own processes.
The versions of PostgreSQL and Nginx available in RHEL 8 will be installed by default using Yum.
PostgreSQL will require psycopg2, but it needs to be available only in the virtualenv environment, so we will install it using pip3 along with Django and Gunicorn. But first, we need to configure virtualenv.
There is always much debate about choosing the right place to install Django projects, but when in doubt, you can always turn to the Linux Filesystem Hierarchy Standard. In particular, the FHS says that / srv is used to: “store host-specific data — data that the system provides, for example, data and scripts from web servers, data stored on FTP servers, as well as repositories of control systems versions (introduced in FHS-2.3 in 2004). ”
This is just our case, so we put everything we need into / srv, which is owned by our application user (cloud-user).
Setting up PostgreSQL and Django is straightforward: create a database, create a user, configure permissions. There is one point to keep in mind when installing PostgreSQL for the first time — the postgresql-setup script, which is installed with the postgresql-server package. This script helps you perform basic tasks related to administering a database cluster, such as initializing a cluster or updating process. To configure a new instance of PostgreSQL on the RHEL system, we need to run the command:
After that, you can start PostgreSQL using systemd, create a database and configure the project in Django. Remember to restart PostgreSQL after making changes to the client authentication configuration file (usually pg_hba.conf) to configure password storage for the application user. If you encounter other difficulties, make sure that the IPv4 and IPv6 settings in the pg_hba.conf file are changed.
In the file /var/lib/pgsql/data/pg_hba.conf:
In the file /srv/djangoapp/settings.py:
After you configure the settings.py file in the project and configure the database configuration, you can start the development server to make sure that everything works. After starting the development server, it’s nice to create an admin user in order to test the connection to the database.
The development server is useful for testing, but to run the application, you must configure the appropriate server and proxy for the Web Server Gateway Interface (WSGI). There are several common bundles, for example, Apache HTTPD with uWSGI or Nginx with Gunicorn.
The goal of the Web Server Gateway Interface is to redirect requests from the web server to the Python web framework. WSGI is a kind of legacy of a terrible past when CGI mechanisms were in use, and today WSGI is actually the standard, regardless of the web server or Python framework used. But despite its wide distribution, there are still many nuances when working with these frameworks, and many choices. In this case, we will try to establish interaction between Gunicorn and Nginx through the socket.
Since both of these components are installed on the same server, we will try to use a UNIX socket instead of a network socket. Since communication requires a socket anyway, let's try one more step and configure the socket activation for Gunicorn through systemd.
The process of creating socket activated services is simple enough. First, a unit file is created that contains the ListenStream directive, pointing to the point at which the UNIX socket will be created, then a unit file for the service, where the Requires directive will point to the socket unit file. Then, in the unit-file of the service, it remains only to call Gunicorn from the virtual environment and create a WSGI binding for the UNIX socket and the Django application.
Here are some examples of unit files that can be taken as a basis. First, configure the socket.
Now you need to configure the Gunicorn daemon.
For Nginx, just create a proxy configuration file and configure a directory to store static content if you use it. In RHEL, the Nginx configuration files are /etc/nginx/conf.d. You can copy the following example there to the file /etc/nginx/conf.d/default.conf, and start the service. Be sure to specify server_name according to your host name.
Run the Gunicorn and Nginx sockets with systemd, and you can start testing.
If you enter the address in the browser, then most likely you will get the 502 Bad Gateway error. It can be caused by incorrectly configured permissions for a UNIX socket, or due to more complex issues related to access control in SELinux.
In the nginx error log, you can see a line like this:
If we test Gunicorn directly, we get an empty answer.
Let's see why this is happening. If you open the log, then most likely we will see that the problem is related to SELinux. Since we are running a daemon for which we have not created our own policy, it is marked as init_t. Let's test this theory in practice.
All of this can cause criticism and bloody tears, but this is just debugging the prototype. We turn off the check only to make sure that this is the problem, after which we will return everything back to their places.
By refreshing the page in the browser or by restarting our curl command, you can see the Django test page.
So, making sure everything works, and there are no more permissions problems, we re-enable SELinux.
There will be no talk about audit2allow and the creation of policies based on alerts using sepolgen, since there is currently no real Django application, there is no complete map of what Gunicorn might want to access, and what access should be denied. Therefore, it is necessary to keep SELinux working to protect the system, and at the same time, allow the application to start and leave messages in the audit log so that you can then create a real policy based on them.
Not everyone has heard about permitted domains in SELinux, but there is nothing new in them. Many even worked with them, without realizing it themselves. When a policy is created based on audit messages, the policy created is an allowed domain. Let's try to create the simplest permissive policy.
To create a specific permitted domain for Gunicorn, you need a certain policy, and you will also need to mark the appropriate files. In addition, tools are needed to collect new policies.
The resolved domain mechanism is a great tool for identifying problems, especially when it comes to a custom application or applications that come without policies already created. In this case, the allowed domain policy for Gunicorn will be as simple as possible - declare the main type (gunicorn_t), declare the type that we will use to mark several executable files (gunicorn_exec_t), and then configure the transition for the system to correctly mark the running processes . The last line sets the policy as enabled by default at the time of its loading.
gunicorn.te:
You can compile this policy file and add it to the system.
Let's check if SELinux is blocking anything else besides what our unknown daemon is accessing.
SELinux prevents Nginx from writing data to the UNIX socket used by Gunicorn. Usually, in such cases, politicians begin to change, but there are other tasks ahead. You can also change the domain settings, turning it from a restriction domain into a permission domain. Now move httpd_t to the permission domain. This will provide Nginx with the necessary access, and we will be able to continue further work on debugging.
So, when it was possible to preserve SELinux protection (in fact, you should not leave the project with SELinux in restriction mode) and the permission domains are loaded, you need to find out what exactly needs to be marked as gunicorn_exec_t so that everything works again as expected. Let's try to access the website to see new messages about access restrictions.
You can see a lot of messages containing 'comm = “gunicorn”' that perform various actions on the files in / srv / djangoapp, so obviously this is just one of the commands worth checking out.
But in addition, a message like this appears:
If you look at the status of the gunicorn service or run the ps command, then no running processes will appear. It looks like gunicorn is trying to access the Python interpreter in our virtualenv environment, possibly to run working scripts (workers). So now let's mark these two executable files and see if we can open our test Django page.
You will need to restart the gunicorn service so that you can select a new label. You can restart it immediately or stop the service and let the socket start it when you open the site in a browser. Make sure the processes get the correct labels using ps.
Remember to create a normal SELinux policy later!
If you look at the AVC messages now, the last message contains permissive = 1 for everything related to the application, and permissive = 0 for the rest of the system. If you understand what kind of access a real application needs, you can quickly find the best way to solve such problems. But until then, it is better that the system is protected, and in order to get a clear and usable audit by the Django project.
A working Django project with a frontend on Nginx and Gunicorn WSGI appeared. We configured Python 3 and PostgreSQL 10 from the RHEL 8 Beta repositories. Now you can move on and create (or just deploy) Django applications or explore other available tools in RHEL 8 Beta to automate the tuning process, improve performance, or even containerize this configuration.
We take Python, a combination of Django and PostgreSQL, a fairly common bundle for creating applications, and configure RHEL 8 Beta to work with them. Then add a couple more (unclassified) ingredients.
The test environment will change, because it is interesting to study automation capabilities, work with containers and try environments with multiple servers. To start working with a new project, you can start by creating a small simple prototype manually - this way you can see what exactly needs to happen and how the interaction is carried out, and then move on to automation and creating more complex configurations. Today is the story of creating such a prototype.
Let's start by deploying the RHEL 8 Beta VM virtual machine image. You can install the virtual machine from scratch, or use the KVM guest image, available with a Beta subscription. When using a guest image, you will need to configure a virtual CD, which will contain metadata and user data for cloud initialization (cloud-init). You don’t need to do anything special with the structure of the disk or the available packages; any configuration will do.
Let's look at the whole process in more detail.
Django Installation
With the latest version of Django, you will need a virtual environment (virtualenv) with Python 3.5 or later. In the notes for Beta, you can see that Python 3.6 is available, let's check if this is true:
[cloud-user@8beta1 ~]$ python
-bash: python: command not found
[cloud-user@8beta1 ~]$ python3
-bash: python3: command not found
Red Hat actively uses Python as a system toolkit in RHEL, so why do you get this result?
The fact is that many developers using Python are still thinking about switching from Python 2 to Python 3, while Python 3 itself is under active development and more and more new versions are constantly appearing. Therefore, in order to satisfy the need for stable system tools, and at the same time offer users access to various new versions of Python, system Python was transferred to a new package and provided the ability to install both Python 2.7 and 3.6. More information about the changes and why this was done can be found in the Langdon White blog post .
So, to get working Python, you need to install only two packages, while python3-pip will pull up as a dependency.
sudo yum install python36 python3-virtualenv
Why not use direct module calls, as Langdon suggests, and not install pip3? Keeping in mind the upcoming automation, it is known that Ansible will require installed pip, because the pip module does not support virtual environments (virtualenvs) with a custom pip executable.
With a working python3 interpreter at your disposal, you can continue the Django installation process and get a working system along with our other components. The network presents many options for implementation. One version is presented here, but users can use their own processes.
The versions of PostgreSQL and Nginx available in RHEL 8 will be installed by default using Yum.
sudo yum install nginx postgresql-server
PostgreSQL will require psycopg2, but it needs to be available only in the virtualenv environment, so we will install it using pip3 along with Django and Gunicorn. But first, we need to configure virtualenv.
There is always much debate about choosing the right place to install Django projects, but when in doubt, you can always turn to the Linux Filesystem Hierarchy Standard. In particular, the FHS says that / srv is used to: “store host-specific data — data that the system provides, for example, data and scripts from web servers, data stored on FTP servers, as well as repositories of control systems versions (introduced in FHS-2.3 in 2004). ”
This is just our case, so we put everything we need into / srv, which is owned by our application user (cloud-user).
sudo mkdir /srv/djangoapp
sudo chown cloud-user:cloud-user /srv/djangoapp
cd /srv/djangoapp
virtualenv django
source django/bin/activate
pip3 install django gunicorn psycopg2
./django-admin startproject djangoapp /srv/djangoapp
Setting up PostgreSQL and Django is straightforward: create a database, create a user, configure permissions. There is one point to keep in mind when installing PostgreSQL for the first time — the postgresql-setup script, which is installed with the postgresql-server package. This script helps you perform basic tasks related to administering a database cluster, such as initializing a cluster or updating process. To configure a new instance of PostgreSQL on the RHEL system, we need to run the command:
sudo /usr/bin/postgresql-setup -initdb
After that, you can start PostgreSQL using systemd, create a database and configure the project in Django. Remember to restart PostgreSQL after making changes to the client authentication configuration file (usually pg_hba.conf) to configure password storage for the application user. If you encounter other difficulties, make sure that the IPv4 and IPv6 settings in the pg_hba.conf file are changed.
systemctl enable -now postgresql
sudo -u postgres psql
postgres=# create database djangoapp;
postgres=# create user djangouser with password 'qwer4321';
postgres=# alter role djangouser set client_encoding to 'utf8';
postgres=# alter role djangouser set default_transaction_isolation to 'read committed';
postgres=# alter role djangouser set timezone to 'utc';
postgres=# grant all on DATABASE djangoapp to djangouser;
postgres=# \q
In the file /var/lib/pgsql/data/pg_hba.conf:
# IPv4 local connections:
host all all 0.0.0.0/0 md5
# IPv6 local connections:
host all all ::1/128 md5
In the file /srv/djangoapp/settings.py:
# Database
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': '{{ db_name }}',
'USER': '{{ db_user }}',
'PASSWORD': '{{ db_password }}',
'HOST': '{{ db_host }}',
}
}
After you configure the settings.py file in the project and configure the database configuration, you can start the development server to make sure that everything works. After starting the development server, it’s nice to create an admin user in order to test the connection to the database.
./manage.py runserver 0.0.0.0:8000
./manage.py createsuperuser
WSGI? Wai
The development server is useful for testing, but to run the application, you must configure the appropriate server and proxy for the Web Server Gateway Interface (WSGI). There are several common bundles, for example, Apache HTTPD with uWSGI or Nginx with Gunicorn.
The goal of the Web Server Gateway Interface is to redirect requests from the web server to the Python web framework. WSGI is a kind of legacy of a terrible past when CGI mechanisms were in use, and today WSGI is actually the standard, regardless of the web server or Python framework used. But despite its wide distribution, there are still many nuances when working with these frameworks, and many choices. In this case, we will try to establish interaction between Gunicorn and Nginx through the socket.
Since both of these components are installed on the same server, we will try to use a UNIX socket instead of a network socket. Since communication requires a socket anyway, let's try one more step and configure the socket activation for Gunicorn through systemd.
The process of creating socket activated services is simple enough. First, a unit file is created that contains the ListenStream directive, pointing to the point at which the UNIX socket will be created, then a unit file for the service, where the Requires directive will point to the socket unit file. Then, in the unit-file of the service, it remains only to call Gunicorn from the virtual environment and create a WSGI binding for the UNIX socket and the Django application.
Here are some examples of unit files that can be taken as a basis. First, configure the socket.
[Unit]
Description=Gunicorn WSGI socket
[Socket]
ListenStream=/run/gunicorn.sock
[Install]
WantedBy=sockets.target
Now you need to configure the Gunicorn daemon.
[Unit]
Description=Gunicorn daemon
Requires=gunicorn.socket
After=network.target
[Service]
User=cloud-user
Group=cloud-user
WorkingDirectory=/srv/djangoapp
ExecStart=/srv/djangoapp/django/bin/gunicorn \
—access-logfile - \
—workers 3 \
—bind unix:gunicorn.sock djangoapp.wsgi
[Install]
WantedBy=multi-user.target
For Nginx, just create a proxy configuration file and configure a directory to store static content if you use it. In RHEL, the Nginx configuration files are /etc/nginx/conf.d. You can copy the following example there to the file /etc/nginx/conf.d/default.conf, and start the service. Be sure to specify server_name according to your host name.
server {
listen 80;
server_name 8beta1.example.com;
location = /favicon.ico { access_log off; log_not_found off; }
location /static/ {
root /srv/djangoapp;
}
location / {
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_pass http://unix:/run/gunicorn.sock;
}
}
Run the Gunicorn and Nginx sockets with systemd, and you can start testing.
Bad Gateway Error?
If you enter the address in the browser, then most likely you will get the 502 Bad Gateway error. It can be caused by incorrectly configured permissions for a UNIX socket, or due to more complex issues related to access control in SELinux.
In the nginx error log, you can see a line like this:
2018/12/18 15:38:03 [crit] 12734#0: *3 connect() to unix:/run/gunicorn.sock failed (13: Permission denied) while connecting to upstream, client: 192.168.122.1, server: 8beta1.example.com, request: "GET / HTTP/1.1", upstream: "http://unix:/run/gunicorn.sock:/", host: "8beta1.example.com"
If we test Gunicorn directly, we get an empty answer.
curl —unix-socket /run/gunicorn.sock 8beta1.example.com
Let's see why this is happening. If you open the log, then most likely we will see that the problem is related to SELinux. Since we are running a daemon for which we have not created our own policy, it is marked as init_t. Let's test this theory in practice.
sudo setenforce 0
All of this can cause criticism and bloody tears, but this is just debugging the prototype. We turn off the check only to make sure that this is the problem, after which we will return everything back to their places.
By refreshing the page in the browser or by restarting our curl command, you can see the Django test page.
So, making sure everything works, and there are no more permissions problems, we re-enable SELinux.
sudo setenforce 1
There will be no talk about audit2allow and the creation of policies based on alerts using sepolgen, since there is currently no real Django application, there is no complete map of what Gunicorn might want to access, and what access should be denied. Therefore, it is necessary to keep SELinux working to protect the system, and at the same time, allow the application to start and leave messages in the audit log so that you can then create a real policy based on them.
Specifying Permitted Domains
Not everyone has heard about permitted domains in SELinux, but there is nothing new in them. Many even worked with them, without realizing it themselves. When a policy is created based on audit messages, the policy created is an allowed domain. Let's try to create the simplest permissive policy.
To create a specific permitted domain for Gunicorn, you need a certain policy, and you will also need to mark the appropriate files. In addition, tools are needed to collect new policies.
sudo yum install selinux-policy-devel
The resolved domain mechanism is a great tool for identifying problems, especially when it comes to a custom application or applications that come without policies already created. In this case, the allowed domain policy for Gunicorn will be as simple as possible - declare the main type (gunicorn_t), declare the type that we will use to mark several executable files (gunicorn_exec_t), and then configure the transition for the system to correctly mark the running processes . The last line sets the policy as enabled by default at the time of its loading.
gunicorn.te:
policy_module(gunicorn, 1.0)
type gunicorn_t;
type gunicorn_exec_t;
init_daemon_domain(gunicorn_t, gunicorn_exec_t)
permissive gunicorn_t;
You can compile this policy file and add it to the system.
make -f /usr/share/selinux/devel/Makefile
sudo semodule -i gunicorn.pp
sudo semanage permissive -a gunicorn_t
sudo semodule -l | grep permissive
Let's check if SELinux is blocking anything else besides what our unknown daemon is accessing.
sudo ausearch -m AVC
type=AVC msg=audit(1545315977.237:1273): avc: denied { write } for pid=19400 comm="nginx" name="gunicorn.sock" dev="tmpfs" ino=52977 scontext=system_u:system_r:httpd_t:s0 tcontext=system_u:object_r:var_run_t:s0 tclass=sock_file permissive=0
SELinux prevents Nginx from writing data to the UNIX socket used by Gunicorn. Usually, in such cases, politicians begin to change, but there are other tasks ahead. You can also change the domain settings, turning it from a restriction domain into a permission domain. Now move httpd_t to the permission domain. This will provide Nginx with the necessary access, and we will be able to continue further work on debugging.
sudo semanage permissive -a httpd_t
So, when it was possible to preserve SELinux protection (in fact, you should not leave the project with SELinux in restriction mode) and the permission domains are loaded, you need to find out what exactly needs to be marked as gunicorn_exec_t so that everything works again as expected. Let's try to access the website to see new messages about access restrictions.
sudo ausearch -m AVC -c gunicorn
You can see a lot of messages containing 'comm = “gunicorn”' that perform various actions on the files in / srv / djangoapp, so obviously this is just one of the commands worth checking out.
But in addition, a message like this appears:
type=AVC msg=audit(1545320700.070:1542): avc: denied { execute } for pid=20704 comm="(gunicorn)" name="python3.6" dev="vda3" ino=8515706 scontext=system_u:system_r:init_t:s0 tcontext=unconfined_u:object_r:var_t:s0 tclass=file permissive=0
If you look at the status of the gunicorn service or run the ps command, then no running processes will appear. It looks like gunicorn is trying to access the Python interpreter in our virtualenv environment, possibly to run working scripts (workers). So now let's mark these two executable files and see if we can open our test Django page.
chcon -t gunicorn_exec_t /srv/djangoapp/django/bin/gunicorn /srv/djangoapp/django/bin/python3.6
You will need to restart the gunicorn service so that you can select a new label. You can restart it immediately or stop the service and let the socket start it when you open the site in a browser. Make sure the processes get the correct labels using ps.
ps -efZ | grep gunicorn
Remember to create a normal SELinux policy later!
If you look at the AVC messages now, the last message contains permissive = 1 for everything related to the application, and permissive = 0 for the rest of the system. If you understand what kind of access a real application needs, you can quickly find the best way to solve such problems. But until then, it is better that the system is protected, and in order to get a clear and usable audit by the Django project.
sudo ausearch -m AVC
Happened!
A working Django project with a frontend on Nginx and Gunicorn WSGI appeared. We configured Python 3 and PostgreSQL 10 from the RHEL 8 Beta repositories. Now you can move on and create (or just deploy) Django applications or explore other available tools in RHEL 8 Beta to automate the tuning process, improve performance, or even containerize this configuration.