Speeding up testing of Django projects
The question of testing Django applications has been given a lot of attention in various articles, including on Habré. In almost every one of them, at least a couple of sentences are devoted to methods and hacks to speed up the tests, and therefore it’s not easy to say something fundamentally new.
The hosting control panel project, which I have been developing for a significant part of my time at NetAngels , has 120 tables and about 500 objects from fixtures are loaded during testing. This cannot be said to be frighteningly much, but creating all the tables, adding indexes and loading objects each time the test is run is quite annoying, especially if only one or a couple of tests are run.
Under the cat, several methods for speeding up the testing, proposed earlier, are listed rather briefly, and at the end there is a detailed description of another useful recipe, which for me now, I hope, has already completely removed the problem of speed of test execution.
In average applications, as a rule, most of the brakes will be associated with working with the database. It is not surprising that almost all proposals are aimed at optimizing this particular component.In
turn, working with the database is mainly slowed down by the disk subsystem, and it is logical to assume that if we somehow reduce this load, we will get a noticeable performance gain.
No one cares if during the testing process electricity suddenly disappears and the data is not completely written to disk. Usually, the database servers are configured in such a way as to minimize the sad consequences of such events. If you use a separate MySQL or PostgreSQL server for tests, you can safely change the settings to unsafe.
For MySQL, in particular, they offer the following: PostgreSQL recommends adding the option to postgresql.conf for the same purpose .
A more radical approach is to not use the disk at all, but instead work with data in RAM. The general principle is that a new partition is created with the tmpfs file system, mounted in a separate directory, and then all the database files are created in this directory. An added bonus is that simply unmounting a partition deletes all data.
In fact, Django developers have already done a lot to make the tests run as quickly as possible. In particular, if you use SQLite as the database engine, then in the process of testing the database is created in memory (the string ": memory:" is passed to the driver as a file name), and this method alone is enough to solve most problems with speed.
Sometimes they complain that ORM Django does not carefully hide the details of the database, and therefore in some cases it may turn out that the code that worked in SQLite (i.e. the tests passed) suddenly breaks when rolling out to the system where it lives and MySQL works. Indeed, this sometimes happens, but as a rule, it is a consequence of the fact that you did something “unusual”, for example, you started to manually compose queries using the QuerySet.extra method. However, most likely, if you do such things, then you know how this can threaten.
As you know, when running tests, Django performs the following sequence of actions:
1. With the permission of the user, clears the test database if there is something in it
2. Creates all the tables and indexes in the test database
3. Loads the fixtures set named initial_data
4. performs tests, one after another
5. deletes everything that was created during the execution of tests
The first and last stage may not be performed if the data lives in memory, but steps 2 and 3 take a significant amount of time and slow down the high pace of iterations “corrected the code - launched the test”. Obviously, the database schema and the fixtures set change much less frequently than the code and tests, so it makes sense to somehow save on the constant creation of the database. By the way, steps 2 and 3 are the usual execution of the syncdb management command.
A general approach to speeding up test execution is to execute syncdb manually, and only when it is really necessary, and when starting the tests, simply copy a pre-prepared database. Using SQLite, one could do without copying files, but did not want to lose the benefits of working with tests in ": memory:".
A short search showed that a solution to this case exists. It turns out that SQLite has an interface for “hot” backup (just like in adults), and if we run such a copy from a previously prepared database into a database with the name: memory :, we will get exactly what we need: an initialized database data in memory.
The first difficulty in the implementation is that the standard Python module sqlite3 does not support, and probably will never support this API, therefore, to perform such a copy using Python, they suggest using a third-party module called APSW (Another Python SQLite Wrapper).
The second difficulty is that each new connection to the database in: memory: creates its own copy of the database (obviously empty), and therefore we need to somehow teach the cursor used by ORM to use the connection initialized by APSW. Fortunately, a hack is provided for this case: instead of a line with a file name, when creating a connection using sqlite3, you can pass an apsw.Connection object, which will proxy all requests yourself.
Thus, the solution looks very simple:
1. Create two ASPW Connection objects, one of which refers to a previously prepared database, and the second to the database in memory.
2. Copy the data from the file to memory.
3. For the NAME parameter for the alias with the name "default" we pass ASPW Connection, referring to the memory.
4. Initialize the cursor, and run the tests.
It is very simple to prepare the database: just add another alias with the name “quickstart” to the DATABASES settings.py variable and then execute it
All the code that can perform these actions takes a little more than 20 lines and is given below. To make it work, just
1. Install APSW
2. Copy the code to a separate file and put it in the project
3. Add an alias to the settings.DABABASES for the database named quickstart
4. Create the database by doing
5. Set the variable TEST_RUNNER so that it refers to the class of the just saved object
6. Try to run some simple test.
As a result, the execution time of one single test fell from 18 to 2 seconds: it’s quite comfortable to run the test as often as you want.
The same code, but with comments and a bold warning “your tests can eat your data!” (use only in a test environment) is available at gist.github.com/1044215 .
I hope these simple recommendations allow you to write code faster, more efficiently and more reliably.
For details and other useful information, I recommend referring to the documentation for your database server, as well as the following sources:
- Speeding up Django unit test runs with MySQL
- Innodb Performance Optimization Basics
- Using the SQLite Online Backup API
- How to use SQLite's backup in Python
The hosting control panel project, which I have been developing for a significant part of my time at NetAngels , has 120 tables and about 500 objects from fixtures are loaded during testing. This cannot be said to be frighteningly much, but creating all the tables, adding indexes and loading objects each time the test is run is quite annoying, especially if only one or a couple of tests are run.
Under the cat, several methods for speeding up the testing, proposed earlier, are listed rather briefly, and at the end there is a detailed description of another useful recipe, which for me now, I hope, has already completely removed the problem of speed of test execution.
In average applications, as a rule, most of the brakes will be associated with working with the database. It is not surprising that almost all proposals are aimed at optimizing this particular component.In
turn, working with the database is mainly slowed down by the disk subsystem, and it is logical to assume that if we somehow reduce this load, we will get a noticeable performance gain.
Insecure transaction settings
No one cares if during the testing process electricity suddenly disappears and the data is not completely written to disk. Usually, the database servers are configured in such a way as to minimize the sad consequences of such events. If you use a separate MySQL or PostgreSQL server for tests, you can safely change the settings to unsafe.
For MySQL, in particular, they offer the following: PostgreSQL recommends adding the option to postgresql.conf for the same purpose .
[mysqld]
default-table-type=innodb
transaction-isolation=READ-COMMITTED
innodb_flush_log_at_trx_commit = 0
skip-sync-frm=OFF
fsync = off
Using ramdisk
A more radical approach is to not use the disk at all, but instead work with data in RAM. The general principle is that a new partition is created with the tmpfs file system, mounted in a separate directory, and then all the database files are created in this directory. An added bonus is that simply unmounting a partition deletes all data.
SQLite for tests
In fact, Django developers have already done a lot to make the tests run as quickly as possible. In particular, if you use SQLite as the database engine, then in the process of testing the database is created in memory (the string ": memory:" is passed to the driver as a file name), and this method alone is enough to solve most problems with speed.
Sometimes they complain that ORM Django does not carefully hide the details of the database, and therefore in some cases it may turn out that the code that worked in SQLite (i.e. the tests passed) suddenly breaks when rolling out to the system where it lives and MySQL works. Indeed, this sometimes happens, but as a rule, it is a consequence of the fact that you did something “unusual”, for example, you started to manually compose queries using the QuerySet.extra method. However, most likely, if you do such things, then you know how this can threaten.
Preliminary SQLite Test Base Creation
As you know, when running tests, Django performs the following sequence of actions:
1. With the permission of the user, clears the test database if there is something in it
2. Creates all the tables and indexes in the test database
3. Loads the fixtures set named initial_data
4. performs tests, one after another
5. deletes everything that was created during the execution of tests
The first and last stage may not be performed if the data lives in memory, but steps 2 and 3 take a significant amount of time and slow down the high pace of iterations “corrected the code - launched the test”. Obviously, the database schema and the fixtures set change much less frequently than the code and tests, so it makes sense to somehow save on the constant creation of the database. By the way, steps 2 and 3 are the usual execution of the syncdb management command.
A general approach to speeding up test execution is to execute syncdb manually, and only when it is really necessary, and when starting the tests, simply copy a pre-prepared database. Using SQLite, one could do without copying files, but did not want to lose the benefits of working with tests in ": memory:".
A short search showed that a solution to this case exists. It turns out that SQLite has an interface for “hot” backup (just like in adults), and if we run such a copy from a previously prepared database into a database with the name: memory :, we will get exactly what we need: an initialized database data in memory.
The first difficulty in the implementation is that the standard Python module sqlite3 does not support, and probably will never support this API, therefore, to perform such a copy using Python, they suggest using a third-party module called APSW (Another Python SQLite Wrapper).
The second difficulty is that each new connection to the database in: memory: creates its own copy of the database (obviously empty), and therefore we need to somehow teach the cursor used by ORM to use the connection initialized by APSW. Fortunately, a hack is provided for this case: instead of a line with a file name, when creating a connection using sqlite3, you can pass an apsw.Connection object, which will proxy all requests yourself.
Thus, the solution looks very simple:
1. Create two ASPW Connection objects, one of which refers to a previously prepared database, and the second to the database in memory.
2. Copy the data from the file to memory.
3. For the NAME parameter for the alias with the name "default" we pass ASPW Connection, referring to the memory.
4. Initialize the cursor, and run the tests.
It is very simple to prepare the database: just add another alias with the name “quickstart” to the DATABASES settings.py variable and then execute it
./manage.py syncdb --database quickstart
. All the code that can perform these actions takes a little more than 20 lines and is given below. To make it work, just
1. Install APSW
2. Copy the code to a separate file and put it in the project
3. Add an alias to the settings.DABABASES for the database named quickstart
4. Create the database by doing
./manage.py syncdb --database quickstart
5. Set the variable TEST_RUNNER so that it refers to the class of the just saved object
6. Try to run some simple test.
Copy Source | Copy HTML- import apsw
- from django.test.simple import DjangoTestSuiteRunner
- from django.db import connections
-
- class TestSuiteRunner(DjangoTestSuiteRunner):
-
- def setup_databases(self, **kwargs):
- quickstart_connection = connections['quickstart']
- quickstart_dbname = quickstart_connection.settings_dict['NAME']
-
- memory_connection = apsw.Connection(':memory:')
- quickstart_connection = apsw.Connection(quickstart_dbname)
- with memory_connection.backup('main', quickstart_connection, 'main') as backup:
- while not backup.done:
- backup.step(100)
-
- connection = connections['default']
- connection.settings_dict['NAME'] = memory_connection
- cursor = connection.cursor()
-
- def teardown_databases(self, old_config, **kwargs):
- pass
As a result, the execution time of one single test fell from 18 to 2 seconds: it’s quite comfortable to run the test as often as you want.
The same code, but with comments and a bold warning “your tests can eat your data!” (use only in a test environment) is available at gist.github.com/1044215 .
I hope these simple recommendations allow you to write code faster, more efficiently and more reliably.
Used sources
For details and other useful information, I recommend referring to the documentation for your database server, as well as the following sources:
- Speeding up Django unit test runs with MySQL
- Innodb Performance Optimization Basics
- Using the SQLite Online Backup API
- How to use SQLite's backup in Python