Django Asynchronous Jobs with Celery
Greetings!
I think most Django developers have heard of Celery , an asynchronous task execution system, and many even use it actively.
About a year ago, there was a good article on the hub that talked about how to use Celery. However, as was mentioned in the conclusion, Сelery 2.0 has already been released (at the moment the stable version is 2.2.7), where integration with django has been moved to a separate package, as well as other changes .
This article will be useful primarily for beginners who start working with Django, and they need something that can perform asynchronous and / or periodic tasks in the system (for example, cleaning outdated sessions). I will show how to install and configure Celery to work with Django from beginning to end, as well as some other useful settings and pitfalls.
First of all, check for the presence of the python-setuptools package in the system , and install it in case of absence:
More in the original: http://celeryq.org/docs/getting-started/introduction.html#installation
In the article, the link to which was given at the beginning, MongoDB was used as a backend , here I will show how to use as a backend and message broker the same database in which the rest of the Django application stores data.
As already mentioned, django-celery provides a convenient integration of Celery and Django. In particular, it uses Django ORM as a backend to save Celery jobs, and it automatically finds and logs Celery jobs for Django applications listed in INSTALLED_APPS .
After installing django-celery, you need to configure:
When using mod_wsgi, add the following lines to the WSGI configuration file :
Install the package:
Customize:
(Without celerybeat, you can start and perform regular tasks. To run periodic scheduled tasks, you need to start celerybeat)
After starting, we can see how periodic tasks look in the django admin panel:
If you use something other than Django ORM (RabbitMQ for example) as the celery backend, then in the Django admin panel you could also view the status of all other tasks, it looks something like this:
Details: http://stackoverflow.com/questions/5449163/django-celery-admin-interface-showing-zero-tasks-workers
UPDATE: I ’m adding a little about demonization, as it may not happen the first time.
In the / etc / default directory , create the celeryd file , from which the script will take the startup settings:
The --concurrency option sets the number of celery processes (by default, this number is equal to the number of processors).
After that, you can start celery using service :
More details: docs.celeryproject.org/en/latest/tutorials/daemonizing.html#daemonizing
It is also useful to activate the CELERY_SEND_TASK_ERROR_EMAILS option , due to which Celery will notify of all errors to the addresses listed in the ADMINS variable .
Writing assignments for celery has not changed much since the previous article:
The only difference is that decorators should now be imported from celery.task , the decorators module has become deprecated.
A couple of performance notes:
More about these and other Celery tips: http://celeryproject.org/docs/userguide/tasks.html#tips-and-best-practices
I think most Django developers have heard of Celery , an asynchronous task execution system, and many even use it actively.
About a year ago, there was a good article on the hub that talked about how to use Celery. However, as was mentioned in the conclusion, Сelery 2.0 has already been released (at the moment the stable version is 2.2.7), where integration with django has been moved to a separate package, as well as other changes .
This article will be useful primarily for beginners who start working with Django, and they need something that can perform asynchronous and / or periodic tasks in the system (for example, cleaning outdated sessions). I will show how to install and configure Celery to work with Django from beginning to end, as well as some other useful settings and pitfalls.
First of all, check for the presence of the python-setuptools package in the system , and install it in case of absence:
aptitude install python-setuptools
Celery Installation
Celery itself is very easy to install:
easy_install Celery
More in the original: http://celeryq.org/docs/getting-started/introduction.html#installation
In the article, the link to which was given at the beginning, MongoDB was used as a backend , here I will show how to use as a backend and message broker the same database in which the rest of the Django application stores data.
django-celery
Install the django-celery package :
easy_install django-celery
As already mentioned, django-celery provides a convenient integration of Celery and Django. In particular, it uses Django ORM as a backend to save Celery jobs, and it automatically finds and logs Celery jobs for Django applications listed in INSTALLED_APPS .
After installing django-celery, you need to configure:
- add djcelery to INSTALLED_APPS list :
INSTALLED_APPS += ("djcelery", )
- add the following lines to the django settings file {{settings.py}}:
import djcelery djcelery.setup_loader()
- Create the necessary tables in the database:
./manage.py syncdb
- We set the database as the storage location for periodic tasks, add it to settings.py :
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
With this option, we will be able to add / remove / edit periodic jobs through the django admin panel.
When using mod_wsgi, add the following lines to the WSGI configuration file :
import os
os.environ["CELERY_LOADER"] = "django"
django-kombu
Now we just have to find a suitable message broker for Celery, in this article I will use django-kombu - a package that allows you to use the Django database as a message store for Kombu (AMPQ implementation in python).Install the package:
easy_install django-kombu
Customize:
- add djkombu to the INSTALLED_APPS list :
INSTALLED_APPS += ("djkombu", )
- Set djkombu as a broker in settings.py :
BROKER_BACKEND = "djkombu.transport.DatabaseTransport"
- We create the necessary tables in the database:
./manage.py syncdb
We launch
We start the celery and celerybeat processes:(Without celerybeat, you can start and perform regular tasks. To run periodic scheduled tasks, you need to start celerybeat)
- In linux, both processes can be started simultaneously using the -B switch :
# ./manage.py celeryd -B -------------- celery@test v2.2.7 ---- **** ----- --- * *** * -- [Configuration] -- * - **** --- . broker: djkombu.transport.DatabaseTransport://guest@localhost0/ - ** ---------- . loader: djcelery.loaders.DjangoLoader - ** ---------- . logfile: [stderr]@WARNING - ** ---------- . concurrency: 16 - ** ---------- . events: OFF - *** --- * --- . beat: ON -- ******* ---- --- ***** ----- [Queues] -------------- . celery: exchange:celery (direct) binding:celery
- On windows, celery and celerybeat must be run separately:
./manage.py celeryd --settings=settings ./manage.py celerybeat
The --settings option may be required if the following exception occurs:ImportError: Could not import settings 'app_name.settings' (Is it on sys.path?): No module named app_name.settings
Details about the problem: http://groups.google.com/group/celery-users/browse_thread/thread/43a95be6865a636/d91ab2492885f3d4?lnk=gst&q=settings#d91ab2492885f3d4
A complete list of known problems with celery on Windows: http: // celeryproject. org / docs / faq.html # windows
After starting, we can see how periodic tasks look in the django admin panel:
If you use something other than Django ORM (RabbitMQ for example) as the celery backend, then in the Django admin panel you could also view the status of all other tasks, it looks something like this:
Details: http://stackoverflow.com/questions/5449163/django-celery-admin-interface-showing-zero-tasks-workers
UPDATE: I ’m adding a little about demonization, as it may not happen the first time.
Run celery as a service
Download the celery launch script from here: https://github.com/ask/celery/tree/master/contrib/generic-init.d/ and place it in the /etc/init.d directory with the appropriate rights.In the / etc / default directory , create the celeryd file , from which the script will take the startup settings:
# Where the Django project is.
CELERYD_CHDIR="/var/www/myproject"
# Path to celeryd
CELERYD_MULTI="$CELERYD_CHDIR/manage.py celeryd_multi"
CELERYD_OPTS="--time-limit=300 --concurrency=8 -B"
CELERYD_LOG_FILE=/var/log/celery/%n.log
# Path to celerybeat
CELERYBEAT="$CELERYD_CHDIR/manage.py celerybeat"
CELERYBEAT_LOG_FILE="/var/log/celery/beat.log"
CELERYBEAT_PID_FILE="/var/run/celery/beat.pid"
CELERY_CONFIG_MODULE="settings"
export DJANGO_SETTINGS_MODULE="settings"
The --concurrency option sets the number of celery processes (by default, this number is equal to the number of processors).
After that, you can start celery using service :
service celeryd start
More details: docs.celeryproject.org/en/latest/tutorials/daemonizing.html#daemonizing
Work with celery
After installing django-celery, celery jobs are automatically registered from all tasks.py modules from all applications listed in INSTALLED_APPS . In addition to tasks modules, you can also specify additional modules using the CELERY_IMPORTS parameter :CELERY_IMPORTS=('myapp.my_task_module',)
It is also useful to activate the CELERY_SEND_TASK_ERROR_EMAILS option , due to which Celery will notify of all errors to the addresses listed in the ADMINS variable .
Writing assignments for celery has not changed much since the previous article:
from celery.task import periodic_task
from celery.schedules import crontab
@periodic_task(ignore_result=True, run_every=crontab(hour=0, minute=0))
def clean_sessions():
Session.objects.filter(expire_date__lt=datetime.now()).delete()
The only difference is that decorators should now be imported from celery.task , the decorators module has become deprecated.
A couple of performance notes:
- If the task does not return any result, it is better to set the ignore_result = True option
- Turn off rate limits if your tasks do not use them:
CELERY_DISABLE_RATE_LIMITS = True
More about these and other Celery tips: http://celeryproject.org/docs/userguide/tasks.html#tips-and-best-practices