Named tuples from samples

    In Django to speed up queries that return large amounts of data, there are methods QuerySetspecified values()and values_list(). The first instead of models returns dictionaries, the second tuples. It’s not as convenient to work with both of them as with model instances, they say, guys pay for speed with convenience. But I don’t want to, and thanks to the named tuples from the standard module collections, I won’t.

    For those who have not encountered and for those who have not yet been convinced, I will give a piece of code illustrating the current state of things:
    qs = Item.objects.filter(...)
    for item in qs:
        print item.title, item.amount
        total += item.amount * item.price 
       
    # используем словари
    qs = Item.objects.filter(...).values('title', 'amount', 'price')
    for item in qs:
        print item['title'], item['amount']
        total += item['amount'] * item['price']
       
    # используем кортежи
    qs = Item.objects.filter(...).values_list('title', 'amount', 'price')
    for item in qs:
        print item[0], item[1]
        total += item[1] * item[2]  

    Variants with dictionaries and tuples are not as pretty as with models, but they work faster and require less memory. Named tuples allow you to access your fields by attribute, i.e. with them, our code will hardly differ from the code for models. Hence the accompanying bonus - in many cases we will be able to switch from models to tuples and vice versa without changing the code, and where it will have to be changed, the changes will be minimal. In addition, tuples store field names in a class, and not every object, as dictionaries, so they take up less memory, which is nice in itself and will be even more important if it occurs to you to put the query results in a cache.

    What are named tuples?


    Named tuples appeared in python 2.6 and are tuples in which access to elements is possible by given attributes. A small piece of code to demonstrate:
    >>> from collections import namedtuple
    >>> Point = namedtuple('Point', 'x y'.split())
    >>> p = Point(x=11, y=22)   # создание с помощью именованных
    >>> p2 = Point(11, 22)      # ... или позиционных параметров
    >>> p1 == p2
    True
    >>> p[0] + p[1]             # индексация
    33
    >>> x, y = p                # ... и распаковка как для обычных кортежей
    >>> x, y
    (11, 22)
    >>> p.x + p.y               # доступ к элементам через атрибуты
    33
    >>> Point._make([11, 22])   # создание из итерируемого объекта
    Point(x=11, y=22)

    NamedTuplesQuerySet


    It remains to create a descendant QuerySetthat will produce named tuples. Due to the fact that the entire logic of the request with the indicated fields is implemented in ValuesQuerySet(returned from QuerySet.values()), we can only process the result before issuing:
    from itertools import imap
    from collections import namedtuple

    from django.db.models.query import ValuesQuerySet

    class NamedTuplesQuerySet(ValuesQuerySet):
        def iterator(self):
            # собираем список имён полей 
            extra_names = self.query.extra_select.keys()
            field_names = self.field_names
            aggregate_names = self.query.aggregate_select.keys()
            names = extra_names + field_names + aggregate_names
           
            # создаём класс кортежа
            tuple_cls = namedtuple('%sTuple' % self.model.__name__, names)

            results_iter = self.query.get_compiler(self.db).results_iter()
            # заворачиваем каждую строку в наш именованный кортеж
            return imap(tuple_cls._make, results_iter)

    And, for ease of use, we assign the method to QuerySet:
    from django.db.models.query import QuerySet

    def namedtuples(self, *fields):
        return self._clone(klass=NamedTuplesQuerySet, setup=True, _fields=fields)
    QuerySet.namedtuples = namedtuples

    After that, the code at the beginning of the article for models, dictionaries and tuples can be written as:
    qs = Item.objects.filter(...).namedtuples('title', 'amount', 'price')
    for item in qs:
        print item.title, item.amount
        total += item.amount * item.price

    It is funny that the resulting class performs the functions of both the available accelerators values()and values_list(). And it does it either better or almost as well and at the same time generates more beautiful code. Maybe someday this will lead to their replacement.

    PS The class code along with the patch can be pulled from here gist.github.com/876324
    P.PS The order of the fields in the tuple may not correspond to the order of the arguments nametuples(), I scored it for the sake of speed. The implementation of field reordering, if anyone needs it, can be taken from ValuesListQuerySet.iterator().

    Also popular now: