Ruzin February 27, 2016 at 23:53

How I Reinvent Dictionaries in Python

In our Django application, it was necessary to develop a report (calculation) of bonuses.
The report should have a nested structure with a summary of the results for users, departments and throughout the company. Schematically, its logic can be represented:

print total
for department in departments:
    print department.total
    for user in department.users:
        print user.total
        for row in user.rows:
            print row.data

This report had two complicating points:

Different models (and can be spaced apart) could act as the "row", which does not allow using iterators over QuerySets.
Report construction time. Data collection takes considerable time (several seconds). Data in the report is subject to change. Speaking of cleanliness, this is not a static report, but a tool for monitoring and adjusting accrued bonuses in the form of a report. But the data does not change very often, say for every 100 views there will be one change, after which you need to rebuild the report. Those. data can be cached.

The structure from the embedded dictionaries perfectly solves both problems: in them you can add all the required scalars (numbers, lines, dates), serialize and cache them.

The data structure for the report has acquired the form (simplified):

{
    'total': {
        'income': 1234,
        'bonus': 123,
        'expense': 1234,
        'penalty': 123
    },
    'departments': {
        '{dept_id}': {
            'department': {
                'title': 'Mega Department'
            }
            'total': {
                'income': 1234,
                'bonus': 123,
                'expense': 1234,
                'penalty': 123
            },
            'users': {
                '{user_id}': {
                    'user': {
                        'name': 'John Smith'
                    },
                    'total': {
                        'income': 1234,
                        'bonus': 123,
                        'expense': 1234,
                        'penalty': 123
                    },
                    'rows': {
                        '{sale_id}': {        //  Одна модель
                            'type': 'sale'
                            'base_income': 1234,
                            'bonus': 123,
                            'comment': 'some description'
                        },
                        '{expense_id}': {     //  Другая модель !!!
                            'type': 'expense'
                            'expense': 1234,
                            'penalty': 123,
                            'comment': 'some description'
                        },
                        ...
                    }
                },
                ...
            }
        },
        ...
    }
}

And here I was faced with the problem that filling out such a structure from dictionaries is not as convenient as I wanted. Checking dictionaries for keys or using setdefatult (key, {}) turns the code into an unreadable mess.

This structure is somewhat reminiscent of XML. And I would like to use something similar to how XPath expressions are built to address the nodes of an XML tree:

/departments/{dept_id}/users/{user_id}/rows/{row_id}/base_income

or in Python something like this:

data.departments.{dept_id}.users.{user_id}.rows.{row_id}.base_income

Bearing in mind that {dept_id} and other other {id} are integers, I allowed myself to use square brackets: [].

data.departments[{dept_id}].users[{user_id}].rows[{row_id}].base_income

Actually, I needed a class that behaved basically like a dictionary, but at the same time:

access to attributes could be done without square brackets
missing attributes were automatically created

This is how ElasticDict came about.

Eventually

The data preparation code looks something like this:

data = ElasticDict()
for sale in Sale.objects.filter(...).prefetch_related(...):
    data.departments[sale.user.department.pk].users[sale.user.pk].rows[sale.pk] = {'base_income': sale.amount, 'bonus': sale.calc_bonus()}
# или в другой форме, кому как больше нравится
for expense in Expense.objects.filter(...).prefetch_related(...):
    data.departments[sale.user.department.pk].users[sale.user.pk].rows[expense.pk].base_expense = expense.amount
    data.departments[sale.user.department.pk].users[sale.user.pk].rows[expense.pk].penalty = expense.calc_penalty()

The code in the template is:

{{ data.total }}
{% for dept_id, department in data.departments.items %}
    {{ department.total }}
    {% for user_id, user in department.users.items %}
        {{ user.total }}
        {% for row_id, row in user.rows.items %}:
            {{ row.data }}
        {% endfor %}
    {% endfor %}
{% endfor %}

Conclusion

It should be noted that ElasticDict () is a subclass of regular dict () 'a, i.e. everything is available in it, as in a regular dictionary. At that moment when you need to “fix” the structure (again we want to get KeyError when accessing non-existent keys), an ElasticDict instance can be exported to a regular dict (). A recursive tour of ElasticDict () is made, where all instances of this class are replaced with ordinary dictionaries. There is an inverse transformation - we input a dictionary at the input, at the output we get ElasticDict also with a recursive traversal.

Comments / suggestions are welcome!

UPDATE from the English-speaking party suggested that there is already an addict analogue . I think those who voted "I need to" should switch to it as a more stable (tested) one.

Only registered users can participate in the survey. Please come in.

Do you need it?

14.7% Of course, I have long wanted! 14
36.8% Well, I don’t know ... 35
48.4% Nonsense 46

Tags:

How I Reinvent Dictionaries in Python

Eventually

Conclusion

Do you need it?

Also popular now: