Dictionary Generators

Original author: Szymon Guz
  • Transfer
  • Tutorial
Some of the great features of the Python language are undeservedly ignored and many programmers do not know about them. This time we will talk about the great feature of the language that makes the code clearer: dictionary generators are single-line expressions that return a dictionary. But let's start with compact list generators and the task of removing non-unique elements of collections.

It will be interesting mainly to beginners in Python.

List generators


The easiest way to create a list is to use a single line expression - a list generator. It is used quite often, and I met it in many examples and in the code of many libraries.
Suppose we have a function that returns a list. A good example is the range (start, end) function , which returns the numbers between start and end . Starting with version Python 3.0, it is implemented as a generator and does not immediately return a complete list, but gives a number by number as necessary. In Python 2. *, the xrange () function was used for this . Getting a list of numbers from 1 to 10 using this function might look like this:
numbers = []
for i in range(1, 11):
    numbers.append(i)

If we only need even numbers, we could implement this as follows:
numbers = []
for i in range(1, 11):
    if i % 2 == 0:
        numbers.append(i)

List generators make code a lot easier. This is how the expression returns the list in general form:
[ expression for item in list if conditional ]

Using it, the first example can be rewritten like this:
numbers = [i for i in range(1, 11)]

and the second is like this:
numbers = [i for i in range(1, 11) if i % 2 == 0]

Of course, such a syntax may seem strange at first glance, but when you get used to it, the code will become simpler and more understandable.

Duplicate Removal

Another common task when working with collections is to remove the same elements. It can be solved by many methods.
Suppose we are working with a list like this:
numbers = [i for i in range(1,11)] + [i for i in range(1,6)]

The most difficult way to remove duplicates that I have encountered looks like this:
unique_numbers = []
for n in numbers:
    if n not in unique_numbers:
        unique_numbers.append(n)

Of course this works, but there are simpler solutions. You can use the standard type set. Many can’t contain the same elements by definition, so if you convert the list to many, duplicates are deleted. But we get a set and not a list, so if we want a list of unique values, we need to convert it again:
unique_numbers = list(set(numbers))


Delete duplicate objects

A completely different situation with objects or dictionaries. For example, we have a list of dictionaries in which one of the values ​​is used as an identifier:
data = [
  {'id': 10, 'data': '...'},
  {'id': 11, 'data': '...'},
  {'id': 12, 'data': '...'},
  {'id': 10, 'data': '...'},
  {'id': 11, 'data': '...'},
]

Removing repeats can be done with more or less code. Of course, the less the better! A long version might look like this:
unique_data = []
for d in data:
    data_exists = False
    for ud in unique_data:
        if ud['id'] == d['id']:
          data_exists = True
          break
    if not data_exists:
        unique_data.append(d)


You can get the same result using the opportunity that I learned about a couple of days ago: dictionary generators. They have a syntax similar to list generators, but return a dictionary:
{ key:value for item in list if conditional }

If we rewrite the code from the example above using this feature, only one line remains:
{ d['id']:d for d in data }.values()

A dictionary is created in this line of code, the keys of which are fields that we took as a unique identifier, then using the values ​​() method we get all the values ​​from the created dictionary. Because the dictionary can contain no more than one entry for each key - the resulting list does not contain duplicates, which we needed.
This feature was added in Python 3.0 and backported in Python 2.7. In earlier versions, to solve a similar problem, you can use a construction of the following type:
dict((key, value) for item in list if condition)

A list of tuples (pairs) is generated and passed to their constructor dict (), which takes the first element of the tuple as a key and the second as a value. With this approach, the solution to the same problem will look like this:
dict((d['id'], d) for d in data).values()

Also popular now: