Code Like a Pythonista: Idiomatic Python (part1)

Original author: David Goodger
  • Transfer
Kaa, the Python


This is a continuation of the translation of the article by David Goodger “Write the code like a real Pythonist: Python idiomat

The beginning and end of the translation.


Thanks to all the habrayuzers for evaluating the first part, valuable comments and positive comments. I tried to take into account errors, again I look forward to a constructive discussion.





Use in whenever possible (1)



Good:

for key in d:
    print key

  • in is usually faster.
  • this method also works for elements of arbitrary containers (such as lists, tuples, sets).
  • in is also an operator (as we will see).


Poorly:
for key in d.keys ():
    print key


This also applies to all objects with the keys () method.


Use in whenever possible (2)



But .keys () is needed when changing the dictionary:

for key in d.keys ():
    d [str (key)] = d [key]


d.keys () creates a static list of dictionary keys. Otherwise, you would get the exception "RuntimeError: dictionary changed size during iteration" (The size of the dictionary changed during iteration).

It is more correct to use key in dict rather than dict.has_key ():

# Do this:
if key in d:
    ... do something with d [key]
# but not like this:
if d.has_key (key):
    ... do something with d [key]

This code uses in as an operator.

Get dictionary method



Often we need to populate the dictionary with data before use.

A naive way to do this:

navs = {}
for (portfolio, equity, position) in data:
    if portfolio not in navs:
        navs [portfolio] = 0
    navs [portfolio] + = position * prices [equity]


dict.get (key, default) avoids checks:

navs = {}
for (portfolio, equity, position) in data:
    navs [portfolio] = (navs.get (portfolio, 0)
                       + position * prices [equity])


This is more correct.

Dictionary method setdefault (1)



Let now we have to initialize the values ​​of the elements of a non-static dictionary, where each element is represented by a list. Here's another naive way:

Initializing elements of a mutable dictionary:
equities = {}
for (portfolio, equity) in data:
    if portfolio in equities:
        equities [portfolio] .append (equity)
    else:
        equities [portfolio] = [equity]

dict.setdefault (key, default) makes this work more efficiently:

equities = {}
for (portfolio, equity) in data:
    equities.setdefault (portfolio, []). append (
                                         equity)

dict.setdefault () is equivalent to "get, or install and get" ("get, or set & get"). Or “set if necessary, then get” (“set if necessary, then get”). This is especially effective if the dictionary key is difficult to compute or long to type from the keyboard.

Only there is a problem with dict.setdefault (), it is that the default value is always calculated, regardless of whether it is needed or not. This is important if the default value is expensive to calculate.

If the default value is difficult to calculate, it may be more convenient to use the defaultdict class, which we will briefly consider.

Dictionary method setdefault (2)



Here we see how the setdefault method can also be a separately used expression:

navs = {}
for (portfolio, equity, position) in data:
    navs.setdefault (portfolio, 0)
    navs [portfolio] + = position * prices [equity]


The setdefault dictionary method returns the default value, but we ignore it here. We bypass the side effect of setdefault, which assigns a value to an uninitialized dictionary element.

defaultdict


New in Python 2.5.

defaultdict appeared in Python 2.5 as part of the collections module. defaultdict is identical to regular dictionaries, except for two things:

  • it takes a production function as its first argument; and
  • when the dictionary key is encountered for the first time, the generating function is called, and its result initializes the value of the new dictionary element.

Here are two ways to get defaultdict:
  • import the collections module and refer through the name of this module,
  • or import directly the defaultdict name:

import collections
d = collections.defaultdict (...)

from collections import defaultdict
d = defaultdict (...)


Here is the previous example, where each element of the dictionary is initialized as an empty list, rewritten with defaultdict:

from collections import defaultdict
equities = defaultdict (list)
for (portfolio, equity) in data:
    equities [portfolio] .append (equity)

In this case, the producing function list, returning an empty list.

And this example shows how to get a dictionary with the default value = 0: for this, the generating function int is used:
navs = defaultdict (int)
for (portfolio, equity, position) in data:
    navs [portfolio] + = position * prices [equity]


Still be careful with defaultdict. You cannot get a KeyError exception from a properly initialized defaultdict. You can use the "key in dict" condition if you need to check for a specific key.

Compilation and analysis of dictionaries


Here is a useful technique for compiling a dictionary of two lists (or sequences): one is a list of keys, the other of values.

given = ['John', 'Eric', 'Terry', 'Michael']
family = ['Cleese', 'Idle', 'Gilliam', 'Palin']

pythons = dict (zip (given, family))

>>> pprint.pprint (pythons)
{'John': 'Cleese',
 'Michael': 'Palin',
 'Eric': 'Idle',
 'Terry': 'Gilliam'}


The converse is, of course, trivial:

>>> pythons.keys ()
['John', 'Michael', 'Eric', 'Terry']
>>> pythons.values ​​()
['Cleese', 'Palin', 'Idle', 'Gilliam']


Note that the order of the results .keys () and .values ​​() is different from the order of the elements when the dictionary dictionary is created. The input order is different from the output order. This is because the dictionary is essentially disordered. However, the output order is guaranteed to be consistent (the order of the keys corresponds to the order of values) as much as possible if the dictionary has not changed between calls.

Validity Check



# do this: # not like this:
if x: if x == True:
    pass pass


This is an elegant and efficient way to validate Python objects (or Boolean values).
Check list:

# do this: # not like this:
if items: if len (items)! = 0:
    pass pass
                  # and definitely not:
                  if items! = []:
                      pass


The Meaning of Truth



The names True and False are instances of the built-in Boolean type. Like None, only one instance of each is created.

FalseTrue
False (== 0)True (== 1)
"" (empty line)any string except "" ("", "anything")
0, 0.0any number except 0 (1, 0.1, -1, 3.14)
[], (), {}, set ()any non-empty container ([0], (None,), [''])
Nonealmost any object that is clearly not False


An example of the value of Truth in objects:

>>> class C:
... pass
...
>>> o = C ()
>>> bool (o)
True
>>> bool (C)
True

(Examples: execute truth.py .)

To control the truth of instances of user-defined classes, use the special __nonzero__ or __len__ method. Use __len__ if your class is a container of length:
class MyContainer (object):
    def __init __ (self, data):
        self.data = data
    def __len __ (self):
        "" "Return my length." ""
        return len (self.data)

If your class is not a container, use __nonzero__:

class MyClass (object):
    def __init __ (self, value):
        self.value = value
    def __nonzero __ (self):
        "" "Return my truth value (True or False)." ""
        # This could be arbitrarily complex:
        return bool (self.value)


In Python 3.0, __nonzero__ is renamed to __bool__ for consistency with the built-in bool type. For compatibility, add this code to the class definition:

__bool__ = __nonzero__


Index & Item (1)



Here's a tricky way to save some typed text into a list of words:

>>> items = 'zero one two three'.split ()
>>> print items
['zero', 'one', 'two', 'three']


Let's say we want to iterate over the elements and we need both the elements themselves and their indices:

                  - or -
i = 0
for item in items: for i in range (len (items)):
    print i, item print i, items [i]
    i + = 1


Index & Item (2): enumerate



The enumerate function takes a list as an argument and returns pairs (index, item) (number, element):

>>> print list (enumerate (items))
[(0, 'zero'), (1, 'one'), (2, 'two'), (3, 'three')]


We need to convert to a list to get the full result, because enumerate is a lazy function: it generates one element, a couple, in one call, as if “how many were asked.” The for loop is just the place that iterates over the list and calls one result per pass. enumerate - example generator ( generator ), which we discuss in more detail later. print does not accept one result at a time - we want a general result, so we must explicitly convert the generator to a list when we want to print it.

Our cycle becomes much simpler:

for (index, item) in enumerate (items):
    print index, item

# compare: # compare:
index = 0 for i in range (len (items)):
for item in items: print i, items [i]
    print index, item
    index + = 1


The enumerate variant is significantly shorter and simpler than the left method, and it is also easier to read and understand.

An example showing how the enumerate function actually returns an iterator (a generator is a kind of iterator):
>>> enumerate (items)
;
>>> e = enumerate (items)
>>> e.next ()
(0, 'zero')
>>> e.next ()
(1, 'one')
>>> e.next ()
(2, 'two')
>>> e.next ()
(3, 'three')
>>> e.next ()
Traceback (most recent call last):
  File "", line 1, in?
StopIteration


Other languages ​​have “variables”


In many other languages, variable assignment places a value in a cell.

int a = 1;

a1box.png


Cell "a" now contains the integer 1.

Assigning a different value to the same variable replaces the contents of the cell:
a = 2;

a2box.png


Now cell “a” contains the integer 2.

Assigning one variable to another creates a copy of the value and places it in a new cell:

int b = a;

b2box.png
a2box.png

“B” is the second cell, with a copy of integer 2. Cell “a” has a separate copy.

Python has “names”


In Python, “names” or “identifiers” are like labels (tags, labels) attached to an object.
a = 1

a1tag.png

Here the whole one has the label “a”.

If we reassign “a”, we simply move the shortcut to another object:
a = 2

a2tag.png
1.png


Now the name "a" is attached to the whole object 2.
The original object of the whole 1 no longer has the label "a". He may live a little longer, but we cannot get him by the name "a". When an object no longer has links or labels, it is removed from memory.

If we assign one name to another, we simply attach another label to an existing object:
b = a

ab2tag.png


The name “b” is simply the second label assigned to the same object as “a”.

Although we usually say “variables” in Python (because it’s generally accepted terminology), we really mean “names” or “identifiers”. In Python, “variables” are references to values, not named cells.

If you haven’t received anything from this tutorial yet, I hope you understand how names work in Python. A clear understanding will undoubtedly do a good job, help you avoid cases like this
:? (for some reason, the example code is missing - approx. transl.)

Default Parameter Values



This is a common mistake that beginners often make. Even more advanced programmers allow it if they do not understand enough names in Python.

def bad_append (new_item, a_list = []):
    a_list.append (new_item)
    return a_list


The problem here is that the default value for a_list, an empty list, is calculated only during function definition. That way, every time you call a function, you get the same default value. Try this several times:

>>> print bad_append ('one')
['one']

>>> print bad_append ('two')
['one', 'two']


Lists are mutable objects, you can change their contents. The correct way to get the default list (either a dictionary or a set) is to create it at run time, and not in a function declaration :

def good_append (new_item, a_list = None):
    if a_list is None:
        a_list = []
    a_list.append (new_item)
    return a_list


% string formatting



In Python, the% operator works like the sprintf function from C.

Although, if you don’t know C, this tells you little. In general, you specify a template or format and substitute values.

In this example, the template contains two presentation specifications: "% s" means "insert a string here", and "% i" means "convert an integer to a string and paste here." "% s" is especially useful because it uses the Python built-in str () function to convert any object to a string.

Substitution values ​​must match the pattern; here we have two values ​​compiled into a tuple.
name = 'David'
messages = 3
text = ('Hello% s, you have% i messages'
        % (name, messages))
print text


Conclusion:
Hello David, you have 3 messages


See the Python Library Reference , section 2.3.6.2, “String Formatting Operations,” for more information. Bookmark it!
If you have not done so already, head over to python.org, download the HTML documentation (in .zip, or whatever you like), and install it on your machine. There is nothing more useful than having complete guidance at your fingertips.

Advanced% string formatting



Many do not know that there are other, more flexible ways to format strings:

By name with a dictionary:

values ​​= {'name': name, 'messages': messages}
print ('Hello% (name) s, you have% (messages) i'
       'messages'% values)


Here we define the names for the substituted values ​​that are looked up in the dictionary.

Have you noticed redundancy? The names "name" and "messages" are already defined in the local
namespace. We can improve it.
By name using local namespace:
print ('Hello% (name) s, you have% (messages) i'
       'messages'% locals ())


The locals () function returns a dictionary of all identifiers available in the local namespace.

This is a very powerful tool. With it, you can format all the lines as you want without having to worry about matching the substituted values ​​in the template.
But be careful. (“With great power, great responsibility comes.”) If you use locals () with externally-related string patterns, you provide your local namespace with the caller. It's just that you know.
To check your local namespace:
>>> from pprint import pprint
>>> pprint (locals ())

pprint is also a useful feature. If you don't already know, try playing around with her. It makes debugging your data structures much easier!

Advanced% string formatting



The attribute namespace of an instance of an object is just a dictionary, self .__ dict__.

By name using instance namespace:

print ("We found% (error_count) d errors"
       % self .__ dict__)


Equivalent, but more flexible than:

print ("We found% d errors"
       % self.error_count)


Note: class attributes in class __dict__. The namespace lookup is actually a dictionary search.

The final part of the translation.

Also popular now: