kossmak April 2, 2010 at 08:23

Code Like a Pythonista: Idiomatic Python (part1)

Transfer

This is a continuation of the translation of the article by David Goodger “Write the code like a real Pythonist: Python idiomat ”

The beginning and end of the translation.

Thanks to all the habrayuzers for evaluating the first part, valuable comments and positive comments. I tried to take into account errors, again I look forward to a constructive discussion.

Use in whenever possible (1)

Good:

for key in d:
    print key

in is usually faster.
this method also works for elements of arbitrary containers (such as lists, tuples, sets).
in is also an operator (as we will see).

Poorly:

for key in d.keys ():
    print key

This also applies to all objects with the keys () method.

Use in whenever possible (2)

But .keys () is needed when changing the dictionary:

for key in d.keys ():
    d [str (key)] = d [key]

d.keys () creates a static list of dictionary keys. Otherwise, you would get the exception "RuntimeError: dictionary changed size during iteration" (The size of the dictionary changed during iteration).

It is more correct to use key in dict rather than dict.has_key ():

# Do this:
if key in d:
    ... do something with d [key]
# but not like this:
if d.has_key (key):
    ... do something with d [key]

This code uses in as an operator.

Get dictionary method

Often we need to populate the dictionary with data before use.

A naive way to do this:

navs = {}
for (portfolio, equity, position) in data:
    if portfolio not in navs:
        navs [portfolio] = 0
    navs [portfolio] + = position * prices [equity]

dict.get (key, default) avoids checks:

navs = {}
for (portfolio, equity, position) in data:
    navs [portfolio] = (navs.get (portfolio, 0)
                       + position * prices [equity])

This is more correct.

Dictionary method setdefault (1)

Let now we have to initialize the values of the elements of a non-static dictionary, where each element is represented by a list. Here's another naive way:

Initializing elements of a mutable dictionary:

equities = {}
for (portfolio, equity) in data:
    if portfolio in equities:
        equities [portfolio] .append (equity)
    else:
        equities [portfolio] = [equity]

dict.setdefault (key, default) makes this work more efficiently:

equities = {}
for (portfolio, equity) in data:
    equities.setdefault (portfolio, []). append (
                                         equity)

dict.setdefault () is equivalent to "get, or install and get" ("get, or set & get"). Or “set if necessary, then get” (“set if necessary, then get”). This is especially effective if the dictionary key is difficult to compute or long to type from the keyboard.

Only there is a problem with dict.setdefault (), it is that the default value is always calculated, regardless of whether it is needed or not. This is important if the default value is expensive to calculate.

If the default value is difficult to calculate, it may be more convenient to use the defaultdict class, which we will briefly consider.

Dictionary method setdefault (2)

Here we see how the setdefault method can also be a separately used expression:

navs = {}
for (portfolio, equity, position) in data:
    navs.setdefault (portfolio, 0)
    navs [portfolio] + = position * prices [equity]

The setdefault dictionary method returns the default value, but we ignore it here. We bypass the side effect of setdefault, which assigns a value to an uninitialized dictionary element.

defaultdict

New in Python 2.5.

defaultdict appeared in Python 2.5 as part of the collections module. defaultdict is identical to regular dictionaries, except for two things:

it takes a production function as its first argument; and
when the dictionary key is encountered for the first time, the generating function is called, and its result initializes the value of the new dictionary element.

Here are two ways to get defaultdict:

import the collections module and refer through the name of this module,
or import directly the defaultdict name:

import collections
d = collections.defaultdict (...)

from collections import defaultdict
d = defaultdict (...)

Here is the previous example, where each element of the dictionary is initialized as an empty list, rewritten with defaultdict:

from collections import defaultdict
equities = defaultdict (list)
for (portfolio, equity) in data:
    equities [portfolio] .append (equity)

In this case, the producing function list, returning an empty list.

And this example shows how to get a dictionary with the default value = 0: for this, the generating function int is used:

navs = defaultdict (int)
for (portfolio, equity, position) in data:
    navs [portfolio] + = position * prices [equity]

Still be careful with defaultdict. You cannot get a KeyError exception from a properly initialized defaultdict. You can use the "key in dict" condition if you need to check for a specific key.

Compilation and analysis of dictionaries

Here is a useful technique for compiling a dictionary of two lists (or sequences): one is a list of keys, the other of values.

given = ['John', 'Eric', 'Terry', 'Michael']
family = ['Cleese', 'Idle', 'Gilliam', 'Palin']

pythons = dict (zip (given, family))

>>> pprint.pprint (pythons)
{'John': 'Cleese',
 'Michael': 'Palin',
 'Eric': 'Idle',
 'Terry': 'Gilliam'}

The converse is, of course, trivial:

>>> pythons.keys ()
['John', 'Michael', 'Eric', 'Terry']
>>> pythons.values ()
['Cleese', 'Palin', 'Idle', 'Gilliam']

Note that the order of the results .keys () and .values () is different from the order of the elements when the dictionary dictionary is created. The input order is different from the output order. This is because the dictionary is essentially disordered. However, the output order is guaranteed to be consistent (the order of the keys corresponds to the order of values) as much as possible if the dictionary has not changed between calls.

Validity Check

# do this: # not like this:
if x: if x == True:
    pass pass

This is an elegant and efficient way to validate Python objects (or Boolean values).
Check list:

# do this: # not like this:
if items: if len (items)! = 0:
    pass pass
                  # and definitely not:
                  if items! = []:
                      pass

The Meaning of Truth

The names True and False are instances of the built-in Boolean type. Like None, only one instance of each is created.

False	True
False (== 0)	True (== 1)
"" (empty line)	any string except "" ("", "anything")
0, 0.0	any number except 0 (1, 0.1, -1, 3.14)
[], (), {}, set ()	any non-empty container ([0], (None,), [''])
None	almost any object that is clearly not False

An example of the value of Truth in objects:

>>> class C:
... pass
...
>>> o = C ()
>>> bool (o)
True
>>> bool (C)
True

(Examples: execute truth.py .)

To control the truth of instances of user-defined classes, use the special __nonzero__ or __len__ method. Use __len__ if your class is a container of length:

class MyContainer (object):
    def __init __ (self, data):
        self.data = data
    def __len __ (self):
        "" "Return my length." ""
        return len (self.data)

If your class is not a container, use __nonzero__:

class MyClass (object):
    def __init __ (self, value):
        self.value = value
    def __nonzero __ (self):
        "" "Return my truth value (True or False)." ""
        # This could be arbitrarily complex:
        return bool (self.value)

In Python 3.0, __nonzero__ is renamed to __bool__ for consistency with the built-in bool type. For compatibility, add this code to the class definition:

__bool__ = __nonzero__

Index & Item (1)

Here's a tricky way to save some typed text into a list of words:

>>> items = 'zero one two three'.split ()
>>> print items
['zero', 'one', 'two', 'three']

Let's say we want to iterate over the elements and we need both the elements themselves and their indices:

                  - or -
i = 0
for item in items: for i in range (len (items)):
    print i, item print i, items [i]
    i + = 1

Index & Item (2): enumerate

The enumerate function takes a list as an argument and returns pairs (index, item) (number, element):

>>> print list (enumerate (items))
[(0, 'zero'), (1, 'one'), (2, 'two'), (3, 'three')]

We need to convert to a list to get the full result, because enumerate is a lazy function: it generates one element, a couple, in one call, as if “how many were asked.” The for loop is just the place that iterates over the list and calls one result per pass. enumerate - example generator ( generator ), which we discuss in more detail later. print does not accept one result at a time - we want a general result, so we must explicitly convert the generator to a list when we want to print it.

Our cycle becomes much simpler:

for (index, item) in enumerate (items):
    print index, item

# compare: # compare:
index = 0 for i in range (len (items)):
for item in items: print i, items [i]
    print index, item
    index + = 1

The enumerate variant is significantly shorter and simpler than the left method, and it is also easier to read and understand.

An example showing how the enumerate function actually returns an iterator (a generator is a kind of iterator):

>>> enumerate (items)
;
>>> e = enumerate (items)
>>> e.next ()
(0, 'zero')
>>> e.next ()
(1, 'one')
>>> e.next ()
(2, 'two')
>>> e.next ()
(3, 'three')
>>> e.next ()
Traceback (most recent call last):
  File "", line 1, in?
StopIteration

Other languages have “variables”

In many other languages, variable assignment places a value in a cell.

int a = 1;

Cell "a" now contains the integer 1.

Assigning a different value to the same variable replaces the contents of the cell:

a = 2;

Now cell “a” contains the integer 2.

Assigning one variable to another creates a copy of the value and places it in a new cell:

int b = a;

“B” is the second cell, with a copy of integer 2. Cell “a” has a separate copy.

Python has “names”

In Python, “names” or “identifiers” are like labels (tags, labels) attached to an object.

a = 1

Here the whole one has the label “a”.

If we reassign “a”, we simply move the shortcut to another object:

a = 2

Now the name "a" is attached to the whole object 2.
The original object of the whole 1 no longer has the label "a". He may live a little longer, but we cannot get him by the name "a". When an object no longer has links or labels, it is removed from memory.

If we assign one name to another, we simply attach another label to an existing object:

b = a

The name “b” is simply the second label assigned to the same object as “a”.

Although we usually say “variables” in Python (because it’s generally accepted terminology), we really mean “names” or “identifiers”. In Python, “variables” are references to values, not named cells.

If you haven’t received anything from this tutorial yet, I hope you understand how names work in Python. A clear understanding will undoubtedly do a good job, help you avoid cases like this
:? (for some reason, the example code is missing - approx. transl.)

Default Parameter Values

This is a common mistake that beginners often make. Even more advanced programmers allow it if they do not understand enough names in Python.

def bad_append (new_item, a_list = []):
    a_list.append (new_item)
    return a_list

The problem here is that the default value for a_list, an empty list, is calculated only during function definition. That way, every time you call a function, you get the same default value. Try this several times:

>>> print bad_append ('one')
['one']

>>> print bad_append ('two')
['one', 'two']

Lists are mutable objects, you can change their contents. The correct way to get the default list (either a dictionary or a set) is to create it at run time, and not in a function declaration :

def good_append (new_item, a_list = None):
    if a_list is None:
        a_list = []
    a_list.append (new_item)
    return a_list

% string formatting

In Python, the% operator works like the sprintf function from C.

Although, if you don’t know C, this tells you little. In general, you specify a template or format and substitute values.

In this example, the template contains two presentation specifications: "% s" means "insert a string here", and "% i" means "convert an integer to a string and paste here." "% s" is especially useful because it uses the Python built-in str () function to convert any object to a string.

Substitution values must match the pattern; here we have two values compiled into a tuple.

name = 'David'
messages = 3
text = ('Hello% s, you have% i messages'
        % (name, messages))
print text

Conclusion:

Hello David, you have 3 messages

See the Python Library Reference , section 2.3.6.2, “String Formatting Operations,” for more information. Bookmark it!
If you have not done so already, head over to python.org, download the HTML documentation (in .zip, or whatever you like), and install it on your machine. There is nothing more useful than having complete guidance at your fingertips.

Advanced% string formatting

Many do not know that there are other, more flexible ways to format strings:

By name with a dictionary:

values = {'name': name, 'messages': messages}
print ('Hello% (name) s, you have% (messages) i'
       'messages'% values)

Here we define the names for the substituted values that are looked up in the dictionary.

Have you noticed redundancy? The names "name" and "messages" are already defined in the local
namespace. We can improve it.
By name using local namespace:

print ('Hello% (name) s, you have% (messages) i'
       'messages'% locals ())

The locals () function returns a dictionary of all identifiers available in the local namespace.

This is a very powerful tool. With it, you can format all the lines as you want without having to worry about matching the substituted values in the template.
But be careful. (“With great power, great responsibility comes.”) If you use locals () with externally-related string patterns, you provide your local namespace with the caller. It's just that you know.
To check your local namespace:

>>> from pprint import pprint
>>> pprint (locals ())

pprint is also a useful feature. If you don't already know, try playing around with her. It makes debugging your data structures much easier!

Advanced% string formatting

The attribute namespace of an instance of an object is just a dictionary, self .__ dict__.

By name using instance namespace:

print ("We found% (error_count) d errors"
       % self .__ dict__)

Equivalent, but more flexible than:

print ("We found% d errors"
       % self.error_count)

Note: class attributes in class __dict__. The namespace lookup is actually a dictionary search.

The final part of the translation.

Tags:

Code Like a Pythonista: Idiomatic Python (part1)

Use in whenever possible (1)

Use in whenever possible (2)

Get dictionary method

Dictionary method setdefault (1)

Dictionary method setdefault (2)

defaultdict

Compilation and analysis of dictionaries

Validity Check

The Meaning of Truth

Index & Item (1)

Index & Item (2): enumerate

Other languages ​​have “variables”

Python has “names”

Default Parameter Values

% string formatting

Advanced% string formatting

Advanced% string formatting

Also popular now:

Other languages have “variables”