Code Like a Pythonista: Idiomatic Python (part0)
- Transfer
От переводчика:
Я только начал изучать Python. С самого первого знакомства язык порадовал симпатичными конструкциями и синтаксически-гарантированной удобностью к чтению и пониманию кода.
В процессе освоения, при написании своего кода, бывает, сомневаюсь в правильности выбранных способов с точки зрения Python-way ( PEP 8 — Style Guide for Python Code, если угодно). Для вникания в идеологию программирования, в Python-сообществе кроме исчерпывающей документации, ко всеобщей радости, накоплено уже немало вспомогательных материалов, таких как статья Python Tips, Tricks, and Hacks, перевод которой недавно появился на Хабре
Мне понравилась статья Дэвида Гуджера «Пиши код, как настоящий Питонист: идиоматика Python» (David Goodger "Code Like a Pythonista: Idiomatic Python" ). For the best assimilation of it, I decided to draw up (by virtue of skill) a full-fledged translation, then it seemed a common idea to share with Habr.
While I was working on the translation, I realized that the article is much larger than it seemed when I read it in the original, so I will post in parts so as not to fall out of the Habr article format.
Continuation and completion of the translation.
Write code like a real Pythonist: Python idioms
David Goodger
goodger@python.org
http://python.net/~goodger
In this interactive tutorial we will look at a lot of essential Python idioms and advanced techniques that will certainly expand your toolkit.
There are three versions of this presentation:
Creative Commons
Attribution / Share-Alike (BY-SA) license.
About me:
- I live in Montreal
- father of two children, husband of one woman
- Python full-time programmer (a full-time Python programmer),
- author of the Docutils project and reStructuredText ,
- Python Enhancement Proposals (or PEPs) editor,
- organizer of PyCon 2007, and PyCon 2008 Chair,
- member of the Python Software Foundation,
- fund director last year, and secretary this year.
I presented this tutorial at the PyCon 2006 conference (called Text & Data Processing), I was surprised by the reaction to some methods that I used and considered them well-known. But many listeners were unaware of the methods that Python programmers use without thinking.
Many of you could see some idioms and methods before. I want to believe that you will also learn a few tricks that you have not seen before and maybe learn something new about those that you already know.
Zen Python (1)
These are the basic principles of Python, but in an extended interpretation. A sense of humor is simply necessary for their correct understanding.
If you use a programming language named after the comedy sketch troupe, it’s best to have a sense of humor.
Beautiful is better than the ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than confusing.
Flat is better than nesting.
Sparse is better than dense.
Readability is important.
Exceptions are not exceptional enough to violate the rules.
Although practicality is more important than beauty.
Mistakes should not pass silently.
If not drowned out explicitly.
...
Zen Python (2)
If ambiguous, give up the temptation to guess.This "poem" was created as a joke, but in reality contains many truths about the philosophy of Python.
There must be one, and preferably only one, obvious way to do this.
Although this path may not be obvious at first, unless you are from Holland (apparently Guido van Rossum means - approx. Transl., Thanks sgtpep ).
Better now than never.
Although often never better than right now .
If the implementation is hard to explain, then this is a bad idea.
If the implementation is easy to explain, this is probably a good idea.
Namespaces are a great idea, so let's make even more namespaces!
—Tim Peters
Over time, Pythoneer Tim Peters translated the principles of the BDFL guide into 20 brief aphorisms, only 19 of which were recorded.
http://www.python.org/dev/peps/pep-0020/
You can decide for yourself who you are: Pythoneer or Pythonista. These words have related meanings.
When in doubt:
import this
Try this in the interactive Python interpreter:
>>> import this
Here is another easter egg:
>>> from __future__ import braces
File " ", line 1
SyntaxError: not a chance
What a bunch of comedians! :-)
Coding Style: Readability Counts (Coding Style: Readability Counts)
Programs should be written for reading by people and only by chance for execution by machines.
—Abelson & Sussman, Structure and Interpretation of Computer Programs (“Structure and interpretation of computer programs”)
Try to make your programs readable and obvious.
PEP 8: Python Code Style Guide
Worth reading:
http://www.python.org/dev/peps/pep-0008/
PEP = Python Enhancement Proposal
PEP is a document that provides information to the Python community describing new features of Python, its processes, or its environment.
The Python community has its own standards that describe what the source code should look like in PEP 8. These standards are different from others, such as C, C ++, C #, Java, VisualBasic, etc.
Since indents and spaces are extremely important in Python, the Guide style for Python code uses standard indentation. It is wise that the leadership adheres to the leadership! .. Most open source projects and (hopefully) home ones also follow the guidelines quite strictly.
Spaces 1
- 4 spaces for one level of indentation.
- No tabs.
- Never mix tabs and spaces.
This is precisely supported in IDLE and Emacs Python mode. Other editors can do this too. - One empty line between functions.
- Two empty lines between classes.
Spaces 2
- Add a space after "," in dictionaries, lists, tuples, argument lists, and after ":" in dictionaries, but not before.
- Insert spaces around assignments and comparisons (except for argument lists).
- No spaces inside parentheses or before the argument list.
- No spaces at the beginning and end of lines of documentation.
def make_squares(key, value=0):
"""Return a dictionary and a list..."""
d = {key: value}
l = [key, value]
return d, l
Naming
- single-line for functions, methods, attributes
- joined_lower or ALL_CAPSOM for constants
- StudlyCaps for classes
- camelCase only to match predefined conventions
- Attributes: interface, _internal, __private
Try to avoid the __private form. I never use this. Believe me. If you use it, you will regret it later.
Explanation:
People who come after C ++ / Java are initially prone to misuse and overuse of this feature. __private names do not work as they do in Java or C ++. It's just "starts" mechanism name conversion (name mangling), which helps prevent accidental namespace collisions in subclasses: MyClass .__ private becomes MyClass._MyClass__private. (Note that this works even for subclasses with the same name as the superclass, for example, a subclass defined in another module.) Thus, access to __private names outside the class is possible, however inconvenient and fragile (this adds dependency on the exact name superclass).
The problem is that the class author can naturally think: “This attribute / method name should be private, accessible only within this class definition” and use a __private declaration. But later, the user of this class can create a subclass in which he legitimately needs access to this name. To do this, the superclass can be changed (which can be difficult or impossible), or use the mangled names in the subclass code (which is ugly and fragile at best).
This is the concept of Python: "We are all adults here."
If you use a __private form, who are you protecting the attribute from? It is the responsibility of subclasses to use the attributes of the superclass correctly, and the responsibility of the superclass to document their attributes correctly.
For these purposes, it is better to use a declaration starting with a single underline, _internal. This does not translate names; just tells others to “be careful with this, this is an implementation of internal mechanisms; don't touch it unless you understand it completely . ” Yet this is only an agreement.
There are some good explanations in the answers here:- http://stackoverflow.com/questions/70528/why-are-pythons-private-methods-not-actually-private
- http://stackoverflow.com/questions/1641219/does-python-have-private-variables-in-classes
Long lines of code (Lines) and their continuation
Keep the length of lines of code within 80 characters.
Use implicit line extensions inside brackets:
def __init__(self, first, second, third,
fourth, fifth, sixth):
output = (first + second + third
+ fourth + fifth + sixth)
Use backslash:
VeryLong.left_hand_side \
= even_longer.right_hand_side()
Use backslashes carefully; they must end the line in which they are included. If you add a space after the backslash, it will no longer work. Thus, this is somewhat contaminating the code.
Long strings
Adjacent literal strings (characters) are concatenated by the parser:
>>> print 'o' 'n' "e"
one
Spaces between letters are not required, but help make the code more readable. Any type of quotation mark can be used:
>>> print 't' r'\/\/' """o"""
t\/\/o
Lines beginning with “r” are “raw”. Backslashes are not considered escape characters in raw lines. They are used for regular expressions and paths in Windows file systems.
Note that named string objects are not concatenated:
>>> a = 'three'
>>> b = 'four'
>>> a b
File "", line 1
a b
^
SyntaxError: invalid syntax
This is because automatic concatenation is a feature of the Python parser (parser), not the interpreter. You must use the + operator to concatenate lines in runtime.
text = ('Long strings can be made up '
'of several shorter strings.')
Parentheses allow implicit concatenation of lines of code.
To specify multi-line string values, use triple quotation marks:
"""Triple
double
quotes"""
'''\
Triple
single
quotes\
'''
In the last example (single quotation marks), notice how backslashes are used to avoid the line termination character. This eliminates unnecessary line termination characters when storing text and quotes aligned to the left margin. Backslash should be at the end of each such line.
Compound statements
Good:
if foo == 'blah':
do_something()
do_one()
do_two()
do_three()
Poorly:
if foo == 'blah': do_something()
do_one(); do_two(); do_three()
Spaces and indents are useful visual indicators of program flow. The indentation of the second line of the “Good” code above shows the reader the execution of something by the condition, while the absence of the indentation in the “Bad” hides the “if” condition.
Many statements on one line of code are a major sin. In Python, readability is important .
Documentation Lines and Comments
Documentation lines = How to use the code
Comments = Why (reasonable justification) and how the code works
Documentation lines explain how to use the code, and for users of your code. Using documentation lines:
- Explain the purpose of the function, even if it seems obvious to you, because it may not be obvious to anyone later.
- Describe expected parameters, return values, and any exceptions thrown.
- If the method is tightly coupled to a single call, create some kind of connection with the calling statement (though be careful, because the calling statement may change later).
Comments explain why and are needed for the maintainers of your code. Examples include notes for yourself, like:
# !!! BUG: ...
# !!! FIX: This is a hack
# ??? Why is this here?
Both of these groups include you , so write good documentation lines and comments!
Documentation lines are used interactively (help ()) and for auto-documentation systems.
False comments and documentation lines are worse than nothing at all. So save them right away! When you make changes, make sure that comments and documentation lines are consistent with the code and do not contradict it.
PEP has a whole documentation line convention, PEP 257, “Docstring Conventions”:
http://www.python.org/dev/peps/pep-0257/
Practicality wins cleanliness
Stupid consistency is a little monster of a near mind (A foolish consistency is the hobgoblin of little minds).
—Ralph Waldo Emerson
( hobgoblin : Something that causes superstitious fear)
(note. transl .: As I understand it, the meaning is: “Teach a fool to pray to God - he will break his forehead”)
There are always exceptions. From PEP 8:
It is very important when a contradiction arises - it happens that following the manual is simply not acceptable. When in doubt, use the solution that is best in your opinion. Look at other examples and decide which point of view is better. And feel free to ask!
Two good reasons to break accepted rules:
- By following the rules, the code becomes less readable, even for someone who is used to following the rules.
- To comply with the surrounding code, it is also possible to violate (this may be for historical reasons) - although it may be possible to restore order in someone's confusion (in true XP style (eXtreme Programming - note. Transl.)).
... but practicality should not always compromise cleanliness!
Potpourri of Idioms
A selection of small, useful idioms.
Now we move on to the main part of the tutorial: a lot of idioms.
We will start with those that are easier and gradually raise the level.
Exchange of values
In other languages:
temp = a
a = b
b = temp
In Python:
b, a = a, b
You may have seen this before. But do you know how this works?
- The comma is the syntax of the tuple constructor.
- A tuple is created on the right (tuple packaging).
- The tuple on the left is the assignment target (unpacking the tuple).
The right side is unpacked by name in a tuple on the left side.
More unpacking examples:
>>> l =['David', 'Pythonista', '+1-514-555-1234']
>>> name, title, phone = l
>>> name
'David'
>>> title
'Pythonista'
>>> phone
'+1-514-555-1234'
Useful in loops for processing structured data:
Above we created a list (David's info). people - a list of people containing two elements, each of which is a list of three elements.
>>> people = [l, ['Guido', 'BDFL', 'unlisted']]
>>> for (name, title, phone) in people:
... print name, phone
...
David +1-514-555-1234
Guido unlisted
Each people element was unpacked into a tuple of the form (name, title, phone).
Just remember to match the structure to the left and right!
>>> david, (gname, gtitle, gphone) = people
>>> gname
'Guido'
>>> gtitle
'BDFL'
>>> gphone
'unlisted'
>>> david
['David', 'Pythonista', '+1-514-555-1234']
More about tuples
We saw how the comma was a tuple constructor, without parentheses. Example:
>>> 1,
(1,)
The Python interpreter shows the brackets for clarity, and I recommend that you use them too:
>>> (1,)
(1,)
Do not forget the comma!
>>> (1)
1
A single-element tuple requires a trailing comma; in 2+ tuples, the trailing comma is optional. In 0-tuples, or empty tuples, a pair of parentheses is a shortened syntax:
>>> ()
()
>>> tuple()
()
A common typo is to leave a comma in the code, even if you do not want to create a tuple. You can easily skip it in your code:
>>> value = 1,
>>> value
(1,)
Thus, if you see a tuple where you did not wait, look for a comma!
Interactive "_"
This is a really useful feature, it is surprising that few people know about it.
In the interactive mode of the interpreter, whenever you try an expression or function call, the result is stored in a temporary variable, _ (underscore):
>>> 1 + 1
2
>>> _
2
_ stores the last expression printed by the print command . It's comfortable!
But it works only in the interactive mode of the interpreter, not in modules.
It is especially useful when you are working on a task interactively and want to save the result for the next step:
>>> import math
>>> math.pi / 3
1.0471975511965976
>>> angle = _
>>> math.cos(angle)
0.50000000000000011
>>> _
0.50000000000000011
Composing strings from substrings
Let's start with a list of lines:
colors = ['red', 'blue', 'green', 'yellow']
We want to join all the lines together into one big line. Especially when there are a lot of substrings ...
don't do this:
result = ''
for s in colors:
result += s
It is very inefficient.
This terribly eats up memory and degrades performance. Summation calculates, saves, and then destroys the object at each intermediate step.
Instead, do this:
result = ''.join(colors)
The join () string method does all the copying in one pass.
When you are dealing with a couple of dozens or hundreds of lines, this will not make a noticeable difference. But get used to collecting lines efficiently, because it will give you a win with thousands of lines and when working in loops.
Composing strings, variation 1
Here are some ways to use the join () method.
If you want to put spaces between your substrings:
result = ' '.join(colors)
Or commas and spaces:
result = ', '.join(colors)
In general:
colors = ['red', 'blue', 'green', 'yellow']
print 'Choose', ', '.join(colors[:-1]), \
'or', colors[-1]
To make a grammatically correct sentence, we want to put a comma between all the values, and the word “or” before the last. Here the syntax of slices helps. “Slice to -1” ([: -1]) gives the index of the penultimate element, which we will append, separated by commas, with a space.
Of course, this code will not want to work in cases where the list length is 0 or 1.
Output:
Choose red, blue, green or yellow
Composing strings, variation 2
If you need to apply a function to generate substrings:
result = ''.join(fn(i) for i in items)
This uses a generator expression , which we will discuss later.
If you need to calculate the substrings step by step, first collect them in a list:
items = []
...
items.append(item) # many times
...
# items is now complete
result = ''.join(fn(i) for i in items)
We put the parts into a list so that we can now apply the join string method
for greater efficiency.
The second part of the translation