Python from the inside out. Objects Tail

Original author: Yaniv Aknin
  • Transfer
  • Tutorial
1. Introduction
2. Objects. Head
3. Objects. Tail
4. Structures of the process

In the previous part, we began the study of the Python object system: we understood what exactly can be considered an object and how do objects do their job. We continue consideration of the issue.

I welcome you in the third part of our series of articles on the insides of Python (I strongly recommend reading the second part if you have not done so already, otherwise you won’t understand anything). In this episode, we’ll talk about an important concept that we still won’t get to, about attributes. If you ever wrote anything in Python, then you had to use them. Attributes of an object are other objects associated with it, accessible through the operettor.(dot), such as: >>> my_object.attribute_name. Briefly describe the behavior of Python when referring to attributes. This behavior depends on the type of object accessible by the attribute (have you already understood that this applies to all operations related to objects?).

A type can describe special methods that modify access to the attributes of its instances. These methods are described here (as we already know, they will be associated with the necessary type slots by the function fixup_slot_dispatcherswhere the type is created ... you read the previous post, right?). These methods can do anything; Whether you describe your type in C or Python, you can write methods that store and return attributes from some incredible repository, if you like, you can send and receive attributes on the radio from the ISS or even store them in relational database. But in more or less ordinary conditions, these methods simply write the attribute in the form of a key-value pair (attribute name / attribute value) in some object dictionary when the attribute is set, and return the attribute from this dictionary when it is requested (or an exception is thrown AttributeErrorif the dictionary does not have a key corresponding to the name of the requested attribute). It’s all so simple and beautiful, thank you for your attention, we’ll probably finish here.

To stand! My friends, the fecal masses have just begun their rapid approach to a rotating wind generator. To disappear, so to all disappear. I propose to study together what is happening in the interpreter and ask, as we usually do, several annoying questions.

We carefully read the code or immediately proceed to the text description:

>>> print(object.__dict__)
{'__ne__': , ... , '__ge__': }
>>> object.__ne__ is object.__dict__['__ne__']
True
>>> o = object()
>>> o.__class__

>>> o.a = 1
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'object' object has no attribute 'a'
>>> o.__dict__
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'object' object has no attribute '__dict__'
>>> class C:
...     A = 1
... 
>>> C.__dict__['A']
1
>>> C.A
1
>>> o2 = C()
>>> o2.a = 1
>>> o2.__dict__
{'a': 1}
>>> o2.__dict__['a2'] = 2
>>> o2.a2
2
>>> C.__dict__['A2'] = 2
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'dict_proxy' object does not support item assignment
>>> C.A2 = 2
>>> C.__dict__['A2'] is C.A2
True
>>> type(C.__dict__) is type(o2.__dict__)
False
>>> type(C.__dict__)

>>> type(o2.__dict__)

Let's translate this into human language: u object(this is the simplest built-in type, if you forgot), as we see, has a dictionary, and everything we can get through through attributes is identical to what we see in object.__dict__. It should surprise us that instances of a type object(for example, an object o) do not support the definition of additional attributes and do not have them at all __dict__, but they do support access to existing attributes (try o.__class__, o.__hash__etc., these commands return something ). After that, we created a new class C, inherited it from object, added an attribute, Aand saw that it is available through C.Aand C.__dict__['A'], as expected. Then we instantiated the o2classCand saw that the definition of an attribute changes __dict__, and vice versa, a change __dict__affects the attributes. Afterwards, we were surprised to learn that the __dict__class is read-only, although the definition of attributes ( C.A2) works great . Finally, we saw that instance and class objects have __dict__different types - familiar dictand mysterious, dict_proxyrespectively. And if all this is not enough, remember the puzzle from the previous part: if the heirs of the pure object(for example o) do not have __dict__, but Cexpand object, without adding anything significant, where does the class C( o2) instance come from__dict__ ?

Yeah, it’s weirder and weirder! But don’t worry, everything has its time. First, consider the __dict__type implementation . If you look at the definition PyTypeObject(I strongly recommend reading it!), You can see a slot tp_dictready to accept a pointer to a dictionary. This slot should be for all types. The dictionary is placed there upon a call ./Objects/typeobject.c: PyType_Readythat occurs either at the initialization of the interpreter (remember Py_Initialize? This function calls _Py_ReadyTypes, which calls PyType_Readyfor all known types), or when the user dynamically creates a new type ( type_newcalls PyType_Readyfor each newborn type before returning it). In fact, each name that you define in the instruction classappears in a __dict__new type (line ./Objects/typeobject.c:type_new:) type->tp_dict = dict = PyDict_Copy(dict);. Do not forget that types are also objects, i.e. they also have a type - typewhich has slots with functions that provide access to attributes in the right way. These functions use a dictionary that each type has and points tp_dictto for storing / accessing attributes. Thus, a call to type attributes is, in fact, a call to the private dictionary of the type instance typepointed to by the type structure.

class Foo:
    bar = "baz"
print(Foo.bar)

In this example, the last line shows a call to a type attribute. In this case, to find the attribute bar, the function of accessing the attributes of the class Foo(to which it points tp_getattro) will be called . Roughly the same thing happens when defining and deleting attributes (for the interpreter, by the way, “deleting” is just setting a value NULL). I hope everything has been clear so far, but in the meantime we have discussed the use of attributes.

Before considering access to instance attributes, let me tell you about a little-known (but very important!) Concept: a descriptor . Descriptors play a special role when accessing instance attributes, and I should clarify what it is. An object is considered a descriptor if one or two slots of its type ( tp_descr_getand / ortp_descr_set) are filled with nonzero values. These slots are associated with special methods __get__, __set__and __delete__(for example, if you define a class with a method __get__that will contact the slot tp_descr_getand create an object of this class, this object will be a descriptor). Finally, an object is considered a data descriptor if a slot is filled with a nonzero value tp_descr_set. As we will see, descriptors play an important role in accessing attributes, and I will give some more explanations and links to the necessary documentation.

So, we figured out what descriptors are and realized how access to type attributes occurs. But most objects are not types, i.e. their type is not type, but something more prosaic, for example int,dictor custom class. They all rely on universal functions for accessing attributes that are either defined in the type or inherited from the parent of the type when it was created (this topic, the inheritance of slots, we discussed in the “ Head ”). The algorithm of the universal function of accessing attributes ( PyObject_GenericGetAttr ) looks like this:

  1. Search in the instance type dictionary and in the dictionaries of all type parents. If a data descriptor is found , call its function tp_descr_getand return the result. If something else is found, remember this just in case (for example, under the name X ).
  2. Search in the dictionary of the object and return the result if it is found.
  3. If nothing was found in the object’s dictionary, check X if it was installed; if X is a descriptor, call its function tp_descr_getand return the result. If X is an ordinary object, return it.
  4. Finally, if nothing was found, throw an exception AttributeError.

Now we realized that descriptors can execute code, when you contact them as attributes (ie, when you write foo = o.aor o.a = foo, aexecutes the code). Powerful functionality that is used to implement some of the "magic" features of Python. Data descriptors are even more powerful because they take precedence over instance attributes (if you have a oclass object C, the class Chas a data descriptor foo, and you ohave an attribute foo, the o.fooresult will return a descriptor when executed ). Esteem, that is descriptors and how. I especially recommend the first link (“what”) - despite the fact that at first the writing is discouraging, after careful and thoughtful reading you will realize that it is much simpler and shorter than my talk. You should also read Raymond Hettinger's awesome article that describes descriptors in Python 2.x; apart from the removal of unrelated methods, the article is still relevant for version 3.x and is recommended for reading. Descriptors are a very important concept, and I advise you to devote some time to studying the listed resources in order to understand them and get inspired by the idea. Here, for brevity, I will no longer go into details, but I will give an example ( very simple) of their behavior in the interpreter:

>>> class ShoutingInteger(int):
...     # __get__ реализует слот tp_descr_get
...     def __get__(self, instance, owner):
...             print('I was gotten from %s (instance of %s)'
...                   % (instance, owner))
...             return self
... 
>>> class Foo:
...     Shouting42 = ShoutingInteger(42)
... 
>>> foo = Foo()
>>> 100 - foo.Shouting42
I was gotten from <__main__.Foo object at 0xb7583c8c> (instance of )
58
# Запомните: используются только дескрипторы в типах!
>>> foo.Silent666 = ShoutingInteger(666)
>>> 100 - foo.Silent666
-566
>>>

I note that we have just gained a complete understanding of object-oriented inheritance in Python: the search for attributes begins with the type of the object, and then in all parents, we understand that accessing the attribute of an Aobject of a Oclass C1that inherits from C2, which in turn inherits from C3, can return Aboth from O, and C1, C2and C3, which is determined by a certain order of method resolution which is well described here . This method of resolving attributes together with slot inheritance is enough to explain most of the inheritance functionality in Python (although the devil, as usual, is in the details).

We learned a lot today, but it is still unclear where links to object dictionaries are stored. We have already seen the PyObject definition , and there certainly is no pointer to a similar dictionary. If not there, then where? The answer is rather unexpected. If you look closely at PyTypeObject(this is a good pastime! Read daily!), You will notice a field called tp_dictoffset . This field defines the byte offset in the C-structures allocated to the type instances; at this offset is a pointer to a regular Python dictionary. Under normal conditions, when creating a new type, the size of the pieces of memory necessary for the type instance will be calculated, and this size will be larger than that of a purePyObject. Extra space, as a rule, is used (among other things) to store a pointer to a dictionary (all this happens in ./Objects/typeobject.c:, type_newread from the line may_add_dict = base->tp_dictoffset == 0;). Usinggdb , we can easily break into this space and look at the object’s private dictionary:

>>> class C: pass
... 
>>> o = C()
>>> o.foo = 'bar'
>>> o
<__main__.C object at 0x846b06c>
>>>
# заходим в GDB
Program received signal SIGTRAP, Trace/breakpoint trap.
0x0012d422 in __kernel_vsyscall ()
(gdb) p ((PyObject *)(0x846b06c))->ob_type->tp_dictoffset
$1 = 16
(gdb) p *((PyObject **)(((char *)0x846b06c)+16))
$3 = {u'foo': u'bar'}
(gdb) 

We created a new class, an object and defined its attribute ( o.foo = 'bar'), entered gdb, dereferenced the type of object ( C) and found it tp_dictoffset(16), and then checked what is at this offset in the C-structure of the object. It is not surprising that we found there an object dictionary with one key fooindicating a value bar. Naturally, if we check a tp_dictoffsettype that does not have __dict__, for example, y object, then we find zero there. Goosebumps, huh?

The fact that type dictionaries and instance dictionaries are similar, but their implementations vary a lot, can be confusing. A few more mysteries remain. Let's summarize and determine what we missed: define an empty class Cinherited from object, create an objectoof this class, additional memory is allocated for the pointer to the dictionary by offset tp_dictoffset(the place is allocated from the very beginning, but the dictionary is allocated only at the first (any) access; here come a sneak ...). Then we execute in the interpreter o.__dict__, the bytecode is compiled with a command LOAD_ATTRthat calls a function PyObject_GetAttrthat dereferens the type of object oand finds a slot tp_getattrothat launches the standard attribute search process described above and implemented in PyObject_GenericGetAttr. As a result, after all this happens, what returns the dictionary of our object? We know where the dictionary is stored, but you can see that in__dict__not himself, so the problem of chicken and eggs arises: what does the dictionary return to us when we turn to __dict__, if it is not in the dictionary itself?

Something that takes precedence over an object's dictionary is a handle. See:

>>> class C: pass
... 
>>> o = C()
>>> o.__dict__
{}
>>> C.__dict__['__dict__']

>>> type(C.__dict__['__dict__'])

>>> C.__dict__['__dict__'].__get__(o, C)
{}
>>> C.__dict__['__dict__'].__get__(o, C) is o.__dict__
True
>>> 

Wow! It can be seen that there is something called getset_descriptor(file ./Objects/typeobject.c), a certain group of functions that implements the descriptor protocol, and which must be in the __dict__type object . This descriptor will intercept all attempts to access o.__dict__objects of this type and return everything that he wants, in our case, it will be a pointer to a dictionary by offset tp_dictoffsetat o. It also explains why we saw a dict_proxylittle earlier. If there tp_dictis a pointer to a simple dictionary in, why do we see it wrapped in an object in which it is impossible to write anything? This makes the __dict__type descriptor type.

>>> type(C)

>>> type(C).__dict__['__dict__']

>>> type(C).__dict__['__dict__'].__get__(C, type)

This handle is a function that wraps a dictionary with a simple object that simulates the behavior of a regular dictionary except that it is read-only. Why is it so important to prevent user intervention in the __dict__type? Because the type namespace may contain special methods, for example __sub__. When we create a type with special methods, or when we define them on the type through attributes, a function is executed update_one_slotthat will associate these methods with type slots, for example, as happened with the subtraction operation in the previous post. If we could add these methods directly to the __dict__types, they would not contact the slots, and we would get a type similar to what we need (for example, it has __sub__in the dictionary), but which behaves differently.

We have long crossed the line of 2000 words, beyond which the reader’s attention is rapidly dying away, and I still haven’t talked about __slots__. How about reading on your own , daredevils? At your disposal there is everything to deal with them alone! Read the document at the specified link, play a little with __slots__in the interpreter, look at the sources and explore them through gdb. Enjoy. In the next series, I think we will leave objects for a while and talk about the state of the interpreter and the state of the stream . I hope it will be interesting. But even if he doesn’t, you still need to know this. What I can say for sure is that the girls terribly like the guys who are versed in such matters.

Do you know what? Not just girls. We also like these guys. Come - more fun together.

Also popular now: