How many objects does Python allocate when executing scripts?

Original author: Artem Golubin
  • Transfer
Some Python programmers are greatly surprised when they know how many temporary objects the python interpreter allocates while running a simple script.

CPython allows you to get statistics on selected objects, for this you need to compile it with additional flags.

./configure CFLAGS='-DCOUNT_ALLOCS' --with-pydebug 
make -s -j2

After compiling, we can open an interactive REPL and check the statistics:

>>> import sys
>>> sys.getcounts()
[('iterator', 7, 7, 4), ('functools._lru_cache_wrapper', 1, 0, 1), ('re.Match', 2, 2, 1),
('re.Pattern', 3, 2, 1), ('SubPattern', 10, 10, 8), ('Pattern', 3, 3, 1),
('IndexError', 4, 4, 1), ('Tokenizer', 3, 3, 1), ('odict_keys', 1, 1, 1),
('odict_iterator', 18, 18, 1), ('odict_items', 17, 17, 1), ('RegexFlag', 18, 8, 10),
('operator.itemgetter', 4, 0, 4), ('PyCapsule', 1, 1, 1), ('Repr', 1, 0, 1),
('_NamedIntConstant', 74, 0, 74), ('collections.OrderedDict', 5, 0, 5),
('EnumMeta', 5, 0, 5), ('DynamicClassAttribute', 2, 0, 2), ('_EnumDict', 5, 5, 1),
('TypeError', 1, 1, 1), ('method-wrapper', 365, 365, 2), ('_C', 1, 1, 1),
('symtable entry', 5, 5, 2), ('OSError', 1, 1, 1), ('Completer', 1, 0, 1),
('ExtensionFileLoader', 2, 0, 2), ('ModuleNotFoundError', 2, 2, 1),
('_Helper', 1, 0, 1), ('_Printer', 3, 0, 3), ('Quitter', 2, 0, 2),
('enumerate', 5, 5, 1), ('_io.IncrementalNewlineDecoder', 1, 1, 1),
('map', 25, 25, 1), ('_Environ', 2, 0, 2), ('async_generator', 2, 1, 1),
('coroutine', 2, 2, 1), ('zip', 1, 1, 1), ('longrange_iterator', 1, 1, 1),
('range_iterator', 7, 7, 1), ('range', 14, 14, 2), ('list_reverseiterator', 2, 2, 1),
('dict_valueiterator', 1, 1, 1), ('dict_values', 2, 2, 1), ('dict_keyiterator', 25, 25, 1),
('dict_keys', 5, 5, 1), ('bytearray_iterator', 1, 1, 1), ('bytearray', 4, 4, 1),
('bytes_iterator', 2, 2, 1), ('IncrementalEncoder', 2, 0, 2), ('_io.BufferedWriter', 2, 0, 2),
('IncrementalDecoder', 2, 1, 2), ('_io.TextIOWrapper', 4, 1, 4), ('_io.BufferedReader', 2, 1, 2),
('_abc_data', 39, 0, 39), ('mappingproxy', 199, 199, 1), ('ABCMeta', 39, 0, 39),
('CodecInfo', 1, 0, 1), ('str_iterator', 7, 7, 1), ('memoryview', 60, 60, 2),
('managedbuffer', 31, 31, 1), ('slice', 589, 589, 1), ('_io.FileIO', 33, 30, 5),
('SourceFileLoader', 29, 0, 29), ('set', 166, 101, 80), ('StopIteration', 33, 33, 1),
('FileFinder', 11, 0, 11), ('os.stat_result', 145, 145, 1), ('ImportError', 2, 2, 1),
('FileNotFoundError', 10, 10, 1), ('ZipImportError', 12, 12, 1), ('zipimport.zipimporter', 12, 12, 1),
('NameError', 4, 4, 1), ('set_iterator', 46, 46, 1), ('frozenset', 50, 0, 50), ('_ImportLockContext', 113, 113, 1),
('list_iterator', 305, 305, 5), ('_thread.lock', 92, 92, 10), ('_ModuleLock', 46, 46, 5), ('KeyError', 67, 67, 2),
('_ModuleLockManager', 46, 46, 5), ('generator', 125, 125, 1), ('_installed_safely', 52, 52, 5),
('method', 1095, 1093, 14), ('ModuleSpec', 58, 4, 54), ('AttributeError', 22, 22, 1),
('traceback', 154, 154, 3), ('dict_itemiterator', 45, 45, 1), ('dict_items', 46, 46, 1),
('object', 8, 1, 7), ('tuple_iterator', 631, 631, 3), ('cell', 71, 31, 42),
('classmethod', 58, 0, 58), ('property', 18, 2, 16), ('super', 360, 360, 1),
('type', 78, 3, 75), ('function', 1705, 785, 922), ('frame', 5442, 5440, 36),
('code', 1280, 276, 1063), ('bytes', 2999, 965, 2154), ('Token.MISSING', 1, 0, 1),
('stderrprinter', 1, 1, 1), ('MemoryError', 16, 16, 16), ('sys.thread_info', 1, 0, 1),
('sys.flags', 2, 0, 2), ('types.SimpleNamespace', 1, 0, 1), ('sys.version_info', 1, 0, 1),
('sys.hash_info', 1, 0, 1), ('sys.int_info', 1, 0, 1), ('float', 584, 569, 20),
('sys.float_info', 1, 0, 1), ('module', 56, 0, 56), ('staticmethod', 16, 0, 16),
('weakref', 505, 82, 426), ('int', 3540, 2775, 766), ('member_descriptor', 246, 10, 239),
('list', 992, 919, 85), ('getset_descriptor', 240, 4, 240), ('classmethod_descriptor', 12, 0, 12),
('method_descriptor', 678, 0, 678), ('builtin_function_or_method', 1796, 1151, 651), ('wrapper_descriptor', 1031, 5, 1026),
('str', 16156, 9272, 6950), ('dict', 1696, 900, 810), ('tuple', 10367, 6110, 4337)]

Make the conclusion more readable:

defprint_allocations(top_k=None):
    allocs = sys.getcounts()
    if top_k:
        allocs = sorted(allocs, key=lambda tup: tup[1], reverse=True)[0:top_k]
    for obj in allocs:
        alive = obj[1]-obj[2]
        print("Type {},  allocs: {}, deallocs: {}, max: {}, alive: {}".format(*obj,alive))

>>> print_allocations(10)
Type str,  allocs: 17328, deallocs: 10312, max: 7016, alive: 7016
Type tuple,  allocs: 10550, deallocs: 6161, max: 4389, alive: 4389
Type frame,  allocs: 5445, deallocs: 5442, max: 36, alive: 3
Type int,  allocs: 3988, deallocs: 3175, max: 813, alive: 813
Type bytes,  allocs: 3031, deallocs: 1044, max: 2154, alive: 1987
Type builtin_function_or_method,  allocs: 1809, deallocs: 1164, max: 651, alive: 645
Type dict,  allocs: 1726, deallocs: 930, max: 815, alive: 796
Type function,  allocs: 1706, deallocs: 811, max: 922, alive: 895
Type code,  allocs: 1284, deallocs: 304, max: 1063, alive: 980
Type method,  allocs: 1095, deallocs: 1093, max: 14, alive: 2

Where:

  • allocs - how many objects were allocated since the start of the interpreter
  • deallocs - how many objects were deleted (manually or automatically)
  • alive - the number of live (current) objects (allocs - deallocs)
  • max - the maximum number of living objects since the start of the interpreter

As you can see, the empty Python REPL managed to allocate 17,328 lines and 10,550 tuples. This is some crazy number of objects! Here you need to keep in mind that for the REPL operation, Python automatically imports additional modules that are not imported in the case of empty scripts.

Now let's test “Hello, World” in flask:

import sys
from flask import Flask
app = Flask(__name__)
@app.route('/')defhello_world():
    print_allocations(15)
    return'Hello, World!'

./python -m flask run
ab -n 100 http://127.0.0.1:5000/

After sending 100 HTTP requests to our server, the statistics look like this:

Type str,  allocs: 192649, deallocs: 138892, max: 54320, alive: 53757
Type frame,  allocs: 191752, deallocs: 191714, max: 158, alive: 38
Type tuple,  allocs: 183474, deallocs: 150069, max: 33581, alive: 33405
Type int,  allocs: 85154, deallocs: 81100, max: 4115, alive: 4054
Type bytes,  allocs: 31671, deallocs: 14331, max: 17381, alive: 17340
Type list,  allocs: 29846, deallocs: 27541, max: 2415, alive: 2305
Type builtin_function_or_method,  allocs: 28525, deallocs: 27572, max: 957, alive: 953
Type dict,  allocs: 19900, deallocs: 14800, max: 5280, alive: 5100
Type method,  allocs: 15170, deallocs: 15105, max: 74, alive: 65
Type function,  allocs: 14761, deallocs: 7086, max: 7711, alive: 7675
Type slice,  allocs: 12521, deallocs: 12521, max: 1, alive: 0
Type list_iterator,  allocs: 10795, deallocs: 10795, max: 35, alive: 0
Type code,  allocs: 9849, deallocs: 1749, max: 8107, alive: 8100
Type tuple_iterator,  allocs: 8938, deallocs: 8938, max: 4, alive: 0
Type float,  allocs: 6033, deallocs: 5889, max: 152, alive: 144

As you can see, flask allocated 847,261 objects since the start of the interpreter. Most of them were temporary ( 714,336 ) and removed as soon as they were no longer needed. The remaining objects ( 132 925 ) are still in memory.

Frames and code objects


In the example above, you can find a lot of frame and code objects. What are they needed for?

In short, each code object stores a block of compiled code; in turn, frame objects are used to execute them, working on the principle of a stack of calls . In Python, the most popular block is the function. For each new function, you need your own code object, and for each call to this function, you need a separate frame object, where Python will store local variables. In addition to local variables, each frame object stores a lot of auxiliary data that is needed to perform the function.

Where do all these objects come from?


Python is a very dynamic language and you have to pay for it. In order to maintain dynamic capabilities, it creates a large number of temporary objects that perform a supporting role.

For example, a simple function declaration creates at least 5 dictionaries, 5 tuples, and 4 lists. These objects will live until the end of the script. In turn, all these objects store other objects (their elements), they are dozens, sometimes hundreds of additional objects used for the internal description of the compiled function. Description of the average class can distinguish hundreds of container (dictionaries, tuples, lists) of objects. Unfortunately, it will not be possible to automatically calculate the exact number of objects to be selected here, and these figures are approximate.

In order for Python to quickly allocate a large number of objects, it uses a large and multi-layered system that optimizes the allocation of objects in memory.

Sometimes you wonder how much detail the interpreted languages ​​hide from us. Python allows you to write good code without thinking about a lot of problems and details.

PS: I am the author of this article, you can ask any questions.

Also popular now: