Python memory usage

From the sandbox

How much memory does 1 million integers take?

I was often puzzled by the thought of how efficiently Python uses memory compared to other programming languages. For example, how much memory is needed to work with 1 million integers? And with the same number of lines of arbitrary length?
As it turned out, in Python there is an opportunity to get the necessary information directly from the interactive console without resorting to the source code in C (although, for the sake of fidelity, we still look there).
By satisfying curiosity, we get inside the data types and find out what exactly the memory is spent on.

All examples were made in CPython version 2.7.4 on a 32 bit machine. At the end is a table for the memory requirements on a 64 bit machine.

Necessary tools

sys.getsizeof and sizeof () method

The first tool we need is in the sys standard library. We quote the official documentation:

sys.getsizeof (object [, default_value])

Returns the size of the object in bytes.
If a default value is specified, then it will return if the object does not provide a way to get the size. Otherwise, a TypeError exception will be thrown.
Getsizeof () calls the __sizeof__ object method and adds the size of the extra information that is stored for the garbage collector, if used.

The algorithm of getsizeof (), rewritten in Python, might look like this:

Py_TPFLAGS_HAVE_GC = 1 << 14# константа. в двоичным виде равна 0b100000000000000defsys_getsizeof(obj, default = None)        
    ifobj.hasattr('__sizeof__'):
        size = obj.__sizeof__()
    elif default isnotNone:
        return default
    else:
        raise TypeError('Объект не имеет атрибута __sizeof__')
    # Если у типа объекта установлен флаг HAVE_GC
    if type(obj).__flags__ & Py_TPFLAGS_HAVE_GC:
        size = size + размер PyGC_Head
    return size

Where PyGC_Head is a double linked list item that is used by the garbage collector to detect circular links. In the source code, it is represented by the following structure:

typedefunion _gc_head {
    struct {union _gc_head *gc_next;
        union _gc_head *gc_sourcev;
        Py_ssize_t gc_refs;
    } gc;
    longdouble dummy;  
} PyGC_Head;

The size of PyGC_Head will be 12 bytes on a 32 bit machine and 24 bytes on a 64 bit machine.

Let's try getsizeof () in the console and see what happens:

>>> import sys 
>>> GC_FLAG = 1 << 14>>> sys.getsizeof(1) 
12>>> (1).__sizeof__()
12>>> bool(type(1).__flags__ & GC_FLAG)
False>>> sys.getsizeof(1.1) 
16>>> (1.1).__sizeof__()
16>>> bool(type(1.1).__flags__ & GC_FLAG)
False>>> sys.getsizeof('') 
21>>> ''.__sizeof__()
21>>> bool(type('').__flags__ & GC_FLAG)
False>>> sys.getsizeof('hello') 
26>>> sys.getsizeof(tuple()) 
24>>> tuple().__sizeof__()
12>>> bool(type(tuple()).__flags__ & GC_FLAG)
True>>> sys.getsizeof(tuple((1, 2, 3))) 
36

With the exception of the magic of checking flags, everything is very simple.
As you can see from the example, int and float occupy 12 and 16 bytes, respectively. Str takes up 21 bytes and another byte for each character in the content. An empty tuple occupies 12 bytes, and an additional 4 bytes per element. For simple data types (which do not contain references to other objects, and therefore are not tracked by the garbage collector), the value of sys.getsizeof is equal to the value returned by the __sizeof __ () method.

id () and ctypes.string_at

Now let's find out what exactly the memory is spent on.
For this we need two things: firstly, to find out where the object is stored, and secondly, to get direct access to read from memory. Despite the fact that Python carefully protects us from direct access to memory, it is still possible to do this. In this case, you must be careful, as this can lead to a segmentation error.

The built-in id () function returns the memory address where the beginning of the object is stored (the object itself is a C structure)

>>> obj = 1>>> id(obj) 
158020320

To read data at a memory address, you need to use the string_at function from the ctypes module. Her official description is not very detailed:

ctypes.string_at (address [, length])
This function returns a string with the beginning of the address in memory cells. If "length" is not specified, then it is considered that the string zero-terminated,

Now let's try to read the data at the address that id () returned to us:

>>> import ctypes 
>>> obj = 1>>> sys.getsizeof(obj)
12>>> ctypes.string_at(id(obj), 12) 
'u\x01\x00\x00 \xf2&\x08\x01\x00\x00\x003\x01\x00\x00 \xf2&\x08\x00\x00\x00\x001\x00\x00\x00'

The look of the hex code is not very impressive, but we are close to the truth.

Model Struct

In order to present the conclusion in the values convenient for perception, we will use one more module. The unpack () function from the struct module will help us here.

struct
This module converts between Python values and C structures, represented as strings.

struct.unpack (format, string)
Parses the string according to the given formats. Always returns a tuple, even if the string contains only one element. The string must contain exactly that amount of information as described by the format.

Data formats that we need.

symbol	C value	Python value	Length on a 32bit machine
c	char	Single character string	one
i	int	int	four
l	long	int	four
L	unsigned long	int	four
d	double	float	eight

Now we collect everything together and look at the internal structure of some types of data.

Int

>>> obj = 1>>> sys.getsizeof(obj), obj.__sizeof__() 
(12, 12) 
>>> struct.unpack('LLl', ctypes.string_at(id(obj), 12)) 
(373, 136770080, 1)

The value format is easy to guess.

The first number (373) is the number of pointers to the object.

>>> obj2 = obj
>>> struct.unpack('LLl', ctypes.string_at(id(obj), 12)) 
(374, 136770080, 1)

As you can see, the number increased by one, after we created another link to the object.

The second number (136770080) is a pointer (id) to the type of object:

>>> type(obj) 
<type 'int'>
>>> id(type(obj) )
136770080

The third number (1) is directly the content of the object.

>>> obj = 1234567>>> struct.unpack('LLl', ctypes.string_at(id(obj), 12)) 
(1, 136770080, 1234567)

Our guesses can be confirmed by looking at the source code of CPython.

typedefstruct { 
    PyObject_HEAD 
    long ob_ival; 
} PyIntObject;

Here PyObject_HEAD is a macro common to all built-in objects, and ob_ival is a value of type long. The PyObject_HEAD macro adds a count of the number of pointers to an object and a pointer to the parent type of the object — exactly what we saw.

Float

The floating point number is very similar to int, but represented in C memory by a value of type double.

typedefstruct { 
    PyObject_HEAD 
    double ob_fval; 
} PyFloatObject;

This is easy to verify:

>>> obj = 1.1>>> sys.getsizeof(obj), obj.__sizeof__() 
(16, 16) 
>>> struct.unpack('LLd', ctypes.string_at(id(obj), 16) 
(1, 136763968, 1.1)

String (Str)

The string is represented as an array of characters ending in a null byte. Also, the length of the string, the hash from its contents, and a flag determining whether it is stored in the internal cache interned are stored in the structure of a separate line.

typedefstruct { 
    PyObject_VAR_HEAD 
    long ob_shash;   # хэш от строкиint ob_sstate;  # находится ли в кэше?char ob_sval[1];  # содержимое строки  + нулевой байт
} PyStringObject;

The PyObject_VAR_HEAD macro includes PyObject_HEAD and adds the long ob_ival value, which stores the length of the string.

>>> obj = 'hello world'>>> sys.getsizeof(obj), obj.__sizeof__() 
(32, 32) 
>>> struct.unpack('LLLli' + 'c' * (len(obj) + 1), ctypes.string_at(id(obj), 4*5 + len(obj) + 1)) 
(1, 136790112, 11, -1500746465, 0, 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '\x00')

The fourth value corresponds to the hash from the string, which is easy to verify.

>>> hash(obj) 
-1500746465

As you can see, the sstate value is 0, so the line is not cached right now. Let's try to add it to the cache:

>>> intern(obj) 
'hello world'>>> struct.unpack('LLLli' + 'c' * (len(obj) + 1), ctypes.string_at(id(obj), 4*5 + len(obj) + 1)) 
(2, 136790112, 11, -1500746465, 1, 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '\x00')

Tuple

A tuple is represented as an array of pointers. Since its use can lead to ring references, it is monitored by the garbage collector, which requires additional memory (this is reminded to us by the sys.getsizeof () call).

The tuple structure is similar to a string, except that there are no special fields except for the length.

typedefstruct { 
    PyObject_VAR_HEAD 
    PyObject *ob_item[1]; 
} PyTupleObject;

>>> obj = (1,2,3)
>>> sys.getsizeof(obj), obj.__sizeof__()
(36, 24)
>>> struct.unpack('LLL'+'L'*len(obj), ctypes.string_at(id(obj), 12+4*len(obj)))
(1, 136532800, 3, 146763112, 146763100, 146763088)
>>> for i in obj: print i, id(i)
114676311221467631003146763088

As you can see from the example, the last three elements of the tuple are pointers to its contents.

The remaining basic data types (unicode, list, dict, set, frozenset) can be explored in a similar way.

What is the result?

Type of	Name in CPython	format	Format for nested objects	32bit length	64bit length	Memory for GC *
Int	PyIntObject	LLl		12	24
float	PyFloatObject	LLd		sixteen	24
str	PyStringObject	LLLli + c * (length + 1)		21 + length	37 + length
unicode	PyUnicodeObject	Llllll	L * (length + 1)	28 + 4 * length	52 + 4 * length
tuple	PyTupleObject	LLL + L * Length		12 + 4 * length	24 + 8 * length	there is
list	PyListObject	L * 5	L * length	20 + 4 * length	40 + 8 * length	there is
Set / frozenset	PySetObject	L * 7 + (lL) * 8 + lL	LL * length	(<= 5 elements) 100 (> 5 elements) 100 + 8 * length	(<= 5 elements) 200 (> 5 elements) 200 + 16 * length	there is
dict	PyDictObject	L * 7 + (lLL) * 8	lLL * length	(<= 5 elements) 124 (> 5 elements) 124 + 12 * length	(<= 5 elements) 248 (> 5 elements) 248 + 24 * length	there is

* Adds 12 bytes on a 32 bit machine and 32 bytes on a 64 bit machine.

We see that simple data types in Python are two to three times larger than their prototypes in C. The difference is due to the need to store the number of references to the object and a pointer to its type (contents PyObject_HEAD macro). This is partially offset by internal caching, which allows reusing previously created objects (this is only possible for immutable types).

For strings and tuples, the difference is not so significant - some constant value is added.

And lists, dictionaries and sets, as a rule, occupy 1/3 more than necessary. This is due to the implementation of the algorithm for adding new elements, which sacrifices memory in order to save processor time.

So, we answer the question at the beginning of the article: to save 1 million integers, we need 11.4 megabytes (12 * 10 ^ 6 bytes) for the numbers themselves and an additional 3.8 megabytes (12 + 4 + 4 * 10 ^ 6 bytes) per tuple, which will store links to them.

UPD: Typos.
UPD: In the subtitle “1 million integers,” instead of “1 million primes”

Tags: