Shed Skin - experimental translator from Python to C ++

Original author: Mark Dufour, James Coughlan
  • Transfer


Shed Skin is an experimental compiler from Python to C ++ designed to increase the speed of Python intensive programs. It converts programs written in a limited subset of the Python language into C ++. C ++ code can be compiled into executable code, which can be either a standalone program or an extension module, easily imported and used in a regular Python program.

Shed Skin uses the type matching methods used in the Python program to generate the explicit type declarations required for the C ++ version. Because C ++ is a static typed language, Shed Skin requires Python code to be written so that all variables have a specific type.

In addition to restrictions on typing and a subset of the language, supported programs cannot freely use the standard Python library, although about 25 commonly used modules, such as random and re, are supported (see Library Limitations).

In addition, the type detection technology used by Shed Skin does not currently scale well for programs that are larger than several thousand lines of code (the maximum size of the program being broadcast is about 6,000 lines (sloccount)). In general, this means that Shed Skin is currently more suitable for translating small programs and extension modules that do not use intensively the dynamic typing features of Python or the standard and external libraries. See below a collection of 75 non-trivial example programs.

Since Shed Skin is still in an early stage of development, it can dramatically improve. At the moment, you may encounter any errors in the process of using it. Please send us a report about them so that we can fix them!

Shed Skin is currently compatible with Python versions 2.4 through 2.7, behaves like 2.6, and runs on Windows and most UNIX platforms such as GNU / Linux and OSX.

Typing restrictions

Shed Skin translates regular, but statically typed programs in C ++. The restriction on static typing means that variables can have only one, immutable type. So, for example, the code

a = 1
a = '1' # ошибка

unacceptable. However, as in C ++, types can be abstract, such as code

a = A()
a = B() # правильно

where A and B have a common base class, for example.

The restriction on typing also means that elements of the same collection (list, set, etc.) cannot have different types (because the types of their members must also be static). So the code:

a = ['apple', 'b', 'c'] # правильно
b = (1, 2, 3) # правильно
c = [[10.3, -2.0], [1.5, 2.3], []] # правильно

let's say, but the code

d = [1, 2.5, 'abc'] # ошибка
e = [3, [1, 2]] # ошибка
f = (0, 'abc', [1, 2, 3]) # ошибка

unacceptable. The keys and values ​​of dictionaries can have different types:

g = {'a': 1, 'b': 2, 'c': 3} # правильно
h = {'a': 1, 'b': 'hello', 'c': [1, 2, 3]} # ошибка

In the current version of Shed Skin, mixed types are also allowed in tuples of length two:

a = (1, [1]) # правильно

In the future, mixed types may be allowed in longer tuples.

The None type can only be mixed with non-scalar types (that is, not with int, float, bool or complex):

l = [1]
l = None # правильно

m = 1
m = None # ошибка

def fun(x = None): # ошибка: используйте конкретное значение для x, например, x = -1

Integers and floating-point numbers (integers and floats) can usually be mixed (integers become floating-point numbers). If this is not possible, Shed Skin will display an error message.

Python Subset Limitations

Shed Skin will always support only a subset of all the features of the Python language. Currently, the following features are not supported:

  • eval, getattr, hasattr, isinstance, all dynamic
  • Arithmetic with arbitrary precision (integers - int - become 32-bit (signed) by default on most architectures, see command line options)
  • argument packing (* args and ** kwargs)
  • multiple inheritance
  • nested functions and classes
  • unicode
  • inheritance from built-in types (excluding Exception and object)
  • function overload __iter__, __call__, __del__
  • circuit

Some other features are only partially supported:

  • class attributes can be accessed only through the class name:
            self.class_attr # ошибка
            SomeClass.class_attr # правильно
            SomeClass.some_static_method() # правильно

  • references to functions can be passed, but not to class methods and not to classes, and they cannot be contained in any container:
            var = lambda x, y: x+y # правильно
            var = some_func # правильно
            var = self.some_method # ошибка, ссылка на метод
            var = SomeClass # ошибка
            [var] # ошибка, находится в контейнере

Library restrictions

At the moment, the following 25 modules are heavily supported. Some of them, such as os.path, have been translated into C ++ using Shed Skin.

  • array
  • binascii
  • bisect
  • collections (defaultdict, deque)
  • colorsys
  • ConfigParser (without SafeConfigParser)
  • copy
  • csv (without Dialect, Sniffer)
  • datetime
  • fnmatch
  • getopt
  • glob
  • heapq
  • itertools (without starmap)
  • math
  • mmap
  • os (under Windows, some functionality is missing)
  • os.path
  • random
  • re
  • select (select function only, under UNIX)
  • socket
  • string
  • struct (without Struct, pack_into, unpack_from)
  • sys
  • time

Note that any other module, such as pygame, pyqt or pickle, can be used in conjunction with an extension module generated using Shed Skin. For examples of such use, see Shed Skin Examples.


There are two types of installers: a boot-unpacking Windows installer and a UNIX archive. But, of course, it is better if Shed Skin is installed using the package manager of your GNU / Linux installation (Shed Skin is available at least on Debian, Ubuntu, Fedora and Arch).


To install the Windows version, just download and run the installer. If you are using ActivePython or another non-standard Python distribution, or MingW, uninstall it first. Also keep in mind that probably in a 64-bit version of Python some file is missing, so building extension modules is not possible. Instead of 64-bit, use the 32-bit version of Python.


Installation through the package manager

Example command for Ubuntu:

sudo apt-get install shedskin

Manual installation

To install the distribution package from the UNIX archive manually, do the following:

  • download and unzip the archive
  • run run sudo python install command


To compile and run the programs generated by shedskin, the following libraries are required:

  • g ++, C ++ compiler (version 4.2 or higher).
  • pcre debug files
  • Python debug files
  • Boehm garbage collector

To install these libraries under Ubuntu, type:

sudo apt-get install g++ libpcre++dev python-all-dev libgc-dev

If the Boehm garbage collector is not available through your package manager, use this method. Download, for example, version 7.2alpha6 from the website, unzip it, and install it as follows:

./configure --prefix=/usr/local --enable-threads=posix --enable-cplusplus --enable-thread-local-alloc --enable-large-config
make check
sudo make install

If the PCRE library is not available through your package manager, use the following method. Download, for example, version 8.12 from the website, unzip it and install it as follows:

./configure --prefix=/usr/local
sudo make install

Manual installation

To install Shed Skin from a UNIX archive on an OSX system, do the following:

  • download and unzip the archive
  • run run sudo python install command


To compile and run the programs generated by shedskin, the following libraries are required:

  • g ++, C ++ compiler (version 4.2 or higher; comes with the Apple Xcode development environment?).
  • pcre debug files
  • Python debug files
  • Boehm garbage collector

If the Boehm garbage collector is not available through your package manager, use this method. Download, for example, version 7.2alpha6 from the website, unzip it, and install it as follows:

./configure --prefix=/usr/local --enable-threads=posix --enable-cplusplus --enable-thread-local-alloc --enable-large-config
make check
sudo make install

If the PCRE library is not available through your package manager, use the following method. Download, for example, version 8.12 from the website, unzip it and install it as follows:

./configure --prefix=/usr/local
sudo make install

Broadcasting a regular program

On Windows, first (double-click) the init.bat file in the directory where you installed Shed Skin.

To compile the following simple test program called type: Two C ++ files are created with the names test.cpp and test.hpp, as well as the Makefile. To create an executable file called test (or test.exe), type:
print 'hello, world!'

shedskin test


Creating an extension module

To compile the following program named as an extension module:

def func1(x):
    return x+1
def func2(n):
    d = dict([(i, i*i)  for i in range(n)])
    return d
if __name__ == '__main__':
    print func1(5)
    print func2(10)


shedskin -e simple_module

In order for the 'make' command to run successfully on a non-Windows system, make sure you have Python debug files installed (on Debian, install python-dev; on Fedora, install python-devel).

Note that for type definition to be possible, your module should only call its own functions. This effect is achieved in the example due to the fact that calls are placed inside the if __name __ == '__ main__' condition, so that they are not called if the module is imported. Functions can only be called indirectly, that is, if func2 calls func1, the call to func1 can be omitted.

The extension module can now be simply imported and used as usual:

>>> from simple_module import func1, func2
>>> func1(5)
>>> func2(10)
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}


There are significant differences between using a compiled expansion module and the original.

You can only pass and return built-in scalar types and containers (int, float, complex, bool, str, list, tuple, dict, set), as well as None and instances of user-defined classes. So, for example, anonymous functions and iterators are not currently supported.
Inline objects as well as their contents are completely converted to every function call / return from Shed Skin types to CPython and vice versa. This means that you cannot change the built-in CPython objects on the Shed Skin side and vice versa, and the conversion can be slow. Instances of classes defined by the user can be transferred / returned without any conversions and changed on any side.
Global variables are converted only once, during initialization, from Shed Skin to CPython. This means that the CPython version and Shed Skin version can change independently. This problem can be avoided by using only constant gabola variables, or by adding getter and setter functions.
Multiple (interacting) expansion modules are not currently supported. Also, the import and simultaneous use of the Python and translated versions may not work.

Integration with Numpy

Shed Skin does not currently have direct Numpy support. However, it is possible to transfer the Numpy array to the translated Shed Skin extension module as a list using its tolist method. Note that this is very inefficient (see above), so it can be used if a large amount of time is spent inside the expansion module. Consider the following example:

def my_sum(a):
    """ compute sum of elements in list of lists (matrix) """
    h = len(a) # number of rows in matrix
    w = len(a[0]) # number of columns
    s = 0.0
    for i in range(h):
        for j in range(w):
            s += a[i][j]
    return s
if __name__ == '__main__':
    print my_sum([[1.0, 2.0], [3.0, 4.0]]) 

After translating this module as an extension module using Shed Skin, we can pass the Numpy array as follows:

>>> import numpy
>>> import simple_module2
>>> a = numpy.array(([1.0, 2.0], [3.0, 4.0]))
>>> simple_module2.my_sum(a.tolist())

Binary distribution


To use the generated Windows binary code on another system or run it without running init.bat, place the following files in the directory with the binary file:

shedskin-0.9 \ shedskin \ gc.dll
shedskin-0.9 \ shedskin-libpcre-0.dll
shedskin- 0.9 \ bin \ libgcc_s_dw-1.dll
shedskin-0.9 \ bin \ libstdc ++. Dll


To use the generated binary on another system, make sure libgc and libpcre3 are installed there. If this is not the case, and you cannot install them globally on the system, you can place copies of these libraries in the same directory where the binar is located using the following commands:

$ ldd test => /usr/lib/ => /lib/x86_64-linux-gnu/
$ cp /usr/lib/ .
$ cp /lib/x86_64-linux-gnu/ .
$ LD_LIBRARY_PATH=. ./test

Note that both systems must be 32-bit or 64-bit. If this is not the case, Shed Skin must be installed on another system in order to rebuild the binary.


Suppose we defined the following function in a file called

def part_sum(start, end):
    """ calculate partial sum """
    sum = 0
    for x in xrange(start, end):
        if x % 2 == 0:
            sum -= 1.0 / x
            sum += 1.0 / x
    return sum
if __name__ == ’__main__’:
    part_sum(1, 10)

To translate this file to the extension module, type:

    shedskin -e meuk

To use the resulting extension module with the module of the standard multiprocessing library, just add wrapper in Python:

from multiprocessing import Pool
def part_sum((start, end)):
    import meuk
    return meuk.part_sum(start, end)
pool = Pool(processes=2)
print sum(, [(1,10000000), (10000001, 20000000)]))

Calling C / C ++ Code

To invoke C / C ++ code, do the following:

Provide Shed Skin with typing information using the C / C ++ code type model. Suppose we need to call a simple function that returns a list of n smallest primes greater than a given number. The following type model, contained in the file, is sufficient for Shed Skin to perform type matching:
def more_primes(n, nr=10):
    return [1]

To do the actual type mapping, write a test program called that uses this type model, and then translate it:
import stuff
print stuff.more_primes(100)
shedskin test

Besides, this code also translates in C ++. Now you can manually write C / C ++ code in the stuff.cpp file. To avoid overwriting it during the next translation of the file, move stuff. * To the lib / Shed Skin directory.

Standard library

By moving stuff. * To lib /, we actually added custom library module support to Shed Skin. Other programs broadcast by Shed Skin can now import our library and use more_primes. In fact, the lib / directory contains type models and implementations of all supported modules. As you can see, some have been partially converted to C ++ using Shed Skin.

Types of Shed Skin

Shed Skin is reimplementing Python's built-in types with a set of its classes in C ++. They have the same interface as their Python counterparts, so they are easy to use (assuming you have basic C ++ knowledge). For details on class definitions, see the lib / builtin.hpp file. If in doubt, convert similar Python code to C ++ and look at the result!

Command line options

The shedskin command supports the following options:

  • -a --ann Output commented source code (
  • -b --nobounds Disable range checking
  • -e --extmod Generate an extension module
  • -f --flags Set flags for Makefile
  • -g --nogcwarns Disable runtime GC warnings
  • -l --long Use integers long long ("64-bit")
  • -m --makefile Specify another Makefile name
  • -n --silent Silent mode, show warnings only
  • -o --noassert Disable assert statements
  • -r --random Use a fast random number generator (rand ())
  • -s --strhash Use fast string hashing algorithm (murmur)
  • -w --nowrap Disable wrap-around validation
  • -x --traceback Print traceback for uncaught exceptions
  • -L --lib Add a directory with libraries

For example, to translate the file as an extension module, enter shedskin –e test or shedskin ––extmod test.

The -b or --nobounds option is very often used because it disables out-of-range exceptions (IndexError), which can greatly affect performance.

    a = [1, 2, 3]
    print a[5] # invalid index: out of bounds

Productivity Tips and Tricks


Small memory allocations (for example, creating a new tuple, list, or class instance) usually do not slow down a Python program much. However, after translation into C ++, often they become a bottleneck. This is because for every memory allocation, the memory is requested from the system, it must be cleared by the garbage collector, and a large number of subsequent memory allocations are likely to cause an absence in the cache. A key approach to high performance is often to reduce the number of small allocations, for example, replacing a small generator expression with a loop or eliminating intermediate tuples in some calculations.

However, note that for idiomatic for a, b in enumerate (..), for a, b in enumerate (..) and for a, b in somedict.iteritems (), intermediate small objects are thrown away by the optimizer, and lines of length 1 are cached.

Some features of Python (which can slow down the generated code) are not always necessary and can be turned off. See the Command Line Options section for details. Turning off range checking is usually a very safe optimization and can be of great help in the case of code where the index pick operation is often used.

Access through an attribute in the generated code is faster than taking by index. For example, vx * vy * vz is faster than v [0] * v [1] * v [2].

Shed Skin takes flags for the C ++ compiler from the FLAGS * files in the directory where Shed Skin is installed. These flags can be changed or modified using a local file named FLAGS.

With a lot of floating point calculations, it is not always necessary to follow the IEEE floating point specifications. Adding the -ffast-math flag can significantly improve performance.

Profiling can squeeze even more performance. In recent versions of GCC, first compile and execute the generated code with -fprofile-generate, and then with fprofile-use.

For best results, configure the latest version of the Boehm GC using CPPFLAGS = "- O3 -march = native" ./configure --enable-cplusplus --enable-threads = pthreads --enable-thread-local-alloc --enable-large -config --enable-parallel-mark. The latter option allows the GC to use multiple processor cores.

When optimizing, it is very useful to know how much time is spent in each part of your program. The Gprof2Dot program can be used to create beautiful traffic for both a separate program and the original Python code. The OProfile program can be used to profile the extension module.

To use Gprof2dot, download the file from the website and install Graphviz. Then:

shedskin program
make program_prof
gprof program_prof | | dot -Tpng -ooutput.png

To use OProfile, install it and use it as follows.

shedskin -e extmod
sudo opcontrol --start
python main_program_that_imports_extmod
sudo opcontrol --shutdown
opreport -l


The following two code snippets work the same, but only the second is supported:

statistics = {'nodes': 28, 'solutions': set()}
class statistics: pass
s = statistics(); s.nodes = 28; = set()

The order in which the arguments to the function or print statement are evaluated changes when translated into C ++, so it’s best not to count on it:

print 'hoei', raw_input() # raw_input вызывается до вывода 'hoei'!

Tuples with different element types and lengths> 2 are not currently supported. However, they can be emulated:

class mytuple:
    def __init__(self, a, b, c):
        self.a, self.b, self.c = a, b, c

Block comments surrounded by # {and #} are ignored by Shed Skin. This feature can be used to comment on code that cannot be compiled. For example, the following snippet will only print a dot when launched under CPython:
print "x =", x
print "y =", y
import pylab as pl
pl.plot(x, y)

Version 0.9.4, June 16, 2013, Mark Dufour and James Coughlan

Also popular now: