![](http://habrastorage.org/getpro/habr/avatars/767/e53/a2e/767e53a2e25cc7c8323f4bf44413c6a3.jpg)
Python, scipy.weave and openMP - overclocking code
Hello% username%, this article is devoted to the problem of increasing the speed of mathematical calculations based on the python language using scipy.weave and openMP .
Many may ask, “Why use python for mathematical calculations at all?”, But we will not answer “eternal” questions, nor will we consider many other solutions to this problem, such as, for example, psyco .
As described above, our tool is the scipy.weave library , as well as the openMP library .
scipy - a set of libraries for computing in applied mathematics and science. openMP is an open standard for parallelizing C, C ++, and Fortran programs.
On Debian-like Linux systems, you must do:
To increase the speed of calculations, it is necessary to implement the “narrow” part of python code (usually a cycle in which some actions with the matrix take place) in C and add openMP directives for parallelization.
I think that there is nothing better than to verify this method by solving the following problem as an example:
In python c, using numpy this task, without taking into account various preparatory operations, such as matrix generation and other things, solves in a couple of lines of code:
scipy.weave is part of the scipy library that allows you to use C / C ++ code inside python code .
It happens as follows:
those. The C code itself is stored as a multiline string , and the python code variables are passed to the C list, where the elements are the same text constants. Also, numpy arrays are transferred to C not in the form of a matrix, but in the form of a vector, which is why the code has one cycle, not two.
By the way, the resulting C code can be searched in / tmp /% user% / python2x_intermediate / compiler_x
Now, to the added version, you need to add openMP directives and add the missing parameters in the inline call , namely:
The above source code can be run and make sure that scipy.weave really gives an increase in speed:
The following resources were used in writing the code:
Many may ask, “Why use python for mathematical calculations at all?”, But we will not answer “eternal” questions, nor will we consider many other solutions to this problem, such as, for example, psyco .
Instruments
As described above, our tool is the scipy.weave library , as well as the openMP library .
scipy - a set of libraries for computing in applied mathematics and science. openMP is an open standard for parallelizing C, C ++, and Fortran programs.
Package Installation
On Debian-like Linux systems, you must do:
apt-get install python-scipy apt-get install libgomp1
Method
To increase the speed of calculations, it is necessary to implement the “narrow” part of python code (usually a cycle in which some actions with the matrix take place) in C and add openMP directives for parallelization.
Example
I think that there is nothing better than to verify this method by solving the following problem as an example:
- there is a matrix of size n by n, a vector of size n, and an integer;
- it is necessary to subtract from each row of the matrix a vector multiplied by an integer (from the simplex method).
Python implementation
In python c, using numpy this task, without taking into account various preparatory operations, such as matrix generation and other things, solves in a couple of lines of code:
Generation of a random matrix x by y, in our case x = y:
- # loop through the rows of the matrix, where i is the row number
- # c is an integer, randRow is a random vector
- for i in xrange (N):
- matrix [i ,:] - = c * randRow
- # generating a random matrix x by y
- # matrix elements - random numbers from 0 to 99 inclusive.
- def randMat (x, y):
- randRaw = lambda a: [randint (0, 100) for i in xrange (0, a)]
- randConst = lambda x, y: [randRaw (x) for e in xrange (0, y)]
- return array (randConst (x, y))
Scipy.weave implementation without openMP
scipy.weave is part of the scipy library that allows you to use C / C ++ code inside python code .
It happens as follows:
- # C code
- codeC =
- "" "
- int i = 0;
- for (i = 0; i <N * M; i ++) {
- matrix [0, i] = matrix [0, i] - (c * randRow [i% M]);
- }
- "" "
- weave.inline (codeC, ['matrix', 'c', 'randRow', 'N', 'M'], compiler = 'gcc')
those. The C code itself is stored as a multiline string , and the python code variables are passed to the C list, where the elements are the same text constants. Also, numpy arrays are transferred to C not in the form of a matrix, but in the form of a vector, which is why the code has one cycle, not two.
By the way, the resulting C code can be searched in / tmp /% user% / python2x_intermediate / compiler_x
Scipy.weave implementation with openMP
Now, to the added version, you need to add openMP directives and add the missing parameters in the inline call , namely:
Full source code with all implementations can be downloaded here.
- # C and openMP code
- codeOpenMP =
- "" "
- int i = 0;
- omp_set_num_threads (2);
- #pragma omp parallel shared (matrix, randRow, c) private (i)
- {
- #pragma omp for
- for (i = 0; i <N * M; i ++) {
- matrix [0, i] = matrix [0, i] - (c * randRow [i% M]);
- }
- }
- "" "
- ...
- weave.inline (codeOpenMP, ['matrix', 'c', 'randRow', 'N', 'M'],
- extra_compile_args = ['- O3 -fopenmp'],
- compiler = 'gcc',
- libraries = ['gomp'],
- headers = ['
'])
Comparison of Results
The above source code can be run and make sure that scipy.weave really gives an increase in speed:
Test on size: 100x100 Pure python: 0.0725984573364 Pure C: 0.303888320923 C plus OpenMP: 0.109100341797 Test - ok Test on size: 1000x1000 Pure python: 1.00839138031 Pure C: 0.506997108459 C plus OpenMP: 0.333213806152 Test - ok Test on size: 2000x2000 Pure python: 3.24151515961 Pure C: 2.10800170898 C plus OpenMP: 1.17690563202 Test - ok Test on size: 3000x3000 Pure python: 5.54490089417 Pure C: 4.61800098419 C plus OpenMP: 2.56960391998 Test - ok
Literature
The following resources were used in writing the code: