vhbit June 11, 2009 at 22:34

Parallel Python Start

Disclaimer

A geographic need was born for a friend to transfer a piece of the map from one piece of land to another. Out of habit, he did it on dolphins, but I wanted to try python in action, in which I am not a specialist.

Practice

Actually translating the algorithm turned out to be quite simple, but the speed of its work left much to be desired.
The first step was Psyco , speeding up the processing by 6 times.

It was no longer possible to get the best result without changing the algorithm, so the brute force method was used - parallelization of tasks. Parallel Python

module was found . Connecting it turned out to be quite simple: First , and then (first option):

import pp

ppservers = ()
job_server = pp.Server (ppservers = ppservers)                     
job_server.set_ncpus (2)
print "Starting pp with", job_server.get_ncpus (), "workers"
jobs = [job_server.submit (tighina_check, (), (find_geo_coords, compare, get_dist_bearing,), ("math",)) for i in range (3)]
for job in jobs:
	job ()
job_server.print_stats ()

The code, in principle, speaks for itself - we use only the local server (in general, the module allows you to parallelize it to the network ones), we try to run on 2 processors, indicate which function to call and which it depends on, import math and run 3 tasks, print statistics at the end.

The first ambush turned off psyco, which again threw us to the starting position.
The solution was obvious - add psyco import when creating the job

jobs = [job_server.submit(tighina_check, (), (find_geo_coords, compare, get_dist_bearing,), ("math", "psyco", )) for i in range(3)]

and call psyco.full already in tighina_check:

def tighina_check ():
        psyco.full ()
        # and here is a lot of math

The second problem was very unexpected.
The code in tighina_check was originally sharpened for import of the form “from math import sin, pow, cos, asin, sqrt, fabs, acos”. But it did not work under pp, because only creates the runtime of the function with the modules specified when creating the job. It was logical to redo all the calls of sin to math.sin, etc. Here a little bewilderment arose - the intensive and constant use of the math functions in the second call format led to a slowdown of 1.3-1.4 times.

The solution was to manually import the necessary functions into the global scope at the beginning of each job:

def tighina_check ():
     psyco.full ()
     math_func_imports = ['sin', 'cos', 'asin', 'acos', 'sqrt', 'pow', 'fabs']
     for func in math_func_imports:
	 setattr (__ builtins__, func, getattr (math, func))

Then I thought that it would be nice to speed up pp itself using psyco. To do this, you need to patch a little pyworker.py from the kit, adding to the beginning:

import psyco
psyco.full ()

and replacing

eval (__ fobj)

on the

exec __fobj

At the same time, there is no need to import psyco when creating a job and, accordingly, calling psyco.full () in job'e.

The rest is just a selection of the desired number of processors

What is the result?

100 jobs were launched.

The original version (no parallelization, only psycho)
100 consecutive jobs 257 seconds

2 processors (pp, psyco)

Starting pp with 2 workers
Job execution statistics:
 job count | % of all jobs | job time sum | time per job | job server
       100 | 100.00 | 389.8933 | 3.898933 | local
Time elapsed since server creation 195.12789011

4 processors (pp, psyco)

Starting pp with 4 workers
Job execution statistics:
 job count | % of all jobs | job time sum | time per job | job server
       100 | 100.00 | 592.9463 | 5.929463 | local
Time elapsed since server creation 148.77167201

I did not want to test further, it seemed that 2 cores, each with hypertreading, which means 4 jobs was the best option. But curiosity took up (and as it turned out - not in vain):
8 processors (pp, psyco)

Starting pp with 8 workers
Job execution statistics:
 job count | % of all jobs | job time sum | time per job | job server
       100 | 100.00 | 1072.3920 | 10.723920 | local
Time elapsed since server creation 137.681350946

16 processors (pp, psyco)

Starting pp with 16 workers
Job execution statistics:
 job count | % of all jobs | job time sum | time per job | job server
       100 | 100.00 | 2050.8158 | 20.508158 | local
Time elapsed since server creation 133.345046043

32 processors (pp, psyco)

Starting pp with 32 workers
Job execution statistics:
 job count | % of all jobs | job time sum | time per job | job server
       100 | 100.00 | 4123.8550 | 41.238550 | local
Time elapsed since server creation 136.022897005

T.O. in the best case, 133 seconds versus 257 in the original case = acceleration by 1.93 times for our specific task only due to parallelization.

It should be noted that all 100 jobs are independent of each other and do not need to “communicate” with each other, which facilitates the task and increases speed.

Summary code examples:

ppservers = ()
job_server = pp.Server (ppservers = ppservers)                     
job_server.set_ncpus (16)
print "Starting pp with", job_server.get_ncpus (), "workers"
jobs = [job_server.submit (tighina_check, (), (find_geo_coords, compare, get_dist_bearing,), ("math",)) for i in range (3)]
for job in jobs:
    job ()
job_server.print_stats ()

def tighina_check ():
    math_func_imports = ['sin', 'cos', 'asin', 'acos', 'sqrt', 'pow', 'fabs']
    for func in math_func_imports:
        setattr (__ builtins__, func, getattr (math, func)) 
        # and here is a lot of math

Tags:

Parallel Python Start

Disclaimer

Practice

What is the result?

Also popular now: