Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] - نسخه متنی

David Ascher, Alex Martelli, Anna Ravenscroft

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید


Recipe 9.5. Executing a Function in Parallel on Multiple Argument Sets


Credit: Guy Argo


Problem



You want to execute a function
simultaneously over multiple sets of arguments. (Presumably the
function is "I/O bound", meaning it
spends substantial time doing input/output operations; otherwise,
simultaneous execution would be useless.)


Solution


Use one thread for each set of arguments. For good performance,
it's best to limit our use of threads to a bounded
pool:

import threading, time, Queue
class MultiThread(object):
def _ _init_ _(self, function, argsVector,
maxThreads=5, queue_results=False):
self._function = function
self._lock = threading.Lock( )
self._nextArgs = iter(argsVector).next
self._threadPool = [ threading.Thread(target=self._doSome)
for i in range(maxThreads) ]
if queue_results:
self._queue = Queue.Queue( )
else:
self._queue = None
def _doSome(self):
while True:
self._lock.acquire( )
try:
try:
args = self._nextArgs( )
except StopIteration:
break
finally:
self._lock.release( )
result = self._function(args)
if self._queue is not None:
self._queue.put((args, result))
def get(self, *a, **kw):
if self._queue is not None:
return self._queue.get(*a, **kw)
else:
raise ValueError, 'Not queueing results'
def start(self):
for thread in self._threadPool:
time.sleep(0) # necessary to give other threads a chance to run
thread.start( )
def join(self, timeout=None):
for thread in self._threadPool:
thread.join(timeout)
if _ _name_ _=="_ _main_ _":
import random
def recite_n_times_table(n):
for i in range(2, 11):
print "%d * %d = %d" % (n, i, n * i)
time.sleep(0.3 + 0.3*random.random( ))
mt = MultiThread(recite_n_times_table, range(2, 11))
mt.start( )
mt.join( )
print "Well done kids!"


Discussion


This recipe's MultiThread class
offers a simple way to execute a function in parallel, on many sets
of arguments, using a bounded pool of threads. Optionally, you can
ask for results of the calls to the function to be queued, so you can
retrieve them, but by default the results are just thrown away.

The MultiThread class takes as its arguments a
function, a sequence of argument tuples for said function, and
optionally a boundary on the number of threads to use in its pool and
an indicator that results should be queued. Beyond the constructor,
it exposes three methods: start, to start all the
threads in the pool and begin the parallel evaluation of the function
over all argument tuples; join, to perform a join on
all threads in the pool (meaning to wait for all the threads in the
pool to have terminated); and get, to get queued
results (if it was instantiated with the optional flag
queue_results set to TRue, to ask
for results to be queued). Internally, class
MultiThread uses its private method
doSome as the target callable for all threads in the
pool. Each thread works on the next available tuple of arguments
(supplied by the next method of an iterator on the
iterable whose items are such tuples, with the call to
next being guarded by the usual locking idiom),
until all work has been completed.

As is usual in Python, the module can also be run as a free-standing
main script, in which case it runs a simple demonstration and
self-test. In this case, the demonstration simulates a class of
schoolchildren reciting multiplication tables as fast as they can.

Real use cases for this recipe mostly
involve functions that are I/O bound, meaning functions that spend
substantial time performing I/O. If a function is
"CPU bound", meaning the function
spends its time using the CPU, you get better overall performance by
performing the computations one after the other, rather than in
parallel. In Python, this observation tends to hold even on machines
that dedicate multiple CPUs to your program, because Python uses a
GIL (Global Interpreter Lock), so that pure Python code from a single
process does not run simultaneously on more than one CPU at a time.

Input/output operations release the GIL, and so can (and should) any
C-coded Python extension that performs substantial computations
without callbacks into Python. So, it is
possible that parallel execution may speed up your program, but only
if either I/O or a suitable C-coded extension is involved, rather
than pure computationally intensive Python code. (Implementations of
Python on different virtual machines, such as Jython, which runs on a
JVM [Java Virtual Machine], or IronPython, which runs on the
Microsoft .NET runtime, are of course not bound by these
observations: these observations apply only to the widespread
"classical Python", meaning
CPython, implementation.)


See Also


Library Reference and Python in a
Nutshell
docs on modules threading and
Queue.

/ 394