Recipe 19.20. Running an Iterator in Another Thread
Credit: Garth Kidd
Problem
You want to run the code of a generator (or any other iterator) in
its own separate thread, so that the iterator's code
won't block your main thread even if it contains
time-consuming operations, such as blocking calls to the operating
system.
Solution
This task is best tackled by wrapping a subclass of
threading.Thread around the iterator:
import sys, threading
class SpawnedGenerator(threading.Thread):
def _ _init_ _(self, iterable, queueSize=0):
threading.Thread._ _init_ _(self)
self.iterable = iterable
self.queueSize = queueSize
def stop(self):
"Ask iteration to stop as soon as feasible"
self.stopRequested = True
def run(self):
"Thread.start runs this code in another, new thread"
put = self.queue.put
try:
next = iter(self.iterable).next
while True:
# report each result, propagate StopIteration
put((False, next( ))
if self.stopRequested:
raise StopIteration
except:
# report any exception back to main thread and finish
put((True, sys.exc_info( )))
def execute(self):
"Yield the results that the "other", new thread is obtaining"
self.queue = Queue.Queue(self.queueSize)
get = self.queue.get
self.stopRequested = False
self.start( ) # executes self.run( ) in other thread
while True:
iterationDone, item = get( )
if iterationDone: break
yield item
# propagate any exception (unless it's just a StopIteration)
exc_type, exc_value, traceback = item
if not isinstance(exc_type, StopIteration):
raise exc_type, exc_value, traceback
def _ _iter_ _(self):
"Return an iterator for our executed self"
return iter(self.execute( ))
Discussion
Generators (and other iterators) are a great way to package the logic
that controls an iteration and obtains the next value to feed into a
loop's body. The code of a generator (and,
equivalently, the code of the next method of
another kind of iterator) usually runs in the same thread as the code
that's iterating on it. The
"calling" code can therefore
block, each time around the loop, while waiting
for the generator's code to do its job.Sometimes, you want to use a generator (or other kind of iterator) in
a "non-blocking" way, which means
you need to arrange things so that the generator's
body runs in a new, separate thread. This recipe shows a class which
supplies exactly this kind of functionality: this
recipe's SpawnedGenerator class
subclasses threading.Thread and uses
Thread's
start/run mechanism to ensure
the generator's body always executes in a separate
thread from that of the calling code.All communication between the two threads occurs through a single
instance of the Queue.Queue class (held through a
local-variable bound method in each of the communicating methods: the
generator named execute that runs in the calling
thread and the method named run that runs in a
separate thread). The "calling"
code may also call method stop on the
SpawnedGenerator instance to ask for the iteration
to stop as soon as feasible. Optionally, you may also specify a queue
size when you instantiate SpawnedGenerator, if you
want to limit how far ahead of the calling thread the spawned thread
can get.The main use case for this recipe is for wrapping iterators that make
blocking calls to the operating system (e.g., walking a directory
tree), when you need to use such iterators in an application where
the "main" thread cannot be allowed
to block for a long time. The typical examples of applications whose
main thread must not block are event-driven applications, a
description that applies to applications with a GUI, as well as to
networking applications built on asynchronous frameworks, such as
Twisted or the asyncore module of the Python
Standard Library.
See Also
Library Reference and Python in a
Nutshell docs about modules threading
and asyncore; Twisted is at http://www.twistedmatrix.com/; Chapter 9 for general issues about threading; Chapter 11 for general issues about user interfaces;
Chapter 13 and Chapter 14
for general issues about network and web programming, including
asynchronous approaches to such programs.