Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] - نسخه متنی

David Ascher, Alex Martelli, Anna Ravenscroft

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید







Recipe 19.13. Fetching Large Record Sets from a Database with a Generator


Credit: Christopher Prinos


Problem


You want to fetch a result set
from a database (using the Python DB API) and easily iterate over
each record in the result set. However, you don't
want to use the DB cursor's method
fetchall: it could consume a lot of memory and
would force you to wait until the whole result set comes back before
you can start iterating.


Solution


A generator is the ideal solution to this problem:

def fetchsome(cursor, arraysize=1000):
''' A generator that simplifies the use of fetchmany '''
while True:
results = cursor.fetchmany(arraysize)
if not results: break
for result in results:
yield result


Discussion


In applications that use the Python DB API, you often see code that
goes somewhat like (where cursor is a DB API cursor
object):

cursor.execute('select * from HUGE_TABLE')
for result in cursor.fetchall( ):
doSomethingWith(result)

This simple approach is "just"
fine, as long as fetchall returns a small result
set, but it does not work very well if the query result is very
large. A large result set can take a long time to return. Also,
cursor.fetchall( ) needs to allocate enough memory
to store the entire result set in memory at once. Further, with this
simple approach, the doSomethingWith function
isn't going to get called until the entire
query's result finishes coming over from the
database to our program.

An alternative approach is to rely on the
cursor.fetchone method:

for result in iter(cursor.fetchone, None):
doSomethingWith(result)

However, this alternative approach does not allow the database to
optimize the fetching process: most databases can exhibit better
efficiency when returning multiple records for a single query, rather
than returning records one at a time as fetchone
requires.

To let your applications obtain greater efficiency than
fetchone allows, without the risks of unbounded
memory consumption and delay connected to the use of
fetchall, Python's DB
API's cursors also have a
fetchmany method. However, the direct use of
fetchmany makes your iterations somewhat more
complicated than the simple for statements such as
those just shown. For example:

while True:
results = cursor.fetchmany(1000)
if not results: break
for result in results:
doSomethingWith(result)

Python's generators are a great way to encapsulate
complicated iteration logic so that application code can just about
always loop with simple for statements. With this
recipe's fetchsome generator, you
get the same efficiencies and safeguards as with the native use of
the fetchmany method in the preceding snippet but
with the same crystal-clear simplicity as in the snippets that used
either fetchall or fetchone,
namely:

for result in fetchsome(cursor):
doSomethingWith(result)

By default, fetchsome fetches up to 1,000 records at
a time, but you can change that number, depending on your
requirements. Optimal values can depend on schema, database type,
choice of Python DB API module. In general, you're
best advised to experiment with a few different values in your
specific settings if you need to optimize this specific aspect. (Such
experimentation is often a good idea for any optimization task.)

This recipe is clearly an example of a more general case: a
subsequence unbuncher generator that you can use
when you have a sequence of subsequences (each subsequence being
obtained through some call, and the end of the whole sequence being
indicated by an empty subsequence) and want to flatten it into a
simple, linear sequence of items. You can think of this unbunching
task as the reverse of the sequence-bunching tasks covered earlier in
Recipe 19.10 and Recipe 19.11, or as a simpler
variant of the sequence-flattening task covered in Recipe 4.6. A generator for unbunching
might be:

def unbunch(next_subseq, *args):
''' un-bunch a sequence of subsequences into a linear sequence '''
while True:
subseq = next_subseq(*args)
if not subseq: break
for item in subseq:
yield item

As you can see, the structure of unbunch is
basically identical to that of the recipe's
fetchsome. Usage would also be just about the same:

for result in unbunch(cursor.fetchmany, 1000):
doSomethingWith(result)

However, while it is important and instructive to consider this kind
of generalization, when you're writing applications
you're often better off using specific generators
that directly deal with your application's specific
needs. In this case, for example, calling
fetchsome(cursor) is more obvious and direct than
calling unbunch(cursor.fetchmany, 1000), and
fetchsome usefully hides the usage of
fetchmany as well as the specific choice of
1,000 as the subsequence size to fetch at each
step.


See Also


Recipe 19.10; Recipe 19.11; Recipe 4.6; Python's
DB API is covered in Chapter 7 and in
Python in a Nutshell.


/ 394