Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] - نسخه متنی

David Ascher, Alex Martelli, Anna Ravenscroft

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید







Recipe 7.8. Using the Berkeley DB Database


Credit: Farhad Fouladi


Problem



You want to persist some
data, exploiting the simplicity and good performance of the Berkeley
DB database library.


Solution


If you have previously installed Berkeley DB on your machine, the
Python Standard Library comes with package bsddb
(and optionally bsddb3, to access Berkeley DB
release 3.2 databases) to interface your Python code with Berkeley
DB. To get either bsddb or, lacking it,
bsddb3, use a
try/except on
import:

try:
from bsddb import db # first try release 4
except ImportError:
from bsddb3 import db # not there, try release 3 instead
print db.DB_VERSION_STRING
# emits, e.g: Sleepycat Software: Berkeley DB 4.1.25: (December 19, 2002)

To create a database, instantiate a db.DB object,
then call its method open with appropriate
parameters, such as:

adb = db.DB( )
adb.open('db_filename', dbtype=db.DB_HASH, flags=db.DB_CREATE)

db.DB_HASH is just one of several access methods
you may choose when you create a databasea popular alternative
is db.DB_BTREE, to use B+tree access (handy if you
need to get records in sorted order). You may make an in-memory
database, without an underlying file for persistence, by passing
None instead of a filename as the first argument
to the open method.

Once you have an open instance of db.DB, you can
add records, each composed of two strings, key and
data:

for i, w in enumerate('some words for example'.split( )):
adb.put(w, str(i))

You can access records via a cursor on the database:

def irecords(curs):
record = curs.first( )
while record:
yield record
record = curs.next( )
for key, data in irecords(adb.cursor( )):
print 'key=%r, data=%r' % (key, data)
# emits (the order may vary):
# key='some', data='0'
# key='example', data='3'
# key='words', data='1'
# key='for', data='2'

When you're done, you close the database:

adb.close( )

At any future time, in the same or another Python program, you can
reopen the database by giving just its filename as the argument to
the open method of a newly created
db.DB instance:

the_same_db = db.DB( )
the_same_db.open('db_filename')

and work on it again in the same ways:

the_same_db.put('skidoo', '23')          # add a record
the_same_db.put('words', 'sweet') # replace a record
for key, data in irecords(the_same_db.cursor( )):
print 'key=%r, data=%r' % (key, data)
# emits (the order may vary):
# key='some', data='0'
# key='example', data='3'
# key='words', data='sweet'
# key='for', data='2'
# key='skidoo', data='23'

Again, remember to close the database when you're
done:

the_same_db.close( )


Discussion


The Berkeley DB is a popular open source database. It does not
support SQL, but it's simple to use, offers
excellent performance, and gives you a lot of control over exactly
what happens, if you care to exert it, through a huge array of
options, flags, and methods. Berkeley DB is just as accessible from
many other languages as from Python: for example, you can perform
some changes or queries with a Python program, and others with a
separate C program, on the same database file, using the same
underlying open source library that you can freely download from
Sleepycat.

The Python Standard Library shelve module can use
the Berkeley DB as its underlying database engine, just as it uses
cPickle for serialization. However,
shelve does not let you take advantage of the
ability to access a Berkeley DB database file from several different
languages, exactly because the records are strings produced by
pickle.dumps, and languages other than Python
can't easily deal with them. Accessing the Berkeley
DB directly with bsddb also gives you access to
many advanced functionalities of the database engine that
shelve simply doesn't support.


A Database, or pickle . . . or Both?


The use cases for pickle or
marshal, and those for databases such as Berkeley
DB or relational databases, are rather different, though they do
overlap somewhat.

pickle
(and marshal even more
so) is essentially about serialization: you turn Python objects into
BLOBs that you may transmit or store, and later receive or retrieve.
Data thus serialized is meant to be reloaded into Python objects,
basically only by Python applications. pickle has
nothing to say about searching or selecting specific objects or parts
of them.


Databases (Berkeley DB,
relational DBs, and other kinds yet) are essentially about data: you
save and retrieve groupings of elementary data (strings and numbers,
mostly), with a lot of support for selecting and searching (a
huge lot, for relational databases) and
cross-language support. Databases have nothing to say about
serializing Python objects into data, nor about deserializing Python
objects back from data.

The two approaches, databases and serialization, can even be used
together. You can serialize Python objects into strings of bytes with
pickle, and store those bytes using a
databaseand vice versa at retrieval time. At a very elementary
level, that's what the standard Python library
shelve module does, for example, with
pickle to serialize and deserialize and generally
bsddb as the underlying simple database engine.
So, don't think of the two approaches as being
"in competition" with each
otherrather, think of them as completing and complementing
each other!

For example, creating a database with an access method of
db.DB_HASH, as shown in the recipe, may give
maximum performance, but, as you'll have noticed
when listing all records with the generator irecords
that is also presented in the recipe, hashing puts records in
apparently random, unpredictable order. If you need to access records
in sorted order, you can use an access method of
db.DB_BTREE instead. Berkeley DB also supports
more advanced functionality, such as transactions, which you can
enable through direct access but not via anydbm or
shelve.

For detailed documentation about all functionality of the Python
Standard Library bsddb package, see http://pybsddb.sourceforge.net/bsddb3l.
For documentation, downloads, and more of the Berkeley DB itself, see
http://www.sleepycat.com/.


See Also


Library Reference and Python in a
Nutshell
docs for modules anydbm,
shelve, and bsddb; http://pybsddb.sourceforge.net/bsddb3l
for many more details about bsddb and
bsddb3; http://www.sleepycat.com/ for downloads of,
and very detailed documentation on, the Berkeley DB itself.


/ 394