Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] - نسخه متنی

David Ascher, Alex Martelli, Anna Ravenscroft

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید


Recipe 7.1. Serializing Data Using the marshal Module


Credit: Luther Blissett


Problem




You want to serialize and reconstruct a
Python data structure whose items are fundamental Python objects
(e.g., lists, tuples, numbers, and strings but no classes, instances,
etc.) as fast as possible.


Solution


If you know that your data is composed entirely of fundamental Python
objects (and you only need to support one version of Python, though
possibly on several different platforms), the lowest-level, fastest
approach to serializing your data (i.e., turning it into a string of
bytes, and later reconstructing it from such a string) is via the
marshal module. Suppose that
data has only elementary Python data types as
items, for example:

data = {12:'twelve', 'feep':list('ciao'), 1.23:4+5j, (1,2,3):u'wer'}

You can serialize data to a bytestring at top
speed as follows:

import marshal
bytes = marshal.dumps(data)

You can now sling bytes around as you wish (e.g.,
send it across a network, put it as a BLOB in a database, etc.), as
long as you keep its arbitrary binary bytes intact. Then you can
reconstruct the data structure from the bytestring at any time:

redata = marshal.loads(bytes)

When you specifically want to write the data to a disk file (as long
as the latter is open for binarynot the default text
modeinput/output), you can also use the
dump function of the marshal
module, which lets you dump several data structures to the same file
one after the other:

ouf = open('datafile.dat', 'wb')
marshal.dump(data, ouf)
marshal.dump('some string', ouf)
marshal.dump(range(19), ouf)
ouf.close( )

You can later recover from datafile.dat the same
data structures you dumped into it, in the same sequence:

inf = open('datafile.dat', 'rb')
a = marshal.load(inf)
b = marshal.load(inf)
c = marshal.load(inf)
inf.close( )


Discussion


Python offers several ways to serialize data (meaning to turn the
data into a string of bytes that you can save on disk, put in a
database, send across the network, etc.) and corresponding ways to
reconstruct the data from such serialized forms. The lowest-level
approach is to use the marshal module, which
Python uses to write its bytecode files. marshal
supports only elementary data types (e.g., dictionaries, lists,
tuples, numbers, and strings) and combinations thereof.
marshal does not guarantee compatibility from one
Python release to another, so data serialized with
marshal may not be readable if you upgrade your
Python release. However, marshal does guarantee
independence from a specific machine's architecture,
so it is guaranteed to work if you're sending
serialized data between different machines, as long as they are all
running the same version of Pythonsimilar to how you can share
compiled Python bytecode files in such a distributed setting.

marshal's
dumps function accepts any suitable Python data
structure and returns a bytestring representing it. You can pass that
bytestring to the loads function, which will
return another Python data structure that compares equal
(==) to the one you originally dumped. In
particular, the order of keys in dictionaries is arbitrary in both
the original and reconstructed data structures, but order in any kind
of sequence is meaningful and is thus preserved. In between the
dumps and loads calls, you can
subject the bytestring to any procedure you wish, such as sending it
over the network, storing it into a database and retrieving it, or
encrypting and decrypting it. As long as the
string's binary structure is correctly restored,
loads will work fine on it (as stated previously,
this is guaranteed only if you use loads under the
same Python release with which you originally executed
dumps).

When you specifically need to save the data to a file, you can also
use marshal's
dump function, which takes two arguments: the data
structure you're dumping and the open file object.
Note that the file must be opened for binary I/O (not the default,
which is text I/O) and can't be a file-like object,
as marshal is quite picky about it being a true
file. The advantage of dump is that you can
perform several calls to dump with various data
structures and the same open file object: each data structure is then
dumped together with information about how long the dumped bytestring
is. As a consequence, when you later open the file for binary reading
and then call marshal.load, passing the file as
the argument, you can reload each previously dumped data structure
sequentially, one after the other, at each call to
load. The return value of load,
like that of loads, is a new data structure that
compares equal to the one you originally dumped. (Again,
dump and load work within one
Python releaseno guarantee across releases.)

Those accustomed to other languages and libraries offering
"serialization" facilities may be
wondering if marshal imposes substantial practical
limits on the size of objects you can serialize
or deserialize. Answer: Nope. Your machine's memory
might, but as long as everything fits comfortably in memory,
marshal imposes practically no further limit.


See Also


Recipe 7.2 for
cPickle, the big brother of
marshal; documentation on the
marshal standard library module in the
Library Reference and in Python in
a Nutshell
.

/ 394