Recipe 7.6. Pickling Code Objects
Credit: Andres Tremols, Peter Cogolo
Problem
You want to be able to pickle code objects, but this functionality is
not supported by the standard library's pickling
modules.
Solution
You can extend the abilities of the pickle (or
cPickle) module by using module
copy_reg. Just make sure the following module has
been imported before you pickle code objects, and has been imported,
or is available to be imported, when you're
unpickling them:
import new, types, copy_reg
def code_ctor(*args):
# delegate to new.code the construction of a new code object
return new.code(*args)
def reduce_code(co):
# a reductor function must return a tuple with two items: first, the
# constructor function to be called to rebuild the argument object
# at a future de-serialization time; then, the tuple of arguments
# that will need to be passed to the constructor function.
if co.co_freevars or co.co_cellvars:
raise ValueError, "Sorry, cannot pickle code objects from closures"
return code_ctor, (co.co_argcount, co.co_nlocals, co.co_stacksize,
co.co_flags, co.co_code, co.co_consts, co.co_names,
co.co_varnames, co.co_filename, co.co_name, co.co_firstlineno,
co.co_lnotab)
# register the reductor to be used for
pickling objects of type 'CodeType'
copy_reg.pickle(types.CodeType, reduce_code)
if _ _name_ _ == '_ _main_ _':
# example usage of our new ability to pickle code objects
import cPickle
# a function (which, inside, has a code object, of course)
def f(x): print 'Hello,', x
# serialize the function's code object to a string of bytes
pickled_code = cPickle.dumps(f.func_code)
# recover an equal code object from the string of bytes
recovered_code = cPickle.loads(pickled_code)
# build a new function around the rebuilt code object
g = new.function(recovered_code, globals( ))
# check what happens when the new function gets called
g('world')
Discussion
The Python Standard Library pickle module (just
like its faster equivalent cousin cPickle) pickles
functions and classes by name. There is no pickling of the
code objects containing the compiled bytecode
that, when run, determines almost every aspect of
functions' (and methods') behavior.
In some situations, you'd rather pickle everything
by value, so that all the relevant stuff can later be retrieved from
the pickle, rather than having to have module files around for some
of it. Sometimes you can solve such problems by using marshaling
rather than pickling, since marshal
does let you serialize code objects, but
marshal has limitations on many other issues. For
example, you cannot marshal instances of classes you have coded.
(Once you're serializing code objects, which are
specific to a given version of Python, pickle will
share one key limitation of marshal: no guaranteed
ability to save and later reload data across different versions of
Python.)
An alternative approach is to take advantage of the possibility,
which the Python Standard Library allows, to extend the set of types
known to pickle. Basically, you can
"teach" pickle
how to save and reload code objects; this, in turn, lets you pickle
by value, rather than "by name",
such objects as functions and classes. (The code in this
recipe's Solution under the if _ _name_ _
== '_ _main_ _' guard essentially shows
how to extend pickle for a function.)To teach pickle about some new type, use module
copy_reg, which is also part of the Python
Standard Library. Through function
copy_reg.pickle, you register the reduction
function to use for instances of a given type. A reduction function
takes as its argument an instance to be pickled and returns a tuple
with two items: a constructor function, which will be called to
reconstruct the instance, and a tuple of arguments, which will be
passed to the constructor function. (A reduction function may also
return other kinds of results, but for this recipe's
purposes a two-item tuple suffices.)The module in this recipe defines function
reduce_code, then registers it as the reduction
function for objects of type
types.CodeTypethat is, code objects. When
reduce_code gets called, it first checks whether its
code object co comes from a
closure (functions nested inside each other),
because it just can't deal with this
eventualityI've been unable to find a way
that works, so in this case, reduce_code just raises
an exception to let the user know about the problem.In normal cases, reduce_code returns
code_ctor as the constructor and a tuple made up of
all of co's attributes as the
arguments tuple for the constructor. When a code object is reloaded
from a pickle, code_ctor gets called with those
arguments and simply passes the call on to the
new.code callable, which is the
true constructor for code arguments.
Unfortunately, reduce_code cannot return
new.code itself as the first item in its result
tuple, because new.code is a built-in (a C-coded
callable) but is not available through a built-in
name. So, basically, the role of
code_ctor is to provide a name for the (by-name)
pickling of new.code.The if _ _name_ _ == '_ _main_ _' part of the
recipe provides a typical toy usage exampleit pickles a code
object to a string, recovers a copy of it from the pickle string, and
builds and calls a function around that code object. A more typical
use case for this recipe's functionality, of course,
will do the pickling in one script and the unpickling in another.
Assume that the module in this recipe has been saved as file
reco.py somewhere on Python's
sys.path, so that it can be imported by Python
scripts and other modules. You could then have a script that imports
reco and thus becomes able to pickle code objects,
such as:
import reco, pickleTo unpickle and use that code object, an example script might be:
def f(x):
print 'Hello,', x
pickle.dump(f.func_code, open('saved.pickle','wb'))
import new, cPickleNote that the second script does not need to import
c = cPickle.load(open('saved.pickle','rb'))
g = new.function(c, globals( ))
g('world')
recothe import will happen
automatically when needed (part of the information that
pickle saves in saved.pickle
is that, in order to reconstruct the pickled object therein, it needs
to call reco.code_ctor; so, it also knows it needs
to import reco). I'm also showing
that you can use modules pickle and
cPickle interchangeably. Pickle
is faster, but there are no other differences, and in particular, you
can use one module to pickle objects and the other one to unpickle
them, if you wish.
See Also
Modules pickle, cPickle, and
copy_reg in the Library
Reference and Python in a
Nutshell.