Recipe 4.1. Copying an Object
Credit: Anna Martelli Ravenscroft, Peter Cogolo
Problem
You
want to copy an object. However, when you assign an object, pass it
as an argument, or return it as a result, Python uses a reference to
the original object, without making a copy.
Solution
Module
copy in the standard Python library offers two
functions to create copies. The one you should generally use is the
function named copy, which returns a new object
containing exactly the same items and attributes as the object
you're copying:
import copyOn the rare occasions when you also want every item and attribute in
new_list = copy.copy(existing_list)
the object to be separately copied, recursively, use
deepcopy:
import copy
new_list_of_dicts = copy.deepcopy(existing_list_of_dicts)
Discussion
When you assign an object (or pass it as an argument, or return it as
a result), Python (like Java) uses a reference to the original
object, not a copy. Some other programming languages make copies
every time you assign something. Python never makes copies
"implicitly" just because
you're assigning: to get a copy, you must
specifically request a copy.Python's behavior is simple, fast, and uniform.
However, if you do need a copy and do not ask for one, you may have
problems. For example:
>>> a = [1, 2, 3]Here, the names a and
>>> b = a
>>> b.append(5)
>>> print a, b
[1, 2, 3, 5] [1, 2, 3, 5]
b both refer to the same object (a list),
so once we alter the object through one of these names, we later see
the altered object no matter which name we use for it. No original,
unaltered copy is left lying about anywhere.
|
original object unaltered, you must make a copy. As this
recipe's solution explains, the module
copy from the Python Standard Library offers two
functions to make copies. Normally, you use
copy.copy, which makes a shallow
copyit copies an object, but for each attribute or
item of the object, it continues to share references, which is faster
and saves memory.Shallow copying, alas, isn't sufficient to entirely
"decouple" a copied object from the
original one, if you propose to alter the items or attributes of
either object, not just the object itself:
>>> list_of_lists = [ ['a'], [1, 2], ['z', 23] ]Here, the names list_of_lists and
>>> copy_lol = copy.copy(lists_of_lists)
>>> copy_lol[1].append('boo')
>>> print list_of_lists, copy_lol
[['a'], [1, 2, 'boo'], ['z', 23]] [['a'], [1, 2, 'boo'], ['z', 23]]
copy_lol refer to distinct objects (two lists), so
we could alter either of them without affecting the other. However,
each item of list_of_lists is
the same object as the corresponding item of
copy_lol, so once we alter an item reached by
indexing either of these names, we later see the altered item no
matter which object we're indexing to reach it.If you do need to copy some container object and
also recursively copy all objects it refers to (meaning all items,
all attributes, and also items of items, items of attributes, etc.),
use copy.deepcopysuch deep copying may cost
you substantial amounts of time and memory, but if you gotta, you
gotta. For deep copies, copy.deepcopy is the only
way to go.For normal shallow copies, you may have good alternatives to
copy.copy, if you know the type of the object you
want to copy. To copy a list L, call
list(L);
to copy a dict d, call dict(d);
to copy a set s (in Python 2.4, which
introduces the built-in type set), call
set(s). (Since list,
dict, and, in 2.4, set, are
built-in names, you do not need to perform any
"preparation" before you use any of
them.) You get the general pattern: to copy a copyable object
o, which belongs to some built-in Python
type t, you may generally just call
t(o). dicts also offer a
dedicated method to perform a shallow copy: d.copy(
) and dict(d) do the same thing. Of the
two, I suggest you use dict(d):
it's more uniform with respect to other types, and
it's even shorter by one character!To copy instances of arbitrary types or classes, whether you coded
them or got them from a library, just use
copy.copy. If you code your own classes,
it's generally not worth the bother to define your
own copy or clone method. If you
want to customize the way instances of your class get (shallowly)
copied, your class can supply a special method _ _copy_
_ (see Recipe 6.9 for a special technique
relating to the implementation of such a method), or special methods
_ _getstate_ _ and _ _setstate_
_. (See Recipe 7.4 for notes on these special
methods, which also help with deep copying and
serializationi.e., picklingof
instances of your class.) If you want to customize the way instances
of your class get deeply copied, your class can
supply a special method _ _deepcopy_ _ (see Recipe 6.9.)Note that you do not need to copy immutable objects (strings,
numbers, tuples, etc.) because you don't have to
worry about altering them. If you do try to perform such a copy,
you'll just get the original right back; no harm
done, but it's a waste of time and code. For
example:
>>> s = 'cat'The is operator checks whether two objects are not
>>> t = copy.copy(s)
>>> s is t
True
merely equal, but in fact the same object
(is checks for identity; for
checking mere equality, you use the
== operator). Checking object identity is not
particularly useful for immutable objects (we're
using it here just to show that the call to
copy.copy was useless, although innocuous).
However, checking object identity can sometimes be quite important
for mutable objects. For example, if you're not sure
whether two names a and
b refer to separate objects, or whether
both refer to the same object, a simple and very fast check
a is b lets you know how things stand. That way
you know whether you need to copy the object before altering it, in
case you want to keep the original object unaltered.
|
|
>>> d1 = { }
>>> for somekey in d:
... d1[somekey] = d[somekey]
|
See Also
Module copy in the Library
Reference and Python in a
Nutshell.