Recipe 4.13. Extracting a Subset of a Dictionary
Credit: David Benjamin
Problem
You want to extract from a larger
dictionary only that subset of it that corresponds to a certain set
of keys.
Solution
If you want to leave the original dictionary intact:
def sub_dict(somedict, somekeys, default=None):If you want to remove from the original the items
return dict([ (k, somedict.get(k, default)) for k in somekeys ])
you're extracting:
def sub_dict_remove(somedict, somekeys, default=None):Two examples of these functions' use and effects:
return dict([ (k, somedict.pop(k, default)) for k in somekeys ])
>>> d = {'a': 5, 'b': 6, 'c': 7}
>>> print sub_dict(d, 'ab'), d
{'a': 5, 'b': 6} {'a': 5, 'b': 6, 'c': 7}
>>> print sub_dict_remove(d, 'ab'), d
{'a': 5, 'b': 6} {'c': 7}
Discussion
In Python, I use dictionaries for many purposesdatabase rows,
primary and compound keys, variable namespaces for template parsing,
and so on. So, I often need to create a dictionary that is based on
another, larger dictionary, but only contains the subset of the
larger dictionary corresponding to some set of keys. In most use
cases, the larger dictionary must remain intact after the extraction;
sometimes, however, I need to remove from the larger dictionary the
subset that I'm extracting. This
recipe's solution shows both possibilities. The only
difference is that you use method get when you
want to avoid affecting the dictionary that you are getting data
from, method pop when you want to remove the items
you're getting. If some item k of
somekeys is not in fact a key in
somedict, this recipe's
functions put k as a key in the result
anyway, with a default value (which I pass as an optional argument to
either function, with a default value of None).
So, the result is not necessarily a subset of
somedict. This behavior is the one
I've found most useful in my applications.You might prefer to get an exception for "missing
keys"that would help alert you to a bug in
your program, in cases in which you know all
ks in somekeys
should definitely also be keys in
somedict. Remember,
"errors should never pass silently. Unless
explicitly silenced," to quote The Zen of
Python, by Tim Peters (enter the statement
import this at an interactive Python prompt to
read or re-read this delightful summary of Python's
design principles). So, if a missing key is an error, from the point
of view of your application, then you do want to
get an exception that alerts you to that error at once, if it ever
occurs. If this is what you want, you can get it with minor
modifications to this recipe's functions:
def sub_dict_strict(somedict, somekeys):As you can see, these strict variants are even simpler than the
return dict([ (k, somedict[k]) for k in somekeys ])
def sub_dict_remove_strict(somedict, somekeys):
return dict([ (k, somedict.pop(k)) for k in somekeys ])
originalsa good indication that Python
likes to raise exceptions when unexpected
behavior occurs!Alternatively, you might prefer missing keys to be simply omitted
from the result. This, too, requires just minor modifications:
def sub_dict_select(somedict, somekeys):The if clause in each list comprehension does all
return dict([ (k, somedict[k]) for k in somekeys if k in somedict])
def sub_dict_remove_select(somedict, somekeys):
return dict([ (k, somedict.pop(k)) for k in somekeys if k in somedict])
we need to distinguish these _select variants from
the _strict ones.In Python 2.4, you can use
generator expressions, instead of list comprehensions, as the
arguments to dict in each of the functions shown
in this recipe. Just change the syntax of the calls to
dict, from dict([. .
.]) to dict(. .
.) (removing the brackets adjacent to the
parentheses) and enjoy the resulting slight simplification and
acceleration. However, these variants would not work in Python 2.3,
which has list comprehensions but not generator expressions.
See Also
Library Reference and Python in a
Nutshell documentation on dict.