Recipe 4.11. Building a Dictionary Without Excessive Quoting
Credit: Brent Burley, Peter Cogolo
Problem
You
want to construct a dictionary whose keys are literal strings,
without having to quote each key.
Solution
Once you get into the swing of Python, you'll find
yourself constructing a lot of dictionaries. When the keys are
identifiers, you can avoid quoting them by calling
dict with named-argument syntax:
data = dict(red=1, green=2, blue=3)This is neater than the equivalent use of dictionary-display syntax:
data = {'red': 1, 'green': 2, 'blue': 3}
Discussion
One
powerful way to build a dictionary is to call the built-in type
dict. It's often a good
alternative to the dictionary-display syntax with braces and colons.
This recipe shows that, by calling dict, you can
avoid having to quote keys, when the keys are literal strings that
happen to be syntactically valid for use as Python identifiers. You
cannot use this approach for keys such as the literal strings
'12ba' or 'for', because
'12ba' starts with a digit, and
for happens to be a Python keyword, not an
identifier.Also, dictionary-display syntax is the only case in Python where you
need to use braces: if you dislike braces, or happen to work on a
keyboard that makes braces hard to reach (as all Italian layout
keyboards do!), you may be happier, for example, using dict() rather than { } to build an empty
dictionary.Calling dict also gives you other possibilities.
dict(d) returns a new dictionary that is an
independent copy of existing dictionary d,
just like d.copy( )but
dict(d) works even when
d is a sequence of pairs (key,
value) instead of being a dictionary (when a
key occurs more than once in the sequence,
the last appearance of the key applies). A
common dictionary-building idiom is:
d = dict(zip(the_keys, the_values))where the_keys is a sequence of keys and
the_values a
"parallel" sequence of
corresponding values. Built-in function zip builds
and returns a list of (key, value) pairs, and
built-in type dict accepts that list as its
argument and constructs a dictionary accordingly. If the sequences
are long, it's faster to use module
itertools from the standard Python
library:
import itertools
d = dict(itertools.izip(the_keys, the_values))
Built-in function
zip constructs the whole list of pairs in memory,
while itertools.izip yields only one pair at a
time. On my machine, with sequences of 10,000 numbers, the latter
idiom is about twice as fast as the one using
zip18 versus 45 milliseconds with Python
2.3, 17 versus 32 with Python 2.4.You can use both a positional argument and named arguments in the
same call to dict (if the named argument clashes
with a key specified in the positional argument, the named argument
applies). For example, here is a workaround for the previously
mentioned issue that Python keywords, and other nonidentifiers,
cannot be used as argument names:
d = dict({'12ba':49, 'for': 23}, rof=41, fro=97, orf=42)If you need to build a dictionary where the same value corresponds to
each key, call dict.fromkeys(keys_sequence, value)
(if you omit the value, it defaults to
None). For example, here is a neat way to
initialize a dictionary to be used for counting occurrences of
various lowercase ASCII letters:
import string
count_by_letter = dict.fromkeys(string.ascii_lowercase, 0)
See Also
Library Reference and Python in a
Nutshell sections on built-ins dict
and zip, and on modules
itertools and string.