Recipe 18.5. Memoizing (Caching) the Return Values of Functions
Credit: Paul Moore, Mitch Chapman, Hannu Kankaanp
Problem
You
have a pure function that is often called with the same arguments
(particularly a recursive function) and is slow to compute its
results. You want to find a simple way to gain substantial
improvement in performance.
Solution
The key idea behind memoizing is to store a
function's results in a dictionary, keyed by the
arguments that produce each result. This approach makes sense only
for a pure function (i.e., one that yields the same result when
called more than once with the same arguments). It's
easy to memoize a function by hand. For example, using the recursive
Fibonacci function, here is a manually memoized function:
fib_memo = { }Having to code the memoization inside each function to be memoized is
def fib(n):
if n < 2: return 1
if n not in fib_memo:
fib_memo[n] = fib(n-1) + fib(n-2)
return fib_memo[n]
repetitive and degrades the function's readability.
An alternative is to encapsulate the memoization mechanics into a
closure:
def memoize(fn):Using this memoize closure to memoize
memo = { }
def memoizer(*param_tuple, **kwds_dict):
# can't memoize if there are any named arguments
if kwds_dict:
return fn(*param_tuple, **kwds_dict)
try:
# try using the memo dict, or else update it
try:
return memo[param_tuple]
except KeyError:
memo[param_tuple] = result = fn(*param_tuple)
return result
except TypeError:
# some mutable arguments, bypass memoizing
return fn(*param_tuple)
# 2.4 only: memoizer._ _name_ _ = fn._ _name_ _
return memoizer
fib, the function definition becomes obvious,
without caching boilerplate to obscure the algorithm. You must assign
the memoize result to the same name,
fib, as the recursive function; otherwise, the
recursive calls bypass the memoizing:
def fib(n):This latest snippet shows that memoize is meant to
if n < 2: return 1
return fib(n-1) + fib(n-2)
fib = memoize(fib)
be used exactly as a Python 2.4 decorator, so
in Python 2.4, you could use decorator syntax (instead of the
explicit call to memoize):
@memoizegiving exactly the same semantics as the previous snippet.
def fib(n):
if n < 2: return 1
return fib(n-1) + fib(n-2)
Discussion
The memoize function is called with just one
argument, a function f. It returns a
closure memoizer that acts just like
f but memoizes its arguments and result if
the actual arguments to a call are hashable and positional. Calls
with mutable or keyword arguments bypass the memoizing. If
you're worried that such bypassing happens too
often, making memoizing counterproductive, you should do a few dry
runs that are representative of your intended production usage, with
a closure that's modified to collect statistics.
Then you can decide whether memoization is worth using for your
specific application. Here's the modified
memoize for this purpose:
def memoize(fn):Functions to be memoized must be pure (i.e., they must have no side
memo = { }
def memoizer(*param_tuple, **kwds_dict):
if kwds_dict:
memoizer.namedargs += 1
return fn(*param_tuple, **kwds_dict)
try:
memoizer.cacheable += 1
try:
return memo[param_tuple]
except KeyError:
memoizer.misses += 1
memo[param_tuple] = result = fn(*param_tuple)
return result
except TypeError:
memoizer.cacheable -= 1
memoizer.noncacheable += 1
return fn(*param_tuple)
memoizer.namedargs = memoizer.cacheable = memoizer.noncacheable = 0
memoizer.misses = 0
return memoizer
effects and must always return the same value whenever they are
called with the same arguments). More significantly,
memoize returns a closure that does not
memoize calls that receive mutable arguments, such as
len on a list, nor functions that receive named
parameters.memoize cannot really check the semantics of the
functions you wrap. The notions of same value
and same arguments are vaguely defined in many
cases, so take care. memoize does try to field
occasional calls with keyword and mutable arguments (with an
interesting mix of checking and
try/except), but performance
will suffer unless such cases are rare. This is why
it's worth having around a version of
memoize that keeps counts of the various
possibilities, so that you can check their rarity.
See Also
Recipe 20.4 applies caching
to class instances' attributes.