Perl Best Practices [Electronic resources]

Damian Conway

نسخه متنی -صفحه : 317/ 283
نمايش فراداده

19.10. Caching

Look for opportunities to use caches .

It makes sense not to do the same calculation twice, if the result is small enough that it can reasonably be stored for reuse. The simplest form of that is putting a result into an interim variable whenever it will be used more than once. That is, instead of calling the same functions twice on the same data:

print form(
'hash alone: {>>>,>>>,>>} bytes', size(\%lookup),
'data alone: {>>>,>>>,>>} bytes', total_size(\%lookup)-size(\%lookup),
'==============================',
'total:      {>>>,>>>,>>} bytes', total_size(\%lookup),
);

call them once, store the results temporarily, and retrieve them each time they're needed:

my $hash_mem = size(\%lookup); my $total_mem = total_size(\%lookup); my $data_mem = $total_mem - $hash_mem; print form( 'hash alone: {>>>,>>>,>>} bytes', $hash_mem, 'data alone: {>>>,>>>,>>} bytes', $data_mem, '==============================', 'total: {>>>,>>>,>>} bytes', $total_mem, );

This often has the additional benefit of allowing you to name the interim values in ways that make the code more comprehensible.

Subroutines like size( ) and total_size( ) and functions like rand( ) or readline( ) don't always return the same result when called with the same arguments. Such subroutines are good candidates for temporary and localized reuse of results, but not for longer-term caching.

On the other hand, pure functions like sqrt( ) and int( ) and crypt( )

do always return the same result for the same list of arguments, so their return values can be stored long-term and reused whenever they're needed again. For example, if you have a subroutine that returns a case-insensitive SHA-512 digest:

sub lc_digest {
my ($text) = @_;
use Digest::SHA qw( sha512 );
return sha512(lc $text);
}

then you could (potentially) speed it up over many calls by giving it a private look-up table in which results can be cached as they're computed, as shown in Example 19-9.

Example 19-9. Adding a cache to a digest subroutine

{ my %cache; sub lc_digest { my $text = lc shift;

# Compute the answer only if it's not already known...
if (!exists $cache{$text}) { use Digest::SHA qw( sha512 ); $cache{$text} = sha512($text); } return $cache{$text}; } }

On the other hand, if the range of possible data for a computation is small and the number of computations is large, then it's often simpler and more efficient to pre-compute the entire look-up table and then access it directly, thereby eliminating the cost of a subroutine call. For example, suppose you were doing some kind of image processing and needed square roots for pixel intensity values in the range 0 to 255. You could write:

for my $row (@image_rows) { for my $pixel_value (@{$row}) { $pixel_value = sqrt($pixel_value); } }

or you could dramatically reduce the number of sqrt operations by precomputing all possible values and creating a look-up table:

my @sqrt_of = map { sqrt $_ } 0..255; for my $row (@image_rows) { for my $pixel_value (@{$row}) { $pixel_value = $sqrt_of[$pixel_value]; } }

For a thorough discussion of the many applications and advantages of caching, see Chapter 3 of

Higher-Order Perl , by Mark Jason Dominus (Morgan Kaufmann, 2005)