32.2. Benchmark
use Benchmark qw(timethese cmpthese timeit countit timestr);
# You can always pass in code as strings:
timethese $count, {
'Name1' => '...code1...',
'Name2' => '...code2...',
};
# Or as subroutines references:
timethese $count, {
'Name1' => sub { ...code1... },
'Name2' => sub { ...code2... },
};
cmpthese $count, {
'Name1' => '...code1...',
'Name2' => '...code2...',
};
$t = timeit $count, '...code...';
print "$count loops of code took:", timestr($t), "\n";
$t = countit $time, '...code...';
$count = $t->iters;
print "$count loops of code took:", timestr($t), "\n";
The Benchmark module can help you determine which
of several possible choices executes the fastest. The
timethese function runs the specified code segments
the number of times requested and reports back how long each segment
took. You can get a nicely sorted comparison chart if you call
cmpthese the same way.Code segments may be given as function references instead of strings
(in fact, they must be if you use lexical variables from the calling
scope), but call overhead can influence the timings. If you don't ask
for enough iterations to get a good timing, the function emits a
warning.Lower-level interfaces are available that run just one piece of code
either for some number of iterations (timeit) or
for some number of seconds (countit). These
functions return Benchmark objects (see the online
documentation for a description). With countit,
you know it will run in enough time to avoid warnings, because you
specified a minimum run time.To get the most out of the Benchmark module, you'll
need a good bit of practice. It isn't usually enough to run a couple
different algorithms on the same data set, because the timings only
reflect how well those algorithms did on that particular data set. To
get a better feel for the general case, you'll need to run several
sets of benchmarks, varying the data sets used.For example, suppose you wanted to know the best way to get a copy
of a string without the last two characters. You think of four
ways to do so (there are, of course, several others): chop
twice, copy and substitute, or use substr on either the
left- or righthand side of an assignment. You test these algorithms
on strings of length 2, 200, and 20_000:
use Benchmark qw/countit cmpthese/;
sub run($) { countit(5, @_) }
for $size (2, 200, 20_000) {
$s = "." x $len;
print "\nDATASIZE = $size\n";
cmpthese {
chop2 => run q{
$t = $s; chop $t; chop $t;
},
subs => run q{
($t = $s) =~ s/..\Z//s;
},
lsubstr => run q{
$t = $s; substr($t, -2) = ';
},
rsubstr => run q{
$t = substr($s, 0, length($s)-2);
},
};
}
which produces the following output:
DATASIZE = 2
Rate subs lsubstr chop2 rsubstr
subs 181399/s -- -15% -46% -53%
lsubstr 214655/s 18% -- -37% -44%
chop2 338477/s 87% 58% -- -12%
rsubstr 384487/s 112% 79% 14% --
DATASIZE = 200
Rate subs lsubstr rsubstr chop2
subs 200967/s -- -18% -24% -34%
lsubstr 246468/s 23% -- -7% -19%
rsubstr 264428/s 32% 7% -- -13%
chop2 304818/s 52% 24% 15% --
DATASIZE = 20000
Rate rsubstr subs lsubstr chop2
rsubstr 5271/s -- -42% -43% -45%
subs 9087/s 72% -- -2% -6%
lsubstr 9260/s 76% 2% -- -4%
chop2 9660/s 83% 6% 4% --
With small data sets, the "rsubstr" algorithm runs 14% faster than
the "chop2" algorithm, but in large data sets, it runs 45% slower.
On empty data sets (not shown here), the substitution mechanism is the
fastest. So there is often no best solution for all possible cases,
and even these timings don't tell the whole story, since you're still at
the mercy of your operating system and the C library Perl was built
with. What's good for you may be bad for someone else. It takes a
while to develop decent benchmarking skills. In the meantime, it helps
to be a good liar.