20.13. Processing Server Logs
20.13.1. Problem
You need to summarize your server
logs, but you don''t have a customizable program to do it.
20.13.2. Solution
Parse the error log yourself with regular expressions, or use the
Logfile modules from CPAN.
20.13.3. Discussion
Example 20-9 is a sample report generator for an
Apache weblog.
Example 20-9. sumwww
#!/usr/bin/perl -w
# sumwww - summarize web server log activity
$lastdate = ";
daily_logs( );
summary( );
exit;
# read CLF files and tally hits from the host and to the URL
sub daily_logs {
while (<>) {
($type, $what) = /"(GET|POST)\s+(\S+?) \S+"/ or next;
($host, undef, undef, $datetime) = split;
($bytes) = /\s(\d+)\s*$/ or next;
($date) = ($datetime =~ /\[([^:]*)/);
$posts += ($type eq POST);
$home++ if m, / ,;
if ($date ne $lastdate) {
if ($lastdate) { write_report( ) }
else { $lastdate = $date }
}
$count++;
$hosts{$host}++;
$what{$what}++;
$bytesum += $bytes;
}
write_report( ) if $count;
}
# use *typeglob aliasing of global variables for cheap copy
sub summary {
$lastdate = "Grand Total";
*count = *sumcount;
*bytesum = *bytesumsum;
*hosts = *allhosts;
*posts = *allposts;
*what = *allwhat;
*home = *allhome;
write;
}
# display the tallies of hosts and URLs, using formats
sub write_report {
write;
# add to summary data
$lastdate = $date;
$sumcount += $count;
$bytesumsum += $bytesum;
$allposts += $posts;
$allhome += $home;
# reset daily data
$posts = $count = $bytesum = $home = 0;
@allwhat{keys %what} = keys %what;
@allhosts{keys %hosts} = keys %hosts;
%hosts = %what = ( );
}
format STDOUT_TOP =
@|||||||||| @|||||| @||||||| @||||||| @|||||| @|||||| @|||||||||||||
"Date", "Hosts", "Accesses", "Unidocs", "POST", "Home", "Bytes"
----------- ------- -------- -------- ------- ------- --------------
.
format STDOUT =
@>>>>>>>>>> @>>>>>> @>>>>>>> @>>>>>>> @>>>>>> @>>>>>> @>>>>>>>>>>>>>
$lastdate, scalar(keys %hosts),
$count, scalar(keys %what),
$posts, $home, $bytesum
.
Here''s sample output from that program:
Date Hosts Accesses Unidocs POST Home Bytes
----------- ------- -------- -------- ------- ------- --------------
19/May/1998 353 6447 3074 352 51 16058246
20/May/1998 1938 23868 4288 972 350 61879643
21/May/1998 1775 27872 6596 1064 376 64613798
22/May/1998 1680 21402 4467 735 285 52437374
23/May/1998 1128 21260 4944 592 186 55623059
Grand Total 6050 100849 10090 3715 1248 250612120
Use
the Logfile::Apache module from CPAN, shown in Example 20-10, to write a similar, but less specific,
program. This module is distributed with other Logfile modules in a
single Logfile distribution
(Logfile-0.115.tar.gz at the time of this
writing).
Example 20-10. aprept
#!/usr/bin/perl -w
# aprept - report on Apache logs
use Logfile::Apache;
$l = Logfile::Apache->new(
File => "-", # STDIN
Group => [ Domain, File ]);
$l->report(Group => Domain, Sort => Records);
$l->report(Group => File, List => [Bytes,Records]);
The new constructor reads a log file and builds
indices internally. Supply a filename with the parameter named
File and the fields to index in the
Group parameter. The possible fields are
Date (date request), Hour (time
of day the request was received), File (file
requested), User (username parsed from request),
Host (hostname requesting the document), and
Domain (Host translated into
"France", "Germany", etc.).
To produce a report on
STDOUT, call the report method.
Give the index to use with the Group parameter,
and optionally say how to sort (Records is by
number of hits, Bytes by number of bytes
transferred) or how to break it down further (by number of bytes or
number of records).Here''s some sample output:
Domain Records
= == == == == == == == == == == == =
US Commercial 222 38.47%
US Educational 115 19.93%
Network 93 16.12%
Unresolved 54 9.36%
Australia 48 8.32%
Canada 20 3.47%
Mexico 8 1.39%
United Kingdom 6 1.04%
File Bytes Records
= == == == == == == == == == == == == == == == == == == =
/ 13008 0.89% 6 1.04%
/cgi-bin/MxScreen 11870 0.81% 2 0.35%
/cgi-bin/pickcards 39431 2.70% 48 8.32%
/deckmaster 143793 9.83% 21 3.64%
/deckmaster/admin 54447 3.72% 3 0.52%
20.13.4. See Also
The documentation for the CPAN module Logfile::Apache;
perlform(1) and Chapter 7 of
Programming Perl