Standard Apache Access Logging
Using Apache's basic logging features, you can keep track of who visits your Web sites by logging accesses to the servers hosting them. You can log every aspect of the requests and responses, including the IP address of the client, user, and resource accessed. You need to take three steps to create a request log:
- Define what you want to log
your log format. - Define where you want to log it
your log files, a database, an external program. - Define whether or not to log
conditional logging rules.
Deciding What to Log
You can log nearly every aspect associated with the request. You can define how your log entries look by creating a log format. A log format is a string that contains text mixed with log formatting directives. Log formatting directives start with a % and are followed by a directive name or identifier, usually a letter indicating the piece of information to be logged. When Apache logs a request, it scans the string and substitutes the value for each directive. For example, if the log format is This is the client address %a, the log entry is something like This is the client address 10.0.0.2. That is, the logging directive %a is replaced by the IP address of the client making the request. Table 24.1 provides a comprehensive list of all formatting directives.
"%h %l %u %t \"%r\" %>s %b"
That is, it includes the hostname or IP address of the client, remote user via identd, remote user via HTTP authentication, time when the request was served, text of the request, status code, and size in bytes of the content served.
![]() | You can read the Common Log Format documentation of the original W3C server at http://www.w3.org/Daemon/User/Config/Loggingl. |
10.0.0.1 - - [26/Aug/2003:11:27:56 -0800] "GET / HTTP/1.1" 200 1456
You are now ready to learn how to define log formats using the LogFormat directive. This directive takes two arguments: The first argument is a logging string, and the second is a nickname that will be associated with that logging string.For example, the following directive from the default Apache configuration file defines the Common Log Format and assigns it the nickname common:
LogFormat "%h %l %u %t \"%r\" %>s %b" common
You can also use the LogFormat directive with only one argument, either a log format string or a nickname. This will have the effect of setting the default value for the logging format used by the TransferLog directive, explained in "Logging Accesses to Files" later in this chapter.
The HostNameLookups Directive
When a client makes a request, Apache knows only the IP address of the client. Apache must perform what is called a reverse DNS lookup to find out the hostname associated with the IP address. This operation can be time-consuming and can introduce a noticeable lag in the request processing. The HostNameLookups directive allows you to control whether to perform the reverse DNS lookup.Managing Apache Logs" section later in this chapter. Additionally, the result will be passed to CGI scripts via the environment variable REMOTE_HOST.
The IdentityCheck Directive
At the beginning of the chapter, we explained how to log the remote username via the identd protocol using the %l log formatting directive. The IdentityCheck directive takes a value of on or off to enable or disable checking for that value and making it available for inclusion in the logs. Because the information is not reliable and takes a long time to check, it is switched off by default and should probably never be enabled. We mentioned %l only because it is part of the Common Log Format. For more information on the identd protocol, see RFC 1413 at http://www.rfc-editor.org/rfc/rfc1413.txt.
Environment Variables
The CustomLog directive accepts an environment variable as a third argument. If the environment variable is present, the entry will be logged; otherwise, it will not. If the environment variable is negated by prefixing an ! to it, the entry will be logged if the variable is not present.The following example shows how to avoid logging images in GIF and JPEG format in your logs:
SetEnvIf Request_URI "(\.gif|\.jpg)$" image
CustomLog logs/access_log common env=!image
![]() | The regular expression used for pattern matching in this and other areas of the httpd.conf file follow the same format for regular expressions in PHP and other programming languages. |
Status Code
You can specify whether to log specific elements in a log entry. At the beginning of the chapter, you learned that log directives start with a %, followed by a directive identifier. In between, you can insert a list of status codes, separated by commas. If the request status is one of the listed codes, the parameter will be logged; otherwise, a - will be logged.For example, the directive identifier %400,501{User-agent}i logs the browser name and version for malformed requests (status code 400) and requests with methods not implemented (status code 501). This information can be useful for tracking which clients are causing problems.You can precede the method list with an ! to log the parameter if the methods are implemented:
%!400,501{User-agent}i
Logging Accesses to Files
Logging to files is the default way of logging requests in Apache. You can define the name of the file using the TransferLog and CustomLog directives.The TransferLog directive takes a file argument and uses the latest log format defined by a LogFormat directive with a single argument (the nickname or the format string). If no log format is present, it defaults to the Common Log Format.The following example shows how to use the LogFormat and TransferLog directives to define a log format that is based on the CLF but that also includes the browser name:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{User-agent}i\"
TransferLog logs/access_log
The CustomLog directive enables you to specify the logging format explicitly. It takes at least two arguments: a logging format and a destination file. The logging format can be specified as a nickname or as a logging string directly.For example, the directives
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{User-agent}i\" myformat
CustomLog logs/access_log myformat
and
CustomLog logs/access_log "%h %l %u %t \"%r\" %>s %b \"%{User-agent}i\"
are equivalent.The CustomLog format can take an optional environment variable as a third argument, as explained in the "Environment Variables" section earlier in the chapter.
Logging Accesses to a Program
Both TransferLog and CustomLog directives can accept a program, prefixed by a pipe sign |, as an argument. Apache will write the log entries to the standard input of the program. The program will, in turn, process them by logging the entries to a database, transmitting them to another system, and so on.If the program dies for some reason, the server makes sure that it is restarted. If the server stops, the program is stopped as well.The rotatelogs utility, bundled with Apache and explained later in this chapter, is an example of a logging program.As a general rule, unless you have a specific requirement for using a particular program, it is easier and more reliable to log to a file on disk and do the processing, merging, analysis of logs, and so on, at a later time, possibly on a different machine.