Basic Apache
Configuration
Even if you need to use advanced features on
your Web site, you should begin by getting Apache operating on a basic level. Once
Apache can serve static Web pages (that is,
those that don't use advanced features like scripting), you can begin to tweak
the configuration to do the more advanced things you need it to do. Basic
Apache configuration involves running the server and setting fundamental
options in the server's configuration files. You should also understand
something of Apache modules, which are
extensions that handle specific types of tasks. Fortunately, most distributions
ship with an Apache configuration that works with few or no changes, but you
may need to tweak some of these features to customize your server for your
particular needs.
Understanding Apache Configuration Files
The Apache configuration file is usually
called httpd.conf . Different distributions use different locations for the file, but
the format is the same. Caldera and SuSE store the file in /etc/httpd ;
Debian and Slackware use /etc/apache (Slackware provides a sample file called /etc/apache/httpd.conf.default that you must rename and modify); and Mandrake, Red Hat, and TurboLinux
use /etc/httpd/conf/ .Whatever the location, httpd.conf consists of comments, which begin with pound signs (#), and configuration
option lines, which take the following form: Directive Value
The Directive is the name
of the configuration option you want to adjust, such as Timeout or StartServers . The
Value may be a number, a
filename, or some other arbitrary string. Some directives allow you to set
several suboptions. These are indicated by directive names enclosed in angle
brackets ( <> ), as follows: <Directory /home/httpd/html> Options FollowSymLinks AllowOverride None </Directory>
The final line uses the same directive name as the first, but
without any options, and preceded by a slash ( / )
to indicate that this is the end of the directive block.Some additional Apache configuration files may be important in
some situations. These are normally stored in the same directory as httpd.conf , and they include the
following: access.conf
This is essentially a supplemental configuration file. It's set in httpd.conf with the AccessConfig directive. The access.conf file has traditionally been
used for <Directory>
directives, which determine how Apache treats access to the specified
directory. Many configurations today leave this file empty, or use AccessConfig to point Apache to /dev/null for this file, effectively
disabling it. mime.types
HTTP relies on a file type identification system known as the Multipurpose Internet Mail Extensions (MIME) to allow
a Web server to inform a Web browser how to treat a file. For instance, text/plain identifies a file as containing
plain text, and image/jpeg
identifies a Joint Photographic Experts Group (JPEG) graphics file. The mime.types file contains a mapping of MIME
types to filename extensions. For instance, the .txt and .asc
filename extensions are associated with text/plain
MIME type. If these mappings aren't set appropriately, Web browsers may become
confused when confronted with certain file types. The default file works well
for most materials you're likely to place on a Web page, but you may need to
edit or add mappings if you want to serve unusual file types. magic This
file provides another way for Apache to determine a file's MIME type. Apache
can examine the file's contents to look for telltale signs of the file's type.
Many file types have certain key, or magic,
byte sequences, and the magic
file lists these, converted to a plain-text format so that the file can be
edited with a text editor. It's best to leave this file alone unless you
understand its format, though, and that format is beyond the scope of this
chapter.
Standalone
versus Super Server Configuration
href="http:// /?xmlid=0-201-77423-2/ch04#ch04"> Chapter 4 , Starting Servers, describes
different methods of running servers. Apache can be run in any of the ways
discussed in that chapterthrough a super server, a SysV initialization script,
or a custom startup script. Most distributions use a SysV startup script or a custom
startup script, because these methods of running Apache cause the server to run
continuously, and therefore to respond quickly to incoming requests. You may
elect to run Apache from a super server if you like, though, and in fact Debian
gives you the option of running Apache either way when you install the package.
Running Apache from a super server results in slower responses to incoming Web
page requests, because the super server must launch Apache for each request.
The Apache developers also recommend against this configuration.TIP

The delay caused by running a Web server from a super server
can be reduced or eliminated by using a slimmer Web server, such as thttpd , or a kernel-based Web server.
Therefore, if you want to use a super server for security reasons, you might
want to more seriously consider a slimmer Web server than Apache.
Although href="http:// /?xmlid=0-201-77423-2/ch04#ch04"> Chapter 4 covers running servers from a super
server or standalone, there is one Apache-specific option you must set: ServerType . This Apache configuration file
option can be set to standalone
or inetd . If you don't set this
option correctly, Apache may behave erratically or fail to respond to requests.
If you want to change your configuration, be sure to adjust the configuration
file, disable the former startup method, and enable the new startup method. For
instance, to convert from a SysV startup to running Apache from inetd , you should change the configuration
file, use the SysV startup script to shut down Apache, disable the SysV startup
script, edit /etc/inetd.conf to
enable Apache, and restart inetd .
If you forget one of these steps, you may find that Apache doesn't work
correctly, or continues to work with the old configuration.NOTE

Some distributions call the Apache executable apache , and others call it httpd . If you change your startup script
or want to shut down Apache directly, you may need to check both names.
Setting Common
Configuration Options
The default Apache configuration works on most systems. After
installing the server and starting it, Apache will serve files from its default
directory (usually /home/httpd/html ;
consult the upcoming section, "href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec3&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20lev2sec4#ch20lev2sec4"> Setting Server Directory Options " for
more details). This directory normally contains a default set of files that
announce that an Apache server is present but unconfigured. You'll almost
certainly want to replace these files with the files that make up your own Web
site, as described in the upcoming section, "href="http:// /?xmlid=0-201-77423-2/ch20lev1sec8#ch20lev1sec8"> Producing Something Worth Serving ."There are a few general-purpose Apache options you might want
to adjust to affect its overall behavior. These include the following: ServerType
This directive has already been mentioned, but it deserves reiteration. If you
change how you run Apache, you must adjust this option to fit: either standalone or inetd . User and Group
Every Linux server runs as a particular user and group. You can tell Apache to
run as a particular user and group with these directives. Most distributions
set Apache to run as the user nobody
or as a custom user with few privileges, to reduce the potential for damage
should a cracker find a way to get Apache to do things you'd rather it not do.
It's generally best to leave these options alone.NOTE

As a security measure, most Apache binaries are compiled so
that they can't be run as root .
ServerTokens
Apache can provide callers with varying degrees of information about the
platform on which it runs by setting this directive. Most distributions set it
to ProductOnly , which provides
no information about the OS on which Apache is running. You can set it to Min , OS ,
or Full to provide increasing
levels of information, but this is usually best left at ProductOnly .WARNING

Don't assume that setting ServerTokens
to ProductOnly will keep your
OS choice hidden. Crackers can use traffic analysis tools to infer
information about your OS (mainly whether or not you're running Linux, and
perhaps the kernel version number). Other servers may also provide clues
about what OS or distribution you're running.
MinSpareServers and MaxSpareServers
When run in standalone mode, Apache starts up several instances of itself in
order to provide quick responses to incoming HTTP requests. Each instance can
handle a single request. These directives set the minimum and maximum number of
these "spare" servers that run at any given time. If fewer than MinSpareServers are running and unused,
the master Apache process starts another. If more than MaxSpareServers are running and unused,
spares are killed to bring the number in line. Setting these numbers too low
can result in slow responses when the load spikes on a heavily used server,
while setting them too high can result in reduced performance if the server
lacks sufficient memory to handle them all. Most distributions set defaults of
about 5 and 10 . You can experiment with lower values
if your server is used very lightly, or higher values if your server is heavily
used. Note that the total number of Apache processes that run at any given
moment may be higher than MaxSpareServers ,
because some of these may be connected to clients, and so are not spares. A
busy Web site, or one whose traffic spikes periodically, may need a lot of swap
space to handle all the server instances. If the MaxSpareServers value is high, this may increase the need
for memory, and hence swap space. MaxClients
This directive sets the total number of clients who may connect to the system
at any one time. The default is usually about 150 ,
but you can adjust it up or down to suit your hardware and traffic. Setting
this value too high can cause your system's performance to degrade if your site
becomes very popular, but setting it too low can keep clients from connecting
to your site. As with MaxSpareServers ,
a high MaxClients value may
require you to have a lot of swap space or memory, should your traffic level
rise.NOTE

The number of connections set in MaxClients is not the same as the number of Web browsers
Apache supports. Individual Web browsers can open multiple connections (up to
8), and each consumes one of the connections allocated via MaxClients .
Listen By
default, Apache binds to port 80 on all active network interfaces. You can bind
it to additional ports or interfaces with this directive. For instance, Listen 192.168.34.98:8080 causes Apache to
listen to port 8080 on the interface associated with the 192.168.34.98 address.
Listen 8000 binds Apache to port
8000 on all interfaces. BindAddress
If your system has multiple network interfaces, you can bind Apache to just one
interface by using this directive. For instance, BindAddress 192.168.34.98 binds Apache to the interface
associated with 192.168.34.98. BindAddress *
is the default, which binds Apache to all interfaces.TIP

If you need to run Apache on a workstation for local use
only, you can use BindAddress 127.0.0.1
to keep it from being accessible to other computers. You'll have to use http://127.0.0.1 or http://localhost as your URL
when accessing Apache locally, though.
Port This
directive tells Apache to which port it should listen. The default is 80 . ServerAdmin
You should specify the e-mail address at which you can be reached with this
directive. The default is usually webmaster ,
which you can alias to your regular user account on the server using your mail
server's alias feature, as described in href="http:// /?xmlid=0-201-77423-2/ch19#ch19"> Chapter 19 , Push Mail Protocol: SMTP. This
e-mail address isn't normally apparent to users, but it's returned with some
types of error messages. ServerName
You can set this directive to your computer's true DNS hostname, if that
differs from the hostname configured into the computer by default. DefaultType
If Apache can't determine the MIME type of a file based on its extension or
magic sequence, as described earlier in "href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec3&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20lev2sec1#ch20lev2sec1"> Understanding Apache Configuration Files ,"
it returns the MIME type specified by the DefaultType
directive. This is normally text/plain ,
but you might want to change it if your Web site hosts many files of a
particular type that might not always be properly identified. HostnameLookups
This option can be set to On or Off , and it determines whether or not
Apache looks up and logs the hostnames of the clients that connect to it.
Having hostname information may be convenient when you're analyzing log files,
as described in the upcoming section, "href="http:// /?xmlid=0-201-77423-2/ch20lev1sec9#ch20lev1sec9"> Analyzing Server Log Files ," but
performing the lookups takes some time and network resources, so you might
prefer to forgo using this feature. LogLevel
Apache logs information on its activities. You can set the amount of
information it sends to its error log by setting this directive to debug , info ,
notice , warn , error ,
crit , alert , or emerg ,
in decreasing order of the amount of information logged. The default is usually
warn . This setting does not affect the access logs. CustomLog
This directive takes two options: the name of a log file and the format of information
sent to that log file. The log file in question holds access logsinformation
on what systems have requested Web pages. The format may be common , agent ,
referer , or combined . For still more flexibility, the LogFormat directive lets you create your
own log file format. You can use multiple CustomLog
directives to create multiple log files.These are the major general-purpose configuration options in httpd.conf . Upcoming sections describe
some additional options, and still more are esoteric or specialized options
that are beyond the scope of this chapter. You should consult the Apache
documentation or a book on Apache to learn more about such directives.
Setting
Server Directory Options
URLs consist of two to four components: The protocol The http:// , ftp:// ,
or similar component of the URL specifies the protocol to be used. This chapter
discusses Web servers, which deal primarily with http:// URLs. (Secure sites use https:// .) The hostname The hostname
component of the URL is the same as the hostname for the computer on which the
Web server runs. For instance, if the URL is http://www.threeroomco.com/thepage/indexl , the Web
server's hostname is www.threeroomco.com .
(A single computer can have multiple hostnames by setting up multiple DNS A
address records or CNAME aliases, as described in href="http:// /?xmlid=0-201-77423-2/ch18#ch18"> Chapter 18 .) The filename An HTTP request
is, at its core, a request for a file transfer. Following the hostname in the
URL is a filename, often associated with a directory name. For instance, in http://www.threeroomco.com/thepage/indexl ,
the file, including its directory reference, is thepage/indexl . Note that, although there is a slash ( / ) separating the hostname from the filename,
that slash doesn't indicate that the filename reference is relative to the root
of the Linux filesystem; it's relative to the root of the Web site's files directory. If the filename is
omitted, most Web servers return a default Web page, as specified by the DirectoryIndex directive, described
shortly. Additional information Some
URLs include additional information specific to a URL type. For instance, HTML
Web pages can include position anchors, which are specified by a pound sign and
anchor name, and FTP URLs can include a username and password.There are several Apache configuration options that let you
set the directories in which you can store files for the Web server. There are
also variant forms of addressing you can use in URLs to indicate which of
several alternate directories Apache is to use for retrieving files. If you
don't set these options correctly, some or all of your Web pages won't appear
in the way you expect. The relevant httpd.conf
options include the following: ServerRoot
This directive sets the root of the directory tree in which Apache's own binary
files reside. On most Linux installations, this defaults to "/usr" , and you shouldn't change
this setting. DocumentRoot
Apache looks in the directory specified by this directive for static Web page
files. The default is usually "/home/httpd/
html" or something similar. (The directory name is normally
enclosed in quote marks in the httpd.conf
file.) WARNING

Do not include a trailing slash ( / ) in your DocumentRoot
directive. Although this is a valid way to refer to directories, it can cause
Apache to misbehave.
UserDir If
the filename specified by a Web browser begins with a tilde ( ~ ), Apache interprets the first component
of the filename as a username and attempts to locate the file in a subdirectory
of the user's home directory. The UserDir
directive specifies the name of the subdirectory used for this access. For
instance, if UserDir is set to public_html , and if a remote user types http://www.threeroomco.com/~abrown/photosl
into a Web browser, then Apache attempts to return the public_html/photosl file in abrown 's home directory. If this directive
is set to disabled , user
directories are disabled. You can disable only some user directories by
following disabled with a list
of usernames to be disabled. This directive is often enclosed in an <IfModule> directive, which checks
to see that the appropriate Apache modules for handling user directories are
loaded. (The next section, "href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec3&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20lev2sec5#ch20lev2sec5"> Loading Apache Modules ," describes
modules.) DirectoryIndex
Some URLs don't end in a filename; they end in a directory name (often followed
by a single slash). When Apache receives such a URL, it first tries to locate a
default index file, the name of which you
specify with the DirectoryIndex
directive. Most distributions set this to indexl
by default, but you can change this if you like. For instance, with this
setting, if a user enters a URL of http://www.threeroomco.com/public/ , Apache returns the
public/indexl file from the DocumentRoot directory. You can provide
the names of several index files, and Apache will search for all of them. This
is often done if Apache handles CGI forms or other non-HTML files.Most distributions' Apache packages create
reasonable defaults for directory and file handling. You may want to check your
configuration files to learn where you should place your Web site's files. If
you prefer to place the files elsewhere, you can of course change the default
settings. You might also want to change the index filename, particularly if
you're setting up an Apache server to replace another Web server that used a
different index filename.
Loading Apache Modules
One of Apache's strengths is that it's an extensible Web server. Programmers and administrators
can write modules that extend its capabilities,
without touching the Apache source code or recompiling Apache itself. These
modules can add features such as access control mechanisms, parsing extended
information provided by clients, and so on. In fact, a great deal of Apache's
standard functionality comes in the form of modules that come with the server.If you check your httpd.conf file, chances are you'll see references to modules. These use the LoadModule directive, and they look like this: LoadModule mime_module lib/apache/mod_mime.so
This directive gives the module's internal
name ( mime_module in this example) and the filename of the external module file
itself ( lib/apache/
mod_mime.so ). In this example, the module filename
is referenced relative to the ServerRoot , although you can also provide
an absolute path if you prefer.It's possible to build modules directly into
the main Apache binary. To find out what modules are permanently available in
this way, type httpd -l or apache
-l , as appropriate. In some cases, modules
built into the Apache binary or loaded via LoadModule need to be
activated in the Apache configuration file. This is done with the AddModule directive, thus: AddModule mod_mime.c
You provide the module's source code filename
as the value for this directive. Some distributions' Apache configuration files
include both LoadModule and AddModule directives for important modules.Frequently, you won't need to add to the
standard Apache module configuration; the default configuration file loads the
modules that are most commonly used. In fact, you might want to disable certain
modules to eliminate features that might be abused, such as the ability to
handle CGI. Unfortunately, it's not always easy to tell what modules can be
safely removed from a configuration.If Apache doesn't do something you require of
it, you might want to investigate adding a module to do the trick. One Web site
you might want to visit in this case is the Apache Module Register, href="http://modules.apache.org" target="_blank">http://modules.apache.org . You
can search for modules others have written by typing in a key word; the site
returns a list of modules, including links to the module maintainers' Web sites.