Mastering Red Hat Linux 9 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Mastering Red Hat Linux 9 [Electronic resources] - نسخه متنی

Michael Jang

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
توضیحات
افزودن یادداشت جدید






Configuring Apache



Once you’ve installed the desired Apache packages, your server should be ready to serve web pages to the local computer. All you need to do is start the httpd service and direct your web browser to the localhost address.


But a web server doesn’t do you much good unless you can call its web pages from other computers. In this chapter, we’ll analyze the main Apache configuration file, httpd.conf, in some detail.


These settings are based on the specifications of the Hypertext Transfer Protocol (HTTP) standards version 1.1. We provide only a brief overview of Apache 2.0; for more information, see Linux Apache Web Server Administration, Second Edition (Sybex, 2002).






Tip


If you install the httpd-manual-* RPM, you’ll get a full Apache manual in HTML format in the /var/www/manual directory.





Starting Apache



Once you’ve installed the Apache packages that you need, starting Apache is easy. As with other services described throughout this book, all you need to do is start the applicable script from the /etc/ rc.d/init.d directory. In this case, the following command should work nicely:


# service httpd start


If you still have the default Apache configuration file, you’ll probably see the following message:


Starting httpd: httpd: Could not determine the
server’s fully qualified domain
→ name, using 127.0.0.1 for ServerName


Now you can open the browser of your choice to the localhost address. This is also known as the loopback IP address, which as defined in Chapter 20 is 127.0.0.1. Figure 30.1 shows the result in the Mozilla web browser.




Figure 30.1: Apache is properly installed.



You’ll also want to use a command such as chkconfig, as described in Chapter 13, to make sure Apache starts the next time you start Linux at an appropriate runlevel. For example, the following command starts the Apache daemon, httpd, whenever you start Linux in runlevel 2, 3, or 5:


# chkconfig --level 235 httpd on


Now you’re ready to start customizing the Apache configuration.





Customizing Apache



The main Apache configuration file, httpd.conf, is located in the /etc/httpd/conf directory. It is split into three sections. In the global environment section, you can configure the basic settings for this web server. In the main server configuration section, you’ll set up the basic defaults for any websites on your server. The Virtual Hosts section allows you to set up several different websites on your Apache server, even if you have only one IP address.






Note


There were originally three main configuration files for Apache: access.conf, srm.conf, and httpd.conf, all located in the same directory. While later versions of Apache 1.3.x incorporated the information from access.conf and srm.conf in httpd.conf, at least blank versions of access.conf and srm.conf were still required by the server. Apache 2.0.x no longer needs these extra configuration files.




Commands in the Apache configuration file are known as directives. In the following sections, we’ll analyze the directives from the default Apache httpd.conf installed with Red Hat Linux 9 in some detail. You can read the file for yourself; it includes many other useful comments.


Commands with a pound sign (#) in front are commented out in the default Apache configuration file. If you’re learning about Apache for the first time, experiment a bit. Set up some website files on your computer. Use the directory specified by the DocumentRoot directive, which is by default /var/ www/html. Try out some of these commands, restart the httpd daemon, and examine the changes for yourself. You might be surprised at what you can do.



Global Environment



We’ll look at each of the directives in the global environment section of the default version of the Apache httpd.conf configuration file. Variables in this section apply to all Virtual Hosts that you might configure on this server. There are basic parameters, detailed parameters related to different clients, port settings, pointers to other configuration files, and module locations.






Note


If a directive is set to 0, it normally means you’re setting no limit on that directive. For example, if you set Timeout to 0, connections from a client browser are kept open indefinitely.





Basic Global Environment Parameters



The following directive gives users of your website some basic information about your software. While the following command tells users that your web server is Apache on a Unix-style system, other commands are possible, as described in Table 30.3:


ServerTokens OS
























Table 30.3: ServerTokens Directive Options


Directive




Description




ServerTokens Prod




Identifies the web server as Apache




ServerTokens Min




Identifies Apache and its version number




ServerTokens OS




Identifies Apache, its version number, and the type of operating system




ServerTokens Full




Identifies Apache, its version number, the type of operating system, and compiled modules




The ServerRoot directive identifies the directory with configuration, error, and log files:


ServerRoot "/etc/httpd"


If you run ls -l /etc/httpd, you’ll see links to the real location of certain directories; for example, /etc/httpd/logs is linked to the /var/log/httpd directory.


Apache includes parent and child processes for different connections. The ScoreBoardFile parameter helps these processes communicate with each other. Otherwise, the communication is through active memory.


#ScoreBoardFile run/httpd.scoreboard





Tip


I normally avoid activating the ScoreBoardFile parameter; it’s required only for certain architectures, which does not include Red Hat Linux 9.




You might note that run is a relative subdirectory. The full directory name is based on the ServerRoot directive—in other words, /etc/httpd/run.


The PidFile specifies the file where Apache records the process identifier (PID):


PidFile run/httpd.pid


If computers are having trouble communicating on your network, you need a Timeout value to keep Apache from hanging. The Timeout directive specifies a stop value in seconds.


Timeout 300


Normally, multiple requests are allowed through each connection. The following command disables this behavior:


KeepAlive Off


If the KeepAlive directive is on, you can regulate the number of requests per connection with the MaxKeepAliveRequests directive:


MaxKeepAliveRequests 100


Once a connection is made between Apache and someone’s web browser, the KeepAliveTimeout directive specifies the number of seconds to wait for the next client request:


KeepAliveTimeout 15




Detailed Client Parameters



Apache includes a number of Multi-Processing Modules (MPM). These MPMs fall into three categories:



Prefork MPMs are suited to process-based web servers; they are appropriate to use if you have Apache modules that do not require separate threads, which imitates the behavior of Apache 1.3.x.



Worker MPMs support both types of modules; however, they should not be used if you’re using Apache 1.3 modules, since threads can cause problems.



Per-child MPMs support websites for clients that need different user IDs.






Note


MPMs flexible; specific modules are available for Windows NT (mpm_winnt) and Novell Netware (mpm_netware) networks.






There are a number of common directives that you can specify in each of these MPM categories.


When Apache is started, the StartServers directive sets the number of available child server processes ready for users who want your web pages:


StartServers 8


Once Apache is started, requests from other users may come in. If the number of unused server processes falls below the MinSpareServers directive, additional httpd processes are started automatically:


MinSpareServers 5


When traffic goes down, the MaxSpareServers directive determines the maximum number of httpd processes that are allowed to run idle:


MaxSpareServers 20


You can regulate the number of clients requesting information from your web server with the MaxClients directive:


MaxClients 150


You can also regulate the number of requests for information from each client with the MaxRequests- PerChild directive:


MaxRequestsPerChild 1000


Apache 2.0 servers can start new threads for each request. The MinSpareThreads directive is similar to MinSpareServers; it allows Apache to handle a surge of additional requests:


MinSpareThreads 25


When the number of requests goes down, Apache monitors the number of spare threads; if they exceed the MaxSpareThreads directive, some are killed:


MaxSpareThreads 75


Every child process can create several threads to handle requests from each user of your website. The ThreadsPerChild directive is created when each child process starts:


ThreadsPerChild 25


You can limit the number of threads allowed for each child process with the MaxRequestsPerChild directive (there is no limit in the default httpd.conf file):


MaxRequestsPerChild 0


You can also limit the number of threads allowed for each child process with the MaxThreadsPerChild directive:


MaxThreadsPerChild 20




Port Settings



You can set Apache to Listen to requests from only certain IP addresses and or TCP/IP ports. The default httpd.conf file includes the following directives:


#Listen 12.34.56.78:80
Listen 80


If you have more than one network adapter, you can also limit Apache to certain networks; for example, the following directive only listens to the network adapter with an IP address of 192.168.13.64 on TCP/IP port 80:


Listen 192.168.13.64:80





Note


The Listen directive supersedes the BindAddress and Port directives from Apache version 1.3.x.






Pointers to Other Configuration Files



As we noted earlier, there are other configuration files associated with the Apache 2.0.x server. By default, they’re in the /etc/httpd/conf.d directory. Normally, file locations are determined by the ServerRoot directive, which is set to /etc/httpd, and the Include directive shown here:


Include conf.d/*.conf




Module Locations



When you need a module in Apache, it should be loaded in the httpd.conf configuration file. Normally, modules are listed in the following format:


LoadModule module_type location


For example, the following directive loads the module named access_module from the ServerRoot modules subdirectory, /etc/httpd/modules. You will find that this is linked to the actual directory with Apache modules: /usr/lib/httpd/modules.


LoadModule access_module modules/mod_access.so


Several modules are listed in the default httpd.conf file; Table 30.4 offers a brief description. The modules are listed in the same order as they appear in the file.




























































































































Table 30.4: Standard Apache Modules


Module




Description




access_module




Supports access control based on an identifier, such as a computer name or IP address.




auth_module




Allows authentication (usernames and passwords) with text files.




auth_anon_module




Lets users have anonymous access to areas that require authentication.




auth_dbm_module




Supports authentication with DBM (database management) files.




auth_digest_module




Sets authentication with MD5 digests.




include_module




Includes SSI (server-side includes) data for dynamic web pages.




log_config_module




Sets logging of requests to the server.




env_module




Allows control of the environment that is passed to CGI (Common Gateway Interface) scripts and SSI pages.




mime_magic_module




Sets Apache to define the file type from a look at the first few bytes of the contents.




cern_meta_module




Supports additional meta-information with a web page, per the standards of the W3C, which is housed at CERN (the French acronym for the European Laboratory for Particle Physics).




expires_module




Lets Apache set an expiration date for the page, to support a web browser refresh request.




headers_module




Allows control of HTTP request and response headers.




usertrack_module




Supports user tracking with cookies.




unique_id_module




Sets a unique identifier for each request




setenvif_module




Allows Apache to set environment variables based on request characteristics, such as the type of web browser.




mime_module




Associates the filename extension, such as .txt, with specific applications.




dav_module




Supports web-based distributed authoring and versioning functionality.




status_module




Gives information on server performance and activity.




autoindex_module




Allows the listing of files in a web directory.




asis_module




Sends files without adding extra headers.




info_module




Supports user access to server configuration information.




dav_fs_module




Supports the dav_module.




vhost_alias_module




Allows dynamically configured Virtual Hosts.




negotiation_module




Sets Apache to match content, such as language, to the settings from the browser.




dir_module




Supports viewing of files in Apache directories.




imap_module




Configures imagemap file directives (not related to e-mail).




actions_module




Lets you run CGS scripts.




speling_module




Allows for small mistakes in requested document names (ironically, the module name is misspelled).




userdir_module




Supports access to user-specific directories.




alias_module




Sets up redirected URLs.




rewrite_module




Supports rewriting of URLs.




proxy_module




Sets up a proxy server for Apache.




proxy_ftp_module




Allows proxy server support for FTP data.




proxy_http_module




Allows proxy server support for HTTP data.




proxy_connect_module




Required for proxy server connect requests.




cgi_module




Configures running of CGI scripts.




cgid_module




Supports running of CGI scripts with an external daemon.




One of the more interesting modules is the info_module; as you’ll see toward the end of the next section, it supports a detailed view of your Apache server configuration in your browser at localhost/ server-info.







Main Server Configuration



Before we move on to configuring Virtual Hosts, let’s take a look at the next section in the httpd.conf configuration file, which includes the default directives for Apache. While you can set different settings for many of these directives, you do need to know the defaults in this section. We analyze the basic settings in this part of the httpd.conf file in order.






Note


This is a very long section; you may want to take a break if you’re in the habit of reading through a complete section at a time.





System User



As determined by the User and Group directives, the Apache daemon, httpd, is assigned a specific user and group name here and in /etc/passwd and /etc/group:


User apache
Group apache




Administrative Contact



With web pages generated by Apache, there is a listing for an administrative contact, as determined by the ServerAdmin directive:


ServerAdmin root@localhost




Web Server Name



If you have an administrative website for your web server, you’ll want to set it with the ServerName directive. If you don’t have a fully qualified domain name in a DNS server, use the IP address.


#ServerName new.host.name:80


If you activate this directive, it will normally be superseded by the name you set for each Virtual Host.





Canonical Name



Technically, every URL, such as http://www.Sybex.com/, is supposed to have a trailing slash. But I never remember to put it in. Without the following directive, an attempt to navigate to www.Sybex.com would end up at the address specified by the ServerName directive. The standard httpd.conf file includes the UseCanonicalName directive to add the trailing slash automatically.


UseCanonicalName Off




Document Root



The root directory for your web server is specified by the DocumentRoot directive:


DocumentRoot "/var/www/html"




Web Directory Permissions



Next, we look at the default permissions for users within directories accessible through your server’s websites. It’s set up by the <Directory /> container, which defines the permissions associated with the DocumentRoot:


<Directory />
Options FollowSymLinks
AllowOverride None
</Directory>


The Options directive determines where you can go for files from that directory. It can be set to several different values, as described in Table 30.5. The AllowOverride directive can go to the .htaccess file for a list of users or computers allowed to see certain files; the AllowOverride None setting doesn’t even look at the .htaccess file.





































Table 30.5: Options Directive Values


Value




Description




All




Supports all settings except MultiViews.




ExecCGI




Allows the running of CGI scripts.




FollowSymLinks




Lets requests follow symbolically linked files or directories.




Includes




Allows the use of server-side includes (SSI).




IncludesNOEXEC




Allows SSIs, but no CGIs.




Indexes




If there is no indexl type file, sets up Apache to return a list of files in that directory. Options for this file are specified by the DirectoryIndex directive.




MultiViews




Supports content negotiation, such as between web pages in different languages.




SymLinksIfOwnerMatch




Follows symbolic links if the target file or directory is owned by the same user.






.htaccess Files



An .htaccess file is a distributed configuration file that you can use to configure individual directories on a website. It is a common way to implement restricted access to a specific directory.


An .htaccess file isn’t necessary in most cases; you can configure access on a per-directory basis in the main Apache configuration file, httpd.conf. In the default version of the main Apache configuration file, look for <Directory> containers. Observe how the restrictions vary for different directories.


However, if you have a large number of websites on your server, such as the personal web pages associated with many ISPs, you may want to use .htaccess files to let individual users regulate access to web pages in their home directories. You can set up a standard scheme to read .htaccess files, as described later in the "User Directory Permissions" section.


If you want to implement distributed configuration files, you can do something to make it more secure. Look for the AccessFileName directive in httpd.conf. Assign a hidden file name other than .htaccess. Also see the "Access Control" section later in this chapter.






Specific Directory Permissions



Next, we’ll look at the default permissions in httpd.conf for the /var/www/html directory, as specified by the following container:


<Directory "/var/www/html">


The following Options directive supports redirection via symbolic links and the listing of files in the current directory if there is no indexl type file (look ahead to Figure 30.2 for an example):


Options Indexes FollowSymLinks


As we mentioned in the previous section, the AllowOverride directive specifies the types of directives in the .htaccess file; the following option doesn’t even look at .htaccess:


AllowOverride None


Finally, there are access control directives; the following looks for an Allow and then a Deny directive for this directory, in order:


Order allow,deny
Allow from all




Root Directory Permissions



Now the httpd.conf file adds a couple more directives for users that access the top directory of your website, also known as DocumentRoot:


<LocationMatch "^/$">
Options -Indexes
ErrorDocument 403 /error/noindexl
</LocationMatch>


The <LocationMatch "^/$"> container looks a little strange; this specific directive applies the commands therein (Options and ErrorDocument) to the root (/) directory.


The Options -Indexes directive prohibits the listing of files, courtesy of the - in front of the Indexes setting. If no indexl page is available, the ErrorDocument directive returns the noted error web page to the user. This location is based on the ServerRoot directive; thus, noindexl is located in the /etc/httpd/error directory.


Oddly enough, the noindexl file is the "Test Page" that is shown when Apache starts without the pages associated with a real website. It’s shown back in Figure 30.1.





User Directory Permissions



You can set up web pages in your users’ home directories. They are disabled by default with the following command:


UserDir disable


You can replace that command with the following:


UserDir public_html


Assume you have a user named ez, and she has a set of web page files in the /home/ez/public_html directory. Also, assume that your website is named www.example.abc. You need to set appropriate permissions:


# chmod 711 /home/ez
# chmod 755 /home/ez/public_html
# chmod 744 /home/ez/public_html/*


Then when you direct your browser to www.example.abc/~ez, you will be able to see any indexl web page that you might have stored in the /home/ez/public_html directory.


You can further regulate access to web pages and files in users’ home directories. Look at the following sample commands from the default httpd.conf file:


#<Directory /home/*/public_html>
# AllowOverride FileInfo AuthConfig Limit
# Options MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec
# <Limit GET POST OPTIONS>
# Order allow,deny
# Allow from all
# </Limit>
# <LimitExcept GET POST OPTIONS>
# Order deny,allow
# Deny from all
# </LimitExcept>
#</Directory>


If you activate these commands, Apache allows you to browse the files in the public_html sub- directory, as described later in the "Directory Listings" section.


As described earlier, the AllowOverride directive relates to the access information that Apache reads from an individual .htaccess file. The different parameters associated with this directive are shown in Table 30.6. All descriptions refer to the commands that you can use in an .htaccess file on a per-directory basis.

























Table 30.6: AllowOverride Directive Parameters


Parameter




Description




AuthConfig




Supports the use of authorization directives




FileInfo




Lets you configure various document types




Indexes




Permits you to configure indexing of the directory




Limit




Supports access control restrictions, such as deny and allow




The Options directive described in Table 30.5 supports content negotiation, file indexing, following symbolic links, and support for SSIs but not CGIs.


The Limit directive sets options for users who want to send (POST) and receive (GET) files from the user home directory; the LimitExcept directive denies the use of all other access commands.





Directory Index



When users navigate to your website, they’re actually looking in a directory. The DirectoryIndex directive tells Apache the types of web pages to send back to the website user:


DirectoryIndex indexl indexl.var


The indexl document is a standard home page file used by many websites; index l.var is one way to set up a dynamic home page. You can look at an example of .var files in the /var/www/error directory. Open those files in the text editor of your choice. You’ll see standard error messages.





Access Control



As described in the sidebar “.htaccess Files," you can configure access control files on individual directories. By default, it’s the hidden file .htaccess; you can set a different filename with the AccessFileName directive:


AccessFileName .htaccess


The following Files directive ensures that any file that starts with .ht is not viewable by users who are browsing your website:


<Files ~ "^\.ht">
Order allow,deny
Deny from all
</Files>




MIME Types



While the MIME (Multipurpose Internet Mail Extensions) standard was originally created for sending binary files over e-mail, it works for web pages as well. For example, you can configure your browser to open the PDF reader of your choice if you navigate to a PDF file on the Internet. The standard translation between MIME types and file extensions is listed through the TypesConfig directive:


TypesConfig /etc/mime.types


Many files do not have extensions such as .pdf or .doc. You can set the DefaultType directive to specify display options on a browser. If you use text files, the following standard should work well:


DefaultType text/plain


Alternatively, if most of your files are in binary format, you could end up sending dozens of pages of gibberish to your users unless you changed this directive to something like:


DefaultType application/octet-stream


If the extension doesn’t provide a clue, you can use the MIMEMagicFile directive, which uses the mod_mime_magic module defined in Table 30.4:


<IfModule mod_mime_magic.c>
# MIMEMagicFile /usr/share/magic.mime
MIMEMagicFile conf/magic
</IfModule>


Remember, the location of a “relative” path such as conf/magic is based on the ServerRoot directive. In other words, this section points to the MIMEMagicFile at /etc/httpd/conf/magic.


There is one more related directive, toward the end of the httpd.conf file. The AddType directive allows you to override the configuration as defined by TypesConfig in /etc/mime.types:


AddType application/x-tar .tgz




Log Data



Apache logs can be very large. If you’re running a large commercial website, you could easily collect hundreds of megabytes of log data every day. The choices you make for log data could easily overload your system.


Normally, HostnameLookups are set to Off; otherwise, Apache will look for the fully qualified domain name of every requesting user. Don’t do this unless you have reliable access to a DNS server and the network capacity to handle that volume of information.


HostnameLookups Off


You can set the locations of different log files. The ErrorLog directive, as you’d expect, sets the location of the error_log file. With the given value of ServerRoot, the following log file is located in the /etc/httpd/logs directory:


ErrorLog logs/error_log


You can control the types of messages sent to the ErrorLog file; available values for the LogLevel directive (debug, info, notice, warn, error, crit, alert, emerg) are similar to those shown in the standard error log file, /etc/syslog.conf, back in Chapter 13.


LogLevel warn


Log information is sent to the error_log in a specific format, as defined by the following LogFormat directives:


LogFormat "%h %l %u %t \"%r\" %>s %
b \"%{Referer}i\" \"%{User-Agent}i\" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent


Each of these lines specifies a set of data collected in four different formats: combined, common, referer, and agent.


The variables associated with LogFormat are described in Table 30.7. A substantial number of additional variables are available, which you can review in the mod_log_configl file in the /var/ www/manual/mod directory. Other request fields are per the standards of the World Wide Web consortium, at www.w3.org/Protocols/HTTP/HTRQ_Headersl.











































Table 30.7: LogFormat Directive Variables


Variable




Description




%a




Remote IP address.




%b




Bytes sent (not including HTTP headers).




%h




Remote host.




%l




Remote log name.




%r




First line of the client request.




%s




Request status.




%t




Time.




%u




Remote user.




referer




Notes the page where someone clicked on a link. (Yes, in Apache, referer is not spelled correctly.)




user-agent




Notes the client program, such as Mozilla.




You can set the location of several other types of logs, as defined through the CustomLog variable. You can set this up within one of your Virtual Hosts, so the owners of individual websites on your server can get their own log files:


# CustomLog logs/access_log common
CustomLog logs/access_log combined
#CustomLog logs/referer_log referer
#CustomLog logs/agent_log agent
#CustomLog logs/access_log combined


These lines specify the location of your log files. Based on the default ServerRoot, that’s /etc/ httpd/logs. The actual information that’s sent to each log file is based on the referenced LogFormat. For example, the active CustomLog directive refers to the combined format, which you might recall is:


LogFormat "%h %l %u %t \"%r\" %>s %b "%{Referer}i\"
\"%{User-Agent}i\" combined





The Server Signs the Web Page



The httpd.conf file can add one element to dynamically generated web pages, depending on the ServerSignature directive. Normally it’s set as follows:


ServerSignature On


When ServerSignature is set to On, you might see a message similar to the following at the bottom of dynamically generated web pages:


Apache/2.0.40 Server at localhost Port 80


Alternatively, if you substitute Email for On, you’ll get a hyperlink from the name of the computer, in this case, localhost, to the server administrator, as defined by the ServerAdmin directive.





Aliases



You can use the Alias directive to set up a link between a directory in the URL to a directory on your computer. For example, the first Alias directive in the default httpd.conf file links the /icons/ subdirectory from a URL:


Alias /icons/ "/var/www/icons/"


to the /var/www/icons/ directory on the web server. This is also a good place to specify the permissions associated with /var/www/icons/.


<Directory "/var/www/icons">
Options Indexes MultiViews
AllowOverride None
Order allow,deny
Allow from all
</Directory>


These permissions allow users to read the contents of the directory, unless there’s a DirectoryIndex file such as indexl, and support content negotiation, such as different languages, via MultiViews.


If you’ve installed the httpd-manual-* RPM and want to include the Apache manual on your website, change the following default Alias directive from


Alias /manual "/var/www/manual"


to


Alias /etc/httpd/manual "/var/www/manual"


This assumes that your ServerRoot directive is set to /etc/httpd. The following lines set permissions for the noted directory, and include the Web-based Distributed Authoring and Versioning (WebDAV) database:


<Directory "/var/www/manual">
Options Indexes FollowSymLinks MultiViews
AllowOverride None
Order allow,deny
Allow from all
</Directory>
<IfModule mod_dav_fs.c>
# Location of the WebDAV lock database.
DAVLockDB /var/lib/dav/lockdb
</IfModule>




Scripts



Scripts in httpd.conf refer to programs that are run through the web server. Apache starts in the default httpd.conf file with a ScriptAlias directive, which is a specialized Alias for scripts:


ScriptAlias /cgi-bin/ "/var/www/cgi-bin/"


Some scripts require access to the CGI daemon, which is defined by the Scriptsock directive:


<IfModule mod_cgid.c>
Scriptsock run/httpd.cgid
</IfModule>


Once again, this is a good opportunity to define the permissions associated with the scripts associated with your websites:


<Directory "/var/www/cgi-bin">
AllowOverride None
Options None
Order allow,deny
Allow from all
</Directory>


Note how these permissions don’t allow the use of .htaccess but support script execution by all users.


If you change website names, you’ll want to redirect users. For example, the following default Redirect directive takes users who navigate to your /bears directory to www.mommabears.com:


# Redirect permanent /bears http://www.mommabears.com




Directory Listings



Sometimes you want to see the files in a directory. For example, Figure 30.2 illustrates the files in the /home/mike/public_html directory, based on the UserDir directives described earlier, in the User Directory Permissions section.




Figure 30.2: Viewing home directory files



The IndexOptions directive determines how index files are shown in client web browsers. For example, the default IndexOptions line


IndexOptions FancyIndexing VersionSort NameWidth=*


configures FancyIndexing, for icons and file sizes; VersionSort, which sorts numbers such as RPM versions in a specific order; and a NameWidth as large as needed for the filenames in the directory.





Icons



Speaking of icons, a list of icons is available for different file types and extensions. These icons are shown with a file list, assuming you have set IndexOptions FancyIndexing as defined in the previous section. There are three basic AddIcon* directives:


AddIconByEncoding (CMP,/icons/compressed.gif)  x-compress x-gzip


The AddIconByEncoding directive shown here applies to compressed binary files. Several AddIconByType directives are also included for four different file types:


AddIconByType (TXT,/icons/text.gif) text/*
AddIconByType (IMG,/icons/image2.gif) image/*
AddIconByType (SND,/icons/sound2.gif) audio/*
AddIconByType (VID,/icons/movie.gif) video/*


Finally, there are a series of AddIcon directives that associate a specific icon with different filename extensions:


AddIcon /icons/binary.gif .bin .exe
AddIcon /icons/binhex.gif .hqx
AddIcon /icons/tar.gif .tar
AddIcon /icons/world2.gif .wrl .wrl.gz .vrml .vrm .iv
AddIcon /icons/compressed.gif .Z .z .tgz .gz .zip
AddIcon /icons/a.gif .ps .ai .eps
AddIcon /icons/layout.gif l .shtml .pdf
AddIcon /icons/text.gif .txt
AddIcon /icons/c.gif .c
AddIcon /icons/p.gif .pl .py
AddIcon /icons/f.gif .for
AddIcon /icons/dvi.gif .dvi
AddIcon /icons/uuencoded.gif .uu
AddIcon /icons/script.gif .conf .sh .shar .csh .ksh .tcl
AddIcon /icons/tex.gif .tex
AddIcon /icons/bomb.gif core
AddIcon /icons/back.gif ..
AddIcon /icons/hand.right.gif README
AddIcon /icons/folder.gif ^^DIRECTORY^^
AddIcon /icons/blank.gif ^^BLANKICON^^


These AddIcon directives are straightforward. For example, if Apache sees a file with an .exe extension, it adds the /icons/binary.gif icon as a label for that particular file. But this list is not comprehensive; there is a DefaultIcon directive for files with unknown extensions:


DefaultIcon /icons/unknown.gif


If you like, you can activate the following AddDescription directives to give users a bit more information about files with specific extensions:


#AddDescription "GZIP compressed document" .gz
#AddDescription "tar archive" .tar
#AddDescription "GZIP compressed tar archive" .tgz


You can set up directories with various HTML files. For example, the HeaderName directive specifies a file to put before the file list; the ReadmeName directive specifies a file to put after the file list.


ReadmeName READMEl
HeaderName HEADERl


The IndexIgnore directive sets Apache to avoid listing the noted files in any directory list. Note how the default value includes the HEADERl and READMEl files.


IndexIgnore .??* *~ *# HEADER* README* RCS CVS *,v *,t




Decompression



Some browsers can read and automatically decompress certain files in your website directories. All you need to do is specify the encoding associated with certain filename extensions by using the AddEncoding directive:


AddEncoding x-compress Z
AddEncoding x-gzip gz tgz




Languages



Multilingual websites include web pages in multiple languages. The DefaultLanguage directive defines the language associated with all web pages that aren’t already labeled. The following inactive directive specifies the Dutch language:


# DefaultLanguage nl


You can set up web pages in different languages, as defined by the AddLanguage directive. For example, indexl.cz is a web page associated with the Czech language:


AddLanguage cz .cz


Other language codes are listed in Table 30.8.



























































































Table 30.8: AddLanguage Codes


Code




Language




ca




Catalan




cz




Czech




da




Danish




de




German




en




English




el




Modern Greek




es




Spanish




et




Estonian




fr




French




he




Hebrew




hr




Hungarian




it




Italian




ja




Japanese




kr




Korean




ltz




Luxembourgeois




nl




Dutch (Netherlands)




nn




Norwegian Nynorsk




no




Norwegian




pl




Polish




pt




Portuguese




pt-br




Brazilian Portuguese




ru




Russian




sv




Swedish




tw




Chinese *




zh-tw




Chinese




Anyone who follows the political situation in China in any depth will understand that the designation of tw as Chinese has caused some controversy. As I understand it, the people behind Apache are in the process of converting all Chinese AddLanguage codes to zh.




A web browser should tell the web server the preferred language. However, when this doesn’t work, the LanguagePriority directive sets the preferred language:


LanguagePriority en da nl et fr de el i
t ja kr no pl pt pt-br ltz ca es sv tw


This works hand in hand with the ForceLanguagePriority directive. As defined in the default httpd .conf file, it uses the LanguagePriority directive list to select from languages acceptable to the client web browser. If no acceptable language page is available, the first item on the LanguagePriority list (in this case, English) is used.


Many languages don’t work too well unless you have the right set of characters. Most language characters have been organized into different ISO character sets. The default, which works for English and a number of similar languages, is ISO-8859-1. It’s forced into the default websites for Apache with the following directive:


AddDefaultCharset ISO-8859-1


Several other character sets are available, as defined by the following AddCharset directives. For more information on these character sets, see www.iana.org/assignments/character-sets.


AddCharset ISO-8859-1  .iso8859-1  .latin1
AddCharset ISO-8859-2 .iso8859-2 .latin2 .cen
AddCharset ISO-8859-3 .iso8859-3 .latin3
AddCharset ISO-8859-4 .iso8859-4 .latin4
AddCharset ISO-8859-5 .iso8859-5 .latin5 .cyr .iso-ru
AddCharset ISO-8859-6 .iso8859-6 .latin6 .arb
AddCharset ISO-8859-7 .iso8859-7 .latin7 .grk
AddCharset ISO-8859-8 .iso8859-8 .latin8 .heb
AddCharset ISO-8859-9 .iso8859-9 .latin9 .trk
AddCharset ISO-2022-JP .iso2022-jp .jis
AddCharset ISO-2022-KR .iso2022-kr .kis
AddCharset ISO-2022-CN .iso2022-cn .cis
AddCharset Big5 .Big5 .big5
# For Russian, more than one charset is
used (depends on client, mostly):
AddCharset WINDOWS-1251 .cp-1251 .win-1251
AddCharset CP866 .cp866
AddCharset KOI8-r .koi8-r .koi8-ru
AddCharset KOI8-ru .koi8-uk .ua
AddCharset ISO-10646-UCS-2 .ucs2
AddCharset ISO-10646-UCS-4 .ucs4
AddCharset UTF-8 .utf8
AddCharset GB2312 .gb2312 .gb
AddCharset utf-7 .utf7
AddCharset utf-8 .utf8
AddCharset big5 .big5 .b5
AddCharset EUC-TW .euc-tw
AddCharset EUC-JP .euc-jp
AddCharset EUC-KR .euc-kr
AddCharset shift_jis .sjis





Mapped Handlers



You can map filename extensions to a specific handler. For example, the following commented AddHandler directive activates CGI script handling for files with the .cgi extension, assuming you also have set the Options ExecCGI directive for the subject directory:


#AddHandler cgi-script .cgi


The following commented directive makes sure that files that already have HTTP headers don’t get processed:


#AddHandler send-as-is asis


To activate commented directives, remove the comment mark (#) in httpd.conf in the text editor of your choice.


This directive processes image map files:


AddHandler imap-file map


Finally, this directive supports .var files, which are associated with finding the language specified by a web browser client:


AddHandler type-map var


Part of the process includes output filters. For example, the following AddOutputFilter directive looks in web pages with .shtml extensions for Server Side Includes.





Error Messages



On a web server, if you have an error, you get a message associated with a specific web page. Figure 30.3 illustrates the error message associated with the HTML 404 error code, also known as the “file not found” error.




Figure 30.3: An HTML 404 Error



The default error directory is /var/www/error; the following Alias directive associates the error directory with those files:


Alias /error/ "/var/www/error/"


The following modules provide for content negotiation and SSIs in the web pages in the /var/ www/error/ directory:


<IfModule mod_negotiation.c>
<IfModule mod_include.c>


The following permissions on the /var/www/error directory set the stage for error messages in English, Spanish, German, and French, in that order. You can read more about the other directives earlier in the "Directory Index" section earlier in this chapter.


<Directory "/var/www/error">
AllowOverride None
Options IncludesNoExec
AddOutputFilter Includes html
AddHandler type-map var
Order allow,deny
Allow from all
LanguagePriority en es de fr
ForceLanguagePriority Prefer Fallback
</Directory>


This works hand in hand with HTML error codes. The page a user sees depends on the error code and the web page defined by the following ErrorDocument directives:


ErrorDocument 400 /error/HTTP_BAD_REQUESTl.var
ErrorDocument 401 /error/HTTP_UNAUTHORIZEDl.var
ErrorDocument 403 /error/HTTP_FORBIDDENl.var
ErrorDocument 404 /error/HTTP_NOT_FOUNDl.var
ErrorDocument 405 /error/HTTP_METHOD_NOT_ALLOWEDl.var
ErrorDocument 408 /error/HTTP_REQUEST_TIME_OUTl.var
ErrorDocument 410 /error/HTTP_GONEl.var
ErrorDocument 411 /error/HTTP_LENGTH_REQUIREDl.var
ErrorDocument 412 /error/HTTP_PRECONDITION_FAILEDl.var
ErrorDocument 413/error/HTTP_REQUEST_ENTITY_TOO_LARGEl.var
ErrorDocument 414 /error/HTTP_REQUEST_URI_TOO_LARGEl.var
ErrorDocument 415 /error/HTTP_SERVICE_UNAVAILABLEl.var
ErrorDocument 500 /error/HTTP_INTERNAL_SERVER_ERRORl.var
ErrorDocument 501 /error/HTTP_NOT_IMPLEMENTEDl.var
ErrorDocument 502 /error/HTTP_BAD_GATEWAYl.var
ErrorDocument 503 /error/HTTP_SERVICE_UNAVAILABLEl.var
ErrorDocument 506 /error/HTTP_VARIANT_ALSO_VARIESl.var





Browser Customization



When a web browser asks for a web page, it tells Apache what kind of browser it is. The BrowserMatch directive helps you customize the response to different web browsers:


BrowserMatch "Mozilla/2" nokeepalive
BrowserMatch "MSIE 4\.0b2;" nokeepalive downgrade-1.0
force-response-1.0
BrowserMatch "RealPlayer 4\.0" force-response-1.0
BrowserMatch "Java/1\.0" force-response-1.0
BrowserMatch "JDK/1\.0" force-response-1.0


The first two commands create special responses for older browsers; Mozilla/2 corresponds to Netscape 2.x, and MSIE 4\.0b2 corresponds to Microsoft Internet Explorer 4.x. These browsers do not conform to the current HTTP 1.1 standard. The last three commands force HTTP 1.0–level responses to the specified web browsers.


There is a special issue with Microsoft WebFolders, which does not properly handle WebDAV databases. This issue is addressed with the following BrowserMatch directives:


BrowserMatch "Microsoft Data Access Internet PublishingProvider"
→ redirect-carefully
BrowserMatch "^WebDrive" redirect-carefully





Server Reports



You can send out reports on the status and configuration information on your Apache server with various server reports. For example, the following command stanza, when activated, can give you the current status of Apache:


#<Location /server-status>
# SetHandler server-status
# Order deny,allow
# Deny from all
# Allow from .your-domain.com
#</Location>


I would activate it with the following commands; otherwise, the Deny from all command would stop all traffic to the http://servername/server-status address. In this case, my LAN is on the 192.168.13.0/24 network:


<Location /server-status>
SetHandler server-status
Order deny,allow
Deny from all
Allow from 192.168.13.0/24
</Location>


You can see the result from another computer on my LAN through a different web browser in Figure 30.4.


You can get similar reports on your Apache configuration when you properly activate the following commands:


#<Location /server-info>
# SetHandler server-info
# Order deny,allow
# Deny from all
# Allow from .your-domain.com
#</Location>




Figure 30.4: Checking server status remotely



These commands are direct from the default httpd.conf file; remember to set Allow from your_ network_address, similar to what I did in the previous stanza. When you do, you can see the results remotely, as shown in Figure 30.5.




Figure 30.5: Checking server configuration remotely






Proxy Server



Apache includes its own proxy server. You can set Apache to cache and serve requested web pages on local networks or all users. The basic commands are shown here; I’ve changed them a bit to apply the proxy server to my LAN with a network address of 192.168.13.0/24:


#<IfModule mod_proxy.c>
#ProxyRequests On
#
#<Proxy *>
# Order deny,allow
# Deny from all
# Allow from 192.168.13.0/24
#</Proxy>


If you have multiple proxy servers, you should activate the following ProxyVia directive, which supports searches through a chain of proxy servers using HTTP 1.1:


#ProxyVia On


A proxy server has no purpose unless you configure a cache. Table 30.9 describes the series of special directives associated with caches. If you set up a proxy server, you may want to change some of these settings; for example, you may want a CacheSize larger than 5KB:


#CacheRoot "/etc/httpd/proxy"
#CacheSize 5
#CacheGcInterval 4
#CacheMaxExpire 24
#CacheLastModifiedFactor 0.1
#CacheDefaultExpire 1
#NoCache a-domain.com another-domain.edu joes.garage-sale.com































Table 30.9: Apache Cache Directives


Directive




Description




CacheDefaultExpire




Sets the time to cache a document, in seconds.




CacheGcInterval




Configures the time between attempts to clear old data from a cache, in hours.




CacheLastModifiedFactor




Sets the expiration time for files in the cache. If there is no expiration date and time associated with a web page, Apache sets it relative to the amount of time since the last known change to that page.




CacheMaxExpire




Selects the maximum time in seconds to cache a document.




CacheRoot




Configures the default directory with the proxy server cache.




CacheSize




Sets the size of the cache, in kilobytes.








Virtual Hosts



One of the strengths of Apache 2.0.x is its ability to set up multiple websites on a single IP address. This is possible with the concept of Virtual Hosts.


Older versions of Apache supported only IP-based Virtual Hosts, which required separate IP addresses for each website configured through your Apache server. Apache 2.0.x supports name-based Virtual Hosts.


In this scheme, DNS servers map multiple domain names, such as www.mommabears.com and www.sybex.com, to the same IP address, such as 10.111.123.45. You can set up httpd.conf to recognize the different domain names and serve the appropriate website.






Note


You can’t always use the name-based scheme; it doesn’t work if you need a secure (SSL) part of your website, such as to support e-commerce. It also has problems with older clients, such as Netscape 2.0 and Internet Explorer 4.0 browsers. These browsers cannot handle a lot of information associated with the current HTTP 1.1 standard.




The following code is an example of how to configure two Virtual Hosts, in this case for www.sybex.com and www.mommabears.com:


NameVirtualHost *


This NameVirtualHost directive listens to requests to all IP addresses on the local computer. Alternatively, you can substitute the actual IP address for the * in this section:


<VirtualHost *>
ServerAdmin webmaster@sybex.com
DocumentRoot /www/site1/sybex.com
ServerName sybex.com
ErrorLog logs/sybex.com-error_log
CustomLog logs/sybex.com-access_log common
</VirtualHost>


The directives in the www.sybex.com <Virtual Host *> container supersede any settings made earlier in the httpd.conf file. You can customize each Virtual Host by adding the directives of your choice:


<VirtualHost *>
ServerAdmin webmaster@mommabears.com
DocumentRoot /www/site2/mommabears.com
ServerName mommabears.com
ErrorLog logs/mommabears.com-error_log
CustomLog logs/mommabears.com-access_log common
</VirtualHost>


As you can see, the settings for the mommabears.com website are similar; remember, relative directories depend on the ServerRoot directive.







Customizing Apache Modules



There are a number of Apache module-specific configuration files in the /etc/httpd/conf.d directory, installed through some of the module RPMs described earlier in the "Packages" section. They are included in the basic Apache configuration courtesy of the Include conf.d/*.conf directive in the main httpd.conf file. These module files are summarized in Table 30.10.































Table 30.10: Apache Module Configuration Files


File




Description




auth_mysql.conf




Supports access to a MySQL database; the default version of this file includes various authentication commands.




auth_pgsql.conf




Supports access to a PostgreSQL database; the default version of this file includes various authentication commands.




perl.conf




Incorporates a Perl interpreter; supports the use of Perl commands and scripts.




php.conf




Incorporates a PHP scripting language interpreter.




python.conf




Configures a Python interpreter; allows the use of Python commands and scripts.




ssl.conf




Adds Secure Socket Layer (SSL) support; uses TCP/IP port 443 by default. Includes several directives for certificates and encryption methods.






Troubleshooting Apache



If you’re unable to make a connection to a website configured on a Apache web server, you can check a number of things. Before you begin, check the network. The most common problem on any network is physical; for example, it’s good to inspect connectors and cables. Then, check connectivity using commands such as ping; for more information, see Chapter 21.



Checking Basic Operation



Once you’re sure that your network is operational, the next step is to see if Apache is running. Start with the following command:


# service httpd status


You should see a message such as:


httpd (pid 3464 3463 3462 3461 3460 3459 3458) is running


This tells you that a number of Apache (httpd) daemons are running; the number depends on httpd.conf directives such as StartServers. If you’re having a problem, there are three other fairly common messages:


httpd is stopped


This is fairly simple; try a service httpd start command. Rerun the service httpd status command. You might also see the following message:


httpd is dead but pid file exists


In this example, Apache can’t start, in part because there is an httpd.pid file in the /var/run directory. This can happen after a power failure (assuming you don’t have an uninterruptible power supply) where Linux never got a chance to erase the httpd.pid file. Try deleting the file and then run the service httpd start command. Rerun the service httpd status command. You might now see the following message:


httpd dead but subsys locked


That tells us something else is going wrong. It’s time to inspect the log files.





Checking Log Files



The default location for your Apache log files as defined in httpd.conf is /etc/httpd/logs; however, you’ll find this directory linked to a more standard location for log files, /var/log/httpd. Remember, you have the freedom to put log files in a different directory by using CustomLog directives in a Virtual Host container.


Read the log files in this directory for clues. The variety of errors that you might find is beyond the scope of this book; however, many of the log entries are self-explanatory.





Checking Syntax



The Apache web server includes its own syntax checker. The following command checks the syntax of the main configuration file, httpd.conf. If there is a problem, the command


# httpd -t


often identifies the line number with the problem, such as a misspelled directive. Alternatively, the following command starts Apache in debug mode, which can help you identify additional problems:


# httpd -X




Checking the Firewall



Sometimes messages just aren’t getting through to your web server. That may mean that you forgot to let in messages through the standard HTTP port (80) in the firewall. Run an iptables -L command to list current firewall rules. Refer to Chapter 22 for more information on this command.


As described with the various firewall utilities (Chapters 3, 4, and 19), you can set up firewalls that automatically allow data through the HTTP port. Remember, if you also serve secure web pages, you should also open up the associated port. In this case, for HTTPS, that is port 443. Standard TCP/IP port numbers are defined in /etc/services.








/ 220