Linux Server Security (2nd Edition( [Electronic resources] نسخه متنی

10.4. Web Applications

The Web Application Security Consortium has
classified web threats and tried to standardize their descriptions
(http://www.webappsec.org/threatl). The
Open Web Application Secuity Project
(OWASP) describes the top 10 vulnerabilities (http://www.owasp.org/documentation/toptenl)
and how to secure web applications (http://www.owasp.org/documentation/guide/guide_aboutl).
All are well worth reading.

10.4.1. Processing Forms

The top risk in the OWASP list is currently
unvalidated input. This is most evident in the
workhorse of web applications, form processing.

In the previous section, I showed how to get and echo the value of
the form element named string.
I'll now show how to circumvent this simple code,
and how to protect against the circumvention.

Client-side form checking with JavaScript is a
convenience for the user, and it avoids a round-trip to the server to
load a new page with error messages. However, it does not protect you
from a handcrafted form submission with bad data.
Here's a simple form that lets the web user enter a
text string:

When submitted, we want to echo the string. Let's
look again at a naive stab at echo in PHP:

<? echo "string = ", $_REQUEST["string"], "\n"; ?>

And the same in Perl:

#!/usr/bin/perl -w
use strict;
use CGI qw(:standard);
print header;
print "string = ", param("string"), "\n";

This looks just ducky. In fact, if you type quack
into the string field, you see the output:

string = quack But someone with an evil mind might enter this text into the
string field:

Submit this, and watch the JavaScript code bounce you right back to
your input form. If this form did something more serious than echo
its input (such as entering the contents of a literal tag into a
database), the results could be more serious.

Never trust user input. Validate everything on the server. Check for
commands within data.

This is an example of someone uploading code to your server without
your knowledge and then getting it to download and execute on any
browser. This cross-site scripting bug was fixed
within JavaScript itself some time ago, but that
doesn't help in this case, because JavaScript is
being injected into the data of a server-side script. HTML tags that
invoke active content are shown in Table 10-7.

Table 10-7. HTML active content tags
Tag	Use
<script>	Client-side script. Languages include JavaScript, Jscript, ECMAScript, and VBScript.
<embed>	Embedded object. Used with browser plug-ins.
<object>	Embedded object. Used with ActiveX/COM components in Windows.
<applet>	Java applet.

Each scripting language has the ability to
escape input data, removing any magic
characters, quotes, callouts, or anything else that would treat the
input as something other than plain text.

An even better approach is to specify what you
want, rather than escaping what you
don't want. You can match the data against a regular
expression specifying the legal input patterns. The complexity of the
regular expression depends on the type of data and the desired level
of validity checking. For example, you might want to ensure that a
U.S. phone number field has exactly 10 digits, or that an email
address follows RFC 822.

10.4.1.1 PHP

To
avoid interpreting a text-form variable as JavaScript or HTML, escape
the special characters with the PHP functions
htmlspecialcharacters
or htmlentities. Some
helper functions are available at http://www.owasp.org/software/labs/phpfiltersl.
As mentioned previously, it's even better to extract
the desired characters from the input first via a regular-expression
match. In the following section, there's an example
of how Perl can be used to untaint input data.

PHP has had another
security issue with global data. When
the PHP configuration variable register_globals
is enabled, PHP creates an automatic global variable to match each
variable in a submitted form. In the earlier example, a PHP variable
named $string winks into existence to match the
form variable string. This makes form processing
incredibly easy. The problem is that anyone can craft a URL with such
variables, forging a corresponding PHP variable. So any uninitialized
variable in your PHP script could be assigned from the outside.

The danger is not worth the convenience. Specify
register_globals off in your
php.ini file. Starting
with PHP 4.2.0, this is the default setting. PHP Versions 4.1.1 and
up also provide safer new autoglobal arrays.
These are automatically global within PHP functions (in
PHP, you need to say global var within a PHP function to access
the normal global variable named var; this quirk
always bites Perl developers). These arrays should be used instead of
the older arrays $HTTP_GET_VARS and
$HTTP_POST_VARS, and are listed in Table 10-8.

Table 10-8. PHP's old and new global arrays
Variable type	Old global array	New autoglobal array
Environment	$HTTP_ENV_VARS	$_ENV
Get	$HTTP_GET_VARS	$_GET
Post	$HTTP_POST_VARS	$_POST
Posted files	$HTTP_POST_files	$_files
Cookie	$HTTP_COOKIE_VARS	$_COOKIE
Server	$HTTP_SERVER_VARS	$_SERVER

Another new autoglobal array, $_REQUEST, is the
union of $_GET, $_POST, and
$_COOKIE. This is handy when you
don't care how the variable got to the server.

10.4.1.2 Perl

Perl
runs in taint mode
in the following situations:

Automatically, when the real and effective user ID and group ID differ Explicitly, when invoked with the -T flag
This mode marks data originating outside the script as potentially
unsafe and forces you to do something about it. To untaint a
variable, run it through a regular expression, and grab it from one
of the positional match variables ($1,
$2, ...). Here's an example that
gets a sequence of "word"
characters (\w matches letters, digits, and
_ ):

#!/usr/bin/perl -wT
use strict;
use CGI qw(:standard);
my $user = param("user");
if ($user =~ /^(\w+)$/) { $user = $1; } We'll see that taint mode applies to file I/O,
program execution, and other areas where Perl is reaching out into
the world.

10.4.2. Including Files

CGI scripts can include files inside or
outside of the document hierarchy. Try to move sensitive information
from your scripts to files located outside the document hierarchy.
This is one layer of protection if your CGI script somehow loses its
protective cloak and can be viewed as a simple file.

Use a special suffix for sensitive include files (a common choice is
.inc), and tell Apache not to serve files with
that suffix. This will protect you when you accidentally put an
include file somewhere in the document root. Add this to an Apache
configuration file:

<FilesMatch "\.inc$">
order allow,deny
deny from all
</Files>

Also, watch out for text editors that may leave copies of edited
scripts with suffixes like ~ or
.bak. The crafty snoop could just ask your web
server for files like program~ or
program.bak. Your access and error logs will
show if anyone has tried. To forbid serving them anywhere, add this
to your Apache configuration file:

<FilesMatch ~ "(~|\.bak)$">
order allow,deny
deny from all
</Files>

When users are allowed to view or download files based on a submitted
form variable, guard against attempts to access sensitive data, such
as a password file. One exploit is to use relative paths (..):

../../../etc/passwd Cures for this depend on the language and are described in the
following sections.

10.4.2.1 PHP

External files can be
included with the PHP include or
include_once commands. These may contain functions
for database access or other sensitive information. A mistake in your
Apache configuration could expose PHP files within normal document
directories as normal text files, and everyone could see your code.
For this reason, I recommend the following:

Include sensitive PHP scripts from a location outside of your
document root. Edit php.ini to specify:

include_path .:/usr/local/lib/php:/usr/local/my_php_lib Use the protected suffix for your included files:

<? include_once "db_login.inc"; ?>

Use the basename function to isolate the filename
from the directory and open_basedir to restrict
access to a certain directory. These will catch attempts to use
../ relative filenames.

If you process forms where people request a file and get its
contents, you need to watch the PHP file-opening command
fopen and the file-reading commands
fpassthru and readfile.
fopen and readfile accept URLs
as well as filenames; disable this with
allow_url_fopen=false in
php.ini. You may also limit PHP file operations
to a specific directory with the open_basedir
directive. This can be set within Apache container directives to
limit virtual hosts to their backyards:

<VirtualHost 192.168.102.103>
ServerName a.test.com
DocumentRoot /usr/local/apache/hosts/a.test.com
php_admin_value open_basedir /usr/local/apache/hosts/a.test.com
</VirtualHost>

If safe_mode is enabled in php.ini
or an Apache configuration file, a file must be owned by
the owner of the PHP script to be processed. This is also useful for
virtual hosts.

Table 10-9 lists recommended safe settings for
PHP.

Table 10-9. Safer PHP settings
Option	Default value	Recommended value
register_globals	off	off
safe_mode	off	on
safe_mode_exec_dir	None	/usr/local/apache/`host`/bin
open_basedir	None	/usr/local/apache/`host`/files
display_errors	on	off
log_errors	off	on
allow_url_fopen	on	off
session.save_path	/tmp	/usr/local/apache/`host`/sessions

In Table 10-9, I'm assuming you
might set up a directory for each virtual host under
/usr/local/apache/host. You can specify multiple
directories with a colon (:) separator.

10.4.2.2 Perl

In taint mode, Perl
blocks use of the functions eval,
require, open (except read-only
mode), chdir, chroot,
chmod, unlink,
mkdir, rmdir,
link, and symlink. You must
untaint filenames before using any of these. As in the PHP example,
watch for relative (../) names and other attempts to access files
outside the intended area.

10.4.3. Executing Programs

Most
scripting languages let you run external programs. This is a golden
opportunity for nasty tricks. Check the pathname of the external
program and remove any metacharacters that would allow multiple
commands. Avoid passing commands through a shell interpreter.

10.4.3.1 PHP

Escape any possible attempts to slip
in extra commands with this PHP function:

$safer_input = escapeshellarg($input);
system("some_command $safer_input");

or:

system(escapeshellcmd("some_command $input"));

These PHP functions invoke the shell and are vulnerable to misuse of
shell metacharacters: system,
passthru, exec,
popen, preg_replace (with the
/e option), and the backtick
(`command`)
operator.

If safe_mode is set, only programs within
safe_mode_exec_dir can be executed, and only files
owned by the owner of the PHP script can be accessed.

The PHP function
eval($arg)
executes its argument $arg as PHP code.
There's no equivalent to
safe_mode for this, although the
disable_functions option lets you turn off
selected functions. Don't execute any command with
embedded user data.

10.4.3.2 Perl

Taint mode will not let you pass
unaltered user input to the functions system,
exec, eval, or the backtick
(`command`)
operator. Untaint them before executing, as described earlier.

10.4.4. Uploading Files from Forms

RFC 1867 documents
form-based file
uploadsa way of uploading files through HTML, HTTP,
and a web server. It uses an HTML form, a special form-encoding
method, and an INPUT tag of type FILE:

This is another golden opportunity for those with too much time and
too little conscience to upload huge files and fill up the available
space. A file upload is handled by a CGI file-upload script. There is
no standard script, since so many things can be done with an uploaded
file.

10.4.4.1 PHP

Uploaded files are saved as temporary
files in the directory specified by the PHP directive
upload_tmp_dir. The default value
(/tmp) leaves them visible to anyone, so you may
want to define upload_tmp_dir to some directory in
a virtual host's file hierarchy. To access uploaded
files, use the new autoglobal array $_files, which
is itself an array. For the photo-uploading example,
let's say you want to move an uploaded image to the
photos directory of virtual host
host:

<?
// $name is the original file name from the client
$name = $_files['photo_file']['name'];
// $type is PHP's guess of the MIME type
$type = $_files['photo_file']['type'];
// $size is the size of the uploaded file (in bytes)
$size = $_files['photo_file']['size'];
// $tmpn is the name of the temporary uploaded file on the server
$tmpn = $_files['photo_file']['tmp_name'];
// If the size and type look okay, move the temporary file
// to its desired place.
if (is_uploaded_file($tmpn))
move_uploaded_file($tmpn, "/usr/local/apache/host/photos");

You may check the file's type, name, and size before
deciding what to do with it. The PHP option
max_upload_filesize caps the size; if a
larger file is uploaded, the value of $tmpn is
none. When the PHP script finishes, any temporary
uploaded files are deleted.

10.4.4.2 Perl

The CGI.pm module
provides a file handle for each temporary file.

#!/usr/bin/perl -wT
use strict;
use CGI qw(:standard);
my $handle = param("photo_file");
my $tmp_file_name = tmpFileName($handle);
my $size = $ENV{CONTENT_LENGTH};
# If the size looks okay, copy or rename the file
# ...

The temporary file goes away when the CGI script completes.

10.4.5. Accessing Databases

Although relational databases have
standardized on SQL as a query language, many of their APIs and
interfaces, whether graphic or text based, have traditionally been
proprietary. When the Web came along, it provided a standard GUI and
API for static text and dynamic applications. The simplicity and
broad applicability of the web model led to the quick spread of the
Web as a database frontend. Although HTML does not offer the richness
and performance of other graphical user interfaces,
it's good enough for many applications.

Databases often contain sensitive information, such as
people's names, addresses, and financial data. How
can a porous medium like the Web be made safer for
database access? Here are some
guidelines for Web-MySQL access (some are also discussed in Chapter 8):

Don't have your database on the same machine as the
web server. It's best if your database is behind a
firewall that only passes queries from your web server. For example,
MySQL normally uses port 3306, so you might only permit access from
ports on the web server to port 3306 on the database server.

Check that all default database passwords have been changed. For
MySQL, ensure that the default user (called
root, but not related to the Unix
root user) has a password. You have a problem if
you can get into the database without a password by typing:

mysql -u root Use the SQL GRANT and REVOKE statements to make sure access to tables
and other resources is allowed only for the desired MySQL IDs on the
desired servers. An example might follow this pattern:

GRANT SELECT ON sample_table
TO "sample_user@sample_machine"
IDENTIFIED BY "sample password" Do not allow access to the MySQL users table by
anyone other than the MySQL root user, since it
contains the permissions and encrypted passwords.

Don't use form-variable values or names in SQL
statements. If the form variable user maps
directly to a user column or table, someone will
deduce the pattern and experiment.

Check user input before using it in SQL statements. This is similar
to checking user input before executing a shell command. Such
exploits have been called SQL injection. See
Chapter 8 for more details.

Any time information is exchanged, someone will be tempted to change
it, block it, or steal it. We'll quickly review
these issues in PHP and Perl database CGI scripts:

Which database APIs to use Protecting database account names and passwords Defending against SQL injection

10.4.5.1 PHP

PHP has many
specific and generic database APIs. There is not yet a clear leader
to match Perl's database-independent (DBI) module.

A PHP fragment to access a MySQL database might begin like this:

<?
$link = mysql_connect("db.test.com", "dbuser", "dbpassword");
if (!$link)
echo "Error: could not connect to database\n";
?>

If this fragment is within every script that accesses the database,
every instance will need to be changed if the database server, user,
or password changes. More importantly, a small error in
Apache's configuration could allow anyone to see the
raw PHP file, which includes seeing these connection parameters.
It's easier to write a tiny PHP library function to
make the connection, put it in a file outside the document root, and
include it where needed.

Here's the include file:

// my_connect.inc
// PHP database connection function.
// Put this file outside the document root!
// Makes connection to database.
// Returns link id if successful, false if not.
function my_connect( )
{
$database = "db.test.com";
$user = "db_user";
$password = "db_password";
$link = mysql_connect($database, $user, $password);
return $link;
} And this is a sample client:

// client.php
// PHP client example.
// Include path is specified in include_path in php.ini.
// You can also specify a full pathname.
include_once "my_connect.inc";
$link = my_connect( );
// Do error checking in client or library function
if (!$link)
echo "Error: could not connect to database\n";
// ...

Now that the account name and password are better protected, you need
to guard against malicious SQL code. This is similar to protecting
against user input passing directly to a system command, for much the
same reasons. Even if the input string is harmless, you still need to
escape special characters.

The PHP addslashes function puts a backslash (\)
before these special SQL characters: single quote
('), double quote ("),
backslash (\), and NUL (ASCII 0). This will be called
automatically by PHP if the option
magic_quotes_gpc is on.
Depending on your database, this may not quote all the characters
correctly.

SQL injection is an attempt to use your database server to get access
to otherwise protected data (read, update, or delete) or to get to
the operating system. For an example of the first case, say you have
a login form with user and password fields. A PHP script would get
these form values (from $_GET,
$_POST, or $_REQUEST, if
it's being good), and then build a SQL string and
make its query like this:

$sql = "SELECT * FROM users WHERE\n" .
"user = '$user' AND\n".
"password = '$password'";
$result = mysql_query($sql);
if ($result && $row = mysql_fetch_array($result) && $row[0] == 1)
return true;
else
return false;

An exploiter could enter these into the input fields (see Table 10-10).

Table 10-10. SQL exploit values
Field	Value
user	' OR '' = ''
password	' OR '' = ''

The SQL string would become:

SELECT * FROM users WHERE
user = '' OR '' = '' AND
password = '' OR '' = '' The door is now open. To guard against this, use the techniques
I've described for accessing other external
resources, such as files or programs: escape metacharacters and
perform regular-expression searches for valid matches. In this
example, a valid user and password might be a sequence of letters and
numbers. Extract user and password from the original strings and see
if they're legal.

In this example, if the PHP option
magic_quotes_gpc were enabled, this exploit would
not work, because all quote characters would be preceded by a
backslash. But other SQL tricks can be done without quotes.

A poorly written script may run very slowly or even loop forever,
tying up an Apache instance and a database connection.
PHP's set_time_limit function
limits the number of seconds that a PHP script may execute. It does
not count time outside the script, such as a
database query, command execution, or file I/O. It also does not give
you more time than Apache's
Timeout variable.

10.4.5.2 Perl

Perl has the trusty
database-independent module DBI and its faithful
sidekicks, the database-dependent (DBD) family.
There are DBD modules for many popular databases, both open source
(MySQL, PostgreSQL) and commercial (Oracle, Informix, Sybase, and
others).

A MySQL connection function might resemble this:

# my_connect.pl
sub my_connect
{
my $server = "db.test.com";
my $db = "db_name";
my $user = "db_user";
my $password = "db_password";
my $dbh = DBI->connect(
"DBI:mysql:$db:$server",
$user
$password,
{ PrintError => 1, RaiseError => 1 })
or die "Could not connect to database $db.\n";
return $dbh;
}
1;

As in the PHP examples, you'd rather not have this
function everywhere. Perl has, characteristically, more than one way
to do it. Here is a simple way:

require "/usr/local/myperllib/my_connect.pl";

Keep the my_connect.pl script outside
Apache's DocumentRoot directory
to prevent its contents from being viewed. If your connection logic
is more complex, it could be written as a Perl package or a module.

Taint mode won't protect you from entering tainted
data into database queries. You'll need to check the
data yourself. Perl's outstanding regular-expression
support lets you specify patterns that input data must match before
going into a SQL statement.

10.4.6. Authentication

Your web
site may have some restricted content, such as premium pages for
registered customers or administrative functions for web site
maintainers. Use authentication to establish the
identity of the visitor. Broken authentication and session
management is number three in the OWASP top 10.

10.4.6.1 Basic authentication

The simplest authentication method in Apache is
basic
authentication. This requires a password file on the web
server and a require directive in a config file:

<Location /auth_demo_dir>
AuthName "My Authorization"
AuthType Basic
# Note: Keep the password files in their own directory
AuthUserFile /usr/local/apache/auth_dir/auth_demo_password
Order deny, allow
Require valid-user
</Location>

I suggest storing password files in their own directories, outside
the document root. You may use subdirectories to segregate files by
user or virtual host. This is more manageable than
.htaccess files all over the site, and it keeps
Apache running faster.

You can specify any matching user, a list of users, or a list of
groups:

require valid-user
require user user1 user2 ...
require group group1 group2 ... Where are the names and passwords stored? The simplest solution,
specified by AuthUserFile in the example, is a
flat text file on the server. To create the password file with an
initial user named raoul, type the following:

htpasswd -c /usr/local/apache/auth_dir/auth_demo_password raoul To add raoul to an existing password file:

htpasswd /usr/local/apache/auth_dir/auth_demo_password -u raoul
... (prompt for password for raoul) ...

When a visitor attempts to access /auth_demo_dir
on this site, a dialog box pops up and prompts him for his name and
password. These will be sent with the HTTP stream to the web server.
Apache will read the password file
/etc/httpd/auth/image/library/english/10020_auth_demo_password, get the
encrypted password for the user raoul, and see
if they match.

Don't put the password file anywhere under your
DocumentRoot! Use one or more separate
directories, with read-write permissions for the Apache user and
group, and none for others.

An authentication method connects with a particular storage
implementation (file, DBM, DB, MySQL, LDAP) by matching Apache
modules and configuration directives. For example,
mod_auth_mysql is configured with the table and
column names in a customer table in a MySQL database. After the name
and password are sent to Apache from the browser,
mod_auth_mysql queries the database, and Apache
allows access if the query succeeds and the username and password
were found.

Browsers typically cache this authentication information and send it
to the web server as part of each HTTP request header for the same
realm (a string specified to identify this
resource). What if the user changes her password during her session?
Or what if the server wants to log the client off after some period
of inactivity? In either case, the cached credentials could become
invalid, but the browser still holds them tight. Further attempts by
the user to reach a web page in the realm will fail. Unfortunately,
HTTP has no way for a server to expire credentials in the client. It
may be necessary to clear all browser caches (memory and disk) to
clear the authentication data, forcing the server to request
reauthentication and causing the client to open a new dialog box.
Basic authentication is not encrypted, and credentials are sent to
the server with every request. A sniffer can and will pick up the
name and password. Use SSL (URLs starting with
https://) for privacy. Although the initial SSL
handshake is slow, the following content encryption is not so bad.

Direct authentication with a scripting language gives more
flexibility than the built-in browser dialog box. The script writes
an HTML form to the client, and it processes the reply as though it
came from the standard dialog box.

10.4.6.2 Digest authentication

The second HTTP client authentication method,
digest
authentication, is more secure, because it uses an MD5
hash of data rather than cleartext passwords. RFC 2617 documents
basic and digest authentication. The Apache server and Mozilla
implement the standard correctly in the module
mod_digest. Microsoft did not, so digest
authentication in IE 5 and IIS 5 does not currently interoperate with
other web servers and browsers. Another implementation has been
written by a security group at Microsoft, so in the future, this may
be resolved. For now, SSL is the only safe way to communicate
authentication data.

10.4.6.3 Safer authentication

It's surprisingly tricky to create secure client
authentication.
User input can be forged, HTTP referrals are unreliable, and even the
client's apparent IP address can change from one
access to the next if the user is behind a proxy farm. It would be
beneficial to have a method that's usable within and
across sites. For cross-site authentication, the authenticating
server must convey its approval or disapproval in a way that
can't be easily forged and that will work even if
the servers aren't homogeneous and local.

A simple adaptation of these ideas follows. It uses a public variable
with unique values to prevent a replay attack. A
timestamp is useful because it can also be used to expire old logins.
This value is combined with a constant string that is known only by
the cooperating web servers to produce another string. That string is
run through a one-way hash function. The timestamp and hashed string
are sent from the authenticating web server (A) to the target web
server (B).

Let's walk through the process. First, the client
form gets the username and password and submits them to Server A over
a secure SSL connection:

# Client form
<form method="get" action="https://a.test.com/auth.php">
User: <input type="text" name="user">
Password: <input type="password" name="password">
<input type="submit">
</form>

On Server A, a PHP script gets the timestamp, combines it with the
secret string, hashes the result, and redirects to Server B:

<?
// a.test.com/auth.php
$time_arg = Date( );
$secret_string = "babaloo";
$hash_arg = md5($time_arg . $secret_string);
$url = "http://b.test.com/login.php" .
"?" .
"t=" . urlencode($time_arg) .
"&h=" . urlencode($hash_arg);
header("Location: $url");
?>

On Server B, a script confirms the input from Server A:

<?
// b.test.com/login.php
// Get the CGI variables:
$time_arg = $_GET['t'];
$hash_arg = $_GET['h'];
// Servers A and B both know the secret string,
// the variable(s) it is combined with, and their
// order:
$secret_string = "babaloo";
$hash_calc = md5($time_arg . $secret_string);
if ($hash_calc == $hash_arg)
{
// Check $time_arg against the current time.
// If it's too old, this input may have come from a
// bookmarked URL, or may be a replay attack; reject it.
// If it's recent and the strings match, proceed with the login...
}
else
{
// Otherwise, reject with some error message.
}
?>

This is a better-than-nothing method, simplified beyond recognition
from the following sources, which should be consulted for greater
detail and security:

Example 16-2 in Web Security, Privacy, and
Commerce (O'Reilly).

Dos and Donts of Client Authentication on the
Web (http://www.lcs.mit.edu/publications/pubs/pdf/MIT-LCS-TR-818.pdf)
describes how a team at MIT cracked the authentication schemes of a
number of commercial sites, including the Wall Street Journal. Visit
http://cookies.lcs.mit.edu/
for links to the Perl source code of their Kooky Authentication
Scheme.

10.4.7. Access Control and Authorization

Once authenticated, what is the
visitor allowed to do? This is the authorization
or access control step. You can control access
by a hostname or address, by the value of an environment variable, or
by a person's ID and password. Broken
access control is the second highest vulnerability in the
OWASP top 10 list.

10.4.7.1 Host-based access control

This
grants or blocks access based on a hostname or IP address. Here is a
sample directive to prevent everyone at evil.com
from viewing your site:

<Location />
order deny,allow
deny from .evil.com
allow from all
</Location>

The period before evil.com is necessary. If I said:

deny from evil.com I would also be excluding anything that ends with
evil.com, such as devil.com or
www.bollweevil.com.

You may also specify addresses:

Type	Example
Full IP	200.201.202.203
Subnet	200.201.202.
Explicit netmask	200.201.202.203/255.255.255.0
CIDR	200.201.202.203/24

10.4.7.2 Environment-variable access control

This is a very flexible solution to
some tricky problems. Apache's configuration file
can set new environment variables based on patterns in the
information it receives in HTTP headers. For example,
here's how to serve images from
/image_dir on http://www.hackenbush.com, but keep people
from linking to the images from their own sites or stealing them:

SetEnvIf Referer "^www.hackenbush.com" local
<Location /image_dir>
order deny,allow
deny from all
allow from env=local
</Location>

SetEnvIf defines the environment variable
local if the referring page was from the same
site.

10.4.7.3 User-based access control

If
you allow any
.htaccess
files in your Apache configuration, Apache must check for a possible
.htaccess file in every directory leading to
every file that it serves, on every access. This is slow: look at a
running httpd process sometime with
strace httpd to see the statistics from all these
look-ups. Also, .htaccess files can be anywhere,
modified by anyone, and very easy to overlook. You can get surprising
interactions between your directives and those in these far-flung
files. So let's consider them a hazard. We can still
selectively and carefully allow them.

Try to put your access-control directives directly in your Apache
configuration file (httpd.conf or
access.conf). Disallow overrides for your whole
site with the following:

<Location />
AllowOverride None
</Location>

Any exceptions must be made in httpd.conf or
access.conf, including
granting the ability to use .htaccess files
(only httpd.conf for Apache 2). You might do
this if you serve many independent virtual hosts and want to let them
specify their own access control and CGI scripts. But be aware that
you're increasing your server's
surface area.

10.4.7.4 Combined access control

Apache's configuration
mechanism is surprisingly flexible, allowing you to handle some
tricky requirements. For instance, to allow anyone from
good.com as well as a registered user:

<Location />
order deny,allow
deny from all
# Here's the required domain:
allow from .good.com
# Any user in the password file:
require valid-user
# This does an "or" instead of an "and":
satisfy any
</Location>

If you leave out satisfy any, the meaning changes
from or to and, a much more
restrictive setting.

10.4.8. SSL

SSL
encrypts data between a web browser and web server.
It's used throughtout the Web to protect login
names, passwords, personal information, and, of course, credit card
numbers. The initial SSL handshake is slow in software, and much
faster with a hardware SSL accelerator.

Until recently, people tended to buy a commercial server to offer
SSL. RSA Data Security owned a patent on a public-key encryption
method used by SSL, and they licensed it to companies. After the
patent expired in September 2000, free implementations of Apache+SSL
emerged. Two modulesApache-SSL and
mod_sslhave competed for the lead
position. mod_ssl is more popular and easier to
install, and it can be integrated as an Apache DSO.
It's included with Apache 2 as a standard module.
For Apache 1.x, you need to get mod_ssl from
http://www.modssl.org and OpenSSL
from http://www.openssl.org.

Early in the SSL process, Apache requires a server certificate to
authenticate its site's identity to the browser.
Browsers have built-in lists of CAs and their credentials. If your
server certificate was provided by one of these authorities, the
browser will silently accept it and establish an SSL connection. The
process of obtaining a server certificate involves proving your
identity to a CA and paying a license fee. If the server certificate
comes from an unrecognized CA or is self-signed,
the browser will prompt the user to confirm or reject it. Large
commercial sites pay annual fees to the CA to avoid this extra step,
as well as to avoid the appearance of being less trustworthy.

10.4.9. Sessions and Cookies

Once a customer has been
authenticated for your site, you want to keep track of him. You
don't want to force a login on every page, so you
need a way to maintain the state over time and multiple page visits.

Since HTTP is stateless, visits need to be threaded together. If a
person adds items to a shopping cart, they should stay there even if
the user takes side trips through the site. Scripting languages
address the problems of remembering information from page to page
through the concept of a session.

A session is a sequence of interactions. It has a session
ID (a unique identifier), data, and a time span. A good
session ID should be difficult to guess or reverse-engineer. A random
ID is best, but an ID may be calculated from some input variables,
such as the user's IP or the time. If the ID is not
random, it should be encrypted. PHP, Perl, and other languages have
code to create and manage web sessions.

If the web user allows cookies in her browser, the web script may
write the session ID as a variable in a cookie for your web site. If
cookies are not allowed, you need to propagate the session ID with
every URL. Every GET URL needs an extra variable, and every POST URL
needs some hidden field to house this ID.

10.4.9.1 PHP

PHP can be configured to check every
URL on a page and tack on the session ID, if needed. In
php.ini, add the following:

session.use_trans_sid=1 This is a little slower, since PHP needs to examine every URL in the
page's HTML contents.

Without this, you need to track the sessions yourself. If cookies are
enabled in the browser, PHP defines the constant
SID to be an empty string. If cookies are
disabled, SID is defined as
PHPSESSID=id, where
id is the 32-character session ID string.
To handle either case in your script, append SID
to your links:

If cookies are enabled, the HTML created by the previous example
would be as follows:

If cookies are disabled, the session ID becomes part of the URL:

By default, session variables are written to
/tmp/sess_id. Anyone
who can list the contents of /tmp can hijack a
session ID, or possibly forge a new one. To avoid this, change the
session directory to a more secure location (outside of
DocumentRoot, of course).

In php.ini:

session.save_path=/usr/local/apache/sessions Or, in Apache's httpd.conf:

php_admin_valuesession.save_path /usr/local/apache/sessions The directory and files should be owned by the web-server user ID and
hidden from others:

chmod 700 /usr/local/apache/session If there is more than one group of PHP developers, use virtual hosts
and a host-specific session directory (such as
/usr/local/apache/host/sessions) to prevent them
from hijacking each other's sessions.

You can also tell PHP to store session data in shared memory, a
database, LDAP, or some other storage method.

10.4.9.2 Perl

The
Apache::Session module provides session
functions for mod_perl. The session ID can be saved in a cookie or
manually appended to URLs. Session storage may use the filesystem, a
database, or RAM. See the documentation at http://www.perldoc.com/cpan/Apache/Sessionl.

Apache provides its own language-independent session management with
mod_ session. This works with or without cookies
(by appending the session ID to the URL in the
QUERY_STRING environment variable) and can exempt
certain URLs, file types, and clients from session control.

10.4.10. Site Management: Uploading Files

As you
update your web site, you will be editing and copying files. You may
also allow customers to upload files for some purposes. How can you
do this securely?

Tim
Berners-Lee originally envisioned the Web as a two-way medium, where
browsers could easily be authors. Unfortunately, as the Web
commercialized, the emphasis was placed on browsing. Even today, the
return path is somewhat awkward, and the issue of secure site
management is not often discussed.

10.4.10.1 Not-so-good ideas

I mentioned form-based file uploads earlier.
Although you can use this for site maintenance, it handles only one
file at a time and forces you to choose it from a list or type its
name.

Although FTP is readily available and simple
to use, it is not recommended for many reasons. It still seems too
difficult to secure FTP servers: account names and passwords are
passed in the clear.

Network filesystems such as
NFS or
Samba are appealing
for web-site developers, because they can develop content on their
client machines and then drag and drop files to network folders.
These filesystems are still too difficult to secure across the public
Internet and are not recommended. At one time, Sun was promoting
WebNFS as the
next-generation, Internet-ready filesystem, but there has been little
public discussion about this in the past few years.

The HTTP PUT method is usually not available in web browsers. HTML
authoring tools, such as Netscape Composer and AOLPress, use PUT to
upload or modify files. PUT has security implications similar to
form-based file uploads, and it now looks as if it's
being superseded by DAV.

Microsoft's
FrontPage server extensions define web-server
extensions for file uploading and other tasks. The web server and
FrontPage client communicate with a proprietary RPC over HTTP. The
extensions are available for Apache and Linux (http://www.rtr.com/fpsupport/indexl),
but only as binaries.

FrontPage has had serious security problems in the past. The author
of the presentation Apache and FrontPage at
ApacheCon 2001 recommended: "If at all possible,
don't use FrontPage at all." There
seems to be a current mod_frontpage DSO for
Apache (http://www.rtr.com/fpsupport/whatsnew).
Microsoft appears to be moving toward DAV.

10.4.10.2 Better ideas: ssh, scp, sftp, rsync

scp and sftp are good
methods for encrypted file transfer. To copy
many files, rsync or Unison
over ssh provide an incremental, compressed,
encrypted data transfer. This is especially useful when mirroring or
backing up a web site. I do most of my day-to-day Linux work on live
systems with ssh, vi,
scp, and rsync. When
working from a Windows box, I use putty and
WinSCP. A true VPN would be even more
convenient.

10.4.10.3 DAV

Distributed
Authoring and Versioning (DAV or WebDAV)
is a recent standard for remote web-based file management. DAV lets
you upload, rename, delete, and modify files on a web server.
It's supported in Apache (as the mod_dav
module) and by all the major web authoring tools,
including:

Microsoft web folders with IE 5 and Windows 95
and up. These look like local directories under Explorer, but are
actually directories on a web server under DAV management. This is
the simplest drag-and-drop solution I've seen for
authors on Windows machines to publish to Apache on Linux. See
http://www.mydocsonline.com/info_webfoldersl.

Microsoft FrontPage 2003 Macromedia Dreamweaver UltraDev Adobe GoLive, InDesign, and FrameMaker Apple Mac OS X iDisk OpenOffice
To add DAV support to Apache, ensure that
mod_dav is included:

Download the source from http://www.moddav.org.

Build the module:

./configure --with-apxs=/usr/local/apache/bin/apxs Add these lines to httpd.conf:

Loadmodule dav_module libexec/libdav.so
Addmodule mod_dav.c Create a password file:

htpasswd -s /usr/local/apache/passwords/dav.htpasswd user password In httpd.conf, enable DAV for the directories
you want to make available. If you allow file upload, you should have
some access control as well:

# The directory part of this must be writeable
# by the user ID running apache:
DAVLockDB /usr/local/apache/davlock/
DAVMinTimeout 600
# Use a Location or Directory for each DAV area.
# Here, let's try "/DAV":
<Location /DAV>
# Authentication:
AuthName "DAV"
AuthUserFile /usr/local/apache/passwords/dav.htpasswd"
AuthType Basic
# Some extra protection
AllowOverride None
# Allow file listing
Options indexes
# Don't forget this one!:
DAV On
# Let anyone read, but
# require authentication to do anything dangerous:
<LimitExcept GET HEAD OPTIONS>
require valid-user
</Limit>
</Location>

The security implications of DAV are the same as for basic
authentication: the name and password are passed as plain text, and
you need to protect the name/password files.

DAV is easy to use and quite flexible. A new extension called DELTA-V
will handle versioning, so DAV could eventually provide a web-based
source-control system.

10.4.11. XML, Web Services, and REST

XML started as a text-based markup language to preserve the structure
of data. It grew beyond file formats to RPC protocols such as XML-RPC
and SOAP. These protocols use HTTP because it usually passes through
corporate firewalls, and it would be difficult to establish a new
specialized protocol. With other proposed standards such as
Web Services Description Language (WSDL)
and Universal Description,
Discovery, and Integration (UDDI), a new field called web
services (http://www.w3.org/2002/ws/) is emerging.

There are some security concerns about this. You construct a firewall
based on your knowledge that server A at port B can do C and D. But
with SOAP and similar protocols, HTTP becomes a conduit for remote
procedure calls. Even a stateful firewall cannot interpret the
protocol to see which way the data flows or the implications of the
data. That would require a packet analyzer that knows the syntax and
semantics of the XML stream, which is a difficult and higher-level
function.

IBM, Microsoft, and others founded the Web Services
Interoperability Group (http://www.ws-i.org) to create web-services
standards outside of the IETF and W3C. Security was not addressed
until the first draft of Web Services
Security (http://www-106.ibm.com/developerworks/webservices/library/ws-secure/)
appeared in April 2002. It describes an extensible XML format for
secure SOAP message exchanges. This addresses the integrity of the
message but still doesn't guarantee that the
message's contents are safe when handled by the
client or server. The http://www.ws-i.org/Pro/image/library/english/10020_BasicSecurityProfile-1.0-2004-05-12l)
was approved in 2004. A separate group, OASIS, recently approved
three Web Services Security specifications (http://www.oasis-open.org/specs/index.php).

It's hard to be certain (the standards are heavy
sledding), but it doesn't look like we have
end-to-end security for web services yet.

An alternative to XML-based web services is
Representational State Transfer (REST),
which uses only traditional web componentsHTTP and URIs. A
description is found in Second Generation Web
Services (http://www.xml.com/pub/a/2002/02/20/restl). Its proponents argue that REST can do
anything that SOAP can do, but more simply and securely. All the
techniques described in this chapter, as well as functions such as
caching and bookmarking, could be applied because current web
standards are well established. For instance, an HTTP GET method has
no side effects and never modifies server state. A SOAP method may
read or write, but this is due to a separate agreement between the
server and client, and cannot be determined from the syntax of the
SOAP message. See Some Thoughts About SOAP Versus REST on
Security (http://www.prescod.net/rest/securityl).

As these new web services roll out, the Law of Unintended
Consequences will get a good workout. Expect major surprises.

10.4.12. Detecting and Deflecting Attackers

The
more attackers know about you, the more vulnerable you are. Some use
port 80 fingerprinting to determine what kind of server
you're running. They can also pass a HEAD request to
your web server to get its version number, modules, etc.

Script kiddies are not known for their precision, so they will often
fling IIS attacks such as Code Red and
Nimda
at your Apache server. Look at your error_log to
see how often these turn up. You can exclude them from your logs with
Apache configuration tricks. A more active approach is to send email
to the administrator of the offending site, using a script like
NimdaNotifyer
(see http://www.digitalcon.ca/nimda/).
You may even decide to exclude these visitors from your site. See
Chapter 13 or visit http://www.snort.org to
see how to integrate an IP blocker with their intrusion detector.

A
tarpit
turns your network's unused IP addresses into a
TCP-connection black hole, holding on to attackers who try to connect
to them. Although an effective tool, a tarpit may actually be illegal
in some places. Read the La Brea story at http://www.hackbusters.net/.

10.4.13. Caches, Proxies, and Load Balancers

A proxy is a man in the middle. A
caching proxy is a
man in the middle with a memory. All the security issues of email
apply to web pages as they stream about: they can be read, copied,
forged, stolen, etc. The usual answer is to apply end-to-end
cryptography.

If you use sessions that are linked to a specific server (stored in
temporary files or shared memory rather than a database), you must
somehow get every request with the same session ID directed to the
same server. Some load balancers offer session
affinity to do this. Without it, you'll
need to store the sessions in some shared medium, such as an
NFS-mounted filesystem or a database.