Producing Something Worth
Serving
Although this chapter is concerned mainly with running and
maintaining a Web server, it's important that you understand something of how
the Web pages served by a Web server come into being. Some of the preceding sections
have already described some types of Web pages (particularly dynamic content).
The most common type of static content is HTML, which is a text-based format
with formatting extensions. There are several different tools available to help
you create HTML, as well as the file formats upon which HTML pages frequently
rely, such as graphics file formats. Understanding how to use these tools, and
how Web browsers interpret HTML, will help you create Web sites that can be
handled by most Web browsers available today.
HTML
and Other Web File Formats
Although there are many tools for creating Web files, as
discussed in the next section, "href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec8&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20lev2sec16#ch20lev2sec16"> Tools for Producing Web Pages ," it helps
to understand something about the various file formats that are common on
static Web pages. File formats that are common on the Web include various text
file formats, graphics files, and assorted data files.Most Web pages are built around an HTML text file. This file
is a plain text file that you can edit in an ordinary text editor. href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec8&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20list02#ch20list02"> Listing 20.2 shows a simple HTML file as an
example. Most text in an HTML file is displayed in the Web browser's window,
but text enclosed in angle brackets ( <> )
is formatting information. Many of these codes come in pairs, with the second
bearing the same name as the first but using a slash ( / ) to indicate it's the end of the
formatted area. The opening code sometimes includes parameters that fine-tune
its behavior, such as setting the size and filename of a graphic or specifying
the color of text and background. Some of these codes reference other documents
on the Web (both on the main document's server and on other Web servers).
Listing
20.2 A sample HTML file
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>Sample Web Page</TITLE> </HEAD> <BODY BGCOLOR="#FFFFFF" TEXT="#000000"> <CENTER><H1 ALIGN="CENTER">Sample Web Page</H1></CENTER> <IMG SRC= ALT="Logo" WIDTH="197" HEIGHT="279"> <P>This is a sample Web page, including <A HREF=>
border=0>html">a link.</A></P> </BODY></HTML>
Some of the formatting codes in href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec8&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20list02#ch20list02"> Listing 20.2 should be self-explanatory, but
others are more obscure. A handful of the more important codes include the
following: <HTML>
This code identifies the page as being HTML. Most browsers don't actually
require this code, but it's best to use it in order to create proper HTML. <HEAD>
HTML pages contain both a header and a body. The header provides information that's not
usually displayed in the Web page proper, such as the <TITLE> code in href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec8&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20list02#ch20list02"> Listing 20.2 . The body includes the bulk of
the text that appears in the Web browser's window. <HEAD> identifies the header. <TITLE>
Most Web browsers display a Web page's title in the window's drag bar, and
enter it as the default title if a user enters the Web page in a bookmark list. <BODY>
This code denotes the main body of the Web page. It frequently includes
parameters to set text color, background color, and similar attributes. <H1>
Headings allow you to create text that's useful
to break up a Web page into sections. They're usually displayed in a larger
font than is regular text. You can create several levels of headings, starting
with 1 ( <H1> ). <H2> is a heading below <H1> , and so on. href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec8&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20list02#ch20list02"> Listing 20.2 's <H1> code includes an ALIGN option to tell the Web browser to center the text.
Unfortunately, not all Web browser respond to the same alignment codes, so some
redundancy in this matter is required for consistent display. <CENTER>
The <H1> heading in href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec8&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20list02#ch20list02"> Listing 20.2 is centered by both an option
within the <H1> code and
by a <CENTER> code that
surrounds the <H1> code.
Some browsers respond to one or the other of these codes, but not both. The <CENTER> code is usually not needed
for modern browsers, but some older ones do need it. <IMG>
You can include a graphic on a Web page by using the <IMG> code, as shown in href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec8&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20list02#ch20list02"> Listing 20.2 . These codes usually include
several parameters, including SRC
(which points to the source of the imageits filename on the server or a
complete URL if it's stored on another server), ALT (text that's associated with the image for those who
have automatic image display disabled or for display when the user moves a
mouse over the image), WIDTH ,
and HEIGHT (the width and height
of the image, which allows the Web browser to display the text of the document
before the image has loaded, or to dynamically scale an image if it's some
other size). <P>
This code denotes a paragraph. A Web browser will automatically rewrap text
within a paragraph to fit the window size, font size, and other characteristics
of the browser's window. <A HREF>
This code denotes a link. The text or image enclosed by this code usually
appears on the Web browser underlined or in a different color, and users can
click the link to view the page specified by the URL within the code.It's possible to use nothing but these codes to create a Web
page, but HTML supports many more options, including the ability to format
tables, specify fonts, display bulleted or numbered lists, break the document
into multiple independent frames, and so on.
It's possible to over-use advanced HTML features, though. The upcoming section,
"href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec8&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20lev2sec17#ch20lev2sec17"> Web Page Design Tips ," includes some
information on this matter.In addition to HTML, Web servers can deliver other document
types to browsers. Indeed, HTML documents often refer to these documents
directly, as in the <IMG>
option in href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec8&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20list02#ch20list02"> Listing 20.2 . You can link to plain text
pages, graphics, downloadable program files, scripts, or any other type of
file. One important caveat is that your Web server should have an appropriate
MIME type set for your documents, usually in the mime.types file described in the earlier section, "href="http:// /?xmlid=0-201-77423-2/ch20lev1sec3#ch20lev2sec1"> Understanding Apache Configuration Files ."
If Apache can't determine the MIME type of the file, it usually sends it as
plain text, which can cause problems because the target OS may alter certain
characters in the file, thus corrupting it.Because many Web pages incorporate extensive graphics, it's
important to understand something of the graphics file formats that are common
on the Web. The three most common formats are as follows: GIF The Graphics Interchange
Format has been popular since the 1980s. It uses a lossless
compression scheme, which means that it compresses data, but in a way that
allows the exact input data to be displayed. GIF supports images with color
depths of up to 8 bits (256 colors). One unique drawback to this format is that
it uses a compression scheme that is covered by a patent (which is due to
expire in 2003). Some people object to using a graphics file format that's so
encumbered. PNG The Portable Network
Graphics file format is another one that uses a lossless compression scheme. It
also supports greater color depth (up to 64-bit, but 24-bit is a more common
depth), and hence many more colors than GIF. Its compression scheme isn't
covered by patents. On the down side, some older Web browsers don't support PNG
graphics. There's a Web site devoted to PNG at href="http://www.libpng.org/pub/png/" target="_blank">http://www.libpng.org/pub/png/ . JPEG The Joint Photographic
Experts Group format uses a lossy compression
scheme, meaning that it can attain greater compression than a lossless format,
but the compressed image may not exactly match the original when displayed.
JPEG supports true-color (up to 24-bit) images.As a general rule, a lossless format is best for line art and
cartoon-like images that use just a handful of colors. These images tend to
acquire ugly-looking artifacts when converted to JPEG format. Digitized photos,
by contrast, usually look best in a true-color format (PNG or JPEG), and JPEG's
lossy compression scheme doesn't impact such images as much. Therefore, JPEGs
are common for digitized photos displayed on the Web.When you use JPEG, your graphics package will give you an
option for a compression level. You can save your graphics file with little
compression, which produces a large but good-looking image, or use a great deal
of compression, which produces a much smaller file that degrades more in
quality. The exact scale used to describe the level of compression varies from
one package to another, but a 1100 scale is not uncommon, with 100
representing the best quality. Most images you're likely to put on the Web look
acceptable at a fairly low compression level (say, around 50), and compressing
these images can help reduce the load on your Web server and cause the images
to appear more quickly in your users' Web browsers. You may want to experiment
with different types of graphics files to learn what compression level works
best for you.
Tools
for Producing Web Pages
Although you can create Web pages by hand by editing the raw
HTML in a text editor and using separate tools like The GIMP (href="http://www.gimp.org" target="_blank">http://www.gimp.org )
to create or edit graphics, many Web page designers prefer to use GUI HTML
design tools. These tools let you type in and edit text much as you can in a
what-you-see-is-what-you-get (WYSIWYG) word processor, using buttons or special
keystrokes to indicate centering, bold text, new paragraphs, and so on. This
approach is certainly convenient, and Apache doesn't really care how you
generate your files, so from a server operation point of view, there's no
reason to avoid such tools. One exception is that Microsoft's Front Page can
create Web pages that depend on special server extensions, so it's best to
avoid it when using Apache.NOTE

Creating Web pages with a
design tool isn't normally a problem for Apache, but some creation tools
include interfaces to automatically upload a Web page to a Web server. These
upload features might not work with Apache, at least not directly. You may
need to save your Web pages in local files, then transfer them by floppy,
FTP, or some other means to the Web server computer.
Examples of Web page creation tools include the following: Word processors Many modern
GUI word processors include a feature to export documents as HTML, or special
HTML formatting modes. Such HTML exports may lose some formatting features if
the files were generated as normal word processor documents. This can be a
convenient way to generate HTML documents if you're already familiar with a
word processor that supports such a feature. Linux word processors with HTML
export capabilities include Applix Words, StarOffice, and WordPerfect. Web browsers Many Web
browsers, including Netscape for Linux, come with document-creation modules. As
a general rule, these are more finely tuned to the needs of Web page design
than are word processors, but if you're already familiar with a word processor,
the browser tools represent another program to master. Standalone Web page
creation tools These tools are designed from
the ground up to do nothing but create Web pages. Examples in Linux include
ASHE ( http://www.cs.rpi.edu/pub/puninj/ASHE/ ), August ( target="_blank">http://www.lls.se/~johanb/august/ ), Bluefish ( target="_blank">http://bluefish.openoffice.nl ), and WebSphere ( href="http://www-4.ibm.com/software/webservers/hpbuilder/" target="_blank">http://www-4.ibm.com/software/webservers/hpbuilder/ ). Some of these are very basic tools, whereas others are extremely
complex.If you use a Web page development tool, you
should be aware of the limitations of these tools. Because of the nature of the
Web, no two browsers are likely to display the same page in precisely the same
way, but working with these tools makes it easy to overlook this fact. If the
tool creates HTML that's optimized for particular browsers, your Web site's
visitors may find your site difficult to read because of the assumptions your
HTML editor made.
Web Page Design Tips
Some Web designers like to use HTML features
to their fullest, thus creating a layout that can be almost as complex as
anything that could be created on a printed page. There are drawbacks to using
the more advanced HTML features, though. Specifically, it's impossible to
predict precisely how a given browser will handle a code. Indeed, even the
codes in href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec8&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20list02#ch20list02"> Listing 20.2 aren't entirely consistent in their applicationas noted in the preceding
descriptions, different browsers respond differently to the various codes used
to center text, for instance. Font specifications work only if the font is
installed on the client's Web browser; if it's not, the usual result is a
fallback to an ugly default, such as Courier. Color specifications may interact
poorly with a user's own color choices. (One particularly annoying error is
specifying a background color without specifying a text color. If you specify a
white background color but no text color, a user who has defaults set to white
text on black background will be unable to read your page. href="http:// /JVXSL.asp?x=1&mode=section&sortKey=insertDate&sortOrder=desc&view=&xmlid=0-201-77423-2/ch20lev1sec8&open=true&title=New%20This%20Week&catid=&s=1&b=1&f=1&t=1&c=1&u=1#ch20list02#ch20list02"> Listing 20.2 specifies background and foreground colors, but it doesn't specify link colors,
which can also be important in this equation.) Because Web browsers vary wildly, it's best
to test your Web pages on multiple browsers. At the very least, you should test
on both Netscape Navigator and Microsoft Internet Explorer. If possible, you
should test on multiple versions of these browsers. Other browsers that are
popular, particularly in the Linux community, include Mozilla ( href="http://www.mozilla.org" target="_blank">http://www.mozilla.org , an
open source cousin to Netscape Navigator), Opera ( href="http://www.opera.com" target="_blank">http://www.opera.com ), Konqueror (a part of the KDE project), and Lynx ( href="http://lynx.browser.org" target="_blank">http://lynx.browser.org , a
text-based Web browser). Lynx is particularly important if you want your site
to be accessible to all users. Because it's text-based, it will turn up
problems you might not notice in a GUI browser, but that might be important to
somebody who uses Lynx, or to a visually impaired person who uses a speech
synthesizer with a computer. Also, keep in mind that many (perhaps most) of
your Web server users won't be using Linux. On Windows, Internet Explorer is
the most popular browser, but others (including many of the preceding browsers)
are available. MacOS, BeOS, OS/2, and many other platforms all sport their own
browsers, some of which are shared with other platforms and some of which are
not.TIP

You can examine your server log files, as
described shortly, to determine what types of browsers are most often used
with your Web site.