HTML Versus XHTML
It's not Latin, but HTML has reached middle age in
standard Version 4.01. The W3C has no plans to develop another
version and has officially said so. Rather, HTML is being subsumed
and modularized as an Extensible Markup Language (XML). Its new name
is XHTML, Extensible Hypertext Markup Language.
The emergence of XHTML is just another chapter in the often
tumultuous history of HTML and the Web, where confusion for authors
is the norm, not the exception. At the worst point, the elders of the
World Wide Web Consortium (W3C) responsible for accepted and
acceptable uses of the language -- i.e., standards -- lost
control of the language in the browser
"wars" between Netscape and
Microsoft. The abortive HTML+ standard never got off the ground, and
HTML 3.0 became so bogged down in debate that the W3C simply shelved
the entire draft standard. HTML 3.0 never happened, despite what some
opportunistic marketers claimed in their literature. Instead, by late
1996, the browser manufacturers convinced the W3C to release HTML
standard Version 3.2, which for all intents and purposes simply
standardized most of Netscape's HTML extensions.
Netscape's dominance as the leading browser, as well
as a leader in Web technologies, faded by the end of the millennium.
By then, Microsoft had effectively bundled
Internet Explorer into the
Windows operating system, not only as an installed application, but
also as a dominant feature of the GUI desktop. And, too, Internet
Explorer introduced several features (albeit nonstandard at the time)
that appealed principally to the growing Internet business and
marketing community.
Fortunately for those of us who appreciate and strongly support
standards, the W3C took back its primacy role with HTML 4.0, which
stands today as HTML Version 4.01, released in December 1999.
Absorbing many of the
Netscape and
Internet Explorer
innovations, the standard is clearer and cleaner than any previous
ones, establishes solid implementation models for consistency across
browsers and platforms, provides strong support and incentives for
the companion Cascading Style Sheets (CSS) standard for HTML-based
displays, and makes provisions for alternative (nonvisual) user
agents, as well as for more universal language supports.
Cleaner and clearer aside, the W3C realized that HTML could never
keep up with the demands of the web community for more ways to
distribute, process, and display documents. HTML offers only a
limited set of document-creation primitives and is hopelessly
incapable of handling nontraditional content like chemical formulae,
musical notation, or mathematical expressions. Nor can it well
support alternative display media, such as handheld computers or
intelligent cellular phones.
To address these demands, the W3C developed the XML standard. XML
provides a way to create new, standards-based markup languages that
don't take an act of the W3C to implement.
XML-compliant languages deliver information that can be parsed,
processed, displayed, sliced, and diced by the many different
communication technologies that have emerged since the Web sparked
the digital communication revolution a decade ago. XHTML is HTML
reformulated to adhere to the XML standard. It is the foundation
language for the future of the Web.
Why not just drop HTML for XHTML? For many reasons. First and
foremost, XHTML has not exactly taken the Web by storm.
There's just too much current investment in
HTML-based documentation and expertise for that to happen anytime
soon. Besides, XHTML is HTML 4.01 reformulated as an application of
XML. Know HTML 4 and you're all ready for the
future.[2]
[2] We plumb the depths of XML and XHTML in
Chapter 15 and Chapter 16.
Deprecated Features
One of the
unpopular things standards-bearers have to do is make choices between
popular and proper. The authors of the HTML and XHTML standards
exercise that responsibility by
"deprecating" those features of the
language that interfere in the grand scheme of things.
For instance, the <center> tag tells the
browser to display the enclosed text centered in the display window.
But the CSS standard provides ways to center text, too. The W3C
chooses to support the CSS way and discourages the use of
<center> by deprecating the tag. The plan
is, in some later standard version, to stop using
<center> and other deprecated elements and
attributes of the language.
Throughout the book, we specially note and continuously remind you
when an HTML tag or other component is deprecated in the current
standards. Should you stop using them now? Yes and no.
Yes, because there is a preferred and perhaps better way to
accomplish the same thing. By exercising that alternative, you ensure
that your documents will survive for many years to come on the Web.
And, yes, because the tools you may use to prepare HTML/XHTML
documents probably adhere to the preferred standard. You may not have
a choice, unless you disable your tools. In any event, unless you
hand-compose all your documents, you'll need to know
how the preferred way works so that you can identify the code and
modify it.
However compelling the reasons for not using deprecated elements and
attributes are, they still are part of the standards. They remain
well supported by most browsers and aren't expected
to disappear any time soon. In fact, since there is no plan to change
the HTML standard, the "deprecated"
stamp is very misleading.
So, no, you don't have to worry about deprecated
HTML features. There is no reason to panic, certainly. We encourage
you to use and continue to use them, since the deprecated features
typically are simpler and eminently more human-readable than their
alternatives.
A Definitive Guide
The paradox in all this is that even the HTML 4.01 standard is not the definitive
resource. There are many more features of HTML in popular use and
supported by the popular browsers than are included in the latest
language standard. And there are many parts of the standards that are
ignored. We promise you, things can get downright confusing.
We've managed to sort things out for you, though, so
you don't have to sweat over what works and
doesn't work with what browser. This book,
therefore, is the definitive guide to HTML and XHTML. We give details
for all the elements of the HTML 4.01 and XHTML 1.0 standards, plus
the variety of interesting and useful extensions to the
language -- some proposed standards -- that the popular browser
manufacturers have chosen to include in their products, such as:
Cascading Style Sheets
Java and JavaScript
Layers
Multiple columns
And while we tell you about each and every feature of the language,
standard or not, we also tell you which browsers or different
versions of the same browser implement a particular extension and
which don't. That's critical
knowledge when you want to create web pages that take advantage of
the latest version of Netscape versus pages that are accessible to
the larger number of people using Internet Explorer or even Lynx, a
once-popular text-only browser for Unix systems.
In addition, there are a few things that are closely related but not
directly part of HTML. For example, we touch, but do not handle,
JavaScript, CGI, and Java programming. They all work closely with
HTML documents and run with or alongside browsers, but they are not
part of the language itself, so we don't delve into
them. Besides, they are comprehensive topics that deserve their own
books, such as JavaScript: The Definitive Guide, by David Flanagan,
CGI Programming with Perl, by Scott Guelich,
Shishir Gundavaram, and Gunther Birzneiks, Cascading Style Sheets:
The Definitive Guide, by Eric Meyer, and Learning
Java, by Pat Niemeyer and Jonathan Knudsen (all published
by O'Reilly).
This is your definitive guide to HTML and XHTML as they are and
should be used, including every extension we could find. Some
extensions aren't documented anywhere, even in the
plethora of online guides. But, if we've missed
anything, certainly let us know and we'll put it in
the next edition.