Appendix A. HTML Grammar
For the most part, the exact syntax of an HTML or XHTML document is not rigidly
enforced by a browser. This gives authors wide latitude in creating
documents and gives rise to documents that work on most browsers but
actually are incompatible with the HTML and XHTML standards. Stick to
the standards, unless your documents are fly-by-night affairs.
The
standards explicitly define the ordering
and nesting of tags and document elements. This syntax is embedded
within the appropriate Document Type Definition and is not readily
understood by those not versed in SGML (for the HTML 4.01 DTD, see
Appendix D) or XML (for the XHTML 1.0 DTD, see
Appendix E). Accordingly, we provide an alternate
definition of the allowable HTML and XHTML syntax, using a fairly
common tool called a "grammar."
Grammar, whether it defines English sentences or HTML documents, is
just a set of rules that indicates the order of language elements.
These language elements can be divided into two sets:
terminal (the actual words of the language) and
nonterminal (all other grammatical rules). In
HTML and XHTML, the words correspond to the embedded markup tags and
text in a document.
To use the grammar to create a valid document, follow the order of
the rules to see where the tags and text may be placed to create a
valid document.