Here's a rundown of the tools you'll need to work with the Word XML shown throughout this chapter.
When running these hacks, you'll need a command-line processor, an XSLT processor that runs from a DOS command prompt.
You can read about and download Microsoft's own command-line XSLT processor, msxsl.exe, at this URL:
|
The libxml project (hosted at http://www.xmlsoft.org) houses some quite useful command-line utilities for XML processing. Native Windows binaries for each of the libxml tools are available at http://www.zlatkovic.com/libxml.enl. One particularly convenient tool in the libxml suite is the xmllint command. Its --format option, which inputs an XML document and outputs a printed version of it (adding line breaks and indentation), is an excellent tool for learning WordprocessingML and for helping to author stylesheets that create Word documents.
Figure 10-1 shows how a WordprocessingML document looks when opened in Notepad after just saving it from Word. The entire document is jammed onto four extremely long lines of text, making it a tad difficult to inspect.
Figure 10-2 shows a portion of the same document, after using the command xmllint --format. The indenting and line breaks make for a much more readable XML file.
The libxml project also contains its own XSLT processor, with a command-line tool called xsltproc. Other freely available XSLT processors you may want to try out include Saxon (http://saxon.sourceforge.net) and Xalan (http://xml.apache.org/xalan-j/), both of which are Java-based processors.