19.2 Parsing and Manipulating with JAXP and DOM
Example 19-1 used the
SAX API for parsing XML documents. We now turn to another commonly
used parsing API: the DOM, or Document Object Model. The DOM API is a
standard defined by the World Wide Web Consortium (W3C); its Java
implementation consists of the org.w3c.dom package
and its subpackages. The current version of the DOM standard is Level
2. As of this writing, the DOM Level 3 API is making its way through
the standardization process at the W3C.The Document Object Model defines the API
of a parse tree for XML documents. The
org.xml.dom.Node interface specifies the basic
features of a node in this parse tree. Subinterfaces, such as
Document, Element,
Entity, and Comment, define the
features of specific types of nodes. A program that uses the DOM
parsing model is quite different from one that uses SAX. With the
DOM, you have the parser read your entire XML document and transform
it into a tree of Node objects. Once parsing is
complete, you can traverse the tree to find the information you need.
The DOM parsing model is useful if you need to make multiple passes
through the tree, if you want to modify the structure of the tree, or
if you need random access to an XML document, instead of the
sequential access provided by the SAX model.Example 19-2 is a
listing of the program WebAppConfig.java. Like
Example 19-1, WebAppConfig reads a
web.xml web application deployment descriptor.
This example uses a DOM parser to build a parse tree, then performs
some operations on the tree to demonstrate how you can work with a
tree of DOM nodes.The WebAppConfig( )
constructor uses the JAXP API to obtain a DOM parser and then uses
that parser to build a parse tree that represents the XML file. The
root node of this tree is of type Document. This
Document object is stored in an instance field of
the WebAppConfig object, so it is available for
traversal and modification by the other methods of the class. The
class also includes a main( ) method that invokes
these other methods.The getServletClass(
) method looks for <servlet-name>
tags and returns the text of the associated
<servlet-class> tags. (These tags always
come in pairs in a web.xml file.) This method
demonstrates a number of features of the DOM parse tree, notably the
getElementsByTagName( ) method. The
addServlet( ) method inserts a new
<servlet> tag into the parse tree; it
demonstrates how to construct new DOM nodes and add them to an
existing parse tree. Finally, the output( ) method
converts the (possibly modified) tree back into an XML document. It
does this using the javax.xml.transform package
and subpackages to "transform" the
DOM tree into a stream. (Another approach is to visit each node of
the tree in order and output its corresponding XML text.)
Example 19-2. WebAppConfig.java
package je3.xml;
import java.io.*;// For reading the input file
import org.w3c.dom.*;// W3C DOM classes for traversing the document
import org.xml.sax.*;// SAX classes used for error handling by JAXP
import javax.xml.parsers.*;
// JAXP classes for parsing
import javax.xml.transform.*;
// For transforming a DOM tree to an XML file.
/**
* A WebAppConfig object is a wrapper around a DOM tree for a web.xml
* file. The methods of the class use the DOM API to work with the
* tree in various ways.
**/
public class WebAppConfig {
/** The main method creates and demonstrates a WebAppConfig object */
public static void main(String[ ] args)
throws IOException, SAXException, ParserConfigurationException,
TransformerConfigurationException, TransformerException
{
// Create a new WebAppConfig object that represents the web.xml
// file specified by the first command-line argument
WebAppConfig config = new WebAppConfig(new File(args[0]));
// Query the tree for the class name associated with the specified
// servlet name
System.out.println("Class for servlet " + args[1] + " is " +
config.getServletClass(args[1]));
// Add a new servlet name-to-class mapping to the DOM tree
config.addServlet("foo", "bar");
// And write out an XML version of the DOM tree to standard out
config.output(new PrintWriter(System.out));
}
org.w3c.dom.Document document; // This field holds the parsed DOM tree
/**
* This constructor method is passed an XML file.
It uses the JAXP API to
* obtain a DOM parser,
and to parse the file into a DOM Document object,
* which is used by the remaining methods of the class.
**/
public WebAppConfig(File configfile)
throws IOException, SAXException, ParserConfigurationException
{
// Get a JAXP parser factory object
javax.xml.parsers.DocumentBuilderFactory dbf =
DocumentBuilderFactory.newInstance( );
// Tell the factory what kind of parser we want
dbf.setValidating(false);
// Use the factory to get a JAXP parser object
javax.xml.parsers.DocumentBuilder parser = dbf.newDocumentBuilder( );
// Tell the parser how to handle errors. Note that in the JAXP API,
// DOM parsers rely on the SAX API for error handling
parser.setErrorHandler(new org.xml.sax.ErrorHandler( ) {
public void warning(SAXParseException e) {
System.err.println("WARNING: " + e.getMessage( ));
}
public void error(SAXParseException e) {
System.err.println("ERROR: " + e.getMessage( ));
}
public void fatalError(SAXParseException e)
throws SAXException {
System.err.println("FATAL: " + e.getMessage( ));
throw e; // re-throw the error
}
});
// Finally, use the JAXP parser to parse the file. This call returns
// a Document object. Now that we have this object, the rest of this
// class uses the DOM API to work with it; JAXP is no longer required.
document = parser.parse(configfile);
}
/**
* This method looks for specific Element nodes in the DOM tree in order
* to figure out the classname associated with the specified servlet name
**/
public String getServletClass(String servletName) {
// Find all <servlet> elements and loop through them.
NodeList servletnodes = document.getElementsByTagName("servlet");
int numservlets = servletnodes.getLength( );
for(int i = 0; i < numservlets; i++) {
Element servletTag = (Element)servletnodes.item(i);
// Get the first <servlet-name> tag within the <servlet> tag
Element nameTag = (Element)
servletTag.getElementsByTagName("servlet-name").item(0);
if (nameTag == null) continue;
// The <servlet-name> tag should have a single child of type
// Text. Get that child, and extract its text. Use trim( )
// to strip whitespace from the beginning and end of it.
String name =((Text)nameTag.getFirstChild( )).getData( ).trim( );
// If this <servlet-name> tag has the right name
if (servletName.equals(name)) {
// Get the matching <servlet-class> tag
Element classTag = (Element)
servletTag.getElementsByTagName("servlet-class").item(0);
if (classTag != null) {
// Extract the tag's text as above, and return it
Text classTagContent = (Text)classTag.getFirstChild( );
return classTagContent.getNodeValue( ).trim( );
}
}
}
// If we get here, no matching servlet name was found
return null;
}
/**
* This method adds a new name-to-class mapping in in the form of
* a <servlet> sub-tree to the document.
**/
public void addServlet(String servletName, String className) {
// Create the <servlet> tag
Element newNode = document.createElement("servlet");
// Create the <servlet-name> and <servlet-class> tags
Element nameNode = document.createElement("servlet-name");
Element classNode = document.createElement("servlet-class");
// Add the name and classname text to those tags
nameNode.appendChild(document.createTextNode(servletName));
classNode.appendChild(document.createTextNode(className));
// And add those tags to the servlet tag
newNode.appendChild(nameNode);
newNode.appendChild(classNode);
// Now that we've created the new sub-tree, figure out where to put
// it. This code looks for another servlet tag and inserts the new
// one right before it. Note that this code will fail if the document
// does not already contain at least one <servlet> tag.
NodeList servletnodes = document.getElementsByTagName("servlet");
Element firstServlet = (Element)servletnodes.item(0);
// Insert the new node before the first servlet node
firstServlet.getParentNode( ).insertBefore(newNode, firstServlet);
}
/**
* Output the DOM tree to the specified stream as an XML document.
* See the XMLDocumentWriter example for the details.
**/
public void output(PrintWriter out)
throws TransformerConfigurationException, TransformerException
{
TransformerFactory factory = TransformerFactory.newInstance( );
Transformer transformer = factory.newTransformer( );
transformer.transform(new javax.xml.transform.dom.DOMSource(document),
new javax.xml.transform.stream.StreamResult(out));
}
}