Perl Cd Bookshelf [Electronic resources] نسخه متنی

Introducing the DOM APIs

The power of the DOM lies in its capability to provide access to an in-memory structure representation of the entire XML document. Using the DOM, applications can perform tasks such as searching for specific data in an XML document, adding or deleting elements and attributes in the XML document, and transforming the DOM to an entirely different document. Along with the org.w3c.dom interfaces provided by W3C, the Oracle Java XML parser comes with a set of classes that implement the DOM APIs and extend them to provide other useful features, such as printing a document fragment or retrieving namespace information.

The following code demonstrates some of the DOM functionality in an XML parser:

// This example demonstrates a simple use of the DOMParser
// An XML file is parsed and some information is printed out.
import java.io.*;
import java.net.*;
import oracle.xml.parser.v2.DOMParser;
import org.w3c.dom.*;
import org.w3c.dom.Node;
// Extensions to DOM Interfaces for Namespace support.
import oracle.xml.parser.v2.XMLElement;
import oracle.xml.parser.v2.XMLAttr;
public class DOMExample {
public static void main(String[] argv){
try {
// Generate a new input stream from given file
FileInputStream xmldoc = new FileInputStream(argv[0]);
// Parse the document using DOMParser
DOMParser parser = new DOMParser();
parser.parse(xmldoc);
// Obtain the document.
Document doc = parser.getDocument();
// Print some information regarding attributes of elements
// in the document
printElementAttributes(doc);
}
catch (Exception e){
System.out.println(e.toString());
}
}
static void printElementAttributes(Document doc){
NodeList nl = doc.getElementsByTagName("*");
Element e;
XMLAttr nsAttr;
String attrname, attrval, attrqname; NamedNodeMap nnm;
for (int j=0; j < nl.getLength(); j++) {
e = (Element) nl.item(j);
System.out.println(e.getTagName() + ":");
nnm = e.getAttributes();
if (nnm != null) {
for (int i=0; i < nnm.getLength(); i++) {
nsAttr = (XMLAttr) nnm.item(i);
// Use the methods getQualifiedName(), getLocalName(),
// getNamespace(), and getExpandedName() in NSName
// interface to get Namespace information.
attrname = nsAttr.getExpandedName(
attrqname = nsAttr.getQualifiedName();
attrval = nsAttr.getNodeValue();
System.out.println(" " + attrqname + "(" + attrname +
")" + " = " +attrval);
}
}
System.out.println();
}
}
}.

The DOM APIs, unlike the SAX APIs, can be used only after the XML document is completely parsed. The downside of this is that large XML documents can occupy a lot of memory, which could ultimately affect the performance of your application. In pure functionality terms, however, the DOM APIs are definitely more powerful. The first thing you need to do before you begin using any of the DOM APIs is to parse your document using a new instance of DOMParser:

 // Parse the document using DOMParser
DOMParser parser = new DOMParser();
parser.parse(xmldoc);

Then, you need to request the parser to return a handle to the root of the Document Object Model, which it has constructed in memory:

 // Obtain the document.
Document doc = parser.getDocument();

Using the preceding handle, you can access every part of the XML document you just parsed. The DOMExample class assumes you want to access the elements in the document and their attributes. To do this, you first need to obtain a list of all the elements in the document. A DOM method called getElementsByTagName enables you to retrieve, recursively, all elements that match a given tag name under a certain level. It also supports a special tag named “*”, which matches any tag. Given this information, you need to invoke this method at the top level of the document via the handle to the root you obtained earlier in this section:

 NodeList nl = doc.getElementsByTagName("*");

The preceding call generates a list of all the elements in the document. Each of these elements contains the information regarding its attributes. To access this information, you need to traverse this list:

 len = nl.getLength();
for (int j=0; j < len; j++) {
e = (Element) nl.item(j);
...
}

To obtain the attributes of each element in the loop, you can use a DOM method called getAttributes. This method generates a special kind of DOM list called NamedNodeMap. Once you obtain this list, traversing it to obtain information about the attributes themselves is straightforward.

DOM Level 2

As DOM evolved into Level 2, it became a modular specification, meaning that some of the new APIs can be stand-alone modules. Though the specifications are “Level 2,” they are actually 1.0 versions, which can be confusing, especially when the same DOM Core names are reused. In addition to DOM Level 2 Core, there are Events, Style, HTML, Traversal and Range, and Views modules. References to these specifications can be found in the appendix of this book.

The introduction of XML namespaces was the primary force behind the development of the DOM Core Level 2 specification, because all the element and attribute functions now had to accept or retrieve namespaces. The following snippet uses the Oracle XML Parser’s DOM 2.0 XML Namespace support to retrieve additional information regarding the attributes of each element:

for (int i=0; i < nnm.getLength(); i++){
nsAttr = (XMLAttr) nnm.item(i);
// Use the methods getQualifiedName() and getExpandedName()
// in NSName interface to get Namespace information.
attrname = nsAttr.getExpandedName();
attrqname = nsAttr.getQualifiedName();
attrval = nsAttr.getNodeValue();
System.out.println(" " + attrqname + "(" + attrname +
")" + " = " + attrval);
}

This kind of code is useful if the XML document you have to parse has elements with many attributes that belong to different namespaces. For example, suppose the booklist XML document from the preceding section looked like this:

<booklist xmlns:osborne="http://www.osborne.com"
xmlns:bookguild="http://www.bookguild.com"
xmlns:dollars="http://www.currency.org/dollars">
<book osborne:isbn="0-07-213495-X" title="Oracle9i XML Handbook"
author="Chang, Scardina, and Kiritzov" bookguild:publisher="Osborne"
dollars:price="49.99"/>
<book osborne:isbn="1230-23498-2349879" title="Emperor's New Mind"
author="Roger Penrose" bookguild:publisher="Oxford Publishing
Company"
dollars:price="15.99"/>
</booklist>

The generated output with namespaces would look like this:

xmlns:osborne(http://www.w3.org
/2000/xmlnls/:osborne)=http://www.osborne.com
xmlns:bookguild(http://www.w3.org/2000
/xmlns/:bookguild)=http://www.bookguild.com
xmlns:dollars(http://www.w3.org/2000
/xmlns/:dollars=http://www.currency.org/dollars
book:
osborne:isbn(http://www.osborne.com:isbn) = 0-07-213495-X
title(title) = Oracle9i XML Handbook
author(author) = Chang, Scardina, and Kiritzov

The DOM Level 2 Traversal and Range functionality includes methods that create Iterators and TreeWalkers to traverse a node and its children in document order. Objects using a TreeWalker to navigate a document tree or subtree use the view of the document defined by their whatToShow flags and filters. An example of such stub code would be the following:

// This filter accepts everything
NodeFilter n1 = new nf1();
// Node iterator doesn't allow expansion of entity references
NodeIterator ni =
doc.createNodeIterator(elems[0],NodeFilter.SHOW_ALL,n1,false);
// Move forward
XMLNode nn =(XMLNode) ni.nextNode();
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)ni.nextNode();
}
// Move backward
nn = (XMLNode)ni.previousNode();
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)ni.previousNode();
}
// Node iterator allows expansion of entity references
ni = doc.createNodeIterator(elems[0],NodeFilter.SHOW_ALL,n1,true);
// Move forward
nn =(XMLNode) ni.nextNode();
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)ni.nextNode();
}
// Move backward
nn = (XMLNode)ni.previousNode();
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)ni.previousNode();
}

// This filter doesn't accept expansion of entity references
NodeFilter n2 = new nf2();
// Node iterator allows expansion of entity references
ni = doc.createNodeIterator(elems[0],NodeFilter.SHOW_ALL,n2,true);
// Move forward
nn =(XMLNode) ni.nextNode();
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)ni.nextNode();
}
// Move backward
nn = (XMLNode)ni.previousNode();
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)ni.previousNode();
}
// After detaching, all node iterator methods throw an exception
ni.detach();
try {
nn = (XMLNode)ni.nextNode();
}
catch(DOMException e) {
System.out.println(e.getMessage());
}
try {
nn = (XMLNode)ni.previousNode();
}
catch(DOMException e){
System.out.println(e.getMessage());
}
// TreeWalker allows expansion of entity references
TreeWalker tw =
doc.createTreeWalker(elems[0],NodeFilter.SHOW_ALL,n1,true);
nn = (XMLNode)tw.getRoot();
// Traverse in document order
while (nn != null) {
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)tw.nextNode();
}
tw = doc.createTreeWalker(elems[0],NodeFilter.SHOW_ALL,n1,true);
nn = (XMLNode) tw.getRoot();
// Traverse the depth left
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)tw.firstChild();
}
tw = doc.createTreeWalker(elems[0],NodeFilter.SHOW_ALL,n2,true);
nn = (XMLNode)tw.getRoot();
// Traverse in document order
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)tw.nextNode();
}
tw = doc.createTreeWalker(elems[0],NodeFilter.SHOW_ALL,n2,true);
nn = (XMLNode) tw.getRoot();
// Traverse the depth right
while (nn != null) {
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)tw.lastChild();
}
...
class nf1 implements NodeFilter {
public short acceptNode(Node node) {
return FILTER_ACCEPT;
}
}
class nf2 implements NodeFilter {
public short acceptNode(Node node) {
short type = node.getNodeType();
if ((type == Node.ELEMENT_NODE) || (type == Node.ATTRIBUTE_NODE))
return FILTER_ACCEPT;
if ((type == Node.ENTITY_REFERENCE_NODE))
return FILTER_REJECT;
return FILTER_SKIP;
}
}

DOM Level 3

As in Level 2, the DOM Level 3 W3C Working Draft consists of DOM Level 3 modules of Core, Load and Save, Validation, Events, and XPath, which provide further functionality identified by DOM users as useful and necessary for their applications. References to these can be found in the Appendix.

A DOM application can use the hasFeature() method of the DOMImplementation object to determine whether the module is supported. A DOMImplementation object can be retrieved from a Document using the getImplementation() method. Examples of these feature strings for their respective modules are XML, HTML, Events, and Validation.

The basis of the DOM, as previously stated, is a tree consisting of Node objects. Different kinds of Nodes are used to represent an XML document: Document, Element, Attr, Text, DocumentFragment, DocumentType, ProcessingInstruction, Comment, CDATASection, EntityReference, and Notation. The DOM also defines some other types that represent a list of nodes—NodeList and NamedNodeMap—and introduces a DOMString type, which is a string of UTF-16 encoded characters. Finally, DOM introduces an exception type, DOMException, which is raised by the various DOM interfaces if an erroneous operation is performed or if some other error occurred during execution.

Table 2-1 and Table 2-2 list the DOM types and the corresponding types supported by the Oracle XML parsers for Java, PL/SQL, C and C++.

Table 2-1: DOM Types with Corresponding Java and PL/SQL Oracle Types
DOM Type	Java	PL/SQL
Node	XMLNode	DOMNode
Document	XMLDocument	DOMDocument
Element	XMLElement	DOMElement
Attr	XMLAttr	DOMAttr
Text	XMLText	DOMText
DocumentFragment	XMLDocumentFragment	DOMDocumentFragment
ProcessingInstruction	XMLPI	DOMPI
DocumentType	DTD	XMLDTD
EntityReference	XMLEntityReference	DOMEntityReference
Comment	XMLComment	DOMComment
CDATASection	XMLCDATA	DOMCDataSection
NodeList	XMLNodeList	DOMNodeList
NamedNodeMap	N/A (private class)	DOMNamedNodeMap
Notation	XMLNotation	DOMNotation
DOMString	java.lang.String	VARCHAR2
DOMException	XMLDOMException	EXCEPTION

Table 2-2: DOM Types with Corresponding C and C++ Oracle Types
DOM Type	C	C++
Node	xmlnode	NodeRef
Document	xmldocnode	DocumentRef
Element	xmlelemnode	ElementRef
Attr	xmlattrnode	AttrRef
Text	xmltextnode	TextRef
DocumentFragment	xmlfragnode	DocumentFragmentRef
ProcessingInstruction	xmlpinode	ProcessingInstructionRef
DocumentType	xmldtdnode	DocumentTypeRef
EntityReference	xmlentrefnode	EntityReferenceRef
Comment	xmlcommentnode	CommentRef
CDATASection	xmlcdatanode	CDATASectionRef
NodeList	xmlnodelist	NodeListRef
NamedNodeMap	xmlnamedmap	NamedNodeMapRef
Notation	xmlnotenode	NotationRef
DOMString	oratext *	DOMString
DOMException	N/A	N/A

Oracle DOM APIs in C

Because the DOM is an object-oriented specification and the C language is not object oriented, some changes had to be made. In particular, the C function namespace is flat, so the names of DOM methods that are the same in several different classes have been changed to make them unique, as detailed in Table 2-3.

Table 2-3: Oracle DOM APIs in C
DOM Name	C Name
Attr::getName, ...	XmlDomGetAttrName, ...
CharacterData::getData, ...	XmlDomGetCharData, ...
DocumentType::getName, ...	XmlDomGetDocTypeName, ...
Entity::getPublicId, ...	XmlDomGetEntityPublicID, ...
NamedNodeMap::item	XmlDomGetChildNode
NamedNodeMap::getLength	XmlDomGetNodeMapLength
NodeList::item	XmlDomGetChildNode
NodeList::getLength	XmlDomGetNodeMapLength
Notation::getPublicId, ...	XmlDomGetNotationPubID, ...

The documentation that is included with the C XDK details each of these functions and can also be seen in the parser header file, xml.h.

Perl Cd Bookshelf [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی