Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

Mark V. Scardina, Ben ChangandJinyu Wang

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید








Introducing the DOM APIs


The power of the DOM lies in its capability to provide access to an in-memory structure representation of the entire XML document. Using the DOM, applications can perform tasks such as searching for specific data in an XML document, adding or deleting elements and attributes in the XML document, and transforming the DOM to an entirely different document. Along with the org.w3c.dom interfaces provided by W3C, the Oracle Java XML parser comes with a set of classes that implement the DOM APIs and extend them to provide other useful features, such as printing a document fragment or retrieving namespace information.


The following code demonstrates some of the DOM functionality in an XML parser:


// This example demonstrates a simple use of the DOMParser
// An XML file is parsed and some information is printed out.
import java.io.*;
import java.net.*;
import oracle.xml.parser.v2.DOMParser;
import org.w3c.dom.*;
import org.w3c.dom.Node;
// Extensions to DOM Interfaces for Namespace support.
import oracle.xml.parser.v2.XMLElement;
import oracle.xml.parser.v2.XMLAttr;
public class DOMExample {
public static void main(String[] argv){
try {
// Generate a new input stream from given file
FileInputStream xmldoc = new FileInputStream(argv[0]);
// Parse the document using DOMParser
DOMParser parser = new DOMParser();
parser.parse(xmldoc);
// Obtain the document.
Document doc = parser.getDocument();
// Print some information regarding attributes of elements
// in the document
printElementAttributes(doc);
}
catch (Exception e){
System.out.println(e.toString());
}
}
static void printElementAttributes(Document doc){
NodeList nl = doc.getElementsByTagName("*");
Element e;
XMLAttr nsAttr;
String attrname, attrval, attrqname; NamedNodeMap nnm;
for (int j=0; j < nl.getLength(); j++) {
e = (Element) nl.item(j);
System.out.println(e.getTagName() + ":");
nnm = e.getAttributes();
if (nnm != null) {
for (int i=0; i < nnm.getLength(); i++) {
nsAttr = (XMLAttr) nnm.item(i);
// Use the methods getQualifiedName(), getLocalName(),
// getNamespace(), and getExpandedName() in NSName
// interface to get Namespace information.
attrname = nsAttr.getExpandedName(
attrqname = nsAttr.getQualifiedName();
attrval = nsAttr.getNodeValue();
System.out.println(" " + attrqname + "(" + attrname +
")" + " = " +attrval);
}
}
System.out.println();
}
}
}.


The DOM APIs, unlike the SAX APIs, can be used only after the XML document is completely parsed. The downside of this is that large XML documents can occupy a lot of memory, which could ultimately affect the performance of your application. In pure functionality terms, however, the DOM APIs are definitely more powerful. The first thing you need to do before you begin using any of the DOM APIs is to parse your document using a new instance of DOMParser:


 // Parse the document using DOMParser
DOMParser parser = new DOMParser();
parser.parse(xmldoc);


Then, you need to request the parser to return a handle to the root of the Document Object Model, which it has constructed in memory:


 // Obtain the document.
Document doc = parser.getDocument();


Using the preceding handle, you can access every part of the XML document you just parsed. The DOMExample class assumes you want to access the elements in the document and their attributes. To do this, you first need to obtain a list of all the elements in the document. A DOM method called getElementsByTagName enables you to retrieve, recursively, all elements that match a given tag name under a certain level. It also supports a special tag named “*”, which matches any tag. Given this information, you need to invoke this method at the top level of the document via the handle to the root you obtained earlier in this section:


 NodeList nl = doc.getElementsByTagName("*");


The preceding call generates a list of all the elements in the document. Each of these elements contains the information regarding its attributes. To access this information, you need to traverse this list:


 len = nl.getLength();
for (int j=0; j < len; j++) {
e = (Element) nl.item(j);
...
}


To obtain the attributes of each element in the loop, you can use a DOM method called getAttributes. This method generates a special kind of DOM list called NamedNodeMap. Once you obtain this list, traversing it to obtain information about the attributes themselves is straightforward.



DOM Level 2



As DOM evolved into Level 2, it became a modular specification, meaning that some of the new APIs can be stand-alone modules. Though the specifications are “Level 2,” they are actually 1.0 versions, which can be confusing, especially when the same DOM Core names are reused. In addition to DOM Level 2 Core, there are Events, Style, HTML, Traversal and Range, and Views modules. References to these specifications can be found in the appendix of this book.


The introduction of XML namespaces was the primary force behind the development of the DOM Core Level 2 specification, because all the element and attribute functions now had to accept or retrieve namespaces. The following snippet uses the Oracle XML Parser’s DOM 2.0 XML Namespace support to retrieve additional information regarding the attributes of each element:


for (int i=0; i < nnm.getLength(); i++){
nsAttr = (XMLAttr) nnm.item(i);
// Use the methods getQualifiedName() and getExpandedName()
// in NSName interface to get Namespace information.
attrname = nsAttr.getExpandedName();
attrqname = nsAttr.getQualifiedName();
attrval = nsAttr.getNodeValue();
System.out.println(" " + attrqname + "(" + attrname +
")" + " = " + attrval);
}


This kind of code is useful if the XML document you have to parse has elements with many attributes that belong to different namespaces. For example, suppose the booklist XML document from the preceding section looked like this:


<booklist xmlns:osborne="http://www.osborne.com"
xmlns:bookguild="http://www.bookguild.com"
xmlns:dollars="http://www.currency.org/dollars">
<book osborne:isbn="0-07-213495-X" title="Oracle9i XML Handbook"
author="Chang, Scardina, and Kiritzov" bookguild:publisher="Osborne"
dollars:price="49.99"/>
<book osborne:isbn="1230-23498-2349879" title="Emperor's New Mind"
author="Roger Penrose" bookguild:publisher="Oxford Publishing
Company"
dollars:price="15.99"/>
</booklist>


The generated output with namespaces would look like this:


xmlns:osborne(http://www.w3.org
/2000/xmlnls/:osborne)=http://www.osborne.com
xmlns:bookguild(http://www.w3.org/2000
/xmlns/:bookguild)=http://www.bookguild.com
xmlns:dollars(http://www.w3.org/2000
/xmlns/:dollars=http://www.currency.org/dollars
book:
osborne:isbn(http://www.osborne.com:isbn) = 0-07-213495-X
title(title) = Oracle9i XML Handbook
author(author) = Chang, Scardina, and Kiritzov


The DOM Level 2 Traversal and Range functionality includes methods that create Iterators and TreeWalkers to traverse a node and its children in document order. Objects using a TreeWalker to navigate a document tree or subtree use the view of the document defined by their whatToShow flags and filters. An example of such stub code would be the following:


// This filter accepts everything
NodeFilter n1 = new nf1();
// Node iterator doesn't allow expansion of entity references
NodeIterator ni =
doc.createNodeIterator(elems[0],NodeFilter.SHOW_ALL,n1,false);
// Move forward
XMLNode nn =(XMLNode) ni.nextNode();
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)ni.nextNode();
}
// Move backward
nn = (XMLNode)ni.previousNode();
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)ni.previousNode();
}
// Node iterator allows expansion of entity references
ni = doc.createNodeIterator(elems[0],NodeFilter.SHOW_ALL,n1,true);
// Move forward
nn =(XMLNode) ni.nextNode();
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)ni.nextNode();
}
// Move backward
nn = (XMLNode)ni.previousNode();
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)ni.previousNode();
}

// This filter doesn't accept expansion of entity references
NodeFilter n2 = new nf2();
// Node iterator allows expansion of entity references
ni = doc.createNodeIterator(elems[0],NodeFilter.SHOW_ALL,n2,true);
// Move forward
nn =(XMLNode) ni.nextNode();
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)ni.nextNode();
}
// Move backward
nn = (XMLNode)ni.previousNode();
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)ni.previousNode();
}
// After detaching, all node iterator methods throw an exception
ni.detach();
try {
nn = (XMLNode)ni.nextNode();
}
catch(DOMException e) {
System.out.println(e.getMessage());
}
try {
nn = (XMLNode)ni.previousNode();
}
catch(DOMException e){
System.out.println(e.getMessage());
}
// TreeWalker allows expansion of entity references
TreeWalker tw =
doc.createTreeWalker(elems[0],NodeFilter.SHOW_ALL,n1,true);
nn = (XMLNode)tw.getRoot();
// Traverse in document order
while (nn != null) {
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)tw.nextNode();
}
tw = doc.createTreeWalker(elems[0],NodeFilter.SHOW_ALL,n1,true);
nn = (XMLNode) tw.getRoot();
// Traverse the depth left
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)tw.firstChild();
}
tw = doc.createTreeWalker(elems[0],NodeFilter.SHOW_ALL,n2,true);
nn = (XMLNode)tw.getRoot();
// Traverse in document order
while (nn != null){
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)tw.nextNode();
}
tw = doc.createTreeWalker(elems[0],NodeFilter.SHOW_ALL,n2,true);
nn = (XMLNode) tw.getRoot();
// Traverse the depth right
while (nn != null) {
System.out.println(nn.getNodeName() + " " + nn.getNodeValue());
nn = (XMLNode)tw.lastChild();
}
...
class nf1 implements NodeFilter {
public short acceptNode(Node node) {
return FILTER_ACCEPT;
}
}
class nf2 implements NodeFilter {
public short acceptNode(Node node) {
short type = node.getNodeType();
if ((type == Node.ELEMENT_NODE) || (type == Node.ATTRIBUTE_NODE))
return FILTER_ACCEPT;
if ((type == Node.ENTITY_REFERENCE_NODE))
return FILTER_REJECT;
return FILTER_SKIP;
}
}



DOM Level 3



As in Level 2, the DOM Level 3 W3C Working Draft consists of DOM Level 3 modules of Core, Load and Save, Validation, Events, and XPath, which provide further functionality identified by DOM users as useful and necessary for their applications. References to these can be found in the Appendix.


A DOM application can use the hasFeature() method of the DOMImplementation object to determine whether the module is supported. A DOMImplementation object can be retrieved from a Document using the getImplementation() method. Examples of these feature strings for their respective modules are XML, HTML, Events, and Validation.


The basis of the DOM, as previously stated, is a tree consisting of Node objects. Different kinds of Nodes are used to represent an XML document: Document, Element, Attr, Text, DocumentFragment, DocumentType, ProcessingInstruction, Comment, CDATASection, EntityReference, and Notation. The DOM also defines some other types that represent a list of nodes—NodeList and NamedNodeMap—and introduces a DOMString type, which is a string of UTF-16 encoded characters. Finally, DOM introduces an exception type, DOMException, which is raised by the various DOM interfaces if an erroneous operation is performed or if some other error occurred during execution.


Table 2-1 and Table 2-2 list the DOM types and the corresponding types supported by the Oracle XML parsers for Java, PL/SQL, C and C++.





























































Table 2-1: DOM Types with Corresponding Java and PL/SQL Oracle Types


DOM Type




Java




PL/SQL




Node




XMLNode




DOMNode




Document




XMLDocument




DOMDocument




Element




XMLElement




DOMElement




Attr




XMLAttr




DOMAttr




Text




XMLText




DOMText




DocumentFragment




XMLDocumentFragment




DOMDocumentFragment




ProcessingInstruction




XMLPI




DOMPI




DocumentType




DTD




XMLDTD




EntityReference




XMLEntityReference




DOMEntityReference




Comment




XMLComment




DOMComment




CDATASection




XMLCDATA




DOMCDataSection




NodeList




XMLNodeList




DOMNodeList




NamedNodeMap




N/A (private class)




DOMNamedNodeMap




Notation




XMLNotation




DOMNotation




DOMString




java.lang.String




VARCHAR2




DOMException




XMLDOMException




EXCEPTION






























































Table 2-2: DOM Types with Corresponding C and C++ Oracle Types


DOM Type




C




C++




Node




xmlnode




NodeRef




Document




xmldocnode




DocumentRef




Element




xmlelemnode




ElementRef




Attr




xmlattrnode




AttrRef




Text




xmltextnode




TextRef




DocumentFragment




xmlfragnode




DocumentFragmentRef




ProcessingInstruction




xmlpinode




ProcessingInstructionRef




DocumentType




xmldtdnode




DocumentTypeRef




EntityReference




xmlentrefnode




EntityReferenceRef




Comment




xmlcommentnode




CommentRef




CDATASection




xmlcdatanode




CDATASectionRef




NodeList




xmlnodelist




NodeListRef




NamedNodeMap




xmlnamedmap




NamedNodeMapRef




Notation




xmlnotenode




NotationRef




DOMString




oratext *




DOMString




DOMException




N/A




N/A





Oracle DOM APIs in C



Because the DOM is an object-oriented specification and the C language is not object oriented, some changes had to be made. In particular, the C function namespace is flat, so the names of DOM methods that are the same in several different classes have been changed to make them unique, as detailed in Table 2-3.








































Table 2-3: Oracle DOM APIs in C


DOM Name




C Name




Attr::getName, ...




XmlDomGetAttrName, ...




CharacterData::getData, ...




XmlDomGetCharData, ...




DocumentType::getName, ...




XmlDomGetDocTypeName, ...




Entity::getPublicId, ...




XmlDomGetEntityPublicID, ...




NamedNodeMap::item




XmlDomGetChildNode




NamedNodeMap::getLength




XmlDomGetNodeMapLength




NodeList::item




XmlDomGetChildNode




NodeList::getLength




XmlDomGetNodeMapLength




Notation::getPublicId, ...




XmlDomGetNotationPubID, ...




The documentation that is included with the C XDK details each of these functions and can also be seen in the parser header file, xml.h.


/ 218