Perl Cd Bookshelf [Electronic resources] نسخه متنی

Building the cppextract Application

Since the Oracle XDK 10g C++ libraries have been converted to use C++ templates, you will use this design approach for the application called cppextract. It will be made up of the following three object files and their associated source files:

cppextractGen.cpp The generic functions used within the application

cppextractForce.cpp The specific instantiation of the generic functions in the application

cppextract.cpp The user interface to the application

The following sections discuss these files and their usage within the application.

Creating the Generic Functions

The cppextractGen.cpp file contains the code of the templates that are used in the application. Its only dependencies are on the standard C I/O library, stdio.h, and the XDK C/C++ library, oraxml10, which is referenced by including xml.hpp:

#ifndef XML_CPP_ORACLE
#include <xml.hpp>
#endif

The processXPath() Template

The processXPath() template performs the actual XPath evaluation by wrapping the process() function with the necessary code to parse the input XML document and to handle the resulting output. Let’s examine the template in detail.

This first section handles the initialization of the XDK context and is a call that all C++ XDK applications need to make. It needs to be made only one time because it can be reused throughout the application.

template< typename TCtx, typename Tnode> unsigned processXPath(
char* dname, char* xpath_exp);
extern "C" {
#include <stdio.h>
}
#include "cppextract.hpp"
template< typename Tnode> void printSubtree( NodeRef< Tnode>& nrefp);
template< typename TCtx, typename Tnode> unsigned processXPath(
char* dname, char* xpath_exp) {
TCtx* ctxp = NULL;
cout << "XML C++ XPath Extract\n";
try
{
ctxp = new TCtx();
}
catch (XmlException& e)
{
unsigned ecode = e.getCode();
cout << "Failed to initialize XML context, error " <<ecode<< "\n";
return ecode;
}

The next section creates an instance of the XPath processor by first creating a new Tools Factory instance, fp, and then using it to create the processor, prp, with createXPathProcessor():

 Factory< TCtx, Tnode>* fp = NULL;
try
{
fp = new Factory< TCtx, Tnode>( ctxp);
}
catch (FactoryException& fe)
{
unsigned ecode = fe.getCode();
cout << "Failed to create factory, error " <<ecode <<"\n";
return ecode;
}
printf("Creating XPath processor\n");
XPath::Processor< TCtx, Tnode>* prp = NULL;
try
{
prp = fp->createXPathProcessor( XPathPrCXml, NULL);
}
catch (FactoryException& fe1)
{
unsigned ecode = fe1.getCode();
cout << "Failed to create XPath processor, error " <<ecode <<"\n";
return ecode;
}

Next, the input XML document is passed in by calling Filesource():

InputSource* isrcp = new FileSource( (oratext*)dname);

The following section invokes the XPath processor through its process() function by passing in the document object, isrcp, and the XPath expression, xpath_exp. The result is returned as objp (which is analyzed in the next section).

 cout<< "Processing "<<dname;
cout<<"using"<<"\"<<xpath_exp<<"\"<<"\n";
XPathObject< Tnode>* objp = NULL;
try
{
objp = prp->process (isrcp, (oratext*)xpath_exp);
}
catch (XPathException& xpe)
{
unsigned ecode = xpe.getCode();
cout << "Failed to process the document, error " <<ecode <<"\n";
return ecode;
}

The next section queries the result object, objp, and the case statement branches the resulting processing:

 NodeSet< Tnode>* np = NULL;
boolean varb = FALSE;
double num = 0.0;
oratext* str = NULL;
unsigned i = 0;
switch (objp->getObjType())
{
case XPOBJ_TYPE_NDSET:
np = objp->getNodeSet();
cout << "NodeSet:\n";
for (i = 0; i < np->getSize(); i++ )
{
NodeRef< Tnode>* nrefp = np->getNode( i);
switch( nrefp->getNodeType())
{
case ELEMENT_NODE:
NodeRef< Tnode> elref( (*np), nrefp);
cout << "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
printSubTree< Tnode>( *nrefp);
break;
case ATTRIBUTE_NODE:
cout << "Attribute Name :" <<nrefp->getNodeName()<<"\n";
break;
default:
cout<<("Node Value : "<<nrefp->getNodeValue()<<"\n";
break;
}
}
break;

case XPOBJ_TYPE_BOOL:
varb = objp->getObjBoolean();
cout<<("Boolean Value : "<<varb<<"\n";
break;
case XPOBJ_TYPE_NUM:
num = objp->getObjNumber();
cout<<("Numeric Value : "<<num<<"\n";
break;
case XPOBJ_TYPE_STR:
str = objp->getObjString();
cout << "String Value : " << str << "\n";
objp = prp->process (isrcp, (oratext*)xpath_exp);
default:
cout<<( "Failed to create valid object\n");
}
return 0;
}

Since objp may represent a set of nodes, values, strings, etc., you need to loop through them as is done in the FOR loop. If the result is XPOBJ_TYPE_NDSET, then the return type is a node set. An additional case statement further processes it to determine the type of nodes. If the nodes are elements, it treats this as a request for splitting the document into subtrees and thus first prints out the XML processing instruction and then calls the printSubTree() function. If the result is one or more attribute nodes, it prints their names. If the node type is neither an element nor attribute node, it prints its value. For example, consider the following document:

<?xml version="1.0" encoding="UTF-8"?>
<Bookcatalog>
<book ISBN = "7564">
<title>The Adventures of Don Quixote</title>
<author_lastname>Cervantes</author_lastname>
<publisher>Oracle Press</publisher>
<year>2000</year>
<price>50.00</price>
</book>
<book ISBN= "5354">
<title>The Iliad</title>
<author_lastname>Homer</author_lastname>
<publisher>Oracle Press</publisher>
<year>1000</year>
<price>5.00</price>
</book>
</Bookcatalog>

The XPath /bookcatalog/book will print each book record. The XPath //book/@* will print the name of each attribute, which in this case is ISBN. The XPath string(//book/@*) will return the ISBN values of 7564 and 5354 since the case statement will break out under XPOBJ_TYPE_STR. If the node is neither an element nor an attribute, such as a text node, the NodeSet case statement executes its default section and returns the associated string values as in //book/title/text() which returns “The Adventures of Don Quixote” and “The Iliad”.

This file also includes the cppextract.hpp header file that declares the main template, processXPath(), as follows:

#ifndef XML_CPP_ORACLE
#include <xml.hpp>
#endif
template< typename TCtx, typename Tnode> unsigned processXPath(
char* dname, char* xpath_exp);

The printSubTree() Template

The printSubTree() template is used to print the subtrees rooted at the specified element to separate documents:

template< typename Tnode> void printSubTree( NodeRef< Tnode>& nref)
{
oratext* tag = nref.getNodeName();
if (tag == NULL)
{
cout << " Element has no name - error\n";
return;
}
// print opening tag
cout << "<" << tag;
//Get attributes on element
NamedNodeMap< Tnode>* attrs = nref.getAttributes(); 
NamedNodeMapRef< Tnode> attref( nref, attrs);
ub4 n_attrs = attref.getLength();
NodeRef< Tnode>* attrefp = NULL;
for (unsigned a = 0; a < n_attrs; a++)
{
Tnode* ap = attref.item( a);
if (a == 0)
attrefp = new NodeRef< Tnode>( nref, ap);
else 
attrefp->resetNode( ap);
// print attribute
cout << " " << attrefp->getNodeName() << " = " << attrefp->getNodeValue();
}
cout << ">";
NodeRef< Tnode>* nrefp = NULL;
if (nref.hasChildNodes())
{
NodeList< Tnode>* lp = nref.getChildNodes();
NodeListRef< Tnode> lref( nref, lp);

ub4 len = lref.getLength();
for (unsigned i = 0; i < len; i++)
{
Tnode* np = lref.item( i);
if (i == 0) 
nrefp = new NodeRef< Tnode>( nref, np);
else
nrefp->resetNode( np);
if (nrefp->getNodeType() == ELEMENT_NODE)
{
//Continue iterration 
printSubTree< Tnode>( *nrefp);
}
else if (nrefp->getNodeType() == TEXT_NODE)
//Print Text value
cout << nrefp->getNodeValue(); 
}
}
//Print closing tag
cout << "</" << tag << ">";
}

This is a useful template for serializing an XML document or fragment as there currently is not a version of XmlSaveDOM() for C++. First it sends out the opening angle bracket and tag name. Then it must check for attributes. This is done with a FOR loop to iterate over them by first getting a pointer, attrefp, to each one and serializing its name and value. Then the closing angle bracket is serialized. Since an element can have element content, text content, or both, an IF statement is created to check for this. By calling getChildNodes() and then checking the number returned, you can iterate over them with a FOR loop and either call printSubTree() again for element nodes or serialize the value for text nodes. Finally, once the node list, lp, is completed, you can serialize the end tag.

Instantiating the Generic Functions with cppextractForce

Once the generic functions are completed, you need to instantiate them in the schema of your application. This is done in cppextractForce.cpp:

#ifndef XMLCTX_CPP_ORACLE
#include <xmlctx.hpp>
#endif
#include "cppextractGen.cpp"
unsigned force( char* dname, char* xpath_exp)
{
return processXPath< CXmlCtx, xmlnode>( dname, xpath_exp);
}

Note that this file includes the generic functions created in cppextractGen.cpp.

Creating the Main Program with cppextractMain

It is now time to create the user interface for the application. This is done in cppextractMain.cpp. Since this is a command-line application, you need to pass in the appropriate parameters, which in this case are the XML file to query and the XPath expression to use:

#include <iostream.h>
#include <string.h>
#ifndef XMLCTX_CPP_ORACLE
#include <xmlctx.hpp>
#endif
#include "cppextract.hpp"
int main( int argc, char* argv[])
{
if (argc < 3)
{
cout << "Usage is cppextract <xmlfile> <xpath>\n";
return 1;
}
if (processXPath< CXmlCtx, xmlnode>( argv[1], argv[2]))
return 1;
}

Besides parsing the command line, the only thing the application needs to do is call your main processXPath() function as it encapsulates the application’s functionality. Obviously, if you were using this function from within a more sophisticated application, as we will discuss shortly, the processXPath() function could return the actual nodes or subtree DOMs for further processing, such as performing an XSLT transformation or inserting into a database.

Perl Cd Bookshelf [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی