Chapter 24: Building an XML Data-Retrieval Application
Overview
As XML becomes the industry standard for exchanging business data, the capability to retrieve selective data from large XML documents becomes increasingly more important. Because XPath is the XML navigational language, it lies at the foundation of standards-based data-retrieval solutions. XPath-supported solutions such as XSLT require creation and traversal of the input XML’s Document Object Model (DOM). In practice, DOMs require up to ten times the memory of the original document because DOMs include traversal APIs that are difficult to optimize. Other XML processing methods, such as SAX and StAX, are event-based and require less memory; however, they lack the desired XPath support for retrieving XML data.XPath allows you to retrieve XML data based not only on its content but also on the XML document structure. In Oracle XDK 10g, you have access to an XPath processor in C and C++ that can evaluate an XPath and return a set of nodes or values as appropriate.In this chapter, we first examine typical requirements for data retrieval from XML documents, and then discuss the design of an application that provides XPath-based XML data retrieval. Then you will build a lightweight data-extraction engine that can efficiently match XPaths and retrieve the results. Finally, we describe how this engine can be easily integrated into a content-management application in an actual use case.