Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

Mark V. Scardina, Ben ChangandJinyu Wang

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید






Database Design Decisions for XML

The challenges of storing and manipulating XML in a database are many, and can be daunting for those of you out there who know it “has” to be done but don’t exactly know how. Databases are relational and XML is hierarchical, so until recently there has been no simple, elegant way to integrate the two. Traditionally, developers have had two choices: either use a parser to deconstruct the document data into relational data and store it as such in the database, or store the entire document as a text file, preserving its text-based structure.

The important thing to remember is that moving XML into the back end with the database is not a one-size-fits-all process. Each storage option has its advantages and disadvantages. Knowing which XML storage model is best for the purposes of your application, rather than modeling your application after an XML storage model, is critical. Let’s look at each XML storage model’s advantages and disadvantage to help you determine which type to use.


XMLType CLOB


In some sense, using an XMLType CLOB is the simplest way to store an XML file. It treats the XML document as exactly that: a document. The file is preserved as a complete text document in storage (with whitespace, comments, and so on intact), by storing it simply as a single, character-based entry in the database. Consequently, files of any size and depth can be stored as long as they are well-formed XML. However, although you may have defined data types in the document for validation against a schema, the data is not typed in the sense that it can be manipulated or retrieved using SQL queries.

Because of this clear limitation, the document must be searched using a text search engine—in contrast to SQL queries, which can leverage the functionality of query rewrites, functional indexes, and so on. Efficiently updating the document is limited because it involves parsing the entire file into memory using DOM, making the change, and replacing it. If, however, the primary purpose of your XML documents is to encapsulate content in a structure for transformation (as might be the focus in web publishing, content management, document archiving, and so on), and most changes to content are made at a document level, then the XMLType CLOB is your best choice for XML data storage. In these cases, you aren’t likely to need a SQL context for your data, and you get a guarantee of byte-for-byte fidelity.


XMLType Views


An alternative to using the XMLType CLOB is to create a virtual XML document on top of a set of relational tables as an XMLType view. This approach permits a user to insert, update, and delete data in the XML file just as though it were SQL data. Because you are defining a virtual XML document on top of the data store, you aren’t limited to just one representation of the data as with CLOB; rather, you can have multiple XML “documents.”

Storing data in relational tables also means that you can update individual elements without pulling the entire document. In general, with XMLType views, you get all the advantages and efficiency that come with SQL operations, because the relational database engine is optimized for these kinds of retrievals. Finally, you can use the SQL datatype operations on the XML data types, instead of treating them simply as text. (For example, a date can be treated as a true date from a SQL standpoint, rather than simply a string of characters.)

This approach has some disadvantages, however. Defining XMLType views where the structure is deep (that is, deeper than eight to ten tiers) can degrade performance significantly due to table joins. Inserting and updating views requires INSTEAD-OF triggers and these are more difficult to maintain, because you need to include application code in the trigger. The great virtue of the CLOB approach is the complete preservation of structure and byte-for-byte fidelity. But with the XMLType view, you lose the guarantee of strict document order; also, many items (such as whitespace comments and processing instructions) will have disappeared while shredding the document data into tables.

None of this matters, of course, if you are shredding your data for use by data-centric applications that don’t “care” about document structure. If your goal is to move data in and out of the database, and keep metadata intact only as context, and have all the advantages of DML operations, XMLType views are ideal. You can start from the database schema and generate the corresponding XML schema. Here there is no concept of document order; any required section ordering can be explicitly defined via ROWIDs or other application-specific methods. Functions are provided in the database and XDK to create XML schemas automatically from XMLType views.

This method is also especially useful if you are working with several XML schemas of differing tag names and structure and do not want them to define your underlying database schema (as when extending an existing legacy database application to support XML while preserving its legacy functionality). In a typical example, you can repurpose the same data store for a variety of customers who are dictating formats and templates to you. You simply define an XML Type view for each customer, and when you retrieve or insert the data to that view, it will have the format appropriate to the corresponding customer.


XML Stored in the Oracle XML DB Repository


With regards to storing XML, it is possible to have your cake and eat it too, at least to an extent: You can store your document as a native XMLType in the Oracle XML DB repository, which will preserve byte-for-byte document fidelity and also shred it into SQL tables. This approach gives you complete validation while allowing you to do all the DML operations on the document that you get with XML views. You still get fine-grained data management, and you can create multiple views and documents based on the SQL data. When your XML schema is registered, you store your XML data in your database, by simply inserting an XML document file using SQL, PL/SQL, Java, FTP, HTTP, or WebDAV. Getting XML data out of your database can be as simple as executing a SQL query or reading a file using one of those Internet-standard protocols. This functionality is made possible through the built-in query rewrite support, eliminating the need for the INSTEAD-OF triggers.

Besides the ease of working with XML in either its document or data form, you get an enhancement to the W3C-standard Document Object Model (DOM) APIs when programmatically using them. When parsing XML from a file, you can build an in-memory tree representation of the entire file in order to manipulate it. (This approach is shared by other XML processors such as XSLT.) With the “virtual DOM” feature of the native XMLType, you build the tree on demand—this preserves resources when using DOM APIs and XSLT, and, in cases of large documents or row sets, your application simply works instead of crashing.

There are many advantages to using the Oracle XML DB repository, but its use isn’t right for every application. Overhead is involved in maintaining the relationship between the full document and its shredded data. However, the biggest problem comes with schema evolution. Because the document dictates the storing process (mapping which data to which tables), when you want to change the document schema, it is no longer a simple abstraction but is intimately bound to the database schema whose structure it dictated. That means you can’t do most nontrivial changes of either the database or the document schema without having to export all the data and reimport it into the database. Oracle Database 10g is better than 9.2 in this regard; however, it is still an expensive operation and you have to decide how to handle the existing data. If there is no need for document fidelity, these changes can be abstracted from your database schema using the XMLType view storage model described previously.

By comparing and contrasting the three different DB storage options for XML data, the preceding discussion should give you a better idea of which is most appropriate for your purposes. Choosing carefully according to your applications’ needs is very important. This choice can literally mean the difference between merely creating a successful prototype and bringing it successfully to production.

/ 218