Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید

22.2. Parsing XML into a DOM Tree


22.2.1. Problem


You want to use the Document Object
Model (DOM) to access and perhaps change the parse tree of an XML
file.

22.2.2. Solution


Use the XML::LibXML module from
CPAN:

use XML::LibXML;
my $parser = XML::LibXML->new( );
my $dom = $parser->parse_string($XML);
# or
my $dom = $parser->parse_file($FILENAME);
my $root = $dom->getDocumentElement;

22.2.3. Discussion


DOM is a framework of classes for representing XML parse trees. Each
element is a node in the tree, with which you can do operations like
find its children nodes (the XML elements in this case), add another
child node, and move the node somewhere else in the tree. The
parse_string, parse_file, and
parse_fh (filehandle) constructors all return a
DOM object that you can use to find nodes in the tree.

For example, given the books XML from Example 22-1, Example 22-2 shows one way to
print the titles.

Example 22-2. dom-titledumper


#!/usr/bin/perl -w
# dom-titledumper -- display titles in books file using DOM
use XML::LibXML;
use Data::Dumper;
use strict;
my $parser = XML::LibXML->new;
my $dom = $parser->parse_file("books.xml") or die;
# get all the title elements
my @titles = $dom->getElementsByTagName("title");
foreach my $t (@titles) {
# get the text node inside the <title> element, and print its value
print $t->firstChild->data, "\n";
}

The
getElementsByTagName method returns a list of
elements as nodes within the document that have the specific tag
name. Here we get a list of the title elements,
then go through each title to find its contents.
We know that each title has only a single piece of
text, so we assume the first child node is text and print its
contents.

If we wanted to confirm that the node was a text node, we could say:

die "the title contained something other than text!"
if $t->firstChild->nodeType != 3;

This ensures that the first node is of type 3 (text). Table 22-1 shows LibXML''s numeric node types, which the
nodeType method returns.

Table 22-1. LibXML''s numeric node types























































































Node type


Number


Element


1


Attribute


2


Text


3


CDATA Section


4


Entity Ref


5


Entity


6


Processing Instruction


7


Comment


8


Document


9


Document Type


10


Document Fragment


11


Notation


12


HTML Document


13


DTD Node


14


Element Decl


15


Attribute Decl


16


Entity Decl


17


Namespace Decl


18


XInclude Start


19


XInclude End


20

You can also create and insert new nodes, or move and delete existing
ones, to change the parse tree. Example 22-23 shows
how you would add a randomly generated price value
to each book element.

Example 22-3. dom-addprice


#!/usr/bin/perl -w
# dom-addprice -- add price element to books
use XML::LibXML;
use Data::Dumper;
use strict;
my $parser = XML::LibXML->new;
my $dom = $parser->parse_file("books.xml") or die;
my $root = $dom->documentElement;
# get list of all the "book" elements
my @books = $root->getElementsByTagName("book");
foreach my $book (@books) {
my $price = sprintf("\$%d.95", 19 + 5 * int rand 5); # random price
my $price_text_node = $dom->createTextNode($price); # contents of <price>
my $price_element = $dom->createElement("price"); # create <price>
$price_element->appendChild($price_text_node); # put contents into <price>
$book->appendChild($price_element); # put <price> into <book>
}
print $dom->toString;




We use
createTextNode and
createElement to build the new
price tag and its contents. Then we use
appendChild to insert the tag onto the end of the
current book tag''s existing contents. The
toString method emits a document as XML, which
lets you easily write XML filters like this one using DOM.

The XML::LibXML::DOM manpage gives a quick introduction to the
features of XML::LibXML''s DOM support and references the manpages for
the DOM classes (e.g., XML::LibXML::Node). Those manpages list the
methods for the objects.

22.2.4. See Also


The documentation for the XML::LibXML::DOM, XML::LibXML::Document,
XML::LibXML::Element, and XML::LibXML::Node modules

/ 875