Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید



22.3. Parsing XML into SAX Events


22.3.1. Problem


You want to
receive Simple API for XML (SAX) events from an XML parser because
event-based parsing is faster and uses less memory than parsers that
build a DOM tree.


22.3.2. Solution


Use the XML::SAX module from CPAN:

use XML::SAX::ParserFactory;
use MyHandler;
my $handler = MyHandler->new( );
my $parser = XML::SAX::ParserFactory->parser(Handler => $handler);
$parser->parse_uri($FILENAME);
# or
$parser->parse_string($XML);

Logic for handling events goes into the handler class (MyHandler in
this example), which you write:

# in MyHandler.pm
package MyHandler;
use base qw(XML::SAX::Base);
sub start_element { # method names are specified by SAX
my ($self, $data) = @_;
# $data is hash with keys like Name and Attributes
# ...
}
# other possible methods include end_element( ) and characters( )
1;

22.3.3. Discussion





An XML processor that uses SAX has three
parts: the XML parser that generates SAX events, the handler that
reacts to them, and the stub that connects the two. The XML parser
can be XML::Parser, XML::LibXML, or the pure Perl XML::SAX::PurePerl
that comes with XML::SAX. The XML::SAX::ParserFactory module selects
a parser for you and connects it to your handler. Your handler takes
the form of a class that inherits from XML::SAX::Base. The stub is
the program shown in the Solution.

The
XML::SAX::Base module provides stubs for the different methods that
the XML parser calls on your handler. Those methods are listed in
Table 22-2, and are the methods defined by the SAX1
and SAX2 standards at http://www.saxproject.org/. The Perl
implementation uses more Perl-ish data structures and is described in
the XML::SAX::Intro manpage.

Table 22-2. XML::SAX::Base methods


























































start_document


end_document


characters


start_element


end_element


processing_instruction


ignorable_whitespace


set_document_locator


skipped_entity


start_prefix_mapping


end_prefix_mapping


comment


start_cdata


end_cdata


entity_reference


notation_decl


unparsed_entity_decl


element_decl


attlist_decl


doctype_decl


xml_decl


entity_decl


attribute_decl


internal_entity_decl


start_dtd


end_dtd


external_entity_decl


resolve_entity


start_entity


end_entity


warning


error


fatal_error

The two data structures you need most often are those representing
elements and attributes. The $data parameter to
start_element and end_element
is a hash reference. The keys of the hash are given in
Table 22-3.

Table 22-3. An XML::SAX element hash



























Key


Meaning


Prefix


XML namespace prefix (e.g., email:)


LocalName


Attribute name without prefix (e.g., to)


Name


Fully qualified attribute name (e.g., email:to)


Attributes


Hash of attributes of the element


NamespaceURI


URI of the XML namespace for this attribute

An attribute hash has a key for each attribute. The key is structured
as
"{namespaceURI}attrname".
For example, if the current namespace URI is
http://example.com/dtds/mailspec/ and the
attribute is msgid, the key in the attribute hash
is:

{http://example.com/dtds/mailspec/}msgid

The attribute value is a hash; its keys are given in Table 22-4.

Table 22-4. An XML::SAX attribute hash



























Key


Meaning


Prefix


XML namespace prefix (e.g., email:)


LocalName


Element name without prefix (e.g., to)


Name


Fully qualified element name (e.g., email:to)


Value


Value of the attribute


NamespaceURI


URI of the XML namespace for this element

Example 22-4 shows how to list the book titles using
SAX events. It's more complex than the DOM solution because with SAX
we must keep track of where we are in the XML document.

Example 22-4. sax-titledumper


# in TitleDumper.pm
# TitleDumper.pm -- SAX handler to display titles in books file
package TitleDumper;
use base qw(XML::SAX::Base);
my $in_title = 0;
# if we're entering a title, increase $in_title
sub start_element {
my ($self, $data) = @_;
if ($data->{Name} eq 'title') {
$in_title++;
}
}
# if we're leaving a title, decrease $in_title and print a newline
sub end_element {
my ($self, $data) = @_;
if ($data->{Name} eq 'title') {
$in_title--;
print "\n";
}
}
# if we're in a title, print any text we get
sub characters {
my ($self, $data) = @_;
if ($in_title) {
print $data->{Data};
}
}
1;

The XML::SAX::Intro manpage provides a gentle introduction to
XML::SAX parsing.

22.3.4. See Also


Chapter 5 of Perl & XML; the documentation
for the CPAN modules XML::SAX, XML::SAX::Base, and
XML::SAX::Intro



22.2. Parsing XML into a DOM Tree22.4. Making Simple Changes to Elements or Text




Copyright © 2003 O'Reilly & Associates. All rights reserved.

/ 875