Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید

22.8. Processing Files Larger Than Available Memory


22.8.1. Problem




You want to work with a large XML
file, but you can''t read it into memory to form a DOM or other kind
of tree because it''s too big.

22.8.2. Solution


Use SAX (as described in Recipe 22.3) to
process events instead of building a tree.

Alternatively, use XML::Twig to build
trees only for the parts of the document you want to work with (as
specified by XPath expressions):

use XML::Twig;
my $twig = XML::Twig->new( twig_handlers => {
$XPATH_EXPRESSION => \&HANDLER,
# ...
});
$twig->parsefile($FILENAME);
$twig->flush( );

You can call a lot of DOM-like functions from within a handler, but
only the elements identified by the XPath expression (and whatever
those elements enclose) go into a tree.

22.8.3. Discussion


DOM modules turn the entire document into a tree, regardless of
whether you use all of it. With SAX modules, there are no trees
built—if your task depends on document structure, you must keep
track of that structure yourself. A happy middle ground is XML::Twig,
which creates DOM trees only for the bits of the file that you''re
interested in. Because you work with files a piece at a time, you can
cope with very large files by processing pieces that fit in memory.

For example, to print the titles of books in
books.xml (Example 22-1), you
could write:

use XML::Twig;
my $twig = XML::Twig->new( twig_roots => { ''/books/book'' => \&do_book });
$twig->parsefile("books.xml");
$twig->purge( );
sub do_book {
my($title) = $_->find_nodes("title");
print $title->text, "\n";
}

For each book element, XML::Twig calls
do_book on its contents. That subroutine finds the
title node and prints its text. Rather than having
the entire file parsed into a DOM structure, we keep only one
book element at a time.

Consult the XML::Twig manpages for details on how much DOM and XPath
the module supports—it''s not complete, but it''s growing all the
time. XML::Twig uses XML::Parser for its XML parsing, and as a result
the functions available on nodes are slightly different from those
provided by XML::LibXSLT''s DOM parsing.

22.8.4. See Also


Recipe 22.6; the documentation for the module
XML::Twig

/ 875