XML and PHP [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

XML and PHP [Electronic resources] - نسخه متنی

Vikram Vaswani

نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
تنظیمات قلم


اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
لیست موضوعات
افزودن یادداشت
افزودن یادداشت جدید

A Few Examples

Now that you know the theory, let's see how it works in some real-life examples. The following sections illustrate how PHP's SAX parser can be used to "do something useful" with XML data.

Formatting an XML Invoice for Display in a Web Browser

Consider the XML document in Listing 2.21, which contains an invoice for material delivered by Sammy's Sports Store.

Listing 2.21 XML Invoice (invoice.xml)

<?xml version="1.0"?>
<!DOCTYPE invoice
<!ENTITY message "Thank you for your purchases!">
<!ENTITY terms SYSTEM "terms.xml">
<name>Joe Wannabe</name>
<line>23, Great Bridge Road</line>
<line>Bombay, MH</line>
<item cid="AS633225">
<desc>Oversize tennis racquet</desc>
<item cid="GT645">
<desc>Championship tennis balls (can)</desc>
<item cid="U73472">
<desc>Designer gym bag</desc>
<item cid="AD848383">
<desc>Custom-fitted sneakers</desc>
<?php displayTotal(); ?>
<delivery>Next-day air</delivery>

The entity &terms references the file "terms.xml", which is shown in Listing 2.22.

Listing 2.22 Payment Terms and Conditions in XML (terms.xml)

<?xml version="1.0"?>
<term>Visa, Mastercard, American Express
accepted. Checks will be accepted
for orders totalling more than USD 5000.00</term>
<term>All payments must be made in US currency</term>
<term>Returns within 15 days</term>
<term>International orders may be
subject to additional customs duties and

This invoice contains many of the constructs you've just studied: PIs, external entities, and plain-vanilla elements and data. It therefore serves as a good proving ground to demonstrate how PHP, combined with SAX, can be used to format XML data for greater readability. The script in Listing 2.23 parses the previous XML data to create an HTML page that is suitable for printing or viewing in a browser.

Listing 2.23 Generating HTML Output from XML Data with SAX

<basefont face="Arial">
<body bgcolor="white">
<font size="+3">Sammy's Sports Store</font>
<font size="-2">14, Ocean
View, CA 12345, USA http://www.sammysportstore.com/</font>
// element handlers
// these look up the element in the associative arrays
// and print the equivalent HTML code
function startElementHandler($parser, $name, $attribs)
global $startTagsArray;
// expose element being processed
global $currentTag;
$currentTag = $name;
// look up element in array and print corresponding HTML
if ($startTagsArray[$name])
echo $startTagsArray[$name];
function endElementHandler($parser, $name)
global $endTagsArray;
if ($endTagsArray[$name])
echo $endTagsArray[$name];
// character data handler
// this prints CDATA as it is found
function characterDataHandler($parser, $data)
global $currentTag;
global $subTotals;
echo $data;
// record subtotals for calculation of grand total
if ($currentTag == "SUBTOTAL")
$subTotals[] = $data;
// external entity handler
// if SYSTEM-type entity,
this function looks up the entity and parses it
function externalEntityHandler
($parser, $name, $base, $systemId, $publicId)
if ($systemId)
// explicitly return true
return true;
return false;
// PI handler
// this function processes PHP code if it finds any
function PIHandler($parser, $target, $data)
// if php code, execute it
if (strtolower($target) == "php")
// this function adds up all the subtotals
// and prints a grand total
function displayTotal()
global $subTotals;
foreach($subTotals as $element)
$total += $element;
echo "<p> <b>Total payable: </b> " . $total;
// function to actually perform parsing
function parse($xml_file)
// initialize parser
$xml_parser = xml_parser_create();
// set callback functions
($xml_parser, "startElementHandler"
, "endElementHandler");
($xml_parser, "characterDataHandler");
handler($xml_parser, "PIHandler");
($xml_parser, "externalEntityHandler");
// read XML file
if (!($fp = fopen($xml_file, "r")))
die("File I/O error: $xml_file");
// parse XML
while ($data = fread($fp, 4096))
// error handler
if (!xml_parse($xml_parser, $data, feof($fp)))
$ec = xml_get_error_code($xml_parser);
die("XML parser error (error code " . $ec . "): " .
xml_error_string($ec) . "<br>Error occurred at line " .
// all done, clean up!
// arrays to associate XML elements with HTML output
$startTagsArray = array(
'CUSTOMER' => '<p> <b>Customer: </b>',
'ADDRESS' => '<p> <b>Billing address: </b>',
'DATE' => '<p> <b>Invoice date: </b>',
'REFERENCE' => '<p> <b>Invoice number: </b>',
'ITEMS' => '<p> <b>Details: </b>
<table width="100%" border="1" cellspacing="0"
'ITEM' => '<tr>',
'DESC' => '<td>',
'PRICE' => '<td>',
'QUANTITY' => '<td>',
'SUBTOTAL' => '<td>',
'DELIVERY' => '<p> <b>Shipping option:</b> ',
'TERMS' => '<p> <b>Terms and conditions: </b> <ul>',
'TERM' => '<li>'
$endTagsArray = array(
'LINE' => ',',
'ITEMS' => '</table>',
'ITEM' => '</tr>',
'DESC' => '</td>',
'PRICE' => '</td>',
'QUANTITY' => '</td>',
'SUBTOTAL' => '</td>',
'TERMS' => '</ul>',
'TERM' => '</li>'
// create array to hold subtotals
$subTotals = array();
// begin parsing
$xml_file = "invoice.xml";

Figure 2.1 shows what the end result looks like.

Figure 2.1. Results of converting the XML invoice into HTML with SAX.

How did I accomplish this? Quite easily by using the various event handlers exposed by SAX. As the script in Chapter 4, "PHP and Extensible Stylesheet Language Transformations (XSLT)."

Parsing and Displaying RSS Data on a Web Site

Another fairly common application of PHP's SAX parser involves using it to parse RDF Site Summary (RSS) documents and extract data from them for display on a web site.

In case you didn't already know, RSS 1.0 documents are well-formed XML documents that conform to the W3C's Resource Description Format (RDF) specification. RSS 1.0 documents typically contain a description of the content on a web site. Many popular portals publish these documents as an easy way to allow other web sites to syndicate and link to their content.

A Rich Resource

For more information on RSS and RDF, take a look at http://purl.org/rss/1.0/ for the RSS 1.0 specification, and also visit the W3C's web site for RDF at http://www.w3.org/RDF/. And then drop by this book's companion web site (http://www.xmlphp.com), which has links to tutorials on how to integrate RSS 1.0 content feeds into your own web site.

Listing 2.24 demonstrates what an RSS 1.0 document looks like.

Listing 2.24 RSS 1.0 document (fm-releases.rdf)

<?xml version="1.0" encoding="ISO-8859-1"?>
<channel rdf:about="http://freshmeat.net/">
<description>freshmeat.net maintains the
Web's largest index of Unix and
cross-platform open source
software. Thousands of
applications are meticulously
cataloged in the freshmeat.net database,
and links to new code are added
<dc:creator>freshmeat.net contributors</dc:creator>
<dc:rights>Copyright (c) 1997-2002 OSDN</dc:rights>
<rdf:li rdf:resource="http://freshmeat.net/releases/69583/" />
<rdf:li rdf:resource="http://freshmeat.net/releases/69581/" />
<!-- remaining items deleted -->
<image rdf:resource="http://freshmeat.net/img/fmII-button.gif" />
<textinput rdf:resource="http://freshmeat.net/search/" />
<image rdf:about="http://freshmeat.net/img/fmII-button.gif">
<item rdf:about="http://freshmeat.net/releases/69583/">
<title>sloop.splitter 0.2.1</title>
<description>A real time sound effects program.</description>
<item rdf:about="http://freshmeat.net/releases/69581/">
<title>apacompile 1.9.9</title>
<description>A full-featured Apache compilation HOWTO.</description>
<!-- remaining items deleted -->

The Scent of Fresh Meat

The RSS 1.0 document in Listing 2.24 describes the content appearing on the front page of the popular open-source software portal Freshmeat.net (http://www.freshmeat.net/).

Freshmeat.net 's RSS content feed is updated on a frequent basis with a list of the latest software added to the site; visit the web site for a copy of the latest version.

Now, this is a well-formed XML document, with clearly defined blocks for <channel> and <item> information. All that's needed now is some code to parse this document and return a list of the <item> s within it, together with the title, URL, and description of each.

With PHP's SAX parser, this is easy to accomplish. Listing 2.25 contains the code for a PHP class designed to parse the RSS document in Listing 2.24 and return PHP arrays containing the information within it. This information can then be formatted and displayed on a web page.

Listing 2.25 A PHP class to parse an RSS 1.0 document (rssparser.class.inc)

class RSSParser
// class variables
// holds name of element currently being parser
var $tag = ";
// location variable indicating whether parser is within
// item or channel block
var $location = 0;
// array counter
var $counter = 0;
// name of RSS file
var $file = ";
// associative array for channel data
var $channelData = array();
// nested array of arrays for item data
// every element of this array will represent
// one item in the channel
var $itemData = array();
// class methods
// set the name of the RSS file to parse
// this is usually a local file
// set it to a remote file only
// if your PHP build supports fopen() over HTTP
function setRSS($file)
$this->file = $file;
// element handlers
// these keep track of the element currently being parsed
// and adjust $location and $tag accordingly
function startElementHandler($parser, $name, $attributes)
$this->tag = $name;
if ($name == "ITEM")
// if entering item block
// set location variable to 1
$this->location = 1;
else if ($name == "CHANNEL")
// if entering channel block
// set location variable to 2
$this->location = 2;
function endElementHandler($parser, $name)
$this->tag = ";
// if exiting channel or item block
// reset location variable to 0
if ($name == "ITEM")
$this->location = 0;
else if ($name == "CHANNEL")
$this->location = 0;
// character data handler
// this function checks to see whether the parser is
// currently reading channel or item information
// and appends the information to the appropriate array
function characterDataHandler($parser, $data)
$data = trim(htmlspecialchars($data));
// only interested in these three elements...
if ($this->tag == "TITLE" || $this->tag ==
"LINK" || $this->tag == "DESCRIPTION")
// if within an item block
// add data to item array
if ($this->location == 1)
$this->itemData[$this->counter][strtolower($this->tag)] .= $data;
else if ($this->location == 2)
// else add it to channel array
$this->channelData[strtolower($this->tag)] .= $data;
// data retrieval methods
// this returns the array with channel information
function getChannelData()
return $this->channelData;
// this returns the array with item information
function getItemData()
return $this->itemData;
// all the work happens here
// parse the specified RSS file
// this populates the $channelData and $itemData arrays
function parseRSS()
// create parser
$this->xmlParser = xml_parser_create();
// set object reference
xml_set_object($this->xmlParser, $this);
// configure parser behaviour
xml_parser_set_option($this->xmlParser, XML_OPTION_CASE_FOLDING, TRUE);
xml_parser_set_option($this->xmlParser, XML_OPTION_SKIP_WHITE, TRUE);
// set up handlers
xml_set_element_handler($this->xmlParser, "startElementHandler",
($this->xmlParser, "characterDataHandler");
// read RSS file
if (!($fp = fopen($this->file, "r")))
die("Could not read $this->file");
// begin parsing...
while ($data = fread($fp, 2048))
if (!xml_parse($this->xmlParser, $data, feof($fp)))
die("The following error occurred: " .
// destroy parser
// end of class

This might look complicated, but it's actually pretty simple. The class above attempts to simplify the task of parsing and using an RDF file by parsing it and extracting the information within it into the following two arrays:

The $channelData associative array, which contains information on the channel title, URL, and description

The $itemData array, which is a two-dimensional array containing information (title, URL, and description) on the individual items in the channel list. The total number of elements in the $itemData array corresponds to the total number of <item> elements in the RSS document.

The class also exposes the following public methods:

Set the name of the RSS file to parse

Actually parse the specified RSS file and place the information extracted from it into the two arrays

Retrieve the array containing channel information

Retrieve the array containing the item list

When using this class (look at Listing 2.26 for a usage example), the first step is, obviously, to specify the name of the RSS file to parse. Once this has been specified and stored in a class variable, the parseRSS() method is invoked to actually parse the document.

This parseRSS() method does all the things you've become familiar with in this chapter: Create an XML parser, configure it, set up callback functions, and sequentially iterate through the document, calling appropriate handlers for each XML construct encountered. As the parser moves through the document, it uses the $location variable to identify its current location, and the $tag variable to identify the name of the element currently being parsed. Based on these two pieces of data, the character data handler knows which array to place the descriptive channel/item information into.

An Object Lesson

Special mention should be made of the xml_set_object() function used within the parseRSS() class method in Listing 2.25. You've probably not seen this function before, so I'll take the opportunity to explain it a little.

The xml_set_object() function is designed specifically to associate an XML parser with a class, and to link class methods and parser callback functions together. Callback functions defined for the parser are assumed to be methods of the enveloping class.

In order to better understand why xml_set_object() is necessary, try commenting out the call to the xml_set_object() function in Listing 2.25, and see what happens.

Listing 2.26 demonstrates how the class from Listing 2.25 can be combined with the RSS document in Listing 2.24 to generate PHP arrays representing the RSS content, and how those arrays can then be manipulated to display the information as browser-readable HTML.

Listing 2.26 Parsing an RDF File and Formatting the Result as an HTML Document

// include class
// instantiate a new RSSParser
$rp = new RSSParser();
// define the RSS 1.0 file to parse
// parse the file
// get channel information
$channel = $rp->getChannelData();
// retrieve item list (array)
// every element of this array is itself an associative array
// with keys ('title', 'link', 'description')
$items = $rp->getItemData();
// uncomment the next line to see a list of object properties
// print_r($rp);
<head><basefont face="Arial"></head>
<h2><? echo $channel['title']; ?></h2>
// iterate through item list
// print each item as a list item
with hyperlink, title and description
foreach($items as $item)
echo "<li>";
echo "<a href=" . $item['link'] . ">" . $item['title'] . "</a>";
echo "<br>" . $item['description'];

The script in Listing 2.26 creates an instance of the RSSParser class and parses the specified RSS file via the parseRSS() class method. It then iterates through the arrays returned by the class methods getChannelData() and getItemData(), and formats the elements of these arrays for display.

Figure 2.2 demonstrates what the output of Listing 2.26 looks like.

Figure 2.2. The results of converting an RDF file into HTML with SAX.

/ 84