Hack 74. Get a Taste of E4X Scripting


Learn the future of XML scripting
techniques.
The JavaScript language is officially named ECMAScript.
E4X
(ECMAScript for XML) is the
new ECMA-357 standard (at http://www.ecma-international.org) that
extends Edition 3 of ECMAScript. It adds a drop of extra syntax that
makes it easy to manipulate XML. This hack presents a brief tour of
these features.
E4X is new, and it isn't available in Firefox 1.0.
It is being implemented rapidly, though, and is likely to be present
in the 1.1 release or thereabouts. It's fun to play
with, and it's the future of XML scripting.
6.18.1. Where E4X Fits In
JavaScript code is just one way of manipulating the content of an XML
document in the browser. If you want to do so, you have to embed or
attach a JavaScript script to that document. That remains the case
for E4X.
Once JavaScript is running inside a web document, there are four
established ways to interact with that document's
content. To briefly review, you can:
Use widely supported but nonstandard DOM 0 JavaScript host objects,
such as document.images[3]. Strengths: simple.
Weaknesses: limited to HTML.
Use W3C DOM 1, 2, or 3 interfaces, usually starting with
document.getElementById() or
Microsoft's nonstandard document.all(
). Strengths: generic to all XML. Weaknesses: verbose.
Use nonstandard .innerHTML features to turn a
string containing XML into DOM content, or more rarely use the DOM 3
Load & Save interfaces. Strengths: simple. Weaknesses:
nonstandard.
Use XPath query patterns [Hack #63], perhaps inside XSLT [Hack #64] . Strengths: powerful.
Weaknesses: limited coding features.
E4X uses syntax as simple as the first method to achieve all of the
most common uses of the other three. It consists
of:
A starting point integrated with the familiar (to many) JavaScript
environment
New, simple syntax that makes HTML and XML element access easy
Four new native JavaScript objects, if you happen to like objects
A near-invisible connector to the non-E4X DOM objects in the XML
document
6.18.2. Setting Up a Playpen for E4X
You can play with E4X in any old web page, with one restriction. If
your script wants to take advantage of E4X, then this
won't work for
you:
<script type="text/javascript" src="/image/library/
english/10055_/image
/library/english/10055_/image/library/english/10055_test.js"><
/script>
This is the new way forward:
<script type="text/javascript;e4x=1" src="/image/library
/english/10055_/image
/library/english/10055_/image/library/english/10055_test.js"><
/script>
For this simple exploration, it doesn't matter too
much what your test page looks like. Since E4X is intended for XML
first and foremost, we're doing the right thing,
using XHTML 1.0 instead of HTML 4.01. Here's a
simple test document:
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<script type="text/javascript;e4x=1" src="/image
/library/english/10055_/image/library/english/10055_/image
/library/english/10055_test.js" />
</head>
<body>
Test me! <p id="tag-with-no-content" />
</body>
</html>
Notice the special value for the type attribute.
Suppose we also have both the document and a specific tag available
in two convenient variables. These might be set up somehow by the
/image/library/english/10055_/image/library/english/10055_/image/library/english/10055_test.js script:
var doc = window.document;
var tag = document.getElementById("tag-with-no-content");
So far, this is all plain JavaScript 1.5 or ECMAScript Edition 3.
Remember, you have to have an E4X-enabled version of Firefox. Check
the release notes for 1.1 and later for details. At worst, you can
compile the SpiderMonkey test program called simply
js. But that's another story.
6.18.3. Experiment with E4X Features
So let's play. Let's first add
two
XML tags the old way. We'll add non-XHTML tags, just
for fun. First, let's use dirty nonstandard tricks:
tag.innerHTML = '<list><item type="round">ball</item></list>';
Note how two types of quotes are required, but at least
it's short. Next, let's use
existing DOM standards:
var text = doc.appendChild(doc.createTextNode("ball"));
var list = doc.createElement("list");
var item = doc.createElement("item");
item.setAttribute("type", "round");
item.appendChild(text);
list.appendChild(item);
tag.appendChild(list);
That's quite verbose, but at least
it's portable. Now, use E4X standard object syntax:
// 'tag' is an E4X object as well as a DOM one.
tag.list.item = "ball"; // add the two tags, and innermost content.
tag.list.item.@type = "round"; // add an attribute, and give it a value.
The syntax on the left side is the E4X quick way of stepping down
through a tag or element hierarchy. That's
convenient by itself. Even better though, if the tags
don't exist, they're automatically
created as they're referenced (because the
tag object is an XML object). Best of all, the
right-hand side is automatically added as the content of the
specified left-hand side. Very simple.
E4X provides an alternate, XML-based syntax. Here's
the same addition as the previous bit of code:
var list = <list>
<item type="round">ball</item>
</list>;
tag += list;
In this example, the XML is literally and
lexically part of the JavaScript syntax. No quotes or translation
functions are required; it's all automatic. Note how
the XML isn't trapped inside a string. The second
assignment adds special and convenient semantics to the
+= operator. It's equivalent to
this DOM 1 code:
tag.appendChild(list);
Suppose the data added isn't all static. E4X XML
content can be also constructed using variables or expressions, as
shown in this final alternative:
var shape = "round";
var type = 1;
var thing = "item";
var list = <list>
<{thing} type={shape}>
{ (type == 1) ? "ball" : "stick"; }
</{thing}>
</list>;
tag += list;
Everywhere you see braces (curly brackets), a JavaScript expression
can be put in. To do the same thing using traditional notations, even
.innerHTML, used to require complicated string
concatenations. Not anymore.
You can also query the XML content very simply using E4Xno
XPath required. This line returns any and all tags named
<item> that are in the tag hierarchy held by
list, no matter how deeply nested they are:
var items = list..items;
A set of objects is returned if there's more than
one match. For our sample data, there's only one
match. This syntax is equivalent to the use of //
in XPath.
This further line returns all immediate child tags of the
<list> tag, which happens to be the same
result as the previous case:
var items = list.*;
This final line returns all the attributes of the
<item> tag as a list:
var atts = list.item.@*;
E4X supports XML namespaces and a few
other goodies as well. Unlike the ECMAScript standard, the
E4X standard is easy to read. Download a
copy today from http://www.ecma-international.org/.