20.2. Using Cocoon
The central concept in Cocoon is that of an XML pipeline. A pipeline starts with a generator, a class that generates XML. Typically this is done by reading a file, but XML can also be obtained from a remote URL, read from a database, or generated from other data.Chapter 18. Such a transformer is provided, along with many others that do less obvious things. Some of these will be covered in following sections.After passing through transformers the XML is given to a serializer, which puts it in a final form that may not be XML. There are serializers that generat187 and other standard Web formats, but there are also serializers that produce files such as zip archives, OpenOffice spreadsheets, and even images.This provides a very flexible architecture for building applications. A complex report can be built to generate XML, and this report can then be deployed on the Web a201 and to users' desktops as a spreadsheet by specifying two pipelines with different serializers. In addition, backend developers can concentrate on the model by building tools that create minimalistic XML, and page developers can concentrate on the view by writing XSLT transformers and style sheets. Cocoon calls this the "separation of concerns."The set of available generators, transformers, and serializers is controlled by the sitemap.xmap file, as are the pipelines themselves. This file is large and complex and cannot be covered in detail; see the Cocoon documentation and the sample on the CD-ROM for details. An overview of the most important elements follows.Generators, transformers, serializers, and other pluggable elements are called components and are defined in the map:components section of sitemap.xmap. Here is a minimal set of definitions:
Each of the three categories map:generators, map:transformers, and map:serializerscontains individual entries of the appropriate type. Each may also specify a default, which can be used to appreciate the definition of the pipelines.Within each category individual entries are defined by a name and an implementing class. These classes must reside in the usual Tomcat directories to be available. Classes that rely on underlying Cocoon blocks may also require configuration in cocoon.xmap.There are other elements that may be configured in this file, although they will not be covered here. The next piece of immediate interest is the pipelines. Each pipeline consists of a set of URLs to match against and a set of steps that should be performed in response to requests for that URL. For the purposes of this book, there is only a single pipeline defined in the global sitemap.xmap in the top-level directory, which is as follows:
<map:components>
<map:generators default="file">
<map:generator
name="file "
src="org.apache.cocoon.generation.FileGenerator"/>
</map:generators>
<map:transformers default="xslt">
<map:transformer
name="xslt "
src="org.apache.cocoon.transformation.TraxTransformer"/>
</map:transformers>
<map:serializers default="html">
<map:serializer
mime-type="tex202"
name="html "
src="org.apache.cocoon.serializatio196Serializer"/>
</map:serializers>
</map:components>
This pipeline says that when a request comes in for a file in a subdirectory, as indicated by the presence of a slash, pass control to the sitemap.xmap within that subdirectory. Note that even though web.xml specified that only requests for files under chapter 20 should be handled by Cocoon, Cocoon will still get its master configuration information from the global sitemap.xmap. The mount entry will pass control to a sitemap.xmap within the chapter 20 directory.The pipeline entries in chapter20/sitemap.xmap are more interesting. Here is one that generates a basic page with some information about Cocoon:
<map:pipelines>
<map:pipeline>
<map:match pattern="*/**">
<map:mount check-reload="yes"
src="{1}/"
uri-prefix="{1}"/>
</map:match>
</map:pipeline>
</map:pipelines>
This entry specifies that requests for sampl187 should be handled by performing the following steps:
<map:match pattern="sampl187">
<map:generate src="sample1.xml"/>
<map:transform src="sample1.xslt"/>
<map:serialize type="xhtml"/>
</map:match>
- Use the default generator (file) with a src of sample1.xml to obtain the initial XML.
- Use the default transformer (xslt) to transform this XML, using the sample1.xslt file.
- Use the xhtml serializer to generate XHTML to send to the user.
sample1.xml, which contains the data for this page, is shown in Chapter 18.
Figure 20.3. A simple page generated by Cocoon.
[View full size image]

Listing 20.1. A sample XML page
<tools>
<tool name="Coccoon" publisher="Apache">
<description>
XML publishing framework
</description>
<features>
<feature>Generate XML from many sources</feature>
<feature>Manipulate XML in many ways</feature>
</features>
</tool>
</tools>
Listing 20.2. An XSLT file to transform data int197
To repurpose this page as plain text, it is just necessary to create another entry in the pipeline with a different serializer:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
version="1.0 "
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml">
<xsl:param name="contextPath" select="'/cocoon'"/>
<xsl:template match="tool">
&l202>
<head>
<title>
Toolkit: <xsl:value-of select="@name"/>
</title>
</head>
<body>
<h1>
<xsl:value-of select="@name"/>: Summary
</h1>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="description">
<p><xsl:apply-templates/></p>
</xsl:template>
<xsl:template match="features">
<ul><xsl:apply-templates/></ul>
</xsl:template>
<xsl:template match="feature">
<li><xsl:apply-templates/></li>
</xsl:template>
</xsl:stylesheet>
To repurpose this page as plain text, it is just necessary to create another entry in the pipeline with a different serializer:
<map:match pattern="sample1.txt">
<map:generate src="sample1.xml"/>
<map:serialize type="text"/>
</map:match>
The result is shown in Figure 20.4Note that the name of the toolkit does not appear in the text version of the page. This is due to the way the text serializer works; it emits all CDATA between tags and ignores all attributes. If this is not what is desired, it could be remedied by introducing a new XSLT file that would pull the name from the attribute and place it within the body of a tag.An interesting and important feature of the file generator is that it can pull data from remote sources as well as from local files. Recall the discussion of the c:import tag from the standard tag library, which performed a similar function. This feature makes it easy to build pages that gather syndicated content. For example, http://www.slashdot.org offers their news stories as an XML stream that could be obtained with the following pipeline element:
<map:match pattern="sample1.txt">
<map:generate src="sample1.xml"/>
<map:serialize type="text"/>
</map:match>
<map:match pattern="slashfeed.xml">
<map:generate src="http://slashdot.org/slashdot.xml"/>
<map:serialize type="xml"/>
</map:match>
Figure 20.4. An alternate text version of the simple page.
[View full size image]

Figure 20.5. The slashdot feed.
Another important transformer, xinclude, allows one file to include another. A simple example of this reads one file, processes the includes, and sends the result a201.
The master file, includer.xml, is shown in Listing 20.3.
<map:match pattern="include200">
<map:generate src="includer.xml"/>
<map:transform type="xinclude"/>
<map:serialize type="xhtml"/>
</map:match>
Listing 20.3. A file that includes another
Note that the file to be included is specified in the including file, even though it is processed by the transformer specified in sitemap.xmap. Also note the import and use of the xi namespace, which is necessary for xinclude to function properly. This is a common theme in Cocoon; files to be transformed may need to import certain namespaces that the transformers will use to find the tags on which they act.The included file contains the text "<p> This is some text from the included file </p>, " and the result is a page with both paragraphs.Like the file generator, xinclude can handle remote URLs as well as local files. Listing 20.4 shows a possible XML file that combines news from two sources.
<?xml version="1.0" encoding="UTF-8"?>
&l202 xmlns:xi="http://www.w3.org/2001/XInclude">
<head>
<title>File that includes another</title>
</head>
<body>
<p>This is from the master file</p>
<xi:include href="included.xml"/>
</body>
</html>
Listing 20.4. A news portal page
Combined with the proper XSLT file,
<?xml version="1.0" encoding="UTF-8"?>
<page
xmlns:xi="http://www.w3.org/2001/XInclude">
<title>Latest news</title>
<content>
<newsboxes>
<newsbox title="Stories from Computerworld">
<xi:include href=
"http://www.computerworld.com/news/xml/6/0,5009,,00.xml"
/>
</newsbox>
<newsbox title="Stories from slashdot">
<xi:include
/>
</newsbox>
</newsboxes>
</content>
</page>