13.1 Creating a Report
After
you've decided you need to produce printable output,
the next step is to decide what format to produce it and what tools
to use to create it. As Andrew Tanenbaum said in his classic quote
from the first edition of his book, Computer
Networks, "The nice thing about standards
is that you have so many to choose from; furthermore, if you do not
like any of them, you can just wait for next year's
model." He might have foreseen web reporting!So, there are many things to think about in deciding on reporting
tools and formats:
Middle-tier platform
What platform is your middle-tier installed on? If
it's
Microsoft Windows, a choice for
creating reports is Microsoft Word and a portable format to produce
is RTF (Rich Text Format). If it's a
Unix environment, PostScript can be
produced with several tools and is a well-supported format by Unix
users. However, for almost all platforms, Adobe's
PDF (Portable Document Format) has a wide range of tools and
libraries for production.
Client platform
What platform do your users use? The answer is most likely to be
mostly Microsoft Windows, and so a format that's
friendly to those users is essential. Importantly, reporting tools
are similar to browsers: you are unlikely to have control over the
environment the user has, and the best approach is to choose a format
that is likely to be used by the majority.
Richness of content
What features do you need? Are you producing reports that contain
images, text, graphics, tables, forms, graphs, or a combination of
those? Do you only need to produce a printable copy of the web page?
The answers determine if you can use a simple library (or a template)
for text and tables, or whether you need the full power of tools that
can create pixels and lines.
Speed
How fast does reporting have to be? There are several easy-to-use
tools that are slow to create a report file, and several hard-to-use
tools that are fast. However, most tools allow you to save output in
a file or database so that it can be delivered to many clients
without recreating the report.
Price
Do you want to pay? Are you prepared to purchase tools for reporting,
or do you want free or open source software?
Flexibility
Do you need to be flexible? Do you want to offer more than one format
to minimize the chance that a user will need to install a third-party
tool?
We discuss these issues in the remainder of this section.
13.1.1 Formats
There are many possible formats for
reports, and this section discusses most of the popular choices for
web reporting.
13.1.1.1 Portable Document Format (PDF)
Adobe's Portable
Document Format (PDF) is a well-documented, well-understood and
powerful format for reporting. It's now the dominant
reporting format on the Web and we use it in this chapter because it
meets most of our criteria in the previous section:It's ideal for reporting because it supports a wide
range of fonts, colors, and graphics. Moreover, it
doesn't matter what tools are used to create or view
a report, it'll produce the same, high-quality
output.
It's
portable. Adobe's free PDF viewer (known as
Adobe Reader) is available for almost
all platforms, including Mac OS X, Linux, Free BSD, Solaris, all
Microsoft Windows variants, Pocket PC, and Palm. There are also Open
Source viewers available such as
xpdf and
ghostview.It's full of features. It's simple
to use, but it's also powerful: fonts can be
embedded in a document, it can be combined with XML markup (which is
discussed later in this section), embedded links can be included,
forms are easy to integrate, and multimedia can be linked in.
Adobe's distiller (a commercial product) is a
powerful tool for creating PDFs, and it also allows you to create
templates that you can later populate with data.It's used by very large organizations. For example,
the U.S. government (including the IRS) delivers most of its
documents to its users in PDF, as do newswire services such as
Associated Press (AP). This means most of your users will already be
familiar with the format.It's flexible for the Web. You can deliver one page
from a large document and it can be rendered at the client without
retrieving the rest of the document. (However, this requires some
configuration that we don't discuss.)There's a wide variety of tools to produce it. We
discuss this next.
You can read the PDF specification at http://partners.adobe.com/asn/tech/pdf/specifications.jsp.There are two major external libraries that can be used to create PDF
with
PHP:
PDFlib (available from http://www.pdflib.com/) and
ClibPDF (available
from http://www.fastio.com/).
Both are function libraries that integrate into PHP, but both need to
be downloaded, purchased (if you're doing commercial
work), and configured, and then PHP needs to be recompiled to support
them. The integration process is sometimes tricky, but good notes on
the process can be found in the user-contributed comments in the
online PHP manual. At the time of writing, PDFlib was more popular.Both PDFlib and ClibPDF allow creation of low- and high-level report
features. For example, you can create a text-only document using a
few lines of code, or you can draw lines and shapes by moving a
cursor with tens or hundreds of lines of code in a complex program.
Both libraries also allow you to include external graphics in
reports, and to use almost all of the features of PDF.Because both function libraries are commercial products and require
integration, we favor other, free solutions that are now becoming
popular. Later in this chapter, we show you how to use the R&OS
PDF class library. It's almost as powerful as
PDFlib, and we show you how to use it create and format documents
that contains tables, images, and reports.There are also other, simpler libraries. For example,
RustyPart's
HTML_ToPDF
is a simple tool to turn your HTML page into a PDF document for
printing, and it makes use of freely available tools to carry out the
process. You can find out more from http://www.rustyparts.com/pdf.php.
13.1.1.2 Rich Text Format (RTF)
Microsoft's Rich Text
Format (RTF) is an interchange format for documents. Similarly to
PDF, it's an open standard that's
implemented in a wide range of tools on many platforms. For example,
Microsoft Word can save and read documents in RTF format, as can
tools such as the writers in OpenOffice, StarOffice, and most
commercial word processors. However, much like HTML,
there's no guarantee that an RTF document will look
the same in a different word processor or on a different platform.Reports in RTF are different from those in PDF. An RTF format
document is designed to be opened, edited, and manipulated in the
same way as any other word processor document. It's
therefore a good format for reports that need to be edited or
documents that need to be exchanged, but it's not a
good format when you want to produce a report that's
the same on all platforms. However, as a reporting format,
it's preferable to Microsoft Word's
proprietary .doc binary format.You can find out more about the RTF specification from http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnrtfspec/html/rtfspec.asp
13.1.1.3 PostScript
Adobe's
PostScript format is a printer language. Most laser printers
understand PostScript, and can convert a PostScript description into
a high-quality printout. PostScript has within it tools to control
whether printing is simplex or duplex, what paper to use, and even
whether to staple. It's not designed for users in
the same way as Adobe's PDF: for example, it
doesn't support hypertext-style linking, embedding
of sounds and movies, or pages being downloaded individually.Despite its focus as a printer language, most Unix users are familiar
with PostScript and happy with it as a report format. Tools such as
GhostView (or GSView or ggv) are commonly installed on Unix
platforms, and do a good job of rendering PostScript documents on a
screen. Adobe's Reader and Mac OS
X's Preview also display PostScript documents.You can find out more about the PostScript language from http://partners.adobe.com/asn/tech/ps/index.jsp.
13.1.1.4 HTML and XML
Perhaps
the most obvious
report type for a web database application is the web page itself.This works as follows: using PHP code in an application you produce
HTML, it's sent to the user, the
user's browser renders the page, and (in most
browsers) the user can then print the page directly. But despite its
simplicity, this doesn't work well for most
reporting: different browsers render pages differently, window width
and depth doesn't usually align with paper width and
depth, and there's no guarantee that colors, fonts,
and images will transpose well into the printed environment. However,
as discussed previously, there are some good tools available to
convert HTML to PDF for printing.So far in this section, we've described several
different formats in which documents or reports are described using a
language or markup. The Extensible Markup Language, XML, is another
markup language designed to identify structure in text and it is a
sibling of HTML (their parent is SGML). XML is conceptually simple,
yet developers have found uses for it in a wide range of
applications:Storing content in large and dynamic web sites
Storing content marked-up with XML can make content re-use and
management much easier.
Standardizing transporting data between applications
When applications are difficult to integrate, XML provides a common
protocol that allows data to be shared.
To define new standards
Scalable Vector Graphics (SVG) and XSL-Flow Objects (XSL-FO) are both
examples of standards that are represented with XML.
It's also used in conjunction with PDF to, for
example, markup forms within a document.
As a component for other technologies
The Simple Object Access Protocol (SOAP) provides a mechanism for
manipulating objects over a wide area networksuch as the
Webusing XML to encode the object messages.
Much like RTF, XML is a possible choice for a reporting format (and
for many other tasks): it's powerful and independent
of presentation, platform, and operating system. PHP has excellent
XML support, and this has been completely redeveloped in PHP5.
However, a detailed discussion of XML is outside the scope of this
book.
13.1.1.5 Email and plain text
Plain text without markup is a simple
report format, as is a plain text email to a user.
What's more, text is compact, easy to format, and
fast to send by email or to a browser. However, you have even less
control than with HTML over presentation or printing, and
it's unlikely to be an effective way to layout
information except for the shortest reports. Despite this, as we show
in Chapter 19, email receipts are still a useful
reporting tool to acknowledge actions in a web database
application.