If someone sends you an XML file containing data that fits into tables, you don't need to read the text and all its angle brackets. You can load the document into Excel directly, tell Excel how you want to present it, and work with the data through maps.
Extensible Markup Language (XML) has become a common interchange format over the past few years, and it's not unusual for people and organizations to send each other XML files. XML's simple core structures make it easy to exchange information with much less concern that all parties are using the same software. Until recently, however, although generic XML tools were widely available, bridging the gap between XML documents and the user interface was difficult. Excel 2003 makes it much easier, at least for data that fits on a grid.
|
We'll start with a sample XML document, shown in Example 8-1.
<?xml version="1.0" encoding="UTF-8"?> <sales> <sale> <date>2003-10-05</date> <ISBN>0596005385</ISBN> <Title>Office 2003 XML Essentials</Title> <PriceUS>34.95</PriceUS> <quantity>200</quantity> <customer ID="1025">Zork's Books</customer> </sale> <sale> <date>2003-10-05</date> <ISBN>0596002920</ISBN> <Title>XML in a Nutshell, 2nd Edition</Title> <PriceUS>39.95</PriceUS> <quantity>90</quantity> <customer ID="1025">Zork's Books</customer> </sale> <sale> <date>2003-10-05</date> <ISBN>0596002378</ISBN> <Title>SAX2</Title> <PriceUS>29.95</PriceUS> <quantity>300</quantity> <customer ID="1025">Zork's Books</customer> </sale> <sale> <date>2003-10-05</date> <ISBN>0596005385</ISBN> <Title>Office 2003 XML Essentials</Title> <PriceUS>34.95</PriceUS> <quantity>10</quantity> <customer ID="1029">Books of Glory</customer> </sale> <sale> <date>2003-10-05</date> <ISBN>0596002920</ISBN> <Title>XML in a Nutshell, 2nd Edition</Title> <PriceUS>39.95</PriceUS> <quantity>25</quantity> <customer ID="1029">Books of Glory</customer> </sale> <sale> <date>2003-10-07</date> <ISBN>0596002378</ISBN> <Title>SAX2</Title> <PriceUS>29.95</PriceUS> <quantity>5</quantity> <customer ID="1029">Books of Glory</customer> </sale> <sale> <date>2003-10-18</date> <ISBN>0596002378</ISBN> <Title>SAX2</Title> <PriceUS>29.95</PriceUS> <quantity>15</quantity> <customer ID="2561">Title Wave</customer> </sale> <sale> <date>2003-10-21</date> <ISBN>0596002920</ISBN> <Title>XML in a Nutshell, 2nd Edition</Title> <PriceUS>39.95</PriceUS> <quantity>15</quantity> <customer ID="9021">Books for You</customer> </sale> </sales>
You can open this directly from Excel 2003
by selecting File
If you select "As an XML list," you'll first be warned that Excel will be creating its own schema for this schema-free document, as shown in Figure 8-2. After clicking OK, you'll be rewarded with Excel's best guess as to how to present the information in the document as a spreadsheet, as shown in Figure 8-3.
Note that Excel expected the date format used by the date element, so it is now displaying dates imported as 2003-10-05 as 10/5/2003.
Once you have the document loaded into
Excel, you can treat the data much like you would any other data in
Excelincorporating it into formulas, creating named ranges for
it, making charts based on its contents, etc. To help you, Excel
provides some built-in functionality for analyzing the data. The
drop-down boxes in the column headers enable you to choose how to
sort the data (the default is the order the document had originally).
You also can turn on a total row, either from the List toolbar or by
right-clicking anywhere on the list and selecting List
You also can refresh the data, updating
that area with information from an XML document with the same
structure. If you had another document with the same structure, you
could right-click the list, select XML
In addition, you can export the data back
to an XML file after editing it by right-clicking the list and
selecting XML
For simple data, you can usually trust Excel to guess what a file's contents are and use the default view it provides. As data gets more complicated, however, especially if it contains dates or text that looks like numbers (note the missing initial zeros on the ISBNs in the preceding figures!), you might want to use XML schemas to constrain how Excel reads your data and what kinds of data will fit in a given map. For this document, an XML schema might look like Example 8-2.
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="sales"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" ref="sale"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="sale"> <xs:complexType> <xs:sequence> <xs:element ref="date"/> <xs:element ref="ISBN"/> <xs:element ref="Title"/> <xs:element ref="PriceUS"/> <xs:element ref="quantity"/> <xs:element ref="customer"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="date" type="xs:date"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Title" type="xs:string"/> <xs:element name="PriceUS" type="xs:decimal"/> <xs:element name="quantity" type="xs:integer"/> <xs:element name="customer"> <xs:complexType mixed="true"> <xs:attribute name="ID" use="required" type="xs:integer"/> </xs:complexType> </xs:element> </xs:schema>
Note that the date element is defined as a date, and the ISBN element is defined as a string here, not an integer. If you start by opening this schema rather than the document, you can have Excel load the document while preserving the initial zero on the ISBNs. This time, you'll create a list before loading the XML document, starting from a blank worksheet.
You'll need to open
the XML Source task pane. If the task pane isn't
open already, open it by selecting View
To load the schema, click the XML Maps... button. This will bring up the XML Maps dialog box, shown in Figure 8-7.
Click the Add... button to open the schema, and select the schema as shown in Figure 8-8.
If your schema doesn't restrict documents to having only one possible starting element, Excel will ask you which element to use as the root element, as shown in Figure 8-9. Because the documents in this example start with the sales element, pick "sales."
After you click OK, Excel warns about possible difficulties it might have in interpreting schemas in the dialog box shown in Figure 8-10. XML Schema is an enormous specification that supports a wide variety of structures that don't fit well with Excel's perspective on information, so Excel has some limitations.
Excel will show that your schema has been added to the spreadsheet in the XML Maps dialog, which should look like that shown in Figure 8-11.
If you click OK, you'll be returned to the main Excel interface, and the XML Source task pane will be populated with a diagram of the structure the schema described, such as that shown at the right of Figure 8-12. Now that you have the structure, you can lay out the list. The easiest way to do thisespecially with a small document such as this oneis to drag the sales icon to cell A1, producing the result in Figure 8-12.
|
Now that you have a home for the data,
it's time to populate it. You can either click the
Import XML Data button on the list toolbar, or right-click the list
and select XML
Excel's support for XML maps and lists means you can create spreadsheets that work on data arriving in separate files with more flexibility than prior formats such as CSV or tab-delimited formats provided. Instead of having to be connected to a database to edit data interactively, a user can edit XML files while on an airplane and feed that XML to an appropriate consumer when he lands. Perhaps the best aspect of Excel's new XML features is their flexibility; so long as the data is in a structure that fits on a grid, Excel has very few rules about what kinds of XML it will accept. With a few mouse clicks and no programming, you can integrate XML data with your spreadsheets.
Simon St.Laurent