There are two basic approaches to parsing an XML document:
The event-based approach.
Under this approach, an XML parser reads an XML document one chunk at a time, processing each tag as it finds it in the document. Each time the parser encounters an XML construct (an element start tag, a CDATA block, or a PI), it generates an event that can be intercepted and processed by the application layer. A Simple API for XML (SAX) parser uses this event-based approach to parsing an XML document.
The tree-based approach (DOM).
Here, an XML parser reads the entire document into memory at one time, and creates a hierarchical tree representation of it in memory. The parser also exposes a number of traversal methods, which developers can use to navigate between, and process, individual tree nodes. A Document Object Model (DOM) parser uses this tree-based approach to parsing an XML document.
This chapter focuses on the first approach. The second approach is dealt with in detail in Chapter 3, "PHP and the Document Object Model (DOM)."
In order to better understand the event-driven approach, consider Listing 2.1.
Now, if a SAX parser processed this document, it would generate an event trail that would look something like this:
Under the SAX approach, the parser has a very clear and defined role. Its function is merely to parse the XML data that's being fed to it and to call appropriate functions (also referred to as "handlers" or "callback functions") at the application layer to handle the different types of constructs it encounters. It's up to the developer to write the code necessary to handle each event, depending on the application requirements; the SAX parser itself is completely insulated from the internals of each handler function.