Table of Contents
| Ch. 1 | XSLT in context | 1 |
| Ch. 2 | The XSLT processing model | 43 |
| Ch. 3 | Stylesheet structure | 83 |
| Ch. 4 | Stylesheets and schemas | 145 |
| Ch. 5 | XSLT elements | 173 |
| Ch. 6 | Patterns | 493 |
| Ch. 7 | XSLT functions | 523 |
| Ch. 8 | Extensibility | 595 |
| Ch. 9 | Stylesheet design patterns | 613 |
| Ch. 10 | Case study : XMLSpec | 645 |
| Ch. 11 | Case study : a family tree | 691 |
| Ch. 12 | Case study : knight's tour | 739 |
| App. A | XPath 2.0 syntax summary | 755 |
| App. B | XPath function library | 765 |
| App. C | Microsoft XSLT processors | 799 |
| App. D | JAXP : the Java API for transformation | 815 |
| App. E | Saxon | 851 |
| App. F | Backwards compatibility | 865 |
Read a Sample Chapter
XSLT 2.0 Programmer's Reference
By Michael Kay John Wiley & Sons
ISBN: 0-7645-6909-0
Chapter One
XSLT in Context This chapter is designed to put XSLT in context. It's about the purpose of XSLT and the task it was designed to perform. It's about what kind of language it is, how it came to be that way, and how it has changed in version 2.0; and it's about how XSLT fits in with all the other technologies that you are likely to use in a typical Web-based application. I won't be saying very much in this chapter about what an XSLT stylesheet actually looks like or how it works: that will come later, in Chapters 2 and 3.
The chapter starts by describing the task that XSLT is designed to perform-transformation-and why there is the need to transform XML documents. I'll then present a trivial example of a transformation in order to explain what this means in practice.
Next, I cover the different ways of using XSLT within the overall architecture of an application, in which there will inevitably be many other technologies and components, each playing their own part. We then discuss the relationship of XSLT to other standards in the growing XML family, to put its function into context and explain how it complements the other standards.
I'll describe what kind of language XSLT is, and delve a little into the history of how it came to be like that. If you're impatient you may want to skip the history and get on with using the language, but sooner or later you will ask "why on earth did they design it like that?" and at that stage I hope you will go back and read about the process by which XSLT came into being.
What is XSLT?
XSLT (which stands for eXtensible Stylesheet Language: Transformations) is a language that, according to the very first sentence in the specification (found at w3.org/TR/ xslt20/), is primarily designed for transforming one XML document into another. However, XSLT is also capable of transforming XML to HTML and many other text-based formats, so a more general definition might be as follows:
XSLT is a language for transforming the structure and content of an XML document.
Why should you want to do that? In order to answer this question properly, we first need to remind ourselves why XML has proved such a success and generated so much excitement.
XML is a simple, standard way to interchange structured textual data between computer programs. Part of its success comes because it is also readable and writable by humans, using nothing more complicated than a text editor, but this doesn't alter the fact that it is primarily intended for communication between software systems. As such, XML satisfies two compelling requirements:
Separating data from presentation: the need to separate information (such as a weather forecast) from details of the way it is to be presented on a particular device. The early motivation for this arose from the need to deliver information not only to the traditional PC-based Web browser (which itself comes in many flavors), but also to TV sets and WAP phones, not to mention the continuing need to produce print-on-paper. Today, for many information providers an even more important driver is the opportunity to syndicate content to other organizations that can republish it with their own look-and-feel.
Transmitting data between applications: the need to transmit information (such as orders and invoices) from one organization to another without investing in bespoke software integration projects. As electronic commerce gathers pace, the amount of data exchanged between enterprises increases daily, and this need becomes ever more urgent.
Of course, these two ways of using XML are not mutually exclusive. An invoice can be presented on the screen as well as being input to a financial application package, and weather forecasts can be summarized, indexed, and aggregated by the recipient instead of being displayed directly. Another of the key benefits of XML is that it unifies the worlds of documents and data, providing a single way of representing structure regardless of whether the information is intended for human or machine consumption. The main point is that, whether the XML data is ultimately used by people or by a software application, it will very rarely be used directly in the form it arrives: it first has to be transformed into something else.
In order to communicate with a human reader, this something else might be a document that can be displayed or printed: for example, an HTML file, a PDF file, or even audible sound. Converting XML to HTML for display is the most common application of XSLT today, and it is the one I will use in most of the examples in this book. Once you have the data in HTML format, it can be displayed on any browser.
In order to transfer data between different applications we need to be able to transform information from the data model used by one application to the model used by another. To load the data into an application, the required format might be a comma-separated-values file, a SQL script, an HTTP message, or a sequence of calls on a particular programming interface. Alternatively, it might be another XML file using a different vocabulary from the original. As XML-based electronic commerce becomes widespread, the role of XSLT in data conversion between applications also becomes ever more important. Just because everyone is using XML does not mean the need for data conversion will disappear.
There will always be multiple standards in use. For example, the NewsML format for exchanging news stories (newsml.org/pages/index.php) has wide support among Western newspaper publishers and press agencies, but attracts little support from broadcasters. Meanwhile, broadcasters in Japan are concentrating their efforts on the Broadcast Markup Language (xml.coverpages .org/bml.html). This has a very different scope and purpose; but ultimately, it can handle the same content in a different form, and there is therefore a need for transformation when information is passed from one industry sector to the other.
Even within the domain of a single standard, there is a need to extract information from one kind of document and insert it into another. For example, a PC manufacturer who devises a solution to a customer problem will need to extract data from the problem reports and insert it into the documents issued to field engineers so they can recognize and fix the problem when other customers hit it. The field engineers, of course, are probably working for a different company, not for the original manufacturer. So, linking up enterprises to do e-commerce will increasingly become a case of defining how to extract and combine data from one set of XML documents to generate another set of XML documents: and XSLT is the ideal tool for the job.
At the end of this chapter we will come back to specific examples of when XSLT should be used to transform XML. For now, I just wanted to establish a feel for the importance and usefulness of transforming XML. If you are already using XSLT, of course, this may be stale news. So let's take a look now at what XSLT version 2.0 brings to the party.
Why Version 2.0?
XSLT 1.0 came out in November 1999 and has been highly successful. It was therefore almost inevitable that work would start on a version 2.0. As we will see later, the process of creating version 2.0 has been far from smooth and has taken rather longer than some people hoped.
It's easy to look at version 2.0 and see it as a collection of features bolted on to the language, patches to make up for the weaknesses of version 1.0. As with a new release of any other language or software package, most users will find some features here that they have been crying out for, and other additions that appear surplus to requirements.
But I think there is more to version 2.0 than just a bag of goodies; there are some underlying themes that have guided the design and the selection of features. I can identify three main themes:
Integration across the XML standards family: W3C working groups do not work in isolation from each other; they spend a lot of time trying to ensure that their efforts are coordinated. A great deal of what is in XSLT 2.0 is influenced by a wider agenda of doing what is right for the whole raft of XML standards, not just for XSLT considered in isolation.
Extending the scope of applicability: XSLT 1.0 is pretty good at rendering XML documents for display as HTML on screen, and for converting them to XSL Formatting Objects for print publishing. But there are many other transformation tasks for which it has proved less suitable. Compared with report writers (even those from the 1980s, let alone modern data visualization tools) its data handling capabilities are very weak. The language is quite good at doing conversions of XML documents if the original markup is well designed, but it's much weaker at recognizing patterns in the text or markup that represent hidden structure. An important aim of XSLT 2.0 is to increase the range of applications that you can tackle using XSLT.
Tactical usability improvements: Here we are into the realm of added goodies. The aim here is to achieve productivity benefits, making it easier to do things that are difficult or error-prone in version 1.0. These are probably the features that existing users will immediately recognize as the most beneficial, but in the long term the other two themes probably have more strategic significance for the future of the language.
Before we discuss XSLT in more detail and have a first look at how it works, let's study a scenario that clearly demonstrates the variety of formats to which we can transform XML, using XSLT.
A Scenario: Transforming Music
As an indication of how far XML has now penetrated, Robin Cover's index of XML-based application standards at xml.coverpages.org/xmlApplications.html today runs to over 580 entries. (The last one is entitled Mind Reading Markup Language, but as far as I can tell, all the other entries are serious.)
I'll follow just one of these 580 links, XML and Music, which takes us to xml.coverpages .org/xmlMusic.html. On this page we find a list of no less than 17 standards, proposals, or initiatives that use XML for marking up music.
Some of this diversity is unnecessary, and many of these initiatives will bear little fruit. Even the names of the standards are chaotic: there is a Music Markup Language, a MusicML, a MusicXML, and a MusiXML, all of which appear to be quite unrelated. There are at least two really serious contenders: the Music Encoding Initiative (MEI) and the Standard Music Description Language (SMDL). The MEI derives its inspiration from the Text Encoding Initiative, which is widely used by the library community for creating digital text archives, while SMDL is related to the HyTime hypermedia standards and takes into account requirements such as the need to synchronize music with video or with a lighting script.
The diversity of standards is inevitable before the industry can come up with a standard that works for everyone. Without variety, there can be no innovation or experimentation. In fact, the likely outcome is not a single standard, but a collection of three or four different standards that are optimized for different needs. The different notations were invented with different purposes in mind: a markup language used by a publisher for printing sheet music has different requirements from the one designed to let you listen to the music from a browser.
For most of us, music may be fun, a diversion from the world of work. But for others, it is a very serious billion-dollar business. Standards that make information interchange in this business easier have an enormous economic impact. Whether you're interested in the music or the money, we're not dealing here with something that's trivial. So it shouldn't be surprising that so much effort is going into the process of creating standards in this area.
In earlier editions of this book I introduced the idea of using XSLT to transform music as a theoretical possibility, something to make my readers think about the range of possibilities open for the language. Today, it is no longer a theoretical possibility-people are actually doing it.
With 17 different schemas for music in existence, all with different strengths and weaknesses (and fan clubs), there is a big need to convert information from one of these formats to any of the others. There is also a need to convert information from any of these formats to a printable score or an audible performance of the music, as well as a need to create XML representations of music from non-XML sources such as MIDI files (Figure 1-1). XSLT has a role to play in all of these conversions.
So you could use XSLT to:
Convert music from one of these representations to another, for example from MEI to SMDL.
Convert music from any of these representations into visual music notation, by generating the XML-based vector graphics format SVG.
Play the music on a synthesizer, by generating a MIDI (Musical Instrument Digital Interface) file.
Perform a musical transformation, such as transposing the music into a different key or extracting parts for different instruments or voices.
Extract the lyrics, into HTML or into a text-only XML document.
Capture music from non-XML formats and translate it to XML (XSLT 2.0 is especially useful here).
As you can see, XSLT is not just for converting XML documents to HTML.
For some real examples of XSLT stylesheets used to transform music, take a look at a thesis written by Baron Schwartz at the University of Virginia (cs.virginia.edu/~bps7j/thesis/).
How Does XSLT Transform XML?
By now you are probably wondering exactly how XSLT goes about processing an XML document in order to convert it into the required output. There are usually two aspects to this process:
1. The first stage is a structural transformation, in which the data is converted from the structure of the incoming XML document to a structure that reflects the desired output.
2. The second stage is formatting, in which the new structure is output in the required format such as HTML or PDF.
The second stage covers the ground we discussed in the previous section; the data structure that results from the first stage can be output as HTML, a text file, or as XML. HTML output allows the information to be viewed directly in a browser by a human user or be input into any modern word processor.
Continues...
Excerpted from XSLT 2.0 Programmer's Reference by Michael Kay Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.