Table of Contents
| Preface | |
| 1 | Getting Started | 1 |
| 2 | The Obligatory Hello World Example | 21 |
| 3 | XPath: A Syntax for Describing Needles and Haystacks | 42 |
| 4 | Branching and Control Elements | 65 |
| 5 | Creating Links and Cross-References | 99 |
| 6 | Sorting and Grouping Elements | 129 |
| 7 | Combining XML Documents | 148 |
| 8 | Extending XSLT | 166 |
| 9 | Case Study: The Toot-O-Matic | 212 |
| A | XSLT Reference | 237 |
| B | XPath Reference | 332 |
| C | XSLT and XPath Function Reference | 341 |
| D | XSLT Guide | 434 |
| Glossary | 443 |
| Index | 451 |
Read an Excerpt
Chapter 5: Creating Links and Cross-References
Contents:
Generating Links with the id() Function
Generating Links with the key() Function
Generating Links in Unstructured Documents
Summary
If you're creating a web site, publishing a book, or creating an XML transaction, chances are many pieces of information will refer to other things. This chapter discusses a several ways to link XML elements. It reviews three techniques:
Generating Links with the id() Function
Our first attempt at linking will be with the XPath id() function.
The ID, IDREF, and IDREFs Datatypes
Three of the basic datatypes supported by XML Document Type Definitions (DTDs) are
ID,
IDREF, and
IDREFS. Here's a simple DTD that illustrates these datatypes:
<!--glossary.dtd-->
<!--The containing tag for the entire glossary-->
<!ELEMENT glossary (glentry+) >
<!--A glossary entry-->
<!ELEMENT glentry (term,defn+) >
<!--The word being defined-->
<!ELEMENT term (#PCDATA) >
<!--The id is used for cross-referencing, and the
xreftext is the text used by cross-references.-->
<!ATTLIST term
id ID #REQUIRED
xreftext CDATA #IMPLIED >
<!--The definition of the term-->
<!ELEMENT defn (#PCDATA | xref | seealso)* >
<!--A cross-reference to another term-->
<!ELEMENT xref EMPTY >
<!--refid is the ID of the referenced term-->
<!ATTLIST xref
refid IDREF #REQUIRED >
<!--seealso refers to one or more other definitions-->
<!ELEMENT seealso EMPTY>
<!ATTLIST seealso
refids IDREFS #REQUIRED >
In this DTD, each <term> element is required to have an id attribute, and each <xref> element must have an refid attribute. The ID and IDREF datatypes work according to two rules:
To round out our example, the <seealso> element contains an attribute of type IDREFS. This datatype contains one or more values, each of which must match a value of an ID elsewhere in the document. Multiple values, if present, are separated by whitespace.
There are some complications of ID and related datatypes, but we'll discuss them later. For now, we'll focus on how the id() function works.
An XML Document in Need of Links
To illustrate the value of linking, we'll use a small glossary written in XML. The glossary contains some <glentry> elements, each of which contains a single <term> and one or more <defn> elements. In addition, a definition is allowed to contain a cross-reference (<xref>) to another <term>. Here's a short sample document:
<?xml version="1.0" ?>
<!DOCTYPE glossary SYSTEM "glossary.dtd">
<glossary>
<glentry>
<term id="applet">applet</term>
<defn>
An application program,
written in the Java programming language, that can be
retrieved from a web server and executed by a web browser.
A reference to an applet appears in the markup for a web
page, in the same way that a reference to a graphics
file appears; a browser retrieves an applet in the same
way that it retrieves a graphics file.
For security reasons, an applet's access rights are limited
in two ways: the applet cannot access the file system of the
client upon which it is executing, and the applet's
communication across the network is limited to the server
from which it was downloaded.
Contrast with <xref refid="servlet"/>.
<seealso refids="wildcard-char DMZlong pattern-matching"/>
</defn>
</glentry>
<glentry>
<term id="DMZlong" xreftext="demilitarized zone">demilitarized
zone (DMZ)</term>
<defn>
In network security, a network that is isolated from, and
serves as a neutral zone between, a trusted network (for example,
a private intranet) and an untrusted network (for example, the
Internet). One or more secure gateways usually control access
to the DMZ from the trusted or the untrusted network.
</defn>
</glentry>
<glentry>
<term id="DMZ">DMZ</term>
<defn>
See <xref refid="DMZlong"/>.
</defn>
</glentry>
<glentry>
<term id="pattern-matching">pattern-matching character</term>
<defn>
A special character such as an asterisk (*) or a question mark
(?) that can be used to represent zero or more characters.
Any character or set of characters can replace a pattern-matching
character.
</defn>
</glentry>
<glentry>
<term id="servlet">servlet</term>
<defn>
An application program, written in the Java programming language,
that is executed on a web server. A reference to a servlet
appears in the markup for a web page, in the same way that a
reference to a graphics file appears. The web server executes
the servlet and sends the results of the execution (if there are
any) to the web browser. Contrast with <xref refid="applet" />.
</defn>
</glentry>
<glentry>
<term id="wildcard-char">wildcard character</term>
<defn>
See <xref refid="pattern-matching"/>.
</defn>
</glentry>
</glossary>
In this XML listing, each <term> element has an id attribute that identifies it uniquely. Many <xref> elements also refer to other terms in the listing. Notice that each time we refer to another term, we don't use the actual text of the referenced term. When we write our stylesheet, we'll use the XPath id function to retrieve the text of the referenced term; if the name of a term changes (as buzzwords go in and out of fashion, some marketing genius might want to rename the "pattern-matching character," for example), we can rerun our stylesheet and be confident that all references to the new term contain the correct text.
Finally, some <term> elements have an xreftext element because some of the actual terms are longer than we'd like to use in a cross-reference. When we have an <xref> to the term ASCII (American Standard Code for Information Interchange), it would get pretty tedious if the entire text of the term appeared throughout our document. For this term, we'll use the xreftext attribute's value, ensuring that the cross-reference contains the less-intimidating text ASCII....