2 <TITLE>Canonical XML</TITLE>
\r
4 <H1>Canonical XML</H1>
\r
6 This document defines a subset of XML called canonical XML.
\r
7 The intended use of canonical XML is in testing XML processors,
\r
8 as a representation of the result of parsing an XML document.
\r
10 Every well-formed XML document has a unique structurally equivalent
\r
11 canonical XML document. Two structurally equivalent XML
\r
12 documents have a byte-for-byte identical canonical XML document.
\r
13 Canonicalizing an XML document requires only information that an XML
\r
14 processor is required to make available to an application.
\r
16 A canonical XML document conforms to the following grammar:
\r
18 CanonXML ::= Pi* element Pi*
\r
19 element ::= Stag (Datachar | Pi | element)* Etag
\r
20 Stag ::= '<' Name Atts '>'
\r
21 Etag ::= '</' Name '>'
\r
22 Pi ::= '<?' Name ' ' (((Char - S) Char*)? - (Char* '?>' Char*)) '?>'
\r
23 Atts ::= (' ' Name '=' '"' Datachar* '"')*
\r
24 Datachar ::= '&amp;' | '&lt;' | '&gt;' | '&quot;'
\r
25 | '&#9;'| '&#10;'| '&#13;'
\r
26 | (Char - ('&' | '<' | '>' | '"' | #x9 | #xA | #xD))
\r
27 Name ::= (see XML spec)
\r
28 Char ::= (see XML spec)
\r
29 S ::= (see XML spec)
\r
32 Attributes are in lexicographical order (in Unicode bit order).
\r
34 A canonical XML document is encoded in UTF-8.
\r
36 Ignorable white space is considered significant and is treated equivalently
\r
40 <A HREF="mailto:jjc@jclark.com">James Clark</A>
\r