CINXE.COM
XHTML 1.0: The Extensible HyperText Markup Language (Second Edition)
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> <head> <meta name="generator" content="HTML Tidy, see www.w3.org" /> <title>XHTML 1.0: The Extensible HyperText Markup Language (Second Edition)</title> <link rel="stylesheet" type="text/css" media="screen" href="xhtml.css" /> <link rel="stylesheet" type="text/css" media="screen" href="https://www.w3.org/StyleSheets/TR/W3C-REC.css" /> <link rel="stylesheet" type="text/css" media="screen" href="superseded.css" /> </head> <body> <div class="head"><a href="https://www.w3.org/"><img height="48" width="72" src="https://www.w3.org/Icons/w3c_home" alt="W3C" /></a> <h1><a name="title" id="title"></a> XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition)</h1> <h2><a name="title2" id="title2"></a> A Reformulation of HTML 4 in XML 1.0</h2> <h2><a name="subtitle" id="subtitle"></a> W3C Recommendation 26 January 2000, revised 1 August 2002<br> superseded 27 March 2018</h2> <dl> <dt><a id="thisVersion" name="thisVersion">This version</a>:</dt> <dd><a href="https://www.w3.org/TR/2018/SPSD-xhtml1-20180327/">http://www.w3.org/TR/2018/SPSD-xhtml1-20180327/</a></dd> <dt>Latest version:</dt> <dd><a href="https://www.w3.org/TR/xhtml1">http://www.w3.org/TR/xhtml1</a></dd> <dt>Previous version:</dt> <dd><a href="https://www.w3.org/TR/2002/REC-xhtml1-20020801">http://www.w3.org/TR/2002/REC-xhtml1-20020801</a></dd> <dt>Authors:</dt> <dd>See <a href="#acks">acknowledgments</a>.</dd> </dl> <p>Please refer to the <a href="https://www.w3.org/2002/08/REC-xhtml1-20020801-errata"><strong>errata</strong></a> for this document, which may include some normative corrections. See also <a href="https://www.w3.org/MarkUp/translations"><strong>translations</strong></a>.</p> <p>This document is also available in these non-normative formats: <a href="Cover.html">Multi-part XHTML file</a>, <a href="xhtml1.ps">PostScript version</a>, <a href="xhtml1.pdf">PDF version</a>, <a href="xhtml1.zip">ZIP archive</a>, and <a href="xhtml1.tgz">Gzip'd TAR archive</a>.</p> <p class="copyright"><a href="https://www.w3.org/Consortium/Legal/ipr-notice-20000612#Copyright">Copyright</a> ©2002 <a href="https://www.w3.org/"><abbr title="World Wide Web Consortium"> W3C</abbr></a><sup>®</sup> (<a href="http://www.lcs.mit.edu/"><abbr title="Massachusetts Institute of Technology">MIT</abbr></a>, <a href="http://www.inria.fr/"><abbr lang="fr" xml:lang="fr" title="Institut National de Recherche en Informatique et Automatique">INRIA</abbr></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="https://www.w3.org/Consortium/Legal/ipr-notice-20000612#Legal_Disclaimer">liability</a>, <a href="https://www.w3.org/Consortium/Legal/ipr-notice-20000612#W3C_Trademarks">trademark</a>, <a href="https://www.w3.org/Consortium/Legal/copyright-documents-19990405">document use</a> and <a href="https://www.w3.org/Consortium/Legal/copyright-software-19980720">software licensing</a> rules apply.</p> <hr /> </div> <h2><a name="abstract" id="abstract"></a> Abstract</h2> <p>This specification defines the Second Edition of <abbr title="Extensible Hypertext Markup Language">XHTML</abbr> 1.0, a reformulation of HTML 4 as an <abbr title="Extensible Markup Language"> XML</abbr> 1.0 application, and three <abbr title="Document Type Definitions">DTDs</abbr> corresponding to the ones defined by HTML 4. The semantics of the elements and their attributes are defined in the W3C Recommendation for HTML 4. These semantics provide the foundation for future extensibility of XHTML. Compatibility with existing HTML user agents is possible by following a small set of guidelines.</p> <h2><a name="status" id="status"></a> Status of this document</h2> <p><em>This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.</em></p> <p>This specification is a <a href="https://www.w3.org/2018/Process-20180201/#rec-rescind">Superseded Recommendation</a>. A newer specification exists that is recommended for new adoption in place of this specification. New implementations should follow the <a href="https://www.w3.org/TR/html/">latest version</a> of the HTML specification.</p> <p>This document is the second edition of the XHTML 1.0 specification incorporating the errata changes as of 1 August 2002. Changes between this version and the previous Recommendation are illustrated in a <a href="xhtml1-diff.html">diff-marked version</a>.</p> <p>This second edition is <em>not</em> a new version of XHTML 1.0 (first published 26 January 2000). The changes in this document reflect corrections applied as a result of comments submitted by the community and as a result of ongoing work within the HTML Working Group. There are no substantive changes in this document - only the integration of various errata.</p> <p>This document has been produced as part of the <a href="https://www.w3.org/MarkUp/Activity">W3C HTML Activity</a>.</p> <p>At the time of publication, the working group believed there were zero patent disclosures relevant to this specification. A current list of patent disclosures relevant to this specification may be found on the Working Group's <a href="https://www.w3.org/2002/07/HTML-IPR">patent disclosure page</a>.</p> <P>A list of current W3C Recommendations and other technical documents can be found at <a href="https://www.w3.org/TR/">https://www.w3.org/TR/</a>. <h1><a name="toc" id="toc"></a> Quick Table of Contents</h1> <div class="toc"> <ul class='toc'> <li class='tocline'>1. <a href="#xhtml" class="tocxref">What is XHTML?</a></li> <li class='tocline'>2. <a href="#defs" class="tocxref">Definitions</a></li> <li class='tocline'>3. <a href="#normative" class="tocxref">Normative Definition of XHTML 1.0</a></li> <li class='tocline'>4. <a href="#diffs" class="tocxref">Differences with HTML 4</a></li> <li class='tocline'>5. <a href="#issues" class="tocxref">Compatibility Issues</a></li> <li class='tocline'>A. <a href="#dtds" class="tocxref">DTDs</a></li> <li class='tocline'>B. <a href="#prohibitions" class="tocxref">Element Prohibitions</a></li> <li class='tocline'>C. <a href="#guidelines" class="tocxref">HTML Compatibility Guidelines</a></li> <li class='tocline'>D. <a href="#acks" class="tocxref">Acknowledgements</a></li> <li class='tocline'>E. <a href="#refs" class="tocxref">References</a></li> </ul> </div> <h1><a name="contents" id="contents"></a> Full Table of Contents</h1> <div class="toc"> <ul class='toc'> <li class='tocline'>1. <a href="#xhtml" class="tocxref">What is XHTML?</a> <ul class="toc"> <li class='tocline'>1.1. <a href="#html4" class="tocxref">What is HTML 4?</a></li> <li class='tocline'>1.2. <a href="#xml" class="tocxref">What is XML?</a></li> <li class='tocline'>1.3. <a href="#why" class="tocxref">Why the need for XHTML?</a></li> </ul> </li> <li class='tocline'>2. <a href="#defs" class="tocxref">Definitions</a> <ul class="toc"> <li class='tocline'>2.1. <a href="#terms" class="tocxref">Terminology</a></li> <li class='tocline'>2.2. <a href="#general" class="tocxref">General Terms</a></li> </ul> </li> <li class='tocline'>3. <a href="#normative" class="tocxref">Normative Definition of XHTML 1.0</a> <ul class="toc"> <li class='tocline'>3.1. <a href="#docconf" class="tocxref">Document Conformance</a> <ul class="toc"> <li class='tocline'>3.1.1. <a href="#strict" class="tocxref">Strictly Conforming Documents</a></li> <li class='tocline'>3.1.2. <a href="#well-formed" class="tocxref">Using XHTML with other namespaces</a></li> </ul> </li> <li class='tocline'>3.2. <a href="#uaconf" class="tocxref">User Agent Conformance</a></li> </ul> </li> <li class='tocline'>4. <a href="#diffs" class="tocxref">Differences with HTML 4</a> <ul class="toc"> <li class='tocline'>4.1. <a href="#h-4.1" class="tocxref">Documents must be well-formed</a></li> <li class='tocline'>4.2. <a href="#h-4.2" class="tocxref">Element and attribute names must be in lower case</a></li> <li class='tocline'>4.3. <a href="#h-4.3" class="tocxref">For non-empty elements, end tags are required</a></li> <li class='tocline'>4.4. <a href="#h-4.4" class="tocxref">Attribute values must always be quoted</a></li> <li class='tocline'>4.5. <a href="#h-4.5" class="tocxref">Attribute Minimization</a></li> <li class='tocline'>4.6. <a href="#h-4.6" class="tocxref">Empty Elements</a></li> <li class='tocline'>4.7. <a href="#h-4.7" class="tocxref">White Space handling in attribute values</a></li> <li class='tocline'>4.8. <a href="#h-4.8" class="tocxref">Script and Style elements</a></li> <li class='tocline'>4.9. <a href="#h-4.9" class="tocxref">SGML exclusions</a></li> <li class='tocline'>4.10. <a href="#h-4.10" class="tocxref">The elements with 'id' and 'name' attributes</a></li> <li class='tocline'>4.11. <a href="#h-4.11" class="tocxref">Attributes with pre-defined value sets</a></li> <li class='tocline'>4.12. <a href="#h-4.12" class="tocxref">Entity references as hex values</a></li> </ul> </li> <li class='tocline'>5. <a href="#issues" class="tocxref">Compatibility Issues</a> <ul class="toc"> <li class='tocline'>5.1. <a href="#media" class="tocxref">Internet Media Type</a></li> </ul> </li> <li class='tocline'>A. <a href="#dtds" class="tocxref">DTDs</a> <ul class="toc"> <li class='tocline'>A.1. <a href="#h-A1" class="tocxref">Document Type Definitions</a> <ul class="toc"> <li class='tocline'>A.1.1. <a href="#a_dtd_XHTML-1.0-Strict" class="tocxref">XHTML-1.0-Strict</a></li> <li class='tocline'>A.1.2. <a href="#a_dtd_XHTML-1.0-Transitional" class="tocxref">XHTML-1.0-Transitional</a></li> <li class='tocline'>A.1.3. <a href="#a_dtd_XHTML-1.0-Frameset" class="tocxref">XHTML-1.0-Frameset</a></li> </ul> </li> <li class='tocline'>A.2. <a href="#h-A2" class="tocxref">Entity Sets</a> <ul class="toc"> <li class='tocline'>A.2.1. <a href="#a_dtd_Latin-1_characters" class="tocxref">Latin-1 characters</a></li> <li class='tocline'>A.2.2. <a href="#a_dtd_Special_characters" class="tocxref">Special characters</a></li> <li class='tocline'>A.2.3. <a href="#a_dtd_Symbols" class="tocxref">Symbols</a></li> </ul> </li> </ul> </li> <li class='tocline'>B. <a href="#prohibitions" class="tocxref">Element Prohibitions</a></li> <li class='tocline'>C. <a href="#guidelines" class="tocxref">HTML Compatibility Guidelines</a> <ul class="toc"> <li class='tocline'>C.1. <a href="#C_1" class="tocxref">Processing Instructions and the XML Declaration</a></li> <li class='tocline'>C.2. <a href="#C_2" class="tocxref">Empty Elements</a></li> <li class='tocline'>C.3. <a href="#C_3" class="tocxref"> Element Minimization and Empty Element Content</a></li> <li class='tocline'>C.4. <a href="#C_4" class="tocxref">Embedded Style Sheets and Scripts</a></li> <li class='tocline'>C.5. <a href="#C_5" class="tocxref">Line Breaks within Attribute Values</a></li> <li class='tocline'>C.6. <a href="#C_6" class="tocxref">Isindex</a></li> <li class='tocline'>C.7. <a href="#C_7" class="tocxref">The <code>lang</code> and <code>xml:lang</code> Attributes</a></li> <li class='tocline'>C.8. <a href="#C_8" class="tocxref">Fragment Identifiers</a></li> <li class='tocline'>C.9. <a href="#C_9" class="tocxref">Character Encoding</a></li> <li class='tocline'>C.10. <a href="#C_10" class="tocxref">Boolean Attributes</a></li> <li class='tocline'>C.11. <a href="#C_11" class="tocxref">Document Object Model and XHTML</a></li> <li class='tocline'>C.12. <a href="#C_12" class="tocxref">Using Ampersands in Attribute Values (and Elsewhere)</a></li> <li class='tocline'>C.13. <a href="#C_13" class="tocxref">Cascading Style Sheets (CSS) and XHTML</a></li> <li class='tocline'>C.14. <a href="#C_14" class="tocxref">Referencing Style Elements when serving as XML</a></li> <li class='tocline'>C.15. <a href="#C_15" class="tocxref">White Space Characters in HTML vs. XML</a></li> <li class='tocline'>C.16. <a href="#C_16" class="tocxref">The Named Character Reference &apos;</a></li> </ul> </li> <li class='tocline'>D. <a href="#acks" class="tocxref">Acknowledgements</a></li> <li class='tocline'>E. <a href="#refs" class="tocxref">References</a></li> </ul> </div> <!-- INCLUDING introduction.mhtml --><!--OddPage--> <h1><a name="xhtml" id="xhtml">1.</a> What is XHTML?</h1> <p><strong>This section is informative.</strong></p> <p>XHTML is a family of current and future document types and modules that reproduce, subset, and extend HTML 4 [<a class="nref" href="#ref-html4">HTML4</a>]. XHTML family document types are <abbr title="Extensible Markup Language">XML</abbr> based, and ultimately are designed to work in conjunction with XML-based user agents. The details of this family and its evolution are discussed in more detail in [<a class="nref" href="#ref-xhtmlmod">XHTMLMOD</a>].</p> <p>XHTML 1.0 (this specification) is the first document type in the XHTML family. It is a reformulation of the three HTML 4 document types as applications of XML 1.0 [<a class="nref" href= "#ref-xml">XML</a>]. It is intended to be used as a language for content that is both XML-conforming and, if some simple <a href="#guidelines">guidelines</a> are followed, operates in HTML 4 conforming user agents. Developers who migrate their content to XHTML 1.0 will realize the following benefits:</p> <ul> <li>XHTML documents are XML conforming. As such, they are readily viewed, edited, and validated with standard XML tools.</li> <li>XHTML documents can be written to operate as well or better than they did before in existing HTML 4-conforming user agents as well as in new, XHTML 1.0 conforming user agents.</li> <li>XHTML documents can utilize applications (e.g. scripts and applets) that rely upon either the HTML Document Object Model or the XML Document Object Model [<a class="nref" href= "#ref-dom">DOM</a>].</li> <li>As the XHTML family evolves, documents conforming to XHTML 1.0 will be more likely to interoperate within and among various XHTML environments.</li> </ul> <p>The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML today, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content's backward and future compatibility.</p> <h2><a name="html4" id="html4">1.1.</a> What is HTML 4?</h2> <p>HTML 4 [<a class="nref" href="#ref-html4">HTML4</a>] is an <abbr title="Standard Generalized Markup Language">SGML</abbr> (Standard Generalized Markup Language) application conforming to International Standard <abbr title="Organization for International Standardization">ISO</abbr> 8879, and is widely regarded as the standard publishing language of the World Wide Web.</p> <p>SGML is a language for describing markup languages, particularly those used in electronic document exchange, document management, and document publishing. HTML is an example of a language defined in SGML.</p> <p>SGML has been around since the middle 1980's and has remained quite stable. Much of this stability stems from the fact that the language is both feature-rich and flexible. This flexibility, however, comes at a price, and that price is a level of complexity that has inhibited its adoption in a diversity of environments, including the World Wide Web.</p> <p>HTML, as originally conceived, was to be a language for the exchange of scientific and other technical documents, suitable for use by non-document specialists. HTML addressed the problem of SGML complexity by specifying a small set of structural and semantic tags suitable for authoring relatively simple documents. In addition to simplifying the document structure, HTML added support for hypertext. Multimedia capabilities were added later.</p> <p>In a remarkably short space of time, HTML became wildly popular and rapidly outgrew its original purpose. Since HTML's inception, there has been rapid invention of new elements for use within HTML (as a standard) and for adapting HTML to vertical, highly specialized, markets. This plethora of new elements has led to interoperability problems for documents across different platforms.</p> <h2><a name="xml" id="xml">1.2.</a> What is XML?</h2> <p>XML™ is the shorthand name for Extensible Markup Language [<a class="nref" href="#ref-xml">XML</a>].</p> <p>XML was conceived as a means of regaining the power and flexibility of SGML without most of its complexity. Although a restricted form of SGML, XML nonetheless preserves most of SGML's power and richness, and yet still retains all of SGML's commonly used features.</p> <p>While retaining these beneficial features, XML removes many of the more complex features of SGML that make the authoring and design of suitable software both difficult and costly.</p> <h2><a name="why" id="why">1.3.</a> Why the need for XHTML?</h2> <p>The benefits of migrating to XHTML 1.0 are described above. Some of the benefits of migrating to XHTML in general are:</p> <ul> <li>Document developers and user agent designers are constantly discovering new ways to express their ideas through new markup. In XML, it is relatively easy to introduce new elements or additional element attributes. The XHTML family is designed to accommodate these extensions through XHTML modules and techniques for developing new XHTML-conforming modules (described in the XHTML Modularization specification). These modules will permit the combination of existing and new feature sets when developing content and when designing new user agents.</li> <li>Alternate ways of accessing the Internet are constantly being introduced. The XHTML family is designed with general user agent interoperability in mind. Through a new user agent and document profiling mechanism, servers, proxies, and user agents will be able to perform best effort content transformation. Ultimately, it will be possible to develop XHTML-conforming content that is usable by any XHTML-conforming user agent.</li> </ul> <!-- END OF FILE introduction.mhtml --><!-- INCLUDING definitions.mhtml --><!--OddPage--> <h1><a name="defs" id="defs">2.</a> Definitions</h1> <p><strong>This section is normative.</strong></p> <h2><a name="terms" id="terms">2.1.</a> Terminology</h2> <p>The following terms are used in this specification. These terms extend the definitions in [<a class="nref" href="#ref-rfc2119">RFC2119</a>] in ways based upon similar definitions in ISO/<abbr title="International Electro-technical Commission">IEC</abbr> 9945-1:1990 [<a class="nref" href="#ref-posix.1">POSIX.1</a>]:</p> <dl> <dt>May</dt> <dd>With respect to implementations, the word "may" is to be interpreted as an optional feature that is not required in this specification but can be provided. With respect to <a href= "#docconf">Document Conformance</a>, the word "may" means that the optional feature must not be used. The term "optional" has the same definition as "may".</dd> <dt>Must</dt> <dd>In this specification, the word "must" is to be interpreted as a mandatory requirement on the implementation or on Strictly Conforming XHTML Documents, depending upon the context. The term "shall" has the same definition as "must".</dd> <dt>Optional</dt> <dd>See "May".</dd> <dt>Reserved</dt> <dd>A value or behavior is unspecified, but it is not allowed to be used by Conforming Documents nor to be supported by Conforming User Agents.</dd> <dt>Shall</dt> <dd>See "Must".</dd> <dt>Should</dt> <dd>With respect to implementations, the word "should" is to be interpreted as an implementation recommendation, but not a requirement. With respect to documents, the word "should" is to be interpreted as recommended programming practice for documents and a requirement for Strictly Conforming XHTML Documents.</dd> <dt>Supported</dt> <dd>Certain facilities in this specification are optional. If a facility is supported, it behaves as specified by this specification.</dd> <dt>Unspecified</dt> <dd>When a value or behavior is unspecified, the specification defines no portability requirements for a facility on an implementation even when faced with a document that uses the facility. A document that requires specific behavior in such an instance, rather than tolerating any behavior when using that facility, is not a Strictly Conforming XHTML Document.</dd> </dl> <h2><a name="general" id="general">2.2.</a> General Terms</h2> <dl> <dt>Attribute</dt> <dd>An attribute is a parameter to an element declared in the DTD. An attribute's type and value range, including a possible default value, are defined in the DTD.</dd> <dt>DTD</dt> <dd>A DTD, or document type definition, is a collection of XML markup declarations that, as a collection, defines the legal structure, <span class="term">elements</span>, and <span class="term"> attributes</span> that are available for use in a document that complies to the DTD.</dd> <dt>Document</dt> <dd>A document is a stream of data that, after being combined with any other streams it references, is structured such that it holds information contained within <span class="term">elements</span> that are organized as defined in the associated <span class="term">DTD</span>. See <a href="#docconf">Document Conformance</a> for more information.</dd> <dt>Element</dt> <dd>An element is a document structuring unit declared in the <span class="term">DTD</span>. The element's content model is defined in the <span class="term">DTD</span>, and additional semantics may be defined in the prose description of the element.</dd> <dt><a name="facilities" id="facilities">Facilities</a></dt> <dd>Facilities are <span class="term">elements</span>, <span class="term">attributes</span>, and the semantics associated with those <span class="term">elements</span> and <span class="term"> attributes</span>.</dd> <dt>Implementation</dt> <dd>See User Agent.</dd> <dt>Parsing</dt> <dd>Parsing is the act whereby a <span class="term">document</span> is scanned, and the information contained within the <span class="term">document</span> is filtered into the context of the <span class="term">elements</span> in which the information is structured.</dd> <dt>Rendering</dt> <dd>Rendering is the act whereby the information in a <span class="term">document</span> is presented. This presentation is done in the form most appropriate to the environment (e.g. aurally, visually, in print).</dd> <dt>User Agent</dt> <dd>A user agent is a system that processes XHTML documents in accordance with this specification. See <a href="#uaconf">User Agent Conformance</a> for more information.</dd> <dt>Validation</dt> <dd>Validation is a process whereby <span class="term">documents</span> are verified against the associated <span class="term">DTD</span>, ensuring that the structure, use of <span class="term"> elements</span>, and use of <span class="term">attributes</span> are consistent with the definitions in the <span class="term">DTD</span>.</dd> <dt><a name="wellformed" id="wellformed">Well-formed</a></dt> <dd>A <span class="term">document</span> is well-formed when it is structured according to the rules defined in <a href="https://www.w3.org/TR/REC-xml#sec-well-formed">Section 2.1</a> of the XML 1.0 Recommendation [<a class="nref" href="#ref-xml">XML</a>].</dd> </dl> <!-- END OF FILE definitions.mhtml --><!-- INCLUDING normative.mhtml --><!--OddPage--> <h1><a name="normative" id="normative">3.</a> Normative Definition of XHTML 1.0</h1> <p><strong>This section is normative.</strong></p> <h2><a name="docconf" id="docconf">3.1.</a> Document Conformance</h2> <p>This version of XHTML provides a definition of strictly conforming XHTML 1.0 documents, which are restricted to elements and attributes from the XML and XHTML 1.0 namespaces. See <a href= "#well-formed">Section 3.1.2</a> for information on using XHTML with other namespaces, for instance, to include metadata expressed in <abbr title="Resource Description Format">RDF</abbr> within XHTML documents.</p> <h3><a name="strict" id="strict">3.1.1.</a> Strictly Conforming Documents</h3> <p>A Strictly Conforming XHTML Document is an XML document that requires only the facilities described as mandatory in this specification. Such a document must meet all of the following criteria:</p> <ol> <li> <p>It must conform to the constraints expressed in one of the three DTDs found in <a href="#dtds">DTDs</a> and in <a href="#prohibitions">Appendix B</a>.</p> </li> <li> <p>The root element of the document must be <code>html</code>.</p> </li> <li> <p>The root element of the document must contain an <code>xmlns</code> declaration for the XHTML namespace [<a class="nref" href="#ref-xmlns">XMLNS</a>]. The namespace for XHTML is defined to be <code>http://www.w3.org/1999/xhtml</code>. An example root element might look like:</p> <div class="good"> <pre> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> </pre> </div> </li> <li> <p>There must be a DOCTYPE declaration in the document prior to the root element. The public identifier included in the DOCTYPE declaration must reference one of the three DTDs found in <a href= "#dtds">DTDs</a> using the respective Formal Public Identifier. The system identifier may be changed to reflect local system conventions.</p> <pre> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"> </pre> </li> <li> <p>The DTD subset must not be used to override any parameter entities in the DTD.</p> </li> </ol> <p>An XML declaration is not required in all XML documents; however XHTML document authors are strongly encouraged to use XML declarations in all their documents. Such a declaration is required when the character encoding of the document is other than the default UTF-8 or UTF-16 and no encoding was determined by a higher-level protocol. Here is an example of an XHTML document. In this example, the XML declaration is included.</p> <div class="good"> <pre> <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Virtual Library</title> </head> <body> <p>Moved to <a href="http://example.org/">example.org</a>.</p> </body> </html> </pre> </div> <h3><a name="well-formed" id="well-formed">3.1.2.</a> Using XHTML with other namespaces</h3> <p>The XHTML namespace may be used with other XML namespaces as per [<a class="nref" href="#ref-xmlns">XMLNS</a>], although such documents are not strictly conforming XHTML 1.0 documents as defined above. Work by W3C is addressing ways to specify conformance for documents involving multiple namespaces. For an example, see [<a class="nref" href= "#ref-xhtml-mathml">XHTML+MathML</a>].</p> <p>The following example shows the way in which XHTML 1.0 could be used in conjunction with the MathML Recommendation:</p> <div class="good"> <pre> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>A Math Example</title> </head> <body> <p>The following is MathML markup:</p> <math xmlns="http://www.w3.org/1998/Math/MathML"> <apply> <log/> <logbase> <cn> 3 </cn> </logbase> <ci> x </ci> </apply> </math> </body> </html> </pre> </div> <p>The following example shows the way in which XHTML 1.0 markup could be incorporated into another XML namespace:</p> <div class="good"> <pre> <?xml version="1.0" encoding="UTF-8"?> <!-- initially, the default namespace is "books" --> <book xmlns='urn:loc.gov:books' xmlns:isbn='urn:ISBN:0-395-36341-6' xml:lang="en" lang="en"> <title>Cheaper by the Dozen</title> <isbn:number>1568491379</isbn:number> <notes> <!-- make HTML the default namespace for a hypertext commentary --> <p xmlns='http://www.w3.org/1999/xhtml'> This is also available <a href="http://www.w3.org/">online</a>. </p> </notes> </book> </pre> </div> <h2><a name="uaconf" id="uaconf">3.2.</a> User Agent Conformance</h2> <p>A conforming user agent must meet all of the following criteria:</p> <ol> <li>In order to be consistent with the XML 1.0 Recommendation [<a class="nref" href="#ref-xml">XML</a>], the user agent must parse and evaluate an XHTML document for well-formedness. If the user agent claims to be a validating user agent, it must also validate documents against their referenced DTDs according to [<a class="nref" href="#ref-xml">XML</a>].</li> <li>When the user agent claims to support <a href="#facilities">facilities</a> defined within this specification or required by this specification through normative reference, it must do so in ways consistent with the facilities' definition.</li> <li>When a user agent processes an XHTML document as generic XML, it shall only recognize attributes of type <code>ID</code> (i.e. the <code>id</code> attribute on most XHTML elements) as fragment identifiers.</li> <li>If a user agent encounters an element it does not recognize, it must process the element's content.</li> <li>If a user agent encounters an attribute it does not recognize, it must ignore the entire attribute specification (i.e., the attribute and its value).</li> <li>If a user agent encounters an attribute value it does not recognize, it must use the default attribute value.</li> <li>If it encounters an entity reference (other than one of the entities defined in this recommendation or in the XML recommendation) for which the user agent has processed no declaration (which could happen if the declaration is in the external subset which the user agent hasn't read), the entity reference should be processed as the characters (starting with the ampersand and ending with the semi-colon) that make up the entity reference.</li> <li>When processing content, user agents that encounter characters or character entity references that are recognized but not renderable may substitute another rendering that gives the same meaning, or must display the document in such a way that it is obvious to the user that normal rendering has not taken place.</li> <li> <p>White space is handled according to the following rules. The following characters are defined in [<a class="nref" href="#ref-xml">XML</a>] white space characters:</p> <ul> <li>SPACE (&#x0020;)</li> <li>HORIZONTAL TABULATION (&#x0009;)</li> <li>CARRIAGE RETURN (&#x000D;)</li> <li>LINE FEED (&#x000A;)</li> </ul> <p>The XML processor normalizes different systems' line end codes into one single LINE FEED character, that is passed up to the application.</p> <p>The user agent must use the definition from CSS for processing whitespace characters [<a class="nref" href="#ref-css2">CSS2</a>]. <em>Note that the CSS2 recommendation does not explicitly address the issue of whitespace handling in non-Latin character sets. This will be addressed in a future version of CSS, at which time this reference will be updated.</em></p> </li> </ol> <p>Note that in order to produce a Canonical XHTML document, the rules above must be applied and the rules in [<a class="nref" href="#ref-xmlc14n">XMLC14N</a>] must also be applied to the document.</p> <!-- END OF FILE normative.mhtml --><!-- INCLUDING diffs.mhtml --><!--OddPage--> <h1><a name="diffs" id="diffs">4.</a> Differences with HTML 4</h1> <p><strong>This section is informative.</strong></p> <p>Due to the fact that XHTML is an XML application, certain practices that were perfectly legal in SGML-based HTML 4 [<a class="nref" href="#ref-html4">HTML4</a>] must be changed.</p> <h2><a name="h-4.1" id="h-4.1">4.1.</a> Documents must be well-formed</h2> <p><a href="#wellformed">Well-formedness</a> is a new concept introduced by [<a class="nref" href="#ref-xml">XML</a>]. Essentially this means that all elements must either have closing tags or be written in a special form (as described below), and that all the elements must nest properly.</p> <p>Although overlapping is illegal in SGML, it is widely tolerated in existing browsers.</p> <p><strong><em>CORRECT: nested elements.</em></strong></p> <div class="good"> <p><p>here is an emphasized <em>paragraph</em>.</p></p> </div> <p><strong><em>INCORRECT: overlapping elements</em></strong></p> <div class="bad"> <p><p>here is an emphasized <em>paragraph.</p></em></p> </div> <h2><a name="h-4.2" id="h-4.2">4.2.</a> Element and attribute names must be in lower case</h2> <p>XHTML documents must use lower case for all HTML element and attribute names. This difference is necessary because XML is case-sensitive e.g. <li> and <LI> are different tags.</p> <h2><a name="h-4.3" id="h-4.3">4.3.</a> For non-empty elements, end tags are required</h2> <p>In SGML-based HTML 4 certain elements were permitted to omit the end tag; with the elements that followed implying closure. XML does not allow end tags to be omitted. All elements other than those declared in the DTD as <code>EMPTY</code> must have an end tag. Elements that are declared in the DTD as <code>EMPTY</code> can have an end tag <em>or</em> can use empty element shorthand (see <a href="#h-4.6">Empty Elements</a>).</p> <p><strong><em>CORRECT: terminated elements</em></strong></p> <div class="good"> <p><p>here is a paragraph.</p><p>here is another paragraph.</p></p> </div> <p><strong><em>INCORRECT: unterminated elements</em></strong></p> <div class="bad"> <p><p>here is a paragraph.<p>here is another paragraph.</p> </div> <h2><a name="h-4.4" id="h-4.4">4.4.</a> Attribute values must always be quoted</h2> <p>All attribute values must be quoted, even those which appear to be numeric.</p> <p><strong><em>CORRECT: quoted attribute values</em></strong></p> <div class="good"> <p><td rowspan="3"></p> </div> <p><strong><em>INCORRECT: unquoted attribute values</em></strong></p> <div class="bad"> <p><td rowspan=3></p> </div> <h2><a name="h-4.5" id="h-4.5">4.5.</a> Attribute Minimization</h2> <p>XML does not support attribute minimization. Attribute-value pairs must be written in full. Attribute names such as <code>compact</code> and <code>checked</code> cannot occur in elements without their value being specified.</p> <p><strong><em>CORRECT: unminimized attributes</em></strong></p> <div class="good"> <p><dl compact="compact"></p> </div> <p><strong><em>INCORRECT: minimized attributes</em></strong></p> <div class="bad"> <p><dl compact></p> </div> <h2><a name="h-4.6" id="h-4.6">4.6.</a> Empty Elements</h2> <p>Empty elements must either have an end tag or the start tag must end with <code>/></code>. For instance, <code><br/></code> or <code><hr></hr></code>. See <a href= "#guidelines">HTML Compatibility Guidelines</a> for information on ways to ensure this is backward compatible with HTML 4 user agents.</p> <p><strong><em>CORRECT: terminated empty elements</em></strong></p> <div class="good"> <p><br/><hr/></p> </div> <p><strong><em>INCORRECT: unterminated empty elements</em></strong></p> <div class="bad"> <p><br><hr></p> </div> <h2><a name="h-4.7" id="h-4.7">4.7.</a> White Space handling in attribute values</h2> <p>When user agents process attributes, they do so according to <a href="https://www.w3.org/TR/REC-xml#AVNormalize">Section 3.3.3</a> of [<a class="nref" href="#ref-xml">XML</a>]:</p> <ul> <li>Strip leading and trailing white space.</li> <li>Map sequences of one or more white space characters (including line breaks) to a single inter-word space.</li> </ul> <h2><a name="h-4.8" id="h-4.8">4.8.</a> Script and Style elements</h2> <p>In XHTML, the script and style elements are declared as having <code>#PCDATA</code> content. As a result, <code><</code> and <code>&</code> will be treated as the start of markup, and entities such as <code>&lt;</code> and <code>&amp;</code> will be recognized as entity references by the XML processor to <code><</code> and <code>&</code> respectively. Wrapping the content of the script or style element within a <code>CDATA</code> marked section avoids the expansion of these entities.</p> <div class="good"> <pre> <script type="text/javascript"> <![CDATA[ ... unescaped script content ... ]]> </script> </pre> </div> <p><code>CDATA</code> sections are recognized by the XML processor and appear as nodes in the Document Object Model, see <a href="https://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#ID-E067D597">Section 1.3</a> of the DOM Level 1 Recommendation [<a class="nref" href="#ref-dom">DOM</a>].</p> <p>An alternative is to use external script and style documents.</p> <h2><a name="h-4.9" id="h-4.9">4.9.</a> SGML exclusions</h2> <p>SGML gives the writer of a DTD the ability to exclude specific elements from being contained within an element. Such prohibitions (called "exclusions") are not possible in XML.</p> <p>For example, the HTML 4 Strict DTD forbids the nesting of an '<code>a</code>' element within another '<code>a</code>' element to any descendant depth. It is not possible to spell out such prohibitions in XML. Even though these prohibitions cannot be defined in the DTD, certain elements should not be nested. A summary of such elements and the elements that should not be nested in them is found in the normative <a href="#prohibitions">Element Prohibitions</a>.</p> <h2><a name="h-4.10" id="h-4.10">4.10.</a> The elements with 'id' and 'name' attributes</h2> <p>HTML 4 defined the <code>name</code> attribute for the elements <code>a</code>, <code>applet</code>, <code>form</code>, <code>frame</code>, <code>iframe</code>, <code>img</code>, and <code> map</code>. HTML 4 also introduced the <code>id</code> attribute. Both of these attributes are designed to be used as fragment identifiers.</p> <p>In XML, fragment identifiers are of type <code>ID</code>, and there can only be a single attribute of type <code>ID</code> per element. Therefore, in XHTML 1.0 the <code>id</code> attribute is defined to be of type <code>ID</code>. In order to ensure that XHTML 1.0 documents are well-structured XML documents, XHTML 1.0 documents MUST use the <code>id</code> attribute when defining fragment identifiers on the elements listed above. See the <a href="#guidelines">HTML Compatibility Guidelines</a> for information on ensuring such anchors are backward compatible when serving XHTML documents as media type <code>text/html</code>.</p> <p>Note that in XHTML 1.0, the <code>name</code> attribute of these elements is formally deprecated, and will be removed in a subsequent version of XHTML.</p> <h2><a name="h-4.11" id="h-4.11">4.11.</a> Attributes with pre-defined value sets</h2> <p>HTML 4 and XHTML both have some attributes that have pre-defined and limited sets of values (e.g. the <code>type</code> attribute of the <code>input</code> element). In SGML and XML, these are called <em>enumerated attributes</em>. Under HTML 4, the interpretation of these values was <em>case-insensitive</em>, so a value of <code>TEXT</code> was equivalent to a value of <code>text</code>. Under XML, the interpretation of these values is <em>case-sensitive</em>, and in XHTML 1 all of these values are defined in lower-case.</p> <h2><a name="h-4.12" id="h-4.12">4.12.</a> Entity references as hex values</h2> <p>SGML and XML both permit references to characters by using hexadecimal values. In SGML these references could be made using either &#Xnn; or &#xnn;. In XML documents, you must use the lower-case version (i.e. &#xnn;)</p> <!-- END OF FILE diffs.mhtml --><!-- INCLUDING issues.mhtml --><!--OddPage--> <h1><a name="issues" id="issues">5.</a> Compatibility Issues</h1> <p><strong>This section is normative.</strong></p> <p>Although there is no requirement for XHTML 1.0 documents to be compatible with existing user agents, in practice this is easy to accomplish. Guidelines for creating compatible documents can be found in <a href="#guidelines">Appendix C</a>.</p> <h2><a name="media" id="media">5.1.</a> Internet Media Type</h2> <p>XHTML Documents which follow the guidelines set forth in <a href="#guidelines">Appendix C</a>, "HTML Compatibility Guidelines" may be labeled with the Internet Media Type "text/html" [<a class="nref" href="#ref-rfc2854">RFC2854</a>], as they are compatible with most HTML browsers. Those documents, and any other document conforming to this specification, may also be labeled with the Internet Media Type "application/xhtml+xml" as defined in [<a class="nref" href="#ref-rfc3236">RFC3236</a>]. For further information on using media types with XHTML, see the informative note [<a class="nref" href="#ref-xhtmlmime">XHTMLMIME</a>].</p> <!-- END OF FILE issues.mhtml --><!-- Appendices --><!-- INCLUDING dtds.mhtml --><!--OddPage--> <h1><a name="dtds" id="dtds">A.</a> DTDs</h1> <p><strong>This appendix is normative.</strong></p> <p>These DTDs and entity sets form a normative part of this specification. The complete set of DTD files together with an XML declaration and SGML Open Catalog is included in the <a href= "xhtml1.zip">zip file</a> and the <a href="xhtml1.tgz">gzip'd tar file</a> for this specification. Users looking for local copies of the DTDs to work with should download and use those archives rather than using the specific DTDs referenced below.</p> <h2><a name="h-A1" id="h-A1">A.1.</a> Document Type Definitions</h2> <p>These DTDs approximate the HTML 4 DTDs. The W3C recommends that you use the authoritative versions of these DTDs at their defined SYSTEM identifiers when validating content. If you need to use these DTDs locally you should download one of the archives of <a href="Overview.html#thisVersion">this version</a>. For completeness, the normative versions of the DTDs are included here:</p> <h3><a name="a_dtd_XHTML-1.0-Strict" id="a_dtd_XHTML-1.0-Strict">A.1.1.</a> XHTML-1.0-Strict</h3> <p>The file <a href="DTD/xhtml1-strict.dtd">DTD/xhtml1-strict.dtd</a> is a normative part of this specification. The annotated contents of this file are available in this <a href= "./dtds.html#a_dtd_XHTML-1.0-Strict">separate section</a> for completeness.</p> <h3><a name="a_dtd_XHTML-1.0-Transitional" id="a_dtd_XHTML-1.0-Transitional">A.1.2.</a> XHTML-1.0-Transitional</h3> <p>The file <a href="DTD/xhtml1-transitional.dtd">DTD/xhtml1-transitional.dtd</a> is a normative part of this specification. The annotated contents of this file are available in this <a href= "./dtds.html#a_dtd_XHTML-1.0-Transitional">separate section</a> for completeness.</p> <h3><a name="a_dtd_XHTML-1.0-Frameset" id="a_dtd_XHTML-1.0-Frameset">A.1.3.</a> XHTML-1.0-Frameset</h3> <p>The file <a href="DTD/xhtml1-frameset.dtd">DTD/xhtml1-frameset.dtd</a> is a normative part of this specification. The annotated contents of this file are available in this <a href= "./dtds.html#a_dtd_XHTML-1.0-Frameset">separate section</a> for completeness.</p> <h2><a name="h-A2" id="h-A2">A.2.</a> Entity Sets</h2> <p>The XHTML entity sets are the same as for HTML 4, but have been modified to be valid XML 1.0 entity declarations. Note the entity for the Euro currency sign (<code>&euro;</code> or <code> &#8364;</code> or <code>&#x20AC;</code>) is defined as part of the special characters.</p> <h3><a name="a_dtd_Latin-1_characters" id="a_dtd_Latin-1_characters">A.2.1.</a> Latin-1 characters</h3> <p>The file <a href="DTD/xhtml-lat1.ent">DTD/xhtml-lat1.ent</a> is a normative part of this specification. The annotated contents of this file are available in this <a href= "./dtds.html#a_dtd_Latin-1_characters">separate section</a> for completeness.</p> <h3><a name="a_dtd_Special_characters" id="a_dtd_Special_characters">A.2.2.</a> Special characters</h3> <p>The file <a href="DTD/xhtml-special.ent">DTD/xhtml-special.ent</a> is a normative part of this specification. The annotated contents of this file are available in this <a href= "./dtds.html#a_dtd_Special_characters">separate section</a> for completeness.</p> <h3><a name="a_dtd_Symbols" id="a_dtd_Symbols">A.2.3.</a> Symbols</h3> <p>The file <a href="DTD/xhtml-symbol.ent">DTD/xhtml-symbol.ent</a> is a normative part of this specification. The annotated contents of this file are available in this <a href= "./dtds.html#a_dtd_Symbols">separate section</a> for completeness.</p> <!-- END OF FILE dtds.mhtml --><!-- INCLUDING prohibitions.mhtml --><!--OddPage--> <h1><a name="prohibitions" id="prohibitions">B.</a> Element Prohibitions</h1> <p><strong>This appendix is normative.</strong></p> <p>The following elements have prohibitions on which elements they can contain (see <a href="#h-4.9">SGML Exclusions</a>). This prohibition applies to all depths of nesting, i.e. it contains all the descendant elements.</p> <dl> <dt><code class="tag">a</code></dt> <dd>must not contain other <code>a</code> elements.</dd> <dt><code class="tag">pre</code></dt> <dd>must not contain the <code>img</code>, <code>object</code>, <code>big</code>, <code>small</code>, <code>sub</code>, or <code>sup</code> elements.</dd> <dt><code class="tag">button</code></dt> <dd>must not contain the <code>input</code>, <code>select</code>, <code>textarea</code>, <code>label</code>, <code>button</code>, <code>form</code>, <code>fieldset</code>, <code>iframe</code> or <code>isindex</code> elements.</dd> <dt><code class="tag">label</code></dt> <dd>must not contain other <code class="tag">label</code> elements.</dd> <dt><code class="tag">form</code></dt> <dd>must not contain other <code>form</code> elements.</dd> </dl> <!-- END OF FILE prohibitions.mhtml --><!-- INCLUDING guidelines.mhtml --><!--OddPage--> <h1><a name="guidelines" id="guidelines">C.</a> HTML Compatibility Guidelines</h1> <p><strong>This appendix is informative.</strong></p> <p>This appendix summarizes design guidelines for authors who wish their XHTML documents to render on existing HTML user agents. <em>Note that this recommendation does not define how HTML conforming user agents should process HTML documents. Nor does it define the meaning of the Internet Media Type <code>text/html</code>. For these definitions, see [<a class="nref" href= "#ref-html4">HTML4</a>] and [<a class="nref" href="#ref-rfc2854">RFC2854</a>] respectively.</em></p> <h2><a name="C_1" id="C_1">C.1.</a> Processing Instructions and the XML Declaration</h2> <p>Be aware that processing instructions are rendered on some user agents. Also, some user agents interpret the XML declaration to mean that the document is unrecognized XML rather than HTML, and therefore may not render the document as expected. For compatibility with these types of legacy browsers, you may want to avoid using processing instructions and XML declarations. Remember, however, that when the XML declaration is not included in a document, the document can only use the default character encodings UTF-8 or UTF-16.</p> <h2><a name="C_2" id="C_2">C.2.</a> Empty Elements</h2> <p>Include a space before the trailing <code>/</code> and <code>></code> of empty elements, e.g. <code class="greenmono"><br /></code>, <code class="greenmono"><hr /></code> and <code class="greenmono"><img src="karen.jpg" alt="Karen" /></code>. Also, use the minimized tag syntax for empty elements, e.g. <code class="greenmono"><br /></code>, as the alternative syntax <code class="greenmono"><br></br></code> allowed by XML gives uncertain results in many existing user agents.</p> <h2><a name="C_3" id="C_3">C.3.</a> Element Minimization and Empty Element Content</h2> <p>Given an empty instance of an element whose content model is not <code>EMPTY</code> (for example, an empty title or paragraph) do not use the minimized form (e.g. use <code class="greenmono"> <p> </p></code> and not <code class="greenmono"><p /></code>).</p> <h2><a name="C_4" id="C_4">C.4.</a> Embedded Style Sheets and Scripts</h2> <p>Use external style sheets if your style sheet uses <code><</code> or <code>&</code> or <code>]]></code> or <code>--</code>. Use external scripts if your script uses <code><</code> or <code>&</code> or <code>]]></code> or <code>--</code>. Note that XML parsers are permitted to silently remove the contents of comments. Therefore, the historical practice of "hiding" scripts and style sheets within "comments" to make the documents backward compatible is likely to not work as expected in XML-based user agents.</p> <h2><a name="C_5" id="C_5">C.5.</a> Line Breaks within Attribute Values</h2> <p>Avoid line breaks and multiple white space characters within attribute values. These are handled inconsistently by user agents.</p> <h2><a name="C_6" id="C_6">C.6.</a> Isindex</h2> <p>Don't include more than one <code>isindex</code> element in the document <code>head</code>. The <code>isindex</code> element is deprecated in favor of the <code>input</code> element.</p> <h2><a name="C_7" id="C_7">C.7.</a> The <code>lang</code> and <code>xml:lang</code> Attributes</h2> <p>Use both the <code>lang</code> and <code>xml:lang</code> attributes when specifying the language of an element. The value of the <code>xml:lang</code> attribute takes precedence.</p> <h2><a name="C_8" id="C_8">C.8.</a> Fragment Identifiers</h2> <p>In XML, <abbr title="Uniform Resource Identifiers">URI</abbr>-references [<a class="nref" href="#ref-rfc2396">RFC2396</a>] that end with fragment identifiers of the form <code> "#foo"</code> do not refer to elements with an attribute <code>name="foo"</code>; rather, they refer to elements with an attribute defined to be of type <code>ID</code>, e.g., the <code>id</code> attribute in HTML 4. Many existing HTML clients don't support the use of <code>ID</code>-type attributes in this way, so identical values may be supplied for both of these attributes to ensure maximum forward and backward compatibility (e.g., <code class="greenmono"><a id="foo" name="foo">...</a></code>).</p> <p>Further, since the set of legal values for attributes of type <code>ID</code> is much smaller than for those of type <code>CDATA</code>, the type of the <code>name</code> attribute has been changed to <code>NMTOKEN</code>. This attribute is constrained such that it can only have the same values as type <code>ID</code>, or as the <code>Name</code> production in XML 1.0 Section 2.3, production 5. Unfortunately, this constraint cannot be expressed in the XHTML 1.0 DTDs. Because of this change, care must be taken when converting existing HTML documents. The values of these attributes must be unique within the document, valid, and any references to these fragment identifiers (both internal and external) must be updated should the values be changed during conversion.</p> <p>Note that the collection of legal values in XML 1.0 Section 2.3, production 5 is much larger than that permitted to be used in the <code>ID</code> and <code>NAME</code> types defined in HTML 4. When defining fragment identifiers to be backward-compatible, only strings matching the pattern <code>[A-Za-z][A-Za-z0-9:_.-]*</code> should be used. See <a href="https://www.w3.org/TR/html4/types.html#h-6.2">Section 6.2</a> of [<a class="nref" href="#ref-html4">HTML4</a>] for more information.</p> <p>Finally, note that XHTML 1.0 has deprecated the <code>name</code> attribute of the <code>a</code>, <code>applet</code>, <code>form</code>, <code>frame</code>, <code>iframe</code>, <code> img</code>, and <code>map</code> elements, and it will be removed from XHTML in subsequent versions.</p> <h2><a name="C_9" id="C_9">C.9.</a> Character Encoding</h2> <p>Historically, the character encoding of an HTML document is either specified by a web server via the charset parameter of the HTTP Content-Type header, or via a <code>meta</code> element in the document itself. In an XML document, the character encoding of the document is specified on the XML declaration (e.g., <code class="greenmono"><?xml version="1.0" encoding="EUC-JP"?></code>). In order to portably present documents with specific character encodings, the best approach is to ensure that the web server provides the correct headers. If this is not possible, a document that wants to set its character encoding explicitly must include both the XML declaration an encoding declaration and a <code>meta</code> http-equiv statement (e.g., <code class="greenmono"><meta http-equiv="Content-type" content="text/html; charset=EUC-JP" /></code>). In XHTML-conforming user agents, the value of the encoding declaration of the XML declaration takes precedence.</p> <p>Note: be aware that if a document must include the character encoding declaration in a meta http-equiv statement, that document may always be interpreted by HTTP servers and/or user agents as being of the internet media type defined in that statement. If a document is to be served as multiple media types, the HTTP server must be used to set the encoding of the document.</p> <h2><a name="C_10" id="C_10">C.10.</a> Boolean Attributes</h2> <p>Some HTML user agents are unable to interpret boolean attributes when these appear in their full (non-minimized) form, as required by XML 1.0. Note this problem doesn't affect user agents compliant with HTML 4. The following attributes are involved: <code>compact</code>, <code>nowrap</code>, <code>ismap</code>, <code>declare</code>, <code>noshade</code>, <code>checked</code>, <code> disabled</code>, <code>readonly</code>, <code>multiple</code>, <code>selected</code>, <code>noresize</code>, <code>defer</code>.</p> <h2><a name="C_11" id="C_11">C.11.</a> Document Object Model and XHTML</h2> <p>The Document Object Model level 1 Recommendation [<a class="nref" href="#ref-dom">DOM</a>] defines document object model interfaces for XML and HTML 4. The HTML 4 document object model specifies that HTML element and attribute names are returned in upper-case. The XML document object model specifies that element and attribute names are returned in the case they are specified. In XHTML 1.0, elements and attributes are specified in lower-case. This apparent difference can be addressed in two ways:</p> <ol> <li>User agents that access XHTML documents served as Internet media type <code>text/html</code> via the <abbr title="Document Object Model">DOM</abbr> can use the HTML DOM, and can rely upon element and attribute names being returned in upper-case from those interfaces.</li> <li>User agents that access XHTML documents served as Internet media types <code>text/xml</code>, <code>application/xml</code>, or <code>application/xhtml+xml</code> can also use the XML DOM. Elements and attributes will be returned in lower-case. Also, some XHTML elements may or may not appear in the object tree because they are optional in the content model (e.g. the <code>tbody</code> element within <code>table</code>). This occurs because in HTML 4 some elements were permitted to be minimized such that their start and end tags are both omitted (an SGML feature). This is not possible in XML. Rather than require document authors to insert extraneous elements, XHTML has made the elements optional. User agents need to adapt to this accordingly. For further information on this topic, see [<a class="nref" href="#ref-dom2">DOM2</a>]</li> </ol> <h2><a name="C_12" id="C_12">C.12.</a> Using Ampersands in Attribute Values (and Elsewhere)</h2> <p>In both SGML and XML, the ampersand character ("&") declares the beginning of an entity reference (e.g., &reg; for the registered trademark symbol "®"). Unfortunately, many HTML user agents have silently ignored incorrect usage of the ampersand character in HTML documents - treating ampersands that do not look like entity references as literal ampersands. XML-based user agents will not tolerate this incorrect usage, and any document that uses an ampersand incorrectly will not be "valid", and consequently will not conform to this specification. In order to ensure that documents are compatible with historical HTML user agents and XML-based user agents, ampersands used in a document that are to be treated as literal characters must be expressed themselves as an entity reference (e.g. "<code>&amp;</code>"). For example, when the <code>href</code> attribute of the <code>a</code> element refers to a CGI script that takes parameters, it must be expressed as <code>http://my.site.dom/cgi-bin/myscript.pl?class=guest&amp;name=user</code> rather than as <code>http://my.site.dom/cgi-bin/myscript.pl?class=guest&name=user</code>.</p> <h2><a name="C_13" id="C_13">C.13.</a> Cascading Style Sheets (CSS) and XHTML</h2> <p>The Cascading Style Sheets level 2 Recommendation [<a class="nref" href="#ref-css2">CSS2</a>] defines style properties which are applied to the parse tree of the HTML or XML documents. Differences in parsing will produce different visual or aural results, depending on the selectors used. The following hints will reduce this effect for documents which are served without modification as both media types:</p> <ol> <li>CSS style sheets for XHTML should use lower case element and attribute names.</li> <li>In tables, the tbody element will be inferred by the parser of an HTML user agent, but not by the parser of an XML user agent. Therefore you should always explicitly add a tbody element if it is referred to in a CSS selector.</li> <li>Within the XHTML namespace, user agents are expected to recognize the "id" attribute as an attribute of type ID. Therefore, style sheets should be able to continue using the shorthand "#" selector syntax even if the user agent does not read the DTD.</li> <li>Within the XHTML namespace, user agents are expected to recognize the "class" attribute. Therefore, style sheets should be able to continue using the shorthand "." selector syntax.</li> <li>CSS defines different conformance rules for HTML and XML documents; be aware that the HTML rules apply to XHTML documents delivered as HTML and the XML rules apply to XHTML documents delivered as XML.</li> </ol> <h2><a name="C_14" id="C_14">C.14.</a> Referencing Style Elements when serving as XML</h2> <p>In HTML 4 and XHTML, the <code>style</code> element can be used to define document-internal style rules. In XML, an XML stylesheet declaration is used to define style rules. In order to be compatible with this convention, <code>style</code> elements should have their fragment identifier set using the <code>id</code> attribute, and an XML stylesheet declaration should reference this fragment. For example:</p> <div class="good"> <pre> <?xml-stylesheet href="http://www.w3.org/StyleSheets/TR/W3C-REC.css" type="text/css"?> <?xml-stylesheet href="#internalStyle" type="text/css"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>An internal stylesheet example</title> <style type="text/css" id="internalStyle"> code { color: green; font-family: monospace; font-weight: bold; } </style> </head> <body> <p> This is text that uses our <code>internal stylesheet</code>. </p> </body> </html> </pre> </div> <h2><a name="C_15" id="C_15">C.15.</a> White Space Characters in HTML vs. XML</h2> <p>Some characters that are legal in HTML documents, are illegal in XML document. For example, in HTML, the Formfeed character (U+000C) is treated as white space, in XHTML, due to XML's definition of characters, it is illegal.</p> <h2><a name="C_16" id="C_16">C.16.</a> The Named Character Reference &apos;</h2> <p>The named character reference <code>&apos;</code> (the apostrophe, U+0027) was introduced in XML 1.0 but does not appear in HTML. Authors should therefore use <code>&#39;</code> instead of <code>&apos;</code> to work as expected in HTML 4 user agents.</p> <!-- END OF FILE guidelines.mhtml --><!-- INCLUDING acks.mhtml --><!--OddPage--> <h1><a name="acks" id="acks">D.</a> Acknowledgements</h1> <p><strong>This appendix is informative.</strong></p> <p>This specification was written with the participation of the members of the W3C HTML Working Group.</p> <p>At publication of the second edition, the membership was:</p> <dl> <dd>Steven Pemberton, CWI/W3C (HTML Working Group Chair)<br /> Daniel Austin, Grainger<br /> Jonny Axelsson, Opera Software<br /> Tantek Çelik, Microsoft<br /> Doug Dominiak, Openwave Systems<br /> Herman Elenbaas, Philips Electronics<br /> Beth Epperson, Netscape/<acronym title="America Online">AOL</acronym><br /> Masayasu Ishikawa, W3C (HTML Activity Lead)<br /> Shin'ichi Matsui, Panasonic<br /> Shane McCarron, Applied Testing and Technology<br /> Ann Navarro, WebGeek, <abbr title="Incorporated">Inc.</abbr><br /> Subramanian Peruvemba, Oracle<br /> Rob Relyea, Microsoft<br /> Sebastian Schnitzenbaumer, SAP<br /> Peter Stark, Sony Ericsson<br /> </dd> </dl> <p>At publication of the first edition, the membership was:</p> <dl> <dd>Steven Pemberton, <acronym title="Centrum voor Wiskunde en Informatica" lang="nl" xml:lang="nl">CWI</acronym> (HTML Working Group Chair)<br /> Murray Altheim, Sun Microsystems<br /> Daniel Austin, AskJeeves (CNET: The Computer Network through July 1999)<br /> Frank Boumphrey, HTML Writers Guild<br /> John Burger, Mitre<br /> Andrew W. Donoho, IBM<br /> Sam Dooley, IBM<br /> Klaus Hofrichter, GMD<br /> Philipp Hoschka, W3C<br /> Masayasu Ishikawa, W3C<br /> Warner ten Kate, Philips Electronics<br /> Peter King, Phone.com<br /> Paula Klante, JetForm<br /> Shin'ichi Matsui, Panasonic (W3C visiting engineer through September 1999)<br /> Shane McCarron, Applied Testing and Technology (The Open Group through August 1999)<br /> Ann Navarro, HTML Writers Guild<br /> Zach Nies, Quark<br /> Dave Raggett, W3C/HP (HTML Activity Lead)<br /> Patrick Schmitz, Microsoft<br /> Sebastian Schnitzenbaumer, Stack Overflow<br /> Peter Stark, Phone.com<br /> Chris Wilson, Microsoft<br /> Ted Wugofski, Gateway 2000<br /> Dan Zigmond, WebTV Networks</dd> </dl> <!-- END OF FILE acks.mhtml --><!-- INCLUDING references.mhtml --><!--OddPage--> <h1><a name="refs" id="refs">E.</a> References</h1> <p><strong>This appendix is informative.</strong></p> <dl> <dt><a name="ref-css2" id="ref-css2"><strong>[CSS2]</strong></a></dt> <dd>"<cite><a href="https://www.w3.org/TR/1998/REC-CSS2-19980512">Cascading Style Sheets, level 2 (CSS2) Specification</a></cite>", B. Bos, H. W. Lie, C. Lilley, I. Jacobs, 12 May 1998.<br /> <a href="https://www.w3.org/TR/REC-CSS2">Latest version</a> available at: http://www.w3.org/TR/REC-CSS2</dd> <dt><a name="ref-dom" id="ref-dom"><strong>[DOM]</strong></a></dt> <dd>"<cite><a href="https://www.w3.org/TR/1998/REC-DOM-Level-1-19981001">Document Object Model (DOM) Level 1 Specification</a></cite>", Lauren Wood <em lang="lt" xml:lang="lt">et al.</em>, 1 October 1998.<br /> <a href="https://www.w3.org/TR/REC-DOM-Level-1">Latest version</a> available at: http://www.w3.org/TR/REC-DOM-Level-1</dd> <dt><a name="ref-dom2" id="ref-dom2"><strong>[DOM2]</strong></a></dt> <dd>"<cite><a href="https://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113">Document Object Model (DOM) Level 2 Core Specification</a></cite>", A. Le Hors, <em lang="lt" xml:lang="lt">et al.</em>, 13 November 2000.<br /> <a href="https://www.w3.org/TR/DOM-Level-2-Core">Latest version</a> available at: http://www.w3.org/TR/DOM-Level-2-Core</dd> <dt><a name="ref-html4" id="ref-html4"><strong>[HTML]</strong></a></dt> <dd>"<cite><a href="https://www.w3.org/TR/1999/REC-html401-19991224">HTML 4.01 Specification</a></cite>", D. Raggett, A. Le Hors, I. Jacobs, 24 December 1999.<br /> <a href="https://www.w3.org/TR/html401">Latest version</a> available at: http://www.w3.org/TR/html401</dd> <dt><a name="ref-posix.1" id="ref-posix.1"><strong>[POSIX.1]</strong></a></dt> <dd>"<cite>ISO/IEC 9945-1:1990 Information Technology - Portable Operating System Interface (POSIX) - Part 1: System Application Program Interface (API) [C Language]</cite>", Institute of Electrical and Electronics Engineers, Inc, 1990.</dd> <dt><a id="ref-rfc2045" name="ref-rfc2045"><strong>[RFC2045]</strong></a></dt> <dd>"<cite><a href="http://www.ietf.org/rfc/rfc2045.txt">Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies</a></cite>", N. Freed and N. Borenstein, November 1996. Note that this RFC obsoletes RFC1521, RFC1522, and RFC1590.</dd> <dt><a name="ref-rfc2046" id="ref-rfc2046"><strong>[RFC2046]</strong></a></dt> <dd>"<cite><a href="http://www.ietf.org/rfc/rfc2046.txt">RFC2046: Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types</a></cite>", N. Freed and N. Borenstein, November 1996.<br /> Available at <a href="http://www.ietf.org/rfc/rfc2046.txt">http://www.ietf.org/rfc/rfc2046.txt</a>. Note that this RFC obsoletes RFC1521, RFC1522, and RFC1590.</dd> <dt><a name="ref-rfc2119" id="ref-rfc2119"><strong>[RFC2119]</strong></a></dt> <dd>"<cite><a href="http://www.ietf.org/rfc/rfc2119.txt">RFC2119: Key words for use in RFCs to Indicate Requirement Levels</a></cite>", S. Bradner, March 1997.<br /> Available at: http://www.ietf.org/rfc/rfc2119.txt</dd> <dt><a name="ref-rfc2376" id="ref-rfc2376"><strong>[RFC2376]</strong></a></dt> <dd>"<cite><a href="http://www.ietf.org/rfc/rfc2376.txt">RFC2376: XML Media Types</a></cite>", E. Whitehead, M. Murata, July 1998.<br /> This document is obsoleted by [<a href="#ref-rfc3023">RFC3023</a>].<br /> Available at: http://www.ietf.org/rfc/rfc2376.txt</dd> <dt><a name="ref-rfc2396" id="ref-rfc2396"><strong>[RFC2396]</strong></a></dt> <dd>"<cite><a href="http://www.ietf.org/rfc/rfc2396.txt">RFC2396: Uniform Resource Identifiers (URI): Generic Syntax</a></cite>", T. Berners-Lee, R. Fielding, L. Masinter, August 1998.<br /> This document updates RFC1738 and RFC1808.<br /> Available at: http://www.ietf.org/rfc/rfc2396.txt</dd> <dt><a name="ref-rfc2854" id="ref-rfc2854"><strong>[RFC2854]</strong></a></dt> <dd>"<cite><a href="http://www.ietf.org/rfc/rfc2854.txt">RFC2854: The text/html Media Type</a></cite>", D. Conolly, L. Masinter, June 2000.<br /> Available at: http://www.ietf.org/rfc/rfc2854.txt</dd> <dt><a name="ref-rfc3023" id="ref-rfc3023"><strong>[RFC3023]</strong></a></dt> <dd>"<cite><a href="http://www.ietf.org/rfc/rfc3023.txt">RFC3023: XML Media Types</a></cite>", M. Murata, S. St.Laurent, D. Kohn, January 2001.<br /> This document obsoletes [<a href="#ref-rfc2376">RFC2376</a>].<br /> Available at: http://www.ietf.org/rfc/rfc3023.txt</dd> <dt><a id="ref-rfc3066" name="ref-rfc3066"><strong>[RFC3066]</strong></a></dt> <dd>"<a href="http://www.ietf.org/rfc/rfc3066.txt">Tags for the Identification of Languages</a>", H. Alvestrand, January 2001.<br /> Available at: http://www.ietf.org/rfc/rfc3066.txt</dd> <dt><a id="ref-rfc3236" name="ref-rfc3236"><strong>[RFC3236]</strong></a></dt> <dd>"<a href="http://www.ietf.org/rfc/rfc3236.txt">The 'application/xhtml+xml' Media Type</a>", M. Baker, P. Stark, January 2002.<br /> Available at: http://www.ietf.org/rfc/rfc3236.txt</dd> <dt><a id="ref-xhtml-mathml" name="ref-xhtml-mathml"><strong>[XHTML+MathML]</strong></a></dt> <dd><cite>"<a href="https://www.w3.org/TR/MathML2/dtd/xhtml-math11-f.dtd">XHTML plus Math 1.1 <abbr title="Document Type Definition">DTD</abbr></a></cite>", "A.2 MathML as a DTD Module", Mathematical Markup Language (MathML) Version 2.0. Available at: http://www.w3.org/TR/MathML2/dtd/xhtml-math11-f.dtd</dd> <dt><a id="ref-xhtmlmime" name="ref-xhtmlmime"><strong>[XHTMLMIME]</strong></a></dt> <dd>"<cite><a href="https://www.w3.org/TR/2002/NOTE-xhtml-media-types-20020801">XHTML Media Types</a></cite>", Masayasu Ishikawa, 1 August 2002.<br /> <a href="https://www.w3.org/TR/xhtml-media-types">Latest version</a> available at: http://www.w3.org/TR/xhtml-media-types</dd> <dt><a id="ref-xhtmlmod" name="ref-xhtmlmod"><strong>[XHTMLMOD]</strong></a></dt> <dd>"<cite><a href="https://www.w3.org/TR/2001/REC-xhtml-modularization-20010410">Modularization of XHTML</a></cite>", M. Altheim et al., 10 April 2001.<br /> <a href="https://www.w3.org/TR/xhtml-modularization">Latest version</a> available at: http://www.w3.org/TR/xhtml-modularization</dd> <dt><a name="ref-xml" id="ref-xml"><strong>[XML]</strong></a></dt> <dd>"<a href="https://www.w3.org/TR/2000/REC-xml-20001006">Extensible Markup Language (XML) 1.0 Specification (Second Edition)</a>", T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, 6 October 2000.<br /> <a href="https://www.w3.org/TR/REC-xml">Latest version</a> available at: http://www.w3.org/TR/REC-xml</dd> <dt><a name="ref-xmlns" id="ref-xmlns"><strong>[XMLNS]</strong></a></dt> <dd>"<a href="https://www.w3.org/TR/1999/REC-xml-names-19990114">Namespaces in XML</a>", T. Bray, D. Hollander, A. Layman, 14 January 1999.<br /> XML namespaces provide a simple method for qualifying names used in XML documents by associating them with namespaces identified by URI.<br /> <a href="https://www.w3.org/TR/REC-xml-names">Latest version</a> available at: http://www.w3.org/TR/REC-xml-names</dd> <dt><a name="ref-xmlc14n" id="ref-xmlc14n"><strong>[XMLC14N]</strong></a></dt> <dd>"<a href="https://www.w3.org/TR/2001/REC-xml-c14n-20010315">Canonical XML Version 1.0</a>", J. Boyer, 15 March 2001.<br /> This document describes a method for generating a physical representation, the canonical form, of an XML document.<br /> <a href="https://www.w3.org/TR/xml-c14n">Latest version</a> available at: http://www.w3.org/TR/xml-c14n</dd> </dl> <!-- END OF FILE references.mhtml --> <p><a href="https://www.w3.org/WAI/WCAG1AAA-Conformance" title="Explanation of Level Triple-A Conformance"><img height="32" width="88" src="https://www.w3.org/WAI/wcag1AAA.png" alt= "Level Triple-A conformance icon, W3C-WAI Web Content Accessibility Guidelines 1.0" /></a></p> <script type="application/javascript" src="https://www.w3.org/scripts/TR/fixup.js"></script></body> </html>