Network Working Group M. Baker Request for Comments: 3236 Planetfred, Inc. Category: Informational P. Stark Ericsson Mobile Communications January 2002 The 'application/xhtml+xml' Media Type Status of this Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2002). All Rights Reserved. Abstract This document defines the 'application/xhtml+xml' MIME media type for XHTML based markup languages; it is not intended to obsolete any previous IETF documents, in particular RFC 2854 which registers 'text/html'. 1. Introduction In 1998, the W3C HTML working group began work on reformulating HTML in terms of XML 1.0 [XML] and XML Namespaces [XMLNS]. The first part of that work concluded in January 2000 with the publication of the XHTML 1.0 Recommendation [XHTML1], the reformulation for HTML 4.01 [HTML401]. Work continues in the Modularization of XHTML Recommendation [XHTMLM12N], the decomposition of XHTML 1.0 into modules that can be used to compose new XHTML based languages, plus a framework for supporting this composition. This document only registers a new MIME media type, 'application/xhtml+xml'. It does not define anything more than is required to perform this registration. Baker & Stark Informational [Page 1] RFC 3236 The 'application/xhtml+xml' Media Type January 2002 This document follows the convention set out in [XMLMIME] for the MIME subtype name; attaching the suffix "+xml" to denote that the entity being described conforms to the XML syntax as defined in XML 1.0 [XML]. This document was prepared by members of the W3C HTML working group based on the structure, and some of the content, of RFC 2854, the registration of 'text/html'. Please send comments to www- html@w3.org, a public mailing list (requiring subscription) with archives at . 2. Registration of MIME media type application/xhtml+xml MIME media type name: application MIME subtype name: xhtml+xml Required parameters: none Optional parameters: charset This parameter has identical semantics to the charset parameter of the "application/xml" media type as specified in [XMLMIME]. profile See Section 8 of this document. Encoding considerations: See Section 4 of this document. Security considerations: See Section 7 of this document. Interoperability considerations: XHTML 1.0 [XHTML10] specifies user agent conformance rules that dictate behaviour that must be followed when dealing with, among other things, unrecognized elements. With respect to XHTML Modularization [XHTMLMOD] and the existence of XHTML based languages (referred to as XHTML family members) that are not XHTML 1.0 conformant languages, it is possible that 'application/xhtml+xml' may be used to describe some of these documents. However, it should suffice for now for the purposes of interoperability that user agents accepting 'application/xhtml+xml' content use the user agent conformance rules in [XHTML1]. Baker & Stark Informational [Page 2] RFC 3236 The 'application/xhtml+xml' Media Type January 2002 Although conformant 'application/xhtml+xml' interpreters can expect that content received is well-formed XML (as defined in [XML]), it cannot be guaranteed that the content is valid XHTML (as defined in [XHTML1]). This is in large part due to the reasons in the preceding paragraph. Published specification: XHTML 1.0 is now defined by W3C Recommendation; the latest published version is [XHTML1]. It provides for the description of some types of conformant content as "text/html", but also doesn't disallow the use with other content types (effectively allowing for the possibility of this new type). Applications which use this media type: Some content authors have already begun hand and tool authoring on the Web with XHTML 1.0. However that content is currently described as "text/html", allowing existing Web browsers to process it without reconfiguration for a new media type. There is no experimental, vendor specific, or personal tree predecessor to 'application/xhtml+xml'. This new type is being registered in order to allow for the expected deployment of XHTML on the World Wide Web, as a first class XML application where authors can expect that user agents are conformant XML 1.0 [XML] processors. Additional information: Magic number: There is no single initial byte sequence that is always present for XHTML files. However, Section 5 below gives some guidelines for recognizing XHTML files. See also section 3.1 in [XMLMIME]. File extension: There are three known file extensions that are currently in use for XHTML 1.0; ".xht", ".xhtml", and ".html". It is not recommended that the ".xml" extension (defined in [XMLMIME]) be used, as web servers may be configured to distribute such content as type "text/xml" or "application/xml". [XMLMIME] discusses the unreliability of this approach in section 3. Of course, should the author desire this behaviour, then the ".xml" extension can be used. Baker & Stark Informational [Page 3] RFC 3236 The 'application/xhtml+xml' Media Type January 2002 Macintosh File Type code: TEXT Person & email address to contact for further information: Mark Baker Intended usage: COMMON Author/Change controller: The XHTML specifications are a work product of the World Wide Web Consortium's HTML Working Group. The W3C has change control over these specifications. 3. Fragment identifiers URI references (Uniform Resource Identifiers, see [RFC2396] as updated by [RFC2732]) may contain additional reference information, identifying a certain portion of the resource. These URI references end with a number sign ("#") followed by an identifier for this portion (called the "fragment identifier"). Interpretation of fragment identifiers is dependent on the media type of the retrieval result. For documents labeled as 'text/html', [RFC2854] specified that the fragment identifier designates the correspondingly named element, these were identified by either a unique id attribute or a name attribute for some elements. For documents described with the application/xhtml+xml media type, fragment identifiers share the same syntax and semantics with other XML documents, see [XMLMIME], section 5. At the time of writing, [XMLMIME] does not define syntax and semantics of fragment identifiers, but refers to "XML Pointer Language (XPointer)" for a future XML fragment identification mechanism. The current specification for XPointer is available at http://www.w3.org/TR/xptr. Until [XMLMIME] gets updated, fragment identifiers for XHTML documents designate the element with the corresponding ID attribute value (see [XML] section 3.3.1); any XHTML element with the "id" attribute. 4. Encoding considerations By virtue of XHTML content being XML, it has the same considerations when sent as 'application/xhtml+xml' as does XML. See [XMLMIME], section 3.2. Baker & Stark Informational [Page 4] RFC 3236 The 'application/xhtml+xml' Media Type January 2002 5. Recognizing XHTML files All XHTML documents will have the string " (or ). [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996. [URI] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1998. [XHTML1] "XHTML 1.0: The Extensible HyperText Markup Language: A Reformulation of HTML 4 in XML 1.0", W3C Recommendation. Available at . [XML] "Extensible Markup Language (XML) 1.0", W3C Recommendation. Available at (or ). [TEXTHTML] Connolly, D. and L. Masinter, "The 'text/html' Media Type", RFC 2854, June 2000. [XMLMIME] Murata, M., St.Laurent, S. and D. Kohn, "XML Media Types", RFC 3023, January 2001. [XHTMLM12N] "Modularization of XHTML", W3C Recommendation. Available at: Baker & Stark Informational [Page 7] RFC 3236 The 'application/xhtml+xml' Media Type January 2002 11. Full Copyright Statement Copyright (C) The Internet Society (2002). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. Baker & Stark Informational [Page 8]