| |
| |
| |
| Internet Engineering Task Force M. Davis |
| Internet-Draft Google |
| Intended status: BCP A. Phillips |
| Expires: July 17, 2010 Lab126 |
| Y. Umaoka |
| IBM |
| January 13, 2010 |
| |
| |
| BCP 47 Extension U |
| draft-davis-u-langtag-ext-00 |
| |
| Abstract |
| |
| This document specifies an Extension to BCP 47 which provides subtags |
| that specify language and/or locale-based behavior or refinements to |
| language tags, according to work done by the Unicode Consortium. |
| |
| Status of this Memo |
| |
| This Internet-Draft is submitted to IETF in full conformance with the |
| provisions of BCP 78 and BCP 79. |
| |
| Internet-Drafts are working documents of the Internet Engineering |
| Task Force (IETF), its areas, and its working groups. Note that |
| other groups may also distribute working documents as Internet- |
| Drafts. |
| |
| Internet-Drafts are draft documents valid for a maximum of six months |
| and may be updated, replaced, or obsoleted by other documents at any |
| time. It is inappropriate to use Internet-Drafts as reference |
| material or to cite them other than as "work in progress." |
| |
| The list of current Internet-Drafts can be accessed at |
| http://www.ietf.org/ietf/1id-abstracts.txt. |
| |
| The list of Internet-Draft Shadow Directories can be accessed at |
| http://www.ietf.org/shadow.html. |
| |
| This Internet-Draft will expire on July 17, 2010. |
| |
| Copyright Notice |
| |
| Copyright (c) 2010 IETF Trust and the persons identified as the |
| document authors. All rights reserved. |
| |
| This document is subject to BCP 78 and the IETF Trust's Legal |
| Provisions Relating to IETF Documents |
| |
| |
| |
| Davis, et al. Expires July 17, 2010 [Page 1] |
| |
| Internet-Draft BCP 47 Unicode Locale Extension January 2010 |
| |
| |
| (http://trustee.ietf.org/license-info) in effect on the date of |
| publication of this document. Please review these documents |
| carefully, as they describe your rights and restrictions with respect |
| to this document. Code Components extracted from this document must |
| include Simplified BSD License text as described in Section 4.e of |
| the Trust Legal Provisions and are provided without warranty as |
| described in the BSD License. |
| |
| |
| Table of Contents |
| |
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 |
| 1.1. Requirements Language . . . . . . . . . . . . . . . . . . . 3 |
| 2. BCP47 Required Information . . . . . . . . . . . . . . . . . . 3 |
| 2.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 4 |
| 2.1.1. Canonicalization . . . . . . . . . . . . . . . . . . . 5 |
| 2.2. Registration Form . . . . . . . . . . . . . . . . . . . . . 5 |
| 3. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 5 |
| 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 |
| 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 6 |
| 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 |
| 6.1. Normative References . . . . . . . . . . . . . . . . . . . 6 |
| 6.2. Informative References . . . . . . . . . . . . . . . . . . 6 |
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 6 |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| Davis, et al. Expires July 17, 2010 [Page 2] |
| |
| Internet-Draft BCP 47 Unicode Locale Extension January 2010 |
| |
| |
| 1. Introduction |
| |
| [BCP47] permits the definition and registration of language tag |
| extensions "that contain a language component and are compatible with |
| applications that understand language tags". This document defines |
| an extension for identifying Unicode locale-based variations using |
| language tags. The "singleton" identifier for this extension is 'u'. |
| |
| 1.1. Requirements Language |
| |
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", |
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this |
| document are to be interpreted as described in RFC 2119. |
| |
| |
| 2. BCP47 Required Information |
| |
| Language tags, as defined by [BCP47], are useful for identifying the |
| language of content. They are also used as locale identifiers (or |
| can be mapped to locales) in many operating environments and APIs. |
| However, most such locale identifiers also provide additional |
| "tailorings" or options for specific values within a language, |
| culture, region, or other variation. This extension provides a |
| mechanism for using these additional tailorings within language tags |
| for general interchange. |
| |
| The maintaining authority for this extension's registry is the |
| Unicode Consortium. Unicode defines common locale data and |
| identifiers for this data: |
| |
| +---------------+---------------------------------------------------+ |
| | Item | Value | |
| +---------------+---------------------------------------------------+ |
| | Name | Unicode Consortium | |
| | Contact Email | cldr@unicode.org | |
| | Discussion | cldr-users@unicode.org | |
| | List Email | | |
| | URL Location | cldr.unicode.org | |
| | Specification | Unicode Technical Standard #35 Unicode Locale | |
| | | Data Markup Language (LDML), | |
| | | http://unicode.org/reports/tr35/ | |
| | Section | Section 3.2 BCP 47 Tag Conversion | |
| +---------------+---------------------------------------------------+ |
| |
| The specification of extension subtags is provided by Section 3 of |
| Unicode Technical Standard #35 Unicode Locale Data Markup Language |
| [LDML]. As required by BCP 47, subtags follow the language tag ABNF |
| and other rules for the formation of language tags and subtags, are |
| |
| |
| |
| Davis, et al. Expires July 17, 2010 [Page 3] |
| |
| Internet-Draft BCP 47 Unicode Locale Extension January 2010 |
| |
| |
| restricted to the ASCII letters and digits, are not case sensitive, |
| and do not exceed eight characters in length. |
| |
| [LDML] specifies a canonical representation. LDML is available over |
| the Internet and at no cost, and is available via a royalty-free |
| license at http://unicode.org/copyright.html. LDML is versioned, and |
| each version of LDML is numbered, dated, and stable. Extension |
| subtags, once defined by LDML, are never retracted or change in |
| meaning in a substantial way. |
| |
| 2.1. Summary |
| |
| The subtags available for use in the 'u' extension consist of a set |
| of attributes, keys, and types. Attributes, keys, types, and their |
| respective meanings are defined in Section 3 (Unicode Language and |
| Locale Identifiers) of [LDML]. The following is a summary of that |
| definition (for details see Section 3): |
| |
| o An 'attribute' is a subtag with a length of three or more |
| characters following the singleton and preceding any 'keyword' |
| sequences. No attributes were defined at the time of this |
| document's publication. |
| |
| o A 'keyword' is a sequence of subtags consisting of a 'key' subtag, |
| followed by zero or more 'type' subtags. Each 'key' MUST be |
| unique within the extension. The order of the 'type' subtags |
| within a 'keyword' is sometimes significant to their |
| interpretation. Note that 'keys' can appear without a subsequent |
| 'type' subtag. |
| |
| A. A 'key' is a subtag with a length of exactly two characters. |
| Each 'key' is followed by zero or more 'type' subtags. |
| |
| B. A 'type' is a subtag with a length of three or more characters |
| following a key. 'Type' subtags are specific to a particular |
| 'key' and the order of the 'type' subtags MAY be significant |
| to the interpretation of the 'keyword'. |
| |
| For example, the language tag "de-DE-u-attr-co-phonebk" consists of: |
| |
| o The base language tag "de-DE" (German as used in Germany), exactly |
| as defined by [BCP47] using subtags from the IANA Language Subtag |
| Registry. |
| |
| o The singleton 'u', identifying this extension. |
| |
| o The attribute 'attr', which is an example for illustration (no |
| attributes were defined at the time this document was published). |
| |
| |
| |
| Davis, et al. Expires July 17, 2010 [Page 4] |
| |
| Internet-Draft BCP 47 Unicode Locale Extension January 2010 |
| |
| |
| o The keyword 'co-phonebk', consisting to the key 'co' (Collation) |
| and the type 'phonebk' (Phonebook collation order). |
| |
| With successive versions of [LDML], additional attributes, keys, and |
| types MAY be defined. Once defined, attributes, keys, and types will |
| never be removed. Machine-readable files listing the valid |
| attributes, keys, and types are available in the CLDR repository for |
| each version. For example, for version 1.7.2, the files are located |
| at http://unicode.org/repos/cldr/tags/release-1-7-2/common/bcp47/. |
| These also can contain aliases which were used in previous versions |
| of [LDML]. |
| |
| 2.1.1. Canonicalization |
| |
| As required by [BCP47], case is not significant. The canonical form |
| for all subtags in the extension is lowercase. The canonical order |
| of attributes is in [US-ASCII] order (that is, numbers before |
| letters, with letters sorted as lowercase US-ASCII code points). The |
| canonical order of keywords is in [US-ASCII] order by key. The order |
| of subtags within a keyword is significant; the meaning of this |
| extension is altered if those subtags are rearranged. Thus, the |
| canonical form of the extension never reorders the subtags within a |
| keyword. |
| |
| 2.2. Registration Form |
| |
| Per [RFC5646], Section 3.7: |
| %% |
| Identifier: u |
| Description: Unicode Locale |
| Comments: Subtags for the identification of language and cultural |
| variations. Used to set behavior in locale APIs. |
| Added: 2009-mm-dd |
| RFC: [TBD] |
| Authority: Unicode Consortium |
| Contact_Email: cldr@unicode.org |
| Mailing_List: cldr-users@unicode.org |
| URL: http://cldr.unicode.org |
| %% |
| |
| |
| 3. Acknowledgements |
| |
| Thanks to John Emmons and the rest of the Unicode CLDR Technical |
| Committee for their work in developing the BCP 47 subtags for LDML. |
| |
| |
| |
| |
| |
| |
| Davis, et al. Expires July 17, 2010 [Page 5] |
| |
| Internet-Draft BCP 47 Unicode Locale Extension January 2010 |
| |
| |
| 4. IANA Considerations |
| |
| This document will require IANA to insert the record in Section 2.2 |
| into the Language Extensions Registry, according to Section 3.7. |
| Extensions and the Extensions Registry of "Tags for Identifying |
| Languages" in [BCP47]. There might be occasional maintenance of this |
| record. This document does not require IANA to create or maintain a |
| new registry or otherwise impact IANA. |
| |
| |
| 5. Security Considerations |
| |
| The security considerations for this extension are the same as those |
| for [RFC5646] (or its successors). See Section 6. Security |
| Considerations of [RFC5646]. |
| |
| |
| 6. References |
| |
| 6.1. Normative References |
| |
| [BCP47] Davis, M., Ed., "Tags for the Identification of Language |
| (BCP47)", September 2009. |
| |
| [LDML] Davis, M., "Unicode Technical Standard #35: Locale Data |
| Markup Language (LDML)", December 2007, |
| <http://www.unicode.org/reports/tr35/>. |
| |
| [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying |
| Languages", BCP 47, RFC 5646, September 2009. |
| |
| [US-ASCII] |
| International Organization for Standardization, "ISO/IEC |
| 646:1991, Information technology -- ISO 7-bit coded |
| character set for information interchange.", 1991. |
| |
| 6.2. Informative References |
| |
| [ldml-registry] |
| "Registry for Common Locale Data Repository tag elements", |
| September 2009. |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| Davis, et al. Expires July 17, 2010 [Page 6] |
| |
| Internet-Draft BCP 47 Unicode Locale Extension January 2010 |
| |
| |
| Authors' Addresses |
| |
| Mark Davis |
| Google |
| |
| Email: mark@macchiato.com |
| |
| |
| Addison Phillips |
| Lab126 |
| |
| Email: addison@inter-locale.com |
| |
| |
| Yoshito Umaoka |
| IBM |
| |
| Email: yoshito_umaoka@us.ibm.com |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| Davis, et al. Expires July 17, 2010 [Page 7] |
| |
| |