IETF language tag
The IETF language tag (English: IETF language tag) is a technical specification defined by BCP 47 (now RFC 5646 and RFC 4647). This is used in HTTP [1], HTML [2], XML [3], many technical standards such as PNG[4].
Table of contents
Format
The language tag is comprised of one or more "lower tag (subtag)" distributed by a hyphen. In general, the lower tag is written in order of next.
- language (language)
- script (letter system)
- region (area)
- variant (variant)
- Extension (expansion)
- Private use (private use)
Thus, the format is quite as follows. (of these, only as for language tag, required)
language-script-region-variant-extension-privateuse
Each low rank tag is derived from the following standards.
- language: ISO 639-1、ISO 639-2、ISO 639-3、ISO 639-5[5]
- script: ISO 15924
- region: ISO 3166-1 alpha-2、UN M.49
- variant: (there is no standard of the origin of derivation because of an original thing)
- Extension: (there is no standard of the origin of derivation in a reservation domain for future expansion)
- privateuse: (there is no standard of the origin of derivation because of a personal use part)
There is the list of shown effective lower tags in Language Subtag Registry (language low rank tag registry) managed by IANA now.
I do not distinguish the capital letter small letter of the alphabet in the lower tag, but recommend it to write all down only an initial by a small letter about all other low rank tags about the script low rank tag about by a method same as language low rank tag registry that is the region low rank tag with the specifications in capital letters in capital letters.
I merely use only the language low rank tag, or the style to be seen as how to use language tag well is a method to use language low rank tag and the region low rank tag. For example, en
is comprised of single language low rank tag (than ISO 639-1) and expresses English. On the other hand, en-CA
attaches region low rank tag CA
(than ISO 3166-1) after language low rank tag and is constructed and expresses Canadian English.
History
The IETF language tag was defined for the first time in RFC 1766 of the publication in May, 1995. This was rearranged in RFC 3066 in January, 2001. This added ISO 639-2 cord (only ISO 639-1 cord was permitted) and admitted that I used the number for a lower tag for the first time.
The next version of specifications was RFC 4646 (for the main part of specifications) and RFC 4647 of the September, 2006 publication (about behavior of the matching). RFC 4646 introduces a structured format by a language tag and I always add it to used ISO 639 (part 1 and 2) and ISO 3166 and use ISO 15924 and UN M.49 and rearrange the registry of the low rank tag to a new thing from an old thing. About the thing which still less adapts to new structure with a tag defined before this, I am succeeded to maintain compatibility with RFC 3066.
The IETF working group is setting up the next printing block of specifications and is during approval work now. The main purpose of this version is to adopt ISO 639-3 to language low rank tag registry [6].
Example of the language tag
The following extracted it from BCP47 [7].
- Only as for language low rank tag:
de
(German)ja
(Japanese)i-enochian
(example of the grandfathered tag)
- language-script:
zh-Hant
(Chinese written in a traditional kanji)zh-Hans
(Chinese written in a simplified Chinese character)sr-Cyrl
(Serbian written by Cyrillic)sr-Latn
(Serbian written in Roman letters)
- language-script-region:
zh-Hans-CN
(Chinese written in a simplified Chinese character used in Mainland China)sr-Latn-CS
(Serbian written in Roman letters used in Serbia and Montenegro)
- language-variant:
sl-nedis
(Slovene Nadiza dialect)
- language-region-variant:
de-CH-1901
(German written in German 1901 orthography used in Switzerland)sl-IT-nedis
(Slovene Nadiza dialect used in Italy)
- language-script-region-variant:
sl-Latn-IT-nedis
(because a note includesLatn
insl
in silence Slovene Nadiza dialect written in Roman letters used in Italy, this tag is not recommended)
- language-region:
en-US
(English used in the United States)es-419
(Spanish used in Latin America and a Caribbean area, use of UN area cord)
など.
Footnote
- It is Hypertext Transfer Protocol - - HTTP/1.1, section 3.10 ^ RFC 2616
- ^ HTML 4.01 Specification, section 8.1
- ^ Extensible Markup Language (XML) 1.0 (Fourth Edition), section 2.12
- ^ Portable Network Graphics (PNG) Specification (Second Edition), section 11.3.4.5
- ^ RFC 5646, section 2.2.1
- ^ Language Tag Registry Update charter
- ^ Examples of Language Tags, Tags for Identifying Languages, 2006.
Allied item
Outside link
- Specifications as of BCP 47 -
- I am managed by Language Subtag Registry (language low rank tag registry) - IANA
- Search tool (English)
- Fill-in by the Language tags in HTML and XML - W3C
- There are Language Tags (English) - non-official site, various tools
- I can search IANA Language Subtag Registry Search - non-official site, a lower tag.
This article is taken from the Japanese Wikipedia IETF language tag
This article is distributed by cc-by-sa or GFDL license in accordance with the provisions of Wikipedia.
In addition, Tranpedia is simply not responsible for any show is only by translating the writings of foreign licenses that are compatible with CC-BY-SA license information.
0 개의 댓글:
댓글 쓰기