2016년 9월 24일 토요일

IETF language tag

IETF language tag

The IETF language tag (English: IETF language tag) is a technical specification defined by BCP 47 (now RFC 5646 and RFC 4647). This is used in HTTP [1], HTML [2], XML [3], many technical standards such as PNG[4].

Table of contents

Format

The language tag is comprised of one or more "lower tag (subtag)" distributed by a hyphen. In general, the lower tag is written in order of next.

  • language (language)
  • script (letter system)
  • region (area)
  • variant (variant)
  • Extension (expansion)
  • Private use (private use)

Thus, the format is quite as follows. (of these, only as for language tag, required)

language-script-region-variant-extension-privateuse

Each low rank tag is derived from the following standards.

  • language: ISO 639-1ISO 639-2ISO 639-3ISO 639-5[5]
  • script: ISO 15924
  • region: ISO 3166-1 alpha-2UN M.49
  • variant: (there is no standard of the origin of derivation because of an original thing)
  • Extension: (there is no standard of the origin of derivation in a reservation domain for future expansion)
  • privateuse: (there is no standard of the origin of derivation because of a personal use part)

There is the list of shown effective lower tags in Language Subtag Registry (language low rank tag registry) managed by IANA now.

I do not distinguish the capital letter small letter of the alphabet in the lower tag, but recommend it to write all down only an initial by a small letter about all other low rank tags about the script low rank tag about by a method same as language low rank tag registry that is the region low rank tag with the specifications in capital letters in capital letters.

I merely use only the language low rank tag, or the style to be seen as how to use language tag well is a method to use language low rank tag and the region low rank tag. For example, en is comprised of single language low rank tag (than ISO 639-1) and expresses English. On the other hand, en-CA attaches region low rank tag CA (than ISO 3166-1) after language low rank tag and is constructed and expresses Canadian English.

History

The IETF language tag was defined for the first time in RFC 1766 of the publication in May, 1995. This was rearranged in RFC 3066 in January, 2001. This added ISO 639-2 cord (only ISO 639-1 cord was permitted) and admitted that I used the number for a lower tag for the first time.

The next version of specifications was RFC 4646 (for the main part of specifications) and RFC 4647 of the September, 2006 publication (about behavior of the matching). RFC 4646 introduces a structured format by a language tag and I always add it to used ISO 639 (part 1 and 2) and ISO 3166 and use ISO 15924 and UN M.49 and rearrange the registry of the low rank tag to a new thing from an old thing. About the thing which still less adapts to new structure with a tag defined before this, I am succeeded to maintain compatibility with RFC 3066.

The IETF working group is setting up the next printing block of specifications and is during approval work now. The main purpose of this version is to adopt ISO 639-3 to language low rank tag registry [6].

Example of the language tag

The following extracted it from BCP47 [7].

  • Only as for language low rank tag:
    • de (German)
    • ja (Japanese)
    • i-enochian (example of the grandfathered tag)
  • language-script:
    • zh-Hant (Chinese written in a traditional kanji)
    • zh-Hans (Chinese written in a simplified Chinese character)
    • sr-Cyrl (Serbian written by Cyrillic)
    • sr-Latn (Serbian written in Roman letters)
  • language-script-region:
    • zh-Hans-CN (Chinese written in a simplified Chinese character used in Mainland China)
    • sr-Latn-CS (Serbian written in Roman letters used in Serbia and Montenegro)
  • language-variant:
    • sl-nedis (Slovene Nadiza dialect)
  • language-region-variant:
    • de-CH-1901 (German written in German 1901 orthography used in Switzerland)
    • sl-IT-nedis (Slovene Nadiza dialect used in Italy)
  • language-script-region-variant:
    • sl-Latn-IT-nedis (because a note includes Latn in sl in silence Slovene Nadiza dialect written in Roman letters used in Italy, this tag is not recommended)
  • language-region:
    • en-US (English used in the United States)
    • es-419 (Spanish used in Latin America and a Caribbean area, use of UN area cord)

など.

Footnote

Allied item

Outside link

This article is taken from the Japanese Wikipedia IETF language tag

This article is distributed by cc-by-sa or GFDL license in accordance with the provisions of Wikipedia.

Wikipedia and Tranpedia does not guarantee the accuracy of this document. See our disclaimer for more information.

In addition, Tranpedia is simply not responsible for any show is only by translating the writings of foreign licenses that are compatible with CC-BY-SA license information.

0 개의 댓글:

댓글 쓰기