JIS encoding

From HandWiki
Short description: Collection of Japanese standards for digital character encoding

In computing, JIS encoding refers to several Japanese Industrial Standards for encoding the Japanese language.[1] Strictly speaking, the term means either:

  • A set of standard coded character sets for Japanese, notably:
    • JIS X 0201, the Japanese version of ISO 646 (ASCII) containing the base 7-bit ASCII characters (with some modifications) and 64 half-width katakana characters.
    • JIS X 0208, the most common kanji character set containing 6,879 characters, including 6,355 kanji and 524 other characters (one 94 by 94 plane)
    • JIS X 0212, a supplement for JIS X 0208 which adds 5,801 kanji, totaling 12,156 kanji (a second 94 by 94 plane)
    • JIS X 0213, which extends JIS X 0208 (two planes)
  • JIS X 0202 (also known as ISO-2022-JP), a set of encoding mechanisms for sending JIS character data over transmission media that only support 7-bit data.

In practice, "JIS encoding" usually refers to JIS X 0208 character data encoded with JIS X 0202. For instance, the IANA uses the JIS_Encoding label to refer to JIS X 0202, and the ISO-2022-JP label to refer to the profile thereof defined by RFC 1468.[2]

Other encoding mechanisms for JIS characters include the Shift JIS encoding and EUC-JP. Shift JIS adds the kanji, full-width hiragana and full-width katakana from JIS X 0208 to JIS X 0201 in a backward compatible way.[3] Shift JIS is perhaps the most widely used encoding in Japan, as the compatibility with the single-byte JIS X 0201 character set made it possible for electronic equipment manufacturers (such as cash register manufacturers) to offer an upgrade from older cheaper equipment that was not capable of displaying kanji to newer equipment while retaining character-set compatibility.

EUC-JP is used on UNIX systems, where the JIS encodings are incompatible with POSIX standards.

A more recent alternative to JIS coded characters is Unicode (UCS coded characters), particularly in the UTF-8 encoding mechanism.

Encoding comparison

The following table compares the features of the three main encoding schemes for JIS X 0208.

See also

  • Japanese language and computers

References

  1. Haralambous, Yannis (2007). Fonts & Encodings. O'Reilly Media. pp. 42–44. ISBN 9780596102425. 
  2. "Character Sets". IANA. https://www.iana.org/assignments/character-sets/character-sets.xhtml. 
  3. Lunde, Ken (2009). CJKV Information Processing. O'Reilly Media. pp. 262–268. ISBN 9780596514471.