Hangul Syllables
| Hangul Syllables | |
|---|---|
| Range | U+AC00..U+D7AF (11,184 code points) |
| Plane | BMP |
| Scripts | Hangul |
| Major alphabets | Hangul |
| Assigned | 11,172 code points |
| Unused | 12 reserved code points |
| Source standards | KS C 5601-1992 |
| Unicode version history | |
| 2.0 | 11,172 (+11,172) |
| Note: [1][2] 6,656 characters were present at U+3400..U+4DFF in Unicode 1.1, but were moved to their current locations with Unicode version 2.0, along with 4,516 additional characters. | |
Hangul Syllables is a Unicode block containing precomposed Hangul syllabic blocks for modern Korean. The order of the characters in this Unicode block follows the Hangul alphabetical order of South Korea.
Algorithm for canonical decomposition mappings and character names
The canonical decomposition mappings and the character names of all characters in this Unicode block are algorithmically defined.
The following step gives the index number of the initial consonant, of the vowel, and of the final consonant for a given Hangul syllabic block.
- Let S = (code point of a character from U+AC00 to U+D7A3, in decimal) − 44032
- The index number of its
- initial consonant is S / 588
- vowel is (S % 588) / 28
- final consonant is S % 28
- (x / y is the integer quotient of x divided by y; x % y is the remainder of x / y)
After getting the index numbers, use the following table to get the canonical decomposition mapping and the character name.
- For the canonical decomposition mapping, simply concatenate the Hangul jamo (element) characters in the "Decomposition" columns below, in the order "initial consonant, vowel, final consonant". The result should match the regular expression
[ᄀ-ᄒ][ᅡ-ᅵ][ᆨ-ᇂ]?(one character from U+1100 to U+1112, and then one character from U+1161 to U+1175, and then optionally one character from U+11A8 to U+11C2). - For the character name, write "HANGUL SYLLABLE " (with the trailing space) first and then concatenate the strings in the "Name" columns below, in the order "initial consonant, vowel, final consonant".
| Index number | Initial consonant | Vowel | Final consonant | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Decomposition | Name | Decomposition | Name | Decomposition | Name | ||||
| 0 | ᄀ | U+1100 | G | ᅡ | U+1161 | A | (null) | ||
| 1 | ᄁ | U+1101 | GG | ᅢ | U+1162 | AE | ᆨ | U+11A8 | G |
| 2 | ᄂ | U+1102 | N | ᅣ | U+1163 | YA | ᆩ | U+11A9 | GG |
| 3 | ᄃ | U+1103 | D | ᅤ | U+1164 | YAE | ᆪ | U+11AA | GS |
| 4 | ᄄ | U+1104 | DD | ᅥ | U+1165 | EO | ᆫ | U+11AB | N |
| 5 | ᄅ | U+1105 | R | ᅦ | U+1166 | E | ᆬ | U+11AC | NJ |
| 6 | ᄆ | U+1106 | M | ᅧ | U+1167 | YEO | ᆭ | U+11AD | NH |
| 7 | ᄇ | U+1107 | B | ᅨ | U+1168 | YE | ᆮ | U+11AE | D |
| 8 | ᄈ | U+1108 | BB | ᅩ | U+1169 | O | ᆯ | U+11AF | L |
| 9 | ᄉ | U+1109 | S | ᅪ | U+116A | WA | ᆰ | U+11B0 | LG |
| 10 | ᄊ | U+110A | SS | ᅫ | U+116B | WAE | ᆱ | U+11B1 | LM |
| 11 | ᄋ | U+110B | (null) | ᅬ | U+116C | OE | ᆲ | U+11B2 | LB |
| 12 | ᄌ | U+110C | J | ᅭ | U+116D | YO | ᆳ | U+11B3 | LS |
| 13 | ᄍ | U+110D | JJ | ᅮ | U+116E | U | ᆴ | U+11B4 | LT |
| 14 | ᄎ | U+110E | C | ᅯ | U+116F | WEO | ᆵ | U+11B5 | LP |
| 15 | ᄏ | U+110F | K | ᅰ | U+1170 | WE | ᆶ | U+11B6 | LH |
| 16 | ᄐ | U+1110 | T | ᅱ | U+1171 | WI | ᆷ | U+11B7 | M |
| 17 | ᄑ | U+1111 | P | ᅲ | U+1172 | YU | ᆸ | U+11B8 | B |
| 18 | ᄒ | U+1112 | H | ᅳ | U+1173 | EU | ᆹ | U+11B9 | BS |
| 19 | ᅴ | U+1174 | YI | ᆺ | U+11BA | S | |||
| 20 | ᅵ | U+1175 | I | ᆻ | U+11BB | SS | |||
| 21 | ᆼ | U+11BC | NG | ||||||
| 22 | ᆽ | U+11BD | J | ||||||
| 23 | ᆾ | U+11BE | C | ||||||
| 24 | ᆿ | U+11BF | K | ||||||
| 25 | ᇀ | U+11C0 | T | ||||||
| 26 | ᇁ | U+11C1 | P | ||||||
| 27 | ᇂ | U+11C2 | H | ||||||
Example
Hangul syllabic block: 쇒 (U+C1D2, decimal 49618)
- S = 49618 − 44032 = 5586
- The index number of its
- initial consonant is 5586 / 588 = 9
- vowel is (5586 % 588) / 28 = 10
- final consonant is 5586 % 28 = 14
Therefore, its
- canonical decomposition mapping is ᄉ+ᅫ+ᆵ (U+1109, U+116B, U+11B5)
- character name is "HANGUL SYLLABLE SWAELP"
Block
Template:Unicode chart Hangul Syllables
History
Encoding Hangul syllabic blocks in Unicode was complicated by a reorganization of the code points:
- Unicode version 1.0.0 encoded 2,350 modern Korean Hangul syllabic blocks from KS C 5601-1987 at U+3400–U+3D2D. This range is now part of CJK Unified Ideographs Extension A.
- Version 1.1 added 1,930 additional modern syllabic blocks from KS C 5657-1991 at U+3D2E–U+44B7, six modern syllabic blocks from GB 12052-89 at U+44B8–U+44BD, and the first 2,370 syllabic blocks that are not in the aforementioned three sets at U+44BE–U+4DFF. These collectively cover the remainder of what is now CJK Unified Ideographs Extension A and all of what is now Yijing Hexagram Symbols.
- In addition, there were three errors in Unicode 1.1:[3]
- U+384E: 삤 in the Unicode Character Database, but 삣 in the Unicode 1.0 and ISO/IEC 10646-1:1993 code charts and per the source standard mappings
- U+40BC: 삣 in the Unicode Character Database, but 삤 in the ISO/IEC 10646-1:1993 code charts and per the source standard mappings
- U+436C: 콫 in the Unicode Character Database, but 콪 in the ISO/IEC 10646-1:1993 code charts and per the source standard mappings
- In addition, there were three errors in Unicode 1.1:[3]
- Version 2.0 added the 4,516 remaining possible syllabic blocks from KS C 5601-1992 and rearranged[4][5] all of the encoded syllabic blocks into the current U+AC00–U+D7AF range which allows algorithmic decomposition into individual jamo.
RFC 2279 explains that this significant incompatible change was made on the assumption that no data or software using Unicode for Korean existed:
"The official justification for allowing such an incompatible change was that no implementations and no data containing Hangul existed, a statement that is likely to be true but remains unprovable. The incident has been dubbed the "Korean mess", and the relevant committees have pledged to never, ever again make such an incompatible change." — RFC 2279
Subsequently, Unicode adopted an encoding stability policy which states that "Once a character is encoded, it will not be moved or removed".[6]
After all this, North Korea submitted a proposal to rearrange the characters to follow its own alphabetical order;[7][8] it was rejected.[9]
The following Unicode-related documents record the purpose and process of defining specific characters in the Hangul Syllables block:
| Version | Final code points[lower-alpha 1] | Count | UTC ID | L2 ID | WG2 ID | Document |
|---|---|---|---|---|---|---|
| 2.0 | U+AC00..D7A3 | 11,172 | N767 | Ksar, Mike (1991-11-25), Unconfirmed minutes WG2-Paris meeting of October 1991 | ||
| X3L2/93-078 | N848 | Modified Korean Position, 1992-07-02 | ||||
| UTC/1994-xxx | Unicode Technical Committee Meeting #62: Discussion of Korean Hangul Proposal, 1994-09-30 | |||||
| X3L2/95-031 | N1158 | Korean National Position for adding Hangul characters, 1995-03-08 | ||||
| N1170 | Canadian Position on Korean Proposal in N 1158 for adding Hangul characters, 1995-03-10 | |||||
| UTC/1995-021B | Aliprand, Joan (1995-03-10), Closed Caucus Minutes, UTC #64 | |||||
| N1198 | Working Draft for a pDAM to 10646 on Korean Hangul, 1995-04-05 | |||||
| X3L2/95-053.1 | N1199 | Background on Korean Coding, 1995-04-30 | ||||
| N1203 | Umamaheswaran, V. S.; Ksar, Mike (1995-05-03), Unconfirmed minutes of SC2/WG2 Meeting 27, Geneva | |||||
| X3L2/95-053 | N1209 | PDAM no. 5 to ISO/IEC 10646-1: Hangul Character Collections, 1995-05-09 | ||||
| UTC/1995-xxx | Unicode Technical Committee Meeting #65, Minutes, 1995-06-02 | |||||
| X3L2/95-090 | N1253 (doc, txt) | Umamaheswaran, V. S.; Ksar, Mike (1995-09-09), Unconfirmed Minutes of WG 2 Meeting # 28 in Helsinki, Finland; 1995-06-26--27 | ||||
| N1285 | Hangul Syllable Character Name Generation Algorithm, 1995-11-08 | |||||
| N1303 (html, doc) | Umamaheswaran, V. S.; Ksar, Mike (1996-01-26), Minutes of Meeting 29, Tokyo | |||||
| N1331 | Paterson, Bruce (1996-03-14), DAM 5 (Korean Hangul) Submittal to JTC1 | |||||
| N1332 | Paterson, Bruce (1996-03-14), BMP Revised Layout (DAM 5 diagram attachment) | |||||
| N1391 | Paterson, Bruce (1996-05-18), Hangul syllable name algorithm, simplified | |||||
| N1353 | Umamaheswaran, V. S.; Ksar, Mike (1996-06-25), Draft minutes of WG2 Copenhagen Meeting # 30 | |||||
| N1537 | Table of Replies and Feedback on Amendment 5 – Hangul, 1997-01-29 | |||||
| L2/97-125 | N1561 | Paterson, Bruce (1997-05-27), Draft Report on JTC1 letter ballot on DAM No. 5 to ISO/IEC 10646-1 (Hangul) | ||||
| N1570 | Paterson, Bruce (1997-06-23), Almost Final Text (pages 2-5 and 182 only) of DAM 5 – Hangul | |||||
| L2/97-288 | N1603 | Umamaheswaran, V. S. (1997-10-24), Unconfirmed Meeting Minutes, WG 2 Meeting # 33, Heraklion, Crete, Greece, 20 June – 4 July 1997 | ||||
| N1806 (pdf, doc) | Kim, Kyongsok; Paterson, Bruce (1998-07-08), Defect Report on AMD 5 - Hangul Syllables with Editor's response | |||||
| L2/99-022 | N1942 | Paterson, Bruce (1998-12-08), Hangul syllable name rules, proposed for ISO/IEC 10646 2nd Edition | ||||
| L2/99-010 | N1903 (pdf, html, doc) | Umamaheswaran, V. S. (1998-12-30), Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25 | ||||
| L2/99-114 | N2018 | Paterson, Bruce (1999-03-31), Draft Technical Technical Corrigendum No. 3 to ISO/IEC 10646-1: 1993 | ||||
| L2/99-232 | N2003 | Umamaheswaran, V. S. (1999-08-03), Minutes of WG 2 meeting 36, Fukuoka, Japan, 1999-03-09--15 | ||||
| L2/99-297 | N2119 | Disposition of Comments Report on SC 2 N 3306, Draft Technical Corrigendum No. 3 to ISO/IEC 10646-1: 1993, 1999-09-20 | ||||
| L2/99-298 | N2120 | Paterson, Bruce (1999-09-21), Final Text for Technical Corrigendum No. 3 to ISO/IEC 10646-1: 1993 | ||||
| L2/03-100 | Edberg, Peter (2002-11-05), Hangul Mapping Errors | |||||
| L2/02-463 | N2564 | Kim, Kyongsok (2002-11-30), 3-way cross-reference tables - KS X 1001, KPS 9566, and UCS | ||||
| L2/04-361 | Moore, Lisa (2004-11-23), UTC #101 Minutes | |||||
| L2/17-080 | Chung, Jaemin (2017-03-29), Informative document about three pre-Unicode-2.0 modern hangul syllables | |||||
| ||||||
References
- ↑ "Unicode character database". The Unicode Standard. https://www.unicode.org/ucd/. Retrieved 2023-07-26.
- ↑ "Enumerated Versions of The Unicode Standard". The Unicode Standard. https://www.unicode.org/versions/enumeratedversions.html. Retrieved 2023-07-26.
- ↑ Chung, Jaemin (2017-03-29). "Informative document about three pre-Unicode-2.0 modern hangul syllables". https://unicode.org/L2/L2017/17080-three-hangul-syl.pdf.
- ↑ Chang, K. D.; Choi, In Sook; Kim, Jung Ho (1995-10-04). "Korean Hangul Encoding Conversion Table". https://unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/KSC/HANGUL.TXT.
- ↑ "Notes and corrections for HANGUL.TXT". 2005-10-13. https://unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/KSC/HangulReadMe.html.
- ↑ "Unicode Character Encoding Stability Policies". Unicode Consortium. 2016-11-14. https://www.unicode.org/policies/stability_policy.html.
- ↑ Jo, Chun-Hui (1999-08-10). "Amendment of the part containing the Korean characters in ISO/IEC 10646-1:1998 amendment 5". https://unicode.org/wg2/docs/n2056.pdf.
- ↑ "New Work item proposal (NP) for an amendment of the Korean part of ISO/IEC 10646-1:1993". 1999-12-07. https://unicode.org/L2/L1999/99380.htm.
- ↑ "Resolutions of WG 2 meeting 37". 1999-09-16. Resolution M37.12. https://unicode.org/L2/L1999/99278-n2104.pdf#page=3.
See also
