Latin-1 Supplement (Unicode block)
C1 Controls and Latin-1 Supplement | |
---|---|
Range | U+0080..U+00FF (128 code points) |
Plane | BMP |
Scripts | Latin (64 char.) Common (64 char.) |
Major alphabets | French German Icelandic Spanish |
Symbol sets | Punctuation Mathematics Currency |
Assigned | 128 code points 33 Control or Format |
Unused | 0 reserved code points |
Source standards | ISO/IEC 8859-1 |
Unicode version history | |
1.0.0 | 128 (+128) |
Note: [1][2] |
The Latin-1 Supplement (also called C1 Controls and Latin-1 Supplement) is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.
The C1 controls and Latin-1 Supplement block has been included in its present form, with the same character repertoire since version 1.0 of the Unicode Standard.[3] Its block name in Unicode 1.0 was simply Latin1.[4]
Character table
Code | Result | Description | Acronym |
---|---|---|---|
C1 Controls | |||
U+0080 | Padding Character | PAD | |
U+0081 | High Octet Preset | HOP | |
U+0082 | Break Permitted Here | BPH | |
U+0083 | No Break Here | NBH | |
U+0084 | Index | IND | |
U+0085 | Next Line | NEL | |
U+0086 | Start of Selected Area | SSA | |
U+0087 | End of Selected Area | ESA | |
U+0088 | Character (Horizontal) Tabulation Set | HTS | |
U+0089 | Character (Horizontal) Tabulation with Justification | HTJ | |
U+008A | Line (Vertical) Tabulation Set | LTS | |
U+008B | Partial Line Forward (Down) | PLD | |
U+008C | Partial Line Backward (Up) | PLU | |
U+008D | Reverse Line Feed (Index) | RI | |
U+008E | Single-Shift Two | SS2 | |
U+008F | Single-Shift Three | SS3 | |
U+0090 | Device Control String | DCS | |
U+0091 | Private Use One | PU1 | |
U+0092 | Private Use Two | PU2 | |
U+0093 | Set Transmit State | STS | |
U+0094 | Cancel character | CCH | |
U+0095 | Message Waiting | MW | |
U+0096 | Start of Protected Area | SPA | |
U+0097 | End of Protected Area | EPA | |
U+0098 | Start of String | SOS | |
U+0099 | Single Graphic Character Introducer | SGCI | |
U+009A | Single Character Introducer | SCI | |
U+009B | Control Sequence Introducer | CSI | |
U+009C | String Terminator | ST | |
U+009D | Operating System Command | OSC | |
U+009E | Private Message | PM | |
U+009F | Application Program Command | APC | |
Latin-1 Punctuation and Symbols | |||
U+00A0 | Non-breaking space | NBSP | |
U+00A1 | ¡ | Inverted exclamation mark | |
U+00A2 | ¢ | Cent sign | |
U+00A3 | £ | Pound sign | |
U+00A4 | ¤ | Currency sign | |
U+00A5 | ¥ | Yen sign | |
U+00A6 | ¦ | Broken bar | |
U+00A7 | § | Section sign | |
U+00A8 | ¨ | Diaeresis | |
U+00A9 | © | Copyright sign | |
U+00AA | ª | Feminine Ordinal Indicator | |
U+00AB | « | Left-pointing double angle quotation mark | |
U+00AC | ¬ | Not sign | |
U+00AD | Soft hyphen | SHY | |
U+00AE | ® | Registered sign | |
U+00AF | ¯ | Macron | |
U+00B0 | ° | Degree symbol | |
U+00B1 | ± | Plus-minus sign | |
U+00B2 | ² | Superscript two | |
U+00B3 | ³ | Superscript three | |
U+00B4 | ´ | Acute accent | |
U+00B5 | µ | Micro sign | |
U+00B6 | ¶ | Pilcrow sign | |
U+00B7 | · | Middle dot | |
U+00B8 | ¸ | Cedilla | |
U+00B9 | ¹ | Superscript one | |
U+00BA | º | Masculine ordinal indicator | |
U+00BB | » | Right-pointing double-angle quotation mark | |
U+00BC | ¼ | Vulgar fraction one quarter | |
U+00BD | ½ | Vulgar fraction one half | |
U+00BE | ¾ | Vulgar fraction three quarters | |
U+00BF | ¿ | Inverted question mark | |
Letters | |||
U+00C0 | À | Latin Capital Letter A with grave | |
U+00C1 | Á | Latin Capital letter A with acute | |
U+00C2 | Â | Latin Capital letter A with circumflex | |
U+00C3 | Ã | Latin Capital letter A with tilde | |
U+00C4 | Ä | Latin Capital letter A with diaeresis | |
U+00C5 | Å | Latin Capital letter A with ring above | |
U+00C6 | Æ | Latin Capital letter AE | |
U+00C7 | Ç | Latin Capital letter C with cedilla | |
U+00C8 | È | Latin Capital letter E with grave | |
U+00C9 | É | Latin Capital letter E with acute | |
U+00CA | Ê | Latin Capital letter E with circumflex | |
U+00CB | Ë | Latin Capital letter E with diaeresis | |
U+00CC | Ì | Latin Capital letter I with grave | |
U+00CD | Í | Latin Capital letter I with acute | |
U+00CE | Î | Latin Capital letter I with circumflex | |
U+00CF | Ï | Latin Capital letter I with diaeresis | |
U+00D0 | Ð | Latin Capital letter Eth | |
U+00D1 | Ñ | Latin Capital letter N with tilde | |
U+00D2 | Ò | Latin Capital letter O with grave | |
U+00D3 | Ó | Latin Capital letter O with acute | |
U+00D4 | Ô | Latin Capital letter O with circumflex | |
U+00D5 | Õ | Latin Capital letter O with tilde | |
U+00D6 | Ö | Latin Capital letter O with diaeresis | |
Mathematical operator | |||
U+00D7 | × | Multiplication sign | |
Letters | |||
U+00D8 | Ø | Latin Capital letter O with stroke | |
U+00D9 | Ù | Latin Capital letter U with grave | |
U+00DA | Ú | Latin Capital letter U with acute | |
U+00DB | Û | Latin Capital Letter U with circumflex | |
U+00DC | Ü | Latin Capital Letter U with diaeresis | |
U+00DD | Ý | Latin Capital Letter Y with acute | |
U+00DE | Þ | Latin Capital Letter Thorn | |
U+00DF | ß | Latin Small Letter sharp S | |
U+00E0 | à | Latin Small Letter A with grave | |
U+00E1 | á | Latin Small Letter A with acute | |
U+00E2 | â | Latin Small Letter A with circumflex | |
U+00E3 | ã | Latin Small Letter A with tilde | |
U+00E4 | ä | Latin Small Letter A with diaeresis | |
U+00E5 | å | Latin Small Letter A with ring above | |
U+00E6 | æ | Latin Small Letter AE | |
U+00E7 | ç | Latin Small Letter C with cedilla | |
U+00E8 | è | Latin Small Letter E with grave | |
U+00E9 | é | Latin Small Letter E with acute | |
U+00EA | ê | Latin Small Letter E with circumflex | |
U+00EB | ë | Latin Small Letter E with diaeresis | |
U+00EC | ì | Latin Small Letter I with grave | |
U+00ED | í | Latin Small Letter I with acute | |
U+00EE | î | Latin Small Letter I with circumflex | |
U+00EF | ï | Latin Small Letter I with diaeresis | |
U+00F0 | ð | Latin Small Letter Eth | |
U+00F1 | ñ | Latin Small Letter N with tilde | |
U+00F2 | ò | Latin Small Letter O with grave | |
U+00F3 | ó | Latin Small Letter O with acute | |
U+00F4 | ô | Latin Small Letter O with circumflex | |
U+00F5 | õ | Latin Small Letter O with tilde | |
U+00F6 | ö | Latin Small Letter O with diaeresis | |
Mathematical operator | |||
U+00F7 | ÷ | Division sign | |
Letters | |||
U+00F8 | ø | Latin Small Letter O with stroke | |
U+00F9 | ù | Latin Small Letter U with grave | |
U+00FA | ú | Latin Small Letter U with acute | |
U+00FB | û | Latin Small Letter U with circumflex | |
U+00FC | ü | Latin Small Letter U with diaeresis | |
U+00FD | ý | Latin Small Letter Y with acute | |
U+00FE | þ | Latin Small Letter Thorn | |
U+00FF | ÿ | Latin Small Letter Y with diaeresis |
Subheadings
The C1 Controls and Latin-1 Supplement block has four subheadings within its character collection: C1 controls, Latin-1 Punctuation and Symbols, Letters, and Mathematical operator(s).[5]
C1 controls
The C1 controls subheading contains 32 supplementary control codes inherited from ISO/IEC 8859-1 and many other 8-bit character standards. The alias names for the C0 and C1 control codes are taken from ISO/IEC 6429:1992.[5]
Latin-1 punctuation and symbols
The Latin-1 Punctuation and Symbols subheading contains 32 characters of common international punctuation characters, such as inverted exclamation and question marks, and a middle dot; and symbols like currency signs, spacing diacritic marks, vulgar fraction, and superscript numbers.[5]
Letters
The Letters subheading contains 30 pairs of majuscule and minuscule accented or novel Latin characters for western European languages, and two extra minuscule characters not commonly used word-initially.[5]
Mathematical operator
The Mathematical operator subheading is used for the multiplication and division signs.[5]
Number of symbols, letters and control codes
The table below shows the number of each letters, symbols and control codes in each subheadings in the C1 Controls and Latin-1 Supplement block.
Type of subheading | Number of symbols | Range of characters |
---|---|---|
C1 controls | 32 control codes | U+0080 to U+009F |
Latin-1 punctuation and symbols | 32 punctuation and symbols | U+00A0 to U+00BF |
Letters | 30 pairs of majuscule and minuscule accented Latin characters | U+00C0 to U+00D6, U+00D8 to U+00F6 and U+00F8 to U+00FF |
Mathematical operators | The U+00D7 × MULTIPLICATION SIGN and U+00F7 ÷ DIVISION SIGN symbols. | U+00D7 and U+00F7 |
2. Latin-1 Supplement
Emoji
The Latin-1 Supplement block contains two emoji: U+00A9 and U+00AE.[6][7]
The block has four standardized variants defined to specify emoji-style (U+FE0F VS16) or text presentation (U+FE0E VS15) for the two emoji, both of which default to a text presentation.[8]
U+ | 00A9 | 00AE |
base code point | © | ® |
base+VS15 (text) | ©︎ | ®︎ |
base+VS16 (emoji) | ©️ | ®️ |
History
The following Unicode-related documents record the purpose and process of defining specific characters in the Latin-1 Supplement block:
Version | Final code points[lower-alpha 1] | Count | L2 ID | WG2 ID | Document |
---|---|---|---|---|---|
1.0.0 | U+0080..009F | 32 | X3L2/95-002 | PDAM No. 3 to ISO/IEC 10646-1 on coding of C1 controls, 1994-11-01 | |
X3L2/95-028 | N1148 | Nine tables of replies to repeated/extended votes, 1995-02-22 | |||
N1203 | Umamaheswaran, V. S.; Ksar, Mike (1995-05-03), Unconfirmed minutes of SC2/WG2 Meeting 27, Geneva | ||||
X3L2/95-061 | DAM no.3 to ISO/IEC 10646-1 (Coding of C1 controls), 1995-06-01 | ||||
N1307 | Table of replies to JTC1 letter ballot on 10646 DAM 3, Coding of C1 Controls, (SC2 N 2666), 1996-01-15 | ||||
N1309 | Paterson, Bruce (1996-01-17), Report and Disposition of Comments on DAM 1, UTF 16 and DAM 2, UTF-8, DAM 3, Coding of C1 Controls, and DAM 4, Removal of Annex G: UTF1 | ||||
N1312 | Paterson, Bruce (1996-01-17), Draft Final Text of 10646 AMD-3, Coding of C1 Controls | ||||
L2/99-048 | Umamaheswaran, V. S. (1999-02-04), C1 controls in the code charts | ||||
L2/99-054R | Aliprand, Joan (1999-06-21), Approved Minutes from the UTC/L2 meeting in Palo Alto, February 3-5, 1999 | ||||
N3046 | Suignard, Michel (2006-02-22), Improving formal definition for control characters | ||||
N3103 (pdf, doc) | Umamaheswaran, V. S. (2006-08-25), Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27 | ||||
U+00A0..00FF | 96 | (to be determined) | |||
X3L2/94-098 | N1033 (pdf, doc) | Umamaheswaran, V. S.; Ksar, Mike (1994-06-01), Unconfirmed Minutes of ISO/IEC JTC 1/SC 2/WG 2 Meeting 25, Falez Hotel, Antalya, Turkey, 1994-04-18--22 | |||
L2/11-016 | Moore, Lisa (2011-02-15), UTC #126 / L2 #223 Minutes | ||||
L2/11-116 | Moore, Lisa (2011-05-17), UTC #127 / L2 #224 Minutes, "Change the general category of to U+00AA FEMININE ORDINAL INDICATOR and U+00BA MASCULINE ORDINAL INDICATOR "Lo" for Unicode 6.1." | ||||
L2/11-261R2 | Moore, Lisa (2011-08-16), UTC #128 / L2 #225 Minutes, "Change the general category from "So" to "Po" ... [U+00A7 and U+00B6]" | ||||
L2/15-050R[lower-alpha 2][lower-alpha 3] | Davis, Mark (2015-01-29), Additional variation selectors for emoji | ||||
|
See also
References
- ↑ "Unicode character database". The Unicode Standard. https://www.unicode.org. Retrieved 2016-07-09.
- ↑ "Enumerated Versions of The Unicode Standard". The Unicode Standard. https://www.unicode.org/versions/enumeratedversions.html. Retrieved 2016-07-09.
- ↑ The Unicode Standard Version 1.0, Volume 1. Addison-Wesley Publishing Company, Inc.. 1991. ISBN 0-201-56788-1.
- ↑ "3.8: Block-by-Block Charts". The Unicode Standard. Unicode Consortium. https://www.unicode.org/versions/Unicode1.0.0/CodeCharts2.pdf.
- ↑ 5.0 5.1 5.2 5.3 5.4 "Unicode 6.2 code charts". The Unicode Standard. https://www.unicode.org/Public/6.2.0/charts/CodeCharts.pdf. Retrieved 1 April 2013.
- ↑ "UTR #51: Unicode Emoji". Unicode Consortium. 2020-02-11. https://unicode.org/reports/tr51/.
- ↑ "UCD: Emoji Data for UTR #51". Unicode Consortium. 2021-08-26. https://unicode.org/Public/UNIDATA/emoji/emoji-data.txt.
- ↑ "UTS #51 Emoji Variation Sequences". The Unicode Consortium. https://unicode.org/Public/UNIDATA/emoji/emoji-variation-sequences.txt.