Videotex character set

From HandWiki

The character sets used by Videotex are based, to greater or lesser extents, on ISO/IEC 2022. Three Data Syntax systems are defined by ITU T.101, corresponding to the Videotex systems of different countries.

Data Syntax 1

Data Syntax 1 is defined in Annex B of T.101:1994. It is based on the CAPTAIN system used in Japan . Its graphical sets include JIS X 0201 and JIS X 0208.

The following G-sets are available through ISO/IEC 2022-based designation escapes:[1]:AnxB.2.3

Name G-set escape type F byte ISO-IR for F byte
Primary Character set Single byte 94-code 0x4A (J) ISO-IR-14 (JIS X 0201 Roman)
Katakana Character set Single byte 94-code 0x49 (I) ISO-IR-13 (JIS X 0201 Kana)
Mosaic I set Single byte 94-code 0x33 (3) (Occupies private-use F byte; also registered as ISO-IR-137 with F byte 0x79)[2]
Mosaic II set Single byte 94-code 0x63 (c) ISO-IR-71[3]
Display Control set Single byte 96-code 0x38 (8) (Occupies private-use F byte)
PDI set Single byte 96-code 0x57 (W) (F byte exceptionally reserved and not used in ISO-IR)[4]
MVI set Single byte 96-code 0x39 (9) (Occupies private-use F byte)
Kanji set Multiple byte 94n-code 0x42 (B) ISO-IR-87 (JIS X 0208:1983)
Macro set Single byte DRCS 96-code 0x40 (@) (Uses a DRCS escape syntax)
DRCS I set Single byte DRCS 94-code 0x41 (A) (Is a DRCS)
DRCS II set Multiple byte DRCS 94n-code 0x40 (@) (Is a DRCS)

Mosaic sets for Data Syntax 1

The mosaic sets supply characters for use in semigraphics.

Videotex Mosaic set: First Mosaic set for Data Syntax 1 (partial Unicode mapping)[2]
0 1 2 3 4 5 6 7 8 9 A B C D E F
🮛
🮚
🭒 🭓 🭔 🭕 🭖 🭗 🭘 🭙 🭚 🭛 🭜 🭬 🭭
🭝 🭞 🭟 🭠 🭡 🭢 🭣 🭤 🭥 🭦 🭧 🭮 🭯

� Not in Unicode

Videotex Mosaic set: Second Mosaic set for Data Syntax 1[3]
0 1 2 3 4 5 6 7 8 9 A B C D E F
🬀 🬁 🬂 🬃 🬄 🬅 🬆 🬇 🬈 🬉 🬊 🬋 🬌 🬍 🬎
🬏 🬐 🬑 🬒 🬓 🬔 🬕 🬖 🬗 🬘 🬙 🬚 🬛 🬜 🬝
🬼 🬽 🬾 🬿 🭀 🭁 🭂 🭃 🭄 🭅 🭆 🭨 🭩 🭰 🮕
🭇 🭈 🭉 🭊 🭋 🭌 🭍 🭎 🭏 🭐 🭑 🭪 🭫 🭵
🬞 🬟 🬠 🬡 🬢 🬣 🬤 🬥 🬦 🬧 🬨 🬩 🬪 🬫 🬬
🬭 🬮 🬯 🬰 🬱 🬲 🬳 🬴 🬵 🬶 🬷 🬸 🬹 🬺 🬻

Data Syntax 2

Data Syntax 2 is defined in Annex C of T.101:1994. It corresponds to some European Videotex systems such as CEPT T/CD 06-01. The graphical character coding of Data Syntax 2 is based on T.51.

The default G2 set of Data Syntax 2 is based on an older version of T.51, lacking the non-breaking space, soft hyphen, not sign (¬) and broken bar (¦) present in the current version, but adding a dialytika tonos (΅—combining form is U+0344) at the beginning of the row of diacritical marks for combination with codes from a Greek primary set.[5] An umlaut diacritic code distinct from the diaeresis code, as included in some versions of T.61, is also sometimes included.[6]

The default G1 set is the second mosaic set, corresponding roughly to the second mosaic set of Data Syntax 1.[1]:AnxCpt1/TableC.11 The default G3 set is the third mosaic set, matching the first mosaic set of Data Syntax 1 for 0x60 through 0x6D and 0x70 through 0x7D, and otherwise differing.[1]:AnxCpt1/TableC.12 The first mosaic set matches the second except for 0x40 through 0x5E: 0x40 through 0x5A follow ASCII (supplying uppercase letters), whereas the remainder are national variant characters; the displaced full block is placed at 0x7F.[1]:AnxCpt1/TableC.10

Videotex Mosaic set: First Mosaic set for Data Syntax 2[1]:AnxCpt1/TableC.10
0 1 2 3 4 5 6 7 8 9 A B C D E F
 SP  🬀 🬁 🬂 🬃 🬄 🬅 🬆 🬇 🬈 🬉 🬊 🬋 🬌 🬍 🬎
🬏 🬐 🬑 🬒 🬓 🬔 🬕 🬖 🬗 🬘 🬙 🬚 🬛 🬜 🬝
@ A B C D E F G H I J K L M N O
P Q R S T U V W X Y Z a ½a a a ⌗/_b
🬞 🬟 🬠 🬡 🬢 🬣 🬤 🬥 🬦 🬧 🬨 🬩 🬪 🬫 🬬
🬭 🬮 🬯 🬰 🬱 🬲 🬳 🬴 🬵 🬶 🬷 🬸 🬹 🬺 🬻
  • ^a Representation of 0x5B-5E is not guaranteed in international communication and may be replaced by national application oriented variants.[1]
  • ^b 0x5F may be displayed either as ⌗ (square) or _ (lower bar) to represent the terminator function required by Videotex services.[1]

Data Syntax 3

Data Syntax 3 is defined in Annex D of T.101:1994. The graphical character coding of Data Syntax 3 is based on T.51.

The supplementary set for Data Syntax 3 is based on an older version of T.51, lacking the non-breaking space, soft hyphen, not sign (¬) and broken bar (¦) present in the current version, and allocating non-spacing marks for a "vector overbar" and solidus and several semigraphic characters to unallocated space in that set.

See the comments in the T.51 article for caveats about the combining mark Unicode mappings shown below. Unlike Unicode combining characters, T.51 diacritic codes precede the base character.

Supplementary Set for Videotex Data Syntax 3[7]
0 1 2 3 4 5 6 7 8 9 A B C D E F
¡ ¢ £ $ ¥ # § ¤ «
° ± ² ³ × µ · ÷ » ¼ ½ ¾ ¿
◌⃑ ◌̀ ◌́ ◌̂ ◌̃ ◌̄ ◌̆ ◌̇ ◌̈ ◌̸ ◌̊ ◌̧ ◌̲ ◌̋ ◌̨ ◌̌
¹ ® ©
Æ Đ/Ð ª Ħ IJ Ŀ Ł Ø Œ º Þ Ŧ Ŋ ʼn
ĸ æ đ ð ħ ı ij ŀ ł ø œ ß þ ŧ ŋ
  Differences from T.51 (1988 edition, first supplementary set)

C0 control codes

C0 control codes for Videotex differ from ASCII as shown in the table below. The NUL, BEL, SO (LS1), SI (LS0) and ESC codes are also available in some or all data syntaxes, but without change in name or semantic from ASCII.[8][9][10]

Seq Dec Hex Replaced Syntaxes Acronym Name Description
^H 08 08 BS 1,[8] 2,[9] 3[10] APB Active Position Backward Moves cursor one position backward. If it is at the start of the line, moves it to the end of the line and back one line. This retains one possible semantic of the ASCII BS.
^I 09 09 HT 1,[8] 2,[9] 3[10] APF Active Position Forward Moves cursor one position forward. If it is at the end of the line, moves it to the start of the line and forward one line.
^J 10 0A LF 1,[8] 2,[9] 3[10] APD Active Position Down Moves cursor one line forward. If it is at the last line of the screen, moves it to the first line unless Data Syntax 3 scroll mode is active. This retains one possible semantic of the ASCII LF.
^K 11 0B VT 1,[8] 2,[9] 3[10] APU Active Position Up Moves cursor one line backward. If it is at the first line of the screen, moves it to the last line unless Data Syntax 3 scroll mode is active.
^L 12 0C FF 1,[8] 2,[9] 3[10] CS Clear Screen Resets entire display to spaces with default display attributes and returns the cursor to its initial position. In Data Syntax 1, also resets macros and DRCS. This retains one possible semantic of the ASCII FF.
^M 13 0D CR 1,[8] 2,[9] 3[10] APR Active Position Return Moves the cursor to the start of the line. In Data Syntax 3, may instead move it to the start of the active field if it is entirely within it. This retains one possible semantic of the ASCII CR.
^Q 17 11 DC1/XON 2[9] CON Cursor On Makes the cursor visible.
^R 18 12 DC2 2[9] RPT Repeat Repeats the immediately preceding graphic character a number of times indicated by the low six bits of the following byte (from 0x40 to 0x7F).
^T 20 14 DC4 1[1]:AnxB.3.1 KMC Key-In-Monitor Conceal Takes one parameter: 0x40 makes the key-in-monitor area unconcealed, 0x41 makes it concealed.
2[9] COF Cursor Off Makes the cursor invisible.
^X 24 18 CAN 1,[8] 2,[9] 3[10] CAN Cancel In Data Syntax 2, fill the rest of the current line (after the current position) with spaces (compare EL). In Data Syntax 1 and 3, immediately stop all running macros. Contrast the semantic of basic ASCII CAN.
^Y 25 19 EM 1,[8] 2,[9] 3[10] SS2 Single Shift Two Non-locking shift code for G2.
^Z 26 1A SUB 3[10] SDC Service Delimitor Character Implementation-defined but non-presentational.
^\ 28 1C FS 1,[8] 3[10] APS Active Position Set Followed by two bytes (from 0x40 to 0x7F; may also be from 0xA0 to 0xFF in Data Syntax 3) respectively giving a row and column address in their low six bits. Compare CUP and HVP.
^] 29 1D GS 1,[8] 2,[9] 3[10] SS3 Single Shift Three Non-locking shift code for G3.
^^ 30 1E RS 1,[8] 2,[9] 3[10] APH Active Position Home Returns cursor to the initial position.
^_ 31 1F US 1,[8] 3[10] NSR Non-Selective Reset Resets all display attributes (including ISO 2022 state, domain, text parameters, textures, colour mode but not macros, DRCS or programmable masks), then moves the cursor to a specified position. Followed by two bytes (from 0x40 to 0x7F; may also be from 0xA0 to 0xFF in Data Syntax 3) respectively giving a row and column address in their low six bits. Compare RIS.
2[9] APA Active Position Address Followed by two or four bytes (from 0x40 to 0x7F) giving a row and column address in their low six bits. Four bytes are used if there are more than 63 rows and columns, with the most significant six bits being first for each parameter. Compare CUP and HVP. If the following byte is not in the range of 0x40 to 0x7F, indicates a switch to another coding scheme (contrast DOCS).

C1 control codes

The following specialised C1 control codes are used in Videotex. There are four registered sets, with some differences between them.

8-bit Escape Data Syntax 1[11] Data Syntax 2, "Parallel" C1 set[12][1]:AnxC.3.3.2 Data Syntax 2, "Serial" C1 set[13][1]:AnxC.3.3.1 Data Syntax 3[14]
0x80 ESC 0x40 (@) BKF, Black Foreground. ABK, Alpha Black. Switch to alphabetic, black foreground. DEFM, Define Macro. Next character (from 0x20 to 0x7F) gives macro name, rest is stored as part of macro until another DEF* or an END.
0x81 ESC 0x41 (A) RDF, Red Foreground. ANR, Alpha Red. Switch to alphabetic, red foreground. DEFP, Define P-Macro. Like DEFM, but simultaneously defines and executes the macro.
0x82 ESC 0x42 (B) GRF, Green Foreground. ANG, Alpha Green. Switch to alphabetic, green foreground. DEFT, Define Transmit-Macro. Like DEFM but defines a macro to be transmitted, not executed.
0x83 ESC 0x43 (C) YLF, Yellow Foreground. ANY, Alpha Yellow. Switch to alphabetic, yellow foreground. DEFD, Define DRCS. Defines a character in the Dynamically Redefinable Character Set. Expected to be followed by the character code defined (from 0x20 to 0x7F) unless it terminates a previous DEFD, in which case it defines the next code. Terminated by another DEF* or an END
0x84 ESC 0x44 (D) BLF, Blue Foreground. ANB, Alpha Blue. Switch to alphabetic, blue foreground. DEFX, Define Texture. Defines a texture mask. Expected to be followed by the texture mask ID defined (from 0x40 to 0x44). Terminated by another DEF* or an END
0x85 ESC 0x45 (E) MGF, Magenta Foreground. ANM, Alpha Magenta. Switch to alphabetic, magenta foreground. END, End. Terminates a macro, DRCS character or texture definition. Also used in unprotected fields.
0x86 ESC 0x46 (F) CNF, Cyan Foreground. ANC, Alpha Cyan. Switch to alphabetic, cyan foreground. REP, Repeat. Repeats preceding spacing graphical character a number of times specified by the following byte (from 0x40 to 0x7F).
0x87 ESC 0x47 (G) WHF, White Foreground. ANW, Alpha White. Switch to alphabetic, white foreground. REPE, Repeat to End of Line. Repeats preceding spacing graphical character until the end of the line is reached.
0x88 ESC 0x48 (H) SSZ, Small Size. Characters half normal width and height FSH, Flashing. Characters displayed flashing between foreground and background. REVV, Reverse Video. Enables reverse video mode.
0x89 ESC 0x49 (I) MSZ, Medium Size. Characters normal height, half normal width STD, Steady. Terminates flashing. NORV, Normal Video. Disables reverse video mode.
0x8A ESC 0x4A (J) NSZ, Normal Size. Characters normal width and height. EBX, End Box. Terminates SBX. SMTX, Small Text. Text size 1/80 of screen width and 5/128 of screen height.
0x8B ESC 0x4B (K) SZX, Size Control. Followed by a one-byte parameter. 0x41 means double height (DBH), 0x44 means double width (DBW), 0x45 means doubled width and height (DBS).[1]:AnxB.3.2.2 SBX, Start Box. Defines a non-alphanumeric area, with transparent background. Terminated by EBX. METX, Medium Text. Text size 1/32 of screen width and 3/64 of screen height.
0x8C ESC 0x4C (L) (not used) NSZ, Normal Size. Characters normal width and height. NOTX, Normal Text. Text size 1/40 of screen width and 5/128 of screen height.
0x8D ESC 0x4D (M) (not used) DBH, Double Height. Characters normal width and double normal height. Inactive on top line. DBH, Double Height. Characters normal width and double normal height. Inactive on bottom line. DBH, Double Height. Text size 1/40 of screen width and 5/64 of screen height.
0x8E ESC 0x4E (N) CON, Cursor On. Makes cursor visible. DBW, Double Width. Characters normal height and double normal width. Inactive in last position of line. BSTA, Blink Start.
0x8F ESC 0x4F (O) COF, Cursor Off. Makes cursor invisible. DBS, Double Size. Characters normal height and double normal width. Inactive on top line or in last position of line. DBS, Double Size. Characters normal height and double normal width. Inactive on bottom line or in last position of line. DBS, Double Size. Text size 1/20 of screen width and 5/64 of screen height.
0x90 ESC 0x50 (P) COL, Background or Foreground Colour. Takes a one-byte parameter. 0x48–0x4F sets a reduced intensity foreground. 0x50–0x57 sets background colour. 0x58–0x5F sets a reduced intensity background. Colour order is the same as that of the individual foreground colour controls (black, red, green, yellow, blue, magenta, cyan, white), but transparent takes the place of reduced intensity black.[1]:AnxB.3.2.1 BKB, Black Background. MBK, Mosaic Black. Switch to mosaic, black foreground. PRO, Protect. Makes all character fields within the active field protected.
0x91 ESC 0x51 (Q) FLC, Flashing Control. Takes one parameter: 0x40 for "normal" flashing, 0x41 through 0x47 for other flashing modes, 0x4F for steady (terminate flashing).[1]:AnxB.3.2.4 RDB, Red Background. MSR, Mosaic Red. Switch to mosaic, red foreground. (EDC1, not used)
0x92 ESC 0x52 (R) CDC, Conceal Display Control. Takes a one-byte parameter defining conceal display attributes, which can make text invisible until user interaction. 0x40 is used to start a concealed range (CDY), 0x4F is used to terminate it (SCD).[1]:AppB.3.2.7 GRB, Green Background. MSG, Mosaic Green. Switch to mosaic, green foreground. (EDC2, not used)
0x93 ESC 0x53 (S) (not used) YLB, Yellow Background. MSY, Mosaic Yellow. Switch to mosaic, yellow foreground. (EDC3, not used)
0x94 ESC 0x54 (T) (not used) BLB, Blue Background. MSB, Mosaic Blue. Switch to mosaic, blue foreground. (EDC4, not used)
0x95 ESC 0x55 (U) P-MACRO, Photo Macro. Followed by a single-byte parameter (0x40 for define, 0x41 for define and execute, 0x42 to define a transmit-macro, 0x4F to delimit the end of a macro definition).[1]:AppB.3.2.9 Second single-byte parameter (from 0x20 to 0x7F) identifies the photo macro being defined (from PM0 to PM95). MGB, Magenta Background. MSM, Mosaic Magenta. Switch to mosaic, magenta foreground. WWON, Word Wrap On.
0x96 ESC 0x56 (V) (not used) CNB, Cyan Background. MSC, Mosaic Cyan. Switch to mosaic, cyan foreground. WWOF, Word Wrap Off.
0x97 ESC 0x57 (W) (not used) WHB, White Background. MSW, Mosaic White. Switch to mosaic, white foreground. SCON, Scroll On. Next-lining off the bottom of the screen moves the rest of the screen up to make space.
0x98 ESC 0x58 (X) RPC, Repeat Control. Repeats preceding spacing graphical character a number of times specified by the low six bits of the following byte (from 0x40 to 0x7F). Repeats to end of line if byte is 0x40. Compare REP from Data Syntax 3. CDY, Conceal Display. Display characters as spaces (might be terminated by SCD). SCOF, Scroll Off. Next-lining off the bottom of the screen wraps around to the top of the screen.
0x99 ESC 0x59 (Y) SPL, Stop Lining. Terminates underlining. For mosaic characters, non-underlined font corresponds to contiguous display, with the blocks within a mosaic character joined together. USTA, Underline Start. Begins underlined letters, and switches to separated display for mosaics.
0x9A ESC 0x5A (Z) STL, Start Lining. Begins underlined letters. For mosaics, this corresponds to separated display, with the blocks within a mosaic character shown separated. USTO, Underline Stop. Terminates underlining, and switches to contiguous display for mosaics.
0x9B ESC 0x5B ([) (not used) CSI, Control Sequence Introducer. FLC, Flash Cursor. User input cursor turned on, flashing.
0x9C ESC 0x5C (\) (not used) NPO, Normal Polarity. Foreground in foreground colour, background in background colour. BBD, Black Background. STC, Steady Cursor. User input cursor turned on, always visible.
0x9D ESC 0x5D (]) (not used) IPO, Inverted Polarity. Foreground in background colour, background in foreground colour. NBD, New Background. Set background colour to previous foreground colour. The current foreground colour is not affected. COF, Cursor Off. User input cursor invisible, but still functional.
0x9E ESC 0x5E (^) UNP, Unprotected. Makes following characters unprotected from user input. TRB, Transparent Background. HMS, Hold Mosaic. Image subsequently stored control functions as the last received mosaic character. BSTO, Blink Stop.
0x9F ESC 0x5F (_) PRT, Protected. Makes following characters protected from user input SCD, Stop Conceal. Terminate CDY. RMS, Release Mosaic. Terminate HMS. UNP, Unprotect. Makes a field unprotected (open to user input).

References

  1. 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 1.13 1.14 1.15 ITU-T (1994-11-11), International interworking for Videotex services, T.101:1994, https://www.itu.int/rec/T-REC-T.101-199411-I/en 
  2. 2.0 2.1 CCITT (1987-07-31), Mosaic-1 Set of Data Syntax 1 of CCITT Rec. T.101, ITSCJ/IPSJ, ISO-IR-137, https://www.itscj.ipsj.or.jp/iso-ir/137.pdf 
  3. 3.0 3.1 CCITT (1983-10-01), Second Supplementary Set of Mosaic Characters, ITSCJ/IPSJ, ISO-IR-71, https://www.itscj.ipsj.or.jp/iso-ir/071.pdf 
  4. International Register of Coded Character Sets To Be Used With Escape Sequences, ITSCJ/IPSJ, p. 22, ISO-IR, https://itscj.ipsj.or.jp/english/vbcqpr00000004qn-att/ISO-IR.pdf 
  5. 5.0 5.1 CCITT (1988-11-01), Supplementary Set of Graphic Characters for Videotex, ITSCJ/IPSJ, ISO-IR-70, https://www.itscj.ipsj.or.jp/iso-ir/070.pdf 
  6. See Table C.9 in Annex C part 1 of T.101.[1] Caveat: the table itself is displayed in the PDF with severe mojibake (hence why the displayed table does not appear to correspond to the associated notes), and is supposed to look like ISO-IR-70[5] (besides the additional highlighted umlaut code).
  7. CCITT (1986-11-30), Supplementary Set of Graphic Characters for CCITT Recommendation T.101, Data Syntax III, ITSCJ/IPSJ, ISO-IR-128, https://www.itscj.ipsj.or.jp/iso-ir/128.pdf 
  8. 8.00 8.01 8.02 8.03 8.04 8.05 8.06 8.07 8.08 8.09 8.10 8.11 8.12 CCITT (1987-07-31), Primary Control Set of Data Syntax I of CCITT Rec. T.101, ITSCJ/IPSJ, ISO-IR-132, https://www.itscj.ipsj.or.jp/iso-ir/132.pdf 
  9. 9.00 9.01 9.02 9.03 9.04 9.05 9.06 9.07 9.08 9.09 9.10 9.11 9.12 9.13 9.14 CCITT (1987-07-31), Primary Control Set of Data Syntax II of CCITT Rec. T.101, ITSCJ/IPSJ, ISO-IR-134, https://www.itscj.ipsj.or.jp/iso-ir/134.pdf 
  10. 10.00 10.01 10.02 10.03 10.04 10.05 10.06 10.07 10.08 10.09 10.10 10.11 10.12 10.13 CCITT (1987-07-31), Primary Control Set of Data Syntax III of CCITT Rec. T.101, ITSCJ/IPSJ, ISO-IR-135, https://www.itscj.ipsj.or.jp/iso-ir/135.pdf 
  11. CCITT (1987-07-31), Supplementary Control Set of Data Syntax I of CCITT Rec. T.101, ITSCJ/IPSJ, ISO-IR-133, https://www.itscj.ipsj.or.jp/iso-ir/133.pdf 
  12. CCITT (1983-10-01), Attribute Control Set for Videotex, ITSCJ/IPSJ, ISO-IR-73, https://www.itscj.ipsj.or.jp/iso-ir/073.pdf 
  13. BSI (1982-06-01), Attribute Control Set for UK Videotex, ITSCJ/IPSJ, ISO-IR-56, https://www.itscj.ipsj.or.jp/iso-ir/056.pdf 
  14. CCITT (1987-07-31), The Supplementary Control Set of Data Syntax III of CCITT Rec. T.101, ITSCJ/IPSJ, ISO-IR-136, https://www.itscj.ipsj.or.jp/iso-ir/136.pdf