DEC RADIX 50
RADIX 50[1][2][3] or RAD50[3] (also referred to as RADIX50,[4] RADIX-50[5] or RAD-50), is an uppercase-only character encoding created by Digital Equipment Corporation (DEC) for use on their DECsystem, PDP, and VAX computers.
RADIX 50's 40-character repertoire (050 in octal) can encode six characters plus four additional bits into one 36-bit machine word (PDP-6, PDP-10/DECsystem-10, DECSYSTEM-20), three characters plus two additional bits into one 18-bit word (PDP-9,[2] PDP-15),[6] or three characters into one 16-bit word (PDP-11, VAX).[3]
The actual encoding differs between the 36-bit and 16-bit systems.
36-bit systems
In 36-bit DEC systems RADIX 50 was commonly used in symbol tables for assemblers or compilers which supported six-character symbol names from a 40-character alphabet. This left four bits to encode properties of the symbol.
For its similarities to the SQUOZE encoding scheme used in IBM's SHARE Operating System for representing object code symbols, DEC's variant was also sometimes called DEC Squoze,[7] however, IBM SQUOZE packed six characters of a 50-character alphabet plus two additional flag bits into one 36-bit word.[6]
RADIX 50 was not normally used in 36-bit systems for encoding ordinary character strings; file names were normally encoded as six six-bit characters, and full ASCII strings as five seven-bit characters and one unused bit per 36-bit word.
Most significant bits |
Least significant bits | |||||||
---|---|---|---|---|---|---|---|---|
000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 | |
000 | space | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
001 | 7 | 8 | 9 | A | B | C | D | E |
010 | F | G | H | I | J | K | L | M |
011 | N | O | P | Q | R | S | T | U |
100 | V | W | X | Y | Z | . | $ | % |
18-bit systems
RADIX 50 (also called Radix 508 format[2]) was used in Digital's 18-bit PDP-9 and PDP-15 computers to store symbols in symbol tables, leaving two extra bits per 18-bit word ("symbol classification bits").[2]
16-bit systems
Some strings in DEC's 16-bit systems were encoded as 8-bit bytes, while others used RADIX 50 (then also called MOD40).[3][8]
In RADIX 50, strings were encoded in successive words as needed, with the first character within each word located in the most significant position.
For example, using the PDP-11 encoding, the string "ABCDEF", with character values 1, 2, 3, 4, 5, and 6, would be encoded as a word containing the value 1×402 + 2×401 + 3×400 = 1683, followed by a second word containing the value 4×402 + 5×401 + 6×400 = 6606. Thus, 16-bit words encoded values ranging from 0 (three spaces) to 63999 ("999"). When there were fewer than three characters in a word, the last word for the string was padded with trailing spaces.[3]
There were several minor variations of this encoding with differing interpretations of the 27, 28, 29 code points. Where RADIX 50 was used for filenames stored on media, the code points represent the $
, %
, *
characters, and will be shown as such when listing the directory with utilities such as DIR.[9] When encoding strings in the PDP-11 assembler and other PDP-11 programming languages the code points represent the $
, .
, %
characters, and are encoded as such with the default RAD50 macro in the global macros file, and this encoding was used in the symbol tables. Some early documentation for the RT-11 operating system considered the code point 29 to be undefined.[3]
The use of RADIX 50 was the source of the filename size conventions used by Digital Equipment Corporation PDP-11 operating systems. Using RADIX 50 encoding, six characters of a filename could be stored in two 16-bit words, while three more extension (file type) characters could be stored in a third 16-bit word. The period that separated the filename and its extension was implied (i.e., was not stored and always assumed to be present).
Most significant bits |
Least significant bits | |||||||
---|---|---|---|---|---|---|---|---|
000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 | |
000 | space | A | B | C | D | E | F | G |
001 | H | I | J | K | L | M | N | O |
010 | P | Q | R | S | T | U | V | W |
011 | X | Y | Z | $ | % . | * % | 0 | 1 |
100 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
See also
- Base 40
- Base conversion
- Chen–Ho encoding
- Densely packed decimal (DPD)
- Hertz encoding
- Packed BCD
- Six-bit character code
- Split octal
References
- ↑ 1.0 1.1 "Chapter VI: The Loader - The Radix 50 Representation of Symbols". PDP-6 Multiprogramming System Manual. Maynard, Massachusetts, USA: Digital Equipment Corporation (DEC). 1965. p. 57. DEC-6-0-EX-SYS-UM-IP-PRE00. http://bitsavers.trailing-edge.com/pdf/dec/pdp6/DEC-6-0-EX-SYS-UM-IP-PRE00_Multiprogramming_System_Manual_1965.pdf. Retrieved 2014-07-10. (1+84+10 pages)
- ↑ 2.0 2.1 2.2 2.3 "Appendix 1". PDP-9 Utility Programs--Advanced Software System--Programmer's Reference Manual. Maynard, Massachusetts, USA: Digital Equipment Corporation. 1968. Order No. DEC-9A-GUAB-D. http://www.bitsavers.org/pdf/dec/pdp9/DEC-9A-GUAB-D_UTILITIES.pdf. Retrieved 2020-06-04.
- ↑ 3.0 3.1 3.2 3.3 3.4 3.5 3.6 "8.10 .RAD50". PAL-11R Assembler - Programmer's Manual - Program Assembly Language and Relocatable Assembler for the Disk Operating System (2nd revised printing ed.). Maynard, Massachusetts, USA: Digital Equipment Corporation. May 1971. p. 8-8. DEC-11-ASDB-D. https://archive.org/details/bitsavers_decpdp11do11RAssemblerProgrammersManualMay71_2572677. Retrieved 2020-06-18. "[…] PDP-11 systems programs often handle symbols in a specially coded form called RADIX 50 (this form is sometimes referred to as MOD40). This form allows 3 characters to be packed into 16 bits; therefore, any 6-character symbol can be held in two words. The single operand is of the form /CCC/ where the slash (the delimiter) can be any printable character except for = and : . The delimiters enclose the characters to be converted which may be A through Z, 0 through 9, dollar ($), dot (.) and space ( ). If there are fewer than 3 characters they are considered to be left justified and trailing spaces are assumed. […] The packing algorithm is as follows: […] A. Each character is translated into its RADIX 50 equivalent as indicated in the following table: Character - RADIX 50 Equivalent (octal): (space) - 0, A–Z - 1–32, $ - 33, . - 34, 0–9 - 36–47. Note that another character could be defined for code 35. […] B. The RADIX 50 equivalents for characters 1 through 3 (C1,C2,C3) are combined as follows: RESULT=((C1*50)+C2)*50+C3 […]" [1]
- ↑ 4.0 4.1 "RADIX50 Character Code Reference". 2004. http://nemesis.lonestar.org/reference/telecom/codes/radix50.html.
- ↑ 5.0 5.1 "Appendix B.3: Radix-50 Constants and Character Set". Compaq Fortran 77 Language Reference Manual. Compaq Computer Corporation. 1999. http://www.helsinki.fi/atk/unix/dec_manuals/cf77au/olrm0398.htm. Retrieved 2012-10-14.
- ↑ 6.0 6.1 "Lecture 7, Object Codes, Loaders and Linkers - Final steps on the road to machine code". Operating Systems, Spring 2018. Department of Computer Science, The University of Iowa. 2018. http://homepage.divms.uiowa.edu/~jones/opsys/notes/07.shtml.
- ↑ "DEC/PDP Character Codes". University of Miami. 2005. DEC Squoze Character Table. http://rabbit.eng.miami.edu/info/decchars.html.
- ↑ PDP-11 Getting DOS on the Air (1 ed.). Maynard, Massachusetts, USA: Digital Equipment Corporation. August 1971. DEC-11-SYDC-D. https://archive.org/details/bitsavers_decpdp11dotingDOSontheAirAug71_3085688. Retrieved 2020-06-18. [2]
- ↑ "RT11 Radix50 Demo". https://commons.wikimedia.org/wiki/File:RT11_Radix50_Demo.gif.
Further reading
- {{cite web |title=Squoze your data |author-first=Al |author-last=Williams