Burroughs B6x00-7x00 instruction set

From HandWiki
Short description: Syllable repertoire of B5900, B6500, B7500 and successors


The Burroughs B6x00-7x00 instruction set includes the set of valid operations for the Burroughs B6500,[1] B7500 and later Burroughs large systems, including the current (as of 2006) Unisys Clearpath/MCP systems; it does not include the instruction for other Burroughs large systems including the B5000, B5500, B5700 and the B8500. These unique machines have a distinctive design and instruction set. Each word of data is associated with a type, and the effect of an operation on that word can depend on the type. Further, the machines are stack[lower-alpha 1] based to the point that they had no user-addressable registers.

Overview

As you would expect from the unique architecture used in these systems, they also have an interesting instruction set. Programs are made up of 8-bit syllables, which may be Name Call, be Value Call or form an operator, which may be from one to twelve syllables in length. There are less than 200 operators, all of which fit into 8-bit syllables. If we ignore the powerful string scanning, transfer, and edit operators, the basic set is only about 120 operators. If we remove the operators reserved for the operating system such as MVST and HALT, the set of operators commonly used by user-level programs is less than 100. The Name Call and Value Call syllables contain address couples; the Operator syllables either use no addresses or use control words and descriptors on the stack.

Since there are no programmer-addressable registers, most of the register manipulating operations required in other architectures are not needed, nor are variants for performing operations between pairs of registers, since all operations are applied to the top of the stack. This also makes code files very compact, since operators are zero-address and do not need to include the address of registers or memory locations in the code stream. Some of the code density was due to moving vital operand information elsewhere, to 'tags' on every data word or into tables of pointers. Many of the operators are generic or polymorphic depending on the kind of data being acted on as given by the tag. The generic opcodes required fewer opcode bits but made the hardware more like an interpreter, with less opportunity to pipeline the common cases.

For example, the instruction set has only one ADD operator. It had to fetch the operand to discover whether this was an integer add or floating point add. Typical architectures require multiple operators for each data type, for example add.i, add.f, add.d, add.l for integer, float, double, and long data types. The architecture only distinguishes single and double precision numbers – integers are just reals with a zero exponent. When one or both of the operands has a tag of 2, a double precision add is performed, otherwise tag 0 indicates single precision. Thus the tag itself is the equivalent of the operator .i, .f, .d, and .l extension. This also means that the code and data can never be mismatched.

Two operators are important in the handling of on-stack data – Value Call (VALC) and Name Call (NAMC). These are two-bit operators, 00 being VALC and 01 being NAMC. The following six bits of the syllable, concatenated with the following syllable, provide the address couple. Thus VALC covers syllable values 0000 to 3FFF and NAMC 4000 to 7FFF.

VALC is another polymorphic operator. If it hits a data word, that word is loaded to the top of stack. If it hits an IRW, that is followed, possibly in a chain of IRWs until a data word is found. If a PCW is found, then a function is entered to compute the value and the VALC does not complete until the function returns.

NAMC simply loads the address couple onto the top of the stack as an IRW (with the tag automatically set to 1).

Static branches (BRUN, BRFL, and BRTR) used two additional syllables of offset. Thus arithmetic operations occupied one syllable, addressing operations (NAMC and VALC) occupied two, branches three, and long literals (LT48) five. As a result, code was much denser (had better entropy) than a conventional RISC architecture in which each operation occupies four bytes. Better code density meant fewer instruction cache misses and hence better performance running large-scale code.

In the following operator explanations remember that A and B are the top two stack registers. Double precision extensions are provided by the X and Y registers; thus the top two double precision operands are given by AX and BY. (Mostly AX and BY is implied by just A and B.)

B6x00/7x00 Address Couple
Current LL Lexical Level
bits
Index
bits
0-1 13 12-0
2-3 13-12 11-0
4-7 13-11 10-0
8-15 13-10 9-0
16-31 13-9 8-0

Tags and control words

In the B6500, a word has 48 bits of data and three tag bits. extended to three bits outside of the 48 bit word into a tag. The data bits are bits 0–47 and the tag is in bits 48–50. Bit 48 is the read-only bit, thus odd tags indicated control words that cannot be written by a user-level program. Code words are given tag 3. Here is a list of the tags and their function:

Tag Word kind Description
0 Data All kinds of user and system data (text data and single precision numbers)
2 Double Double Precision data
4 SIW Step Index word (used in loops)
6 Uninitialized data
SCW Software Control Word (used to cut back the stack)
1 IRW Indirect Reference Word
SIRW Stuffed Indirect Reference Word
3 Code Program code word
MSCW Mark Stack Control Word
RCW Return Control Word
TOSCW Top of Stack Control Word
SD Segment Descriptor
5 Descriptor Data block descriptors
7 PCW Program Control Word

The current incarnation of these machines, the Unisys ClearPath has extended tags further into a four bit tag. The microcode level that specified four bit tags was referred to as level Gamma.

Even-tagged words are user data which can be modified by a user program as user state. Odd-tagged words are created and used directly by the hardware and represent a program's execution state. Since these words are created and consumed by specific instructions or the hardware, the exact format of these words can change between hardware implementation and user programs do not need to be recompiled, since the same code stream will produce the same results, even though system word format may have changed.

Tag 1 words represent on-stack data addresses. The normal IRW simply stores an address couple to data on the current stack. The SIRW references data on any stack by including a stack number in the address.

Tag 5 words are descriptors, which are more fully described in the next section. Tag 5 words represent off-stack data addresses.

Tag 7 is the program control word which describes a procedure entry point. When operators hit a PCW, the procedure is entered. The ENTR operator explicitly enters a procedure (non-value-returning routine). Functions (value-returning routines) are implicitly entered by operators such as value call (VALC). Global routines are stored in the D[2] environment as SIRWs that point to a PCW stored in the code segment dictionary in the D[1] environment. The D[1] environment is not stored on the current stack because it can be referenced by all processes sharing this code. Thus code is reentrant and shared.

Tag 3 represents code words themselves, which won't occur on the stack. Tag 3 is also used for the stack control words MSCW, RCW, TOSCW.

Figure 9.2 From the ACM Monograph in the References. Elliot Organick 1973.

Display registers

A stack hardware optimization is the provision of D (or "display") registers. These are registers that point to the start of each called stack frame. These registers are updated automatically as procedures are entered and exited and are not accessible by any software. There are 32 D registers, which is what limits to 32 levels of lexical nesting.

Consider how we would access a lexical level 2 (D[2]) global variable from lexical level 5 (D[5]). Suppose the variable is 6 words away from the base of lexical level 2. It is thus represented by the address couple (2, 6). If we don't have D registers, we have to look at the control word at the base of the D[5] frame, which points to the frame containing the D[4] environment. We then look at the control word at the base of this environment to find the D[3] environment, and continue in this fashion until we have followed all the links back to the required lexical level. This is not the same path as the return path back through the procedures which have been called in order to get to this point. (The architecture keeps both the data stack and the call stack in the same structure, but uses control words to tell them apart.)

As you can see, this is quite inefficient just to access a variable. With D registers, the D[2] register points at the base of the lexical level 2 environment, and all we need to do to generate the address of the variable is to add its offset from the stack frame base to the frame base address in the D register. (There is an efficient linked list search operator LLLU, which could search the stack in the above fashion, but the D register approach is still going to be faster.) With D registers, access to entities in outer and global environments is just as efficient as local variable access.

D Tag Data                — Address couple, Comments
register
| 0        | n          | (4, 1) The integer n (declared on entry to a block, not a procedure)
|-----------------------|
| D[4]==>3 | MSCW       | (4, 0) The Mark Stack Control Word containing the link to D[3].
|=======================|
| 0        | r2         | (3, 5) The real r2
|-----------------------|
| 0        | r1         | (3, 4) The real r1
|-----------------------|
| 1        | p2         | (3, 3) A SIRW reference to g at (2,6)
|-----------------------|
| 0        | p1         | (3, 2) The parameter p1 from value of f 
|-----------------------|
| 3        | RCW        | (3, 1) A return control word
|-----------------------|
| D[3]==>3 | MSCW       | (3, 0) The Mark Stack Control Word containing the link to D[2].
|=======================|
| 1        | a          | (2, 7) The array a  ======>[ten word memory block]
|-----------------------|
| 0        | g          | (2, 6) The real g 
|-----------------------|
| 0        | f          | (2, 5) The real f 
|-----------------------|
| 0        | k          | (2, 4) The integer k 
|-----------------------|
| 0        | j          | (2, 3) The integer j 
|-----------------------|
| 0        | i          | (2, 2) The integer i
|-----------------------|
| 3        | RCW        | (2, 1) A return control word
|-----------------------|
| D[2]==>3 | MSCW       | (2, 0) The Mark Stack Control Word containing the link to the previous stack frame.
|=======================| — Stack bottom

If we had invoked the procedure p as a coroutine, or a process instruction, the D[3] environment would have become a separate D[3]-based stack. This means that asynchronous processes still have access to the D[2] environment as implied in ALGOL program code. Taking this one step further, a totally different program could call another program's code, creating a D[3] stack frame pointing to another process' D[2] environment on top of its own process stack. At an instant the whole address space from the code's execution environment changes, making the D[2] environment on the own process stack not directly addressable and instead make the D[2] environment in another process stack directly addressable. This is how library calls are implemented. At such a cross-stack call, the calling code and called code could even originate from programs written in different source languages and be compiled by different compilers.

The D[1] and D[0] environments do not occur in the current process's stack. The D[1] environment is the code segment dictionary, which is shared by all processes running the same code. The D[0] environment represents entities exported by the operating system.

Stack frames actually don't even have to exist in a process stack. This feature was used early on for file I/O optimization, the FIB (file information block) was linked into the display registers at D[1] during I/O operations. In the early nineties, this ability was implemented as a language feature as STRUCTURE BLOCKs and – combined with library technology - as CONNECTION BLOCKs. The ability to link a data structure into the display register address scope implemented object orientation. Thus, the B6500 actually used a form of object orientation long before the term was ever used.

On other systems, the compiler might build its symbol table in a similar manner, but eventually the storage requirements would be collated and the machine code would be written to use flat memory addresses of 16-bits or 32-bits or even 64-bits. These addresses might contain anything so that a write to the wrong address could damage anything. Instead, the two-part address scheme was implemented by the hardware. At each lexical level, variables were placed at displacements up from the base of the level's stack, typically occupying one word - double precision or complex variables would occupy two. Arrays were not stored in this area, only a one word descriptor for the array was. Thus, at each lexical level the total storage requirement was not great: dozens, hundreds or a few thousand in extreme cases, certainly not a count requiring 32-bits or more. And indeed, this was reflected in the form of the VALC instruction (value call) that loaded an operand onto the stack. This op-code was two bits long and the rest of the byte's bits were concatenated with the following byte to give a fourteen-bit addressing field. The code being executed would be at some lexical level, say six: this meant that only lexical levels zero to six were valid, and so just three bits were needed to specify the lexical level desired. The address part of the VALC operation thus reserved just three bits for that purpose, with the remainder being available for referring to entities at that and lower levels. A deeply nested procedure (thus at a high lexical level) would have fewer bits available to identify entities: for level sixteen upwards five bits would be needed to specify the choice of levels 0–31 thus leaving nine bits to identify no more than the first 512 entities of any lexical level. This is much more compact than addressing entities by their literal memory address in a 32-bit addressing space. Further, only the VALC opcode loaded data: opcodes for ADD, MULT and so forth did no addressing, working entirely on the top elements of the stack.

Much more important is that this method meant that many errors available to systems employing flat addressing could not occur because they were simply unspeakable even at the machine code level. A task had no way to corrupt memory in use by another task, because it had no way to develop its address. Offsets from a specified D-register would be checked by the hardware against the stack frame bound: rogue values would be trapped. Similarly, within a task, an array descriptor contained information on the array's bounds, and so any indexing operation was checked by the hardware: put another way, each array formed its own address space. In any case, the tagging of all memory words provided a second level of protection: a misdirected assignment of a value could only go to a data-holding location, not to one holding a pointer or an array descriptor, etc. and certainly not to a location holding machine code.

Arithmetic operators

ADD
Add top two stack operands (B := B + A or BY := BY + AX if double precision)
SUBT
Subtract (B - A)
MULT
Multiply with single or double precision result
MULX
Extended multiply with forced double precision result
DIVD
Divide with real result
IDIV
Divide with integer result
RDIV
Return remainder after division
NTIA
Integerize truncated
NTGR
Integerize rounded
NTGD
Integerize rounded with double precision result
CHSN
Change sign
JOIN
Join two singles to form a double
SPLT
Split a double to form two singles
ICVD
Input convert destructive – convert BCD number to binary (for COBOL)
ICVU
Input convert update – convert BCD number to binary (for COBOL)
SNGL
Set to single precision rounded
SNGT
Set to single precision truncated
XTND
Set to double precision
PACD
Pack destructive
PACU
Pack update
USND
Unpack signed destructive
USNU
Unpack signed update
UABD
Unpack absolute destructive
UABU
Unpack, absolute update
SXSN
Set external sign
ROFF
Read and clear overflow flip flop
RTFF
Read true/false flip flop

Comparison operators

LESS
Is B < A?
GREQ
Is B >= A?
GRTR
Is B > A?
LSEQ
Is B <= A?
EQUL
Is B = A?
NEQL
Is B <> A?
SAME
Does B have the same bit pattern as A, including the tag

Logical operators

LAND
Logical bitwise and of all bits in operands
LOR
Logical bitwise or of all bits in operands
LNOT
Logical bitwise complement of all bits in operand
LEQV
Logical bitwise equivalence of all bits in operands

Branch and call operators

BRUN
Branch unconditional (offset given by following code syllables)
DBUN
Dynamic branch unconditional (offset given in top of stack)
BRFL
Branch if last result false (offset given by following code syllables)
DBFL
Dynamic branch if last result false (offset given in top of stack)
BRTR
Branch if last result true (offset given by following code syllables)
DBTR
Dynamic branch if last result true (offset given in top of stack)
EXIT
Exit current environment (terminate process)
STBR
Step and branch (used in loops; operand must be SIW)
ENTR
Execute a procedure call as given by a tag 7 PCW, resulting in an RCW at D[n] + 1
RETN
Return from current routine to place given by RCW at D[n] + 1 and remove the stack frame

Bit and field operators

BSET
Bit set (bit number given by syllable following instruction)
DBST
Dynamic bit set (bit number given by contents of B)
BRST
Bit reset (bit number given by syllable following instruction)
DBRS
Dynamic bit reset (bit number given by contents of B)
ISOL
Field isolate (field given in syllables following instruction)
DISO
Dynamic field isolate (field given in top of stack words)
FLTR
Field transfer (field given in syllables following instruction)
DFTR
Dynamic field transfer (field given in top of stack words)
INSR
Field insert (field given in syllables following instruction)
DINS
Dynamic field insert (field given in top of stack words)
CBON
Count binary ones in the top of stack word (A or AX)
SCLF
Scale left
DSLF
Dynamic scale left
SCRT
Scale right
DSRT
Dynamic scale right
SCRS
Scale right save
DSRS
Dynamic scale right save
SCRF
Scale right final
DSRF
Dynamic scale right final
SCRR
Scale right round
DSRR
Dynamic scale right round

Literal operators

LT48
Load following code word onto top of stack
LT16
Set top of stack to following 16 bits in code stream
LT8
Set top of stack to following code syllable
ZERO
Shortcut for LT48 0
ONE
Shortcut for LT48 1

Descriptor operators

INDX
Index create a pointer (copy descriptor) from a base (MOM) descriptor
NXLN
Index and load name (resulting in an indexed descriptor)
NXLV
Index and load value (resulting in a data value)
EVAL
Evaluate descriptor (follow address chain until data word or another descriptor found)

Stack operators

PUSH
Push down stack register
DLET
Pop top of stack
EXCH
Exchange top two words of stack
RSUP
Rotate stack up (top three words)
RSDN
Rotate stack down (top three words)
DUPL
Duplicate top of stack
MKST
Mark stack (build a new stack frame resulting in an MSCW on the top,

— followed by NAMC to load the PCW, then parameter pushes as needed, then ENTR)

IMKS
Insert an MSCW in the B register.
VALC
Fetch a value onto the stack as described above
NAMC
Place an address couple (IRW stack address) onto the stack as described above
STFF
Convert an IRW as placed by NAMC into an SIRW which references data in another stack.
MVST
Move to stack (process switch only done in one place in the MCP)

Store operators

STOD
Store destructive (if the target word has an odd tag throw a memory protect interrupt,

— store the value in the B register at the memory addressed by the A register. — Delete the value off the stack.

STON
Store non-destructive (Same as STOD but value is not deleted – handy for F := G := H := J expressions).
OVRD
Overwrite destructive, STOD ignoring read-only bit (for use in MCP only)
OVRN
Overwrite non-destructive, STON ignoring read-only bit (for use in MCP only)

Load operators

The Load instruction could find itself tripping on an indirect address, or worse, a disguised call to a call-by-name thunk routine.

LOAD
Load the value given by the address (tag 5 or tag 1 word) on the top of stack.

— Follow an address chain if necessary.

LODT
Load transparent – load the word referenced by the address on the top of stack

Transfer operators

These were used for string transfers usually until a certain character was detected in the source string. All these operators are protected from buffer overflows by being limited by the bounds in the descriptors.

TWFD
Transfer while false, destructive (forget pointer)
TWFU
Transfer while false, update (leave pointer at end of transfer for further transfers)
TWTD
Transfer while true, destructive
TWTU
Transfer while true, update
TWSD
Transfer words, destructive
TWSU
Transfer words, update
TWOD
Transfer words, overwrite destructive
TWOU
Transfer words, overwrite update
TRNS
Translate – transfer a source buffer into a destination converting characters as given in a translate table.
TLSD
Transfer while less, destructive
TLSU
Transfer while less, update
TGED
Transfer while greater or equal, destructive
TGEU
Transfer while greater or equal, update
TGTD
Transfer while greater, destructive
TGTU
Transfer while greater, update
TLED
Transfer while less or equal, destructive
TLEU
Transfer while less or equal, update
TEQD
Transfer while equal, destructive
TEQU
Transfer while equal, update
TNED
Transfer while not equal, destructive
TNEU
Transfer while not equal, update
TUND
Transfer unconditional, destructive
TUNU
Transfer unconditional, update

Scan operators

These were used for scanning strings useful in writing compilers. All these operators are protected from buffer overflows by being limited by the bounds in the descriptors.

SWFD
Scan while false, destructive
SISO
String isolate
SWTD
Scan while true, destructive
SWTU
Scan while true, update
SLSD
Scan while less, destructive
SLSU
Scan while less, update
SGED
Scan while greater or equal, destructive
SGEU
Scan while greater or equal, update
SGTD
Scan while greater, destructive
SGTU
Scan while greater, update
SLED
Scan while less or equal, destructive
SLEU
Scan while less or equal, update
SEQD
Scan while equal, destructive
SEQU
Scan while equal, update
SNED
Scan while not equal, destructive
SNEU
Scan while not equal, update
CLSD
Compare characters less, destructive
CLSU
Compare characters less, update
CGED
Compare characters greater or equal, destructive
CGEU
Compare characters greater or equal, update
CGTD
Compare character greater, destructive
CGTU
Compare character greater, update
CLED
Compare characters less or equal, destructive
CLEU
Compare characters less or equal, update
CEQD
Compare character equal, destructive
CEQU
Compare character equal, update
CNED
Compare characters not equal, destructive
CNEU
Compare characters not equal, update

System

SINT
Set interval timer
EEXI
Enable external interrupts
DEXI
Disable external interrupts
SCNI
Scan in – initiate IO read, this changed on different architectures
SCNO
Scan out – initiate IO write, this changed on different architectures
STAG
Set tag (not allowed in user-level processes)
RTAG
Read tag
IRWL
Hardware pseudo operator
SPRR
Set processor register (highly implementation dependent, only used in lower levels of MCP)
RPRR
Read processor register (highly implementation dependent, only used in lower levels of MCP)
MPCW
Make PCW
HALT
Halt the processor (operator requested or some unrecoverable condition has occurred)

Other

VARI
Escape to extended (variable instructions which were less frequent)
OCRX
Occurs index builds an occurs index word used in loops
LLLU
Linked list lookup – Follow a chain of linked words until a certain condition is met
SRCH
Masked search for equal – Similar to LLLU, but testing a mask in the examined words for an equal value
TEED
Table enter edit destructive
TEEU
Table enter edit, update
EXSD
Execute single micro destructive
EXSU
Execute single micro update
EXPU
Execute single micro, single pointer update
NOOP
No operation
NVLD
Invalid operator (hex code FF)
User operators
unassigned operators could cause interrupts into the operating system so that algorithms could be written to provide the required functionality

Edit operators

These were special operators for sophisticated string manipulation, particularly for business applications.

MINS
Move with insert – insert characters in a string
MFLT
Move with float
SFSC
Skip forward source character
SRSC
Skip reverse source characters
RSTF
Reset float
ENDF
End float
MVNU
Move numeric unconditional
MCHR
Move characters
INOP
Insert overpunch
INSG
Insert sign
SFDC
Skip forward destination character
SRDC
Skip reverse destination characters
INSU
Insert unconditional
INSC
Insert conditional
ENDE
End edit

Notes

  1. The lexical level in a syllable may refer either to a marked point in the stack of the current task or to a marked point in the stack of a parent task. The term the stack may refer to multiple related stacks, collectively known as a saguaro stack.

References