Belt machine

From HandWiki

In computer engineering and in programming language implementations, a belt machine is a real or emulated computer that uses a first in, first out (FIFO) queue rather than individual machine processor registers to evaluate each sub-expression in the program. A belt computer is programmed with an instruction set that specifies arguments explicitly but results implicitly.[1]

The common alternative to belt machines are register machines, in which each instruction explicitly names the specific registers to use for locations of operand arguments and results. Belt machines are related to stack machines, which specify both arguments and results implicitly using a pushdown stack. Other alternatives are accumulator machines, which have only one visible general-purpose temp register, and memory-to-memory machines, which have no visible temp registers.

A belt machine implements temporary storage with a fixed-length FIFO queue, or belt by analogy to a conveyor belt. The operands of the arithmetic logic units (ALUs) and other functional units may be taken from any position on the belt, and the result from the computation is dropped (stored) in the front position of the belt, advancing the belt to make room. As the belt is fixed length, drops in the front are matched by older operands falling off the back; pushed-off operands become inaccessible and must be explicitly saved if still needed for later work. Most operations of the instruction set work only with data on the belt, not on data registers or main memory cells.

For a typical instruction like add, both argument operands come from explicitly named positions on the belt, and the result is dropped on the front, ready for the next instruction. Operations with multiple results simply drop more values at the belt front. Most belt instructions are encoded as just an operation code (opcode) and two belt positions, with no added fields to specify a result register, memory address, or literal constant. This encoding is easily extended to richer operations with more than two inputs or more than one result. Constant operands are dropped by separate load immediate instructions. All access of program variables in main random-access memory (RAM) is segregated into separate load or store instructions containing one memory address, or some way to calculate that address from belt operands.

All belt machines have variants of the load/store opcodes to access local variables and the heap. This can be by offsets, from a pointer on the belt, or from various special-purpose base registers. Similarly, there will be instructions to branch to an address taken from the belt, along with branches relative to the program counter.

Temporal addressing

Because each drop of a result moves the prior belt content along to later positions in the queue, a given operand continually changes its position (and hence address) as a result of later execution. In effect, an access to the operand at position zero is a request for the most recent value dropped to the belt, while (for example) a reference to position five is to the sixth most recent drop. Thus the addresses of belt operands reflect the belt history over time. This is temporal addressing. It is hard for human programmers to keep track of belt contents, and hence operand addresses, when writing assembly code for a belt machine. However, it is easy for a compiler to track the changing contents and emit the correct position addresses in generated code.

Spill and fill

The belt is fixed length and may be too short to hold all live transient operands before they are pushed off the end. If an operand is needed for longer than its belt lifetime, it must be saved while still on the belt (spill) and later restored to the belt when needed again (fill). This situation is equivalent to the need to spill registers to memory when a program runs out of registers in a general-register machine. Spilled operands may be written to memory using normal store instructions, and restored using normal load instructions, or spill and fill may use special-purpose storage and associated operations that are faster or offer other advantages over load and store.

Freedom from hazard

The operands on the belt are read-only. New results do not overwrite prior values. The belt is thus a single-assignment structure, and is immune to the data hazards that must be dealt with by modern out-of-order general-register machines.

Compact object code

Dense machine code was very valuable in the 1960s, when main memory was very costly and limited, even on mainframe computers. It became important again on the initially-tiny memories of minicomputers, and then microprocessors. Density remains important today, for applications for smartphone, or downloaded into browsers over slow Internet connections, and in read-only memory (ROM) for embedded applications. A more general advantage of increased density is improved effectiveness of caches and instruction prefetch.

Belt machines have smaller instructions than register-based machines, due to not needing a destination address for results. This saving can make a significant difference for fixed-length instruction formats, which normally use power-of-two instruction widths. If there are thirty-two addressable elements (registers on a general-register machine, belt positions on a belt machine), then each element address occupies five bits in the instruction, needing 15 bits for the three-address format of a general-register machine, but only 10 bits using the two-address format of a belt machine. Because bits are also needed for opcode and other information in the instruction, the (power-of-two constrained) instruction width often determines the maximum number of addressable elements possible in a design. Typically a belt machine instruction can support the encoding of double the number of addressable elements compared to a general-register machine of the same instruction width. There are similar gains in variable-length instruction encodings.

Overall, belt machine code is less compact than for stack machines, which use no operand addresses, but often must introduce stack-manipulation instructions unneeded on a belt machine. The instructions for accumulator machines are not padded out with multiple register fields, instead, they use the return stack and need no extra memory reference instructions.

Implementation

While a belt machine presents an operand queue as the program model, there is not necessarily a physical queue (shift register) in the implemented hardware. Instead, a belt design may use an implementation analogous to the register renaming common in modern general-register machines. Live data values are kept in conveniently addressable physical resources (individual registers, register files, static random-access memory (SRAM), or operand forwarding from functional units) and generally not moved for the duration of their belt lifetime. Instruction decoder maps logical belt positions to physical locations. The mapping is updated to reflect the changes of logical position arising from newly dropped results.

The belt-machine architecture was created by startup Mill Computing, Inc., for use in their Mill architecture family of general-purpose CPUs.[2]

See also

References