XCore XS1-G4

From HandWiki
XS1-G4
XCore XS1-G4 144BGA.png
An XMOS Xcore processor, 144 Ball grid array package, 12×12 mm.
General Info
Launched2008
Performance
Max. CPU clock rate400 MHz
Architecture and classification
Instruction setXCore XS1
Physical specifications
Cores
  • 4
Package(s)
History

The XS1-G4 is a processor designed by XMOS. It is a 32-bit quad-core processor, where each core runs up to 8 concurrent threads. It was available as of Autumn 2008 running at 400 MHz. Each thread can run at up to 100 MHz; four threads follow each other through the pipeline, resulting in a top speed of 1.6 GIPS for four cores if 16 threads are running. The XS1-G4 is a distributed memory multi core processor, requiring the end user and compiler to deal with data distribution. When more than 4 threads execute, the 400 MIPS of each core is equally distributed over all active threads. This allows the use of extra threads in order to hide latency.

Description

The XS1-G4 comprises four cores and a switch. Each core has a data path, a memory, and register banks for eight threads. Threads running on different cores can communicate with each other by exchanging messages through the switches. Switches of multiple G4s can be connected to form a larger system. The instruction set supports the notion of a channel, a virtual connection between two threads. Channels are supported between threads on a core, between cores on a single chip through a XSwitch, or between cores in the same system if the switches are connected by means of physical links.

Instruction set architecture

Main page: XCore Architecture

Each thread has access to 12 general purpose registers, and a standard 3-operand instruction set is used for programming the thread.[1] The instruction set is encoded densely, encoding most instructions in 16 bits, where 11 bits are used for specifying 3 operands, and 5 bits are used to encode the opcode. Less frequently used instructions are encoded in 32 bits. The instruction set is a load-store instruction set. All instructions execute in a single cycle. If an instruction does not need data from memory (for example, arithmetic operations), the instruction will prefetch a word of instructions. This acts like a very small instruction cache, but its behavior can be predicted at compile time, making timing behavior as predictable as functional behavior. The instruction set natively supports events which enables the processor to stop a thread and restart it when an event is ready. In addition, a thread may be interrupted to deal with some external events.

Resources

Each core on the XS1-G4 has access to:

  • 64 KByte of RAM
  • 64 I/O pins that can be accessed using 28 ports: 16 ports access 16 single pins, the other 12 ports access nibbles, bytes, and words of pins.
  • 32 channel ends; each channel end can be connected to another channel end setting up a uni directional communication path. Two channels can be set up to point to each other creating a bi directional path.
  • 10 timers

The total number of data pins on an XS1-G4 is 256, requiring a 512-pin BGA to bring out all pins (including ground, power, and system pins). The 144-pin BGA only brings out 48 pins of two cores, effectively providing two cores for processing only, and two cores for both processing and I/O.

Communication network

The switch of the G4 comprises 16 internal links (four links to each core) and 16 external links. The internal links can transport up to 3.2 Gbit/s (bidirectional) each between core and switch. The external links can transport up to 400 Mbit/s (bidirectional) between the switch and an external unit (possibly the switch of a second node). The switch can route up to 57 Gbit/s.

References

External links