Engineering:Teraflops Research Chip
General Info | |
---|---|
Launched | 2006 |
Designed by | Intel Tera-Scale Computing Research Program |
Performance | |
Max. CPU clock rate | 5.67 GHz |
Data width | 38-bit |
Architecture and classification | |
Instruction set | 96-bit VLIW |
Physical specifications | |
Transistors |
|
Cores |
|
Socket(s) |
|
History | |
Successor | Xeon Phi |
Intel Teraflops Research Chip (codenamed Polaris) is a research manycore processor containing 80 cores, using a network-on-chip architecture, developed by Intel's Tera-Scale Computing Research Program.[1] It was manufactured using a 65 nm CMOS process with eight layers of copper interconnect and contains 100 million transistors on a 275 mm2 die.[2][3][4] Its design goal was to demonstrate a modular architecture capable of a sustained performance of 1.0 TFLOPS while dissipating less than 100 W.[3] Research from the project was later incorporated into Xeon Phi. The technical lead of the project was Sriram R. Vangal.[4]
The processor was initially presented at the Intel Developer Forum on September 26, 2006[5] and officially announced on February 11, 2007.[6] A working chip was presented at the 2007 IEEE International Solid-State Circuits Conference, alongside technical specifications.[2]
Architecture
The chip consists of a 10x8 2D mesh network of cores and nominally operates at 4 GHz.[nb 1] Each core, called a tile (3 mm2), contains a processing engine and a 5-port wormhole-switched router (0.34 mm2) with mesochronous interfaces, with a bandwidth of 80 GB/s and latency of 1.25 ns at 4 GHz.[2] The processing engine in each tile contains two independent, 9-stage pipeline, single-precision floating-point multiplyaccumulator (FPMAC) units, 3 KB of single-cycle instruction memory and 2 KB of data memory.[3] Each FPMAC unit is capable of performing 2 single-precision floating-point operations per cycle. Each tile has thus an estimated peak performance of 16 GFLOPS at the standard configuration of 4 GHz. A 96-bit very long instruction word (VLIW) encodes up to eight operations per cycle.[3] The custom instruction set includes instructions to send and receive packets into/from the chip's network and well as instructions for sleeping and waking a particular tile.[4] Underneath each tile, a 256 KB SRAM module (codenamed Freya) was 3D stacked, thus bringing memory nearer to the processor to increase overall memory bandwidth to 1 TB/s, at the expense of higher cost, thermal stress and latency, and a small total capacity of 20 MB.[7] The network of Polaris was shown to have a bisection bandwidth of 1.6 Tbit/s at 3.16 GHz and 2.92 Tbit/s at 5.67 GHz.[8]
Other prominent features of the Teraflops Research chip include its fine-grained power management with 21 independent sleep regions on a tile and dynamic tile sleep, and very high energy efficiency with 27 GFLOPS/W theoretical peak at 0.6 V and 19.4 GFLOPS/W actual for stencil at 0.75 V.[4][9]
Application | [math]\displaystyle{ FLOP }[/math] count | [math]\displaystyle{ \text{TFLOPS}_{avg} }[/math] | [math]\displaystyle{ \% \text{TFLOPS}_{peak} }[/math] | Active tiles |
---|---|---|---|---|
Stencil | 358K | 1.00 | 73.3% | 80 |
SGEMM: | 2.63M | 0.51 | 37.5% | 80 |
Spreadsheet | 64.2K | 0.45 | 33.2% | 80 |
2D FFT | 196K | 0.02 | 2.73% | 64 |
[math]\displaystyle{ V_{CC} }[/math] | [math]\displaystyle{ f_{max} }[/math][nb 4] | [math]\displaystyle{ \text{TFLOPS}_{peak} }[/math][nb 5] | Power[nb 6] | [math]\displaystyle{ T }[/math] | Source |
---|---|---|---|---|---|
0.60 V | 1.0 GHz | 0.32 TFLOPS | 11 W | 110 °C | [2] |
0.675 V | 1.0 GHz | 0.32 TFLOPS | 15.6 W | 80 °C | [4] |
0.70 V | 1.5 GHz | 0.48 TFLOPS | 25 W | 110 °C | [2] |
0.70 V | 1.35 GHz | 0.43 TFLOPS | 18 W | 80 °C | [4] |
0.75 V | 1.6 GHz | 0.51 TFLOPS | 21 W | 80 °C | [4] |
0.80 V | 2.1 GHz | 0.67 TFLOPS | 42 W | 110 °C | [2] |
0.80 V | 2.0 GHz | 0.64 TFLOPS | 26 W | 80 °C | [4] |
0.85 V | 2.4 GHz | 0.77 TFLOPS | 32 W | 80 °C | [4] |
0.90 V | 2.6 GHz | 0.83 TFLOPS | 70 W | 110 °C | [2] |
0.90 V | 2.85 GHz | 0.91 TFLOPS | 45 W | 80 °C | [4] |
0.95 V | 3.16 GHz | 1.0 TFLOPS | 62 W | 80 °C | [4] |
1.00 V | 3.13 GHz | 1.0 TFLOPS | 98 W | 110 °C | [2] |
1.00 V | 3.8 GHz | 1.22 TFLOPS | 78 W | 80 °C | [4] |
1.05 V | 4.2 GHz | 1.34 TFLOPS | 82 W | 80 °C | [4] |
1.10 V | 3.5 GHz | 1.12 TFLOPS | 135 W | 110 °C | [2] |
1.10 V | 4.5 GHz | 1.44 TFLOPS | 105 W | 80 °C | [4] |
1.15 V | 4.8 GHz | 1.54 TFLOPS | 128 W | 80 °C | [4] |
1.20 V | 4.0 GHz | 1.28 TFLOPS | 181 W | 110 °C | [2] |
1.20 V | 5.1 GHz | 1.63 TFLOPS | 152 W | 80 °C | [4] |
1.25 V | 5.3 GHz | 1.70 TFLOPS | 165 W | 80 °C | [4] |
1.30 V | 4.4 GHz | 1.39 TFLOPS | ? | 110 °C | [2] |
1.30 V | 5.5 GHz | 1.76 TFLOPS | 210 W | 80 °C | [4] |
1.35 V | 5.67 GHz | 1.81 TFLOPS | 230 W | 80 °C | [4] |
1.40 V | 4.8 GHz | 1.52 TFLOPS | ? | 110 °C | [2] |
Issues
Intel aimed to help software development for the new exotic architecture by creating a new programming model, especially for the chip, called Ct. The model never gained the following Intel hoped for and has been eventually incorporated into Intel Array Building Blocks, a now defunct C++ library.
See also
Notes
- ↑ Though the chip was later shown by Intel to run as high as 5.67 GHz.
- ↑ At 1.07 V and 4.27 GHz.
- ↑ All measurements present performance with all 80 cores active.
- ↑ Substantially higher frequencies at the same voltages (compared to the initial ISSCC report) were attained in 2008 with use of a custom cooling solution.
- ↑ Values in italic were extrapolated by [math]\displaystyle{ \text{FLOPS}_{peak} = f_{max} \cdot 80 \text{ tiles} \cdot 2 \tfrac{\text{FPMAC}}{\text{tile}} \cdot 2 \tfrac{\text{FLOPS}}{\text{FPMAC}\cdot\text{cycle}} }[/math], where the maximal frequency was manually extracted from plots and are thus only approximate in their nature.
- ↑ Values in italic were manually extracted from plots and are thus only approximate in their nature.
References
- ↑ Intel Corporation. "Teraflops Research Chip". http://techresearch.intel.com/articles/Tera-Scale/1449.htm.
- ↑ Jump up to: 2.00 2.01 2.02 2.03 2.04 2.05 2.06 2.07 2.08 2.09 2.10 2.11 Vangal, Sriram; Howard, Jason; Ruhl, Gregory; Dighe, Saurabh; Wilson, Howard; Tschanz, James; Finan, David; Iyer, Priya et al. (2007). "An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS". pp. 98–589. doi:10.1109/ISSCC.2007.373606. ISBN 978-1-4244-0852-8. https://ieeexplore.ieee.org/document/4242283.
- ↑ Jump up to: 3.0 3.1 3.2 3.3 Peh, Li-Shiuan; Keckler, Stephen W.; Vangal, Sriram (2009), Keckler, Stephen W.; Olukotun, Kunle; Hofstee, H. Peter, eds., "On-Chip Networks for Multicore Systems", Multicore Processors and Systems (Springer US): pp. 35–71, doi:10.1007/978-1-4419-0263-4_2, ISBN 978-1-4419-0262-7, Bibcode: 2009mps..book...35P, http://link.springer.com/10.1007/978-1-4419-0263-4_2, retrieved 2020-05-14
- ↑ Jump up to: 4.00 4.01 4.02 4.03 4.04 4.05 4.06 4.07 4.08 4.09 4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18 4.19 4.20 Vangal, S.R.; Howard, J.; Ruhl, G.; Dighe, S.; Wilson, H.; Tschanz, J.; Finan, D.; Singh, A. et al. (2008). "An 80-Tile Sub-100-W TeraFLOPS Processor in 65-nm CMOS". IEEE Journal of Solid-State Circuits 43 (1): 29–41. doi:10.1109/JSSC.2007.910957. ISSN 0018-9200. Bibcode: 2008IJSSC..43...29V. https://ieeexplore.ieee.org/document/4443212.
- ↑ "Intel Develops Tera-Scale Research Chips". 2006. https://www.intel.com/pressroom/archive/releases/2006/20060926corp_b.htm.
- ↑ Intel Corporation (February 11, 2007). "Intel Research Advances 'Era Of Tera'". http://www.intel.com/pressroom/archive/releases/20070204comp.htm.
- ↑ Bautista, Jerry (2008). "Tera-scale computing and interconnect challenges - 3D stacking considerations". 2008 IEEE Hot Chips 20 Symposium (HCS). Stanford, CA, USA: IEEE. pp. 1–34. doi:10.1109/HOTCHIPS.2008.7476514. ISBN 978-1-4673-8871-9. https://ieeexplore.ieee.org/document/7476514.
- ↑ Intel's Teraflops Research Chip. Intel Corporation. 2007. http://download.intel.com/pressroom/kits/Teraflops/Teraflops_Research_Chip_Overview.pdf.
- ↑ Fossum, Tryggve (2007). "High End MPSOC - The Personal Super Computer". MPSoC Conference 2007. pp. 6. https://en.wikichip.org/w/images/0/0b/intel_mpsoc_2007.pdf.
Original source: https://en.wikipedia.org/wiki/Teraflops Research Chip.
Read more |