Engineering:AMD Instinct

From HandWiki
AMD Instinct
Release dateJune 20, 2017; 8 years ago (2017-06-20)
Architecture
ModelsMI Series
Cores36-304 Compute Units (CUs)
Transistors
  • 5.7B (Polaris10) 14 nm
  • 8.9B (Fiji) 28 nm
  • 12.5B (Vega10) 14 nm
  • 13.2B (Vega20) 7 nm
  • 25.6B (Arcturus) 7 nm
  • 58.2B (Aldebaran) 6 nm
  • 146B (Antares) 5 nm
  • 153B (Aqua Vanjaram) 5 nm
History
Predecessor

AMD Instinct is AMD's brand of data center GPUs.[1][2] It replaced AMD's FirePro S brand in 2016. Compared to the Radeon brand of mainstream consumer/gamer products, the Instinct product line is intended to accelerate deep learning, artificial neural network, and high-performance computing/GPGPU applications.

The AMD Instinct product line directly competes with Nvidia's Tesla (Nvidia Data Center GPUs) and Intel's Xeon Phi and Data Center GPU lines of machine learning and GPGPU cards.

The brand was originally known as AMD Radeon Instinct, but AMD dropped the Radeon brand from the name before AMD Instinct MI100 was introduced in November 2020.

In June 2022, supercomputers based on AMD's Epyc CPUs and Instinct GPUs took the lead on the Green500 list of the most power-efficient supercomputers with over 50% lead over any other, and held the top first 4 spots.[3] One of them, the AMD-based Frontier is since June 2022 and as of 2023 the fastest supercomputer in the world on the TOP500 list.[4][5]

Products

Two glove-covered hands hold AMD Radeon Instinct MI50 card. The background is unsharp.
Top view of an AMD Radeon Instinct MI50 card.
AMD Instinct GPU generations
Accelerator Launch date Architecture Lithography Compute Units Memory PCIe support Form factor Processing power TBP
Size Type Bandwidth (GB/s) FP16 BF16 FP32 FP32 matrix FP64 performance FP64 matrix INT8 INT4
MI6 2016-12-12[6] GCN 4 14 nm 36 16 GB GDDR5 224 3.0 PCIe 5.7 TFLOPS N/A 5.7 TFLOPS N/A 358 GFLOPS N/A N/A N/A 150 W
MI8 GCN 3 28 nm 64 4 GB HBM 512 8.2 TFLOPS 8.2 TFLOPS 512 GFLOPS 175 W
MI25 GCN 5 14 nm 16 GB HBM2 484 26.4 TFLOPS 12.3 TFLOPS 768 GFLOPS 300 W
MI50 2018-11-06[7] 7 nm 60 1024 4.0 26.5 TFLOPS 13.3 TFLOPS 6.6 TFLOPS 53 TOPS 300 W
MI60 64 32 GB 29.5 TFLOPS 14.7 TFLOPS 7.4 TFLOPS 59 TOPS 300 W
MI100 2020-11-16 CDNA 120 1200 184.6 TFLOPS 92.3 TFLOPS 23.1 TFLOPS 46.1 TFLOPS 11.5 TFLOPS 184.6 TOPS 300 W
MI210 2022-03-22[8] CDNA 2 6 nm 104 64 GB HBM2E 1600 181 TFLOPS 22.6 TFLOPS 45.3 TFLOPS 22.6 TFLOPS 45.3 TFLOPS 181 TOPS 300 W
MI250 2021-11-08[9] 208 128 GB 3200 OAM 362.1 TFLOPS 45.3 TFLOPS 90.5 TFLOPS 45.3 TFLOPS 90.5 TFLOPS 362.1 TOPS 560 W
MI250X 220 383 TFLOPS 47.92 TFLOPS 95.7 TFLOPS 47.9 TFLOPS 95.7 TFLOPS 383 TOPS 560 W
MI300A 2023-12-06[10] CDNA 3 6 & 5 nm 228 128 GB HBM3 5300 5.0 APU SH5 socket 980.6 TFLOPS
1961.2 TFLOPS (with Sparsity)
122.6 TFLOPS 61.3 TFLOPS 122.6 TFLOPS 1961.2 TOPS
3922.3 TOPS (with Sparsity)
N/A 550 W
760 W (with liquid cooling)
MI300X 304 192 GB OAM 1307.4 TFLOPS
2614.9 TFLOPS (with Sparsity)
163.4 TFLOPS 81.7 TFLOPS 163.4 TFLOPS 2614.9 TOPS
5229.8 TOPS (with Sparsity)
N/A 750 W
MI325X 2024-10-10[11] 256 GB HBM3E 6000


MI350X 2025-06-13[12] CDNA 4 3 nm 256 288 GB HBM3E 8000 5.0 OAM 2386.9 TFLOPS
4613.8 TFLOPS (with Sparsity)
144.2 TFLOPS 72.1 TFLOPS 4.6137 POPS
9.2274 POPS (with Sparsity)
1000 W
MI355X 2516.6 TFLOPS
5033.2 TFLOPS (with Sparsity)
157.3 TFLOPS 78.6 TFLOPS 5.0332 POPS
10.066 POPS (with Sparsity)
1400 W

The three initial Radeon Instinct products were announced on December 12, 2016, and released on June 20, 2017, with each based on a different architecture.[13][14]

MI6

The MI6 is a passively cooled, Polaris 10 based card with 16 GB of GDDR5 memory and with a <150 W TDP.[1][2] At 5.7 TFLOPS (FP16 and FP32), the MI6 is expected to be used primarily for inference, rather than neural network training. The MI6 has a peak double precision (FP64) compute performance of 358 GFLOPS.[15]

MI8

The MI8 is a Fiji based card, analogous to the R9 Nano, has a <175W TDP.[1] The MI8 has 4 GB of High Bandwidth Memory. At 8.2 TFLOPS (FP16 and FP32), the MI8 is marked toward inference. The MI8 has a peak (FP64) double precision compute performance 512 GFLOPS.[16]

MI25

The MI25 is a Vega based card, utilizing HBM2 memory. The MI25 performance is expected to be 12.3 TFLOPS using FP32 numbers. In contrast to the MI6 and MI8, the MI25 is able to increase performance when using lower precision numbers, and accordingly is expected to reach 24.6 TFLOPS when using FP16 numbers. The MI25 is rated at <300W TDP with passive cooling. The MI25 also provides 768 GFLOPS peak double precision (FP64) at 1/16th rate.[17]

MI50, MI60

MI50 and MI60 are based on the Vega20 variant of GCN 5. They support 1/2 rate FP64 and are the last Instinct cards to bear the Radeon branding as well as the ability to produce display output.

MI100 series (CDNA 1)

The CDNA1 cards have removed all rendering-related resources while adding matrix processing units.

MI300 series

The AMD Instinct MI325X without cooler

The MI300A and MI300X are data center accelerators that use the CDNA 3 architecture, which is optimized for high-performance computing (HPC) and generative artificial intelligence (AI) workloads. The CDNA 3 architecture features a scalable chiplet design that leverages TSMC’s advanced packaging technologies, such as CoWoS (chip-on-wafer-on-substrate) and InFO (integrated fan-out), to combine multiple chiplets on a single interposer. The chiplets are interconnected by AMD’s Infinity Fabric, which enables high-speed and low-latency data transfer between the chiplets and the host system.

The MI300A is an accelerated processing unit (APU) that integrates 24 Zen 4 CPU cores with four CDNA 3 GPU cores, resulting in a total of 228 CUs in the GPU section, and 128 GB of HBM3 memory. The Zen 4 CPU cores are based on the 5 nm process node and support the x86-64 instruction set, as well as AVX-512 and BFloat16 extensions. The Zen 4 CPU cores can run general-purpose applications and provide host-side computation for the GPU cores. The MI300A has a peak performance of 61.3 TFLOPS of FP64 (122.6 TFLOPS FP64 matrix) and 980.6 TFLOPS of FP16 (1961.2 TFLOPS with sparsity), as well as 5.3 TB/s of memory bandwidth. The MI300A supports PCIe 5.0 and CXL 2.0 interfaces, which allow it to communicate with other devices and accelerators in a heterogeneous system.

The MI300X is a dedicated generative AI accelerator that replaces the CPU cores with additional GPU cores and HBM memory, resulting in a total of 304 CUs (64 cores per CU) and 192 GB of HBM3 memory. The MI300X is designed to accelerate generative AI applications, such as natural language processing, computer vision, and deep learning. The MI300X has a peak performance of 653.7 TFLOPS of TP32 (1307.4 TFLOPS with sparsity) and 1307.4 TFLOPS of FP16 (2614.9 TFLOPS with sparsity), as well as 5.3 TB/s of memory bandwidth. The MI300X also supports PCIe 5.0 and CXL 2.0 interfaces, as well as AMD’s ROCm software stack, which provides a unified programming model and tools for developing and deploying generative AI applications on AMD hardware.[18][19][20]

MI350 series

The MI350X and MI355X are data center accelerators built on the CDNA 4 architecture, targeting advanced AI training and inference workloads. Manufactured on TSMC’s 3 nm (N3) process, they incorporate a high-performance chiplet design, feature 288 GB of HBM3E memory with 8 TB/s of bandwidth.[21] CDNA 4 introduces native support for low-precision formats FP4 and FP6, in addition to FP8 and FP16—boosting FP4 compute to up to 9.2 PetaFLOPS on the MI355X.[22] The architecture maintains AMD’s Infinity Fabric interconnect for high-speed, low-latency data transit between GPU chiplets and the host system. This design builds on CDNA 3, advancing both scalability and energy efficiency for large-scale AI deployments.

Software

ROCm

Following software is, as of 2022, regrouped under the Radeon Open Compute meta-project.

MxGPU

The MI6, MI8, and MI25 products all support AMD's MxGPU virtualization technology, enabling sharing of GPU resources across multiple users.[1][23]

MIOpen

MIOpen is AMD's deep learning library to enable GPU acceleration of deep learning.[1] Much of this extends the GPUOpen's Boltzmann Initiative software.[23] This is intended to compete with the deep learning portions of Nvidia's CUDA library. It supports the deep learning frameworks: Theano, Caffe, TensorFlow, MXNet, Microsoft Cognitive Toolkit, Torch, and Chainer. Programming is supported in OpenCL and Python, in addition to supporting the compilation of CUDA through AMD's Heterogeneous-compute Interface for Portability and Heterogeneous Compute Compiler.

Chipset table

The Vega 20 GPU on the Instinct MI50
Model
(codename)
Release Date Architecture
Fab
Transistors
& Die Size
Core Fillrate[lower-alpha 1][lower-alpha 2][lower-alpha 3] Processing power[lower-alpha 1][lower-alpha 4]
(GFLOPS)
Memory TBP Bus
interface
Config[lower-alpha 5] Clock[lower-alpha 1] (MHz) Texture
(GT/s)
Pixel
(GP/s)
Half Single Double Bus type
& width
Size
(GiB)
Clock
(MT/s)
Bandwidth
(GB/s)
Radeon Instinct MI6
(Polaris 10) [1][24][25][26][27]
2016 GCN 4th gen
14 nm
5.7×109
232 mm2
2304:144:32
36 CU
1120
1233
177.6 39.46 5800 5800 358 GDDR5
256-bit
16 7000 224 150 W PCIe 3.0 x16
Radeon Instinct MI8
(Fiji XT) [1][24][25][28][29]
GCN 3rd gen
28 nm
8.9×109
596 mm2
4096:256:64
64 CU
1000 256.0 64.0 8200 8200 512 HBM
4096-bit
4 1000 512 175 W
Radeon Instinct MI25
(Vega 10 XT)[1][24][25][30][31][32]
GCN 5th gen
14 nm
12.5×109
510 mm2
4096:256:64
64 CU
1400
1500
384 96.0 24600 12300 768 HBM2
2048-bit
16 1704 436.2 300 W
Radeon Instinct MI25 mxgpu
(Prototype, 2017)[33]
Unreleased
12.5×109
510 mm2

4096:256:64
64 CU
1400
1500

384

96.0

24600

12300

768

16

436.2
300 W
Radeon Instinct MI50
(Vega 20 GL)[34][35][36][37]
2018 GCN 5th gen
7 nm
13.2×109
331 mm2
3840:240:-
60 CU
1450
1746
419.04 - 26800 13400 6700 HBM2
4096-bit
16 2000 1024 300 W PCIe 4.0 x16
Radeon Instinct MI60
(Vega 20 GL)[34][38][39]
4096:256:-
64 CU
1500
1800
460.8 - 29450 14725 7362.5 32 1024 300 W

v · d · e

  1. 1.0 1.1 1.2 Boost values (if available) are stated below the base value in italic.
  2. Texture fillrate is calculated as the number of texture mapping units multiplied by the base (or boost) core clock speed.
  3. Pixel fillrate is calculated as the number of render output units multiplied by the base (or boost) core clock speed.
  4. Precision performance is calculated from the base (or boost) core clock speed based on a FMA operation.
  5. Unified Shaders : Texture Mapping Units : Render Output Units and Compute Units (CU)


See also

References

  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 Smith, Ryan (December 12, 2016). "AMD Announces Radeon Instinct: GPU Accelerators for Deep Learning, Coming in 2017". Anandtech. http://www.anandtech.com/show/10905/amd-announces-radeon-instinct-deep-learning-2017.  Cite error: Invalid <ref> tag; name "anand" defined multiple times with different content
  2. 2.0 2.1 Shrout, Ryan (December 12, 2016). "Radeon Instinct Machine Learning GPUs include Vega, Preview Performance". PC Per. https://www.pcper.com/reviews/Graphics-Cards/Radeon-Instinct-Machine-Learning-GPUs-include-Vega-Preview-Performance. 
  3. "Green500 Release June 2022". TOP500. https://www.top500.org/lists/green500/2022/06/. 
  4. "Top500 Release June 2022". TOP500. https://www.top500.org/lists/top500/2022/06/. 
  5. "Top500 Release November 2023". TOP500. https://www.top500.org/lists/top500/2023/11/. 
  6. Smith, Ryan. "AMD Announces Radeon Instinct: GPU Accelerators for Deep Learning, Coming In 2017". https://www.anandtech.com/show/10905/amd-announces-radeon-instinct-deep-learning-2017. 
  7. Smith, Ryan. "AMD Announces Radeon Instinct MI60 & MI50 Accelerators: Powered By 7nm Vega". https://www.anandtech.com/show/13562/amd-announces-radeon-instinct-mi60-mi50-accelerators-powered-by-7nm-vega. 
  8. Smith, Ryan. "AMD Releases Instinct MI210 Accelerator: CDNA 2 On a PCIe Card". https://www.anandtech.com/show/17326/amd-releases-instinct-mi210-accelerator-cdna-2-on-a-pcie-card. 
  9. Smith, Ryan. "AMD Announces Instinct MI200 Accelerator Family: Taking Servers to Exascale and Beyond". https://www.anandtech.com/show/17054/amd-announces-instinct-mi200-accelerator-family-cdna2-exacale-servers. 
  10. Bonshor, Ryan Smith, Gavin. "The AMD Advancing AI & Instinct MI300 Launch Live Blog (Starts at 10am PT/18:00 UTC)". https://www.anandtech.com/show/21181/the-amd-advancing-ai-live-blog-starts-at-10am-pt1800-utc. 
  11. Smith, Ryan. "AMD Plans Massive Memory Instinct MI325X for Q4'24, Lays Out Accelerator Roadmap to 2026". https://www.anandtech.com/show/21422/amd-instinct-mi325x-reveal-and-cdna-architecture-roadmap-computex. 
  12. Gulick, Josh. "AMD Launches Instinct MI350X and MI355X AI GPUs". https://www.extremetech.com/computing/amd-launches-instinct-mi350x-and-mi355x-ai-gpus. 
  13. WhyCry (December 12, 2016). "AMD announces first VEGA accelerator:RADEON INSTINCT MI25 for deep-learning". https://videocardz.com/64677/amd-announces-first-vega-accelerator-radeon-instinct-mi25-for-deep-learning. 
  14. Mujtaba, Hassan (June 21, 2017). "AMD Radeon Instinct MI25 Accelerator With 16 GB HBM2 Specifications Detailed – Launches Today Along With Instinct MI8 and Instinct MI6". https://wccftech.com/amd-radeon-instinct-mi25-mi8-mi6-graphics-accelerators/. 
  15. "Radeon Instinct MI6". AMD. http://instinct.radeon.com/product/mi/radeon-instinct-mi6/. 
  16. "Radeon Instinct MI8". AMD. http://instinct.radeon.com/product/mi/radeon-instinct-mi8/. 
  17. "Radeon Instinct MI25". AMD. http://instinct.radeon.com/product/mi/radeon-instinct-mi25/. 
  18. "AMD CDNA 3 Architecture". AMD. https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf. 
  19. "AMD INSTINCT MI300A APU". AMD. https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/data-sheets/amd-instinct-mi300a-data-sheet.pdf. 
  20. "AMD INSTINCT MI300X APU". AMD. https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/data-sheets/amd-instinct-mi300x-data-sheet.pdf. 
  21. "AMD INSTINCT™ MI350X GPU. LEADERSHIP AI AND HPC ACCELERATION". https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/product-briefs/amd-instinct-mi350x-gpu-brochure.pdf. 
  22. "AMD INSTINCT™ MI355X GPU.LEADERSHIP AI AND HPC ACCELERATION". https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/product-briefs/amd-instinct-mi355x-gpu-brochure.pdf. 
  23. 23.0 23.1 Kampman, Jeff (December 12, 2016). "AMD opens up machine learning with Radeon Instinct". TechReport. https://techreport.com/review/31093/amd-opens-up-machine-learning-with-radeon-instinct. 
  24. 24.0 24.1 24.2 Shrout, Ryan (12 December 2016). "Radeon Instinct Machine Learning GPUs include Vega, Preview Performance". PC Per. https://www.pcper.com/reviews/Graphics-Cards/Radeon-Instinct-Machine-Learning-GPUs-include-Vega-Preview-Performance. Retrieved 12 December 2016. 
  25. 25.0 25.1 25.2 Kampman, Jeff (12 December 2016). "AMD opens up machine learning with Radeon Instinct". TechReport. https://techreport.com/review/31093/amd-opens-up-machine-learning-with-radeon-instinct. Retrieved 12 December 2016. 
  26. "Radeon Instinct MI6". AMD. http://instinct.radeon.com/product/mi/radeon-instinct-mi6/. Retrieved 22 June 2017. 
  27. "AMD Radeon Instinct MI6 Specs". https://www.techpowerup.com/gpu-specs/radeon-instinct-mi6.c2927. 
  28. "Radeon Instinct MI8". AMD. http://instinct.radeon.com/product/mi/radeon-Instinkt-mi8/. Retrieved 22 June 2017. 
  29. "AMD Radeon Instinct MI8 Specs". https://www.techpowerup.com/gpu-specs/radeon-instinct-mi8.c2928. 
  30. Smith, Ryan (5 January 2017). "The AMD Vega Architecture Teaser: Higher IPC, Tiling, & More, coming in H1'2017". Anandtech.com. http://www.anandtech.com/show/11002/the-amd-vega-gpu-architecture-teaser. Retrieved 10 January 2017. 
  31. "Radeon Instinct MI25". AMD. http://instinct.radeon.com/product/mi/radeon-instinct-mi25/. Retrieved 22 June 2017. 
  32. "AMD Radeon Instinct MI25 Specs". https://www.techpowerup.com/gpu-specs/radeon-instinct-mi25.c2983. 
  33. "AMD Radeon Instinct MI25 MxGPU Specs". https://www.techpowerup.com/gpu-specs/radeon-instinct-mi25-mxgpu.c3269. 
  34. 34.0 34.1 "Next Horizon – David Wang Presentation". AMD. https://www.amd.com/system/files/documents/next_horizon_david_wang_presentation.pdf. 
  35. "Radeon Instinct MI50". AMD. https://www.amd.com/en/products/professional-graphics/instinct-mi50. 
  36. "Radeon Instinct MI50 Datasheet". AMD. https://www.amd.com/system/files/documents/radeon-instinct-mi50-datasheet.pdf. 
  37. "Hands on with the AMD Radeon VII". Jarred Walton. https://www.pcgamer.com/hands-on-with-the-amd-radeon-vii/. 
  38. "Radeon Instinct MI60". AMD. https://www.amd.com/en/products/professional-graphics/instinct-mi60. 
  39. "Radeon Instinct MI60 Datasheet". AMD. https://www.amd.com/system/files/documents/radeon-instinct-mi60-datasheet.pdf.