Engineering:ARM Cortex-A77

From HandWiki
ARM Cortex-A77
General Info
Launched2019
Designed byARM Holdings
Max. CPU clock rateto 3.0 GHz in phones and 3.3 GHz in tablets/laptops 
Cache
L1 cache128 KiB (64 KiB I-cache with parity, 64 KiB D-cache) per core
L2 cache256–512 KiB
L3 cache1–4 MiB
Architecture and classification
ArchitectureARMv8-A
MicroarchitectureARM Cortex-A77
Instruction setARMv8-A
Extensions
  • ARMv8.1-A, ARMv8.2-A, Cryptography, RAS, ARMv8.3-A LDAPR instructions, ARMv8.4-A dot product.
Physical specifications
Cores
  • 1–4 per cluster
Products, models, variants
Product code name(s)
  • Deimos
History
PredecessorARM Cortex-A76
SuccessorARM Cortex-A78, ARM Cortex-X1

The ARM Cortex-A77 is a central processing unit implementing the ARMv8.2-A 64-bit instruction set designed by ARM Holdings' Austin, Texas design centre.[1] ARM announced an increase of 23% and 35% in integer and floating point performance, respectively. Memory bandwidth increased 15% relative to the A76.[1]

Design

The Cortex-A77 serves as the successor of the Cortex-A76. The Cortex-A77 is a 4-wide decode out-of-order superscalar design with a new 1.5K macro-OP (MOPs) cache. It can fetch 4 instructions and 6 Mops per cycle. And rename and dispatch 6 Mops, and 13 µops per cycle. The out-of-order window size has been increased to 160 entries. The backend is 12 execution ports with a 50% increase over Cortex-A76. It has a pipeline depth of 13 stages and the execution latencies of 10 stages.[1][2]

There are six pipelines in the integer cluster – an increase of two additional integer pipelines from Cortex-A76. One of the changes from Cortex-A76 is the unification of the issue queues. Previously each pipeline had its own issue queue. On Cortex-A77, there is now a single unified issue queue which improves efficiency. Cortex-A77 added a new fourth general math ALU with a typical 1-cycle simple math operations and some 2-cycle more complex operations. In total, there are three simple ALUs that perform arithmetic and logical data processing operations and a fourth port which has support for complex arithmetic (e.g. MAC, DIV). Cortex-A77 also added a second branch ALU, doubling the throughput for branches.

There are two ASIMD/FP execution pipelines. This is unchanged from Cortex-A76. What did change is the issue queues. As with the integer cluster, the ASIMD cluster now features a unified issue queue for both pipelines, improving efficiency. As with Cortex-A76, the ASIMD on Cortex-A77 are both 128-bit wide capable of 2 double-precision operations, 4 single-precision, 8 half-precision, or 16 8-bit integer operations. Those pipelines can also execute the cryptographic instructions if the extension is supported (not offered by default and requires an additional license from Arm). Cortex-A77 added a second AES unit in order to improve the throughput of cryptography operations.[3]

Larger ROB, Up to 160-entry, up from 128, Add New L0 MOP cache , can up to 1536-entry.[4]

The core supports unprivileged 32-bit applications, but privileged applications must utilize the 64-bit ARMv8-A ISA. It also supports Load acquire (LDAPR) instructions (ARMv8.3-A), Dot Product instructions (ARMv8.4-A), and PSTATE Speculative Store Bypass Safe (SSBS) bit instructions (ARMv8.5-A).

The Cortex-A77 supports ARM's DynamIQ technology, and is expected to be used as high-performance cores in combination with Cortex-A55 power-efficient cores.[1]

Architecture changes in comparison with ARM Cortex-A76

  • Front-end[5][6]
    • Branch-prediction
      • Better accuracy
      • Up to 64B runahead window (From 32B)
      • Increase L1 BRB capacity, up to 64-entry (From 16-entry)
      • Increase BTB capacity, up to 8K-entry (From 6K-entry)
    • Improved prefetcher
    • Add new L0 Macro-op cache
    • Wider instruction fetch, up to 6 instructions/cycle (From 4 instuctions/cycle)
  • Execution engine
    • Wider instruction fetch, Up to 6 instructions/cycle (From 4 instuctions/cycle)
    • Larger Re-Order Buffer, Up to 160-entry (From 128-entry)
    • Wider dispatch, uo to 10-way, (From 8-way)
    • Wider issue, up to 12-way (From 8-way)
      • Execution units
        • New integer ALU unit and port
        • New branch unit and port
        • New dedicated store data ports
        • New AES unit added

Licensing

The Cortex-A77 is available as SIP core to licensees, and its design makes it suitable for integration with other SIP cores (e.g. GPU, display controller, DSP, image processor, etc.) into one die constituting a system on a chip (SoC).

Usage

The Samsung Exynos 980 was introduced in September 2019[7][8] as the first SoC to use the Cortex-A77 microarchitecture.[9] This was later followed by a lower-end variant Exynos 880 in May 2020.[10] The MediaTek Dimensity 1000, 1000L and 1000+ SoCs also utilizes the Cortex-A77 microarchitecture.[11] Derivatives by the names of Kryo 585, Kryo 570 and Kryo 560, are used in the Snapdragon 865, 750G, and 690 respectively.[12][13][14]

See also

References

  1. 1.0 1.1 1.2 1.3 Frumusanu, Andrei. "Arm's New Cortex-A77 CPU Micro-architecture: Evolving Performance". https://www.anandtech.com/show/14384/arm-announces-cortexa77-cpu-ip. 
  2. Schor, David (2019-05-26). "Arm Unveils Cortex-A77, Emphasizes Single-Thread Performance" (in en-US). https://fuse.wikichip.org/news/2339/arm-unveils-cortex-a77-emphasizes-single-thread-performance/. 
  3. "Arm Cortex-A77". https://en.wikichip.org/wiki/arm_holdings/microarchitectures/cortex-a77. 
  4. "Cortex-A77 - Microarchitectures - ARM - WikiChip" (in en). https://en.wikichip.org/wiki/arm_holdings/microarchitectures/cortex-a77. 
  5. "Arm Cortex-A77 - everything you need to know" (in en-US). 2019-05-27. https://www.androidauthority.com/arm-cortex-a77-cpu-990172/. 
  6. "Cortex-A77 - Microarchitectures - ARM - WikiChip" (in en). https://en.wikichip.org/wiki/arm_holdings/microarchitectures/cortex-a77. 
  7. "Samsung Introduces its First 5G-Integrated Mobile Processor, the Exynos 980" (in en). https://www.samsung.com/semiconductor/minisite/exynos/newsroom/pressrelease/samsung-introduces-its-first-5g-integrated-mobile-processor-the-exynos-980/. 
  8. "Exynos 980 5G Mobile Processor: Specs, Features | Samsung Exynos" (in en). https://www.samsung.com/semiconductor/minisite/exynos/products/mobileprocessor/exynos-980/. 
  9. Frumusanu, Andrei. "Samsung Announces Exynos 980 - Mid-Range With Integrated 5G Modem". https://www.anandtech.com/show/14829/samsung-announces-exynos-980-midrange-with-integrated-5g-modem. 
  10. "Exynos 880 5G Mobile Processor: Specs, Features | Samsung Exynos" (in en). https://www.samsung.com/semiconductor/minisite/exynos/products/mobileprocessor/exynos-880. 
  11. MediaTek (2020-06-18). "MediaTek Dimensity 1000 Series" (in en). https://www.mediatek.com/products/smartphones/dimensity-1000-series. 
  12. "Qualcomm Snapdragon 865 5G Mobile Platform | Latest Snapdragon Processor" (in en). 2019-11-19. https://www.qualcomm.com/products/snapdragon-865-5g-mobile-platform. 
  13. "Qualcomm Snapdragon 750G Mobile Platform | Qualcomm". https://www.qualcomm.com/products/snapdragon-750g-5g-mobile-platform. 
  14. "Snapdragon 690 Mobile Platform". https://www.qualcomm.com/products/snapdragon-690-mobile-platform.