New Armv9 CPU core

ARM processors touch 70% of the global population. From the edge to the cloud, Arm is on track to touch 100 percent of the world’s shared data. Arm has announced Armv9, its first new ARM architecture in a decade. Arm’s solution to the future needs of AI, security and specialized computing is v9 released at March 30, 2021. News highlights:

  • The new Armv9 architecture will form the leading edge of the next 300 billion Arm-based chips
  • Advances specialized processing built on the economics, design freedom and accessibility advantages of general-purpose compute
  • Delivers greater performance, enhanced security and DSP and ML capabilities

The latest Arm architecture updates promises to deliver the power of specialized processing with the economics, design freedom, and accessibility of general-purpose computing. Armv9 is promised to be code-friendly and secure. ARM has ongoing rollout of Armv9 to enabling partners to deliver best-in-class solutions for all workloads and applications and across all markets.

Explore the New Armv9 Decade web page tries to tell the ARM vision for next ten years as ARM company turns 30 years old. 30 years ago, Arm began with a vision of deploying its processor technology everywhere computing might happen and now 180 billion Arm-based chips shipped.

Armed for the Future: Armv9 Architecture Equips Chips with AI, Security, DSP Superpowers article says that in 2021 the keywords are artificial intelligence, the cloud, and security. Arm’s new Armv9 architecture directly enhances all of these and more thanks to what might sound like a paradoxical focus on specialized computing. Focusing first on security, Arm build on their hardware-enforced TrustZone technology with the new Confidential Compute Architecture (CCA), which provides an impenetrable firewall between apps running on the same OS, as well as ensuring the safety of data in transit from Arm-based server to Arm-based handheld. So we will see this new core in everything from smart phones to servers.

The updated architecture places a focus on secure execution and compartmentalisation. In addition to security features there are signal processing enhancements. Arm CPU and NPU (neural processing unit) devices are already replacing cloud-based AI infrastructure with low-power Edge compute, but enhancements to Matrix Multiplication (a key operation for machine learning) and new Advanced SIMD instructions and Scalable Vector Extension 2 (SVE2) technology boost not only ML but also DSP, VR and AR compute performance.

Arm broadens established footholds in the cloud through partnerships with Amazon (AWS Graviton processors), Microsoft (Azure Edge), VMware, and Ampere. In Armv9: The Future of Specialized Compute blog post Arm CEO Simon Segars explains why the launch of the Armv9 architecture signals a new era for Arm and its partners. We will have to wait for some time to see the silicon with the new core to become available.

More articles on ARMv9:

https://www.anandtech.com/show/16584/arm-announces-armv9-architecture

https://www.nextplatform.com/2021/03/30/arms-v9-architecture-explains-why-nvidia-needs-to-buy-it/

https://www.xda-developers.com/armv9-architecture/

https://www.androidauthority.com/armv9-explained-1213065/

https://arstechnica.com/gadgets/2021/03/arm-vision-day-outlined-upcoming-arm-v9-cpu-design/

50 Comments

  1. Tomi Engdahl says:

    Arm unveils v9 architecture
    Arm has introduced the Armv9 architecture in response to the global demand for ubiquitous specialized processing with increasingly capable security and artificial intelligence (AI).
    https://www.newelectronics.co.uk/electronics-news/arm-unveils-v9-architecture/235943/

    Reply
  2. Tomi Engdahl says:

    https://semiengineering.com/week-in-review-auto-security-pervasive-computing-61/?cmid=f05c93b6-0f56-44a8-a16d-4505a667f76f
    Pervasive computing — IoT, edge, cloud, data center, and back
    Arm announced its Armv9 architecture, which is designed for secure, pervasive computing that can run in more types of AI systems. Because most data will be touching an Arm-based chip in the near future — whether on the edge, IoT, or data center — Arm enhanced the security, in addition to improving performance and AI/ML capabilities, according to a press release. Data is shielded through its Confidential Compute Architecture (CCA), which uses Realms — dynamically created shields that protects sensitive data and code while in storage, use, and transit. Microsoft Azure collaborated on Realms.

    Reply
  3. Tomi Engdahl says:

    Killer Screwdriver
    https://www.youtube.com/watch?v=AGXQNLq19FQ

    Neon tester/ terminal driver gets tested on a few thousand volts.
    Not a good prospect if you were on the end of it, and you had it in your hand and inadvertently made contact with a few thousand volts.
    Strongly advise to use a proper insulated screwdriver for terminals.
    Use a proper voltage tester like a fluke with 2 probes to get a definitive answer to weather there is voltage or not.
    Always ensure that your test equipment is rated for the voltage you want to test or work on.
    Where ever possible do not work on live equipment moving wires about.
    Play safe !

    Reply
  4. Tomi Engdahl says:

    Arm’s new chip architecture will power future devices, possibly including Apple’s
    https://appleinsider.com/articles/21/04/01/arms-new-chip-architecture-will-power-future-devices-possibly-including-apples
    https://forums.appleinsider.com/discussion/220932

    Chip designer Arm Ltd. has announced its new v9 architecture, a design that could eventually appear in the Apple Silicon powering future iPhones, iPads, and Macs.

    The new Arm v9 focuses on three areas: performance, security, and machine learning (ML) capabilities. Arm says the design will provide more than a 30% CPU performance boost over the next two generations of mobile and infrastructure CPUs.

    AI is another critical area Arm targeted with its v9 design. The architecture’s new Scalable Vector Extension 2 (SVE2) technology will enhance ML and digital signal processing (DSP) for future devices.

    Arm’s SCE2 can improve processing for 5G systems, ML, voice AI assistants, and virtual and augmented reality.

    Security is the third pillar of Arm’s new design. Its Confidential Compute Architecture (CCA) “shields portions of code and data from access or modification while in-use, even from privileged software, by performing computation in a hardware-based secure environment.”

    The CCA will use a concept called Realms, a “region that is separate from both the secure and non-secure worlds.” For example, Arm says a business application could use Realms to protect sensitive data from the rest of the system “while it is in-use, at rest, and in transit.”

    Reply
  5. Tomi Engdahl says:

    Anandtech: With ARMv9 on the way next year, A14 and Mac ARM SoCs may be even more powerful than we expect. Notably adds Scalable Vector Extension 2
    https://www.reddit.com/r/apple/comments/hehztx/anandtech_with_armv9_on_the_way_next_year_a14_and/

    Reply
  6. Tomi Engdahl says:

    ARMv9 gives 30%+ Performance Gains for future Apple Silicon & iPhones
    https://www.youtube.com/watch?v=hkioIjR9CtA

    ARM V9 coming to Apple A15 & M2? Siri gets new voices and Apple TV’s getting a new remote. Then I’m answering your iCaveAnswers about AMD going ARM, iPhone 13 Colours, iPad 10, Tim Cook retiring and what MacBook to buy right now. So lets GO

    ARM has announced their V9 Architecture, and its the first major update in a decade. ARMv6 was the instruction set in the original iPhone, ARMv7 was the last 32 bit version that was abandoned after the iPhone 5, with the 64 Bit ARM 8 taking over for the 5S onwards.
    In terms of what ARM v9 means in broad terms, it’s about faster performance, better AI performance and Security.
    ARMv9 is focused on machine learning but will also benefit general purpose computing, AR, VR and 5G. And we’re already aware of Apple’s interest in AR and VR, as well as how real 5G got last year.

    CPU performance could improve by over 30% on ARMv9 and all existing software should run perfectly without modification as the standard is fully backward compatible, which given the current performance of Apple’s processors sounds pretty awesome!

    Reply
  7. Tomi Engdahl says:

    Arm’s new #processor architecture aims to meet the growing need for enhanced #security and #AI functionality at the #edge #Armv9 #IoT Arm

    New Arm architecture brings enhanced security and AI to IoT
    https://www.edn.com/new-arm-architecture-brings-enhanced-security-and-ai-to-iot/?utm_content=buffer9b3ab&utm_medium=social&utm_source=edn_facebook&utm_campaign=buffer

    The Armv9 architecture launched at the end of March, with the aim of enhancing AI processing in IoT devices. The need seems clear: the company estimates that 90% of new IoT applications will contain some kind of AI element. Among the applications expected to require AI are voice processing for device control, vision processing for industrial automation and consumer systems, and machine learning for robotics, autonomous mobile devices, and smart sensors. AI is also providing developers with an alternative to custom programming when adapting their device designs to specific use cases.

    Reply
  8. Tomi Engdahl says:

    The ARMv9 ISA, And What It Can Do For You
    https://hackaday.com/2021/05/10/the-armv9-isa-and-what-it-can-do-for-you/

    The number of distinct ARM Instruction Set Architectures (ISA) versions has slowly increased, with Arm adding a new version every few years. The oldest ISA version in common use today is ARMv6, with the ARMv6 ISA (ARM11) found in the original Raspberry Pi SBC and Raspberry Pi Zero (BCM2835). The ARMv6 ISA was introduced in 2002, followed by ARMv7 in 2005 (start of Cortex-A series) and ARMv8 in 2011. The latter was notable for adding 64-bit support.

    With ARMv7 being the first of the Cortex cores, and ARMv8 adding 64-bit support in the form of AArch64, what notable features does ARMv9 bring to the table? As announced earlier this year, ARMv9’s focus appears to be on adding a whole host of features that should improve vector processing (vector extensions, or SVE) as well as digital signal processing (DSP) and security, with its Confidential Compute Architecture (CCA).

    In addition to this, ARMv9 also includes all of the features that were added with ARMv8.1, v8.2, v8.3 and so on. In essence, this makes an ARMv9-based processor theoretically capable of going toe-to-toe with the best that Intel and AMD have to offer.

    Welcome to the High-End

    The next three updates added and refined more functionality, creating an impressive list of required and optional updates to the base ARMv-8 specification. Not surprisingly, this large number of ISA specifications is a bit messy, and one of the things that ARMv9 accomplishes is bringing all of these versions together in one specification.

    Another thing which ARMv9 adds over ARMv8 is Scalable Vector Extension version two (SVE2), the successor to SVE, and essentially the replacement of the NEON SIMD instructions. As Arm notes, the NEON instructions are still in ARMv9, but only for backwards compatibility. As the ‘Scalable’ part of SVE suggests, a major benefit of SVE over NEON is that it scales to the underlying hardware, allowing for even smaller, less powerful platforms to still handle the same SVE2-based code as a higher-end chip.

    It’s telling that SVE has its roots in HPC (high-performance computing), with the Japanese Fukagu supercomputer being one of the first systems to make use of it upon its introduction last year. This means that ARMv9’s SVE2 will be very important for applications that process data which benefit from SIMD-based algorithms.

    Realms and Tagged Memory

    New in ARMv9 is the concept of ‘realms’, which can be considered as a kind of secure container in which code can execute without affecting the rest of the system. This works together with e.g. a hypervisor, with the latter handing a large part of the security-related operations to a new Realm Manager. The exact details of which aren’t known at this point, beyond the information which Arm made available at its recent announcement.

    Not new to v9, but available since v8.5 is Memory Tagging Extension (MTE). Memory tagging is a mechanism to track illegal memory operations on a hardware level. This is similar to what Valgrind’s Memcheck tool does when it keeps track of memory accesses in order to detect buffer overflows and out of bounds writes and reads, except with MTE this is supported on a hardware level.

    Having these features in the ARMv9 specification means that upcoming ARM processors and SoCs are likely to offer virtualization and security options that make them interesting for data centers and other applications were virtualization and security are essential.

    Stay Tuned

    Just a few days ago, Arm revealed the first two CPU cores which are based on the ARMv9 specification. These Neoverse V1 (“Zeus”) and N2 (“Perseus”) cores. Both of these are targeting data centers and HPC applications, with Amazon AWS, Tencent, Oracle, and other cloud providers likely to use them.

    The low-down is that for in the near term, ARMv9 is something that the average consumer will have little if anything to do with as there’s not much incentive for many platforms to change even from baseline ARMv8-A to ARMv6-A. The need to license new cores and new IP is of course another factor here. All of this means that for the coming years, it’s not likely that we’ll see ARMv9-based silicon appear in mobile devices or new single-board computers. (Sounds kind of like a challenge for Hackaday readers, doesn’t it?)

    That is not to say that it isn’t an interesting development, especially once ARMv9 with SVE2 and CCA ultimately do appear on those platforms.

    Arm Puts Some Muscle Into Future Neoverse Server CPU Designs
    https://www.nextplatform.com/2021/04/27/arm-puts-some-muscle-into-future-neoverse-server-cpu-designs/

    Arm is hosting its annual Tech Day shindig, virtually (again) thanks to the coronavirus pandemic, and is providing a lot more insight into the future Neoverse core and processor designs that will be adopted and modified by those who have a hankering to take on the hegemony of the X86 processor – which now includes pretty solid CPUs from Intel and AMD – in the datacenter and at the edge.

    The revelation of the details of the future Neoverse server architectures, meshed with the future Armv9-A architecture that was unveiled a month ago and that will make its debut in the Neoverse “Perseus” N2 cores, comes at a pivotal time. Despite many Arm server chip suppliers leaving the field, Arm Holdings has hung in there and there appears to be momentum building for an Arm alternative from multiple chip designers and suppliers who will individually play to particular subsets of the datacenter and edge, but collectively give Intel – and therefore AMD, too – plenty of grief.

    Like the Neoverse N2 core, the design of which is done and licensable from Arm Holdings, the “Zeus” V1 core is also done and provides significant differentiation within the Neoverse family of designs and with all manner of CPUs in the datacenter and at the edge. In fact, while we don’t know it yet, at some point late this year and early next year, we should see more than a few processors that are based on the Zeus and Perseus platforms that Arm Holdings has created to demonstrate the totality of its technology and to give actual chip makers a head start on designing their own chippery. At Tech Day, the hardware engineers in charge of these Neoverse designs were able to talk more deeply about the architecture of V1 and N2 and also provide some insight into expected performance of chips based on the Zeus and Perseus cores and the technology that Arm wraps around them to create an SoC.

    Last September, when Arm trotted out the Neoverse V1 design and made it available, the N2 design was not yet available. But as of this announcement, today, it is. Both the Ampere Computing Altra and the Amazon Web Services Graviton2 processors, which are the two production-grade Arm server chips in the market today, are based on N1 cores and platform designs, with various customizations. So when this chart says production, that’s what it means – some licensee is using it in production and it is being manufactured and either sold or used, depending. The N1 designs supported regular DDR4 memory or HBM2 stacked memory as well as PCI-Express 4.0 peripheral controllers and CCIX 1.0 interconnects for accelerators and to provide NUMA shared memory across processors. CCIX is one of many interconnects, in this case launched by Xilinx in May 2016 to provide cache coherent memory sharing between CPUs and accelerators.

    With the V1 platform, Arm is designing cores and the uncore regions of a hypothetical processor using either 7 nanometer or 5 nanometer processes, presumably either at Taiwan Semiconductor Manufacturing Corp or Samsung Electronics, which have fabs that can handle either. Intel, which has merchant foundry aspirations, could eventually get there. As could SMIC, the Chinese foundry, which launched its 7 nanometer efforts last fall. No other foundries in the world are doing 7 nanometer research or manufacturing. And 5 nanometer is going to be for the elite only. We shall see just how real 3 nanometer processes are, and after that, well. . . .

    Drilling Into The V1

    The V1 core is going to push the limits of core counts, clock speeds, and operations per second as the throughput engine. Everything is turned up to 11, and not because Arm wants to show off, but because some customers running search engines, machine learning training and inference, HPC simulation and modeling, and data analytics workloads need a monster to chew on their data. Also, the big public clouds want to have a big instance that they can carve into little instances but, importantly, also sell as one big, wonking, expensive instance to those who need it to run, say, the SAP HANA in-memory database in the cloud.

    The Zeus V1 core is providing 50 percent better single-threaded performance on integer workloads compared to the Ares N1 core, which is better than the 30 percent average per generation that Arm was promising. The V1 design has the Armv8-A implementation of the SVE vector engine, and in this case will support a pair of 256-bit wide vectors that can do Bfloat16 as well as a mix of floating point and integer operations in parallel. This will essentially match the AVX-512 vector unit in each Intel Xeon SP core for years and the pair of 256-bit FMA units in the AMD “Milan” Epyc 7003 core.

    Those wide vectors in the V1 cores are not quite as parallel as a GPU accelerator, but they run considerably faster and the performance delta is not as small as you might think. Slap some HBM2 memory on a V1 chip and it should run like a bat out of hell – the A64FX Arm chip from Fujitsu used in the “Fugaku” supercomputer proves the point.

    The Zeus V1 platform will support DDR5 main memory orHBM2E stacked memory for those who need high bandwidth, and supports PCI-Express 5.0 peripherals and the CCIX 1.1 protocol for accelerator and NUMA interconnects. That will put it on par, more or less depending on the ratios of these technologies, with the future “Sapphire Rapids” Xeon SPs from Intel and the “Genoa” Epyc 7004s from AMD. Chip sellers – we really can’t call them chip makers, since they use foundries and outside packagers – will have to choose very carefully between 7 nanometer and 5 nanometer processes, and we would not be surprised to see some chiplet implementations that use CCIX for chiplet interconnect and allows for either 7 nanometer or 5 nanometer cores and maybe 14 nanometer or 7 nanometer etching for the uncore regions where dropping the transistor size hurts as much as it helps because of leakage issues. It will be tough to make these calls, given the huge demand for chips and the limited capacity for either 7 nanometer or 5 nanometer manufacturing.

    The Zeus V1 is technically hewing to the Armv8.4 ISA and AMBA CHI.D on-chip interconnect specs, which means it supports the SVE vectors. In fact, this is Arm’s first homegrown SVE implementation, and it supports running the pair of 256-bit SVE units as a quad of 128-bit NEON accelerators

    The net result is that the V1 core has 50 percent higher instruction per core (IPC) over the N1 core at the same frequency in a 70 percent larger chip area, and if customers want to sacrifice a little performance on the clock speed they can radically reduce the thermals. We don’t expect customers buying server CPUs based on the V1 cores to do that. This is a muscle car, and it will run fast and furious. As intended.
    Tearing The Hood Off The N2

    That brings us to the Perseus N2 core and CPU design, which optimizes for performance per dollar and performance per watt rather than just pushing the performance limits at any cost as the V1 core and CPU does. If the V1 is a muscle car, the N2 is a crossover sport utility vehicle.

    Abernathy says that the front end on the N2 core is similar to that on the V1 core, but the core will implement the Armv9-A architecture, which has all kinds of interesting security features that are, frankly, of less use for exascale computing facilities. While the V1 design is aimed at CPUs with 32 to 128 cores with a thermal envelope of between 80 watts and 350 watts, the N2 cores are aimed at mainstream infrastructure servers that might have 12 to 36 cores and run at between 30 watts and 80 watts. That is not to say that there will not be N2 chips that don’t push the core limits up and down – there will be some, at we think Ampere Computing, AWS, and possibly Nvidia will use N2 cores in some devices. (It is very unlikely that Ampere and AWS will use a V1 core in their respective Altra or Graviton chips.)

    The N2 is really an upgrade to the N1, with 40 percent higher IPC at constant frequency, with around the same power draw and about the same area as the N1 but allowing a 10 percent increase in clock speed and presumably more cores and caches thanks to the shrink to 5 nanometers.

    The N2 core will take up 30 percent more area and burn 45 percent more power to deliver that 40 percent higher throughput, and importantly, the N2 core will be 25 percent smaller than the V1 core so you can cram more of them into a given die size. Those fat vectors and fat caches don’t come free. Nothing in CPU architecture does. And that – in addition to all of the security features in the Armv9-A architecture – is why we expect cloud builders to want N2 designs more than V1 designs. And we would not be surprised if they (or their chip partner if they are not designing their own chips as AWS and maybe Microsoft is doing) push the core limits above 128 cores using a chiplet design, again using CCIX as a chiplet interconnect and possibly breaking out an I/O and memory hub like AMD does with its Epyc X86 server CPUs.

    That’s what we would do, and perhaps in a single socket design to really drive down the system cost and drive up the size of the cloud instance and the number of slices you can carve it into.

    That is for a 32-core monolithic chip, with four to eight DDR5 memory channels running at 5.6 GHz (yeah!) and twelve ports for NUMA expansion or to use as CXL ports. This reference is not pushing any limits

    Reply
  9. Tomi Engdahl says:

    The ARMv9 ISA, And What It Can Do For You
    https://hackaday.com/2021/05/10/the-armv9-isa-and-what-it-can-do-for-you/

    The number of distinct ARM Instruction Set Architectures (ISA) versions has slowly increased, with Arm adding a new version every few years. The oldest ISA version in common use today is ARMv6, with the ARMv6 ISA (ARM11) found in the original Raspberry Pi SBC and Raspberry Pi Zero (BCM2835). The ARMv6 ISA was introduced in 2002, followed by ARMv7 in 2005 (start of Cortex-A series) and ARMv8 in 2011. The latter was notable for adding 64-bit support.

    Reply
  10. Tomi Engdahl says:

    Jon Fingas / Engadget:
    ARM unveils Cortex-X2 CPU, which it says offers 30% more performance than current high-end Android phones, and Mali-G710 for Chromebooks and high-end phones — first new chip architecture in a decade, it’s ready to show the CPU designs that will take advantage of those improvements. The company has

    ARM’s first v9 CPUs are built for computers, not just phones
    The Cortex-X2 and Mali-G710 are ready for ARM-based PCs.
    https://www.engadget.com/armv9-cortex-x2-cpu-mali-g710-gpu-130055653.html?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cudGVjaG1lbWUuY29tLw&guce_referrer_sig=AQAAAByhNpY7Z9dr0E6Cc5Q72cmQTspEdwcFgRMdK3pZVztrTyRVR_js7gbefYDWfg-vwv5hC0mTIDBdsXqMp66e0do1EZ_Re_GlJEnpnGKpm-6ZMVHgPKcrSLcPHNxe_C9nFQ6LqmSj0S3yAGkdT3Az84pnIF5dHg_O1by4WhG2Lmr4

    Now that ARM has unveiled its first new chip architecture in a decade, it’s ready to show the CPU designs that will take advantage of those improvements. The company has unveiled a host of new Cortex CPUs (and companion Mali GPUs) that it hopes will power laptops, other computers and wearables in addition to the next wave of smartphones.

    The flagship is the ARM Cortex-X2, a CPU core meant to scale from “premium” smartphones to laptops. It reportedly offers a 30 percent performance boost over current high-end Android phones, although ARM didn’t provide more details.

    You’ll also see gains for more mainstream uses. The Cortex-A710 is the first ARMv9 “big” core (meant for big.LITTLE chips) and is about 10 percent faster than the Cortex-A78 while delivering 30 percent greater efficiency. Cortex-A510, meanwhile, is the first new “LITTLE” high-efficiency core in four years and should offer 35 percent better overall performance and triple the speed for machine learning. ARM claims the A510 is nearly as fast as high-performance chips from a few years ago, making it a viable option for watches and smart home tech in addition to lower-end phones.

    ARM is finally dragging the rest of the industry into the 64-bit era, too. It’s promising that all “big” and “LITTLE” cores will be 64-bit by 2023, and its partners are helping put an end to 32-bit apps before 2021 is over. There’s a good chance you’ve been using 64-bit phones and apps for a while, but this should push stragglers to catch up.

    Like the Cortex CPUs, the Mali GPUs are aimed at more than just phones. The flagship Mali-G710 is about 20 percent faster for intensive tasks (35 percent for machine learning) and is aimed at Chromebooks in addition to high-end phones.

    Reply
  11. Tomi Engdahl says:

    Arm just unveiled its first CPUs based on the Armv9 architecture, but you’ll have to wait a while to see them.

    Arm unveils Armv9 Cortex CPUs that will power the next generation of smartphones and PCs
    Arm has big plans for its Armv9 architecture, and it all starts with the CPUs announced today.
    https://www.windowscentral.com/arm-first-armv9-cpus?utm_source=wc_fb&utm_medium=fb_link&utm_content=85287&utm_campaign=social

    What you need to know
    Arm announced new CPUs based on its Armv9 architecture.
    The company also announced several Mali GPUs.
    Arm also specified that all Cortex-A CPU mobile cores will be 64-bit by 2023.

    Reply
  12. Tomi Engdahl says:

    https://etn.fi/index.php/13-news/12197-android-puhelimista-tulee-taysin-64-bittisia

    Arm on esitellyt ensimmäiset uuteen v9-arkkitehtuuriin perustuvat suoritinytimensä. Cortex-nimi säilyy. Seuraavan sukupolven älypuhelimien sovellusprosessoreihin ovat tulossa Cortex-X2, Cortex-A710 ja Cortex-A510. Myös Mali-grafiikka saa lisää suorituskykyä. Koko Arm-arkkitehtuuri siirtyy 64-bittisyyteen viimeistään ensi vuoden aikana.

    Uudet CPU-ytimet on Arm:n mukaan suunniteltu tuomaan yhä enemmän suorituskykyä eri luokan laitekategorioihin. COrtex-X2 on viime vuonna julkistetun X1:n seuraaja tuoden 16 prosenttia enemmän suorituskykyä, jos piiri valmistetaan samassa prosessissa ja se operoi samalla kellotaajuudella.

    Cortex-A710 on puolestaan ensimmäinen v9-polven ”big”-ydin. Suorituskyky on 10 prosenttia parempi kuin A78-edeltäjällä, mutta energiatehokkuus on parantunut 30 prosenttia. Tämä tietää hyvää puhelimien akkukestolle.

    Cortex-A510 on puolestaan Arm:n ensimmäinen energiatehokas prosessoriydin yli neljään vuoteen. Sen rooli on ajaa puhelimien LITTLE+big -arkkitehtuurissa sellaisia toimintoja, jotka eivät tarvitse kaikkein raskainta laskentatehoa.

    Reply
  13. Tomi Engdahl says:

    Arm introduces its first Armv9 architecture CPUs and GPUs, previewing 2022’s Android flagships
    The Cortex-X2, Cortex-A710, and Mali-G710 will power the smartphones of tomorrow
    https://www.theverge.com/2021/5/25/22448107/arm-armv9-architecure-cortex-x2-a710-cpu-mali-g710-gpu-android-2022-designs

    Arm has introduced its latest CPU and GPU designs, including its flagship Cortex-X2 and Cortex-A710 CPUs and Mali-G710 GPU. The new CPU and GPU designs aren’t just Arm’s latest chip blueprints, though; they’re also its first designs to utilize its new Armv9 architecture, its first in a decade, which means big jumps in performance, along with new security and AI features.

    Most consumers probably aren’t familiar with the exact Arm cores inside their phones or computers, but Arm’s designs — and particularly, its big.LITTLE configuration of combining powerful high-performance cores and battery-saving high-efficiency cores — are common to virtually every Android phone. That means the designs introduced here are effectively a preview of what the best Android phones of 2022 will look like.

    Arm is introducing three new CPU designs this year. First, the Cortex-X2: it’s part of Arm’s Cortex-X custom program that lets partners help design specialized cores for their specific use cases. A successor to last year’s Cortex-X1, it’s also the most powerful design in the lineup, promising up to 16 percent improved performance compared to last year’s model. There’s also the Cortex-A710, the new “big” core, promising up to 30 percent better power efficiency and 10 percent better performance than last year’s Cortex-A78.

    But Arm isn’t just upgrading the performance cores. For the first time in four years, it’s also introducing a new “LITTLE” high-efficiency core, the Cortex-A510, which replaces the Cortex-A55 design that’s been used for major phones since it was introduced in 2017. And it’s here that Arm’s promising the biggest jumps: up to 30 percent better performance, and 20 percent better power efficiency compared to the old model.

    The new Armv9 designs together should result in a big jump in performance once they make their way to chips, too. Qualcomm’s Snapdragon 888, for example, uses partially customized versions of last year’s flagship Arm Cortex-X1 and Cortex-A78 as its four “big” cores and the roughly four-year-old Cortex-A55 design for its “LITTLE” cores. Samsung’s flagship Exynos 2100 uses a similar configuration, too, along with Arm’s Mali-G78 GPU design.

    The new CPU core designs promise to blow both of those chips out of the water: Arm says that a CPU cluster made up of the Armv9 designs (a single Cortex-X2, three Cortex-A710 cores, and four Cortex-A510 cores) should offer up to 30 percent better peak performance (thanks to the Cortex-X2), 30 percent better overall efficiency (from the Cortex-A710), and 35 percent better “LITTLE” performance from the Cortex-A510 when compared to a comparable Armv8.2 cluster like the ones mentioned above.

    Arm is also introducing three new GPUs. There’s the flagship Mali-G710, which promises 20 percent better gaming performance and 20 percent better power efficiency; the Mali-G510, designed as a midrange option for more affordable devices; and the entry-level Mali-G310.

    So it’ll likely be the beginning of 2022 before the new Arm CPU and GPU designs appear in any phones — and that’s assuming the global semiconductor shortage doesn’t push out next year’s products even further.

    Reply
  14. Tomi Engdahl says:

    Taiwan’s MediaTek to deliver its first Armv9 chip by year’s end
    MediaTek ships hundreds of millions of Arm-based chips per year: Arm CEO
    https://www.taiwannews.com.tw/en/news/4214841

    MediaTek is set to deliver its first commercially available Armv9 chip by the end of this year, according to Arm CEO Simon Segars.

    Speaking during a keynote speech at the virtual Computex 2021, Segars described MediaTek as a long-time partner. The Taiwanese company ships hundreds of millions of Arm-based chips a year in addition to using the British company’s Total Compute Solutions to expand the reach of mobile devices and enter new markets, Segars said.

    Armv9 focuses on security while offering improved performance, digital signal processing, and machine learning capabilities, according to Digitimes.

    New processor cores offered by Arm include the Cortex-X2, Cortex-A710, and Cortex-A510 CPU designs. According to the British semiconductor company, the Armv9 Cortex CPUs are meant for a wide range of consumer devices made for several different workloads and use cases.

    MediaTek said its first smartphone product using Armv9 technologies will be ready for the market by year’s end, according to the report. “The Armv9 architecture will play a role as we design next-generation Dimensity 5G products with new capabilities, features, and user experiences,”

    Reply
  15. Tomi Engdahl says:

    Taiwan’s MediaTek opens up top 5G chip for customization
    Brands can now customize multimedia, multiprocessing, AI, cameras, connectivity
    https://www.taiwannews.com.tw/en/news/4236220

    MediaTek is opening up the architecture of its high-end 5G Dimensity 1200 chips, allowing phone makers to customize several elements for their smartphones.

    Dubbed the Dimensity 5G Open Resource Architecture, MediaTek will give brands the ability to tweak five dimensions of the Dimensity 1200 to suit their preferences. The new initiative will allow phone manufacturers to customize multimedia, multiprocessing, AI, cameras, and connectivity.

    Reply
  16. Tomi Engdahl says:

    Arm Introduces Its Confidential Compute Architecture
    https://fuse.wikichip.org/news/5699/arm-introduces-its-confidential-compute-architecture/

    When Arm announced the Armv9 architecture earlier this year, they revealed a new architecture for confidential compute that would be introduced. Today Arm is introducing the new architecture. The Arm Confidential Compute Architecture (CCA) is an isolation technology that builds on Arm’s existing TrustZone technology as its foundation. Within the new CCA architecture, application data is designed to be protected while in use. This includes the prevention of access to private data even from privileged software such as the hypervisor or the operating system. What Arm is introducing today is the first release of the behavior architecture specification for a compliant implementation of the architecture. Today spec introduction is designed to seed the software community and kickstart software enablement. Arm expects around 2-3 years until things reach production readiness and we can expect to see it shipping in silicon.

    Note that the Confidential Compute Architecture is part of the new Armv9 architecture. Specifically, it will be an optional feature of the Armv9.2 and will likely eventually become a required extension in a future specification. Software developers will be able to check for CCA support by checking if the Realm Management Extension (RME) feature is present on the CPU (more on this later).

    Realms

    Previously, a high-trust environment was only accessible to silicon vendors and OEMs through things such as TrustZones. With the new Arm CCA, high-trust environments is being extended to all developers with the hope that it will be used by mainstream workloads. While initially this be implemented by OS-related software, Arm hopes that Realms will find their way into mainstream software such as smartphone apps (more on this later in this article). To that end, the new confidential compute architecture is designed to suit all markets from cloud to mobile, automotive, and to IoT.

    At the heart of the Confidential Compute Architecture are “Realms”. Realms are effectively small, individual high-trust environments or enclaves. The Confidential Compute Architecture is designed to prevent private data access across the entire stack right from the silicon. To that end, the CCA provides support for strong protection between mutually distrusting workloads, strong protection against compromised rich operating systems (e.g., Linux, Windows), strong protection against compromised hypervisors, and new protection from applications running within the secure world. In other words, a Realm need not trust anyone. The CCA also supports attestation so that a Realm can verify trust in the device or platform and then present an attestation report to a relying party that can then independently verify trust.

    The diagram in the slide below shows a high-level overview of how Arm’s CCA is designed to work. You will notice that there are now two new states beyond the usual ‘Normal’ and ‘Secure’ states. They are called ‘Realm’ and ‘Root’. The new states facilitate the new realm features.

    Realms are shown in the yellow boxes. When they are created – through the help of the hypervisor – they will migrate to the Realm state. The new realm enclaves cannot be accessed by the hypervisor or any of the TrustZone applications (in green) or even other realms.

    Realms can be created and destroyed dynamically on-demand. In striking contrast to TrustZone, resources dedicated to realms such as memory can also be adjusted on-demand (i.e., memory size can be increased or decreased as needed on-demand). The idea here is that by moving large software workloads into their own private realms, applications can have much greater confidence that the data they process and algorithms used cannot be used by other software and services running on the same hardware. A big part of the CCA is protecting realms from the underlying hypervisor. So while it is still responsible for allocating resources and is responsible for things such as scheduling, it is no longer able to access the data within a realm.

    As far as how many realms can be supported; a system is capable of supporting any number of realms – limited by just the available system resources (i.e., memory, compute power). For example, small IoT devices might have just a handful of realms while a large server SoC might be running 100s of realms. Creating and destroying realms is meant to be light enough operation to be used by any application as desired.

    Realm Management Extension (RME)

    The architecture specification being released today includes both the hardware requirements and the enabling firmware and software. As we mentioned earlier, the Arm CCA feature is being introduced as an optional feature of the Armv9.2 ISA. Hardware that supports the confidential compute architecture will have the Realm Management Extension (RME) available. At its core, processors that implement the Realm Management Extension have two new hardware capabilities: the creation of realms and dynamic memory assignment.

    When looking at a typical Arm processor with TrustZone support, the system is partitioned into two regions: one is a secure world and the other is a non-secure world (or normal world). Memory mapping divides the secure world from the non-secure world. Historically, one of the limitations that can impact the use of TrustZone is that memory has to be allocated to the secure world by the Monitor at boot time. And this is often done with limited granularity. This imposes various artificial constraints on the kind/size of resources that may be allocated to the secure world. With the new Realm Management Extension, pages can now transition from the non-secure world to the secure world and back again. This allows TrustZone to be utilized for much more memory-intensive applications. As a side note, when RME is being used exclusively for the enhancement of TrustZone (i.e., no Realms) with dynamic memory capabilities, this specific feature is now being called “Arm Dynamic TrustZone Technology.”

    New RME States: Realm & Root

    Under the current architecture (Armv8.4-SecEL2), there are two worlds: Secure and Non-Secure (Normal). Each world is associated with its own security state (secure/non-secure) and a physical address space. The secure world is protected from the normal world at exception level 2 and above. Any attempt to access secure memory address space from the normal world will generate a hardware exception, halting execution. Software running within the secure world is able to access both secure and normal world memory.

    Under the new RME, two new security states have been added: Root and Realm. Additionally, two new physical address spaces (also called Root and Realm) have been added. The lowest level of the hardware stack – the Monitor – now gets its own private address space and state called ‘Root’. The new Root address space is protected from all other address spaces even the secure world.

    The new Realm Management Extension provides the ability to dynamically transition pages of memory between these physical address spaces. Arm calls those pages Granules. In order for the processor to support the new dynamic transitioning of memory pages between the various address spaces, Arm added a new table called the Granule Protection Table or GPT. The GPT is an extension of the MMU page tables that are controlled by the Monitor in EL3.

    Note that any memory assigned to the Normal, Secure, Root, and Realm world is encrypted by the hardware prior to written to DRAM.

    Realm Management Monitor (RMM)

    As part of the Arm CCA, a new firmware/software architecture is also defined. A new Realm Management Monitor (RMM) is defined which provides services for the hypervisor as well as to the realms themselves. This is done via a new Realm Management Interface (RMI). Services for the hypervisor include the creation and destruction of realms as well as adding and removing memory.

    Reply
  17. Tomi Engdahl says:

    Arm Launches Its New Flagship Performance Armv9 Core: Cortex-X2
    https://fuse.wikichip.org/news/5269/arm-launches-its-new-flagship-performance-armv9-core-cortex-x2/

    Arm Launches The DSU-110 For New Armv9 CPU Clusters
    https://fuse.wikichip.org/news/5270/arm-launches-the-dsu-110-for-new-armv9-cpu-clusters/

    Today Arm is introducing a broad range of client IPs including a new little Armv9 CPU, a new big Armv9 CPU, a new flagship performance Armv9 CPU, and new Mali GPUs. But to fully take advantage of the new cores, a new cluster must be designed to better interface between the new IPs. Arm’s new DSU-110 was designed for this very purpose.

    Reply
  18. Tomi Engdahl says:

    https://www.livemint.com/technology/tech-news/arm-unveils-new-chip-designs-that-will-power-mobile-processors-of-the-future-11622012502136.html

    The company announced the ARM Cortex-X2, Cortex-A710 and the Cortex-A510 chips, which will become the blueprint for processors that run on smartphones, smart home devices, and possibly even laptops that come in 2022

    Qualcomm’s custom Kryo architecture (which run on Snapdragon chips), Samsung’s Mongoose (running on Exynos chips), and even Apple’s Bionic chips (which are used for its iPhone processors and the new Apple M1 PC processor), are all based on ARM’s designs.

    Reply
  19. Tomi Engdahl says:

    First Armv9 chip to debut by year-end
    https://www.digitimes.com/news/a20210601PD212.html

    The first chip based on Arm’s new v9 architecture will be rolled out by MediaTek at the end of this year, according to Arm CEO Simon Segars.

    Reply
  20. Tomi Engdahl says:

    ARM’s powerful new cores for Samsung’s next flagship Exynos chip unveiled
    https://www.sammobile.com/news/arm-powerful-new-cores-announced-samsung-exynos

    Semiconductor company Arm has announced its new CPU and GPU designs that will power next year’s flagship Samsung Exynos chip. Arm’s core architecture is getting a major upgrade for the first time in a decade – the ARMv8 architecture that’s been seen in pretty much every Android chipset over the last ten years is being replaced by ARMv9, which brings with it more powerful and efficient processor cores.

    The Exynos 2100 and Snapdragon 888 chipsets use a big.LITTLE setup that includes a single ultra-high-performance Cortex-X1 core, three high-performance Cortex-A78 cores, and four power-efficient Cortex-A55 cores, and all of these cores are now being upgraded with new designs. There’s the Cortex-X2, which promises 16% higher performance compared to the X1 and twice as fast machine learning performance. The Cortex-A710, meanwhile, promises 30% higher efficiency and 10% better performance over the A78.

    Coming to a Samsung Exynos chip near you… with an AMD GPU?

    According to Arm, these new cores can be used in a similar configuration as before, so we should see one Cortex-X2, three Cortex-A710, and four Cortex-A510 cores in whatever new flagship chipsets Samsung and Qualcomm will come up with next year. Arm has also introduced three new GPUs, with the flagship Mali-G710 promising up to 20% better gaming performance over the Mali-G78 found inside the Galaxy S21’s Exynos variant.

    However, the way Samsung has been teasing things this year, the Korean giant could switch to an AMD GPU for its next flagship Exynos chip. Rumors suggest the Samsung and AMD partnership will first bear fruit with a laptop later this year, and while there is no guarantee that Samsung will have an AMD GPU ready for its phones by 2022, it is looking highly likely given how long the two companies seem to have been working together.

    Even if the AMD GPU isn’t ready by 2022, we should see faster performance and better efficiency on Samsung’s flagship phones (whichever form they might take), thanks to the new Arm CPU and GPU designs that are bound to be used in Samsung and Qualcomm’s next-generation chips.

    Reply
  21. Tomi Engdahl says:

    Lauterbach Offers
    Full Debug Support for Armv9
    https://www.lauterbach.com/armv9-debugging-tools.html

    Lauterbach announces full debug and trace support for the Arm®v9 architecture.

    The Armv9 architecture is the successor to Armv8, which introduced 64-bit computing to the Arm platform. Armv9 will deliver greater performance, enhanced security and digital signal processing (DSP) and machine learning (ML) capabilities for the next generation of Arm-based chips.

    Says Norbert Weiss, Managing Directory Lauterbach GmbH, “Lauterbach have always worked very closely with Arm and this relationship has allowed us to consistently provide high quality tools, often before the first silicon is available. By adding support for the upcoming Armv9 architecture, Lauterbach is re-affirming their commitment to ongoing support for all Arm cores.”

    Reply
  22. Tomi Engdahl says:

    The successor to Qualcomm’s Snapdragon 888 will have Arm’s new v9 CPU designs
    https://www.xda-developers.com/qualcomm-snapdragon-888-successor-arm-v9-cpu/

    Thanks to the ongoing global chip shortage, chip design firm Qualcomm is struggling to meet the demand for its premium-tier Snapdragon 888 chip, so they’ve resorted to launching products like the Snapdragon 860 and 778 in recent months. However, that doesn’t mean they aren’t still working on their next major chipset. We already know that Qualcomm is working on a new chipset aimed at high-performance laptops after they acquired Nuvia earlier this year, but we don’t expect products based on that design to launch until late next year at the earliest. Meanwhile, we now have the first details of Qualcomm’s next premium-tier chipset for mobile devices.

    Famed leaker Evan Blass took to Twitter today to share some details on “SM8450”, the presumed part number for Qualcomm’s “next-gen premium system-on-chip.” The Snapdragon 888’s part number was “SM8350”, which is why we expect “SM8450” to be its successor. Given the lack of consistency in Qualcomm’s chip naming process, we have no idea what “SM8450” will be marketed as. Nonetheless, we now know what to expect from it thanks to a list of “key components” that Blass shared on Twitter.

    According to Blass, SM8450 will integrate Qualcomm’s Snapdragon X65 5G modem-RF system. The Snapdragon X65 is the successor to the Snapdragon X60 modem integrated into the Snapdragon 888. The modem is built on a 4nm process just like the AP. Phones built on the SoC can support connecting to mmWave or sub-6GHz 5G frequencies on either non-standalone or standalone 5G networks.

    The CPU consists of Qualcomm Kryo 780 cores “built on Arm Cortex v9 technology.” The Armv9 architecture was announced earlier this year, and the first CPU designs to be announced using the new technology were the Cortex-X2, Cortex-A710, and Cortex-A510. Thus, we expect the Snapdragon 888’s successor to be using these three CPU core designs, likely in a 1 x 3 x 4 configuration (1X Cortex-X2, 3X-Cortex-A710, 4X Cortex-A510.)

    Reply
  23. Tomi Engdahl says:

    https://www.anandtech.com/show/16693/arm-announces-mobile-armv9-cpu-microarchitectures-cortexx2-cortexa710-cortexa510

    ARM Announces First Armv9 Based CPUs and GPUs Which Will Power 2022 Android Phones
    https://www.mysmartprice.com/gear/arm-announces-first-armv9-based-cpus-and-gpus-which-will-power-2022-android-phones/

    The ARM v9 architecture is the first time the chip company has introduced a new architecture in a decade, and the new chips set the stage for future smartphones.

    Reply
  24. Tomi Engdahl says:

    ARM processors for servers: “Neoverse” computing cores vs. Intel Xeon and AMD Epyc
    April 28, 2021 Technology 7 Views
    https://techbeezer.com/switzerlandeng/arm-processors-for-servers-neoverse-computing-cores-vs-intel-xeon-and-amd-epyc-2/

    According to ARM, the Neoverse V1 cores (Zeus) with SVE are primarily intended for supercomputers, ie for high performance computing (HPC) – this also shows the Fujitsu A64X of the current top 500 top rider Fugaku with SVE. The EU processor Rhea, developed by SiPearl for the European Processor Initiative (EPI), therefore uses (among other things) a Neoverse V1, but also a commender HPC processor from the Indian Ministry of Electronics and Information Technology (MeitY).

    With Neoverse N2 (Perseus) the HPC computing power is not very high, here all-purpose servers are at the forefront. Neoverse N2 is the first ARM core for server with ARMv9 architecture, which also includes the slimmer SVE implementation SVE2. According to ARM, about Marvell is currently developing a new generation of Octeon network processors with Neoverse N2.

    Bfloat16, CMN-700 and CXL

    The SVE (2) computing of Neoverse V1 and N2 also processes the KI data format Bfloat16 (BF16). In addition, ARM announces the improved Core Mesh Network CMN-700 for internal interconnection of computing cores, memory controllers and interfaces in SoCs. Not only does the CMN-700 achieve significantly higher data rates than its predecessor, but it also binds the coherent CXL interface for computing accelerators.

    Reply
  25. Tomi Engdahl says:

    Arm v9 promises ray tracing for smartphones and a big performance boost
    Applications will be protected in hardware, and you’ll be able to game, PC-style, on your smartphone.
    https://www.pcworld.com/article/3613514/arm-v9-promises-ray-tracing-for-smartphones-and-a-big-performance-boost.html

    Arm said Tuesday that ray tracing and variable rate shading will migrate from the PC

    to Arm-powered smartphones and tablets as part of Armv9, the next-generation CPU architecture that the company expects will power the next decade of Arm devices. Chips based upon the v9 architecture will be released in 2021, providing an estimated 30-percent improvement in performance over the next two Arm chip generations and the devices that run them.

    Arm’s v9 will also add SVE2, new AI-specific instructions that will probably be used for the AI image processing used on smartphones, such as portrait mode. Arm v9 will also include what Arm is calling Realms, a hardware container of sorts specifically designed to protect virtual machines and secure applications.

    Reply
  26. Tomi Engdahl says:

    Posted on May 26, 2021 by Jean-Luc Aufranc (CNXSoft) – 25 Comments on First Armv9 cores unveiled – Cortex-A510, Cortex-A710, Cortex-X2
    First Armv9 cores unveiled – Cortex-A510, Cortex-A710, Cortex-X2
    https://www.cnx-software.com/2021/05/26/armv9-cores-cortex-a510-cortex-a710-cortex-x2/

    Reply
  27. Tomi Engdahl says:

    compiler support for SVE2 for GCC and LLVM.

    Reply
  28. Tomi Engdahl says:

    New ARM processor featuring CHERI architecture –

    “Several years ago, researchers at the University of Cambridge, in collaboration with Arm, developed an experimental architecture called [CHERI—capability hardware-enhanced RISC instructions](https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/)—which uses 64-bit Armv8-A to address memory safety, particularly in programming languages.”

    Arm Puts Security Architecture to the Test With New SoC and Demonstrator Board
    https://www.allaboutcircuits.com/news/arm-puts-security-architecture-to-test-with-new-soc-and-demonstrator-board/

    Reply

Leave a Comment

Your email address will not be published. Required fields are marked *

*

*