Today ARM announces the new Cortex A32 ultra-low power/high efficiency processor IP. For some readers this might come as a surprise as it's only been a few months since we saw the announcement of the Cortex A35 which was presented as a replacement for the Cortex A7 and A5, so this leaves us with the question of where the Cortex A32 positions itself against both past IPs such as the A7 and A5, but also how it compares against the A35.

The answer is rather simple: It's still a replacement for the A7 and A5, but targets even lower power use-cases than what the A35 was designed for. While ARM sees the A35 as the core for the next billion low-end smartphones, the A32 seems to be more targeted at the embedded market. In particular it's the "Rich Embedded" market that ARM seems to be excited about. The differentiation lies between use-cases which require a full-fledged MMU and thus able to run full operating systems based on Linux, and those who don't and could make due with a simpler micro-controller based on one of ARM's Cortex-M profile IPs. It's also worth to mention that although last time we claimed that the A35 would servce the IoT market, ARM seems to see wearables and similar devices as part of the "Rich Embedded" umbrella-term and thus now it seems more likely that it's the A32 that will be the core that will power such designs.

This leads us to the mystery of what exactly is the A32? During the briefing the only logical question that seemed to come to mind is: "Is this an A35 with 64-bit 'slashed off'?" While ARM chuckled at my oversimplification, they agreed that from a very high-level perspective that it could be considered as an accurate description of the A32.

In more technical terms, the A32 is an 32-bit ARMv8-A processor with largely the same microarchitectural characteristics of the Cortex A35. As a reminder to our readers out there: The ARMv8 ISA is not only an 64-bit instruction set but also contains many improvements and additions to the 32-bit profile commonly named as AArch32. Among the larger differences between the A35 and A32 is that the latter's microarchitecture has been tuned and optimized to achieve the best performance and efficiency for 32-bit.

Indeed, performance wise, the A32 is advertised as being able to match the Cortex A35.  The improvements lie in power efficiency: as a result of dropping its 64-bit capabilities, the new core is now able to achieve up to 10% better efficiency than the Cortex A35. Similarly to the A35, the A32 promises to achieve vastly superior performance per clock versus the Cortex A5 and A7, achieving anywhere from a 31% increase in integer workloads to a massive factor of 13x in crypto workloads, which the A32 is still capable of as they're included in the AArch32 ARMv8 profile.

While only a few months ago the Cortex A35 was advertised as ARM's smallest Cortex-A core, this title has now been passed on to the A32. ARM claims the core is around 30% smaller than the A35; The decrease in size, mostly due to the slimming down of the micro-architecture due the removal of 64-bit capability, allows the Cortex A32 to scale down to <0.25mm² in its smallest configuration, a significant decrease compared to the A35's disclosed <0.4mm². The core remains as configurable as the Cortex A35, able to run as either as single core or any as a cluster up to four cores. Optionally vendors can also configure cache sizes, with L1 ranging from 8KB to 32KB and L2 either being completely absent to up to 1MB in size.

ARM's philosophy of "having the right design for the job" now seems more apparent than ever as we see an steadily increasing portfolio of processor IPs specialized for different use-cases. The A32 seems to fit right in with this strategy and we'll more than certainly see a large array of devices powered by the core in the future to come.

Comments Locked

27 Comments

View All Comments

  • Kylinblue - Tuesday, February 23, 2016 - link

    You really need 64bit for IoT?
  • Ryan Smith - Monday, February 22, 2016 - link

    Heh, that's what happens when "low power" and "high efficiency" get merged with too few words. Thanks for pointing that out.
  • blaktron - Monday, February 22, 2016 - link

    I'm in my early 30s and I have owned CPUs built with transistors larger than this core...
  • extide - Monday, February 22, 2016 - link

    Ehh, close, but not quite! 0.25mm is 250um -- the biggest processes were about 10um, (which is, 10,000nm, we have come a long way!) For example, the Intel 4004, the first processor, was made on 10um.
  • jas90 - Monday, February 22, 2016 - link

    So by that logic extide,at 10nm a single core would be 61um if Tsmc 10nm is 4.1 times smaller than the 28HPC giving the example in the slide,getting close ;)
  • ant6n - Tuesday, February 23, 2016 - link

    Even worse: 0.25mm² = (0.5mm)²
  • r3loaded - Monday, February 22, 2016 - link

    ""Is this an A35 with 64-bit 'slashed off'?" While ARM chuckled at my oversimplification, they agreed that from a very high-level perspective that it could be considered as an accurate description of the A32."

    That's literally what it is, when you absolutely must squeeze out every last drop out of your area and power budget.

    There will also shortly be a 64-bit only version of the A35 which has AArch32 support slashed off, again to save on power and area. Might be useful for someone...
  • name99 - Monday, February 22, 2016 - link

    Are you claiming this (64-bit only version of the A35) as a joke, or do you have actual knowledge of this?
    I expect that Apple's watch CPU would take this path. They have already abstracted 3rd party code submissions to an IR form, so it's possible (I honestly don't know) that that IR is abstract enough that it could just be rendered down to 64-bits even though it was compiled as 32-bits. If so, there would really be no reason for Apple to ever provide a simultaneously 32 and 64-bit watch CPU.

    However it is hard to see watches (or anything else in the very low power category) as really wanting 64-bits in the near (three-year or so) future. Is there reason to believe that using the 64-bit ISA would provide something desirable (better power efficiency or smaller code) than using the 32-bit bit ISA and Thumb-2? Obvious the 64-bit ISA provides improved performance in various ways (especially if you use address bits for tagging) but for these devices right now, I think performance/watt is more important than just performance, and I don't know if the improved 64-bit ISA efficiency can make up for the work done on wider registers.
  • r3loaded - Tuesday, February 23, 2016 - link

    It's not 100% confirmed for release but it is a real product. I believe that certain customers are interested in a low-power A-class system control processor that will form part of an existing 64-bit system while minimising area for the primary cores. Since such a processor would be guaranteed to only run 64-bit ARMv8-A code, AArch32 support would be unnecessary.

    It's a niche thing but it could happen. No idea what the final name will be.
  • RobATiOyP - Tuesday, February 23, 2016 - link

    Would AArch64 only reduce area by much? 64bit CPU still needs to handle 8/16/32 bit quantities as well as 64bits, so seems like only instruction decode would be simplified.

Log in

Don't have an account? Sign up now