Intel's new Atom Microarchitecture: The Tremont Core in Lakefield
by Dr. Ian Cutress on October 24, 2019 1:30 PM ESTA Wider Back End
Moving beyond the micro-op queue, Tremont has an 8 execution ports, filled from 7 reservation stations.
The only two ports using a combined reservation station are the address generator units (AGUs) - this is in stark contrast to the Core design, which in Sunny Cove uses a unified reservation for all integer and floating point calculations and three for the AGUs. The reason that Tremont uses a unified reservation station for the two AGUs, also backed by extra memory for queued micro-ops, is in order to supply both AGUs with either 2x 16-byte stores, 2x 16-byte loads, or one of each. Intel clearly expects the AGUs on Tremont to be fairly active compared to other execution ports.
On the integer side, aside from the two AGUs, Tremont has 3 ALUs, a jump port, and a store data port. Each ALU supports different functions, with one enabling shift functions and another for multiplication and division. Compared to core, these ALUs are extremely lightweight, and Intel hasn’t gone into specifics here.
On the floating point side, we are a little bit more varied – the three ports are split between two ALUs and a store port. The two ALUs have one focused on fused additions (FADD), while the other focuses on fused multiplication and division (FMUL). Both ALUs support 128-bit SIMD and 128-bit AES instructions with a 4-cycle latency, as well as single instruction SHA256 at 4-cycles. There is no 256-bit vector support here. In order to help with certain calculations, GFNI instruction support is included.
There is also a larger 1024-entry L2 TLB, supporting 1024x 4K entries, 32x 2M entries, or 8x 1G entries. This is an upgrade from the 512-entry L2 TLB in Goldmont.
New Instructions
As with any generation, Intel adds new supported instructions to either accelerate common calculations that would traditionally require lots of instructions or to add new functionality. Tremont is no different.
TITLE | |||||
AnandTech | Tremont | Goldmont Plus |
Goldmont | Airmont | Silvermont |
Process | 10+ | 14 | 14 | 14 | 22 |
Release Year | 2019 | 2017 | 2016 | 2015 | 2013 |
New Instructions | CLWB GFNI ENCLV CLDEMOTE MOVDIR* TPAUSE UMONITOR UWAIT |
SGX1 UMIP PTWRITE RDPID |
RDSEED SMAP MPX XSAVEC XSAVES CLFLUSHOPT SHA |
SSE4.1 SSE4.2 MOVBE CRC32 POPCNT CLMUL AES RDRAND PREFETCHW |
(When asked what other new instructions are supported, Intel stated to look at the published documents about future instructions. When it was pointed out that those documents weren’t exactly clear and that in the past Intel hasn’t spoken about future designs, we were not afforded additional comments.)
When we get hold of a Tremont device, we’ll do a full instruction breakdown.
101 Comments
View All Comments
29a - Friday, October 25, 2019 - link
I second this stay away from Atom.PeachNCream - Friday, October 25, 2019 - link
AMD is just as iffy about support for their low pwoer cores. My A4-1250 is not supported either. Though that isn't a problem with it running Linux, its just that, unlike my Bay Trail, it isn't fanless and ultra quiet. There is nothing quite like a fanless laptop with a SSD or eMMC and getting that in Core is a challenge. Getting it with Core at less than $200 is not possible.Jorgp2 - Friday, October 25, 2019 - link
Lol, that's how Android works.unclevagz - Thursday, October 24, 2019 - link
Given that when Lakefield products come out they will in all likelihood competing with ARM A77 products, I struggle to see how this architecture would be competitive.vladx - Friday, October 25, 2019 - link
If Tremont will be almost Core-class as Intel claims, it will very likely equal Cortex-A77 if not surpass it.Wilco1 - Friday, October 25, 2019 - link
It couldn't even get anywhere near a Cortex-A76. The fastest Goldmont+ gets 464 on Geekbench 5, so with 25% gain it would be ~600 at 2.5GHz.However SD855+ (Cortex-A76) gets 795...
vladx - Friday, October 25, 2019 - link
That's probably because Geekbench tests both CPU and GPU, I don't think the GPU compute on Atoms is anything to be impressedWilco1 - Friday, October 25, 2019 - link
No these scores are not using the GPU. Atom just has poor integer performance (and FP is even worse, being just SSE, no AVX, no FMA). You need a 60-70% IPC improvement over Goldmont+ to match Cortex-A76.Brunnis - Friday, October 25, 2019 - link
If IPC is approximately 25% better than Goldmont Plus, it will be on Haswell level. Not sure how it will compete with A77 performance wise, but it should be competitive with A76. From a power consumption perspective? I wouldn’t bet on it.Jorgp2 - Friday, October 25, 2019 - link
They'll be x86 and have PC IO.