AMD Threadripper Pro Review: An Upgrade Over Regular Threadripper?
by Dr. Ian Cutress on July 14, 2021 9:00 AM EST- Posted in
- CPUs
- AMD
- ThreadRipper
- Threadripper Pro
- 3995WX
Since the launch of AMD’s Threadripper Pro platform, the desire to see what eight channels of memory brings to compute over the regular quad-channel Threadripper has been an intriguing prospect. Threadripper Pro is effectively a faster version of AMD’s EPYC, limited for single CPU workstation use, but also heralds a full 280 W TDP to match the frequencies of the standard Threadripper line. There is a 37% price premium from Threadripper to Threadripper Pro, which allows for ECC memory support, double the PCIe lanes, and double the memory bandwidth. In this review, we’re comparing every member of both platforms that is commercially available.
Threadripper Pro: Born of Need
When AMD embarked upon its journey with the new Ryzen portfolio, the delineation of where each product sat in the traditional market has not always been entirely clear. The first generation Ryzen was earmarked for standard consumers, however the top of the line Ryzen 7 1800X, with eight cores, competed against Intel’s high-end desktop market. The Zen 2-based portfolio saw the mainstream Ryzen go to 16 cores, pushing past Intel’s best 18-core HEDT processor at the time in most tests. That Zen 2-based Ryzen 9 3950X was still classified as a ‘mainstream platform’ processor, as it only had 24 PCIe lanes and dual-channel memory, sufficient for mainstream users but not enough for workstation markets. These mainstream processors were also limited to 105W TDP.
At the other end of the scale was AMD EPYC, with the first generation EPYC 7601 having 32 cores, and the second generation EPYC 7742 having 64 cores, up to 225W TDP. These share the same LGA4094 socket, have eight channels of memory, full ECC support, and 128 PCIe lanes (first PCIe 3.0, then PCIe 4.0), with dual-socket support. For workstation users interested in EPYC, AMD launched single socket ‘P’ versions. These offered the same features, at around 200 TDP, losing some performance to the regular non-P versions.
AMD then launched Threadripper, a high-end desktop version of EPYC that went all the way up to 280 W for peak frequency and performance. Threadripper sat above Ryzen with 64 PCIe lanes and quad channel memory, enabling mainstream users that wanted a bit more to get a bit more. However workstation users noted that while 280 W was great, it lacked official ECC memory support, and compared to EPYC, sometimes the reduced memory channel support and reduced PCIe compared to EPYC stopped Threadripper being adopted.
So enter Threadripper Pro, which sits between Threadripper and EPYC, and in this instance, very much more on the EPYC side. Threadripper Pro has almost all the features of AMD’s EPYC platform, but in a 280W thermal envelope. It has eight channels of memory support, all 128 PCIe 4.0 lanes, and can support ECC. The only downside to EPYC is that it can only be used in single socket systems, and the peak memory support is halved (from 4 TB to 2 TB). Threadripper Pro also comes at a small price premium as well.
AMD Comparison | ||||
AnandTech | Ryzen | Threadripper | Threadripper Pro |
Enterprise EPYC |
Cores | 6-16 | 32-64 | 12-64 | 16-64 |
Architecture | Zen 3 | Zen 2 | Zen 2 | Zen 3 |
1P Flagship | R9 5950X |
TR 3990X |
TR Pro 3995WX | EPYC 7713P |
MSRP | $799 | $3990 | $5490 | $5010 |
TDP | 105 W | 280 W | 280 W | 225 W |
Base Freq | 3400 MHz | 2900 MHz | 2700 MHz | 2000 MHz |
Turbo Freq | 4900 MHz | 4300 MHz | 4200 MHz | 3675 MHz |
Socket | AM4 | sTRX40 | sTRX4: WRX80 | SP3 |
L3 Cache | 64 MB | 256 MB | 256 MB | 256 MB |
DRAM | 2 x DDR4-3200 | 4 x DDR4-3200 | 8 x DDR4-3200 | 8 x DDR4-3200 |
DRAM Capacity | 128 GB | 256 GB | 2 TB, ECC | 4 TB, ECC |
PCIe | 4.0 x20 + chipset |
4.0 x56 + chipset | 4.0 x120 + chipset | 4.0 x128 |
Pro Features | No | No | Yes | Yes |
One of the biggest pulls for Threadripper and Threadripper Pro has been any market that typically uses high-speed workstations and can scale their workloads. Speaking to a local OEM, the demand for Threadripper and Threadripper Pro from the visual effects industry has been off the charts, where these companies are ripping out their old infrastructure and replacing anew with AMD. This has also been spurned by the recent pandemic, where these studios want to keep the expensive hardware onsite and allow their artists to work from home via remote access.
Threadripper Pro CPUs: Four Models, Three at Retail
When TR Pro launched in 2020, the processors were a Lenovo exclusive for the P620 workstation. The deal between Lenovo and AMD was not disclosed, however it would appear that the exclusivity deal ran for six months, from September to February, with the processors being made retail available on March 2nd.
During that time, we were sampled one of these workstations for review, and it still remains one of the best modular systems I’ve ever tested:
Lenovo ThinkStation P620 Review: A Vehicle for Threadripper Pro
AMD’s first Threadripper Pro platform has four processors in it, ranging from 12 cores to 64 cores, mimicking their equivalents in Threadripper 3000 and EPYC 77x2 but at 280W.
AMD Ryzen Threadripper Pro | |||||||
AnandTech | Cores | Base Freq |
Turbo Freq |
Chiplets | L3 Cache |
TDP | Price SEP |
3995WX | 64 / 128 | 2700 | 4200 | 8 + 1 | 256 MB | 280 W | $5490 |
3975WX | 32 / 64 | 3500 | 4200 | 4 + 1 | 128 MB | 280 W | $2750 |
3955WX | 16 / 32 | 3900 | 4300 | 2 + 1 | 64 MB | 280 W | $1150 |
3945WX | 12 / 24 | 4000 | 4300 | 2 + 1 | 64 MB | 280 W | OEM |
Sitting at the top is the 64-core Threadripper Pro 3995WX, with a 2.7 GHz base frequency and a 4.2 GHz turbo frequency. This processor is the only one in the family to have all 256 MB of L3 cache, as it has all eight chiplets fully active. The $5490 price is a full 37.5% increase over the Threadripper 3990X at $3990.
AMD 64-Core Zen 2 Comparison | |||
AnandTech | Threadripper 3990X |
Threadripper Pro 3995WX |
EPYC 7702P |
MSRP | $3990 | $5490 | $4425 |
TDP | 280 W | 280 W | 200 W |
Base Freq | 2900 MHz | 2700 MHz | 2000 MHz |
Turbo Freq | 4300 MHz | 4200 MHz | 3350 MHz |
L3 Cache | 256 MB | 256 MB | 256 MB |
DRAM | 4 x DDR4-3200 | 8 x DDR4-3200 | 8 x DDR4-3200 |
DRAM Capacity | 256 GB | 2 TB, ECC | 4 TB, ECC |
PCIe | 4.0 x56 + chipset | 4.0 x120 + chipset | 4.0 x128 |
Pro Features | No | Yes | Yes |
Middle of the line is the 32-core Threadripper Pro 3975WX, with a 3.5 GHz base frequency and a 4.2 GHz turbo frequency. AMD decided to make this processor use four chiplets with all eight cores on each chiplet, leading to 128 MB of L3 cache total. At $2750, it is also 37.5% more expensive than the equivalent 32-core Threadripper 3970X.
AMD 32-Core Zen 2 Comparison | |||
AnandTech | Threadripper 3970X |
Threadripper Pro 3975WX |
EPYC 7501P |
MSRP | $3990 | $2750 | $2300 |
TDP | 280 W | 280 W | 180 W |
Base Freq | 3700 MHz | 3500 MHz | 2500 MHz |
Turbo Freq | 4500 MHz | 4200 MHz | 3350 MHz |
L3 Cache | 128 MB | 128 MB | 128 MB |
DRAM | 4 x DDR4-3200 | 8 x DDR4-3200 | 8 x DDR4-3200 |
DRAM Capacity | 256 GB | 2 TB, ECC | 4 TB, ECC |
PCIe | 4.0 x56 + chipset | 4.0 x120 + chipset | 4.0 x128 |
Pro Features | No | Yes | Yes |
The following two processors have no Threadripper equivalents, but also represent a slightly different scenario that we’ll explore in this review. Both the 3955WX and 3945WX, despite being part of the big Threadripper Pro family, only use two chiplets in their design: 8 core per chipet for the 3955 WX and 6 core per chiplet for the 3945WX. This means these processors only have 64 MB of L3 cache, making them somewhat identical to the Ryzen 9 3950X and Ryzen 9 3900X, except the IO die means there is eight channels of memory and 128 PCIe lanes here.
AMD 16-Core Zen 2/3 Comparison | |||
AnandTech | Ryzen 9 3950X |
Threadripper Pro 3955WX |
Ryzen 9 5950X |
MSRP | $749 | $1150 | $799 |
TDP | 105 W | 280 W | 105 W |
Base Freq | 3500 MHz | 3900 MHz | 3400 MHz |
Turbo Freq | 4700 MHz | 4300 MHz | 4900 MHz |
L3 Cache | 64 MB | 64 MB | 64 MB |
DRAM | 2 x DDR4-3200 | 8 x DDR4-3200 | 2 x DDR4-3200 |
DRAM Capacity | 128 GB | 2 TB, ECC | 128 GB |
PCIe | 4.0 x20 + chipset |
4.0 x120 + chipset |
4.0 x20 + chipset |
Pro Features | No | Yes | No |
Motherboard Cost | -- | +++ | -- |
The 3955WX has a higher base frequency, but the 3950X has the higher turbo frequency. The 3950X is also cheaper, and motherboards are cheaper! It might be worth partitioning these out into a separate comparison review.
The final Threadripper Pro processor, the 3945WX, does not have a price, because AMD is not making it available at retail. This part is for selected OEM customers only it seems; perhaps the limited substrate resources in the market right now makes it unappealing to make too many of these? Hard to say.
Motherboards: Beware!
Despite being based on the same LGA4094 socket as both Threadripper and EPYC, Threadripper Pro has its own unique WRX80 platform that has to be used instead. Only select vendors seem to have access/licenses to make WRX80 motherboards, and your main options are:
- ASUS Pro WS WRX80E-SAGE SE WiFi ($1000)
- Supermicro M12SWA-TF (~$750)
- GIGABYTE WRX80 SU8-IPMI ($790)
All three boards use a transposed LGA4094 socket, eight DDR4 memory slots, and 6-7 PCIe 4.0 slots.
Though beware! There is an option of finding an old/refurbished Lenovo P620 motherboard. It is worth noting that Lenovo is exercising an AMD feature for OEMs: processors used in that Lenovo motherboard will be locked to Lenovo forever. This is part of AMD’s guaranteed supply chain process, allowing OEMs to hard lock processors into certain vendors for supply chain end-to-end security that is requested by specific customers. In that instance, if you might ever want to break down your system to upgrade and sell off parts, it is not recommended you find a Lenovo TR Pro system unless you buy/sell it as a whole.
This Review
The main goal of this review is to test all of the Threadripper Pro 3000 hardware and compare against the equivalent Threadripper 3000 to get a sense of how much performance is gained by the increased memory bandwidth, or lost due to the slight core frequency differences. We are also including Intel’s best HEDT/workstation processor for comparison, the W-3175X, as well as the top consumer-grade processors on the market. All systems are tested at JEDEC specifications.
Test Setup | |||||
AMD TR Pro |
3995WX 3975WX 3955WX |
ASUS Pro WS WRX80E-SAGE SE WiFi |
BIOS 0405 |
IceGiant Thermosiphon |
Kingston 8x16 GB DDR4-3200 ECC |
AMD TR |
TR 3990X TR 3970X TR 3960X |
ASRock TRX40 Taichi |
BIOS P1.70 |
IceGiant Thermosiphon |
ADATA 4x32 GB DDR4-3200 |
AMD Ryzen |
R9 5950X | GIGABYTE X570 I Aorus Pro |
BIOS F31L |
Noctua NH-U12S |
ADATA 4x32 GB DDR4-3200 |
Intel Core |
i9-11900K | ASUS Maximus XIII Hero |
BIOS 0703 |
Thermalright TRUE Copper* |
ADATA 4x32 GB DDR4-3200 |
Intel Xeon |
Xeon W-3175X | ASUS ROG Dominus Extreme |
BIOS 0601 | Asetek 690LX-PN |
DDR4-2666 ECC |
GPU | Sapphire RX 460 2GB (CPU Tests) | ||||
PSU | Various (inc. Corsair AX860i) | ||||
SSD | Crucial MX500 2TB | ||||
*Silverstone SST-FHP141-VF 173 CFM fans also used. Nice and loud. |
Many thanks to Kingston for supplying a full set of KSM32RD8/16MEI - 16x16 GB of DDR4-3200 ECC RDIMMs for enterprise testing in systems like Threadripper Pro.
As part of this review, we are also showcasing the 64 core processors in 128T mode as well as 64T mode. This is being done to showcase how some processors can get better performance by having better memory bandwidth per thread - one of the issues with these high core count processors is the limited amount of memory bandwidth each thread can access. Also, some operating systems (such as Windows) struggle above 64 threads due to the use of thread groups.
98 Comments
View All Comments
Spunjji - Friday, July 16, 2021 - link
Having seen how modern processors behave with insufficient cooling, Threska's right that it won't get "fried", but you're correct to infer that it would result in unpredictably sub-optimal performance.Anecdotally, I had a friend with a Sandy Bridge system with a cooling issue that he only noticed when he bought a new GPU and ran 3DMark and got unexpectedly low results. The "cooling issue" was that the stock heatsink wasn't even making contact with the CPU heat-spreader; he'd been gaming with the system for 3 years by that point. 😬
serpretetsky - Friday, July 16, 2021 - link
I had to do some thermal shutdown testing on some consumer intel cpu. I forgot which one. Maybe i5/i7 8000 series?With server CPUs this was usually pretty easy, remove fan, and wait for shutdown. With the consumer CPU it kept running. So i completely removed the heatsink, the thing simply downclocked to 800 MHz, and continued running happily with no heatsink. Booted to linux, ran everything great, and no heatsink (actually once it booted to linux I think it even started clocking back up once in a while). I had get a hot-air soldering gun to heat it up till shutdown.
mode_13h - Saturday, July 17, 2021 - link
5-10 years ago, there was a heatsink gasket where you have to get near 100 degrees C to melt the material so it fuses with the heatsink and CPU. I forget the name, but I'm wondering if it's even possible to do that any more.skaurus - Wednesday, July 14, 2021 - link
That's great analysis.Threska - Wednesday, July 14, 2021 - link
It would be nice to see how these MBs do with VFIO since that has considerations most users don't.mode_13h - Wednesday, July 14, 2021 - link
Ian, is the source code for your 3DPM benchmark published anywhere? If not, it would be nice if we could see it and compare the AVX2 path with the AVX-512 one. Also, maybe someone could add support for ARM NEON or SVE.techguymaxc - Wednesday, July 14, 2021 - link
I'm slightly confused by the concluding remarks."Performance between Threadripper Pro and Threadripper came in three stages. Either (a) the results between similar processors was practically identical, (b) Threadripper beat TR Pro by a small margin due to slightly higher frequencies, or (c) TR Pro thrashed Threadripper due to memory bandwidth availability. That last point, (c), only really kicks in for the 32c and 64c processors it should be noted. Our 16c TR Pro had the same memory bandwidth results as TR, most likely due to only having two chiplets in its design."
A and B are observable, but C only proves true in synthetic benchmarks (and Pi calculation). Is there a real-world use-case for the additional memory bandwidth, outside of calculating Pi?
Blastdoor - Wednesday, July 14, 2021 - link
The advantage shows up with multi-threaded SPEC. SPEC is essentially a composite of a suite of real-world tasks. I guess you could call it 'synthetic' due to it being a composite, but the individual tasks don't strike me as 'synthetic.' For example, here's a description of namd: https://www.spec.org/cpu2017/Docs/benchmarks/508.n...techguymaxc - Wednesday, July 14, 2021 - link
Thanks for that info. It would be nice to see the breakdown of individual test results from the SPEC suite.arashi - Saturday, July 17, 2021 - link
Bench