AMD's Radeon HD 5870: Bringing About the Next Generation Of GPUs
by Ryan Smith on September 23, 2009 9:00 AM EST- Posted in
- GPUs
A Quick Refresher on the RV770
As Cypress is a direct evolution of the RV770 design, before we talk about what’s new with Cypress we are going to go over a quick rehash of RV770’s internal workings. As it’s necessary to understand how RV770 was built to understand what Cypress changes, if you’re completely unfamiliar with RV770, please take a look at our expanded discussion of RV770 from last year. For the rest of you, let’s get started.
At the center of the RV770 is the Stream Processing Unit (SPU), a single arithmetic logic unit. The RV770 has 800 of these, and they are packaged together in groups of 5 and are what we call a Streaming Processor (SP). A SP contains a register file, a branch predictor, and the aforementioned 5 SPUs, with the 5th SPU being a more complex unit capable of transcendental functions along with the base functions of an ALU. The SP is the smallest unit that can do individual work; every SPU in an SP must execute the same instruction.
For every 16 SPs, AMD groups them together with texture units, L1 cache, shared memory, and controlling logic. This combined block is what AMD calls a SIMD, and RV770 has 10 of them. These 10 SIMDs form the core computational power of the RV770, and in the chip work with various specialized units such as ROPs, rasterizers, L2 cache, and tesselators to form a complete chip.
To utilize the computational power of the hardware, instruction threads are issued to the SPs. These threads are grouped into wavefronts, where there are 64 threads per wavefront. To maximize the utilization of the GPU, threads need to be organized so that they can feed all 5 SPUs in a SP an instruction every clock cycle. Doing this requires extracting instruction level parallelism (ILP) out of programs being passed to the GPU, which is difficult task of AMD’s compiler.
If SPUs go unused, then the performance of the chip suffers due to underutilization. This design gives AMD a great deal of theoretical computational power, but it is always a challenge to fully exploit it.
327 Comments
View All Comments
Scali - Thursday, October 1, 2009 - link
Here's a screenshot of my 8800GTS320 getting almost 49 fps when I overclock it:http://bohemiq.scali.eu.org/OceanCS8800GTS.png">http://bohemiq.scali.eu.org/OceanCS8800GTS.png
So you see why I think 47 fps for a GTX285 is weird. It should easily beat the 72 fps of the HD5870. Even an 8800Ultra might get close to that number.
mapesdhs - Tuesday, September 29, 2009 - link
I sincerely nope not as we need the competition. See:
http://www.marketwatch.com/story/does-amd-really-p...">http://www.marketwatch.com/story/does-a...-pose-a-...
Ian.
Johnwo - Monday, September 28, 2009 - link
so wait, can this card play Crysis?vsl2020 - Sunday, September 27, 2009 - link
AMD only introducing new things which merely would make yur frap fps go 1000 and thats it.....no new good or interesting features such as what nvidia did with physx/3d Stereoscopic or similar that would convince me thats the way to the future...why should I need to buy a new dx11gpu only can do 1000fps...I would still luv my 260+ and 60fps in batman arkhum or other games which supported phsyx or similar...AMD just bring us back to the stone age race ..who has the higher fps race......
Jamahl - Tuesday, September 29, 2009 - link
did you even read the review? what about eyefinity, you know a good way to use up those 1000fps by adding more screens?you can be stuck with your 260, you aren't really gaming unless you are gaming on eyefinity.
Zool - Monday, September 28, 2009 - link
Actualy the delaying of nvidia dx11 card will make introducing new things harder. DX11 and OpenCL means enough that u can forget nvidias physx. At least with open platform dewelopers could finaly merge gpu and cpu code and make some more usefull things than improwed water splashing,unrealistic glass shatering and curtains which just run on top of the code and act as some kind of postprocessing + efects just to maintain compatibility.(miles away from the nvidia demos)And also dx11 compute shader can make these things.
RNViper - Sunday, September 27, 2009 - link
Hey GuysNeed Eyefinity a Nativ DisplayPort TFT?
pawaniitr - Sunday, September 27, 2009 - link
maybe a 2 GiB memory will help this card at highest resolutionswaiting for that version
Troll Trolling - Saturday, September 26, 2009 - link
I think you guys from anandtech could do an article explaining why the new Radeons don't don't double performance, even with doubled specs.It happened too with the HD 4870, it had more than doubled everything (except bandwidht, that was 80% higher) and was not close from double performance.
SiliconDoc - Saturday, September 26, 2009 - link
PS - The bandwidth is not doubled.124GB/sec to 153GB/sec, nowhere near an 80% increasse, let alone, virtually double.