A Quick Refresher on the RV770

As Cypress is a direct evolution of the RV770 design, before we talk about what’s new with Cypress we are going to go over a quick rehash of RV770’s internal workings. As it’s necessary to understand how RV770 was built to understand what Cypress changes, if you’re completely unfamiliar with RV770, please take a look at our expanded discussion of RV770 from last year. For the rest of you, let’s get started.

At the center of the RV770 is the Stream Processing Unit (SPU), a single arithmetic logic unit. The RV770 has 800 of these, and they are packaged together in groups of 5 and are what we call a Streaming Processor (SP). A SP contains a register file, a branch predictor, and the aforementioned 5 SPUs, with the 5th SPU being a more complex unit capable of transcendental functions along with the base functions of an ALU. The SP is the smallest unit that can do individual work; every SPU in an SP must execute the same instruction.

For every 16 SPs, AMD groups them together with texture units, L1 cache, shared memory, and controlling logic. This combined block is what AMD calls a SIMD, and RV770 has 10 of them. These 10 SIMDs form the core computational power of the RV770, and in the chip work with various specialized units such as ROPs, rasterizers, L2 cache, and tesselators to form a complete chip.

To utilize the computational power of the hardware, instruction threads are issued to the SPs. These threads are grouped into wavefronts, where there are 64 threads per wavefront. To maximize the utilization of the GPU, threads need to be organized so that they can feed all 5 SPUs in a SP an instruction every clock cycle. Doing this requires extracting instruction level parallelism (ILP) out of programs being passed to the GPU, which is difficult task of AMD’s compiler.

If SPUs go unused, then the performance of the chip suffers due to underutilization. This design gives AMD a great deal of theoretical computational power, but it is always a challenge to fully exploit it.

Meet the Rest of the Evergreen Family Cypress: What’s New
Comments Locked

327 Comments

View All Comments

  • Scali - Thursday, October 1, 2009 - link

    Here's a screenshot of my 8800GTS320 getting almost 49 fps when I overclock it:
    http://bohemiq.scali.eu.org/OceanCS8800GTS.png">http://bohemiq.scali.eu.org/OceanCS8800GTS.png

    So you see why I think 47 fps for a GTX285 is weird. It should easily beat the 72 fps of the HD5870. Even an 8800Ultra might get close to that number.
  • mapesdhs - Tuesday, September 29, 2009 - link


    I sincerely nope not as we need the competition. See:

    http://www.marketwatch.com/story/does-amd-really-p...">http://www.marketwatch.com/story/does-a...-pose-a-...

    Ian.

  • Johnwo - Monday, September 28, 2009 - link

    so wait, can this card play Crysis?
  • vsl2020 - Sunday, September 27, 2009 - link

    AMD only introducing new things which merely would make yur frap fps go 1000 and thats it.....no new good or interesting features such as what nvidia did with physx/3d Stereoscopic or similar that would convince me thats the way to the future...

    why should I need to buy a new dx11gpu only can do 1000fps...I would still luv my 260+ and 60fps in batman arkhum or other games which supported phsyx or similar...AMD just bring us back to the stone age race ..who has the higher fps race......
  • Jamahl - Tuesday, September 29, 2009 - link

    did you even read the review? what about eyefinity, you know a good way to use up those 1000fps by adding more screens?

    you can be stuck with your 260, you aren't really gaming unless you are gaming on eyefinity.
  • Zool - Monday, September 28, 2009 - link

    Actualy the delaying of nvidia dx11 card will make introducing new things harder. DX11 and OpenCL means enough that u can forget nvidias physx. At least with open platform dewelopers could finaly merge gpu and cpu code and make some more usefull things than improwed water splashing,unrealistic glass shatering and curtains which just run on top of the code and act as some kind of postprocessing + efects just to maintain compatibility.(miles away from the nvidia demos)
    And also dx11 compute shader can make these things.
  • RNViper - Sunday, September 27, 2009 - link

    Hey Guys

    Need Eyefinity a Nativ DisplayPort TFT?
  • pawaniitr - Sunday, September 27, 2009 - link

    maybe a 2 GiB memory will help this card at highest resolutions
    waiting for that version
  • Troll Trolling - Saturday, September 26, 2009 - link

    I think you guys from anandtech could do an article explaining why the new Radeons don't don't double performance, even with doubled specs.
    It happened too with the HD 4870, it had more than doubled everything (except bandwidht, that was 80% higher) and was not close from double performance.
  • SiliconDoc - Saturday, September 26, 2009 - link

    PS - The bandwidth is not doubled.

    124GB/sec to 153GB/sec, nowhere near an 80% increasse, let alone, virtually double.

Log in

Don't have an account? Sign up now