The GF104/GF110 Refresher: Different Architecture & Different Transistors

For all practical purposes GF100 is the Fermi base design, but for sub high-end cards in particular NVIDIA has made a number of changes since we first saw the Fermi architecture a year and a half ago. For those of you reading this article who don’t regularly keep up with the latest NVIDIA hardware releases, we’re going to quickly recap what makes GF114 and GTX 560 Ti different from both the original GF100/GF110 Fermi architecture, and in turn what makes GF114 different from GF104 through NVIDIA’s transistor optimizations. If you’re already familiar with this, please feel free to skip ahead.

With that said, let’s start with architecture. The GF100/GF110 design is ultimately the compute and graphics monster that NVIDIA meant for Fermi to be. It has fantastic graphical performance, but it also extremely solid GPU computing performance in the right scenarios, which is why GF100/GF110 is the backbone of not just NVIDIA’s high-end video cards, but their Tesla line of GPU computing cards.

But Fermi’s compute characteristics only make complete sense at the high-end, as large institutions utilizing GPU computing have no need for weaker GPUs in their servers, and in the meantime home users don’t need features like ECC or full speed FP64 (at least not at this time) so much as they need a more reasonably priced graphics card. As a result only the high-end GF100/GF110 GPUs feature Fermi’s base design, meanwhile GF104 and later use a tweaked design that stripped away some aspects of Fermi’s GPU compute design while leaving much of the graphics hardware intact.

NVIDIA GF104 SM

With GF104 we saw the first GPU released using NVIDIA’s streamlined Fermi architecture that forms the basis of GF104/GF106/GF108/GF114, and we saw a number of firsts from the company. Chief among these was the use of a superscalar architecture, the first time we’ve seen such a design in an NVIDIA part. Superscalar execution allows NVIDIA to take advantage of Instruction Level Parallelism (ILP) – executing the next instruction in a thread when it doesn’t rely on the previous instruction – something they haven’t done previously. It makes this streamlined design notably different from the GF100/GF110 design. And ultimately this design is more efficient than GF100/GF110 on average, while having a wider range of best and worst case scenarios than GF100/GF110, a tradeoff that doesn’t necessarily make sense for GPU computing purposes but does for mainstream graphics.

Meanwhile in terms of low-level design, starting with GF110 NVIDIA began revising the low-level design of their GPUs for production purposes. NVIDIA’s choice of transistors with GF10x was suboptimal, and as a result they used leaky transistors in functional units and parts thereof where they didn’t want them, limiting the number of functional units they could utilize and the overall performance they could achieve in the power envelopes they were targeting.

For GF110 NVIDIA focused on better matching the types of transistors they used with what a block needed, allowing them to reduce leakage on parts of the chip that didn’t require such fast & leaky transistors. This meant not only replacing fast leaky transistors with slower, less leaky transistors in parts of the chip that didn’t require such fast transistors, but also introducing a 3rd mid-grade transistor that could bridge the gap between fast/slow transistors. With 3 speed grades of transistors, NVIDIA was able to get away with only using the leakiest transistors where they needed to, and could conserve power elsewhere.


A typical CMOS transitor: Thin gate dielectrics lead to leakage

GF110 wasn’t the only chip to see this kind of optimization however, and the rest of the GF11x line is getting the same treatment. GF114 is in a particularly interesting position since as a smaller GPU, its predecessor GF104 wasn’t as badly affected. Though we can’t speak with respect to enabling additional functional units, at the clockspeeds and voltages NVIDIA was targeting we did not have any issues with the stock voltage. In short while GF100 suffered notably from leakage, GF104 either didn’t suffer from it or did a good job of hiding it. For this reason GF114 doesn’t necessarily stand to gain the same benefit.

As we touched on in our introduction, NVIDIA is putting their gains here in to performance rather than power consumption. The official TDP is going up 10W, while performance is going up anywhere between 10% and 40%. This is the only difference compared to GF104, as GF114 does not contain any architectural changes (GF110’s changes were backported from GF104). Everything we see today will be a result of a better built chip.

Index Meet The GTX 560 Ti
Comments Locked

87 Comments

View All Comments

  • SolMiester - Wednesday, January 26, 2011 - link

    Hi can we please have some benchies on this feature please...While surround is not an option for the single nVidia card, 3D is, and it is hard to judge performance with this enabled. Readers need to know if 3D is an viable option with this card at perhaps 16x10 or 19x10

    Ta
  • DarknRahl - Wednesday, January 26, 2011 - link

    I agree with some of the other comments and find the conclusion rather odd. I didn't really get why all the comparison's were done with the 460 either, yes this is the replacement card for the 460 but isn't really to relevant as far a purchasing decision goes at this moment in time. I found HardOCP's review far more informative, especially as they get down to brass tacs; the price to performance. In both instances the 560 doesn't make too much sense.

    I still enjoy reading your reviews of other products, particularly power supplies and CPUs.
  • kallogan - Thursday, January 27, 2011 - link

    Put those damn power connectors at the top of the graphic card, think about mini-itx users !!! Though this one is 9 inches, should fit in my sugo sg05 anyway.
  • neo_moco - Thursday, January 27, 2011 - link

    I really don`t understand the conclusion :

    The radeon 6950 wins hands down in 1920x and 2560x in almost all important games: crysis , metro , bad company2, mass effect 2 and wolfenstein

    The geforce wins only the games who not a lot of people play : civ 5 , battleforge, hawx , dirt 2

    Add to that others tests : 6950 wins in the popular call of duty , vantage .
    In 3dmark 11 the geforce is 15 % weaker(guru3d) so the conclusion as i see it : radeon 6950 1 gb is aprox. 5-10 % better and not the other way .
  • neo_moco - Thursday, January 27, 2011 - link

    after 15 months of the radeon 5850 launch we get a card 10 % better for the same price; i dont get the enthusiasm of some people over this cards ; its kind of lame
  • HangFire - Thursday, January 27, 2011 - link

    "By using a center-mounted fan, NVIDIA went with a design that recirculates some air in the case of the computer instead of a blower that fully exhausts all hot air."

    I don't think I've ever seen a middle or high end NVIDIA card that fully exhausted all hot air. Maybe it exists, I certainly haven't owned them all, perhaps in the customized AIB vendor pantheon there have been a few.

    This is not just a nitpick. When swapping out an 8800GT to an 8800 Ultra a few years back, I thought I was taking a step forwards in case temperatures, because the single-slot GT was internally vented and the Ultra had that second slot vent. I didn't notice the little gap, or the little slots in the reference cooler shroud.

    That swap began a comical few months of debugging and massive case ventilation upgrades. Not just the Ultra got hot, everything in that case got hot. Adding a second 120mm input fan and another exhaust, an across-the-PCI-slots 92mm fan finally got things under control (no side panel vent). Dropping it into a real gamer case later was even better. (No, I didn't pay $800 for the Ultra, either).

    I'm not a fan of, um, graphics card fans that blow in two directions at once, I call them hard drive cookers, and they can introduce some real challenges in managing case flow. But I no longer run under the assumption that a rear-venting double slot cooler is necessarily better.

    I'd like to see some case-cooker and noise testing with the new GTX 560 Ti reference, some AIB variants, and similar wattage rear venting cards. In particular, I'd like to see what temps a hard drive forward of the card endures, above, along, and below.
  • WRH - Saturday, January 29, 2011 - link

    I know that comparing an on CPU chip graphics with GPU cards like the ones on this list would be like comparing a Prius to a Ferrari but I would like to see the test results just the same. Sandy Bridge on CPU GPUs come in two models (2000 and 3000). It would be interesting to see just how far below the Radeon HD4870 they are and see how the 2000 and 3000 compare.

Log in

Don't have an account? Sign up now