Dual Graphics Cards -- Alternating frames

**Artic_Kid** · 01-12-2005, 06:19 PM

Originally Posted by kalniel

I omitted a rather more modern use of render to texture - rendering a cube reflection map.

You create a cube with 6 sides made up of textures from a scene rendered in 6 orientations. This is 'placed' around an object and light reflected from it to give you the impression the object is reflecting the area it's in. The resolution of the tecture obviously depends on the size of the object, but you also need 6 of them.

That's interesting! Thanks kalniel.

If I may add another question. First a quick review. I'm trying to get a feeling for the value (or not) of dual GPUs (SLI, Crossfire) -- and Alternating Frames (AFR mode) seems to be a fair solution for achieving a full doubling of processing power in a simple way. The down side of the AFR mode, it is said, is that it has trouble with the "render to texture function" because it cannot pass a texture from one frame to the next when the frames are processed on different boards. I'm questioning that, and looking for ways around that difficulty -- so as to save AFR mode, and hence save SLI as an efficient venture.

So back to my question. The application that you described (rendering a cube reflection map) uses the "render to texture function", however, it doesn't require that texture to be passed to the next frame. Rather, the entire process can occur within one frame, on one GPU card, and hence there is no difficulty whatever for AFR mode in this particular application (rendering a cube reflection map). Do I have that right?

In other words, it's not the "render to texture function" itself that causes difficulty, rather it is the passing of textures from one frame (or GPU board) to another that causes difficulty.

**Artic_Kid** · 01-12-2005, 07:16 PM

Originally Posted by Butcher

Another thing you might end up doing is doing something like a cloth simulation using the GPU. This can output a relatively large amount of data which is needed on the next frame.

Butcher, thanks for your insights, they are helpful.

What is a "cloth simulation"? And why is it needed in the next frame? With careful use, can it perhaps be processed and used solely within one frame -- thereby avoiding any problem for AFR mode?

The problem is you'd have to code the game with SLI in mind to do that. Most (currently) are not.

Assuming that SLI can be saved as an efficient approach, game developers would have to start coding their games with SLI in mind. Just as most software developers will now have to code with dual-core processors in mind. It's where things are going. The motive is there (a full doubling of processing power in AFR mode), the question is: Can/will game developers make more use of these techniques?

But I'm trying to figure out whether SLI can be saved as an efficient approach. If it gives an improvement of only eleven percent (on FarCry, 1600x1200, with 8xAA, on a 7800GT), then SLI scarcely seems worth it.

Part of my question, therefore, is whether Alternating Frames (AFR mode) can be (or will be) used more often by game developers, rather than the split-screen mode. Are there viable techniques for reducing the number or size of textures passed from frame-to-frame (from GPU-to-GPU), and for improving the timing of passing those? Would such techniques be used by game developers as a means of getting their games to look/run better on SLI systems?

Also it's not neccessarily feasible to do this amount of arrangement. There are certain things you have to do in certain orders for them to work correctly. Rendering last isn't usually an option. E.g. if the game is using HDR then it will have to perform tone mapping and such at the end of the frame also the alpha blended objects are generally draw after opaque which adds more dependencies. And of course any FSAA is always done last. These sorts of dependencies will often push the use of the rendered texture forwards in the frame and make overlap inpossible.

Assuming you're right, you paint a bleak picture for AFR mode. Which again causes me to wonder whether SLI can be saved as an efficient approach. I'm not doubting the use of SLI in achieving marginal improvements and attain bleeding-edge performance. Rather, I'm doubting it's cost effectiveness -- bang for the buck.

If you were picking out a new system (which I am in the process of doing), why invest in an SLI board and a vain hope for the future, rather than invest instead in a better graphics card now? If SLI gives a performance improvement of eleven percent or so, then there is no likely future price reduction that can turn a future purchase of the second card into a better decision now. It's all a pipe dream -- marketing hype.

I'm excited about SLI, I'm even leaning towards it (vain optimist that I am). But I'm also very doubtful about it. Can you lend some insight here?

**Butcher** · 02-12-2005, 12:44 AM

Cloth simulation is making a piece of geometry look like a piece of cloth. It's mainly concerned with the way it deforms (so it will drape across objects and fall into folds etc. like cloth does). It's computationally expensive, so offloading it from the CPU to the GPU is generally a good thing to do (since the GPU is much better at these sorts of massively parallel operations). A common way to do it is to calculate one frame ahead. So the GPU is drawing the current frame of cloth sim data and generating the next frame within one frame. Doing it this way means you can render cloth early without having to wait for the sim to complete. Also due to the way GPUs work it's often a hit to use a texture very soon after updating, which this approach avoids.

As for whether you need AFR to realise significant gains, I don't think that's neccessarily the case. AFR does give the best improvement, but I think you're getting too hung up on that one farcry result. Many games using SFR gaina lot more than 11% - the typical gain in SFR tends to be in the 50-60% range, which is certainly noticable.

Something to realise is that, while each card does have to transform all the geometry, thus doubling the effort there, generally the vertex processing side of things is idle for large amounts of the frame. A typical game could do 3-4 times more vertex processing without hit, so having to do twice as much in SLI mode isn't that large a hit. GPUs tend to run out of pixel power very much before vertex power, and the increasing use of per-pixel effects is only going to exacerbate this.

In terms of faster passing of textures and the like, it's not something a developer has control over. That sort of thing is handled by the driver, so it's entirely up to nvidia/ATI how things are handled.

Just to add, I run SLI currently and it's definitely more than 11%, even on SFR games. One thing that SLI is very good at is allowing more FSAA and higher resolutions to be used, and that's not really affected by AFR/SFR modes.

**Artic_Kid** · 04-12-2005, 05:01 AM

Originally Posted by Butcher

Cloth simulation is making a piece of geometry look like a piece of cloth. It's mainly concerned with the way it deforms (so it will drape across objects and fall into folds etc. like cloth does). It's computationally expensive, so offloading it from the CPU to the GPU is generally a good thing to do (since the GPU is much better at these sorts of massively parallel operations).

Thanks Butcher, that is helpful.

A common way to do it is to calculate one frame ahead. So the GPU is drawing the current frame of cloth sim data and generating the next frame within one frame. Doing it this way means you can render cloth early without having to wait for the sim to complete. Also due to the way GPUs work it's often a hit to use a texture very soon after updating, which this approach avoids.

It seems self-evident that a cloth simulation takes about the same amount of time whether the processing occurs step-A then step-B, versus the other way around (step-B, then step-A, which is then passed into the next frame). But the advantage of the latter approach, if I understand you correctly, is that it allows better timing of when the texture can be used. Is it plausible that, with careful programming, a suitable timing can be reached for the former approach (step-A, followed by step-B, with no need to pass the texture into the next frame)?

As for whether you need AFR to realise significant gains, I don't think that's neccessarily the case. AFR does give the best improvement, but I think you're getting too hung up on that one farcry result. Many games using SFR gaina lot more than 11% - the typical gain in SFR tends to be in the 50-60% range, which is certainly noticable.

I admit I'm not looking at the general case. Rather, in my case, I'm looking specifically at 1600x1200 res, and with 8xAA and AF. The 8xAA is especially where FarCry showed only eleven percent improvement in SLI. But many of the websites don't benchmark 8xAA (with and without SLI) for some reason. So I'm now casting about looking for such results. That is, I'm not yet chalking this up to FarCry specifically, rather my working hypothesis is that 8xAA (at 1600x1200) has difficulty for SLI. Is that conceivable?

Something to realise is that, while each card does have to transform all the geometry, thus doubling the effort there, ...

That suggests that the division of labor is occuring at the wrong place, thus resulting in duplication of effort on the two graphics cards. Your statement suggests a more efficient way to do graphics. That is, the CPU does its work, then another card transforms all the geometry (once and for all), then that result is passed to two (or more) GPUs that finish the rendering. This would reduce the duplication of effort, and reduce costs. (Perhaps the first card could be much like the existing cards today, while the additional cards are specialized and lack the part that "transforms all the geometry" (since it would get that information from the first card). Is that a plausible approach?

generally the vertex processing side of things is idle for large amounts of the frame. A typical game could do 3-4 times more vertex processing without hit, so having to do twice as much in SLI mode isn't that large a hit. GPUs tend to run out of pixel power very much before vertex power, and the increasing use of per-pixel effects is only going to exacerbate this.

(I presume the vertex processors do the "transform all the geometry" that you mentioned previously.)

That suggests another drawback to the current method of doing SLI. That is, the game can be CPU limited (so SLI doesn't help) -- or the game can be vertex processing limited, in which case SLI doesn't help (because that is needlessly duplicated on both cards, with no gain in peformance to show for it). I understand that you feel that the vertex processing sits idle much of the time and is not near its limits -- but I imagine that depends on the game and the GPU.

In other words, SLI helps only when the game-system is neither CPU-limited, nor vertex processor limited -- otherwise SLI does not help. Do I have that right?

In terms of faster passing of textures and the like, it's not something a developer has control over. That sort of thing is handled by the driver, so it's entirely up to nvidia/ATI how things are handled.

I understand. I was referring to the timing of the passing of textures. I believe the timing is controlled by the game developers, and that this could be done more efficiently -- and thereby allow for AFR mode (even when using the render to texture function).

Just to add, I run SLI currently and it's definitely more than 11%, even on SFR games. One thing that SLI is very good at is allowing more FSAA and higher resolutions to be used, and that's not really affected by AFR/SFR modes.

I'm going by intuition now, but it seems to me that filtering (such as AA and AF) requires information from the surrounding pixels, and that requires the two cards (in SFR mode) to overlap their boundaries across the split screen -- resulting in some duplication of effort in the mid-screen area. Is that plausible?

Thanks again for your help Butcher.

**Butcher** · 06-12-2005, 11:36 PM

Originally Posted by Artic_Kid

It seems self-evident that a cloth simulation takes about the same amount of time whether the processing occurs step-A then step-B, versus the other way around (step-B, then step-A, which is then passed into the next frame). But the advantage of the latter approach, if I understand you correctly, is that it allows better timing of when the texture can be used. Is it plausible that, with careful programming, a suitable timing can be reached for the former approach (step-A, followed by step-B, with no need to pass the texture into the next frame)?

It's plausible, though it would depend on the actual game in question whether it's possible to do that sort of rearrangement. And it may hurt performance on single cards, or even on SLI setups.

Originally Posted by Artic_Kid

I admit I'm not looking at the general case. Rather, in my case, I'm looking specifically at 1600x1200 res, and with 8xAA and AF. The 8xAA is especially where FarCry showed only eleven percent improvement in SLI. But many of the websites don't benchmark 8xAA (with and without SLI) for some reason. So I'm now casting about looking for such results. That is, I'm not yet chalking this up to FarCry specifically, rather my working hypothesis is that 8xAA (at 1600x1200) has difficulty for SLI. Is that conceivable?

That's not really the problem. I think the main issue is far cry is poorly suited to SLI. To give a contrasting viewpoint, HL2 on my rig runs roughly twice as fast with SLI enabled, at 1920x1200, 8xAF, 2xAA. That's a large improvement.

Originally Posted by Artic_Kid

That suggests that the division of labor is occuring at the wrong place, thus resulting in duplication of effort on the two graphics cards. Your statement suggests a more efficient way to do graphics. That is, the CPU does its work, then another card transforms all the geometry (once and for all), then that result is passed to two (or more) GPUs that finish the rendering. This would reduce the duplication of effort, and reduce costs. (Perhaps the first card could be much like the existing cards today, while the additional cards are specialized and lack the part that "transforms all the geometry" (since it would get that information from the first card). Is that a plausible approach?

It's possible, though adding an extra transform card in adds latency. Passing data from card to card is very slow and generally avoided. In a typical game 95% of the data sits on the GFX card all the time (video memory stores geometry as well as textures on modern cards). Vertex processing isn't slow enough that it;s worth splitting it off in general.

Originally Posted by Artic_Kid

That suggests another drawback to the current method of doing SLI. That is, the game can be CPU limited (so SLI doesn't help) -- or the game can be vertex processing limited, in which case SLI doesn't help (because that is needlessly duplicated on both cards, with no gain in peformance to show for it). I understand that you feel that the vertex processing sits idle much of the time and is not near its limits -- but I imagine that depends on the game and the GPU.

I'm not aware of any game made since cards got hardware transform engines that's been vertex limited. It's very very hard to do. Generally you'll run out of CPU first, or if running sufficiently high res with lots of AA you'll run out of pixel processing. If you hit the vertex limit you can generally bump the res up and become pixel limited (same for cpu limited).

Originally Posted by Artic_Kid

In other words, SLI helps only when the game-system is neither CPU-limited, nor vertex processor limited -- otherwise SLI does not help. Do I have that right?

Yes you have that right. That's actually quite a common scenario when running with very high quality settings. Bigger screens and more AA mean you have to fill more pixels, which puts a large strain on the pixel processing. There's a reason there are 16 pixel pipes and 6 vertex pipes on current cards.

Originally Posted by Artic_Kid

I understand. I was referring to the timing of the passing of textures. I believe the timing is controlled by the game developers, and that this could be done more efficiently -- and thereby allow for AFR mode (even when using the render to texture function).

It's controlled by the driver. SLI looks like a single card to the game developer, you have no way to tell if there are two in SLI there or not really. Additionally, devs don't really have fine grained control of texture transfers in any event (unless you're coding for a console). The general method of writing a 3d game is to send a big bunch of work to the card and let it get on with it. When things actually occur is not generally something you care about (and it's a hit to check).

Originally Posted by Artic_Kid

I'm going by intuition now, but it seems to me that filtering (such as AA and AF) requires information from the surrounding pixels, and that requires the two cards (in SFR mode) to overlap their boundaries across the split screen -- resulting in some duplication of effort in the mid-screen area. Is that plausible?

for AA, yes that's correct. I'm not exactly cure how they do it, but I'd image it's at most one half of a filter kernel of overlap, which is about 4 pixels max (maybe as many as 8 with 16xAA). Overlapping 4 rows of pixels from 1200 is a minor hit.
AF is purely a texture space filter, and both cards have the complete texture set, so no inter-card transfer is required.

**Artic_Kid** · 07-12-2005, 02:11 AM

Originally Posted by Butcher

To give a contrasting viewpoint, HL2 on my rig runs roughly twice as fast with SLI enabled, at 1920x1200, 8xAF, 2xAA. That's a large improvement.

That's encouraging news. Thanks for the help Butcher.

You didn't mention whether that spec was for AFR mode, or SFR mode.

I was just over at THG, where they compared single cards with and without SLI. I was surprised to see that four of their seven games showed a decrease in performance with SLI. Here are their results for a 7800GTX(256M):

GAME .... PERFORMANCE Gain or Loss (in %)
Age of Empire -18.3%
Fable - The Lost Chapters -11%
Quake 4 +19%
Serious Sam 2 with HDR -10%
Serious Sam 2 -4%
Black & White 2 +45%
F.E.A.R. +40%

For an average improvement of only 8.2%! Then on their next page they give the average improvement from that same GPU in SLI mode as merely 26%! These results are less encouraging.

Perhaps those are GIGO results? Or perhaps these current games are not yet optimized for SLI, and we may yet see improvements in the future ...? Or perhaps not. I don't have a feeling for it yet.

SLI looks like a single card to the game developer, you have no way to tell if there are two in SLI there or not really.

That's like dual-core processors. The application developers don't know (at the time they are writing software) whether it will run on a single-core or dual-core processor. So they will (soon) be writing their software to take advantage of dual-core (if it be present), and then let the given operating system make it happen on whatever processor(s) are present. I imagine it will be the same with SLI. Developers will write more for the SLI option. ...?

Additionally, devs don't really have fine grained control of texture transfers in any event (unless you're coding for a console). The general method of writing a 3d game is to send a big bunch of work to the card and let it get on with it.

Even though you send a "big bunch of work to the card and let it get on with it" I doubt the card does much re-arranging, re-ordering, re-optimizing, of the sequence of events. Maybe a little, but not much. Rather, I suspect the game developer still has a lot of control over the sequencing of events, and therefore can do a better job of optimizing for SLI.

Which goes to my question: How much improvement (in the future game development) can we expect from SLI? Is the cost/performance worth it? Or should one simply invest instead in a great single GPU card, and save money by going with a non-SLI board?

**kalniel** · 07-12-2005, 11:05 AM

Well firstly.. I've begun questioning Toms recently. They used to be great, but more recently.. meh.

Secondly, SLI is only ever going to show an improvement where you are GPU limited. What the modern cards have shown us is simply that with most games you are at least partially limited by the system when using these uber cards. SLI *does* have some small system overheard - if the system was the bottleneck, then you've just put even more load on it and will see a decrease in performance.

As for your final question - that's the biggy.

Originally Posted by artic_kid

Which goes to my question: How much improvement (in the future game development) can we expect from SLI? Is the cost/performance worth it? Or should one simply invest instead in a great single GPU card, and save money by going with a non-SLI board?

On a purely GPU limited game, we can expect up to 100% improvement from SLI. Whether future games are going to be GPU limited or not we can't say. In my experience the trend tends to go more towards GPU limited than not - GPU hardware changes faster than CPU and games can most quickly take advantage of this.

Is the cost worth it? IMO, no. The best use of SLI is not with a new card, which is likely to handle games that come out at the same time, but with older cards that create a GPU bottleneck. The problem though, is that cards don't just increase in speed, they increase in features. It only takes a new SM to start being used by games and you'll need a new generation card to take advantage of it. And even then, one new card that performs twice as well as the old one is likely to cost less than two of the old ones.

There is one situation where SLI *is* cost effective - buying one card now, then further down the line picking up a second one second hand, when your single card has become a bottleneck for games. There the performance increase vs the cost of a whole new card *is* worth it.

But SLI adds cost to the second hand market in that there are more buyers interested in picking up a second card too

**Butcher** · 07-12-2005, 02:03 PM

Originally Posted by Artic_Kid

That's encouraging news. Thanks for the help Butcher.

You didn't mention whether that spec was for AFR mode, or SFR mode.

It was using the default HL2 profile, so AFR. I had a quick look through the nvidia sli profiles and the vast majority of games are AFR.

Originally Posted by Artic_Kid

I was just over at THG, where they compared single cards with and without SLI. I was surprised to see that four of their seven games showed a decrease in performance with SLI. Here are their results for a 7800GTX(256M):

GAME .... PERFORMANCE Gain or Loss (in %)
Age of Empire -18.3%
Fable - The Lost Chapters -11%
Quake 4 +19%
Serious Sam 2 with HDR -10%
Serious Sam 2 -4%
Black & White 2 +45%
F.E.A.R. +40%

For an average improvement of only 8.2%! Then on their next page they give the average improvement from that same GPU in SLI mode as merely 26%! These results are less encouraging.

Perhaps those are GIGO results? Or perhaps these current games are not yet optimized for SLI, and we may yet see improvements in the future ...? Or perhaps not. I don't have a feeling for it yet.

It's Tom's so almost certainly GIGO.

Originally Posted by Artic_Kid

That's like dual-core processors. The application developers don't know (at the time they are writing software) whether it will run on a single-core or dual-core processor. So they will (soon) be writing their software to take advantage of dual-core (if it be present), and then let the given operating system make it happen on whatever processor(s) are present. I imagine it will be the same with SLI. Developers will write more for the SLI option. ...?

You code differently for dual core - you would tend to try and have (at least) 2 threads running with useful work on them to keep both cores running.
SLI by contrast is entirely transparent to the dev. There is no concept of threads for GPUs, you just have a single queue of work and it ploughs through it. Whether there is one card or 10 behind there makes no difference to the way you code.
That makes SLI easy on the developer, but gives much less opportunity for tuning your app for best SLI performance.

Originally Posted by Artic_Kid

Even though you send a "big bunch of work to the card and let it get on with it" I doubt the card does much re-arranging, re-ordering, re-optimizing, of the sequence of events. Maybe a little, but not much. Rather, I suspect the game developer still has a lot of control over the sequencing of events, and therefore can do a better job of optimizing for SLI.

The card doesn't resequence events. The GPU/driver guarantee that submission order is draw order. However, GPUs are very deeply pipelined (several hundred stages), so there can be many jobs going on at the same time at various stages of the pipeline. If you aren't careful about where you use things you'll stall the pipeline waiting for a previous operation to complete.
For example rendering a texture and immediately using it will stall because the second command wants to start well before the first is completed, but they are dependent on each other. Stalls are generally very bad for performance.
So yes, potentially a dev could optimise for SLI better, but without knowing exactly how the SLI is working under the covers it's a difficult prospect.

Originally Posted by Artic_Kid

Which goes to my question: How much improvement (in the future game development) can we expect from SLI? Is the cost/performance worth it? Or should one simply invest instead in a great single GPU card, and save money by going with a non-SLI board?

Difficult to say. If most games continue to be AFR games then SLI gives about a 95% speed boost, which is well worth it. If more games go to SFR where the gains are closer to 50% then it's a bit more marginal.
In general I would say only consider SLI if you're going for top end cards such that there isn't a better single card. E.g. at present I wouldn't recommend SLI for anyone considering less than dual 7800GTs.

**Artic_Kid** · 08-12-2005, 11:26 AM

Originally Posted by Butcher

It was using the default HL2 profile, so AFR. I had a quick look through the nvidia sli profiles and the vast majority of games are AFR.

That is encouraging news. I was previously under the (false) impression that most games are SFR mode. Are most new games likely to go with AFR mode? If so, then that will improve the outlook for SLI considerably.

It's Tom's so almost certainly GIGO.

Has Tom's Hardware Guide fallen into some disrepute? Here's why I ask. Just a couple days ago someone here told me that Antec power supplies are no longer so good (and that Seasonic is better), based on a review given at THG. Can I trust THG concerning their powersupply reviews, but not other things? Any help on understanding that?

SLI by contrast is entirely transparent to the dev. There is no concept of threads for GPUs, you just have a single queue of work and it ploughs through it. Whether there is one card or 10 behind there makes no difference to the way you code.

Yes, I understand. I was making an analogy -- that developers will be forced to optimize for SLI, just as they will be forced to optimize for dual-core CPUs -- though the optimization techniques are obviously different.

In SLI, it seems to me, the optimization comes largely through better sequencing of the tasks to be done. And the game developer does have control over that.

GPUs are very deeply pipelined (several hundred stages), so there can be many jobs going on at the same time at various stages of the pipeline. If you aren't careful about where you use things you'll stall the pipeline waiting for a previous operation to complete.
For example rendering a texture and immediately using it will stall because the second command wants to start well before the first is completed, but they are dependent on each other. Stalls are generally very bad for performance.

Thanks Butcher, that pipelining insight is helpful.

Difficult to say. If most games continue to be AFR games then SLI gives about a 95% speed boost, which is well worth it. If more games go to SFR where the gains are closer to 50% then it's a bit more marginal.

That makes sense, and I've been drifting toward the same conclusion.

In general I would say only consider SLI if you're going for top end cards such that there isn't a better single card. E.g. at present I wouldn't recommend SLI for anyone considering less than dual 7800GTs.

Yes, I'm coming to the same conclusion. Thanks Butcher.

**Artic_Kid** · 08-12-2005, 11:30 AM

There is a drawback of Alternating Frames (AFR mode) that should be discussed -- delay. AFR doubles the processing power per frame, but delays the frame by one frame period -- approximately 1/60th of a second. That's not much, but perhaps it's enough to make people motion-sick, and give headaches, etc.

Does anyone here have some experience with that? Does SLI (operating in AFR mode) increase motion-sickness and headaches, etc.?

**Butcher** · 08-12-2005, 11:39 AM

Originally Posted by Artic_Kid

There is a drawback of Alternating Frames (AFR mode) that should be discussed -- delay. AFR doubles the processing power per frame, but delays the frame by one frame period -- approximately 1/60th of a second. That's not much, but perhaps it's enough to make people motion-sick, and give headaches, etc.

Does anyone here have some experience with that? Does SLI (operating in AFR mode) increase motion-sickness and headaches, etc.?

Actually AFR SLI is generally less latent than single GPU rendering...

Consider a game 2 frames ahead of it's rendering. With a single GPU the latency is obviously enough 2 frames.
With AFR SLI each frame takes the same time, but because they are overlapped 50% the delay is only 1.5 frames.

If anything SLI will complete frames faster than single GPU so feel less latent.

NB: You may find this presentation helpful: http://download.nvidia.com/developer...ed_Day/SLI.pdf

**Artic_Kid** · 10-12-2005, 07:31 AM

Originally Posted by Butcher

NB: You may find this presentation helpful: http://download.nvidia.com/developer...ed_Day/SLI.pdf

Thanks Butcher, that link is helpful. It explains a lot.

Actually AFR SLI is generally less latent than single GPU rendering...

Your statement is explained fuller in the link you gave. I wouldn't have understood it without that link.

That link claims the latency is typically reduced by going from single-GPU to SLI-AFR-mode -- however, they are assuming they are doubling the frame rate. That comparison is "in theory", but in practice it is not how people experience the comparison, and so is not the proper comparison, in my view. That is, the typical user does not compare two systems that differ by a doubling of the frame rate -- and the user does not move to SLI in order to double the frame rate. For example, 30 frames per second is an un-usable system, so comparing an un-usable system with a usable system is not how people experience the tradeoff. Rather, they tune any given system in order to work best, then they compare those with and without SLI -- and at that point, doubling the frame-rate would likely be impossible (even with SLI) because CPU-limiting would prevent it. On the other hand, successfully doubling the frame rate to anything much above 60 frames per second would go to waste and not be experienced as an improvement. The range of usable frame rates is too narrow, and does not differ by a factor of 2 to 1, so that is not the proper comparison to make.

I'm saying that the typical user considers SLI in order to increase image quality (to higher resolutions and/or higher AA, AF, & HDR) -- at nominally the same frame rate. It goes approximately like this: In SLI-AFR-mode you allow the GPU to process twice as long on the image, and you use the second GPU to keep the same frame-rate as before. That is the proper comparison to make, because that is how the typical user would experience the trade-off between non-SLI and SLI. What performance would one GPU card get you, versus two GPU cards? That is certainly how I viewed the comparison intuitively.

And under this comparison (that is, assuming nominally the same frame-rate), going from single-GPU to SLI-AFR-mode always increases the latency. That occurs because they double the Push Buffering (from 1 Push Buffer to 2, thereby doubling that latency), then the AFR-mode also doubles the time used by the graphics card, thereby doubling that latency also). All the latencies get doubled, except for the CPU latency, which would remain unchanged. If the frame takes one time-unit, the original latency is 3 time-units, then in SLI-AFR-mode the latency is at least 5 time-units, for at least 66% increase in latency. This is what the typical user would experience in moving to SLI.

If the frame rate is 60 frames per second, the non-SLI latency is 3/60 = 1/20 second, while the SLI-AFR-mode latency is 5/60 = 1/12 second, which I suspect might be noticable. In other words, when the gamer does something, it would take 1/12th second before the change shows up on the screen. I believe I've done that comparison appropriately. ... ???

My question is, would users be bothered by the 66% increase in latency caused by SLI-AFR-mode? Would it increase motion-sickness or headaches? Or is the latency too small to be noticed? Does anyone here have experience with that?

**Butcher** · 10-12-2005, 11:26 PM

I've never had latency issues with SLI, but then I'm not much affected by motion sickness or headaches.

**Artic_Kid** · 11-12-2005, 08:15 PM

Originally Posted by kalniel

On a purely GPU limited game, we can expect up to 100% improvement from SLI. Whether future games are going to be GPU limited or not we can't say.

The "up to 100% improvement" is reached only in SLI-AFR-mode, not in SFR-mode. So part of my question is whether games developers will migrate more to AFR-mode in order to get greater efficiencies from SLI. In other words, whether they will avoid or eliminate passing large textures from one frame to the next. By careful design of when the texture is created and used, the passing of large textures from one frame to the next can be avoided.

Is the cost worth it? IMO, no. The best use of SLI is not with a new card, which is likely to handle games that come out at the same time, but with older cards that create a GPU bottleneck. The problem though, is that cards don't just increase in speed, they increase in features. It only takes a new SM to start being used by games and you'll need a new generation card to take advantage of it. And even then, one new card that performs twice as well as the old one is likely to cost less than two of the old ones.

kalniel, I suspect you have that exactly right. SLI is likely not worth the cost. Or, it's window of opportunity is rather small and specific. If you need super high resolution (~1900x1200) with all the eye candy turned on, then a new SLI system might be worth it right now -- but that would be extremely few people. And for people who wait, SLI is still not likely worth the cost, because the future cards (and games) will have better features, and who would then want to buy an old video card for their SLI rig when the old card can't do the new features? Again, rather few people.

Perhaps that is one advantage of the ATI 1800XT cards: their shader model is more flexible and better able to operate well for future games than the Nvidia 7800GTX. (...??? Any thoughts on that?)

If that is true, then perhaps it's a better idea to use an SLI/Crossfire mobo with 1800XT card now, then add a second 1800XT card later. (Any thoughts on that?)

Which reminds me. Will two ATI 1800XT cards work in an (Nvidia-based) SLI mobo (given the proper drivers of course)? I've heard they will work, but I haven't had that confirmed.

But SLI adds cost to the second hand market in that there are more buyers interested in picking up a second card too

That's a good point kalniel!

**Artic_Kid** · 11-12-2005, 08:40 PM

Originally Posted by Butcher

I've never had latency issues with SLI, but then I'm not much affected by motion sickness or headaches.

I've had a type of "motion sickness" from 3D games before. It takes a while to set in on me, perhaps an hour or so. It shows up as a slight whoosy feeling, not bad, but enough to take the joy out of things.

I've heard of it too on these 3D games you play at pubs for $5. Four people put on these special headsets matched to their head motion. To see to the left, you turn your head to the left. And so forth. The four people are displayed in the 3D simulation-world, and they shoot at each other. (And the rest of the pub gets to watch on overhead screens.) The game lasts about four minutes -- and makes lots of players motion-sick. It happens because of the subtle delay between what players expect to see and what they actually see -- and the mind struggles to merge the two sets of data.

I suspect SLI-AFR-mode might cause a similar problem, if the latency is sufficiently great. Perhaps SLI is too new to have reports from many people. I dunno. Any thoughts?

======

Another thought. Accoding to that link you gave -- In SLI they use a second "Push Buffer" in series after the first Push Buffer -- which doubles the latency caused by the Push Buffer(s). Which suggests a possible way to reduce the latency. That is, set up the Push Buffers to work in parallel, rather than in series. This way, the CPU writes into one Push Buffer, while the other Push Buffer is being read-out to the GPUs -- with no additional level of latency. Any thoughts on this?

Or, perhaps there's a way to eliminate the "Push Buffers" altogether, and reduce the latency even further. Which brings up the question, what is a Push Buffer for?

**Butcher** · 12-12-2005, 12:21 AM

Originally Posted by Artic_Kid

Perhaps that is one advantage of the ATI 1800XT cards: their shader model is more flexible and better able to operate well for future games than the Nvidia 7800GTX. (...??? Any thoughts on that?)

They use the same shader model (which is decided by all the parties involved, primarily MS, ATI and nvidia). ATI doesn't have a future proofing advantage over nvidia there.

Originally Posted by Artic_Kid

Another thought. Accoding to that link you gave -- In SLI they use a second "Push Buffer" in series after the first Push Buffer -- which doubles the latency caused by the Push Buffer(s). Which suggests a possible way to reduce the latency. That is, set up the Push Buffers to work in parallel, rather than in series. This way, the CPU writes into one Push Buffer, while the other Push Buffer is being read-out to the GPUs -- with no additional level of latency. Any thoughts on this?

That's already what happens. They show them in series, because that's how the CPU writes them. The GPU can be reading one as the CPU is writing the other though, or even reading from the same push buffer that is being written to.

Originally Posted by Artic_Kid

Or, perhaps there's a way to eliminate the "Push Buffers" altogether, and reduce the latency even further. Which brings up the question, what is a Push Buffer for?

Push buffers hold commands waiting to be sent to the GPU. You cannot elimiate them without massively reducing performance of the system. Without a push buffer the CPU has to wait until the GPU is ready for the next command, which means stalling the game code. That's a bad thing.

Thread: Dual Graphics Cards -- Alternating frames

LinkBack

Thread Tools

Thread Information

Users Browsing this Thread

Similar Threads

Problems getting PCI-E graphics cards to work on DFI SLI-DR

Molex connectors on graphics cards

Dual Monitor Graphics Card

XFX dual dvi 6800GT corrupt graphics

Does my graphics card supports dual monitors???

Posting Permissions