Originally Posted by
EhSteve
I’ve been browsing around these parts for a while, not yet found a good enough reason to post, until now. I’m copying this post to a number of such forums in the hope that someone, somewhere, will have had a similar situation, or will somehow be able to advise from their ineffable wisdom. I’ll start with some back-story.
Prologue:
I was running a well functioning workstation which I built in April 2009 of the following specifications:
•2x Intel Xeon W5580 3.2GHz
•Supermicro X8DAI
•2x 6GB kit (2GBx3), DDR3 PC3-10600 (DDR3-1333), CL=9, Dual Ranked, Registered ECC, 1.5V, 256Meg x 72 Crucial memory
•PNY NVIDIA Quadro FX 4800
•GeForce GTX 260 – OEM
•Yes, those 2 cards were in the same machine, at the same time, working smoothly off the same drivers
•No, they weren’t run in SLI, I was running multiple monitors with 3D design on the Quadro’s monitor.
•No, they’re not the same hardware
•No, the 260 isn’t soft-moddable into a Quadro
•Coolermaster Real Power 1000w Modular Power Supply
•Samsung PB22-J 256GB SSD
•Windows XP 64-bit SP2
•A few other bits of hardware, not relevant
Everything ran well, aside from the initial effort of finding the right drivers to run the Quadro and GTX with at the same time and configuring them.
Then, one day 3 weeks ago, the motherboard decided that it shouldn’t POST when I powered on, and the fans would spin up to full speed and just stay there. I took it back to Scan, where the kind techies tested it with both my hardware and theirs, and agreed that it was dead, and out of its meager 12 month warranty. I needed a solution as soon as possible, so I bought an EVGA X58 SR-2 while I was there, as no other suitable dual socket boards were in stock, and when it comes to computing, the word “extreme” when used in marketing is a comforting one, so better to err on the side of more capable hardware than less capable hardware, even if I’m not an overclocking fanatic.
Act 1:
I took my new board home, pleased and bemused by its 10 year warranty (which extreme user runs 10 year old hardware?), and installed it into the cadaver of a rig, fortunately able to fit in my case without anything other than some cable rearrangement. I installed both CPUs, all sticks of RAM, the hard drive, and the Quadro into the 1st PCI slot (x16), with 2x 8pin CPU connectors to the mobo, and one 6 pin PCI-E connector to the PCI slot area on the mobo, and one 6-pin to the auxiliary CPU power, and formatted my hard drive to reinstall the OS with the new mobo.
Everything running smoothly so far. I installed the chipset drivers etc, and installed the correct Quadro drivers that would be compatible with the GTX 260, and then I installed the Geforce into the 5th PCI slot (x16). When I powered on, the system would POST, and start booting into XP 64, then the loading bar would hang for around 1-2 minutes, then it would continue to load into Windows. Minor panic, but it made it into Windows, and I was able to enable the card in Device Manager, and activate the monitors (whoop). However, everything was suspiciously slow, and I attributed this to a driver reconfiguration lag where the OS was installing driver updates in the background.
After a while, I tried to install Kaspersky so I could get on with having a functioning computer, but everything was still RIDICULOUSLY slow. Perplexed, I ran Task Manager to find that the installer was using a lot of CPU (it’s a 16 thread system, and it was using up 12-20%, and running as if I was rendering at 100% CPU usage). I cancelled the installation and ran Process Explorer to find out more, then found that all processes were using far more than normal CPU usage (task manager was using 5%, explorer.exe using 6%, process explorer using 4%, any foreground process would use WAY more than it should, with no hardware interrupts), and the system was behaving as if it had 100% CPU load.
Act 2:
I both disabled, then uninstalled the GTX 260 from Device manager, with no changes after restarting (apart from lack of multi-monitors).
With the computer running this slow, it was unusable, so I had a look at the BIOS settings to see what was going on. Disabled extra CPU features and tried again, with no luck. Tried with optimized settings, voltages within limits, temperatures within limits, all settings as they should be. No luck. So I removed the Geforce from its PCI slot and powered on again, and everything was back the way it was before I put it in, working smoothly.
I put it back in, and the CPU usage on minor processes would go up 20-fold. I disabled the PCI slot that the Geforce was in from the mobo jumpers, and it would run smoothly. Then I ran a barebones trial with one CPU and one stick of RAM with the 2 cards and it would still slow down with the Geforce plugged in, regardless of which CPU I disabled and how much RAM I put in. I memtested all my RAM, and stress tested the CPUs without the Geforce in, and got no errors, and nice temperatures and speeds all round.
Right about now, I’m pretty sure it’s the 2nd card causing issues, so I run the Geforce by itself in slot 1 or 3 or 5, and the Quadro by itself in slot 1, 3 or 5 and everything is smooth, swap around the Geforce and Quadro together to different x16 slots on the SR-2 and everything is unusable. Having been continuously googling for information about similar problems over this time period, I decided that the BIOS might be incompatible with the cards, so I flashed to the latest version, A50. Uninstall/reinstall cards and drivers. No luck. Used Tuneup to remove residual registry entries for drivers and tried again. No luck.
I also learn that the Nf200 chip provides 2 extra PCI-E 2.0 x16 slots in addition to the Northbridge’s standard 2, and hypothesized that it might not be able to run one off the NB and one off the Nf200 at the same time without an SLI bridge, so I tried running them in various adjacent slots, to no avail.
One of my friends who also builds systems suggests that because the SR-2 is an extreme board, stock auto-voltages might not provide a stable setup, so we think the Northbridge/nf200 might be getting a lot more load under multiple non-SLI graphics cards, and tried boosting the voltage slightly with both cards in, to no avail. We only added 0.04V, up to 1.175V then I decided it was unwise to add voltage without knowing vague upper limits for this chipset.
Having lost a lot of sleep trying to sort this out, I’m now at a loss as to which direction to take this. I suppose primarily, I’d like to know if anyone else has a similar issue with multiple non-SLI cards, and whether they’ve identified the cause. If anyone knows whether it is in fact the SR-2 not supporting 2 non-SLI cards, or if it sounds like a dodgy board, or low-voltage instability, or even the “safe” upper limits for voltages on the SR-2 chipset, nf200 etc. for me to test out, I would be absurdly grateful.
Yours desperately
Eh Steve