Just checked scans website to see if there was any update they have aparently tested it and found exactly the same issue I have and my MB is fine. Ill update once I get an official reply.
Just checked scans website to see if there was any update they have aparently tested it and found exactly the same issue I have and my MB is fine. Ill update once I get an official reply.
I have received the new mobo and old CPU back from Scan and the CPU is still overheating just a badly as it ever was.
I was at a loss to understand how this could happen so I experimented with the memory settings as these were one of the things which may have been different at Scan's end, the results were significant,
- With memory at 1333 and 1.5v the AIDA test could be run with stock Intel HS and no thermal throttling, CPU temps reaching a maximum of 94°C.
- With memory at 1333 and 1.65v the AIDA test could not be run with stock Intel HS without thermal throttling. 3%
- With memory at 1600 and 1.5v the AIDA test could not be run with stock Intel HS without thermal throttling. 17%
- With memory at 2400 and 1.65v (XMP) the AIDA test could not be run with stock Intel HS without thermal throttling. 17%
So with the memory controller on the CPU the DRAM frequency and voltage make a big difference to the running temperature of the CPU. In this case too much heat to run at very modest memory settings above minimal default.
I am not sure what the justice of this situation permits me to expect and would be interested to hear other people's opinions. It strikes me that the Intel CPU spec rates it for DDR1600 and that normally runs at 1.5v so if it overheats enough to throttle at those DRAM settings then its faulty.
Last edited by Sylvester; 01-09-2014 at 04:36 PM.
Hi Sylvester,
I have also had my hardware back from scan since Friday. I have managed to get my CPU to run and AIDA 64 stress test with the stock cooler CPU temp was ~87°C with core temps ranging form 94-98°C.
I cant remember if that was 1333 or 2400MHz. I have to do a couple of runs over the next week or so. As the last run i did have adaptive voltages on in the bios which I have been told can cause the CPU to over volt itself during stress testing causing extra heat. I should get back to you by the weekend with my results.
I tried testing with the Noctua NH-L12 again to see if better cooling would make a difference. I ran the tests at stock + XMP settings with 32Gb of ram (DDR2400, 1.65v).
Result in Prime95 was protective thermal shut down of the entire system within 5 seconds of starting the small FFT test. Core Temp showed all four cores showing yellow and about 96°C in the polling interval prior to shutdown, but the polling interval was longer during stress testing and the temperature rose so rapidly its likely that it was higher at shut down. I was so surprised I repeated this test several times with the same result each time.
Result in OCCT was that the test was automatically shut down due to reaching the thermal limit of 85°C within 80 seconds of starting the stress part of the test.
Result in AIDAx64 was that the test went ahead without thermal throttling, results showing that the cooler was taking at least 14°C off the stock cooler. However Core Temp issued thermal warnings and the max temp reached 86°C on 3 out of four cores.
Previously tests using AIDAx64 used the Intel retail HS/F because this is the reference test which SCAN recommended and use themselves. The expected temps with retail HS/F are 60°-70°C. While better cooling allowed the test to be run without throttling, this setup still overheated to much higher temps than expected even with the Intel HS/F and this does not change the fact that this setup fails to run the test without throttling with the Intel cooler reference set up thus failing SCANs test procedure.
So it does not change the fact that there is an overheating fault and the real world impact of this problem is evident from the Prime95 and OCCT tests. Even with a better cooler than Intel's there is an unavoidable risk of thermal shutdown in the middle of running an application or game which cannot be considered fit for purpose.
So I think its absolutely necessary for me to track down this fault and fix it so the board+cpu are running at normal temperatures to start with. It may be a wider problem than my set up, judging from the reports of other people contributing to this thread and the Intel thread. So I feel it is appropriate to suggest it would help if anyone experiencing a problem with overheating could contact their mobo manufacturer technical support to ask for help and make them aware of the issue.
I dialled the RAM back to DDR1600 1.5v to keep it in spec and added the Noctua and did some more tests. It is cooler than the Intel retail HS/F (which I reseated once and checked contact pattern again which was good x2) but both still way hotter than they should be. Even at these settings and under the Noctua Prime95 small FFT test reached 100°C on 3 cores before I pulled the plug myself to prevent another shut down.
So I looked at voltages.
Can anyone tell me why OCCT "CPU VCORE" monitor is registering vcore under OCCT stress testing as 1.63v? When CPUz is showing "Core Voltage" as 1.213v at the same time ? I don't understand, on my QX9650 these two readings are identical.
EDIT I set the BIOS power draw limiter to 86w (was auto with displayed value 88w) and it made a huge difference to top temps but caused a lot of CPU frequency changes when under load, varying from 4 to 4.4 GHz but mostly less than 4.4 GHz so in other words this nerfs the turbo, basically a less drastic way of disabling turbo, so not really a solution, but does give some kind of pointer to the nature of the problem, further indicated by transient power draws reaching up to 220W in Core Temp monitor even at this setting.
220W power draw would explain the heat but where is the cause? Is it just the mobo power regulation set up in BIOS is out of whack? Or is there a bigger problem with the processor which makes the BIOS look out of whack? How does an 88W processor ever draw 220W even at turbo frequencies?
Last edited by Sylvester; 05-09-2014 at 09:32 AM.
Tried to boot with the limiter set to 87w this morning and there was some kind of system instability, Windows7ultx64 somehow forgot it was activated and started throwing around insulting and heavy handed false accusations about itself so once I restored that and then connected my test bench to the internet (thanks so much for all of that Microsoft) revalidated etc I found the voltage and power dynamics had altered and I presume the instability has triggered one of the myriad mobo auto settings to switch to a stability priority.
This would be good if I knew which one it was, never mind that for now, on with the testing.
Even with stability triggered and power limited to 87w Core Temps picked up Prime95 small FFT test drawing over 102w (see Power reading just above core temps).
So I tried going back to 86w power limit and ran OCCT again to see what was happening to OCCTs version of vCore. This is captured in the screeny below and has "dropped" to 1.436v with occasional wild fluctuations up to 1.6v which is crazy but true. If you examine carefully you can see there is a correlation between the voltage spikes and the temps. Mostly the temps drop and then the voltage spikes, suggesting the mobo is reacting to the temps by changing volts as it should. At around 80secs there is one example where the temps spike and the voltage drops. OK but it seems like the calibration is way off and the reactions are unstable and most unstable when the power draws are very high. So I am tending to think high temps are a result of flawed design and default power management on this mobo CPU combination needs some work. Does that seem fair ?
It seems to be pumping up the voltage at about 52.9c. I'm completely ignorant of how this works but can you set your vcore (or whatever it is) and remove the power limiter? I think there is an offset somewhere which may be set to auto. It looks as though the boosting may be broken. Not sure which monitor is correct though.
Well I tried the second of your suggestions first and it worked well Domestic_Ginger, so thanks. By setting a "normal" vcore and a negative offset on vCore of -0.060v I was able to take 12w off the power draw in Prime95 from 102w down to 90w (with 86w limit set, go figure) and this helped drop the temps. All volts readings dropped by that amount so default vcore was 1.14v instead of 1.2v. This has definitely helped. I tried more but it was unstable. I also dropped the graphics and ring bus by 0.050v but that didnt have much effect.
However setting the power limiter to Auto was pretty disastrous even when the negative offset was set. Prime95 power consumption shot up from 90w to 140w and temps went up from low 70's to mid 90's. See Core Temp screeny comparison below, left is with power limiter wild and free, right is 86w limit imposed. With that limit in place the CPU clock is reduced to 4GHz though, so effectively turbo is broken, and I am underclocking it to stop it burning up. NB pls SCAN and Gigabyte, this is a workaround not a fix.
I still don't know why it is being so crazy power hungry in the first place. I might take a look at loadline calibration tomorrow, may be able to reduce some of the peaks. All suggestions welcome, within reason...!
Last edited by Sylvester; 05-09-2014 at 10:57 PM.
Just an idea to perhaps narrow it down a bit, does the motherboard you're using support MultiCore Enhancement? What this essentially does, is allow the CPU to run at full turbo clocks on all cores simultaneously rather than varying it depending on load to stay within power limits. I seem to remember reading it's often on by default on Gigabyte boards.
CoreTemp is only reporting a single clock, but the cores can clock up independently so you'd need to use something else to check, e.g. http://www.cpuid.com/softwares/tmonitor.html
Under a 'stock' profile, you'd expect to see higher turbo clocks with fewer threads running.
The clock speed reported on your left screenshot would fit in with that - IIRC 4.4GHz is the highest turbo clock for 1/2 cores, but it's normally up to 4.2GHz with all four cores loaded.
Thanks for the link watercooled, I also snagged HWMonitor while I was there and it produced this.
So I am out of my depth but some of those volts look a bit high. Anyone know what they should be reading ?
No idea. You'll need to stress test to get the max values and work from there.
Software voltage measurements can be weird so I wouldn't pay them too much attention.
OK this is as far as I got today, beginning to see how this thing does its power dance.
To run Prime95 I had to set the 4 core turbo to 41x as it was set 42x and it was too much clock for 4 cores all at the same time.
So now its set
4cores @ 4.1GHz
3cores @ 4.2GHz
2cores @ 4.3GHz
1cores @ 4.4GHz
I noticed that the CPU draws more power the warmer it gets (due to increased resistances) and gets warmer as it draws more power which can become a "runaway feedback loop", which is where the power limiter comes into play. With four cores doing 4.2 GHz and power limit open it snowballs quickly. With turbo limited to 4.1GHz for 4 cores it slows the heat gain down and allows about 5 mins solid processing before the temps rise above 75°C, when this happens the power draw increases and with the limiter upped to about 115W it catches the overheat as a power overdraw and throttles down the clock to 4.0GHz which allows the cores to cool.
Basically the limiter can be used to govern temps at the cost of performance. If it can do that I dont know why they dont just make a temperature based limiter and be done with it. But I am still not sure if this is a fix and working as intended or a workaround for a set up which is still drawing too much power. This is not overclocked it is stock clocked with a nice cooler, drawing way more than its rated TDP under load and 4 core turbo nerfed to 4.1GHz.
IA offset still sits at 2.0v which is confusing me. That is something to do with the adaptive voltage system but why is it offset by 2.0v? I cant find a way to tweak it and I am wondering if it is just too big a number but I have no clue despite plenty of googling.
With power issues contained for now though the good news was that XMP DDR2400 1.65v was not a problem for temps, so at least I can get the full 29Gb/s memory bandwidth which was the main reason for getting this chip and the 2400 RAM. Point being to run a nice VGA later, maybe a Maxwell, for 3D 1080p Elite Dangerous and Star Citizen. Oh yeah!
Presumably the 42x on all cores was the value reached with MCE disabled? Or did you just try the stock turbo settings manually? It would be worth ruling out that the motherboard is interfering with voltages with MCE enabled.
I find it sort of weird that MCE is enabled by default, especially considering how much it seems to increase power draw, and that it's technically OCing the CPU.
I'm not sure what's meant by that 2v reading, but it's certainly not the core voltages or don't imagine there would be much of a CPU left. Assuming it's a true reading, it's probably some intermediate regulator voltage.
Had to look up MCE (multicore enhancement) as I've been out of the loop since core2quad and missed all the sandy-ivy-haswell evolution. So wasn't wise to this review driven default overclock approach, am now, so maybe that answers the question about whether it is broken. Irony then that following SCAN's advice and testing, one mobo has already been RMA'd for overheating. Maybe SCAN need to get wise to this too or maybe they are wise to it already but are refusing to pick up after mobo manufacturers, and who can blame them.
I set the ratio/corecount manually following the advice figures visible in the mobo BIOS. 4core@4.2 was too high for temps. The power limiter would not allow it for long anyway. I am not comfortable raising that above the max power recommendation for the Noctua which is 125W in a well ventilated case.
As for running it at full turbo 4x4.4 I really don't think the integrated heat spreader and thermal interface material is up to the job as the heat build up is so quick to rise and fall while the actual heatsink stays only warm (not hot) to the touch (which is why I have reseated a dozen times but temps seem to match other peoples' accounts). So it shows a bottleneck at the surface of the CPU IMHO. I can see why people delid and even with good heat transfer you would want a heatsink engineered for 200W+ to let it rip, which is not so very green after all.
It is interfering with volts as I have vCore set on "normal" which allows adaptive voltage changes but I have offset it -0.060v. This is close to the limit having tried lower when it would not boot, had to clear CMOS. The volts messing was not as bad as I feared though because the anomalous OCCT reading was because I had a 4.4.0 version running and updating to OCCT 4.4.1 fixed the high vCore readout issue, it is pretty congruent with CPUz and CoreTemp and HWMonitor now.
Like you say though, it's one thing being able to run the thing at stock with a far more efficient cooler, it's quite another trying to overclock what is marketed as an overclocking CPU.
As I think I linked earlier, someone on another forum investigated the heatspreader issue and it seems the heat transfer issue is more down to the gap between the IHS and the die, improving the TIM doesn't fix the core issue. http://www.anandtech.com/show/8227/d...k-and-i5-4690k
Look at the difference between stock temps, and those achieved by removing the gap.
It really does mean coolers can't work efficiently; I'm seeing people recommending and using uber-expensive closed-loop coolers for stock CPUs. It doesn't mean the cooler is being overwhelmed, as you say your cooler is barely getting warm. Just for way of comparison, Sandy Bridge used solder to attach to the heatspreader, and although their power dissipation is similar, Sandy Bridge ran far cooler thanks to a decent thermal path. Even look at Haswell-E, which obviously dissipates more power than Haswell-DT, yet again runs coolers because of a decent thermal path. It seems it's just inexcusably crap with the likes of the 4790k though, even more so considering it's supposed to fix exactly that issue, and it really doesn't seem like a complex thing to fix.
There are currently 1 users browsing this thread. (0 members and 1 guests)