Someone posted this on AT about this poster on HardwareLuxx
https://www.hardwareluxx.de/communit...l#post25385390
(Google Translate)
This was for an Asus Prime X370 Pro. I had wondered how you force an ECC error.In order to prove the ECC is actually also active I have put an experiment on.
I have very carefully around the RAM around clocked around a configuration to find so just just instable.
I have managed to produce this work of art:
DRAM ECC error.So it should be clear that the ECC is activated and functional.
EDAC MC0: 1UE on mc # 0csrow # 0channel # 1 (csrow: 0 channel: 1 page: 0x41d7f0 offset: 0x700 grain: 0)
Yes, all their boards claim they 'support' ECC but what people were unsure of was whether that meant they can take ECC and treat it as normal, or whether they support ECC functionality. And that Asus had the same question: it obviously POST'ed and ran with ECC. The user was trying to force an ECC error and seeing if it got reported.
bsodmike (16-03-2017)
My old AM3+ motherboard has BIOS options for ECC enable, scrubbing control etc. I would have thought it was obvious if a board really supported ECC.
Injecting ECC errors is quite hard, I have only seen it done by modifying a DIMM with circuitry to corrupt a memory transfer when a button is pressed. Was pretty iffy in use.
Well, it seems the ASRock X370 TaiChi has a DRAM scrub option in the BIOS but nothing else about ECC:
(time index 16:43 which the embed video feature here seems to ignore)
That HardwareLuxx only had one person commenting on the guy forcing the error. They said they would have expected a kernel panic but the first guy said that in Linux that's not necessarily true.
But since replicating a similar error with non-ECC memory would be hard and since it doesn't look like that Asus board has a feature to disable ECC, there might still be a doubt whether it actually works. If the memory speed was left constant and the only difference between ECC enabled and disabled in the BIOS was that the error was or was not caught then I would be more convinced that ECC is working.
kalniel (13-03-2017)
Oh I didn't know this thread had actually been created! Well, here's to another collection of epic discussions, speculation and theorising (and of course, some bits about Zen!).
Just to kick off, I've been thinking about the Zen APUs due out later this year (IIRC?). I wonder if they will come with or without L3 cache? Given the CCX with L3 is smaller than Intel's, it's not unlikely they'll just stick with it unless they really need that extra 16mm2 to squeeze in more GPU? Either way, Zen has a pretty big L2 cache which could lend itself to managing without L3, but they'd likely be going directly against Intel's mainstream platform so wouldn't want to be throwing away CPU performance?
Anyone read the blog post from AMD that basically says there's nothing wrong with Ryzen or any software, before i had confidence AMD were working on some of the issues highlighted by the community, however now it seem they're just burying their head in the sand.
It might have more to do with Microsoft not wanting to make the changes required.
Seems the scheduler and core parking is quite poor compared to Linux from the talk on The Stilt's Ryzen: Strictly technical. That point onwards talks about AMD's statement, but the comparison with Linux was earlier.
Maybe with all these teething issues an R3 or R5 makes more sense: get a decent board, use that, then wait for Ryzen2 as there seem to be plenty of easy fixes they can implement.
You know, I read the Hexus summary of it and was left shaking my head; now I'm not so sure. They do confirm they're working on a software update to improve performance in Balanced mode (I suspect this will improve frequency switching times but still allow core parking), which is good.
If there was an out-and-out scheduling issue with Windows 10 you should find that all loads were equally affected, but we know there are single-threaded loads that Ryzen excels at. There are also games that aren't negatively affected by SMT being enabled. We've heard about cross-CCX communication issues, but again that should affect all loads to some extent, and some of Ryzen's results indicate that it doesn't.
I don't think there's much point denying that Ryzen has these performance issues, but it's pretty clear that only some programming techniques are exposing them, and that "well" written programs aren't impacted by them. In other words, it's yet another example of AMD's "forward-thinking" architecture stuffing its performance in real-world metrics. Whether it'll be a problem long-term will depend on how many titles they can persuade devs to patch, and how many studios they can get to optimise forthcoming titles...
On the Windows scheduling thing i think they're correct when they say it's working as intended and it's not a problem, however that's doesn't mean it's working optimally for Ryzen's unique design.
Personally i think most of Ryzen's problems come from locking the data fabrics clock speed to that of the memory controller, for some unknown reason they decided on a fixed 1:2 ratio (afaik).
Hmmm, a couple of things.
Uncertain core scaling due to CCX isn't a uniquely Windows problem, but it isn't an OS-wide problem either. Look at http://www.phoronix.com/scan.php?pag...en-cores&num=3 for example - while most mutlithreaded tests in that review are pretty much CCX agnostic (a couple actually gain performance), TTSIOD shows a clear performance boost from using a single CCX. Those tests are all on Linux, so they're a pretty good indication that the issues some programs have with spreading threads across CCXes aren't just down to the Windows scheduler
As to "for some unknown reason", I think you mean "for some undisclosed reason". I'm pretty sure AMD a) know why they did it, and b) ran it at the speed that provided them with the best performance for a given guarantee of stability. AMD engineers aren't idiots - they'll be fully aware of the performance implications of every design aspect of that processor. If they didn't run it faster it'll be because they couldn't stably run it faster, or that running faster didn't improve performance sufficiently given its impact on other parameters of the processor.
There are currently 3 users browsing this thread. (0 members and 3 guests)