Meh,thing I will need to stick with what I have until Ryzen 2.
Anyway,I have some RX470 results I need to add to the GTX1080 review.
Meh,thing I will need to stick with what I have until Ryzen 2.
Anyway,I have some RX470 results I need to add to the GTX1080 review.
While certainly notable I'd be inclined to say that one benchmark maybe more of an indication of problem with that benchmark than the way the OS spreads threads across CCX's, mainly because the majority show it has little to no effect, having said that wasn't strange results in some Windows benchmarks where all this started.
Yea that, unknown to us mere common folk
Last edited by Corky34; 14-03-2017 at 02:23 PM.
I think for most people interesting in gaming their purchasing decision will come down to whether they're GPU-limited on their existing setup. Running an RX 480 at 1440p? Probably be fine with Ryzen. Running a GTX 1080 on a high refresh 1080p screen? Might want to stick on Intel for now.
Ooooh, very interesting.... *goes to look*
EDIT:
Yes, that was my point. That's under Linux, so can't be affected by the Windows Scheduler, yet still shows the same problem (reduced performance when cores are split between CCXes). The fact that it happens on Linux indicates that it's more complicated than just the Win 10 scheduler moving threads between CCXes - in some cases the way the software is written causes the problem (which backs up AMD's claims that a) the Win 10 scheduler's working fine, and b) fixes will have to be implemented by the software vendors).
It's interesting how software can be written to rely on specific assumptions about the hardware in use - for instance, the original build of the original Neverwinter Nights relied on synchronising threads on a single core to manage game timing. Install that game, unpatched, on a quad core CPU and the performance tanks - we're talking genuine slide-show - because it loses that synchronicity. So while the problem seems too widespread to be specific to the individual games in question, if they're making a common assumption about hardware that Ryzen doesn't adhere to, it's quite feasible that a) it is specific to the game code, and b) it's easily patchable (NWN was patched and now works perfectly on multicore CPUs).
Last edited by scaryjim; 14-03-2017 at 02:46 PM.
I will get the results up in the next few hours.
scaryjim (14-03-2017)
Just being doing some poking about. AMD claimed this morning that the Coreinfo dumps going around were incorrect due to an older version of Coreinfo. Interestingly, coreinfo uses a Windows function call to get its information.
The only logical way old and new versions of the same tool could call that function and produce different output is if they interpret the results differently, since the actual return structures should be identical. If games and other software is making the same types of checks, and therefore the same types of mistakes, when trying to optimise code paths for different architectures, that could easily skew results.
Hmmm, for instance: with Bulldozer/Piledriver cores, iirc you wanted to schedule related threads on logical cores that shared some physical resources, because that would give you the minimum possible cache latency? So if a game is identifying an AMD CPU, querying the core logic and finding cores that share physical resources, it may well be affinitizing threads to that pair of cores, thinking this is more efficient - whereas with Zen it's the opposite, because you're created unwanted competition between two threads for a physical core....
I mean, I have no idea if that is the case, but it's the kind of processor-specific optimisation I could see a dev trying to do, particularly if they were targetting Windows 7, which iirc didn't get a scheduler update for Bulldozer...?
Seems CB have been busy investigating Core Parking and Win7 vs 10. Rather than adding to their original review, they wrote a new article:
https://www.computerbase.de/2017-03/...-core-parking/
(Google translate)
Scores are a bit all over the place: sometimes Win7 is quicker, other Win10 is.
Biggest difference is for Project Cars where Ryzen was below the i5-7600K, but now managed to catch the (admittedly stock) i7-7770K:
Takes this with a wheelbarrow of salt, but over the AT forum someone says they heard inside info about major revison coming in a few months (i.e. way too soon for Zen+).
This start from this post:
https://forums.anandtech.com/threads...#post-38793715
Seems a bit unlikely if the comments about GF's process we've heard are accurate, as this speculation goes on about higher speeds whereas from what we currently know, 4GHz is already way outside the process's ideal.
Just seems so soon but then R3 and R5 are delayed.
One of the pre-launch rumours were about respins, and I had wondered if maybe they'd sell some parts before the respin.
If all R7 parts were pre-respin, current buyers might not be happy though... they'd feel like Titan buyers.
But this gets the imagination going: what if inter-CCX communications were meant to run faster (maybe full RAM speed rather than half), or some other issue. Since Ryzen essentially is a SOC with its own firmware, I'm sure there were lots of thing they could have tweaked if the silicon wasn't as expected.
Last edited by kompukare; 14-03-2017 at 09:47 PM.
I wouldn't expect much outside of clock speed bumps for steppings, and/or the binning improvements brought by increasing yields (like the Phenom steppings CAT mentioned, or the different Q6600 steppings - IIRC they were both examples of first-gen products on their respective nodes), but the 14nm process is already pretty mature so I wouldn't be expecting anything major TBH. Major architectural changes don't happen that far down the pipeline, things like the fabric clock speeds are locked to the IMC which isn't a simple thing to change. Besides that, there's not really anything obvious that needs urgently changing so I'm not sure what this 'nonsense' is he's referring to - the other issues we're seeing such as SMT scheduling, core parking, etc. are software issues and have nothing to do with the physical CPU itself.
At the end of the day though, there will always be 'something better' down the line - just we've been stuck with 'the same old' for so long we're not use to being in this situation!
As has always been the case, if you're good with what you have, you'll likely get more for your money next year or whatever. If you're in need of an upgrade, and unless something new is *right* around the corner, get whatever is best for your workload. IMO anyway...
OTOH, claiming the release silicon is 'rebranded ES' is utter nonsense. Intel regularly release the same stepping they used as ES for shipping products, it just means that stepping didn't need additional work to reach final clocks or whatever, and given the clocks Ryzen has reached in time for release, that's clearly not an issue.
If anything I suspect that guy is referring to (or confusing) the 4C8T models which obviously won't have any issue with fabric latency if they're only using one CCX!!!
Edit: The fabric does run at full memory clock speed. Remember the speeds you see for e.g. DDR4 are data rates in megatransfers/s, not Hz. With DDR, Double Date Rate, data is transferred on both the rising and falling edge of the clock i.e. twice per Hz.
Scheduling a single thread is pretty easy, stick it on core 0 and power gate the rest off. Something like Handbrake is also pretty easy to get right, all the threads run flat out so no need to move any around.
The problem is if you get a load that works in bursts. Event driven stuff, responding to network packets coming in and user inputs. It is soft real time, you are fighting the conflict of "leave a thread where it is, less disruption will make it go faster" and "that thread over there hasn't been run in 20ms, if it is responding to a real-time event the system will stutter". That sort of load is hard, I don't think anyone will ever really solve it, a real-time OS doesn't even try they just use round-robin to minimise latency at the expense of throughput.
So I don't think it is so much well written programs, as just easy to schedule loads. Games are bursty and hard to schedule.
Have been out of the loop a bit recently with real life getting a bit intense, have there been any database benchmarks yet? That was a big problem for Piledriver.
That's what I meant about Ryzen being a SOC with its own firmware.
Now whether that is changeable in the firmware is an other issue, but it is at least possible that the first ones they got back had a problem somewhere and they were forced to run *something* slower via the firmware, and that a later revision could fix that. In a time frame way quicker than Zen+, possibly even in time for R3 and R5.
Only if the designer of that part thought there was a need and benefit for it to be flexible. Running things at a fixed ratio can simplify the logic, so making it programmable may have added extra logic which could add a cycle or two of latency each time you go over the fabric. A design isn't finished when you have run out of things to add, it is finished when you run out of things to remove
Well if the designer of Zen didn't see a need for the data fabric clock to be (adjustable) running faster than the IMC then I'd like words with that designer.
There are currently 1 users browsing this thread. (0 members and 1 guests)