Page 14 of 108 FirstFirst ... 4111213141516172434445464 ... LastLast
Results 209 to 224 of 1713

Thread: AMD - Zen chitchat

  1. #209
    Headless Chicken Terbinator's Avatar
    Join Date
    Apr 2009
    Posts
    7,321
    Thanks
    1,095
    Thanked
    672 times in 550 posts
    • Terbinator's system
      • Motherboard:
      • ASRock H61M
      • CPU:
      • Intel Xeon 1230-V3
      • Memory:
      • Geil Evo Corsa 2133/8GB
      • Storage:
      • M4 128GB, 2TB WD Red
      • Graphics card(s):
      • Gigabyte GTX Titan
      • PSU:
      • Corsair AX760i
      • Case:
      • Coolermaster 130
      • Operating System:
      • Windows 8.1 Pro
      • Monitor(s):
      • Dell Ultrasharp U2711H
      • Internet:
      • Virgin Media 60Mb.

    Re: AMD - Zen chitchat

    Quote Originally Posted by watercooled View Post
    I've been wondering what the Xbox will use. Did Lisa Su rule out the possibility of Zen or was it more of a dismissal of the question? Either I've forgotten or I didn't watch it.
    I believe it was something along the lines of "no semi-custom Zen APUs before 2018" during an investor call around October time. Maybe things have changed/progressed since then?

    12GB GDDR5 and Vega is all but confirmed though, I believe.
    Kalniel: "Nice review Tarinder - would it be possible to get a picture of the case when the components are installed (with the side off obviously)?"
    CAT-THE-FIFTH: "The Antec 300 is a case which has an understated and clean appearance which many people like. Not everyone is into e-peen looking computers which look like a cross between the imagination of a hyperactive 10 year old and a Frog."
    TKPeters: "Off to AVForum better Deal - £20+Vat for Free Shipping @ Scan"
    for all intents it seems to be the same card minus some gays name on it and a shielded cover ? with OEM added to it - GoNz0.

  2. #210
    Bows out! CAT-THE-FIFTH's Avatar
    Join Date
    Aug 2006
    Location
    Hopefully somewhere less backstabby
    Posts
    28,789
    Thanks
    3,203
    Thanked
    4,456 times in 3,442 posts
    • CAT-THE-FIFTH's system
      • Motherboard:
      • Less E-PEEN
      • CPU:
      • Massive E-PEEN
      • Memory:
      • RGB E-PEEN
      • Storage:
      • Not in any order
      • Graphics card(s):
      • EVEN BIGGER E-PEEN
      • PSU:
      • OVERSIZED
      • Case:
      • UNDERSIZED
      • Operating System:
      • DOS 6.22
      • Monitor(s):
      • NOT USUALLY ON....WHEN I POST
      • Internet:
      • FUNCTIONAL

    Re: AMD - Zen chitchat

    Quote Originally Posted by watercooled View Post
    I mean, I do understand that when you're reviewing for a website, you have probably hundreds of benchmarks to process, along with hardware changes, and having that much information in front of you means you're not as likely to spot weirdness which might seem like nothing at first glance. Independent reviewers and even users often pick up on things like this, and it would be nice if it got some more attention in the media, to better explain the results. I respect sites more for doing things like this than publishing a ton of robotic test results, not least because it demonstrates the reviewers have a better understanding of what they're actually seeing

    I know a few like Hexus do this when the need arises, but it would be nice if more sites could take this sort of new information and run their own investigation into why some results are the way they are.

    We've seen something similar happen with GPUs too, where in some cases we get benchmarks with release drivers, and that's it. 6 months down the line when you're looking to upgrade, those numbers could be completely invalid because of both software and driver updates, and few places re-run benchmarks (again, I do understand it's a time-consuming process so it's probably not practical to do it frequently). In a way, I suppose this is the sort of reason we get rebrand launches from the likes of AMD as it, in a way, forces sites to re-test the products with the latest drivers, and can show performance improvements even if nothing has changed on the hardware side.
    I remember that part of ROTTR being quite taxing when I played it with my GTX960,and its why I do like DF,since it is quite obvious they do know the areas where issues happened.

    Another issue is reviews tend to not test the updated games or drivers with older CPUs too.

    The internal benchmark under DX12 is jittery - I ran it like 10 times,and occasionally performance would be lower,irrespective of whether it was a warm or cold system,so the screenshots are ones which were the median ones.

    The driver released with the GTX1080TI was a new DX12 performance driver,and interesting enough you do seem some gains in certain games now,but all the testing is done with the newest CPUs.



    Thats under DX11. So the point where FPS dips is where you see a lot of animated animals run up to you and the part of the Village of the Remnants where most of the NPCs are located in.





    Its less severe when you drop resolution. I thought it might be the GPU or CPU throttling but after testing it for a while,it wasn't that. But once the jittery runs happened if I dropped textures to high in-game it solved the problem.

    I even tried a different area of the game,and the issue was not present - DX12 performance was slightly better at lower resolution.

    Look at the RX470 in comparison.





    The GTX1080 is no doubt faster,but it didn't have the DX12 issue.

    Remember this is the SAME driver as used in the GTX1080TI reviews so it is some weird issue that is happening even with older CPUs like mine.


    Those despicable Elk,stealing the pond weed!

  3. #211
    Senior Member
    Join Date
    Jul 2009
    Location
    West Sussex
    Posts
    1,072
    Thanks
    73
    Thanked
    132 times in 124 posts
    • kompukare's system
      • Motherboard:
      • Asus P8Z77-V LX
      • CPU:
      • Intel i5-3570K
      • Memory:
      • 2 x 8GB Crucial Ballistix Elite PC3-14900
      • Storage:
      • Crucial MX200 | Sandisk Extreme 120GB SSD | WDC 1TB Green | Samsung 1Tb Spinpoint
      • Graphics card(s):
      • Sapphire R9 290 VaporX 7950
      • PSU:
      • Antec 650 Gold TruePower (Seasonic) or Seasonic SII-330
      • Case:
      • Aerocool DS 200 (silenced, 53.6 litres)l)
      • Operating System:
      • Windows 10-64
      • Monitor(s):
      • 2 x Dell P2414H

    Re: AMD - Zen chitchat

    Quote Originally Posted by watercooled View Post
    I mean, I do understand that when you're reviewing for a website, you have probably hundreds of benchmarks to process, along with hardware changes, and having that much information in front of you means you're not as likely to spot weirdness which might seem like nothing at first glance. Independent reviewers and even users often pick up on things like this, and it would be nice if it got some more attention in the media, to better explain the results. I respect sites more for doing things like this than publishing a ton of robotic test results, not least because it demonstrates the reviewers have a better understanding of what they're actually seeing
    Thing is - like the AotS results with <6> cores - this is not just something which affects Ryzen 7, both also Intel HEDT with 6+ cores. It's just that nobody seems to have noticed as almost everyone was benching with Nvidia cards even for DX12 despite the suspicions people had about Nvidia hardware and DX12 before Pascal and their lack of proper async compute.

    So, really reviewers should have continued to keep an eye on this. But while a lot sites used Haswell-E or Broadwell-E in the past, recently a lot have been using Kaby/Skylake so the fact that Nvidia's DX12 driver is not able to utilise the additional cores of 6+ core CPUs wasn't noticed.

    The question is of course, is this something in the way Nvidia's drivers handle the threading can be fixed in software (after all Nvidia's DX11 drivers threaded a lot better than AMD's), or is it related to AMD Radeons having those ACE's or a similar hardware feature?

    If Nvidia's hardware is lacking, there is little they can do to improve the issue, whereas software should eventually be fixable. Might even be that their hardware is the problem but they might be able to do a workaround.

  4. #212
    Bows out! CAT-THE-FIFTH's Avatar
    Join Date
    Aug 2006
    Location
    Hopefully somewhere less backstabby
    Posts
    28,789
    Thanks
    3,203
    Thanked
    4,456 times in 3,442 posts
    • CAT-THE-FIFTH's system
      • Motherboard:
      • Less E-PEEN
      • CPU:
      • Massive E-PEEN
      • Memory:
      • RGB E-PEEN
      • Storage:
      • Not in any order
      • Graphics card(s):
      • EVEN BIGGER E-PEEN
      • PSU:
      • OVERSIZED
      • Case:
      • UNDERSIZED
      • Operating System:
      • DOS 6.22
      • Monitor(s):
      • NOT USUALLY ON....WHEN I POST
      • Internet:
      • FUNCTIONAL

    Re: AMD - Zen chitchat

    Quote Originally Posted by kompukare View Post
    Thing is - like the AotS results with <6> cores - this is not just something which affects Ryzen 7, both also Intel HEDT with 6+ cores. It's just that nobody seems to have noticed as almost everyone was benching with Nvidia cards even for DX12 despite the suspicions people had about Nvidia hardware and DX12 before Pascal and their lack of proper async compute.

    So, really reviewers should have continued to keep an eye on this. But while a lot sites used Haswell-E or Broadwell-E in the past, recently a lot have been using Kaby/Skylake so the fact that Nvidia's DX12 driver is not able to utilise the additional cores of 6+ core CPUs wasn't noticed.

    The question is of course, is this something in the way Nvidia's drivers handle the threading can be fixed in software (after all Nvidia's DX11 drivers threaded a lot better than AMD's), or is it related to AMD Radeons having those ACE's or a similar hardware feature?

    If Nvidia's hardware is lacking, there is little they can do to improve the issue, whereas software should eventually be fixable. Might even be that their hardware is the problem but they might be able to do a workaround.
    But if it was just a threading issue,then why with a 4C/8T IB Core i7 am I have really weird performance drops which are not present under DX11?? There is something definitely borked when it comes to VRAM management under DX12 too,since the RX470 I had with 4GB of VRAM didn't exhibit all the weirdness at all in ROTTR in the exact same scene,which also happens to be one of the most CPU intensive parts of the game and I have an older CPU too.

    Its almost like it can't make a proper decision fast enough.

    The benchmark and one or two other areas I tested didn't show that under DX12. Its basically that one area of the game. DX11 has none of the performance drop outs.

    The game was installed on an SSD too.

    Edit!!

    This is what annoys me more.

    The whole point of DX12/Vulkan/Mantle is to test gaming in more CPU limited scenarios,ie,older and slower CPUs.

    That is where Mantle look great in.

    Its all fine and dandy testing the latest and greatest £300 to £500 CPU,but why don't have the sites bother testing DX12 with an older CPU like a SB/IB Core i5/Core i7 or an FX8350??

    Lots of people will be having older CPUs,so its not really some absurd situation that a person might upgrade to a qHD screen and get a faster card.
    Last edited by CAT-THE-FIFTH; 02-04-2017 at 04:02 PM.


    Those despicable Elk,stealing the pond weed!

  5. #213
    Bows out! CAT-THE-FIFTH's Avatar
    Join Date
    Aug 2006
    Location
    Hopefully somewhere less backstabby
    Posts
    28,789
    Thanks
    3,203
    Thanked
    4,456 times in 3,442 posts
    • CAT-THE-FIFTH's system
      • Motherboard:
      • Less E-PEEN
      • CPU:
      • Massive E-PEEN
      • Memory:
      • RGB E-PEEN
      • Storage:
      • Not in any order
      • Graphics card(s):
      • EVEN BIGGER E-PEEN
      • PSU:
      • OVERSIZED
      • Case:
      • UNDERSIZED
      • Operating System:
      • DOS 6.22
      • Monitor(s):
      • NOT USUALLY ON....WHEN I POST
      • Internet:
      • FUNCTIONAL

    Re: AMD - Zen chitchat

    Another review seeing something similar with ROTTR:

    https://thetechaltar.com/amd-ryzen-1800x-performance/5/

    There seems to be some concern over whether or not NVIDIA cards play well with Ryzen CPUs. So we tested using a GTX 1080 and turning threaded-optimization on and off to see what kind of a difference it makes. As it turns out, Ryzen is affected in a few games, though only significantly in Rise of the Tomb Raider. The other question concerns NVIDIA’s DX12 support and how that affects performance on this platform. In some games, notably Rise of the Tomb Raider, the GTX 1080 and Pascal Titan X tend to get better performance using DX11, though that’s something that is seemingly processor agnostic. That is, DX11 gives better results on both Intel and AMD processors in certain games with NVIDIA GPUs.


    Those despicable Elk,stealing the pond weed!

  6. #214
    Bows out! CAT-THE-FIFTH's Avatar
    Join Date
    Aug 2006
    Location
    Hopefully somewhere less backstabby
    Posts
    28,789
    Thanks
    3,203
    Thanked
    4,456 times in 3,442 posts
    • CAT-THE-FIFTH's system
      • Motherboard:
      • Less E-PEEN
      • CPU:
      • Massive E-PEEN
      • Memory:
      • RGB E-PEEN
      • Storage:
      • Not in any order
      • Graphics card(s):
      • EVEN BIGGER E-PEEN
      • PSU:
      • OVERSIZED
      • Case:
      • UNDERSIZED
      • Operating System:
      • DOS 6.22
      • Monitor(s):
      • NOT USUALLY ON....WHEN I POST
      • Internet:
      • FUNCTIONAL

    Re: AMD - Zen chitchat

    Hardware.fr have done an article about DDR4 memory scaling:

    http://www.hardware.fr/articles/958-...i7-6900k.html4





    Look at the memory bandwidth - the Rzyen controller is actually more efficient than the BW-E one.



    Those despicable Elk,stealing the pond weed!

  7. #215
    Senior Member watercooled's Avatar
    Join Date
    Jan 2009
    Posts
    10,796
    Thanks
    1,496
    Thanked
    928 times in 799 posts

    Re: AMD - Zen chitchat

    This could be common knowledge already, but AIDA was updated to correctly read Ryzen's cache latencies, and they're far better than previously reported:
    https://www.aida64.com/news/aida64-v...cy-cache-speed


    Original numbers: https://www.techpowerup.com/231268/a...cx-compromises

    The L2 and L3 now seem to be lower than the 6900k! Memory latency is still on the high side however, but apparently this should improve with AGESA updates?
    http://semiaccurate.com/2017/04/03/a...emedies-ryzen/

    Edit: WOW! How's that for timing CAT? xD

  8. #216
    Not a good person scaryjim's Avatar
    Join Date
    Jan 2009
    Location
    Manchester
    Posts
    15,021
    Thanks
    1,191
    Thanked
    2,239 times in 1,842 posts
    • scaryjim's system
      • Motherboard:
      • Dell Inspiron
      • CPU:
      • Core i5 8250U
      • Memory:
      • 1x 8GB DDR4 2400
      • Storage:
      • 128GB M.2 SSD + 1TB HDD
      • Graphics card(s):
      • Radeon R5 230
      • PSU:
      • Battery/Dell brick
      • Case:
      • Dell Inspiron 5570
      • Operating System:
      • Windows 10
      • Monitor(s):
      • 15" 1080p laptop panel

    Re: AMD - Zen chitchat

    Hmmm, suspect. To me that looks like AIDA is using a small or predictable dataset for the testing - the in depth cache latency tests some of the sites did on release showed that with sequential data AMD's prefetchers were excellent and cache latency was kept very low, while using random data the latency soared as you filled the cache...

  9. #217
    Registered User
    Join Date
    Jan 2017
    Posts
    8
    Thanks
    0
    Thanked
    0 times in 0 posts

    Re: AMD - Zen chitchat

    Just got off of chat to Scan, and apparently retailers are under one weird (to me at least) NDA from AMD.

    Despite only being a week out from release, and the RRP having already been announced, the Scan rep told me that they're still not allowed to tell us how much they'll be retailing the Ryzen5 line for. Furthermore, they also told me that the NDA also forbids them from telling us when they will be allowed to tell us how much money they want from us.

    I hoping someone here is familiar with AMD NDAs but not bound by them. From the above, does it sound like the NDA forbids Scan for telling anyone their prices until the CPUs are already released? Or is there still room for the NDA to allow retailers to reveal their prices before the actual release?

  10. #218
    Bows out! CAT-THE-FIFTH's Avatar
    Join Date
    Aug 2006
    Location
    Hopefully somewhere less backstabby
    Posts
    28,789
    Thanks
    3,203
    Thanked
    4,456 times in 3,442 posts
    • CAT-THE-FIFTH's system
      • Motherboard:
      • Less E-PEEN
      • CPU:
      • Massive E-PEEN
      • Memory:
      • RGB E-PEEN
      • Storage:
      • Not in any order
      • Graphics card(s):
      • EVEN BIGGER E-PEEN
      • PSU:
      • OVERSIZED
      • Case:
      • UNDERSIZED
      • Operating System:
      • DOS 6.22
      • Monitor(s):
      • NOT USUALLY ON....WHEN I POST
      • Internet:
      • FUNCTIONAL

    Re: AMD - Zen chitchat

    Quote Originally Posted by watercooled View Post
    Edit: WOW! How's that for timing CAT? xD
    Also on the same subject,LOL!!


    Those despicable Elk,stealing the pond weed!

  11. #219
    Not a good person scaryjim's Avatar
    Join Date
    Jan 2009
    Location
    Manchester
    Posts
    15,021
    Thanks
    1,191
    Thanked
    2,239 times in 1,842 posts
    • scaryjim's system
      • Motherboard:
      • Dell Inspiron
      • CPU:
      • Core i5 8250U
      • Memory:
      • 1x 8GB DDR4 2400
      • Storage:
      • 128GB M.2 SSD + 1TB HDD
      • Graphics card(s):
      • Radeon R5 230
      • PSU:
      • Battery/Dell brick
      • Case:
      • Dell Inspiron 5570
      • Operating System:
      • Windows 10
      • Monitor(s):
      • 15" 1080p laptop panel

    Re: AMD - Zen chitchat

    Quote Originally Posted by nekomata View Post
    ... I hoping someone here is familiar with AMD NDAs but not bound by them. From the above, does it sound like the NDA forbids Scan for telling anyone their prices until the CPUs are already released? Or is there still room for the NDA to allow retailers to reveal their prices before the actual release?
    I''m not familiar with AMD NDAs but I've seen a few different NDAs, and yes, it sounds perfectly reasonable that AMD will have had retailers sign an NDA that forbids them from telling anyone a) how much they will be selling processors for, and b) the date that the NDA lifts. Those are both perfectly standard things to include in an NDA.

    Any reveal of either the prices or the NDA expiry date before the NDA lifts will almost certainly be a breach of the NDA.
    Last edited by scaryjim; 04-04-2017 at 05:43 PM.

  12. #220
    Senior Member watercooled's Avatar
    Join Date
    Jan 2009
    Posts
    10,796
    Thanks
    1,496
    Thanked
    928 times in 799 posts

    Re: AMD - Zen chitchat

    The way I interpreted it (which could be wrong) is that measuring cache latencies with software requires some knowledge about the CPU in order to get accurate results, hence the reason for the AIDA patch.

    With regard to prefetching though, that shouldn't affect at least the L3 given it's a victim cache and doesn't have a prefetcher? Or am I misinterpreting what you're saying?

    There are some results for both linear and random access here: http://www.legitreviews.com/amd-ryze...eview_191753/5

    According to those results, it looks to me like AMD and Intel are overall very close for L2 until they get above 256kB which exceeds Intel's L2 cache size (which important to remember - AMD are achieving this latency with a cache twice the size), then remain well below Intel's latency through the CCX's 8MB L3.

  13. #221
    Oh Crumbs.... Biscuit's Avatar
    Join Date
    Feb 2007
    Location
    N. Yorkshire
    Posts
    11,192
    Thanks
    1,392
    Thanked
    1,091 times in 833 posts
    • Biscuit's system
      • Motherboard:
      • MSI B450M Mortar
      • CPU:
      • AMD 2700X (Be Quiet! Dark Rock 3)
      • Memory:
      • 16GB Patriot Viper 2 @ 3466MHz
      • Storage:
      • 500GB WD Black
      • Graphics card(s):
      • Sapphire R9 290X Vapor-X
      • PSU:
      • Seasonic Focus Gold 750W
      • Case:
      • Lian Li PC-V359
      • Operating System:
      • Windows 10 x64
      • Internet:
      • BT Infinity 80/20

    Re: AMD - Zen chitchat

    I'm regularly under NDA with major broadcasters and technology partners. The terms in our case generally allow you to state you have an NDA with the company and that's about it, you can't say anything else without specific permission.

  14. #222
    Not a good person scaryjim's Avatar
    Join Date
    Jan 2009
    Location
    Manchester
    Posts
    15,021
    Thanks
    1,191
    Thanked
    2,239 times in 1,842 posts
    • scaryjim's system
      • Motherboard:
      • Dell Inspiron
      • CPU:
      • Core i5 8250U
      • Memory:
      • 1x 8GB DDR4 2400
      • Storage:
      • 128GB M.2 SSD + 1TB HDD
      • Graphics card(s):
      • Radeon R5 230
      • PSU:
      • Battery/Dell brick
      • Case:
      • Dell Inspiron 5570
      • Operating System:
      • Windows 10
      • Monitor(s):
      • 15" 1080p laptop panel

    Re: AMD - Zen chitchat

    Quote Originally Posted by watercooled View Post
    .,.. With regard to prefetching though, that shouldn't affect at least the L3 given it's a victim cache and doesn't have a prefetcher? Or am I misinterpreting what you're saying?

    There are some results for both linear and random access here: http://www.legitreviews.com/amd-ryze...eview_191753/5 ...
    I think there are a few things worth noting:

    - In the 64 byte stride Linear Forward results AMD's results are frankly phenomenal - those pre-fetchers are amazing when they know what data shold be coming.

    - Every set of results show an inflection for AMD at 6MB - 8MB block size, so the L3 cache is definitely being hit, and it's hurting AMD when the block size is too big for a single CCX-worth of L3.

    - The larger the stride, the harder it is for AMD's prefetchers, apparently - at a 4096 byte stride the linear forward and full random almost perfectly overlap.

    Ryzen's practical cache latencies are very good, as you say - up to a 6MB block size it comfortably beats Intel's latency. But for full random or large stride data patterns, Ryzen is to all intents and purposes an 8MB L3 cache design. Once your access goes beyond that, you're essentially going to main memory for every access. Previous builds of AIDA were, I would assume, testing the latency on the entire 16MB of cache, and finding half of it very much wanting. "Fixing" the benchmark so it only tests Ryzen under ideal circumstances doesn't strike me as the best way to represent real world performance.

    Indeed, I can't help wondering if part of the fixes that have seen such dramatic results in AotS was telling it to treat Ryzen as having an 8MB cache. If the code manages caching block sizes based on the cache it thinks is available that could be a quick fix to boost performance (and I know gaming used to be very cache heavy back in the day...).

  15. #223
    Bows out! CAT-THE-FIFTH's Avatar
    Join Date
    Aug 2006
    Location
    Hopefully somewhere less backstabby
    Posts
    28,789
    Thanks
    3,203
    Thanked
    4,456 times in 3,442 posts
    • CAT-THE-FIFTH's system
      • Motherboard:
      • Less E-PEEN
      • CPU:
      • Massive E-PEEN
      • Memory:
      • RGB E-PEEN
      • Storage:
      • Not in any order
      • Graphics card(s):
      • EVEN BIGGER E-PEEN
      • PSU:
      • OVERSIZED
      • Case:
      • UNDERSIZED
      • Operating System:
      • DOS 6.22
      • Monitor(s):
      • NOT USUALLY ON....WHEN I POST
      • Internet:
      • FUNCTIONAL

    Re: AMD - Zen chitchat

    I saw this mentioned on AT forums:



    So he tests the GTX1060 and RX480 on a R7 and a Core i7 7700K.

    That chap also tests ROTTR but this time tested the game at medium and the GTX1060 saw gains going from DX11 to DX12 which is opposite to what others saw at very high settings.

    I saw exactly the same when I dropped down one or two settings in ROTTR from very high to high and that was on an IB Core i7.
    Last edited by CAT-THE-FIFTH; 05-04-2017 at 02:41 PM.


    Those despicable Elk,stealing the pond weed!

  16. #224
    Senior Member watercooled's Avatar
    Join Date
    Jan 2009
    Posts
    10,796
    Thanks
    1,496
    Thanked
    928 times in 799 posts

    Re: AMD - Zen chitchat

    Quote Originally Posted by scaryjim View Post
    I think there are a few things worth noting:

    - In the 64 byte stride Linear Forward results AMD's results are frankly phenomenal - those pre-fetchers are amazing when they know what data shold be coming.

    - Every set of results show an inflection for AMD at 6MB - 8MB block size, so the L3 cache is definitely being hit, and it's hurting AMD when the block size is too big for a single CCX-worth of L3.
    Yeah that's what I meant by the CCX's L3 - it doesn't appear that L2 cache will normally be evicted to the other CCX's L3 cache, but that doesn't mean there is no access to it e.g. through snooping. A more complex test would be required to demonstrate this e.g. by having one CCX access data and have it evicted to its L3, immediately followed by the other CCX accessing it. IIRC the way this works, is the request is sent to both the other L3 and the IMC simultaneously, and if the data is present in the other L3, the memory access is cancelled (providing the cached data hasn't been invalidated of course).

    Quote Originally Posted by scaryjim View Post
    - The larger the stride, the harder it is for AMD's prefetchers, apparently - at a 4096 byte stride the linear forward and full random almost perfectly overlap.
    The same seems true of Intel's prefetching too from what I can see.

    Quote Originally Posted by scaryjim View Post
    Ryzen's practical cache latencies are very good, as you say - up to a 6MB block size it comfortably beats Intel's latency. But for full random or large stride data patterns, Ryzen is to all intents and purposes an 8MB L3 cache design. Once your access goes beyond that, you're essentially going to main memory for every access. Previous builds of AIDA were, I would assume, testing the latency on the entire 16MB of cache, and finding half of it very much wanting. "Fixing" the benchmark so it only tests Ryzen under ideal circumstances doesn't strike me as the best way to represent real world performance.
    It seems quite normal for cache latency to increase as you approach filling it, likely due to the caching algorithms not being designed to operate in that way. There's not enough granularity in those tests to see in more detail, but Ryzen stays completely flat even at 6MB (75%) while Intel's 20MB cache starts curving upwards at before the 75% mark. There's no 15MB point on the graph, but it's very obvious at 16MB, and apparent even at 8 and 12 depending on the other variables. AMD seem to have done a very good job of keeping cache latencies down within a CCX, but like you say, as far as an independent thread is concerned, it appears more like an 8MB cache than a 16MB one, and I think some places are describing it as 2x8MB.

    WRT the comment about AIDA - I don't think that's how it works, and nor really possible to test in the way you describe. Don't forget it's a cache which is wholly controlled by the CPU itself, and not addressable local memory i.e. you can't choose to use a certain amount. This isn't a workload performance benchmark, you're looking for a precise measurement of one specific aspect of the microarchitecture, and there's really only one right answer for a given test - it's one of those cases where, provided you're measuring correctly, you're never going to get a result better than what is theoretically possible.

    The sort of fixes I assume they will have had to make would be, like I said, more about getting accurate results e.g. ensuring they're timing the measurements correctly. If they were e.g. trying to use a 16MB block size with a single thread, they would have just been hitting main memory for half of it, which I guess is possibly what was happening.

    Cache latency obviously cannot be extrapolated to real-world performance - either they're measuring it correctly or they're not. They freely admitted the older version was not. It's not like a special optimisation to make it look better than it is.

    Quote Originally Posted by scaryjim View Post
    Indeed, I can't help wondering if part of the fixes that have seen such dramatic results in AotS was telling it to treat Ryzen as having an 8MB cache. If the code manages caching block sizes based on the cache it thinks is available that could be a quick fix to boost performance (and I know gaming used to be very cache heavy back in the day...).
    It would be interesting to read what exactly the fixes were. It is possible, to an extent, for software developers to try to keep certain objects in cache by carefully tuning their code, but again they're still relying on the cache controller's own algorithms to both behave how they expect, and that other competing threads won't eat into what's available to use.

    There's some interesting information on uArch tuning and superoptimisers at the website for y-cruncher: http://www.numberworld.org/y-crunche...mizations.html

    If I had to guess, I doubt the AotS fix would have been exactly that, with my reasoning being: Ryzen doesn't have just 8MB of L3 cache, and a game is more than able to access all 16MB of it depending on how it's threaded and scheduled. AotS had no 'knowledge' of Ryzen, so could not have been e.g. mistakenly tuned for 16MB of accessible cache per thread. You're just running a binary that already existed and will run happily on much smaller caches e.g. 6MB in the i5, so it doesn't appear like the game was specifically tuned in some way to utilise 16MB of cache.

    So if the performance improvements are down to (or partly down to) caching, I reckon the answer would be more complicated and involve more than just its apparent size. Like many games, it's also multi-threaded which adds another dimension to the possibilities e.g. competition, snooping across the fabric, etc. Perhaps one way to limit the impact of Ryzen's layout would be to somehow avoid snooping the remote L3 cache e.g. by scheduling dependent threads on one CCX?

    About the design in general, I was listening to another podcast with David Kanter as a guest (techreport) and he mentioned how it's very much a trade off. Intel do their best to keep latency uniform across the larger core-count parts with a huge ring bus, but this adds a significant amount of complexity, and probably cost, power and notably, local latency. Having a more beastly fabric is likely to add more cycles to every single access, whereas Ryzen's design keeps local performance very good at the expense of higher latency for (far less frequent, especially when properly scheduled) snooping across the interconnect.

    He also mentioned that Intel's ring-bus paradigm gets increasingly difficult as you scale in core count, hence why Xeon Phi and probably the Big Skylake uses meshes, with non-uniform latency.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •