Results 1 to 5 of 5

Thread: GDC: Async Compute What Nvidia says

  1. #1
    Moosing about! CAT-THE-FIFTH's Avatar
    Join Date
    Aug 2006
    Location
    Not here
    Posts
    32,039
    Thanks
    3,910
    Thanked
    5,224 times in 4,015 posts
    • CAT-THE-FIFTH's system
      • Motherboard:
      • Less E-PEEN
      • CPU:
      • Massive E-PEEN
      • Memory:
      • RGB E-PEEN
      • Storage:
      • Not in any order
      • Graphics card(s):
      • EVEN BIGGER E-PEEN
      • PSU:
      • OVERSIZED
      • Case:
      • UNDERSIZED
      • Operating System:
      • DOS 6.22
      • Monitor(s):
      • NOT USUALLY ON....WHEN I POST
      • Internet:
      • FUNCTIONAL

    GDC: Async Compute What Nvidia says

    More details here:

    http://translate.google.com/translat...it-nvidia.html

    We have of course used the GDC to question Nvidia in order to learn more about what its capable GPU in terms of management of multi engine DirectX 12 ... without real success?

    At the heart of DirectX 12, this feature allows to decompose rendering several lines of commands, which can be of copy, graphics or compute, and manage synchronization between the files. What to allow developers to take control over the order in which tasks are executed or to directly drive the multi-GPU. This decomposition allows in some cases to take advantage of the GPU ability to handle multiple tasks in parallel to boost performance.

    This is what AMD calls Async Compute, although the term may not be correct. Indeed, the asynchronous execution of a task does not imply that it be treated concomitantly another, yet it is this last point that is crucial and allows a performance gain. AMD GPU advantage of multiple orders of processors capable of powering the GPU computing units from several different files. Treatment simultaneous tasks that maximizes the use of all GPU resources processing units, memory bandwidth etc.

    Nvidia side is more complicated. If the GeForce are able to support copy files in parallel compute and graphics files, process the last two concomitantly seems problematic. Theoretically Maxwell 2 GPUs (GTX 900) have a command processor can handle 32 lines of which may be type graphics. Yet this support is still not functional in practice, as shown by example in the performance of GeForce Ashes of the Singularity.

    Why ? So far we were able to get real answer to Nvidia. So of course we wanted to take advantage of the GDC to try to learn more and have questioned Nvidia at a meeting organized with Rev Lebaredian, Senior Director GameWorks. Unfortunately for us, this engineer who is part of the technical support group for video game developers was very well prepared for these issues that affect the multi engine support specificities. His answers were initially verbatim those of the brief official statement Nvidia communicated to the technical press in recent months. Namely "GeForce Maxwell can support running concurrently at the SM (groups of processing units)", "it is not yet active in the pilot," "Ashes of the Singularity is one set (not too important) among others. "

    An unusual wooden language that shows, if it were still needed, that this issue bothers Nvidia. So we changed the approach to the impasse we have approached the subject from a different angle: is the Async Compute is important (for Maxwell GPU)? What Lebaredian Rev relax and open the way for a much more interesting discussion. Two arguments are then developed by Nvidia.

    First, if Async Compute is a way to increase performance, what matters in the end it is the overall performance. If GeForce GPUs are the most efficient basis than the Radeon GPU, the use of multi engine in an attempt to boost their performance is not a top priority.

    On the other hand, if the rate of use of the various blocks of the GeForce GPU is relatively high at the base, the potential gain from Async Compute is less important. Nvidia says here that overall there are far fewer holes (bubbles in language GPU) at the activity of units of its GPU than its competitor. But the purpose of concurrent execution is to exploit synergies in the treatment of different tasks to fill these "holes".

    Behind these arguments lie Nvidia actually one of the good planning of a GPU architecture. Integrate into chips one or more advanced control processors at a cost, a cost that can eg be exploited differently to provide more computing units and boost performance directly in up games.

    When developing a GPU architecture, much of the work is to provide a profile of tasks that will be supported when the new chips will be marketed. The balance of the architecture between its different types of units, among the computational power and memory bandwidth between the triangles rate and pixel throughput, etc., is a crucial point that requires good visibility, a lot of pragmatism and a strategic vision. It is clear that Nvidia is rather pulls well at this level for several generations of GPUs.

    To illustrate this, let's do a few comparisons between GM200 and Fiji on the basis of results obtained in Ashes of the Singularity not Async Compute. The comparison is rough and approximate (the exploited GM200 is from the GTX 980 Ti, which operates in a slightly castrated view) but still interesting:

    GM200 (GTX 980 Ti): 6.0 fps / Gtransistors, 7.8 fps / TFLOPS, 142.1 fps / TB / s
    Fiji (R9 Fury X): 5.6 fps / Gtransistors, 5.8 fps / TFLOPS, 97.9 fps / TB / s
    We could do the same with many games and the result would be similar or even greater (AOTS is particularly effective on Radeon): the GM200 better utilize resources at its disposal than Fiji. It is an architecture of choice, which does not directly involve it is better than another. Increase the yield of some units may cost more than the increase in their number in a greater measure. The work of architects is to find the right balance at this level.

    Obviously, AMD has instead relied on gross flows of its GPU, which usually implies a lower yield and optimization of opportunity at this one. Add to this that the organization of the Async Compute in AOTS seems more efficient use of memory bandwidth surplus and you will easily understand that there is less to gain from the side of NVIDIA. Especially as the synchronization commands related to Compute Async have a cost that will be masked by a significant gain.

    If our own thinking leads to rather agree with Nvidia these arguments, there is another important point for the players and that's probably what makes the number one GPU addresses the topic lip: Async Compute provides free gain for Radeon users. While this possibility was provided for in the AMD GPU for more than 4 years, they have not been able to get commercial profit, they have not been sold more expensive for the cause. This changes somewhat with the latest range of AMD that focuses strongly on this point, but in terms of perception, players like to get a free such little boost, even if only a handful of games. Conversely, the overall higher performance GPU Nvidia may have an immediate benefit in up games, and could be included directly in the price of GeForce. And from the perspective of a company whose purpose is not to post losses, it is clear that an approach makes more sense than another.

    Still we are in 2016 and that the operation of the Async Compute should gradually spread, particularly thanks to the similarity between the architecture of the GPU consoles and that of the Radeon. Nvidia can not totally ignore the possibility that could reduce or eliminate the lead in terms of performance. Without going into detail, Rev Lebaredian thus wished to reiterate that there were indeed opportunities in the drivers' level to implement which they can enjoy in some cases a performance gain with the Async Compute . Opportunities that Nvidia constantly revalues, not without forgetting that its future GPU could change that at this level.

  2. #2
    Banhammer in peace PeterB kalniel's Avatar
    Join Date
    Aug 2005
    Posts
    31,023
    Thanks
    1,870
    Thanked
    3,381 times in 2,718 posts
    • kalniel's system
      • Motherboard:
      • Gigabyte Z390 Aorus Ultra
      • CPU:
      • Intel i9 9900k
      • Memory:
      • 32GB DDR4 3200 CL16
      • Storage:
      • 1TB Samsung 970Evo+ NVMe
      • Graphics card(s):
      • nVidia GTX 1060 6GB
      • PSU:
      • Seasonic 600W
      • Case:
      • Cooler Master HAF 912
      • Operating System:
      • Win 10 Pro x64
      • Monitor(s):
      • Dell S2721DGF
      • Internet:
      • rubbish

    Re: GDC: Async Compute What Nvidia says

    Seems to be backed up by comments made by the Hitman and Ashes of the Singularity devs who both say async compute is only a modest benefit even for AMD cards:

    http://wccftech.com/async-compute-bo...per-hard-tune/

  3. #3
    Moosing about! CAT-THE-FIFTH's Avatar
    Join Date
    Aug 2006
    Location
    Not here
    Posts
    32,039
    Thanks
    3,910
    Thanked
    5,224 times in 4,015 posts
    • CAT-THE-FIFTH's system
      • Motherboard:
      • Less E-PEEN
      • CPU:
      • Massive E-PEEN
      • Memory:
      • RGB E-PEEN
      • Storage:
      • Not in any order
      • Graphics card(s):
      • EVEN BIGGER E-PEEN
      • PSU:
      • OVERSIZED
      • Case:
      • UNDERSIZED
      • Operating System:
      • DOS 6.22
      • Monitor(s):
      • NOT USUALLY ON....WHEN I POST
      • Internet:
      • FUNCTIONAL

    Re: GDC: Async Compute What Nvidia says

    Quote Originally Posted by kalniel View Post
    Seems to be backed up by comments made by the Hitman and Ashes of the Singularity devs who both say async compute is only a modest benefit even for AMD cards:

    http://wccftech.com/async-compute-bo...per-hard-tune/
    For consoles they said upto 20% and for Hitman they said upto 10% which is still reasonable for an early iteration on PC.

    But from that article you linked to,this is what the dev said:

    Asynchronous compute granted a gain of 5-10% in performance on AMD cards, and unfortunately no gain on Nvidia cards, but the studio is working with the manufacturer to fix that. They’ll keep on trying.
    It also means Nvidia lied when it said it could easily get an async driver out.

    It probably means all the people who said Nvidia could not really do async compute properly were right.


    iD is also now saying this with Doom:

    http://wccftech.com/doom-dev-big-gai...-compute-love/

    While on topic of IHVs poking. We are seeing big gains with async compute - just saying.
    So all the Nvidia PR fluff was deflection - they cannot do async properly.

    Plus with more and more companies now collaborating with AMD on DX12 games,and Nvidia really not showing any advantage in any of the DX12 games released so far,it makes me wonder if we are seeing another FX situation with Maxwell DX12 support.

    It might make my GTX960 one of the shortest lived cards for years.

    Its a bloody joke since Nvidia when they launched Maxwell said how they were the first proper DX12 cards.

    People who got the R9 290 series cards at launch are going to have the next 9700 PRO or 8800GTX at this rate!
    Last edited by CAT-THE-FIFTH; 28-03-2016 at 11:39 AM.

  4. #4
    Senior Member
    Join Date
    Jul 2009
    Location
    West Sussex
    Posts
    1,721
    Thanks
    197
    Thanked
    243 times in 223 posts
    • kompukare's system
      • Motherboard:
      • Asus P8Z77-V LX
      • CPU:
      • Intel i5-3570K
      • Memory:
      • 4 x 8GB DDR3
      • Storage:
      • Samsung 850 EVo 500GB | Corsair MP510 960GB | 2 x WD 4TB spinners
      • Graphics card(s):
      • Sappihre R7 260X 1GB (sic)
      • PSU:
      • Antec 650 Gold TruePower (Seasonic)
      • Case:
      • Aerocool DS 200 (silenced, 53.6 litres)l)
      • Operating System:
      • Windows 10-64
      • Monitor(s):
      • 2 x ViewSonic 27" 1440p

    Re: GDC: Async Compute What Nvidia says

    Aside from Ashes, all this work has to be done for the consoles anyway. Between them XbOne and PS4 have sold over 55+ million so obviously they are the important market.

    The closest thing to Async is probably something like hyper-threading (SMT). Basically use under-utilised resource when some other part of the chip is busy. So not something you can just tag on to a design.

    So I'm not surprised that Nvidia can't just quickly add it. Imagine trying to take the Bulldozer designer and change it from using CMT to SMT. And Nvidia not telling the truth is hardly new, either!

  5. #5
    Moosing about! CAT-THE-FIFTH's Avatar
    Join Date
    Aug 2006
    Location
    Not here
    Posts
    32,039
    Thanks
    3,910
    Thanked
    5,224 times in 4,015 posts
    • CAT-THE-FIFTH's system
      • Motherboard:
      • Less E-PEEN
      • CPU:
      • Massive E-PEEN
      • Memory:
      • RGB E-PEEN
      • Storage:
      • Not in any order
      • Graphics card(s):
      • EVEN BIGGER E-PEEN
      • PSU:
      • OVERSIZED
      • Case:
      • UNDERSIZED
      • Operating System:
      • DOS 6.22
      • Monitor(s):
      • NOT USUALLY ON....WHEN I POST
      • Internet:
      • FUNCTIONAL

    Re: GDC: Async Compute What Nvidia says

    I think it is a hardware issue - probably Maxwell is quite well tuned for DX11,but not so well tuned for DX12 and similar APIs. From what I gathered,one of the reasons AMD DX11 drivers are so single thread heavy is partially down to the hardware too.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •