Read more.NVIDIA spills the beans on Fermi. Good enough to take down the Radeon HD 5870? We take a first look at the architecture.
Read more.NVIDIA spills the beans on Fermi. Good enough to take down the Radeon HD 5870? We take a first look at the architecture.
They seem incredibly reluctant to even talk about gaming, let alone give realistic hints to performance. It's almost like they are counting on HPC sales to grow exponentially to make up for a poor forcast on gaming profit.
I think AMD will be satisfied.
It's very hard to see how, from a gaming perspective, nVidia will be able to match ATI on a price per performance basis. I hope I'm wrong as the price of 5870s could stay high for quite some time if this does come to pass.
They have more than doubled the SP count, 512 is more than a GTX295, so this should be much faster given the inevitable tweaks under the hood...
I reckon it'll compete with a 5870 fairly evenly, and for me the value proposition of NVidia with CUDA, PhysX, the potential for flash acceleration etc is better than ATI (guess it depends on your viewpoint though) so assuming the power/efficiency are good I'm looking forward to this, going to be hard to choose...
Main PC: Asus Rampage IV Extreme / 3960X@4.5GHz / Antec H1200 Pro / 32GB DDR3-1866 Quad Channel / Sapphire Fury X / Areca 1680 / 850W EVGA SuperNOVA Gold 2 / Corsair 600T / 2x Dell 3007 / 4 x 250GB SSD + 2 x 80GB SSD / 4 x 1TB HDD (RAID 10) / Windows 10 Pro, Yosemite & Ubuntu
HTPC: AsRock Z77 Pro 4 / 3770K@4.2GHz / 24GB / GTX 1080 / SST-LC20 / Antec TP-550 / Hisense 65k5510 4K TV / HTC Vive / 2 x 240GB SSD + 12TB HDD Space / Race Seat / Logitech G29 / Win 10 Pro
HTPC2: Asus AM1I-A / 5150 / 4GB / Corsair Force 3 240GB / Silverstone SST-ML05B + ST30SF / Samsung UE60H6200 TV / Windows 10 Pro
Spare/Loaner: Gigabyte EX58-UD5 / i950 / 12GB / HD7870 / Corsair 300R / Silverpower 700W modular
NAS 1: HP N40L / 12GB ECC RAM / 2 x 3TB Arrays || NAS 2: Dell PowerEdge T110 II / 24GB ECC RAM / 2 x 3TB Hybrid arrays || Network:Buffalo WZR-1166DHP w/DD-WRT + HP ProCurve 1800-24G
Laptop: Dell Precision 5510 Printer: HP CP1515n || Phone: Huawei P30 || Other: Samsung Galaxy Tab 4 Pro 10.1 CM14 / Playstation 4 + G29 + 2TB Hybrid drive
I think it will be a lot of use because there are loads of programmers out there who prefer to program in Python or Java, and don't like C. It is also a lot quicker to write useful programs in high level languages, than in C.
Suppose you have an existing program written in Java. It currently takes an hour to run, and because it gets run a great deal you have a bussness need for it to run faster.
You could re-write the time crital sections in C, which will make the program about 50% faster (40 minutes), but to do so you would need to learn C, and the resultant code would be more bug prone.
Alternatively you could ask your Boss for £1000 for an nVidai CUDA card that will run the code 100 times faster (4 seconds), with only minor tweaks to the code in a language you are already familiar with.
Even if the program is not yet written, it is often still better to write in a high level language than a low level one as development will be faster. If that last bit of performance is still needed then the critical sections can still be re-written in C, but most of the time the 100x speedup from using CUDA will be good enough.
Exactly.
Originally Posted by RealWorldTechnologies
For anyone interested in the architecture to a greater degree, NVIDIA's released a whitepaper to the press a few days ago. It's now on the site, so read away (PDF).
http://www.nvidia.com/content/PDF/fe...Whitepaper.pdf
not only that, but surely to make effective use of a GPU with that many stream processors your code would already have to be written to be massively multithreaded. Having done an MSc which taught Java as its principal language, and therefore knowing the coding skills of many professional Java developers, the concept of them trying to develop a massively-multithreaded software architecture to take advantage of this leaves me shivering in terror...
I don't think the language you write in is that big a deal if it comes with a decent library for this sort of stuff, or a good compiler (the world needs more compiler writers).The problem is, most workloads just aren't written to do SIMD. OK, so new CUDA can run multiple kernels, but I doubt you can run as many kernels as you have streams (I guess I should read the whitepaper!).Originally Posted by chrestomanci
If you want to make a GPGPU run fast, you need to take a lot of data, chop it up, and apply the same operations to each chunk - which is why you can dunk it through something massively parallel.
As soon as the operations you need to perform vary between each chunk (e.g. you have branches) the whole thing breaks down. Now, assuming you've got data that lends itself to parallel processing, there are ways of dealing with conditionals that don't involve branching.
Indeed, the reason GPUs have turned into the parallelised beasts that they are, is that graphics shaders and the data they work on are perfect for such situations.
There are a lot of workloads that can have multiple things happening at once, but that's not the same as doing the same thing to lots of data elements at once, which is why we don't have 512-core CPUs (yet...).
Yes they also increased the bus to 300bit and more onboard ram, this all adds to a huge expense and so it will probably offer better performance than the 5870 but will also cost a HUGE amount more .
edit: Also the amount of R&D thats gone into this project, it wont be healthy for Nvidia to compete with AMD's 5000 series (a lot less R&D costs etc) on price, either they sell at a loss or sell at a huge price that people wont buy. Although its a start of what could be an amazing platform/design its just going to be a profitless technology unless some serious cost cutting and developments can be made.
Doesnt matter anyway, by the time fermi is out AMD will already have its 6000 series out or a few months away i reckon.
A 384bit memory bus is actually smaller than the GTX285, which interfaced using a 512bit bus, so there will be a small cost saving in using less memory chips. It does mean it'll end up shipping with one of those odd-sounding memory buffers though - 1.5GB most likely (I can't see them bothering with a 768MB version of the top end card).
Of course, ATI only use a 256bit bus, so unless Nvidia run their DDR5 @ < 3200 effective they're going to have more memory bandwidth on tap. On the other hand, it's debatable whether ATIs top end cards are bandwidth limited anyway, and therefore whether that extra bandwidth will boost performance at all...
Hicks12 (18-01-2010)
Sorry i havent looked much into nvidia top end cards, only focused on the gtx 260 :L.
Thanks for informing me of that, however the rest still is right isnt it? R&D needs to be recopurated from somewhere and its going to be in the price of the cards, amd just tweaked there design and added more which is great (it shows) and in the end costs a lot less than changing the whole design!.
We will only know if bandwidth helps more when fermi is released, im betting Q2 release now tbh.
There are currently 1 users browsing this thread. (0 members and 1 guests)