Read more.It will be released in 2016 and be easy for designers to integrate into GDDR5 designs.
Read more.It will be released in 2016 and be easy for designers to integrate into GDDR5 designs.
"the company's answer to HBM"
Seems somewhat hyperbolic, if you ask me: even at 14Gb/s on a 512b pathway it wouldn't quite double the bandwidth available from the Fury's HBM1 implementation, which means that a four-stack HBM2 implementation would still out-do the best a 512b GDDR6 implementation could offer, in a much smaller silicon area, and presumably also at lower power usage. Sounds more like a stop-gap while they try to work out what their actual answer to HBM is, if you ask me...
I thought one of the main points of HBM was that it reduces the power consumption of the memory interfaces, because wiggling a pin faster doesn't come for free.
So unless power consumption is addressed somehow, and if it was I would expect this press release to be singing it from the rooftops so I assume it isn't, then this seems like a not so useful improvement.
AFAICT (and assuming this IS the same stuff as GDDR5X) the actual clock rate won't change, but the doubled pre-fetch will allow twice as much data to be transmitted per fetch? Which is curious, because I thought that was what GDDR5 already did to get the high data rates compared to GDDR3/GDDR4. Since it's targeting data rates that are double the existing GDDR5 range I assume that the clock and signalling rates will remain which means it won't consume (much) more power than a comparable GDDR5 implementation. Still a big chunk of the power budget going to driving the memory though; it definitely looks like more of a stop-gap measure than a forward-looking memory subsystem enhancement. I guess it might be useful for mid-range (128-bit) cards: if it's similar enough to GDDR5 we could start seeing GDDR5/GDDR6 cards instead of the existing GDDR3/GDDR5 ones....
Wiggling a lot of short pins at 1GHz will always beat wiggling long pins at 6GHz or whatever this runs at, this will only help designers smooth out any memory bottlenecks on a recycled mid-range GPU design. Or let them downsize on memory bus silicon area, to boost yields on the new processes. A die shrunk 295X2 with this to help keep up at 4k would probably embarrass a lot of top-end chips, and without any pricey interposer.
As long as it contributes to cheaper memory and faster graphic cards I'm all for it. I was afraid we are still going to be stuck with 2GB GDDR5 memories for low end and mid end graphic cards, we all know Nvidia especially likes to milk their 2GB 5.5gbps cards at up to $250, so hopefully this will allow even the greedy ass Nvidia to add 4GB 10gbps memory to its low and mid offerings.
We are likely going to see HBM2 for the top end, the $650 and $500 cards(hopefully they'll be cheaper than that though), HBM1 for the $300-450 parts and GDDR6 for medium and hopefully low range cards.
Once HBM2 gets going, I would expect it to displace HBM1 entirely. The cost of interposers is the same, the memory cost will probably be the same. If you can buy one part rather than two similar ones, then mass production advantages drive costs down. Or more likely, drive cost down for the low end part, and profit margin up for the higher end part.
It would make sense if they divided their ranges between cards able to go above 1080p and those not, is there really a need for the bandwidth that HBM offers when playing at 1080 and under?
A single stack of HBM2 will provide bandwidth similar to a 7950/7970, and either 4GB or 8GB of capacity. If you want to push high quality 1080p @ 60fps that's not an unreasonable spec to be thinking about. A 14nm 2048-shader die with a single stack of HBM2 would be tiny; get the thermals right and you could have a 1080p low profile card...
The only place I can see for GDDR6 is at lower levels: a 7870-level card could get away with a 128bit GDDR6 interface, saving silicon on the memory controller and I suppose saving power from having a smaller memory interface. OTOH, to push the same bandwidth as a flagship card you'd need a 768bit interface @ 11Gbps, which I think is stretching the bounds of credibility...
But if someone wants to make a cheap card, then DDR4 should give GDDR5 like performance and is already going mainstream price.
So, if low end uses DDR4 and high end goes HBM2 that leaves the mid range cards. If HBM2 becomes cheap enough, it leaves the lower mid range cards. That is still a lot of cards though.
I thought HBM only came in stacks of 1GB with a 1024 bit memory bus per stack, and the yet to be released HBM2 is meant to double that to 2GB per stack with a 2048 bit memory bus, while having a single stack of HBM2 would make for a tiny card it would leave you with a 2GB tiny card.
Not quite. Based on the latest information I've been able to dredge up from google:
HBM 1 is, as you say, 1GB stacks (4x 2Gb dies) with a 1024bit interface at a 1Gbps data rate.
HBM2 uses the same 1024bit interface, but doubles the data rate (hence the doubling of the bandwidth), and makes use of either 4 or 8 8Gb dies per stack, which give you either 4GB or 8GB per stack.
I believe there were rumours of 8 hi stacks for HBM1 once upon a time but they don't appear to have happened.
EDIT
I thought about that, but DDR4 currently seems to be topping out at around 3.4Gbps, which leaves it 25% behind the GDDR5 used on mainstream cards, and 50% behind the top-end GDDR5. I reckon there could be space for all three, with DDR4 at the very low end, GDDR5 and GDDR6 filling the mid-range, and HBM2 in the higher mid-range upwards...?
Last edited by scaryjim; 15-12-2015 at 04:58 PM.
Like I said before, assuming it's easy enough to tweak existing designs these will be more likely to be used in re-brands/die-shrinks of current GDDR5-using GPUs than top-end.
So it seems from more recent reading I've been doing, my memory must be letting me down as i could have sworn when HBM1 first came out that i read that HBM2 would basically be a 8 hi stack, now it seems either I'm mis-remembering or HBM2 has changed to increasing the size of each stack and not increasing the amount of stacks in each unit.
There are currently 1 users browsing this thread. (0 members and 1 guests)