I offer you ... pancakes A cooking time that can totally be parallelised!
Really nice analogy though, particularly the bit about splitting tasks up between extra people often needing additional overhead to make sure everything comes together properly. At some point I may have to blog this, and if I do I will totally be crediting and linking this post
Last edited by scaryjim; 11-03-2016 at 04:27 PM.
i agree... it's a superb analogy
top class
Originally Posted by Advice Trinity by Knoxville
and to be fair... my Intel Core 2 Quad was HOTTER than an oven.....
Originally Posted by Advice Trinity by Knoxville
Not quite. Back in the 386 and prior era, you had an x86 CPU and another socket in your motherboard for a x87 maths co-processor. People who ran games or spreadsheets bought the additional co-processor, others did not.
In the 486 era they combined the x86 and x87 CPUs into the same package and ever since we have pretty much forgotten that we get an x87 co-pro bundled into our CPUs.
When dual-cores first released, there was an x87 co-pro in each core.....when the BD FX line launched, AMD only used 1 co-pro per 2 CPUs cores (and then packaged them together as a module of 2 x x86 units and 1 x x87 unit). So, if your workload is fully integer you essentially have a 8 core processor, the more and more floating point operations you throw at it, the more you expose the weakness in lack of x87 units.
To compare them at this level you would need to count x87 units as cores. So a 4 core i5 would actually be 8 cores (4 x x86 and 4 x x87) and an 8 core AMD FX would have 12 cores (8 x x86 and 4 x x87)
So, it's all a bit messy........which is why benchmarking is pretty much the be-all and end-all of identifying the value for money for a specific workload.
Main PC: Asus Rampage IV Extreme / 3960X@4.5GHz / Antec H1200 Pro / 32GB DDR3-1866 Quad Channel / Sapphire Fury X / Areca 1680 / 850W EVGA SuperNOVA Gold 2 / Corsair 600T / 2x Dell 3007 / 4 x 250GB SSD + 2 x 80GB SSD / 4 x 1TB HDD (RAID 10) / Windows 10 Pro, Yosemite & Ubuntu
HTPC: AsRock Z77 Pro 4 / 3770K@4.2GHz / 24GB / GTX 1080 / SST-LC20 / Antec TP-550 / Hisense 65k5510 4K TV / HTC Vive / 2 x 240GB SSD + 12TB HDD Space / Race Seat / Logitech G29 / Win 10 Pro
HTPC2: Asus AM1I-A / 5150 / 4GB / Corsair Force 3 240GB / Silverstone SST-ML05B + ST30SF / Samsung UE60H6200 TV / Windows 10 Pro
Spare/Loaner: Gigabyte EX58-UD5 / i950 / 12GB / HD7870 / Corsair 300R / Silverpower 700W modular
NAS 1: HP N40L / 12GB ECC RAM / 2 x 3TB Arrays || NAS 2: Dell PowerEdge T110 II / 24GB ECC RAM / 2 x 3TB Hybrid arrays || Network:Buffalo WZR-1166DHP w/DD-WRT + HP ProCurve 1800-24G
Laptop: Dell Precision 5510 Printer: HP CP1515n || Phone: Huawei P30 || Other: Samsung Galaxy Tab 4 Pro 10.1 CM14 / Playstation 4 + G29 + 2TB Hybrid drive
Not quite. It' more like AMD's modules have a "full" core supplemented with parts of a second core. The floating point hardware is shared, so there's only one of those (although it was meant to be a big fat FP pipe that could handle the extra throughput, it never seemed to work like that in real life), but there are two integer cores. Intel have 4 very fast cores, and they can send 2 sets of instructions down the pipe at a time to simulate having 8 cores (what they call hyperthreading). The class action lawsuit was basically claiming that because you couldn't separate one core out of a bulldozer module from the other, it wasn't "really" 8 cores. It's all down to what you call a core. Personally, I'm with AMD on that one: historically the x86 core only did integer calculations, with floating point work done in software or on a separate x87 co-processor. it wasn't until the 486 that it was common to have x87 and x86 implemented in the same bit of silicon, and from there there've been a variety of enhancements to the FP capabilities of modern CPUs. AMD have simply decided to rebalance the supporting hardware in silicon: each module effectively has 2 x86 cores and a single x87 core. if I was them I'd be half tempted to reply to the class action suit by pointing out that they actually have 12 cores on their 4 module processor...
For parallel integer workloads, 8 core bulldozer/piledriver is still pretty impressive, and if your software can take advantage of the architecture it's phenomenal: look at the original FX 8150 threaded benchmarks from the Hexus review and watch how the right workload lets it stroll past the i7 2600k (that TrueCrypt AES benchmark ). However the design makes it far less software agnostic than many previous generations of processor: if your software doesn't like bulldozer, there's not much you can do about it.
It's a curious reversal of the Pentium/Athlon battle, where AMD were on top partly because of their excellent floating point performance. It'll be very interesting to see how AMD choose to balance Zen, when it comes along. For quite a long time now they've pushed core count over straight line speed, and I don't think there are that many software processes where the straight line performance of modern processors is unacceptable. DX11 gaming is perhaps one of the last common ones, so with DX12 and Vulkan ready in the wings, it's a very interesting time for processor design.
EDIT:
Aw, come on Zakky - that's what watercooling is for
I have learned loads in this thread
thanks guys. v valuable
Originally Posted by Advice Trinity by Knoxville
I don't know why people bang on about the floating point of the FX processors. From the Hexus review when the 8350 was released:
so at release the chip had the SSE grunt to compete and video encoding was the one place it could win in benchmarks, the problem seemed to be more general. In gaming, which is a more general mix of instructions so SSE throughput is less important, the FX started to fall behind looking more like:
Some games were better, some were worse, about 20% behind Intel on average in games at the time.
In server workloads like databases which are integer & memory access bound, which at first glance the FX looks really good for, it actually sucked fairly badly. Some of that has been pointed at the cache being slow, some at the instruction decoders on Piledriver sometimes giving you all the throughput of one of the cheap cat cores.
Of course with no design updates to FX for three and a half years, it looks now like as much of a bargain as if you were offered an i7 3770 for £150, not so good.
Bulldozer was an odd one.
I dug up some arithmetic benchmarks that showed that its IPC under ideal circumstances was actually better than Phenom II (an FX 4100 was matching an equivalent-clocked Phenom II X4), yet in many real world applications it bottomed out badly. IIRC it was a similar set of benchmarks that showed that for some instructions its FP throughput was markedly worse than the equivalent Phenom II X4, but obviously instruction optimisations can impact that hugely. And as we now know, of course, DX11 and earlier games are heavily bound by single-thread performance, so even those games that utlise many threads efficiently still udnerperform on AMD processors because of that lower single thread performance.
IIRC the decoders and branch predictors got completely redesigned in a later iteration (Steamroller?), and the AVX and AES instructions were phenomenally well implemented, but again that's only useful if the software uses them. I guess in many ways it was a bit too forward looking: it was designed for the way AMD wanted software to be written, but you can't rely on people writing software just to perform well on a given processor. Developers didn't code to it, and so overall it comes out looking a lot worse than it probably should.
It wasn't just buggy. It shocking real world IPC and power consumption that was also its downfall. If Bulldozer had instead been Piledriver, AMD would have been close enough to Intel at the time to charge a lot more for their high end SKU's.
They actually had the gall to claim an increase in IPC over Phenom II for bulldozer before it was released.
"In a perfect world... spammers would get caught, go to jail, and share a cell with many men who have enlarged their penises, taken Viagra and are looking for a new relationship."
As I mentioned earlier, under ideal circumstances it was actually true. Sadly, I can't find the post where I mentioned it, or the review where I noticed it, but iirc basically it was a well-threaded throughput test where a 3.6GHz FX 4100 beat a 3.6Ghz Phenom II Quad.
It does raise the question of how much internal benchmarking AMD did before the release of the initial bulldozer FX CPUs, of course. Did they run a wide range of real worldbenchmarks, find out it was a bit of a dog in lots of real world tasks, but just bite the bullet and go with it because it was what they had? Or did they run lots of theoretical tests, found that it had higher synthetic performance than Phenom II, and trust that it would perform the same across a broad range of real world benchmarks, getting a horrid shock when it suddenly fell flat under most circumstances? We'll never know, of course, but you'd hope it was more of the former than the latter....
I've run on a dual core for ages. Been fine for all I've needed. It's only recently that I've started doing more photo editing and batch processing that I'm finding the needle sitting on 100% CPU load and I need more grunt to reduce the regular 30minute periods where I can't do anything else. I may well get a multi core AMD since due to budget constraints I don't think I justify going for the multi core intel i7s.
There are currently 1 users browsing this thread. (0 members and 1 guests)