Read more.We pit an HIS Radeon HD 5870 against a BFG GeForce GTX 295 and see if either can beat up on a high-end Core i7 for video transcoding purposes.
Read more.We pit an HIS Radeon HD 5870 against a BFG GeForce GTX 295 and see if either can beat up on a high-end Core i7 for video transcoding purposes.
I thought GPGPU acceleration was supposed to be 10-50x faster than a CPU?
I noticed that the nVidia card was only about 25% faster and the ATi, although being a lot faster than nVidia, was also only about 40% faster than the CPU at the same task.
What's the deal with that?
I understand they were using one of the fastest CPUs available right now, but they're also using 2 very very fast GPUs.
I've done some of my own testing with Badaboom and a 9600GT, the speed wasn't great - no faster than a my Q6600, sometimes slower and the quality appeared worse than using Handbrake on the CPU.
EDIT: No doubt a 9800 or GTX would be faster than my Q6600, but the quality and tweaking in Badaboom is way below Handbrake.
Tried running it for curiosity on my 8200 IGP too, it was an utter joke - Athlon II 240 much faster (5-10x) - so to tout CUDA support on the IGP is a bit of a stretch, as doing the work on your CPU would probably be faster anyway.
Last edited by kingpotnoodle; 28-10-2009 at 02:22 PM.
I had the impression Nvidia's CUDA was better, but that appears not to be the case (at least speed wise).
It's a pity they're using video encoding as the sole comparison. OK, so it's one of the key areas you might consider, but I'd also prefer a direct comparison that can be objectively measured. Video quality by its nature is subjective, but the output of e.g. fluid dynamics/something else compute bound with a definite correct solution would be nice to see.
PK
A very poor review this time.
The Source you have chosen has already been heavily encoded, in a very lossy format. It's not HD as you claim. - 1000px × 563. It's blury, has huge compression artifacts, ghosting, halo's, there is no detail. - You need to redo the tests using a real High definition source ideally one that's not already compressed. - Try here If lost.
The output of the ati's final gpu result looks like it's been taken on a 5 year old camera phone. NV's considerably better. However opposed to the images posted you claim ". AMD came up trumps by offering a much cleaner (albeit slightly softer) output than the NVIDIA's" This is not subjective - It's is utter nonsense - please review the images you posted again.
Failure to point out faster higher quality h264 encoders such as as the free x264 is a huge omission. Get handbreak for ease of use, encode your source again, and be amazed at how poor MediaShow Espresso is. For features, ease of use, quality, and speed. GPU video encoding currently is slow and gives terrible quality, this is due to no one has coded a gpu encoder properly yet.
As a video encoding veteran, there are a number of flaws in the testing
1. If you can't run 8 threads then use some other encoders, like x264 build 1300+. Either way running 4x4 = 16 threads can affect encoding speed compared to 2x4 or 4x2. even though it might not achieve 100% CPU usage.
2. There are huge differences in speed (up to 1000 times) and quality depending on what h264 encoder setting is used.
3. Encoders from different developers can have very different implementations of the h264 standard. It is impossible to directly compare them, unless you willing to go all the way finding settings for each of the encoder such that a similar PSNR/SSIM is achieved.
4. The FPS of the source/output is of no concern to the encoder. It is likely just a bug in the frontend that fed the wrong FPS value to the muxer. Even if the software is smart enough to do pulldown it will be 30->24FPS not the other way round.
5. This is suppose to be a video encoding comparison. Why include audio. Depends on the setting used the speed can be affected.
6. The software developer could have added source filtering which can significantly affect encoding speed and the quality. In the ATI / Nvidia encoder it could be that the CPU is busy filtering and the GPU is waiting for data to be processed. And this filtering is probably applied blindly which doesn't care if the source is clean or not, so Blackbadger's point about using a cleaner source is irrelevant. Using a bad source will bias towards the software that does filtering. Using a good source will bias towards the software that does NOT do filtering, because the filtering can hurt quality, either through destroying details or interfere with h264's motion estimation algorithm.
I understand it is difficult to do a GPGPU comparison, especially with h264. But this review doesn't really tell us anything apart from GPGPU and the Cyberlink encoders are crap.
Last edited by arthurleung; 28-10-2009 at 03:08 PM.
Workstation 1: Intel i7 950 @ 3.8Ghz / X58 / 12GB DDR3-1600 / HD4870 512MB / Antec P180
Workstation 2: Intel C2Q Q9550 @ 3.6Ghz / X38 / 4GB DDR2-800 / 8400GS 512MB / Open Air
Workstation 3: Intel Xeon X3350 @ 3.2Ghz / P35 / 4GB DDR2-800 / HD4770 512MB / Shuttle SP35P2
HTPC: AMD Athlon X4 620 @ 2.6Ghz / 780G / 4GB DDR2-1000 / Antec Mini P180 White
Mobile Workstation: Intel C2D T8300 @ 2.4Ghz / GM965 / 3GB DDR2-667 / DELL Inspiron 1525 / 6+6+9 Cell Battery
Display (Monitor): DELL Ultrasharp 2709W + DELL Ultrasharp 2001FP
Display (Projector): Epson TW-3500 1080p
Speakers: Creative Megaworks THX550 5.1
Headphones: Etymotic hf2 / Ultimate Ears Triple.fi 10 Pro
Storage: 8x2TB Hitachi @ DELL PERC 6/i RAID6 / 13TB Non-RAID Across 12 HDDs
Consoles: PS3 Slim 120GB / Xbox 360 Arcade 20GB / PS2
Amdahl's Law still applies - any serial code in there is a common element to both systems. On a pure throughput of 1,000,000,000 parallel threads, yes the GPU should be faster than a CPU.
In parallel with lots of work to do, more threads help hide memory accesses (600tick latency for GPU to GPUMEM over 100tick for CPU to RAM) on the GPU, yet it's still a case of what memory to use where (it isn't cached like on a CPU, you have to program it where to go, which can be frustrating when it's 250kb at a time, WITH a 600 tick latency).
GPUs work really well when properly coded and lots of threads to hide memory latency. You'd be surprised at just how much of a performance knock you can get from a simple cutilSafeCall() function, which checks that the GPU can actually do the work (speaking from a CUDA function standpoint).
Thanks for updating the original reference image. It is alot clearer now.
However it is still a very lousy image that has been re-compressed. It is not what a true 35mbps HD source would look like.
- It's a combination of your terrible source and the software your using, not the bitrate or h264 that are the problem. H264 can retain all of the details at that bitrate.Due to the relatively low bit-rate the YouTube profile employs, a lot of the fine detail - such as the individual strands of hair, as well as much of the background detail - is lost. Even an advanced codec such as H.264 can’t work miracles when there’s a lot of movement and a fairly low bit-rate, clearly.
Also as you have pointed out the Espresso software does not support multiple gpu's. You have chosen to use a new high end ati gfx card vs a a dual gpu nv card, which is effectively a midrange gtx260 for this application. You need to re-do the tests with a high end 285gtx, if you wish to compare Ati vs NV in this software.
Weak points in your article outlined in this post... overclock.net/7510321-post3.html (I registered just to make this post so I can't post links yet... if you don't deem this masked link OK, PM me and I will replace it with a quote).
Please make corrections accordingly (some are opinion-based, but others are fact -- you should at least take the resolution, FPS and levels parts into account).
Last edited by minori; 28-10-2009 at 08:26 PM.
You'd be surprised many HD source are extremely noisy, unless you have super-high end equipment, which most people / low budget studios don't have. Not to mention a lot of them upscaled from 720p. The pre-filtering or not argument is in my previous post FYI. x264 can do magic if you understand how it actually compress and filter/optimize your source to the most compressible way.
http://x264dev.multimedia.cx/?p=164 <--Good Read
http://x264dev.multimedia.cx/?p=102#more-102 <---This is a MUST READ
Last edited by arthurleung; 29-10-2009 at 12:30 AM.
Workstation 1: Intel i7 950 @ 3.8Ghz / X58 / 12GB DDR3-1600 / HD4870 512MB / Antec P180
Workstation 2: Intel C2Q Q9550 @ 3.6Ghz / X38 / 4GB DDR2-800 / 8400GS 512MB / Open Air
Workstation 3: Intel Xeon X3350 @ 3.2Ghz / P35 / 4GB DDR2-800 / HD4770 512MB / Shuttle SP35P2
HTPC: AMD Athlon X4 620 @ 2.6Ghz / 780G / 4GB DDR2-1000 / Antec Mini P180 White
Mobile Workstation: Intel C2D T8300 @ 2.4Ghz / GM965 / 3GB DDR2-667 / DELL Inspiron 1525 / 6+6+9 Cell Battery
Display (Monitor): DELL Ultrasharp 2709W + DELL Ultrasharp 2001FP
Display (Projector): Epson TW-3500 1080p
Speakers: Creative Megaworks THX550 5.1
Headphones: Etymotic hf2 / Ultimate Ears Triple.fi 10 Pro
Storage: 8x2TB Hitachi @ DELL PERC 6/i RAID6 / 13TB Non-RAID Across 12 HDDs
Consoles: PS3 Slim 120GB / Xbox 360 Arcade 20GB / PS2
LOL. i dont see how H264 being able to produce good quality at ~2Mbps with certain 720p encodes of videos is relevant. What is relevant is producing an image transparent to the source. Now, even though from the screenshot the source doesnt seem to have much in the way of grain, action, dark scenes, or anything that requires high bitrate, it seems unlikely you are going to get transparency with that level of compression. Cant say for sure without testing myself, but in my experience, it just doesnt happen.
Yes, using better settings could result in a HUGE difference. For example here are two encodes of star trek, from the same source using the exact same bitrate:
Source: http://achumpatoxford.com/u/files/27...3b7da4aab8.png
Encode 1: http://achumpatoxford.com/u/files/27...9ecbabb92a.png
Encode 2: http://achumpatoxford.com/u/files/27...d4d7eb56e6.png
note how 1 is almost transparent, while the second isnt even close. you are going to have awful settings for everything with a program like the one used in the review. The only two things i can really take away from this article is that program is useless... i would have at least expected the GPU encodes to be faster than the cpu encodes by a wider margin.
There are currently 1 users browsing this thread. (0 members and 1 guests)