Quote Originally Posted by Lifepower View Post
I've used CUDA for one research work a while back. However, I think the language itself is in infant stages, I'd say much less evolved than HLSL, for instance. In some cases, you are better off working with actual shaders (say SM4) for GPGPU than with CUDA itself.

The speedup is subjective as in GPGPU you can do very simple things very fast.

If you need to do complex tasks, I'd recommend using multi-threading on actual CPU. In our latest publication which will appear in November this year in the conference of optimization and computer software, we have performance benchmarks of an illumination approach developed both on GPU and CPU using multi-threading, showing that CPUs such as Core i7 Q840 when using multiple threads on 64-bit platform are extremely powerful. We don't even use it to its full potential, SSE2+ could be used to speed things up even more.

I personally find CUDA a very low-level language, not far from raw x86/x64 assembly, which uses C syntax instead of actual instructions. For the entire code you need to think of hardware architecture and how to adapt your problem to the hardware itself, not the way around. It is nice for video editing, massive image processing and some other tasks like this, but even in this area from a developer point of view, you'll save a lot of time and headaches if you develop for CPU instead.
You are quite right that shaders can be just as fine. The shader languages are mature, very fast and available without extra installers. I always insist "don't count out shaders" when I teach GPU computing.

However, shaders don't give the optimization options that CUDA gives. Yes you do tweaking to adapt the algorithm to the hardware, and that is where it is at its best. And it often outperforms CPUs by a large margin.