PDA

View Full Version : CUDA & Open CL rendering native in CORE ?



grn
09-20-2009, 06:54 PM
How high do you value GPU rendering ?

If you must have it, how soon ?

Do you think Worley or some other 3rd party company adds it to CORE if Newtek doesn't ?

pixym
09-20-2009, 09:11 PM
It is by far the more interesting render tech reviewed this year. I really hope to be able to put my render 20 times faster and sell my renderfarm...
Regarding to Worley, the only thing I know is Mister Worley is very active on nvidia Cuda forum!
Core will surely give us this render option in the future.

praa
09-21-2009, 09:53 AM
no to CUDA and yes to open standards ie. OpenCL

grn
09-21-2009, 03:45 PM
It is by far the more interesting render tech reviewed this year. I really hope to be able to put my render 20 times faster and sell my renderfarm...
Regarding to Worley, the only thing I know is Mister Worley is very active on nvidia Cuda forum!
Core will surely give us this render option in the future.

Mister Worley is on CUDA forums ! Now that sounds promising. :D If he is working on a CUDA renderer, it could be something blazing fast as the guy made the fastest CPU renderer on the planet.

Quality of 3D graphics could generally achieve a sudden face lift when GPUs replace CPUs. You can install 4 3D cards into a single personal computer. The performance is great per watt. Soo much muscle, what will we do with it all ?


no to CUDA and yes to open standards ie. OpenCL

Afaik CUDA (which is Nvidia only) has the upper hand atm. Hopefully that will change.

Lightwolf
09-21-2009, 04:09 PM
The problem is that it's not trivial to do, even with CUDA.

I've had a look at StudioGPU recently (which is AMD only, Brook+ based - and AMD currently have the lead in terms of GPU performance hands down).
It's still missing a lot of things we take for granted in a production renderer... and the graphics board RAM limit doesn't help either (current GPUs are limited to 32-bit adressing as well).

I'm looking forward to what nVidia will offer as an SDK in that area (for Quadros only). I'm expected GPU accelerated rendering, but not complete GPU based rendering in the medium term though. And not really 20x either...

Cheers,
Mike

The Dommo
09-22-2009, 07:27 AM
That's one of the things that made the Wildcat Realizm cards so good - the memory usage abilities.
http://www.3dlabs.com/legacy/Datasheets/realizm800.pdf

Lightwolf
09-22-2009, 07:33 AM
That's one of the things that made the Wildcat Realizm cards so good - the memory usage abilities.
http://www.3dlabs.com/legacy/Datasheets/realizm800.pdf
All current GPUs use "virtual memory" (since AGP) - but that doesn't help much with GPU based computing where any transfer from the CPU based RAM to the GPU based RAM is one of the many bottlenecks.

As a side note, 3DLabs current endeavors are quite interesting as well: http://www.ziilabs.com

Cheers,
Mike

Lightwolf
09-22-2009, 08:00 AM
I'm curious what you base that on. All the test I've seen show that ATI's top cards get thoroughly creamed by the GTX 285 (in both games and DCC). nVidia's top card is somewhat more expensive than a 4890, but it also a fair bit faster and has more exclusively and industry adopted technologies available for it (CUDA/PhysX). ATI's drivers are also still pretty abysmal in various areas.

Are things different with FireGL vs Quadro? I haven't been following workstation cards very close for the past couple of years...
I meant GPU perfomance in the context of this thread... and that is computing.
When it comes to raw GFlops nVidia can't beat ATI at the moment... at both single precision and to an even larger degree at double precision.

I've got some numbers for fairly simple benchmarks here (matrix multiplication) - using libraries optimized either for CUDA or Brook+:
GT285: 448 / 91 (single/double GFLops)
Radeon 4870: 750 / 340
Core i7: 99 / 49

Mind you, matrix multiplications are fairly "basic" operations (these are 1024x1024 matrices) - but the numbers are close enough to the relative specs.

Cheers,
Mike

Lightwolf
09-22-2009, 08:12 AM
Ah, thanks for clarifying. How come there's such a big difference in computing and pushing DX/OGL pixels on the same hardware? Drivers? Memory bandwidth?

How can one brand excel in pushing pixels and lose out when it comes to raw compute power... what is the deciding factor here?
Drivers... the quality of the software.
Which is also why ogleing at the raw specs of a GPU is kind of a moot point really. What's the point of a bad renderer on a fast GPU if a good renderer on a CPU produces better results?

As for ATI, apparently their FireGL drivers aren't that bad for pro apps actually. They're quite competitive compared to Quadros. The consumer line is a different matter though (but even there ATI isn't doing that bad in DX based apps, especially if you take the power usage into account as well).

Cheers,
Mike

adamredwoods
09-23-2009, 04:24 PM
Someday soon I hope.....
http://www.cgarchitect.com/news/SIGGRAPH-2009-CHAOS-GROUP-GPU.shtml

Lightwolf
09-23-2009, 04:56 PM
Someday soon I hope.....
http://www.cgarchitect.com/news/SIGGRAPH-2009-CHAOS-GROUP-GPU.shtml
Hopefully not as laggy as that... then again, everything needs to be transferred to the GPU memory first. Fun :D

Cheers,
Mike

monovich
09-24-2009, 12:22 PM
if that is the definition of laggy, I'll take laggy any day of the week.

Lightwolf
09-24-2009, 03:35 PM
if that is the definition of laggy, I'll take laggy any day of the week.
Laggy means that there is a delay, and in this case there is a massive delay between a change of the scene and a new render.
It might be good for final renders, but a replacement for something like FPrime as an interactive tool it ain't - at least not yet.

Cheers,
Mike

pixym
09-28-2009, 08:37 AM
Another Hardware tech GI demo this time with Caustic RT:
http://vimeo.com/6715300

Sensei
09-28-2009, 09:44 AM
Laggy means that there is a delay, and in this case there is a massive delay between a change of the scene and a new render.
It might be good for final renders, but a replacement for something like FPrime as an interactive tool it ain't - at least not yet.

Lagging might be result of scanning whole scene and all objects geometry.. Especially when there are million polygons.. and SDK is in C++..

Scanning geometry is very very slow in LW.. Especially if you're taking also normal vector of polygon and all points smoothed normal vectors. f.e. on Core 2 Duo 2.0 GHz (mine laptop) scanning 500,000 polygons is taking 0.7-1.0 second.. just scanPolys() call-back that's filling array of polygons, all in C-like code, even without using vector C++ class to not affect speed of scanning.. After removing from call-back pntOtherNormal() functions, scanning half million polygon is taking 0.3 second (200-300% speedup).

It's raw time spend entirely in Layout Master context, so it's affecting interactivity of real-time renderer, while moving, rotating, scalling, morphing object by bones etc. stuff that cause rebuild of geometry. Transformation of light, camera and surface don't (or should not) cause that lag.

I think so Fprime doesn't use pntOtherNormal(), because that function was added just recently in LWSDK, v9.x or so. Instead it has to calculate smoothing manually, but this can be delayed from Layout context to rendering context (and made multi-threaded), thus not affecting interactivity of renderer.