PDA

View Full Version : GPU Rendering. Measuring the Power!



silviotoledo
04-23-2010, 07:14 AM
I'd like to start a thread about HOW MUCH MORE powerfull for rendering would be the use of GPU instead of a CPU. And also about how to have this deal in lightwae.

Octane Render is the only application I know is compatible with lightwave.
Still dont know about CORE and GPU at this moment.

V Ray seems to be almost 20 x faster on GPU.

Now it seems every one will have the 40.000 cores from Weta's Avatar.

---------------------------------------------------------------------
http://www.youtube.com/watch?v=4zVAR_l2w8k


20X speed increase in rendering over a quad i7 ( with 8 virtual cores ) from a $310 dollar card http://www.newegg.com/Product/Product... Now imagine 2 or 3 sli'd gtx 295s!!! Imagine a possible 120X speed increase for just the price of the graphics cards, oh and you can play crysis at 3840 x 2400 just for kicks. This will chage the 3d rendering world :)

Ĺgrén
04-23-2010, 07:59 AM
I'd like to start a thread about HOW MUCH MORE powerfull for rendering would be the use of GPU instead of a CPU. And also about how to have this deal in lightwae.

How about GPU and CPU ? :)

silviotoledo
04-23-2010, 08:09 AM
yeah! I reffer to both. But now only one CPU and several GPU :)

it makes no more sense to use CPU to render once a basic GPU card will be 20 X more powerfull to render than an i7 processor.

Tobian
04-23-2010, 08:36 AM
silvoledo.. can you back up those numbers? Because pretty much all examples I have ever seen do not seem to indicate that GPU's are especially massivelly faster than CPU's for a lot of tasks. Sure you can render thousands of polys using very basic lights and very basic shading models with them in practically real time, but very few GPU's have the re-programability necessary to enable them to do anything else, and by the time you add in all the other expensive shading models to add to 'realism' like radiosity, sss, blurry reflections and refractions and DOF, then there's not a lot in it.

Vray with GPU support Octane and others don't look particularly much faster than you would expect Fprime to look on a modern 4 core+ CPU. You also have the the huge overhead of memory bandwith between the CPU and the GPU, only on the most modern i7 motherboards with direct quickpath links do you even have a realistic chance of not having a severe bottleneck to wipe out any speed advantages.

I was as enthusiastic as anyone about GPU acceleration as anyone, but I am not really convinced that there is a huge advantage, especially for high-poly work, with advanced shaders.

warmiak
04-23-2010, 11:29 AM
I was as enthusiastic as anyone about GPU acceleration as anyone, but I am not really convinced that there is a huge advantage, especially for high-poly work, with advanced shaders.

If you can code your shader as a fragment program running on the GPU then it will always be much faster than anything you can do on the CPU side of things.

Lightwolf
04-23-2010, 01:35 PM
If you can code your shader as a fragment program running on the GPU then it will always be much faster than anything you can do on the CPU side of things.
If... if it's an advanced shader you can't, and if you're using the CPU using CUDA or OpenCL you're also not as likely to mess with fragment shaders.

Cheers,
Mike

probiner
04-23-2010, 01:43 PM
From what is was told GPU is faster for calculations that are simple but very extensive. Right/Wrong?

Lightwolf
04-23-2010, 01:45 PM
From what is was told GPU is faster for calculations that are simple but very extensive. Right/Wrong?
Pretty much... dumb but extremely fast.

Cheers,
Mike

Tobian
04-23-2010, 02:00 PM
Yeah that's the problem... They are extremely well optimised for simple repetitive operations, like shading thousands of polygons per frame, but once you start going into the realms of programmability and complex shading operations, the advantages drop away. By all means it's a very good idea to tap into all that potential, but I am not sure it's as good as all that: For example Gelato's been dropped, so obviously it wasn't as good as the PR would have us believe! (that said Nvidia now owns Mental Ray, so maybe it has something in mind? :D)

LazyCoder
04-23-2010, 02:40 PM
http://www.refractivesoftware.com/gallery.html

nuff said.

Tobian
04-23-2010, 02:54 PM
Yep, full of technically impressive renders... like you would expect a gallery promoting a product to have :p It's a very impressive new tool. That said it's a pure GPU renderer and standalone software, so it's not quite integrated into a software pipeline quite yet. I'll definitely watch what's going on there, but it's not got everything yet, and I am not sure what walls it has just yet.

monovich
04-23-2010, 02:57 PM
GPU renderers are like flying cars to me. I'll be happy when I can buy one and fly it.

StereoMike
04-23-2010, 03:03 PM
Some time ago (2007) I heard, Mental Images would shift rendering to the GPU (XSI developer told me). I definitely believe, GPU rendering will have it's place. Arion, Octane and every other engine that has a name has this on the roadmap (core as well).

mike

Elmar Moelzer
04-23-2010, 04:53 PM
Well and then there is this:
http://unlimiteddetailtechnology.com/description.html

Now skepticism is asked for here, but that stuff SUPPOSEDLY runs on a single CPU.
To me both things exist just as much:
They are nice tech demos for geeks like us and we all keep dreaming about realtime movie quality rendering, like we all have been for (help me here) how many years now?
Somehow all this stuff just never materialises. Why? Because the theory and marketing never quite match the reality and when you, as a developer, have to actually make a working application arround all that, you quickly learn that you will have to make decisions. Some things just wont work with others, some things will be limited to some piece of hardware and you will have to use the lowest common denominator, or you cant just make that complicated shader there, so you have to do a simpler one. Or you have to work with memory constraints. Or, the drivers dont fully support the spec, as they should. When are drivers not buggy? So again, you have to make an option here, or limit some functionality there.
You end up cutting corners here and corners there and finally you have a very complicated product (thus less stable and harder to maintain) that took a long time to develop and that does not quite perform as well as the marketing made everyone believe it would.
Then the technology wheel turns on and what was the dernier crie in hardware rendering tech, is outdated and a dead end and you have to start ALL OVER again with your renderer, because there is no upgrade path.
You think that I am exaggerating? I have been there with 3DFX. For a few years Glide was THE THING. It was gone quicker than it appeared.
Nvidia has a few more bad releases like they had recently and a few more products that dont live up to the expectations and they will be the next 3DFX. I would NEVER support CUDA.
DirectX11 also has it issues, since it is limited to Windows only (so cant be used for CORE). Leaves OpenCL. Well OpenCL is nice, but it is not the answer to all our prayers either. Why? Because a whole bunch of people with their own personal interests had their hands in the development of this standard.
And then of course, the propper OpenGL/CL driver support is also a big question mark. I wont bet my money on AMD (ATI) ever getting their drivers right before the next generation of their hardware is out.

Rayek
04-23-2010, 05:15 PM
Well, I did buy a license of Octane yesterday, if only to test how it works with my 280gtx. And it IS fast, and incredibly easy to use: I was rendering beautiful images withing minutes. It has its drawbacks, but as with any renderer it depends on the job it is used for. For product rendering and architectural rendering it works like a charm, and saves a lot of time when talking about rendering and setting up things. I really like how you can just use an eye dropper to pick an object's material and start adjusting stuff on the fly - at almost real-time speed.

And, come on: the current beta pricing is almost negligible compared to other commercial renderers, even with the price of a nvidia card thrown into the equation.

Today I just had FUN rendering all kinds of stuff within minutes at a high quality, rather than waiting hours for rendering and dealing with awkward render settings - plain fun! :)

*edit* Just now checked out the physical sky, which is a doddle to use - couldn't resist adding a screenshot. I changed the parameters to where I live now (canada, vancouver) and the current time.
http://img140.imageshack.us/img140/2112/clipboard01qw.th.jpg (http://img140.imageshack.us/i/clipboard01qw.jpg/)

Elmar Moelzer
04-24-2010, 05:32 AM
I am just wondering how much faster it really is than, say, Fprime, which is CPU only.
Fprime was very interactive on complexly shaded scenes already many years ago, when it first came out. It should be very fast on modern CPUs, like a mid price Core i7.

warmiak
04-24-2010, 01:02 PM
From what is was told GPU is faster for calculations that are simple but very extensive. Right/Wrong?

They key is to be able to have each calculation being independent of another so they can execute in parallel - if you can achieve that then the GPU will scream ...

There is no mystery here ... for instance a card like GTX 8800 Ultra has about 128 simplified cores vs your typical CPU being 4 cores only.

The latest ATI cards take the concept even further by having about 800 extremely simplified "cores" ( I hesitate to even call them cores anymore since these are extremely simply, few instructions only units) but you get the idea.

Lightwolf
04-24-2010, 02:45 PM
They key is to be able to have each calculation being independent of another so they can execute in parallel - if you can achieve that then the GPU will scream ...
Not quite, that's true for any kind of multi-threading, GPUs pose a lot of other problems though (a simple one being recursion) and the way the "cores" are actually being used and made available.

There's a reason as to why the research on getting traditional techniques working on GPUs is still ongoing (kd-trees being an example here).

As a simple example: I've seen performance claims of GPU code vs. CPU code of around 50 times faster execution time (image processing, which is easier to begin with).
Upon closer inspection the routine used is brute force, which is pretty much the only way to get it to run on a GPU, but certainly the worst approach on a CPU (with more effective techniques being around for decades). In this case an algorithm to find the closest point in a massive point cloud to a specific coordinate.

Cheers,
Mike

silviotoledo
04-24-2010, 05:51 PM
Tobian

I'm very interested in the possibility of having massive rendering power with low cost for future projects. But I would like to see similar results to simple skin ( sss ) and other shaders done in GPU too.
The solution, anyway, seems to be perfect for architectural visualization at this moment.
I did not try it yet. Just would like to learn more about the process and the tech info I'm collecting here are great.

Lightwolf

Once all the data finish or starts from ON and OFF energy ( 0,1 binary ). Why would be difficult to break complex functions in single ones?

Rayek

The render looks beauty! Please keep us informed about your progress.
Can't SSS be simulated by a surface tickness and several layers of material?
Gerard Strada heve done nice samples with the primitive tech while there was no special shaders. Is there a way to simulate things like this in Octane?

Elmar

Yeah, Fprime is amazing. But it does not have the same photorreal look ( Radiosity or whatever it is, physically correct ) that Octane and Arion ( and also Maxwell ) has.
Mr. Worley is a genius, but it seems fprime era is ending. I would consider the possibility of having VOLUMEDIC working with GPU too, but maybe you should wait CORE support to CUDA. Wouldn't volumedic data be readable by Octane?
I think untill Quantic processor will be available, GPU seems to be the best solution for massive processing.

I've heard about a lot of studios wich used in house solution and GPU devices to acellerate rendering.

Ĺgrén
04-24-2010, 06:09 PM
Mr. Worley is a genius, but it seems fprime era is ending.


He has been writing a GPU using version for a long time according to a rumor.

Lightwolf
04-24-2010, 06:25 PM
Once all the data finish or starts from ON and OFF energy ( 0,1 binary ). Why would be difficult to break complex functions in single ones?

The problem is that a simplistic approach doesn't warrant the complexity of the difference. It's like saying "Hey, a movie is just light on film, it can't be that hard to create". ;)

Basically though GPUs are more or less designed to execute exactly the same code on multiple sets of data at the same time.
This could be running the same rendering loop on different image pixels, but using the same set of data to render (i.e. geometry).
However, one part of the complexity is that the multiple cores on the GPU need to run the same code at the same time. That's quite different from a CPU where multiple cores can run anything at any time.
I.e. one core is still busy tracing a ray while another one shades a certain surface, the third is computing AA and a fourth one is subdiving micro-polygons (this is a scenario that any bucket renderer can handle).

On a GPU the cores are bunched together into larger clusters, and any cluster can only run one piece of code at any one time. You want to compute something different, wait until all cores have finished their task.

It is also still hard (i.e. OpenCL) to actually break down a complex scenario into smaller pieces of code that can be called arbitrarily.

You also can't just mix and match CPU and GPU code (i.e. you're shading a surface with a texture and that texture needs to be swapped in from hd via the CPU).Once the code is running on the GPU it needs to run and finish there.

That's also why you can't just "port" a renderer. You have to completely re-think and re-design every single concept to make it work.

Then there's also the question on whether you combine OpenGL/Dx based rendering with GPU style code (as StudioGPU does afaik) or only program the GPU directly.

But basically, the smarter the code and the algorithm and the more flexible it is... the harder it is to re-generate on a GPU style system. And production renderers have reached an extremely high level of sophistication in that regard... they're very smart (reduce computation as much as possible) and flexible (since that's what production rendering is about).

To insert a simplistic car analogy here... you've got hundreds of tiny dragsters racing a couple of normal passenger cars.
And you can surely drive shopping, get to work or drive your kids to school on the dragsters as well - the common theme being: They've got wheels ;)

Cheers,
Mike

Rayek
04-24-2010, 08:49 PM
The render looks beauty! Please keep us informed about your progress.
Can't SSS be simulated by a surface tickness and several layers of material?
Gerard Strada heve done nice samples with the primitive tech while there was no special shaders. Is there a way to simulate things like this in Octane?

This version does have mappable thin film coatings, which works very well for cars and stuff. Remember, this is still a beta version, and while it works extremely good, some things relating to materials have not been implemented yet - shouldn't take too long, though, since the developers have completed most features so far on their list for v1.

Material-wise this is their list for v1:


Physically based material models

Bump and normal mapping

Opacity / Alpha mapping

Mappable Thin Film Coatings

Complex IOR **

Null material and mix/stacking of materials and layers**

Absorption and transmittance**

Chromatic dispersion**

** means it still needs to be added.
Here's a quick example of the thin film in action:

http://img688.imageshack.us/img688/6196/clipboard01qq.th.jpg (http://img688.imageshack.us/i/clipboard01qq.jpg/)

This only took about 30 secs. Granted, still somewhat noisy, I overdid it, and there are a couple of fireflies (they are working on that). The model consists of about 700k polys. Setting up the materials and lighting took me about a minute - I did not use path tracing, though.

But why take my word for this? The demo is almost fully functional, albeit limited in resolution and the fact you cannot save your work.

This version still has problems with fireflies, although changing the settings can prevent them somewhat. The final version will have got rid of them, according to the developers. Still, I wanted to show these to you that it might not be 'production ready' in certain circumstances - remember it IS a beta.

http://img406.imageshack.us/img406/5962/fireflies.th.jpg (http://img406.imageshack.us/i/fireflies.jpg/)

3 minutes render with full hdr lighting and path tracing:
http://img265.imageshack.us/img265/4601/clipboard01zt.th.jpg (http://img265.imageshack.us/i/clipboard01zt.jpg/)

Elmar Moelzer
04-25-2010, 04:11 AM
Yeah, Fprime is amazing. But it does not have the same photorreal look ( Radiosity or whatever it is, physically correct ) that Octane and Arion ( and also Maxwell ) has.
Actually I think that Fprime is unbiased (Monte Carlo Method). AFAIK it is just not getting the same results because it is trying to produce results that match LWs instead of doing its own thing.
I may be wrong here though. Radiosity is not my area of expertise.

On VoluMedic using the GPU: We do employ texture based volume rendering on the GPU already for the interactive previews in the OpenGL viewports. These are somewhat limited though compared to what the software renderer can do. I do NOT want to use CUDA, since it is vendor specific and it is therefore poised to the way of GLIDE one day. Since developing a renderer is a signifficant investment, I am not willing to bet all that money on Nvidias capability to keep the leadership (which has already suffered quite a lot in the last two years). If anything, I would consider OpenCL, but even there I want to wait at least one more generation before making a decision in that regard (I wont bet my money on ATI/AMD ever getting their drivers right before the next generation of cards is already out).

Rayek
04-25-2010, 11:12 AM
I do NOT want to use CUDA, since it is vendor specific and it is therefore poised to the way of GLIDE one day. Since developing a renderer is a signifficant investment, I am not willing to bet all that money on Nvidias capability to keep the leadership (which has already suffered quite a lot in the last two years). If anything, I would consider OpenCL, but even there I want to wait at least one more generation before making a decision in that regard (I wont bet my money on ATI/AMD ever getting their drivers right before the next generation of cards is already out).

The developers of Octane have stated that as soon as amd's opencl drivers have matured, they will implement opencl in their software as well. So, hopefully their renderer is 'future-proof'.

Intuition
04-25-2010, 12:17 PM
Yeah, Fprime is amazing. But it does not have the same photorreal look ( Radiosity or whatever it is, physically correct ) that Octane and Arion ( and also Maxwell ) has.

f-prime doesn't have LWF controls built in no. But it is capable of the same Gamma workflow look as any other render engine. I mean all engines have thier own look and that is just how things are different under the hood but overall each engine can replicate a "look" if used properly. F-Prime does not have lwf controls built into to UI but you can do the old linear/sRGB method like in Lightwave classic if one wanted. Its just harder to preview the final result.

F-Prime and LW have some sort of reflection/refraction blurring issues that other engines seem to do a little better, but the overall clean photoreal look can be attained with LWF. Just have to use the force since f-prime wont let you see the 2.2 result in window.

Rayek
05-13-2010, 06:28 PM
There's an insane guy on the Refractive software forum who has installed 7(!) GTX 480 graphic cards in his office machine. Roughly translates to 5000 pixel samples with path tracing @ 720p resolution per frame in about 1 minute: that's 720 frames in 12 hours.

Basically a 350 quad core PC renderfarm in one box.

The newest beta now supports multi gpu rendering.

http://img163.imageshack.us/img163/8561/clipboard01txi.jpg (http://img163.imageshack.us/i/clipboard01txi.jpg/)

Really - completely insane ($20000)

8/8/8/

pixym
05-13-2010, 07:11 PM
…Octane Render is the only GPU application I know is compatible with lightwave.
Arion seems to be more compatible because it uses the Fry render exporter that works in the layout.

With Octane you have to export your scene via Obj file format. To do that, you have to export the Layout scene in collada then open it in the modeler, then export in Obj with all your textures in the same folder than the obj one…

pixym
05-13-2010, 07:19 PM
He has been writing a GPU using version for a long time according to a rumor.
I have read several posts of him on a cuda forum…

Hieron
05-13-2010, 08:00 PM
3 minutes render with full hdr lighting and path tracing:
http://img265.imageshack.us/img265/4601/clipboard01zt.th.jpg (http://img265.imageshack.us/i/clipboard01zt.jpg/)


Could be me, but that doesn't sound amazingly fast for that shot.

edit: tried the demo quickly, with a rather hefty model. I'm not convinced it has speed per se, but perhaps I should spend more time one day.

pixym
05-13-2010, 08:05 PM
In fact it is not the shot, but the render method used (path tracing)

gristle
05-14-2010, 03:06 AM
I won a lifetime license of Octane in the first contest, and have done several tests on it. Kind of left it alone for a while because of the vertex normal issue, but it looks like that is now fixed (not tried it yet personally).

Speedwise, it is pretty quick on scenes with minimal transparency.
Alot quicker than FPrime is on my 920 quad.

However, I ran a test model of a piece of cut crystal and it was painfully slow to resolve noise. Probably slower than FPrime was for the same object.
Of course, once the MLT goodies get added this might change.

Because Octane is pretty new and has not yet had several features added, I think you cannot do a real world comparison. It will be interesting to see if features like SSS slow things down. Still, good to see this appearing, especially at the price it is.