PDA

View Full Version : Rendering Volumetrics



younglion
02-17-2005, 06:50 PM
Hello everyone!
I did a little search on this but couldn't find anything. I just recently got a G-5 and put the latest Lightwave (8.2) on it. I loaded up a scene that I made in Lightwave 7.5, (on a PC) on to the Mac and I noticed that it's rendering volumetrics alot slower than the PC did. One frame rendered on the PC took no more than a minute, whereas on the Mac it took over 7minutes!! The PC is running a duel Xeon 2gig processor, 2gigs of ram; and the Mac is a duel 2gig G-5, 3.5gigs of ram, so I figured that the Mac would perform better, which it does except volumetrics. If anyone has any info about this, your help would be greatly appreciated. Thanks.

brayne
02-17-2005, 09:42 PM
How many threads are you using on the Mac version?

harlan
02-18-2005, 01:30 AM
Threads aren't a "set and forget" option. Some scenes may render faster with the thread set to 1 while others will render faster with an increased thread setting. Regardless of the logic involved I always test my scenes with each thread setting to find the best performance for that scene - you'd be surprised how often a setting of 1 will outperform a higher setting.

brayne
02-18-2005, 04:25 AM
This is true, however on a dual processor G5, volumetrics will benefit greatly from a higher thread setting.

marinello2003
02-18-2005, 08:49 AM
I just got a new Dual 2.5 PowerMac with 8GB of ram, and I am currently rendering a volumetrics scene of an Atom-bomb explosion I made, and I am amazed at how slowly it is going at 320X240 res. It is currently rendering - 6 hours and counting! I do have an older 2.0GHZ P4 Dell, however it could not render this at all. I figured the new Mac could.

I have found that the Mac is so incredibly fast at rendering anything else! It can render very large scenes at 1920X1080 about 6-8 times faster than my single processor G4 1.4 GHZ Mac or my 2.0 GHZ P4 Dell. Volumetrics do seem to pose a problem! I really hope that Newtek updates the code there very soon.

I am currently testing the 'Thread setting' to see if I notice a difference. If anyone knows or can explain why a certain thread setting is better for one scene vs. another I would appreciate the explination. Right now it is just voodoo.

-Brent

brayne
02-18-2005, 09:20 AM
Brent, I'll do my best to explain it, but I can't guarantee anything!

A thread is like a self-contained bunch of instructions. If you have a multi CPU computer, and you set LightWave to only one thread, the whole render is being bundled into one set of instructions, and the Mac OS can't split it up, so only one of your CPUs will be used.

If you set LightWave to two threads, now the Mac OS can give one thread to the first CPU, and the other thread to the second CPU.

Ok, that's the theory. Now, when it's put into practice there are many factors at work. When Lightwave is splitting a rendering task into threads, it can't distribute the tasks based on how long they will take, it just slices the scene up. So now you have a situation where each thread is working on a part of the scene, but one thread may take longer to finish than another. The problem with this is that LightWave can't start a new render task until all threads are complete, so even though thread 1 may have finished 5 minutes ago, it's now doing nothing while we wait for thread 2 to finish. Sometimes breaking the scene up into more threads will fix this, but on the flip-side sometimes too many threads will actually cause the Mac OS to slow down from being overloaded.

So, what's the answer? I'm still looking. It varies a lot whether you have a single or dual CPU computer, and it also varies based on your computer configuration and background workload. It also varies based on the scene you are rendering.

I can't really offer you more than that. I think testing the scene beforehand is really the only sure way. My dual G5s are usually set on 8, but I send most of the big stuff to a render farm.

Hope this doesn't just confuse the hell out of you!
Bruce

marinello2003
02-18-2005, 09:28 AM
Thanks!

BTW, what kind of renderfarm?

-Brent

brayne
02-18-2005, 09:33 AM
BTW, what kind of renderfarm?
Just a bunch of Macs all networked together, running ScreamerNet and being controlled by this really nifty piece of software called RenderFarm Commander (http://www.brucerayne.com/renderfarm.html) ;)

Bruce

toby
02-18-2005, 03:57 PM
Also with threading it helps to keep in mind that it takes a certain amount of processor time to split the data into threads, the more threads, the more time. For example if you have a simple scene that renders very fast, 1-2 sec., multi-threading will likely slow it down.

In my experience, it takes a different amount of time to split different kinds of data - like Hypervoxels. Try rendering a 1 small HV on 1 thread, then with 2. With 2 threads it will take many times longer to render.

Like everyone's been saying, testing with different threads is the way to go. It's different on different machines too.

Volumetrics can be sped up greatly by using shadow-maps instead of ray-traced shadows.

younglion
02-18-2005, 10:18 PM
Hey thanks everyone for the tip.
I knew there was some setting for that but I had forgot. Anyway, the scene was set on 4 threads, so I put it on one thread and it works fine. I tried setting it to 2 or 8 and everything rendered really fast except the hypervoxels. I did find however that set at 4 threads it rendered a scene with radiosity turned on suprisingly fast (compared to usual). This is definately helpful info in using all your processing power. :D

Captain Obvious
02-19-2005, 04:13 PM
As for the overhead with multiple threads, I just did a test rendering of a 200k polygon scene with backdrop radiosity, and it took 97 seconds with one thread and 104 seconds with eight threads. Note that this is on a single-processor, single-thread computer. The overhead doesn't strike me as very big.

brayne
02-19-2005, 05:38 PM
I just did a test on a scene with nothing but hypervoxels in it, and here are the results:

1 Thread: 75 sec
2 Threads: 46.5 sec
4 Threads: 44.5 sec
8 Threads: 39.9 sec

That's on a dual processor G5.

I think this exercise has shown us all a few conclusive things: don't treat a single cpu computer the same as a dual cpu computer, and do a test first!

Bruce

toby
02-22-2005, 02:27 PM
I'm guessing you had enough HV to make multi-threading worthwhile - try this:

default scene
add null
add default hypervoxel

on my dual 2.4ghz pc:
8 threads - 4.8 seconds
2 threads - 2.2 seconds
1 thread - 0.7 seconds

And I think it was Cpt. Obvious that pointed out that single mac's don't really benefit from multi-threading - and blanos.com seems to back that up; all the top performing single cpus were set to 1 thread

Captain Obvious
02-22-2005, 02:53 PM
If a single-threaded computer ever gains anything from using multiple threads, something is very odd indeed. ;) Currently, only Pentium 4 systems are capable of running several threads on a single processor. They gain quite a bit (about 25% in many renderers) when you turn on multiple threads in the renderer and HyperThreading... but that your dual-processor system doesn't gain anything at all from multiple CPUs is somewhat odd. But with so short render times, maybe it's the overhead from having multiple threads causing it? Add lots more hypervoxels and try again, please! :D

toby
02-23-2005, 11:20 AM
yea, that was an example of overhead, and my theory of different kinds of data taking longer to split -

I thought multi-threading could help on a single because of Altivec, using spare cycles, but I guess you never know -

Ge4-ce
02-23-2005, 02:45 PM
Also, here I go again with my dual LW layout setup..

Also a thing to keep is mind.

I have a dual 2.0 G5.

I duplicated the LW-Layout programm, shut down the hub, and set the treads to 1 in each program.

I let those babies render at the same time and guess what?! for 99% of the cases, when I renders say, a single frame in 1 minute on Layout-1, and then launch the render on Layout-2, I have also a 1 minute render!!!

So basically, you have gained 100% advantage of the second processor. This is also very noticable in the activity-monitor app. When you let only 1 Layout render, with 1, 2, 4 or 8 threads, I only saw max. of 170% processor used, but mostly about 150%. When I use the 2 layouts together, I see both between 190% and 200% of the processors used. (each processor counts for 100%)

Keep in mind, I sport 2.5 gigs of Ram, not "that" much, but if you do this with only 512 Ram, it will slow you down more, than you can gain..

secondly,.. it takes about 2 minutes to setup the second Layout (loading scene, altering the frames you want to have rendered, checking,...) so don't go through all this fuss when you only have 50 frames to render at 3s/frame..

May the Ge4-ce be with you... always....

Captain Obvious
02-23-2005, 04:00 PM
yea, that was an example of overhead, and my theory of different kinds of data taking longer to split -

I thought multi-threading could help on a single because of Altivec, using spare cycles, but I guess you never know -
First of all, I recently learned that LW generally doesn't use AltiVec (don't blame NewTek for it) for rendering, so it wouldn't help at all. And regardless, having two AltiVec units (or any other unit, for that matter) never ever lets you run several threads at once. You have to have SMT (or "HyperThreading") for it. I can't think of a good car analogy right now, but if I think of one later I'll post it. ;)

brayne
02-23-2005, 04:11 PM
So basically, you have gained 100% advantage of the second processor. This is also very noticable in the activity-monitor app. When you let only 1 Layout render, with 1, 2, 4 or 8 threads, I only saw max. of 170% processor used, but mostly about 150%. When I use the 2 layouts together, I see both between 190% and 200% of the processors used. (each processor counts for 100%)

This is why I render most of my stuff with a render farm. By running two simultaneous copies of ScreamerNet, both set to one thread, you get the same sort of advantage, without the overheads of running the whole LightWave application. Once again, this is for a dual processor Mac - running two simultaneous copies on a single processor Mac will give you no advantage at all.

toby
02-23-2005, 09:53 PM
First of all, I recently learned that LW generally doesn't use AltiVec (don't blame NewTek for it) for rendering, so it wouldn't help at all. And regardless, having two AltiVec units (or any other unit, for that matter) never ever lets you run several threads at once. You have to have SMT (or "HyperThreading") for it. I can't think of a good car analogy right now, but if I think of one later I'll post it. ;)

Odd, for 2 reasons, for one I'm d-a-m-n sure Newtek said it was coded for Altivec, and two, why can't you take advantage of spare cycles if you've got more threads - or at least when you're running 2 different apps? If it's too technical for this forum I understand, I take your word for it, I'm just curious...

Ge4-ce
02-24-2005, 02:19 AM
This is why I render most of my stuff with a render farm. By running two simultaneous copies of ScreamerNet, both set to one thread, you get the same sort of advantage, without the overheads of running the whole LightWave application. Once again, this is for a dual processor Mac - running two simultaneous copies on a single processor Mac will give you no advantage at all.

Correct, it's the same thing. Only I do not like screamernet at all. I once tried to set it up, and it worked, but it was not as easy as I wanted it to be. Screamernet is not user-friendly (just my opinion)


I forgot one thing when using the double-layout-setup or the screamernetsolution... Always bake particles and motions that could be randomised (like most) there are also a buch of nifty textures that tend to randomise in motion (the moving textures for example like ripples etc.) If you don't bake them you will see a jump in textures or particles when you glue the frames together in editing later on..

And I can sure say this: When you have rendered 20 hours to do render of an explosion, and you didn't bake the motions, and there is such a jump.. you could hit yourself whith a truck... it really sucks.. it happened to me once.. never again...

brayne
02-24-2005, 04:27 AM
Correct, it's the same thing. Only I do not like screamernet at all. I once tried to set it up, and it worked, but it was not as easy as I wanted it to be. Screamernet is not user-friendly (just my opinion)

Yes, you're quite right. ScreamerNet is extremely user-unfriendly. Of course there is a utility out there designed to make ScreamerNet easy...

Nope, I'd better stop myself there before I turn this into an advertisement for my own product ;)

Ge4-ce
02-24-2005, 04:44 AM
Yes, you're quite right. ScreamerNet is extremely user-unfriendly. Of course there is a utility out there designed to make ScreamerNet easy...

Nope, I'd better stop myself there before I turn this into an advertisement for my own product ;)

Yeah, you're right, I saw your product before, and I even downloaded the free version once. But I never tested it. I've read some good comments about your product. Once I have some more computers in my inventory, I will buy the pro version. Because the basic version is usefull, but also a teaser for the other products offcourse. Hey! I do find it just awesome that you give this basic version away for free! I'm a student however, and every eurocent must be saved.. :(

I still have to buy the Adobe CS package (the premium) before I graduate this year. Then I have all the official software needed to start working. boy software is expensive :(

Captain Obvious
02-24-2005, 05:39 PM
Odd, for 2 reasons, for one I'm d-a-m-n sure Newtek said it was coded for Altivec, and two, why can't you take advantage of spare cycles if you've got more threads - or at least when you're running 2 different apps? If it's too technical for this forum I understand, I take your word for it, I'm just curious...
A processor can only keep track of one thing at a time, essentially. It generally cannot send out instructions for two threads at once.

Lightwave's renderer supposedly uses double-precision floating points for rendering, AltiVec only does single-precision floating point calculations.

toby
02-24-2005, 08:50 PM
But then 8 threads shouldn't render faster on a dual...? :confused:

Ozzie
03-01-2005, 09:59 PM
Yeah, you're right, I saw your product before, and I even downloaded the free version once. But I never tested it. I've read some good comments about your product. Once I have some more computers in my inventory, I will buy the pro version. Because the basic version is usefull, but also a teaser for the other products offcourse. Hey! I do find it just awesome that you give this basic version away for free! I'm a student however, and every eurocent must be saved.. :(
I am using Renderfarm commander at work. We have a room full of G5 dual 2Ghz machines. I can hook up to 10 of them into my Mac using personal file sharing. Yesterday I was rendering 150 frames at 2min 30seconds per frame per machine. The whole thing was done in under an hour. On a single machine it would have taken over 6 hours. The best part is that I can work in Layout and Modeler whilst Renderfarm Commander is rendering with my machine as a render node. It is so easy to use. Nice work Bruce. Cheers mate.

Mark

brayne
03-01-2005, 10:24 PM
The best part is that I can work in Layout and Modeler whilst Renderfarm Commander is rendering with my machine as a render node. It is so easy to use. Nice work Bruce. Cheers mate.

Thanks for that Mark.

Wait until you see what I'm adding to RFC right now... ;)

Darrell
03-02-2005, 05:45 PM
I've rendered scenes with volumectics at 720x420 under 2 minutes on a G5...so i won't be so quick to say its a hardware problem...i'll go back & look at my settings & post them on here some day & show my render times to prove it

Captain Obvious
03-03-2005, 06:29 AM
But then 8 threads shouldn't render faster on a dual...? :confused:
Well, sometimes. Since LW isn't a bucket renderer, it simply divides the image into two, four or eight different parts and assigns one thread per part. If you have a dual-processor computer (ie, capable of rendering two threads at once), and set the threads to 2, sometimes one of the two parts will contain much more stuff to render than the other, so one of the threads will be completed long before the other. That means that after the first part is completed, one of the processors will sit idle while the other processor keeps working on the more complex part. If you set the threads to 4 or 8 instead, you can avoid this problem.

toby
03-03-2005, 11:31 AM
So it still only renders 2 threads at a time ( but apparently goes back and forth, it looks like it's doing all threads at once ) and the only extra speed is from having enough threads that no processor sits idle - I think I'm grasping it now

Captain Obvious
03-03-2005, 04:32 PM
Yes, essentially it is to make sure no processor is ever sitting idle. With too few threads, it can happen in some scenes. On a single-processor machine, it obviously never matters, because it won't be idle. It never needs to sit and wait.