PDA

View Full Version : Extreme render perf issue here w/ Mac(64) LW vs Win LW64?!



jwiede
02-05-2013, 11:14 PM
On my MacPro (12core/24thread) running LW64 11.5 on Win7 X64, I was able to render the BenchmarkMarbles scene in ~1h5mins, a respectable enough performance. Unfortunately, running LW Mac64 on 10.8.2 on the exact same hardware, across multiple tries rendering the exact same scene with identical settings, I stopped the render after >3hrs with noise still visible. During the render, I noticed that LW was running nowhere near 100% on all CPUs, and seemed to be spending most of its time in system synchronization functions (load was consistently showing as user 16%, system 45%+ during rendering).

I was floored by how much worse LW Mac64 was performing: Even after three hours the render hadn't reached the same level of cleanliness I saw in the LW_Win64 render after ~30mins. Mac64 LW is consistently rendering ~6x slower here than LW64 on Win7x64 using the same hardware, a 12core/24thread modern system. I've tried clearing configs and anything else I can think of, and the Mac LW render performance remains abysmally poor compared to Win64 LW.

I would very much appreciate it if other Mac LW users with modern hw (and esp. those with recent 12c/24t MacPros) could attempt to run the 11.5 content scene "BenchmarkMarbles.lws" (just set content dir, load scene, and hit F9) to confirm this issue. I'm seeing no evidence to suggest a local cause, but would love to hear from others they don't see this issue. I really, really hope this does not turn out to be a (multicore?) Mac-wide issue.

Before anyone suggests this is just "normal" OS overhead of MacOS versus Win64, don't bother, it isn't. In the past I've run tests comparing Cinebench 64-bit on Mac versus Cinebench 64-bit on Windows on this same hardware, and the difference in performance was far less divergent (~1.3% difference in Mac's favor -- CB64 on Mac consistently gives 12.49 on this hw on 10.8.2 while CB64 on Win7x64 consistently gives 12.30). I've seen similar results testing with Maxwell renders in the past, where the differences were so small as to fall within a reasonable margin of error (<5%). Neither Cinebench nor Maxwell show substantial differences in performance between 64-bit app on MacOS 10.8.2 and 64-bit versions running on Win7x64 on this hardware.

*: All LW tests were done using BenchmarkMarbles.lws scene and content from 11.5 content directory. Cinebench tests were done using (current) version 11.529 of Cinebench. Maxwell tests were done using Maxwell 2.7, then-current, but I can repeat tests using current 2.7.10 if precise numbers are desired.

Sensei
02-05-2013, 11:17 PM
What is Render Globals > Multithreading?
Try setting up Multithreading manually to >24.
what happens?

jwiede
02-05-2013, 11:28 PM
During the render, the info panel noted 24 threads were being used, and setting to 24 render threads manually seemed to yield similar results in terms of CPU loading. Visually, there seemed to be 24 separate "sets" of pixel updates occurring as well, just as were present in LW64 on W7x64 -- perhaps put better, the pixel updating in both cases was the same in terms of the number of concurrent "lines" of pixels being drawn by Mac64 LW & Win LW64. The difference was that Win LW64 was producing pixels-per-update-set at a visibly quicker rate.

Just to clarify, this is also with all third-party plugins removed (clean configs, as noted previously). The only plugins present are the "default" LW ones.

Sensei
02-05-2013, 11:29 PM
But set 64...

Spinland
02-05-2013, 11:52 PM
My desktop is a 2011 Mac Mini Server: 2.0GHz i7, four cores, eight threads. It has 16GB of RAM and is running Mt Lion 64 bit. It completed the marbles benchmark render in 3hrs, 18mins. Hope that's a useful data point.

OFF
02-06-2013, 12:18 AM
Check the Image Cache settings!

jwiede
02-06-2013, 01:01 AM
My desktop is a 2011 Mac Mini Server: 2.0GHz i7, four cores, eight threads. It has 16GB of RAM and is running Mt Lion 64 bit. It completed the marbles benchmark render in 3hrs, 18mins. Hope that's a useful data point.
Sounds like it was doing better than mine, which suggests perhaps some kind of scaling/synch issue, but still need to hear from other 8c/16t or 12c/24t Mac folks to confirm. Just out of interest, while you were rendering what did Activity Monitor state for user and system CPU load?

jwiede
02-06-2013, 01:05 AM
But set 64...
Will try tomorrow when I have time, but at the moment I'm more interested in what other 12c/24t Mac folks experience to confirm/deny similar perf.

Spinland
02-06-2013, 01:23 AM
Here's a representative screen shot of my system while rendering the scene:

111169

3dworks
02-06-2013, 01:25 AM
i did just load the scene and render with my macpro (same specs as your's) and results are actually faster than on your win 64 setup. all threads where full used, just 11.03 was a little bit faster than 11.5:

http://forums.newtek.com/showthread.php?133251-11-5-s-BenchmarkMarbles-lws-share-your-machine-s-render-time-here&p=1298186&viewfull=1#post1298186

not sure what is wrong with your setup/system? did you try with a clean install of LW 11.5? did you try to render the scene without third party plugins (especially no master plugins installed as default)?

cheers and good luck

markus

OFF
02-06-2013, 01:40 AM
Purely for information:
Image Caching is "on"
111172
Image Caching is "off"
111173

3dworks
02-06-2013, 01:48 AM
screenshot here: http://www.3dworks.com/skitch/Screen_Shot_2013-02-06_at_09.41.38-20130206-094420.png

Spinland
02-06-2013, 01:48 AM
Where is the control to control image caching?

3dworks
02-06-2013, 02:01 AM
check this: what is still a source for extreme slowdown is if you manually add buffer export or plugins like exrtrader when using the VIPER buffers. see screenshot. i posted this bug in fogbugz as case number 52850. unfortunately, this bug has not been resolved, so native buffer export cannot be used on mac side without huge render speed penality. with exrtrader, you MUST uncheck the 'store all VIPER buffers' option, which is on by default.

but i'm not sure why in your case this would be active if you used the unchanged scene from the content folder? maybe using a startup template scene? let us know how it goes.

screenshot, left CPU activity log is without buffer mode, right one with:

111175

i will update my fogbugz report, it's a shame that this bug made it through...

cheers

markus

EDIT: bug is still open and i got confirmation that a fix for the 11.5 release was too 'risky' and that NT will try to fix it in an update. i reported the bug mid of november...

OFF
02-06-2013, 02:04 AM
Where is the control to control image caching?
General options tab, below.

3dworks
02-06-2013, 02:15 AM
111177
Check the Image Cache settings!

image caching is NOT an issue under mac:

jwiede
02-06-2013, 02:36 AM
Purely for information:
Image Caching is "on"
111172
Image Caching is "off"
111173
Yep, it does appear that image caching was the culprit. Running with it explicitly disabled the speed improvement is visible, and CPU loading shows maxing out all cores. Odd thing is, it was enabled despite (otherwise) clean configs*, and I don't recall explicitly enabling it.

So, next question: When did image caching become a "bad thing"? I don't recall ever seeing explicit advice against it, nor explanation why it was bad. I can imagine how it could be badly written, but when did it become a bad thing? I've done _lots_ of renders with it enabled in 11.0.3 and prior, and never recall the CPU loading as dysfunctional the way it was with 11.5 & the benchmark scene.

I'm glad to have found the culprit, though I'm not sure I'm that happy about having to give up image caching.

*: I clean configs for testing by explicitly renaming the existing "Preferences" folder I always create to "OldPrefs" or similar, and create a new empty "Preferences" dir in same dir as the LW apps. I never run off the "stock config location" due to past testing experiences.

Spinland
02-06-2013, 02:43 AM
Yes, odd that it was enabled by default. On my pretty vanilla 11.5 install the default was unchecked. A purely unscientific test over just a couple of runs showed about a 20% speed up with it disabled on an image with GI enabled and three FiberFX instances attached.

jwiede
02-06-2013, 02:53 AM
111177

image caching is NOT an issue under mac:
Actually, it appears it (now) may be one, at least in my case. Can someone please give me a pointer to the "image caching issue" so I can better understand what's going on?

3dworks
02-06-2013, 02:53 AM
did you guys see my post below?

jwiede
02-06-2013, 02:59 AM
did you guys see my post below?
Yep, I saw yours, but that doesn't explain why I see my perf issue with it enabled here, and not without. That's why I want to understand the issue better.

I've checked and rechecked and have no third-party plugins in my "clean" 11.5 install, nor image filters or master plugins with benchmarkmarbles scene loaded, so no buffer export or exrtrader involved. Image Caching alone seems to be the factor here between 100% CPU loading and the degenerate User 15%/System 49% "problem state" I was seeing with the bad perf.

Just to add more strangeness, I've done some quick testing on 11.0.3, and only one of the sample benchmark scenes I tried (out of a bunch) show a problem with image caching enabled, most still produce 100% CPU loading. I'm pretty sure I didn't encounter this issue previously in 11.0.3 in my own work either -- I'd remember it, the CPU loading catches my eye immediately when it's in the problem state. It might be that what was a rare issue previously in Mac LW prior to 11.5 is now occurring much more frequently (in 11.5).

Regardless, I'm still less than thrilled about the situation in general. If this was a known issue previously, why wasn't it addressed? It certainly seems like a fairly severe issue in terms of perf impact, nor does blanket disabling image caching seem "harmless" either from a perf viewpoint.

3dworks
02-06-2013, 03:21 AM
ah, interesting, i rechecked and can confirm now your issue! apparently there is a difference if you check this option when the scene is already loaded (image cache OFF when loading the scene, then switched to ON) or if you start layout, check this option to ON and then load the scene. in this latter case i can confirm exactly your problems! time for another fogbugz report, i guess?!

111183

3dworks
02-06-2013, 03:34 AM
sent to NT as fogbugz 56218!

Sensei
02-06-2013, 03:39 AM
111184

What is earlier in time on left is with Image Caching enabled.
What is on right is with IC turned off.
You can clearly see that even on 8 threads it's jumping between 75-90% cpu used. When without this option it's almost steady 100% (little bumps are when new AA passes are started).

Sensei
02-06-2013, 03:55 AM
When the all threads are trying to access single file or single block of memory simultaneously there is needed to block multi-tasking, then write data, then unlock multi-tasking.
But with too many threads trying to access restricted area, trying to block multi-tasking is resulting is waiting when other thread will unblock it.

JonW
02-06-2013, 05:34 AM
It's not a Mac problem. Image caching off, my W5580 x 2 CPUs 3.2 GHz XP64, does the scene in 1h13m17s. The CPU were running between 97-100%.

With Image caching on CPU usage dropped to 13-16%

So make sure it's off!

jwiede
02-06-2013, 08:01 AM
So make sure it's off!
Sorry, not a good enough answer, IMO. Image caching serves a legitimate purpose, and offers its own perf benefits that are lost by disabling it. Architecturally it is absolutely possible to write a data-caching system that can handle arbitrary readers/writers without such problems. Just because something is broken, and can be turned off, doesn't make that suddenly a legitimate solution, nor trivialize the issue's presence in a released product.

jwiede
02-06-2013, 11:13 AM
I would very much like to hear LW3DG's perspective on this issue (major perf degredation with image caching enabled), and how serious a problem they consider it? I'd esp. welcome any details they can provide as to precisely what's going on when the issue occurs (which might help us find better workarounds besides just disabling image caching).

3dworks
02-06-2013, 12:16 PM
did you also file a bug report? maybe it's good to let them know this way. let's see if the service update will contain fixes for the image buffers issue (way more a problem if you ask me, there's currently no workaround when not using exrtrader as a third party plugin) and this image cache one... so, fingers crossed! the other serious glitch on mac are those abysmal opengl performances in modeler's glsl mode.

for the rest it seems to be a very solid release to me. what definitely improved is lscript, many issues seem have been vanished, and i got reports that even third party plugins like ozone now are working properly! ...have to test this one when i have more time.

cheers

markus

Airwaves
02-12-2013, 02:02 PM
Purely for information:
Image Caching is "on"
111172
Image Caching is "off"
111173


Just curious on a side note. It shows 592.5 M which I believe is for memory. I have a 64 bit computer that has 32gb ram and when I render the most I see there is 242.5 or something close. Do you know why and maybe I can change a setting to use more RAM so things speed up, if possible.

OFF
02-12-2013, 08:18 PM
I think it depends on the number of polygons and textures in the scene (here ~10 000 000 polygon's and 1gb texture in my scene),
you can also raise the value of the segment memory limit.

Airwaves
02-13-2013, 12:37 PM
I think it depends on the number of polygons and textures in the scene (here ~10 000 000 polygon's and 1gb texture in my scene),
you can also raise the value of the segment memory limit.

I read a tutorial a while back that said changing the segment memory limit can hurt more than help. I would love to change it if it will help. I have 32 gb and I only use 4gb when in the program. If I did change it what would you recommend I change the segment memory limit to? Thanks

OFF
02-13-2013, 08:24 PM
I set the the value of a segment the 512mb, this is enough for me for all sizes and types of scenes.
But again - if in your scene only zero objects or primitives, respectively, you will have nothing to upload your memory.
In order to load the 32 gigabytes memory of my computer I need to create a scene with at least 50-60 000,000 polygons
and more than gigabyte of textures.

Airwaves
02-14-2013, 10:46 AM
I set the the value of a segment the 512mb, this is enough for me for all sizes and types of scenes.
But again - if in your scene only zero objects or primitives, respectively, you will have nothing to upload your memory.
In order to load the 32 gigabytes memory of my computer I need to create a scene with at least 50-60 000,000 polygons
and more than gigabyte of textures.

When you set the segment memory limit what do you type in to get the 512mb?

My scenes are never complicated and probably do not need lots of RAM. I just wonder why so little RAM is used but CPU is at 100%. I guess the way it is designed is to use CPU and hardly any RAM. I think I overloaded when I got this computer. I thought the 32gb of RAM would help but I guess not. At least I have it if I need it though.

Airwaves
02-14-2013, 10:47 AM
Thanks again.