PDA

View Full Version : New 48 core box



jackvee
02-17-2011, 02:17 PM
I am building this system right now ( see attachment) I am wondering if anyone else is using such a system for rendering and if I can use any other OS other than Server 2008 r2 and will LW10 run on 2008

Thanks in advance

LazyCoder
02-17-2011, 03:02 PM
I'm sorry was that "building" or "giving birth to?" That thing is a BEAST! :D

Lewis
02-17-2011, 03:05 PM
That's gonna be hell of a machine for rendering :). You plan to use it as Workstation or Renderslave ?

Markc
02-17-2011, 03:07 PM
That must cost some serious cash?:eek:

Ernest
02-17-2011, 03:45 PM
I am building this system right now ( see attachment) I am wondering if anyone else is using such a system for rendering and if I can use any other OS other than Server 2008 r2 and will LW10 run on 2008

Thanks in advanceOnly Windows server will be able to see 4 processors. The other versions are limited to 3. You don't need 2008, but 2003 had some issues when using that many cores, so it's better to avoid the trouble and use 2008.

Maybe it will help you to get a discount if you tell the manufacturer that a 96 core box has been announced by Dell: http://www.v3.co.uk/v3/news/2274769/dell-core-poweredge-server

jackvee
02-17-2011, 03:49 PM
I plan on using as a render box but I'll have to try working with LW Layout just once just to see how cool it is. the case is a rack case and its only 2 rack spaces high so I can't fit a full height video card. so I'll be using on board video not cool for modeling Dollar wise we came in just under $8,600.
I hope this thing cranks as fast as I hope it should. installing OS now I will post render times in the next few days

Lewis
02-17-2011, 03:50 PM
Ernest - AFAIR Win7 and others (non server) are limited by 2 CPUs (not 3), you can use many cores but 2 sockets/CPUs only so yes he need to use Server version to get all CPUs/Cores.

Shiny_Mike
02-17-2011, 05:22 PM
that thing will probably be a monster for doing CFD sims.

JonW
02-17-2011, 06:07 PM
Anandtech did some reviews on 4 x CPU boards for server work & programs written for them & they are really good. There wasn't a lot of information for run of the mill programs, reading between the lines & scaling up from 2 to 4 cpus will be interesting for LW.

Good luck with it.

ToMar
02-18-2011, 10:54 PM
For full hight GPU try this MEB case: http://www.akiwa.com/item.php?pg=image&id=82

BTW: This Opteron model uses 4 memmory chanels, not 3

Cageman
02-19-2011, 05:27 AM
I hope this thing cranks as fast as I hope it should. installing OS now I will post render times in the next few days

Looking forward to see some rendertimes. And please, if you can, share the content you test with so we can get some good comparsions with other systems.

:)

Lightwolf
02-19-2011, 05:33 AM
The main question is if a single instance of LW will actually be able to see that many cores.

Anything beyond 32 requires using a different Windows API to detect them.

Cheers,
Mike

Cageman
02-19-2011, 05:36 AM
The main question is if a single instance of LW will actually be able to see that many cores.

Anything beyond 32 requires using a different Windows API to detect them.

Cheers,
Mike

LW itself supports up to 64 threads in LW10 and LW9.6.1. If the OS can pass the information over to LW, which is what I believe Windows Server will do, I don't see why LW wouldn't be able to utilize it?

Lightwolf
02-19-2011, 05:41 AM
LW itself supports up to 64 threads in LW10 and LW9.6.1. If the OS can pass the information over to LW, which is what I believe Windows Server will do, I don't see why LW wouldn't be able to utilize it?
It does pass it on if the right API is used. I suppose in the worst case one could always set the number of threads manually.

Cheers,
Mike

COBRASoft
02-19-2011, 05:50 AM
What a machine!

If 64 cores is too much for LW, you can run 2 instances of it. Each on 32 cores :)!

Lightwolf
02-19-2011, 05:54 AM
What a machine!

If 64 cores is too much for LW, you can run 2 instances of it. Each on 32 cores :)!
It'll certainly be interesting how to see how it performs with various scenes and numbers of threads. I'd expect massively diminishing returns at some point in time (heck that can even happen on a lowly quad).

Cheers,
Mike

frantbk
02-19-2011, 08:32 AM
It does pass it on if the right API is used. I suppose in the worst case one could always set the number of threads manually.
Cheers,
Mike

So what is the right API to use in Lightwave 9.6/10?

I also thought someone once said that lightwave only uses the 64 thread max on specific functions. Which makes me ask the question if there is a white paper on Lightwave 10 & Core documenting when the max threads are used and when they are not.

I doubt very much if modeller can use all 64 threads (just my 2 cents).

Lightwolf
02-19-2011, 10:43 AM
So what is the right API to use in Lightwave 9.6/10?
If you really want to know: http://software.intel.com/en-us/blogs/2010/12/16/tbb-30-high-end-many-cores-and-windows-processor-groups/

I also believe that OpenMP, as included with icc, (at least until recently) didn't support more than 32 cores in Windows as well. I remember a blurb about it in a benchmark comparison of a 48 core AMD box vs. a 32 core Intel one.

Cheers,
Mike

Lewis
02-19-2011, 10:53 AM
This article (or whatever it is) says win7/server 2008 v2 can work with 256 cores.

short summary:

Windows 7 was developed in parallel with Windows Server 2008 Release 2. The two share the same kernel and thus many features. One of them, for example, is support for up to 256 cores. AMD and Intel are at the six/eight-core range, but in a system with multiple processors, that adds up fast. Intel intends its new Nehalem-EX processor, with eight cores, to be used in servers with four sockets. That’s 32 cores and 64 threads.
Windows XP was built back in the time of Symmetric Multiprocessing (SMP), where multiple cores in a CPU were seen as separate physical CPUs, as were multithreaded CPUs like the new Core i5/i7. Memory was seen as a single fabric. Windows 7, with its NUMA concepts, sees cores as functional nodes, manages threads between the nodes, and allows for partitioning or allocating memory between cores.
This led to all kinds of headaches involving overwrites or conflicts over the same memory allocation memory. NUMA lets a portion of memory be allocated to a core. Switching memory from one core to another, though, isn’t trivial, but with Windows Vista, Microsoft added C functions to copy memory from one allocation to another.

full story here: http://itexpertvoice.com/home/multi-core-support-in-windows-7/

Lightwolf
02-19-2011, 11:05 AM
This article (or whatever it is) says win7/server 2008 v2 can work with 256 cores.

Yup. but that doesn't tell you anything about how the actual software finds out about them. If it's using outdated libraries or older Win32 API calls then it can never find out about the 256 cores.

It's a bit like a 32bit app not seeing anything beyond 4GB.

Cheers,
Mike

Lewis
02-19-2011, 11:29 AM
well i doubt that M$/Intel/AMD would accept that 256Cores machine would work as 32 cores or just little bit above :D. That would be huge waste pf resources but luckily we will find-out "soon" 'coz machines are progressing fast and adding more and more CORES so even mainstream render machines could go over 32 cores in next 2 years, heck even mobile phones are dual-core now :)

Lightwolf
02-19-2011, 11:33 AM
well i doubt that M$/Intel/AMD would accept that 256Cores machine would work as 32 cores or just little bit above :D.
That's not for them to decide. Up until recently even LW only supported up to 16. And software that is designed for older OS versions is also not likely to support more than 32 because the API to actually see if there's more just didn't exist back then.

And if developers still stick to older library/compiler versions, then there's nothing that any of the above can do either.

Cheers,
Mike

Lewis
02-19-2011, 11:56 AM
well if "recent" is 2 years then yeah but in technology 2 years is a lot of time :D. Also whoever wants to have some impact in future ($$$$) it surely can't rely on old compiler/library/system when it's clear that we are "stuck" under 4Ghz and they are going for more and more cores instead GHz so multiprocessing is the key. Even new games now like Quads more than dual core CPUs which is big deal since that market is huge and not so long ago (like 9-10 months ago) 3GHz Dual-Core was giving better frame rates than 2.4GHZ Quads since games didn't use all cores....

Lightwolf
02-19-2011, 12:02 PM
well if "recent" is 2 years then yeah but in technology 2 years is a lot of time :D.
Or not. Developers can be quite conservative (for good reasons), tried and tested trumps new and shiny. Especially if you have thousands of customers.
Even the Cinebench 11.5 release last year capped at 32 cores for example.


Also whoever wants to have some impact in future ($$$$) it surely can't rely on old compiler/library/system when it's clear that we are "stuck" under 4Ghz and they are going for more and more cores instead GHz so multiprocessing is the key.
Obviously you need to keep moving. But it might also mean that older software will not take advantage of the new hardware. Again, this is quite similar to the move from 32- to 64-bit bit.
Less extreme in terms of finding the right API function to take care of the number of cores.
A lot more extreme if you really want to make good use of them across the board though.

Cheers,
Mike

jackvee
02-21-2011, 01:05 PM
Ok I've done a few test with mixed reviews LW 9.6 basically well SUCKED so
I downloaded LW 10 30 day trial MUCH better
What is strange is that LW 10's VPR viewer is amazing it uses 100% of all cpus with every scene I loaded I am able to create a very accurate VPR animation preview in just minutes.
but on most scenes with an actual render cpu usage does not exceed
70% ????? I don't understand If VPR Viewer can access all available threads
why can't my scene (with no external plugins) do the same.I tried XP 64, windows 7 64 and window's server 2008 R2 Datacenter Only Server 2008 recognized all the cores.

So here's the scenes (see Attachment ) and the test results so far

Quickroom
LIGHTWAVE 9.6 1920x1080 duel core 3.6 overclked to 4.1 with 4 gigs
of ram = 41m20s (2480.1 sec)cpus maxxed out @90%

LIGHTWAVE 9.6 48 Core AMD 1920x1080 4 WAY 12 core amd 2.3 with 48 gigs of ddr3
eec ram = 12m30s (0750.4 sec)cpus maxxed out @22%

LIGHTWAVE 10 on 48 Core AMD 1920x1080 4 WAY 12 core amd 2.3 with 48 gigs of ddr3
eec ram = 10m04s (604.o sec) cpus maxxed out @69% up and down



My volcano2
LIGHTWAVE 9.6 1920x1080 duel core 3.6 overclked to 4.1 with 4 gigs
of ram =55m32s (3331.8.3 sec) cpus maxxed out @100 & @50%

LIGHTWAVE 9.6 48 Core AMD 1920x1080 4 WAY 12 core amd 2.3 with 48 gigs of ddr3
eec ram = =14m22s (862.3 sec)cpus maxxed out @65 %

LIGHTWAVE 10 on 48 Core AMD 1920x1080 4 WAY 12 core amd 2.3 with 48 gigs of ddr3 eec ram = 4m16s (256.6 sec)cpus maxxed out @70 & @40 %

cresshead
02-21-2011, 01:42 PM
okay i'm no math's guru but that looks VERY slow.

can lightwave even SEE 48 cores?

Cageman
02-21-2011, 01:42 PM
Sounds cool! :)

jackvee: Would it be possible for you to do a screenrecording of one of these scenes with VPR? Camtasia Studio or whatever (you can download a fully functional 30-day trial version of it).

Would LOVE to see this thing in action!

:)

Cageman
02-21-2011, 01:44 PM
okay i'm no math's guru but that looks VERY slow.

can lightwave even SEE 48 cores?

Going from 55min to 4 min?

LW9.6 can see 32 threads.
LW9.6.1 and LW10 can see 64 threads.

Lewis
02-21-2011, 01:59 PM
but on most scenes with an actual render cpu usage does not exceed
70% ????? I don't understand If VPR Viewer can access all available threads
why can't my scene (with no external plugins) do the same.I tried XP 64, windows 7 64 and window's server 2008 R2 Datacenter Only Server 2008 recognized all the cores.


Did you switched on 64 threads in Render globals prefs (or try Automatic mode)? It sounds like it's on 32 threads = cca 70% of usage.

Lightwolf
02-21-2011, 02:01 PM
It'd be interesting to see the task manager during a render, with one graph per CPU as well as kernel times active.

Are the scenes available anywhere?

Cheers,
Mike

Cageman
02-21-2011, 02:04 PM
It'd be interesting to see the task manager during a render, with one graph per CPU as well as kernel times active.

Are the scenes available anywhere?

Cheers,
Mike

Look at the bottom of the post: http://www.newtek.com/forums/showpost.php?p=1115241&postcount=25

:)

Lightwolf
02-21-2011, 02:06 PM
Look at the bottom of the post: http://www.newtek.com/forums/showpost.php?p=1115241&postcount=25

:)
D'oh, cheers... I shouldn't read posts while we chat... :D

Cheers,
Mike

cresshead
02-21-2011, 02:21 PM
Going from 55min to 4 min?

LW9.6 can see 32 threads.
LW9.6.1 and LW10 can see 64 threads.

so just 13.74 times faster

so 2.29 mins was the target if you go from 2x to 48x so 4 mins is not that efficientl.

someone want to crunch the numbers as i'm not a math geezer!

also take into account that core 2 or the 2 core machine was just using 50%

it's certainly a fast box...but it doesn't appear to scale up very well with 1 scene..unless my numbers are waay off.

Cageman
02-21-2011, 02:24 PM
so just 13.74 times faster

so 2.29 mins was the target if you go from 2x to 48x so 4 mins is not that efficientl.

someone want to crunch the numbers as i'm not a math geezer!

Yes... it isn't as efficient as one would want and I believe there are alot of room for improvements, especially when VPR is able to utilize 100% CPU all the time.

Lightwolf
02-21-2011, 03:51 PM
so 2.29 mins was the target if you go from 2x to 48x so 4 mins is not that efficientl.
2.29 mins. is completely unrealistic though. Compute times don't really scale in a linear fashion with the amount of cores available.
I think anything beyond +90% per core is extremely good and for that many cores even that seems to be a fairly high number.

Cheers,
Mike

Lightwolf
02-21-2011, 03:55 PM
Ok I've done a few test with mixed reviews LW 9.6 basically well SUCKED so I downloaded LW 10 30 day trial MUCH better
What is strange is that LW 10's VPR viewer is amazing it uses 100% of all cpus with every scene I loaded I am able to create a very accurate VPR animation preview in just minutes.
but on most scenes with an actual render cpu usage does not exceed
70% ????? I don't understand If VPR Viewer can access all available threads
why can't my scene (with no external plugins) do the same.I tried XP 64, windows 7 64 and window's server 2008 R2 Datacenter Only Server 2008 recognized all the cores.
I just had a look at quickroom. Change the classic camera to perspective and then bench again please.

Classic doesn't scale well on multiple threads at all.

Cheers,
Mike

jackvee
02-21-2011, 04:15 PM
I just had a look at quickroom. Change the classic camera to perspective and then bench again please.

Classic doesn't scale well on multiple threads at all.

Cheers,
Mike

Tonight I will try different LW cameras and settings on same scenes
I will Perspective cam and re-bench

jackvee
02-21-2011, 04:29 PM
Here's a shot of task manager rendering an VPR animation preview.On some scenes it pegs to 100% for 2 or 3 sec then drops to about 5% for a sec then back to 100%. On others it pegged @ 100% 90% of the time.

Lightwolf
02-21-2011, 04:34 PM
Here's a shot of task manager ...
You officially make me sick ... :D

Cheers,
Mike

COBRASoft
02-21-2011, 04:47 PM
And me jalous :)

caesar
02-21-2011, 05:00 PM
Here's a shot of task manager rendering an VPR animation preview.On some scenes it pegs to 100% for 2 or 3 sec then drops to about 5% for a sec then back to 100%. On others it pegged @ 100% 90% of the time.

Its a monster!!!

JonW
02-21-2011, 07:14 PM
Firstly, a very cool WTM image!

This box is probably going to render frames quiet well (hopefully!) Have you tried with Screamernet with say 4 nodes. This should really fill in the gaps for CPU usage. Obviously your individual frame time goes up, but the benefit is that you are doing 4 frames at a time. So your time per frame now gets divided be 4.

Or have you tried say 2 instances of LW & get 1 to render odd frames & the other to render even frames. Or 3 instances & get each to render every third frame etc..........


Attached 2 x W5580 (8 x 3.2 GHz) ET 9:38 (578.6)

VirtualFM
02-22-2011, 01:59 AM
Quickroom
LIGHTWAVE 9.6 1920x1080 duel core 3.6 overclked to 4.1 with 4 gigs
of ram = 41m20s (2480.1 sec)cpus maxxed out @90%
LIGHTWAVE 9.6 48 Core AMD 1920x1080 4 WAY 12 core amd 2.3 with 48 gigs of ddr3
eec ram = 12m30s (0750.4 sec)cpus maxxed out @22%
LIGHTWAVE 10 on 48 Core AMD 1920x1080 4 WAY 12 core amd 2.3 with 48 gigs of ddr3
eec ram = 10m04s (604.o sec) cpus maxxed out @69% up and down

My volcano2
LIGHTWAVE 9.6 1920x1080 duel core 3.6 overclked to 4.1 with 4 gigs
of ram =55m32s (3331.8.3 sec) cpus maxxed out @100 & @50%
LIGHTWAVE 9.6 48 Core AMD 1920x1080 4 WAY 12 core amd 2.3 with 48 gigs of ddr3
eec ram = =14m22s (862.3 sec)cpus maxxed out @65 %
LIGHTWAVE 10 on 48 Core AMD 1920x1080 4 WAY 12 core amd 2.3 with 48 gigs of ddr3 eec ram = 4m16s (256.6 sec)cpus maxxed out @70 & @40 %

Well, I'm a sucker for benchmarks so I had to have a go at this with my computer. :-) It's a 4 year old PC, with an Intel Quad Core [email protected] (just a slight 10% overclock, as my motherboard doesn't allow much more than that). But it doesn't even simulated double cores with hyper-threaduing like i7, it's just pure straight 4 cores.

I have to say that when I looked at your system I was really jealous... then I stopped being jealous when I thought about your energy bill! :D

But now that I've done the benchmarks myself, I'm not jealous at all! Because either AMD really sucks really bad or something is very wrong with your system!

I got these results:
Quickroom:
LW10 (1920x1080) 13'38" (Classic camera)
LW10 (1920x1080) 10'36" (Perspective Camera, AA=3, Reconst. Filter=Box, Sampling=Classic, Adpt. Sampling=0.01)

Volcano2:
5'25" (with your settings)
LW10 (1280x720) 3'27" (Perspective Camera, AA=3, Reconst. Filter=Box, Sampling=Bluie Noise, Adpt. Sampling=OFF)
LW10 (1024x1080) 10'38" (Perspective Camera, AA=3, Reconst. Filter=Box, Sampling=Bluie Noise, Adpt. Sampling=OFF)

And this was really turned down because this is not tweaked as a "render machine" but as a multi purpose system, so this results where obtained while running all the other crap it usually runs and consumes CPU, like Anti-Virus, DropBox, Mozy, Opera (with 100+ sites opened in tabs), Word, Photoshop, Fusion and a bunch of other small utilities. So I think I could squeeze another 10-20% more power if I quit all those apps!

I know for a fact (as I tested one) than an i7-950 is 25-200% faster than mine (it depends on the scene), so either I choose wrong settings in my benchmark running or your system is not behaving as it should.

jackvee
02-22-2011, 09:48 AM
Ok I have found the magic. Rendering in layout just won't max out 48 cores but all I had to do was use screamer net, configure 4-6 nodes now I'm getting 100% cpu usage and FREAKING CRANKING WOW!!! I tried 20 render nodes and ram was an issue immediately it sucked up all 48 gigs and cam to a stand still. 512 gigs of ram is not sounding so crazy now. I will render a few Animations of the test scenes and I should have a good bench test soon

Lightwolf
02-22-2011, 09:53 AM
Ok I have found the magic.
Did you ever try using the right camera type?
As long as you still to classic there's (imho) really little point in benching.

Cheers,
Mike

Lewis
02-22-2011, 10:14 AM
As Mike said, there is no point of benching if you are using "classic" camera. That is old render "engine". Use Perspective camera and set threads to 64 in render globals (or automatic) and then see what happens.

jackvee
02-22-2011, 11:24 AM
Did you ever try using the right camera type?
As long as you still to classic there's (imho) really little point in benching.

Cheers,
Mike

Hi Mike I tried Perspective camera in Layout its was about a minute faster but it only utilized about 2% more cpu headroom I will render the same animation with every camera through screamer_net and post times

Lightwolf
02-22-2011, 11:27 AM
I will render the same animation with every camera through screamer_net and post times
Don't bother with the others, Perspective is the one.
Did you try running the scenes on the other machines as well for comparison?

Only 2% more is very odd. While Classic is prone to not even using all cores on a dual machine, Perspective usually does (once the scene is set-up by LW of course).

Cheers,
Mike

Lewis
02-22-2011, 11:55 AM
jackvee, can you post screen grab of your render globals tab in layout ?

bazsa73
02-22-2011, 12:10 PM
I want this computer

jackvee
02-22-2011, 12:15 PM
Ok I rendered My volcano 2 scene with the classic camera set to enhanced low. It rendered 150 frames in 22m45s I re-rendered same scene with Perspective camera AA set to 2 reconstruction filter classic sampling pattern blue noise all other settings the same it took 34m18s also the animation when viewed had noticeable aliasing artifacts. Please if you can suggest camera settings AA recon filter sampling pattern etc I'll try again.

I am using screamer_net and am getting 100% cpu usage.
I notice no difference If I save the scene with 1, automatic or 64 threads

bazsa73
02-22-2011, 12:29 PM
Dont you have some ordinary heavy geometry scene with GI? These hypervoxel renders are always so frustrating.

Meshbuilder
02-22-2011, 12:33 PM
Hi jackvee, your computer looks like a monster!

I would love to see what render time you get with this benchmark scene I have attached.

The scene has good AA settings, Reflections, Refractions, DOF, Final Gather Radiosity, Spherical Light with soft shadows.
Dialectric, Conductor and Simple Skin shaders and some basic textures so that the AA have something to work with :)

15 min render time on my Intel Core 2 Quad 2,5 Ghz.

jackvee
02-22-2011, 01:23 PM
Hi jackvee, your computer looks like a monster!

I would love to see what render time you get with this benchmark scene I have attached.

The scene has good AA settings, Reflections, Refractions, DOF, Final Gather Radiosity, Spherical Light with soft shadows.
Dialectric, Conductor and Simple Skin shaders and some basic textures so that the AA have something to work with :)

15 min render time on my Intel Core 2 Quad 2,5 Ghz.

Hi Meshbuilder, this scene rendered in [email protected] 3m53s But if you look at my other posts this box seems to work best using screamer_net so I have set up a 60 frame camera tilt and pan animation to see how it renders
I'll post after

ToMar
02-22-2011, 01:57 PM
Hi Meshbuilder, this scene rendered in [email protected] 3m53s But if you look at my other posts this box seems to work best using screamer_net so I have set up a 60 frame camera tilt and pan animation to see how it renders
I'll post after

Redered at 6:46 on my box (with firefox and YouTube in the back) wich is like a single CPU with slower memmory compared to your's. For now it looks like my next system is going to be "just" a 2-way (24magnycours/32Bulldozer).

Lightwolf
02-22-2011, 02:02 PM
I notice no difference If I save the scene with 1, automatic or 64 threads
Which isn't surprising, as it's not setting saved with the scene but a Layout config (which is also picked up by lwsn if configured properly).

Cheers,
Mike

jackvee
02-22-2011, 02:40 PM
Redered at 6:46 on my box (with firefox and YouTube in the back) wich is like a single CPU with slower memmory compared to your's. For now it looks like my next system is going to be "just" a 2-way (24magnycours/32Bulldozer).

ok using the same scene.I created a 60 frame animation moving the camera slightly inward and tilting up it rendered in 62m28s in screamer_net using 48 threads I don't understand why Layout just wont use 100% cpu yet using just 2 threads in screamer_net will yield 100% cpu usage?

JonW
02-22-2011, 02:41 PM
The BM scene rendered in 4:17 on my W5580 which is now long in the tooth. A pair of X5690 CPUs should render this in the region of 2:28 & turbo 2:23.

JonW
02-22-2011, 02:45 PM
SN is usually at 100% with 2 nodes on W5580 or so close it's not an issue, on my other CPUs it's never quite as good.


ok using the same scene.I created a 60 frame animation moving the camera slightly inward and tilting up it rendered in 62m28s in screamer_net using 48 threads I don't understand why Layout just wont use 100% cpu yet using just 2 threads in screamer_net will yield 100% cpu usage?

That's 1:04 per frame! BloodyHell!

I'm rendering the BM 60 frames on SN only using the W5580, just started but it looks like about high 4 minutes per frame. This is typical that SN takes longer than a straight LW render.

It would cost about $8k for a new W5690 box which would do frames in about 2:30 each, so your quad 6176 box of tricks is looking seriously good for frames! It's nice to have a stack of horse power with as little space & networking as possible!

Baking radiosity would be an interesting exercise?

JonW
02-22-2011, 10:10 PM
Some back of the envelope calcs based with AU$, but it’s more or less relative for US$. Then I factored in the X5690 instead of the prehistoric W5580 so pricing should be within a reasonable factor.

I also priced a box using AMD 6176 2.3GHz & 6168 1.9GHz CPUs. The whole box set up divided by the GHz, both work out very close in price per GHz. So whether you get the dear or cheap CPUs you get about the same GHz per $ for a whole box set up. But if you want maximum grunt in one box get the 2.2GHz, if you haven’t got quite that much cash lying around get the 1.9GHz.

On a box set up for LW, in other words not going mad with a handfuls of GTX580 cards etc, which would make the expensive CPUs better value on a whole box price:

Rendering Screamernet frames will cost:

If a frame rendered on a 48 core AMD 48gb costs $1.00 to render.
A pair of X5690 CPUs 24gb will cost $2.19 per frame

Even if you factor in a good margin of error when rendering Screamernet frames, a quad AMD is still looking extremely economical. It would be worthwhile doing a few benchmarks on cheaper boxes to get some price performance figures.


Taking into account power etc, it’s pretty good, ok it’s using a lot of power but it’s spitting out frames so fast, power is not an issue. This is a bit the same as my W5580 compared to my other boxes. The W5580, although it’s using about 530 watts, it’s less per frame than my other computers.

Screamernet for the BM scene for 60 frames:
2 x W5580: 4h 38m CPU better than 99.9%
2 x X5690: 2h 45m
4 x 6176: 1h 2m

jrandom
02-23-2011, 03:55 PM
Oh man. Someday I will get a box like this... just as soon as I actually get good enough at modeling and animating to justify the cost. :P

frantbk
02-24-2011, 07:23 AM
If you really want to know: http://software.intel.com/en-us/blogs/2010/12/16/tbb-30-high-end-many-cores-and-windows-processor-groups/

I also believe that OpenMP, as included with icc, (at least until recently) didn't support more than 32 cores in Windows as well. I remember a blurb about it in a benchmark comparison of a 48 core AMD box vs. a 32 core Intel one.

Cheers,
Mike

Thanks for the link lightwolf. I thought windows 7 Prof & Ultimate supported 192 cores, but only supported 2 physical socket. The real question is what does Lightwave 10 support and which functions support multi-core/multi-threaded operations.

frantbk
02-24-2011, 07:32 AM
Or not. Developers can be quite conservative (for good reasons), tried and tested trumps new and shiny. Especially if you have thousands of customers.
Even the Cinebench 11.5 release last year capped at 32 cores for example.

Obviously you need to keep moving. But it might also mean that older software will not take advantage of the new hardware. Again, this is quite similar to the move from 32- to 64-bit bit.
Less extreme in terms of finding the right API function to take care of the number of cores.
A lot more extreme if you really want to make good use of them across the board though.

Cheers,
Mike

Don't forget the thread buffer issue. Back in 2002-2003 where I worked there was constant blue screens issue that the software people kept claiming was a hardware problem. As a hardware guy I kept telling them there was nothing wrong with the hardware. A year later Microsoft owned up to the small thread buffer size in windows 2000.

History could repeat its self.

JonW
02-24-2011, 02:49 PM
Lightwolf, that's interesting about the CPUs.


It would be interesting how long 48 cores takes to do this as 1 instance & rendered 60 times with SN & averaged.
http://www.3dspeedmachine.com/benchmarks/render-times/lightwave/

ToMar
02-27-2011, 09:52 AM
ReRan the Benchmark (the one with 4 Balls) without FF & YT in the Back and got 5:39 (5,65).
Also noticed that your 4x 12 core Magnycours runs at a little higher clockrate than my 2x 6 core instanbul (I thought Magnycours tops out at 2.1 GHz. My bad)

so taht's about 2.1/2.3= 0.913 x 5,65 = 5,16 / 4 = 1,29 -> 1,07 / 1,29 = 0,83 -1 = 0,17 x 100 = 17%

at least (The loss due to not 100% upscaling with increased Core count and clock rate is not factored in) 17% Speed improovement from increasing memmory access alone.

ToMar
02-27-2011, 09:18 PM
DELL Power Edge C6145: Double the power per space: http://www.youtube.com/watch?v=kBxny9OnBCU&feature=related

http://sites.amd.com/us/Documents/PowerEdge-C6145-Spec-Sheet.pdf

Hieron
02-28-2011, 07:39 AM
My 980X does the benchmark in 4m 45. I guess you can get a system like that speed for perhaps 1250 euros. As usual dollars and euros are quite interchangeable so I'd say you can get about 6 of them for the price of the 48 core box. Jon, wouldn't it then be better, speed/cost wise to opt for 5 980x's ?

Jackvee I would really appreciate if you could record a video of how your 48 box tackles VPR with 100% cpu showing. Used on a standard test scene. For instance the one Jon linked. Nothing HV or anything. Would be awesome to see just how fast it is on such a standardized scene zo we can compare. That VPR is osmething 5 980x's wouldn't be able to beat ofc. Ow and best not to have a scene with some nodal shaders, for example that benchmark scene makes VPR superslow due to the Material nodal shaders.. (waay slower than an actual render)

ps: making a nice print screen screengrab is better than a photo... and for video grabbing there's some (free?) software available..

ToMar
02-28-2011, 02:05 PM
My 980X does the benchmark in 4m 45. I guess you can get a system like that speed for perhaps 1250 euros. As usual dollars and euros are quite interchangeable so I'd say you can get about 6 of them for the price of the 48 core box. Jon, wouldn't it then be better, speed/cost wise to opt for 5 980x's ?

Jackvee I would really appreciate if you could record a video of how your 48 box tackles VPR with 100% cpu showing. Used on a standard test scene. For instance the one Jon linked. Nothing HV or anything. Would be awesome to see just how fast it is on such a standardized scene zo we can compare. That VPR is osmething 5 980x's wouldn't be able to beat ofc. Ow and best not to have a scene with some nodal shaders, for example that benchmark scene makes VPR superslow due to the Material nodal shaders.. (waay slower than an actual render)

ps: making a nice print screen screengrab is better than a photo... and for video grabbing there's some (free?) software available..

I just did a search on my Buddys shop.

Intel
CORE I7-980X EXTREME 3.33GHZ
Aktionspreis
EUR 995,22

And thats just the CPU. How do you come up with EUR 1250?

JonW
02-28-2011, 02:12 PM
Having one very fast box would make life easier for large single renders, VPR & baking radiosity is very useful. But having a number of smaller boxes allows one to use another computer if all hell brakes loose.

Hieron
03-01-2011, 04:30 AM
I just did a search on my Buddys shop.

Intel
CORE I7-980X EXTREME 3.33GHZ
Aktionspreis
EUR 995,22

And thats just the CPU. How do you come up with EUR 1250?


In the Netherlands:
980x: 900,-
8 GB RAM: 80,-
Case psu: 100,-
cheap *** vga: 25,-
some mobo: 160,-
small HD: 50,-

total: 1395,-

(which still fits the 5x 980x inside 8500 price +- mind you)

However, no reason to use the 980x. Just go for the 970, my 980x isn't that overclocked anyway, the 970 will reach it fine as well. Which saves 400 euros here. Resulting in: 995,- Which fits 7x with ease !


So yes, having all in 1 box helps on OS price, network connections, overhead, floor space. But efficiency wise... not so sure. And it better not break down.

Needless to say though, I would really like to have one :) That's why I'd like to see some VPR mayhem :)

ToMar
03-02-2011, 03:17 PM
My 980X does the benchmark in 4m 45. I guess you can get a system like that speed for perhaps 1250 euros. As usual dollars and euros are quite interchangeable so I'd say you can get about 6 of them for the price of the 48 core box. Jon, wouldn't it then be better, speed/cost wise to opt for 5 980x's ?

Jackvee I would really appreciate if you could record a video of how your 48 box tackles VPR with 100% cpu showing. Used on a standard test scene. For instance the one Jon linked. Nothing HV or anything. Would be awesome to see just how fast it is on such a standardized scene zo we can compare. That VPR is osmething 5 980x's wouldn't be able to beat ofc. Ow and best not to have a scene with some nodal shaders, for example that benchmark scene makes VPR superslow due to the Material nodal shaders.. (waay slower than an actual render)

ps: making a nice print screen screengrab is better than a photo... and for video grabbing there's some (free?) software available..

Are those 4m 45s OC or stock? You got to compare eggs to eggs.

Hieron
03-02-2011, 04:15 PM
Are those 4m 45s OC or stock? You got to compare eggs to eggs.


huh. So OC'ing a 980x to a mere 4.0 which is a 2 minute task at the very most, is not allowed? I mentioned mine are OC'ed.

As far as I'm concerned, I'm comparing eggs to eggs. Comparing the power of both with the minimum amount of effort. The fact that you can't OC the 48 cores, doesn't mean you wouldn't on the 6 core. How is that comparing eggs to eggs.. running a 980x at stock would be laughable. Anyway, even at stock the main point still stands.

I was just saying that rendering power/cost is the similar, and when 970 (OC'ed easy) is taken into consideration the favor goes rather heavily to i7's. Which are easy to get, install, replace etc.

Again, I'd like a 48 core box especially if it would blaze through VPR and GI caching, but for that amount you can get around 42 i7 cores as well, in 7 boxes, running around 3.2 (stock) to 4.0 ghz. And they beat those hands down. Then again, that much power in such a small space is way sexier.

ToMar
03-02-2011, 10:07 PM
I was just interested what these boxes can do in there manufacture intended configuration. I.e. if my almost 2 year old box still can keep up.

Hieron
03-03-2011, 04:17 AM
I was just interested what these boxes can do in there manufacture intended configuration. I.e. if my almost 2 year old box still can keep up.

Ah ok..
Not overclocked I think the 980x would hit 5m45s more or less so sort of on par right? These days there is a 970 which I would recommend over the 980x though..

Well actually, I'd wait for the high end Sandy Bridges to arrive. But this 48 core box is compelling if you look at added OS prices, space requirements, cable management, GI calc, VPR etc. And just pure awesome taskmanager :)

So, perhaps a VPR demo?

ianr
03-16-2011, 10:24 AM
Hi Jack,
Hieron asked for a Camtasia Video Grab of VPR.
I would love if you could get Rob Powers to release
his Sporty Car in Garage scene-lws. to you!
To show us how smooth your VPR Res.movements
are against the LW10 version on YouTube entitled
‘Virtual Cinematography’.
I think they where using Boxx dual Xenons with 24
threads running if remember correctly. Whaddaaya think?

In the words of Gaga......' show'em what i got'

For a Camtasia Free Trial goto:
http://www.techsmith.com/download/trials.asp

Cageman
03-19-2011, 06:14 AM
Hi jackvee, your computer looks like a monster!

I would love to see what render time you get with this benchmark scene I have attached.

The scene has good AA settings, Reflections, Refractions, DOF, Final Gather Radiosity, Spherical Light with soft shadows.
Dialectric, Conductor and Simple Skin shaders and some basic textures so that the AA have something to work with :)

15 min render time on my Intel Core 2 Quad 2,5 Ghz.

Cool, and thanks for the neato benchmark scene! :) Came in handy to compare my old workstation and my new one. :D

My results:

Intel QuadCore Q6600 2.4GHz = 16m 22s (old workstation)
Intel i7 970 Hexa 3.2GHz = 5m 44s (new workstation)

I've attached screenshots as well. :)

wrench
03-21-2011, 05:24 AM
I can't compare directly on my old motherboard now, but just by unlocking the fourth core of my Phenom II and overclocking I calculated it would have taken 1127 seconds (18m47s) on the old one and it now takes 733 (12m13s). (Based on 4x3.23GHz against 3x2.8GHz).

No speed demon, but not bad for a "free" upgrade.

B

OnlineRender
03-21-2011, 06:30 AM
Cool, and thanks for the neato benchmark scene! :) Came in handy to compare my old workstation and my new one. :D

My results:

Intel QuadCore Q6600 2.4GHz = 16m 22s (old workstation)
Intel i7 970 Hexa 3.2GHz = 5m 44s (new workstation)

I've attached screenshots as well. :)

THAT's Killer !

djlithium
03-21-2011, 07:22 AM
I can't compare directly on my old motherboard now, but just by unlocking the fourth core of my Phenom II and overclocking I calculated it would have taken 1127 seconds (18m47s) on the old one and it now takes 733 (12m13s). (Based on 4x3.23GHz against 3x2.8GHz).

No speed demon, but not bad for a "free" upgrade.

B

I love AMD :)
I picked up 3 x6 PhenomII 1075T based CPU systems in december for under 2000 bucks CND with tax and they scream! The price just keeps coming down too, I checked last night for new pricing and the 1100T's are out for less than what I paid for the 1075T cpu model, but run at 3.3ghz vs 3.0 on the 1075T. CRAZY!

erikals
03-21-2011, 01:32 PM
the 1100T is not bad, and can be overclocked to 4,1 GHz without any big problem
(according to an article i read)... something to consider... :]

djlithium
03-22-2011, 03:28 AM
the 1100T is not bad, and can be overclocked to 4,1 GHz without any big problem
(according to an article i read)... something to consider... :]

I may pick up one 1100T and migrate the other x6 core to a board I know that can take it that currently has a phenomII x4 in it and see what I can push it to. I think 4.1ghz is a bit scary, but 3.5 - 3.8, should be totally safe.

Mitja
03-22-2011, 07:05 AM
Did a test, just to see what we are playing at.
Intel i5 760 @ 2.8GHz ---> the benchmark scene renders in 12:39.
3:53 with 64 cores is defintely not a good preformance (my system was 500€!).