Page 1 of 3 123 LastLast
Results 1 to 15 of 33

Thread: 64-bit Macs in August

  1. #1
    Gold Member Beamtracer's Avatar
    Join Date
    Feb 2003
    Location
    Citizen of the World
    Posts
    2,226

    Smile 64-bit Macs in August

    According to MacBidouille, Apple will use the World Wide Developers Conference on June 23 to announce 64-bit Macintoshes, with configurations of 1.4GHz, 1.8GHz and dual 2.3GHz, and that these machines will be in the stores in August this year.

    Of course, none of this can be confirmed by Apple, but June 23 is not that far away for an announcement.

    64-bit Macs running IBM-970 processors, with the 64-bit Panther OS. Yippee!!!!!
    Moral of the day:
    Worry about your own morals, not someone else's!

  2. #2
    d i g i t a l u k a luka's Avatar
    Join Date
    Feb 2003
    Location
    Australia
    Posts
    297
    I wonder if any apps will be hitting 64bits by then

  3. #3
    Gold Member Beamtracer's Avatar
    Join Date
    Feb 2003
    Location
    Citizen of the World
    Posts
    2,226
    That's a question for Newtek!
    Moral of the day:
    Worry about your own morals, not someone else's!

  4. #4
    d i g i t a l u k a luka's Avatar
    Join Date
    Feb 2003
    Location
    Australia
    Posts
    297
    Well Newtek can we ask you if you are thinking in that directrion?

  5. #5
    This is way too complex a subject to deal with fully. LW has been '64-bit clean' since v6.5, but it is not clear what advantages one would gain with a massive conversion to actually using 64-bit data everywhere. For many math op.s, LW already uses 64-bit double precision math. Where 32-bit floats can be used, it is probably faster and definitely less of a RAM burden to stay with this type of math. Perhaps running these operations on a 64-bit processor will simply double the speed? For integer operations, I would also hope to see some speed-up of operations on 32-bit ints. If not, then using 64-bit ints will just take more RAM to run at the same speed. It is not clear what the performance boost or penalty on existing, 32-bit compiled apps will be when running on a 64-bit CPU, so it is not obvious that building a new LW version for a 64-bit CPU will be worthwhile. If Apple does things right, 32-bit apps could get a nice boost without the serious memory footprint bloat that could occur by just using 64-bit data types indiscriminately. Alot will depend on the compilers available too.

  6. #6
    Yea from what i understood about CPU architectures 64bit alone won't really speed up floating point math, double precision floats take the same amount of time as 32bit floats on current 32bit CPUs (Mac and PC)
    If it was THAT easy to boost performance all of todays CPU s would already be 64bit i guess.

    You should rather hope it has strong vector (SIMD) units, just look at the P4...SSE2 has support for double precisions floats, and that gives LW a massive boost.
    From what i could find out Altivec can only handle single precision floats, and the PPC970 has Altivec compatible SIMD units, seemingly without any additions...(?)

    Some interesting articlesaboutb 64bit computing, PPC 970 and P4 vs. G4:
    http://arstechnica.com/cpu/index.html

  7. #7
    normally i am different ingo's Avatar
    Join Date
    Feb 2003
    Location
    Close to the baltic sea, nearly in it
    Posts
    1,904
    Hmm August, isn't that the time when Motorola wanted to present the 2 GHz 7457 processor, without the need for a dozen fans or a huge power supply like IBM's ? Just curious....

  8. #8
    ShortsightedSithAssassin
    Join Date
    Feb 2003
    Location
    UK
    Posts
    1,948
    Isn't it odd that games consoles are already employing 128-bit CPUs, while computers are only now edging towards 64-bit? (Atari's 64-bit Jaguar came out in 1993!)

    I understand that it's one thing to make a dedicated games machine; quite another to develop a stable, mainstream computing workstation. But it's an interesting disparity.

  9. #9
    Registered User wapangy's Avatar
    Join Date
    Feb 2003
    Location
    Colorado Springs CO, USA
    Posts
    213
    Originally posted by Darth Mole
    Isn't it odd that games consoles are already employing 128-bit CPUs, while computers are only now edging towards 64-bit? (Atari's 64-bit Jaguar came out in 1993!)

    I understand that it's one thing to make a dedicated games machine; quite another to develop a stable, mainstream computing workstation. But it's an interesting disparity.
    Well those are just the graphic chips as far as I know, not the main CPU. All graphic chips in computers now a days are 128.

  10. #10
    well yea i was curious and in case of the Jaguar, none of the programmable parts had 64bit registers, so you could call those 64bit processors GPUs. The PS2 however does have a real 128bit CPU, but it is also designed for pretty specific tasks.

    Originally posted by Darth Mole
    Isn't it odd that games consoles are already employing 128-bit CPUs, while computers are only now edging towards 64-bit?
    Well you can buy 64bit systems since ages, have a look at SGI, Sun, HP, IBM, Alpha (ok that one's pretty much dead)...they're all selling 64bit workstations since years...

    Besides that, a P4 for example has 128bit SIMD registers (SSE2) and that's exactly what boosts LW's render times, it's not really an issue how big your integers or address pointers can be, until you need more than 4GB RAM or your integers get so big they over- or underflow.

    You should really read the "Introduction to 64-bit Computing and x86-64" article from my link above, to understand why there is no magical twice performance or anything from 64bit alone...

  11. #11
    Registered User
    Join Date
    Feb 2003
    Location
    NJ
    Posts
    303

    Not *just 64-bit!

    Arnie, keep in mind that the 970 brings more than just 64-bit to the table. There are a host of other improvements -- the bus being just one of them. Anyway, the "double precision" and AltiVec discussion seems to be surfacing again, so perhaps we should revisit that for a moment.

    It's my understanding that double precision buys you something like an extra 29 bits of precision. 2^29 is about 5*10^8. Therefore, double precision can tolerate about half a billion times more accumulated error before it reaches some absolute error threshold beyond which there would be simply too much error in the calculation. In other words douvle precision hides a lot of code-slop and error.

    The question is.... Is Lightwave *actually* doing 1 million calculations on a pixel before it reaches the screen? If so, then perhaps they do *need* double precision. For comparison, a single precision float is only accurate to about 1 part in ~ 17. Still, many I've talked to ask whether they actually *need* to do 1 million calculations on it to produce the desired result (i.e., photo-realistic rendering).

    For clarity, I'll "repost" a snippet from another NewTek forum that dealt with the topic. Beam might even remember it...

    Here is what was posted:

    [[[Q: Is an updated double precision-centric AltiVec unit the way to go?

    A: No.

    This is why:

    The vector registers have room for four single precision floats to fit in each one. So for single precision, you can do four calculations at a time with a single AltiVec instruction. AltiVec is fast because you can do multiple things in parallel this way.

    Most AltiVec single precision floating point code is 3-4 times faster than the usual scalar single precision floating point code for this reason. The reason that it is more often only three times faster and not the full four times faster (as would be predicted by the parallelism in the vector register I just mentioned) is that there is some additional overhead for making sure that the floats are in the right place in a vector register, that you don't have to deal with in the scalar registers. (There is only one way to put a floating point value in a scalar register.)

    Double precision floating point values are twice as big (take up twice as many bytes) as single precision floating point values. That means you can only cram two of them into the vector register instead of four. If our experience with single precision floating point translates to double precision floating point, then the best you could hope to get by having double precision in AltiVec is a (3 to 4)/2 = 1.5 to 2 times speed up.

    Is that enough to justify massive new hardware on Motorola's or Apple's part?

    In my opinion, no.

    This is especially true when one notes that using the extra silicon to instead add a second or third scalar FPU could probably do a better job of getting you a full 2x or 3x speed up, and the beauty part of this is that it would require absolutely no recoding for AltiVec. In other words, it would be completely backwards compatible with code written for older machines, give *instant speedups everywhere* and require no developer retraining whatsoever. This would be a good thing.

    Even if you still think that SIMD with only two way parallelism is better than two scalar FPU's, you must also consider that double precision is a lot more complicated than single precision. There is no guarantee that pipeline lengths would not be a lot longer. If they were, that 1.5x speed increase might evaporate -- quickly.

    Yes, Intel has SSE2, which has two doubles in a SIMD unit. Yes, it is faster -- for Intel. It makes sense for Intel for a bunch of reasons that have to do with shortcomings in the Pentium architecture and nothing to do with actual advantages with double precision in SIMD.

    To begin with Intel does not have a separate SIMD unit like PowerPC does. If you want to use MMX/SSE/SSE2 on a Pentium, you have to shut down the FPU. That is very expensive to do. As a work around, Intel has added Double precision to its SIMD so that people can do double precision math without having to restart the FPU. You can tell this is what they had in mind because they have a bunch of instructions in SSE2 that only operate on one of the two doubles in the vector. They are in effect using their vector engine as a scalar processing unit to avoid having to switch between the two. Their compilers will even recompile your scalar code to use the vector engine in this way because they avoid the switch penalty.

    Okay, so Intel has double precision in their vector unit and despite what I have said, you still think that is absolutely wonderful. But do they *really* have a double precision vector unit? The answer is not so clear.

    Their vector unit actually does calculations on the two doubles in the vector in a similar "one at a time fashion" to the way an ordinary scalar unit would. They only can get one vector FP op through [every two cycles] for this reason. AltiVec has no such limitation!

    AltiVec can push through one vector FP op per cycle, doing four floating point operations simultaneously (up to 20 in flight concurrently). AltiVec also has a MAF core, which in many cases does two FP operations per instruction. This is the reason why despite large differences in clock frequency, AltiVec can meet and often beat the performance of Intel's vector engine.

    The other big dividend that they get from double precision SIMD is the fact that they can get two doubles into one register. When you only have eight registers this is a big deal! [PowerPC has 32 registers for each of scalar floating point and AltiVec!] In 90% of the cases, we programmers don't need more space in there and the registers the PPC provides are just fine.

    Simply put, (from a developers position) we just don't need double precision in the vector engine, and we wouldn't derive much benefit from it if we had it. The worst thing that could possibly happen for Mac developers is that we get it, because that would mean that the silicon could not be used to make some other part of the processor faster and more efficient, and a lot of code would need to be rewritten for little to no performance benefit. It wouldn't be a logical tradeoff.

    *The only way this would be worthwhile would be to double the width of the vector register so that we get 4x parallelism for double precision FP arithmetic.

    And with respect to 3D apps *requiring* double precision...

    Most 3D rendering apps do not NEED double precision everywhere. They just need it in a few places, and often (if they *really* decide to look) they may find that there are more robust single-precision algorithms out there that would be just as good. In the end they should be using those algorithms anyway, because the speed benefits for SIMD are twice as good for single precision than they are for double precision.

    Apps like this can get a lot more mileage out of the PowerPC if they just increase the amount of parallelism as much as possible in their data processing. Don't just take one square root at a time, do four etc. And this isn't even taking into account multiprocessing just yet or even AltiVec for that matter -- the scalar units alone, by virtue of their pipelines, are capable of doing three to five operations simultaneously! However if you don't give them 3-5 things to do at every given moment, this power goes unused.

    Unfortunately, this can be noticed in quite a few Mac applications already on the market where performance doesn't seem to be as solid as it should be. What is baffling is why many Mac developers aren't taking advantage of this power. What it boils down to is that most of these apps just do one thing at a time (for the most part), and in turn are wasting 60-80% of the CPU cycles. That's a lot of waste. What's nice is that the AltiVec unit is also pipelined, so it is important to do a lot in parallel there too. The only problem is that developers actually have to make a conscious effort to use the processor the way it was designed to be used. ]]] - (Anonymous source)

    --
    Ed M.

  12. #12
    ShortsightedSithAssassin
    Join Date
    Feb 2003
    Location
    UK
    Posts
    1,948
    AAAAAAARRRRRGHHHHH!!!!

    My head hurts

  13. #13
    Inquisitor
    Join Date
    Feb 2003
    Location
    Bristol, UK
    Posts
    230
    Arnie, congratulations on joining the luxology team, I hope you still hang around these forums from time to time as your input and knowledge has been invaluable.

  14. #14

    the new wave macs?

    I wonder whether the new Macs (sporting the 970) will be killers, or simply maimers...

    The first PowerMacs were sort of like speed-bumped Quadras in performance...the first G4s weren't quite what they ought to be.

    Seems the box surrounding the chip often has to catch up to the chip's potential, or maybe Apple wants to use leftover parts before designing and selling brand-new parts which better unleash the new chip.

    Could this be the case when the 970 hits? Yes, more powerful chip, but what if it's strapped into a VW Beetle?

    J
    _______________________________________
    Discuss sustainability at www.ThinkPlan.org

  15. #15
    @Ed M.

    interesting posting...
    I don't know how much SSE2 instruction using doubles LW uses on P4s, but SSE2 definitely helps a lot, although it uses no dedicated pipeline for it!

    I don't think adding more FPU units will really speed up anything, the articles i read statet that it's a big challenge to keep multiple execution units busy with a single thread, actually it's THE challenge and the reason why vector units and simultaneous multithreading (intels Hyper-Threading is an approach) were invented after all...

    But with a quick search anout basic raytracing techniques it seems that you really don't need double precision for many calculation, but you can't abandon it completely i think.

    All the speculations aside, let's just wait and see, it's unimportant how stuff would perform that you can't buy

Page 1 of 3 123 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •