PDA

View Full Version : A question about RAID storage



mikala
04-06-2012, 07:59 PM
So I buy a RAID box to fill with HD's and store vast amounts of data on.
My question is...The box says it is made to do RAID O through 5 so what is the best RAID setting/type to use?
I know everyone has their opinion but I'd like to hear the pro's and cons of each type of RAID from you and see what suits me best. After seeing what others have encountered in their RAID travels I'll hopefully have one that is optimum for my use.
Thanks in advance for any and or all replies in advance.

Dexter2999
04-06-2012, 08:26 PM
http://www.youtube.com/watch?v=LDoGdvPcM1w
Explains it all.

I should point out to jump ahead 28 minutes.

mikala
04-06-2012, 09:06 PM
Too cool! Thank you!
Just watched it. A lot to think about.
High transfer speed or redundancy?
RAID 5 is looking pretty good at the moment.

Hieron
04-07-2012, 07:56 AM
RAID 10 is our preferred mode here.
Combining it with 10Gbe connections atm and work straight of the redundant (= not back up! :)) fileserver.

Lightwolf
04-07-2012, 08:28 AM
High transfer speed or redundancy?
RAID 5 is looking pretty good at the moment.
Well, RAID 5 gives you both, RAID 6 is a good option if you have it.

RAID 5 allows one drive to fail, 6 allows for two, 10 for one or two, depending on which ones (effectively that's one drive considering the worst case) but doesn't have the performance of 5.
And remember, redundant storage is not backup.

Cheers
Mike

Hieron
04-07-2012, 01:02 PM
10 for one or two, depending on which ones (effectively that's one drive considering the worst case) but doesn't have the performance of 5.

Cheers
Mike

seen alot of varying reports on that... depends on hardware and usage scenario I guess..

Lightwolf
04-07-2012, 01:52 PM
seen alot of varying reports on that... depends on hardware and usage scenario I guess..
Unless it's a cheap NAS or a really bad driver then RAID-5 is faster than RAID 10 for the same amount of drives.
RAID 10 needs a minimum of 4, but only two of those actually store unique data. So at best you get the throughput of two drives (and also the storage of two).
RAID-5 with four drives gives you roughly the throughput of 3 (as well as the storage of 3).

Cheers,
Mike

mikala
04-07-2012, 01:55 PM
R5 seems to be the one for me. Not building a huge array of drives so speed and storage are the big things for me.
Backup will be a separate thing.

Hieron
04-07-2012, 01:59 PM
afaik, a good raid 10 setup can read from all n drives. It can write with only the speed of n/2, but it does not need to do anything parity related which can get really slow at small file sizes. Like, reallllly slow. When a disk dies, performance of the array goes down alot too. A 6 disk raid 10 array may be able to lose 3 drives and still live (the mirrors). Small chance ofcourse, but still.

Indeed, the storage space is n/2 but disk cost is a no issue for us in this matter.

anyway, just redoing our new Linux fileserver, which holds 6 3TB disks and a hotspare. We'll probably test raid 5 and 10, to be sure. (connected by 10 Gbe to workstations)

mikala
04-07-2012, 02:02 PM
Another question.
Are there any SSD RAID systems?

Lightwolf
04-07-2012, 02:11 PM
afaik, a good raid 10 setup can read from all n drives.
Theoretically yes. But I've never seen a RAID-1 that overlaps reads either. And indeed, it'd also make little sense to read alternating blocks from the different mirror drives for continuous files.

It can write with only the speed of n/2, but it does not need to do anything parity related which can get really slow at small file sizes. Like, reallllly slow.
I've always had good results with small files in the past on RAID-5.

When a disk dies, performance of the array goes down alot too. A 6 disk raid 10 array may be able to lose 3 drives and still live (the mirrors). Small chance ofcourse, but still.
That's what I mean earlier. If you are cautious and think of the worst case (which is imho the only case to think about when it comes to data integrity), then it's still only only drive that can go down.


anyway, just redoing our new Linux fileserver, which holds 6 3TB disks and a hotspare. We'll probably test raid 5 and 10, to be sure. (connected by 10 Gbe to workstations)
If the drivers are decent I'd expect something like 500MB/s on Raid-5 and 300MB/s on Raid-10.
Another option would be RAID-50 - but I've never really used it.
It seems that currently there's a lot of trends that go toward providing redundancy on a file system level instead as well.

Cheers,
Mike

Hieron
04-07-2012, 02:12 PM
Another question.
Are there any SSD RAID systems?


sure why not..


If the drivers are decent I'd expect something like 500MB/s on Raid-5 and 300MB/s on Raid-10.
Another option would be RAID-50 - but I've never really used it.
It seems that currently there's a lot of trends that go toward providing redundancy on a file system level instead as well.


kk, will take a while to test.. if it is truly that much faster I'll reconsider. Still, not too happy with all the mentioning of slow rebuild speeds on raid 5, considering that the array will be 15TB at that point. Also wonder if it will go through the 10 Gbe line at those speed as well, really little info to be had on 10Gbe... Got a pretty good Areca card, but we're actually planning on doing it in software, but on a decent system.

Lightwolf
04-07-2012, 02:20 PM
kk, will take a while to test.. if it is truly that much faster I'll reconsider. Still, not too happy with all the mentioning of slow rebuild speeds on raid 5, considering that the array will be 15TB at that point. Also wonder if it will go through the 10 Gbe line at those speed as well, really little info to be had on 10Gbe... Got a pretty good Areca card, but we're actually planning on doing it in software.
I'd check the Areca first... it does keep the RAID config independant from drivers.
Slow rebuilds are indeed a problem. But that's proportional to the drive sizes in general. Rebuilding a mirror won't be faster either if the same amount of data needs to be checked.
I could imagine that eventually protocol overhead (the file sharing protocol) starts to become an issue. So it depends on what OS you share to and what protocol you use.

Cheers,
Mike

Hieron
04-07-2012, 02:27 PM
Was more worried on depending on the Areca hardware than on a Linux/driver combo.. . But will surely check. I thought rebuilding the mirror is easier and faster, it basically just copies the mirror from the surviving drive of the pair, there is nothing to check really?

Regarding protocols: it is Linux that shares to Win7, but there I have absolutely no idea.. really happy someone who actually does have some idea is on that.

But seeing that using AE to import image sequences over 1 Gb/s is quite slower than doing the same thing on any HD is a tad worrying. Will have to just bite the bullet and try.

Lightwolf
04-07-2012, 02:36 PM
Was more worried on depending on the Areca hardware than on a Linux/driver combo.. . But will surely check. I thought rebuilding the mirror is easier and faster, it basically just copies the mirror from the surviving drive of the pair, there is nothing to check really?
There isn't that much more to check on a RAID-5 either... current CPUs can easily handle those computations.
One advantage for the Areca is also that the RAID functionality is independent of the integrity of your RAM on the server. If that runs with ECC though then it should be i.j.


Regarding protocols: it is Linux that shares to Win7, but there I have absolutely no idea.. really happy someone who actually does have some idea is on that.
Hm, MS bumped their protocol up to SMB2, which is more efficient. But I think that requires SAMBA 4 on the Linux side as well (I'd have to check, it may have been backported to 3.x).

But seeing that using AE to import image sequences over 1 Gb/s is quite slower than doing the same thing on any HD is a tad worrying. Will have to just bite the bullet and try.
I haven't used a Linux server for ages, so I wouldn't know what to optimise on that side.
Fusion allows for automatic local caches, something that I use for more elaborate gigs.

Cheers,
Mike

Hieron
04-07-2012, 02:45 PM
kk, thanks. Now only need time to test and implement it...

Lightwolf
04-07-2012, 02:52 PM
kk, thanks. Now only need time to test and implement it...
I just had a look... Samba 3.6 and higher supports SMB2.

Cheers,
Mike

Hieron
04-07-2012, 03:07 PM
Ah ok, thanks. Will be interesting to see how it turns out..

Lightwolf
04-07-2012, 05:21 PM
Ah ok, thanks. Will be interesting to see how it turns out..
I expect a full report once you've tested the set-up... :hey:

Cheers,
Mike

Hieron
04-08-2012, 02:11 PM
:) will try at least

dsol
04-09-2012, 08:09 AM
I have a pretty sweet RAID setup here. I'm using a Promise 2314 Raid-5 E-Sata controller card (in my 2008 MacPro), hooked up to two PROAVIO 10-bay enclosures. Performance is solid, though not great for uncompressed HD, but it was relatively cost-effective to put together and I haven't (fingers crossed) lost any data for the last 4 years I've had it running. It also means I have - what feels like - almost unlimited storage now (about 14TB across those 2 enclosures).

The new Promise Thunderbolt enclosures are pretty nice too. Not cheap, but definitely worth considering if you have a thunderbolt connector on your mac or PC (rare on PCs at the moment, until Intel release IvyBridge)

Hieron
05-01-2012, 05:39 AM
I'd check the Areca first... it does keep the RAID config independant from drivers.
Slow rebuilds are indeed a problem. But that's proportional to the drive sizes in general. Rebuilding a mirror won't be faster either if the same amount of data needs to be checked.


The areca seems troublesome, we had some issues with it. Moved to Linux software raid via the Areca and moved it to onboard Sata without any hitch, the raid was still completely fine.

We'll scrap the hardware raid option and the areca as it is slower and seems to run into an issue when IO's get very heavy. Being able to move a raid array over to other controllers (moving it from Areca to onboard means moving it from 1 tot 3 seperate controllers!) and it still works fine is quite awesome. (and nice to know that we won't be dependent on hardware.

Afaik RAID 10 is faster, but that's just from info on the webz.. so who knows. Raid 5 rebuilds are notoriously slow at least, we rebuild the raid 10 a couple of times when the Areca crapped out and it was quite fast. (hour or so)


Theoretically yes. But I've never seen a RAID-1 that overlaps reads either. And indeed, it'd also make little sense to read alternating blocks from the different mirror drives for continuous files.

We're currently close to hitting 1 GB/s (990, so close!) on 6 3TB drives in raid 10, with massive transfers so caches can't be playing a major role. That is near to 6x the max transfer speed of each independent drive. Imho, it means the raid 10 is reading from all drives at max speeds. I did not expect it to be that ideal.

Mind you, it took our friendly but capable IT guy some effort, it doesn't do that out of the box.

Write speeds hit +- 350 MB/s. Which is quite as expected from effectively 3 drives without parity overhead.


I've always had good results with small files in the past on RAID-5.

That's not my experience or what the webz says about it, but I guess experiences and expectations vary.


That's what I mean earlier. If you are cautious and think of the worst case (which is imho the only case to think about when it comes to data integrity), then it's still only only drive that can go down.

Sure, but if all else would be equal, why not take the extra chance of things working.. it doesn't hurt or anything.



If the drivers are decent I'd expect something like 500MB/s on Raid-5 and 300MB/s on Raid-10.
Another option would be RAID-50 - but I've never really used it.
It seems that currently there's a lot of trends that go toward providing redundancy on a file system level instead as well.

Cheers,
Mike

We're hitting far beyond that, and Raid5 was loads slower. Again, for us. It's easy to get into endless debates I guess as alot of things can be tweaked and set right or wrong.

Atm we're at +- 1GB/s read, 350 MB/s write. This is for sequential transfers. It is always assumed by everyone, that Raid5 has problems with small files so I expect it to perform better with high IO no doubt.

We thought about file system redundancy, but I believe there were considerable downsides, so we decided not to. We're going for a good backup system. That together with the raid should cover it well enough.

Anyway, so far so good. The 1GB/s read, 350MB/s write is great. The problem now is that the new Intel 10Gbe nic's are really hard to get here. So no way to get the data out of the fileserver at those speeds, which renders the speed pointless :)

Lightwolf
05-01-2012, 05:46 AM
Anyway, so far so good. The 1GB/s read, 350MB/s write is great. The problem now is that the new Intel 10Gbe nic's are really hard to get here. So no way to get the data out of the fileserver at those speeds, which renders the speed pointless :)
Thanks for the update. The numbers are interesting indeed.

Yeah, I've looked are for 10Gb NiCs, they're certainly tricky to get.

Cheers,
Mike

Hieron
05-01-2012, 07:33 AM
Yeah.. sometimes you can see a couple at Newegg, but here nothing yet..
perhaps demand was higher than they estimated.. So much for the 10Gbe revolution :)

Btw, raid values:
400MB/s write
416MB/s re-write
944MB/s read
950MB/s re-read

This is on 128 GB sequential transfer.

hazmat777
05-01-2012, 12:48 PM
I know this thread has gotten pretty advanced, so I thought I'd include a "cheat sheet" for people new to the concept of RAID. It's an old article, but was written by a fine PC company up here in the North West...

http://www.pugetsystems.com/labs/articles/RAID-Explained-24

mikala
05-01-2012, 02:16 PM
Hey the more info the better!

Rayek
05-01-2012, 02:22 PM
Another question.
Are there any SSD RAID systems?

Yes: even integrated ones, like the one I am using right now (Revodrive 3 x240gb). It is ridiculously fast (up to 1.5gb/s), but should be used only for performance server stuff or your system drive/current project work drive (in the latter case your geekness factor must be relatively high, such as mine ;-).

Still expensive, though. But very, very cool. Photoshop starts up in one second (first time). Lightwave (both Modeler and Layout) start up within .2 seconds - immediate feedback). One drawback: all other machines I work on outside my own office, such as colleges, feel very slow indeed.

mikala
05-01-2012, 02:44 PM
I picked up an SSD cache drive and have another to install programs and OS on my P9X9 WS mobo. Hopefully it all works out.

Hieron
07-09-2012, 04:54 AM
I expect a full report once you've tested the set-up... :hey:

Cheers,
Mike

Will this do? :) (took long for them to arrive, and 1 nic was doa)

Over normal 1 Gb/s nic's:
105445

Over Intel X540 10 Gbe nic's:
105446

Same share on 6x3TB in RAID 10 (+1 hotspare), reading on all 6 heads. Required some tinkering around (not by me, I know nothing of this stuff) to get it to work that fast, but now it surpasses our SSD's easily. Also in actual normal usage scenarios.

Lightwolf
07-09-2012, 05:01 AM
Will this do? :) (took long for them to arrive, and 1 nic was doa)
Cool, thanks a lot. Nice read speeds indeed.

Cheers,
Mike

Red_Oddity
07-09-2012, 07:34 AM
When you do performance benchmarks, make sure the stream/file that is used for testing is LARGER than the on board memory of the RAID card and use direct IO streaming.

Attached are the iozone benchmarks we ran on our Areca ARC-1880ix (CentOS 6.2, EXT4 filesystem, the device is mounted (not a partition), so there was no use for aligning), using RAID 60 and RAID 10 (RAID 5 benchmark files got lost)
You have to look for the large speed dip in the results, this is where the card runs out of on board memory. For read speeds, this will be important.
For writing the cache does most of the work (at least up to the point the card's memory is filled up and the disks can't keep up)

For more info on the settings we used and what the results mean i suggest checking the iozone site : http://www.iozone.org/

Hieron
07-10-2012, 01:50 AM
When you do performance benchmarks, make sure the stream/file that is used for testing is LARGER than the on board memory of the RAID card and use direct IO streaming......

edit: hey nice reel 2011.. nice work!

We did those tests you refer to on the raid array itself, in Linux. Buffering can play a role sure, it works for the good as well. However, here we are testing the ethernet speed as much as the raid. The 1Gbit line was completely saturated and the atto test is exactly as we see with any other sort of application: fully maxed out 1Gbit line. The raid array can easily keep up ofcourse.

So we put in 10Gbe and hoped latency's would also be much better. If we test only the link speed, we do reach the rated 10 Gbit/s btw. Depending on jumbo frame sizes and such.

We run no raid card, it is software raid on Linux which is plenty capable. But yes, Linux has plenty memory on that machine. If we turn off direct IO in Atto, we do get suspicious numbers. The numbers here are as expected imho and seem to match up well with the other drives and their rated speeds and shares over 1 Gbit links.

Regardless, our reason to go this route was to get a way to work from a FileServer directly and as if local, instead of multiple workstations which have to backup to FS etc. Currently, the FS over 10 Gbe outperforms loading/working from a local HD or 4 disk raid 10 array. Something the 1gbit line never allowed.

It requires both a fast array in a fileserver and a fast link to it. Would probably not bet my life on the link being able to sustain 800 MB/s per se.. but it doesn't really have to. And even if the system uses buffering, no problem I'd say..

lengthy explanation :)

Anyway, imho this is a really cost effective way to allow people to work with massive amounts of files from a single fileserver at speeds that rival local arrays.