[SATLUG] Software RAID suggestions
j at jvpappas.net
Tue Feb 23 10:32:24 CST 2010
This is my trimmed (actual email was over 12K) post that included my layout
and was bounced. If any of you want that verbose mail, let me know and I
will send explicitly off list.
On Tue, Feb 23, 2010 at 10:26, John Pappas <j at jvpappas.net> wrote:
> On Mon, Feb 22, 2010 at 02:44, FIRESTORM_v1 <firestorm.v1 at gmail.com>wrote:
>> Hello Everyone:
>> I have a question concerning software RAID. Unfortunately I'm cursed
>> with the Promise TX4310 "fake" raid card and am wanting to separate
>> the RAID array (w/controller) from my gaming rig in an effort to cut
>> down on power and with the fact that I recently discovered XBMC (FTW!)
> Kernel RAID tools are very mature, and other than a couple (mostly very low
> level) idiosyncrasies, very stable. I have been running a mirrored OS
> (2x250GB, md0/1/3, boot/root/swap respectively), 5xR5 (5X500GB md2) Data
> setup for years and have not lost data, even across upgrades and distro
> changes. I even run LVM on top of those md (except md0=/boot). I have
> occasionally run into an issue where I had to resync the drives even though
> there was no actual "failed" or bad drive. With a boot MD there is a chance
> that the wrong (error-ed or bad) physical drive will get booted. Simple
> repair with boot CD, as then that MD is no longer the boot volume.
>> My experience with software RAID in Linux is many years old and did
>> not end on a good note and I was hoping of anyone here had any good
>> suggestions/stories/pitfalls that they could share with me. From
>> research I've done with this particular RAID card, the best thing to
>> do is to set it for JBOD mode (4x 750GB) and then use the Linux RAID
>> tools to build a software RAID set. I plan on using the same setup as
>> currently deployed with a RAID 5 configuration.
> I have had no devastating problems (related to SW-RAID :), and all the
> others have been surmountable with a little research and planning. I have
> run into a couple of issues (no data effect) that were related to the RAID
> superblock version, specifically 0.9 has the system ID embedded in the GUID,
> while 1.0+ has a host field that holds the system ID. This matters when the
> MDs are numbered, as "foreign" mds are numbered from 126, so I mysteriously
> got md126 and md127 after an upgrade, and could not for the life of me
> determine how to get mdadm to use the numbers that I was explicitly
> assigning to the md via GUIDs in the /etc/mdadm.conf file. Once I updated
> the GUIDs on the 0.9's and hostname on the 1.0+ they became "local" md
> devices and numbering worked as expected.
> I would use a partition for the md (as with LVM; ie /dev/sdb1 rather than
> /dev/sdb directly), as it prevents one thinking that the drive is empty, and
> facilitates the auto discovery of md/pv data through partition type (fd or
> 8e respectively).
> Those 750's will take a REALLY long time to rebuild, especially if there is
> only one CPU core on the system or if they system is busy. Those XOR calcs
> take time, as the CPU has to do it, as opposed to the HW RAID controller
>> I plan to boot from a dedicated hard drive not part of the 4 drive set
>> and want this to be as good of a system as I can make it without
>> having to worry about losing my data again.
> A HW RAID card with BBU (Battery backed cache unit) will be the most
> resilient, as that is one of the best ways to plug the RAID5 hole (other
> than not use R5). Without that expense, I would say that kernel RAID would
> be the next best thing. I would also contend that kernel RAID is even
> better than the "hybrid" or "fake" (parity is driver calculated, rather than
> hardware calculated) RAID, as the tools are built into the OS, rather than
> having the reliance on the driver; not to mention much more portable and
> well documented.
>> Unfortunately, the last time I tried this was with IDE drives and when
>> one went out, the entire array died and was unrecoverable. The array
>> would not even work in "degraded" mode to allow me to salvage my data.
> Without having a exact and detailed rundown on what happened, I cannot
> accurately comment on your perception of kernel RAID's resilience; but
> SATA's native hot-swap capability will alleviate the post-failure
> replacement issue and some of the other Hardware level issues that probably
> contributed to your event.
>> A lot of the research I've done in regards to linux support for this
>> card has people sayingt that it works, but they never come back to say
>> how the performance or the longevity of the array is, fault recovery,
> As long as the controller does not do anything "under the sheets" to a JBOD
> disk, then kernel RAID works great. If the controller tries to be smart (or
> the disk was at one point a member of a RAID volume controlled by that
> controller) then there can be gotchas.
>> I appreciate your insight and any information you can provide me
> Keep me apprised of your decision, and I have included my layout below:
More information about the SATLUG