[SATLUG] Create High Availability SAN from scratch
jaret at aberlorn.com
Mon Jun 14 18:37:52 CDT 2010
John Pappas wrote:
> Complex question, but:
> On Mon, Jun 14, 2010 at 15:04, jaret <jaret at aberlorn.com> wrote:
>> First off, I have two servers (calling them HA-1) in a heartbeat
>> configuration with DRBD running to keep data in sync. That way in case one
>> server fails, the second server kicks in and all is good. This is high
>> availability. Each computer has 8 1T hot-swappable drives with Raid 1
>> applied. So, there's 8T total but with Raid 1, the server has 4T available
>> to use. Question: is Raid 1 needed since I have DRBD running? (reference:
> Disk layout would be dependent on I/O needs, but I do not think that Raid1
> (or more accurately RAID 0+1 or 10 given that with more than 2 drives it
> would be unusual and unnecessary to run 8 mirrors, rather you would run 4
> striped mirrored pairs or 2 mirrored stripes) is the best/most efficient,
> given that you are already mirroring via DRBD. I would think that a Raid5
> or 6 would be a better use of space, and performance would probably be good
> enough for most applications.
> Given that you are building an iSCSI SAN (rather than FibreChannel,
> Infiniband, or SAS), DRBD is only 1/2 of the HA equasion. You also need a
> way for the clients to seamlessly switch from one host to another. DRBD
> only takes care of the data, not host access. You will need to equip the
> cluster with a "heartbeat" in order to provide a virtual IP that the client
> hosts use for access, so that in the case of cluster node failure both the
> data (DRBD) and host access/iSCSI service (heartbeat) move to another
> available cluster.
Maybe drbd in conjunction with the PaceMaker cluster, which supercedes
Heartbeat 2. http://www.drbd.org/docs/about/
>> On the HA-1 servers, I make these iSCSI targets. Then from a client
>> computer (the iSCSI initiator), I create the appropriate file systems on
>> HA-1 and create a LVM logical volume.
> Data is HA, but overall system is not classical HA as per earlier
> Now, my storage needs increase and HA-1 is nearing capacity. I would then
>> take two new servers and create HA-2, in the same manner I did with HA-1
>> previously. On the client computer, I would create appropriate file systems
>> on HA-2 and then expand the existing LVM to include HA-2. In this way, the
>> client sees the target as one physical volume, which would be 8T of usable
>> drive space. Is this the right job for LVM or should I be using different
> If you use different VGs then the system will not completely fail, but in
> the remote case of a cluster failure, the LVs could go offline; so the
> definition of HA would have to be defined based on your requirements.
Is the remote case the 0.001 of the 99.999% HA uptime disclaimer,
assuming I use a proper cluster-manager and everything else were setup
correctly within the cluster?
>> The current setup has a network transfer rate is 1000Mbps. If I wanted to
>> increase this speed, I would put fibre network adapters using pci-express in
>> the servers and hook them up to a fibre switch. Or I could do point-to-point
>> to save cash but would probably be more difficult to administer. Any
>> recommendations on going the fibre route? (good-bad experience with
> I think that you are mixing protocols as the use of fiber optics does not
> necessarily predicate higher speeds, and FibreChannel is a completely
> different beast than iSCSI (or more accurately TCP/IP).
True. Learning about this opened up a different world than I'm used to.
I'm still fuzzy on definitions coming from TCP/IP land. Is there a
better way to think of a switched fabric than as an ethernet switch? In
my programming tools, I assume I can still use normal toolsets to open
sockets read bytes, transmit, etc, over fiber-channels and my
inputs/outputs will work? Only when I go lower in the communication
layer, I'd run into problems and needing to understand protocols. Any
recommended reading / videos out there? I watched some OEM vendor
videos, but it was more product promotion and less technical how-to.
> Given the falling cost of 10GBE, that will make your discussion more
> interesting, but on a budget, the best way to increase iSCSI throughput is
> to team 1GBE adapters (presuming multiple client access, otherwise you are
> still limited to the max throughput of your fastest single adapter).
Didn't think of a second 1GBE adapter.
> Regarding the buy vs build question: The OEMs have R&D budgets, supply
> chain, and support infrastructure advantanges that you don't. The good side
> of that being that your build does not have to support those costs, but the
> other side of that token, they will have a fully tested product that will
> have much better integration than a DIY build (also potentially more costly
> depending on requirements). Again, depending on your budget and needs, that
> may or may not be an issue.
Good advice. Another option might not be an option -- could use hosted
services but am worried that with data growth and usage on the hosted
servers it'll cost more (in terms of downtime) to migrate to in-house
servers. Would rather start on an in-house footing, either going with
OEMs or DIY.
More information about the SATLUG