[SATLUG] Create High Availability SAN from scratch

jaret jaret at aberlorn.com
Mon Jun 14 18:37:52 CDT 2010


John Pappas wrote:
> Complex question, but:
>
> On Mon, Jun 14, 2010 at 15:04, jaret <jaret at aberlorn.com> wrote:
>
>   
>> First off, I have two servers (calling them HA-1) in a heartbeat
>> configuration with DRBD running to keep data in sync. That way in case one
>> server fails, the second server kicks in and all is good. This is high
>> availability. Each computer has 8 1T hot-swappable drives with Raid 1
>> applied. So, there's 8T total but with Raid 1, the server has 4T available
>> to use. Question: is Raid 1 needed since I have DRBD running?  (reference:
>> http://www.drbd.org/home/mirroring/)
>>
>>     
>
> Disk layout would be dependent on I/O needs, but I do not think that Raid1
> (or more accurately RAID 0+1 or 10 given that with more than 2 drives it
> would be unusual and unnecessary to run 8 mirrors, rather you would run 4
> striped mirrored pairs or 2 mirrored stripes) is the best/most efficient,
> given that you are already mirroring via DRBD.  I would think that a Raid5
> or 6 would be a better use of space, and performance would probably be good
> enough for most applications.
>   
Ok. Thanks.
> Given that you are building an iSCSI SAN (rather than FibreChannel,
> Infiniband, or SAS), DRBD is only 1/2 of the HA equasion.  You also need a
> way for the clients to seamlessly switch from one host to another.  DRBD
> only takes care of the data, not host access.  You will need to equip the
> cluster with a "heartbeat" in order to provide a virtual IP that the client
> hosts use for access, so that in the case of cluster node failure both the
> data (DRBD) and host access/iSCSI service (heartbeat) move to another
> available cluster.
>
>
>   
Maybe drbd in conjunction with the PaceMaker cluster, which supercedes 
Heartbeat 2.  http://www.drbd.org/docs/about/

>> On the HA-1 servers, I make these iSCSI targets. Then from a client
>> computer (the iSCSI initiator), I create the appropriate file systems on
>> HA-1 and create a LVM logical volume.
>>
>>     
>
> Data is HA, but overall system is not classical HA as per earlier
> discussion.
>
> Now, my storage needs increase and HA-1 is nearing capacity. I would then
>   
>> take two new servers and create HA-2, in the same manner I did with HA-1
>> previously. On the client computer, I would create appropriate file systems
>> on HA-2 and then expand the existing LVM to include HA-2. In this way, the
>> client sees the target as one physical volume, which would be 8T of usable
>> drive space. Is this the right job for LVM or should I be using different
>> software?
>>
>>     
>
> If you use different VGs then the system will not completely fail, but in
> the remote case of a cluster failure, the LVs could go offline;  so the
> definition of HA would have to be defined based on your requirements.
>
>
>   
Is the remote case the 0.001 of the 99.999% HA uptime disclaimer, 
assuming I use a proper cluster-manager and everything else were setup 
correctly within the cluster?
>> The current setup has a network transfer rate is 1000Mbps. If I wanted to
>> increase this speed, I would put fibre network adapters using pci-express in
>> the servers and hook them up to a fibre switch. Or I could do point-to-point
>> to save cash but would probably be more difficult to administer. Any
>> recommendations on going the fibre route? (good-bad experience with
>> vendors/hardware?)
>>
>>     
>
> I think that you are mixing protocols as the use of fiber optics does not
> necessarily predicate higher speeds, and FibreChannel is a completely
> different beast than iSCSI (or more accurately TCP/IP).
>
>   
True. Learning about this opened up a different world than I'm used to. 
I'm still fuzzy on definitions coming from TCP/IP land. Is there a 
better way to think of a switched fabric than as an ethernet switch? In 
my programming tools, I assume I can still use normal toolsets to open 
sockets read bytes, transmit, etc, over fiber-channels and my 
inputs/outputs will work?  Only when I go lower in the communication 
layer, I'd run into problems and needing to understand protocols.  Any 
recommended reading / videos out there? I watched some OEM vendor 
videos, but it was more product promotion and less technical how-to.

> Given the falling cost of 10GBE, that will make your discussion more
> interesting, but on a budget, the best way to increase iSCSI throughput is
> to team 1GBE adapters (presuming multiple client access, otherwise you are
> still limited to the max throughput of your fastest single adapter).
>
>   
Didn't think of a second 1GBE adapter.
> Regarding the buy vs build question:  The OEMs have R&D budgets, supply
> chain, and support infrastructure advantanges that you don't.  The good side
> of that being that your build does not have to support those costs, but the
> other side of that token, they will have a fully tested product that will
> have much better integration than a DIY build (also potentially more costly
> depending on requirements).  Again, depending on your budget and needs, that
> may or may not be an issue.
>   
Good advice. Another option might not be an option -- could use hosted 
services but am worried that with data growth and usage on the hosted 
servers it'll cost more (in terms of downtime) to migrate to in-house 
servers. Would rather start on an in-house footing, either going with 
OEMs or DIY.
> HTH,
> John
>   



More information about the SATLUG mailing list