[SATLUG] 67 TB for less than $8K

Bruce Dubbs bruce.dubbs at gmail.com
Tue Oct 9 21:36:31 CDT 2012

Brad Knowles wrote:
> On Oct 9, 2012, at 6:16 PM, Bruce Dubbs <bruce.dubbs at gmail.com>
> wrote:
>> But the beauty of the solution is that parts are cheap.  There's no
>> expensive Ferrari engine or transmission.
> But the O&M costs are still very high, especially since the
> probability of failure skyrockets each time that a human being
> actually touches the device, or there is a drive failure and the
> array is put into degraded mode.
> Those are facts of life for anyone who is operating a drive array.
>> Also, the follow up report was that Hitachi drives failures were at
>> 1%, not 5%.  The failure rate included infant mortality, so the
>> burn in time provides a good screening of bad drives.
> The way I read the second post, the overall failure rate was 5%, and
> initial data indicated that they were seeing lower failure rates with
> the Hitachi drives but that they did not yet have enough experience
> with them to see what the overall lifetime failure rate might be.

The 5% was for all the drives since they started 4 years ago.  The 
article (Jul 2011) indicated 1% the newer drives:

"We are currently seeing failures in less than 1 percent of the Hitachi 
Deskstar 5K3000 HDS5C3030ALA630 drives that we’re installing in pod 2.0."

They also say:  "We have yet to see any drives die because of old age" 
but that doesn't seem consistent with the statement with the 5%/year 
failure across their entire 9K drive inventory.

>> In the case of single unit, there are some places without
>> redundancy (e.g. the motherboard), but the power supplies are
>> redundant and a single unit could be set up with multiple RAID
>> arrays.
> There are two power supplies, but I don't think that they are
> actually redundant.  I think that's just how much power you have to
> provide for all the drives in the case.

That does appear to be the case, but I've never seen a disk drive that 
takes two power inputs.  That could be engineered around though.

The reliability factor really depends on the need.  There are some 
applications (e.g. some scientific research environments) where lots of 
disk space is needed, but occasional downtime is OK.

> Of course, you could set up multiple RAID arrays per device.  If it
> was me, I'd be doing four or five-disk RAID-6 with three (or six) hot
> spare drives per chassis (one or two per controller), but you lose a
> hell of a lot of storage that way.

Again, it depends on the need.  Reliability, speed, cost tradeoffs 
depend on the environment.

> Due to the way the port multipliers are connected to the PCI cards,
> you're going to get device/channel imbalances pretty much any way you
> do it, and that's going to create significant bottlenecks just due to
> the amount of storage you'd be providing and the way you'd be
> providing it per box.

> Of course, you'd also be left looking for a network card where you
> could handle that much bandwidth at a reasonable latency, and the
> three drive controllers would already have filled the three available
> PCIe slots on the motherboard.
> Ideally, you'd want two dual-interface NICs, each set up in LACP
> bonding mode and each connected to two different switches, so that
> you could have higher bandwidth but also survive either switch or NIC
> failure -- or both.  These could be 10GigE, if you had a place to
> plug them in.

> OTOH, you're going to have such serious performance problems rising
> from the way the drives are laid out and spread across so few SATA
> channels that maybe it wouldn't make a difference if you had just the
> single Gig-E interface per box.

I do think that networking is the real bottleneck, not the drive setup. 
  They said they can easily saturate a 1GB network connection.

A dual 10-Gb ethernet card is available for about $400 if you have 
something to connect it to.  That might saturate the PCIe bus (depending 
on version) connection to the disk drives, but more likely the 6 Gb/s 
limit of SATA3.

>> I'm not going to build one of these things because I don't need it,
>> but I would suggest it to an employer with a limited budget but a
>> need for a large amount of disk space to try out and develop actual
>> experience with the concept from both a technical point of view and
>> from a cost perspective.
> There are definitely lots of lessons that you'd learn from building a
> box like this.  I'm not sure that you necessarily want to try to
> learn those lessons the hard way, however.
> Even if it was your goal to learn those lessons the hard way, this
> seems like a pretty expensive way to learn.

I doubt that an organization that needs 50+ TB of storage thinks that an 
$8K expenditure for HW is expensive.  The engineering time is more 
expensive, but a commercial solution can be even more expensive.

Experience is always valuable.  I don't know how to learn lessons the 
easy way.

   -- Bruce

More information about the SATLUG mailing list