[SATLUG] 67 TB for less than $8K

Brad Knowles brad at shub-internet.org
Tue Oct 9 18:52:59 CDT 2012

On Oct 9, 2012, at 4:21 PM, Bruce Dubbs <bruce.dubbs at gmail.com> wrote:

> Well I thought the redundancy was in the multiple systems.  RAID provides some redundancy for individual hard drives.  The rest of the redundancy would be in software.  I'm not sure what you mean by imbalances in SATA chains.  It's not like the drives are daisy chained.  There is one level of disk drive multiplexing, but that's all.
> Is there any evidence that these guys actually lost customer data?

When I said they didn't have any redundancy, I meant they didn't have any redundancy inside the pod -- I mean, they had RAID-5, but when you're talking about 45 drive mechanisms per pod then RAID-5 is not redundancy anymore.

I have no evidence that they have ever lost any customer data, but the way they achieve their real redundancy is by having hundreds of pods, and they replicate customer data across multiple pods.

If you can only afford to have a single pod, then you would not have any real redundancy even if you did implement RAID-5.

When you face 5% drive failure rates per year across all the pods and 45 drives per pod, that's an average of 2.25 drive failures per pod per year, or roughly a drive failure per pod every five to six months or so.  With RAID-5, a single drive failure puts you into degraded mode, and you can't sustain a second drive failure.

So, what's the odds of a two-disk drive failure in a single pod which would lead to catastrophic loss of all data on the pod?

Of course, you know that these things tend to fail when they get stressed and there is no higher level of stress that can ever be placed on a drive array than when it is in degraded mode and you're trying to do an array rebuild to get back to "normal" operation.

> Doesn't Google use large arrays of cheap disks also?

Yeah, but they have the same (or better) economies of scale, and they can afford to just throw money at the problem, whereas Backblaze made the conscious choice to cut every possible corner that they could.

I understand the reasons why they made that design choice, but before you decide to base anything on their design, you should be very careful to fully examine all their assumptions and design criteria before you proceed.

Once you example their assumptions and design criteria, I believe you will find that you would probably make different assumptions and choose different things to optimize for, because you can't afford to operate hundreds and hundreds of these pods.

Which brings us all the way back around the circle.

> Wikipedia says "In November 2010, Oracle designated that the X4540 is end-of-life and has no next-generation replacement model."

I saw that.  But the only real differences between the X4500 and the X4540 was the improved motherboard and CPUs of the X4540, the overall architecture was largely still the same.

> I've never heard of anything from Sun/Oracle that is inexpensive.

True, but those hardware guys usually did good systems engineering, and they really outdid themselves in the X4500.

My point is that by the time you take the Backblaze design and you add back in all the kinds of things that you'd want in the case of a site where you could only afford to have one of these things available, you end up adding back in all the same kinds of costs that buying an X4500 would have cost.

Unless you can afford to operate at or near the kind of scale as Backblaze, you are not in the position to be able to take advantage of their kinds of economies of scale.

> The difference is that I can afford to build one of these POD arrays myself and add 3T drives as needed.  The entire price for one without drives is about $2K and then increasing 3T at a time for $120 for each drive.

That's another false economy.  Each time you move the unit to install another drive, you greatly increase the chances that there will be a drive or other hardware failure, and you significantly increase the TCO of the unit.  Every time a human being physically touches the hardware or the rack, you increase the chances that something in that rack or nearby will go Tango-Uniform.  So, you don't ever want to touch the rack again after you install it.  Or, if you do have to touch the rack again, at the very least you don't want to ever again touch the individual devices in the rack.

If you want to buy something that has a bare cost of $2k and allows you to add drives as desired, I'd go with NetGear ReadyNAS devices instead, and add more entire storage units as you need to expand.

Brad Knowles <brad at shub-internet.org>
LinkedIn Profile: <http://tinyurl.com/y8kpxu>

More information about the SATLUG mailing list