[SATLUG] 67 TB for less than $8K

Brad Knowles brad at shub-internet.org
Tue Oct 9 21:22:05 CDT 2012

On Oct 9, 2012, at 6:16 PM, Bruce Dubbs <bruce.dubbs at gmail.com> wrote:

> I'm not going to build one of these things because I don't need it, but I would suggest it to an employer with a limited budget but a need for a large amount of disk space to try out and develop actual experience with the concept from both a technical point of view and from a cost perspective.

Here's a good page on the subject: <http://bioteam.net/2011/08/why-you-should-never-build-a-backblaze-pod/>.

Quoting from this page:

> What are the risks?
> The backblaze storage pod was designed for a very specific use case that is not a great fit for more generic usage. A quick glance at the design plans will tell you:
> 	• The system uses a single disk for hosting the operating system
> 	• The system requires 2 power supplies to operate, both must be active and there is no redundancy, spare or failover unit
> 	• The system has no hardware RAID capability
> 	• The system only has 2 GigE network interfaces
> 	• To access/replace a disk drive you need to remove 12 screws
> 	• To access/replace a disk drive you need to remove the top cover
> 	• If you build this yourself totally DIY you will be required to create custom wiring harnesses
> 	• Any monitoring or health status reporting tools will have to be built, installed and configured by hand
> Simply put this box has no “highly available” features and any sort of significant maintenance on it will almost certainly require the  system to be taken offline and possibly even powered down. You also need to mount this unit on extremely heavy-duty rack rails OR put it on a shelf and leave about 12 inches of top clearance free if you want to easily be able to pop the top cover off to get at the drives.
> This is cheap storage, not fast storage and certainly not highly-available storage. It carries a far higher operational and administrative burden than storage arrays traditionally sold into the enterprise.
> Scary huh? My main goal with this blog post is to ensure that readers considering this approach are fully aware of the potential risks.
> Why the folks at Backblaze don’t care about the “risks”
> Short answer: They solve all reliability, availability and operational concerns by operating a bunch of pods simultaneously with a proprietary cloud software layer that handles data movement and multi-pod data replication. To them, a storage pod is a single FRU (field replaceable unit) and they don’t really need to spend significant amount of time and attention on any single pod.
> Long answer:
> 	• Backblaze does not care about reliability of single pods. They engineer around hardware and data reliability concerns by using many pods and custom software
> 	• Backblaze does not care about downtime for single pods. Customer data is stored on multiple pods, allowing individual pods to break or otherwise be taken offline for maintenance or replacement
> 	• Backblaze does not care about performance of single pods. They have openly stated that their only performance metric is “can we saturate the GigE link as we load a pod with data”
> 	• Backblaze has an unusual duty cycle. A normal backblaze pod is only “active” for the first few weeks of it’s life as it slowly fills to capacity with customer backup data. After a pod is “full” the system sits essentially idle while it waits for (much less  frequent) client restore requests.
> 	• Backblaze does not care about operational burden. Via their custom software and use of many pods at once Backblaze has built an infrastructure that requires very little effort in the datacenter. It looks like a few days a week are spent deploying new pods and I’m guessing that failing pods are “drained” of data and then pulled out to be totally rebuilt or refreshed. Backblaze does not have to dink around trying to debug single-drive failures within individual pods.

Now, as the author of this page has said, all those warnings aside, there can actually be good reasons for building one of these things, even if they really do cost more like $12k per pod when you build them in small quantities, as opposed to the $7k per pod that Backblaze claims.

My goal in this discussion was simply to bring to light the same issues that Chris highlighted in his blog post.  The only difference is that he actually posted on his blog about this issue, and I hadn't Googled on this subject in a while.

I would encourage everyone to read the comments in the fifth and final post (so far) in Chris' blog series at <http://bioteam.net/2011/08/backblaze-performance/>.  Some good pointers there, too -- I'd pay special attention to some of the alternatives that are mentioned.

I'd also encourage folks to take a look at the comments starting at <http://storagemojo.com/2011/07/20/open-source-storage-array/#comment-217808>.

Brad Knowles <brad at shub-internet.org>
LinkedIn Profile: <http://tinyurl.com/y8kpxu>

More information about the SATLUG mailing list