[SATLUG] Open Source deduplication capabilities?

John Pappas j at jvpappas.net
Sun Jun 15 11:40:43 CDT 2008

On Tue, Jun 3, 2008 at 4:28 PM, Jeremy Mann <jeremymann at gmail.com> wrote:

> I just read a great article in this months Information Week about a
> new backup technology called 'data deduplication'. Anybody out there
> actually using it and if so, how much has it dropped your storage
> requirements for your backups?

Did a DataDomain demo for a large customer of mine (used as both CIFS drop
and FibreChannel VTL) and they got 30:1 average, they were keeping TB of
data in GB of space (See below for the report, Pre-Compressed is total data,
and post-compress is space on disk).  IMHO, dedupe is the only way to bring
offsite data replication within reach of any SMB where bandwidth is the
bottleneck.  The DD stuff is both super slick and super expensive.  Another
option is Digisense.  They are building a dedup into their next offering,
but not sure how robust it will be.

On the F/LOSS side, there is brackup.  It uses a block hash to both validate
integrity and minimize redundant data.

>From the Data Domain demo:
==========  SERVER USAGE   ==========
Resource             Size GiB   Used GiB   Avail GiB   Use%
------------------   --------   --------   ---------   ----
/backup: pre-comp           -    22702.2           -      -
/backup: post-comp     9861.7      796.9      9064.8     8%
/ddvar                   78.7        1.3        73.4     2%
------------------   --------   --------   ---------   ----

Filesys Compression
Total files: 9,839;  bytes/storage_used: 23.1
      Original Bytes:   24,228,883,468,947
 Globally Compressed:    2,365,654,037,970
  Locally Compressed:    1,028,840,034,409
           Meta-data:       17,826,391,624

More information about the SATLUG mailing list