[SATLUG] Open Source deduplication capabilities?
brad at shub-internet.org
Wed Jun 18 23:12:29 CDT 2008
On 6/18/08, John Pappas wrote:
> Agreed. I am not a TSM guy, so I am not sure how a restore works (how
> many tapes are needed for a complete restore in an incremental forever?).
I think that depends on your D2D2T environment. In the case of my
current employer, we don't actually use tape at all right now --
everything we back up is exclusively disk-to-disk only, and it stops
there without ever going to tape.
There has recently been an acquisition of a really expensive
mainframe tape library that can be shared cross-platform, and our
group has been given use of two tape drives and ~200 tapes per
library. We're still trying to figure out the best way to make all
that work with our current environment, and I imagine that de-dupe is
going to play a role in that.
> Most environments can get away with replacing tape with a dedupe
> methodology (or single instance storage) of any type and see a good
> benefit (think fileshares and OS backups). My comments were based
> on a first-hand demonstration in a large datacenter with real data
> and replication concerns.
Interestingly enough, we just had a presentation this evening at the
Austin Sun User Group (see <http://www.austinsug.org/>) by a senior
consultant working for the second-largest Sun VAR in the country, and
he still has not seen a de-dupe environment that he would use in a
primary storage function. Backup, sure. But not primary. I've
heard the same from the founder of the Austin chapter of the Storage
Networking User Group, and other experts in the area.
> I am not Curtis Preston, but I do know somethings, and for the near-line
> or D2D2T needs, DD is a good fit. SIR (unless it has DB knowledge built
> in) falters on DB tablespace storage as the files are "different", where
> block dedupe can account for the tablespace index changes and whatnot.
I think you have to be careful regardless of how you reduce the
amount of data being stored. Simply because you will have fewer
physical copies of that data around means that you have to protect
those copies even more, otherwise the cost of losing that one single
copy you've got of the critical data on your network is ...
Brad Knowles <brad at shub-internet.org>
LinkedIn Profile: <http://tinyurl.com/y8kpxu>
More information about the SATLUG