I just posted a comment in Scott Waterhouse’s The Backup Blog that I didn’t agree that NetApp was the last major vendor to come out with dedupe. Since that seems opposite to what seems to have happened, click Read More to see why I believe this, and why I think this is important.
While it’s true they were the last major OEM disk vendor to ship target dedupe for their VTL, you can hardly say they are the last to the dedupe party. They were the first major OEM disk storage vendor (i.e. EMC, IBM, HDS, HP, NetApp & Sun) to have their own (not OEMd) VTL and the first vendor of any kind to ship dedupe for primary storage (developed it themselves). Before that, they supported dedupe for NetBackup with that same product. They’re the second such vendor to ship their own (not OEMd) target dedupe product, and the first such vendor to develop it themselves. (IBM beat them by a few months via their acquisition of Diligent.) I’m not dismissing Avamar, but it isn’t a target dedupe product; it’s source dedupe – a whole other animal. So while they may look like the last one to the party, they’re the first to come with a monogamous wife, versus others who are coming with a friend (EMC/Quantum), or caught in a love triangle (IBM/HDS/Diligent or Sun/COPAN/Falconstor).
Why is it important that NetApp developed their own dedupe product? First, I agree with David Chapa when he talks about the advantages of a home-grown product (http://blogs.netapp.com/barandgrill/2008/10/another-music-r.html). Yes, it makes you last to the party, but it comes with a level of knowledge about the product that’s just not possible otherwise. That level of knowledge comes through in support situations.
Second, I’ll take an acquired or homegrown solution over an OEMd solution any day of the week; they won’t change the way OEMd products do. When a company owns the solution, they’ll make it work, instead of just abandoning it for the Next Best Thing, the way EMC did when they chose to use Quantum for their dedupe engine instead of Falconstor. If EMC had owned the Falconstor code (instead of just OEMing it), I bet they would have made it work, and those who bought CDLs could just upgrade their code and get dedupe. Instead they’ll have to do a forklift upgrade, or in the case of the 4000 series, bolt another VTL on the back of their current one. And regardless of HDS’ and IBM’s words to the contrary, I’ll believe the long term viability of the Diligent-based HDS dedupe product when I see it. Buying an OEMd product puts you at risk that the company will change plans because it wants to or is forced to.
I do agree with Scott that it’s a shame they shipped with RAID5 (not RAID6) and without replication. Since having the only copy of deduplicated data on a RAID5 array is nothing short of scary (the odds of a double-disk failure are just too high), this limits the use of this VTL to those who will back up to it and copy all of its backups immediately to tape.
----- Signature and Disclaimer -----
Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technologist at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.