NetApp: He who is first shall be last

I just posted a comment in Scott Waterhouse’s The Backup Blog that I didn’t agree that NetApp was the last major vendor to come out with dedupe.  Since that seems opposite to what seems to have happened, click Read More to see why I believe this, and why I think this is important.

While it’s true they were the last major OEM disk vendor to ship target dedupe for their VTL, you can hardly say they are the last to the dedupe party.  They were the first major OEM disk storage vendor (i.e. EMC, IBM, HDS, HP, NetApp & Sun) to have their own (not OEMd) VTL and the first vendor of any kind to ship dedupe for primary storage (developed it themselves).   Before that, they supported dedupe for NetBackup with that same product.  They’re the second such vendor to ship their own (not OEMd) target dedupe product, and the first such vendor to develop it themselves.  (IBM beat them by a few months via their acquisition of Diligent.)  I’m not dismissing Avamar, but it isn’t a target dedupe product; it’s source dedupe – a whole other animal.  So while they may look like the last one to the party, they’re the first to come with a monogamous wife, versus others who are coming with a friend (EMC/Quantum), or caught in a love triangle (IBM/HDS/Diligent or Sun/COPAN/Falconstor).

Why is it important that NetApp developed their own dedupe product?  First, I agree with David Chapa when he talks about the advantages of a home-grown product (http://blogs.netapp.com/barandgrill/2008/10/another-music-r.html).  Yes, it makes you last to the party, but it comes with a level of knowledge about the product that’s just not possible otherwise.  That level of knowledge comes through in support situations.

Second, I’ll take an acquired or homegrown solution over an OEMd solution any day of the week; they won’t change the way OEMd products do.  When a company owns the solution, they’ll make it work, instead of just abandoning it for the Next Best Thing, the way EMC did when they chose to use Quantum for their dedupe engine instead of Falconstor.  If EMC had owned the Falconstor code (instead of just OEMing it), I bet they would have made it work, and those who bought CDLs could just upgrade their code and get dedupe.  Instead they’ll have to do a forklift upgrade, or in the case of the 4000 series, bolt another VTL on the back of their current one.  And regardless of HDS’ and IBM’s words to the contrary, I’ll believe the long term viability of the Diligent-based HDS dedupe product when I see it.  Buying an OEMd product puts you at risk that the company will change plans because it wants to or is forced to.

I do agree with Scott that it’s a shame they shipped with RAID5 (not RAID6) and without replication. Since having the only copy of deduplicated data on a RAID5 array is nothing short of scary (the odds of a double-disk failure are just too high), this limits the use of this VTL to those who will back up to it and copy all of its backups immediately to tape.

Written by W. Curtis Preston (@wcpreston), four-time O'Reilly author, and host of The Backup Wrap-up podcast. I am now the Technology Evangelist at Sullivan Strickler, which helps companies manage their legacy data

6 comments
  • W Curtis… I welcome the debate, but… I think there are two levels to this conversation: one is what the user cares about. And honestly I think that just boils down to a performance conversation–in the sense of throughput, time to finish backup, and time to finish replication. The other level is the technical minutae of what is really going on with the device. The second is also partly semantics in this case.

    Let me start with the second level. You say that you don’t want the market confused. Fair enough. Neither do we. Having said that, I think you are trying to fit a square peg in to a round hole in order to define something. (By the way, we are fine with the “immediate” definition if you prefer that to in line.)

    Why do I say that? You further wrote: “the second reason that the differentiation between inline and post-process is important is that post-processing systems can get

  • “If EMC had owned the Falconstor code (instead of just OEMing it), I bet they would have made it work, and those who bought CDLs could just upgrade their code and get dedupe.”

    You have this all wrong. FalconStor works just fine… so well that EMC wanted FalconStor more than it wanted Quantum… but FalconStor would not give it to EMC exclusively and abandon their other OEMs. EMC’s decision to go with Quantum had nothing to do with technology, it had everything to do with business and EMC’s desire to have a controling interest in what it does.

  • While Falconstor may have worked out their kinks, EMC definitely had issues with the versions of Falconstor’s dedupe that they tested.

    And if the reason they left Falconstor was only to get exclusive rights, why didn’t they ask that of Quantum? Quantum would have given them the world to get the business, but Quantum is certainly not being exclusive with them.

  • Who else is Quantum OEMing to?

    And don’t confuse the licensing stuff to OEM business – that is two seperate worlds.