Challenge with NetApp deduplication (ASIS)

An interesting aspect of NetApp’s primary dedupe (ASIS) came to light while talking with one of their customers the other day.  It’s one of those things that should have been obvious from the start, but I never really thought about it until this customer brought it up.

Before I bring up a concern about ASIS, let me praise it just a bit.  NetApp’s ASIS (Advanced Single Instance Storage) is the only product available today that does true deduplication of any data type, including deduping active data stores such as VMware images.  Customers that I have spoken to tell me that there is a performance hit while the post-process dedupe process is running (generally at night), but that after the dedupe process has run, there is minimal to no performance degradation on the deduped data.  Add to that fact that ASIS is included in the base OS and I think they’ve got a pretty interesting story.  Other “data reduction” products include:

  • EMC Celerra
    • It provides file-level dedupe and compression of older files.  This would therefore not work for VMware or database images.  It will provide some space savings, but not as much as a subfile-level approach, and not as much as one that can dedupe active data.
  • StorWize
    • Since half of most dedupe comes from compression, StorWize said let’s just do that!  They are an inline compression system that does for NAS systems what the compression chip in your tape drive does for the tape drive.  Since it’s compressing inline, it actually can improve performance of some applications.  Believe it or not, they’ve even tested it in front of Data Domain systems and increased their capacity!  Like the NetApp approach, it works for any data type.
  • Ocarina
    • Ocarina does content-aware deduplication. While they started doing only file-level dedupe, they have recently added cross-file-level dedupe, so they also are doing “true” dedupe.  But they only do this for certain file types, such as Word documents, jpg files, etc.  If you have a lot of data in the file types they support, they should be able to get more dedupe out of it than other approaches, but they won’t be able to address other data types at all, such as VMware.
  • Content Addressable Storage (CAS) products & Single Instance Storage (SIS) products
    • These products provide object-level or file-level dedupe and will not identify common blocks between files, but they should at least be mentioned in a list such as this.  Some of these products have started calling themselves deduplication products, when (at best) they can call themselves object-level dedupe or file-level dedupe.

Alright, on to the interesting thing about NetApp’s primary dedupe.  Here’s the thing: they “redupe” when replicating or when copying to tape.   Let’s look at each of these use cases.

ASIS is run at the filer-level and actually at the flex-vol (i.e. volume) level.  When that data is replicated to another file, the data is reduped, or re-constituted to its original size.  If you want to run ASIS on the other side you can.  Under “normal “circumstances where you start out with an empty volume, start filling it, and are replicating it, this poses no problem.  It also poses no problem if you had a full volume you were replicating and then decided to run dedupe on it after the fact.  Dedupe both sides — no problem.  However, if you have a volume where the amount of deduped data when reduped is greater than the replicated volume’s raw capacity, and you haven’t been replicating it as you go along, you’ll need to begin replication in stages.  You’ll replicate some of the data, then dedupe that data.  Then you replicate some more data and dedupe that data, and so on.

Update: The above only occurs if you use qtree-based snapmirror.  If you do volume-based snapmirror, there is no problem.  However, many people prefer qtree snapmirror, so they should be aware of this limitation.

A bigger concern is when you’re backing this data up to tape.  Like almost all dedupe products, when the deduped volume is copied to tape, it is reduped.  If you had a full volume fail and needed to restore that volume, you wouldn’t be able to directly do so, as you’d have more data on tape than you could fit on the volume.  You’d have to restore some data, dedupe it, restore some more, dedupe it, and so on.  Therefore, it would seem that anyone with aggressive RTOs and a full deduped ASIS volume would be well advised to have a snapmirror copy of it standing by, as you won’t be able to restore it as fast as a regular volume.  This limitation is confirmed by the following quote from NetApp’s ASIS Implementation Guide, “Backup of the deduplicated volume using NDMP is supported, but there is no space optimization when the data is written to tape because it’s a logical operation.”

Update: Snapmirror to tape (sm2t) doesn’t have this problem, just a regular NDMP dump.  The problem with sm2t is that it doesn’t do file-level recovery AND it’s not manageable via some backup applications.  (It is manageable by TSM, NBU, BakBone, CommVault, Atempo and SyncSort ).  So, SM2T is fine for a full DR of a volume if you can manage it with your backup app, and that’s alright if you have enough snapshot history to handle single-file restores (which you should be doing anyway).

Like I said — just something I never thought about until someone brought it up. NetApp may be able to address both these challenges at some point, and I hope they do.

9 thoughts on “Challenge with NetApp deduplication (ASIS)

  1. andriven says:

    Exactly right — there are a few more wrinkles even actually. SnapVault (D2D product — uses snapshots but allows keeping more snapshots at the destination than the source) as of ONTap 7.3 does support deduplication. Given SnapVault is based off of qtrees, support for deduplicated qtree SnapMirror can’t be too far behind.

    On a practical level, while it is a drawback we find that we don’t run into it as an issue very often (am a NetApp partner engineer).

    Technically speaking, if the replication happens at the volume level, things work swimmingly (since deduplication happens to the blocks at a volume level). Given qtree SnapMirror has to handle things at the file level (and is slower when there are tons of files), it doesn’t work natively with a volume-level blocked deduplication.

    It’s definitely a good point although in all fairness something that’s been out there for a while (at least if you have good NetApp partner engineers to work with ;-).

  2. cpjlboss says:

    Like I said, it’s something I should have known about before. It’s not like I’m breaking the story or anything. Just bringing it up for discussion. (It does seem news to some.)

  3. andriven says:

    Completely understood…and always glad to see discussion that fall in areas where I spend a lot of time. πŸ™‚

  4. cpjlboss says:

    @Sirisak

    I don’t follow Netvault very closely, so it’s good to hear that it has this functionality.

    @Juan Orlandini

    Kinda hard to know about undocumented features unless you’re using them! Thanks for telling me about that.

  5. alapati says:

    Curtis,

    Nice to see you here. Its been a long time since we connected. I would like to comment on the following statement.

    Volume SnapMirror (VSM) is more popular with NetApp customers. VSM provides mirroring functionality and therefore a good fit for DR. For backups, many people prefer SnapVault, which is qtree-based replication. Maybe you meant to say many people prefer SnapVault for backup purposes?

    “However, many people prefer qtree snapmirror, so they should be aware of this limitation.”

  6. dunterse says:

    Hi Curtis,

    some additions, corrections and additional comments:

    SM2T can be done today by 6 backup software:
    TSM since 5.5.2 official supported (I have a very large german TSM customer, who replaced all NDMP DUMP by NDMP SM2T in February).
    BakBone, NBU, BakBone, CommVault, Atempo and SyncSort are also able to control it (hope more in future).
    SM2T is only a solution for Disaster backup2tape.
    But this is fine for everybody, who hold enough Snapshots for Single-File Restores.
    The benefit is, that every Snapshot is back at desaster restore (Customers do not like to loose any of their sometimes over 100 snapshots, whenever a restore has to be done for their secondary storage).
    The next great thing is: SM2T works with full speed for the million-of-file NAS-volume usecase (which is a speed problem for any file-level-backup).

    VolumeSnapMirror became very famous and recommended the last months for the Remote Office Backup use case:
    Customer stores their data on a NetApp filer (often a small FAS2020).
    They activate ASIS (Dedupe) for every volume and use snapshot for data protection.
    They do at night one ASIS run, followed by a Snapshot, followed by one VSM transfer.
    That gives them since OTNAP 7.3 the benefit, that the dedupe benefit can be kept for the WAN line and for the target filer in the Datacenter.
    BTW, some Customer had already added Network Compression (today under PVR) to reduce another 50% of WAN-traffic.
    But the best for this use case is: Every Restore at Reomote-Office can be done from local snapshots (no problem for big full volume restores over the WAN wire). The replicated data in the data center are only needed in case of disaster.
    All together, that’s the perfect Remote Office Backup Solution. Do you know anything on the market that compares?

    Comparing different dedupe implementations and the benefit out of them, I expect, that many come in the future to the following conclusion:
    It makes most sense to begin with dedupe at primary storage and keep the benefit for every replication (on wire and target) and keep the benefit for any backup2tape.
    Avoiding duplication is better than to duplicate data with all the transport overhead (for example by doing file-level fullbackups) and to dedupe only at the end for the backup target (VTL or B2D-Diskcache).
    On a high level, that’s the target, where NetApp want’s to get.

  7. cpjlboss says:

    You’d think they’d keep me updated on things like new features. πŸ˜‰

    Thanks for the update. I’ve updated the article to reflect the new information.

Leave a Reply

Your email address will not be published. Required fields are marked *