Encrypt your tapes but not your disks

Update: My opinions on this have changed due to the comments written below.  Feel free to read this post, but make sure you read the follow-up post as well where I change my tune a bit.

Steve Duplessie wrote a blog post inspired by the RSA hack . His post isn’t about that hack at all.  But for the record, I agree with this guy who says “RSA Silent About Compromise For 7 Days – Assume SecurID Is Broken.

Steve’s blog post said that the lesson we should learn from the RSA hack is that anyone can get hacked.  I would agree with him.  He said that your security system should be based on that assumption.  I would agree with that.  He said:

Your security strategy should be based on the assumption that you WILL lose your backup tapes.  You will be hacked.  You will have your customer’s name, SS numbers, and bank account information published on a website.

He then goes on to say, “…if your binary data is going to go missing, it had best be encrypted. Encrypt it at rest, in flight, on the truck, on the disk, in the lab, in the warehouse, everywhere.  Encrypt it so when you lose it, it gets stolen, or Chuck leaves the tape on his dashboard while at the bar, it can’t do you any harm.”

Let me start where I do agree.  Encrypt your backups tapes. Encrypt your backup tapes. Let me say it again: Encrypt your backup tapes! With in-drive encryption built into any tape drive worth its salt, it’s a no-brainer.  (You do need to make sure you have a good key management system.)

Where I don’t agree with Steve is when he recommends that you should encrypt your disk drives.  (BTW, I respect Steve a lot and I’m sure he’ll appreciate this blog post as much as the next guy.)  I will go with his assumption (that you’ve been hacked) and explain why encrypting data on disks at rest wouldn’t help.

If the host storing the data has been hacked.  The hacker is accessing your system like any other user.  Data encrypted at the drive level is automatically unencrypted for the host that is reading the data.  It’s as if you aren’t encrypting — otherwise the apps reading the data wouldn’t be able to read it.  Data encrypted at the application level doesn’t protect you if the server has been hacked, either, because the hacker can just become the appropriate user that runs the app, and voila!  He sees the data unencrypted.  What if the server isn’t hacked, but the SAN is? If the SAN is hacked between the host and the encryption device (assuming in SAN encryption, not host-based encryption), such as via WWN spoofing or the like, then they will be able to read the data as well, so you’re not protecting against that. 

Let’s say you didn’t encrypt, and someone grabs a disk out of RAID array and runs off with it.  They would only have part of the picture and they wouldn’t  be able to read any data off that.  (Update: Greg pointed out that this doesn’t apply if we’re talking single disk solutions, or one half of a mirrored double disk solution.  He is right.  This comment only applies to RAID 0+1,10, 5, 6, etc. where multiple disks are required to create a volume.) 

The ONLY scenario that encrypting data at rest on disk protects you from is someone literally walking out of your datacenter with the entire disk array on their back and no one seeing it — AND that same person being dumb enough NOT to bring the (much smaller) system that can unencrypt the data with him.  Yeah, that’s gonna happen.

Should every laptop hard drive be encrypted?  Yup.  Should every backup tape be encrypted?  Yup.  Should your smartphone have remote wipe and a really good way to prevent people from accessing it as well?  You bet. They are way too mobile and have way too much sensitive data on them.

But I’m still not sold on storing data at rest on disk in encrypted form.  But I honestly would love for someone to explain to me why I’m wrong.


Written by W. Curtis Preston (@wcpreston), four-time O'Reilly author, and host of The Backup Wrap-up podcast. I am now the Technology Evangelist at Sullivan Strickler, which helps companies manage their legacy data

13 comments
  • The only real reason I could see for encrypting drives outside of someone coming in and stealing them is when a drive goes bad or your selling an array when the good old warranty is up and it’s not beneficial to go third party.

    Drive failure obviously does not mean the data cannot be gotten to and if it gets into someones hands that has the know how, your data is readily available.

    If you sell your array and neglect to wipe the data first, same situation.

  • Curtis you mention: “But I’m still not sold on encrypting data at rest on disk in encrypted form. But I honestly would love for someone to explain to me why I’m wrong.”

    Not saying you are wrong as you can have an opinion like anyone else.

    However, since we are throwing out hypothetical scenarios, let’s assume that your RAID array is using RAID1 (mirroring) which in that case guess what, all you need is to have one of the drives go missing and it would be an ouch situation, assuming that you have anything of interest to others or yourself.

    How about one of the parity based approaches which you mention, ok, sure in most situations you can take a gamble on the chance that your data is spread across multiple drives. However, just on the remote chance that an array is using a parity chunk size of say 4K to 8K (if not larger as is becoming the case) and lets just for giggles assume that you have a small 2K document with some rather sensitive data that just happens to land in one of those chunks on the drive that goes missing. Again, this is all hypothetical balancing threat risks with protection, however for some environments taking the chance on that data going missing or explaining why they did not take adequate safeguards can be very expensive.

    Now what about storage where the drives are just installed as JBOD, what if they go missing?

    BTW, fwiw, fyi, here is a post about SEDs (Self Encrypting Disks)
    Securing data at rest: Self Encrypting Disks (SEDs)
    http://storageioblog.com/?p=1734

    Food for thought.

    Cheers
    gs

  • @Rich

    A disk degausser is a whole lot cheaper and less risky than a disk encryption solution.

  • @Greg

    Where the &%^$#%^ are you creating a 2K document? What application is that in? It doesn’t exist!

    As to the other things you point out (RAID 1, JBOD), let me say I’m speaking to people with RAID 10, RAID 5/6, etc. If you’re talking a single disk drive (or a double disk drive with mirroring), then this post obviously doesn’t apply.

  • I totally agree with you on this. This was one reason, I knew Decru would be a failure even before they came out of stealth mode and I told them so, but they didn’t listen, neither did NetApp apparently 🙂

  • Wow W, Sorry if I hit a nerve or offended with my response to you post, as I didn’t mean too.

    I saw your note and thought you were simply looking for comments or scenarios to expand the discussion or explore other situations, Oh well.

    However since you are claiming that 2K documents don’t exist, are you kidding as well? Come on W, you can do better than that!

    In fact, simply capture your response to my post, then insert it into notepad or any other editor/basic tool and guess what, you get less than a 2K document!

    Ok, I know, that’s not real world for you, ok, Fair enough, however poke around your website, blogs and other places for documents that are less than 2K and see what you find. In fact, if you don’t have such a tool, download tree size for free or the pro version for a nominal fee and see what it shows for different size files.

    As for getting back to RAID, look into what the chunk size is that is being used. Most have been moving over the past decade or more into multiple Kbytes with some vendors even spreading data in MByte size chunks. You don’t have to take my word, look around, ask around to see what you find.

    Here’s my point, it’s not about 2K documents or chunk size or raid levels and parity.

    The point is that if you have data on a disk that can be retrieved with basic or advance tools and mechanisms, and you happen to be in an industry, or geography/state/country that requires taking adequate steps to secure your data against the chance it becomes compromised, then why would you not do so. In other words, by simply enabling encryption of data at rest regardless of if for raid or mirror or distributed protection or jbod, fixed or removable, you take out a very low cost insurance policy to guard against real or perceived threat risks. Plus using SEDs help to address the issue of digital shredding instead of the time consuming task of degaussing and secure erase, all of which are options of course.

    There you have it, like it or not, agree or not, you asked for comments, feel free to remove/delete or disagree.

    Hope all is well
    Cheers
    gs

  • @Greg

    LOL. No offense at ALL! My “^^%&^O” was a friendly “WTH?” not an offended one. As in something I’d say to you in a bar while the two of us were having friendly banter — NOT something I’d say to your face begging you to hit me. 😉

    BTW, I created a Word doc with “Hello world” in it and saved it. 29KB.

    I will concede that you have pointed out a scenario (albeit an incredibly rare one) by which a person who steals an encrypted drive might be able get a REALLY TINY file (which isn’t going to be an Excel/Word/PPT file — cause there’s no way they make files that small — or any kind of data in a database). If:

    1. There is a file so small that has any kind of value in it (it would have to be pure text. What would that be? OK, source code.)
    2. It fits perfectly inside the boundary of a block
    3. The person in question reads every single block on the drive individually
    4. The person is sharp enough to deduce what kind of file a given block is

    OK, maybe there’s an INCREDIBLY REMOTE chance here.

    The only issue I have with SEDs in an array is key management.

  • Since you’re now a famous blogger I suggest you be careful about “it’s” vs. “its”. For example, you said “any tape drive worth it’s salt, it’s a no-brainer”. This is a pathological example.

    The “it’s vs. its” Police

  • @Sudsy

    I’m actually usually the one catching those sorts of things. I DO know the difference, but I was in a rush this morning.

    Pathological?

  • [quote name=W. Curtis Preston]@Rich

    A disk degausser is a whole lot cheaper and less risky than a disk encryption solution.[/quote]
    And also is against the warranty on a lot of disk arrays – you want to degauss, you need the NRDK option on the disks, as otherwise the vendor wants them back. It also negates most Failure Analysis actions, so you won’t be able to determine the cause of the failure on the disk.

    And if you get NRDK on the disks, a disk shredder is a LOT more fun than the degausser. 😉

  • Single disk out of a raid 10 can still generate lots of interesting, if incomplete data. Rather than talk about a single file, how about a database table? Especially since that is the most likely type of data going onto a raid 10.

    If a typical row size is a full 1k, then a 64k stripe chunk will still yield 60 full rows. So you lose a couple rows at each end, but still a lot of usable data that can be snooped through. And that is just one stripe chunk.

    If you are looking for “juicy” data, it doesn’t matter if you have all of it. There is plenty of it on the disk you have. At that point ppl can start arguing whether you should have used application, system or disk encryption, but you wanted a realistic scenario first.

  • Ok, ok, uncle!

    I was wrong. (Not surprising, given that I stepped outside of my specialty area.)

    I learned some good stuff here. Blog forthcoming.

  • Curtis, thank you for the info.

    Data Domain has a disk encryption offering, however I am trying to understand what it really buys me. Example, we will have 3 or 4 EV sites (vaulting sites) that handle tape, once the data sets are replicated to the EV site, the data will be sent off to tape leveraging our tape encryption platform we currently have. The remainder of the data is IP or VTL to disk. Data Domain is pitching their encryption option to eliminate tape all together. Data Domains pitch is, encrypting data at rest satisfies some aspects of internal governance rules and compliance regulations. It protects user data against theft of a Data Domain system, loss of the physical storage media during transit and eliminates accidental exposure during the replacement of failed drives. Data Domains setup is dual disk parity RAID 6

    1.So obviously tapes that go offsite are already encrypted with our tape encryption platform at the Data Domain EV sites (vaulting sites) , so I am not worried about tape media in transit

    2.if a DD drive has failed and needs to be replaced, I am not too worried about the data that is on the drive in an unencrypted state, as this is RAID 6, even if EMC took the failed drive, or it was lost and somebody tried to grab the unencrypted data off of it, they would need all drives to recreate the volume as you stated.

    Does leveraging Data Domain encryption offering provide me something that I am missing here, besides trying to eliminate tape? Is it worth its salt as you would say? Any information would help. THX