Stephen Manley published a blog post today called “Tape is Alive? Inconceivable!” To which I have to reply with a quote from Inigo Montoya, “You keep using that word. I do not think it means what you think it means.” I say that because, for me, it’s very conceivable that tape continues to play the role that it does in today’s IT departments. Yes, its role is shrinking in the backup space, but it’s far from “dead,” which is what Stephen’s blog post suggests should happen.
He makes several good points as to why tape should be dead by now. I like and respect Stephen very much, and I’d love to have this discussion over drinks at EMC World or VMworld sometime. I hope that he and his employer see this post as helping him to understand what people who don’t live in the echo chamber of disk think about tape.
Stephen makes a few good points about disk in his post. The first point is that the fastest way to recover a disk system is to have a replicated copy standing by ready to go. Change where you’re mounting your primary data and you’re up and running. He’s right. He’s also right about snapshots or CDP being the fastest way to recover from logical corruption, and the fastest way to do granular recovery of files or emails.
In my initial post on the LinkedIn discussion that started this whole thing, I make additional “pro-disk” points. First, I say that tape is very bad at what most of us use it for: receiving backups across a network — especially incremental backups. I also mention that tape cannot be RAID-protected, where disk can be. I also mention that disk enables deduplication, CDP, near-CDP and replication — all superior ways to get your data offsite than handing tape to a dude in a truck. I summarize with the statement that I believe that disk is the best place for day-to-day backups.
Disk has all of the above going for it. But it doesn’t have everything going for it, and that’s why tape isn’t dead yet — nor will it be any time soon.
I do have an issue or two with the paragraph in Stephen’s post called “Archival Recovery.” First, there is no such thing. It may seem like semantics, but one does not recover from archives; one retrieves from archives. If one is using archive software to do their archives, there is no “recover” or “restore” button in the GUI. There is only “retrieve.” Stephen seems to be hinting at the fact that most people use their backups as archives — a fact on which he and I agree is bad. Where we disagree is whether or not moving many-years-old backup data to disk solves anything. My opinion is that the problem is not that the customer has really old backups on tape. The problem is that they have really old backups. Doing a retrieval from backups is always going to be a really bad thing (regardless of the media you use) and could potentially cost your company millions of dollars in fines and billions of dollars in lost lawsuits if you’re unable to do it quickly enough. (I’ll be making this point again later.)
Disk is the best thing for backups, but not everyone can afford the best. Even companies that fill their data centers with deduplicated disk and the like still tend to use tape somewhere — mainly for cost reasons. They put the first 30-90 days on deduped disk, then they put the next six months on tape. Why? Because it’s cheaper. If it wasn’t cheaper, there would be no reason that they do this. (This is also the reason why EMC still sells tape libraries — because people still want to buy them.)
Just to compare cost, at $35 per 1.5 TB tape, storing 20 PB on LTO-5 tapes costs $22K with no compression, or $11K with 2:1 compression. In contrast, the cheapest disk system I could find (Promise VTrak 32TB unit) would cost me over $12M to store that same amount of data. Even if got a 20:1 dedupe ratio in software (which very few people get), it would still cost over $600K (plus the cost of the capacity-based dedupe license from my backup software company).
It’s also the cheapest way to get data offsite and keep it there. Making another copy on tape at $.013/GB (current LTO-5 pricing) and paying ~$1/tape/month to Iron Mountain is much cheaper than buying another disk array (deduped or not) and replicating data to it. The disk array is much more expensive than a tape, and then you need to pay for bandwidth — and you have to power the equipment providing that bandwidth and power the disks themselves. The power alone for that equipment will cost more than the Iron Mountain bill for the same amount of data — and then you have the bill for the bandwidth itself.
Now let’s talk about long-term archives. This is data stored for a long time that doesn’t need to be in a library. It can go on a shelf and that’ll be just fine. Therefore, the only cost for this data is the cost of the media and the cost of cooling/dehumidifying something that doesn’t generate heat. I can put it on a tape and never touch it for 30 years, and it’ll be fine (Yes, I’m serious; read the rest of the post). If I put it on disk, I’m going to need to buy a new disk every five years and copy it. So, even if the media were the same price (which it most certainly is not), the cost to store it on disk would be six times the cost of storing it on tape.
Never underestimate the bandwidth of a truck. ‘Nuf said. Lousy latency, yes. But definitely unlimited bandwidth.
Integrity of Initial Write
LTO is two orders of magnitude better at writing bits than enterprise-grade SATA disks, which is what most data protection data is stored on. The undetectable bit error rate of enterprise SATA is 1:10^15, and LTO is 1:10^17. That’s one undetectable error every 100 TB with SATA disk and one undetectable error every 10 PB with LTO. (If you want more than that, you can have one error every Exabyte with the Oracle and IBM drives.) I would also argue that if one error every 10 PB is too much, then you can make two copies — at a cost an order of magnitude less than doing it on disk. There’s that cost argument again.
As I have previously written, tape is also much better than disk at holding onto data for periods longer than five years. This is due to the physics of how disks and tapes are made and operated. There is a formula (KuV/kt) that I explain in a previous blog post that explains how the bigger your magnetic grains are, the better, and the cooler your device is, the better The resulting value of this formula gives you an understanding of how well the device will keep its bits in place over long periods of time, and not suffer what is commonly called “bit rot.” This is because disks use significantly smaller magnetic grains than tape, and disks run at very high operating temperatures, where tape is stored in ambient temperatures. The result is that disk cannot be trusted to hold onto data for more than five years without suffering bit rot. If you’re going to store data longer than five years on disk, you must move it around. And remember that every time you move it around, you’re subject to the lower write integrity of disk.
I know that those who are proponents of disk-based systems will say that because it’s on disk you can scan it regularly. People who say that obviously don’t know that you can do the same thing on tape. Any modern tape drive supports the SCSI verify command that will compare the checksums of the data stored on tape with the actual data. And modern tape libraries have now worked this into their system, automatically verifying tapes as they have time.
Only optical (i.e. non-magnetic) formats (e.g. BluRay, UDO) do a better job of holding onto data for decades. Unfortunately they’re really expensive. Last I checked, UDO media was 75 times more expensive than tape.
Air Gap [Update: I added this a day after writing the inital post because I forgot to add it]
One thing tape can do that replicated disk systems cannot do is create a gap of air between the protected data and the final copy of its backup. Give the final tape copy to Iron Mountain and you create a barrier to someone destroying that backup maliciously. One bad thing about replicated backups is that a malicious sysadmin can delete the primary system, backup system, and replicated backup system with a well-written script. That’s not possible with an air gap.
People that don’t like tape also like to bring up device obsolescence. They say things like “you can’t even get a device to read the tape you wrote 10 years ago.” They’re wrong. Even if you completely failed to plan, there is a huge market for older tape drives and you can find any tape drive used in the last 20-30 years on eBay if you have no other choice. (I know because I just did it.)
Second, if you’re keeping tapes from twenty-year-old tape drives, you should be keeping the drives. Duh. And if those drives aren’t working, there are companies that will repair them for you. No problem, easy peasy. Device obsolescence is a myth.
Suppose you have a misbehaving disk from many years ago. There are no disk repair companies. There are only data recovery companies that charge astronomical amounts of money to recover data from that drive.
Now consider what you do if you had a malfunctioning tape, which is odd, because there’s not much to malfunction. I have been able to “repair” all of the physically malfunctioning tapes I have ever experienced (which is only a few out of the hundreds of thousands of tapes I’ve handled). The physical structure of a modern tape spool is not that difficult to understand, take apart, and reassemble.
Now consider what happens when your old tape drive malfunctions, which is much more likely. You know what you do? Use a different drive! If you don’t have another drive, you can just send the one that’s malfunctioning to a repair shop that will cost you far less than what a data recovery company will cost you. If you’re in a hurry, buy another one off eBay and have them rush it to you. Better yet, always have a spare drive.
This isn’t really a disk-vs-tape issue, but I just had to comment on the customer that Stephen quoted in his blog post as saying, “I’m legally required to store data for 30 years, but I’m not required by law or business to ever recover it. That data is perfect for tape.” That may be a statement that amuses someone who works for a disk company, but I find the statement to be both idiotic and irresponsible. If one is required by law to store data for 30 years, then one is required by law to be able to retrieve that data when asked for it. This could be a request from a government agency, or an electronic discovery request in a lawsuit. If you are unable to retrieve that data when you were required to store it, you run afoul of that agency and will be fined or worse. If you are unable to retrieve the data for an electronic discovery request in a lawsuit, you risk receiving an adverse inference instruction by the judge that will result in you losing the lawsuit. So whoever said that has no idea what he/she is talking about.
Think I’m exaggerating? Just ask Morgan Stanley, who up until the mid 00’s used their backups as archives. The SEC asked them for a bunch of emails, and their inability to retrieve those emails resulted in a $15M fine. They also had a little over 1400 backup tapes that they needed months of time to be able to pull emails off of to satisfy an electronic discovery request from a major lawsuit from Coleman Holdings in 2005. (They needed this time because they stored the data via backup software, not archive software.) The judge said “archive searches are quick and inexpensive. They do not cost ‘hundred of thousands of dollars’ or ‘take several months.'” (He obviously had never tried to retrieve emails off of backup tapes.) He issued an adverse inference instruction to the jury that said that this was a ploy by Morgan Stanley to hide emails, and that they should take that into consideration in the verdict. They did, and Morgan Stanley lost the case and Coleman Holdings was given a $1.57B judgment.
Why isn’t tape dead? Because there are plenty of things that it is better at than disk. Yes, there are plenty of things that disk is better at than tape. But move all of today’s production, backup, and archive data to disk? Inconceivable!
----- Signature and Disclaimer -----
Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Architect at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.