A week at NABShow (National Association of Broadcasters) and two days at Tape Summit last week have given me a chance to revisit my thoughts on tape. Here's a brief summary of how my opinion of tape has changed over the years:
Stage 1: Tape was it. It was all I knew. Backing up to disk was crazy, as it was too expensive. (early 90s)
Stage 2: Tape was still it, but tape drives were getting too fast. Multiplexing or disk staging was starting to be required. Disk was too expensive to hold backups long term.
Stage 3: The dedupe craze hit. It was both theoretically possible, as well as financially feasible (for some) to store all backups on disk — and still have an offsite copy.
Stage 4: (Pretty recently). I compared the pricing of today's dedupe systems to similarly-sized tape systems. I was shocked at how expensive disk still was (4x-8x the price of tape).
Stage 5: (Today) I think we have unsuccessfully put a very good backup and archive target out to pasture and we should really reconsider that.
[Update: Because people tend to read my old articles, I'm going to update this one almost a year later to reflect my current position on tape.]
[Update2: I just wrote this blog post about my response to another article about this topic.]
First, let me state that I am not saying that we should not have disk in a backup system, or that deduped systems are over-rated. What I am saying is that tape has more to offer than we've been giving it credit for lately. Here are some factors that came into my mind while considering this:
It costs 4-8 times more to acquire a disk-based backup system than it does to acquire an automated tape system.
[Update 3/9/12: Pricing obviously changes all the time, and prices on disk have come down since this original post. I even have some vendors that claim to be as cheap as tape on the initial purchase of one disk system vs one disk robot in some situations. ]
While I've heard this from multiple sources, let me give you a real-life example to drive home this point. I recently priced tape libraries and dedupe disk systems for a 20 TB shop, and I was surprised to learn that disk was actually still way more than the price of tape — even after dedupe. The average street price of the tape libraries I was considering was about $15K, and the average price of the dedupe systems was about $60K. Since the customer was getting rid of their (very old) tape library, their choices were:
A) Buy a new tape library, copy tapes and hand them to a dude in a truck ($15K)
B) Buy a dedupe system AND a tape library. Copy from the dedupe system to the tape library, and then hand tapes to a dude in a truck. ($60K + $15K)
C) Buy two dedupe systems and replicate between them (no truck needed) ($120K)
Option C was 8 times more expensive than Option A and was out of the question. While it meant they could get rid of their Iron Mountain bill, they did not believe they could ever save enough money to recoup that additional $105K. Option B offered no cost savings, so it was difficult to justify the additional $60K. I pointed out that Option A (if done correctly) requires a disk cache in front of their tape library, but they informed me that they were already doing that. (Based on their throughput requirements, though, adding a disk cache wouldn't have added that much to the price.)
You can undoubtedly make an argument that a backup-to-disk system is easier to manage than a hybrid tape system, but the simple fact is that the disk system will be more expensive to purchase.
Tape actually has a better bit error rate than disk
For those unfamiliar with the concept of bit error rate (BER), the following definition from Wikipedia should be helpful:
"The bit error rate or bit error ratio (BER) is the number of bit errors divided by the total number of transferred bits during a studied time interval. … The bit error probability p^e is the expectation value of the BER. The BER can be considered as an approximate estimate of the bit error probability. This estimate is accurate for a long time interval and a high number of bit errors."
LTO-5 has a bit error rate of 1:10^17. The TS1130 from IBM has a bit error rate of 1:10^20, & the T10000C from Oracle both have a BER of 1:10^19. SATA disk has a BER of 1:10^14 for SATA (SAS/FC is 1:10^15 but no one is using that for backup or archive). This will probably come as a surprise to many people. Tape has actually gotten so good at writing data, it is more reliable at writing data than disk!
While 10^15 may look really close to 10^17, it's not. When it's bits we're talking about, it's the difference between 113 TB and 11.1 PB! It means you are 100 times more likely to have bad data on disk than you are on an LTO-5 tape drive, and 10,000 times more likely than if the data is stored on a T1000C or TS1130 drive!
Tape uses less power than disk
Every time I calculate power consumption for tape systems vs. disk systems, tape systems win. The reason for this is that tapes in slots take up no power at all, tape drives use very little power while they're not doing anything, and you need far fewer tape drives than you need disk drives. I recently did a comparison for a 20 TB shop that resulted in at least a 2X difference in power consumption, and that included enough disk to do disk staging before the tape system. (I plan to publish this once I double/triple check my numbers, but right now I feel pretty safe in saying at least a 2X difference.)
You buy the system once; you power it all day long every day.
Longterm (5+ years) storage of data on disk is not compatible with the typical lifecycle of disk, but it is compatible with tape.
This one is something we don't talk about. An individual tape is made to hold data much longer than an individual disk, and the lifecycle of most tapes is much longer than the lifecycle of most datasets. You cannot say the same about disks. Storing data on disks for more than 5 years automatically assumes that you're going to migrate data from one disk unit to another.
In addition to the media, it is also very common for tape libraries and tape drives to outlast the disk systems sitting next to them. Where most companies migrate data at the end of the depreciation cycle for disk, they tend to keep their tape libraries and drives much longer than that. They also tend to swap out their drives in the tape libraries; the same is not true in disk units. If you find a disk system in your data center older than five years, I'd be shocked.
What's the problem then?
Let's throw out the claims I've heard:
1. Tape has bitrot
So does disk. It's called magnetism. It happens. The chances of bitrot happening on tape are far less than the chances of it happening on disk. [Update: See this post for further info on this.]
2. Tape is flimsy
Tell you what. Move disks around the way you move tapes around and see how flimsy they are.
3. 80% of tape restores fail. [Update 3/9/12: This is a fake statistic that never existed. See my updated blog post.]
This Gartner statistic has been thrown around so much and I really don't know where Gartner got this number from, but it's out there. [Update: This Gartner statistic never existed.] What I can tell you is that in my entire career of working with backups, I've only had one or two restores that failed due to an actual bad tape — and that's why we make copies. But I can tell you of dozens of situations where bad disk drives caused me all sorts of headaches.
I can also tell you that most of the restore failures I've seen have been caused by human error – not tape failure.
4. Tape is too slow
Baloney. Check your facts again. There isn't a disk drive alive that can keep up with the speed of today's tape drives.
5. Tape is hard to make happy during backups & restores
Agreed. This is why I believe strongly in using at least disk caching. I would never design a system that uses just tape to do backups at this point. I'm actually OK with all of the designs mentioned above (in the A, B, C list). I think dedupe systems are awesome, and the idea of replicating to another one is even better. But I also know that doing this is more expensive than the alternative. The other thing I know is that it can't possibly be cheaper to store data on disk for many, many years, and it may even be risky to do so. (See my comments on BER.
What I'm really making an argument for is the use of tape for long term archiving, and as a less expensive way of getting data offsite. (Less expensive than having a second dedupe system and replicating to it.)
6. Tapes go bad sitting on the shelf and you never know they're bad until you need them
That is correct. This is why both Spectralogic and Quantum have come up with products to proactively scan your old archives to find and fix any corruption issues before you need a given tape. If it finds something wrong, it can be fixed by copying the other copy that you have.
Tape can be your friend for long term archives and cheap offsite storage. Don't dismiss it so lightly.
----- Signature and Disclaimer -----
Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Evangelist at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.