


Written by W. Curtis Preston
Wednesday, 28 May 2008 04:07
I received an email today telling me about a whitepaper sponsored by the LTO group and written by The Clipper Group. I'm used to seeing such whitepapers, and used to seeing them state things in such a way that makes the point the sponsor of the paper is trying to make. I've even written a number of these whitepapers myself. But this one just takes the cake, and I'd like to tell you why.
Feel free to take a look at the whitepaper itself and/or the webcast that talks about it. (If the ultrium.com link doesn't work, it's not my fault. It wasn't working for me either tonight.)
First, I am no tape hater, nor am I being paid to promote disk over tape. I also don't have a problem with using tape for long term archiving.
But I really don't like it when whitepapers use statements to back up their claims when those statements are either untrue or seriously weighted in their favor. Here are some examples from this paper.
- They use an average compression rate of 2:1, despite the fact that most customers get much less than that on average.
- They use hardware compression in their figures for tape, but ignore the fact that all major VTLs have hardware compression.
- Their per-TB pricing for disk is only found in the most expensive disk systems.
- Their TCO includes replacing the disk arrays at the end of their 3-year warranty. Are you SERIOUS?
- They use the latest tape drive (LTO-4), but do not use the latest disk drives in use (1 TB).
- RAID configuration seems to be configured to increase cost (two RAID-5 LUNs per drawer, spare disk driver per drawer, no such concept as global spares, etc)
- They use 85% of LTO-4's write speed plus compression, ignoring the FACT that 90% of customers get less than 50% of the rated throughput of their drives. (Based on my personal observation of data from over 100 customers.)
- "Did not consider LTO-5," but they sure hint at it -- a technology that is promised "sometime in the next two years." If you're going to mention futures, make sure you mention everybody's futures, not just yours.
- None of the calculations (acquisition cost, power/cooling, floor space) take deduplication into account. When they mention it in passing, they state incorrectly that while dedupe will reduce the power/cooling, it will increase the cost. That is simply not the case. The per-GB pricing of dedupe disk is significantly less than non-dedupe disk.
Their "other factors" completely ignore the benefits of disk, and state as benefits of tape things that are almost all also benefits of disk. For example:
- Tape is removable and portable
- While disk is not removable and portable, it also can't be replicated, but disk can. If you want backups offsite, you can replicate them there, especially if you have deduplication. As to being susceptible to corruption, virtual tapes can be made read-only as well.
- Tape is fast
- Tape is reliable
- Not only are individual disk drives inherently more reliable than individual tape drives and tapes, they can be RAID-protected, where tape cannot.
- Tape can be encrypted
- Tape has WORM
How can you release a whitepaper today that talks about the relative TCO of disk and tape, and not talk about deduplication? Here's the really hilarious part; one of the assumptions that the paper makes is both disk and tape solutions will have the first 13 weeks on disk, and the TCO analysis only looks at the additional disk and/or tape needed for long term backup storage. If you do that AND you include deduplication, dedupe has a major advantage, as the additional storage needed to store the quarterly fulls will be barely incremental. The only additional storage each quarterly full backup will require is the amount needed to store the unique new blocks in that backup. So, instead of needing enough disk for 20 full backups, we'll probably need about 2-20% of that, depending on how much new data is in each full.
TCO also can't be done so generally, as pricing is all over the board. I'd say there's a 1000% difference from the least to the most expensive systems I look at. That's why you have to compare the cost of system A to system B to system C, not use numbers like "disk cost $10/GB."
The relative TCO of disk versus tape is something I've looked a lot at with customers. Tape is still wining -- by a much smaller margin than it used to -- but it's not 23x or 250x cheaper. If it were really that much cheaper, no one would even be looking at disk.
Add comment
Comments
Have you never seen a system that does paralllel writes to two tapes before? They've been around for years, and most backup software packages allow you to configure for this option.
If I remember right having a tape drive out of service made life difficult too.
i guess there is a reason it hasn't caugth on or isn't used anymore or nobody speaks of it to this date.
Yes, there was such a thing as RAID tape, and it is technically possible. (The one you list is not the only one I knew about. ARCserve even had a software RAIT.)
The problem is that it exacerbates tapes core difficulty, the inability of incoming data streams to go fast enough to keep the drive happy. If you were to create a RAID1 stripe of five LTO-4 drives, you would need almost 1000 MB/s to stream it! That thing would ALWAYS shoe-shine.
Disk, on the other hand, can go both fast AND slow, so you can RAID as much of it as you want together and still have that array go as slow as you need it to go.
findarticles.com/p/articles/mi_m0EIN/is_1999_Sept_27/ai_55863405/print
And yes, they even refer to it by the proper acronym:
"RAIT (Redundant Array of Independent Tape)"
Lets use my current D-D-T disk environment as an example - a STK B280 (FlexStore) FC-AL controller using multiple 3+1 RAID5 across four trays (400 gig each disk, all SATA), and the LUNs concatenated using Veritas VxVM/VxFS (due to Solaris 8 limitations at the time, and no ZFS back then). Over the 16 disk trays we have for this configuration (staggered RAID/LUN generation down the trays and starting over again at the top), I'm very happily streaming four LTO-2 drives and the servers (E450 and V240) are currently the bottleneck - which should be fixed soon.
Even Sun-STK's 'children' of the above technology (6140 and smiliar disk) is running at or better than the above speeds for general and heavy I/O (again using RAID5 configurations).
--TSK
RSS feed for comments to this post