Jack Clark of ZDNet wrote an article entitled AWS Glacier's dazzling price benefits melt next to the cost of tape, where he compares what he believes is the cost of storing 10 PB on tape for five years, versus the cost of doing the same with Amazon's Glacier service. His conclusion is that Amazon's 1c/GB price is ten times the cost of tape.
I mean no disrespect, but I don't believe Jack Clark has ever had anything to do with a total cost of ownership (TCO) study of anything in IT. Because if he had, he'd know that the acquisition cost of the hardware is only a fraction of the TCO of any given IT system. If only IT systems only cost what they cost when you buy them…. If only.
So what does it really cost to store 10 PB on tape? Let's take a look at two published TCO studies to find out. Before looking at these studies, let me say that since both studies were sponsored by tape companies, the point of them was to prove that tape systems are cheaper than disk systems. If these studies are biased in any way, it would be that they might underestimate the price of tape, since the purpose of these two, uh, independent studies is to prove that tape is cheaper. (In fact, I wrote about one of the reports being significantly biased in favor of tape.)
Clipper Group Report
The first report we'll look at is the Clipper Group report that said that tape was 15 times cheaper than disk. It's a very different report, but I'm going to use the graph on page 3, as it gives what it believes to be the TCO of storing a TB of data on tape for a year, based on four different three-year "cycles" of a 12-year period.
As you can see, the cost per TB is much higher in the first three years, because it includes the cost of buying a tape library that is much larger than it needs to be for that period — because you must plan for growth. (This, of course, is one of the major advantages of the Glacier model — you only pay for what you use.) But to get close to Mr. Clark's five-year period, I need to use two three-year periods.
The other problem with the report is that they use graphs and don't show the actual numbers, and they use scales that make the tape numbers look really small. You can see how difficult it is to figure out the actual numbers for tape. It is, easy, however, to figure out the cost numbers for disk and then divide them by the multiplier shown in the graph.
The disk number for the first three-year period looks to be about $2600, which is said to be 9x the price of tape. I divide that $2600 by 9 and I get $288/TB for that 3 year period, which matches up with the line for tape on the graph. Divide it by 3 and we get $96/TB per year. The disk cost of the second period is $1250/TB. Divide it by15x and you get $83/TB for that 3 year period; divide that by 3 to get $27/TB per year. If I average those two together, I get $61/TB per year. Since Amazon Glacier stores your data in multiple locations, we'll need two copies, so the cost is $122/TB per year for two copies. Since Jack Clark used 10 PB for five years, we'll multiply this by 10,000 to get to 10 PB, then by five to get to five years. This gives us a cost of $6,100,000 for to store 10 PB on tape for five years, based on the numbers from the Clipper Group study.
Let's look at a more recent report that compares a relatively new idea of using a disk front end to LTFS-based tape. The first fully-baked system of this type is from Crossroads, and they just happen to have created a TCO study that compares the cost of storing 2PB on their system (a combination of disk and tape) vs storing it on disk for ten years. Awesome! Their 10-year cost for this is $1.64M. Divide 2PB by 2000 gives us 1TB, then dividing the 10 year cost by 10 gives us the cost of $80/TB for one year. Double it like we did the last number, and we have $160/TB/yr for two copies. Mutiply it by 10,000 (10 PB) and then again by five (five years) gives us a cost of $8M for 10 PB for five years based on the Crossroads Report.
On a side note, the Crossroads Strongbox system has the ability to replicate backups between two locations using their disk front end. This makes this system a lot more like what Amazon is offering with their Glacier service. (As opposed to traditional use of tape like the Clipper Group report was based on, where you'd also have to pay for someone like Iron Mountain to move tapes around as well.)
According to two TCO studies, storing two copies of 10 PB of data on tape for five years costs the same or more than it costs to store that same data on Amazon's Glacier.
And you don't have to buy everything up front and you only pay for what you use. You don't have to plan for anything but bandwidth. Yes, this will only work for data whose usage pattern matches what they offer, but they sure have made it cheap — and you don't have to manage it!
----- Signature and Disclaimer -----
Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technologist at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.