Zdnet confused about Amazon Glacier pricing

Jack Clark of ZDNet wrote an article entitled AWS Glacier's dazzling price benefits melt next to the cost of tape, where he compares what he believes is the cost of storing 10 PB on tape for five years, versus the cost of doing the same with Amazon's Glacier service.  His conclusion is that Amazon's 1c/GB price is ten times the cost of tape.

I mean no disrespect, but I don't believe Jack Clark has ever had anything to do with a total cost of ownership (TCO) study of anything in IT.  Because if he had, he'd know that the acquisition cost of the hardware is only a fraction of the TCO of any given IT system. If only IT systems only cost what they cost when you buy them…. If only.

So what does it really cost to store 10 PB on tape?  Let's take a look at two published TCO studies to find out.  Before looking at these studies, let me say that since both studies were sponsored by tape companies, the point of them was to prove that tape systems are cheaper than disk systems. If these studies are biased in any way, it would be that they might underestimate the price of tape, since the purpose of these two, uh, independent studies is to prove that tape is cheaper.  (In fact, I wrote about one of the reports being significantly biased in favor of tape.)

Clipper Group Report

The first report we'll look at is the Clipper Group report that said that tape was 15 times cheaper than disk.  It's a very different report, but I'm going to use the graph on page 3, as it gives what it believes to be the TCO of storing a TB of data on tape for a year, based on four different three-year "cycles" of a 12-year period. 

diskvtape

As you can see, the cost per TB is much higher in the first three years, because it includes the cost of buying a tape library that is much larger than it needs to be for that period — because you must plan for growth.  (This, of course, is one of the major advantages of the Glacier model — you only pay for what you use.)  But to get close to Mr. Clark's five-year period, I need to use two three-year periods.

The other problem with the report is that they use graphs and don't show the actual numbers, and they use scales that make the tape numbers look really small.  You can see how difficult it is to figure out the actual numbers for tape.  It is, easy, however, to figure out the cost numbers for disk and then divide them by the multiplier shown in the graph.

The disk number for the first three-year period looks to be about $2600, which is said to be 9x the price of tape.  I divide that $2600 by 9 and I get $288/TB for that 3 year period, which matches up with the line for tape on the graph. Divide it by 3 and we get $96/TB per year.  The disk cost of the second period is $1250/TB. Divide it by15x and you get $83/TB for that 3 year period; divide that by 3 to get $27/TB per year.  If I average those two together, I get $61/TB per year.  Since Amazon Glacier stores your data in multiple locations, we'll need two copies, so the cost is $122/TB per year for two copies.  Since Jack Clark used 10 PB for five years, we'll multiply this by 10,000 to get to 10 PB, then by five to get to five years.  This gives us a cost of $6,100,000 for to store 10 PB on tape for five years, based on the numbers from the Clipper Group study.

Crossroads Report

Let's look at a more recent report that compares a relatively new idea of using a disk front end to LTFS-based tape.  The first fully-baked system of this type is from Crossroads, and they just happen to have created a TCO study that compares the cost of storing 2PB on their system (a combination of disk and tape) vs storing it on disk for ten years.  Awesome! Their 10-year cost for this is $1.64M.  Divide 2PB by 2000 gives us 1TB, then dividing the 10 year cost by 10 gives us the cost of $80/TB for one year.  Double it like we did the last number, and we have $160/TB/yr for two copies. Mutiply it by 10,000 (10 PB) and then again by five (five years) gives us a cost of $8M for 10 PB for five years based on the Crossroads Report.

On a side note, the Crossroads Strongbox system has the ability to replicate backups between two locations using their disk front end.  This makes this system a lot more like what Amazon is offering with their Glacier service.  (As opposed to traditional use of tape like the Clipper Group report was based on, where you'd also have to pay for someone like Iron Mountain to move tapes around as well.)

Net net

According to two TCO studies, storing two copies of 10 PB of data on tape for five years costs the same or more than it costs to store that same data on Amazon's Glacier.

And you don't have to buy everything up front and you only pay for what you use.  You don't have to plan for anything but bandwidth.  Yes, this will only work for data whose usage pattern matches what they offer, but they sure have made it cheap — and you don't have to manage it!

Not bad.

 


Written by W. Curtis Preston (@wcpreston), four-time O'Reilly author, and host of The Backup Wrap-up podcast. I am now the Technology Evangelist at Sullivan Strickler, which helps companies manage their legacy data

3 comments
  • > “we’ll need two copies”

    Since both Glacier & S3 offer 99.999999999% durability, I think we should triple instead of double the cost (S3 stores each byte in 3 different locations).

    Rayson Ho – Open Grid Scheduler

  • Aren’t you (knowingly?) forgetting a number of things here? Deduplicated disk/replication, management overhead, cost of testing restores, restore speed and the cost of waiting for that, etc etc.

    I think your calculation lacks a number of things that usually go into a backup/recovery ROI/TCO discussion

  • @Calle

    I appreciate you weighing in on this. I have no bone to pick here, nor do I have a stake in Amazon, nor do I get compensated in any way should someone pick Amazon. However, I am really impressed with what they’re offering and happen to think it’s very affordable when compared to the options.

    The point of my post was not to do a complete TCO of Glacier vs tape. My point was to use two other people’s TCO studies and compare them to what the guy from ZDnet said. But I’ll give the things you listed a shot.

    If one does consider dedupe, it significantly swings things in favor of Amazon’s Glacier, as you can do that with a disk device, but can’t usually do it with tape. (The exception is with CommVault Simpana, but even they don’t recommend doing that with your active data set.)

    I believe both TCO studies did factor in management overhead. That, again, swings things in favor of Glacier, as their IS no management overhead — they manage it.

    You have to test restores on both sides, so not sure how that factors into anything.

    Regarding restore/retrieval speed, either Glacier meets your needs or it doesn’t. If it doesn’t, then there’s no point in discussing cost. If it does, then it’s not a cost issue, either.

    I have another post on more details on Glacier, but I’m awaiting an official response from Amazon before posting it.