No such thing as a “Pay as you go” appliance

Pay as you goI’ve never seen an appliance solution that I would call “pay as you go.”  I might call it “pay as you grow,” but never “pay as you go.”  There is a distinct difference between the two.

What is “pay as you go?”

I’ll give you a perfect example.  BackupCentral.com runs on a Cpanel-based VM. Cpanel can automatically copy the backups of my account to an S3 account.   I blogged about how to do that here.

I tell Cpanel to keep a week of daily backups, four weeks of weekly backups, and 3 months of monthly backups.  A backup of backupcentral.com is about 20 GB, and the way I store those backups in S3, I have about fifteen copies.  That’s a total of about 300 GB of data I have stored in Amazon S3 at any given time.

Last time I checked, Amazon bills me about $.38/month.  If I change my mind and decrease my retention, my bill drops.  If I told Cpanel to not store the three monthly backups, my monthly bill would decrease by about 20%.  If I told it to make it six months of retention, my monthly bill would increase by about 20%.

What is “pay as you grow?”

Pay as you grow

Instead of using S3 — which automatically ensures my data is copied to three locations — I could buy three FTP servers and tell Cpanel to back up to them. I would buy the smallest servers I could find. Each server would need to be capable of storing 300 GB of data.  So let’s say I buy three servers with 500 GB hard drives, to allow for some growth.

Time will pass and backupcentral.com will grow.  That is the nature of things, right?  At some point, I will need more than 500 GB of storage to hold backupcentral.com.  I’ll need to buy another hard drive to go into each server and install that hard drive.

Pay as you grow always starts with a purchase of some hardware –– more than you need at the time.  This is done to allow for some growth.  Typically you buy enough hardware to hold three years of growth.  Then a few years later when you outgrow that hardware, you either replace it with a bigger one (if it’s fully depreciated) or you grow it by adding more nodes/blocks/chunks/bricks/whatever.

Every time you do this, you are buying more than you need at that moment, because you don’t want to have to keep buying and installing new hardware every month.  Even if the hardware you’re buying is the easiest to buy and install hardware in the world, pay as you grow is still a pain, so you minimize the number of times you have to do it. And that means you always buy more than you need.

What’s your point, Curtis?

The company I work (Druva) for has competitors that sell “pay as you grow” appliances, but they often refer to them as “pay as you go.”  And I think the distinction is important. All of them start with selling you a multi-node solution for onsite storage, and (usually) another multi-node solution for offsite storage. These things cost hundreds of thousands of dollars just to start backing up a few terabytes.

It is in their best interests (for multiple reasons) to over-provision and over-sell their appliance configuration.  If they do oversize it, nobody’s going to refund your money when that appliance is fully depreciated, and you find out you bought way more than you needed for the least three or five years.

What if you under-provision it?  Then you’d have to deal with whatever the upgrade process is sooner than you’d like.  Let’s say you only buy enough to handle one year of growth.  The problem with that is now you’re dealing with the capital process every year for a very crucial part of your infrastructure.  Yuck.

In contrast, Druva customers never buy any appliances from us.  They simply install our software client and start backing up to our cloud-based system that runs in AWS.  There’s no onsite appliance to buy, nor do they need a second appliance to get the data offsite.(There is an appliance we can rent them to help seed their data, but they do not have to buy it.) In our design, data is already offsite.  Meanwhile, the customer only pays for the amount of storage they consume after their data has been globally deduplicated and compressed.

In a true pay as you go system, no customer ever pays for anything they don’t consume. Customers often pay up front for future consumption, just to make the purchasing process easier.  But if they buy too much capacity, anything they paid for in advance just gets applied to the next renewal.  There is no wasted capacity, no wasted compute.

In one mode (pay as you grow)l you have wasted money and wasted power and cooling while your over-provisioned system sits there waiting for future data.  In the other model (pay as you go), you pay only for what you consume — and you have no wasted power and cooling.

What do you think?  Is this an important difference?

 

Written by W. Curtis Preston (@wcpreston), four-time O'Reilly author, and host of The Backup Wrap-up podcast. I am now the Technology Evangelist at Sullivan Strickler, which helps companies manage their legacy data

6 comments
  • You mention that all the data is on AWS. This is becoming more of a problem as time passes. I consult for the data analytics branch of a large retailer. We have [finally!] made the decision at the executive level that we are to get our data off AWS as soon as is reasonable and feasible. It turns out that Amazon is now a competitor of many retail companies, and they are getting off AWS because they don’t want Amazon to have any kind of access to their data. It could give Amazon an unscrupulous advantage.

    On another topic, you mention globally deduplicated. Because dedup reduces the ability to recover from data loss/errors, I would hope that different backup stores would use different dedup algorithms, and not just plop a pre-dedupped copy of everything into the Hadoop lake.

    • Druva also runs in Azure for those that prefer that. Having said that, any data Druva stores in Amazon is encrypted in such a way that even Druva can’t read it; therefore, Amazon would be completely unable to do anything with your data even if they wanted to.

      Discussing the other topic you brought up is a bit more complicated. The best high-level statement I can give is that all forms of copying and storing data have risks associated with them, and one must do their best to ameliorate those risks. Every copy you make to any medium, for example, can be corrupted during the copy process. This is why extensive checking should be done after any copy to ensure the data is what it is supposed to be. Suffice it to say that I do not believe deduplication increases your risk. If anything, it reduces the number of blocks of data that must be continually checked for consistency.

  • You said, “Suffice it to say that I do not believe deduplication increases your risk. If anything, it reduces the number of blocks of data that must be continually checked for consistency.”

    Yes, less blocks helps. What I was getting at was the ability to recover given that you did have an error. Dedup always has at least a minimal probability of collision. Different algorithms (at least different hash algorithms) would mean the error would only occur on one of the separate datastores.

    • Dedupe hash collisions are a myth created by the tape industry that was trying to shoot down the dedupe freight train that was killing their business. It’s something that sounds good, but the real odds are beyond astronomical. Here’s a 10-year-old blog post that explains why this is really a non-issue.

      https://backupcentral.com/de-dupe-hash-collisions/

      Please note that this article was written about SHA-1, with a keyspace of 160. Everyone now uses at least SHA-256.

  • I’m not saying it has a significant probability; I was merely saying that the opportunity to improve the odds in our favor existed when there was more than on store, as long as each store used a different has algorithm. This is so easy to do it is hard to explain why not to do it.

    BTW The AWS issue is very real, whether realistically, or emotionally. Companies that find Amazon moving into their business arena are fearful, so they are leaving AWS as a cloud provider.

    • Ah, but IS there a chance to improve the odds in your favor? I will argue that it’s only different odds, not better ones.

      You alternative to dedupe is to store three separate copies on three different media. Three copies on three different tapes, for example. Or on three different storage arrays, etc.

      Every time you transmit those copies, the transport medium has the possibility of undetectable bit errors (UDE). Then when you write the copy, there’s a chance of a UDE. Then over time, that magnetic medium will degrade and there’s a chance that the 1 you wrote turns into a 0. All of these are as real as the concept of a hash collision. And the entire industry has been based on these odds for several decades.

      All of the things above have mathematical probabilities attached to them, just as the odds of a hash collision are published. And the odds of a hash collision are much, much less (statistically speaking) then any of the things above. Add to this that by using dedupe we can keep everything on disk. And keeping everything on disk allows us to do many more high-level and low-level data integrity checks that simply aren’t possible with things like tape — your odds get even better.

      So I don’t think one can successfully argue that your odds are better under the old ways of doing things.

      As to AWS, I know the AWS thing is real, or at least the perception of it. That’s why Druva did the work to support running in Azure.