How to make the cloud cheaper (or more expensive)

Depending on how you do it, the cloud can be much less expensive than using on-premises systems.  But it can also much more expensive. It all depends on you use the public cloud.

The expensive way: 24×7 VMs and pre-provisioned block storage

Running one or more VMs in the cloud 24×7 (like you would in a datacenter) is a great way to drive up your costs. It’s generally going to be more expensive than running the same VMs in house (if you’re running them 24×7).  It’s difficult to come up with the incremental cost of an individual VM, as this article attests. But generally speaking, you should be able to run a VM onsite for less than the cost of running that same VM in the cloud. It makes sense; it’s called markup.

Storage can also be more expensive in the cloud for the same reasons.  If you’re provisioning large chunks of block storage (e.g. EBS in AWS) before you actually consume it, your costs are going to be higher than if you only pay for storage as you use it. This is really only possible with object storage.

It’s also important to note that moving a VM to the cloud doesn’t get rid of all the typical system administration tasks associated with said VM. The OS still needs updating; the applications still need updating.  Sure, you don’t have to worry about swapping out the power supply, but most people let a vendor do that part anyway. But it’s important to understand that moving a VM to the cloud doesn’t make it magically start caring for itself.

The cheap way: Dynamically allocated VMs and object storage

In the public cloud, your costs are directly tied to how much storage, network and compute you use. That means that if you have an application that can dynamically scale up and down its use of cloud resources, you might be able to save money in the cloud, even if the per-hour costs are higher than those you would have onsite. This is because generally speaking, you don’t save money in the datacenter by turning off a VM. The resources attached to that VM are still there, so your costs don’t do down. But if you have an app that can reduce its compute resources – especially to the point of turning off VMs, you can save a lot of money.

This also goes true for storage. If you are using object storage instead of block storage, you pay only for what you use as you use it.  As backups expire and objects are deleted out of the object store, your costs decrease.  This is very different than how pre-provisioned block storage behaves, where deleting files doesn’t save you money.

Use the cloud the way its meant to be used.

If your backup software is just running software in 24×7 VMs in the cloud, and if they require you to provision block storage for said VMs, then they’re using the cloud in the way that cloud experts generally agree is a great way to drive up costs and not add a lot of value.

Your costs will go up and your manageability stays the same. You’re still dealing with an OS and application that needs to be updated in the same way it would be onsite. You still have to increase or decrease your software or storage licenses as your needs grow.

Another example of why we do backups

There’s a story going around about Apple’s MacOS APFS sparse disk images occasionally losing their mind and throwing documents out the window.  Yet another example of why we do backups.

Don’t use APFS images for backups

I’ve never liked these disk images that Apple makes –– as a backup method.  This is just another example of why.  For those unfamiliar with them, they’re like a fancy .ISO image.  It’s one big file that you can mount as a file system. The “sparse” part is what the industry would call a thin-provisioned version of this image.  That is, you tell it how big it’s allowed to grow, but it will only consume the amount of space that is actually put into the image.

The problem that was recently discovered is that if the APFS sparse image runs out of virtual space, it will just keep writing the files like nothing’s wrong.  Even worse, the files will appear to have been copied, as they’ll be in RAM.  Unmount the disk image and remount it and you’ll find that the files were never copied.  Surely Apple needs to fix this.

The one place you’ll see a disk image is if you buy a Time Capsule Time Machine backup appliance.  I’m not sure why, but they chose to do it this way, instead of just mirroring the filesystem, the way Time Machine does on a local machine.  I’m sure they had their reasons, but this is where you’ll see disk images.  (Actually, I haven’t looked into the details of the Time Capsules in a while, so they could have changed.  But I can’t think of any other place where you’d see such a beast.)

I’ve never been a fan

Nine years ago I wrote an article about how I wasn’t a huge fan of Time Machine, and how I really didn’t like Time Capsules because of their disk images — and how they can get corrupted. Time Machine is nice for upgrades or a local copy, but I don’t think you should rely on it as your only backup.

This is why we do real backups.  Real backups are scheduled and happen all the time without you having to do anything. Their data is stored somewhere else, which today typically means the cloud. I simply can’t think of another viable way to backup mobile users and home users.

Protect your backups from ransomware

Don’t get your backup advice from security people. That’s how I felt reading what started out as a really good article about protecting your systems from ransomware.  It was all great until he started talking about how to configure your backup system.  He had no idea what he was talking about. But now I’m going to give you security advice about your backup system, so take it with a grain of salt.

Windows backup servers are risky

Windows-based backup products that store data in a directory are a huge security risk. In fact, many customers of such products have already reported my worst fears: their backups were encrypted with the same ransomware that infected their servers.

This isn’t an anti-Windows rant, or an anti-BackupProductX rant. It’s simply acknowledging the elephant in the room.

  1. If your backup server is accessible via the same network your computers are on, it can be attacked via the same things that attack your computers.
  2. If your backup server runs the same OS as your computers – especially if it’s the OS that most ransomware attacks happen on (Windows) – it can be infected with the same ransomware
  3. If your backups are stored in a directory (as opposed to a tape drive, an S3 object, or a smart appliance not accessible via SMB/NFS), they can be infected if your backup server is infected.
  4. If your backups are stored on a network mount via NFS/SMB, you’re giving the ransomware even more ways to attack you.

What should you do?

I don’t want to be guilty of doing what the security guy did, so I’ll say this: research what you can do to protect your systems from ransomware. But I’ll do my best to give some general advice.

I know the best advice I’ve read is to keep up-to-date on patches and to disable Remote Desktop Management on Windows.  There are also default SMB shares in Windows that should be disabled.

You can also make sure that your backups aren’t just stored in a directory. Unfortunately, that’s the default setup for most inexpensive backup software products. You need to investigate if the software you’re using supports another way to store backups.  If not, it’s time to think about a different product.

The same goes true for those currently storing backups on an NFS/SMB share. Investigate if your backup software has the ability to store backups on that device without using NFS/SMB. If not, make sure you lock down that share as much as you can. Again, if not, it’s time to think about another backup product.

Consider a cloud data protection service

A true cloud-based data protection service might be the best way to do this.  In a true cloud-based system, you never see the backup servers. You don’t know what they are and never login to them. You login to a web-based portal, and the actual servers that make this happen are completely invisible to you.  (Similar to the way the servers that make happen are invisible to you.)

If your backup servers are invisible to you, they’re invisible to your attackers. If there’s no way to directly access your backup – unless you’ve specifically setup such access for a recovery or DR test – then ransomware can’t get to those backups either.

It should go without saying that this recommendation does not apply if your “cloud” data protection vendor is just putting backup software on VMs that you manage in the cloud – what many have dubbed “cloud washing.” If you’re seeing your backup servers as VMs in the cloud, they’re just as much of a risk as they are if they were in your data centers. It’s on the reasons why these cloud washing vendors aren’t really giving you the full benefit of the cloud if all they’re doing is putting VMs up there.

Time to fire the man in the van

The man in the van can lose your tapes.  Any questions?

It’s the man, not the mountain

Yes, Iron Mountain has had many very public incidents of losing tapes. You can do a google search for Iron Mountain Loses Tapes to see what I’m talking about.  When all these stories started hitting the news back in 2005 (thanks to California’s new law requiring you to report such things), Iron Mountain’s official response was, “Iron Mountain performs upwards of five million pickups and deliveries of backup tapes each year, with greater than 99.999% reliability. Nevertheless, since the beginning of the year, four events of human error at Iron Mountain resulted in the loss of a customer’s computer backup tapes. While four losses is not a large number in comparison to an annual rate of five million transportation events, any loss is important to customers and to Iron Mountain … Iron Mountain is advising its customers that current, commonly used disaster recovery processes do not address increased requirements for protecting personal information from inadvertent disclosure.”

The tape vaulting company I used to use back in the day lost one or two of our tapes a year.  We gave them about 50 tapes a day, and retrieved 50 more back.  We tracked each individual tape, and were linked into their system to show when the tapes made it into the vault.  Every once in a while, there would be a discrepancy where one of the tapes would not show up in the vault.  This resulted in a search, and inevitably the tape would be found somewhere along the way.  Good times.

I remember one vaulting customer that received a box of tapes that weren’t theres.  When they called their rep, they had him read the bar codes off the tapes.  They couldn’t figure out whose they were, so the vaulting company said they should keep the tapes!

As long as media vaulting companies employ humans to be the “man” in the van, this problem will continue.  Humans do dumb things.  Humans make mistakes. So until these companies start hiring robots to pick up and deliver tapes, we will continue to see these problems.  However, I think much of the world will have moved to electronic vaulting by then.

I’ve always liked electronic vaulting

If you’re not going to use tapes to get your data offsite, you can use electronic vaulting.  This can be accomplished via a few different methods.

Onsite & Offsite Target Dedupe Appliance

There are a number of vendors that will be happy to sell you an appliance that will dedupe any backups you send to it. Those deduped backups are then replicated to another dedupe appliance offsite. This has been the primary model for the last 15 years or so to accomplish electronic vaulting. The problem is that these appliances are very expensive, and you have to buy two of them – as well as power, cool, and maintain them. It’s the most expensive of the three options mentioned here.

Source dedupe to offsite appliance

It makes more sense to buy backup software that will dedupe the data before it’s sent to an appliance. This appliance can be offsite, so that data is immediately sent offsite.  It can even be a virtual appliance running as a VM in the cloud.  Most people exploring this option opt for an onsite copy that replicates to the offsite appliance or VM.  Most vendors selling this type of solution tend to want to charge you for both copies.

Source dedupe to a cloud service

If you are backing up to a true cloud service (not just backup software running in some VMs in the cloud), and you are deduping data before it is sent to the cloud. Vendors that use this model tend to only charge you for the cloud copy. If they support a local appliance for quick recoveries, they tend not to charge for that copy. That makes this option the least expensive of the three

Fire the man, get a plan

Wow, I like that!  There are a number of ways you can now have onsite and offsite backups without ever touching a tape or talking to a man in the van down by the river.  Look into them and join the new millennium.