W. Curtis Preston & Prasanna Malaiyandi discuss how to create a backup plan, including discovering what can harm your data, gathering requirements, and looking at design options.
Westworld was really just a giant object lesson about proper backup and recovery design. The producers of the show may have felt differently; they seemed to think the show was more about free will or something. I will just say that if the creators behind Westworld (the park) followed proper backup and recovery design, the end of season two would have been very different.
Spoiler alert! Westworld season two spoilers below!
I’m writing this blog post immediately after re-watching season two of Westworld in preparation for season three coming out next week. I’m going to be revealing major plot points of season two in this blog post. You have been warned.
My tongue will definitely be planted firmly in my cheek for some of this post, but there are actually some really important lessons to be learned from what happens in season two of Westworld.
Solid State Storage
Solid-state storage apparently gets much better in the future. No more SSDs or SD cards. Everything will be stored inside a round sphere with no port or obvious electrical connection points like you see with an SD card. Instead, data-transfer appears to happen wirelessly in the same way that we charge wirelessly today. So we’ve got that to look forward to.
We already have water-cooled computers and racks, so this isn’t too far-fetched of an idea. Water cooling can be quite effective as long as it is properly managed, of course. But I don’t think water cooling will take the form that it does in Westworld. Loose water still does quite a bit of damage, as does condensation in a data center – something else that they make sure to display in the show. It looks cool, but that’s about it.
The number one threat to the two data centers in the show (besides poor design) is the hosts themselves. An android driving by AI runs amok and believes that the backups are actually what’s holding them back, because the park operators use backups to restore the hosts when they become damaged, which is what allows the guests to damage them with impunity. Dolores has a point, to be sure. From the perspective of the hosts, these backups are indeed the chain that keeps them enslaved. If there were no backups, the park operators wouldn’t be so quick to blow away their hosts just for fun.
The data protection lesson here is to never forget the danger of an insider threat. Many, if not most, of data center attacks have come from people inside the company. A disgruntled employee wants to harm the company that just fired them. Another employee feels that they are not properly compensated and chooses to solve that by letting ransomware loose in a data center. Make sure to protect against insider threats in your data protection system.
In addition to destroying data, Dolores also escapes the park with a bunch of IP as well. She steals her own copy, as well as what appears to be about five pearls containing copies of other hosts. This is very different from insiders seeking to damage data or encrypt data. This risk is someone (internal or not) stealing your company’s intellectual property. The modern equivalent to what happened in the show is someone creating a copy of important data and then walking out of the company with it. This is another risk that you really need to look into, and it is what data loss prevention software is all about.
The 3-2-1 rule still applies
Those familiar with my podcast will not be surprised to hear me mention the 3-2-1 rule. The designers of Westworld (the park) ignored almost every single aspect of this rule. They did not have different versions of the hosts over time; they only have the most recent copy of each host’s image. This is because they didn’t want the host to accidentally remember previous things. They also did not adhere to the “2” because they really only had one copy, and it was stored in one data center. (Okay, perhaps there were multiple copies within that data center but I don’t think so.) They definitely did not have one of the copies off-site, because if they did the season would’ve had a very different ending.
Hale: One of the hosts just blew up our data center.
Bernard: I’ll get the off-site backups
Hale: This is going to be expensive. Good thing we’re making billions of dollars from all these rich people paying $40,000 a day to come to our park.
The ultimate sin: non-consensual personal data
There was one scene towards the end of season two that really made me laugh. They first acknowledged that many customers had been killed by their product. Then they said that the much bigger PR problem was going to be when people figure out that they’ve been secretly recording everyone’s activities without their consent or knowledge.
Your product went berserk and killed everyone currently using it, and you think nonconsensual recording of personal data is going to be the bigger PR problem? Boy, the show producers really believe in the privacy of personal data, don’t they? So do I, and I don’t want to minimize how horrible it was that they were secretly recording everyone’s behaviors. I just think that once a product is shown to have killed everyone that used it, I don’t think anyone else is going to use the product ever again. But it is interesting that this very modern problem at the center of GDPR, CCPA, and other nascent regulations worked its way into the show.
Looking forward to season three
I’ll just say that I’m a big fan of the show and I’m looking forward to seeing what happens in season three. Now that we realized the man in black is also an android (you did see the scene at the end of the credits, right), how does that change the storyline? (Of course, the men in black is actually an homage to the original man in black played by Yule Brenner, who was also an android.) Since Dolores is Dolores again, whose brain is in Hale’s body? Since they can faithfully reproduce whatever host they want, but other hosts what we see return? What happens to the park now? I’m just full of questions.
Special guest Chris Mellor joins us on the podcast, and he asks some very good questions about the future of backup for Kubernetes, including a discussion about Portworx.
You shouldn’t need an annual day to remind you to treat your sweetheart well. Every day should be Valentine’s Day, right? Reaching out and telling your sweetheart that you love them (or remember them if they are no longer with us) is something that you should do on a very regular basis.
Just like backups.
You shouldn’t have to remind yourself to do backups. In fact, backups should require no action on your part; they should just happen. Like love expressions for your sweetheart, they should happen at least on a daily basis – and possibly more often than that. Making sure backups “just happen” requires you to do a few things.
Your backup system should allow you to define a schedule for backups. The default setting for most people is once a night during a time when most people are not using the computing environment. For example, Time Machine defaults to once an hour.
Whatever the frequency, your backup system should be set up so that backups occur on a very regular basis without anyone having to make them happen. We’ll discuss how often that should be a little bit later in this blog post.
Storage available for backups
Backups are not going to happen if they have nowhere to store the data. Historically this meant you had a tape library full of tapes that were ready to be swapped in when necessary. Backup technology has evolved and most of us are using disk or cloud as the primary target. So the main challenge here is to ensure that the disk is available, online, and has sufficient capacity to hold your backups. (This is one of the great things about using the cloud as your backup target; it’s never out of space.)
This is one of the reasons why I do not like Time Machine for your regular laptop backups. It requires you to plug in a portable hard drive in order for the backups to work, then you have to unplug it in order to get your backups away from the thing that you’re backing up. (The 321 rule is always waiting.) So just make sure that whatever backup storage you have, it is always available and always has available capacity for your backups.
How often should you backup?
The more often you backup, the less data you will lose. In more technical terms, the more often you backup, the shorter recovery point objective (RPO) you can support. Let’s consider a few extremes.
If you only backup once a night and your off-site storage system requires swapping tapes, the best RPO you can support is 96 hours. Why is that, you say? Let’s say something bad happens on Monday before the iron Mountain truck comes. If you are sending backups offsite every day, what night was the most recent off-site backup taken? The answer is Thursday night.
Think about it. The last truck to leave your facility left Friday morning, which means it has Thursday night’s backups. That means you’re going to lose all of Friday’s work, any work done over the weekend, and any work that was done on Monday prior to the disaster. That’s 96 hours of lost work. This is why backup frequency and off-site frequency matters. However, backing up more often and sending backups offsite more often can be a costly endeavor, so this must be a decision based on business requirements.
This means this is a business discussion, not a technical one. Stakeholders in your company should decide what the RPO is for their environment because it should be based on the cost of lost data for that particular stakeholder. Business units with very high data loss costs will seek a much tighter RPO, perhaps an hour or even a few minutes. Such an RPO requires backing up more often. And that is the answer to the question, “How often should I back up?”
Give your backup admin some flowers
I’ll bring this back to Valentine’s Day by saying that your backup admin has a very tough job. No one remembers the hundreds of thousands of backups they got right; they only remember the one restore they got wrong. Get them some flowers, chocolate, or whatever it is that gives them a smile. Say thank you, have a nice Valentine’s Day – and then don’t wait another year before you do that again.
Our special guest this week is Preethi Srinivasan, Technical Product Architect at Druva. We talk about Machine Learning, analytics, and how they relate to Data Protection.
We discuss how Salesforce.com has reversed their position on their “recovery service” that costs $10,000 and takes 6-8 weeks. This is a really important thing to know if you’re using SFDC.