Running CrashPlan on a headless CentOS/cpanel server

I was helping someone figure out how to back up their CentOS/Cpanel-based web server using CrashPlan.  He was already backing it up via rsync, but he wanted to back it up with a cloud backup product.  Code42 advertises that CrashPlan and CrashPlanPro support Linux, so how hard could it be?  Not hard at all if you know what to do.  But if you're on a headless web server you will be at the mercy of what you can find on the Internet, as Code42 won't help you at all as you're running an "unsupported configuration."

We got it to work, but only after trying multiple different methods that didn't work.  So I thought I'd describe what we did that eventually worked, and hopefully someone else will find this when they're in the same situation.

What didn't work

Code42 has an "unsupported" (but totally reliable) method to connect the CrashPlan app on your desktop to the CrashPlan service running on the web server by using ssh tunneling.  It's described here.  We were able to make that method work to configure the backup, but then the backup wouldn't run.  It just stayed stuck at "waiting for backup."  We contacted Code42, but they said they couldn't help us at all because we were running an unsupported configuration.  More on that at the end of this blog.

I thought the path to take would be to see if we could use the GUI that is supposed to display on the console of the server, but display it back to our desktop — a MacBook in this case.  (Something totally normally in Unix/Linux configurations.)  Then since I would be running the GUI directly from the server being backed up, I could then call support.  It turned out I ended up fixing it myself, though.  Let's see what I did.

Use ssh to forward X11

Since MacOS no longer uses the X11 Window System (BTW, it's not "X Windows"), I needed to install Xquartz, which I got from here. We followed the instructions and they seemed to work without a hitch.

X11 forwarding is not turned on by default in CentOS, so you have to edit the sshd config and restart sshd.  (Thanks to this blog post for helping me with this.)

sudo vi /etc/ssh/sshd_config

Uncomment and change these two lines to these values

X11Forwarding yes
X11UseLocalhost no

Now restart sshd.

$ sudo /etc/init.d/ssd reload

If you do not have xauth installed already, you need to install it, too.

$ sudo yum install xauth

Then back on the client where you want to see the GUI displayed, run this command:

$ ssh -l root -Y <linuxserver>

We saw a message that mentioned that xauth had created a new authority file.

To test if it was working correctly, we wanted to run xterm.  But that wasn't installed yet, so we installed it.

$ sudo yum install xterm
$ xterm

We waited a few second, and voila!  An xterm popped up on the Mac.  Awesome.  

Run CrashPlanDesktop

$ /usr/local/crashplan/bin/CrashPlanDesktop
$

It just returned the prompt to us and never did anything.  When we looked at the log directory, we saw error messages like the ones mentioned in this blog post.  We followed the suggestions in that blog post about creating temporary directories that CrashPlan can write to, and then specifying those directories in the run.conf file.

$ mkdir /root/.crashplan-tmp
$ mkdir /var/crashplan
$ vi /usr/local/crashplan/bin/run.conf

Add this to the end of the GUI_JAVA_OPTS line: "-Djava.io.tmpdir=/root/.crashplan-tmp"
Add this to the end of the SRV_JAVA_OPTS line: "-Djava.io.tmpdir=/var/crashplan"

So run.conf now looks like this:

SRV_JAVA_OPTS="-Dfile.encoding=UTF-8 -Dapp=CrashPlanService -DappBaseName=CrashPlan -Xms20m -Xmx1024m -Djava.net.preferIPv4Stack=true -Dsun.net.inetaddr.ttl=300 -Dnetworkaddress.cache.ttl=300 -Dsun.net.inetaddr.negative.ttl=0 -Dnetworkaddress.cache.negative.ttl=0 -Dc42.native.md5.enabled=false -Djava.io.tmpdir=/var/crashplan"

GUI_JAVA_OPTS="-Dfile.encoding=UTF-8 -Dapp=CrashPlanDesktop -DappBaseName=CrashPlan -Xms20m -Xmx512m -Djava.net.preferIPv4Stack=true -Dsun.net.inetaddr.ttl=300 -Dnetworkaddress.cache.ttl=300 -Dsun.net.inetaddr.negative.ttl=0 -Dnetworkaddress.cache.negative.ttl=0 -Dc42.native.md5.enabled=false -Djava.io.tmpdir=/root/.crashplan-tmp"

After that, everything worked perfectly!

Epilogue: Vindication

We fixed the GUI_JAVA_OPTS line first and were then able to run the GUI and configure the backups, but the backup was still stuck at "waiting for backup."  That was exactly what happened when we used the method of running the GUI local on the Mac and connecting to the CrashPlan service on the web server.  We then changed the SRV_JAVA_OPTS line and backups kicked off immediately.

In other words, the reason the backup wasn't working had nothing to do with us running an unsupported GUI configuration and had everything to do with the CrashPlan app trying to use directories that it couldn't write to.  Now back to Code42.

You can support something that isn't "supported"

Just because a customer is running an unsupported configuration, that doesn't mean you can't help him troubleshoot something.  The Code42 support person could have told us where the logs are, for example.  (Yes, they were in the obvious place of /usr/local/crashplan/logs, but we didn't know that.)  Luckily we googled the right thing and found that web page.  Luckily we knew what X11 was and could figure out how to install it on our Mac.  They could have at least helped a little.  Instead, they simply said I was running a system that didn't meet the minimum requirements, so he literally could not help me in any way to troubleshoot the problem.

This is very reminiscent of when I was trying to install a Drobo on my iMac in my house. The blog post I wrote back then was to tell Data Robotics to either support Linux or drop it.  I still feel the same way right now, but in this case the problem is not that they aren't supporting Linux; it's that they don't support headless Linux, which is what most web servers are running.

It isn't that hard to provide some "best effort" support to someone.  They could also enhance that "how to run CrashPlan on a headless Linux system" post by adding this X11 Forwarding idea to it.  Then if a customer has a few questions, help them.  Tell them it's unsupported and that the support will be best effort.  But make the effort.  Seriously.

Continue reading

----- Signature and Disclaimer -----

Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Architect at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.

Is your data in the cloud getting backed up?

Roaming the aisles of Dreamforce 2014 taught me one thing: backups are here to stay.  You can move everything you can into the cloud, but your data still has to be protected against natural and human-created disasters.  Moving it to the cloud doesn’t change that.

I’ve always felt that way, but I thought for a while that maybe I was just a lone reed in the wind; only I was worried about data that had been moved to the cloud.  Everyone else was happy with the backups of their mission-critical data being put into the hands of the cloud provider. 

It was with some joy that I welcomed Backupify to the salesforce.com world when I first heard about them a few years ago.  (To my knowledge, they were the first vendor to offer backup of your salesforce.com data, and the first to backup Facebook, Gmail, and others.)  But I wondered whether or not there would be enough people concerned about their cloud-based data to justify adding that expense to their cloud infrastructure bill.  They might think, for example, that a company the size of salesforce.com is backing up their data – so why should they pay to do it as well.   Only time would tell.

Walking around Dreamforce 2014, though, put my fears to rest.  There were three other companies exhibiting backup solutions for salesforce.com (that I could see), and there are a few others that I found via a simple “backup salesforce”  search.  By the way, I’ll cover these companies in another post.

The key concept I wanted to cover here is that some people believe that by moving their data to the cloud, it’s automatically going to get backed up.  That simply isn’t the case.

Consider salesforce.com, for example.  It is well documented that they back up your data – but not so you can restore it!  Their backup is for them to restore a datacenter that gets destroyed by a disaster, malicious attack, or even just plain human error of one of their many humans.   However, if you need to use that backup to restore your salesforce instance due to error on your end, it will cost you a minimum of $10,000, and it is a best effort restore that might take several days.  In addition, it’s an all-or-nothing restore, so you are forced to roll back your entire salesforce instance to the last good backup they took, which could be several days ago!  Suffice it to say that relying on this service is a really, really bad idea.

This is still better than Amazon.com.  They do not back up customer data at all.  Their method of protecting against disasters is to replicate everything all over the place. However, if something catastrophic happens on your end, their replication will simply make it more catastrophic by immediately replicating it to multiple locations.  There is no way to recover your AWS instance if you or someone else manages to take it you.  If you don’t believe me, read my post about the death of codespaces.com

The general rule is that backup of the data you place in the cloud is your responsibility – just like it is in the datacenter.  Moving it to the cloud does not change that.

Recommendation

The first thing you need to do is to figure out what data you actually have in the cloud.  Good luck with that.  I’ve got some ideas, but we’ll save those for another post.

The next thing you need to do is find out what the cloud vendor’s policies are in this area.  Do they backup your data at all, or are backups entirely your responsibility?  Please note that I believe that backups are entirely your responsibility, I just want to know if you’re going to get any help from them or not in meeting that responsibility.  Even if you develop your own backup system, it would be nice to know whether or not there is a Plan B.

If they do backup your data, are you allowed to use it?  If so, is there an extra fee like salesforce.com, or can you use it at will?  It would be really nice to test this backup once in a while so you know how it will work when and if you need it.  But you’re not going to test a backup that costs $10K just to try it.

Finally, since the goal here is to have your own independent backup, make sure to investigate the feasibility and costs of doing that.  With salesforce.com, you’ll probably need more API calls, as a regular backup is likely to exceed your base amount.  With hosting providers, you’re talking about bandwidth.  How much will it cost to perform your first off-host backup of your data, and how much will each incremental backup cost you?  You need to know these numbers before investigating alternatives.

If you’re talking a hosted system of any kind, whether a physical machine in a colo somewhere or a VM inside AWS, you need to find out if regular backup software will run inside that machine, or if you are prevented in any way from running a backup application in that machine.  This could be anything from “we have a customized Linux kernel that doesn’t run regular apps” to “you are not allowed to make outgoing connections on non-standard ports.”  Find out the answers to these questions now.

Examine alternatives

If we’re talking about an application like salesforce, you can start by googling “backup application name.”  If you do that with salesforce, you will find several apps that you can investigate and compare the pricing for. You will find that each has set their pricing structure so they are more or less attractive to small or larger instances.  For example, they may have a base price that includes 50 users.  That’s great if you have 50 users, but not if you have 5.  If you have 500 users, though, you might not want an app that charges by individual user if they don’t start giving discounts at larger numbers.

If you’re talking any kind of hosted system running Windows or Linux, you can use most any cloud backup application that uses either source deduplication, continuous data protection (CDP), or near-CDP (otherwise known as snapshots and replication).  This is because after the first full backup is done, each of these will only send new, unique blocks every time they backup.  Since you are likely paying your cloud provider by the bit, this is both financially wise and doesn’t put you at odds with physics.

If you find yourself running an app that there is no way to backup, see if there is an API that can be used to get some of the data out.  For example, even though there are several apps that backup salesforce, what if there weren’t?  There are other apps that can connect via the API to at least grab your leads and contacts and put them into other systems such as databases or even spreadsheets.  It would be better than nothing if you found yourself running such an app that did not have any automated backup options.

Speaking of that, it’s not really a backup if it’s not automated, and it also needs to be stored in some system other than where the primary data is stored.   Again, I hate to keep using salesforce.com as an example, but please don’t tell me you do a weekly manual export of your various salesforce object using Dataloader.  That is better than nothing, but not by much.  Too much human involvement means too much chance for human error.  Automate it and get it offsite.

Just do it

I can’t explain all the options in an article like this, but I can hopefully get you thinking and asking questions about this.  Is your salesforce.com data being backed up? What about those apps you have running in a Linux VM in AWS?  You can’t fix what you don’t acknowledge, so it’s time to start looking.

 

 

 

 

 

Continue reading

----- Signature and Disclaimer -----

Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Architect at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.

Is a Copy a Backup?

Are we breaking backup in a new way by fixing it?  That's the thought I had while interviewing Bryce Hein from Quantum. It made me think about a blog post I wrote four years ago asking whether or not snapshots and replication could be considered a backup.  The interview is an interesting one and the blog post has a lot of good points, along with quite a bit of banter in the comments section.
 
What I mean when I say, "is a copy a backup" is this: traditionally, a "backup" changed form during the backup process.  It was put into tar/cpio/dump format, or the format of some commercial backup tool.  In this process, it made it slightly harder for it to be monkeyed with by a black hat.
 
I'm a fan of putting operational backup and recovery on disk.  I'm even a bigger fan of backing up in such a way that a "recovery" can simply be done by using the backup as the primary while the real primary is being repaired.  It offers the least amount of downtime in some sort of disaster.

But this does beg the question of whether or not leaving the backup in the same format as the original leaves it vulnerable in some way that putting it into a backup format doesn't.  I think the answer is a big fat no.  Specifically, I'd say that a copy is no more of less susceptible than a file on disk that's in some kind of "backup" format.  Either one could be deleted by a malicious admin, unless you were storing it on some kind of WORM filesystem.  The same is true of backups stored on tape.  If someone has control of your backup system, it doesn't take a rocket scientist to quickly relabel all your tapes, rendering them completely useless to your backup system.

As mentioned in my previous post on snapshots and replication, what makes something a backup (versus just a copy) is not its format.  The question is whether or not it has management, reporting, and cataloging built around it so that it is useful when it needs to be.

In that sense, a CDP or near-CDP style backup is actually more of a backup than a tar tape, assuming the tar tape is just the result of a quick tar command.  The tar tape has not management, reporting, or cataloging, other than what you get on the tape itself.  

I just want to close out by saying that backup products that are making instant recovery a reality are my favorite kind of products.  These include CDP and near-CDP style products like SimpanaZerto, Veeam, AppAssure, RecoverPoint, and any of the storage array or storage virtualization products that accomplish backup via snapshots and replication. This is the way backup should be done.  Backup continuously or semi-continuously, and recover instantly by being able to use the backup as the primary when bad stuff happens.

One thing's for sure: you can't do that with tape. ūüėČ

 
 

Continue reading

----- Signature and Disclaimer -----

Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Architect at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.

Keep Your Private Data Private – Good Password Practices Part 2

This is the third post in a series of posts on keeping your private data private. ¬†It was inspired by the Jennifer Lawrence (et al) nude photo scandal, and then encouraged by the “gmail hack,” which wasn’t really a gmail hack — which was published literally while i was working on this post. ¬†Previous posts talked about two-factor authentication and preventing hackers from guessing your password.

As I said in the last post, password best practices boil down to three things: preventing hackers from guessing your password, preventing them from stealing it in plain text, and limiting the damage if they do either one.  This blog post is about protecting yourself form the second two.  To read about protecting against the first one, read my previous blog post.

Note: if at any point in this article, you find yourself saying “give me a break” or your eyes start rolling into the back of your head due to boredom, just skip to the next blog post where I talk about password managers.

Limiting the damage if hackers steal your password 

You should assume that any given password may eventually get compromised. ¬†Therefore, you do not want to use the same password on every system. It’s one thing to have your gmail.com account password in the hands of bad guys. ¬†But if that same username and password are used on your amazon.com site? ¬†You’ll be buying $500 espresso machines for all your best friends in the Czech Republic before you can say Carlovy Vary.

Now I’ve gone and made it impossible, right? ¬†I want you to use a hard-to-guess password, I don’t want you to write it down, and I want you to use a different one on every system.

One thing that people do that is to combine the password mentioned above with a 2-3 letter code for each site. ¬†Prepend, append, or (better yet) split your password with this code. ¬†So take the “base” password above and make it one of these for Facebook:

FbStephen12p$4#oS
FStephen12p$4#oSb
FBStephen12p$4#oS
Stephenfb12p$4#oS

Then you do the same thing for your other accounts that you have. ¬†This has the benefit of giving you a unique password for every site that’s relatively easy to remember, and it makes it harder to guess. ¬†Adding those two characters increases the entropy of the password as well.

Another thing that people do is to have classes of passwords. ¬†They use really secure and separate passwords for sites where money is involved (e.g. bank, Amazon.com, any site that stores your credit card), another set of passwords for sites with sensitive personal information (e.g. facebook, gmail, dropbox), and then a “junk” password they use at places where you wouldn’t care if it got hacked (e.g. The website that stores your recipes).

Preventing them from stealing your password in plain text

This blog post¬†says that half of all internet sites store your passwords in plain text. For example, but it was revealed only a few years ago that LinkedIn was storing passwords in plain text. ¬†You’d think they’d know better. ¬†There’s literally nothing you can do to protect against that. ¬†No matter how good your password is; if they steal the password file and your password is in plain text — you’re toast. ¬†Well, shame on them. ¬†

What you can do, though, is to avoid installing software that would steal your passwords as you type them by watching your keystrokes. Don’t click on emails you don’t recognize. ¬†Don’t click on emails from places you¬†do¬†recognize! ¬†If Bank of America sends you an email, open BOA’s website on your own and log in. ¬†Don’t click on the link in the email. ¬†If you do, at the very least you’re letting a spammer know you’re a real person. ¬†Possibly it’ll be a really normal looking website that is nothing but a dummy site made to look like BOA and designed to steal your password as you type it in.

Also, no bank should ever call you and ask you for personally identifiable information, either. ¬†They should not be calling asking for passwords, your SSN, or anything like that. ¬†Unfortunately, some actual banks do this. ¬†The bank I belong to will call me about some fraud, and then ask me to verify my identity by giving them my account number or SSN or something. ¬†I refuse to give them that information and then I call back the actual number of the bank and talk to the fraud department. ¬†In my case, it really is the bank just doing stupid stuff. ¬†But it could be someone just trying to steal your passwords. But I believe it’s a really bad idea for banks to teach people that someone might call them and ask them for such information.

And if you get a phone call from “computer support” claiming you’ve got a virus and they need to login to your computer to fix it, again… hang up! ¬†Tell them they’re full of crap and they are a worthless excuse for a human being. ¬†In fact, feel free to unload the worst things you’ve ever wanted to say to a human being to them. ¬†It’ll be cathartic, and it’s not like they can complain to anyone.

This practice of trying to get you to give up your password or other personal info is referred to as social engineering. ¬†If you want to see how it works, watch a great movie called¬†Sneakers, or a not-as-great movie called¬†Trackdown. ¬†Both are available on Netflix On-Demand, and they both show you exactly the kinds of things hackers do to get people to reveal their personal information. ¬†Sneakers is the better movie, but Trackdown is actually more technically correct. ¬†It’s loosely based on the story of¬†Kevin Mitnick, considered one of the greatest hackers of all time. ¬†(In real life, Kevin Mitnick now does what Robert Redford’s character does in Sneakers.)

Use a Password Manager

This is becoming my default recommendation. Use a password manager to create random passwords for you, remember them, and enter them for you.

I’m talking about products like¬†1password,¬†lastpass, and¬†Dashlane. ¬†Instead of having to create and remember dozens of different passwords, you can just have them create and store your passwords for you. ¬†I have been trying out Dashlane and like it quite a bit. ¬†Some of them also support two-factor authentication, something I talked about in my¬†last post.

The first thing¬†Dashlane¬†did was to import all of the passwords stored in my browser. ¬†It turns out there were 150+ of them! ¬†If I did nothing else, it would allow me¬†to turn¬†off¬†the “remember password” feature on my browser. ¬†(It’s a really bad feature because if someone gets your laptop, they have the ability to automatically login as you to your most important sites, and your browser’s history will take them right to those sites.) ¬†

The second thing¬†Dashlane¬†did was to run a security audit on all my passwords. ¬†Like many people, I failed the audit. ¬†But then they walked me through exactly what I needed to do to make things all better. ¬†They also synchronized my passwords to my iPad and Android phones. ‚Ä®‚Ä®The software will remember your passwords and automatically log you in — but not before requiring you to login to the password manager (usually once per session). That way if someone stole your laptop, they wouldn’t be able to use the password manager to gain access to anything — assuming you didn’t put your master password on a sticky on your laptop, of course. ūüėČ ¬†They also allow you to specify that a particular site requires an entry of the master password every single time you use it, not just once per session. Pretty impressive stuff.

They unfortunately don’t yet support logging into apps on iOS/Android, but it can sync your passwords to those devices. ¬†That way if you forget a given password, it can either display it to you or copy it into the buffer so you can paste it into the app. ¬†I’ve been pretty impressed with¬†Dashlane.

Summary

•¬†¬†¬†¬†Don’t use easy to guess passwords

•¬†¬†¬†¬†Don’t use the same password everywhere

•¬†¬†¬†¬†Don’t open stupid stuff that’s designed to steal your data

•¬†¬†¬†¬†Consider using a password manager

I hope this post helps and hasn’t been too overwhelming.

 

Continue reading

----- Signature and Disclaimer -----

Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Architect at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.

Keep Your Private Data Private – Good Password Practices Part 1

This is the second post in a series of posts on keeping your private data private. ¬†The series was inspired by the Jennifer Lawrence (et al) nude photo scandal. ¬†Then literally while I was writing this blog post, this happened. ¬†I’m stlll not sure but what happened, but the short version is change your gmail password.

Password best practices boil down to three things: preventing hackers from guessing your password, preventing them from stealing it in plain text, and limiting the damage if they do either one.  This blog post is about protecting yourself from the first one of them.

Note:¬†if at any point in this article, you find yourself saying “give me a break” or your eyes start rolling into the back of your head due to boredom, just skip to my next blog post¬†where I talk about password managers.

Preventing hackers from guessing your password

Proper password systems do not store your password in plain text; they store it in encrypted format. ¬†(Although¬†this blog post¬†says that half of internet sites do store them in plain text. There’s literally nothing you can do to protect against that. ¬†No matter how good your password is; if they steal the password file and your password is in plain text — you’re toast.) ¬†When you enter your password to login to something, they encrypt what you typed and compare the encrypted result to the stored encrypted result. ¬†If they match, then you’re authenticated. This means that if a site is hacked and their password database is compromised, the hacker will not have direct access to your password.

They do have a couple of techniques they can use to guess your password, however. ¬†The first is called a¬†brute force¬†attack against the website. ¬†The only thing they need to do this is your user name, which they may have obtained via a variety of methods. ¬†If they have that, they simply try to login to the system as your user name again and again, guessing at various passwords each time until they get it right. ¬†A good website would have brute force detection and would notice this and shut down the account. ¬†But that doesn’t stop hackers from trying this method.

If they are able to gain access to the actual password file/database, they can try a different brute force attack that would be undetectable and will always result in them guessing some password of some account, because there are always people who use really bad passwords. They can use software that uses the same one-way encryption system the website uses.  They can try millions of combinations of different passwords until they find one whose encrypted version matches your stored encrypted password, and voila!  

Like the website brute force method above, they usually start with words they store in a dictionary file, which include ridiculous passwords like Password and 12345 (which people still use, believe it or not), and include every common word in dictionaries in multiple languages. ¬†They also know to append or prepend numbers around the word, such as Password1 or KittyCat57. ¬†It takes them a few milliseconds to try everything from Kittycat1 to Kittycat99, experimenting with capitalizing each letter, etc. ¬†They’ve got nothing but time and super powerful computers at their disposal. ¬†They might not guess your account, but you can bet that they will guess a bunch of accounts. ¬†(Which is why you should change your password as soon as you hear that a company you use has been compromised.) ¬†And, yes, they know about all the variations of dictionary words as well. They know Password is also spelled Pa$$word, Passw0rd, P@ssword, etc. ¬†So variations on dictionary words are also bad ideas for a password.

So the key here is to use a password that is hard to guess randomly. ¬†Such a password is said to have good¬†entropy. ¬†This is a mathematical term that I won’t go into great detail here, but suffice it to say that having good entropy is a factor of two things: the number of characters you use (e.g. a 12-character password), and the number of different types of characters you use (e.g. Upper/lower case, numbers, special characters). ¬†It’s a partnership. ¬†Long passwords are key, but not if they’re composed of all 9s (e.g. 999999999999). ¬†Having an upper and lower case letter, a number, and a special character is good, but 1a8# would be guessed in seconds. ¬†If you want to learn more about entropy, here’s a great¬†blog post. I will say that those who understand entropy seem to prefer longer passwords over more complex passwords, as you will see below.

It’s important to say that this means any of the following are out:

•¬†¬†¬†¬†Any word in any dictionary in any language (including Klingon and LOTR Elvish.”You shall not password” is no good either.)

•¬†¬†¬†¬†Variations on dictionary words (e.g. Pa$$word or $uperman)

•¬†¬†¬†¬†Any phrase or number associated with you (e.g. your name, birthday, or address) ¬†This matters more in an attack targeting you specifically.

•¬†¬†¬†¬†Any string that is just numbers (e.g. 438939) unless it’s really long, like 40 characters

You need a long, seemingly random string of characters¬†that is also easy to remember. If you have to look at the sticky on your monitor every time you enter it, you did it wrong. ¬†They key is to get a really good password that is hard to guess randomly and then stick with it. ¬†(No, I am not a fan of “change your password every month” policies. It would make much more sense for them to enforce entropy via software, and force you to make a good password and then let you keep it.)

One method is referred to in this¬†xkcd comic¬†and is commonly referred to as¬†correct horse battery staple¬†(see the comic for why). ¬†The practice is to select four completely random words that have nothing to do with each other that you can make a story out of, and use the entire phrase as your password. ¬†Again, the real key is to use words that have¬†nothing to do with each other. ¬†“Mommy of four babies” is bad, “Mommy Electric¬†tomato coffee” is good. ¬† Think of a story that helps you remember them in order and you’re all set. ¬†Think of a mommy that likes electrically warmed tomatoes in her coffee. ¬†Yuck. But you’ll never forget it. ¬†The phrase I used above gets an entropy score of 131 and a score of Very Strong (perhaps overkill) at¬†this password checker! ¬†That’s what you’re looking for. ¬†Some password systems will not allow you to use it because it’s too long, or that it doesn’t contain any capital characters, numbers or special characters. ¬†Therefore, I’m not a big fan of this method by itself. But it definitely scores high in the entropy department because the phrases are so long. ¬†(This is why I said entropy folks prefer longer, simple passwords over shorter, more complex passwords.)

Another method is to make up a much longer silly sentence or phrase and then make an acronym (technically an initialism) of that phrase, while turning some of the initials into numbers or special characters. The phrase should not be a common phrase like “I like walks in the park,” but “I like to pay $ for hash on Sundays” is good, and it becomes “1l2p$#oS”. ¬†(The more random the phrase, the harder it will be to guess. ¬†The less random the phrase, the easier it will be to remember. You need a balance.) ¬†Now you have a nine-character password that contains upper and lower case letters, two numbers, and two special characters. ¬†If you really have to early on, you could write down the sentence version of the password and refer to that, without writing the actual password down anywhere. ¬†Then after you’ve committed it to muscle memory you can either discard or securely store that sentence. ¬†This has been my personal favorite way to create passwords for years. ¬†However, running the password above into the same¬†password checker¬†gives me an entropy score of only 36, saying it is a “reasonable” password, but that skilled hackers might be able to guess it. I had to add five more random characters before it would say that the password was strong.

So I’d say that combining correct horse battery staple with the initialism method would make a pretty strong, hard to guess password. ¬†So “I like to pay $ for hash on Sundays” becomes “Stephen likes to pay $ for hash on Sundays,” which becomes ¬†“Stephen12p$4#oS” or¬†“Stephen12p$4#oSundays” if you want to go crazy.

Eyes rollling over yet? If so, go to my next blog post where I talk about password managers.

It should go without saying, but you should not use any of the passwords you read in this article, nor should you use any password that you run through a password checker. You never know who may be running that site; they could be using it to grab passwords.  Use those sites to enter examples of passwords like the password you will use, not the actual one you will use.)

Continue reading

----- Signature and Disclaimer -----

Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Architect at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.

Keep Your Private Data Private – Two-Factor Authentication

There are nude photos of you being posted on a website without your permission! ¬†Well, that’s what Jennifer Lawrence (and a host of other celebs) learned yesterday. ¬†Poor folks. ¬†They never meant for those pictures to be public. ¬†And you probably never mean for those personal emails you wrote, or pictures you took, or private Facebook messages you drunk-typed at two in the morning, to be made public either. ¬†So I thought I’d write a few posts about how to prevent just that thing from happening. ¬†And while I’m at it, I’ll talk about protecting them from failure as well. It’ll probably take me a few posts, but I needed something to blog about.

The first thing I want to talk about is how to keep someone from being able to access your account just because they got ahold of your password. ¬†How many stories have you read of someone hacking an entire password database? ¬†Passwords are typically sent and stored in an encrypted format, so just because someone hacked blabla.com doesn’t mean your blabla.com password is known — but it could be. ¬†(I don’t want to go into details, but suffice it to say that there are a number of scenarios where someone could steal your password without your consent or knowledge, and yes — even if you’re using SSL.) ¬†So let’s talk about how to protect your account from being accessed by a “black hat” even if they get access to your password. ¬†The secret is something called two-factor authentication, or TFA for short.¬†

If you have an ATM card, you’ve been using TFA for years. ¬†It involves pairing something you have (the ATM card) with something you know (your PIN). ¬†This is different than how most people access common Internetsites; they use only something they know (e.g. their password). ¬†If someone else gets your password, then poof — all bets are off. ¬†However, what if your password only worked if it was used on a device that you physically own? ¬†In other words, what if your device only worked if it was used on your laptop or mobile phone? Then the black hat would need to steal both your laptop and your mobile phone to get access to your data. ¬†And if you were a user of a big site that got hacked, you would probably want to change your password, but at least you would know that you didn’t hacked before you changed it.

Just ask the former owners of codespaces.com if they wished they had used two-factor authentication. ¬†If they had, the hacker would not have been able to gain access to their entire infrastructure and destroy their entire company — and the backups of said company — in a few keystrokes. ¬†It’s not a perfect system, but it’s better than single-factor authentication.

You won’t like the limits that this places on your digital lifestyle. ¬†If you find yourself wanting to access Facebook from a friend’s phone, for example, you won’t be able to do so without jumping through a hoop or two. ¬†Security always makes things harder to do; it’s kind of the point. ¬†But IMO, TFA is a very minor tradeoff to make in order to help keep your private data private.

Here is a great article on how to enable TFA on several popular Internet services. ¬†If it doesn’t cover your favorite service, just google “servicename two-factor authentication.” ¬†If your favorite site doesn’t support TFA, then maybe you should find a different site.

Later blog posts will talk about best practices for passwords, encrypting data at rest and in flight, and — of course — backing all this stuff up.

Continue reading

----- Signature and Disclaimer -----

Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Architect at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.

Does Undetectable Bit Error Rate Matter?

Magnetic devices make mistakes; the only question is how many mistakes will they make. I was presenting a slide yesterday that listed the UBERs for various magnetic media, and someone decided to take issue with it. ¬†He basically claimed that it was all bunk and that modern-day magnetic devices don’t make mistakes like I was talking about. ¬†He said he had worked several years in businesses that make disk drives, and that there was essentially no such thing as an undetectable bit error. ¬†I thought I’d throw out a few other thoughts on this issue and see what others think.

He said that all modern devices (disk and tape) do read after write checks, and therefore they catch such errors. ¬†But my (albeit somewhat cursory) knowledge of ECC technology is that the read after write is not a block-for-block comparison. ¬†My basic understanding is that a CRC is calculated of the block before the write, the write is made, the block is read back, the CRC is calculated on what was read back, and if they match all is good. ¬†HOWEVER, since the CRC is so small (12-16 bits), there is a possibility that the block doesn’t match, but the CRC does match. ¬†The result is an undetected bit error. ¬†(This is my best attempt at understanding how ECC and UBERs work. ¬†If someone else who has deep understanding of how it really works can explain it in plain English, I’m all eyes.)

There was a representative in the room from Exablox, and he previously worked at Data Domain. ¬†He mentioned that both vendors do high-level checking that looks for bit errors that disk drives make, and that they had found such errors many times — which is why they do it.

Stephen Foskett has said that he thinks that any modern disk array does such checking, and so the fact that disk drives have higher UBERs than tape drives is irrelevant. ¬†Any such errors would be caught by the higher level checks performed by the array or filesystem. ¬†For example, ZFS has such checking as well. ¬†But if all modern arrays do such checks, why do some vendors make sure they mention that THEY do such checking, suggesting that other vendors don’t do such checks. ‚Ä®‚Ä®Unless someone can explain to me why I should, I definitely don’t agree with the person who made the comment in my show. ¬†If drives didn’t make these errors, they wouldn’t need to publish a UBER in the first place. ¬†I somewhat agree with Stephen — if we’re talking about arrays that do higher-level checks. ¬†But I don’t think all arrays do such checks. ¬†So I think that UBER still matters. ‚Ä®‚Ä®What do you think?¬†

Continue reading

----- Signature and Disclaimer -----

Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Architect at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.

More proof that one basket for all your eggs is bad: codespaces.com is gone

Codespaces.com ceased to exist on June 17th, 2014 because they failed to adhere to the standard advice of not putting all your eggs in one basket.  There are a number of things that they could have done to prevent this, but they apparently did none of them.

Before I continue, let me say this. ¬†I know it’s been more than a year since I blogged. ¬†I don’t know what to say other than I’ve been a lot busy building my new company. ¬†Truth in IT¬†now has six full time employees, several part time employees, and several more contractors. ¬†We do a little bit of everything, including backup seminars, storage seminars, webinars, viral video production, lead nurturing programs, and some other things we’re working on in stealth at the moment. ¬†Hopefully I’ll get back to blogging more often. ¬†OK. ¬†Back to the business at hand.

Here’s the short story on codespaces.com. ¬†Their websites, storage, and backups were all stored in the Amazon.com egg basket. ¬†Then on June 17th, they were subjected to a DDOS attack by someone who was going to extort money from them. ¬†He gained access to their Amazon control panel. ¬†When they took steps to try and fix the problem, he reacted by wiping out their entire company. ¬†According to their site, he “removed all EBS snapshots, S3 buckets, all AMI’s, some EBS instances and several machine instances. In summary, most of our data, backups, machine configurations and offsite backups were either partially or completely deleted.” ¬†I hate being a Monday morning quarterback, but this is what happens when you put all your eggs in one basket.¬†

I’m a fan of cloud services. (Truth in IT is run entirely in the cloud.) ¬†I’m a fan of disk backups. (Truth in IT uses both a cloud-based sync and share service and a cloud-based backup service.) ¬†But if it’s on disk and is accessible electronically, it is at risk. ¬†Having your services, storage, and backups all accessible via the same system is just asking for it. ¬†

I do not see this as a cloud failure. ¬†I see this as a process and design failure. ¬†They would have been just as likely to have this happen to them if they had done this in their data center. That is, if they used a single system to store their server images, applications and data, snapshots of that data, and extra copies of those snapshots. ¬†Yes, using Amazon made it easier to do this by offering all of these services in one place. But the fact that it was in the cloud was not the issue — the fact that they stored everything in one place was the issue.

I love snapshot-based backups, which is what codespaces.com used. It should go without saying, though, that snapshots must be replicated to be any good in times like this.  However, as I have repeatedly my friends at companies that push this model of data protection, even a replicated snapshot can be deleted by a malicious admin or a rolling bug in the code.  So I still like some other kind of backups of the backups as long as they are accessible electronically.  

Use a third-party replication/CDP system to copy them to a different vendor’s array that has a different password and control panel. ¬†Back them up to tape once in a while. ¬†Had they done any of these things into a system that was not immediately controllable via the Amazon Control Panel, their backups would have been safer. ¬†(The hacker would have had to hack both systems.) ¬†However, since all server data, application data, and backup data were all accesible via a single Amazon.com console, the hacker was able to access their data and their backups via the same console.

I love cloud-based computing services. ¬†There’s nothing wrong with them running their company on that. ¬†But also storing their backups via the same Amazon console as their server? ¬†Not so much.

I love cloud-based backups. ¬†They are certainly the best way to protect cloud-based servers. ¬†I’m also fine with such backups being stored on S3. ¬†But if your S3 backups are in the same account as your AWS instances, you’re vulnerable to this kind of attack.

I also want to say that this is one of the few advantages that tape has — the ability to create an “air gap.” ¬†As a removable medium, it can be used to place distance (i.e. an “air gap”) between the data you’re protecting and the protection of that data. ¬†Store those backups at an offsite storage company and make retrieval of those tapes difficult. ¬†For example, require two-person authentication when picking up backup tapes outside of normal operations.

For those of you backing up things in a more traditional manner using servers in a non-cloud datacenter, this still applies to you. ¬†The admin/root password to your production servers should not be the same password as your development servers — and it should not be the same one as your backup servers. ¬†Your backup person should not have privileged access to your production servers (except via the backup software), and administrators of your production servers should not have privileged access to your backup system. ¬†That way a single person cannot damage both your production systems and the backups of those systems.

I would add that many backup software packages have the ability to run scripts before and after backups run, and these scripts usually run as a privileged user. ¬†If a backup user can create such a script and then run it, he/she could issue an arbitrary command, such as deleting all data — and that script would run as a privileged user. ¬†Look into that and lock that down as much as you can. ¬†Otherwise, the backup system could be hacked and do just what this person did.

Don’t store all your eggs in one basket. ¬†It’s always been a bad idea.¬†

 

Continue reading

----- Signature and Disclaimer -----

Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Architect at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.

Get rid of tape? Inconceivable!

Stephen Manley published a blog post today called “Tape is Alive? Inconceivable!”¬† To which I have to reply with a quote from Inigo Montoya, “You keep using that word. I do not think it means what you think it means.”¬† I say that because, for me, it’s very conceivable that tape continues to play the role that it does in today’s IT departments.¬† Yes, its role is shrinking in the backup space, but it’s far from “dead,” which is what Stephen’s blog post suggests should happen.

He makes several good points as to why tape should be dead by now.¬† I like and respect Stephen very much, and I’d love to have this discussion over drinks at EMC World or VMworld sometime.¬† I hope that he and his employer see this post as helping him to understand what people who don’t live in the echo chamber of disk think about tape. ¬†

Stephen makes a few good points about disk in his post.¬† The first point is that the fastest way to recover a disk system is to have a replicated copy standing by ready to go.¬† Change where you’re mounting your primary data and you’re up and running.¬† He’s right.¬† He’s also right about snapshots or CDP being the fastest way to recover from logical corruption, and the fastest way to do granular recovery of files or emails.

In my initial post on the LinkedIn discussion that started this whole thing, I make additional “pro-disk” points. First, I say that tape is very bad at what most of us use it for: receiving backups across a network — especially incremental backups.¬† I also mention that tape cannot be RAID-protected, where disk can be. I also mention that disk enables deduplication, CDP, near-CDP and replication — all superior ways to get your data offsite than handing tape to a dude in a truck.¬† I summarize with the statement that I believe that disk is the best place for day-to-day backups.

But…

Disk has all of the above going for it.¬† But it doesn’t have everything going for it, and that’s why tape isn’t dead yet — nor will it be any time soon.

I do have an issue or two with the paragraph in Stephen’s post called “Archival Recovery.”¬† First, there is no such thing.¬† It may seem like semantics, but one does not recover from archives; one retrieves from archives.¬† If one is using archive software to do their archives, there is no “recover” or “restore” button in the GUI.¬† There is only “retrieve.”¬† Stephen seems to be hinting at the fact that most people use their backups as archives — a fact on which he and I agree is bad.¬† Where we disagree is whether or not moving many-years-old backup data to disk solves anything. My opinion is that the problem is not that the customer has really old backups on tape.¬† The problem is that they have really old backups.¬† Doing a retrieval from backups is always going to be a really bad thing (regardless of the media you use) and could potentially cost your company millions of dollars in fines and billions of dollars in lost lawsuits if you’re unable to do it quickly enough.¬†¬† (I’ll be making this point again later.)

Cost

Disk is the best thing for backups, but not everyone can afford the best.¬† Even companies that fill their data centers with deduplicated disk¬† and the like still tend to use tape somewhere — mainly for cost reasons.¬† They put the first 30-90 days on deduped disk, then they put the next six months on tape.¬† Why?¬† Because it’s cheaper.¬† If it wasn’t cheaper, there would be no reason that they do this.¬† (This is also the reason why EMC still sells tape libraries — because people still want to buy them.)

Just to compare cost, at $35 per 1.5 TB tape, storing 20 PB on LTO-5 tapes costs $22K with no compression, or $11K with 2:1 compression.  In contrast, the cheapest disk system I could find (Promise VTrak 32TB unit) would cost me over $12M to store that same amount of data.  Even if got a 20:1 dedupe ratio in software (which very few people get), it would still cost over $600K (plus the cost of the capacity-based dedupe license from my backup software company).

It’s also the cheapest way to get data offsite and keep it there.¬† Making another copy on tape at $.013/GB (current LTO-5 pricing) and paying ~$1/tape/month to Iron Mountain is much cheaper than buying another disk array (deduped or not) and replicating data to it.¬† The disk array is much more expensive than a tape, and then you need to pay for bandwidth — and you have to power the equipment providing that bandwidth and power the disks themselves.¬† The power alone for that equipment will cost more than the Iron Mountain bill for the same amount of data — and then you have the bill for the bandwidth itself.

Now let’s talk about long-term archives.¬† This is data stored for a long time that doesn’t need to be in a library.¬† It can go on a shelf and that’ll be just fine.¬† Therefore, the only cost for this data is the cost of the media and the cost of cooling/dehumidifying something that doesn’t generate heat.¬† I can put it on a tape and never touch it for 30 years, and it’ll be fine (Yes, I’m serious; read the rest of the post).¬† If I put it on disk, I’m going to need to buy a new disk every five years and copy it.¬† So, even if the media were the same price (which it most certainly is not), the cost to store it on disk would be six times the cost of storing it on tape.

Unlimited Bandwidth

Never underestimate the bandwidth of a truck.¬† ‘Nuf said.¬† Lousy latency, yes.¬† But definitely unlimited bandwidth.

Integrity of Initial Write

LTO is two orders of magnitude better at writing bits than enterprise-grade SATA disks, which is what most data protection data is stored on.¬† The undetectable bit error rate of enterprise SATA is 1:10^15, and LTO is 1:10^17.¬† That’s one undetectable error every 100 TB with SATA disk and one undetectable error every 10 PB with LTO.¬† (If you want more than that, you can have one error every Exabyte with the Oracle and IBM drives.)¬† I would also argue that if one error every 10 PB is too much, then you can make two copies — at a cost an order of magnitude less than doing it on disk.¬† There’s that cost argument again.

Long-term Integrity

As I have previously written, tape is also much better than disk at holding onto data for periods longer than five years.¬† This is due to the¬† physics of how disks and tapes are made and operated.¬† There is a formula (KuV/kt) that I explain in a previous blog post that explains how the bigger your magnetic grains are, the better, and the cooler your device is, the better¬† The resulting value of this formula gives you an understanding of how well the device will keep its bits in place over long periods of time, and not suffer what is commonly called “bit rot.”¬†¬† This is because disks use significantly smaller magnetic grains than tape, and disks run at very high operating temperatures, where tape is stored in ambient temperatures.¬† The result is that disk cannot be trusted to hold onto data for more than five years without suffering bit rot.¬† If you’re going to store data longer than five years on disk, you must move it around.¬† And remember that every time you move it around, you’re subject to the lower write integrity of disk.

I know that those who are proponents of disk-based systems will say that because it’s on disk you can scan it regularly.¬† People who say that obviously don’t know that you can do the same thing on tape.¬† Any modern tape drive supports the SCSI verify command that will compare the checksums of the data stored on tape with the actual data.¬† And modern tape libraries have now worked this into their system, automatically verifying tapes as they have time.

Only optical (i.e. non-magnetic) formats (e.g. BluRay, UDO) do a better job of holding onto data for decades.¬† Unfortunately they’re really expensive. Last I checked, UDO media was 75 times more expensive than tape.

Air Gap [Update: I added this a day after writing the inital post because I forgot to add it]

One thing tape can do that replicated disk systems cannot do is create a gap of air between the protected data and the final copy of its backup.¬† Give the final tape copy to Iron Mountain and you create a barrier to someone destroying that backup maliciously.¬† One bad thing about replicated backups is that a malicious sysadmin can delete the primary system, backup system, and replicated backup system with a well-written script.¬† That’s not possible with an air gap.

Device Obsolescence

People that don’t like tape also like to bring up device obsolescence.¬† They say things like “you can’t even get a device to read the tape you wrote 10 years ago.”¬† They’re wrong.¬† Even if you completely failed to plan, there is a huge market for older tape drives and you can find any tape drive used in the last 20-30 years on eBay if you have no other choice. (I know because I just did it.)

Second, if you’re keeping tapes from twenty-year-old tape drives, you should be keeping the drives.¬† Duh.¬† And if those drives aren’t working, there are companies that will repair them for you.¬† No problem, easy peasy.¬† Device obsolescence is a myth.

Device Life

Suppose you have a misbehaving disk from many years ago.  There are no disk repair companies.  There are only data recovery companies that charge astronomical amounts of money to recover data from that drive.

Now consider what you do if you had a malfunctioning tape, which is odd, because there’s not much to malfunction.¬† I have been able to “repair” all of the physically malfunctioning tapes I have ever experienced (which is only a few out of the hundreds of thousands of tapes I’ve handled).¬† The physical structure of a modern tape spool is not that difficult to understand, take apart, and reassemble.

Now consider what happens when your old tape drive malfunctions, which is much more likely.¬† You know what you do?¬† Use a different drive!¬† If you don’t have another drive, you can just send the one that’s malfunctioning to a repair shop that will cost you far less than what a data recovery company will cost you.¬† If you’re in a hurry, buy another one off eBay and have them rush it to you.¬† Better yet, always have a spare drive.

Legal Issues

This isn’t really a disk-vs-tape issue, but I just had to comment on the customer that Stephen quoted in his blog post as saying, “I’m legally required to store data for 30 years, but I’m not required by law or business to ever recover it. That data is perfect for tape.” That may be a statement that amuses someone who works for a disk company, but I find the statement to be both idiotic and irresponsible.¬† If one is required by law to store data for 30 years, then one is required by law to be able to retrieve that data when asked for it.¬† This could be a request from a government agency, or an electronic discovery request in a lawsuit.¬† If you are unable to retrieve that data when you were required to store it, you run afoul of that agency and will be fined or worse.¬† If you are unable to retrieve the data for an electronic discovery request in a lawsuit, you risk receiving an adverse inference instruction by the judge that will result in you losing the lawsuit.¬† So whoever said that has no idea what he/she is talking about.

Think I’m exaggerating?¬† Just ask Morgan Stanley, who up until the mid 00’s used their backups as archives.¬† The SEC asked them for a bunch of emails, and their inability to retrieve those emails resulted in a $15M fine.¬† They also had a little over 1400 backup tapes that they needed months of time to be able to pull emails off of to satisfy an electronic discovery request from a major lawsuit from Coleman Holdings in 2005.¬† (They needed this time because they stored the data via backup software, not archive software.)¬† The judge said “archive searches are quick and inexpensive. They do not cost ‘hundred of thousands of dollars’ or ‘take several months.'”¬† (He obviously had never tried to retrieve emails off of backup tapes.)¬† He issued an adverse inference instruction to the jury that said that this was a ploy by Morgan Stanley to hide emails, and that they should take that into consideration in the verdict.¬† They did, and Morgan Stanley lost the case and Coleman Holdings was given a $1.57B judgment.

Doing a retrieval for a lawsuit or a government agency request is a piece of cake — regardless of the medium you use — if you use archive software.¬† If you’re use backup software to store data for many years, it won’t matter what medium you use either — retrieval will take forever.¬† (I do feel it important to mention that there is one product I know that will truly help you in this case, and that’s Index Engines. It’s a brute-force approach, but it’s manageable.¬† They support disk and tape.)

Summary

Why isn’t tape dead?¬† Because there are plenty of things that it is better at than disk.¬† Yes, there are plenty of things that disk is better at than tape.¬† But move all of today’s production, backup, and archive data to disk?¬† Inconceivable!

Continue reading

----- Signature and Disclaimer -----

Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Architect at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.

Love my Mac Starting to hate Apple.

Keep it up, Apple, and I’m going back to Windows.

I was a Windows customer for many years.¬† Despite running virus/malware protection and being pretty good at doing the right things security-wise, I had to completely rebuild Windows at least once a year — and it usually happened when I really didn’t have the time for it.¬† It happened one too many times and said, “that’s it,” and I bought my first MacBook Pro. (The last Windows OS I ran on bare metal was Windows XP.)

I made the conversion to MacOS about 4+ years ago.¬† During all this time, I have never — never — had to rebuild MacOS. When I get a new Mac, I just use Time Machine to move the OS, apps, and data to the new machine.¬† When a new version of the OS comes out, I just push a button and it upgrades itself.¬† I cannot say enough nice things about how much easier it is to have a Mac than a Windows box.¬† (I just got an email today of a Windows user complaining about what he was told about transferring his apps and user data to his new Windows8 machine.¬† He was told that it wasn’t possible.)

My first Mac was a used MacBook Pro for roughly $600, for which I promptly got more RAM and a bigger disk drive.¬† I liked it.¬† I soon bought a brand new MacBook Pro with a 500 GB SSD drive, making it cost much more than it would have otherwise.¬† (In hindsight, I should’ve bought the cheapest one I could buy and then upgrade the things I didn’t like.)¬† It wasn’t that long before I realized that I hadn’t put enough RAM in it, so I did.¬† (I didn’t account for the amount of RAM that Parallels would take.)¬†

My company’s second Mac was an iMac. After we started doing video editing on that, we decided to max out its RAM.¬† Another MacBook Pro had more RAM installed in it because Lion wanted more than Snow Leopard, and on another MacBook Pro we replaced the built-in hard drive with an SSD unit and upgraded its RAM.¬† We are still using that original MacBook Pro and it works fine — because we upgraded to more RAM and a better disk — because we could. It’s what people that know how to use computers do — they upgrade or repair the little parts in them to make them better.

The first expensive application we bought (besides Microsoft Office) was Final Cut Pro 7, and I bought it at Fry’s Electronics — an authorized reseller of Apple products.¬† I somehow managed to pay $1000 for a piece of software that Apple was going to replace in just a few days with a completely different product.¬† Not an upgrade, mind you, a complete ground-up rework of that product.¬† Again, anyone who followed that world knows what’s coming next.¬† I wish I had known at the time.

First, Apple ruins Final Cut Pro

For those who don’t follow the professional video editing space, Final Cut Pro was the industry standard for a long time.¬† Other products eventually passed it up in functionality and speed, but a lot of people hung onto Final Cut Pro 7 anyway because (A) they knew it already and (B) it worked with all their existing and past project files.¬† They waited for years for a 64-bit upgrade to Final Cut Pro 7.¬†

Apple responded by coming out with Final Cut Pro X, a product that was closer in functionality to iMovie than Final Cut Pro¬† — and couldn’t open Final Cut Pro 7 projects.¬† (In case you missed that, the two reasons that people were holding onto Final Cut Pro 7 were gone.¬† They didn’t know how to use the new product because it was night and day a different product, and it couldn’t open the old product’s projects.)¬† FCP X was missing literally dozens of features that were important to the pro editing community.¬† (They have since replaced a lot of those missing features, but not all of them.) And the day they started selling FCP X, they stopped selling FCP 7.¬† Without going into the details, suffice it to say that there was a mass exodus and Adobe and Avid both had a very good year.¬† (Both products offered, and may still be offering big discounts to FCP customers that wanted to jump ship.)

But what really killed me is what happened to me personally. I thought that while Apple was addressing the concerns that many had with FCP X, I’d continue using FCP 7.¬† So I called them to pay for commercial support for FCP 7 so I could call and ask stupid questions — of which I had many — as I was learning to use the product.¬† Their response was to say that support for FCP 7 was unavailable.¬† I couldn’t pay them to take my calls on FCP 7. What?

So here I am with a piece of software that I just paid $1000 for and I can’t get any help from the company that just sold it to me.¬† I can’t return it to Fry’s because it’s open software.¬† I can’t return it to Apple because I bought it at Fry’s.¬† I asked Apple to give me a free copy of FCP X to ease the pain and they told me they’d look into it and then slowly stopped returning my emails.¬† Thanks a bunch, Apple.¬† (Hey Apple: If you’re reading this, it’s never too late to make an apology & give me that free copy of FCP X.)

Apple ruins the MacBook Pro

Have you seen the new MBP?¬† Cool, huh?¬† Did you know that if you want the one with the Retina display, you’d be getting the least upgradeable, least repairable laptop in history?¬† That’s what iFixit had to say after they tore down then 15″ and 13″ MBPs.¬† You won’t be able to upgrade the RAM because it’s soldered to the motherboard.¬† You’ll have to replace the entire top just to replace the screen — because Apple fused the two together.

When I mention this to Apple fans and employees, what I get is, “well it’s just like the iPad!”¬† You’re right.¬† The 15-inch MacBook Pro is a $2200 iPad.¬† This means that they can do things like they do in the iPad where they charge you hundreds of dollars to go from a 16 GB SSD chip to a 64 GB SSD chip, although the actual difference in cost is a fraction of that.¬† Except now we’re not talking hundreds of dollars — we’re talking thousands.¬† This means that you’ll be forced to buy the most expensive one you can afford because if you do like I did and underestimate how much RAM you’ll need, you’ll be screwed.¬† (It costs $200 more to go from an 8GB version to a 16GB version, despite the fact that buying that same RAM directly from Crucial will cost you $30 more — not $200.)

Apple’s response is also that they’ll let the market decide.¬† You can have the MBP with the Retina Display and no possibility of upgrade or the MBP without the Retina Display and the ability to upgrade.

First, I want to say that that’s not a fair fight.¬† Second, can you please show me on the Apple website where they show any difference between the two MBPs other than CPU speed and the display?¬† Everyone is going to buy the cheaper laptop with the cooler display, validating Apple’s theory that you’ll buy whatever they tell you to buy. (Update: If you do order one of the Retina laptops, it does say in the memory and hard drive sections, “Please note that the memory is built into the computer, so if you think you may need more memory in the future, it is important to upgrade at the time of purchase.” But I don’t think the average schmo is going to know what that means.)

Apple Ruins the iMac

I just found out today that they did the same thing they did above, but with the iMac.¬† And they did this to make the iMac thinner.¬† My first question is why the heck did the iMac need to be thinner?¬† There’s already a giant empty chunk of air behind my current iMac because it’s so stinking thin already.¬† What exactly are they accomplishing by making it thinner?

One of the coolest things about the old iMac was how easy it was to upgrade the RAM.¬† There was a special door on the bottom to add more RAM.¬† Two screws and you’re in like Flynn.¬† Now it’s almost as bad as the MacBook Pros, according to the folks over at iFix it.¬† First, they removed the optical drive.¬† Great, just like FCP. They made it better by removing features!¬† Their tear down analysis includes sentences like the following:

  • “To our dismay, we’re forced to break out our heat gun and guitar picks to get past the adhesive holding the display down.”
  • “Repair faux pas alert! To save space and eliminate the gap between the glass and the pixels, Apple opted to fuse the front glass and the LCD. This means that if you want to replace one, you’ll have to replace both.”
  • “Putting things back together will require peeling off and replacing all of the original adhesive, which will be a major pain for repairers.”
  • “The speakers may look simple, but removing them is nerve-wracking. For seemingly no reason other than to push our buttons, Apple has added a barb to the bottom of the speaker assemblies that makes them harder-than-necessary to remove.”
  • “Good news: The iMac’s RAM is “user-replaceable.” Bad news: You have to unglue your screen and remove the logic board in order to do so. This is just barely less-terrible than having soldered RAM that’s completely non-removable.”

It is obvious to me that Apple doesn’t care at all about upgradeability and repairabiity.¬† Because otherwise they wouldn’t design a system that requires ungluing a display just to upgrade the RAM!¬† How ridiculous is that?¬† And they did all this to make something thinner that totally didn’t need to be thinner.¬† This isn’t a laptop.¬† There is absolutely no benefit to making it thinner.¬† You should have left well enough alone.

Will they screw up the Mac Pro, too?

I have it on good authority that they are also doing a major redesign of the Mac Pro (the tower config).¬† This is why we have waited to replace our iMac w/a Mac Pro, even though the video editing process could totally use the juice.¬† But now I’m scared that they’ll come out with another non-repairable product.

Keep it up, Apple, and I’m gone

Mac OS may be better than Windows in some ways, but it also comes with a lot of downsides.¬† I continually get sick of not being able to integrate my Office suite with many of today’s cool cloud applications, for example.¬† I still have to run a copy of Windows in Parallels so I can use Visio and Dragon Naturally Speaking.¬†

You are proving to me that you do not want intelligent people as your customers.¬† You don’t want people that try to extend the life of their devices by adding a little more RAM or a faster disk drive.¬† You want people that will go “ooh” and “ahh” when you release a thinner iMac, and never ask how you did that, or that don’t care that they now have to pay extra for a DVD drive that still isn’t Blu-Ray.

Like I said when I started this blog post.  I like my Mac.  I love my new iPad Mine, but I am really starting to hate Apple.

Continue reading

----- Signature and Disclaimer -----

Written by W. Curtis Preston (@wcpreston). For those of you unfamiliar with my work, I've specialized in backup & recovery since 1993. I've written the O'Reilly books on backup and have worked with a number of native and commercial tools. I am now Chief Technical Architect at Druva, the leading provider of cloud-based data protection and data management tools for endpoints, infrastructure, and cloud applications. These posts reflect my own opinion and are not necessarily the opinion of my employer.