Include All Files; Reject Some

I had a twitter chat with @JLivens the other day where the question was "what do you back up?"  My first response was to, of course, say that I back up everything – thrice – cause I'm me. If you're curious, my critical personal data is synced on multiple computers with history using Dropbox (which I'm reconsidering based on how things have been going over there lately), then it's backed up with the free version of CrashPlan to another computer that isn't at my house, AND I can't resist the urge to throw in a Time Machine backup every once in a while.  You know what?  I haven't done one of those in a week or so.  Just a second.  My little Time Machine icon is spinning now. Ah, there I feel better.

Side note: for all my talk about tape lately, you'll notice that I don't have any tape in my setup for now.  I am about to embark on a project that may make me reconsider that as I might have an archiving need soon.  Can't keep it all on spinning disk!

Alright, back to the topic at hand.  What do I back up?  I actually do back up everything, but that is not the point I wanted to get across in this post. 

It's easy to come up with a list of directories you don't want to back up.  Your /tmp folder, your "Temporary Internet Files," your folder on your work laptop that contains the illegally downloaded movies that you should have never downloaded in the first place.  Yeah, I'm talking to you.  Pay for the media/software you consume.

But what I wanted to talk about was how to make your backup selection if you want to exclude things.  What I've found is that the human tendency is to say "just backup the Documents folder," or something like that.  And that is what I really want to talk you out of.  There is too much risk doing it this way.  You could accidentally put some important data in a directory you're not backing up.  You could create a whole other directory that contains really important data and forget to add it to the list.  The risk outweighs the benefit of excluding the other data.

If your backup software has the ability, please have it autoselect both filesystems/drives and folders/directories.  If it supports it and if you want to do so, you can also create an exclude list of the directories you definitely don't want to back up.

And that's what I came to say: backup up everything, but exclude what you don't want.  Hopefully the title makes sense now.


Written by W. Curtis Preston (@wcpreston), four-time O'Reilly author, and host of The Backup Wrap-up podcast. I am now the Technology Evangelist at Sullivan Strickler, which helps companies manage their legacy data

4 comments
  • Interesting that you say that Curtis, because I tell my customers the same thing. Now, of course, one might say that backing up your “personal” computer is different than backing up a company’s IT infrastructure. Sure there may be a few more pieces of Hardware/Software involved, but it’s really all the same. Value is the name of the game here. For example, what you choose to backup is a direct reflection of what’s of “value” to you. In other words, without putting a monetary value on it, perception is reality. If an organization perceives their Oracle environment as “important”, then of course they will back that up. The same as if you value your “My Documents” folder. I often tell customers the same thing. Determine your objectives (whether personal or professional) then react accordingly. Normally in professional settings, technical objectives = business objectives = “what will it cost if I wasn’t able to recover this data” (which is ultimately the deciding factor on whether it should be backed up and on what – tape, disk, VTLs, etc). The bottomline in my humble opinion (and for the truly paranoid individuals such as myself), back up everything if you have the space. If you don’t have the space, research de-duplication (which is a whole different subject)!

  • Hello Curtis,

    Good point ! Can’t be more agree with you here.

    Any tools that “suggest” the contents of the exclude list automatically ? I know It may sound strange – only the user knows what not to backup, but from the other hands – who would spend xGB of the backup media on pagefile.sys ? Right ?

    Regards

  • NetBackup makes some suggestions, but it’s just a list that has been built over time. I don’t know of any that automatically build a list for you. Nice idea, though.

  • [quote name=W. Curtis Preston]NetBackup makes some suggestions, but it’s just a list that has been built over time. I don’t know of any that automatically build a list for you. Nice idea, though.[/quote]
    Yep, hashes of “known files” could be managed and list could be generated for each installation individually. Of course user verification/confirmation should be received before the exclude list is applied. But comparing/merging exclude lists (anonymously, of course) some service could come up with a bunch of exclude list “templates”. Collective mind, you know ๐Ÿ™‚

    Oh … ideas are just floating around … ๐Ÿ™‚

    Regards