A little while ago I wrote a blog entry about how I was disappointed in Time Machine and how I was trying to figure out something better. I believe I’ve found my solution, and even have a working shell script that does the job for me. Those of you that don’t have Macs really want to pay attention as well, as what I ended up doing works for anything you can run rsync on.
First let me say that I think Time Machine is awesome and puts any other native backup utility that I’ve seen to shame. All you have to do is plug in an external drive and Time Machine automatically pops up and says “Hey! I noticed you plugged in a drive! Would you like to use that for backup?” Tell it yes, and you’re doing backups. From then on, all you have to do is plug in the drive. As soon as it sees it, it kicks off Time Machine. NICE. Restores are nice, too. You’re presented with a number of points of view of your directories over time, allowing you to just grab the files you want. Again, very nice.
But, as I said in my previous post, Time Machine starts to lose its coolness when you bring in other machines or buy a Time Capsule. In order to back up to a centralized system (e.g. Time Capsule, or a Mac w/a large hard drive), it creates a disk image (referred to as a sparsebundle) and backs up to that. The problem with that is that this disk image sometimes gets corrupted. So people that know what they’re doing know to keep two versions of it and fsck it every night after backups. If the fsck fails, you swap it out with the good version. And it’s important to say that since this is one big image, we’re not just talking about keeping an extra full around; this sparse image contains the “full” and all incremental since then. That’s a lot of wasted space and effort.
Time Machine Structure
After some unsuccesful attempts, I decided to dig deeper in the format of the Time Machine backup itself. I looked at the format of a local backup without a sparsebundle complexity. The first thing you see is a Backups.backupdb directory, and inside that is a directory named after the host the backups are for. Inside that directory is a series of directories named after the date and time the backup was made, along with a seemingly random number tacked on the end of the name (presumably to prevent naming conflicts). There also is a directory named Latest, which is just a symbolic link to the latest backup. During the backup, there’s another directory with similar names as the others, but with the phrase “inProgress” tacked on the end. When a backup is successful, the “inProgress” part of the name is removed. Here’s how that looks:
W-Curtis-Preston1s-MacBook-Pro:W. Curtis Preston1’s MacBook Pro wcurtispreston$ pwd
/volumes/backup/Backups.backupdb/W. Curtis Preston1’s MacBook Pro
W-Curtis-Preston1s-MacBook-Pro:W. Curtis Preston1’s MacBook Pro wcurtispreston$ ls
2009-12-10-103835 2009-12-29-131111 2009-12-29-151117 2009-12-29-171133 2009-12-29-191606 Latest
2009-12-17-141618 2009-12-29-141131 2009-12-29-161126 2009-12-29-181318 2009-12-29-201128
If cd to this directory and look around, you might be a bit confused. Each of the directories appears to look like a full backup, but your drive couldn’t possible hold all those full backups. What’s up with that? Run “du -sk *” on the directories and you’ll see that each of them takes up a different amount of space. It’s a little thing called hard links. Each ‘backup’ directory looks like a full backup. If a file is changed or new before a backup, it is found in that directory. But if there’s a file being backed up tonight that hasn’t changed, Time Machine simply makes a hard link to the same named file in a previous backup. A hard link tapes up very little space (almost none, actually) and links the two files. This is a feature available in Unix-style filesystems (e.g. UFS, EXT3, ZFS, HFS+) and not available on FAT32 (it is available in NTFS).
That’s when it hit me. Time Machine is nothing but a GUI around the age old idea of using rsync and hard links to create a snapshot-like backup that was first popularized by Mike Rubel several years ago and written about in my book. The instructions on this original page don’t work with Mac OS, but the concept does. (If you want to understand the concepts behind this tool, such as how hard links work, read that original blog post.) This idea was enhanced and developed and blossomed into such open-source backup products such as rsnapshot and rdiff-backup. These are awesome tools that handle everything for you kind of like Time Machine does. I can do this! And I can do it for my Linux system as well.
So what’s up with the SparseBundle files? The reason they went with this is that when backing up your OS, you need to watch carefully the permissions of all the files you’re backing up. When backing up across the LAN, however, they couldn’t exactly match permissions, etc, of HFS+ on an AFP or SMB share. So they create an image on that share that mimics a real filesystem there. I don’t know.. I still think it’s a copout, but that’s the best explanation I can come up with.
Why not use rsnapshot, flyback, timevault, rdiff-backup, etc?
There are two reasons for that.
The first reason is that all those tools are built to run on a Linux server. I can’t find any made to run on Mac. Although my primary server is Linux, I needed my backup to NOT be on Linux (due to using an unsupported 8TB LUN on my Drobo). This way I could put my other drobo on one of my Mac machines and back up to it. In addition, this works for either all-Mac or Mac/Windows environs that don’t have Linux.
The second reason is similar to the first. Once you start talking about the Mac as a client, they start talking about loading this package or that package (e.g. Darwin Ports), and I thought that was beyond the capabilities of many users. Some of the ported versions were also out of date. This script, albeit simple, runs on a base Mac OS X machine (although I’ve only tested on Snow Leopard. There could be older versions of rsync that it won’t work with).
Using rsync and hard links to mimic Time Machine
With rsync (which is built into both MacOS, Linux, & FreeBSD, and can be installed on Windows) and a little shell scripting, you can mimic everything but the GUI and you don’t need no stinking sparse files. I even wrote a little script that I’ll share with you at the end of this post.
There’s a few things you need to do make this all work.
- You need a backup drive that’s big enough to hold a full backup and many days or weeks of changes. You can get a 1-2TB USB drive for $100-$200. Just do it.
- Pick a Linux/MacOS/FreeBSD/Unix machine to act as a the backup server. It doesn’t need to be dedicated, as the drive will be external. Even if you loaded it on cygwin on an NTFS machine, it won’t work until you take out the use of the Latest symbolic link, as NTFS doesn’t have those. (It has shortcuts, which are close, but no cigar.)
- If you’re using a Mac backup server, format that drive as an HFS+ drive that enforces permissions. I believe that’s the default, but make sure. To test the latter, try to use it as a backup drive for Time Machine. If it likes it, you’re good to go.
- If you’re using Linux, FreeBSD, Unix, your favorite Unix-savvy filesystem will do (e.g. ext, ufs — no FAT or NTFS).
- Make sure you don’t have conflicting user ids on the systems you’re backing up. Either use a centralized LDAP-like setup, or synchronize all your user IDs across multiple computers. Otherwise I’m not sure what will happened. I’ve included the –numeric-ids argument to rsync that should address this, but I’m not sure.
- Make sure that the “backup server” can ssh as root to the machines you’re going to back up without a passwd. On Macs, you’ll need to enable remote login, and you’ll need to follow these directions for how to setup ssh & public/private key authentication. If one of the clients you want to back up is a Windows client, download cygwin and install and configure the sshd setup so you can ssh to it as Administrator with no password.
- Determine a max size that you want the backup setup to grow to. Usually this will be a value just shy of a full filesystem. I, however, have a thin-provisioned Drobo where the output of a df -k shows that I have far more space than I actually have (I have a 16TB LUN and only 3.5TB of disk in the Drobo), so I need to pick a value that matches how much actual disk I’ve put in the system.
Download this shell script and change the three values at the top:
maxsize”1234567″ # A value in KB of how large the backup directory is allowed to grow before you delete older backups
The entries on the next few lines will be used as the source directory argument for each iteration of rsync. The following are a few examples. Read the man page for more
I’m sorry for the clumsiness of that last config item. I couldn’t get Mac’s bourne shell to do the right thing, so I have to echo to an external file and read it later.
Once you’ve changed those values, you can run the sript AT YOUR OWN RISK DON’T SUE ME IF YOU BLOW UP YOUR WHOLE SYSTEM! As long as the backup directory is writeable, and you specified the source directories correctly, it should then create something that looks a whole lot like Time Machine.
- First it takes the source path name and gets rid of all the charaters that aren’t good in a filename (e.g. :, /, etc) and replaces them with dashes (and deletes any leading or trailing dashes). It then creates a directory in the $backups dir with that name. For example, root@othermac:/ becomes root-othermach. In that directory, it will create a directory starting with the naming convention Backup.
.inProgress and will rsync to that directory. The first backup can take quite some time. If the backup is successful, it removes the .inProgress part of the name and creates a symbolic link to Latest.
- The next time it backs up, it tells rsync (using the –link-dest argument) to compare the files it’s supposed to back up with the backup in the Latest directory. If the file is the same, it makes a hard link from the Latest directory to the .inProgress directory. If the file is new, it is transferred from the source to the new directory. Again, if the backup succeeds, it removes the inProgress part and changes the Latest link to point to the latest backup.
- Every time it runs it compares the “used” value of the filesystem (df -k) to what you specified as the maxsize for the backup directory. If it ever exceeds that it starts removing the oldest backups until it’s below that size. However, it will never delete the newest directory. It’ll just complain and quit.
If you’re not familar with Unix and hard links you may be wondering what happens when it deletes that first “full” backup. Don’t the files go away? Nope. That’s the beauty of hard links. All that happens is that the link count is dropped by one. Even if you deleted every directory except for the most recent version, you’d still have all the files you need and the link count for all those files would be down to 1.
I’m putting it in cron to run once a day. I can put how to do that in here, but I’m tired.
Restoring with these backups
All you have to do is have a running system that has rsync on it and you can restore to it. Suppose you had been backing up the / drive from a client named elvis, but needed to restore just the /Users directory — and your backup volume was /Volumes/Drobo. The source argument would have been root@elvis:/, which would have been changed to root-elvis–, so our total backup directory to use for restores would be /Volumes/Drobo/root-elvis–. To restore the /Users directory from that directory back to its original location, issue the following command from the backup server.:
rsync -avP –numeric-ids /Volumes/Drobo/root-elvis/Users/ root@elvis:/Users
The -a tells it to copy and preserve permissions, resource forks, links, etc. (it’s short for archive). The -v tells it to be verbose, and the -P tells it to shows progress when copying large files. The –numeric-ids tells it to not map user ids to usernames and just use the numeric id, which could be very helpful if you’ve got conflicting user ids.
You can also use network shares and drag and drop any files you want as well, but the rsync method will give you the truest restore of permissions, etc. it should also be able to restore a system drive. Boot the mac off the Mac OS CD. Using the disk utility, prepare the system drive for the restore by giving it a filesystem and mounting it. Assuming your backup server is apollo, and the directory structure is the same as the one above, and assuming you’ve mounted the root drive on /Volumes/mnt, the command to run from the Mac to be restored would be:
sudo rsync -avP –numeric-ids root@apollo:/Volumes/Drobo/root-elvis/ /Volumes/mnt
Make the drive bootable:
sudo bless -verbose -folder “/Volumes/mnt/System/Library/CoreServices” -bootinfo
Mind you, I haven’t tested this last part, but it matches other procedures I’ve seen.
OK, it’s not as cool looking as Time Machine (and you really need to learn rsync to restore), but it handles my Linux box and my Macs. It doesn’t use those stupid sparsebundle files that get corruped occasionally and doesn’t use the extra space that keeping multiple copies of those files would use. It can be used to recover both files and the system drive. I’m happy.
I am totally open to how to make this better, but I’m pretty happy with it right now.
BTW, this is version .9 and has no reporting and very little error handling. I literally started running it for the first time yesterday. But I’m swamped right now and don’t have time to add that stuff. Anyone want to contribute?
This also means that I’m actively working on this script. If you actually use it, you ought to check back for updated versions. (I’ve made five more changes just today.)