Hi.
Sorry for the rudeness in the summary.
rsnapshot is offcourse very usable, and it handles backup jobs for heaps
of people without problems!
I've been looking for a sane backup-tool to do my incremental backups to
a remote machine...
rsnapshot seemed perfect at first sight:
- rsync based
- daily, monthly, etc backups, with automatic rotation!
- hard linking to save space
- instant browsability of backups
- simple config files (I used the command line option to specify a
custom config file, and include_conf in that file so I could specify
different snapshot_root's for each "backup job")
- widely used, proven to work
- good documentation
But...
The way rsnapshot only syncs to the top-most interval/retain config
directory (e.g "daily"), while all the subsequent snapshot jobs (e.g.
"weekly"/"monthly") depends on the the first one in this list, is really
flawed, confusing and error prone in my opinion.
When running "rsnapshot weekly" when having "retain daily 3" at the top
in the config file, you'll get a weekly backup _only_ if daily.0 exists,
and regardless of when daily.0 was created.
If you no longer need daily backups, and decide to comment out
"rsnapshot daily" in cron, while forgetting to comment out "retain daily
NUM" in the config file, then your backup plan is completely borked!
In addition to this, it also seems like it is very important to run the
cron daily/weekly/monthly jobs in the correct order, and with a
(uncertain) delay between each of them, in order for rsnapshot to work
correctly.
I also dislike the way rsnapshot forces you to seperate arguments with
<TAB> in the config file.
Especially because my editor (vim) expands all tabs to 4 spaces. It
would be much better if rsnapshot required quotes around the config
arguments that contains spaces.
I hope I'm not offending anyone, but large portions of the rsnapshot
code also seems old and outdated (atleast at first sight), and would
probably benefit alot from some simple refactoring, cleanup and
abstraction. I'm blaming this mostly on the age of the Perl code
(2003-2005).
And..
Why should you have to specify "retain daily 3", "retain weekly 4", and
so on, in the config when you'll have to create seperate cron-jobs for
all of these anyway?
It doesn't make sense?
It would be better to specify this on the command line. Something like:
rsnapshot daily 3
rsnapshot weekly 4
Or even (the way I'd like it):
rsnapshot -c htpc.conf -s daily -k 3
rsnapshot -c htpc.conf -s weekly -k 4
rsnapshot -c laptop.conf -s daily -k 30
rsnapshot -c laptop.conf -s weekly -k 6
-c (above) would obviously be the config file to use (containing stuff
like $SNAPSHOT_ROOT, $SNAPSHOT_NAME, $DESTINATION)
-s would specify the $SNAPSHOT_SUBNAME (e.g daily/weekly/monthly)
-k would specify number of files to keep for
$SNAPSHOT_NAME.$SNAPSHOT_SUBNAME* in $SNAPSHOT_ROOT
I created a bash script which serves like a proof of concept for what
I'd like to do.
It doesn't support different config files, and it doesn't parse command
line arguments like in the example above, but it has (imo) a much better
algorithm for dealing with incremental backups!
To create some dirs and files for testing my script, run:
mkdir -p ~/rsnapshot-ng/source
mkdir -p ~/rsnapshot-ng/source/somedir
touch ~/rsnapshot-ng/source/file1
touch ~/rsnapshot-ng/source/somedir/file2
mkdir -p ~/rsnapshot-ng/dest
cd ~/rsnapshot-ng/
Then copy the included script (at the bottom of this mail) to
~/rsnapshot-ng/rsnapshot-ng.sh, and run e.g:
bash rsnapshot-ng.sh daily 2
bash rsnapshot-ng.sh weekly 3
You should now have directories similar to the following, in
~/rsnapshot-ng/dest:
laptop.daily.201204040013
laptop.latest
laptop.monthly.201204040014
If you take a look at the script, you'll see that every backup job
writes to $SNAPSHOT_ROOT/$SNAPSHOT_NAME.$SNAPSHOT_SUBNAME.$DATE (using
$SNAPSHOT_ROOT/$SNAPSHOT_NAME.$SNAPSHOT_SUBNAME.latest as the
--link-dest DIR (rsync specific...)), then removes
$SNAPSHOT_ROOT/$SNAPSHOT_NAME.SNAPSHOT_SUBNAME.latest, before copying
the latest backup dir to
$SNAPSHOT_ROOT/$SNAPSHOT_NAME.SNAPSHOT_SUBNAME.latest with hardlinks.
Lastly it ensures that only the specified number of backups are kept
based on the $KEEP argument.
This solution ensures that the daily/weekly/monthly/etc backups are
actually up-to-date when created, and it also avoids most of the config
pecularities you'll have with rsnapshot.
There are still race condition issues when running multiple backup jobs
simulaniously, but I'm sure that could be solved just with some thought.
What do you guys think?
The actual script, rsnapshot-ng.sh (I threw this together, so bugs or
stupidness might occur):
#!/bin/bash
SOURCE=~/rsnapshot-ng/source
SNAPSHOT_ROOT=~/rsnapshot-ng/dest
SNAPSHOT_NAME=laptop
SNAPSHOT_SUBNAME=$1
KEEP=$2
DATE=`date +%Y%m%d%H%M`
DESTINATION="$SNAPSHOT_ROOT/$SNAPSHOT_NAME.$SNAPSHOT_SUBNAME.$DATE"
LATEST=$SNAPSHOT_NAME.latest
rsync -v -a --delete-excluded --delete --link-dest=../$LATEST $SOURCE
$DESTINATION
rm -r $SNAPSHOT_ROOT/$LATEST/
cp -al $DESTINATION $SNAPSHOT_ROOT/$LATEST/
# Keep only specified number of files matching $NAME.$SUBNAME* in
$BACKUP_ROOT
if [ -n "$KEEP" ]; then
/bin/ls -dt1 $SNAPSHOT_ROOT/$SNAPSHOT_NAME.$SNAPSHOT_SUBNAME* |
tail -n +$(($KEEP+1)) | xargs rm -r
fi
------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss
