I'm running rsnapshot 1.2.9 happily for the most part, but occasionally I
run into a problem that keeps messing things up. I've got lazy_delete
turned on, so that the lockfile is removed before running the delete of
any old dirs.
Sometimes the sync finishes in time for the next cron job to start, but
the hourly.delete directory isn't fully removed (or daily.delete or
whatever) by the time the next task starts up. In that case, some code at
line 2612 is triggered:
if (1 == $exists) {
# if we found any leftover directories, delete them now before they pile up and cause problems
# this is a directory
if (0 == $is_file) {
display_rm_rf("$config_vars{'snapshot_root'}/$interval.delete/");
if (0 == $test) {
$result = rm_rf("$config_vars{'snapshot_root'}/$$id_ref{'interval'}.delete/" );
if (0 == $result) {
bail("Error! rm_rf(\"$config_vars{'snapshot_root'}/$interval.delete/\")");
}
}
In other words, if we start up and see a interval.delete directory still
sitting around undeleted, try and delete it before we continue. This is
fine, even though the other task may still be working on the exact same
deletion. The problem is here:
One line says this:
display_rm_rf("$config_vars{'snapshot_root'}/$interval.delete/");
Another says:
$result = rm_rf("$config_vars{'snapshot_root'}/$$id_ref{'interval'}.delete/" );
And a third says:
bail("Error! rm_rf(\"$config_vars{'snapshot_root'}/$interval.delete/\")");
In my log files, I'm seeing that it is trying to delete the correct path
($interval.delete) but I keep getting errors from the actual command being
executed (deleting $$id_ref{'interval'}.delete) like this:
[28/Dec/2006:23:45:00] /bin/rm -rf /usr/.snapshots/hourly.delete/
[28/Dec/2006:23:45:00] /usr/local/bin/rsnapshot daily: ERROR: cmd_rm_rf()
needs a valid file path as an argument
[28/Dec/2006:23:45:00] /usr/local/bin/rsnapshot daily: ERROR: Error!
rm_rf("/usr/.snapshots/hourly.delete/")
The paths it shows are correct, but apparently the command is failing and
is getting different parameters. Should they all say $interval.delete, or
the other way? Is $id_ref properly set at this point in execution?
This brings me to the other problem I've been noticing. When lazy_delete
is turned on, remove_lockfile() is called at line 2740. Then it runs the
delete, and then handle_interval() returns. This puts us back at line 261,
then on line 263, remove_lockfile() is called again. If the next
invocation starts up while the delete is happening (the major point of
lazy_delete), then we remove a lockfile that doesn't belong to us. That
leaves rsnapshot running without the lock file and another invocation can
start up even if the other one isn't done (exactly what the lockfile
should have prevented).
I can think of two ways to resolve this. They both make sense to me, but I
want to see what others think and submit a fix for the next release of
rsnapshot. One way would be to have remove_lockfile() make sure that the
pid in the lockfile is my own, otherwise, don't delete it. Another way
would be to check the lazy_delete flag at line 263, and don't call
remove_lockfile a second time if lazy_delete is set. What is your
preference?
Thanks for your help with these issues. I'm extremely grateful for
rsnapshot and all the help it has been for me.
Thanks,
Mac
--
Mac Newbold Code Greene, LLC
1440 S. Foothill Dr. Suite #250
Office: 801-438-0142 Salt Lake City, UT 84108
Cell: 801-694-6334 www.codegreene.com
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss
