SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
Delete Enough To Finish
Author Message
Post Delete Enough To Finish 
I saw the Delete Enough To Finish page listed on the Wiki. It seems
this feature could be implemented as follows.

Rdiff-backup will work and work, filling up the hard drive.
Rdiff-backup will catch not-enough-space errors using a try...catch type
block. When such an error occurs, it will do the following:
1. Delete ONE increment, using the already-build increment deletion
feature.
2. Try again
3. If still not enough space, delete another increment, etc...

The increment deletion would be configured to not delete too much; for
example, to leave at least two increments, or at least 1 week, etc. If
there is not enough space and no more increments can be deleted, then
rdiff-backup can declare the hard drive to be full.

Is there any reason this could not be implemented?

-- Bob

Post Delete Enough To Finish 
Bob Fischer <bob.fischer17 < at > earthlink.net> writes:
Rdiff-backup will catch not-enough-space errors using a try...catch type
block. When such an error occurs, it will do the following:
1. Delete ONE increment, using the already-build increment deletion
feature.
2. Try again
3. If still not enough space, delete another increment, etc...

The increment deletion would be configured to not delete too much; for
example, to leave at least two increments, or at least 1 week, etc. If
there is not enough space and no more increments can be deleted, then
rdiff-backup can declare the hard drive to be full.

Is there any reason this could not be implemented?

Seems pretty practical; might not work for everyone,
but would be useful for many.

The most obvious concern is that one overly large file would cause
all the backups to be pared down to the minimum retain value and
then fail anyway, but assuming one sets a minimum retain value
correctly, I suppose this wouldn't be deadly.

It occurs to me that if rdiff-backup were given the ability
to blow away stuff in order to make space,
then if the metadata-diddling features from the wishlist were available,
it could be given a pretty complex combination of increment
and file priorities for space management;
ie,
first blow away all copies of the big file "foo" except for 2,
then all copies of the not-quite-as-big file "bar" except for
the past three months,
then blow away increments retaining at least five months' worth.

It has just occurred to me that rdiff-backup may not have to
in fact perform the backup in order to figure out how much space
it would require; it only needs to do all the work involved in making
the backup and keep count of the results. This would be a trade
of lots of cpu and bandwidth for more precise space management.
It would still have to do the try...catch iteration in case more
files were created during the run, and I don't know the
rdiff algorithm well enough to know if it would need much scratch
space.

--akb

Post Delete Enough To Finish 
The most obvious concern is that one overly large file would cause
all the backups to be pared down to the minimum retain value and
then fail anyway...

For me, this means that the backup device just isn't big enough.

I think it's important to KISS with backups --- "Keep it Simple
Stupid". I would want a simple set of rules I can easily analyze, so I
know that the right stuff is being backed up. If I have to get a bigger
HDD for backups, so be it.

Except for database dump files, most large files on home PC's today are
multimedia --- music and photos. These files basically do not change,
they are write-once read-many. For myself at least, the set of files
than change is miniscule compared to my modest (by today's standards)
160Gb hard drive.

It has just occurred to me that rdiff-backup may not have to
in fact perform the backup in order to figure out how much space
it would require...

My biggest concern with rdiff-backup is CPU time. It's about 5X slower
than rsync, which performs a very similar computation. I'm running
backups on a P-II box because I only had to pay for the disks that way.
Anything that increases CPU time could be impractical.

Any chance rdiff-backup might be re-coded in C? Or at least its inner
loop in C?

-- Bob

Post Delete Enough To Finish 
Bob Fischer <bob.fischer17 < at > earthlink.net> writes:
For me, this means that the backup device just isn't big enough.

this is true for me as well; I would want my purging of old stuff to
be independant of ongoing backups; if I run out of space during
a backup at all, something is wrong. But this is an environment
dependant issue; several folks have asked for the "purge enough to
complete" option, and I expect not all of them want the same
behavior as each other, let alone as you or I...


My biggest concern with rdiff-backup is CPU time. It's about 5X slower
than rsync, which performs a very similar computation. I'm running
backups on a P-II box because I only had to pay for the disks that way.
Anything that increases CPU time could be impractical.

Any chance rdiff-backup might be re-coded in C? Or at least its inner
loop in C?

rdiff-backup uses librsync, which is written in c.
rsync does not use librsync, and thus might have some better
optimizations in it, or the delay could be python related or
code design related.


I think it's important to KISS with backups --- "Keep it Simple

Ah, the art of useability.

The thing about backups is that to a greater extent than
most other routine system functions, one size does not fit all,
so it comes down to balancing flexibility with complexity.

dump is simple, but not flexible, legato is flexible, but not simple.

I use rdiff-backup because it hits a really nice medium point;
the files are as easy to get to as with dump or tarfiles,
while space and version management is much better.

it is not just an either/or trade though; you can have a lot
of flexibility with simple tools if they are done right;
doing it right is, of course, a bit of work.

--akb
who doesn't code rdiff-backup, just writes about it

Post Delete Enough To Finish 
Bob Fischer <bob.fischer17 < at > earthlink.net>
wrote the following on Mon, 26 Jan 2004 23:39:58 -0500

Any chance rdiff-backup might be re-coded in C? Or at least its
inner loop in C?

As Andrew pointed out, librsync is already written in C, and a few of
the time-critical portions are already in C. There's no one (or 15)
functions that take up 90% of the time IIRC.

So the inner loop is already in C Smile. Or alternately, the inner loop
that needs to be sped up is the big one that recurses down a directory
and checks to see if any files have changed.

My understanding of the way disk and processor speeds are changing is
that the difference favors rdiff-backup, and that long term it's more
important to increase reliability and features than CPU speed.


--
Ben Escoto

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB