SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
Disk space watcher and autodeleter
Author Message
Reply with quote
Post Disk space watcher and autodeleter 
Hello,

I'm attempting to replace a custom "hacky" script that has a
similar technique (rsync links) as rsnapshot with rsnapshot.

The one major feature I see missing is the ability to keep
the disk space on a snapshot mountpoint below a specific
level.

This works by calling "du" to see the disk space consumed,
and deleting snapshots until the level falls below the
threshold. (It's otherwise configured to keep snapshots
"forever" - assuming the disk space never runs out.)

Even better would be to do a trial run of the sync to
estimate the space for the next snapshot, and subtract that
from the threashold. The threashold could then be set much
tighter (90%ish), as there's less danger of overrunning the
disk.

Does anyone have a cmd_preexec or other script to do this?

If I was to implement this, could the patches be considered
for inclusion in rbackup?

Thanks

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Reply with quote
Post Disk space watcher and autodeleter 
Hallo, Wilson,

Du meintest am 25.02.10:

Quote:
I'm attempting to replace a custom "hacky" script that has a
similar technique (rsync links) as rsnapshot with rsnapshot.

Quote:
The one major feature I see missing is the ability to keep
the disk space on a snapshot mountpoint below a specific
level.

Quote:
This works by calling "du" to see the disk space consumed,
and deleting snapshots until the level falls below the
threshold.

Sorry - that's no good idea.

There has to be enough space for a complete new backup - it's the worst
case, and it can happen.

And if there is not enough room for a complete new backup then your
(desired) routine may have to delete all yeary and most monthly backups
- then you really don't need backups.

The better way is to install a bigger disk.

Quote:
(It's otherwise configured to keep snapshots
"forever" - assuming the disk space never runs out.)

How?
My oldest backups are "yearly", and I have defined

interval yearly 10

(that are nearly full backups ...)

Viele Gruesse!
Helmut

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Reply with quote
Post Disk space watcher and autodeleter 
Wilson Snyder wrote:

Quote:
The one major feature I see missing is the ability to keep
the disk space on a snapshot mountpoint below a specific
level.

This works by calling "du" to see the disk space consumed,
and deleting snapshots until the level falls below the
threshold.

It would be better to use df instead of du for this - du has to grovel
over the entire filesystem to see how big all the files and directories
are, whereas df makes a single system call to the filesystem.

--
David Cantrell | London Perl Mongers Deputy Chief Heretic

Longum iter est per praecepta, breve et efficax per exempla.

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Reply with quote
Post Disk space watcher and autodeleter 
Helmut Hullen wrote:
Quote:
Hallo, Wilson,
Quote:
The one major feature I see missing is the ability to keep
the disk space on a snapshot mountpoint below a specific
level.
Sorry - that's no good idea.

In an ideal world, no it isn't. But lets be pragmatic!

Quote:
There has to be enough space for a complete new backup - it's the worst
case, and it can happen.

I hope it doesn't on *my* backup box!

Filesystem Size Used Avail Use% Mounted on
...
/dev/disk1s3 1.8T 1.8T 40G 98% /Volumes/Backup

It's been hovering between 30 and 50G free for the last coupla months.

Quote:
The better way is to install a bigger disk.

If your only consideration is backups then that's true. But I, and I
suspect Wilson, also care about things like money, and available slots
for plugging the damned disks in.

--
David Cantrell | Enforcer, South London Linguistic Massive

What profiteth a man, if he win a flame war, yet lose his cool?

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Reply with quote
Post Disk space watcher and autodeleter 
On Thu, Feb 25, 2010 at 09:37:23AM -0500, Wilson Snyder wrote:
Quote:

Hello,

I'm attempting to replace a custom "hacky" script that has a
similar technique (rsync links) as rsnapshot with rsnapshot.

The one major feature I see missing is the ability to keep
the disk space on a snapshot mountpoint below a specific
level.

Quote:
Even better would be to do a trial run of the sync to
estimate the space for the next snapshot, and subtract that
from the threashold. The threashold could then be set much
tighter (90%ish), as there's less danger of overrunning the
disk.

Does anyone have a cmd_preexec or other script to do this?

If I was to implement this, could the patches be considered
for inclusion in rbackup?

s/rbackup/rsnapshot/

If rsnapshot is working properly and in a steady state, then it
should use pretty much constant disk space relative to the size
of the file systems being backed up (unless there is an unusual
amount of space taken up by file which have changed).

So my conclusion is that if you are running out of disk space
on your rsnapshot destination, it means one of these things:
(1) your destination disk is not big enough
(2) you are trying to back up too much stuff
(3) you are trying to keep too many different backups
(4) your rsnapshot had not yet retained a complete set of backups,
but ran out of space when it tried to
(5) the source file system(s) have grown in size beyond the
capacity of the destination disk
(6) there is an unusual amount of file changing (counted by the
aggregate size of the files being changed) on the source(s)
(7) your rsnapshot is not working properly - for example destination
files are not properly hard linked and therefore take up much
more space on the destination disk than they need to

Problems 1 to 5 can be solved by provisioning more disk space for
the destination or changing what you are trying to back up so less
space is needed. Basically capacity planning.

Problem 6 can sometimes be managed by investigating what has changed
and seeing whether a large directory (or a directory with large files,
or a directory that has large directories for children) has been moved
(renamed) recently.

If so, it can be save space to move/rename the relevant directory in the
right destination snapshot to match the change that has happened at source,
and modify snapshots between now and then. But please don't try this if
you are not sure about what you are doing - if in doubt, don't change the
destination snapshots.

But the more interesting point relates to (7). Running low on disk space
can be a warning that there is a problem with your rsnapshot installation.
If you are not watching rsnapshot closely, you may not notice if unchanged
files are not hard-linked together as they should be, until you start to
run low on disk space.

So this is a possible counter-argument about a risk that could arise if
rsnapshot were configured to automatically delete old snapshots when disk
space starts running low. This could mask a symptom of a problem that
should be addressed.

However I am interested in the idea. If risks like the masked symptom
risk can be managed appropriately, and of course if it is implemented
in a sensible way that doesn't cause other problems, then I would be
happy to consider it (and also happy to hear opinions for or against
from people on this list).

And the idea of a tool that can estimate disk space requirements for
the next snapshot seems like a good one. For example, that could be
used to build an early warning system that would check whether there
seems to be enough space for the next rsnapshot, and automatically
email a warning a few hours before the rsnapshot is attempted.

(I like to run "rsnapshot -t -q daily" from cron during business
hours as an early warning system - if there is any output, then I
get an email.)

--
___________________________________________________________________________
David Keegel <djk < at > cybersource.com.au> http://www.cyber.com.au/users/djk/
Cybersource P/L: Linux/Unix Systems Administration Consulting/Contracting

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Reply with quote
Post Disk space watcher and autodeleter 
Quote:
On Thu, Feb 25, 2010 at 09:37:23AM -0500, Wilson Snyder wrote:
Quote:

Hello,

I'm attempting to replace a custom "hacky" script that has a
similar technique (rsync links) as rsnapshot with rsnapshot.

The one major feature I see missing is the ability to keep
the disk space on a snapshot mountpoint below a specific
level.

Quote:
Even better would be to do a trial run of the sync to
estimate the space for the next snapshot, and subtract that
from the threashold. The threashold could then be set much
tighter (90%ish), as there's less danger of overrunning the
disk.

Does anyone have a cmd_preexec or other script to do this?

If I was to implement this, could the patches be considered
for inclusion in rsnapshot?
...

If rsnapshot is working properly and in a steady state, then it
should use pretty much constant disk space relative to the size
of the file systems being backed up (unless there is an unusual
amount of space taken up by file which have changed).

So my conclusion is that if you are running out of disk space
on your rsnapshot destination, it means one of these things:
(1) your destination disk is not big enough
(2) you are trying to back up too much stuff
(3) you are trying to keep too many different backups
(4) your rsnapshot had not yet retained a complete set of backups,
but ran out of space when it tried to
(5) the source file system(s) have grown in size beyond the
capacity of the destination disk
(6) there is an unusual amount of file changing (counted by the
aggregate size of the files being changed) on the source(s)
(7) your rsnapshot is not working properly - for example destination
files are not properly hard linked and therefore take up much
more space on the destination disk than they need to

Problems 1 to 5 can be solved by provisioning more disk space for
the destination or changing what you are trying to back up so less
space is needed. Basically capacity planning.

Problem 6 can sometimes be managed by investigating what has changed
and seeing whether a large directory (or a directory with large files,
or a directory that has large directories for children) has been moved
(renamed) recently.

If so, it can be save space to move/rename the relevant directory in the
right destination snapshot to match the change that has happened at source,
and modify snapshots between now and then. But please don't try this if
you are not sure about what you are doing - if in doubt, don't change the
destination snapshots.

Thanks. I should have perhaps given you our use case.
We're using it to backup 500GB+ disks used for subversion
checkouts and chip simulations. These fit will into your
case 6; often on weekends there's few changes, and few
checkout changes, but then the next day a user may remove
their areas, make a new checkout and run several simulations
over night, which can easily make the next snapshot have
several 100GB. We have dozens of filesystems like this, all
which may vary from 1-25% of the disk per snapshot. The
point is I don't want to *think* about provisioning.

Quote:
But the more interesting point relates to (7). Running low on disk space
can be a warning that there is a problem with your rsnapshot installation.
If you are not watching rsnapshot closely, you may not notice if unchanged
files are not hard-linked together as they should be, until you start to
run low on disk space.

So this is a possible counter-argument about a risk that could arise if
rsnapshot were configured to automatically delete old snapshots when disk
space starts running low. This could mask a symptom of a problem that
should be addressed.

That's good points. Perhaps one idea is to have a
configurable minimum number of snapshots to keep?

Quote:
However I am interested in the idea. If risks like the masked symptom
risk can be managed appropriately, and of course if it is implemented
in a sensible way that doesn't cause other problems, then I would be
happy to consider it (and also happy to hear opinions for or against
from people on this list).

And the idea of a tool that can estimate disk space requirements for
the next snapshot seems like a good one. For example, that could be
used to build an early warning system that would check whether there
seems to be enough space for the next rsnapshot, and automatically
email a warning a few hours before the rsnapshot is attempted.

If the warning is ignored, and there's nothing to delete
(perhaps due to the limits set), and the disk would
otherwise fill, what is the right default behavior? Skip it
entirely with an error? Having a half-snapshot seems
dangerous, and filling a filesystem often causes strange
behaviors.

Quote:
(I like to run "rsnapshot -t -q daily" from cron during business
hours as an early warning system - if there is any output, then I
get an email.)

Thanks, I'll look into this further then.

-Wilson

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Reply with quote
Post Disk space watcher and autodeleter 
Hallo, Wilson,

Du meintest am 25.02.10:

Quote:
Quote:
So this is a possible counter-argument about a risk that could arise
if rsnapshot were configured to automatically delete old snapshots
when disk space starts running low. This could mask a symptom of a
problem that should be addressed.

Quote:
That's good points. Perhaps one idea is to have a
configurable minimum number of snapshots to keep?

On more blinkenlight ... the more options, the more confusions. Sorry.

A backup has to be reliable.
If/when I see the system may run out of space then I have to install
more space. Not playing with options.

I have paid 1 time for restoring. I know it's no good idea replacing
space with hope.

Viele Gruesse!
Helmut

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Reply with quote
Post Disk space watcher and autodeleter 
Hallo, David,

Du meintest am 25.02.10:

Quote:
Quote:
Quote:
The one major feature I see missing is the ability to keep
the disk space on a snapshot mountpoint below a specific
level.

Quote:
Quote:
Sorry - that's no good idea.

Quote:
In an ideal world, no it isn't. But lets be pragmatic!

Quote:
Quote:
There has to be enough space for a complete new backup - it's the
worst case, and it can happen.

Quote:
I hope it doesn't on *my* backup box!

Quote:
Filesystem Size Used Avail Use% Mounted on
....
/dev/disk1s3 1.8T 1.8T 40G 98% /Volumes/Backup

Quote:
It's been hovering between 30 and 50G free for the last coupla
months.


I'd walked to my friendly hard disk shop and ordered a new disk.

If I install a backup system it has to work. Reliable.
I have seen the effects of to small backup media in some other
installations. Some colleagues were very unhappy ...

Maybe it solves some problems when I move the yearly backups to a
second/third/fourth disk (eSATA works fine), but when the program is
allowed to delete them without asking me then I really don't need them.

It's a security backup. No trash bin.

Viele Gruesse!
Helmut

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Reply with quote
Post Disk space watcher and autodeleter 
On Fri, 26 Feb 2010, Helmut Hullen wrote:

Quote:
Hallo, Wilson,

Du meintest am 25.02.10:

Quote:
Quote:
So this is a possible counter-argument about a risk that could arise
if rsnapshot were configured to automatically delete old snapshots
when disk space starts running low. This could mask a symptom of a
problem that should be addressed.

Quote:
That's good points. Perhaps one idea is to have a
configurable minimum number of snapshots to keep?

On more blinkenlight ... the more options, the more confusions. Sorry.

A backup has to be reliable.
If/when I see the system may run out of space then I have to install
more space. Not playing with options.


How about merely adding the 'estimate next backup usage' idea? Then
calculating this sometime before an actual backup is set to occur could be
used to trigger external scripts: in case of too little space it could
either send a warning email or delete old snapshots, depending on the
user's preference. It would be quite a cpu intensive operation but if the
user wants to do it I can't see any harm.

Quote:
I have paid 1 time for restoring. I know it's no good idea replacing
space with hope.

Viele Gruesse!
Helmut

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss


------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Reply with quote
Post Disk space watcher and autodeleter 
On Fri, Feb 26, 2010 at 09:47:00AM +0000, Freddie wrote:

Quote:
How about merely adding the 'estimate next backup usage' idea? Then
calculating this sometime before an actual backup is set to occur could be
used to trigger external scripts: in case of too little space it could
either send a warning email or delete old snapshots, depending on the
user's preference. It would be quite a cpu intensive operation but if the
user wants to do it I can't see any harm.

It would actually use very little CPU, but lots of I/O, as it would have
to, at minimum, read an entire backup (or at least all its dirents) and
compare to the data that you're about to backup.

--
David Cantrell | A machine for turning tea into grumpiness

Anyone who cannot cope with mathematics is not fully human.
At best he is a tolerable subhuman who has learned to wear
shoes, bathe and not make messes in the house.
-- Robert A Heinlein

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Reply with quote
Post Disk space watcher and autodeleter 
On Fri, Feb 26, 2010 at 08:37:00AM +0100, Helmut Hullen wrote:
Quote:
Hallo, David,
Quote:
Filesystem Size Used Avail Use% Mounted on
/dev/disk1s3 1.8T 1.8T 40G 98% /Volumes/Backup
It's been hovering between 30 and 50G free for the last coupla
months.
I'd walked to my friendly hard disk shop and ordered a new disk.

Like I said, it's been there for a coupla months without any problems.

Quote:
If I install a backup system it has to work. Reliable.

I can live with one backup failing because it runs out of space, and if
that does happen, I'll get an email notification, so I can free up some
space while I wait for a new disk to arrive.

Like I said, be pragmatic. Absolute rigour isn't the right solution for
everyone in every situation. And I kinda trust rsnapshot users to do
their own risk analysis so I won't dictate to them what they should do.

If I needed absolute reliability, I'd have a nice modern Sun box,
running Solaris, in a proper data centre, with air conditioning, UPS,
redundant power supplies etc etc etc. Instead, I have an Apple G4 Cube
with some external Firewire disks. Oh, and if I needed absolute
reliability I'd be using proper filesystem point-in-time snapshots, not
rsync.

--
David Cantrell | even more awesome than a panda-fur coat

What plaything can you offer me today?

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Reply with quote
Post Disk space watcher and autodeleter 
Quote:
Quote:
On Thu, Feb 25, 2010 at 09:37:23AM -0500, Wilson Snyder wrote:
Quote:

Hello,

I'm attempting to replace a custom "hacky" script that has a
similar technique (rsync links) as rsnapshot with rsnapshot.

The one major feature I see missing is the ability to keep
the disk space on a snapshot mountpoint below a specific
level.

Quote:
Even better would be to do a trial run of the sync to
estimate the space for the next snapshot, and subtract that
from the threashold. The threashold could then be set much
tighter (90%ish), as there's less danger of overrunning the
disk.

Does anyone have a cmd_preexec or other script to do this?

If I was to implement this, could the patches be considered
for inclusion in rsnapshot?
...

If rsnapshot is working properly and in a steady state, then it
should use pretty much constant disk space relative to the size
of the file systems being backed up (unless there is an unusual
amount of space taken up by file which have changed).

... [Discussion about possible problems]

Quote:
Running low on disk space
can be a warning that there is a problem with your rsnapshot installation.
If you are not watching rsnapshot closely, you may not notice if unchanged
files are not hard-linked together as they should be, until you start to
run low on disk space.

So this is a possible counter-argument about a risk that could arise if
rsnapshot were configured to automatically delete old snapshots when disk
space starts running low. This could mask a symptom of a problem that
should be addressed.

That's good points. Perhaps one idea is to have a
configurable minimum number of snapshots to keep?

Quote:
However I am interested in the idea. If risks like the masked symptom
risk can be managed appropriately, and of course if it is implemented
in a sensible way that doesn't cause other problems, then I would be
happy to consider it (and also happy to hear opinions for or against
from people on this list).

I've just posted a patch to implement these features.

BTW there's a few other minor patches I've posted too, I
don't know if sourceforge is notifying you'all as I haven't
seen comments on them.

Quote:
Quote:
And the idea of a tool that can estimate disk space requirements for
the next snapshot seems like a good one. For example, that could be
used to build an early warning system that would check whether there
seems to be enough space for the next rsnapshot, and automatically
email a warning a few hours before the rsnapshot is attempted.

If the warning is ignored, and there's nothing to delete
(perhaps due to the limits set), and the disk would
otherwise fill, what is the right default behavior? Skip it
entirely with an error? Having a half-snapshot seems
dangerous, and filling a filesystem often causes strange
behaviors.

This I haven't done, but it's an easy extension now that the
info is there. (An early warning wouldn't work for my
application because we can easily have 10+GB of data change
in the few minutes before snapshot time.)

-Wilson

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
rsnapshot-discuss mailing list
rsnapshot-discuss < at > lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB