SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
Limit creation of log and sesssion data
Author Message
Post Limit creation of log and sesssion data 
Dear list,

I'm running rdiff-backup 1.0.3 on Debian Testing. I'm using it to take
a backup of a web app every 30 minutes. (There might be better tools
for this, but I like rdiff-backup.) The problem is that quite quickly
the log files and session data becomes larger than the web app. This
is a real problem because I back up the rdiff-backup to an off-line
disc once every day and the off-line back grows and grows.

The problem can be solved by implementing the "Delete Intermediate"
feature request:
<http://rdiff-backup.solutionsfirst.com.au/index.php/DeleteIntermediate>
However, other (easier) approaches come to mind:
1) Never, ever gzip an empty file. All my error_log files have size
108 in stead of 0 zero because they are compressed.
2) Don't write any logs or session data if there are no changes.
(Make it an option or the default.) This would help a lot!
3) I'm already using "--no-file-statistics". Any other options like
this that could be useful?

I'm aware that I can post process the rdiff-backup-data and remove
empty error_log files and so on, but it's much cleaner if rdiff-backup
cleans up it's own mess :-)

Regards,
Hans - until now a very happy rdiff-backup user

Post Limit creation of log and sesssion data 
"Hans F. Nordhaug" <Hans.F.Nordhaug < at > hiMolde.no>
wrote the following on Tue, 13 Dec 2005 09:47:15 +0100

However, other (easier) approaches come to mind:
1) Never, ever gzip an empty file. All my error_log files have size
108 in stead of 0 zero because they are compressed.
2) Don't write any logs or session data if there are no changes.
(Make it an option or the default.) This would help a lot!

Yes, I think the error_log file could be skipped if there are no
errors. Anyone see any drawbacks to this?

I'm not sure what other files this advice would apply to however.
There is only one backup.log file, so no logging means to change in
disk space. The session_statistics file only takes up about 500
bytes, and can still contain useful statistics (you might want to know
that nothing has changed, how long the session took, etc.).

The mirror_metadata file I would want to keep. However, the
development versions use diffs on these files, so they should take up
less space than 1.0.x and earlier.


--
Ben Escoto

Post Limit creation of log and sesssion data 
* Ben Escoto <ben < at > emerose.org> [2005-12-15]:
"Hans F. Nordhaug" <Hans.F.Nordhaug < at > hiMolde.no>
wrote the following on Tue, 13 Dec 2005 09:47:15 +0100

However, other (easier) approaches come to mind:
1) Never, ever gzip an empty file. All my error_log files have size
108 in stead of 0 zero because they are compressed.
2) Don't write any logs or session data if there are no changes.
(Make it an option or the default.) This would help a lot!

Yes, I think the error_log file could be skipped if there are no
errors. Anyone see any drawbacks to this?

(That would be great.)

I'm not sure what other files this advice would apply to however.
There is only one backup.log file, so no logging means to change in
disk space. The session_statistics file only takes up about 500
bytes, and can still contain useful statistics (you might want to know
that nothing has changed, how long the session took, etc.).

Yes, I can see the use for session_statistics, but couldn't there be
an option equivalent to "--no-file-statistics" to prevent these files
if the user really wants to? (I know I can do this myself, but then I
have to update the code every time a new version is released.)

The mirror_metadata file I would want to keep. However, the
development versions use diffs on these files, so they should take up
less space than 1.0.x and earlier.

OK, I'll try the development version. (I'm currently running 1.0.3.)

Thx a lot for your reply and keep up the good work!

Hans

Post Limit creation of log and sesssion data 
On Thu, 15 Dec 2005, Ben Escoto wrote:

"Hans F. Nordhaug" <Hans.F.Nordhaug < at > hiMolde.no>
wrote the following on Tue, 13 Dec 2005 09:47:15 +0100

However, other (easier) approaches come to mind:
1) Never, ever gzip an empty file. All my error_log files have size
108 in stead of 0 zero because they are compressed.
2) Don't write any logs or session data if there are no changes.
(Make it an option or the default.) This would help a lot!

Yes, I think the error_log file could be skipped if there are no
errors. Anyone see any drawbacks to this?

Which filesystems allocate disk space for an empty file? AFAIK, ext2 only
uses an entry in the 'directory', but no blocks for contents. So, by just
not gzipping an empty file, it will probably take less space already.

I currently use the error_log file as a means to check the result of the
rdiff-backup run. Attaching the contents of `ls error_log*|tail -n1` to a
cron email is quite easy, but would give strange results when no error_log
file was created in the last run.
While my crude error_log selection line could be adjusted to use e.g. the
timestamp from the current_mirror file, I really prefer to have all log
and session files, even if they're empty.

If deleting (or not creating) file_statistics files is possible, and
mirror_metadata files are managed incrementally, I doubt there will be
much overhead in disk usage. And a few megabytes is a small price to pay
for long-term "incremental restores".

A switch for not writing session_statistics files might be useful, but
adding an 'rm rdiff-backup-data/session_statistics*' in the script calling
rdiff-backup shouldn't be too hard either.


--
Maarten

Post Limit creation of log and sesssion data 
* Maarten Bezemer <mcbrdiff < at > robuust.nl> [2005-12-16]:

On Thu, 15 Dec 2005, Ben Escoto wrote:

"Hans F. Nordhaug" <Hans.F.Nordhaug < at > hiMolde.no>
wrote the following on Tue, 13 Dec 2005 09:47:15 +0100

However, other (easier) approaches come to mind:
1) Never, ever gzip an empty file. All my error_log files have size
108 in stead of 0 zero because they are compressed.
2) Don't write any logs or session data if there are no changes.
(Make it an option or the default.) This would help a lot!

Yes, I think the error_log file could be skipped if there are no
errors. Anyone see any drawbacks to this?

Which filesystems allocate disk space for an empty file? AFAIK, ext2 only
uses an entry in the 'directory', but no blocks for contents. So, by just
not gzipping an empty file, it will probably take less space already.

Which is exactly what I suggested. (I just complained about gzipping
an empty file.) However, a "million" empty files does clutter the
directory...

A switch for not writing session_statistics files might be useful, but
adding an 'rm rdiff-backup-data/session_statistics*' in the script calling
rdiff-backup shouldn't be too hard either.

I agree, this is what I called post processing. But to repeat myself -
to be consistent it makes sense to also have a "--no-session-statistics"
option.

Hans

PS! Since you claim it's easy to add 'rm rdiff-backup-data/session_statistics*'
to my cron scripts, I would say it's easy to add an "if error_log
exists test" to your cron scripts Smile

Post Limit creation of log and sesssion data 
Maarten Bezemer <mcbrdiff < at > robuust.nl>
wrote the following on Fri, 16 Dec 2005 00:56:45 +0100 (CET)

Which filesystems allocate disk space for an empty file? AFAIK, ext2 only
uses an entry in the 'directory', but no blocks for contents. So, by just
not gzipping an empty file, it will probably take less space already.

I currently use the error_log file as a means to check the result of the
rdiff-backup run. Attaching the contents of `ls error_log*|tail -n1` to a
cron email is quite easy, but would give strange results when no error_log
file was created in the last run.
While my crude error_log selection line could be adjusted to use e.g. the
timestamp from the current_mirror file, I really prefer to have all log
and session files, even if they're empty.

It was a bit more work than I thought, but the patches I just checked
in avoid gzipping 0 length error_log, mirror_metadata,
extended_attributes, and access_control_list files (but all are still
written).

session_statistics files are still written, so if they bug anyone, do
a "rm session_statistics*".

"Hans F. Nordhaug" <Hans.F.Nordhaug < at > hiMolde.no>
wrote the following on Thu, 15 Dec 2005 22:11:12 +0100

The mirror_metadata file I would want to keep. However, the
development versions use diffs on these files, so they should take up
less space than 1.0.x and earlier.

OK, I'll try the development version. (I'm currently running 1.0.3.)

Sure, but make sure to keep an eye on your backups since the
development version is the development version.


--
Ben Escoto

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB