SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
Techniques for trimming the sizes of backups
Author Message
Post Techniques for trimming the sizes of backups 
Hi All,

I've been running rdiff-backup successfully on a few servers for
several years now - but noticed recently that the disk space
requirements have got quite large (not a sudden change in disk usage,
just something I looked in to when it became an issue!).

I've got a few questions about how to reduce this - ideally without
using the "--remove-older-than" option over the whole backup, as
having the history has been useful in the past. Really just after
making point changes to certain files in the backup, without affecting
the integrity of the backup as a whole.

1) How to safely remove a given file from the backup directory?
I've got a given file (Mail.tar.gz) that should never have been backed
up, but was - and because the file was regenerated daily and around
the time the rdiff-backup script ran the file would either appear
different every day or would appear/disappear depending on timings.

Looking in {backup}/rdiff-backup-data/increments/ I see three
different sets of files that reference Mail.tar.gz in their names:
- ~40Gb: rdiff-backup-data/increments/Mail.tar.gz.20*diff
- ~9.5Gb: rdiff-backup-data/increments/Mail.tar.gz.20*snapshot
- ~0: rdiff-backup-data/increments/Mail.tar.gz.20*missing

There are presumably also a bunch of meta-data files that reference Mail.tar.gz.

So, given the fact that I do NOT want this file in the backups at all,
but ideally want to keep all history going back for everything else in
the backup, what are my options for removing this?

Is it safe to just delete all the
rdiff-backup-data/increments/Mail.tar.gz* files? I don't care that
this will leave some phantom trace of this file, as long as it doesn't
affect the rest of the backups..

2) What is the different between the mirror_metadata* and
file_statistics* files? Are they both required for restoring for
backups? file_statistics appears to be largely a subset of the
mirror_metadata information..

I only ask as for a given user they have ~2.5Gb of mirror_metadata
(which seems to contain all the important backup meta-data), but
~6.5Gb of file_statistics* files - so if the latter is not necessary
for restoring from backups then I can save some disk space.

TIA,
Anthony


_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Techniques for trimming the sizes of backups 
Anthony Toole spake thusly on 02/08/2010 11:04 AM:


Is it safe to just delete all the
rdiff-backup-data/increments/Mail.tar.gz* files? I don't care that
this will leave some phantom trace of this file, as long as it doesn't
affect the rest of the backups..


Hi -- I recently did this with a .Trash folder that was getting backed
up. I excluded the file in my include list and then manually deleted
the increments files. The next time I ran rdiff-backup I got some
errors about it, but then all seemed fine and with no errors on the next
run.

(I've only been using rdiff-backup for a couple of months, so I'm not
speaking from a lot of experience at this...)

Scott


_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Techniques for trimming the sizes of backups 
On Mon, Feb 08, 2010 at 05:04:36PM +0000, Anthony Toole wrote:
Hi All,

I've been running rdiff-backup successfully on a few servers for
several years now - but noticed recently that the disk space
requirements have got quite large (not a sudden change in disk usage,
just something I looked in to when it became an issue!).

Can't really answer you other problem, but thought I would add.

I use fusecompress fs to place by destination for rdiff-backup - it does
file by file compression.

So

partition mount to /backup/.laptop
fusecompress mount /backup/laptop

/backups/laptop is the rdiff-backup destination


you can work with the files in /backup/laptop just like any other posix
fs. files in the /backup/.laptop are the compresses ones, you can use a
offline tool to decompress them if you want.

I have seen quite large saving specially with things like cvs, maildir
etc...

[snip]


_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Techniques for trimming the sizes of backups 
On Wed, Feb 10, 2010 at 08:57:16PM +1100, Alex Samad wrote:
Can't really answer you other problem, but thought I would add.
I use fusecompress fs to place by destination for rdiff-backup - it does
file by file compression.

I'm experimenting with LessFS. It's another fuse-based filesystem, and in
addition to compressing blocks, it checksums each block and only stores
identical blocks once -- "de-duplication". This seems like a particular win
with rdiff-backup, because of the problem with handling of renamed files.

--
Matthew Miller mattdm < at > mattdm.org <http://mattdm.org/>


_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Techniques for trimming the sizes of backups 
Thanks Scott. Have you tried to recover anything from the time frame
where the deleted file existed (or earlier)?

I can give it a go on a copy of my backup if nobody can say for sure
if it will work or not..

On 10 February 2010 01:55, Scott Carpenter <scottc < at > movingtofreedom.org> wrote:
Anthony Toole spake thusly on 02/08/2010 11:04 AM:


Is it safe to just delete all the
rdiff-backup-data/increments/Mail.tar.gz* files?  I don't care that
this will leave some phantom trace of this file, as long as it doesn't
affect the rest of the backups..


Hi -- I recently did this with a .Trash folder that was getting backed up.
 I excluded the file in my include list and then manually deleted the
increments files.  The next time I ran rdiff-backup I got some errors about
it, but then all seemed fine and with no errors on the next run.

(I've only been using rdiff-backup for a couple of months, so I'm not
speaking from a lot of experience at this...)

Scott



_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Techniques for trimming the sizes of backups 
On Wed, Feb 10, 2010 at 08:24:33AM -0500, Matthew Miller wrote:
On Wed, Feb 10, 2010 at 08:57:16PM +1100, Alex Samad wrote:
Can't really answer you other problem, but thought I would add.
I use fusecompress fs to place by destination for rdiff-backup - it does
file by file compression.

I'm experimenting with LessFS. It's another fuse-based filesystem, and in
addition to compressing blocks, it checksums each block and only stores
identical blocks once -- "de-duplication". This seems like a particular win
with rdiff-backup, because of the problem with handling of renamed files.

thats nice.... what compression tec does it use


--
"I don't know where bin Laden is. I have no idea and really don't care. It's not that important. It's not our priority. "

- George W. Bush
03/13/2002
Washington, DC

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Techniques for trimming the sizes of backups 
On Thu, Feb 11, 2010 at 06:44:46AM +1100, Alex Samad wrote:
I'm experimenting with LessFS. It's another fuse-based filesystem, and in
addition to compressing blocks, it checksums each block and only stores
identical blocks once -- "de-duplication". This seems like a particular win
with rdiff-backup, because of the problem with handling of renamed files.
thats nice.... what compression tec does it use

Read about it yourself here: http://www.lessfs.com/wordpress/?page_id=50

In short, it uses a 192-bit hash function (happens to be Tiger) to uniquely
identify each block, and then compresses each block with LZO or QUICKLZ.

--
Matthew Miller mattdm < at > mattdm.org <http://mattdm.org/>


_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Techniques for trimming the sizes of backups 
On Wed, Feb 10, 2010 at 03:05:14PM -0500, Matthew Miller wrote:
On Thu, Feb 11, 2010 at 06:44:46AM +1100, Alex Samad wrote:
I'm experimenting with LessFS. It's another fuse-based filesystem, and in
addition to compressing blocks, it checksums each block and only stores
identical blocks once -- "de-duplication". This seems like a particular win
with rdiff-backup, because of the problem with handling of renamed files.
thats nice.... what compression tec does it use

Read about it yourself here: http://www.lessfs.com/wordpress/?page_id=50

In short, it uses a 192-bit hash function (happens to be Tiger) to uniquely
identify each block, and then compresses each block with LZO or QUICKLZ.

had a quick read of the web site, just wondering how effective it would
be with something like rdiff-backup - my line of thinking is that rd
stores the differences, so I would guess all the original files would
benefit, but the differences wouldn't

Also with fusecompress you can specify by mime type which files pass
through ie don't get affected by fusecompress.

I will have to investigate a bit more, run some tests

--
"America stands for liberty, for the pursuit of happiness, and for the unalienalienable right of life."

- George W. Bush
11/03/2003
Washington, DC

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Post Techniques for trimming the sizes of backups 
Anthony Toole spake thusly on 02/10/2010 06:49 AM:
Thanks Scott. Have you tried to recover anything from the time frame
where the deleted file existed (or earlier)?

I can give it a go on a copy of my backup if nobody can say for sure
if it will work or not..



I have not. I just had (and still have) faith that it will be okay. Smile

Experimentation is well-advised!

Scott


_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users < at > nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB