SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
rdiff-backup reliability ad lzop compression
Author Message
Post rdiff-backup reliability ad lzop compression 
hello !

i`m new to rdiff-backup and want to tell about 2 things.

the following isn`t meant as critics, nor is it meant as "i know better....". rdiff-backup just looks rather promising and i don`t know of any other application (are there any?) which is able to do incremental backups with a binary diff - so this is very efficient for saving space and a really neat utility ! cool stuff - thanks for making it!


first i'd be happy to know, how "reliable" rdiff-backup is in general.
is my data really _safe_ ?
is this ready for the enterprise?
some "real world" experiences? ('.....i´m backing up 2 TB daily and never had a problem.....')
i'm in doubt somehow, because i tried rdiff-backup some time ago and had some problems (cannot describe exactly anymore - it`s some time ago) - and now while trying again i`m having problems with large files.

i posted a bug report for this at http://savannah.nongnu.org/bugs/?func=detailitem&item_id=15539


now the more interesting part:

regarding diskspace, there is one idea coming to my mind:

what about storing _all_ of the backup data compressed and adding a layer of "realtime compression/decompression" - i.e. instead of only gzipping the data in rdiff-backup-data subdir, why not compressing the data in "destination_directory", too ?


I'm thinking of lzo, which is sort of realtime compression library.
see http://www.oberhumer.com/opensource/lzo/
there is also a python version there!

look at this little example:

time cp test.dat test2.dat
real 2m39.438s
user 0m0.163s
sys 0m28.336s

time lzop -c test.dat >test.dat.lzo
real 2m27.205s
user 1m5.725s
sys 0m20.681s

you can see, that copying the data is slower than writing it in compressed format. (ok, compressing needs some more cpu...)

adding lzo could probably save space and time for backup.
i'm not sure, but maybe someone finds this interesting and likes to spend some thoughts about this ?

regards

roland
(sysadmin)

Post rdiff-backup reliability ad lzop compression 
roland wrote:
first i'd be happy to know, how "reliable"
rdiff-backup is in general.
is my data really _safe_ ?

What does "safe" mean? Safe enough that you never have to carry out test
restores? NO backup system is that safe, ergo test restores must be
carried out and they will answer your question.

That said, I've been backing up my and customers' data for over three
years using rdiff-backup with a minimum of trouble.

Keith

Post rdiff-backup reliability ad lzop compression 
roland wrote:
first i'd be happy to know, how "reliable" rdiff-backup is in general.
Speaking as a user, rdiff-backup isn't perfect -- it has bugs, to be
sure -- but none of the issues we've seen impact data integrity. Our
internal QA team has given it their approval, and we're using it for
data for which lawsuits could (and quite likely would!) be spawned were
it lost.

regarding diskspace, there is one idea coming to my mind:

what about storing _all_ of the backup data compressed and adding a
layer of "realtime compression/decompression" - i.e. instead of only
gzipping the data in rdiff-backup-data subdir, why not compressing the
data in "destination_directory", too ?
The rdiff algorithm implies random access -- if you're going to rewrite
just one block, you don't want to recompress the whole file. Also, some
users (me!) consider having the data be in original form on the backup
server a substantial convenience -- so even if the above issue didn't
apply, we'd still want this feature to be optional.

Post rdiff-backup reliability ad lzop compression 
On Fri, 27 Jan 2006, roland wrote:

some "real world" experiences? ('.....i´m backing up 2 TB daily and
never had a problem.....')

it's not 2 TB by any means... but these are the average daily stats for
/home backup on my shell server.

--------------[ Average of 32 stat files ]--------------
ElapsedTime 13852.59 (3 hours 50 minutes 52.59 seconds)
SourceFiles 1244847.15625
SourceFileSize 126665014885.0 (118 GB)
MirrorFiles 1243889.65625
MirrorFileSize 126329389861.0 (118 GB)
NewFiles 1663.78125
NewFileSize 490312504.406 (468 MB)
DeletedFiles 706.28125
DeletedFileSize 186038110.812 (177 MB)
ChangedFiles 11117.75
ChangedSourceSize 11392924536.5 (10.6 GB)
ChangedMirrorSize 11361573906.6 (10.6 GB)
IncrementFiles 13503.6875
IncrementFileSize 348486468.125 (332 MB)
TotalDestinationSizeChange 684111491.594 (652 MB)
Errors 0

so that's avg of 652MB for a daily inc on a 118 GB /home partition.

it occurs over dsl (6mbps down 768kbps up) ... and the backup server is an
old p3 750 with slow raid5.


regarding diskspace, there is one idea coming to my mind:

what about storing _all_ of the backup data compressed and adding a
layer of "realtime compression/decompression" - i.e. instead of only
gzipping the data in rdiff-backup-data subdir, why not compressing the
data in "destination_directory", too ?

maybe you could do this with a block compression layer below the
filesystem... dunno if there are any of them which are writeable though.

-dean

Post rdiff-backup reliability ad lzop compression 
dean gaudet <dean-list-rdiff-backup < at > arctic.org>
wrote the following on Sun, 29 Jan 2006 13:13:54 -0800 (PST)

maybe you could do this with a block compression layer below the
filesystem... dunno if there are any of them which are writeable though.

Yes, perhaps Roland could use a compressed filesystem, and run
rdiff-backup with --no-compression. This shouldn't be a big deal now
that fuse is in the linux kernel.


--
Ben Escoto

Post rdiff-backup reliability ad lzop compression 
roland wrote:

some "real world" experiences? ('.....i´m backing up 2 TB daily and
never had a problem.....')

We are using rdiff-backup for 3 month now.
We run rdiff-backup with 1.2TB from server1 to server2 and the same size
from server2 to server1 starting at the same time.
Depending on the new, deleted and changed files the the backup lasts
from 2 hours (nothing changed) to 24 hours (ca. 70GB changed) on a PIII-800.

Our last full backup by AIT2-tapes lasts more than 48 hours.

After the first backup the bottleneck is the CPU. So using
--ssh-no-compression is faster for us on a 1GBit link between the servers.

All our restores were successfully.

Carsten

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB