Alan <alan < at > ufies.org>
wrote the following on Fri, 9 Jan 2004 13:35:37 -0800
However, I found a bit of a gotcha. When I moved from tar/scp to
rdiff-backup I was dumping my database everynight to a .sql file and
then bziping it and including that in my nightly tar. When I moved to
rdiff-backup I left it like that until I realized that because of the
bzip the .sql file was completely different each time, so the entire
file was transfered as an increment. When I removed the bzip part of
the process the base file was larger, but the increments were much
smaller because they were simply text diffs of new/changed data, not a
binary diff of an entirely changed file. Something to think about
anyway.
This is a tough problem. The xdelta program (similar to rdiff) would
decompress the files to better find the differences. But that leads
to its own problems because some files get really huge when you
decompress them...
I think there is a patch to gzip floating around that adds an option
to reset the buffer at certain clever intervals. The end result is
that similar data gzipped stays similar---one extra byte at the
beginning doesn't result in two totally separate gzip archives.
Perhaps eventually that patch will become standard, and programs that
compress changing files will use diff-friendly compression.
--
Ben Escoto
