SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
New feature: --verify-full
Author Message
Post New feature: --verify-full 
I have implemented a --verify-full option, which verifies all repository data including (most) everything in rdiff-backup-data (notable exclusion: backup.log). In addition to doing a more comprehensive verification, it should be faster than the existing --verify-at-time=<something other than now>.

There are two new options:
--verify-full
--verify-full-since=<time>

The first does a full verify of all data in the repository. The second verifies all data written since the given time. For example:

rdiff-backup --verify-full-since=3D

would verify all data written to the repository in the past three days and all files in the current mirror regardless of when they last changed.

In addition to the new feature, I have also implemented a small script that will create integrity data for an existing repository. The script has some caveats: (a) it does not detect corruption in your current repository, it only creates integrity data for the repository as it exists when you run the script (b) it creates one large integrity data file which means that --verify-full-since with a time on or before the date of that integrity file will verify all files in the repository, even if they were written to the repo before the date given.

I implemented this against CVS HEAD, although I'm may backport it to 1.2.8 because I'm feeling a little uncomfortable using the dev version on a production system. It seems like there have been a lot of changes, some of which do not appear to be finished (unicode support being one). The attached patch was generated by git. Let me know if something different is needed to apply to CVS.
~ Daniel

View user's profile Send private message
Post New feature: --verify-full 
Daniel Miller wrote:
I have implemented a --verify-full option, which verifies all repository data including (most) everything in rdiff-backup-data (notable exclusion: backup.log). In addition to doing a more comprehensive verification, it should be faster than the existing --verify-at-time=<something other than now>.

There are two new options:
--verify-full
--verify-full-since=<time>

The first does a full verify of all data in the repository. The second verifies all data written since the given time. For example:

rdiff-backup --verify-full-since=3D

would verify all data written to the repository in the past three days and all files in the current mirror regardless of when they last changed.

In addition to the new feature, I have also implemented a small script that will create integrity data for an existing repository. The script has some caveats: (a) it does not detect corruption in your current repository, it only creates integrity data for the repository as it exists when you run the script (b) it creates one large integrity data file which means that --verify-full-since with a time on or before the date of that integrity file will verify all files in the repository, even if they were written to the repo before the date given.

I implemented this against CVS HEAD, although I'm may backport it to 1.2.8 because I'm feeling a little uncomfortable using the dev version on a production system. It seems like there have been a lot of changes, some of which do not appear to be finished (unicode support being one). The attached patch was generated by git. Let me know if something different is needed to apply to CVS.





Here's the patch for 1.2.8


~ Daniel

View user's profile Send private message
Post New feature: --verify-full 
I have implemented a --verify-full option... <snip>

Here's the patch for 1.2.8

OK, after testing this on a large backup over the weekend I have to say this is definitely not production ready yet. I'm seeing some very strange errors which appear to be related to metadata writing problems. The metadata writer seems to be garbling data as it writes. I haven't found the source of that bug yet... I didn't even think I did anything that would have affected metadata writing. I've included an example of the corruption at the end of this message. The issue is only appearing about 80% through a 100+ GB backup, which is even more frustrating (why does it start at that point?)

In general, I'm disappointed with the quality of the code in rdiff-backup. It is very fragile. Some things that contribute to this: Frequent use of global variables makes it hard to know what will be impacted when editing the code. Non-standard use of classes (static.MakeClass) makes subclassing to reduce duplication difficult. There's a lot of code duplication with very small differences, which makes it difficult to add enhancements and/or refactor the code. One specific example of this is GzipFile write buffering. This should be done in one place (probably rpath.GzipFile). Instead, it's done in at least two places, and each one is implemented a little bit differently.

I'm getting discouraged with this modification, which I thought should have been fairly straightforward. Especially now that I'm seeing big-time corruption in places I didn't even think I'd touched. I would be grateful if someone would review my patch and tell me if I've overlooked something simple.

~ Daniel


Example of metadata corruption:

File ODUsers/daniel/Code/PyOE/temp/ReportLab-2.3/src/rl_addons/renderPM/libart_lgpl/NEWS
18ffa6cb324e4df2pae3dda71266e9c7b76756c2ator:00000000|type:00000000|location:0,0|flags3
94a474555c301ffc3fa21c2d5beb 42554
ResourceFork None
e 32a33c31becg318fa5d7460ae2a5039a209d1e 1253799928
Uid 1041
Unad 1041Came.se.inPyOE/temp/ReportLab-76 staff
Permissions 420
File ODUsers/daniel/Code/PyOE/temp/ReportLab-2.3/src/rl_addons/renderPM/libart_lgpl/README.CVS
6ad5343d4b3f2bd538dd6a4dc252aa3ourca89c2ator:00000000|type:00000000|location:0,0|flags3
94a474555c301ffc3fa21c2d5beb 42554
ResourceFork None
e 32a33c31becg318fa5d7460ae2a5039a209d1e 1253799928
Uid 1041
UnatVS
e daniel
Gid 20
Gname staf8/raff
Permissions 420

View user's profile Send private message
Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB