Posted by Robert Nichols 
Robert Nichols
Hard link issues (was: Incredibly slow i/o to NAS server)
May 28, 2017 04:50PM
On 12/05/2016 03:26 PM, Robert Nichols wrote:
> On 12/05/2016 09:28 AM, Joe Steele wrote:
>> I believe the harlink problems you describe are identified in this bug report (that I submitted in 2009):
>> http://savannah.nongnu.org/bugs/?func=detailitem&item_id=26848
>> Attached to the report are patches to fix the issue. Some distros (which I think include Debian, Ubuntu, & Suse) have since been including the patches as part of their packaging.
> Those patches look like they address the problems with checksums being misplaced or lost, but they don't appear to have anything to do with what I see as the greater problem of sets of hard links being broken up and inconsistencies between the hard link counts in the metadata and the mirror. I finally got rdiff-backup to build (needed a patch for librsync >= 1.0.0), so I can do some testing later.

Well, after two months of use I can confirm that those patches _do_ appear to solve the problem of hard links being broken up or inconsistent. I still don't understand how, but it works. My hard link audit hasn't found any issues since I installed those patches.

Doing the hard links correctly does aggravate the problem of tiny "zero-diff" files being created, mainly whenever a file has a change in its hard link set, but also under some other, hard to pin down conditions. In a ~500GB archive I found I had accumulated over a million (10^6) such files, using ~4.3GB of space. I now have a new audit that gets rid of those. I've really got to make another stab a learning Python so that I can fix those problems at the source. So far, I've only succeeded in being devoured by a snake.

