SearchFAQMemberlist Log in
Reply to topic Page 1 of 2
Goto page 1, 2  Next
Bad md5sums due to zero size (uncompressed) cpool files - WE
Author Message
Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
After the recent thread on bad md5sum file names, I ran a check on all
my 1.1 million cpool files to check whether the md5sum file names are
correct.

I got a total of 71 errors out of 1.1 million files:
- 3 had data in it (though each file was only a few hundred bytes
long)

- 68 of the 71 were *zero* sized when decompressed
29 were 8 bytes long corresponding to zlib compression of a zero
length file

39 were 57 bytes long corresponding to a zero length file with an
rsync checksum

Each such cpool file has anywhere from 2 to several thousand links

The 68 *zero* length files should *not* be in the pool since zero
length files are not pooled. So, something is really messed up here.

It turns out though that none of those zero-length decompressed cpool
files were originally zero length but somehow they were stored in the
pool as zero length with an md5sum that is correct for the original
non-zero length file.

Some are attrib files and some are regular files.

Now it seems unlikely that the files were corrupted after the backups
were completed since the header and trailers are correct and there is
no way that the filesystem would just happen to zero out the data
while leaving the header and trailers intact (including checksums).

Also, it's not the rsync checksum caching causing the problem since
some of the zero length files are without checksums.

Now the fact that the md5sum file names are correct relative to the
original data means that the file was originally read correctly by
BackupPC..

So it seems that for some reason the data was truncated when
compressing and writing the cpool/pc file but after the partial file
md5sum was calculated. And it seems to have happened multiple times
for some of these files since there are multiple pc files linked to
the same pool file (and before linking to a cpool file, the actual
content of the files are compared since the partial file md5sum is not
unique).

Also, on my latest full backup a spot check shows that the files are
backed up correctly to the right non-zero length cpool file which of
course has the same (now correct) partial file md5sum. Though as you
would expect, that cpool file has a _0 suffix since the earlier zero
length is already stored (incorrectly) as the base of the chain.

I am not sure what is going on with the other 3 files since I have yet
to find them in the pc tree (my 'find' routine is still running)

I will continue to investigate this but this is very strange and
worrying since truncated cpool files means data loss!

In summary, what could possibly cause BackupPC to truncate the data
sometime between reading the file/calculating the partial file md5sum
and compressing/writing the file to the cpool?

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
Hi,

Jeffrey J. Kosowsky wrote on 2011-10-04 18:58:51 -0400 [[BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG]:
After the recent thread on bad md5sum file names, I ran a check on all
my 1.1 million cpool files to check whether the md5sum file names are
correct.

I got a total of 71 errors out of 1.1 million files:
[...]
- 68 of the 71 were *zero* sized when decompressed
[...]
Each such cpool file has anywhere from 2 to several thousand links
[...]
It turns out though that none of those zero-length decompressed cpool
files were originally zero length but somehow they were stored in the
pool as zero length with an md5sum that is correct for the original
non-zero length file.
[...]
Now it seems unlikely that the files were corrupted after the backups
were completed since the header and trailers are correct and there is
no way that the filesystem would just happen to zero out the data
while leaving the header and trailers intact (including checksums).
[...]
Also, on my latest full backup a spot check shows that the files are
backed up correctly to the right non-zero length cpool file which of
course has the same (now correct) partial file md5sum. Though as you
would expect, that cpool file has a _0 suffix since the earlier zero
length is already stored (incorrectly) as the base of the chain.
[...]
In summary, what could possibly cause BackupPC to truncate the data
sometime between reading the file/calculating the partial file md5sum
and compressing/writing the file to the cpool?

the first and only thing that springs to my mind is a full disk. In some
situations, BackupPC needs to create a temporary file (RStmp, I think) to
reconstruct the remote file contents. This file can become quite large, I
suppose. Independant of that, I remember there is *at least* an "incorrect
size" fixup which needs to copy already written content to a different hash
chain (because the hash turns out to be incorrect *after*
transmission/compression). Without looking closely at the code, I could
imagine (but am not sure) that this could interact badly with a full disk:

* output file is already open, headers have been written
* huge RStmp file is written, filling up the disk
* received file contents are for some reason written to disk (which doesn't
work - no space left) and read back for writing into the output file (giving
zero-length contents)
* trailing information is written to the output file - this works, because
there is enough space left in the already allocated block for the file
* RStmp file gets removed and the rest of the backup continues without
apparent error

Actually, for the case I tried to invent above, this doesn't seem to fit, but
the general idea could apply - at least the symptoms are "correct content
stored somewhere but read back incorrectly". This would mean the result of a
write operation would have to be unchecked by BackupPC somewhere (or handled
incorrectly).

So, the question is: have you been running BackupPC with an almost full disk?
Would there be at least one file in the backup set, of which the
*uncompressed* size is large in comparison to the reserved space (->
DfMaxUsagePct)?

For the moment, that's the most concrete thing I can think of. Of course,
writing to a temporary location might be fine an reading could fail (you
haven't modified your BackupPC code to use a signal handler for some arbitrary
purposes, have you? Wink. Or your Perl version could have an obscure bug that
occasionally trashes the contents of a string. Doesn't sound very likely,
though.

What *size* are the original files?

Ah, yes. How many backups are (or rather were) you running in parallel? Noone
said the RStmp needs to be created by the affected backup ...

Regards,
Holger

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
Holger Parplies wrote at about 17:41:48 +0200 on Wednesday, October 5, 2011:
Hi,

Jeffrey J. Kosowsky wrote on 2011-10-04 18:58:51 -0400 [[BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG]:
After the recent thread on bad md5sum file names, I ran a check on all
my 1.1 million cpool files to check whether the md5sum file names are
correct.

I got a total of 71 errors out of 1.1 million files:
[...]
- 68 of the 71 were *zero* sized when decompressed
[...]
Each such cpool file has anywhere from 2 to several thousand links
[...]
It turns out though that none of those zero-length decompressed cpool
files were originally zero length but somehow they were stored in the
pool as zero length with an md5sum that is correct for the original
non-zero length file.
[...]
Now it seems unlikely that the files were corrupted after the backups
were completed since the header and trailers are correct and there is
no way that the filesystem would just happen to zero out the data
while leaving the header and trailers intact (including checksums).
[...]
Also, on my latest full backup a spot check shows that the files are
backed up correctly to the right non-zero length cpool file which of
course has the same (now correct) partial file md5sum. Though as you
would expect, that cpool file has a _0 suffix since the earlier zero
length is already stored (incorrectly) as the base of the chain.
[...]
In summary, what could possibly cause BackupPC to truncate the data
sometime between reading the file/calculating the partial file md5sum
and compressing/writing the file to the cpool?

the first and only thing that springs to my mind is a full disk. In some
situations, BackupPC needs to create a temporary file (RStmp, I think) to
reconstruct the remote file contents. This file can become quite large, I
suppose. Independant of that, I remember there is *at least* an "incorrect
size" fixup which needs to copy already written content to a different hash
chain (because the hash turns out to be incorrect *after*
transmission/compression). Without looking closely at the code, I could
imagine (but am not sure) that this could interact badly with a full disk:

* output file is already open, headers have been written
* huge RStmp file is written, filling up the disk
* received file contents are for some reason written to disk (which doesn't
work - no space left) and read back for writing into the output file (giving
zero-length contents)
* trailing information is written to the output file - this works, because
there is enough space left in the already allocated block for the file
* RStmp file gets removed and the rest of the backup continues without
apparent error

Actually, for the case I tried to invent above, this doesn't seem to fit, but
the general idea could apply - at least the symptoms are "correct content
stored somewhere but read back incorrectly". This would mean the result of a
write operation would have to be unchecked by BackupPC somewhere (or handled
incorrectly).

So, the question is: have you been running BackupPC with an almost full disk?

Nope - disk has plenty of space...

Would there be at least one file in the backup set, of which the
*uncompressed* size is large in comparison to the reserved space (->
DfMaxUsagePct)?

Nothing large by today's standard - I don't backup any large databases
or video files.


For the moment, that's the most concrete thing I can think of. Of course,
writing to a temporary location might be fine an reading could fail (you
haven't modified your BackupPC code to use a signal handler for some arbitrary
purposes, have you? Wink. Or your Perl version could have an obscure bug that
occasionally trashes the contents of a string. Doesn't sound very likely,
though.

What *size* are the original files?

About half are attrib files of normal directories so they are quite
small. One I just checked was a kernel Documentation file of < 20K


Ah, yes. How many backups are (or rather were) you running in parallel? Noone
said the RStmp needs to be created by the affected backup ...

I don't run more than 2-3 in parallel.
And again my disk is far from full (about 60% of a 250GB partition)
and the files with errors so far all seem to be small.

I do have the partition mounted over NFS but I'm now using an updated
kernel on both machines (kernel 2.6.32) so it's not the same buggy
stuff I had years ago with an old 2.6.12 kernel.

But still, I would think an NFS error would trash the entire file, not
just the data portion of a compressed file...

Looking at the timestamps of the bad pool files, the errors occurred in
the Feb-April time frame (note this pool was started in February) and
there have been no errors since then. But the errors are sprinkled
across ~10 different days during that time period. So whatever
happened, happened several times. Now I haven't really changed/added
many files since April other than normal daily logs, mail spools, and
a few files that I have been editing so it could be that the rare event
hasn't occurred because I haven't added many new files to the pool.

Finally, remember it's possible that many people are having this
problem but just don't know it, since the only way one would know
would be if one actually computed the partial file md5sums of all the
pool files and/or restored & tested ones backups. Since the error
affects only 71 out of 1.1 million files it's possible that no one has
ever noticed...

It would be interesting if other people would run a test on their
pools to see if they have similar such issues (remember I only tested
my pool in response to the recent thread of the guy who was having
issues with his pool)...

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
On Wed, 2011-10-05 at 21:35 -0400, Jeffrey J. Kosowsky wrote:

Finally, remember it's possible that many people are having this
problem but just don't know it, since the only way one would know
would be if one actually computed the partial file md5sums of all the
pool files and/or restored & tested ones backups. Since the error
affects only 71 out of 1.1 million files it's possible that no one has
ever noticed...

It would be interesting if other people would run a test on their
pools to see if they have similar such issues (remember I only tested
my pool in response to the recent thread of the guy who was having
issues with his pool)...

Do you have a script or series of commands to do this check with?

I have access to a couple of backuppc installs of various ages and sizes
that I can test.

--
Tim Fletcher <tim < at > night-shade.org.uk>


------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
Tim Fletcher <tim < at > night-shade.org.uk> wrote on 10/06/2011 05:17:03 AM:

Do you have a script or series of commands to do this check with?

I have access to a couple of backuppc installs of various ages and sizes
that I can test.

Me too, if it can run in a reasonable amount of time. I'd hate to find out during a major restore that something is corrupt.

Tim Massey
Out of the Box Solutions, Inc.
Creative IT Solutions Made Simple!

[url=Arial]http://www.OutOfTheBoxSolutions.com[/url]
[url=Arial]tmassey < at > obscorp.com[/url] 22108 Harper Ave.
St. Clair Shores, MI 48080
Office: (800)750-4OBS (4627)
Cell: (586)945-8796

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
Hi,

Tim Fletcher wrote on 2011-10-06 10:17:03 +0100 [Re: [BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG]:
On Wed, 2011-10-05 at 21:35 -0400, Jeffrey J. Kosowsky wrote:
Finally, remember it's possible that many people are having this
problem but just don't know it,

perfectly possible. I was just saying what possible cause came to my mind (any
many people *could* be running with an almost full disk). As you (Jeffrey)
said, the fact that the errors appeared only within a small time frame may or
may not be significant. I guess I don't need to ask whether you are *sure*
that the disk wasn't almost full back then.

To be honest, I would *hope* that only you had these issues and everyone
else's backups are fine, i.e. that your hardware and not the BackupPC software
was the trigger (though it would probably need some sort of software bug to
come up with the exact symptoms).

since the only way one would know would be if one actually computed the
partial file md5sums of all the pool files and/or restored & tested ones
backups.

Almost.

Since the error affects only 71 out of 1.1 million files it's possible
that no one has ever noticed...

Well, let's think about that for a moment. We *have* had multiple issues that
*sounded* like corrupt attrib files. What would happen, if you had an attrib
file that decompresses to "" in the reference backup?

It would be interesting if other people would run a test on their
pools to see if they have similar such issues (remember I only tested
my pool in response to the recent thread of the guy who was having
issues with his pool)...

Do you have a script or series of commands to do this check with?

Actually, what I would propose in response to what you have found would be to
test for pool files that decompress to zero length. That should be
computationally less expensive than computing hashes - in particular, you can
stop decompressing once you have decompressed any content at all. Sure, that
just checks for this issue, not for possible different ones. On the one hand,
having the *correct* content in the pool under an incorrect hash would not be
a *serious* issue - it wouldn't prevent restoring your data, it would just
make pooling not work correctly (for the files affected). On the other,
different instances of this problem might point toward a common cause. And I
guess it would be possible to have *truncated* data (i.e. not zero-length, but
incomplete just the same) in your files as well.

You weren't asking me, but, yes, I wrote a script to check pool file contents
against the file names back in 2007. I'll append it here, but it would really
be interesting to add information on whether the file decompressed to
zero-length. I could easily add the decompressed file length to the output,
but it would make lines longer than 80 characters. Ok, I did that (and added
counting of zero-length files) - please make your terminals at least 93
characters wide Smile. I just scanned 1/16th of my pool and found various
mismatches, though none of them zero-length. Probably top-level attrib files.
Link counts might be interesting - I'll add them later.

I have access to a couple of backuppc installs of various ages and sizes
that I can test.

Try something like

BackupPC_verifyPool -s -p

to scan the whole pool, or

BackupPC_verifyPool -s -p -r 0

to test it on the 0/0/0 - 0/0/f pool subdirectories (-r takes a Perl
expression evaluating to an array of numbers between 0 and 255, e.g. "0",
"0 .. 255" (the default), or "0, 1, 10 .. 15, 5"; note the quotes to make your
shell pass it as a single argument). If you have switched off compression,
you'll have to add a '-u' (though I'm not sure this test makes much sense in
that case). You'll want either '-p' (progress) or '-v' (verbose) to see
anything happening. It *will* take time to traverse the pool, but you can
safely interrupt the script at any time and use the range parameter to resume
it later (though not at the exact place) - or just suspend and resume it (^Z).

You might need to change the 'use lib' statement in line 64 to match your
distribution.

Hope that helps.

Regards,
Holger

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
On Thu, Oct 6, 2011 at 11:56 AM, Timothy J Massey <tmassey < at > obscorp.com ([email]tmassey < at > obscorp.com[/email])> wrote:
Personally, I feel that compression has no place in backups.  Back when we were highly limited in capacity by terrible analog devices (i.e. tape!) I used it from necessity.  Now, I just throw bigger hard drives at it and am thankful.  Smile


No, it makes perfect sense for backuppc where the point is to keep as much history as possible online in a given space.  If you have trouble with compression, just throw a faster CPU at it.  Just anecdotally, I saw 95% compression recently on a system where someone requested including their web content directory and forgot to mention the 40Gb of log files that happened to be there.

--
   Les Mikesell
     lesmikesell < at > gmail.com ([email]lesmikesell < at > gmail.com[/email])

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
Les Mikesell <lesmikesell < at > gmail.com> wrote on 10/06/2011 01:21:29 PM:

On Thu, Oct 6, 2011 at 11:56 AM, Timothy J Massey <tmassey < at > obscorp.com> wrote:
Personally, I feel that compression has no place in backups. Back
when we were highly limited in capacity by terrible analog devices
(i.e. tape!) I used it from necessity. Now, I just throw bigger
hard drives at it and am thankful. Smile

No, it makes perfect sense for backuppc where the point is to keep
as much history as possible online in a given space.

No, the point of backup is to be able to *restore* as much historical data as possible. Keeping the data is not the important part. Restoring it is. Anything that is between storing data and *restoring* that data is in the way of that job.

Obviously, there *are* things that have to go between it: a filesystem to store the data, for example. But if I can avoid something in between storing my data and using my data, I absolutely will.

Compression falls in that area.

If you have
trouble with compression, just throw a faster CPU at it. Just
anecdotally, I saw 95% compression recently on a system where
someone requested including their web content directory and forgot
to mention the 40Gb of log files that happened to be there.

That's all well and good. My issue is *NOT* performance. Or capacity, for that matter. I'm not saying that there is no value to compression. I'm saying that my objective for of a backup server is FIRST to be as simple and reliable as possible, and THEN only to have other features. Features that detract from that first requirement are considered skeptically.

This entire thread is a *PERFECT* example of why I have my reasons. I have avoided an entire category of failure simply by throwing more disk at it (or by having a smaller "window" of backups). Seeing as I have, at a minimum, 4 months of data (with varying gaps between the backups) within the backup server itself, and archive data in long-term storage every three months, I have what I (and my clients) feel to be enough data. Extra capacity would have no value. Extra reliability *always* has value.

YMMV, of course.

Timothy J. Massey
Out of the Box Solutions, Inc.
Creative IT Solutions Made Simple!

[url=Arial]http://www.OutOfTheBoxSolutions.com[/url]
[url=Arial]tmassey < at > obscorp.com[/url] 22108 Harper Ave.
St. Clair Shores, MI 48080
Office: (800)750-4OBS (4627)
Cell: (586)945-8796

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
On Thu, Oct 6, 2011 at 1:04 PM, Timothy J Massey <tmassey < at > obscorp.com ([email]tmassey < at > obscorp.com[/email])> wrote:

On Thu, Oct 6, 2011 at 11:56 AM, Timothy J Massey <tmassey < at > obscorp.com ([email]tmassey < at > obscorp.com[/email])> wrote:

Personally, I feel that compression has no place in backups.  Back
when we were highly limited in capacity by terrible analog devices
(i.e. tape!) I used it from necessity.  Now, I just throw bigger
hard drives at it and am thankful.  Smile

No, it makes perfect sense for backuppc where the point is to keep
as much history as possible online in a given space.


No, the point of backup is to be able to *restore* as much historical data as possible.  Keeping the data is not the important part.  Restoring it is.  Anything that is between storing data and *restoring* that data is in the way of that job.


Obviously, there *are* things that have to go between it:  a filesystem to store the data, for example.  But if I can avoid something in between storing my data and using my data, I absolutely will.

Compression falls in that area.

My experience is that the failures are more likely in the parts underneath storing the data than in the compression process.   Admittedly, that goes all the way back to storing zip files on floppies vs. large uncompressed text files and media reliability has improved a bit.



  If you have
trouble with compression, just throw a faster CPU at it.  Just
anecdotally, I saw 95% compression recently on a system where
someone requested including their web content directory and forgot
to mention the 40Gb of log files that happened to be there.


That's all well and good.  My issue is *NOT* performance.  Or capacity, for that matter.  I'm not saying that there is no value to compression.  I'm saying that my objective for of a backup server is FIRST to be as simple and reliable as possible, and THEN only to have other features.  Features that detract from that first requirement are considered skeptically.

Media fails.  Things that reduce the media necessary to hold a given amount of data reduces the chances of failure.  The CPU and RAM can fail too, but if those go you are fried whether you were compressing or not.
 

This entire thread is a *PERFECT* example of why I have my reasons.  I have avoided an entire category of failure simply by throwing more disk at it (or by having a smaller "window" of backups).  Seeing as I have, at a minimum, 4 months of data (with varying gaps between the backups) within the backup server itself, and archive data in long-term storage every three months, I have what I (and my clients) feel to be enough data.  Extra capacity would have no value.  Extra reliability *always* has value.

YMMV, of course.
With compressible data you increase both capacity and reliability by compressing before storage.   There's no magical difference between the reliability of 'cat' vs 'zcat'.  Either one could fail.

-- 

   Les Mikesell
     lesmikesell < at > gmail.com ([email]lesmikesell < at > gmail.com[/email])

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
On Thursday 06 October 2011 20:04:57 Timothy J Massey wrote:
Les Mikesell <lesmikesell < at > gmail.com> wrote on 10/06/2011 01:21:29 PM:
On Thu, Oct 6, 2011 at 11:56 AM, Timothy J Massey <tmassey < at > obscorp.com>

wrote:
Personally, I feel that compression has no place in backups. Back
when we were highly limited in capacity by terrible analog devices
(i.e. tape!) I used it from necessity. Now, I just throw bigger
hard drives at it and am thankful. Smile

No, it makes perfect sense for backuppc where the point is to keep
as much history as possible online in a given space.

No, the point of backup is to be able to *restore* as much historical data
as possible. Keeping the data is not the important part. Restoring it
is. Anything that is between storing data and *restoring* that data is in
the way of that job.

Actually the point of a backup is to restore the most recent version of
<something> from just before the trouble (whatever that might be).

Storing or restoring historical data is called an archive. Interestingly most
commercial archive-solutions advertise their (certified) long-term archive but
never the ability to get back that data. Makes you wonder...

Have fun,

Arnold

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
Hi,

Les Mikesell wrote on 2011-10-06 13:42:09 -0500 [Re: [BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG]:
On Thu, Oct 6, 2011 at 1:04 PM, Timothy J Massey <tmassey < at > obscorp.com>wrote:

No, the point of backup is to be able to *restore* as much historical data
as possible. Keeping the data is not the important part. Restoring it is.
Anything that is between storing data and *restoring* that data is in the
way of that job.
[...]
My experience is that the failures are more likely in the parts underneath
storing the data than in the compression process. Admittedly, that goes
all the way back to storing zip files on floppies vs. large uncompressed
text files and media reliability has improved a bit.
[...]
Media fails. Things that reduce the media necessary to hold a given amount
of data reduces the chances of failure. The CPU and RAM can fail too, but
if those go you are fried whether you were compressing or not.
[...]
With compressible data you increase both capacity and reliability by
compressing before storage. There's no magical difference between the
reliability of 'cat' vs 'zcat'. Either one could fail.

the problem, I believe, is not 'cat' or 'zcat' failing, it's a *media* error,
as you pointed out, rendering a complete compressed file unusable instead of
only the erraneous bytes/sectors. Yes, there are compression algorithms that
are able to recover after an error, but I don't think BackupPC uses any of
these.

Sure, the common case might be losing a complete disk rather than having a few
bytes altered, but in that case, you can either recover from the remaining
disks (presuming you have some form of redundancy), or you lose your complete
pool, whether or not compressed.

While you might reduce the chances of failure with compression, you increase
the impact of failure.

This entire thread is a *PERFECT* example of why I have my reasons.

I agree with your reasons, but it remains to be seen whether compression makes
any difference in the context of this thread. But I'll reply to that
separately.

Regards,
Holger

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
On Thu, Oct 6, 2011 at 5:21 PM, Arnold Krille <arnold < at > arnoldarts.de> wrote:

No, it makes perfect sense for backuppc where the point is to keep
as much history as possible online in a given space.

No, the point of backup is to be able to *restore* as much historical data
as possible.  Keeping the data is not the important part.  Restoring it
is.  Anything that is between storing data and *restoring* that data is in
the way of that job.

Actually the point of a backup is to restore the most recent version of
<something> from just before the trouble (whatever that might be).

Yes, but throw in the fact that it may take some unpredictable amount
of time after the 'trouble' (which could have been accidentally
deleting a rarely used file) before anyone notices and you see why you
need some history available to restore from the version just before
the trouble.

--
Les Mikesell
lesmikesell < at > gmail.com

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
On Thu, Oct 6, 2011 at 5:42 PM, Holger Parplies <wbppc < at > parplies.de> wrote:

[...]
With compressible data you increase both capacity and reliability by
compressing before storage.   There's no magical difference between the
reliability of 'cat' vs 'zcat'.  Either one could fail.

the problem, I believe, is not 'cat' or 'zcat' failing, it's a *media* error,
as you pointed out, rendering a complete compressed file unusable instead of
only the erraneous bytes/sectors. Yes, there are compression algorithms that
are able to recover after an error, but I don't think BackupPC uses any of
these.

Sure, the common case might be losing a complete disk rather than having a few
bytes altered, but in that case, you can either recover from the remaining
disks (presuming you have some form of redundancy), or you lose your complete
pool, whether or not compressed.

I like RAID1 where you can recover from any singe surviving disk.

While you might reduce the chances of failure with compression, you increase
the impact of failure.

Maybe, maybe not. You might find something usable if you scrape some
plain text or maybe even part of a tar file off a disk past a media
error which is pretty hard to do anyway, but most other file types
won't have much chance of working.

--
Les Mikesell
lesmikesell < at > gmail.com

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
Hi,

Les Mikesell wrote on 2011-10-06 18:17:06 -0500 [Re: [BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG]:
On Thu, Oct 6, 2011 at 5:21 PM, Arnold Krille <arnold < at > arnoldarts.de> wrote:

No, it makes perfect sense for backuppc where the point is to keep
as much history as possible online in a given space.

No, the point of backup is to be able to *restore* as much historical data
as possible.  Keeping the data is not the important part.  Restoring it
is.  Anything that is between storing data and *restoring* that data is in
the way of that job.

Actually the point of a backup is to restore the most recent version of
<something> from just before the trouble (whatever that might be).

Yes, but throw in the fact that it may take some unpredictable amount
of time after the 'trouble' (which could have been accidentally
deleting a rarely used file) before anyone notices and you see why you
need some history available to restore from the version just before
the trouble.

I think you've all got it wrong. The real *point* of a backup is ...
whatever the person doing the backup wants it for. For some people that
might just be being able to say, "hey, we did all we could to preserve the
data as long as legally required - too bad it didn't work out". Usually, it
seems to be sufficient that the data is stored, but some of us really *do*
want to be able to *restore* it, too, while others are doing backups mainly
for watching the progress. Fine. That's what the flexibility of BackupPC is
for, right?

Regards,
Holger

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Post Bad md5sums due to zero size (uncompressed) cpool files - WE 
On Thu, Oct 06, 2011 at 05:54:05PM +0200, Holger Parplies wrote:
Try something like

BackupPC_verifyPool -s -p

to scan the whole pool, or

BackupPC_verifyPool -s -p -r 0

to test it on the 0/0/0 - 0/0/f pool subdirectories (-r takes a Perl
expression evaluating to an array of numbers between 0 and 255, e.g. "0",
"0 .. 255" (the default), or "0, 1, 10 .. 15, 5"; note the quotes to make your
shell pass it as a single argument). If you have switched off compression,
you'll have to add a '-u' (though I'm not sure this test makes much sense in
that case). You'll want either '-p' (progress) or '-v' (verbose) to see
anything happening. It *will* take time to traverse the pool, but you can
safely interrupt the script at any time and use the range parameter to resume
it later (though not at the exact place) - or just suspend and resume it (^Z).

You might need to change the 'use lib' statement in line 64 to match your
distribution.

I ran this with -r 0 and got as a summary:

39000 files in 16 directories checked, 4 had wrong digests, of these 0
zero-length.

running it with -r "1,2" now.

--
-- rouilj

John Rouillard System Administrator
Renesys Corporation 603-244-9084 (cell) 603-643-9300 x 111

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users < at > lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Display posts from previous:
Reply to topic Page 1 of 2
Goto page 1, 2  Next
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB