On 2018-08-13 11:44, Conrad L. Macina wrote:
Thanks much for your reply. :) This indeed corroborates things.
George
>> When running an incremental or numeric backup, I thought NetWorker walked
> the file system (or named path/saveset) to generate a list of what files
> will be backed up.
>> 1. Is this true?
> Yes.
>
>> 2. If so, does it also do this when running a full?
> I believe it does, but the "walk" takes less time because NetWorker doesn't
> have to do any time comparisons to determine whether or not to back up each
> individual file.
>
>> I will sometimes see warning message(s) in the savegroup completion
> notifications for a pathname/file whose size grew or shrunk during save.
>> It seems, therefore, that the only way it would know this would be to
> compare the file's size before and after the backup.
> Correct. NetWorker compares the file's size and mtime it found during the
> "walk" against the same values when it's ready to actually write the backup
> to media.
>
>> 3. When it does this comparison does it determine the pre-backup size on
> the fly, just before it backs up the file? Or does it instead already have
> its size stored in
>> the listing that it generated when it did the initial walk-through?
> I believe it reads the inode information (or Windows equivalent) to get the
> mtime and size.
>
>> Sometimes there will be a warning message regarding a file that changed
> during save.
>> 4. How does it determine this change? Does it use some kind of fast
> checksum like a CRC before/after or maybe even a security cryptographic hash
> (seems that would add
>> a lot of time to the backups)? Or does it simply infer a change if the
> modtimes/ctimes differ before/after?
> It just looks at the mtime and size at the time of the "walk" vs. those
> values when it's ready to back up the file.
>
>> 5. Does it store the pre-backup times and/or CRCs in the walk-through
> listing or does it generate those on the fly before/after?
> I'm pretty sure they're generated on the fly, although the file time and
> size are stored in NetWorker's databases, there would be no value in
> referencing information about prior backups before beginning a new one. It
> only looks at time and size, not CRC.
>
>> Are any of these details documented anywhere?
> I think the documents with details of the inner workings of NetWorker are
> accessible only to EMC employees. You might find some hints in the Technical
> Overview document for the version you're running, particularly if the
> process changed since the prior version. Here's the one for 9.2:
> https://support.emc.com/docu87149_DELl_EMC_NETWORKER_9.2_-_TECHNICAL_OVERVIE
> W.pdf?language=en_US&language=en_US
>
> ----
>
> Your second message is related so I'll address the questions here.
>
>> If a file is added or modified after a backup starts but before any data
> is sent, say while it's still walking the file system, then it's anyone's
> guess whether it will
>> get backed up on that backup, or instead the next one, as there's no way
> to know how far it's into its walk-through, i.e. whether it's already past
> that point.
>> Is that right?
> Yes.
>
>> So let's say the file system has a small number of inodes in use, and a
> file is copied there or created just after NetWorker starts its walking,
> then it would be much
>> more likely that the file would not get backed up this time as the
> walk-through might be very fast, most likely completing before the file
> could be created or copied
>> there, maybe?
>> Alternatively, if a very large number of inodes are in use then the
> probability of it not reaching that point in its walk-through before the
> file is added/copied there
>> would be higher, albeit maybe detecting that the file grew or changed
> during the save. Something like that?
>> I know there are a lot of factors that could come into play, but is that
> the general idea?
> Basically, what you're saying here is:
> 1. The longer it takes NetWorker to "walk" the file system, the more
> probable a file created or changed file during the "walk" will be processed
> in that backup
> 2. A backup with few changed files will have a shorter "walk" and therefore
> a lower probability of reaching the file before it changed
> 3. Conversely, a backup with many changed files will be more likely to reach
> the file after it is created or changed.
> All this is correct, but you have to be aware that we're dealing in
> probabilities. It's not something you can count on.
>
>> I often see files whose ctimes (Linux) are newer than the start time of
> the last incremental, but older than the completion time. These get captured
> on the next
>> incremental. I've always inferred that either they weren't there when
> NetWorker walked the file system on the previous backup, or they were, but
> one or more file
>> attributes (e.g. permissions, owner, group, etc.) was changed after the
> walk-through completed, or after it passed that point where the file lived,
> thus being
>> left off that backup and not captured until the next one.
> This is true for a file that was created after NetWorker "walked" the file
> system and before it did the backup. The file won't be in the work list so
> it won't get backed up. But if NetWorker encountered a file during the
> "walk" and the file had changed before it got around to backing it up,
> NetWorker will back up the modified file and generate a warning message.
>
>
> DISCLAIMER: All this is based on my understanding, which may be incomplete,
> inaccurate or out of date.
>
>
>
> From: EMC Data Protection Q & A
> [mailto:EMC-DATAPROTECTION-L@LISTSERV.TEMPLE.EDU] On Behalf Of
> EMC-DATAPROTECTION-L automatic digest system
> Sent: Saturday, August 11, 2018 12:00 AM
> To: EMC-DATAPROTECTION-L@LISTSERV.TEMPLE.EDU
> Subject: EMC-DATAPROTECTION-L Digest - 7 Aug 2018 to 10 Aug 2018 (#2018-37)
>
>
>
>
>
> EMC-DATAPROTECTION-L Digest - 7 Aug 2018 to 10 Aug 2018 (#2018-37)
> Table of contents:
> • Some questions on shrunk, grew and changed files?
> • Adding a file after backups start?
> 1. Some questions on shrunk, grew and changed files?
> o Some questions on shrunk, grew and changed files? (08/10)
> From: George Sinclair - NOAA Federal <george.sinclair@NOAA.GOV>
> 2. Adding a file after backups start?
> o Adding a file after backups start? (08/10)
> From: George Sinclair - NOAA Federal <george.sinclair@NOAA.GOV>
>
>
> Browse the EMC-DATAPROTECTION-L online archives.
>
>
>
> Virus-free. www.avg.com
>
>
>
> ---
> This email has been checked for viruses by AVG.
> https://www.avg.com
>
>
--
George Sinclair
Voice: (301) 713-4921
- The preceding message is personal and does not reflect any official or unofficial position of the United States Department of Commerce -
- Any opinions expressed in this message are NOT those of the US Govt. -
--
This list is hosted as a public service at Temple University by Stan Horwitz
If you wish to sign off this list or adjust your subscription settings, please do so via http://listserv.temple.edu/archives/emc-dataprotection-l.html
If you have any questions regarding management of this list, please send email to owner-emc-dataprotection-l@listserv.temple.edu
This message was imported via the External PhorumMail Module