Hello,
Thanks for your analysis and discussion of the problem. Please see my
comments below.
On Wednesday 14 March 2007 03:36, Troy Daniels wrote:
When a verify job is scheduled under 2.0.2 to perform a Volume to
Catalog verify it now selects the job to be verified (and the associated
tape(s)) at the scheduled start time. It then waits until all jobs of
lower priority finish before performing the actual verify. This results
in it verifying the previous nights backup instead of the current nights
one.
Under 1.38.5 the job wouldn't select the job to verify until it actually
ran (Actual start time instead of scheduled start time). In my case this
is after the job I actually want to verify has finished running for the
evening.
OK, thanks. This time, the problem is crystal clear. The above describes
what I need to know/understand.
I haven't looked at the code, but I suspect that some of the startup code
was moved earlier in the process. In general this was done so that the
catalog entries are more complete if the job is subsequently cancelled
before it is actually started.
In this case, if this is what happened, I can see that this will create a
problem for you. However, it does seem to make sense that the Verify job
would select the job to be verified when Verify is scheduled rather than
at some later time when additional jobs may have run.
I'm not sure what the solution is. It seems there are two ways to resolve
it: 1. Start the Verify immediately after the backup (probably with a
RunScript. 2. Put the code back the way it was in 1.38.11 (assuming that
is the problem).
Do you or anyone else on this list have any comments on the above two
possible solutions?
Well I've already updated my config to implement option 1 as a
'workaround', and am happy to trigger my verifies that way. I've scheduled
a 'TriggerVerify' admin job that will run after the backup is complete and
launch the verify. I used this as I believe using a RunScript within the
backup job wouldn't work as the backup job will still be 'running' when the
verify job is scheduled, so it'll still automatically select the previous
nights job.
You mention below that you are considering looking over the documentation.
The above, IMO, regardless of how the code evolves, would be a very valuable
contribution to the documentation, because as you indicate, it gives
additional flexibility.
Also, I personally feel the current implementation isn't as intuitive as
how it worked in 1.38.11 - however this can probably be dealt with by
appropriate documentation within the manual.
Yes, I agree with you about the current implementation not being very
intuitive. I think the "correct" solution is to put the code back as it was
in 1.38.11, and document the potential problems with that (not the problems
with the current "broken" code) as well as document the Admin workaround job.
I also think that for a longer term, "cleaner" solution it might be a good
idea to add a RunAfterJob, which runs one or more jobs after the current job,
and by "runs", I mean it starts the job within Bacula, not through a script
as RunScript does. This, however, needs to be carefully examined -- Eric
what do you think?
It comes down to the
difference in scheduled time and start time under Bacula. Especially when
you throw priorities into the mix. You might schedule a job to start at
23:15, but you can stop it running until after the backup job finishes
using priorities. Seems to me that if you've setup your job to me, you are
most likely going to want to verify the job that just ran.
What I would *really* like to see is the ability to set which job gets
verified by jobid - Currently I can say "Verify job 'BackupCatalog' but I
cant say "Verify last nights/weeks/months/whatever 'BackupCatalog' job"
As a longer term project, I might consider this, but if you do submit a
Feature Request, you will need to be *very* explicit about the syntax -- for
example "verify last nights BackupCatalog job" seems to me rather difficult
to parse and probably syntactically very precise (a good flow state chart
would help). There are also problems with what happens if the job runs into
the morning ...
This would aid in resource scheduling IMO - in my case my full backups take
up most of Saturday, and there is barely enough time to perform the
verifies before Saturday nights incremental backups start - it would be
nice to be able to schedule the verify jobs for Sunday when the tape drive
is otherwise idle.
Again, the problem here is how to simply and precisely specify such actions.
A nice side effect of the changed behaviour is that I can now achieve this.
This probably needs to be a feature request doesn't it - or is there one in
already that covers it?
I don't recall any existing feature request in this area -- all the ones I
have received to data (with the possible exception of the NDMP request) are
in the current 2.0.3 projects file.
Just had another thought - is this change limited to VolumeToCatalog
verifies, or was it applied across the board to all verify job types? Seems
to me, that it would be a lot easier to overlook the fact that your verify
job is actually verifying the previous nights backup if tapes aren't
involved.
Good question. I'll look at this when I am looking at the problem.
If I get time today I'll look into the documentation and see if I can come
up with a suitable addition explaining this situation. I might even throw
in a feature request if one doesn't already exist.
Yes, thanks, as noted above, suggestions for the documentation would be
welcome.
In order to be sure this bug is properly fixed (one way or another), would you
please submit a bug report with the first two paragraphs listed above in your
previous email.
Thanks,
Kern
