 |
Page 1 of 1
|
| Author |
Message |
dave2
Guest
|
 backups stuck in pending state, group never cancels or compl
Hi,
I've seen an issue at a few sites where there are backups in a running
group "stuck" in the "Waiting to Run" queue, with a state of state active,
but never seem to transition to a running state. In addition, the group
doesn't time out as it should—in one case the backup admin notices after 3
days that she hasn't gotten success emails.
Anyone run into this and have any thoughts?
Thanks,
Dave
Dave Gold
Sr. Technical Consultant
Cambridge Computer Services
dgold < at > cambridgecomputer.com
781-250-3260
*Celebrating our 20th Anniversary*
|
| Fri Jun 15, 2012 6:04 am |
|
 |
Matthew Powell
Guest
|
 backups stuck in pending state, group never cancels or compl
I have seen this on occasion here but it has mainly been on my windows
server groups. I have not had this happen on the other flavors of OS. We are
running Legato Networker 7.5.3 writing to a SAM/QFS file system on the back
end. So it has nothing to do with the tape process b/c our networker servers
never see any tape. The SAM/QFS back end takes care of the two copies for us
so no overhead of cloning. I have always wondered if anyone else had seen
this same type of thing. There is nothing in the logs of the clients or the
group that say any error. I generally just stop and the restart the job and
it fires right off.
On 6/15/12 9:58 AM, "dave2 < at > CAMBRIDGECOMPUTER.COM"
<dave2 < at > CAMBRIDGECOMPUTER.COM> wrote:
Hi,
I've seen an issue at a few sites where there are backups in a running
group "stuck" in the "Waiting to Run" queue, with a state of state active,
but never seem to transition to a running state. In addition, the group
doesn't time out as it should‹in one case the backup admin notices after 3
days that she hasn't gotten success emails.
Anyone run into this and have any thoughts?
Thanks,
Dave
Dave Gold
Sr. Technical Consultant
Cambridge Computer Services
dgold < at > cambridgecomputer.com
781-250-3260
*Celebrating our 20th Anniversary*
--
Matt Powell
mtpowel < at > clemson.edu
Storage Administrator
Clemson University
340 Computer Court
Anderson, SC, 29625
office: 864-656-0589
cell: 864-247-2823
|
| Fri Jun 15, 2012 6:32 am |
|
 |
Browning, David
Guest
|
 backups stuck in pending state, group never cancels or compl
Same here, we see it on occasion. We run 7.6.2.7 on our machines.
Happens maybe once a month, or so.
Like Matthew said, just kill the job, and start it back up, it's ok.
The big issue is catching it, as we have so many jobs that are running
all of the time, it's difficult to catch a "stuck" job, as it never
fails.
David M. Browning Jr.
IT Project Coordinator Enterprise Backups and Help Desk
-----Original Message-----
From: EMC NetWorker discussion [mailto:NETWORKER < at > LISTSERV.TEMPLE.EDU] On
Behalf Of Matthew Powell
Sent: Friday, June 15, 2012 9:27 AM
To: NETWORKER < at > LISTSERV.TEMPLE.EDU
Subject: Re: [Networker] backups stuck in pending state, group never
cancels or completes
I have seen this on occasion here but it has mainly been on my windows
server groups. I have not had this happen on the other flavors of OS. We
are
running Legato Networker 7.5.3 writing to a SAM/QFS file system on the
back
end. So it has nothing to do with the tape process b/c our networker
servers
never see any tape. The SAM/QFS back end takes care of the two copies
for us
so no overhead of cloning. I have always wondered if anyone else had
seen
this same type of thing. There is nothing in the logs of the clients or
the
group that say any error. I generally just stop and the restart the job
and
it fires right off.
On 6/15/12 9:58 AM, "dave2 < at > CAMBRIDGECOMPUTER.COM"
<dave2 < at > CAMBRIDGECOMPUTER.COM> wrote:
Hi,
I've seen an issue at a few sites where there are backups in a running
group "stuck" in the "Waiting to Run" queue, with a state of state
active,
but never seem to transition to a running state. In addition, the
group
doesn't time out as it should<in one case the backup admin notices
after 3
days that she hasn't gotten success emails.
Anyone run into this and have any thoughts?
Thanks,
Dave
Dave Gold
Sr. Technical Consultant
Cambridge Computer Services
dgold < at > cambridgecomputer.com
781-250-3260
*Celebrating our 20th Anniversary*
--
Matt Powell
mtpowel < at > clemson.edu
Storage Administrator
Clemson University
340 Computer Court
Anderson, SC, 29625
office: 864-656-0589
cell: 864-247-2823
|
| Fri Jun 15, 2012 6:41 am |
|
 |
stan
Joined: 25 Jan 2008
Posts: 697
|
 backups stuck in pending state, group never cancels or compl
I have seen this type of behavior occur on Windows and Linux clients that lose access to their storage. On Linux, I typically find at least one save process that cannot be killed off. In NetWorker 7.6.3, the hardlimit attribute of the group resource can be set to kill the group after a certain number of minutes pass by. I have found that feature to work very nicely.
On 06 15, 2012, at 10:26 AM, Matthew Powell wrote:
I have seen this on occasion here but it has mainly been on my windows
server groups. I have not had this happen on the other flavors of OS. We are
running Legato Networker 7.5.3 writing to a SAM/QFS file system on the back
end. So it has nothing to do with the tape process b/c our networker servers
never see any tape. The SAM/QFS back end takes care of the two copies for us
so no overhead of cloning. I have always wondered if anyone else had seen
this same type of thing. There is nothing in the logs of the clients or the
group that say any error. I generally just stop and the restart the job and
it fires right off.
On 6/15/12 9:58 AM, "dave2 < at > CAMBRIDGECOMPUTER.COM"
<dave2 < at > CAMBRIDGECOMPUTER.COM> wrote:
Hi,
I've seen an issue at a few sites where there are backups in a running
group "stuck" in the "Waiting to Run" queue, with a state of state active,
but never seem to transition to a running state. In addition, the group
doesn't time out as it should‹in one case the backup admin notices after 3
days that she hasn't gotten success emails.
Anyone run into this and have any thoughts?
Thanks,
Dave
Dave Gold
Sr. Technical Consultant
Cambridge Computer Services
dgold < at > cambridgecomputer.com
781-250-3260
*Celebrating our 20th Anniversary*
--
Matt Powell
mtpowel < at > clemson.edu
Storage Administrator
Clemson University
340 Computer Court
Anderson, SC, 29625
office: 864-656-0589
cell: 864-247-2823
|
| Fri Jun 15, 2012 10:44 am |
|
 |
coretouch
Joined: 03 Jun 2012
Posts: 19
|
 backups stuck in pending state, group never cancels or compl
Hi,
Check the paralelity of the NetWorker Server client and increase it, if it is somethink like 12, up to 64. If this does Not help, stop the NSR server Software, rename the x:\Program Files\Legato\nsr\res\nsrdb Directory and Start again.
Von meinem iPad gesendet
Am 15.06.2012 um 15:58 schrieb dave2 < at > CAMBRIDGECOMPUTER.COM:
Hi,
|
| Fri Jun 15, 2012 8:34 pm |
|
 |
jee
Joined: 10 Jun 2007
Posts: 119
|
 backups stuck in pending state, group never cancels or compl
Do you actually mean "rename the ...\nsr\tmp Directory and Start again"?
On Friday 15 Jun 2012 17:43:18 you wrote:
Hi,
Check the paralelity of the NetWorker Server client and increase it, if it
is somethink like 12, up to 64. If this does Not help, stop the NSR server
Software, rename the x:\Program Files\Legato\nsr\res\nsrdb Directory and
Start again.
Von meinem iPad gesendet
Am 15.06.2012 um 15:58 schrieb dave2 < at > CAMBRIDGECOMPUTER.COM:
Hi,
|
| Sun Jun 17, 2012 12:09 pm |
|
 |
Dave Gold
Guest
|
 backups stuck in pending state, group never cancels or compl
Ah, that is an interesting idea. I'll give that a try, and also Stan's idea.
Dave
*From:* Rainer rethmeier [mailto:rethmeier.rainer < at > web.de]
*Sent:* Monday, June 18, 2012 2:15 PM
*To:* Dave Gold
*Subject:* Re: [Networker] backups stuck in pending state, group never
cancels or completes
Sorry, I did wrong, I mean the jobsdb, Not the resdb
Regards
Von meinem iPad gesendet
Am 18.06.2012 um 15:44 schrieb Dave Gold <dave2 < at > cambridgecomputer.com>:
Thanks for the thoughts. Not sure I agree about renaming nsrdb, but res
corruption is something that I didn't really think about.
Dave
On Fri, Jun 15, 2012 at 11:43 AM, Rainer rethmeier <rethmeier.rainer < at > web.de>
wrote:
Hi,
Check the paralelity of the NetWorker Server client and increase it, if it
is somethink like 12, up to 64. If this does Not help, stop the NSR server
Software, rename the x:\Program Files\Legato\nsr\res\nsrdb Directory and
Start again.
Von meinem iPad gesendet
Am 15.06.2012 um 15:58 schrieb dave2 < at > CAMBRIDGECOMPUTER.COM:
Hi,
|
| Mon Jun 18, 2012 10:22 am |
|
 |
bingo
Joined: 27 Jul 2007
Posts: 510
|
 backups stuck in pending state, group never cancels or compl
- Depending on your NW version, you could use the jobquery/jobkill commands to see/correct what is still running.
Abort the job/the group and restart it.
- If a client hangs in the "waiting to run state", restart its nsrexecd process. This will cause the client (and the group, of course) to fail.
You now have a stable state.
- If you (re)start a group from the command line, simply verify that it does not wait for another <CR>.
I always restart a group for certain clients from the command line:
savegrp -c <client_1> [-c <client_2> ...] -R <groupname>
This avoids unnecessary CFI backups - you got them already.
- Also watch the alerts - NW could just wait for another media (in another pool)
|
| Mon Jun 18, 2012 10:58 am |
|
 |
bangrn
Joined: 15 Jan 2008
Posts: 17
|
Do u see if that 'active' saveset backedup ; run mminfo?
|
| Tue Jun 19, 2012 2:58 pm |
|
 |
|
|
The time now is Wed Jun 19, 2013 9:18 am | All times are GMT - 8 Hours
|
Page 1 of 1
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|