SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
backups stuck in pending state, group never cancels or compl
Author Message
Post backups stuck in pending state, group never cancels or compl 
Hi,



I've seen an issue at a few sites where there are backups in a running
group "stuck" in the "Waiting to Run" queue, with a state of state active,
but never seem to transition to a running state. In addition, the group
doesn't time out as it should—in one case the backup admin notices after 3
days that she hasn't gotten success emails.



Anyone run into this and have any thoughts?



Thanks,



Dave





Dave Gold

Sr. Technical Consultant

Cambridge Computer Services

dgold < at > cambridgecomputer.com

781-250-3260

*Celebrating our 20th Anniversary*

Post backups stuck in pending state, group never cancels or compl 
I have seen this on occasion here but it has mainly been on my windows
server groups. I have not had this happen on the other flavors of OS. We are
running Legato Networker 7.5.3 writing to a SAM/QFS file system on the back
end. So it has nothing to do with the tape process b/c our networker servers
never see any tape. The SAM/QFS back end takes care of the two copies for us
so no overhead of cloning. I have always wondered if anyone else had seen
this same type of thing. There is nothing in the logs of the clients or the
group that say any error. I generally just stop and the restart the job and
it fires right off.


On 6/15/12 9:58 AM, "dave2 < at > CAMBRIDGECOMPUTER.COM"
<dave2 < at > CAMBRIDGECOMPUTER.COM> wrote:

Hi,



I've seen an issue at a few sites where there are backups in a running
group "stuck" in the "Waiting to Run" queue, with a state of state active,
but never seem to transition to a running state. In addition, the group
doesn't time out as it should‹in one case the backup admin notices after 3
days that she hasn't gotten success emails.



Anyone run into this and have any thoughts?



Thanks,



Dave





Dave Gold

Sr. Technical Consultant

Cambridge Computer Services

dgold < at > cambridgecomputer.com

781-250-3260

*Celebrating our 20th Anniversary*

--
Matt Powell
mtpowel < at > clemson.edu
Storage Administrator
Clemson University
340 Computer Court
Anderson, SC, 29625
office: 864-656-0589
cell: 864-247-2823

Post backups stuck in pending state, group never cancels or compl 
Same here, we see it on occasion. We run 7.6.2.7 on our machines.
Happens maybe once a month, or so.

Like Matthew said, just kill the job, and start it back up, it's ok.

The big issue is catching it, as we have so many jobs that are running
all of the time, it's difficult to catch a "stuck" job, as it never
fails.

David M. Browning Jr.
IT Project Coordinator Enterprise Backups and Help Desk

-----Original Message-----
From: EMC NetWorker discussion [mailto:NETWORKER < at > LISTSERV.TEMPLE.EDU] On
Behalf Of Matthew Powell
Sent: Friday, June 15, 2012 9:27 AM
To: NETWORKER < at > LISTSERV.TEMPLE.EDU
Subject: Re: [Networker] backups stuck in pending state, group never
cancels or completes

I have seen this on occasion here but it has mainly been on my windows
server groups. I have not had this happen on the other flavors of OS. We
are
running Legato Networker 7.5.3 writing to a SAM/QFS file system on the
back
end. So it has nothing to do with the tape process b/c our networker
servers
never see any tape. The SAM/QFS back end takes care of the two copies
for us
so no overhead of cloning. I have always wondered if anyone else had
seen
this same type of thing. There is nothing in the logs of the clients or
the
group that say any error. I generally just stop and the restart the job
and
it fires right off.


On 6/15/12 9:58 AM, "dave2 < at > CAMBRIDGECOMPUTER.COM"
<dave2 < at > CAMBRIDGECOMPUTER.COM> wrote:

Hi,



I've seen an issue at a few sites where there are backups in a running
group "stuck" in the "Waiting to Run" queue, with a state of state
active,
but never seem to transition to a running state. In addition, the
group
doesn't time out as it should<in one case the backup admin notices
after 3
days that she hasn't gotten success emails.



Anyone run into this and have any thoughts?



Thanks,



Dave





Dave Gold

Sr. Technical Consultant

Cambridge Computer Services

dgold < at > cambridgecomputer.com

781-250-3260

*Celebrating our 20th Anniversary*

--
Matt Powell
mtpowel < at > clemson.edu
Storage Administrator
Clemson University
340 Computer Court
Anderson, SC, 29625
office: 864-656-0589
cell: 864-247-2823

Post backups stuck in pending state, group never cancels or compl 
I have seen this type of behavior occur on Windows and Linux clients that lose access to their storage. On Linux, I typically find at least one save process that cannot be killed off. In NetWorker 7.6.3, the hardlimit attribute of the group resource can be set to kill the group after a certain number of minutes pass by. I have found that feature to work very nicely.

On 06 15, 2012, at 10:26 AM, Matthew Powell wrote:

I have seen this on occasion here but it has mainly been on my windows
server groups. I have not had this happen on the other flavors of OS. We are
running Legato Networker 7.5.3 writing to a SAM/QFS file system on the back
end. So it has nothing to do with the tape process b/c our networker servers
never see any tape. The SAM/QFS back end takes care of the two copies for us
so no overhead of cloning. I have always wondered if anyone else had seen
this same type of thing. There is nothing in the logs of the clients or the
group that say any error. I generally just stop and the restart the job and
it fires right off.


On 6/15/12 9:58 AM, "dave2 < at > CAMBRIDGECOMPUTER.COM"
<dave2 < at > CAMBRIDGECOMPUTER.COM> wrote:

Hi,



I've seen an issue at a few sites where there are backups in a running
group "stuck" in the "Waiting to Run" queue, with a state of state active,
but never seem to transition to a running state. In addition, the group
doesn't time out as it should‹in one case the backup admin notices after 3
days that she hasn't gotten success emails.



Anyone run into this and have any thoughts?



Thanks,



Dave





Dave Gold

Sr. Technical Consultant

Cambridge Computer Services

dgold < at > cambridgecomputer.com

781-250-3260

*Celebrating our 20th Anniversary*

--
Matt Powell
mtpowel < at > clemson.edu
Storage Administrator
Clemson University
340 Computer Court
Anderson, SC, 29625
office: 864-656-0589
cell: 864-247-2823

View user's profile Send private message
Post backups stuck in pending state, group never cancels or compl 
Hi,
Check the paralelity of the NetWorker Server client and increase it, if it is somethink like 12, up to 64. If this does Not help, stop the NSR server Software, rename the x:\Program Files\Legato\nsr\res\nsrdb Directory and Start again.

Von meinem iPad gesendet

Am 15.06.2012 um 15:58 schrieb dave2 < at > CAMBRIDGECOMPUTER.COM:

Hi,


View user's profile Send private message
Post backups stuck in pending state, group never cancels or compl 
Do you actually mean "rename the ...\nsr\tmp Directory and Start again"?


On Friday 15 Jun 2012 17:43:18 you wrote:
Hi,
Check the paralelity of the NetWorker Server client and increase it, if it
is somethink like 12, up to 64. If this does Not help, stop the NSR server
Software, rename the x:\Program Files\Legato\nsr\res\nsrdb Directory and
Start again.

Von meinem iPad gesendet

Am 15.06.2012 um 15:58 schrieb dave2 < at > CAMBRIDGECOMPUTER.COM:
Hi,

View user's profile Send private message
Post backups stuck in pending state, group never cancels or compl 
Ah, that is an interesting idea. I'll give that a try, and also Stan's idea.



Dave



*From:* Rainer rethmeier [mailto:rethmeier.rainer < at > web.de]
*Sent:* Monday, June 18, 2012 2:15 PM
*To:* Dave Gold
*Subject:* Re: [Networker] backups stuck in pending state, group never
cancels or completes



Sorry, I did wrong, I mean the jobsdb, Not the resdb

Regards

Von meinem iPad gesendet


Am 18.06.2012 um 15:44 schrieb Dave Gold <dave2 < at > cambridgecomputer.com>:

Thanks for the thoughts. Not sure I agree about renaming nsrdb, but res
corruption is something that I didn't really think about.

Dave

On Fri, Jun 15, 2012 at 11:43 AM, Rainer rethmeier <rethmeier.rainer < at > web.de>
wrote:

Hi,
Check the paralelity of the NetWorker Server client and increase it, if it
is somethink like 12, up to 64. If this does Not help, stop the NSR server
Software, rename the x:\Program Files\Legato\nsr\res\nsrdb Directory and
Start again.

Von meinem iPad gesendet

Am 15.06.2012 um 15:58 schrieb dave2 < at > CAMBRIDGECOMPUTER.COM:

Hi,


Post backups stuck in pending state, group never cancels or compl 
- Depending on your NW version, you could use the jobquery/jobkill commands to see/correct what is still running.
Abort the job/the group and restart it.

- If a client hangs in the "waiting to run state", restart its nsrexecd process. This will cause the client (and the group, of course) to fail.
You now have a stable state.

- If you (re)start a group from the command line, simply verify that it does not wait for another <CR>.
I always restart a group for certain clients from the command line:
savegrp -c <client_1> [-c <client_2> ...] -R <groupname>
This avoids unnecessary CFI backups - you got them already.

- Also watch the alerts - NW could just wait for another media (in another pool)

View user's profile Send private message
Post  
Do u see if that 'active' saveset backedup ; run mminfo?

View user's profile Send private message
Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB