SearchFAQMemberlist Log in
Reply to topic Page 1 of 1
The nsrd process stopped responding
Author Message
Post The nsrd process stopped responding 
I just opened up case 3130419 with LEGATO tech support about this issue,
but I am wondering if anyone on this list run into a situation where your
server's nsrd process just dies? I had this happen around 12:20 with a RPC
error. A bunch of processes were killed off according to the daemon.log
file, then logging stopped about ten minutes later. Other nsr processes
such as nsrexecd, nsrmmdb, etc. were still in the process list. The nsrd
process also generated a core file in /nsr/cores/nsrd at the time it died.
I shut down the nsr processes via nsr_shutdown and the shutdown appeared
to happen normally. Then I restarted, and the restart appeared to be
normal. Backups that were in progress resumed, although a few did crash
when this problem began so those were not re-started.

Our NetWorker complex consists of a Solaris 9 server with NetWorker Power
Edition 7.1.2 for our server. We also have one storage node that also has
Solaris 9 and NetWorker 7.1.2. A variety of clients are backed up each
night, including Tru64 Unix, Windows, Solaris, Novell, Linux, and NDMP. A
few of our Windows clients use the MS SQL module and one uses the MS
Exchange module. Our NetWorker server has a Sony PetaSite library with
five Sony S-AIT drives connected to it via fiber. Our storage node has a
Qualstar library with twelve Sony AIT-2 drives connected via LVD SCSI. We
do not do any library or drive sharing.

The errors in daemon.log look like:

01/17/05 00:29:05 savegrp: RPC error: Unable to send
01/17/05 00:29:05 savegrp: Cannot query the pool resources. Unable to verify the save sets on the media.
01/17/05 00:29:05 nsrexecd: Recvd signal to kill process group - pid=-13435, sig=2

Note: To sign off this list, send a "signoff networker" command via email
to listserv < at > listserv.temple.edu or visit the list's Web site at
http://listserv.temple.edu/archives/networker.html where you can
should be sent to stan < at > temple.edu

Post The nsrd process stopped responding 
Today at 12:10 nsrd died with a coredump

at a strlen (have a trace )

t < at > 1 (l < at > 1) program terminated by signal SEGV (no mapping at the fault address)
0xffffffff7e03d1ec: strlen+0x007c: ld [%o1], %o2
(dbx) where
current thread: t < at > 1
=>[1] strlen(0x0, 0x0, 0x0, 0x7efefeff, 0x81010100, 0x1007815be), at
0xffffffff7e03d1ec
[2] _doprnt(0x0, 0xffffffff7ffe1fa0, 0x0, 0x0, 0x73, 0x0), at
0xffffffff7e08fd34
[3] vsprintf(0x1007815a0, 0x10014dee0, 0xffffffff7ffe2178, 0x1, 0x30,
0x1000d20b8), at 0xffffffff7e091ed0
[4] err_setstr(0x0, 0x1389, 0x10014dee0, 0x1003e6e28, 0x0, 0x1003f6c70), at
0x10012789c
[5] build_clone_rlist(0x304, 0xffffffff7ffe2388, 0xffffffff7ffe2430,
0xffffffff7ffe2448, 0x0, 0x1006d7c70), at 0x100040f5c
[6] 0x1000320a8(0x4, 0x100b9aa80, 0xffffffff7ffe2750, 0x0, 0x1, 0x2880), at
0x1000320a7
[7] 0x1000377e4(0xffffffff7ffe2530, 0x100b9aa80, 0xffffffff7ffe2710, 0x2,
0x64, 0x0), at 0x1000377e3
[8] svcrm_broker_2(0x4, 0xffffffff7ffe2750, 0x10058de20, 0xffffffff7ffe2710,
0x12c8, 0x12b8), at 0x100033844
[9] svcnsr_start_pools_2(0xffffffff7ffe5640, 0x1009e96e0,
0xffffffff7ffed750, 0x100299020, 0x2c00, 0xffffffff7ffe2750), at 0x10007a278
[10] nsrprog_2(0xffffffff7ffed750, 0x100c75420, 0x0, 0x5f3d7, 0x800000000,
0xffffffff7ffed850), at 0x1000a2060
[11] 0x1001186b8(0x100c75420, 0xffffffff7fffd820, 0x0, 0x0,
0xffffffff7ffff9d0, 0x0), at 0x1001186b7
[12] svc_getreqset_varped(0xffffffff7fffd9d0, 0x100591d50, 0x8291400,
0x2008, 0x0, 0xb), at 0x100118858
[13] 0x100086604(0x78b0, 0x1, 0xffffffff7ffff9e0, 0xffffffffffffdfc0,
0xffffffff7fffd9c0, 0x2000), at 0x100086603
[14] 0x1000853e8(0x0, 0x1, 0xa, 0x0, 0x7850, 0x1002b7200), at 0x1000853e7
[15] main(0x1, 0xffffffff7ffffcf8, 0xffffffff7ffffd08, 0x1002991a0,
0x1003e82c0, 0x4), at 0x100084950

I expect to upgrade to 7.1.3 soon so opened no case ( I expect legato to ask
for this upgrade anyway when asking for a support call, it was always their
standard reaction )

Maarten


On Monday 17 January 2005 17:09, Stan Horwitz wrote:
I just opened up case 3130419 with LEGATO tech support about this issue,
but I am wondering if anyone on this list run into a situation where your
server's nsrd process just dies? I had this happen around 12:20 with a RPC
error. A bunch of processes were killed off according to the daemon.log
file, then logging stopped about ten minutes later. Other nsr processes
such as nsrexecd, nsrmmdb, etc. were still in the process list. The nsrd
process also generated a core file in /nsr/cores/nsrd at the time it died.
I shut down the nsr processes via nsr_shutdown and the shutdown appeared
to happen normally. Then I restarted, and the restart appeared to be
normal. Backups that were in progress resumed, although a few did crash
when this problem began so those were not re-started.

Our NetWorker complex consists of a Solaris 9 server with NetWorker Power
Edition 7.1.2 for our server. We also have one storage node that also has
Solaris 9 and NetWorker 7.1.2. A variety of clients are backed up each
night, including Tru64 Unix, Windows, Solaris, Novell, Linux, and NDMP. A
few of our Windows clients use the MS SQL module and one uses the MS
Exchange module. Our NetWorker server has a Sony PetaSite library with
five Sony S-AIT drives connected to it via fiber. Our storage node has a
Qualstar library with twelve Sony AIT-2 drives connected via LVD SCSI. We
do not do any library or drive sharing.

The errors in daemon.log look like:

01/17/05 00:29:05 savegrp: RPC error: Unable to send
01/17/05 00:29:05 savegrp: Cannot query the pool resources. Unable to
verify the save sets on the media. 01/17/05 00:29:05 nsrexecd: Recvd signal
to kill process group - pid=-13435, sig=2

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv < at > listserv.temple.edu or visit the list's Web site at
http://listserv.temple.edu/archives/networker.html where you can
should be sent to stan < at > temple.edu

--
Maarten Boot,
Compuware Europe B.V.
Hoogoorddreef 5
1101 BA Amsterdam

Note: To sign off this list, send a "signoff networker" command via email
to listserv < at > listserv.temple.edu or visit the list's Web site at
http://listserv.temple.edu/archives/networker.html where you can
should be sent to stan < at > temple.edu

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
  


Magic SEO URL for phpBB