 |
Page 1 of 1
|
| Author |
Message |
marshall.a.skare
Guest
|
 Worrying bptm log entry and 84 error later on
Content-Transfer-Encoding: quoted-printable
Hi everyone,
=20
I'm trying to track down the cause of some 84 and 14 failures we've been
having lately. I know the problem is not tape, drive, job, time-of-day
or robot specific. However the problem seems to be happening on the
weekends when we run our full backups. We're on NBU 4.5GA running on
Solaris 8 master and media servers with an STK L700 library that has
Seagate Viper 200s in it.
=20
One of the entries I found in a bptm log bothered me a little bit. Is
this normal, or should I perform an inventory on the robot? The first
line says the media is not in the correct storage unit or volume pool.
However, it looks like the tape ends up being used anyway, and it also
appears that the tape was able to store data. The job associated with
these log entries bombed out roughly 4 hours later with an 84 error.
=20
Also, during the whole backup attempt, I'd see the db_lock_media error
messages about every 30 seconds. I've read elsewhere that this isn't
really a problem.
=20
20:19:23.649 [13116] <2> select_media: skipping media id 000309, it is
not in correct storage unit or volume pool
22:09:35.428 [13711] <2> check_available_drives: checking drives, about
to request media id 000309
22:09:35.589 [13711] <2> select_media: selected media id 000309 for
backup[0], crmmnt52(rl =3D 4) <----------
22:09:35.592 [13711] <2> mount_open_media: Waiting for mount of media id
000309 (copy 1) on server crmmdb17.
22:09:58.590 [13725] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:10:24.370 [13739] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:10:27.510 [13746] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:10:31.100 [13754] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:10:50.500 [13761] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:10:52.710 [13768] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:11:00.730 [13775] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:11:05.320 [13783] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:11:24.389 [13711] <2> io_open: file
/usr/openv/netbackup/db/media/tpreq/000309 successfully opened
22:11:24.389 [13711] <2> write_backup: media id 000309 mounted on drive
index 0, drivepath /dev/rmt/8cbn, drivename LTO_crmmdb17_0, copy 1
22:11:24.588 [13711] <2> io_position_for_write: position media id
000309, copy 1, current number images =3D 23
22:11:25.220 [13790] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:11:36.810 [13799] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:11:59.039 [13711] <2> io_position_for_write: empty header found on
000309, OK, copy 1
22:11:59.039 [13711] <2> io_close: closing
/usr/openv/netbackup/db/media/tpreq/000309, from bptm.c.17346
22:11:59.046 [13711] <2> io_open: file
/usr/openv/netbackup/db/media/tpreq/000309 successfully opened
22:12:07.410 [13807] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:12:22.924 [13711] <4> write_backup: begin writing backup id
crmmnt52_1105157367, copy 1, fragment 1, to media id 000309 on drive
index 0
=20
If our problem is really network-related, are there any unusual causes I
should check first? The clients we're backing up are a mixture of
Solaris 7/8 and Windows XP/2000 Server/2003 Server. I think I've ruled
out the drives altogether since I took the time to run TapeRx on all of
the drives in this system, and have written and read up to 4GB without
an error. I've had backup jobs fail with error 84 that have transferred
less data than that when they failed.
=20
Thanks for any help!
=20
Marshall Skare
ATIS - Unix Engineering
(612) 277-4434
=20
This message is for the designated recipient only and may contain =
privileged, proprietary, or otherwise private information. If you have =
received it in error, please notify the sender immediately and delete =
the original. Any other use of the email by you is prohibited.
|
| Mon Jan 10, 2005 2:49 pm |
|
 |
thoke
Guest
|
 Worrying bptm log entry and 84 error later on
--Apple-Mail-1-393558136
Content-Transfer-Encoding: quoted-printable
charset=WINDOWS-1252;
format=flowed
Marshall,
First off, the "unable to lock media" is totally normal. It's likely=20
that either a bptm -U or some other process came in and was trying to=20
select that piece of media. It's good that it shows up as locked. =20
That way, no other process can grab it and use it.
Second, you'll need to see the bptm log at the time of the error. =20
Additionally, you'll need to look at the syslog during that same time. =20=
So, what do bptm and syslog say?
You say you're on 4.5 GA. Is that really the case? You should be=20
patched up from there as GA is a REALLY old release by now.
-Tim
On Jan 10, 2005, at 4:49 PM, <marshall.a.skare < at > accenture.com> wrote:
Hi everyone,
=A0
I=92m trying to track down the cause of some 84 and 14 failures we=92ve=20=
been having lately.=A0 I know the problem is not tape, drive, job,=20
time-of-day or robot specific.=A0 However the problem seems to be=20
happening on the weekends when we run our full backups.=A0 We=92re on =
NBU=20
4.5GA running on Solaris 8 master and media servers with an STK L700=20=
library that has Seagate Viper 200s in it.
=A0
One of the entries I found in a bptm log bothered me a little bit.=A0 =
Is=20
this normal, or should I perform an inventory on the robot?=A0 The =
first=20
line says the media is not in the correct storage unit or volume=20
pool.=A0 However, it looks like the tape ends up being used anyway, =
and=20
it also appears that the tape was able to store data. =A0The job=20
associated with these log entries bombed out roughly 4 hours later=20
with an 84 error.
=A0
Also, during the whole backup attempt, I=92d see the db_lock_media =
error=20
messages about every 30 seconds.=A0 I=92ve read elsewhere that this =
isn=92t=20
really a problem.
=A0
20:19:23.649 [13116] <2> select_media: skipping media id 000309, it is=20=
not in correct storage unit or volume pool
22:09:35.428 [13711] <2> check_available_drives: checking drives,=20
about to request media id 000309
22:09:35.589 [13711] <2> select_media: selected media id 000309 for=20
backup[0], crmmnt52(rl =3D 4) <----------
22:09:35.592 [13711] <2> mount_open_media: Waiting for mount of media=20=
id 000309 (copy 1) on server crmmdb17.
22:09:58.590 [13725] <2> db_lock_media: unable to lock media at offset=20=
63 (000309)
22:10:24.370 [13739] <2> db_lock_media: unable to lock media at offset=20=
63 (000309)
22:10:27.510 [13746] <2> db_lock_media: unable to lock media at offset=20=
63 (000309)
22:10:31.100 [13754] <2> db_lock_media: unable to lock media at offset=20=
63 (000309)
22:10:50.500 [13761] <2> db_lock_media: unable to lock media at offset=20=
63 (000309)
22:10:52.710 [13768] <2> db_lock_media: unable to lock media at offset=20=
63 (000309)
22:11:00.730 [13775] <2> db_lock_media: unable to lock media at offset=20=
63 (000309)
22:11:05.320 [13783] <2> db_lock_media: unable to lock media at offset=20=
63 (000309)
22:11:24.389 [13711] <2> io_open: file=20
/usr/openv/netbackup/db/media/tpreq/000309 successfully opened
22:11:24.389 [13711] <2> write_backup: media id 000309 mounted on=20
drive index 0, drivepath /dev/rmt/8cbn, drivename LTO_crmmdb17_0, copy=20=
1
22:11:24.588 [13711] <2> io_position_for_write: position media id=20
000309, copy 1, current number images =3D 23
22:11:25.220 [13790] <2> db_lock_media: unable to lock media at offset=20=
63 (000309)
22:11:36.810 [13799] <2> db_lock_media: unable to lock media at offset=20=
63 (000309)
22:11:59.039 [13711] <2> io_position_for_write: empty header found on=20=
000309, OK, copy 1
22:11:59.039 [13711] <2> io_close: closing=20
/usr/openv/netbackup/db/media/tpreq/000309, from bptm.c.17346
22:11:59.046 [13711] <2> io_open: file=20
/usr/openv/netbackup/db/media/tpreq/000309 successfully opened
22:12:07.410 [13807] <2> db_lock_media: unable to lock media at offset=20=
63 (000309)
22:12:22.924 [13711] <4> write_backup: begin writing backup id=20
crmmnt52_1105157367, copy 1, fragment 1, to media id 000309 on drive=20=
index 0
=A0
If our problem is really network-related, are there any unusual causes=20=
I should check first?=A0 The clients we=92re backing up are a mixture =
of=20
Solaris 7/8 and Windows XP/2000 Server/2003 Server.=A0 I think I=92ve=20=
ruled out the drives altogether since I took the time to run TapeRx on=20=
all of the drives in this system, and have written and read up to 4GB=20=
without an error.=A0 I=92ve had backup jobs fail with error 84 that =
have=20
transferred less data than that when they failed.
=A0
Thanks for any help!
=A0
Marshall Skare
ATIS - Unix Engineering
(612) 277-4434
=A0
This message is for the designated recipient only and may contain=20
privileged, proprietary, or otherwise private information. If you have=20=
received it in error, please notify the sender immediately and delete=20=
the original. Any other use of the email by you is prohibited.
--Apple-Mail-1-393558136
Content-Transfer-Encoding: quoted-printable
|
| Mon Jan 10, 2005 4:11 pm |
|
 |
dlogcher
Guest
|
 Worrying bptm log entry and 84 error later on
I just ran into this problem as well, and it turns out it was a bad HBA.
The server had three HBA's for SAN and tape connectivity. Two were for
the old HP SAN, and one for the new EMC SAN. The two now only server the
LTO tape drives. I yanked one of the two connections and the errors stopped.
I tried replacing the cable, and errors resumed. I moved switch ports and
replaced Gbics. So I narrowed it down to the HBA (thanks Greg).
Try mounting a tape and doing a regular OS tar to the tar on a good sized
filesystem to see if it barfs.
--
Dan
marshall.a.skare < at > accenture.com wrote:
Hi everyone,
I?m trying to track down the cause of some 84 and 14 failures we?ve been
having lately. I know the problem is not tape, drive, job, time-of-day
or robot specific. However the problem seems to be happening on the
weekends when we run our full backups. We?re on NBU 4.5GA running on
Solaris 8 master and media servers with an STK L700 library that has
Seagate Viper 200s in it.
One of the entries I found in a bptm log bothered me a little bit. Is
this normal, or should I perform an inventory on the robot? The first
line says the media is not in the correct storage unit or volume pool.
However, it looks like the tape ends up being used anyway, and it also
appears that the tape was able to store data. The job associated with
these log entries bombed out roughly 4 hours later with an 84 error.
Also, during the whole backup attempt, I?d see the db_lock_media error
messages about every 30 seconds. I?ve read elsewhere that this isn?t
really a problem.
20:19:23.649 [13116] <2> select_media: skipping media id 000309, it is
not in correct storage unit or volume pool
22:09:35.428 [13711] <2> check_available_drives: checking drives, about
to request media id 000309
22:09:35.589 [13711] <2> select_media: selected media id 000309 for
backup[0], crmmnt52(rl = 4) <----------
22:09:35.592 [13711] <2> mount_open_media: Waiting for mount of media id
000309 (copy 1) on server crmmdb17.
22:09:58.590 [13725] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:10:24.370 [13739] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:10:27.510 [13746] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:10:31.100 [13754] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:10:50.500 [13761] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:10:52.710 [13768] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:11:00.730 [13775] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:11:05.320 [13783] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:11:24.389 [13711] <2> io_open: file
/usr/openv/netbackup/db/media/tpreq/000309 successfully opened
22:11:24.389 [13711] <2> write_backup: media id 000309 mounted on drive
index 0, drivepath /dev/rmt/8cbn, drivename LTO_crmmdb17_0, copy 1
22:11:24.588 [13711] <2> io_position_for_write: position media id
000309, copy 1, current number images = 23
22:11:25.220 [13790] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:11:36.810 [13799] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:11:59.039 [13711] <2> io_position_for_write: empty header found on
000309, OK, copy 1
22:11:59.039 [13711] <2> io_close: closing
/usr/openv/netbackup/db/media/tpreq/000309, from bptm.c.17346
22:11:59.046 [13711] <2> io_open: file
/usr/openv/netbackup/db/media/tpreq/000309 successfully opened
22:12:07.410 [13807] <2> db_lock_media: unable to lock media at offset
63 (000309)
22:12:22.924 [13711] <4> write_backup: begin writing backup id
crmmnt52_1105157367, copy 1, fragment 1, to media id 000309 on drive index 0
If our problem is really network-related, are there any unusual causes I
should check first? The clients we?re backing up are a mixture of
Solaris 7/8 and Windows XP/2000 Server/2003 Server. I think I?ve ruled
out the drives altogether since I took the time to run TapeRx on all of
the drives in this system, and have written and read up to 4GB without
an error. I?ve had backup jobs fail with error 84 that have transferred
less data than that when they failed.
Thanks for any help!
Marshall Skare
ATIS - Unix Engineering
(612) 277-4434
This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise private information. If you have
received it in error, please notify the sender immediately and delete
the original. Any other use of the email by you is prohibited.
|
| Tue Jan 11, 2005 7:07 am |
|
 |
|
|
The time now is Sat May 26, 2012 1:07 am | All times are GMT - 8 Hours
|
Page 1 of 1
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|