Hi,
You don't *have* to have reach/put errors for it to be a robot arm issue, there are more points of failure in a robotic arm than this. Reach/put errors are just the obvious alignment ones.
(When I was level 2, if I had an error with robot in it, it was pretty easy to ask a hw engineer to do an onsite just to health check it - even if it had been checked recently.)
I also agree with Kevin's comment that you have a data loss situation if the frozen tape is put back into circulation, so it's not a situation to treat lightly. I'd work all angles.
Given you've already:
- confirmed firmware is up-to-date
- robotic arm has been replaced
I'd then:
- Make sure that tape is pulled out of circulation, so that noone accidentally unfreezes it
- Start collecting iostat stats
- Get level 3 to do a diagnostic dump and analyse - not all errors are reported to syslog/bptm - before replacing the tape drive, cos once you pull it you lose that history (defintiely would do, if you've already been replacing LTO3 drives)
- Check what was replaced - was it the whole robotic component or a portion, just how many tape drives have been replaced, what were the serial IDs.
- Get someone to check robotic arm operation, in case the wrong component was replaced
If this comes up blank, as a level 2, I'd be escalating to level 3 so that they are across decision to replace and fully investigate the the pending and previous replacements.
Again, hope this helps.I'm mainly just pulling from my collective memory of lots of tape support cases - I've even seen replaced tape drives being diverted to support for stress testing when there was a silent error, but this was only the once - normally it was possible to get a reason if you dug.
Robyn
On Wed, Dec 14, 2011 at 11:58 PM, Justin Piszcz <jpiszcz < at > lucidpixels.com ([email]jpiszcz < at > lucidpixels.com[/email])> wrote:
Hi,
Thanks for the reply we are at the current revision for the robot that
they recommend, we have replaced arms in the past but cannot confirm
or deny whether that has fixed any of the problems. Normally (again,
normally..) when there are robot arm issues there are reach/put errors
etc, have not seen them in this case..
Justin.
On Wed, Dec 14, 2011 at 7:35 AM, Robyn Hirano
<robyn.hirano < at > roddconsulting.com.au ([email]robyn.hirano < at > roddconsulting.com.au[/email])> wrote:
Hi,
That looks like a robotic arm problem rather than the tape drive or tapes.
I'd be checking the robotics firmware (there's a command or the library
panel normally shows as well) and requesting an engineer onsite to
healthcheck the robotic arm.
But it's often one of the components associated with the gripper (robotics)
that's out of alignment needing alignment or replacing.
Robyn
--
Robyn Hirano
Rodd Consulting Pty Ltd
M: [url=tel:%2B61%20412%20352%20725]+61 412 352 725[/url]
E: robyn.hirano < at > roddconsulting.com.au ([email]robyn.hirano < at > roddconsulting.com.au[/email])
-----Original Message-----
From: veritas-bu-bounces < at > mailman.eng.auburn.edu ([email]veritas-bu-bounces < at > mailman.eng.auburn.edu[/email])
[mailto:veritas-bu-bounces < at > mailman.eng.auburn.edu ([email]veritas-bu-bounces < at > mailman.eng.auburn.edu[/email])] On Behalf Of Justin
Piszcz
Sent: Wednesday, 14 December 2011 11:07 PM
To: veritas-bu < at > mailman.eng.auburn.edu ([email]veritas-bu < at > mailman.eng.auburn.edu[/email])
Subject: [Veritas-bu] Tape/media errors/HP LTO-3
Hi,
We're running the latest F/W for HP LTO-3 tape drives (M6BS) for
4.0GBPS/FC drives.
As was noted in the previous conversation, errors such as:
1323762270 1 386 16 media-server 0 0 0 *NULL* bptm error unloading
media, TpErrno = Robot operation failed
1322549252 1 388 16 media-server 1136618 1136513 0 client-hostname
bptm FREEZING media id XAC228, External event caused rewind during
write, all data on media is lost
When these errors occur in your environments (on multiple tapes) do
you get the drives replaced in advanced or wait for them to fail
completely? In the past I had been getting them replaced regularly
but its getting problematic they used to be servicing components
multiple times per ewek.
Justin.
_______________________________________________
Veritas-bu maillist - Veritas-bu < at > mailman.eng.auburn.edu ([email]Veritas-bu < at > mailman.eng.auburn.edu[/email])
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
-----
No virus found in this message.
Checked by AVG -
www.avg.com
Version: 10.0.1415 / Virus Database: 2102/4079 - Release Date: 12/13/11
_______________________________________________
Veritas-bu maillist - Veritas-bu < at > mailman.eng.auburn.edu ([email]Veritas-bu < at > mailman.eng.auburn.edu[/email])
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
--
Robyn Hirano
RODD Consulting Pty Ltd
M: +61 412 352 725
E: robyn.hirano < at > roddconsulting.com.au ([email]robyn.hirano < at > roddconsulting.com.au[/email])