Arno Lehmann wrote:
Hi,
On 3/11/2007 2:39 AM, Gerard Sharpe wrote:
Arno Lehmann wrote:
Hi,
On 3/8/2007 2:38 PM, Gerard Sharpe wrote:
Hi all,
Just after some advice about a HP SureStore E AutoLoader 818 I just
picked up today second hand.
The seller I purchased it from on ebay has mentioned that the drive had
been tested and confirmed to be working ok but I'm having no joy.
I am able to backup a small amount of data via tar, use the autoloader
via mtx and rewind, check the status and so on with mt. The issue is
when I try and backup a large amount of data with tar it starts writing
to tape for around 20 seconds and then stops with the error:
---
tar: /dev/st0: Cannot write: Input/output error
---
Checking the logs I see this:
---
Mar 8 23:58:38 localhost kernel: st0: Current: sense key: Hardware Error
Mar 8 23:58:38 localhost kernel: ASC=0xc <<vendor>> ASCQ=0x80
Mar 8 23:58:38 localhost kernel: Info fld=0x707fff
Mar 8 23:58:38 localhost kernel: st0: Current: sense key: Hardware Error
Mar 8 23:58:38 localhost kernel: ASC=0xc <<vendor>> ASCQ=0x80
Mar 8 23:58:38 localhost kernel: Info fld=0x2d0
---
That looks bad.
I've checked connections and tried different DLT Tapes and the obvious
things, I'll try a different scsi card tomorrow just in case. It just
seems odd it backs up small amounts of data and not large amounts.
I guess my question is, is there anyway I can get a more descriptive
error message to help isolate the source of the problem or has any else
had a similar issue?
Not a similar issue... but you might try tapeinfo on the drive.
Also could this be caused by the drive requiring a
cleaning?
I don't think so. It should report the need of cleaning by an LED, usually.
Admittedly, I don't know your autochanger model, but it might help to
check the device configuration and reset it to its factory default.
Usually taht should work through the control panel somehow.
Also, HP has a tool for library and tape testing called Library and Tape
Tools or short L&TT which you can download here:
http://h20000.www2.hp.com/bizsupport/TechSupport/DriverDownload.jsp?pnameOID=406731&locale=en_US&taskId=135&prodSeriesId=406729&prodTypeId=12169
or go through this
http://h18006.www1.hp.com/products/storageworks/ltt/index.html URL.
Hope that helps you,
Arno
Thanks for the reply Arno I'm downloading the HP tools at the moment to
see what it has to say.
I also grabbed some more info from dmesg, I'm now wondering if it may be
the SCSI card or kernel:
This looks like a hardware problem.
Quite often, checking and perhaps replacing the cabling and terminators
helsp in these cases. It might also be a problem with the SCSI driver,
but you would know a lot about the drivers code to verify that...
For cases like these, I try another SCSI HBA using a different driver,
for example LSI instead of adaptec. If that driver works, you can be
quite sure it's a software problem.
I can't recall you telling us, but is that an Adaptec 2940U(2)W HBA?
There seem to be different revisions of the controller chips on these
cards, some of them buggy.
I looked through the drivers code once, and there were lots of special
cases for problem handling depending on the exact chipsat revision.
Arno
---
scsi0: At time of recovery, card was not paused
Dump Card State Begins <<<<<<<<<<<<<<<<<
scsi0: Dumping Card State in Command phase, at SEQADDR 0x157
Card was paused
ACCUM = 0x80, SINDEX = 0xac, DINDEX = 0xc0, ARG_2 = 0x0
HCNT = 0x0 SCBPTR = 0x0
SCSISIGI[0x84]:(BSYI|CDI) ERROR[0x0] SCSIBUSL[0x80]
LASTPHASE[0x80]:(CDI) SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI)
SBLKCTL[0x2]:(SELWIDE) SCSIRATE[0xf]:(SXFR_ULTRA2)
SEQCTL[0x10]:(FASTMODE) SEQ_FLAGS[0x0] SSTAT0[0x7]:(DMADONE|SPIORDY|SDONE)
SSTAT1[0x2]:(PHASECHG) SSTAT2[0x0] SSTAT3[0x0]
SIMODE0[0x0] SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|ENSCSIRST|ENSELTIMO)
SXFRCTL0[0x88]:(SPIOEN|DFON) DFCNTRL[0x4]:(DIRECTION)
DFSTATUS[0x6d]:(FIFOEMP|DFTHRESH|HDONE|FIFOQWDEMP|DFCACHETH)
STACK: 0x37 0xcc 0x151 0x192
SCB count = 4
Kernel NEXTQSCB = 3
Card NEXTQSCB = 3
QINFIFO entries:
Waiting Queue entries:
Disconnected Queue entries:
QOUTFIFO entries:
Sequencer Free SCB List: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Sequencer SCB Info:
0 SCB_CONTROL[0x0] SCB_SCSIID[0x57] SCB_LUN[0x0] SCB_TAG[0x2]
1 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
2 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
3 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
4 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
5 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
6 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
7 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
8 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
9 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
10 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
11 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
12 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
13 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
14 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
15 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
Pending list:
2 SCB_CONTROL[0x0] SCB_SCSIID[0x57] SCB_LUN[0x0]
Kernel Free SCB list: 1 0
Untagged Q(5): 2
<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
st 0:0:5:0: Device is active, asserting ATN
Recovery code sleeping
(scsi0:A:5:0): Recovery SCB completes
Unexpected busfree in Command phase
SEQADDR == 0x157
Recovery code awake
aic7xxx_abort returns 0x2002
st 0:0:5:0: Attempting to queue a TARGET RESET message
CDB: 0xa 0x0 0x0 0x28 0x0 0x0
st 0:0:5:0: Command not found
aic7xxx_dev_reset returns 0x2002
target0:0:5: FAST-10 SCSI 10.0 MB/s ST (100 ns, offset 15)
st0: Current: sense key: Not Ready
Additional sense: Logical unit is in process of becoming ready
st0: Current: sense key: Not Ready
Additional sense: Logical unit is in process of becoming ready
st0: <<DEFERRED>>: sense key: Hardware Error
ASC=0xc <<vendor>> ASCQ=0x80
Info fld=0x40fff
---
R
GS
Hi Arno,
I've now tried two separate AHA-2944UW cards (BIOS v2.20) on two
machines, a linux box (multiple kernels tested 2.4.27, 2.6.8.13-k7 and
2.6.8.14-k7) and Windows XP with HP Tools but still with no luck, I
tried different cables and terminators (I've also terminated directly on
the drive to eliminate everything else).
The drive is detected on boot and seems to work fine on linux for the
quick drive test used by bacula in btape but falls on longer writes
(btape fill test for example) with the error above.
I also tried disabling auto wide negotiation and other settings with the
SCSI BIOS without any luck. Linux loads the aic7XXX module and uses the
aic7880 driver by the looks of things.
This has got me curious since it worked with the previous owner a few
months ago which makes me more inclined to believe its a issue
(compatibility maybe?) with the AHA-2944UW cards I have. The previous
owner has also given me the HP branded cable that they used however one
end uses a Very High Density connector so I'm unable to use it.
Arno or anyone else reading this, can you able to suggest another
particular brand and model of HVD card(s) that they have had success with.
Thanks again
Gerard