Re: Kernel bug or disk failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Todd Denniston writes:

Sam Varshavchik wrote, On 07/13/2008 10:51 AM:
Chris Snook writes:

Sam Varshavchik wrote:
Every other week or so, I get a disk kicked out of my RAID, with this:

Jul 6 04:05:38 commodore kernel: (scsi1:A:0:0): scsi1: device overrun (status 10) on 0:0:0 Jul 6 04:05:38 commodore kernel: Unexpected busfree in DT Data-in phase, 1 SCBs aborted, PRGMCNT == 0x22f Jul 6 04:05:38 commodore kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< Jul 6 04:05:38 commodore kernel: scsi1: Dumping Card State at program address 0x22d Mode 0x22
Jul  6 04:05:38 commodore kernel: Card was paused

… followed by a rather dry dump of the HBA's registers. This is aic79xxx.

This does not look like a disk error to me. I re-add the drive into the array, and rebuild with no downtime. SMART shows 0 in the defect list on this drive, and over the disk's lifetime 0 uncorrectable reads and 1 uncorrectable write -- but this kernel barf already happened 4-5 times now, and it's getting rather annoying.


Looks more like a controller problem than a drive problem. Do you have a spare HBA to test?

No, but I have one on order, now. I reseated the cable, that didn't help -- the card dumped again about 12 hours later, but it was, apparently, non-fatal because RAID did not degrade.


May I suggest that, when it is convenient to do so, you:
1) reboot
2) Catch the scsi card ( Ctrl-A ) when the aic79xxx boot text shows up during bios operations.
3) set the speed of the scsi bus to that drive to a little slower.
4) if you get the fault or the drive is not recognized, repeat until you get a desired result (some drives do not work at ALL the speeds slower than it is rated at, Promise U160 rated array communicated only at 160, 80, 66, 16 & 6).

I'll try that if the replacement card still trips like this. These drives have been spinning away, 24x7, for a few years, with nary a hiccup.


Attachment: pgpwJO2OAXvJE.pgp
Description: PGP signature

-- 
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux