Re: [RESEND][2.6.15] New ATA error messages on upgrade to 2.6.15

On Apr 2, 2006, at 17:24:43, Alan Cox wrote:

On Sul, 2006-04-02 at 15:55 -0400, Kyle Moffett wrote:
(2) It's extremely unlikely that the card itself is faulty; itexhibits identical symptoms on both drives and has ever since Ioriginally purchased the card and installed 2.4.X on the system.
If it has always shown those symptoms then I'd say its quite likelythe card if the crystals/PLLs on it are out. It looks like thetiming is wrong, which means either the input clocks (eg PCI clock)are wrong (eg 37.5Mhz not 33 due to BIOS overclock settings or justplain out), the card has a dodgy crystal/PLL or the kernel set itup wrong.
PCI timings won't move between motherboards, PLL faults wont movebetween cards.
Unless anyone else is seeing the same problem with the same cardvariant or you have two cards that do it then there isn't much thatcan be done I suspect other than assume the hardware is iffy,rightly or wrongly. I'd have expected a lot more reports if it werethe controller.

Hmm, okm thanks for the information. If it was possible, I'd beextremely suspicious that the card's firmware was either buggy orLinux didn't know how to repsond to the odd hardware variant; I don'trecall them producing that model of card for very long, so it's quitepossible there aren't many of them around and they have some kind oftiming quirk nobody knows about.

CRC issues aside, there is that other MULTWRITE_EXT error that onlyoccurs on hdi (and if I swap hdi and hdg, the error follows thedrive). The error also is specific to 2.6.15+, it does not occur onthe 2.6.12+patch that I switched from a month ago. I'm assuming thatsince the drive/card stop giving BadCRC errors that they're able tocommunicate successfully at the extremely low speed.

With a little more tinkering with hdparm I was able to determine thatthe drives on the built-in controller and the primary bus of the PCIcontroller were both in DMA mode, the former in udma4 and the latterin udma3. The originally problematic drive (the one giving theMULTWRITE_EXT errors) was in PIO mode, though "hdparm -d1 /dev/hdi""fixed" that problem and resulted in a drastic increase in drive busspeed as measured by "hdparm -tT". (from 2MB/sec to around 23MB/secor so). hdi ended up in udma2 according to "hdparm -i"

Just for clarity, I'm repeating the _new_ error below. This onerecurs about once or twice an hour, but only on the samsung drive.If the answer is (as it seems likely) "Your drive has bad firmwarebut the error is totally harmless", then I'll be perfectly happy,although I'd kind of prefer if the kernel could detect the buggyfirmware and work around it (maybe by switching back to whatever theold behavior was, whenever changed). I'd otherwise be happy to git-bisect except for the fact that a number of people rely on thissystem for day-to-day activities.

Mar 28 03:15:13 penelope kernel: hdi: status timeout: status=0xd0{ Busy }
Mar 28 03:15:13 penelope kernel: PDC202XX: Secondary channel reset.
Mar 28 03:15:13 penelope kernel: hdi: no DRQ after issuingMULTWRITE_EXT
Mar 28 03:15:13 penelope kernel: ide4: reset: success

The drive on the built-in controller is correctly set to udma4 mode,though if I attempt to bump that up to udma5 (which is listed assupported in "hdparm -i /dev/hda"), then the drive becomes completelyunresponsive until the next reboot. I'm waiting for the RAID tofinish rebuilding before I try increasing the UDMA speed on the otherdrives to see what happens.

Thanks again for the help and consideration!

Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- Re: [RESEND][2.6.15] New ATA error messages on upgrade to 2.6.15
  - From: Kyle Moffett <mrmacman_g4@mac.com>
- Re: [RESEND][2.6.15] New ATA error messages on upgrade to 2.6.15
  - From: Alan Cox <alan@lxorguk.ukuu.org.uk>

Prev by Date: Re: bridge+netfilter broken for IP fragments in 2.6.16?
Next by Date: Re: bridge+netfilter broken for IP fragments in 2.6.16?
Previous by thread: Re: [RESEND][2.6.15] New ATA error messages on upgrade to 2.6.15
Next by thread: Re: [RESEND][2.6.15] New ATA error messages on upgrade to 2.6.15
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]