Re: [RESEND][2.6.15] New ATA error messages on upgrade to 2.6.15

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Apr 2, 2006, at 17:24:43, Alan Cox wrote:
On Sul, 2006-04-02 at 15:55 -0400, Kyle Moffett wrote:
(2) It's extremely unlikely that the card itself is faulty; it exhibits identical symptoms on both drives and has ever since I originally purchased the card and installed 2.4.X on the system.

If it has always shown those symptoms then I'd say its quite likely the card if the crystals/PLLs on it are out. It looks like the timing is wrong, which means either the input clocks (eg PCI clock) are wrong (eg 37.5Mhz not 33 due to BIOS overclock settings or just plain out), the card has a dodgy crystal/PLL or the kernel set it up wrong.

PCI timings won't move between motherboards, PLL faults wont move between cards.

Unless anyone else is seeing the same problem with the same card variant or you have two cards that do it then there isn't much that can be done I suspect other than assume the hardware is iffy, rightly or wrongly. I'd have expected a lot more reports if it were the controller.

Hmm, okm thanks for the information. If it was possible, I'd be extremely suspicious that the card's firmware was either buggy or Linux didn't know how to repsond to the odd hardware variant; I don't recall them producing that model of card for very long, so it's quite possible there aren't many of them around and they have some kind of timing quirk nobody knows about.

CRC issues aside, there is that other MULTWRITE_EXT error that only occurs on hdi (and if I swap hdi and hdg, the error follows the drive). The error also is specific to 2.6.15+, it does not occur on the 2.6.12+patch that I switched from a month ago. I'm assuming that since the drive/card stop giving BadCRC errors that they're able to communicate successfully at the extremely low speed.

With a little more tinkering with hdparm I was able to determine that the drives on the built-in controller and the primary bus of the PCI controller were both in DMA mode, the former in udma4 and the latter in udma3. The originally problematic drive (the one giving the MULTWRITE_EXT errors) was in PIO mode, though "hdparm -d1 /dev/hdi" "fixed" that problem and resulted in a drastic increase in drive bus speed as measured by "hdparm -tT". (from 2MB/sec to around 23MB/sec or so). hdi ended up in udma2 according to "hdparm -i"

Just for clarity, I'm repeating the _new_ error below. This one recurs about once or twice an hour, but only on the samsung drive. If the answer is (as it seems likely) "Your drive has bad firmware but the error is totally harmless", then I'll be perfectly happy, although I'd kind of prefer if the kernel could detect the buggy firmware and work around it (maybe by switching back to whatever the old behavior was, whenever changed). I'd otherwise be happy to git- bisect except for the fact that a number of people rely on this system for day-to-day activities.

Mar 28 03:15:13 penelope kernel: hdi: status timeout: status=0xd0 { Busy }
Mar 28 03:15:13 penelope kernel: PDC202XX: Secondary channel reset.
Mar 28 03:15:13 penelope kernel: hdi: no DRQ after issuing MULTWRITE_EXT
Mar 28 03:15:13 penelope kernel: ide4: reset: success

The drive on the built-in controller is correctly set to udma4 mode, though if I attempt to bump that up to udma5 (which is listed as supported in "hdparm -i /dev/hda"), then the drive becomes completely unresponsive until the next reboot. I'm waiting for the RAID to finish rebuilding before I try increasing the UDMA speed on the other drives to see what happens.

Thanks again for the help and consideration!

Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux