Re: Major SATA / EXT3 Issue?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Chris Holvenstot wrote:
I am curious if anyone else has had major problems with SATA drives on
the current series of kernels.  I have (or rather had) two SATA drives
on my system - the first was a Maxtor MaxLine 500 and the second was a
Maxtor MaxLine 250.

Both of these drives were plugged to the 1.5 Gigabyte / second mode.

My SATA controller is integrated on my MSI motherboard and sports four
ports.  It is implemented using the Nvidia CK804 chipset.  My processor
is an AMD64 X2 4600+ running the 32 bit version of Linux.
I have had these drives up and running for about six months.

The first drive "failed" about 10 days ago - and unfortunately I focused
on hardware error and after several attempts to get the drive back
online I physically pulled it from the system.  This drive was used for
backups and thus was not critical to day-to-day operations.
However, tonight I "lost" a second SATA drive, this one I use on a daily
basis for my kernel build and test processes.  It failed in the same
manner as the first, which makes me a little suspicious.


The first drive “failed” while I was running a modified Ubuntu 7.04
system. Because I focused on hardware as the reason for the failure I
did not collect specific information about the version of the kernel
being used, but it was likely to be 2.6.24-git8.


The second drive “failed” tonight on what is, except for the kernel, a
fairly standard Ubuntu 7.10 system (the same hardware - I upgraded my OS
this past week) – the kernel in use tonight at the time of the second
failure was 2.6.24-rc1-git1


In each case the failure mode appears to have been the same – the system
appears to lock up. When rebooted I get a long string of messages like:


Oct 26 20:07:37 localhost kernel: [ 101.581091] ata2: timeout waiting
for ADMA IDLE, stat=0x440

Oct 26 20:07:37 localhost kernel: [ 101.581096] sd 1:0:0:0: [sda] Write
Protect is off

Oct 26 20:07:37 localhost kernel: [ 101.581174] res
71/04:08:00:00:00/04:00:1d:00:00/e0 Emask 0x1 (device error)

Oct 26 20:07:37 localhost kernel: [ 101.644992] ata2.00: configured for
UDMA/33

Oct 26 20:07:37 localhost kernel: [ 101.644994] ata2: EH complete

Oct 26 20:07:37 localhost kernel: [ 101.645006] sd 1:0:0:0: [sda] Write
cache: disabled, read cache: enabled, doesn't support DPO or FUA

You should try and get some output from dmesg and not from the messages log, as the log daemon seems to have a nasty habit of discarding critical output from these errors. In this case the failing command is missing and the message ordering even seems off.



The hardware appears to be correctly identified by the BIOS during the
power up sequence.

Not much is seen in the dmesg log excpet for:


[ 43.649673] scsi0 : sata_nv

[ 43.649722] scsi1 : sata_nv

[ 43.649776] ata1: SATA max UDMA/133 cmd 0x9f0 ctl 0xbf0 bmdma 0xcc00
irq 19

[ 43.649778] ata2: SATA max UDMA/133 cmd 0x970 ctl 0xb70 bmdma 0xcc08
irq 19

There should be more than this at the very least.. As above, please try to get output from dmesg itself.



When I try to run a file system check on these devices I get:




e2fsck 1.40.2 (12-Jul-2007)

fsck.ext2: No such file or directory while trying to open /dev/sdb1

The superblock could not be read or does not describe a correct ext2

filesystem. If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e2fsck with an alternate
superblock:

e2fsck -b 8193 <device>


I have a gut feeling that when the system appears to lock up what is
really going on is that the contents of the drive are being trashed. But
I have no proof of that.

I don't think that is the case, more like the drives have not been detected at all. If this happens after a reboot when they were working before, that sounds like some kind of a hardware issue most likely..



When I try to do a parted to see what the system thinks is on the drive
I get the error message:


Error: Error opening /dev/sdb: No medium found

I am not having any problems with my EXT3 file systems located on
“standard” IDE / PATA drives.


My config file, which has not changed in months beyond taking the
defaults during make oldconfig looks like:

--
Robert Hancock      Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux