Martin Drab wrote:
Well, I had a similar experience lately with the Adaptec AAC-2410SA RAID
5 array. Due to the CPU overheating the whole box was suddenly shot down
by the CPU damage protection mechanism. While there is no battery backup
on this particular RAID controller, the sudden poweroff caused some very
localized inconsistency of one disk in the RAID. The configuration was
1x160 GB and 3x120GB, with the 160 GB being split into 120 GB part within
the RAID 5 and a 40 GB part as a separate volume. The inconsistency
happend in the 40 GB part of the 160 GB HDD (as reported by the Adaptec
BIOS media check). In particular the problem was in the /dev/sda2 (with
/dev/sda being the 40 GB Volume, /dev/sda1 being an NTFS Windows system,
and /dev/sda2 being ext3 Linux system).
Now, what is interesting, is that Linux completely refused any possible
access to every byte within /dev/sda, not even dd(1) reading from any
position within /dev/sda, not even "fdisk /dev/sda", nothing. Everything
ended up with lots of following messages:
sd 0:0:0:0: SCSI error: return code = 0x8000002
sda: Current: sense key: Hardware Error
Additional sense: Internal target failure
Info fld=0x0
end_request: I/O error, dev sda, sector <some sector number>
But /dev/sda is not a Linux filesystem, running fsck on it makes no
sense. You wanted to run on /dev/sda2.
I've consulted this with Mark Salyzyn, because I thought it was a problem
of the AACRAID driver. But I was told, that there is nothing that AACRAID
can possibly do about it, and that it is a problem of the upper Linux
layers (block device layer?) that are strictly fault intollerant, and
thouth the problem was just an inconsistency of one particular localized
region inside /dev/sda2, Linux was COMPLETELY UNABLE (!!!!!) to read a
single byte from the ENTIRE VOLUME (/dev/sda)!
The obvious test of this "it's not us" statement is to connect that one
drive to another type controller and see if the upper level code
recovers. I'm assuming that "sda" is a real drive and not some
pseudo-drive which exists only in the firmware of the RAID controller.
That message is curious, did you cat /proc/scsi/scsi to see what the
system thought was there? Use the infamous "cdrecord -scanbus" command?
And now for the best part: From Windows, I was able to access the ENTIRE
VOLUME without the slightest problem. Not only did Windows boot entirely
from the /dev/sda1, but using Total Commander's ext3 plugin I was also
able to access the ENTIRE /dev/sda2 and at least extract the most
important data and configurations, before I did the complete low-level
formatting of the drive, which fixed the inconsistency problem.
I call this "AN IRONY" to be forced to use Windows to extract information
from Linux partition, wouldn't you? ;)
(Besides, even GRUB (using BIOS) accessed the /dev/sda without
complications - as it was the bootable volume. Only Linux failed here a
100%. :()
From the way you say sda when you presumably mean sda1 or sda2 it's not
clear if you don't understand the difference between drive and partition
access or are just so pissed off you are not taking the time to state
the distinction clearly.
There was a problem with recovery from errors in RAID-5 which is
addressed by recent changes to fail a sector, try rewriting it, etc. I
would have to read linux-raid archives to explain it, so I'll stop with
the overview. I don't think that's the issue here, you're using a RAID
controller rather than the software RAID, so it should not apply.
I assume that the problem is gone, so we can't do any more analysis
after the fact.
--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]