Re: FYI: RAID5 unusably unstable through 2.6.14

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 3 Feb 2006, Martin Drab wrote:

> On Thu, 2 Feb 2006, Bill Davidsen wrote:
> 
> Just to state clearly in the first place. I've allready solved the problem 
> by low-level formatting the entire disk that this inconsistent array in 
> question was part of.
> 
> So now everything is back to normal. So unforunatelly I would not be able 
> to do any more tests on the device in the non-working state.
> 
> I mentioned this problem here now just to let you konw that there is such 
> a problematic Linux behviour (and IMO flawed) in such circumstances, and 
> that perhaps it may let you think of such situations when doing further 
> improvements and development in the design of the block device layer (or 
> wherever the problem may possibly come from).

Perhaps maybe rather of the SCSI layer, than of the block layer ??
 
> And I also hope you would understand, that I wouldn't try to create that 
> state again deliberatelly, since my main system is running on that array 
> and I wouldn't risk loosing some more data because of this.
> 
> However maybe someone perhaps in Adaptec or smewhere else may have 
> some simillar system at the disposal on which he could allow to experiment 
> on demand without any serious risk of loosing anything important.
> 
> So what I may say is that it is an Adaptec 2410SA with 8205 firmware and 
> without a battery backup system (which is probably the crutial thing). 
> And the inconsistency was caused by a MB protection of CPU overheat 
> shutdown, because I've started the system and booted Linux from the array 
> in question (which consisted by just one part of one disk), while I've 
> forgotten to turn on the water cooling of the CPU and northbridge. So 
> after about 3 minutes the system automatically shut down and Linux was 
> probably doing some writing in that very moment, which wasn't able to 
> complete fully (most probably due to the lack of the battery backup system 
> on the RAID controller). So my guess is that this may be artificially 
> reproduced when you suddenly switch off a power source of the system while 
> Linux is doing some writing to the array.
> 
> My arrays in particular are:
> 
> 	HDD1 (160 GB): 120 GB Array 1, 40 GB Array 2
> 	HDD2 (120 GB): 120 GB Array 1
> 	HDD3 (120 GB): 120 GB Array 1
> 	HDD4 (120 GB): 120 GB Array 1
> 
> Where Array 1 is a RAID 5 array /dev/sdb (labeled as "Data 1"), which 
> contains just one 330 GB partition /dev/sdb1, and Array 2 is a bootable 
> (in Adaptec BIOS setup so called) Volume array (i.e. no RAID) /dev/sda 
> (labeled as "SYSTEM"), which contains /dev/sda1 (NTFS Windows), /dev/sda2 
> (ext3 Linux), /dev/sda3 (Linux swap). Problem was accessing the whole Array 2. 
> Array 1 from Linux worked well.
> 
> Then, when I tried, the array checking function within the BIOS of the 
> Adaptec controller found an inconsistency on the position somewhere in the 
> middle of the /dev/sda, so somewhere within the /dev/sda2 in particular. 
> So I low-level formatted the entire HDD1, resynced the Array 1 (which is 
> RAID 5, so no problem) and reinstalled both systems in Array 2, and now it 
> is all back to normal again.
> 
> > Martin Drab wrote:
> > 
> > > Well, I had a similar experience lately with the Adaptec AAC-2410SA RAID 5
> > > array. Due to the CPU overheating the whole box was suddenly shot down by
> > > the CPU damage protection mechanism. While there is no battery backup on
> > > this particular RAID controller, the sudden poweroff caused some very
> > > localized inconsistency of one disk in the RAID. The configuration was 1x160
> > > GB and 3x120GB, with the 160 GB being split into 120 GB part within the RAID
> > > 5 and a 40 GB part as a separate volume. The inconsistency happend in the 40
> > > GB part of the 160 GB HDD (as reported by the Adaptec BIOS media check). In
> > > particular the problem was in the /dev/sda2 (with /dev/sda being the 40 GB
> > > Volume, /dev/sda1 being an NTFS Windows system, and /dev/sda2 being ext3
> > > Linux system).
> > > 
> > > Now, what is interesting, is that Linux completely refused any possible
> > > access to every byte within /dev/sda, not even dd(1) reading from any
> > > position within /dev/sda, not even "fdisk /dev/sda", nothing. Everything
>                                         ^^^^^^^^^^^^^^
> > > ended up with lots of following messages:
> > > 
> > >         sd 0:0:0:0: SCSI error: return code = 0x8000002
> > >         sda: Current: sense key: Hardware Error
> > >             Additional sense: Internal target failure
> > >         Info fld=0x0
> > >         end_request: I/O error, dev sda, sector <some sector number>
> > 
> > But /dev/sda is not a Linux filesystem, running fsck on it makes no sense. You
> > wanted to run on /dev/sda2.
> 
> But I was talking about fdisk(1). This wasn't a problematic behaviour of a 
> filesystem, but of the entire block device.
> 
> > > I've consulted this with Mark Salyzyn, because I thought it was a problem of
> > > the AACRAID driver. But I was told, that there is nothing that AACRAID can
> > > possibly do about it, and that it is a problem of the upper Linux layers
> > > (block device layer?) that are strictly fault intollerant, and thouth the
> > > problem was just an inconsistency of one particular localized region inside
> > > /dev/sda2, Linux was COMPLETELY UNABLE (!!!!!) to read a single byte from
> > > the ENTIRE VOLUME (/dev/sda)!
> > 
> > The obvious test of this "it's not us" statement is to connect that one drive
> > to another type controller and see if the upper level code recovers. I'm
> > assuming that "sda" is a real drive and not some pseudo-drive which exists
> > only in the firmware of the RAID controller.
> 
> /dev/sda is a 40 GB RAID array consisting of just one 40 GB part of one 
> 160 GB drive. But it is in fact a virtual device supplied by the 
> controller. I.e. this 40 GB part of that disc behaves as an entire 
> harddisk (with it's own MBR etc.). And it is at the end of the drive, so 
> it may be a little tricky to find the exact position of the partitions 
> there, but it may be possible.
> 
> > That message is curious, did you
> > cat /proc/scsi/scsi to see what the system thought was there? Use the infamous
> > "cdrecord -scanbus" command?
> 
> ----------
> $ cdrecord -scanbus
> Cdrecord-Clone 2.01.01a03-dvd (i686-pc-linux-gnu) Copyright (C) 1995-2005 J�g Schilling
> Note: This version is an unofficial (modified) version with DVD support
> Note: and therefore may have bugs that are not present in the original.
> Note: Please send bug reports or support requests to warly at mandriva.com.
> Note: The author of cdrecord should not be bothered with problems in this 
> version.
> Linux sg driver version: 3.5.33
> Using libscg version 'schily-0.8'.
> scsibus0:
>         0,0,0     0) 'Adaptec ' 'SYSTEM          ' 'V1.0' Disk
>         0,1,0     1) 'Adaptec ' 'Data 1          ' 'V1.0' Disk
>         0,2,0     2) *
>         0,3,0     3) *
>         0,4,0     4) *
>         0,5,0     5) *
>         0,6,0     6) *
>         0,7,0     7) *
> 
> $ cat /proc/scsi/scsi
> Attached devices:
> Host: scsi0 Channel: 00 Id: 00 Lun: 00
>   Vendor: Adaptec  Model: SYSTEM           Rev: V1.0
>   Type:   Direct-Access                    ANSI SCSI revision: 02
> Host: scsi0 Channel: 00 Id: 01 Lun: 00
>   Vendor: Adaptec  Model: Data 1           Rev: V1.0
>   Type:   Direct-Access                    ANSI SCSI revision: 02
> -----------
> 
> The 0,0,0 is the /dev/sda. And even though this is now, after low-level 
> formatting of the previously inconsistent disc, the indications back then 
> were just the same. Which means every indication behaved as usual. Both 
> arrays were properly identified. But when I was accessing the inconsistent 
> one, i.e. /dev/sda, in any way (even just bytes, this has nothing to do 
> with any filesystems) the error messages mentioned above appeared. I'm not 
> sure what exactly was generating them, but I've CC'd Mark Salyzyn, maybe 
> he can explain more to it.
> 
> > > And now for the best part: From Windows, I was able to access the ENTIRE
> > > VOLUME without the slightest problem. Not only did Windows boot entirely
> > > from the /dev/sda1, but using Total Commander's ext3 plugin I was also able
> > > to access the ENTIRE /dev/sda2 and at least extract the most important data
> > > and configurations, before I did the complete low-level formatting of the
> > > drive, which fixed the inconsistency problem.
> > > 
> > > I call this "AN IRONY" to be forced to use Windows to extract information
> > > from Linux partition, wouldn't you? ;)
> > > 
> > > (Besides, even GRUB (using BIOS) accessed the /dev/sda without complications
> > > - as it was the bootable volume. Only Linux failed here a 100%. :()
> > 
> > From the way you say sda when you presumably mean sda1 or sda2 it's not clear
> > if you don't understand the difference between drive and partition access or
> > are just so pissed off you are not taking the time to state the distinction
> > clearly.
> 
> No, I understand the differences very clearly. But maybe I was just 
> unclear in my expressions (for which I appologize). What I mean is that 
> the problem was with the entire RAID array /dev/sda. So whenever ANY 
> access to ANY part of /dev/sda, which of course also includes accesses to 
> all of /dev/sda1, /dev/sda2, and /dev/sda3, the error messages appeared 
> and no access was performed. That includes even accesses like this
> "dd if=/dev/sda of=/dev/null bs=512 count=1" and any other possible 
> accesses. So the problem was with the entire device /dev/sda.
> 
> > There was a problem with recovery from errors in RAID-5 which is addressed by
> > recent changes to fail a sector, try rewriting it, etc.
> 
> Maybe this was again my bad explanation, but this wasn't a problem of a 
> RAID 5 array, and much less of a software array. Adaptec 2410SA is a 
> 4-channel HW SATA-I RAID controller.
> 
> > I would have to read linux-raid archives to explain it, so I'll stop 
> > with the overview. I don't think that's the issue here, you're using a 
> > RAID controller rather than the software RAID, so it should not apply.
> 
> Yes, exactly. And again, I've solved it by lowlevel formatting.
> 
> > I assume that the problem is gone, so we can't do any more analysis after the
> > fact.
> 
> Unfortunatelly, yes. But I've described above how did it happen, so maybe 
> someone in Adaptec would be able to reproduce, Mark?
> 
> Martin

====================================================
Martin Drab
Department of Solid State Engineering
Department of Mathematics
Faculty of Nuclear Sciences and Physical Engineering
Czech Technical University in Prague
Trojanova 13
120 00  Praha 2, Czech Republic
Tel: +420 22435 8649
Fax: +420 22435 8601
E-mail: [email protected]
====================================================

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux