On Wed, 2006-02-01 at 10:36 +0000, Terry Barnaby wrote: > Gilboa Davara wrote: > > On Wed, 2006-02-01 at 09:45 +0000, Terry Barnaby wrote: > > > >>Hi, > >> > >>I have just set up a Raid 5 disk array using 4 SATA disks on Fedora 4. > >>To test the setup I unplugged the SATA cable from one of the disk drives. > >>I was expected the system to carry on with messages from the Raid system > >>indicating that there was a disk drive down. > >> > >>However the Raid 5 partition became completely inaccessable after un-plugging > >>the drive. The kernel reported disk errors but there was no error messages > >>from the Raid system and "mdadm -Q --detail /dev/md2" reported that there > >>was no problems with the Raid array. > >> > >>When I rebooted the system (needed a reset) the Raid system reported that > >>one disk was down and the partition became readable again. > >> > >>It appears that the default configuration of the Raid 5 system does not > >>handle a complete drive failier during up-time. I presume it may respond to > >>disk errors from a disk drive that is connected but once disconnected the > >>Raid system appears to ignore errors. > >> > >>Is there a configuration option to allow the Raid system to respond to > >>a completely broken drive or cable ? > >> > >>Terry > >> > > > > > > A. Are you sure your machine/controller hot plug? SATA doesn't support > > it by default. (You'll need special drive enclosures and > > hot-plug-supporting controller. > > B. Can you post your complete machine configuration? > > > > Gilboa > > > Hi, > > Thank you for the response. > > A. No, I don't think the SATA controller is a hot-plug-supporting controller. > It is a: "Intel Corporation 82801FB/FW (ICH6/ICH6W) SATA Controller (rev 04)". > B. > Motherboard: AOPEN i915Ga-PLF > CPU: Pentum 4 3GHz > Disks: 4 * SATA WD Caviar 320G > > Paritions: Each disk has: 1 - 20G, 2 - 1G (swap), 3 - ~300G > Raid: "/" /dev/md0 Raid1 using /dev/sda1,/dev/sdb1 > "/spare" /dev/md1 Raid1 using /dev/sdc1,/dev/sdd1 > "/data" /dev/md2 Raid5 using /dev/sda3,/dev/sdb3,/dev/sdc3,/dev/sdd3 > > Although the SATA controller is not a "hot-plug" controller I assumed that > disconnecting a SATA disk to simulate a cable failier or complete drive failier > would cause the RAID system to react correctly. Certainly I see kernel > error messages from the disk/controller in question and I would have assumed that > the RAID system would react to this ... > > Terry > By default software RAID1/5/6 support on-line drive kill/remove/rebuild/etc. However, seems that the MD driver is unaware of the dead drive. What does /proc/mdstat say? Gilboa