Software raid not kicking devices out of the array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I have a server with 5 serial ata disks, 4 of them connected into 2
software raid1 devices. Today this server stopped responding (no ping,
nothing on the screen, even numlock not working) and after inspecting
logs I found 5 records like:

Mar  9 19:30:00 shaman ata5: status=0x51 { DriveReady SeekComplete
Error }
Mar  9 19:30:00 shaman ata5: error=0x0c { DriveStatusError }

(not consequent) before the freeze. First one was at 19:03 - about half
an hour before the freeze. I'm pretty sure, that the reason for server
stopping responding is hard drive failure.

So the question is, isn't raid supposed to kick the device out of the
array in case of io error? Surely I can write a script that monitors the
logs and kicks drives out, but this does not sound like a good solution.

The drive was still in the array after the reboot and after the reboot
it continued to issue such errors until I removed the drive from array
with mdadm -f.

I'm attaching dmesg of the machine after reboot.

Anton Titov
Host.bg

Attachment: dmesg.shaman.gz
Description: GNU Zip compressed data


[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux