Fedora Users — Re: RAID 5 Multiple Hard-drives failure

On Tue, 2006-03-14 at 07:21, Reuben D. Budiardja wrote:

> So my questions having said all that, is there any thing else other than a 
> real hard-drive problem that would cause something like this ? 
> In other words, could the problem be in the controller, motherboard, etc other 
> than the hard drive itself that would cause hard-drives to fail like that ? 
> Or is it just Maxtor makes bad drives ? 
> Or is a consumer level hard-drive just cannot be used for this kind of work 

All manufacturers have made bad batches of drives, even in their
scsi enterprise versions.  I think what happens is that you
don't access certain sections of the drive for a long time so
you don't notice the 2nd drive going bad.  Then when the first
one fails, you have to traverse all the sectors on the remaining
disks to rebuild the broken one and encounter the problem.  This
can happen even in RAID1 mirrors.  One thing that can help is
to periodically force a read of all sectors with something like
'cat /dev/hdn >/dev/null' for each of the underlying disks 
and then look at 'dmesg' output for errors so you'll notice
any problem before the 2nd drive goes.  You should probably
try a different cable too - newer drives are fairly sensitive
to cable problems.

I'm using a RAID1 mirror for a backup archive that consists
of 2 internal IDE drives plus a set of 3 external firewire
drives that are added to the array and re-syncd then
rotated offsite. This gives a slightly longer history and
the security of an offline/offsite copy and as a side effect
tests the whole disk in the mirror process.  But, the Linux
firewire drivers have been a problem.  I think fedora works
again but I switched to Centos with the centosplus kernel
during the several months that the fedora kernel was broken.

-- 
  Les Mikesell
    lesmikesell@xxxxxxxxx