Fedora Users — Re: hard drive problems

On Monday 06 February 2006 21:30, zanecb@xxxxxxxxxxxxxxxxxxxxxxx wrote:
>> Zane C. B. wrote:
>>> hdb: lost interrupt
>>> hda: status error: status=0x58 { DriveReady SeekComplete
>>> DataRequest } ide: failed opcode was: unknown
>>> hda: drive not ready for command
>>> hda: irq timeout: status=0xd0 { Busy }
>>> ide: failed opcode was: unknown
>>> ide0: reset: success
>>>
>>> Any ideas what is happening or suggestions for testing for what is
>>> happening?
>>>
>>> smartctl -H /dev/hda and smartctl -l error /dev/hda show the drive
>>> as being good.
>>
>> Download and run the manufacturer's diagnostics tools (full barrage
>> of tests) to eliminate hard drive faults first. Then we can see
>> whether we're looking at a controller issue?
>
>Actually trying to advoid this so I don't have to take the machine out
> of production.

If its indeed that important, then DO IT NOW while you still have data 
that can be recovered when it does curl up its toes. If the data is 
valuable, then production must understand that their baby needs a fresh 
diaper.  If they can't do that, then I assume you can get an overtime 
approval to check it after hours?  But by then, it may well be too 
late.  IBM published some papers a few years ago about how they were 
attempting to arrive at some sort of a meaningfull indicator of 
impending drive failure, but their best work at the time could give 
only a 20 minute warning, this in the heyday of deathstars. Apply 
Moores Law, and it might be 8 hours today.  Emphasis on the might...

I trust that you do have backups?  Don't you?

Common Sense...  Why is it so uncommon?


-- 
Cheers, Gene
People having trouble with vz bouncing email to me should add the word
'online' between the 'verizon', and the dot which bypasses vz's
stupid bounce rules.  I do use spamassassin too. :-)
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2006 by Maurice Eugene Heskett, all rights reserved.