On Fri, 2004-04-30 at 15:01, Guolin Cheng wrote: > Hi, jludwig, > > Thanks for your helpful information. > > Because I'm running Linux, so I assume there are no viruses. Then comes > several questions: > > 1, How can I know whether all the spare sectors are in use and the disk > will lose data, or it is just the beginning of disk failure? > There is no real way to know if you are using spare sectors (even new drives use a few since perfect media is rare) since this is part of the hard drive system's firmware and happens automatically. > 2, How I can identify that the hard drive becomes dying at the first > minute? Run the smartd daemon < chkconfig smartd on > > 3, How to identify the malfunctioning hard drives? Should I idle the > machine and test hard drives one by one to figure it out? Mostly it is > the faiure-reporting hard drive failed, but I remember for sure, in a > few cases, other alternative hard drives failed instead. The only way to really check a hard drive is a multiple 100% read/write of each sector. Needless to say the drive must be taken out of service and all data is removed. > > 4, Should I replace hard drives when I first see this kind of disk error > messages in case data begin to lose? When you see this it usually indicates a drive has used up all the spares. When you do see this; 1) back up your data 2) watch for another R/W failure 3) Depending on the nature of the drive and system have a new drive ready 4) Don't assume the drive has failed or lost sectors. I have had drives that were "thrown out" when all that was really needed was a factory "low level format" which rechecks all sectors. (This is not a true low level format which can only be done at the factory or other facility with the proper equipment). > Thanks a LOT... > > --Guolin Cheng > Snip -- jludwig <wralphie@xxxxxxxxxxx>