Re: Catastrophic disk failure, where was smartd?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bruno Wolff III wrote:
On Wed, Mar 26, 2008 at 08:35:49 -0500,
  "David G. Mackay" <mackay_d@xxxxxxxxxxxxx> wrote:
Shouldn't there have been some indication of problems prior to the
failure?

Only if you are lucky. Someone at Google published some information about
smart around a year ago. In cases where catastrophic failures occur, for a high
percentage there is no warning from smart.


The big issue is that most of the smart implementations don't scan the disk for bad blocks, and in my experience several years ago with a 1000+ disks in services was that the #1 failure was bad blocks, and smart did little to catch that. The #2 failure was failure to spin up at all, but this seemed to be confined to certain batches.

One thing that I would do was do a simple "dd if=/dev/sdx of=/dev/null bs=1M" on all of my disks maybe 1x per week or 1x per month to scan it yourself, if the disk detects a sector getting too many errors (still correctable with the extra bits they have) they will move the data from the bad sector to a spare, and mark the bad sector bad, and I believe smart counts when this has been done.

                               Roger


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux