Fedora Users — Re: smartctl -l error /dev/... When to worry ?

Hi Randy!

Randy Kelsoe wrote:

Hannes Mayer wrote:
Hi all!
I just discovered smartctl and the interesting output with
# smartctl -l error /dev/hda
hda reports no errors (newest disk), but hdb and hdf do report some
errors every few hours (see outputs below)
When do I actually have to start to worry about errors ?
I mean, is an error every few hours normal for older (1-2 years) disks ?
########################### hdb #################################
smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
Warning: ATA error count 60 inconsistent with error log pointer 3
Error 60 occurred at disk power-on lifetime: 5386 hours When the command that caused the error occurred, the device was active or idle. Error 59 occurred at disk power-on lifetime: 5386 hours When the command that caused the error occurred, the device was active or idle. Error 58 occurred at disk power-on lifetime: 5386 hours When the command that caused the error occurred, the device was active or idle. Error 57 occurred at disk power-on lifetime: 5385 hours When the command that caused the error occurred, the device was active or idle. Error 56 occurred at disk power-on lifetime: 5375 hours When the command that caused the error occurred, the device was active or idle.
I have trimmed the above errors to show that the most recent 5 errors on hdb were within 11 hours of each other. You need to look at a 'smartctl -a /dev/hdb |grep -i power_on' to get the current age of the disk, then compare it to the number of power-on hours when the error occurred. The above errors occurred when you drive was 224 days old, so they might be old errors, might have occurred during a power failure, etc.

You might also want to get a newer version of smartmontools. The latest is 5.32, and you are running 5.21.


I get this for hdb:
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       646543
which I assume is in minutes, so that would be almost 449 days.
The errors occured at day 224, so that is pretty old.

>> Error 1482 occurred at disk power-on lifetime: 5368 hours
>>   When the command that caused the error occurred, the device was in
>> an unknown state.
>>   Commands leading to the command that caused the error were:
>>   CR FR SC SN CL CH DH DC   Timestamp  Command/Feature_Name
>>   -- -- -- -- -- -- -- --   ---------  --------------------
>>   08 00 00 01 00 00 b0 00     220.000  DEVICE RESET
>>   ec 00 01 01 00 00 b0 00     213.264  IDENTIFY DEVICE
>>
>> #################################################

See a theme here? What device do or did you have as the master device on the IDE bus with this drive? This looks like a device on the bus had a problem, and the system tried to recover by resetting devices on the bus. If you have a CDROM drive on the same bus, and it had problems, you might see something like this in your error logs. Again, do the 'smartctl -a /dev/hdf | grep -i power_on' and compare the age of the drive to when the error occurred.


hdb is the slave on my IDE bus, currently with FC2.
hda is the master with windoze running idle 99% of the time, just mounting it
from FC2 from time to time.

hdf was the slave with hdb, when hdb still had windoze on it.

OK, so for hdf we have:
5368 hours = 224 days
  9 Power_On_Hours          0x0032   237   237   000    Old_age   Always       -       16981
16981 hours = 707.5 days

So these errors are pretty old too.

But I get confused by the Power_On_Hours. Can this value be altered in some way ?
hdb has 449 days and hdf 707 days, but hdb is in use for a much longer time...

Thank you very much Randy!

Cheers,
Hannes.