On 09-10-01 09:09:40, Robin Laing wrote: > Tony Nelson wrote: > > On 09-09-23 09:29:56, Gene Poole wrote: > >> I've very recently upgraded 2 of my machines. One machine was > >> upgraded from Fedora 9 to Fedora 11, and the other machine was > >> upgraded from Fedora 10 to Fedora 11. On machine 1 I have 2-hard > >> disks (both Seagate's - 500 GB and 1000 GB), on machine 2 I have > >> 1- hard disk (Western Digital 320 GB). All of the interfaces are > >> SATA. The questionable status is that on machine 1 the 500 GB > >> drive is showing as failing and on machine 2 the 20 GB drive is > >> showing as failing. Neither drive, under the old releases, showed > >> up as failing. How do I know that these drive are truly failing? > > > > 1) Wait. If the disk is going bad, it will fail. > > > > 2) Run as root `smartctl -A /dev/sdx` (for each sdx) and look at > > the "WHEN_FAILED" column; it will be "-" if not failed. > > > > 3) Run as root `smartctl -a /dev/sdx` (for each sdx) and look at > > the whole output. > > > > 4) Run as root `smartctl -t long /dev/sdx` (for each sdx) and wait > > until the time the test should finish, then view the results with > > `smartctl -l selftest /dev/sdx` (for each sdx) or `smartctl -a > > /dev/sdx` (for each sdx). > > > > See `man smartctl`. > > > > Note that the new disk health monitoring tool "palimpsest" in > > package gnome-disk-utility is panicky and not to be trusted, unless > > you like buying lots of hard drives. It doesn't just look at > > "WHEN_FAILED", but has its own criteria such as nonzero > > Reallocated_Event_Count, which is fairly normal for a modern drive > > that has been in use for a while. A nonzero Current_Pending_Sector > > or Offline_Uncorrectable are bad, as they mean data loss, though > > not general drive failure. I recommend enabling Automatic Offline > > Testing with `smartctl -o on /dev/sdx` (for > > each sdx), which will do a surface scan every few hours, giving the > > best chance to repair or recover any sectors that are going bad. > > > > Will the `smartctl -o on /dev/sdx` (for > each sdx), fix the nonzero > Reallocated_Event_Count issue on RAID arrays in a non-desctructive > way? No. Nor for non-RAID either. It doesn't "fix" Reallocated_Event_Count -- rather, its purpose is to make Reallocated_Event_Count go up faster, in that as soon as a sector starts to go bad it will be reallocated if readable, and the sooner the more likely it is possible. A non-zero Reallocated_Event_Count is not a problem. Whatever says it is a problem is the real problem. Fix that instead. Non-zero Current_Pending_Sector is a problem, but RAID should be fixing that already. I don't know, but I think that enabling Automatic Offline Testing should cause any uncorrectable sectors to be noticed and fixed sooner by RAID. > Do you have to use the /dev/sdx devices or the /dev/md devices? ... Automatic Offline Testing must be enabled on an actual ATA hard disk, so no fake disk such as dm or md. See `man smartctl`. -- ____________________________________________________________________ TonyN.:' <mailto:tonynelson@xxxxxxxxxxxxxxxxx> ' <http://www.georgeanelson.com/> -- fedora-list mailing list fedora-list@xxxxxxxxxx To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines