Tod Merley wrote:
On Dec 4, 2007 2:03 PM, Konstantin Svist <fry.kun@xxxxxxxxx> wrote:
Here's the situation:
I have some large mysql db files on my main partition. For some strange
reason, some of these files are very slow to be read off the disk:
~3MB/s (I'm SCPing them to another machine). File size doesn't seem to
be relevant (other files of approximately same size are being
transferred at ~20MB/s), and the speed is not the same throughout the file.
My first thought was that the file got very fragmented (I'm fairly sure
that the partition was filled up to [almost] 100% at some point).
Usually, that can be remedied by copying the file to another location on
the same partition.. or rather, that's what would fix fragmented files
on NTFS. When I tried that, the read speed of the new file was better by
a small amount (up to ~5MB/s) - although that could be because the file
was still in the memory cache.
I also thought that HD might be nearing the end of its useful lifetime
(and has to re-read the sectors, causing the horrible slowdown), but I
didn't notice any alerts from SMART (including the output of smartctl
-a, anyway)
The partition is ~60-70% full, type ext3 with noatime enabled.
Speed tests were performed by "scp myfile localhost:/dev/null"
----SNIP
Hi Konstantin Svist!
If I understand the data:
Lots of seek errors - lots of ECC recoveries - running very warm (117
degrees Farenheit). Check for cooling issues. Low level format if
possible.
Sounds kind of like the little guy has worked very hard for you the
last year and a half! Yes, maybe time to change him out.
Just some thoughts!
Tod
Thanks, Tod!
Just now I checked the value of "195 Hardware_ECC_Recovered" - it seems
to be increasing by a large amount every second. However, the same thing
is happening on most of my HDs (I see that number increase by ~15-20K/s
on the machine with the slow file, and ~45K/s on some 1-month-old SATA
drives (those might have a different load, however).
Is this normal? (these are all HD-intensive servers, BTW, mostly DB
accesses)
I also remember hearing on this list that it's not certain how to read
some SMART values... look for thread "Fedora May Be Killing Your
Laptop's Hard Drive?". The attribute in question was "193
Load_Cycle_Count" but maybe the same thing applies to the other "raw data"?