Roger wrote:
Jul 16 19:50:18 asterix kernel: ata1: port reset, p_is 0 is 0 pis 0 cmd 4c017 tf 7f ss 0 se 0
Hm there it seems to be resetting the device
starts to just die and then i guess switches to read-only mode and i cant even use ssh to get into the machine, such that i always reset it, but when i do it just runs for 4 days max and then just dies, do you reckon it to be the hard drive that is now weak or what.
I read some time ago there is some design tradeoff for the bearings on hard drives, they can be optimized for running a long time or for stopping and starting like on a laptop drive. Maybe that is something to do with it.
Your problem does sound about the same as what I experienced, also on x86_64, mine was an nVidia chipset DFI Lanparty board.
I have since replaced that server with a smaller machine running on FC3 with but uses a lot less memory and does not complain at all.
My box was also up 24hrs being a mailserver. I replaced it with an embedded ARM board that draws 1.5W and a USB flash pen. I wrote an article about it here: http://warmcat.com/_wp/?p=5 But of course it just does my mail, not 1000 users.
Do you reckon it could be the SATA Drivers which kill the hard drives!
Sorry I don't have any answers, I dropped that machine after the second drive was killed and it is sitting in the corner looking lonely. But then I would think more people would be hitting this problem if it was simply a driver issue. Maybe it requires extended uptimes on the box in addition?
-Andy
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature