FC 1.0. Prior to the lockup, this machine had been up
over 25 days without problems and with a moderate load
(moved many GB of data to/from NFS and to/from SMB). System info:
Kernel 2.4.20-1115 (FC 1.0 stock kernel) Supermicro dual XEON 2.2, hyperthread not enabled in BIOS Soft raid (dual 120G HD) Dual 1G ethernet - one had several NFS and SMB mounted partitions (read and write) - one has an NFS partition (Solaris 7 server) 2G RAM
The machine had both NFS and SMB mounts, but the NFS
server was down at the time (cable removed). Also, I did
df as a user and left it up yesterday.
This morning, the machine was locked up and would only respond to pings. I could not login, hence had to hard reboot.
/var/log/messags reported:
... smb_request: result -104, setting invalid ... smb_retry: successful, new pid=9141, generation =2
This was repeated every hour, with generation 3, 4, 5, then 6. That was the last message in /var/log/messages.
I found two threads on kernel lockups but from the info,
this is still a problem (last messages dated 1/8).
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=109497 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=113148
Note: I am loading the latest kernel and will retry, but I really need a STABLE box....
Questions:
1) Should I move to RH Enterprise?
2) Should I use a stock 2.4.24 kernel (all I need is basic stuff: soft RAID, e1000, NFS, SAMBA, CD-ROM)?
3) Do you think that the latest kernel will fix it?
4) Any help on how to test this (e.g., Stress?)?
Cheers, -- Wade Hampton