On Wednesday May 10, [email protected] wrote:
> We have a NFS server here with a fairly high load. The clients are
> Linux, FreeBSD and Solaris. The exported filesystem is XFS, which is onb
> a LVM drive. After between 3 and 30 days it seems that locking
> completely stops working, clients generally either error or simply lock
> up when they try to lock a file. The only way to fix it seems to be a
> reboot.
Reboot the client or the server?
>
> Last time it happened was on 2.6.17-rc2, it started around 2.6.15.
>
> There is nothing in the dmesg on the server, the (Linux) clients are
> printing this in the dmesg when something tries to create a lock:
>
> lockd: server xxx.xxx.xxx.xxx not responding, still trying
> lockd: server xxx.xxx.xxx.xxx not responding, still trying
> lockd: server xxx.xxx.xxx.xxx not responding, still trying
> lockd: server xxx.xxx.xxx.xxx not responding, still trying
Sounds like the server has locked up.
What does 'ps' on the server show for 'lockd'? Is it in 'D'? What is
the 'wchan'? Are any 'nfsd's permanently in 'D'?
Try
echo t > /proc/sysrq-trigger
and see what the stack trace for lockd is - probably only useful if it
is in 'D'.
Maybe a 'tcpdump -s 1500' of traffic between client and server would
help.
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]