Re: 2.6.15.1: persistent nasty hang in sync_page killing NFS (ne2k-pci / DP83815-related?), i686/PIII

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 29 Jan 2006, Trond Myklebust stipulated:
> As a general rule of thumb: if tcpdump/ethereal can see the reply on the
> client, then the engine socket should see it too. If tcpdump is indeed
> seeing those replies, you should check the RPC code by
> setting /proc/sys/sunrpc/rpc_debug to 1.

tcpdump is seeing them.

... have a pile of messages in the midst of a locked-up transfer:

Jan 30 16:50:57 loki warning: kernel: -pid- proc flgs status -client- -prog- --rqstp- -timeout -rpcwait -action- --exit--
Jan 30 16:50:57 loki warning: kernel: 15046 0006 0021 -00011 c1a11600 100003 c801f000 00000000 xprt_resend c02c3798 c01c0e4d
Jan 30 16:50:57 loki warning: kernel: 15047 0006 0021 000000 c1a11600 100003 c801f0b8 00000070 xprt_pending c02c3869 c01c0e4d
Jan 30 16:50:57 loki warning: kernel: 15048 0006 0021 -00011 c1a11600 100003 c801f170 00000000 xprt_resend c02c3798 c01c0e4d
Jan 30 16:50:57 loki warning: kernel: 15049 0006 0001 -00011 c1a11600 100003 c801f228 00000000 xprt_sending c02c3798 c01c0e4d
Jan 30 16:50:57 loki warning: kernel: RPC: 15047 xprt_timer
Jan 30 16:50:57 loki warning: kernel: RPC:      cong 256, cwnd was 256, now 256
Jan 30 16:50:57 loki warning: kernel: RPC: 15048 xprt_cwnd_limited cong = 0 cwnd = 256
Jan 30 16:50:57 loki warning: kernel: RPC: 15048 xprt_prepare_transmit
Jan 30 16:50:57 loki warning: kernel: RPC: 15048 xprt_transmit(116)
Jan 30 16:50:57 loki warning: kernel: RPC: 15048 xmit complete
Jan 30 16:50:57 loki warning: kernel: RPC: 15047 xprt_prepare_transmit
Jan 30 16:50:57 loki warning: kernel: RPC: 15047 xprt_cwnd_limited cong = 256 cwnd = 256
Jan 30 16:50:57 loki warning: kernel: RPC: 15047 failed to lock transport c1a11800
Jan 30 16:50:58 loki warning: kernel: RPC: 15048 xprt_timer
Jan 30 16:50:58 loki warning: kernel: RPC:      cong 256, cwnd was 256, now 256
Jan 30 16:50:58 loki warning: kernel: RPC: 15046 xprt_cwnd_limited cong = 0 cwnd = 256
Jan 30 16:50:58 loki warning: kernel: RPC: 15046 xprt_prepare_transmit
Jan 30 16:50:58 loki warning: kernel: RPC: 15046 xprt_transmit(116)
Jan 30 16:50:58 loki warning: kernel: RPC: 15046 xmit complete
Jan 30 16:50:58 loki warning: kernel: RPC: 15048 xprt_prepare_transmit
Jan 30 16:50:58 loki warning: kernel: RPC: 15048 xprt_cwnd_limited cong = 256 cwnd = 256
Jan 30 16:50:58 loki warning: kernel: RPC: 15048 failed to lock transport c1a11800
Jan 30 16:50:59 loki warning: kernel: RPC: 15046 xprt_timer
Jan 30 16:50:59 loki warning: kernel: RPC:      cong 256, cwnd was 256, now 256
Jan 30 16:50:59 loki warning: kernel: RPC: 15047 xprt_cwnd_limited cong = 0 cwnd = 256
Jan 30 16:50:59 loki warning: kernel: RPC: 15047 xprt_prepare_transmit
Jan 30 16:50:59 loki warning: kernel: RPC: 15047 xprt_transmit(116)
Jan 30 16:50:59 loki warning: kernel: RPC: 15047 xmit complete
Jan 30 16:50:59 loki warning: kernel: RPC: 15046 xprt_prepare_transmit
Jan 30 16:50:59 loki warning: kernel: RPC: 15046 xprt_cwnd_limited cong = 256 cwnd = 256
Jan 30 16:50:59 loki warning: kernel: RPC: 15046 failed to lock transport c1a11800
Jan 30 16:51:00 loki warning: kernel: RPC: 15047 xprt_timer
[repeats indefinitely]
Jan 30 16:51:38 loki warning: kernel: -pid- proc flgs status -client- -prog- --rqstp- -timeout -rpcwait -action- --exit--
Jan 30 16:51:38 loki warning: kernel: 15046 0006 0021 -00011 c1a11600 100003 c801f000 00000000 xprt_resend c02c3798 c01c0e4d
Jan 30 16:51:38 loki warning: kernel: 15047 0006 0021 000000 c1a11600 100003 c801f0b8 00000140 xprt_pending c02c3869 c01c0e4d
Jan 30 16:51:38 loki warning: kernel: 15048 0006 0021 -00011 c1a11600 100003 c801f170 00000000 xprt_resend c02c3798 c01c0e4d

The RPC messages are emitted at pretty much exactly the same frequency
as the ACKs.

I *guess* that the `failed to lock transport' is the underlying error...
time to add some debugging and find out what task is locking the
transport. Back soon, must rebuild the kernel and reboot to clear this
lock ;)

-- 
`I won't make a secret of the fact that your statement/question
 sent a wave of shock and horror through us.' --- David Anderson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux