Re: lockd: couldn't create RPC handle for (host)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 18, 2005 at 03:33:56AM -0500, Trond Myklebust wrote:
> Any Oopses? (use 'dmesg')
> 
> Could you also check dmesg for any entries of the form
> 
> 'lockd: new process, skipping host shutdown'
> or
> 'lockd_up: makesock failed, error='

Well, there are the oopses I reported last week or so:

Unable to handle kernel NULL pointer dereference at 0000000000000018 RIP: 
<ffffffff801dbd9e>{nlmclnt_mark_reclaim+62}
PGD 7e0e7067 PUD 7e6c0067 PMD 0 
Oops: 0000 [1] 
CPU 0 
Modules linked in:
Pid: 1316, comm: lockd Not tainted 2.6.14.2 #2
RIP: 0010:[<ffffffff801dbd9e>] <ffffffff801dbd9e>{nlmclnt_mark_reclaim+62}
RSP: 0018:ffff81007dfade70  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff81007ad74740 RCX: ffff81007e24a858
RDX: ffff81007e24a8f0 RSI: ffff81007e24a8e8 RDI: ffff81007ad74740
RBP: ffff81007e820e00 R08: 00000000fffffffa R09: 0000000000000001
R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: ffffffff803ec420 R15: ffff81007df62014
FS:  00002aaaab00c4a0(0000) GS:ffffffff804b6800(0000) knlGS:00000000557a9080
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000018 CR3: 000000007e113000 CR4: 00000000000006e0
Process lockd (pid: 1316, threadinfo ffff81007dfac000, task ffff81007eea61c0)
Stack: ffffffff801dbe6b ffff81007ad74740 ffffffff801e3d8c 3256cc84d3030002 
       0000000000000000 ffff81007df4fc68 ffff81007df4fc00 ffffffff803ed4a0 
       ffff81007df4fca0 ffff81007df4fc68 
Call Trace:<ffffffff801dbe6b>{nlmclnt_recovery+139} <ffffffff801e3d8c>{nlm4svc_proc_sm_notify+188}
       <ffffffff8034c5a4>{svc_process+884} <ffffffff8012ab40>{default_wake_function+0}
       <ffffffff801dde00>{lockd+352} <ffffffff801ddca0>{lockd+0}
       <ffffffff8010e352>{child_rip+8} <ffffffff801ddca0>{lockd+0}
       <ffffffff801ddca0>{lockd+0} <ffffffff8010e34a>{child_rip+0}
       

Code: 48 39 78 18 75 1c 8b 86 8c 00 00 00 a8 01 74 12 83 c8 02 89 
RIP <ffffffff801dbd9e>{nlmclnt_mark_reclaim+62} RSP <ffff81007dfade70>
CR2: 0000000000000018

Every machine with a dead lockd has had this oops.  Other stuff that
looks related (these came after the oops, a few days later):

lockd: unexpected unlock status: 1
lockd: weird return 7 for CANCEL call

Neither of those two messages you mentioned are present.

> > > Is anything at all listening on port 32768 on 'jacquere'? (check using
> > > 'netstat -ap | grep 32768').
> > 
> > Er... sort of?
> > 
> > # netstat -ap | grep 32768
> > udp    11144      0 *:32768                 *:*                                -                   
> > I'm not sure what that means...  lsof|grep 32768 returns nothing.
> 
> That could be a kernel process. The RPC client has no reason to set up a
> full file descriptor for the socket.
> 
> Could you please finally double-check that the entries in /proc/mounts
> for your NFS mounts do contain the 'lock' mount option.

Yep:

xarello:/export/home /export/home nfs rw,nosuid,nodev,v3,rsize=8192,wsize=8192,hard,intr,udp,lock,addr=xarello 0 0

> Finally, please do
> 
> echo 1 > /proc/sys/sunrpc/rpc_lockd
> then unmount one of your NFS partitions, and then mount it again.

That file doesn't exist.

$ ls /proc/sys/sunrpc 
nfs_debug  nfsd_debug  nlm_debug  rpc_debug  tcp_slot_table_entries  udp_slot_table_entries

Thanks,
-ryan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux