Re: BUG: unable to handle kernel NULL pointer dereference - nfs v3

On Monday July 16, david.ml@euro-web.fr wrote:
> Hi,
> 
> I'm not sure is the good place to poste that, and if not - please excuse me.

This is the correct place to post this, thanks.

> 
> I was running nfs server v2 since a year on one server, there is few days, i 
> have update my kernel to 2.6.21.3 with support of nfsv3 server.
> 
> Somes times per days i have somes crash as below, needing i reboot the server 
> to nfs re-become up.
> 
> ************
> BUG: unable to handle kernel NULL pointer dereference at virtual address 
> 00000004
  ^^^^^^^^

This says that it tried to access memory at address '4'.  There is no
memory there, so it caused the BUG.

>  printing eip:
> c01e7279
> *pde = 09ecc001
> Oops: 0000 [#1]
> SMP
> CPU:    0
> EIP:    0060:[<c01e7279>]    Not tainted VLI
> EFLAGS: 00010246   (2.6.21.3-sdf88-core #9)
                              ^^^^^^^^^^^

What is "-sdf88-core" ?? Are there any extra patches that we should
know about?

> EIP is at encode_fsid+0x67/0x89

This is presumably where the illegal access happened.

> eax: e5bde8c0   ebx: f7593404   ecx: 00000000   edx: 00000006
> esi: dc569048   edi: f75934ec   ebp: f7593404   esp: f75f1f18

Memory accesses are (almost) always relative to the value in some
register.  Of these registers, the most likely is ecx, with edx a
vague possibility.

> Code: e2 08 09 d1 09 c1 eb 10 8b 83 88 00 00 00 8b 40 30 89 c3 89 c1 c1 fb 1f 
> 89 d8 0f c8 89 06 89 c8 eb 1e 

Unfortunately "ksymoops" does seem to decode this into something quite
useful enough.  Normally one of the numbers has <> around it.  Are you
should you copied the number across exactly?

This code decodes as:
   0:   e2 08                     loop   a <_EIP+0xa>
   2:   09 d1                     or     %edx,%ecx
   4:   09 c1                     or     %eax,%ecx
   6:   eb 10                     jmp    18 <_EIP+0x18>
   8:   8b 83 88 00 00 00         mov    0x88(%ebx),%eax
   e:   8b 40 30                  mov    0x30(%eax),%eax
  11:   89 c3                     mov    %eax,%ebx
 ....

 From the 'jmp' onwards, that is what I would expect to see in
 encode_fsid.  The code before there doesn't make a lot of sense, so
 it is hard to pinpoint exactly there the error is.

 In any case, there is no place in encode_fsid where an offset of 4
 from any register is indexed, nor an offset of -2.
 So either there is something wrong with the decoding and displaying
 of this information, or there is something very wrong with your
 hardware.

 I would suggest:
   1/ if possible, run memtest86 on the machine for a while, to make
      sure there isn't a problem with the memory.
   2/ If the problem happens again, post another report with all the
      "oops" information again.  Maybe the next time it will be slightly
      different and will make more sense in some way.

NeilBrown

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: BUG: unable to handle kernel NULL pointer dereference - nfs v3
  - From: "Satyam Sharma" <satyam.sharma@gmail.com>
- Re: BUG: unable to handle kernel NULL pointer dereference - nfs v3
  - From: David CHANIAL <david.ml@euro-web.fr>

References:
- BUG: unable to handle kernel NULL pointer dereference - nfs v3
  - From: David CHANIAL <david.ml@euro-web.fr>

Prev by Date: Re: sysfs root link count broken in 2.6.22-git5
Next by Date: [PATCH] SLAB_PANIC more (proc, posix-timers, shmem)
Previous by thread: BUG: unable to handle kernel NULL pointer dereference - nfs v3
Next by thread: Re: BUG: unable to handle kernel NULL pointer dereference - nfs v3
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]