[vma list corruption] Re: proc_pid_readlink oopses again on 2.6.14.5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 27, 2005 at 09:52:26PM -0800, Joshua Kwan wrote:
> Unable to handle kernel paging request at virtual address b7c1fc35
>  printing eip:
> c017c2a8
> *pde = 0afff067
> *pte = 00000000
> Oops: 0000 [#1]
> Modules linked in: hostap_pci hostap ieee80211_crypt tulip ipt_state
> ipt_MASQUERADE ppp_deflate zlib_deflate zlib_inflate bsd_comp ipt_LOG
> iptable_mangle sch_ingress cls_u32 sch_sfq sch_cbq iptable_nat ip_nat
> ipt_REJECT iptable_filter ip_tables ppp_async crc_ccitt ppp_generic slhc
> ip_conntrack_irc i2c_dev via686a hwmon i2c_isa i2c_core
> CPU:    0
> EIP:    0060:[<c017c2a8>]    Not tainted VLI
> EFLAGS: 00010282   (2.6.14-influx)
> EIP is at proc_exe_link+0x28/0xa0
> eax: cd4af070   ebx: cd4af040   ecx: cd4af070   edx: b7c1fc20
> esi: 00000000   edi: c445bf6c   ebp: c445a000   esp: c445bf30
> ds: 007b   es: 007b   ss: 0068
> Process lsof (pid: 17477, threadinfo=c445a000 task=c0d82560)
> Stack: c9473e60 00000000 c017de2c c9473e60 c445bf4c c445bf48 00000000
> c0355fc0
>        c9473e60 00001000 c0156093 d0d6af70 bffc9250 00001000 c9473e60
> d0d6af70
>        c12f3f60 00000000 00000000 00000000 00000000 00000001 00000000
> 00000000
> Call Trace:
>  [<c017de2c>] proc_pid_readlink+0x4c/0xc0
>  [<c0156093>] sys_readlink+0x53/0x80
>  [<c01060df>] do_syscall_trace+0x9f/0x148
>  [<c0102af9>] syscall_call+0x7/0xb
> Code: 90 90 90 56 53 8b 44 24 0c ff 70 f0 e8 b2 69 f9 ff 85 c0 89 c3 5a
> 74 77 8d 48 30 89 c8 ff 00 0f 88 72 07 00 00 8b 13 85 d2 74 14 <f6> 42
> 15 10 74 07 8b 42 4c 85 c0 75 28 8b 52 0c 85 d2 75 ec be

Until the last line it made sense.  Code, however, is flat-out BS.
This chunk is from around proc_exe_link(), all right.  But it starts
at 3 bytes before the beginning of that function.  Perfect match to
build with your .config using gcc4, but...  no way in hell you would
get an oops at that location - it's in the middle of long chunk of
NOP.  So something's rotten here...

FWIW, with the same build you will get the following at 0x28 from the
beginning of function:
        vma = mm->mmap;
        while (vma) {
                if ((vma->vm_flags & VM_EXECUTABLE) && vma->vm_file)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                        break;
                vma = vma->vm_next;
        }
This check.  It turns into
	f6 42 15 10             testb  $0x10,0x15(%edx)
with vma in %edx.  Since your %edx is 0xb7c1fc20 and address you are trying
to access is 0xb7c1fc35, it's a match.

So you've got 0xb7c1fc20 as vma.  Which is not good, since that's a userland
address.  The next question is where it'd come from - it might be
	* fscked task->mm
	* fscked mm->mmap
	* fscked vma somewhere in the chain.

Doing lsof will walk vma chains of many processes, so if something is
corrupted it will step into that...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux