Re: Unable to handle kernel paging request, another 2.6.16.25 server reboots

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In-Reply-To: <[email protected]>

On Mon, 24 Jul 2006 12:01:48 +0400, Jim Klimov wrote:

>  I recently wrote about problems with a fileserver rebooting
>  frequently. Another similar server got under NFS load today
>  and rebooted at least twice in the past few hours.
>
>  This server has a similar motherboard (Supermicro X5DP8-G2),
>  dual Xeons@533, two older 3Ware controllers (7506+8506) and
>  a reiserfs v3 archive.
>
>  The server reported last week has two 3Ware 9550 controllers,
>  ext3 archives and primarily a Samba usage.

I decoded your oops.  It's in netfilter:

Unable to handle kernel paging request at virtual address f9445d43
printing eip:
 c0392bba
*pde = 32e59067
Oops: 0000 [#1]
 SMP 
Modules linked in: w83781d hwmon_vid hwmon i2c_isa i2c_core w83627hf_wdt
CPU:    0
EIP:    0060:[<c0392bba>]    Not tainted VLI
EFLAGS: 00010282   (2.6.16.25 #3) 
EIP is at ipt_do_table+0xae/0x385
eax: 00000003   ebx: 00000000   ecx: cbf4b8d8   edx: f944a2c8
esi: e2262940   edi: f9445cf0   ebp: 80000000   esp: f700fac8
ds: 007b   es: 007b   ss: 0068
Process nfsd (pid: 10685, threadinfo=f700e000 task=f36a7ab0)
Stack: f9446b10 00000282 c33e2180 f36cda80 00000000 c047deec f944a2c8 f9418000 
       f7788800 c0530cd4 00000000 cbf4b8d8 00000000 00000003 f700fba0 00000000 
       f700fba0 00000003 c052f0d8 80000000 c03947e7 f7788800 c047dec0 00000000 
Call Trace:
 [<c03947e7>] ipt_local_out_hook+0x72/0x77
 [<c035dfd9>] nf_iterate+0x69/0x83
 [<c036a1ca>] dst_output+0x0/0x7
 [<c036a1ca>] dst_output+0x0/0x7
 [<c035e050>] nf_hook_slow+0x5d/0xea
 [<c036a1ca>] dst_output+0x0/0x7
 [<c0368178>] ip_queue_xmit+0x3d4/0x4f5
 [<c036a1ca>] dst_output+0x0/0x7
 [<c012ce4a>] __rcu_process_callbacks+0x7d/0xc5
 [<c0115d92>] activate_task+0x99/0xa5
 [<c011659c>] try_to_wake_up+0x29c/0x33b
 [<c037d0d1>] tcp_v4_send_check+0x4a/0xdc
 [<c037868d>] tcp_transmit_skb+0x2e6/0x45a
 [<c0379879>] tcp_push_one+0x97/0x104
 [<c036ec4c>] tcp_sendmsg+0x36b/0xb4d
 [<c035dfd9>] nf_iterate+0x69/0x83
 [<c037e653>] tcp_v4_rcv+0x4e6/0x81f
 [<c0389cbe>] inet_sendmsg+0x47/0x5f
 [<c0344768>] sock_sendmsg+0xc9/0xe3
 [<c03649ad>] ip_rcv+0x2bc/0x56f
 [<c034e58a>] netif_receive_skb+0x227/0x2d7
 [<c0348c4c>] __kfree_skb+0x3a/0xc3
 [<c012f3e4>] autoremove_wake_function+0x0/0x43
 [<c0125923>] update_wall_time_one_tick+0x6/0x7e
 [<c01259ce>] update_wall_time+0x8/0x35
 [<c01062ab>] timer_interrupt+0x5b/0x86
 [<c0139975>] handle_IRQ_event+0x26/0x59
 [<c03447b0>] kernel_sendmsg+0x2e/0x3c
 [<c0347a3f>] sock_no_sendpage+0x80/0x9f
 [<c036e8a5>] tcp_sendpage+0x49/0x85
 [<c03a9573>] svc_sendto+0x134/0x250
 [<c034e7ce>] net_rx_action+0x88/0x15f
 [<c0104f82>] do_IRQ+0x1e/0x24
 [<c01035e2>] common_interrupt+0x1a/0x20
 [<c03aa597>] svc_tcp_sendto+0x4d/0x99
 [<c0258eab>] _atomic_dec_and_lock+0x33/0x4c
 [<c03aad1a>] svc_send+0xaa/0xed
 [<c0210abc>] fh_put+0x133/0x17d
 [<c03ac4da>] svcauth_unix_release+0x43/0x45
 [<c021d1fd>] nfs3svc_release_fhandle+0x0/0xe
 [<c03a8b14>] svc_process+0x1b1/0x619
 [<c01183f8>] default_wake_function+0x0/0xc
 [<c020e10d>] nfsd+0x178/0x301
 [<c020df95>] nfsd+0x0/0x301
 [<c01010a1>] kernel_thread_helper+0x5/0xb

   6:   8b 40 10                  mov    0x10(%eax),%eax
   9:   8b 44 86 34               mov    0x34(%esi,%eax,4),%eax
   d:   89 44 24 1c               mov    %eax,0x1c(%esp)
  11:   89 c7                     mov    %eax,%edi
  13:   8b 44 24 34               mov    0x34(%esp),%eax
  17:   8b 54 24 1c               mov    0x1c(%esp),%edx
  1b:   03 7c 86 0c               add    0xc(%esi,%eax,4),%edi
  1f:   03 54 86 20               add    0x20(%esi,%eax,4),%edx
  23:   89 5c 24 10               mov    %ebx,0x10(%esp)
  27:   89 54 24 18               mov    %edx,0x18(%esp)
   0:   0f b6 5f 53               movzbl 0x53(%edi),%ebx   <=====
   4:   89 d8                     mov    %ebx,%eax
   6:   24 08                     and    $0x8,%al
   8:   84 c0                     test   %al,%al
   a:   0f 84 b4 02 00 00         je     2c4 <_EIP+0x2c4>
  10:   8b 47 08                  mov    0x8(%edi),%eax

This is in net/ipv4/netfiler/ip_tables.c::ipt_do_table():

        table_base = (void *)private->entries[smp_processor_id()];
        e = get_entry(table_base, private->hook_entry[hook]);

        /* For return from builtin chain */
        back = get_entry(table_base, private->underflow[hook]);

        do {
                IP_NF_ASSERT(e);
                IP_NF_ASSERT(back);
===>            if (ip_packet_match(ip, indev, outdev, &e->ip, offset)) {

'e' is an invalid pointer. (ip_packet_match() was inlined.)
hook == 3


The call trace seems to show that svc_tcp_sendto() was interrupted by an
IRQ for an incoming packet, or maybe the timer interrupt?

Can you build with CONFIG_FRAME_POINTERS and see if you can get a cleaner
trace?

-- 
Chuck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux