Re: 2.6.21-git8+ BUG: NMI Watchdog detected LOCKUP on CPU1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/05/07, Andrew Morton <[email protected]> wrote:
On Tue, 08 May 2007 10:35:14 +0200 Michal Piotrowski <[email protected]> wrote:

> Hi,
>
> / filesystem was full
>
> [39525.460000] BUG: NMI Watchdog detected LOCKUP on CPU1, eip 08056990, registers:
> [39525.468000] Modules linked in: loop ipt_MASQUERADE iptable_nat nf_nat autofs4 af_packet nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 binfmt_misc thermal processor fan container nvram snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss evdev snd_pcm intel_agp snd_timer snd agpgart soundcore i2c_i801 snd_page_alloc ide_cd cdrom rtc unix
> [39525.518000] CPU:    1
> [39525.518000] EIP:    0073:[<08056990>]    Not tainted VLI
> [39525.518000] EFLAGS: 00000202   (2.6.21-ga989705c #187)
> [39525.529000] EIP is at 0x8056990
> [39525.529000] eax: 6e560d60   ebx: 0000000b   ecx: 00000000   edx: 000dd15e
> [39525.541000] esi: 00000000   edi: 6e560220   ebp: bfeb0a58   esp: bfeb0990
> [39525.547000] ds: 007b   es: 007b   fs: 0000  gs: 0033  ss: 007b
> [39525.553000] Process line (pid: 4277, ti=cf200000 task=f6f560b0 task.ti=cf200000)
> [39525.560000] Kernel panic - not syncing: Aiee, killing interrupt handler!
>
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-git8/git-console.log
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-git8/git-config
>

I don't know what caused the CPU to jump into hyperspace like that, but Patrick
tells me that this:

> [38773.921000] printk: 15909 messages suppressed.
> [38773.926000] ipt_hook: happy cracking.
> [38778.921000] printk: 16332 messages suppressed.
> [38778.925000] ipt_hook: happy cracking.
> [38783.921000] printk: 16175 messages suppressed.
> [38783.926000] ipt_hook: happy cracking.
> [38788.921000] printk: 16390 messages suppressed.
> [38788.925000] ipt_hook: happy cracking.
> [38793.921000] printk: 16289 messages suppressed.
> [38793.925000] ipt_hook: happy cracking.
> [38798.921000] printk: 16172 messages suppressed.
> [38798.926000] ipt_hook: happy cracking.
> [38803.921000] printk: 15738 messages suppressed.
> [38803.925000] ipt_hook: happy cracking.
> [38808.921000] printk: 14731 messages suppressed.

  happens when a local process sends packets with invalid IP headers
  through raw sockets.

Yes, it was an isic session.


  [ 5225.195000] UDP: short packet: From 37.126.206.54:46544 39671/1182
  to 127.0.0.1:40761

  This seems to indicate something on the local machine (packets are not
  routed to 127.0.0.1) is sending invalid packets, probably with
  incorrectly set up skb pointers.

  I'd suggest to add a WARN_ON(1) in ipt_local_hook().

So can you please add the appropriate WARN_ON?

Whatever happens, that printk should be toned down, shouldn't it?  We
prefer to not let unprivileged apps spam the logs.



[39293.925000] ipt_hook: happy cracking.
[39429.024000] printk: 15828 messages suppressed.
[39429.028000] nf_conntrack: table full, dropping packet.
[39430.034000] nf_conntrack: table full, dropping packet.
[39431.039000] nf_conntrack: table full, dropping packet.
[39432.044000] nf_conntrack: table full, dropping packet.
[39444.056000] nf_conntrack: table full, dropping packet.
[39445.061000] nf_conntrack: table full, dropping packet.
[39525.460000] BUG: NMI Watchdog detected LOCKUP on CPU1, eip
08056990, registers:

This lockup occurred after an isic test. Hmmm... linus_stress?

FAIL aio_dio_bugs Command
<LD_LIBRARY_PATH=/usr/local/autotest/client/deps/libaio/lib/
/usr/local/autotest/client/tests/aio_dio_bugs/src/aio-dio-extend-stat
file> failed, rc=32512
GOOD aiostress completed successfully
GOOD bonnie completed successfully
GOOD cpu_hotplug completed successfully
GOOD cyclictest completed successfully
GOOD dbench completed successfully
FAIL disktest running test disktest
<--[random error]-->
FAIL fs_mark Command <./fs_mark -d /mnt -s 10240 -n 1000> failed, rc=256
GOOD fsfuzzer completed successfully
GOOD fsx completed successfully
FAIL interbench Command
</usr/local/autotest/client/tests/interbench/src/interbench -m 'run
#0'  -c> failed, rc=256
GOOD iozone completed successfully
FAIL isic running test job
 Traceback (most recent call last):
   File "/usr/local/autotest/client/bin/job.py", line 179, in __runtest
     test.runtest(self, url, tag, args, dargs)
   File "/usr/local/autotest/client/bin/test.py", line 195, in runtest
     fork_waitfor(job.resultdir, pid)
   File "/usr/local/autotest/client/bin/parallel.py", line 40, in fork_waitfor
     (pid, status) = os.waitpid(pid, 0)
 KeyboardInterrupt
GOOD linus_stress completed successfully

I don't remember what was the next test.

I'll try to find out how to reproduce this lockup. Anyway, IMO it's
not a network related problem.

Regards,
Michal

--
Michal K. K. Piotrowski
Kernel Monkeys
(http://kernel.wikidot.com/start)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux