Crash soon after an alloc_skb failure in 2.6.16 and previous, swap disabled

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I'm running a few machines with swap turned off and am experiencing crashes when the system is extremely low on kernel memory. So far the crashes observed are always inside the recv function of the Ethernet module, below is the trace for the tg3 module but a similar result is also seen with the e1000 module. Th crash is not necessarily related to the Ethernet modules but may happen at a later stage deeper in the networking code.

I don't have console access to the machine so I can't know what the final oops/crash message is (if any) but this can be reproduced on any machine quite easily by consuming all of the available memory, I guess that if done at userspace the OOM killer will prevent this from happening but a simple LKM can allocate all this memory and this issue should surface quickly.

Some commercial unices have tunable kernel parameters for randomly denying allocations within the kernel to help QA-abuse code within the kernel which relies on memory allocation, this kind of feature helps locate crash cases similar to this one. Maybe it should be introduced to the kernel as well and be tested from time to time prior to major releases?

The benefits of running a system without swap are arguable, but in my particular scenario I prefer to have connections dropped rather than experience the overheads and latencies of a heavily swapping system. Nonetheless, the kernel should handle most of the OOM cases as elegantly and gently as possible.

I tried tuning vm.min_free_kbytes and vm.lowmem_reserve_ratio with fairly high values but it didn't seem to help. Are there other parameters I should look at?

Thanks!

Mar 28 18:03:47 t06 kernel: swapper: page allocation failure. order:1, mode:0x20
Mar 28 18:03:47 t06 kernel:
Mar 28 18:03:47 t06 kernel: Call Trace: <IRQ> <ffffffff8015d501>{__alloc_pages+705} Mar 28 18:03:47 t06 kernel: <ffffffff8017be67>{cache_grow+359} <ffffffff8017c362>{cache_alloc_refill+450} Mar 28 18:03:47 t06 kernel: <ffffffff8017c4c2>{__kmalloc+226} <ffffffff8044f45b>{__alloc_skb+107} Mar 28 18:03:47 t06 kernel: <ffffffff8048423b>{tcp_collapse+331} <ffffffff80484670>{tcp_prune_queue+448} Mar 28 18:03:47 t06 kernel: <ffffffff8048743c>{tcp_data_queue+540} <ffffffff80488e99>{tcp_rcv_established+1849} Mar 28 18:03:47 t06 kernel: <ffffffff8048fcbf>{tcp_v4_do_rcv+63} <ffffffff8046dc5c>{nf_iterate+92} Mar 28 18:03:47 t06 kernel: <ffffffff8049102e>{tcp_v4_rcv+2478} <ffffffff80474180>{ip_local_deliver_finish+0} Mar 28 18:03:47 t06 kernel: <ffffffff804744d2>{ip_local_deliver+370} <ffffffff80474aa7>{ip_rcv+1255} Mar 28 18:03:47 t06 kernel: <ffffffff8044f476>{__alloc_skb+134} <ffffffff80456d90>{netif_receive_skb+976} Mar 28 18:03:47 t06 kernel: <ffffffff803855fd>{tg3_poll+1901} <ffffffff80455618>{net_rx_action+184} Mar 28 18:03:47 t06 kernel: <ffffffff80133743>{__do_softirq+99} <ffffffff8010bdc2>{call_softirq+30} Mar 28 18:03:47 t06 kernel: <ffffffff8010ceac>{do_softirq+44} <ffffffff8010cf07>{do_IRQ+71} Mar 28 18:03:47 t06 kernel: <ffffffff801096f0>{default_idle+0} <ffffffff8010b120>{ret_from_intr+0} <EOI> Mar 28 18:03:47 t06 kernel: <ffffffff804be41f>{thread_return+0} <ffffffff8010971d>{default_idle+45} Mar 28 18:03:47 t06 kernel: <ffffffff801097bb>{cpu_idle+107} <ffffffff807c543f>{start_secondary+1247}
Mar 28 18:03:47 t06 kernel: Mem-info:
Mar 28 18:03:47 t06 kernel: Node 1 DMA per-cpu: empty
Mar 28 18:03:47 t06 kernel: Node 1 DMA32 per-cpu: empty
Mar 28 18:03:47 t06 kernel: Node 1 Normal per-cpu:
Mar 28 18:03:47 t06 kernel: cpu 0 hot: high 186, batch 31 used:0
Mar 28 18:03:47 t06 kernel: cpu 0 cold: high 62, batch 15 used:0
Mar 28 18:03:47 t06 kernel: cpu 1 hot: high 186, batch 31 used:11
Mar 28 18:03:47 t06 kernel: cpu 1 cold: high 62, batch 15 used:57
Mar 28 18:03:47 t06 kernel: Node 1 HighMem per-cpu: empty
Mar 28 18:03:47 t06 kernel: Node 0 DMA per-cpu:
Mar 28 18:03:47 t06 kernel: cpu 0 hot: high 0, batch 1 used:0
Mar 28 18:03:47 t06 kernel: cpu 0 cold: high 0, batch 1 used:0
Mar 28 18:03:47 t06 kernel: cpu 1 hot: high 0, batch 1 used:0
Mar 28 18:03:47 t06 kernel: cpu 1 cold: high 0, batch 1 used:0
Mar 28 18:03:47 t06 kernel: Node 0 DMA32 per-cpu:
Mar 28 18:03:47 t06 kernel: cpu 0 hot: high 186, batch 31 used:80
Mar 28 18:03:47 t06 kernel: cpu 0 cold: high 62, batch 15 used:50
Mar 28 18:03:47 t06 kernel: cpu 1 hot: high 186, batch 31 used:69
Mar 28 18:03:47 t06 kernel: cpu 1 cold: high 62, batch 15 used:57
Mar 28 18:03:47 t06 kernel: Node 0 Normal per-cpu: empty
Mar 28 18:03:47 t06 kernel: Node 0 HighMem per-cpu: empty
Mar 28 18:03:47 t06 kernel: Free pages:      310220kB (0kB HighMem)
Mar 28 18:03:47 t06 kernel: Active:897229 inactive:1864947 dirty:1200746 writeback:315 unstable:0 free:77555 slab:167046 mapped:644593 pagetables:23124 Mar 28 18:03:47 t06 kernel: Node 1 DMA free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 28 18:03:47 t06 kernel: lowmem_reserve[]: 0 0 8080 8080
Mar 28 18:03:47 t06 kernel: Node 1 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 28 18:03:47 t06 kernel: lowmem_reserve[]: 0 0 8080 8080
Mar 28 18:03:47 t06 kernel: Node 1 Normal free:184796kB min:9444kB low:11804kB high:14164kB active:2372304kB inactive:5198104kB present:8273920kB
pages_scanned:33 all_unreclaimable? no
Mar 28 18:03:47 t06 kernel: lowmem_reserve[]: 0 0 0 0
Mar 28 18:03:47 t06 kernel: Node 1 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 28 18:03:47 t06 kernel: lowmem_reserve[]: 0 0 0 0
Mar 28 18:03:47 t06 kernel: Node 0 DMA free:8996kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:8564kB pages_scanned:10 all_unreclaimable? yes
Mar 28 18:03:47 t06 kernel: lowmem_reserve[]: 0 3896 3896 3896
Mar 28 18:03:47 t06 kernel: Node 0 DMA32 free:116428kB min:4552kB low:5688kB high:6828kB active:1216612kB inactive:2261684kB present:3989664kB pages_scanned:33 all_unreclaimable? no
Mar 28 18:03:47 t06 kernel: lowmem_reserve[]: 0 0 0 0
Mar 28 18:03:47 t06 kernel: Node 0 Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 28 18:03:47 t06 kernel: lowmem_reserve[]: 0 0 0 0
Mar 28 18:03:47 t06 kernel: Node 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 28 18:03:47 t06 kernel: lowmem_reserve[]: 0 0 0 0
Mar 28 18:03:47 t06 kernel: Node 1 DMA: empty
Mar 28 18:03:47 t06 kernel: Node 1 DMA32: empty
Mar 28 18:03:47 t06 kernel: Node 1 Normal: 45755*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 184796kB
Mar 28 18:03:47 t06 kernel: Node 1 HighMem: empty
Mar 28 18:03:47 t06 kernel: Node 0 DMA: 5*4kB 4*8kB 3*16kB 4*32kB 1*64kB 2*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 2*4096kB = 8996kB Mar 28 18:03:47 t06 kernel: Node 0 DMA32: 28893*4kB 1*8kB 1*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 116428kB
Mar 28 18:03:47 t06 kernel: Node 0 Normal: empty
Mar 28 18:03:47 t06 kernel: Node 0 HighMem: empty
Mar 28 18:03:47 t06 kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0
Mar 28 18:03:47 t06 kernel: Free swap  = 0kB
Mar 28 18:03:47 t06 kernel: Total swap = 0kB
Mar 28 18:03:47 t06 kernel: Free swap:            0kB
Mar 28 18:03:47 t06 kernel: 3145728 pages of RAM
Mar 28 18:03:47 t06 kernel: 100273 reserved pages
Mar 28 18:03:47 t06 kernel: 2717673 pages shared
Mar 28 18:03:47 t06 kernel: 0 pages swap cached

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux