Re: Crash soon after an alloc_skb failure in 2.6.16 and previous, swap disabled

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just Marc <[email protected]> wrote:
>
> I'm running a few machines with swap turned off and am experiencing 
> crashes when the system is extremely low on kernel memory.   So far the 
> crashes observed are always inside the recv function of the Ethernet 
> module, below is the trace for the tg3 module but a similar result is 
> also seen with the e1000 module.   Th crash is not necessarily related 
> to the Ethernet modules but may happen at a later stage deeper in the 
> networking code.
> 
> I don't have console access to the machine so I can't know what the 
> final oops/crash message is (if any) but this can be reproduced on any 
> machine quite easily by consuming all of the available memory,  I guess 
> that if done at userspace the OOM killer will prevent this from 
> happening but a simple LKM can allocate all this memory and this issue 
> should surface quickly.

We'd really need to see that final oops trace, please.

It's not unusual for a hard-working gigabit NIC to exhaust the page
allocator reserves and perhaps we're a bit too noisy in the logs when it
happens.  But it's sufficiently rare and sufficiently associated with other
problems (like this one) that nobody has yet gone and stuck the
__GFP_NOWARN into the relevant drivers to suppress the messages.

If we really have broken something in there then someone else will hit this
soon enough.  But nobody has, as far as I know.

A digital photo of the screen would suit.

Or perhaps netconsole.  If the crash is really associated with the NIC
running out of txbufs then netconsole might not be useful.  But perhaps the
crash is something else altogether.

Netconsole is pretty easy to get going.  See
Documentation/networking/netconsole.txt.  To the target machine's kernel
boot command line you add

[email protected]/eth0,[email protected]/nn:nn:nn:nn:nn:nn

where a.b.c.d is the target machine's IP address and A.B.C.D is your
workstation's IP address and nn:nn:nn:nn:nn:nn is your workstation's MAC
address.

On the workstation, do

	netcat -u -l -p 5140


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux