Re: [PATCH] netpoll can lock up on low memory.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Aug 06, 2005 at 09:45:03AM +0200, Ingo Molnar wrote:
> 
> * Andi Kleen <[email protected]> wrote:
> 
> > On Fri, Aug 05, 2005 at 01:01:57PM -0700, Matt Mackall wrote:
> > > The netpoll philosophy is to assume that its traffic is an absolute
> > > priority - it is better to potentially hang trying to deliver a panic
> > > message than to give up and crash silently.
> > 
> > That would be ok if netpoll was only used to deliver panics. But it is 
> > not. It delivers all messages, and you cannot hang the kernel during 
> > that. Actually even for panics it is wrong, because often it is more 
> > important to reboot in a panic than (with a panic timeout) to actually 
> > deliver the panic. That's needed e.g. in a failover cluster.
> 
> without going into the merits of this discussion, reliable failover 
> clusters must include (and do include) an external ability to cut power.  
> No amount of in-kernel logic will prevent the kernel from hanging, given 
> a bad enough kernel bug.

Ok, true, but we should do a best effort.

> 
> So the right question is not 'can we prevent the kernel from hanging, 
> ever' (we cannot), but 'which change makes it less likely for the kernel 
> to hang'. (and, obviously: assuming all other kernel components are 
> functioning per specification, netpoll itself most not hang :-)
> 
> even a plain printk to VGA can hang in certain kernel crashes. Netpoll 
> is more complex and thus has more exposure to hangs. E.g. netpoll relies 
> on the network driver to correctly recycle skbs within a bound amount of 
> time. If the network driver leaks skbs, it's game over for netpoll.

I don't think we even need to think about such rare cases,
until the easy cases ("everything hangs when the cable is pulled") 
are not fixed.

> [ i'd prefer a hang over nondeterministic behavior, and e.g. losing 
>   console messages is sure nondeterministic behavior. What if the 
>   console message is "WARNING: the box has just been broken into"? ]

That just makes netconsole useless in production. If it causes frequenet
hangs people will not use it.


> 
> we could do one thing (see the patch below): i think it would be useful 
> to fill up the netlogging skb queue straight at initialization time.  
> Especially if netpoll is used for dumping alone, the system might not be 
> in a situation to fill up the queue at the point of crash, so better be 
> a bit more prepared and keep the pipeline filled.

You're solving a completely different issue here?

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux