Re: [patch 00/61] ANNOUNCE: lock validator -V1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 30, 2006 at 12:26:27PM +0200, Arjan van de Ven wrote:
> On Tue, 2006-05-30 at 11:14 +0200, Benoit Boissinot wrote:
> > On 5/29/06, Ingo Molnar <[email protected]> wrote:
> > > We are pleased to announce the first release of the "lock dependency
> > > correctness validator" kernel debugging feature, which can be downloaded
> > > from:
> > >
> > >   http://redhat.com/~mingo/lockdep-patches/
> > > [snip]
> > 
> > I get this right after ipw2200 is loaded (it is quite verbose, I
> > probably shoudln't post everything...)
> > 
> > ipw2200: Detected Intel PRO/Wireless 2200BG Network Connection
> > ipw2200: Detected geography ZZD (13 802.11bg channels, 0 802.11a channels)
> 
> 
> >  <c0301efa> netlink_broadcast+0x7a/0x360  
> 
> this isn't allow to be called from IRQ context, because it takes
> nl_table_lock for read, but that is taken as
>         write_lock_bh(&nl_table_lock);
> in 
> 	static void netlink_table_grab(void)
> so without disabling interrupts; which would thus deadlock if this
> read_lock-from-irq would hit.
> 
> >  <c02fb6a4> wireless_send_event+0x304/0x340
> >  <e1cf8e11> ipw_rx+0x1371/0x1bb0 [ipw2200] 
> >  <e1cfe6ac> ipw_irq_tasklet+0x13c/0x500 [ipw2200]
> >  <c0121ea0> tasklet_action+0x40/0x90  
> 
> but it's more complex than that, since we ARE in BH context.
> The complexity comes from us holding &priv->lock, which is 
> used in hard irq context.

It is probably related, but I got this in my log too:

BUG: warning at kernel/softirq.c:86/local_bh_disable()
 <c010402d> show_trace+0xd/0x10  <c0104687> dump_stack+0x17/0x20
 <c0121fdc> local_bh_disable+0x5c/0x70  <c03520f1> _read_lock_bh+0x11/0x30
 <c02e8dce> sock_def_readable+0x1e/0x80  <c0302130> netlink_broadcast+0x2b0/0x360
 <c02fb6a4> wireless_send_event+0x304/0x340  <e1cf8e11> ipw_rx+0x1371/0x1bb0 [ipw2200]
 <e1cfe6ac> ipw_irq_tasklet+0x13c/0x500 [ipw2200] <c0121ea0> tasklet_action+0x40/0x90
 <c01223b4> __do_softirq+0x54/0xc0  <c01056bb> do_softirq+0x5b/0xf0
 =======================
 <c0122455> irq_exit+0x35/0x40  <c01057c7> do_IRQ+0x77/0xc0
 <c0103949> common_interrupt+0x25/0x2c 

> 
> so the deadlock is like this:
> 
> 
> cpu 0: user context					cpu1: softirq context
>    netlink_table_grab takes nl_table_lock as		take priv->lock	in ipw_irq_tasklet
>    write_lock_bh, but leaves irqs enabled
> 
> 
>    hardirq comes in and the isr tries to take           in ipw_rx, call wireless_send_event which
>    priv->lock but has to wait on cpu 1                  tries to take nl_table_lock for read
>                                                         but has to wait for cpu0
> 
> and... kaboom kabang deadlock :)
> 
> 

-- 
powered by bash/screen/(urxvt/fvwm|linux-console)/gentoo/gnu/linux OS
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux