Re: [lock validator] inet6_destroy_sock(): soft-safe -> soft-unsafe lock dependency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Herbert Xu <[email protected]> wrote:

> On Thu, Feb 02, 2006 at 11:54:29AM +0100, Ingo Molnar wrote:
> > 
> > hm, i got a new one:
> > 
> > ============================================
> > [ BUG: circular locking deadlock detected! ]
> > --------------------------------------------
> > sshd/28997 is trying to acquire lock:
> >  (&sk->sk_lock.slock){-+}, at: [<c0c6be28>] packet_rcv+0xbf/0x34b
> > 
> > but task is already holding lock:
> >  (&dev->xmit_lock){-+}, at: [<c0bb04ec>] qdisc_restart+0x46/0x207
> > 
> > which lock already depends on the new lock,
> > which could lead to circular deadlocks!
> > 
> > the dependency chain (in reverse order) is:
> > -> #2 (&dev->xmit_lock){-+}: [<c0bb04ec>] qdisc_restart+0x46/0x207
> > -> #1 (&dev->queue_lock){-+}: [<c0b98137>] dev_queue_xmit+0xc3/0x21e
> > -> #0 (&sk->sk_lock.slock){-+}: [<c0c6be28>] packet_rcv+0xbf/0x34b
> 
> I believe this is a false positive and I think I can see where it went 
> wrong.  The dependency between #0 and #1 is the broken premise.
> 
> The validator is probably putting all sk_lock's in the same basket.  
> That is, it's mixing up the socket locks for TCP, UDP as well as 
> AF_PACKET.  While it is true that TCP and UDP's sk_lock may sit 
> outside queue_lock, AF_PACKET never transmits while holding its 
> sk_lock.

you are right, the validator indeed treated all these protocols as one, 
so this is a false positive. (there are two slock buckets already, 
because there is some natural lock nesting between listen sockets and 
ordinary sockets, but otherwise you are right that all net protocols 
were indeed treated as one category.)

> So the #0 => #1 dependency shouldn't exist.  Can you get the validator 
> to print out the reasoning for the #0 => #1 dependency? That should 
> clarify the problem.

the slock -> queue_lock dependency was first observed at:

-> (&sk->sk_lock.slock/1){-+}
    -> &dev->queue_lock          [<c0b98137>] dev_queue_xmit+0xc3/0x21e

so it comes from dev_queue_xmit(). No further information about where 
that was done - but i suspect it was TCP's sendmsg. af_packet.c never 
seems to be doing that.

The queue_lock -> xmit_lock dependency was first observed at:

-> (&dev->queue_lock){-+}
    -> &dev->xmit_lock           [<c0bb0827>] dev_activate+0xc8/0xe5

i think i'll solve this by splitting af_packet.c's slock into a separate 
bucket (separate lock type). (Such exceptions and locking assymetries 
can be expressed towards the validator in a pretty straightforward way, 
by initializing those locks in a special way.)

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux