Re: [RFC/PATCH] debug workqueue deadlocks with lockdep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/04, Johannes Berg wrote:
> 
> On Tue, 2007-07-03 at 21:31 +0400, Oleg Nesterov wrote:
> 
> > If A does NOT take a lock L1, then it is OK to do cancel_work_sync(A)
> > under L1, regardless of which other work_structs this workqueue has,
> > before or after A.
> 
> Ah, cancel_work_sync() waits only for it if A is currently running?

Yes. And no other work (except a barrier) can run before the caller of
wait_on_work() is woken.

> > Now we have a false positive if some time we queue B into that workqueue,
> > and this is not good.
> 
> Right. I was thinking of the flush_workqueue case where any of A or B
> matters.

Aha, now I see where I was confused. Yes, we can't avoid the false positives
with flush_workqueue().

I hope this won't be a problem, because almost every usage of flush_workqueue()
is pointless nowadays. So even if we have a false positive, it probably
means the code needs cleanups anyway.

But see below,

> > We can avoid this problem if we put lockdep_map into work_struct, so
> > that wait_on_work() "locks" work->lockdep_map, while flush_workqueue()
> > takes wq->lockdep_map.
> 
> Yeah, and then we'll take both wq->lockdep_map and the
> work_struct->lockdep_map when running that work. That should work, I'll
> give it a go later.

If you are going to do this, may I suggest you to make 2 separate patches?
Exactly because we can't avoid the false positives with flush_workqueue(),
it would be nice if we have an option to revert the 2-nd patch if there are
too many false positives (I hope this won't happen).

(please ignore if this is not suitable for you).

> > > @@ -257,7 +260,9 @@ static void run_workqueue(struct cpu_wor
> > >
> > >  		BUG_ON(get_wq_data(work) != cwq);
> > >  		work_clear_pending(work);
> > > +		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> > >  		f(work);
> > > +		lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_);
> >                                                    ^^^
> > Isn't it better to call lock_release() with nested == 1 ?
> 
> Not sure, Ingo?

Ingo, could you also explain the meaning of "nested" parameter? Looks
like it is just unneeded, lock_release_nested() does a quick check
and use lock_release_non_nested() when hlock is not on top of stack.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux