Re: [RFC][PATCH] Fix hang in posix_locks_deadlock()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 26, 2007 at 06:47:08PM -0400, J. Bruce Fields wrote:
> Hm.  After another look: assume we have four tasks, t1, t2, t3, and t4.
> Assume t1 and t2 share the same current->files (so they're the same
> "owner" for the purpose of posix_same_owner()).  Assume:
> 
> 	t1 is waiting on a conflicting lock held by t3.
> 	t2 is waiting on a conflicting lock held by t4.
> 
> Now suppose t4 requests a lock that conflicts with a lock held by t1 and
> t2.  The list_for_each_entry() above will search for a task with t1 or
> t2 as owner, which is waiting on a lock.  If it finds t1 first, the loop
> won't be noticed, so t4 will be put to sleep.  Now we have a loop; t3
> can release its lock (it no longer matters), and we'll have
> 
> 	t2 waiting on a conflicting lock held by t4, and
> 	t4 waiting on a conflicting lock held by t2.
> 
> If a new task t5 then requests a lock conflicting with the one held by
> t2, then the above function will go into an infinite loop.  I think.
> 
> Consider the directed graph with each vertex representing the set of all
> tasks sharing the same file table, and each edge representing the
> relationship "a task at this vertex is waiting on a lock held by a task
> on another vertex".  The existance of multiple tasks with the same file
> table means that we can no longer assume that each vertex has outdegree
> at most one, so we have to switch to an algorithm that works on an
> arbitrary directed graph.  That sounds painful.
> 
> Am I right about that, and about the example above?  It'd be interesting
> to code it up just to make sure.
> 
> If so, one can imagine various bandaids, but maybe we should just rip
> out the deadlock detection completely.... It's hard to imagine it being
> really useful anyway.

OK, well I cooked up a similar example, which was kind of fun, and
verified that I can indeed lock up the kernel this way.

The only way this can happen, though, is if you already have deadlocked
threads--that is to say, two threads that are each waiting on posix file
locks held by the other.  (Or a similar cycle of length more than 2.) So
hopefully your application is doing some other kind of deadlock
detection (e.g. by killing threads that block for too long); otherwise
it already has a bug.

--b.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux