[BUG] futex_unlock_pi returns w/o unlocking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ingo,
	So we've just hunted down a nasty bug w/ 2.6.17-rt8. We've had some odd
test failures where userspace apps were hanging when dealing w/ pi
mutexes.

After a good bit of debugging (not by me) it was found that we were
deadlocking by double locking a mutex, however it wasn't userland's
fault. On top of that, the mutex __owner field was null, so something
was wrong.

We found that futex_unlock_pi() was somehow returning without error
while not actually clearing the mutex __lock value. Further digging
found the failure case, and its a bit obscure.

1) In futex_unlock_pi() we start w/ ret=0 and we go down to the first
futex_atomic_cmpxchg_inatomic(), where we find uval==-EFAULT.  We then
jump to the pi_faulted label.
2) From pi_faulted: We increment attempt, unlock the sem and hit the
retry label.
3) From the retry label, with ret still zero, we again hit EFAULT on the
first futex_atomic_cmpxchg_inatomic(), and again goto the pi_faulted
label.
4) Again from pi_faulted: we increment attempt and enter the
conditional, where we call futex_handle_fault.
5) futex_handle_fault fails, and we goto the out_unlock_release_sem
label. 
6) From out_unlock_release_sem we return, and since ret is still zero,
we return without error, while never actually unlocking the lock.


Issue #1: at the first futex_atomic_cmpxchg_inatomic() we should
probably be setting ret=-EFAULT before jumping to pi_faulted:  However
in our case this doesn't really affect anything, as the glibc we're
using ignores the error value from futex_unlock_pi().

Issue #2: Look at futex_handle_fault(), its first conditional will
return -EFAULT if attempt is >= 2. However, from the "if(attempt++)
futex_handle_fault(attempt)" logic above, we'll *never* call
futex_handle_fault when attempt is less then two. So we never get a
chance to even try to fault the page in.

This very simple and hackish fix for issue #2 is probably not the
correct solution, but with the odd if(attempt++) logic all over futex.c
it might actually be the right thing to do.

Your thoughts?

thanks
-john

Index: 2.6-rt/kernel/futex.c
===================================================================
--- 2.6-rt.orig/kernel/futex.c	2006-08-01 13:14:50.000000000 -0700
+++ 2.6-rt/kernel/futex.c	2006-08-02 17:25:32.000000000 -0700
@@ -298,7 +298,7 @@
 	struct vm_area_struct * vma;
 	struct mm_struct *mm = current->mm;
 
-	if (attempt >= 2 || !(vma = find_vma(mm, address)) ||
+	if (attempt > 2 || !(vma = find_vma(mm, address)) ||
 	    vma->vm_start > address || !(vma->vm_flags & VM_WRITE))
 		return -EFAULT;
 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux