Re: Recursion bug in -rt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dinakar,
after further testing and investigation I believe you original assessment was correct.
The problem you are seeing is not a library problem.
The changes to down_futex need to be reverted.  There is a new patch at

http://source.mvista.com/~dsingleton/patch-2.6.15-rc5-rt2-rf2.

that reverts the changes to down_futex.

Thanks for testing this.

David




Dinakar Guniguntala wrote:

Hi David,

I hit this bug with -rt22-rf11

==========================================
[ BUG: lock recursion deadlock detected! |
------------------------------------------
already locked:  [f7abbc94] {futex}
.. held by:          testpi-3: 4595 [f7becdd0,  59]
... acquired at:               futex_wait_robust+0x142/0x1f3
------------------------------
| showing all locks held by: |  (testpi-3/4595 [f7becdd0,  59]):
------------------------------

#001:             [f7abbc94] {futex}
... acquired at:               futex_wait_robust+0x142/0x1f3

-{current task's backtrace}----------------->
[<c0103e04>] dump_stack+0x1e/0x20 (20)
[<c0136bc2>] check_deadlock+0x2d7/0x334 (44)
[<c01379bc>] task_blocks_on_lock+0x2c/0x224 (36)
[<c03f29c5>] __down_interruptible+0x37c/0x95d (160)
[<c013aebf>] down_futex+0xa3/0xe7 (40)
[<c013ebc5>] futex_wait_robust+0x142/0x1f3 (72)
[<c013f35c>] do_futex+0x9a/0x109 (40)
[<c013f4dd>] sys_futex+0x112/0x11e (68)
[<c0102f03>] sysenter_past_esp+0x54/0x75 (-8116)
------------------------------
| showing all locks held by: |  (testpi-3/4595 [f7becdd0,  59]):
------------------------------

#001:             [f7abbc94] {futex}
... acquired at:               futex_wait_robust+0x142/0x1f3

---------------------------------------------------------------------

futex.c -> futex_wait_robust

       if ((curval & FUTEX_PID) == current->pid) {
               ret = -EAGAIN;
               goto out_unlock;
       }

rt.c    -> down_futex

       if (!owner_task || owner_task == current) {
               up(sem);
               up_read(&current->mm->mmap_sem);
               return -EAGAIN;
       }

I noticed that both the above checks below have been removed in your
patch. I do understand that the futex_wait_robust path has been
made similar to the futex_wait path, but I think we are not taking
PI into consideration. Basically it looks like we still need to check
if the current task has become owner. or are we missing a lock somewhere ?

I added the down_futex check above and my test has been
running for hours without the oops. Without this check it
used to oops within minutes.

Patch that works for me attached below.  Thoughts?

       -Dinakar


------------------------------------------------------------------------

Index: linux-2.6.14-rt22-rayrt5/kernel/rt.c
===================================================================
--- linux-2.6.14-rt22-rayrt5.orig/kernel/rt.c	2005-12-15 02:15:13.000000000 +0530
+++ linux-2.6.14-rt22-rayrt5/kernel/rt.c	2005-12-15 02:18:29.000000000 +0530
@@ -3001,7 +3001,7 @@
	 * if the owner can't be found return try again.
	 */

-	if (!owner_task) {
+	if (!owner_task || owner_task == current) {
		up(sem);
		up_read(&current->mm->mmap_sem);
		return -EAGAIN;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux