Re: [PATCH] Fix deadlock in RPC scheduling code.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2006-03-09 at 11:35 +0100, Aurelien Degremont wrote:
> Hello,
> 
> Here is a small patch which fixes a deadlock issue in RPC scheduling
> code (rpc_wake_up_task() is mainly concerned). This problem happens if a
> RPC task, waiting for the response of a sync request, is woken up by a
> signal, and, in the same time, the kernel tries to wake up some RPC
> tasks. If this is done, a deadlock could occurs due to an error in RPC
> workqueue lock management.
> 
> This error was added by linux-2.6.8.1-49-rpc_workqueue.patch (included
> in 2.6.11). This code was not changed since. The issue was detected on
> linux-2.6.12.
> 
> 
> **DESCRIPTION**
> 
> This deadlock is due to the wrong management of two locks: queue->lock
> and the bit RPC_TASK_WAKEUP in task->tk_runstate.
> When dealing with RPC workqueues, the code must take care of locking the
> associated queue lock before doing any modifications on it or its tasks.
> And it does it... except for one function. Nested locks should always be
> taken in the same order.
> 
> As a consequence, in some circumstances (in fact, I noticed it several
> times), this code deadlocks. Moreover, when this code is called, by
> example by nfs_file_flush(), the lock_kernel() is held and so all the
> system hangs.
> 
> 
> **PATCH**
> 
> To fix the problem, we only need to invert the lock hierarchy in
> rpc_wake_up_task() and taking care of checking the task is really queued
> before trying to grab the lock.

No you can't. You have absolutely _no_ guarantee that
task->u.tk_wait.rpc_waitq is valid here.

The real fix is the one I posted in response to this thread last week.
See

http://client.linux-nfs.org/Linux-2.6.x/2.6.16-rc5/linux-2.6.16-89-fix_race_in_rpc_wakeup.dif

Cheers,
  Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux