Re: RT and Cascade interrupts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oleg Nesterov wrote:
This all is very unlikely of course, but it would be nice to verify
that kernel/timer.c is not the source of the problem.

The problem I am seeing is on a single CPU system.

John, if it is easy to reproduce the problem, could you please retest
with this patch?

I've done so and the second assert is generated
when running the test.  So here we have a case
of RPC_TASK_HAS_TIMER set but the associated
timer->base == NULL.  It would seem this could
easily be the scenario of executing in:

__run_timers()
    timer->base = NULL;
        rpc_run_timer()
            task->tk_timeout_fn(task)
            /* ksoftirqd preempted */
                                             :
                                         /* RPC client */
                                         rpc_execute()
                                            rpc_delete_timer()
                                                del_timer() returns 0
                                                BUG_ON(test_bit(RPC_TASK_HAS_TIMER,
                                                  &task->tk_runstate));
                                             :
         /* rpc_run_timer() resumes */
         clear_bit(RPC_TASK_HAS_TIMER,
           &task->tk_runstate);

I don't see how this would imply a kernel/timer.c
problem.  It also appears this wouldn't be causing
the timer cascade corruption I've seen as the
end result is deleting an already dequeued timer
which is safe here.

BTW, if preemption is explicitly disabled in rpc_run_timer()
the BUG_ON assert is not generated nor (as reported
earlier) does the cascade corruption occur.  I'm still
investigating.

-john



--- 2.6.12-rc5/net/sunrpc/sched.c~	Wed Jun  1 17:49:57 2005
+++ 2.6.12-rc5/net/sunrpc/sched.c	Wed Jun  1 18:00:31 2005
@@ -137,8 +137,12 @@ rpc_delete_timer(struct rpc_task *task)
 {
 	if (RPC_IS_QUEUED(task))
 		return;
-	if (test_and_clear_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate)) {
-		del_singleshot_timer_sync(&task->tk_timer);
+	if (test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate)) {
+		if (del_singleshot_timer_sync(&task->tk_timer)) {
+			BUG_ON(!test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate));
+			clear_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate);
+		} else
+			BUG_ON(test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate));
 		dprintk("RPC: %4d deleting timer\n", task->tk_pid);
 	}
 }



--
[email protected]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux