Re: RT and Cascade interrupts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



del_timer_sync() is known to be racy. Now I think it has
(theoretically) problems with singleshot timers too.

Suppose whe have rpc timer which is running on CPU_0, it is
preempted *after* it has cleared RPC_TASK_HAS_TIMER bit.

Then __rpc_execute's main loop on CPU_1 calls rpc_delete_timer(),
it returns without doing del_timer_sync().

Now it calls ->tk_action()->rpc_sleep_on()->__rpc_add_timer(),
timer pending on CPU_1.

preemption comes, reschedule at another (not CPU_1) processor,
(this step is not strictly necessary).

Next loop iteration, __rpc_execute calls rpc_delete_timer() again,
now it calls del_timer_sync().

del_timer_sync:

	// __run_timers starts on CPU_1, picks rpc timer, sets
	// timer->base = 0

	ret += del_timer(timer); // return 0, timer is not pending.

	for_each_online_cpu() {
		if (timer running on that cpu) {
			// finds the timer still running on CPU_0,
			// waits for __run_timers on CPU_0 change
			//	->running_timer,
			// the timer on CPU_0 completes.
			break;
		}
	}

	if (timer_pending())	// NO
		goto del_again;

	// The timer still running on CPU_1
	return;

This all is very unlikely of course, but it would be nice to verify
that kernel/timer.c is not the source of the problem.

John, if it is easy to reproduce the problem, could you please retest
with this patch?

Oleg.

--- 2.6.12-rc5/net/sunrpc/sched.c~	Wed Jun  1 17:49:57 2005
+++ 2.6.12-rc5/net/sunrpc/sched.c	Wed Jun  1 18:00:31 2005
@@ -137,8 +137,12 @@ rpc_delete_timer(struct rpc_task *task)
 {
 	if (RPC_IS_QUEUED(task))
 		return;
-	if (test_and_clear_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate)) {
-		del_singleshot_timer_sync(&task->tk_timer);
+	if (test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate)) {
+		if (del_singleshot_timer_sync(&task->tk_timer)) {
+			BUG_ON(!test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate));
+			clear_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate);
+		} else
+			BUG_ON(test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate));
 		dprintk("RPC: %4d deleting timer\n", task->tk_pid);
 	}
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux