Basically , there is a race condition in sys_timer_delete() and
posix_timer_event() .
>From sys_timer_delete() :
/*
* This keeps any tasks waiting on the spin lock from thinking
* they got something (see the lock code above).
*/
if (timer->it_process) {
if (timer->it_sigev_notify ==
(SIGEV_SIGNAL|SIGEV_THREAD_ID))
put_task_struct(timer->it_process);
timer->it_process = NULL;
}
unlock_timer(timer, flags);
/* Preemption happens here. */
release_posix_timer(timer, IT_ID_SET);
So when the timer is getting triggered , right before it takes the timer
lock, preemption happens. You finish the code above. Then your preempted
again right after unlock timer, shown above.
At this point, your triggering a timer that is half deleted, in posix_timer_fn() .
timer->it_process = NULL , so when you try to send the signal to the
timer owner you crash with an OOPS , cause the timer owner was just set
to NULL.
George, at least CC me, after all I found/documented this bug ..
Preliminary fix included ..
Daniel
On Tue, 2005-04-26 at 16:42, George Anzinger wrote:
> Ingo,
>
> In tracking down the failure of a system running the RT patch we have found a
> preemption between the time run_timer_list clears its spinlock and the call back
> function (in this case in posix-timers.c) gets its spinlock. The bad news is
> that it is possible for the timer to be released at this point leaving the call
> back code with a pointer to a bogus timer.
>
> This was/is possible, of course, in SMP systems and is why del_timer_sync()
> exists. I suspect that del_timer_sync() needs to also do the "right thing" in
> UP RT systems.
>
> This means removing the #ifdef CONFIG_SMP at about line 56 of kernel/timer.c
> thus setting up base->running_timer in all cases (or at least in SMP and RT
> cases) and also the #ifdef CONFIG_SMP around del_timer_sync() and, of course,
> the defines that redirect calls to these functions.
>
> Does this make sense?
Source: MontaVista Software, Inc.
MR: 11506
Type: Defect Fix
Disposition: needs submitting to LKML
Signed-off-by: Daniel Walker <[email protected]>
Description:
Ok, so here is the run down.. Basically , there is a race condition in
sys_timer_delete() and posix_timer_event() .
>From sys_timer_delete() :
timer->it_process = NULL;
}
unlock_timer(timer, flags);
/* Preemption happens here. */
release_posix_timer(timer, IT_ID_SET);
So when the timer is getting triggered , right before it takes the timer
lock, preemption happens. You finish the code above. Then your preempted
again right after unlock timer, shown above.
At this point, your triggering a timer that is half deleted, in posix_timer_fn() .
timer->it_process = NULL , so when you try to send the signal to the
timer owner you crash with an OOPS , because the timer owner was just set
to NULL.
Index: linux-2.6.10/kernel/posix-timers.c
===================================================================
--- linux-2.6.10.orig/kernel/posix-timers.c 2005-04-26 17:38:25.000000000 +0000
+++ linux-2.6.10/kernel/posix-timers.c 2005-04-26 17:53:54.000000000 +0000
@@ -433,6 +433,14 @@ exit:
int posix_timer_event(struct k_itimer *timr,int si_private)
{
+ /*
+ * If it_process is NULL then this timer is
+ * in the process of being deleted. At this
+ * point we can't do very much. So we
+ * try to return gracefuly.
+ */
+ if (timr->it_process == NULL) return 1;
+
memset(&timr->sigq->info, 0, sizeof(siginfo_t));
timr->sigq->info.si_sys_private = si_private;
/*
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]