* Jarek Poplawski <[email protected]> wrote:
> BTW, I've looked a bit at these NMI watchdog traces, and now I'm not
> even sure it's necessarily the spinlock's problem (but I don't exclude
> this possibility yet). It seems both processors use task_rq_lock(), so
> there could be also a problem with that loop. The way the correctness
> of the taken lock is verified is racy: there is a small probability
> that if we have taken the wrong lock the check inside the loop is done
> just before the value is beeing changed elsewhere under the right
> lock. Another possible problem could be a result of some wrong
> optimization or wrong propagation of change of this task_rq(p) value.
ok, could you elaborate this in a bit more detail? You say it's racy -
any correctness bug in task_rq_lock() will cause the kernel to blow up
in spectacular ways. It's a fairly straightforward loop:
static inline struct rq *__task_rq_lock(struct task_struct *p)
__acquires(rq->lock)
{
struct rq *rq;
repeat_lock_task:
rq = task_rq(p);
spin_lock(&rq->lock);
if (unlikely(rq != task_rq(p))) {
spin_unlock(&rq->lock);
goto repeat_lock_task;
}
return rq;
}
the result of task_rq() depends on p->thread_info->cpu wich will only
change if a task has migrated over to another CPU. That is a
fundamentally 'slow' operation, but even if a task does it intentionally
in a high frequency way (for example via repeated calls to
sched_setaffinity) there's no way it could be faster than the spinlock
code here. So ... what problems can you see with it?
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]