Sonny Rao <[email protected]> wrote:
>
> Hi, I ran into a deadlock on 2.6.16-mm2 when I was running a
> multi-threaded application and was reading /proc/<pid>/numa_maps for
> the app.
>
> I recompiled with DEBUG_SPINLOCK and I can get various error messages
> on kernels ranging from 2.6.17-rc1 to serveral mm kernels including
> 2.6.16-mm[12] and 2.6.16-rc5-mm[23] (mm kernels before this seem to break
> a lot on my box)
>
> My current guess, based on my rudimentary understanding of the code, is
> that we are rescheduling while holding a spinlock in
> check_pte_range() which is called from show_numa_map() in mempolicy.c.
>
> Specifically, the gather_stats() function which is called inside
> check_pte_range() has a cond_resched() at the end. Maybe that line
> should be changed to cond_resched_lock() or should simply be removed.
>
> I'll try removing it and see what happens.
Yes, that's a bug and that cond_resched() needs to go.
We would have found this quite quickly if cond_resched() had a
might_sleep() in it. It really should have such a check, but we cannot do
this because in some configurations, might_sleep() calls cond_resched().
That was rather nasty or us. Ingo, can you think of a fix please?
This bug would also have been exposed as a scheduling-while-atomic warning
on those rare occasions when the cond_resched() actually calls schedule().
But that won't be enabled unless the NUMA guys actually test with all debug
options, as I repeatedly and apparently ineffectively have suggested.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]