I'm concerned that we don't have adequate protection for the scheduler
during cpu hotplug events, but I'm willing to believe I simply don't
understand the mechanism well enough. We had a crash in (comparatively
ancient) 2.6.16.* but I think the relevant code is basically unchanged
since then.
First we introduced some cpu-intensive workloads. Then we added two cpus.
System quickly crashed. The crash was in find_busiest_group(), when
the kernel tried to access "this", which was NULL. If we don't find a
localgroup, we won't set this, and when we try to calculate *imbalance,
we'll dereference a NULL "this" and crash.
As I looked over the code, though, I couldn't tell if the fault was with
find_busiest_group() for not covering this case, or if the problem was
that the method the hotplug code is using to reconstruct the sched_domains
really doesn't protect find_busiest_group (and find_idlest_group) at all.
Can anybody explain how synchronize_sched() is really syncing? It looks
like a half-implemented RCU setup. I fear we really don't have any way
to protect the two functions above from hotplug's desire to twiddle
with the sched_domains.
Do we?
Rick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]