Bad hotplug/scheduler interaction?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm concerned that we don't have adequate protection for the scheduler
during cpu hotplug events, but I'm willing to believe I simply don't
understand the mechanism well enough.  We had a crash in (comparatively
ancient) 2.6.16.* but I think the relevant code is basically unchanged
since then.

First we introduced some cpu-intensive workloads.  Then we added two cpus.
System quickly crashed.  The crash was in find_busiest_group(), when
the kernel tried to access "this", which was NULL.  If we don't find a
localgroup, we won't set this, and when we try to calculate *imbalance,
we'll dereference a NULL "this" and crash.

As I looked over the code, though, I couldn't tell if the fault was with
find_busiest_group() for not covering this case, or if the problem was
that the method the hotplug code is using to reconstruct the sched_domains
really doesn't protect find_busiest_group (and find_idlest_group) at all.
Can anybody explain how synchronize_sched() is really syncing?  It looks
like a half-implemented RCU setup.  I fear we really don't have any way
to protect the two functions above from hotplug's desire to twiddle
with the sched_domains.

Do we?

Rick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux