Re: Status on CPU hotplug issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 06, 2006 at 04:29:24PM -0700, Andrew Morton wrote:
> Can you describe the nature of the cpu-hotplug tests you're running?  I'd
> be fairly staggered if the kernel was able to survive a full-on cpu-hotplug
> stress test for more than one second, frankly.  There's a lot of code in
> there which is non-hotplug-aware.  Running a non-preemptible kernel would
> make things appear more stable, perhaps.

Certainly, the testsuite is one the OSDL Hotplug SIG put together last
summer, and consists of several test cases:

   http://developer.osdl.org/dev/HOTPLUG/planning/hotplug_cpu_test_plan_status.html

   hotplug01:  Check IRQ behavior during cpu hotplug events
   hotplug02:  Check process migration during cpu hotplug events
   hotplug03:  Verify tasks get scheduled on newly onlined cpu's
   hotplug04:  Verify disallowing offlining all CPU's
   hotplug05:  (Unimplemented)
   hotplug06:  Check userspace tools (sar, top) during cpu hotplug events 
   hotplug07:  Stress case doing kernel compile while cpu's are
               hotplugged on and off repeatedly

It can be downloaded here:
   http://developer.osdl.org/dev/hotplug/tests/

Results are posted here:
   http://crucible.osdl.org/runs/hotplug_report.html

We've been running this testsuite fairly continuously for several
months, and irregularly for about a year before that.  We find that on
some platforms like PPC64 it's quite robust, and on others there are
issues, but the developers tend to be quick to provide fixes as the
issues are found.  I'm glad to see that the results are finally showing
green for ia64.

> > Issues were found in four areas: General kernel, cpu hotplug, sysstat,
> > and the test harness itself.
> 
> It's surprising that AMD and Intel CPUs behave differently.  Also a good
> start on diagnosing things.

I was also surprised to see this too.  Note that the .config's for the
amd and em64t machines are considerably different (different drivers,
etc.), but the cpu config should be relatively comparable.  Still, I
wouldn't rule out the different behaviors being due to configuration
differences.  We could work on homogenizing the configs of the two
systems if that would help in troubleshooting?

Thanks,
Bryce

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux