Re: [PATCH -mm][resend] Disable CPU hotplug during suspend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Saturday 29 July 2006 00:40, Nathan Lynch wrote:
> Rafael J. Wysocki wrote:
> > On Friday 28 July 2006 20:20, Nathan Lynch wrote:
> > > > +#ifdef CONFIG_SUSPEND_SMP
> > > > +static cpumask_t frozen_cpus;
> > > > +
> > > > +int disable_nonboot_cpus(void)
> > > > +{
> > > > +	int cpu, error = 0;
> > > > +
> > > > +	/* We take all of the non-boot CPUs down in one shot to avoid races
> > > > +	 * with the userspace trying to use the CPU hotplug at the same time
> > > > +	 */
> > > > +	mutex_lock(&cpu_add_remove_lock);
> > > > +	cpus_clear(frozen_cpus);
> > > > +	printk("Disabling non-boot CPUs ...\n");
> > > > +	for_each_online_cpu(cpu) {
> > > > +		if (cpu == 0)
> > > > +			continue;
> > > 
> > > Assuming cpu 0 is online is not okay in generic code.
> > 
> > Absolutely.  Thanks for pointing this out.
> > 
> > > This should be something like:
> > > 
> > > 	int cpu, first_cpu, error = 0;
> > > 
> > > 	/* We take all of the non-boot CPUs down in one shot to avoid races
> > > 	 * with the userspace trying to use the CPU hotplug at the same time
> > > 	 */
> > > 	mutex_lock(&cpu_add_remove_lock);
> > > 	cpus_clear(frozen_cpus);
> > > 	first_cpu = first_cpu(cpu_online_mask);
> > > 	printk("Disabling non-boot CPUs ...\n");
> > > 	for_each_online_cpu(cpu) {
> > > 		if (cpu == first_cpu)
> > > 			continue;
> >
> > 
> > I'm not quite sure if we can finish with CPU0 offline.  Perhaps it's
> > better to check if CPU0 is online and bring it up if not and then
> > continue or return an error if that fails?
> 
> You can't assume that cpu 0 is even present in generic code. :-)

Yes.  I should have said "the first present CPU" instead of "CPU0".

> But maybe I'm misunderstanding the motivation for using cpu 0 here.  I
> had assumed it was because on i386 (and others?) the BSP can't be
> offlined.  Is there some other reason?

Yes.

First, the arch-dependent suspend code assumes implicitly that it will be
running on the BSP, so some strange things may happen if it doesn't.

Second, we have to make sure that this function will always leaves the
same CPU online.  It's a bit difficult to explain, but I'll do my best.
Suppose that disable_nonboot_cpus() exits running on CPU1, assuming it's
possible.  Then the system memory state saved in the suspend image will
reflect this situation.  Now the resume code will almost certainly run on the
BSP (say it's CPU0), but when the system memory is restored from the suspend
image the kernel will think it's running on CPU1.

In the last patch I send yesterday I made disable_nonboot_cpus() check if the
first present CPU, first_cpu(cpu_present_map), is online, try to bring it up
if not and migrate itself to it before the loop over all online CPUs is run.

I think that's general enough.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux