Re: [PATCH] i386/x86_64: Don't IPI to offline cpus on shutdown

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Zwane Mwaikambo <[email protected]> writes:

> On Sun, 27 Nov 2005, Eric W. Biederman wrote:
>
>> Andi Kleen <[email protected]> writes:
>> 
>> > On Sun, Nov 27, 2005 at 02:05:45AM -0800, Zwane Mwaikambo wrote:
>> >> http://bugzilla.kernel.org/show_bug.cgi?id=5203
>> >> 
>> >> There is a small race during SMP shutdown between the processor issuing 
>> >> the shutdown and the other processors clearing themselves off the 
>> >> cpu_online_map as they do this without using the normal cpu offline 
>> >> synchronisation. To avoid this we should wait for all the other processors
>> >> to clear their corresponding bits and then proceed. This way we can safely
>> >> make the cpu_online test in smp_send_reschedule, it's safe during normal 
>> >> runtime as smp_send_reschedule is called with a lock held / preemption 
>> >> disabled.
>> >
>> > Looking at the backtrace in the bug - how can sys_reboot call do_exit???
>> > I would say the problem is in whatever causes that. It shouldn't 
>> > do that. sys_reboot shouldn't schedule, it's that simple.
>> > Your patch is just papering over that real bug.
>> 
>> sys_reboot in the case of halt (after everything else is done)
>> has directly called do_exit for years.
>> 
>> There are some very subtle interactions there.  The one
>> I always remember (having found it the hard way) is that
>> interrupts must be left on so ctrl-alt-del still works.
>> 
>> This do_exit may be something similar, although I can't
>> think of how it could be useful.
>
> I wondered the same thing, perhaps i am papering over the bug, what 
> exactly is do_exit doing there?

So I just looked at a couple of kernels I have lying around.
the do_exit has been in halt since 1.2.?  In 1.0 the halt
case was not even implemented.  

So I think not coping with the do_exit is a problem, as every
other architecture and every other kernel has been able to.

To handle ctrl-alt-del after a halt (or any other action) being able
to schedule sounds important.

As Andi Kleen indicated the problem with do_exit is that it schedules.
I strongly disagree that simply scheduling after halting is a bug.

So why are we calling smp_send_stop from machine_halt?

We don't.

Looking more closely at the bug report the problem here
is that halt -p is called which triggers not a halt but
an attempt to power off.  

machine_power_off calls machine_shutdown which calls smp_send_stop.

If pm_power_off is set we should never make it out machine_power_off
to the call of do_exit.  So pm_power_off must not be set in this case.
When pm_power_off is not set we expect machine_power_off to devolve
into machine_halt.

So how do we fix this?

Playing too much with smp_send_stop is dangerous because it
must also be safe to be called from panic.

It looks like the obviously correct fix is to only call
machine_shutdown when pm_power_off is defined.  Doing
that will make Andi's assumption about not scheduling
true and generally simplify what must be supported.

This turns machine_power_off into a noop like machine_halt
when pm_power_off is not defined.

If the expected behavior is that sys_reboot(LINUX_REBOOT_CMD_POWER_OFF)
becomes sys_reboot(LINUX_REBOOT_CMD_HALT) if pm_power_off is NULL
this is not quite a comprehensive fix as we pass a different parameter
to the reboot notifier and we set system_state to a different value
before calling device_shutdown().

Unfortunately any fix more comprehensive I can think of is not
obviously correct.  The core problem is that there is no architecture
independent way to detect if machine_power will become a noop, without
calling it.

Eric

diff --git a/arch/i386/kernel/reboot.c b/arch/i386/kernel/reboot.c
index 350ea66..2a8cb44 100644
--- a/arch/i386/kernel/reboot.c
+++ b/arch/i386/kernel/reboot.c
@@ -12,6 +12,7 @@
 #include <linux/efi.h>
 #include <linux/dmi.h>
 #include <linux/ctype.h>
+#include <linux/pm.h>
 #include <asm/uaccess.h>
 #include <asm/apic.h>
 #include <asm/desc.h>
@@ -347,10 +348,10 @@ void machine_halt(void)
 
 void machine_power_off(void)
 {
-	machine_shutdown();
-
-	if (pm_power_off)
+	if (pm_power_off) {
+		machine_shutdown();
 		pm_power_off();
+	}
 }
 
 
diff --git a/arch/x86_64/kernel/reboot.c b/arch/x86_64/kernel/reboot.c
index 75235ed..57117b8 100644
--- a/arch/x86_64/kernel/reboot.c
+++ b/arch/x86_64/kernel/reboot.c
@@ -6,6 +6,7 @@
 #include <linux/kernel.h>
 #include <linux/ctype.h>
 #include <linux/string.h>
+#include <linux/pm.h>
 #include <asm/io.h>
 #include <asm/kdebug.h>
 #include <asm/delay.h>
@@ -154,10 +155,11 @@ void machine_halt(void)
 
 void machine_power_off(void)
 {
-	if (!reboot_force) {
-		machine_shutdown();
-	}
-	if (pm_power_off)
+	if (pm_power_off) {
+		if (!reboot_force) {
+			machine_shutdown();
+		}
 		pm_power_off();
+	}
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux