Re: [2.6.17-rc5-mm2] crash when doing second suspend: BUG in arch/i386/kernel/nmi.c:174

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Makes the start/stop paths of nmi watchdog more robust to handle the
suspend/resume cases more gracefully.

Signed-off-by:  Don Zickus <[email protected]>

---

On Tue, Jun 06, 2006 at 09:23:59AM -0700, Jeremy Fitzhardinge wrote:
> Shaohua Li wrote:
> >Does below patch help? The nmi suspend/resume doesn't look good to me.
> >Only CPU0 uses the suspend/resume code path. Other CPUs run the CPU
> >hotplug code path.
> >  
> Unfortunately this just oopses immediately on the first suspend.  The 
> stack trace is unclear (and I'm just going from memory at the moment), 
> but it looked like it got an invalid op.  I'll try to get a clearer idea 
> of the crash later today.
> 
>    J

Can you apply this patch on top of Shaohua's.  This should fix all your
suspend problems.  

Inside the patch is a little hack to handle the scenario when we come out
of resume we do _not_ want the nmi watchdog enabled (to match the
case entering suspend).  

Compiled but not tested, as I don't have easy access to my test machines
right now.  Mainly posted for Andrew to pick up for rc6-mm1.

Cheers,
Don

Index: linux-don/arch/i386/kernel/nmi.c
===================================================================
--- linux-don.orig/arch/i386/kernel/nmi.c
+++ linux-don/arch/i386/kernel/nmi.c
@@ -745,6 +745,7 @@ static void stop_intel_arch_watchdog(voi
 
 void setup_apic_nmi_watchdog (void *unused)
 {
+	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
 #ifdef CONFIG_LOCKDEP
 	/*
 	 * The NMI watchdog uses spinlocks (notifier chains, etc.),
@@ -761,6 +762,14 @@ void setup_apic_nmi_watchdog (void *unus
 	    (nmi_watchdog != NMI_IO_APIC))
 	    	return;
 
+	if (wd->enabled == 1)
+		return;
+
+	/* cheap hack to support suspend/resume */
+	/* if cpu0 is not active neither should the other cpus */
+	if ((smp_processor_id() != 0) && (atomic_read(&nmi_active) <= 0))
+		return;
+
 	if (nmi_watchdog == NMI_LOCAL_APIC) {
 		switch (boot_cpu_data.x86_vendor) {
 		case X86_VENDOR_AMD:
@@ -798,17 +807,22 @@ void setup_apic_nmi_watchdog (void *unus
 			return;
 		}
 	}
-	__get_cpu_var(nmi_watchdog_ctlblk.enabled) = 1;
+	wd->enabled = 1;
 	atomic_inc(&nmi_active);
 }
 
 void stop_apic_nmi_watchdog(void *unused)
 {
+	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
+
 	/* only support LOCAL and IO APICs for now */
 	if ((nmi_watchdog != NMI_LOCAL_APIC) &&
 	    (nmi_watchdog != NMI_IO_APIC))
 	    	return;
 
+	if (wd->enabled == 0)
+		return;
+
 	if (nmi_watchdog == NMI_LOCAL_APIC) {
 		switch (boot_cpu_data.x86_vendor) {
 		case X86_VENDOR_AMD:
@@ -836,7 +850,7 @@ void stop_apic_nmi_watchdog(void *unused
 			return;
 		}
 	}
-	__get_cpu_var(nmi_watchdog_ctlblk.enabled) = 0;
+	wd->enabled = 0;
 	atomic_dec(&nmi_active);
 }
 
Index: linux-don/arch/x86_64/kernel/nmi.c
===================================================================
--- linux-don.orig/arch/x86_64/kernel/nmi.c
+++ linux-don/arch/x86_64/kernel/nmi.c
@@ -672,6 +672,7 @@ static void stop_intel_arch_watchdog(voi
 
 void setup_apic_nmi_watchdog(void *unused)
 {
+	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
 #ifdef CONFIG_LOCKDEP
 	/*
 	 * The NMI watchdog uses spinlocks (notifier chains, etc.),
@@ -688,6 +689,14 @@ void setup_apic_nmi_watchdog(void *unuse
 	    (nmi_watchdog != NMI_IO_APIC))
 	    	return;
 
+	if (wd->enabled == 1)
+		return;
+
+	/* cheap hack to support suspend/resume */
+	/* if cpu0 is not active neither should the other cpus */
+	if ((smp_processor_id() != 0) && (atomic_read(&nmi_active) <= 0))
+		return;
+
 	if (nmi_watchdog == NMI_LOCAL_APIC) {
 		switch (boot_cpu_data.x86_vendor) {
 		case X86_VENDOR_AMD:
@@ -709,17 +718,22 @@ void setup_apic_nmi_watchdog(void *unuse
 			return;
 		}
 	}
-	__get_cpu_var(nmi_watchdog_ctlblk.enabled) = 1;
+	wd->enabled = 1;
 	atomic_inc(&nmi_active);
 }
 
 void stop_apic_nmi_watchdog(void *unused)
 {
+	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
+
 	/* only support LOCAL and IO APICs for now */
 	if ((nmi_watchdog != NMI_LOCAL_APIC) &&
 	    (nmi_watchdog != NMI_IO_APIC))
 	    	return;
 
+	if (wd->enabled == 0)
+		return;
+
 	if (nmi_watchdog == NMI_LOCAL_APIC) {
 		switch (boot_cpu_data.x86_vendor) {
 		case X86_VENDOR_AMD:
@@ -738,7 +752,7 @@ void stop_apic_nmi_watchdog(void *unused
 			return;
 		}
 	}
-	__get_cpu_var(nmi_watchdog_ctlblk.enabled) = 0;
+	wd->enabled = 0;
 	atomic_dec(&nmi_active);
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux