Re: [PATCH] Allow users to force a panic on NMI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

> To quote Alan Cox:
> 
> The default Linux behaviour on an NMI of either memory or unknown is to
> continue operation. For many environments such as scientific computing
> it is preferable that the box is taken out and the error dealt with than
> an uncorrected parity/ECC error get propogated.

> This is just a refreshed post of Alan's original patch
> <http://www.ussg.iu.edu/hypermail/linux/kernel/0510.2/1208.html>, with
> hopes this time it sticks. :)
> 
> It applies cleanly on top of my other nmi patches.  

> Index: linux-don/arch/i386/kernel/traps.c
> ===================================================================
> --- linux-don.orig/arch/i386/kernel/traps.c
> +++ linux-don/arch/i386/kernel/traps.c
> @@ -602,6 +602,8 @@ static void mem_parity_error(unsigned ch
>  			"to continue\n");
>  	printk(KERN_EMERG "You probably have a hardware problem with your RAM "
>  			"chips\n");
> +	if (panic_on_unrecovered_nmi)
> +                panic("NMI: Not continuing");
>  
>  	/* Clear and disable the memory parity error line. */
>  	clear_mem_error(reason);
> @@ -637,6 +639,10 @@ static void unknown_nmi_error(unsigned c
>  		reason, smp_processor_id());
>  	printk("Dazed and confused, but trying to continue\n");

'Trying to contninue'

>  	printk("Do you have a strange power saving mode enabled?\n");
> +
> +	if (panic_on_unrecovered_nmi)
> +                panic("NMI: Not continuing");
> +

'not really'. Move printks around so it makes sense...

> Index: linux-don/arch/x86_64/kernel/traps.c
> ===================================================================
> --- linux-don.orig/arch/x86_64/kernel/traps.c
> +++ linux-don/arch/x86_64/kernel/traps.c
> @@ -608,6 +608,8 @@ mem_parity_error(unsigned char reason, s
>  {
>  	printk("Uhhuh. NMI received. Dazed and confused, but trying to continue\n");
>  	printk("You probably have a hardware problem with your RAM chips\n");
> +	if (panic_on_unrecovered_nmi)
> +               panic("NMI: Not continuing");
>  
>  	/* Clear and disable the memory parity error line. */

same here.

> @@ -633,6 +635,10 @@ unknown_nmi_error(unsigned char reason, 
>  {	printk("Uhhuh. NMI received for unknown reason %02x.\n", reason);
>  	printk("Dazed and confused, but trying to continue\n");
>  	printk("Do you have a strange power saving mode enabled?\n");
> +
> +	if (panic_on_unrecovered_nmi)
> +                panic("NMI: Not continuing");
> +
>  }
>  

and here.

-- 
Thanks for all the (sleeping) penguins.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux