Re: [PATCH] Add mem_nmi_panic enable system to panic on hard error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Maw, 2005-12-13 at 01:48 -0500, Dave Jones wrote:
> On Tue, Dec 13, 2005 at 03:38:16PM +0900, Hidetoshi Seto wrote:
>  > Some x86 server fires NMI with reason 0x80 up if a parity error
>  > occurs on its PCI-bus system or DIMM module.
> 
> Hmm, are you sure this isn't a bios error misconfiguring
> some northbridge register perhaps ?  Some chipsets offer
> such reporting as a feature. Could be your server has this
> on by default.

This is done deliberately on some systems and is why we have the patches
I submitted to Andrew Morton which are in his tree and allow you to set
"panic_on_unrecovered_nmi" to halt on a memory error.

See the -mm tree. The functionality is there. Also see the various
x86-64 logging tools which can parse out MCE based reports.

> (I believe the EDAC code has also triggered similar cases
>  on certain cards which is why it too disables this checking
>  by default).

EDAC has it enabled by default at the moment, although it needs to clear
left over flags and possibly a small blacklist first.

> The sysctl seems pointless too. If this is needed at all,
> why would you ever want to turn it off ?

Debugging and stressing systems for one.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux