Re: Problems with EDAC coexisting with BIOS

Alan Cox wrote:

On Mer, 2006-05-03 at 21:25 +0100, Tim Small wrote:

something with NMI-signalled errors, I was wondering what the problemswith using NMI-signalled ECC errors were?


The big problem with NMI is that it can occur *during* a PCI
configuration sequence (ie during pci_config_* functions). That means we
can't safely do some I/O, especially configuration space I/O in an NMI
handler. At best we could set a flag and catch it afterwards.

I was assuming this was the case - but I don't think that deferring thework until after the NMI handler has returned is necessarily a bigdisadvantage - at least as far as ECC register-status checking isconcerned - since none of the hardware that I've looked at makes anysort of guarantee about the timeliness of ECC-error-triggered NMIdelivery anyway - so any of the really smart (and urgent) stuff that youcould potentially do as part of the ECC error handling (e.g. terminatinga process if one of their physical pages was mangled) is not possible todo in a reliable manner anyway.

About the best thing it is possible to do is to try and arrange to takethe page(s) in which an uncorrectable error occurred out of further use(maybe do the same for correctable errors, if the same physical pagesees repeated correctable errors), plus maybe give the option ofpanicing if an uncorrectable page was in use by the kernel?

My first thought was to schedule a tasklet as part of the ECC-specificNMI handling, or are there any gotchas with doing this from within anNMI handler?


Cheers,

Tim.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- RE: Problems with EDAC coexisting with BIOS
  - From: "Ong, Soo Keong" <[email protected]>
- RE: Problems with EDAC coexisting with BIOS
  - From: Alan Cox <[email protected]>
- Re: Problems with EDAC coexisting with BIOS
  - From: Tim Small <[email protected]>
- Re: Problems with EDAC coexisting with BIOS
  - From: Alan Cox <[email protected]>

Prev by Date: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA
Next by Date: Re: [RFC] kernel facilities for cache prefetching
Previous by thread: Re: Problems with EDAC coexisting with BIOS
Next by thread: RE: Problems with EDAC coexisting with BIOS
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]