Hello,
> We have about 100 servers based on Intel S5000PSL-SATA motherboards.
> They have been running for anywhere between 1 and 10 months. For the
> past few months, after updating them all to the 2.6.20.15 kernel
> (because of a bug in the 2.6.18 kernel), we are seeing some strange NMI
> errors. For example:
>
> Aug 29 09:02:10 master kernel: Uhhuh. NMI received for unknown reason 30.
> Aug 29 09:02:10 master kernel: Do you have a strange power saving mode enabled?
> Aug 29 09:02:10 master kernel: Dazed and confused, but trying to continue
I'm also working with Andrew and Samson. It seems that the cause of
the problem is CONFIG_PCIEAER, which was introduced after 2.6.18 and
defaults to y.
With CONFIG_PCIEAER=n, scanpci works fine with no errors. This is the
workaround that they'll likely use for now.
With CONFIG_PCIEAER=y, scanpci always triggers the NMI error. The
option aerdriver.forceload=1 has no effect.
The related dmesg output at boot is:
Evaluate _OSC Set fails. Status = 0x0005
Evaluate _OSC Set fails. Status = 0x0005
aer_init: AER service init fails - Run ACPI _OSC fails
aer: probe of 0000:00:02.0:pcie01 failed with error 2
aer_init: AER service init fails - No ACPI _OSC support
aer: probe of 0000:00:03.0:pcie01 failed with error 1
Evaluate _OSC Set fails. Status = 0x0005
Evaluate _OSC Set fails. Status = 0x0005
aer_init: AER service init fails - Run ACPI _OSC fails
aer: probe of 0000:00:04.0:pcie01 failed with error 2
Evaluate _OSC Set fails. Status = 0x0005
Evaluate _OSC Set fails. Status = 0x0005
aer_init: AER service init fails - Run ACPI _OSC fails
aer: probe of 0000:00:05.0:pcie01 failed with error 2
Evaluate _OSC Set fails. Status = 0x0005
Evaluate _OSC Set fails. Status = 0x0005
aer_init: AER service init fails - Run ACPI _OSC fails
aer: probe of 0000:00:06.0:pcie01 failed with error 2
aer_init: AER service init fails - No ACPI _OSC support
aer: probe of 0000:00:07.0:pcie01 failed with error 1
Full dmesg, lspci, and ACPI DSDT are available here:
http://jim.sh/~jim/tmp/nmi/
-jim
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]