Andi Kleen (on Wed, 7 Jun 2006 09:20:23 +0200) wrote:
>On Wednesday 07 June 2006 06:49, Keith Owens wrote:
>> Following a suggestion by Brendan Trotter, I ran some more tests to
>> track down the problem with sending NMI IPI on Dell Xeons.
>>
>> BIOS Logical OS ACPI Cpus IPI 2 NMI IPI
>> Processor BIOS OS (APIC_DM_NMI)
>>
>> Enabled Enabled 4 4 Not delivered Delivered as NMI
>> Enabled Disabled 4 2 Machine reset Machine reset
>> Disabled Enabled 2 2 Not delivered Delivered as NMI
>> Disabled Disabled 2 2 Not delivered Delivered as NMI
>>
>> So the killer combination with this motherboard is when the BIOS knows
>> about logical processors but the OS does not. Sending IPI 2 or NMI IPI
>> with that combination kills the machine. Brendan suggested that the
>> BIOS is seeing the broadcast NMI on the logical processors which are
>> not under OS control and that the BIOS cannot cope.
>
>How did you manage that? Normally the OS should use all CPUs
>known to BIOS. Or did you boot with special boot options to limit it?
Two ways:
(1) Boot with a kernel with CONFIG_ACPI=n, so the OS only finds 2 cpus
in the MPT instead of the 4 listed by ACPI.
(2) The kernel has ACPI=y, but is booted with maxcpus=2.
In both cases, send_IPI_allbutself() with IPI 2 or an NMI will result
in a hard reset.
>> Should we change the x86_64 send_IPI_allbutself() so it is only
>> delivered to cpus that the OS knows about, instead of doing a general
>> broadcast.
>
>Hmm, we should be doing that already to avoid races for CPU hotplug. But
>maybe it's not working correctly for KDB.
This problem is not KDB specific, although that is where it was first
noticed. Any code that sends a broadcast IPI 2 or an NMI IPI will
crash these Dell boxes when there is a mismatch between the cpus known
to the BIOS and the cpus known to the OS.
>Does it go away when you
>enable CPU hotplug?
HOTPLUG_CPU was already on in all of my test kernels.
>Anyways, should be a SMOP to force it. I wouldn't
>have a problem to use sequence ipis always and get rid of the broadcasts.
>There were benchmarks at some point and there wasn't a noticeable
>difference.
I will try forcing send_IPI_allbutself() to use the mask version rather
than the broadcast shortcut. Later tonight ...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]