Re: 2.6.12-rc2-mm3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 18 Apr 2005 11:56:15 +0200, Alexander Nyberg wrote:
>> >This patch fixes the NMI checking problems in -mm x64 for me. It 
>> 
>> What problems?
>> 
>
>Sorry, in -mm on x64 check_nmi_watchdog() has started to be run as a
>late_initcall(). Currently it reports the NMIs as stuck on a few systems
>although they are not, both of mine are reported as stuck. This appears
>to be because the current event mask uses don't appear to tick much
>running mdelay() on opteron (in my case).

Please provide a complete dmesg log up to and including the failure
point where the kernel complains about stuck NMIs.

I tried 2.6.12-rc2-mm3 SMP on UP amd64 and I immediately found a
bug triggering bogus stuck NMI failures, and I want to check if
what you're seeing is caused by the same bug.

> Also in -mm because nmi_hz is
>set to 1 in setup_k7_watchdog() the NMI watchdog checking takes 10
>seconds, a bit much.

Orthogonal issue. Let's ignore this one for now.

>Patch below uses RETIRED_UOPS for a more constant rate of NMI sending, 

This may or may not work as you intend. There is _no_ documented reason
to assume that RETIRED_UOPS would provide a more steady stream of events
than CYCLES_PROCESSOR_IS_RUNNING. Both events are likely to be idle in
HLT states, for instance.

The local APIC + performance counter driven NMI watchdog simply cannot
provide wall-clock like behaviour. You need the I/O-APIC driven watchdog
for that, or to prevent the kernel from using HLT when idle.

>@@ -68,7 +69,7 @@
> #define K7_EVNTSEL_INT		(1 << 20)
> #define K7_EVNTSEL_OS		(1 << 17)
> #define K7_EVNTSEL_USR		(1 << 16)
>-#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING	0x76
>+#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING	0xC1 /* Retired uops */
> #define K7_NMI_EVENT		K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING

This is as bogus as "#define ONE 2". CYCLES_PROCESSOR_IS_RUNNING
_is_ event 0x76 (AMD renamed it recently, but that's irrelevant).
Using RETIRED_UOPS requires a new define, and a modification to
the K7_NMI_EVENT #define.

/Mikael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux