Re: i386 nmi_watchdog: Merge check_nmi_watchdog fixes from x86_64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andy Isaacson <[email protected]> writes:

> On Sat, Oct 15, 2005 at 06:49:56AM -0600, Eric W. Biederman wrote:
>> Andy Isaacson <[email protected]> writes:
>> > I believe (but have not verified) that GCC simply inhibits
>> > dead-store-elimination when the address of the variable has been taken,
>> > so this theoretical possibility is not a real danger under gcc.  And in
>> > any case, it doesn't apply to your check_nmi_watchdog, because you've
>> > got function calls after the assignment.
>> 
>> It comes very close to applying to check_nmi_watchdog as both
> [snip]
>
> You prodded me to looking in a bit more depth at the code around this,
> and I'm actually a bit concerned that volatile may not be enough of a
> guarantee that other CPUs will see the correct value.  I grant that this
> is a mostly theoretical concern, but let's take a look at the code:
>
> +static __init void nmi_cpu_busy(void *data)
> +{
> +	volatile int *endflag = data;
> +	local_irq_enable();
> +	while (*endflag == 0)
> +		barrier();
> +}
>  static int __init check_nmi_watchdog(void)
>  {
> +	volatile int endflag = 0;
> ...
> +	if (nmi_watchdog == NMI_LOCAL_APIC)
> +		smp_call_function(nmi_cpu_busy, (void *)&endflag, 0, 0);
> ...
> +	endflag = 1;
> 	printk("OK.\n");
>         if (nmi_watchdog == NMI_LOCAL_APIC)
> 		nmi_hz = 1;
> +       kfree(prev_nmi_count);
> 	return 0;
> }
>
> So CPU#0 does a smp_call_function, does some work, then sets endflag,
> does a printk, sets a static variable, calls kfree, then leaves the
> stack frame.
>
> Meanwhile, CPU#1 - CPU#N are polling waiting for endflag to make a
> transition from 0->1.
>
> Nothing that CPU#0 does after setting endflag is a guarantee that its
> store to endflag will be seen by other CPUs.  In particular, if the
> caller immediately zeros that stack location (not an unlikely
> happenstance), then it's possible that the two stores to endflag might
> be coalesced by a write buffer in CPU#0's bus interface.

Yes but if they are at all separated in time things are better. And
the constant reads of endflag are going to ping-pong the cache
line between all of the cpus which will tend to get the store pushed
out as soon as possible.  No other magic is necessary on x86.

> In particular, a NUMA x86 system is rather likely to be unfair to remote
> nodes in this case (where the other local CPUs are polling too).

In the NUMA systems I am familiar with the system will slow to a crawl
instead of performing and being unfair.

>> If there is a better idiom to synchronize the cpus which does
>> not mark the variable volatile I could see switching to that. 
>> 
>> There is a theoretical race there as some cpu might not see endflag
>> get set and the next function on the stack could set it that same
>> stack slot to 0.  I can see value in fixing that, if there is a simple
>> and clear solution.  Currently I am having a failure of imagination.
>
> Exactly (he says; after writing the same thing above I re-read the rest
> of your post).
>
> I would imagine that some kind of write memory barrier is appropriate,
> but I'm not up to date on what's in fashion in the kernel code these
> days.

Probably a counter, to ensure the code exits.  The code prints
a message if it fails, and possibly the boot breaks.  So I don't expect
bugs in this code to persist for a long time without someone screaming.

Andrew has forwarded me the first bug report already.  
http://bugzilla.kernel.org/show_bug.cgi?id=5462

It looks like we have a bug of the watchdog timers instead of one of
the expected failure modes.  After I get some more information I will
be able to see what the practical consequences are.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux