Re: [BUG] long freezes on thinkpad t60

* Linus Torvalds <[email protected]> wrote:

> With the sequence counters, the situation is more complex:
> 
> 	CPU #0					CPU #1
> 
> 	A (= code before the spinlock)
> 
> 	lock xadd mem	(serializing instruction)
> 
> 	B (= code afte xadd, but not inside lock)
> 
> 						lock release
> 
> 	cmp head, tail
> 
> 	C (= code inside the lock)
> 
> Now, B is basically the empty set, but that's not the issue I worry 
> about. The thing is, I can guarantee by the Intel memory ordering 
> rules that neither B nor C will ever have memops that leak past the 
> "xadd", but I'm not at all as sure that we cannot have memops in C 
> that leak into B!
> 
> And B really isn't protected by the lock - it may run while another 
> CPU still holds the lock, and we know the other CPU released it only 
> as part of the compare. But that compare isn't a serializing 
> instruction!
> 
> IOW, I could imagine a load inside C being speculated, and being moved 
> *ahead* of the load that compares the spinlock head with the tail! 
> IOW, the load that is _inside_ the spinlock has effectively moved to 
> outside the protected region, and the spinlock isn't really a reliable 
> mutual exclusion barrier any more!
> 
> (Yes, there is a data-dependency on the compare, but it is only used 
> for a conditional branch, and conditional branches are control 
> dependencies and can be speculated, so CPU speculation can easily 
> break that apparent dependency chain and do later loads *before* the 
> spinlock load completes!)
> 
> Now, I have good reason to believe that all Intel and AMD CPU's have a 
> stricter-than-documented memory ordering, and that your spinlock may 
> actually work perfectly well. But it still worries me. As far as I can 
> tell, there's a theoretical problem with your spinlock implementation.

hm, i agree with you that this is problematic. Especially on an SMT CPU 
it would be a big architectural restriction if prefetches couldnt cross 
cache misses. (and that's the only way i could see Nick's scheme 
working: MESI coherency coupled with the speculative use of that 
cacheline's value never "surviving" a MESI invalidation of that 
cacheline. That would guarantee that once we have the lock, any 
speculative result is fully coherent and no other CPU has modified it.)

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- Re: [BUG] long freezes on thinkpad t60
  - From: Ingo Molnar <[email protected]>
- Re: [BUG] long freezes on thinkpad t60
  - From: Chuck Ebbert <[email protected]>
- Re: [BUG] long freezes on thinkpad t60
  - From: Linus Torvalds <[email protected]>
- Re: [BUG] long freezes on thinkpad t60
  - From: Eric Dumazet <[email protected]>
- Re: [BUG] long freezes on thinkpad t60
  - From: Linus Torvalds <[email protected]>
- Re: [BUG] long freezes on thinkpad t60
  - From: Nick Piggin <[email protected]>
- Re: [BUG] long freezes on thinkpad t60
  - From: Linus Torvalds <[email protected]>
- Re: [BUG] long freezes on thinkpad t60
  - From: Nick Piggin <[email protected]>
- Re: [BUG] long freezes on thinkpad t60
  - From: Linus Torvalds <[email protected]>
- Re: [BUG] long freezes on thinkpad t60
  - From: Linus Torvalds <[email protected]>

Prev by Date: Re: How would I do this? (expert tricks) OT
Next by Date: Re: Oops on rmmod asus_acpi
Previous by thread: Re: [BUG] long freezes on thinkpad t60
Next by thread: Re: [BUG] long freezes on thinkpad t60
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]