Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync

On Tue, Nov 21, 2006 at 11:04:41PM +0300, Oleg Nesterov wrote:
> On 11/20, Alan Stern wrote:
> >
> > On Mon, 20 Nov 2006, Oleg Nesterov wrote:
> >
> > > So, if we have global A == B == 0,
> > >
> > > 	CPU_0		CPU_1
> > >
> > > 	A = 1;		B = 2;
> > > 	mb();		mb();
> > > 	b = B;		a = A;
> > >
> > > It could happen that a == b == 0, yes?
> >
> > 	Both CPUs execute their "mb" instructions.  The mb forces each
> > 	cache to wait until it receives an Acknowdgement for the
> > 	Invalidate it sent.
> >
> > 	Both caches send an Acknowledgement message to the other.  The
> > 	mb instructions complete.
> >
> > 	"b = B" and "a = A" execute.  The caches return A==0 and B==0
> > 	because they haven't yet invalidated their cache lines.
> >
> > The reason the code failed is because the mb instructions didn't force
> > the caches to wait until the Invalidate messages in their queues had been
> > fully carried out (i.e., the lines had actually been invalidated).
> 
> However, from
> 	http://marc.theaimsgroup.com/?l=linux-kernel&m=113435711112941
> 
> Paul E. McKenney wrote:
> >
> > 2.      rmb() guarantees that any changes seen by the interconnect
> >         preceding the rmb() will be seen by any reads following the
> >         rmb().
> >
> > 3.      mb() combines the guarantees made by rmb() and wmb().
> 
> Confused :(

There are the weasel words "seen by the interconnect".  Alan is
pointing out that the stores to A and B might not have been "seen by the
interconnect" at the time that the pair of mb() instructions execute,
since the other function of the mb() instructions is to ensure that
any stores prior to each mb() is "seen by the interconnect" before any
subsequenct stores are "seen by the interconnect".

Why wouldn't the store to A be seen by the interconnect at the time of
CPU 1's mb()?  Because the cacheline containing A is still residing at
CPU 1.  CPU 0's store to A cannot possibly be seen by the interconnect
until after CPU 0 receives the corresponding cacheline.

Yes, it is confusing.  Memory barriers work a bit more straightforwardly
on MMIO accesses, thankfully.  But it would probably be good to strive
for minimal numbers of memory barriers, especially in common code.  :-/

						Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
  - From: Oleg Nesterov <[email protected]>
- Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
  - From: Alan Stern <[email protected]>
- Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
  - From: Oleg Nesterov <[email protected]>

Prev by Date: Re: 2.6.19-rc6 : Spontaneous reboots, stack overflows - seems to implicate xfs, scsi, networking, SMP
Next by Date: Re: [PATCH] i386-pda UP optimization
Previous by thread: Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
Next by thread: Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]