David Howells wrote:
Nick Piggin <[email protected]> wrote:
Isn't the Alpha's split caches a counter-example of your model,
because the coherency itself is out of order?
I'd forgotten I need to adjust my documentation to deal with this. It seems
this is the reason for read_barrier_depends(), and that read_barrier_depends()
is also a partial cache sync.
Do you know of any docs on Alpha's split caches? The Alpha Arch Handbook
doesn't say very much about cache operation on the Alpha.
I've looked around for what exactly is meant by "split cache" in conjunction
with Alpha CPUs, and I've found three different opinions of what it means:
(1) Separate Data and Instruction caches.
(2) Serial data caches (CPU -> L1 Cache -> L2 Cache -> L3 Cache -> Memory).
(3) Parallel linked data caches, where a CPU's request can be satisfied by
either data cache, in which whilst one data cache is being interrogated by
the CPU, the other one can use the memory bus (at least, that's what I
understand).
I don't have any docs myself, Paul might be the one to talk to as he's
done the most recent research on this (though some of it directly with
Alpha engineers, if I remember correctly).
IIRC some alpha models have split cache close to what you describe in
#3, however I'm not sure of the fine details (eg. I don't think the
split caches are redundant, but just treated as two entities by the
cache coherence protocol).
Again, I don't have anything definitive that you can put in your docco,
sorry.
Why do you need to include caches and queues in your model? Do
programmers care? Isn't the following sufficient...
I don't think it is sufficient, given the number of times the way the cache
interacts with everything has come up in this discussion.
: | m |
CPU -----> | e |
: | m |
: | o |
CPU -----> | r |
: | y |
... and bugger the implementation details?
Ah, but if the cache is on the CPU side of the dotted line, does that then mean
that a write memory barrier guarantees the CPU's cache to have updated memory?
I don't think it has to[*]. It would guarantee the _order_ in which
"global memory" of this model ie. visibility for other "CPUs" see the
writes, whether that visibility ultimately be implemented by cache
coherency protocol or something else, I don't think matters (for a
discussion of memory ordering). If anything it confused the matter
for the case of Alpha.
All the programmer needs to know is that there is some horizon (memory)
beyond which stores are visible to other CPUs, and stores can travel
there at different speeds so later ones can overtake earlier ones. And
likewise loads can come from memory to the CPU at different speeds too,
so later loads can contain earlier results.
[*] Nor would your model require a smp_wmb() to update CPU caches either,
I think: it wouldn't have to flush the store buffer, just order it.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]