Russell King wrote:
On Wed, Dec 06, 2006 at 11:16:55AM -0800, Christoph Lameter wrote:
No. If you read what I said, you'll see that you can _cheaply_ use
cmpxchg in a ll/sc based implementation. Take an atomic increment
operation.
do {
old = load_locked(addr);
} while (store_exclusive(old, old + 1, addr);
On a cmpxchg, that "store_exclusive" (loosely) becomes your cmpxchg
instruction, comparing the first arg, and if equal storing the second.
The "load_locked" macro becomes a standard pointer deref. Ergo, x86
becomes:
do {
load value
manipulate it
conditional store
} while not stored
On ll/sc, the load_locked() macro is the load locked instruction. The
store_exclusive() macro is the exclusive store and it doesn't need to
use the first parameter at all. Ergo, ARM becomes:
do {
ldrex r1, [r2]
manipulate r1
strex r0, r1, [r2]
} while failed
Notice that both are optimal.
Now let's consider the cmpxchg case.
do {
val = *addr;
} while (cmpxchg(val, val + 1, addr);
The x86 case is _identical_ to the ll/sc based implementation. Absolutely
entirely. No impact what so ever.
Let's look at the ll/sc case. The cmpxchg code implemented on this has
to reload the original value, compare it, if equal store the new value.
So:
do {
val = *addr;
(r2 = addr,
ldrex r1, [r2]
compare r1, r0
strexeq r4, r3, [r2] (store exclusive if equal)
} while store failed or comparecondition failed
Note how the cmpxchg has _forced_ the ll/sc implementation to become
more complex.
So, let's recap.
Implementing ll/sc based accessor macros allows both ll/sc _and_ cmpxchg
architectures to produce optimal code.
Implementing an cmpxchg based accessor macro allows cmpxchg architectures
to produce optimal code and ll/sc non-optimal code.
See my point?
Wrong. Your ll/sc implementation with cmpxchg is buggy. The cmpxchg
load_locked is not locked at all, and there can be interleaving writes
between the load and cmpxchg which do not cause the store_conditional
to fail.
It might be reasonable to implement this watered down version, but:
don't some architectures have restrictions on what instructions can
be issued between the ll and the sc?
But in general I agree with you, in that a higher level primitive is
preferable (eg. atomic_add_unless).
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]