Satyam Sharma wrote:
On Tue, 24 Jul 2007, Nick Piggin wrote:
Satyam Sharma wrote:
From: Satyam Sharma <[email protected]>
[8/8] i386: bitops: smp_mb__{before, after}_clear_bit() definitions
From Documentation/atomic_ops.txt, those archs that require explicit
memory barriers around clear_bit() must also implement these two interfaces.
However, for i386, clear_bit() is a strict, locked, atomic and
un-reorderable operation and includes an implicit memory barrier already.
But these two functions have been wrongly defined as "barrier()" which is
a pointless _compiler optimization_ barrier, and only serves to make gcc
not do legitimate optimizations that it could have otherwise done.
So let's make these proper no-ops, because that's exactly what we require
these to be on the i386 platform.
No. clear_bit is not a compiler barrier on i386,
Obvious.
thus smp_mb__before/after
must be.
Not so obvious. Why do we require these to be a full compiler barrier
is precisely the question I raised here.
Consider this (the above two functions exist only for clear_bit(),
the atomic variant, as you already know), the _only_ memory reference
we care about is that of the address of the passed bit-string:
No. Memory barriers explicitly extend to all memory references.
(1) The compiler must not optimize / elid it (i.e. we need to disallow
compiler optimization for that reference) -- but we've already taken
care of that with the __asm__ __volatile__ and the constraints on
the memory "addr" operand there, and,
(2) For the i386, it also includes an implicit memory (CPU) barrier
already.
Repeating what has been said before: A CPU memory barrier is not a
compiler barrier or vice versa. Seeing as we are talking about
the compiler barrier, it is irrelevant as to whether or not the
assembly includes a CPU barrier.
So I /think/ it makes sense to let the compiler optimize _other_ memory
references across the call to clear_bit(). There's a difference. I think
we'd be safe even if we do this, because the synchronization in callers
must be based upon the _passed bit-string_, otherwise _they_ are the
ones who're buggy.
Yes it makes sense to let the compiler move memory operations over
clear_bit(), because we have defined the interface to be nice and
relaxed. And this is exactly why we do need to have an additional
barrier there in smp_mb__*_clear_bit().
[ For those interested, I've been looking at the code generated
for the test kernel I built with these patches, and I don't
really see anything gcc did that it shouldn't have -- but ok,
that doesn't mean other versions/toolchains for other setups
won't. Also, the test box has been up all night, but I'm only
running Firefox on it anyway, and don't really know how to
verify if I've introduced any correctness issues / bugs. ]
correct output != correct input.
Without a barrier there, we _allow_ the compiler to reorder. If it
does not reorder, the missing barrier is still a bug :)
--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]