Paul E. McKenney wrote:
On Thu, Aug 09, 2007 at 12:36:17PM -0400, Chris Snook wrote:
Paul E. McKenney wrote:
The compiler is within its rights to read a 32-bit quantity 16 bits at
at time, even on a 32-bit machine. I would be glad to help pummel any
compiler writer that pulls such a dirty trick, but the C standard really
does permit this.
Yes, but we don't write code for these compilers. There are countless
pieces of kernel code which would break in this condition, and there
doesn't seem to be any interest in fixing this.
Use of volatile does in fact save you from the compiler pushing stores out
of loops regardless of whether you are also doing reads. The C standard
has the notion of sequence points, which occur at various places including
the ends of statements and the control expressions for "if" and "while"
statements. The compiler is not permitted to move volatile references
across a sequence point. Therefore, the compiler is not allowed to
push a volatile store out of a loop. Now the CPU might well do such a
reordering, but that is a separate issue to be dealt with via memory
barriers. Note that it is the CPU and I/O system, not the compiler,
that is forcing you to use reads to flush writes to MMIO registers.
Sequence points enforce read-after-write ordering, not write-after-write.
We flush writes with reads for MMIO because of this effect as well as the
CPU/bus effects.
Neither volatile reads nor volatile writes may be moved across sequence
points.
By the compiler, or by the CPU? If you're depending on volatile writes being
visible to other CPUs, you're screwed either way, because the CPU can hold that
data in cache as long as it wants before it writes it to memory. When this
finally does happen, it will happen atomically, which is all that atomic_set
guarantees. If you need to guarantee that the value is written to memory at a
particular time in your execution sequence, you either have to read it from
memory to force the compiler to store it first (and a volatile cast in
atomic_read will suffice for this) or you have to use LOCK_PREFIX instructions
which will invalidate remote cache lines containing the same variable. This
patch doesn't change either of these cases.
And you would be amazed at what compiler writers will do in order to
get an additional fraction of a percent out of SpecCPU...
Probably not :)
In short, please retain atomic_set()'s volatility, especially on those
architectures that declared the atomic_t's counter to be volatile.
Like i386 and x86_64? These used to have volatile in the atomic_t
declaration. We removed it, and the sky did not fall.
Interesting. You tested all possible configs on all possible hardware?
No, but I can reason about it and be confident that this won't break anything
that isn't already broken. At worst, this patch will make any existing subtly
incorrect uses of atomic_t much more obvious and easier to track down.
-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]