Re: [patch 04/15] Generic Mutex Subsystem, add-atomic-call-func-x86_64.patch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Tue, 20 Dec 2005, Russell King wrote:
> 
> Also, Nico has an alternative idea for mutexes which does not
> involve decrementing or incrementing - it's an atomic swap.
> That works out at about the same cycle count on non-Intel ARM
> CPUs as the present semaphore path.  I'm willing to bet that
> it will be faster than the present semaphore path on Intel ARM
> CPUs.

Don't be so sure, especially not in the future.

An atomic "swap" operation is, from a CPU design standpoint, fundamentally 
more expensive that a "load + store".

Now, most ARM architectures don't notice this, because they are all 
in-order, and not SMP-aware anyway. No suble memory ordering, no nothing. 
Which is the only case when "swap" basically becomes a cheap "load+store".

What I'm trying to say is that a plain "load + store" is almost always 
going to be the best option in the long run.

It's also almost certainly always the best option for UP + non-preempt, 
for both present and future CPU's. The reason is simply that a 
microarchitecture will _always_ be optimized for that case, since it's 
pretty much by definition the common situation.

Is preemption even the common case on ARM? I'd assume not. Why are people 
so interested in the preemption case? IOW, why don't you just do

	ldr  lr,[%0]
	subs lr, lr, %1
	str  lr,[%0]
	blmi failure

as the _base_ timings, since that should be the common case. That's the 
drop-dead fastest UP case.

There's an additional advantage of the regular load/store case: if some 
CPU has scheduling issues, you can actually write this out as C code (with 
an extra empty ASM to make sure that the compiler doesn't move anything 
out of the critical region). Again, that probably doesn't matter on most 
ARM chips, but in the general case it sure does matter.

(Btw, inlining _any_ of these except perhaps the above trivial case, is 
probably wrong. None of the ARM chips tend to have caches all that big, I 
bet).

			Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux