On Thu, Feb 23, 2006 at 08:48:43AM -0800, Linus Torvalds wrote:
> For example, on x86(-64), memcpy() is mostly inlined for the interesting
> cases. That's not always so. Other architectures will have things like the
> page copying and clearing as _the_ hottest functions. Same goes for
> architecture-specific things like context switching etc, that have
> different names on different architectures.
On x86-64 the way gcc inlines memcpy() is rather broken, too. If the
kernel is compiled with -Os instead of -O2, gcc seems to always use
rep ; movs or rep ; stos, which is substantially slower (factor of 10
or more for some sizes -- a few parallel issued pipelined instructions
vs tieing up the entire pipeline until the size of the move is known and
dispatched) for small size data structures. Each instance fixed was
worth a few percent on the P4 when looking at lmbench's af_unix component.
-ben
--
"Ladies and gentlemen, I'm sorry to interrupt, but the police are here
and they've asked us to stop the party." Don't Email: <[email protected]>.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]