Hi,
On Mon, Jun 12, 2006 at 01:14:34PM -0400, Chuck Ebbert wrote:
> Kernel code starts out ~30K bytes smaller with gcc 4.1 and using C
> for current_thread_info() helps even more than with 4.0. Nice...
Especially since current_thread_info() often has an AGI stall (read:
severe pipeline stall) since it often cannot properly intermingle
with nearby opcodes due to lack of suitable ones, e.g. at a
function prologue.
mov $0xffffe000,%eax
and %esp,%eax
are fundamentally incompatible due to having to wait for the address
generation before the "and" can be executed.
This shows up during profiling quite noticeably (IIRC 8 hits vs. 1 to 2
hits on other places), which really hurts since this function is used
basically *everywhere*.
As such whatever optimization we can gain (e.g. the compiler merging
multiple current_thread_info() invocations) is very worthwhile.
I've come up with this C variant recently, too, but I didn't think
it would make too much of a difference, however it seems it's really
useful when going towards newer gcc versions.
Andreas Mohr
--
No programming skills!? Why not help translate many Linux applications!
https://launchpad.ubuntu.com/rosetta
(or alternatively buy nicely packaged Linux distros/OSS software to help
support Linux developers creating shiny new things for you?)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]