In-Reply-To: <[email protected]>
On Mon, 12 Jun 2006 20:48:33 +0200, Andreas Mohr wrote:
> > Kernel code starts out ~30K bytes smaller with gcc 4.1 and using C
> > for current_thread_info() helps even more than with 4.0. Nice...
>
> Especially since current_thread_info() often has an AGI stall (read:
> severe pipeline stall) since it often cannot properly intermingle
> with nearby opcodes due to lack of suitable ones, e.g. at a
> function prologue.
> mov $0xffffe000,%eax
> and %esp,%eax
> are fundamentally incompatible due to having to wait for the address
> generation before the "and" can be executed.
> This shows up during profiling quite noticeably (IIRC 8 hits vs. 1 to 2
> hits on other places), which really hurts since this function is used
> basically *everywhere*.
Hmmm. The compiler does it this way:
mov %esp,%eax
and $0xffffe000,%eax
which could be faster because esp can be moved to eax while the mask
is being fetched.
--
Chuck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]