Re: [patch 00/2] improve .text size on gcc 4.0 and newer compilers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Andrew Morton <[email protected]> wrote:

> Either way, we need to go all over the tree.  In practice, we'll only 
> bother going over the bits which we most care about (core kernel, core 
> networking, a handful of net and block drivers).  I suspect many of 
> the bad inlining decisions are in poorly-maintained code - we've been 
> pretty careful about this for several years.

i've gone over quite many inlining decisions, and i have to say that 
even for my _own code_, most of the historic inlining decisions were 
wrong. What seems simple on the C level, can be quite bloaty on the 
assembly level - even if you are well aware of how C maps to assembly!

furthermore, there's also a new CPU-architecture argument: the cost of 
icache misses has gone up disproportionally over the past couple of 
years, because on the first hand lots of instruction-scheduling 
'metadata' got embedded into the L1 cache (like what used to be the BTB 
cache), and secondly because the (physical) latency gap between L1 cache 
and L2 cache has increased. Thirdly, CPUs are much better at untangling 
data dependencies, hence more compact but also more complex code can 
still perform well. So the L1 icache is more important than it used to 
be, and small code size is more important than raw cycle count - _and_ 
small code has less of a speed hit than it used to have. x86 CPUs have 
become simple JIT compilers, and code size reductions tend to become the 
best way to inform the CPU of what operations we want to compute.

So as an end-result, most of the historic inlining decisions _even in 
well-maintained core code_ are mostly incorrect these days. We really 
only want to inline things where the inlining results in _smaller_ code 
(due to less clobbering of function-call related caller-saved 
registers). That pretty much only leaves the truly trivial 1-2 
instructions code sequences like get_current() and the __always_inline 
stuff like kmalloc() or memset(). All the rest wants to be a function 
call these days.

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux