On Mon, 25 Jun 2007, Segher Boessenkool wrote:
then do we need a new option 'optimize for best overall performance' that
goes for size (and the corresponding wins there) most of the time, but is
ignored where it makes a huge difference?
That's -Os mostly. Some awful CPUs really need higher
loop/label/function alignment though to get any
performance; you could add -falign-xxx options for those.
in reality this was a flaw in gcc that on modern CPU's with the larger
difference between CPU speed and memory speed it still preferred to unroll
loops (eating more memory and blowing out the cpu cache) when it shouldn't
have.
You told it to unroll loops, so it did. No flaw. If you
feel the optimisations enabled by -O2 should depend on the
CPU tuning selected, please file a PR.
Also note that whether or not it is profitable to unroll
a particular loop depends largely on how "hot" that loop
is, and GCC doesn't know much about that if you don't feed
it profiling information (it can guess a bit, sure, but it
can guess wrong too).
actually, what you are saying is that the compiler can't know enough to
figure out how to optimize for speed. it will just do what you tell it to,
either unroll loops or not.
this argues that both O2 and Os are incorrect for a project to use and
instead the project needs to make it's own decisions on this.
if this is the true feeling of the gcc team I'm very disappointed, it
feels like a huge step backwards.
David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]