> > > It's documented as being suboptimal to use inc/dec due to it modifying all
> > > of eflags resulting in dependency related stalls. add/sub only modifies
> > > one bit of eflags so is more optimal. However there is a problem of
> >
> > ?! add/sub doesn't modify "only one bit in eflags", it modifies all.
> > In fact, it's dec/inc which does not modify all bits.
> > It doesn't touch 'carry' bit (IIRC).
> >
> > If inc/dec is slower on P4, it must be just another P4 quirk.
>
> You're right about the add and the number of modified bits. The documented
> part is found in the P4 optimisation manual;
>
> "The inc and dec instructions should always be avoided. Using add
> and sub instructions instead avoids data dependence and improves
> performance."
I read this as "get saner processor".
But frankly any CPU optimization manuals, not just Intel's,
go to semi-absurd suggestions like "align all data
and branch targets to $BIGNUM bytes", "do not use instruction X
(because we failed to make it work reasonably fast), use Y".
Of course next gen CPU will prefer Y over X
(for example, AMD's recomended way to do NOPs).
--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
|
|