On Thu, 28 Jul 2005, Steven Rostedt wrote:
>
> In the thread "[RFC][PATCH] Make MAX_RT_PRIO and MAX_USER_RT_PRIO
> configurable" I discovered that a C version of find_first_bit is faster
> than the asm version now when compiled against gcc 3.3.6 and gcc 4.0.1
> (both from versions of Debian unstable). I wrote a benchmark (attached)
> that runs the code 1,000,000 times.
I suspect the old "rep scas" has always been slower than
compiler-generated code, at least under your test conditions. Many of the
old asm's are actually _very_ old, and some of them come from pre-0.01
days and are more about me learning the i386 (and gcc inline asm).
That said, I don't much like your benchmarking methodology. I suspect that
quite often, the code in question runs from L2 cache, not in a tight loop,
and so that "run a million times" approach is not necessarily the best
one.
I'll apply this one as obvious: I doubt the compiler generates bigger code
or has any real downsides, but I just wanted to say that in general I just
wish people didn't always time the hot-cache case ;)
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
|
|