Re: [RFC, patch] i386: vgetcpu(), take 2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Wed, 21 Jun 2006, Andi Kleen wrote:
> 
> My measurements show different - i get 60+ cycles on K8 and 150+ cycles
> on P4. That is with a full vsyscall around it. However it is still
> far better than CPUID, however slower than RDTSCP on those CPUs that support it.
> 
> I changed the CPUID fallback path to use LSL on x86-64

One note of warning: 

Playing "clever games" has a real tendency to suck badly eventually. I'm 
betting LSL is pretty damn low on any list of instructions to be optimized 
by the CPU core, so it would tend to always be microcoded, while other ops 
might get faster.

> Measuring this way is a bad idea because you get far too much 
> noise from the RDTSCs. Usually you need to put a a few thousands entry 
> loop inside the RDTSCP and devide the result by the loop count

And measuring that way isn't perfect either, because it tends to show you 
how well an instruction works in that particular instruction mix, but not 
necessarily in real life.

Benchmarking single instructions is simply damn hard. It's often better to 
try to find a real load where that particular sequence is important enough 
to be measurable at all, and then try the alternatives. Not perfect 
either, but if you can't find such a load, maybe you shouldn't be doing it 
in the first place.. And if you _can_ find such a real load, at least you 
measured something that was actually real.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux