Re: [RFC, patch] i386: vgetcpu(), take 2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In-Reply-To: <[email protected]>

On Wed, 21 Jun 2006 19:14:37 +0200, Andi Kleen wrote:

>> 
>> /* test how fast lsl/jnz/and runs.
>>  */
>> #define _GNU_SOURCE
>> #include <stdio.h>
>> #include <stdlib.h>
>> 
>> #define rdtscll(t)   asm volatile ("rdtsc" : "=A" (t))
>> 
>> #ifndef ITERS
>> #define ITERS        1000000
>> #endif
>> 
>> int main(int argc, char * const argv[])
>> {
>>      unsigned long long tsc1, tsc2;
>>      int count, cpu, junk;
>> 
>>      rdtscll(tsc1);
>>      asm (
>>              "       pushl %%ds              \n"
>>              "       popl %2                 \n"
>>              "1:                             \n"
>> #ifdef DO_TEST
>>              "       lsl %2,%0               \n"
>>              "       jnz 2f                  \n"
>>              "       and $0xff,%0            \n"
>> #endif
>>              "       dec %1                  \n"
>>              "       jnz 1b                  \n"
>>              "2:                             \n"
>>              : "=&r" (cpu), "=&r" (count), "=&r" (junk)
>>              : "1" (ITERS), "0" (-1)
>>      );
>>      rdtscll(tsc2);
>
> Measuring this way is a bad idea because you get far too much 
> noise from the RDTSCs. Usually you need to put a a few thousands entry 
> loop inside the RDTSCP and devide the result by the loop count

I got tired of people (namely me) forgetting to compile the C code
with optimization, so I did the loop in assembler.  It does 1000000
iterations by default.  Later I added the DO_TEST that lets you test
the empty loop just because I was curious.

A more realistic test with the two 'mov' instructions inside the loops
still only takes 16 clocks, so I'm wondering why you get 60?  Does the
vsyscall add that much overhead?  With this I get 29-30 clocks per loop
on Pentium II:


/* vgetcpu.c: test how fast vgetcpu runs
 * boot kernel with vgetcpu patch first, then build this:
 *  gcc -O3 -o vgetcpu vgetcpu.c <srcpath>/arch/i386/kernel/vsyscall-int80.so
 * (don't forget the optimization (-O3))
 */
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>

extern int __vgetcpu(void);

#define rdtscll(t)      asm("rdtsc" : "=A" (t))

int main(int argc, char * const argv[])
{
        long long tsc1, tsc2;
        int i, iters = 999999;

        rdtscll(tsc1);
        for (i = 0; i < iters; i++)
                __vgetcpu();
        rdtscll(tsc2);

        printf("loops: %d, avg: %llu\n", iters, (tsc2 - tsc1) / iters);

        return 0;
}
-- 
Chuck
 "You can't read a newspaper if you can't read."  --George W. Bush
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux