Re: [PATCH] i386-pda UP optimization

Andi Kleen wrote:
>> For umask/getppid, assuming you're just running 1e7 iterations, you're
>> seeing a difference of 25 and 35ns per iteration difference.  I wonder
>> why it would be different for different syscalls; I would expect it to
>> be a constant overhead either way.
>>     
>
> They got different numbers of current references? 
>   

My understanding is that Eric has changed UP current (and other PDA ops)
to not touch %gs at all, and the difference in reported times in due
omitting the %gs load in entry.S (though %gs is still save/restored on
the stack).

> On such micro benchmarks everything should be cache hot in theory
> (unless it's a system with really small cache)
>   

Yes, that would be my thought too, but maybe there's excessive aliasing
on one of the ways, but I think he's using a Pentium M which has a 8-way L1.

>> been planning on a patch to rearrange the gdt in order to pack all the
>> commonly used segment descriptors into one or two cache lines so that
>> all the segment register reloads can be done with a minimum of cache
>> misses.  It would be interesting for you to replace the:
>>
>>     movl $(__KERNEL_PDA), %edx; movl %edx, %gs
>>
>> with an appropriate read of the gdt entry, hm, which is a bit complex to
>> find.
>>     
>
> On UP it could be hardcoded. And oprofile can be used to profile for cache misses.
>   

Yes, assuming oprofile doesn't interfere with things too much. 
Actually, just counting cache miss events during the course of a syscall
would be most interesting (ie, no need to sample).

    J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- Re: [PATCH] i386-pda UP optimization
  - From: Eric Dumazet <dada1@cosmosbay.com>
- Re: [PATCH] i386-pda UP optimization
  - From: Jeremy Fitzhardinge <jeremy@goop.org>
- Re: [PATCH] i386-pda UP optimization
  - From: Andi Kleen <ak@suse.de>

Prev by Date: Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
Next by Date: [PATCH 1/3] ext2/3/4: enable "undeletable" file attribute.
Previous by thread: Re: [PATCH] i386-pda UP optimization
Next by thread: Re: [PATCH] i386-pda UP optimization
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]