On Friday 03 November 2006 08:26, Eric Dumazet wrote:
> Hi Andi
>
> While doing some oprofile analysis, I got this result on ip_route_input() :
> one particular instruction seems to spend a lot of cycles.
>
> machine is a dual core 285, 2.6 GHz
Single socket?
> ################ BEGIN
> 370 0.0017 :ffffffff803e98f1: mov %gs:0x8,%rax
> 222769 1.0454 :ffffffff803e98fa: incl 0x38(%rcx,%rax,1)
> ################ END
> 6 2.8e-05 :ffffffff803e98fe: mov (%rdx),%rdx
> 833 0.0039 :ffffffff803e9901: test %rdx,%rdx
>
> __raw_get_cpu_var(rt_cache_stat).field++ appears to be very expensive
First the profile events are not exact. It could be an earlier instruction.
The reordering Window is relatively large (~80 macro ops)
But let's assume it is this one.
Weird. Maybe it's a cache miss. Can you check with DATA_CACHE_MISSES ?
Or possible a TLB miss, although that's far less likely (L1_AND_L2_DTLB_MISSES)
> (about 18000 RT_CACHE_STAT_INC(in_hlist_search); are done per second, not an
> impressive count in fact)
>
> Are segment prefixes that expensive ?
No, they are supposed to be one cycle only.
> Or is it only the first access to %gs:8 that is doing extra checks ?
> (because other RT_CACHE_STAT_INC() done in the same function dont have this cost)
> is it the loading of %rcx (done in ffffffff803e98bd) that is stalling ?
My first guess would be a cache miss of some sort.
Maybe the rest of the code needs enough cache to push this line out.
Or since you got dual core with separated caches they are bouncing
for some reason (they shouldn't, but maybe there is a bug)
> so that RT_CACHE_STAT_INC(field) would map to
>
> addl $1,%gs:OFFSET /* no register needed */
I doubt that will make much difference.
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]