Re: [patch 9/9] Scheduler profiling - Use conditional calls

On Fri, 1 Jun 2007 11:54:13 -0400 Mathieu Desnoyers <[email protected]> wrote:

> * Andrew Morton ([email protected]) wrote:
> > On Wed, 30 May 2007 10:00:34 -0400
> > Mathieu Desnoyers <[email protected]> wrote:
> > 
> > > @@ -2990,7 +2991,8 @@
> > >  			print_irqtrace_events(prev);
> > >  		dump_stack();
> > >  	}
> > > -	profile_hit(SCHED_PROFILING, __builtin_return_address(0));
> > > +	cond_call(profile_on,
> > > +		profile_hit(SCHED_PROFILING, __builtin_return_address(0)));
> > >  
> > 
> > That's looking pretty neat.  Do you have any before-and-after performance
> > figures for i386 and for a non-optimised architecture?
> 
> Sure, here is the result of a small test comparing:
> 1 - Branch depending on a cache miss (has to fetch in memory, caused by a 128
>     bytes stride)). This is the test that is likely to look like what
>     side-effect the original profile_hit code was causing, under the
>     assumption that the kernel is already using L1 and L2 caches at
>     their full capacity and that a supplementary data load would cause
>     cache trashing.
> 2 - Branch depending on L1 cache hit. Just for comparison.
> 3 - Branch depending on a load immediate in the instruction stream.
> 
> It has been compiled with gcc -O2. Tests done on a 3GHz P4.
> 
> In the first test series, the branch is not taken:
> 
> number of tests : 1000
> number of branches per test : 81920
> memory hit cycles per iteration (mean) : 48.252
> L1 cache hit cycles per iteration (mean) : 16.1693
> instruction stream based test, cycles per iteration (mean) : 16.0432
> 
> 
> In the second test series, the branch is taken and an integer is
> incremented within the block:
> 
> number of tests : 1000
> number of branches per test : 81920
> memory hit cycles per iteration (mean) : 48.2691
> L1 cache hit cycles per iteration (mean) : 16.396
> instruction stream based test, cycles per iteration (mean) : 16.0441
> 
> Therefore, the memory fetch based test seems to be 200% slower than the
> load immediate based test.

Confused.  From what did you calculate that 200%?

> (I am adding these results to the documentation)

Good, thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [patch 9/9] Scheduler profiling - Use conditional calls
  - From: Mathieu Desnoyers <[email protected]>

References:
- [patch 0/9] Conditional Calls - for 2.6.22-rc2-mm1
  - From: Mathieu Desnoyers <[email protected]>
- [patch 9/9] Scheduler profiling - Use conditional calls
  - From: Mathieu Desnoyers <[email protected]>
- Re: [patch 9/9] Scheduler profiling - Use conditional calls
  - From: Andrew Morton <[email protected]>
- Re: [patch 9/9] Scheduler profiling - Use conditional calls
  - From: Mathieu Desnoyers <[email protected]>

Prev by Date: Re: [PATCH 3/5] lockstat: core infrastructure
Next by Date: Re: [PATCH] sendfile removal
Previous by thread: Re: [patch 9/9] Scheduler profiling - Use conditional calls
Next by thread: Re: [patch 9/9] Scheduler profiling - Use conditional calls
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]