RE: [PATCH RFC] smt nice introduces significant lock contention

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Con Kolivas wrote on Thursday, June 01, 2006 6:59 PM
> On Friday 02 June 2006 09:57, Chen, Kenneth W wrote:
> > Chris Mason wrote on Thursday, June 01, 2006 3:56 PM
> >
> > > Hello everyone,
> > >
> > > Recent benchmarks showed some performance regressions between 2.6.16 and
> > > 2.6.5.  We tracked down one of the regressions to lock contention in
> > > schedule heavy workloads (~70,000 context switches per second)
> > >
> > > kernel/sched.c:dependent_sleeper() was responsible for most of the lock
> > > contention, hammering on the run queue locks.  The patch below is more of
> > > a discussion point than a suggested fix (although it does reduce lock
> > > contention significantly).  The dependent_sleeper code looks very
> > > expensive to me, especially for using a spinlock to bounce control
> > > between two different siblings in the same cpu.
> >
> > Yeah, this also sort of echo a recent discovery on one of our benchmarks
> > that schedule() is red hot in the kernel.  I was just scratching my head
> > not sure what's going on.  This dependent_sleeper could be the culprit
> > considering it is called almost at every context switch.  I don't think
> > we are hitting on lock contention, but by the large amount of code it
> > executes.  It really needs to be cut down ....
> 
> Thanks for the suggestion. How about something like this which takes your
> idea and further expands on it. Compiled, boot and runtime tests ok.
> 
>  /*
> + * Try to lock all the siblings of a cpu that is already locked. As we're
> + * only doing trylock the order is not important.
> + */
> +static int trylock_smt_siblings(cpumask_t sibling_map)
> +{
> +	cpumask_t locked_siblings;
> +	int i;
> +
> +	cpus_clear(locked_siblings);
> +	for_each_cpu_mask(i, sibling_map) {
> +		if (!spin_trylock(&cpu_rq(i)->lock))
> +			break;
> +		cpu_set(i, locked_siblings);
> +
> +	}
> +
> +	/* Did we lock all the siblings? */
> +	if (cpus_equal(sibling_map, locked_siblings))
> +		return 1;
> +
> +	for_each_cpu_mask(i, locked_siblings)
> +		spin_unlock(&cpu_rq(i)->lock);
> +	return 0;
> +}


I like Chris's version of trylock_smt_siblings().  Why create all that
local variables?  sibling_map is passed by value, so the whole thing is
duplicated on the stack (I think it should be pass by pointer), then
there is another locked_siblings mask declared followed by a bitmask
compare. The big iron people who set CONFIG_NR_CPUS=1024 won't be
amused because of all that bitmask copying.

Let me hack up something too, so we can compare notes etc.

- Ken
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux