Re: [patch 2/4] net: percpufy frequently used vars -- struct proto.memory_allocated

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ravikiran G Thirumalai <[email protected]> wrote:
>
> On Tue, Mar 07, 2006 at 06:14:22PM -0800, Andrew Morton wrote:
> > Ravikiran G Thirumalai <[email protected]> wrote:
> > >
> > > -	if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
> > >  +	if (percpu_counter_read(sk->sk_prot->memory_allocated) <
> > >  +			sk->sk_prot->sysctl_mem[0]) {
> > 
> > Bear in mind that percpu_counter_read[_positive] can be inaccurate on large
> > CPU counts.
> > 
> > It might be worth running percpu_counter_sum() to get the exact count if we
> > think we're about to cause something to fail.
> 
> The problem is percpu_counter_sum has to read all the cpus cachelines.  If
> we have to use percpu_counter_sum everywhere, then might as well use plain
> per-cpu counters instead of  batching ones no?

I didn't say "use it everywhere" ;)

Just in places like this:

	if (percpu_counter_read(something) > something_else)
		make_an_application_fail();

in that case it's worth running percpu_counter_sum().  And bear in mind
that once we've done that, the following percpu_counter_read()s become
accurate, so we won't run the expensive percpu_counter_sum() again
for a while.  Unless we're really close to or over the limit, in which case
blowing a few cycles is relatively unimportant.

All that should be captured in library code (per_cpu_counter_exceeds(ptr,
threshold), for example) rather than open-coded everywhere.

> sysctl_mem[0] is about 196K  and on a 16 cpu box variance is 512 bytes, which 
> is OK with just percpu_counter_read I hope.

You mean a 16 CPU box with NR_CPUS=16 as well...

>  Maybe, on very large cpu counts, 
> we should just change the FBC_BATCH so that variance does not go quadratic.
> Something like 32.  So that variance is 32 * NR_CPUS in that case, instead
> of (NR_CPUS * NR_CPUS * 2) currently.  Comments?

Sure, we need to make that happen.  But it got all mixed up with the
spinlock removal and it does need quite some thought and testing and
documentation to help developers to choose the right settings and
appropriate selection of defaults, etc.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux