Re: [PATCH] percpu data: only iterate over possible CPUs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrew Morton a écrit :
Eric Dumazet <[email protected]> wrote:
Andrew Morton a écrit :
Andi Kleen <[email protected]> wrote:
On Thursday 09 February 2006 19:04, Andrew Morton wrote:
Ashok Raj <[email protected]> wrote:
The problem was with ACPI just simply looking at the namespace doesnt
 exactly give us an idea of how many processors are possible in this platform.
We need to fix this asap - the performance penalty for HOTPLUG_CPU=y,
NR_CPUS=lots will be appreciable.
What is this performance penalty exactly?
All those for_each_cpu() loops will hit NR_CPUS cachelines instead of
hweight(cpu_possible_map) cachelines.
You mean NR_CPUS bits, mostly all included in a single cacheline, and even in a single long word :) for most cases (NR_CPUS <= 32 or 64)


No, I mean cachelines:

static void recalc_bh_state(void)
{
	int i;
	int tot = 0;

	if (__get_cpu_var(bh_accounting).ratelimit++ < 4096)
		return;
	__get_cpu_var(bh_accounting).ratelimit = 0;
	for_each_cpu(i)
		tot += per_cpu(bh_accounting, i).nr;

That's going to hit NR_CPUS cachelines even on a 2-way.

Or am I missing something really obvious here?


OK I see. This can be solved with this patch :

[PATCH] HOTPLUG_CPU : avoid hitting too many cachelines in recalc_bh_state()

Instead of using for_each_cpu(i), we can use for_each_online_cpu(i) : The difference matters if HOTPUG_CPU=y

When a CPU goes offline (ie removed from online map), it might have a non null bh_accounting.nr, so this patch adds a transfert of this counter to an online CPU counter.

We already have a hotcpu_notifier, (function buffer_cpu_notify()), where we can do this bh_accounting.nr transfert.

Signed-off-by: Eric Dumazet <[email protected]>
--- a/fs/buffer.c	2006-02-10 15:08:21.000000000 +0100
+++ b/fs/buffer.c	2006-02-10 15:47:55.000000000 +0100
@@ -3138,7 +3138,7 @@
 	if (__get_cpu_var(bh_accounting).ratelimit++ < 4096)
 		return;
 	__get_cpu_var(bh_accounting).ratelimit = 0;
-	for_each_cpu(i)
+	for_each_online_cpu(i)
 		tot += per_cpu(bh_accounting, i).nr;
 	buffer_heads_over_limit = (tot > max_buffer_heads);
 }
@@ -3187,6 +3187,9 @@
 		brelse(b->bhs[i]);
 		b->bhs[i] = NULL;
 	}
+	get_cpu_var(bh_accounting).nr += per_cpu(bh_accounting, cpu).nr ;
+	per_cpu(bh_accounting, cpu).nr = 0;
+	put_cpu_var(bh_accounting);
 }
 
 static int buffer_cpu_notify(struct notifier_block *self,

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux