__percpu_counter_add() cache the result in percpu variable until it
exceeds the batch.
The prop_norm_percpu() use the percpu_counter_read(&pl->events) to read
the counter ,and use percpu_counter_add(&pl->events, -half) to half the
counter.
There are potential problems:
1.The counter may be negative
2.After some calculation, it may be converted to a big positive number.
1.
For example, the batch is 32, when the bdi add 32 to the pl->events,
the pl->events->count will be 32.Suppose one of the percpu counter is 1.
In the prop_norm_percpu(),the half will be 16.Because it is under the
batch, the pl->events->count won't be modified and one of the percpu
counter may be -15. If call the prop_norm_percpu() again, the half will
still be 16,though it should be 8.The percpu counter may be -31.
Now, there pl->events->count is still 32.
If do the third calculation, the percpu counter will be -47, it will
be merged into the pl->evnets->count.Then pl->events->count will be
negative.
2.When the pl->events->count is negative,
unsigned long val = percpu_counter_read(&pl->events);
This statement may return a negative number, so the val would be a big
number.Because of the overflow, the pl->events->count will be converted
into a big positive number after some calculation.
Because of the overflow, I catch some very big numerators when call the
prop_fraction_percpu().
I think that it should use percpu_counter_sum() instead of the
percpu_counter_read() to be more robust.
diff -Nur a/proportions.c b/proportions.c
--- a/proportions.c 2007-12-12 11:05:59.000000000 +0800
+++ b/proportions.c 2007-12-13 11:05:40.000000000 +0800
@@ -241,7 +241,7 @@
* can never result in a negative number.
*/
while (pl->period != global_period) {
- unsigned long val = percpu_counter_read(&pl->events);
+ unsigned long val = percpu_counter_sum(&pl->events);
unsigned long half = (val + 1) >> 1;
/*
Here is the relative codes:
static
void prop_norm_percpu(struct prop_global *pg, struct prop_local_percpu
*pl)
{
/*
* For each missed period, we half the local counter.
* basically:
* pl->events >> (global_period - pl->period);
*
* but since the distributed nature of percpu counters make division
* rather hard, use a regular subtraction loop. This is safe, because
* the events will only every be incremented, hence the subtraction
* can never result in a negative number.
*/
while (pl->period != global_period) {
unsigned long val = percpu_counter_read(&pl->events);
unsigned long half = (val + 1) >> 1;
/*
* Half of zero won't be much less, break out.
* This limits the loop to shift iterations, even
* if we missed a million.
*/
if (!val)
break;
percpu_counter_add(&pl->events, -half);
pl->period += period;
}
pl->period = global_period;
spin_unlock_irqrestore(&pl->lock, flags);
}
void __percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32
batch)
{
s64 count;
s32 *pcount;
int cpu = get_cpu();
pcount = per_cpu_ptr(fbc->counters, cpu);
count = *pcount + amount;
if (count >= batch || count <= -batch) {
spin_lock(&fbc->lock);
fbc->count += count;
*pcount = 0;
spin_unlock(&fbc->lock);
} else {
*pcount = count;
}
put_cpu();
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]