Re: [BUG] Something goes wrong with timer statistics.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2007.05.30 14:44:49 +0200, Eric Dumazet wrote:
> On Wed, 30 May 2007 13:38:25 +0200
> Björn Steinbrink <[email protected]> wrote:
> 
> > On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote:
> > > On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote:
> > > > On 29/05/07, Ian Kumlien <[email protected]> wrote:
> > > > > Hi,
> > > > >
> > > > > As the daystar sets, i try to play some with my new would be
> > > > > firewall/server, but since this will be running for quite some time i
> > > > > have been experimenting with powertop to find out what i can do to limit
> > > > > it's power usage.
> > > > >
> > > > > But, if i run powertop for too long or a few times to many... this
> > > > > happens:
> > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg
> > > 
> > > This was reported before. It's incredibly hard to reproduce.
> > 
> > OK, second try, this time with a patch. In timer_stats_update_stats,
> > input is allocated on the stack, so it is uninitialized, in particular
> > the "next" field is random. Now in tstat_lookup, the new entry "curr" is
> > initialized with the values from "input" (passed as "entry") and "next"
> > is set to NULL _after_ it is added to the list, so if a second CPU is
> > running the fastpath lookup while we're inserting the new element, it
> > might get the garbage value in "next". The patch below fixes that.
> > 
> > Björn
> > 
> > 
> > 
> > Initialize the "next" field of a timer stats entry before it is inserted
> > into the list to avoid a race with the fastpath lookup.
> > 
> > Signed-off-by: Björn Steinbrink <[email protected]>
> > ---
> > diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
> > index 868f1bc..5bc8f91 100644
> > --- a/kernel/time/timer_stats.c
> > +++ b/kernel/time/timer_stats.c
> > @@ -202,12 +202,12 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
> >  	if (curr) {
> >  		*curr = *entry;
> >  		curr->count = 0;
> > +		curr->next = NULL;
> >  		memcpy(curr->comm, comm, TASK_COMM_LEN);
> >  		if (prev)
> >  			prev->next = curr;
> >  		else
> >  			*head = curr;
> > -		curr->next = NULL;
> >  	}
> >   out_unlock:
> >  	spin_unlock(&table_lock);
> > 
> 
> Your analysis might be right, not the fix.
> 
> You *cannot* assume curr->next = NULL will be done before insert.
> 
> You probably also need a memory barrier.

Ehrm, right. I somehow thought of the spinlock being enough to satisfy
that constraint, but if that was the case, the whole problem wouldn't
exist in the first place. D'oh!

Thanks,
Björn


Initialize the "next" field of a timer stats entry before it is inserted
into the list to avoid a race with the fastpath lookup.

Thanks to Eric Dumazet for reminding me of the memory barrier.

Signed-off-by: Björn Steinbrink <[email protected]>
---
diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
index 868f1bc..ab0ba6c 100644
--- a/kernel/time/timer_stats.c
+++ b/kernel/time/timer_stats.c
@@ -202,12 +202,15 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
 	if (curr) {
 		*curr = *entry;
 		curr->count = 0;
+		curr->next = NULL;
 		memcpy(curr->comm, comm, TASK_COMM_LEN);
+
+		smp_mb(); /* Ensure that curr is initialized before insert */
+
 		if (prev)
 			prev->next = curr;
 		else
 			*head = curr;
-		curr->next = NULL;
 	}
  out_unlock:
 	spin_unlock(&table_lock);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux