Re: 2.6.16-rc1-mm3 — Linux Kernel

Eric Dumazet wrote:
> Andrew Morton a écrit :
> 
>> Andy Whitcroft <apw@shadowen.org> wrote:
>>
>>> Yes.  I think I have this one.  It appears that the patch below is the
>>>  trigger for all our recent panic woe's.  The last of the testing should
>>>  complete in the next few hours and I will be able to confirm that
>>>  hypothesis; results so far are all good.
>>>
>>>      reduce-size-of-percpudata-and-make-sure-per_cpuobject.patch
>>
>>
>> That patch did have some missed conversions, which might well explain the
>> crash.
>>
>> Thanks for narrowing it down - I'll keep that patch in next -mm (and will
>> include the known fixups).  Could you please boot test that?  If we're
>> still in trouble, I'll drop it.

Sounds eminently fair.  I think the patch has merit so now we know the
symptoms we can spent a little effort to get the kinks out.  Will test
the next -mm as a matter of course.


> The NULL choice was maybe wrong. We might need more than one page to
> fully catch all accesses. Something like 32KB.

The crash behavoir is handy to catch that the problem exists, and is
very cheap (0 cost) at run time.  However, once its known I think we
need something more targetted to allow tracking of the cause.  Perhaps
we could set the offset thingy to -1 or something and simply do
something like the following in per_cpu():

	if (__per_cpu_offset[i] == -1)
		BUG();
	else
		*RELOC_HIDE(&per_cpu__##var, __per_cpu_offset[cpu])

> In the meantime could you apply this one ?
> 
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> 
> 
> 
> ------------------------------------------------------------------------
> 
> --- a/arch/i386/kernel/nmi.c	2006-01-27 07:51:04.000000000 +0100
> +++ b/arch/i386/kernel/nmi.c	2006-01-27 07:52:14.000000000 +0100
> @@ -148,7 +148,7 @@
>  	if (nmi_watchdog == NMI_LOCAL_APIC)
>  		smp_call_function(nmi_cpu_busy, (void *)&endflag, 0, 0);
>  
> -	for (cpu = 0; cpu < NR_CPUS; cpu++)
> +	for_each_cpu(cpu)
>  		prev_nmi_count[cpu] = per_cpu(irq_stat, cpu).__nmi_count;
>  	local_irq_enable();
>  	mdelay((10*1000)/nmi_hz); // wait 10 ticks

No change to the panic's in alloc_slabmgmt.  A very quick review seems
to say that slab percpu data is actually not in percpu space, so that
seems a little odd.  Not had any real time to trace it further.

If you have any other missed ones than this send them along and I'll put
them through the mill.

-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: 2.6.16-rc1-mm3
  - From: Eric Dumazet <dada1@cosmosbay.com>

References:
- 2.6.16-rc1-mm3
  - From: Andrew Morton <akpm@osdl.org>
- Re: 2.6.16-rc1-mm3
  - From: Andy Whitcroft <apw@shadowen.org>
- Re: 2.6.16-rc1-mm3
  - From: Pekka Enberg <penberg@cs.helsinki.fi>
- Re: 2.6.16-rc1-mm3
  - From: Andy Whitcroft <apw@shadowen.org>
- Re: 2.6.16-rc1-mm3
  - From: Pekka Enberg <penberg@cs.helsinki.fi>
- Re: 2.6.16-rc1-mm3
  - From: Andy Whitcroft <apw@shadowen.org>
- Re: 2.6.16-rc1-mm3
  - From: Pekka Enberg <penberg@cs.helsinki.fi>
- Re: 2.6.16-rc1-mm3
  - From: Andy Whitcroft <apw@shadowen.org>
- Re: 2.6.16-rc1-mm3
  - From: Andrew Morton <akpm@osdl.org>
- Re: 2.6.16-rc1-mm3
  - From: Eric Dumazet <dada1@cosmosbay.com>

Prev by Date: Re: [patch 0/9] Critical Mempools
Next by Date: Re: GPL V3 and Linux
Previous by thread: Re: 2.6.16-rc1-mm3
Next by thread: Re: 2.6.16-rc1-mm3
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]