Re: 2.6.23-rc1-mm2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On (01/08/07 22:52), Torsten Kaiser didst pronounce:
> On 8/1/07, Andrew Morton <[email protected]> wrote:
> > On Wed, 01 Aug 2007 16:30:08 -0400
> > [email protected] wrote:
> >
> > > As an aside, it looks like bits&pieces of dynticks-for-x86_64 are in there.
> > > In particular, x86_64-enable-high-resolution-timers-and-dynticks.patch is in
> > > there, adding a menu that depends on GENERIC_CLOCKEVENTS, but then nothing
> > > in the x86_64 tree actually *sets* it.  There's a few other dynticks-related
> > > prep patches in there as well.  Does this mean it's back to "coming soon to
> > > a CPU near you" status? :)
> >
> > I've lost the plot on that stuff: I'm just leaving things as-is for now,
> > wait for Thomas to return from vacation so we can have another run at it.
> 
> For what its worth: 2.6.22-rc6-mm1 with NO_HZ works for me on an AMD
> SMP system without trouble.
> 
> Next try with 2.6.23-rc1-mm2 and SPARSEMEM:
> Probably the same exception, but this time with Call Trace:
> [    0.000000] Bootmem setup node 0 0000000000000000-0000000080000000
> [    0.000000] Bootmem setup node 1 0000000080000000-0000000120000000
> [    0.000000] Zone PFN ranges:
> [    0.000000]   DMA             0 ->     4096
> [    0.000000]   DMA32        4096 ->  1048576
> [    0.000000]   Normal    1048576 ->  1179648
> [    0.000000] Movable zone start PFN for each node
> [    0.000000] early_node_map[4] active PFN ranges
> [    0.000000]     0:        0 ->      159
> [    0.000000]     0:      256 ->   524288
> [    0.000000]     1:   524288 ->   917488
> [    0.000000]     1:  1048576 ->  1179648
> PANIC: early exception rip ffffffff807cddb5 error 2 cr2 ffffe20003000010
> [    0.000000]
> [    0.000000] Call Trace:
> [    0.000000]  [<ffffffff807cddb5>] memmap_init_zone+0xb5/0x130
> [    0.000000]  [<ffffffff807ce874>] init_currently_empty_zone+0x84/0x110
> [    0.000000]  [<ffffffff807cec93>] free_area_init_node+0x393/0x3e0
> [    0.000000]  [<ffffffff807cefea>] free_area_init_nodes+0x2da/0x320
> [    0.000000]  [<ffffffff807c9c97>] paging_init+0x87/0x90
> [    0.000000]  [<ffffffff807c0f85>] setup_arch+0x355/0x470
> [    0.000000]  [<ffffffff807bc967>] start_kernel+0x57/0x330
> [    0.000000]  [<ffffffff807bc12d>] _sinittext+0x12d/0x140
> [    0.000000]
> [    0.000000] RIP memmap_init_zone+0xb5/0x130
> 
> (gdb) list *0xffffffff807cddb5
> 0xffffffff807cddb5 is in memmap_init_zone (include/linux/list.h:32).
> 27      #define LIST_HEAD(name) \
> 28              struct list_head name = LIST_HEAD_INIT(name)
> 29
> 30      static inline void INIT_LIST_HEAD(struct list_head *list)
> 31      {
> 32              list->next = list;
> 33              list->prev = list;
> 34      }
> 35
> 36      /*
> 
> I will test more tomorrow...

Well.... That doesn't make a whole pile of sense unless the memory map
is not present. Looking at your boot log, we see this gem

> [    0.000000]     1:   524288 ->   917488
> [    0.000000]     1:  1048576 ->  1179648

Node 1 spans a region with a nice little hole in the middle of DMA32. In our
test machines, we wouldn't see a hole like this, at least that I can recall
so it would appear to work on some machines. On SPARSEMEM, sparse_init()
is responsible for allocating memmap for each section. In 2.6.22-rc6-mm1,
it allocated the memory if the section was *valid*. In 2.6.23-rc1-mm1,
it allocates the memory if the section is *present* due to the patch
sparsemem-record-when-a-section-has-a-valid-mem_map.patch[1]. Much later in
the init process, memmap is initialised based on spanned memory, not present
memory so initialisation will init memmap that resides in holes if a zone
spans that area in a node which is the case on this machine.  I think this
is why it kablamos - it's inits memmap that wasn't allocated because it's
not present and the suprise is that it doesn't blow up sooner. Please try
the patch below Torsten, thanks.

[1] yeah, I acked this patch and I had read through it. My bad if the
    patch below does fix the problem

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.23-rc1-mm2-clean/mm/sparse.c linux-2.6.23-rc1-mm2-present_revert/mm/sparse.c
--- linux-2.6.23-rc1-mm2-clean/mm/sparse.c	2007-08-01 10:09:39.000000000 +0100
+++ linux-2.6.23-rc1-mm2-present_revert/mm/sparse.c	2007-08-02 00:27:00.000000000 +0100
@@ -483,7 +483,7 @@ void __init sparse_init(void)
 	unsigned long *usemap;
 
 	for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
-		if (!present_section_nr(pnum))
+		if (!valid_section_nr(pnum))
 			continue;
 
 		map = sparse_early_mem_map_alloc(pnum);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux