Re: 2.6.21-rc7-mm2 -- x86_64 blade hard hangs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On (26/04/07 15:46), Mel Gorman didst pronounce:
> On (26/04/07 14:33), Mel Gorman didst pronounce:
> > On (26/04/07 15:51), Andi Kleen didst pronounce:
> > > Andy Whitcroft <[email protected]> writes:
> > > 
> > > > Getting hard boot failures before any kernel output on an x86_64 blade:
> > > > 
> > > > root (hd0,0)
> > > >  Filesystem type is ext2fs, partition type 0x83
> > > > kernel /vmlinuz-autobench ro root=/dev/VolGroup00/LogVol00 rhgb
> > > > console=tty0 console=ttyS1,19200 selinux=no autobench_args:
> > > > root=30726124 ABAT:1177507960 profile=2
> > > >    [Linux-bzImage, setup=0x1e00, size=0x21fe48]
> > > > initrd /initrd-autobench.img
> > > >    [Linux-initrd @ 0x37e5f000, 0x190984 bytes]
> > > > [silence]
> > > > 
> > > > Kernel config is here:
> > > > 
> > > > http://test.kernel.org/abat/85082/build/dotconfig
> > > 
> > > You should boot with earlyprintk=serial,ttyS1,19200
> > > 
> > 
> > Here is the console log. Haven't taken a closer look yet in case it's
> > blindingly obvious to you.
> > 
> > kernel /vmlinuz-autobench ro root=/dev/VolGroup00/LogVol00 rhgb
> > console=tty0 co
> > nsole=ttyS1,19200 selinux=no autobench_args: root=30726124
> > ABAT:1177594149 logl
> > evel=8 earlyprintk=serial,ttyS1,19200 
> >    [Linux-bzImage, setup=0x1e00, size=0x220da8]
> > initrd /initrd-autobench.img
> >    [Linux-initrd @ 0x37e5f000, 0x190982 bytes]
> > Linux version 2.6.21-rc7-mm2-autokern1 ([email protected])
> > (gcc version 4.1.1 20060525 (Red Hat 4.1.1-1)) #1 SMP Thu Apr 26
> > 08:16:37 CDT 2007
> > Command line: ro root=/dev/VolGroup00/LogVol00 rhgb console=tty0
> > console=ttyS1,19200 selinux=no autobench_args: root=30726124
> > ABAT:1177594149 loglevel=8 earlyprintk=serial,ttyS1,19200 
> > BIOS-provided physical RAM map:
> >  BIOS-e820: 0000000000000000 - 000000000009d400 (usable)
> >  BIOS-e820: 000000000009d400 - 00000000000a0000 (reserved)
> >  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> >  BIOS-e820: 0000000000100000 - 000000003ffcddc0 (usable)
> >  BIOS-e820: 000000003ffcddc0 - 000000003ffd0000 (ACPI data)
> >  BIOS-e820: 000000003ffd0000 - 0000000040000000 (reserved)
> >  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> > end_pfn_map = 1048576
> > kernel direct mapping tables up to 100000000 @ 8000-d000
> > DMI 2.3 present.
> > ACPI: RSDP 000FDFC0, 0014 (r0 IBM   )
> > ACPI: RSDT 3FFCFF80, 0034 (r1 IBM    SERBLADE     1000 IBM  45444F43)
> > ACPI: FACP 3FFCFEC0, 0084 (r2 IBM    SERBLADE     1000 IBM  45444F43)
> > ACPI: DSDT 3FFCDDC0, 1EA6 (r1 IBM    SERBLADE     1000 INTL  2002025)
> > ACPI: FACS 3FFCFCC0, 0040
> > ACPI: APIC 3FFCFE00, 009C (r1 IBM    SERBLADE     1000 IBM  45444F43)
> > ACPI: SRAT 3FFCFD40, 0098 (r1 IBM    SERBLADE     1000 IBM  45444F43)
> > ACPI: HPET 3FFCFD00, 0038 (r1 IBM    SERBLADE     1000 IBM  45444F43)
> > SRAT: PXM 0 -> APIC 0 -> Node 0
> > SRAT: PXM 0 -> APIC 1 -> Node 0
> > SRAT: PXM 1 -> APIC 2 -> Node 1
> > SRAT: PXM 1 -> APIC 3 -> Node 1
> > SRAT: Node 0 PXM 0 0-40000000
> > PANIC: early exception rip ffffffff808a7e83 error 0 cr2 5cf0
> > Call Trace:
> >  [<ffffffff808a7e83>] reserve_bootmem_node+0x0/0xc
> >  [<ffffffff808a3a8c>] reserve_bootmem_generic+0x6d/0x99
> >  [<ffffffff8089cc00>] setup_arch+0x2f7/0x469
> >  [<ffffffff80896510>] start_kernel+0x64/0x2aa
> >  [<ffffffff80896148>] _sinittext+0x148/0x14c
> > RIP reserve_bootmem_node+0x0/0xc
> > 
> 
> When I looked closer, nothing seemed to make sense so here is another
> log with stack unwinder, frame pointer etc etc
> 
> PANIC: early exception rip ffffffff8095a873 error 0 cr2 5cf0
> 
> Call Trace:
>  [<ffffffff8020b174>] dump_trace+0xb6/0x3d8
>  [<ffffffff8020b4d2>] show_trace+0x3c/0x52
>  [<ffffffff8020b4fd>] dump_stack+0x15/0x17
>  [<ffffffff802001e7>]
> DWARF2 unwinder stuck at 0xffffffff802001e7
> Leftover inexact backtrace:
>  [<ffffffff8095a873>] reserve_bootmem_node+0x1/0x12
>  [<ffffffff8096114f>] acpi_table_parse+0x4f/0x69
>  [<ffffffff80955f91>] reserve_bootmem_generic+0x73/0x9e
>  [<ffffffff8094edd3>] setup_arch+0x301/0x49c
>  [<ffffffff8094855f>] start_kernel+0x6c/0x30b
>  [<ffffffff80948144>] _sinittext+0x144/0x14b
> 
> RIP reserve_bootmem_node+0x1/0x12
> 
> Will keep kicking.
> 

NODE_DATA(0) was returning NULL because of
x86_64-set-node_possible_map-at-runtime.patch. This patch appears to
break acpi_scan_nodes() so that setup_node_bootmem() never gets called and
node_data[] remains full of zeros. Later on, reserve_bootmem_generic() is
called but with NODE_DATA(0) == NULL, it all goes pear shaped. Author of
patch cc'd.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux