Re: 2.6.17-rc5-mm2 — Linux Kernel

On Thu, 01 Jun 2006 21:34:37 +1200
Reuben Farrelly <[email protected]> wrote:

> 
> 
> On 1/06/2006 8:48 p.m., Andrew Morton wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm2/
> > 
> > 
> > - A cfq bug was fixed in mainline, so the git-cfq tree has been restored.
> > 
> > - Various lock-validator and genirq fixes have been added.  Should be
> >   slightly less oopsy than 2.6.17-rc5-mm1.
> > 
> > - I just realised that I've been accidentally not updating the PCI tree for
> >   a while.  Will be restored in next -mm.
> > 
> > - Has been booted and has passed various stress-tests on quad x86_64,
> >   quad ancient-Xeon, quad power4, quad ia64, dual old-PIII and a modern
> >   pentium-M laptop.  So if it breaks, it's your fault.
> 
> What an optimist if ever I've seen one ;)

Dammit.

> ...
>
> Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
> ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
> ICH7: IDE controller at PCI slot 0000:00:1f.1
> ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 177
> ICH7: chipset revision 1
> ICH7: not 100% native mode: will probe irqs later
>      ide0: BM-DMA at 0x30b0-0x30b7, BIOS settings: hda:DMA, hdb:pio
> hda: PIONEER DVD-RW DVR-111D, ATAPI CD/DVD-ROM drive
> ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> ACPI (acpi_bus-0192): Device `IDES]is not power manageable [20060310]
> ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 185
> ahci 0000:00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode
> ahci 0000:00:1f.2: flags: 64bit ncq led clo pio slum part
> ata1: SATA max UDMA/133 cmd 0xFFFFC20000016100 ctl 0x0 bmdma 0x0 irq 58
> ata2: SATA max UDMA/133 cmd 0xFFFFC20000016180 ctl 0x0 bmdma 0x0 irq 58
> ata3: SATA max UDMA/133 cmd 0xFFFFC20000016200 ctl 0x0 bmdma 0x0 irq 58

I assume you're using the ahci driver here.

> ata4: SATA max UDMA/133 cmd 0xFFFFC20000016280 ctl 0x0 bmdma 0x0 irq 58
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
>   [<0000000000000000>]
> PGD 0
> Oops: 0010 [1] SMP
> last sysfs file:
> CPU 0
> Modules linked in:
> Pid: 0, comm: idle Not tainted 2.6.17-rc5-mm2 #2
> RIP: 0010:[<0000000000000000>]  [<0000000000000000>]
> RSP: 0000:ffffffff80660f98  EFLAGS: 00010006
> RAX: 0000000000003a00 RBX: ffffffff8090dec8 RCX: 0000000000000000
> RDX: ffffffff8090dec8 RSI: ffffffff808fe100 RDI: 000000000000003a
> RBP: ffffffff80660fb0 R08: 0000000000000001 R09: ffffffff802676aa
> R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000003a
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffffffff808fa000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
> Process idle (pid: 0, threadinfo ffffffff8090c000, task ffffffff80593760)
> Stack: ffffffff80270132 ffffffff8025dbb1 ffffffff8094e084 ffffffff8090def0
>         ffffffff802641a9  <EOI> ff6500005d4be8fa 65c900000020250c 00000010250c8b48
>         f700001fd8e98148 7400000003582444
> Call Trace:
> 

And we did a jump-to-zero.  I'm suspecting the sata changes.

Is this the mysterious missing ->mode_filter, perhaps?  I don't think so -
we test for null there.

Should ahci.c have a data_xfer vector?  Right now it's left at NULL.

> 
> Code:  Bad RIP value.
> RIP  [<0000000000000000>] RSP <ffffffff80660f98>
> CR2: 0000000000000000
>   <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
>   BUG: warning at kernel/lockdep.c:1853/trace_hardirqs_on()
> 
> Call Trace:
>    [<ffffffff8026e6ed>] show_trace+0xad/0x225
>          [<ffffffff8026e87a>] dump_stack+0x15/0x1b  [<ffffffff802a05da>] 
> trace_hardirqs_on+0xa1/0x124
>          [<ffffffff80276fec>] smp_send_stop+0x4c/0x68
>          [<ffffffff8028a491>] panic+0xa7/0x220  [<ffffffff80216376>] 
> do_exit+0x74/0x94f
>          [<ffffffff8020b195>] do_page_fault+0x895/0x9c4
>          [<ffffffff802649dd>] error_exit+0x0/0x8e
> Rebooting in 60 seconds..BUG: warning at kernel/panic.c:114/panic()
> 

And here we collapsed instead of generating a backtrace.  Both Ingo and the
x86_64 guys have been playing with the backtrace code.

> 
> Hardware posted at http://www.reub.net/files/kernel/system-hardware

A .config would be useful too.

> Box has MSI capabilities and MSI compiled in.
> 

Hopefully MSI is fixed now.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: 2.6.17-rc5-mm2
  - From: Jeff Garzik <[email protected]>
- Re: 2.6.17-rc5-mm2
  - From: Reuben Farrelly <[email protected]>

References:
- 2.6.17-rc5-mm2
  - From: Andrew Morton <[email protected]>
- Re: 2.6.17-rc5-mm2
  - From: Reuben Farrelly <[email protected]>

Prev by Date: Re: [BUG](-mm)pci_disable_device function clear bars_enabled element
Next by Date: [Patch] Check sound_alloc_mixerdev() failure in sound/oss/nm256_audio.c
Previous by thread: Re: 2.6.17-rc5-mm2
Next by thread: Re: 2.6.17-rc5-mm2
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]