Re: 2.6.18-rc1-mm1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 09 Jul 2006 23:22:14 +1200
Reuben Farrelly <[email protected]> wrote:

> 
> 
> On 9/07/2006 9:11 p.m., Andrew Morton wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc1/2.6.18-rc1-mm1/
> > 
> > - We're getting a relatively large number of crash reports coming out of the
> >   core sysfs/kobject/driver/bus code, and they're all really hard to diagnose.
> > 
> >   I am suspecting that what's happening is that some registration functions
> >   are failing and the caller is ignoring that failure.  The code proceeds and
> >   crashes much later, in obscure ways.
> > 
> >   All these functions return error codes, and we're not checking them.  We
> >   should.  So there's a patch which marks all these things as __must_check,
> >   which causes around 1,500 new warnings.
> > 
> >   These are all bugs and they all need to be fixed.
> 
> Works.  Well, it boots without crashing here and has been up for 30 or so 
> minutes without incident or so much as a log entry.

Shock.  Have you tested suspend-to-ram and suspend-to-disk?

> I assume that the bulk of those warnings about the return error codes will be 
> largely dealt with by individual maintainers as there are far too many to post here?

I admire your faith in your fellow man.  I'll see what can be done to
reduce the warnings by changing some deregistration/removal API
functions so they return void.  That should remove maybe half of them.

As for the rest I guess we just need to slam that patch into mainline and
start bitching at people.


> Some minor problems noted - possibly PCI/ACPI related (read on past the IDE bit 
> if that's not your cup of tea).
> 
> 1. I've disabled the old IDE stuff and enabled Alan's IDE support 
> (CONFIG_SCSI_ATA_GENERIC=y).  But it seems to be a bit unhappy with my IDE CD 
> burner:
> 
> ata_piix 0000:00:1f.1: version 2.00ac5
> ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18
> PCI: Setting latency timer of device 0000:00:1f.1 to 64
> ata5: PATA max UDMA/133 cmd 0x1F0 ctl 0x3F6 bmdma 0x30B0 irq 14
> scsi4 : ata_piix
> ata5.00: ATAPI, max UDMA/66
> ata5.00: configured for UDMA/66
> ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata5.00: (BMDMA stat 0x24)
> ata5.00: tag 0 cmd 0xa0 Emask 0x4 stat 0x40 err 0x0 (timeout)
> ata5: soft resetting port
> ata5.00: configured for UDMA/66
> ata5: EH complete
> ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata5.00: (BMDMA stat 0x24)
> ata5.00: tag 0 cmd 0xa0 Emask 0x4 stat 0x40 err 0x0 (timeout)
> ata5: soft resetting port
> ata5.00: configured for UDMA/66
> Losing some ticks... checking if CPU frequency changed.
> ata5: EH complete
> ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata5.00: (BMDMA stat 0x24)
> ata5.00: tag 0 cmd 0xa0 Emask 0x4 stat 0x40 err 0x0 (timeout)
> ata5: soft resetting port
> ata5.00: configured for UDMA/66
> ata5: EH complete
> ata5.00: limiting speed to UDMA/44
> ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata5.00: (BMDMA stat 0x24)
> ata5.00: tag 0 cmd 0xa0 Emask 0x4 stat 0x40 err 0x0 (timeout)
> ata5: soft resetting port
> ata5.00: configured for UDMA/44
> ata5: EH complete
> ata6: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0x30B8 irq 15
> scsi5 : ata_piix
> ata6: port disabled. ignoring.
> ATA: abnormal status 0xFF on port 0x177
> SCSI device sda: 586072368 512-byte hdwr sectors (300069 MB)
> sda: Write Protect is off
> sda: Mode Sense: 00 3a 00 00
> SCSI device sda: drive cache: write back

Alan stuff.

> Note also the message midway through about losing some ticks, which if I recall 
> correctly is not new to this -mm release.  I'm not sure who to cc about this.

John stuff.  I suspect it's natural and normal, if the IDE error handling
did something rude with interrupt holdoff.


> The IDE device obviously ended up not being detected by the system.  Usually 
> this device comes up as:
> 
> Jul  2 12:03:28 tornado kernel: hda: ATAPI 40X DVD-ROM DVD-R CD-R/RW drive, 
> 2000kB Cache, UDMA(66)
> 
> 
> 2. Onto some more minor warnings:
> 
> ACPI: bus type pci registered
> PCI: BIOS Bug: MCFG area at f0000000 is not E820-reserved
> PCI: Not using MMCONFIG.
> PCI: Using configuration type 1
> ACPI: Interpreter enabled
> 
> Is there any way to verify that there really is a BIOS bug there?  If it is, is 
> there anyone within Intel or are there any known contacts who can push and poke 
> to get this looked at/fixed?  (It's a new Intel board, I'd hope they could get 
> it right..).
> 
> Plus we're not using MMCONFIG - even though I have it enabled.

Andi stuff.

> Based on previous postings to lkml, I believe Randy Dunlap may have one of these 
> boards too - Randy are you seeing this and the next bunch of warnings I am seeing?
> 
> 3. Power Management warnings, been there ages, but I've had bigger things to 
> worry about (like fatal oopses) so haven't bothered asking:
> 
> Device `[PEX0]' is not power manageable
> ACPI: PCI Interrupt 0000:00:1c.0[A] -> GSI 17 (level, low) -> IRQ 17
> PCI: Setting latency timer of device 0000:00:1c.0 to 64
> Device `[PEX2]' is not power manageable
> ACPI: PCI Interrupt 0000:00:1c.2[C] -> GSI 18 (level, low) -> IRQ 18
> PCI: Setting latency timer of device 0000:00:1c.2 to 64
> Device `[PEX3]' is not power manageable
> ACPI: PCI Interrupt 0000:00:1c.3[D] -> GSI 19 (level, low) -> IRQ 19
> PCI: Setting latency timer of device 0000:00:1c.3 to 64
> Device `[PEX4]' is not power manageable
> ACPI: PCI Interrupt 0000:00:1c.4[A] -> GSI 17 (level, low) -> IRQ 17
> PCI: Setting latency timer of device 0000:00:1c.4 to 64
> Device `[PEX5]' is not power manageable
> ACPI: PCI Interrupt 0000:00:1c.5[B] -> GSI 16 (level, low) -> IRQ 16

ACPI stuff.  I suspect the kernel isn't doing anything wrong here.

> and
> 
> Device `[IDES]' is not power manageable

I don't know what device that is.

> [root@tornado ~]# cat /proc/interrupts
>             CPU0       CPU1
>    0:     258266          0   IO-APIC-edge     timer
>    4:        355          0   IO-APIC-edge     serial
>    6:          5          0   IO-APIC-edge     floppy
>    8:          1          0   IO-APIC-edge     rtc
>    9:          0          0   IO-APIC-fasteoi  acpi
>   14:         28          0   IO-APIC-edge     libata
>   15:          0          0   IO-APIC-edge     libata
>   16:          0          0   IO-APIC-fasteoi  uhci_hcd:usb5
>   18:          0          0   IO-APIC-fasteoi  uhci_hcd:usb4
>   19:        980          0   IO-APIC-fasteoi  uhci_hcd:usb3, serial
>   23:        105          0   IO-APIC-fasteoi  ehci_hcd:usb1, uhci_hcd:usb2
> 313:      82513          0   PCI-MSI-<NULL>  eth0
> 314:      57370          0   PCI-MSI-<NULL>  libata
> NMI:        217        188
> LOC:     258118     257890
> ERR:          0
> MIS:          0
> [root@tornado ~]#
> 
> The full dmesg is up at http://www.reub.net/files/kernel/2.6.18-rc1-mm1.dmesg 
> and config is up at http://www.reub.net/files/kernel/2.6.18-rc1-mm1.config
> 
> Minor issues and possibly most if not all are not of concern, but occasionally 
> supposedly minor things show up much bigger problems when questions are asked 
> and people start poking around :)
> 

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux