On Sun, 23 Oct 2005 10:22:49 -0400 Adam Kropelin wrote: > Sasa Ostrouska <[email protected]> wrote: > > Oct 20 03:01:50 rc-vaio kernel: Unable to handle kernel paging request at virtual address f8e43706 > > Oct 20 03:01:50 rc-vaio kernel: printing eip: > > Oct 20 03:01:50 rc-vaio kernel: c01eaf49 > > Oct 20 03:01:50 rc-vaio kernel: *pde = 01bae067 > > Oct 20 03:01:50 rc-vaio kernel: Oops: 0000 [#1] > > Oct 20 03:01:50 rc-vaio kernel: PREEMPT > > Oct 20 03:01:50 rc-vaio kernel: Modules linked in: snd_pcm_oss > > snd_mixer_oss lp ipv6 uhci_hcd joydev parport_pc parport psmouse pcspkr > > rtc sis_agp shpchp pci_hotplug i2c_sis96x i2c_core usb_storage > > snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm snd_timer snd > > snd_page_alloc ohci_hcd ehci_hcd usbcore sis900 ohci1394 ieee1394 tsdev > > pcmcia firmware_class yenta_socket rsrc_nonstatic pcmcia_core ide_scsi > > agpgart > > Oct 20 03:01:50 rc-vaio kernel: CPU: 0 > > Oct 20 03:01:50 rc-vaio kernel: EIP: 0060:[<c01eaf49>] Not tainted VLI > > Oct 20 03:01:50 rc-vaio kernel: EFLAGS: 00010297 (2.6.14-rc4) > > Oct 20 03:01:50 rc-vaio kernel: EIP is at vsnprintf+0x369/0x500 > > Oct 20 03:01:50 rc-vaio kernel: eax: f8e43706 ebx: 0000000a ecx: f8e43706 edx: fffffffe > > Oct 20 03:01:50 rc-vaio kernel: esi: f596e11f edi: 00000000 ebp: f596efff esp: f398ded0 > > Oct 20 03:01:50 rc-vaio kernel: ds: 007b es: 007b ss: 0068 > > Oct 20 03:01:50 rc-vaio kernel: Process grep (pid: 7529, threadinfo=f398c000 task=f6122030) > > Oct 20 03:01:50 rc-vaio kernel: Stack: 000003e1 00000000 00000010 00000004 00000002 00000001 ffffffff ffffffff > > Oct 20 03:01:50 rc-vaio kernel: 00000eed f596e113 c0331532 f596e113 f665c380 f665c380 00000113 c017c52f > > Oct 20 03:01:50 rc-vaio kernel: f398df44 c0330829 f7fe0ca0 c011fcb4 f665c380 c0331520 00000000 c0330829 > > Oct 20 03:01:50 rc-vaio kernel: Call Trace: > > Oct 20 03:01:50 rc-vaio kernel: [<c017c52f>] seq_printf+0x2f/0x60 > > Oct 20 03:01:50 rc-vaio kernel: [<c011fcb4>] r_show+0x84/0x90 > > Oct 20 03:01:50 rc-vaio kernel: [<c017c0f1>] seq_read+0x221/0x290 > > Oct 20 03:01:50 rc-vaio kernel: [<c015bae7>] vfs_read+0xc7/0x180 > > Oct 20 03:01:50 rc-vaio kernel: [<c015be77>] sys_read+0x47/0x80 > > Oct 20 03:01:50 rc-vaio kernel: [<c0103005>] syscall_call+0x7/0xb > > Oct 20 03:01:50 rc-vaio kernel: Code: 00 83 cf 01 89 44 24 1c eb bc 8b > > 44 24 40 8b 54 24 18 83 44 24 40 04 8b 08 b8 fe 14 34 c0 81 f9 ff 0f 00 > > 00 0f 46 c8 89 c8 eb 06 <80> 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 83 > > e7 10 89 c3 75 20 > > Oct 20 03:01:50 rc-vaio kernel: <6>note: grep[7529] exited with preempt_count 1 > > If I had to guess (and I do) I'd say one of your shutdown scripts tried > to grep thru something in /proc and the module that once supplied the > data for that something is gone, without having removed its /proc > entries. Lacking any particular insight on what module to blame, I'd > start by disabling various modules and booting cleanly so they never > load. Binary search your way thru them until you find the culprit. $ git grep -w r_show fs/reiserfs/procfs.c:static int r_show(struct seq_file *m, void *v) fs/reiserfs/procfs.c: .show = r_show, kernel/resource.c:static int r_show(struct seq_file *m, void *v) kernel/resource.c: .show = r_show, This does not look like r_show() from reiserfs, because that function does not call seq_printf() directly, so it must be r_show() from kernel/resource.c: static int r_show(struct seq_file *m, void *v) { struct resource *root = m->private; struct resource *r = v, *p; int width = root->end < 0x10000 ? 4 : 8; int depth; for (depth = 0, p = r; depth < MAX_IORES_LEVEL; depth++, p = p->parent) if (p->parent == root) break; seq_printf(m, "%*s%0*lx-%0*lx : %s\n", depth * 2, "", width, r->start, width, r->end, r->name ? r->name : "<BAD>"); return 0; } This function is responsible for /proc/ioports and /proc/iomem. First parameters of seq_printf() in the stack were: f665c380 m c0331520 "%*s%0*lx-%0*lx : %s\n" 00000000 depth * 2 c0330829 "" (unfortunately, no more information is available in the stack dump). They do not look like the bad pointer (f8e43706), so the most likely culprit is r->name - probably some module set the resource name to some string constant, and then was unloaded, but did not perform the proper cleanup. And depth == 0 means that the problematic resource most likely did not belong to a PCI device - maybe it was some legacy resource. You have the list of modules which were loaded at oops time (see "Modules linked in:" above); please also show the lsmod output obtained when the system is working - then we can find which modules were unloaded and investigate those more closely.
Attachment:
pgp4m5QpwUZGs.pgp
Description: PGP signature
- References:
- Re: oops in 2.6.14-rc3
- From: Sasa Ostrouska <[email protected]>
- Re: oops in 2.6.14-rc3
- From: Adam Kropelin <[email protected]>
- Re: oops in 2.6.14-rc3
- Prev by Date: Re: Information on ioctl32
- Next by Date: Re: [PATCH] RCU torture-testing kernel module
- Previous by thread: Re: oops in 2.6.14-rc3
- Next by thread: How to interpret a kernel bug output from dmesg?
- Index(es):