Re: bad pmd filemap.c, oops; 2.4.30 and 2.4.32

I'm starting to suspect bad hardware. Booting is now hanging (with2.4.27, 2.4.30 and 2.4.32) after scsi drivers load:


.....

Floppy drive(s): fd0 is 1.44M
FDC 0 is a National Semiconductor PC87306
Uniform Multi-Platform E-IDE driver Revision: 7.00beta4-2.4

ide: Assuming 33MHz system bus speed for PIO modes; override withidebus=xx

hda: TEAC CD-ROM CD-224E, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: attached ide-cdrom driver.
hda: ATAPI 24X CD-ROM drive, 128kB Cache
Uniform CD-ROM driver Revision: 3.12
SCSI subsystem driver Revision: 1.00
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
        <Adaptec 3960D Ultra160 SCSI adapter>
        aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
        <Adaptec 3960D Ultra160 SCSI adapter>
        aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs

scsi2 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
        <Adaptec aic7899 Ultra160 SCSI adapter>
        aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs

blk: queue f7e46018, I/O limit 4095Mb (mask 0xffffffff)


If I wait several minutes (around 10 or 15 minutes), I get:

scsi0:0:0:0: Attempting to queue an ABORT message
CDB: 0x12 0x0 0x0 0x0 0xff 0x0
scsi0:0:0:0: Command already completed
aic7xxx_abort returns 0x2002
scsi0:0:0:0: Attempting to queue an ABORT message
CDB: 0x0 0x0 0x0 0x0 0x0 0x0
scsi0:0:0:0: Command already completed
aic7xxx_abort returns 0x2002
scsi0:0:0:0: Attempting to queue a TARGET RESET message
CDB: 0x12 0x0 0x0 0x0 0xff 0x0
scsi0:0:0:0: Is not an active device
aic7xxx_dev_reset returns 0x2002
scsi0:0:0:0: Attempting to queue an ABORT message
CDB: 0x0 0x0 0x0 0x0 0x0 0x0
scsi0:0:0:0: Command already completed
aic7xxx_abort returns 0x2002
scsi0:0:0:0: Attempting to queue an ABORT message
CDB: 0x0 0x0 0x0 0x0 0x0 0x0
scsi0:0:0:0: Command already completed
aic7xxx_abort returns 0x2002
scsi: device set offline - not ready or command retry failed after bus reset: host 0 channel 0 id 0 lun 0

The messages repeated for all 15 targets on scsi0. It's looking like itwill repeat for scsi1 as well.

How likely is it that a failing scsi controller contribute to the otherproblems I was seeing?



-Chris

On Fri, 30 Dec 2005, Chris Stromsoe wrote:

I oopsed again last night with an identical EIP and Call Trace to theoops from the 28th. The new oops is below, the prior below that. I'mgoing to reboot the machine into UP and see if that helps.


-Chris

Unable to handle kernel paging request at virtual address c211ce80
c0259bb3
*pde = 020001e3
Oops: 0002
CPU:    2
EIP:    0010:[alloc_skb+275/480]    Not tainted
EFLAGS: 00010282
eax: c211ce80   ebx: f5303680   ecx: f7eeb780   edx: 00000680
esi: 000001f0   edi: 00000000   ebp: d348ddf0   esp: d348dddc
ds: 0018   es: 0018   ss: 0018
Process innfeed (pid: 25080, stackpage=d348d000)

Stack: 000006bc 000001f0 ebabc980 eb0e64d8 eb0e6400 d348de68 c027b50b00000680000001f0 000005a8 00000000 d348de54 00000000 00000000 0000000100000000012815b5 00000000 00000000 d7a160a0 d348c000 636686ac 000c3dec000087c0Call Trace: [tcp_sendmsg+2619/4512] [inet_sendmsg+65/80][sock_sendmsg+102/176] [sock_readv_writev+116/176] [sock_writev+79/96]

Code: c7 00 01 00 00 00 8b 83 8c 00 00 00 c7 40 04 00 00 00 00 8b
Using defaults from ksymoops -t elf32-i386 -a i386

eax; c211ce80 <_end+1d3b380/38650560>
ebx; f5303680 <_end+34f21b80/38650560>
ecx; f7eeb780 <_end+37b09c80/38650560>
ebp; d348ddf0 <_end+130ac2f0/38650560>
esp; d348dddc <_end+130ac2dc/38650560>


Code;  00000000 Before first symbol
00000000 <_EIP>:
Code;  00000000 Before first symbol
  0:   c7 00 01 00 00 00         movl   $0x1,(%eax)
Code;  00000006 Before first symbol
  6:   8b 83 8c 00 00 00         mov    0x8c(%ebx),%eax
Code;  0000000c Before first symbol
  c:   c7 40 04 00 00 00 00      movl   $0x0,0x4(%eax)
Code;  00000013 Before first symbol
 13:   8b 00                     mov    (%eax),%eax


On Wed, 28 Dec 2005, Chris Stromsoe wrote:

Unable to handle kernel paging request at virtual address c22eee80
c0259bb3
*pde = 020001e3
Oops: 0002
CPU:    2
EIP:    0010:[alloc_skb+275/480]    Not tainted
EFLAGS: 00010282
eax: c22eee80   ebx: ccbdb480   ecx: 000006bc   edx: 00000680
esi: 000001f0   edi: 00000000   ebp: f663bdf0   esp: f663bddc
ds: 0018   es: 0018   ss: 0018
Process innfeed (pid: 526, stackpage=f663b000)

Stack: 000006bc 000001f0 ccbdb080 00000000 f7185800 f663be68 c027b50b00000680000001f0 000005a8 00000000 f663be54 00000000 00000287 d84bec38d84bec34d84bec54 f663a000 00000000 d5fbd8a0 f663a000 586d4438 0002c774000005a8Call Trace: [tcp_sendmsg+2619/4512] [inet_sendmsg+65/80][sock_sendmsg+102/176] [sock_readv_writev+116/176] [sock_writev+79/96]Code: c7 00 01 00 00 00 8b 83 8c 00 00 00 c7 40 04 00 00 00 00 8b Usingdefaults from ksymoops -t elf32-i386 -a i386

eax; c22eee80 <_end+1f0d380/38650560>
ebx; ccbdb480 <_end+c7f9980/38650560>
ebp; f663bdf0 <_end+3625a2f0/38650560>
esp; f663bddc <_end+3625a2dc/38650560>


Code;  00000000 Before first symbol
00000000 <_EIP>:
Code;  00000000 Before first symbol
  0:   c7 00 01 00 00 00         movl   $0x1,(%eax)
Code;  00000006 Before first symbol
  6:   8b 83 8c 00 00 00         mov    0x8c(%ebx),%eax
Code;  0000000c Before first symbol
  c:   c7 40 04 00 00 00 00      movl   $0x0,0x4(%eax)
Code;  00000013 Before first symbol
 13:   8b 00                     mov    (%eax),%eax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: bad pmd filemap.c, oops; 2.4.30 and 2.4.32
  - From: Alan Cox <[email protected]>
- Re: bad pmd filemap.c, oops; 2.4.30 and 2.4.32
  - From: Willy Tarreau <[email protected]>
- Re: bad pmd filemap.c, oops; 2.4.30 and 2.4.32
  - From: Chris Stromsoe <[email protected]>

References:
- bad pmd filemap.c, oops; 2.4.30 and 2.4.32
  - From: Chris Stromsoe <[email protected]>
- Re: bad pmd filemap.c, oops; 2.4.30 and 2.4.32
  - From: Marcelo Tosatti <[email protected]>
- Re: bad pmd filemap.c, oops; 2.4.30 and 2.4.32
  - From: Chris Stromsoe <[email protected]>
- Re: bad pmd filemap.c, oops; 2.4.30 and 2.4.32
  - From: Chris Stromsoe <[email protected]>

Prev by Date: Re: [PATCH 0 of 20] [RFC] ipath - PathScale InfiniPath driver
Next by Date: Re: [SCHED] wrong priority calc - SIMPLE test case
Previous by thread: Re: bad pmd filemap.c, oops; 2.4.30 and 2.4.32
Next by thread: Re: bad pmd filemap.c, oops; 2.4.30 and 2.4.32
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]