Once properly terminated all problems are solved. Except if the other machine ever goes down I get scsi bus problems which will pretty much kill the other machine.
doh.
On Fri, 2004-07-16 at 12:05 +0900, Naoki wrote:
This is not good, but I don't know why it's happening. [root@xxxxxxxxxxxxxxxxxxxxx root]# cat /proc/scsi/aic79xx/1 Adaptec AIC79xx driver version: 1.3.11 Adaptec 39320D Ultra320 SCSI adapter aic7902: Ultra320 Wide Channel A, SCSI Id=6, PCI-X 67-100Mhz, 512 SCBs Allocated SCBs: 4, SG List Length: 128 Serial EEPROM: 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x17c8 0x09f4 0x01c7 0x2806 0x0010 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0x0410 0xb457 Target 0 Negotiation Settings User: 320.000MB/s transfers (160.000MHz DT|IU|QAS, 16bit) Goal: 3.300MB/s transfers Curr: 3.300MB/s transfers Transmission Errors 0 Channel A Target 0 Lun 0 Settings Commands Queued 15 Commands Active 0 Command Openings 4 Max Tagged Openings 4 Device Queue Frozen Count 0 -n. Naoki wrote: > I tried changing the order in /etc/modprobe.conf from this : > > alias scsi_hostadapter aic79xx > alias scsi_hostadapter1 mptbase > alias scsi_hostadapter2 mptscsih > > To this : > > alias scsi_hostadapter mptbase > alias scsi_hostadapter1 mptscsih > alias scsi_hostadapter2 aic79xx > > But that had no effect. Then I opened up the kernel's initrd file and > the order of the modules on the filesystem was : > > [root@xxxxxxxxxxxxxxxxxxxxx lib]# ls -lrt > total 1011 > -rwxr--r-- 1 root root 19996 Jul 1 21:48 sd_mod.ko > -rwxr--r-- 1 root root 125928 Jul 1 21:48 scsi_mod.ko > -rwxr--r-- 1 root root 229628 Jul 1 21:48 aic79xx.ko > -rwxr--r-- 1 root root 49188 Jul 1 21:48 mptscsih.ko > -rwxr--r-- 1 root root 53148 Jul 1 21:48 mptbase.ko > -rwxr--r-- 1 root root 545724 Jul 1 21:48 xfs.ko > > So I removed the aix*.ko file and that sort of worked, now my machine > boots with all controllers attached and boots from the correct device. > Obviously because the other controller was never probed. > > Now my next questions are a) How to change the order of loaded modules > in the initrd ( other than changing the pysical layout ), > b) How to correctly load the driver after boot ( modprobe aic79xx from > rc.local?). > > Now I'm trying to modprobe it I'm seeing some odd errors and the > kernel messages : > > scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11 > <Adaptec 39320D Ultra320 SCSI adapter> > aic7902: Ultra320 Wide Channel A, SCSI Id=6, PCI-X 67-100Mhz, > 512 SCBs > > (scsi1:A:0): 320.000MB/s transfers (160.000MHz DT|IU|QAS, 16bit) > (scsi1:A:0:0): Unexpected busfree in Message-out phase, 1 SCBs > aborted, PRGMCNT == 0x1b0 > >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< > scsi1: Dumping Card State at program address 0x1ae Mode 0x33 > Card was paused > HS_MAILBOX[0x0] INTCTL[0x0] SEQINTSTAT[0x0] SAVED_MODE[0x11] > DFFSTAT[0x30] SCSISIGI[0x0] SCSIPHASE[0x0] SCSIBUS[0x0] > LASTPHASE[0xa0] SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x10] > SEQINTCTL[0x0] SEQ_FLAGS[0x40] SEQ_FLAGS2[0x0] SSTAT0[0x0] > SSTAT1[0x8] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0xc8] > SIMODE1[0xac] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] > LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x0] > > SCB Count = 4 CMDS_PENDING = 1 LASTSCB 0x3 CURRSCB 0x3 NEXTSCB 0x0 > qinstart = 1 qinfifonext = 1 > QINFIFO: > WAITING_TID_QUEUES: > Pending list: > Total 0 > Kernel Free SCB list: 3 2 1 0 > Sequencer Complete DMA-inprog list: > Sequencer Complete list: > Sequencer DMA-Up and Complete list: > > scsi1: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0 > SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89] > SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0] > SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0 > HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10] > scsi1: FIFO1 Free, LONGJMP == 0x80ff, SCB 0x0 > SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89] > SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0] > SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0 > HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10] > LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 > 0x0 0x0 0x0 0x0 > scsi1: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42 > scsi1: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 > SIMODE0[0xc] > CCSCBCTL[0x4] > scsi1: REG0 == 0x3, SINDEX = 0x80, DINDEX = 0x0 > scsi1: SCBPTR == 0x3, SCB_NEXT == 0xff00, SCB_NEXT2 == 0xff29 > CDB 3b a 0 0 0 0 > STACK: 0x125 0x0 0x0 0x0 0x0 0x0 0x0 0x29 > <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> > DevQ(0:0:0): 0 waiting > scsi1: Transmission error detected > LQISTAT1[0x0] LASTPHASE[0xe0] SCSISIGI[0x60] PERRDIAG[0xd1] > >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< > scsi1: Dumping Card State at program address 0x1af Mode 0x11 > Card was paused > HS_MAILBOX[0x0] INTCTL[0x0] SEQINTSTAT[0x0] SAVED_MODE[0x11] > DFFSTAT[0x11] SCSISIGI[0x74] SCSIPHASE[0x2] SCSIBUS[0x0] > LASTPHASE[0xe0] SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x10] > SEQINTCTL[0x0] SEQ_FLAGS[0x0] SEQ_FLAGS2[0x0] SSTAT0[0x2] > SSTAT1[0x11] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0] > SIMODE1[0xac] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] > LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x80] > > SCB Count = 4 CMDS_PENDING = 1 LASTSCB 0x3 CURRSCB 0x3 NEXTSCB 0x0 > qinstart = 16 qinfifonext = 16 > QINFIFO: > WAITING_TID_QUEUES: > Pending list: > 3 FIFO_USE[0x0] SCB_CONTROL[0x40] SCB_SCSIID[0x6] > Total 1 > Kernel Free SCB list: 2 1 0 > Sequencer Complete DMA-inprog list: > Sequencer Complete list: > Sequencer DMA-Up and Complete list: > > scsi1: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0 > SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89] > SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0] > SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0 > HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x0] > scsi1: FIFO1 Active, LONGJMP == 0x80ff, SCB 0x3 > SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x4] DFSTATUS[0x88] > SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0] > SOFFCNT[0x0] MDFFSTAT[0x4] SHADDR = 0x00, SHCNT = 0x0 > HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10] > LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 > 0x0 0x0 0x0 0x0 > scsi1: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42 > scsi1: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 > SIMODE0[0xc] > CCSCBCTL[0x4] > scsi1: REG0 == 0x3, SINDEX = 0x111, DINDEX = 0x1ba > scsi1: SCBPTR == 0x3, SCB_NEXT == 0xffc0, SCB_NEXT2 == 0xff29 > CDB 12 0 0 0 24 0 > STACK: 0xe2 0x125 0x0 0x0 0x0 0x0 0x0 0xa7 > <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> > DevQ(0:0:0): 0 waiting > scsi1:0:0:0: Attempting to abort cmd 04246bb4: 0x12 0x0 0x0 0x0 0x24 0x0 > scsi1: At time of recovery, card was not paused > >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< > scsi1: Dumping Card State at program address 0x94 Mode 0x11 > Card was paused > HS_MAILBOX[0x0] INTCTL[0x0] SEQINTSTAT[0x0] SAVED_MODE[0x11] > DFFSTAT[0x11] SCSISIGI[0x74] SCSIPHASE[0x2] SCSIBUS[0x0] > LASTPHASE[0x60] SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x10] > SEQINTCTL[0x80] SEQ_FLAGS[0x20] SEQ_FLAGS2[0x0] SSTAT0[0x0] > SSTAT1[0x1] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0] > SIMODE1[0xac] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] > LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x80] > > SCB Count = 4 CMDS_PENDING = 1 LASTSCB 0x3 CURRSCB 0x3 NEXTSCB 0x0 > qinstart = 16 qinfifonext = 16 > QINFIFO: > WAITING_TID_QUEUES: > Pending list: > 3 FIFO_USE[0x0] SCB_CONTROL[0x40] SCB_SCSIID[0x6] > Total 1 > Kernel Free SCB list: 2 1 0 > Sequencer Complete DMA-inprog list: > Sequencer Complete list: > Sequencer DMA-Up and Complete list: > > scsi1: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0 > SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89] > SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0] > SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0 > HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x0] > scsi1: FIFO1 Active, LONGJMP == 0x80ff, SCB 0x3 > SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x28] DFSTATUS[0x80] > SG_CACHE_SHADOW[0xa] SG_STATE[0x0] DFFSXFRCTL[0x0] > SOFFCNT[0x0] MDFFSTAT[0xc] SHADDR = 0x046bee4, SHCNT = 0x24 > HADDR = 0x046bee4, HCNT = 0x24 CCSGCTL[0x10] > LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 > 0x0 0x0 0x0 0x0 > scsi1: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42 > scsi1: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 > SIMODE0[0xc] > CCSCBCTL[0x4] > scsi1: REG0 == 0x3, SINDEX = 0x122, DINDEX = 0x1ba > scsi1: SCBPTR == 0x3, SCB_NEXT == 0xffc0, SCB_NEXT2 == 0xff29 > CDB 12 0 0 80 8 3c > STACK: 0x29 0x206 0x125 0x0 0x0 0x0 0x0 0x0 > <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> > DevQ(0:0:0): 0 waiting > scsi1:0:0:0: Device is active, asserting ATN > Recovery code sleeping > Recovery code awake > Timer Expired > Recovery code sleeping > Recovery code awake > Timer Expired > scsi1: Device reset returning 0x2003 > Recovery SCB completes > Recovery SCB completes > Vendor: SA-8630 Model: Rev: R0.0 > Type: Direct-Access ANSI SCSI revision: 03 > (scsi1:A:0): 3.300MB/s transfers > scsi1:A:0:0: Tagged Queuing enabled. Depth 4 > SCSI device sdb: 3914432512 512-byte hdwr sectors (2004189 MB) > SCSI device sdb: drive cache: write back > sdb: > Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0 > Attached scsi generic sg2 at scsi1, channel 0, id 0, lun 0, type 0 > scsi2 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11 > <Adaptec 39320D Ultra320 SCSI adapter> > aic7902: Ultra320 Wide Channel B, SCSI Id=6, PCI-X 67-100Mhz, > 512 SCBs > > [root@xxxxxxxxxxxxxxxxxxxxx root]# uname -a > Linux localhost.localdomain 2.6.7-1.488smp #1 SMP Wed Jul 14 10:02:03 > EDT 2004 i686 i686 i386 GNU/Linux > > > -n. > > Naoki wrote: > >> Excellent Jeff, I'll try that out! >> >> >>> You should be able to do similar for the scsi controllers, since they >>> have different drivers and specify something like >>> alias scsi0 LSImodule >>> alias scsi1 Adaptecmodule >>> or one of the other tricks using an install line in modprobe.conf to >>> make sure the LSI gets the driver loaded first and Linux sees it as >>> scsi0. >>> >>> >>> >>> >> >> > >