hello, we are experiencing problems with the new qlogic driver in 2.6.12 on a set of servers with qla2310 HBAs. The problem is as follows: The Infotrend storage array we are using has two controllers, each of them has two virtual discs with a couple of partitions exported as shared storage. The controllers are linked inside of the storage box, each controller has one qlogic fabric switch attached, and half of the servers are connected to the lefthand switch, the other half is connected to the righthand switch. Now, with the qlogic driver in 2.6.11.12, we can access all shares on both controllers from every server, while the new driver allows only access to the respective controller where the switch is attached to directly, thus depriving the servers of half of it's shared storage devices. Example: on server s05, we have a boot device (lun 3 on primary controller), and 2 shared storages (lun 9 on primary, lun 10 on secondary controller). With 2.6.11.12, this looks as follows: s05:~# cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 03 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-Access ANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 00 Lun: 09 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-Access ANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 01 Lun: 10 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-Access ANSI SCSI revision: 03 and the driver sees everything: s05:~# cat /proc/scsi/qla2xxx/0 QLogic PCI to Fibre Channel Host Adapter for QLA2310: Firmware version 3.03.08 IPX, Driver version 8.00.02b4-k ISP: ISP2300, Serial# R74545 Request Queue = 0xcf940000, Response Queue = 0xcf980000 Request Queue count = 2048, Response Queue count = 512 Total number of active commands = 0 Total number of interrupts = 1117762 Device queue depth = 0x20 Number of free request entries = 964 Number of mailbox timeouts = 0 Number of ISP aborts = 0 Number of loop resyncs = 0 Number of retries for empty slots = 0 Number of reqs in pending_q= 0, retry_q= 0, done_q= 0, scsi_retry_q= 0 Host adapter:loop state = <READY>, flags = 0x1a03 Dpc flags = 0x0 MBX flags = 0x0 Link down Timeout = 030 Port down retry = 030 Login retry count = 030 Commands retried with dropped frame(s) = 0 Product ID = 4953 5020 2020 0001 SCSI Device Information: scsi-qla0-adapter-node=200000e08b1bd113; scsi-qla0-adapter-port=210000e08b1bd113; scsi-qla0-target-0=210000d023800002; scsi-qla0-target-1=210000d023600002; SCSI LUN Information: (Id:Lun) * - indicates lun is not registered with the OS. ( 0: 0): Total reqs 2, Pending reqs 0, flags 0x0*, 0:0:81 00 ( 0: 3): Total reqs 470693, Pending reqs 0, flags 0x0, 0:0:81 00 ( 0: 9): Total reqs 227717, Pending reqs 0, flags 0x0, 0:0:81 00 ( 0:11): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:81 00 ( 0:13): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:81 00 ( 1: 0): Total reqs 2, Pending reqs 0, flags 0x0*, 0:0:82 00 ( 1:10): Total reqs 12, Pending reqs 0, flags 0x0, 0:0:82 00 ( 1:12): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:82 00 ( 1:14): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:82 00 while on 2.6.12.5 and 2.6.13-rc6 it looks like this: sm05:~# scsiadd -a 0 0 0 9 Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 03 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-Access ANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 00 Lun: 09 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-Access ANSI SCSI revision: 03 sm05:~# scsiadd -a 0 0 1 10 Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 03 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-Access ANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 00 Lun: 09 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-Access ANSI SCSI revision: 03 unfortunately, the proc interface was removed: s05:/sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/0000:02:02.0/host0# find . . ./rport-0:0-1 ./rport-0:0-1/power ./rport-0:0-1/power/state ./rport-0:0-0 ./rport-0:0-0/target0:0:0 ./rport-0:0-0/target0:0:0/0:0:0:9 ./rport-0:0-0/target0:0:0/0:0:0:9/ioerr_cnt ./rport-0:0-0/target0:0:0/0:0:0:9/iodone_cnt ./rport-0:0-0/target0:0:0/0:0:0:9/iorequest_cnt ./rport-0:0-0/target0:0:0/0:0:0:9/iocounterbits ./rport-0:0-0/target0:0:0/0:0:0:9/timeout ./rport-0:0-0/target0:0:0/0:0:0:9/state ./rport-0:0-0/target0:0:0/0:0:0:9/delete ./rport-0:0-0/target0:0:0/0:0:0:9/rescan ./rport-0:0-0/target0:0:0/0:0:0:9/rev ./rport-0:0-0/target0:0:0/0:0:0:9/model ./rport-0:0-0/target0:0:0/0:0:0:9/vendor ./rport-0:0-0/target0:0:0/0:0:0:9/scsi_level ./rport-0:0-0/target0:0:0/0:0:0:9/type ./rport-0:0-0/target0:0:0/0:0:0:9/queue_type ./rport-0:0-0/target0:0:0/0:0:0:9/queue_depth ./rport-0:0-0/target0:0:0/0:0:0:9/device_blocked ./rport-0:0-0/target0:0:0/0:0:0:9/bus ./rport-0:0-0/target0:0:0/0:0:0:9/driver ./rport-0:0-0/target0:0:0/0:0:0:9/block ./rport-0:0-0/target0:0:0/0:0:0:9/power ./rport-0:0-0/target0:0:0/0:0:0:9/power/state ./rport-0:0-0/target0:0:0/0:0:0:3 ./rport-0:0-0/target0:0:0/0:0:0:3/ioerr_cnt ./rport-0:0-0/target0:0:0/0:0:0:3/iodone_cnt ./rport-0:0-0/target0:0:0/0:0:0:3/iorequest_cnt ./rport-0:0-0/target0:0:0/0:0:0:3/iocounterbits ./rport-0:0-0/target0:0:0/0:0:0:3/timeout ./rport-0:0-0/target0:0:0/0:0:0:3/state ./rport-0:0-0/target0:0:0/0:0:0:3/delete ./rport-0:0-0/target0:0:0/0:0:0:3/rescan ./rport-0:0-0/target0:0:0/0:0:0:3/rev ./rport-0:0-0/target0:0:0/0:0:0:3/model ./rport-0:0-0/target0:0:0/0:0:0:3/vendor ./rport-0:0-0/target0:0:0/0:0:0:3/scsi_level ./rport-0:0-0/target0:0:0/0:0:0:3/type ./rport-0:0-0/target0:0:0/0:0:0:3/queue_type ./rport-0:0-0/target0:0:0/0:0:0:3/queue_depth ./rport-0:0-0/target0:0:0/0:0:0:3/device_blocked ./rport-0:0-0/target0:0:0/0:0:0:3/bus ./rport-0:0-0/target0:0:0/0:0:0:3/driver ./rport-0:0-0/target0:0:0/0:0:0:3/block ./rport-0:0-0/target0:0:0/0:0:0:3/power ./rport-0:0-0/target0:0:0/0:0:0:3/power/state ./rport-0:0-0/target0:0:0/power ./rport-0:0-0/target0:0:0/power/state ./rport-0:0-0/power ./rport-0:0-0/power/state ./nvram ./fw_dump ./power ./power/state apparently the targets on rport-0:0-1 are not scanned at all, and so the devices on the secondary controller are not reachable. placing an additional link between the two fabric switches did double the amount of targets, but not solve our problem. It seems to us the 2.6.12+ driver does not allow access to controllers not directly attached to the very same fabric switch. how can this be fixed? Best regards Frederik Schueler -- ENOSIG
Attachment:
signature.asc
Description: Digital signature
- Follow-Ups:
- Re: new qla2xxx driver breaks SAN setup with 2 controllers
- From: Patrick Mansfield <[email protected]>
- Re: new qla2xxx driver breaks SAN setup with 2 controllers
- Prev by Date: usb oops in 2.6.13-rc6-mm2
- Next by Date: Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
- Previous by thread: usb oops in 2.6.13-rc6-mm2
- Next by thread: Re: new qla2xxx driver breaks SAN setup with 2 controllers
- Index(es):