File corruption with 2940U2 SCSI card and aic7xxx driver.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I recently installed an Adaptec 2940U2 controller and two disks in my
Debian Sarge system, kernel version 2.6.8.  Prior to this
installation, the system
had been a rock-solid IDE only system.  The card and drives are correctly
detected and identified by the kernel at boot.  Unfortunately, I am
experiencing consistent corruption on large files written to the SCSI
drives.  For example, if I copy a file from the old, stable IDE drive
to one of the SCSI disks using dd:

dd if=alphabet of=/dev/sda1
205200+0 records in
205200+0 records out
105062400 bytes transferred in 5.480344 seconds (19170767 bytes/sec)

Then copy the file back:

dd if=/dev/sda1 of=alphabet_ver2 count=205200
205200+0 records in
205200+0 records out
105062400 bytes transferred in 5.840856 seconds (17987500 bytes/sec)

The md5sums are different:

md5sum alphabet alphabet_ver2
5a96c70a890ff479568f75c54abb82a8  alphabet
e507a5b662b5f528bb6aa3a489a0e04e  alphabet_ver2

The original file, "alphabet", contains the line
"abcdefghijklmnopqrstuvwxyz" repeated many times; however the file
read from the SCSI drive, "alphabet_ver2", contains a number lines
like "abcdefghijklmnopqrstubcdefghijklmnopqrstuvwxyz" and
"abcdopqrstuvwxyz" --- all the correct characters, just out of order.
Curiously, all of the corruption appears to occur when writing the
file to the disk, as reading the data from the disk a second time
yields the same corrupt data:

dd if=/dev/sda1 of=alphabet_ver3 count=205200
205200+0 records in
205200+0 records out
105062400 bytes transferred in 5.840856 seconds (17987500 bytes/sec)
md5sum alphabet alphabet_ver2 alphabet_ver3
5a96c70a890ff479568f75c54abb82a8  alphabet
e507a5b662b5f528bb6aa3a489a0e04e  alphabet_ver2
e507a5b662b5f528bb6aa3a489a0e04e  alphabet_ver3

The corruption on write appears to be different each time:

dd if=alphabet of=/dev/sda1;\
dd if=/dev/sda1 of=alphabet_ver4 count=205200;md5sum alphabet*
205200+0 records in
205200+0 records out
105062400 bytes transferred in 5.488071 seconds (19143775 bytes/sec)
205200+0 records in
205200+0 records out
105062400 bytes transferred in 5.776168 seconds (18188944 bytes/sec)
5a96c70a890ff479568f75c54abb82a8  alphabet
e507a5b662b5f528bb6aa3a489a0e04e  alphabet_ver2
e507a5b662b5f528bb6aa3a489a0e04e  alphabet_ver3
40a369cb78d68f9b6d293dfd5012c87f  alphabet_ver4

You'll note that I've given up trying to create a filesystem on the
SCSI disk since the filesystem was always corrupted quickly and
fatally.  I have exhausted my ideas for troubleshooting this problem.
I would greatly appreciate any ideas for further troubleshooting.
Here is a brief list of what I have tried:

- Copying data to and from the other SCSI disk, sdb.
- Changing PCI slots and SCSI cables.
- The 2940 card does not share an interrupt with any other card.
- Trying the aic7xxx_old driver.
- Trying the new version of the aic7xxx driver with a 2.6.16 kernel.
- Disabling write caching on the drives.
- Enabling the debug information in the aic7xxx driver module (see
below for transcript).  There is no indication of problems from the
debug output.

In all cases, I get the same results.  This set of card, cable, and
drives worked flawlessly when it was removed from another computer
(which ran Windows and SUSE Linux).

A few relevant system details:

- Debian version 3.1 (Sarge)
- kernel-image-2.6.8-3-686 ver. 2.6.8-16sarge4 (primarily) and
linux-image-2.6.16-1-686 ver. 2.6.16-11bpo1 from backports.org
- Pentium III 500 MHz with 640 MB memory, VIA Apollo Pro 133 chipset

The relevant kernel messages during boot with full aic7xxx debug:

PCI: Found IRQ 11 for device 0000:00:0c.0
aic7xxx: PCI Device 0:12:0 failed memory mapped test.  Using PIO.
ahc_pci:0:12:0: Reading SEEPROM...done.
ahc_pci:0:12:0: BIOS eeprom is present
ahc_pci:0:12:0: Secondary High byte termination Enabled
ahc_pci:0:12:0: Secondary Low byte termination Enabled
ahc_pci:0:12:0: Primary Low Byte termination Enabled
ahc_pci:0:12:0: Primary High Byte termination Enabled
ahc_pci:0:12:0: Downloading Sequencer Program... 423 instructions downloaded
ahc_pci:0:12:0: Features 0x56f6, Bugs 0x6, Flags 0x20485440
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
       <Adaptec 2940 Ultra2 SCSI adapter>
       aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi0: Slave Alloc 0
(scsi0:A:0:0): Sending WDTR 1
(scsi0:A:0:0): Received WDTR 1 filtered to 1
(scsi0:A:0): 1.960MB/s transfers (0.980MHz , offset 255, 16bit)
scsi0: target 0 using 16bit transfers
(scsi0:A:0:0): Sending SDTR period a, offset 7f
(scsi0:A:0:0): Received SDTR period a, offset 7f
       Filtered to period a, offset 7f
(scsi0:A:0): 80.000MB/s transfers (40.000MHz, offset 127, 16bit)
scsi0: target 0 synchronous at 40.0MHz, offset = 0x7f
 Vendor: QUANTUM   Model: ATLAS10K2-TY367L  Rev: DA40
 Type:   Direct-Access                      ANSI SCSI revision: 03
scsi0: Slave Configure 0
(scsi0:A:0): 80.000MB/s transfers (40.000MHz, offset 127, 16bit)
scsi0:A:0:0: Tagged Queuing enabled.  Depth 8
scsi0: Slave Alloc 1
scsi0: Slave Destroy 1
scsi0: Slave Alloc 2
scsi0: Slave Destroy 2
scsi0: Slave Alloc 3
scsi0: Slave Destroy 3
scsi0: Slave Alloc 4
scsi0: Slave Destroy 4
scsi0: Slave Alloc 5
SCSI device sda: 71132959 512-byte hdwr sectors (36420 MB)
scsi0: Slave Destroy 5
scsi0: Slave Alloc 6
(scsi0:A:6:0): Sending WDTR 1
(scsi0:A:6:0): Received WDTR 1 filtered to 1
(scsi0:A:6): 1.960MB/s transfers (0.980MHz, offset 255, 16bit)
scsi0: target 6 using 16bit transfers
(scsi0:A:6:0): Sending SDTR period a, offset 7f
(scsi0:A:6:0): Received SDTR period a, offset 3f
       Filtered to period a, offset 3f
(scsi0:A:6): 80.000MB/s transfers (40.000MHz, offset 63, 16bit)
scsi0: target 6 synchronous at 40.0MHz, offset = 0x3f
 Vendor: IBM       Model: DDYS-T36950N      Rev: S96H
 Type:   Direct-Access                      ANSI SCSI revision: 03
scsi0: Slave Configure 6
(scsi0:A:6): 80.000MB/s transfers (40.000MHz, offset 63, 16bit)
scsi0:A:6:0: Tagged Queuing enabled.  Depth 8
SCSI device sda: drive cache: write through
/dev/scsi/host0/bus0/target0/lun0: p1 p2 p3
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sdb: 71687340 512-byte hdwr sectors (36704 MB)
SCSI device sdb: drive cache: write back
/dev/scsi/host0/bus0/target6/lun0: p1 p2
Attached scsi disk sdb at scsi0, channel 0, id 6, lun 0
scsi0: Slave Alloc 8
scsi0: Slave Destroy 8
scsi0: Slave Alloc 9
scsi0: Slave Destroy 9
scsi0: Slave Alloc 10
scsi0: Slave Destroy 10
scsi0: Slave Alloc 11
scsi0: Slave Destroy 11
scsi0: Slave Alloc 12
scsi0: Slave Destroy 12
scsi0: Slave Alloc 13
scsi0: Slave Destroy 13
scsi0: Slave Alloc 14
scsi0: Slave Destroy 14
scsi0: Slave Alloc 15
scsi0: Slave Destroy 15

The aic7xxx driver does not emit any further kernel messages.  The
aic7xxx module
is loaded with the following flags:  verbose,debug:0xffff,pci_parity

Please CC me directly with any comments or ideas you have.  Thanks for
your time.

--Ethan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux