Hello,
I have several machines configured (10+) with K8N-DRE motherboards,
which have Nvidia CK804 chipsets and Sata controllers, all seem to
exhibit this behavior. All machines have 16GB of ram, and are
running x86_64 versions.
With one disk running everything is fine, and there are no problems,
so if I do "dd if=/dev/sda of=/dev/null bs=64k" everything works,
if I background the command and start a second dd on sdb the io rate
about doubles (to 130k was about 65k) for about 1-2 seconds and then
goes to 0 and the machine hangs up (the dd's can be killed with alt-sysrq
keys or issuing a kill against the processes-but the processes are
in disk wait so it takes 10-30 seconds before the kill actually does
its job). After the kill completes everything seems ok again, it
looks like access to the disks is completely blocked when this happens,
I cannot get anything that accesses the disks to run once the
dd's hang up.
I get these messages from dmesg when the event happens on 2.6.16:
ata1: command 0x25 timeout, stat 0x50 host_stat 0x24
ata2: command 0x25 timeout, stat 0x50 host_stat 0x24
On 2.6.17-rc2 the messages look slightly different:
ata1: command 0xc8 timeout, stat 0x50 host_stat 0x24
ata1: status=0x50 { DriveReady SeekComplete }
sda: Current: sense key: No Sense
Additional sense: No additional sense information
Info fld=0x3481ff
ata2: command 0xc8 timeout, stat 0x50 host_stat 0x24
ata2: status=0x50 { DriveReady SeekComplete }
sdb: Current: sense key: No Sense
Additional sense: No additional sense information
Info fld=0x166ff
ata1: command 0xc8 timeout, stat 0x50 host_stat 0x24
ata1: status=0x50 { DriveReady SeekComplete }
sda: Current: sense key: No Sense
Additional sense: No additional sense information
Info fld=0x3482ff
ata2: command 0xc8 timeout, stat 0x50 host_stat 0x24
ata2: status=0x50 { DriveReady SeekComplete }
sdb: Current: sense key: No Sense
Additional sense: No additional sense information
Info fld=0x167ff
ata1: command 0xca timeout, stat 0x50 host_stat 0x24
ata1: status=0x50 { DriveReady SeekComplete }
The bootup messages for the disks look like this:
SCSI subsystem initialized
libata version 1.20 loaded.
sata_nv 0000:00:07.0: version 0.8
PCI: Setting latency timer of device 0000:00:07.0 to 64
ata1: SATA max UDMA/133 cmd 0xF000 ctl 0xEC02 bmdma 0xE000 irq 7
ata2: SATA max UDMA/133 cmd 0xE800 ctl 0xE402 bmdma 0xE008 irq 7
logips2pp: Detected unknown logitech mouse model 1
ata1: SATA link up 1.5 Gbps (SStatus 113)
ata1: dev 0 cfg 49:2f00 82:346b 83:7f01 84:4003 85:3469 86:3c01 87:4003
88:203f
ata1: dev 0 ATA-6, max UDMA/100, 625142448 sectors: LBA48
nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device added
nv_sata: Secondary device removed
ata1: dev 0 configured for UDMA/100
scsi0 : sata_nv
ata2: SATA link up 1.5 Gbps (SStatus 113)
input: PS/2 Logitech Mouse as /class/input/input1
ata2: dev 0 cfg 49:2f00 82:346b 83:7f01 84:4003 85:3469 86:3c01 87:4003
88:203f
ata2: dev 0 ATA-6, max UDMA/100, 625142448 sectors: LBA48
nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device added
nv_sata: Secondary device removed
ata2: dev 0 configured for UDMA/100
scsi1 : sata_nv
Vendor: ATA Model: WDC WD3200SD-01K Rev: 08.0
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
sda:<4>nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device added
nv_sata: Secondary device removed
sda1
sd 0:0:0:0: Attached scsi disk sda
Vendor: ATA Model: WDC WD3200SD-01K Rev: 08.0
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sdb: 625142448 512-byte hdwr sectors (320073 MB)
sdb: Write Protect is offsdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 625142448 512-byte hdwr sectors (320073 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
sdb:<4>nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device added
nv_sata: Secondary device removed
sdb1
nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device added
nv_sata: Secondary device removed
sd 1:0:0:0: Attached scsi disk sdb
PCI: Setting latency timer of device 0000:00:08.0 to 64
ata3: SATA max UDMA/133 cmd 0xDC00 ctl 0xD802 bmdma 0xCC00 irq 5
ata4: SATA max UDMA/133 cmd 0xD400 ctl 0xD002 bmdma 0xCC08 irq 5
ata3: SATA link down (SStatus 0)
scsi2 : sata_nv
ata4: SATA link down (SStatus 0)
scsi3 : sata_nv
I have not been able (so far) to get these messages only running
one disk at a time. And it appears that I can run either disk by
itself with no issues.
I tested with 2 different FC5 2.6.16 variants, and with 2.6.17-rc2,
and both exhibit the same behavior.
What can I try to debug this?
Roger Heflin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]