It looks like an obscure issue. Apparently, to trigger it you need _both_ more that 2GB of RAM and SMP.On a second thought. Why would that only occur in SMP mode? Since now the box is with 3G ram, no SMP and it works like a charm. If I enable SMP - the hell breaks loose.I too am running an Athlon X2 using sata_nv. I have an ASUS motherboard. But what I noticed was that the problem went away if I used 2 gigs of ram instead of 4 gigs. When you use the whole 4 gigs there is some memory mapping going on and I thought perhaps the problem was related to the sata_nv not liking the memory mapped over the 4gig barrier.That's possible. Unfortunately I cannot verify this, since there are 2GB ofRAM in my box. I remeber someone having a problem with sata_nv DMAing over 2GB of RAM, so there may be something wrong with it.Looks like it. I played around today as well, and it seems to not to occur with 2Gb.OTOH, today I played with Tyan Thunder K8WE, based on the Nvidia CK8-04 chipset (not that much different to the regular Nforce4) in a 2-processor configuration and 8GB of RAM, and it had no issues at all (I installed SuSE 9.3 with the distro kernel on it). Anyway its SATA controller is handled by the sata_nv driver, so the problem you describe does not seem to be software-related.
Hehe, as if reading your mind I ordered Tyan K8E today, and in case that won't work, i'll just put a pci raiser there and stuck an external SATA from silicon image or promise there :)I've heard from another person (thou have not seen it myself) that there was kind of the same problem (hanging sata) on an opteron-based machine. I was too late to ask the mb model thou, so this info is kind of useless, but just for statistical purposes I mention it here.[Jeff, could you please say if this is a known problem?]
Ok, the last status update.Tyan K8E arrived today, and after a short while (much longer while than on MSI motherboard thou) it started spitting out errors again (this is in SMP mode):
end_request: I/O error, dev sda, sector 35447511 raid1: sda3: rescheduling sector 27623856 raid1: sda3: redirecting sector 27623856 to another mirror ATA: abnormal status 0xD0 on port 0x9F7 ATA: abnormal status 0xD0 on port 0x9F7 ATA: abnormal status 0xD0 on port 0x9F7 ata1: command 0x25 timeout, stat 0xd0 host_stat 0x1 ata1: status=0xd0 { Busy } SCSI error : <0 0 0 0> return code = 0x8000002 sda: Current: sense key: Aborted Command Additional sense: Scsi parity error And so on, and so forth, with various blocks.At this moment in time, the resync of 160G raid-1 was attempted in the background.
MSI mobo from the original mail had nForce4-SLi chipset, whereas this Tyan K8E has nForce4 Ultra.
Yet again, if i enable apic, the boot process hangs here: sata_nv version 0.6 PCI: Setting latency timer of device 0000:00:07.0 to 64 ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xDC00 irq 11 ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xDC08 irq 11These are the last messages that I get. The same behaviour on both motherboard. Shoudl I enable apic, it hangs on that.
When I disable apic, the boot sequence looks like: sata_nv version 0.6 PCI: Setting latency timer of device 0000:00:07.0 to 64 ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xDC00 irq 11 ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xDC08 irq 11ata1: dev 0 cfg 49:2f00 82:7c6b 83:7f09 84:4003 85:7c69 86:3e01 87:4003 88:407f
ata1: dev 0 ATA, max UDMA/133, 320173056 sectors: lba48 nv_sata: Primary device added nv_sata: Primary device removed nv_sata: Secondary device added nv_sata: Secondary device removed ata1: dev 0 configured for UDMA/133 scsi0 : sata_nv ... Thus, perfectly normal. Further on it goes onto ata2, and so on. Hereby, the lspci from the new motherboard: root@anarxi:~# lspci0000:00:00.0 Memory controller: nVidia Corporation: Unknown device 005e (rev a3)
0000:00:01.0 ISA bridge: nVidia Corporation: Unknown device 0050 (rev a3) 0000:00:01.1 SMBus: nVidia Corporation: Unknown device 0052 (rev a2) 0000:00:06.0 IDE interface: nVidia Corporation: Unknown device 0053 (rev f2) 0000:00:07.0 IDE interface: nVidia Corporation: Unknown device 0054 (rev f3) 0000:00:09.0 PCI bridge: nVidia Corporation: Unknown device 005c (rev a2) 0000:00:0a.0 Bridge: nVidia Corporation: Unknown device 0057 (rev a3) 0000:00:0b.0 PCI bridge: nVidia Corporation: Unknown device 005d (rev a3) 0000:00:0c.0 PCI bridge: nVidia Corporation: Unknown device 005d (rev a3) 0000:00:0d.0 PCI bridge: nVidia Corporation: Unknown device 005d (rev a3) 0000:00:0e.0 PCI bridge: nVidia Corporation: Unknown device 005d (rev a3) 0000:00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge 0000:00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge 0000:00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge 0000:00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge0000:01:06.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 80)
0000:01:07.0 VGA compatible controller: S3 Inc. 86c764/765 [Trio32/64/64V+] And lspci -v from the IDE parts:0000:00:06.0 IDE interface: nVidia Corporation: Unknown device 0053 (rev f2) (prog-if 8a [Master SecP PriP])
Subsystem: Tyan Computer: Unknown device 2865 Flags: bus master, 66MHz, fast devsel, latency 0 I/O ports at f000 [size=16] Capabilities: [44] Power Management version 20000:00:07.0 IDE interface: nVidia Corporation: Unknown device 0054 (rev f3) (prog-if 85 [Master SecO PriO])
Subsystem: Tyan Computer: Unknown device 2865 Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 11 I/O ports at 09f0 [size=8] I/O ports at 0bf0 [size=4] I/O ports at 0970 [size=8] I/O ports at 0b70 [size=4] I/O ports at dc00 [size=16] Memory at febfd000 (32-bit, non-prefetchable) [size=4K] Capabilities: [44] Power Management version 2I'd very much appreciate some feedback on this. The box crashes every 40-50 minutes in SMP mode, and without second core it's overloaded, and running on the edge of crash as well.
Let me know if i need to test anything or you want more info. Thanks! Regards, Vladimir
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
- Follow-Ups:
- Re: sata_nv + SMP = broken?
- From: "Rafael J. Wysocki" <[email protected]>
- Re: sata_nv + SMP = broken?
- References:
- sata_nv + SMP = broken?
- From: Vladimir Lazarenko <[email protected]>
- Re: sata_nv + SMP = broken?
- From: "Rafael J. Wysocki" <[email protected]>
- Re: sata_nv + SMP = broken?
- From: Vladimir Lazarenko <[email protected]>
- Re: sata_nv + SMP = broken?
- From: "Rafael J. Wysocki" <[email protected]>
- Re: sata_nv + SMP = broken?
- From: Vladimir Lazarenko <[email protected]>
- sata_nv + SMP = broken?
- Prev by Date: Re: [PATCH 01/11] uml: sigio code - reduce spinlock hold time
- Next by Date: Re: [PATCH 00/02] Process Events Connector
- Previous by thread: Re: sata_nv + SMP = broken?
- Next by thread: Re: sata_nv + SMP = broken?
- Index(es):