Hi there, we`ve got a database server machine running a 2.6.18.2 vanilla kernel on Debian Etch. The database is MySQL 5. Everything works fine, but sometimes the server "lags", i.e. it doesn`t respond for 30 seconds. We`ve now investigated the problem and found this messages in syslog (and dmesg): 15:55:44 omega11 kernel: ata1: port is slow to respond, please be patient 15:55:44 omega11 kernel: ata1: soft resetting port 15:55:44 omega11 kernel: ata1: port is slow to respond, please be patient 15:55:44 omega11 kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) 15:55:44 omega11 kernel: ATA: abnormal status 0xD0 on port 0xFFFFC2000000401C 15:55:44 omega11 last message repeated 5 times 15:55:44 omega11 kernel: ata1.00: qc timeout (cmd 0xec) 15:55:44 omega11 kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) 15:55:44 omega11 kernel: ata1: failed to recover some devices, retrying in 5 secs 15:55:44 omega11 kernel: ata1: hard resetting port 15:55:44 omega11 kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) 15:55:44 omega11 kernel: ata1.00: configured for UDMA/133 15:55:44 omega11 kernel: ata1: EH complete 15:55:44 omega11 kernel: SCSI device sda: 293046768 512-byte hdwr sectors (150040 MB) 15:55:44 omega11 kernel: sda: Write Protect is off 15:55:44 omega11 kernel: SCSI device sda: drive cache: write back We`ve got this messages up to 5 times a day since as far as our syslogs reach. It seems no kind of queuing is used: # cat /sys/block/sda/device/queue_type none # cat /sys/block/sda/device/queue_depth 1 The server is up for 91 days now and has low to medium load (depending on daytime). Since it`s a production server located in a datacenter, we can`t just test some random kernel on it :( Does somebody have a glue whats going on here? Could it be a hardware failure? We have an identical machine using the same kernel. It`s used as a webserver. There also this messages shows up, but not that often (10 times in 91 days uptime). If it is a hardware failure, then both machines would been affected by the same hardware problem. What can we do to fix this problem? Is it known? I`ve found many posts related to SATA problems, but none seemed to be about this problem. Do you need additional information? Thanks cu, Emmy P.S.: Please CC me, since i`m not subscribed.
00:01.0 PCI bridge: Broadcom HT1000 PCI/PCI-X bridge 00:02.0 Host bridge: Broadcom HT1000 Legacy South Bridge 00:02.1 IDE interface: Broadcom HT1000 Legacy IDE controller 00:02.2 ISA bridge: Broadcom HT1000 LPC Bridge 00:03.0 USB Controller: Broadcom HT1000 USB Controller (rev 01) 00:03.1 USB Controller: Broadcom HT1000 USB Controller (rev 01) 00:03.2 USB Controller: Broadcom HT1000 USB Controller (rev 01) 00:04.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet Controller (rev 05) 00:05.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet Controller (rev 05) 00:06.0 VGA compatible controller: XGI - Xabre Graphics Inc Volari Z7 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:0d.0 PCI bridge: Broadcom HT1000 PCI/PCI-X bridge (rev c0) 01:0e.0 RAID bus controller: Broadcom BCM5785 (HT1000) SATA Native SATA Mode
Attachment:
config.gz
Description: GNU Zip compressed data
processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 33 model name : Dual Core AMD Opteron(tm) Processor 275 stepping : 2 cpu MHz : 2194.616 cache size : 1024 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy bogomips : 4390.69 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 33 model name : Dual Core AMD Opteron(tm) Processor 275 stepping : 2 cpu MHz : 2194.616 cache size : 1024 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy bogomips : 4390.11 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp processor : 2 vendor_id : AuthenticAMD cpu family : 15 model : 33 model name : Dual Core AMD Opteron(tm) Processor 275 stepping : 2 cpu MHz : 2194.616 cache size : 1024 KB physical id : 1 siblings : 2 core id : 0 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy bogomips : 4393.11 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp processor : 3 vendor_id : AuthenticAMD cpu family : 15 model : 33 model name : Dual Core AMD Opteron(tm) Processor 275 stepping : 2 cpu MHz : 2194.616 cache size : 1024 KB physical id : 1 siblings : 2 core id : 1 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy bogomips : 4393.51 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp
- Follow-Ups:
- Prev by Date: [PATCH] sk98lin: planned removal
- Next by Date: Re: PROBLEM: sata timeouts with intel 82801HB on amd64
- Previous by thread: [PATCH] sk98lin: planned removal
- Next by thread: Re: 2.6.18.2: sporadic SATA port resets (Broadcom BCM5785 (HT1000))
- Index(es):