On 10/22/06, Martin J. Bligh <[email protected]> wrote:
Jesse Brandeburg wrote:
> Analysis follows, but I wanted to ask you to bisect back if you can to
> find the apparent patch to make the difference. Basically at this
> point I'd say its not likely to be an e1000 issue, but I'd like to
> follow up and make sure.
That's going to be ugly, since I can't reproduce it at will. Maybe if
I netperf it to the other box I can push it over.
try tbench with 100 sessions (from dbench package) and see if that hurts.
> Nothing seems out of order, but the latency may be low, I'd be curious
> what these looked like before with the old kernel. Some of the other
> things to compare would have been the lspci -vv output from your
> chipset with old/new kernel, in case the bridge/system configuration
> changed. There are no known problems right now with this chipset
> 82546EB
OK. will try later when I have more time. For now I switched to the
onboard via rhine controller.
ouch.
> shared int, fine, but whats with the ERR: ?
Hmm. Having rebooted they look rather lower. but might be a time thing.
CPU0
0: 1405995 XT-PIC timer
1: 5910 XT-PIC i8042
2: 0 XT-PIC cascade
5: 0 XT-PIC uhci_hcd:usb3
7: 27135 XT-PIC ehci_hcd:usb2, VIA8237, eth0
10: 0 XT-PIC uhci_hcd:usb4, uhci_hcd:usb5,
uhci_hcd:usb6
11: 0 XT-PIC ehci_hcd:usb1, uhci_hcd:usb7,
uhci_hcd:usb8
12: 157547 XT-PIC i8042
14: 36296 XT-PIC ide0
15: 196690 XT-PIC ide1
NMI: 0
LOC: 1406006
ERR: 26
> except you didn't include any of the e1000 load information nor the
> system's boot information as it came up.
OK, it had gone since reboot, but I rebooted just now .... new info
attached.
> This chipset is one of the most frequent common elements in problem
> reports of TX hangs for e1000. My current theory (we've bought a
> bunch of these systems and never reproduced the issue) is that there
> is something either design specific or BIOS specific that causes this
> chipset to interact very badly with e1000 hardware. Some systems have
> the issue and some don't. If you could bisect back to a working point
> it would be interesting to see where that pointed.
OK, is going to be hard to bisect, since the other one was an Ubuntu
kernel, but I guess I can give 2.6.15 virgin a shot, at least.
thanks, I know how difficult and time consuming bisecting is.
> doesn't seem you're overclocked. Good.
Nah, I'm pretty conservative with hardware, get enough problems when
it's all running within specs ;-)
Thanks for looking at all this.
welcome, like to help when I can.
Linux version 2.6.18 (mbligh@titus) (gcc version 3.4.6 (Ubuntu 3.4.6-1ubuntu2)) #2 Sun > e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
Tx Queue <0>
TDH <26>
TDT <26>
next_to_use <26>
next_to_clean <39>
buffer_info[next_to_clean]
time_stamp <77145>
next_to_watch <3b>
jiffies <7734f>
next_to_watch.status <0>
NETDEV WATCHDOG: eth0: transmit timed out
e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex
hey, this one is different. It is actually the common tx hang
signature (TDH == TDT) for these kinds of systems. I've come up with a
workaround driver, code is still in development.
you can try it if you would like.
http://sourceforge.net/tracker/download.php?group_id=42302&atid=447449&file_id=198849&aid=1463045
Thanks,
Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]