I have 3 supermicro systems based on the x6dal-tb2 motherboard. It has
built in broadcom 5721 gig-e pci-e nics. eth0 on these boxes fails whenever
a decent amount of data is pushed across them (decent being ~100Mb). At
this point I can say when it fails I get these error messages in
/var/log/messages:
Oct 2 19:08:53 office kernel: NETDEV WATCHDOG: eth0: transmit timed out
Oct 2 19:08:53 office kernel: tg3: eth0: transmit timed out, resetting
Oct 2 19:08:53 office kernel: tg3: tg3_stop_block timed out, ofs=1400
enable_bit=2
Oct 2 19:08:53 office kernel: tg3: tg3_stop_block timed out, ofs=c00
enable_bit=2
Oct 2 19:08:53 office kernel: tg3: tg3_stop_block timed out, ofs=4800
enable_bit=2
Oct 2 19:08:53 office kernel: tg3: eth0: Link is down.
I made a cron job to log ifconfig output to a file every minute. This shows
that the NIC resets itself at least every couple minutes when data is being
passed. The TX/RX stats in ifconfig reset to 0. The above message in
/var/log/messages doesn't happen every time the NIC resets like this. I
think that the NIC is resetting because of some bug, and sometimes, the
reset fails and locks the NIC, creating the above messages. The above only
happens once or twice a day, the other nic resets happen as I said every 2-3
minutes. Is there any information that would be helpful in debugging this
problem? Let me know what to run and I'll do it. Eth1 never has this
problem, I have pushed 5GB+ onto the box over eth1 and it doesn't blink.
Tom
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]