After upgrading from Linux 2.6.30 (Fedora Core 11) to 2.6.31 (F12), I am experiencing significant packet loss on an Intel 82574L NIC running on the e1000e driver. I was not experiencing this with kernel 2.6.30. I notice 2.6.30 uses e1000e version 0.3.3.4-k4 whereas 2.6.31 uses version 1.0.2-k2. I have tried setting IntMode to 0, 1, and 2 and InterruptThrottleRate to 0, 1, 3 (the default), 1000, 5000, 10000, and 100000. I've also tried booting with the "noapic" kernel parameter. I am testing with ttcp, sending 100000 1450 byte UDP packets at about 910 Mbps. With InterruptThrottleRate at 1, 3, 5000, or 10000, I see the following behaviour on the receiver side: ttcp -u -4 -l 1450 -s -fm -r ttcp-r: buflen=1450, nbuf=2048, align=16384/0, port=5001 udp ttcp-r: socket ttcp-r: 98486900 bytes in 1.22 real seconds = 617.22 Mbit/sec +++ ttcp-r: 67924 I/O calls, msec/call = 0.02, calls/sec = 55794.64 ttcp-r: 0.0user 0.0sys 0:01real 0% 0i+0d 0maxrss 0+0pf 4963+3csw So in total (145000000 - 98486900)/1450 = 32078 out of 100000 packets were dropped, or about 32%. This is the difference between /proc/interrupts (the change in each counter) before and after the test. lan0 is the interface being tested. Notice that there are a significant number of interrupts on the "sequence error" interrupt; I'm guessing that's 57: 55: 0 0 0 8603 PCI-MSI-edge 56: 0 0 0 25 PCI-MSI-edge Q�����V 57: 4868 0 0 0 PCI-MSI-edge lan0 67: 0 0 2 0 PCI-MSI-edge ��������@� 68: 0 0 0 0 PCI-MSI-edge 69: 0 0 0 0 PCI-MSI-edge lan1 This is the difference between the output from 'ethtool -S lan0' before and after the test; only fields which changed are shown: : rx_broadcast: 585730 - 581046 = 4684 : rx_bytes: 931068459 - 822452567 = 108615892 : rx_csum_offload_good: 650149 - 577551 = 72598 : rx_long_byte_count: 931068459 - 822452567 = 108615892 : rx_missed_errors: 31003 - 6 = 30997 : rx_packets: 655692 - 583072 = 72620 : rx_smbus: 5784 - 5763 = 21 : tx_broadcast: 972 - 969 = 3 : tx_bytes: 388453 - 385439 = 3014 : tx_packets: 3025 - 3012 = 13 Notice the large rx_missed_errors count which indicates NIC FIFO or PCI bus exhaustion. If I disable interrupt throttling or set the limit very high, e.g., 100000, the same test generates about 65,000 data interrupts and 93,000 error interrupts and rx_missed_errors increases by 34,000. This suggests to me that the NIC is attempting to raise an interrupt for every packet received. An Intel 82576 NIC in the same system, running on the igb driver, is performing OK under 2.6.31 (0 to 0.1% packet loss). For comparison, the same UDP test generates about 6000 interrupts on the 82576. dmesg, dmidecode, ethtool, lspci, 'netstat -s', and /proc/interrupts output is attached. N.B. I tried removing the 82576 NIC from the system before testing as well; no change. - Kelvin
Attachment:
testhost.dmesg.gz
Description: GNU Zip compressed data
Attachment:
testhost.dmidecode.gz
Description: GNU Zip compressed data
Attachment:
testhost.ethtool.gz
Description: GNU Zip compressed data
Attachment:
testhost.lspci.gz
Description: GNU Zip compressed data
Attachment:
testhost.netstat.gz
Description: GNU Zip compressed data
Attachment:
testhost.proc-interrupts.gz
Description: GNU Zip compressed data
-- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines