Severe problem with e1000 driver in 2.4.31/32 (at least)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

today we had to experience a bug in e1000 network driver on this type of
network card (a current PCI e1000 sold everywhere):

02:07.0 Ethernet controller: Intel Corp.: Unknown device 107c (rev 05)
        Subsystem: Intel Corp.: Unknown device 1376
        Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 18
        Memory at f9000000 (32-bit, non-prefetchable) [size=128K]
        Memory at f9020000 (32-bit, non-prefetchable) [size=128K]
        I/O ports at a800 [size=64]
        Expansion ROM at <unassigned> [disabled] [size=128K]
        Capabilities: [dc] Power Management version 2
        Capabilities: [e4] PCI-X non-bridge device.

We can simply crash the box (2.4.32 stock kernel, 2.4.31 is the same) by
performing this:

1) Boot box with network physically disconnected
2) start pinging some host somewhere, you get "unreachable", let it run
3) connect the network and await some ping replies.
4) disconnect the network again
5) box is dead

The box runs into a BUG in e1000_hw.c line 5052. The BUG shows up because the
code is obviously executed inside an interrupt, which seems not intended.
As this BUG is always reproducable and pretty annoying we made this pretty bad
workaround:

In e1000_osdep.h replace:

#define msec_delay(x)   do { if(in_interrupt()) { \
                                /* Don't mdelay in interrupt context! */ \
                                BUG(); \
                        } else { \
                                set_current_state(TASK_UNINTERRUPTIBLE); \
                                schedule_timeout((x * HZ)/1000 + 2); \
                        } } while(0)

with

#define msec_delay(x)   do { if(in_interrupt()) { \
                                /* Don't mdelay in interrupt context! */ \
                                mdelay(x); \
                        } else { \
                                set_current_state(TASK_UNINTERRUPTIBLE); \
                                schedule_timeout((x * HZ)/1000 + 2); \
                        } } while(0)


Obviously this is not a solution, but it saves the world at least. This is why
I offer no applyable patch.

Someone with inside knowledge of this driver should take a look why
e1000_config_dsp_after_link_change in e1000_hw.c is executed in interrupt
context. Btw I don't know if this shows up in 2.6 either.

-- 
Regards,
Stephan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux