I just noticed that the Ubuntu setup says "GSI 20(level,low) -> IRQ 20"
whereas I remember my built kernels saying "No GSI...... IRQ 11". I'll
investigate what that means and how to enable it. Pointers appreciated.
sdw
Stephen D. Williams wrote:
I have been working for days to get a recent kernel to work with these
small-format UP Celeron 2Ghz (running at 1.33Ghz) motherboards that I
am planning to use as thin clients. I'm doing a PXE boot, loading
kernels, and trying to get networking to come up.
I eventually realized that the problem is that the e100 driver loads
but does not allow any packet traffic. The system isn't crashed, but
I do get transmit timeouts.
I've used kernels: 2.6.10, 2.6.11, and 2.6.12.4, stock with only the
"squashfs" patch applied and compiled as 586/....
The interesting thing is that Ubuntu 5.04, booted "Live" on the box,
works just fine with the e100 driver with a kernel shown as:
"2.6.10-5-386". I'm going to work on pulling this kernel and its
modules off to use.
Any help urgently appreciated.
sdw
Matti Aarnio wrote:
On Tue, Aug 09, 2005 at 09:16:21AM -0700, Daniel Walker wrote:
It looks like this might be an SMP race , it seem that both processors
are in e100_down(). There is a while loop in e100_clean_cbs() that
appears to have an unsafe looping condition .
It looks like cbs_avail might jump over params.cbs.count , then you
would have to wait for a rollover . Is this a PREEMPT_NONE kernel?
# CONFIG_PREEMPT is not set
# CONFIG_PREEMPT_BKL is not set
which is probably same as "NONE".
There is _one_ processor in down, but other may be in trying to send
some data out, or otherwise polling the card.
However... while real bugs in their own sense, none of these are
as important as original "card dies" thing, during a recovery of
which all this soft-lockup merryment happens.
Also, as it happens only once a week or so (except when it happens
right after another), testing code patches is rather slow.
I can guess which things make it more likely, but I can't make it
happen at will.
/Matti Aarnio
This patch may help, but it's not a complete fix.
--- linux-2.6.12.orig/drivers/net/e100.c 2005-08-05
16:45:59.000000000 +0000
+++ linux-2.6.12/drivers/net/e100.c 2005-08-09
16:14:45.000000000 +0000
@@ -1393,7 +1393,7 @@ static inline int e100_tx_clean(struct n
static void e100_clean_cbs(struct nic *nic)
{
if(nic->cbs) {
- while(nic->cbs_avail != nic->params.cbs.count) {
+ while(nic->cbs_avail < nic->params.cbs.count) {
struct cb *cb = nic->cb_to_clean;
if(cb->skb) {
pci_unmap_single(nic->pdev,
On Tue, 2005-08-09 at 16:36 +0300, Matti Aarnio wrote:
Running very recent Fedora Core Development kernel I can following
soft-oops.. ( 2.6.12-1.1455_FC5smp )
e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex
BUG: soft lockup detected on CPU#0!
Pid: 10743, comm: ifconfig
EIP: 0060:[<f88bf2f9>] CPU: 0
EIP is at e100_clean_cbs+0x2f/0x12b [e100]
EFLAGS: 00000293 Not tainted (2.6.12-1.1455_FC5smp)
EAX: 495c7c2b EBX: 495c7c2b ECX: f6c311a0 EDX: 00000000
ESI: 00000040 EDI: f6c30000 EBP: f71a4b20 DS: 007b ES: 007b
CR0: 8005003b CR2: 0804a544 CR3: 01e9cd80 CR4: 000006f0
[<f88c0708>] e100_down+0x66/0x9a [e100]
[<f88c1623>] e100_close+0xa/0xd [e100]
[<c02b7adb>] dev_close+0x40/0x7e
[<c02b8f59>] dev_change_flags+0x46/0xf5
[<c02f76b3>] devinet_ioctl+0x564/0x5df
[<c02af22c>] sock_ioctl+0xc3/0x250
[<c02af169>] sock_ioctl+0x0/0x250
[<c01762ef>] do_ioctl+0x1f/0x6d
[<c017648f>] vfs_ioctl+0x50/0x1c6
[<c0176662>] sys_ioctl+0x5d/0x6f
[<c010394d>] syscall_call+0x7/0xb
[<c014473f>] softlockup_tick+0x6f/0x80
[<c01085b8>] timer_interrupt+0x2d/0x75
[<c01448dd>] handle_IRQ_event+0x2e/0x5a
[<c01449cb>] __do_IRQ+0xc2/0x127
[<c0105f7e>] do_IRQ+0x4e/0x86
=======================
[<c01160cc>] smp_apic_timer_interrupt+0xc1/0xca
[<c0104382>] common_interrupt+0x1a/0x20
[<f88bf2f9>] e100_clean_cbs+0x2f/0x12b [e100]
[<f88c0708>] e100_down+0x66/0x9a [e100]
[<f88c1623>] e100_close+0xa/0xd [e100]
[<c02b7adb>] dev_close+0x40/0x7e
[<c02b8f59>] dev_change_flags+0x46/0xf5
[<c02f76b3>] devinet_ioctl+0x564/0x5df
[<c02af22c>] sock_ioctl+0xc3/0x250
[<c02af169>] sock_ioctl+0x0/0x250
[<c01762ef>] do_ioctl+0x1f/0x6d
[<c017648f>] vfs_ioctl+0x50/0x1c6
[<c0176662>] sys_ioctl+0x5d/0x6f
[<c010394d>] syscall_call+0x7/0xb
Preconditions for this are:
- E100 card stopped working for some reason (no idea why, it just
does sometimes at this oldish 2x P-III machine)
- There are active datastreams running in and out
(around 0.2 Mbps out, multiple megabits in.)
- Commanding then "ifconfig eth0 down" results in what feels like
system freezing, but it does recover in about 30-60 seconds
(it takes long enough for me to sweat bullets...)
- While in freeze state, keyboard can go crazy, but mouse does
respond, as well as tvtime shows bt848 captured live video.
-
To unsubscribe from this list: send the line "unsubscribe
linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe
linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
begin:vcard
fn:Stephen Williams
n:Williams;Stephen
email;internet:[email protected]
tel;work:703-724-0118
tel;fax:703-995-0407
tel;pager:[email protected]
tel;home:703-729-5405
tel;cell:703-371-9362
x-mozilla-html:TRUE
version:2.1
end:vcard
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
|
|