Re: Linux-2.6.21 hangs during post boot initialization phase

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Neil Horman wrote:
On Fri, Apr 27, 2007 at 04:05:11PM +1000, Peter Williams wrote:
Linus Torvalds wrote:
On Fri, 27 Apr 2007, Peter Williams wrote:
The 2.6.21 kernel is hanging during the post boot phase where various daemons
are being started (not always the same daemon unfortunately).

This problem was not present in 2.6.21-rc7 and there is no oops or other
unusual output in the system log at the time the hang occurs.
Can you use "git bisect" to narrow it down a bit more? It's only 125 commits, so bisecting even just three or four kernels will narrow it down to a handful.
As the changes became, smaller the builds became quicker :-) and after 7 iterations we have:


author	Neil Horman <[email protected]>
	Fri, 20 Apr 2007 13:54:58 +0000 (09:54 -0400)
committer	Jeff Garzik <[email protected]>
	Tue, 24 Apr 2007 16:43:07 +0000 (12:43 -0400)
commit	b748d9e3b80dc7e6ce6bf7399f57964b99a4104c
tree	887909e1f735bb444ef0e3e370f34401fa6eee02	tree | snapshot
parent	d91c088b39e3c66d309938de858775bb90fd1ead	commit | diff
sis900: Allocate rx replacement buffer before rx operation

The sis900 driver appears to have a bug in which the receive routine
passes the skbuff holding the received frame to the network stack before
refilling the buffer in the rx ring. If a new skbuff cannot be allocated, the
driver simply leaves a hole in the rx ring, which causes the driver to stop
receiving frames and become non-recoverable without an rmmod/insmod according to reporters. This patch reverses that order, attempting to allocate a replacement buffer first, and receiving the new frame only if one can be allocated. If no skbuff can be allocated, the current skbuf in the rx ring is recycled, dropping
the current frame, but keeping the NIC operational.

Signed-off-by: Neil Horman <[email protected]>
Signed-off-by: Jeff Garzik <[email protected]>

Peter
--
Peter Williams                                   [email protected]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce

This was reported to me last night, and I've posted a patch to fix it, its
available here:
http://marc.info/?l=linux-netdev&m=117761259222165&w=2

It applies on top of the previous patch, and should fix your problem.

Here's a copy of the patch

Thanks & Regards
Neil


diff --git a/drivers/net/sis900.c b/drivers/net/sis900.c
index a6a0f09..7e44939 100644
--- a/drivers/net/sis900.c
+++ b/drivers/net/sis900.c
@@ -1754,6 +1754,7 @@ static int sis900_rx(struct net_device *net_dev)
 			sis_priv->rx_ring[entry].cmdsts = RX_BUF_SIZE;
 		} else {
 			struct sk_buff * skb;
+			struct sk_buff * rx_skb;
pci_unmap_single(sis_priv->pci_dev,
 				sis_priv->rx_ring[entry].bufptr, RX_BUF_SIZE,
@@ -1787,10 +1788,10 @@ static int sis900_rx(struct net_device *net_dev)
 			}
/* give the socket buffer to upper layers */
-			skb = sis_priv->rx_skbuff[entry];
-			skb_put(skb, rx_size);
-			skb->protocol = eth_type_trans(skb, net_dev);
-			netif_rx(skb);
+			rx_skb = sis_priv->rx_skbuff[entry];
+			skb_put(rx_skb, rx_size);
+			skb->protocol = eth_type_trans(rx_skb, net_dev);
+			netif_rx(rx_skb);
/* some network statistics */
 			if ((rx_status & BCAST) == MCAST)

This patch fixes the problem for me.

Peter
--
Peter Williams                                   [email protected]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux