Re: sungem hangs in atomic if netconsole enabled but no carrier

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Johannes Berg <[email protected]> :
[...]
> think it should not hang the system completely. So far I haven't been
> able to figure out where it actually hangs and don't even know how to do
> so -- I'm open for suggestions on how to find out why/where it hangs or
> even fixes.

See the thread "Netconsole violates dev->hard_start_xmit synch rules"
started the 06/09/2005 on [email protected] for some interesting
background.

(the innocent hero slowly fades into the swamps of netpolling...)

Still with us ?

Were you using sundance.c, you would probably bug on the first timeout:

[net/sched/sch_generic.c]
static void dev_watchdog(unsigned long arg)
{
        struct net_device *dev = (struct net_device *)arg;

        spin_lock(&dev->xmit_lock);
        ^^^^^^^^^
        if (dev->qdisc != &noop_qdisc) {
                if (netif_device_present(dev) &&
                    netif_running(dev) &&
                    netif_carrier_ok(dev)) {
                        if (netif_queue_stopped(dev) &&
                            (jiffies - dev->trans_start) > dev->watchdog_timeo) {
                                printk(KERN_INFO "NETDEV WATCHDOG: %s: transmit timed out\n", dev->name);
                                dev->tx_timeout(dev);
                                ^^^^^^^^^^^^^^^
[net/core/netpoll.c]
static void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
{
        int status;
        struct netpoll_info *npinfo;

        if (!np || !np->dev || !netif_running(np->dev)) {
                __kfree_skb(skb);
                return;
        }

        npinfo = np->dev->npinfo;

        /* avoid recursion */
        if (npinfo->poll_owner == smp_processor_id() ||
            np->dev->xmit_lock_owner == smp_processor_id()) {
                if (np->drop)
                        np->drop(skb);
                else
                        __kfree_skb(skb);
                return;
        }

        do {
                npinfo->tries--;
                spin_lock(&np->dev->xmit_lock);
                ^^^^^^^^^

A quick glance shows no netif_carrier_{on/off} in the sundance driver.
It would be a good candidate.

However you are using sungem.c and despite the fact that I should really
have something for dinner *now*, you are protected by netif_carrier_off.

But (drums roll):

[drivers/net/sungem.c]
#define DEFAULT_MSG     (NETIF_MSG_DRV          | \
                         NETIF_MSG_PROBE        | \
                         NETIF_MSG_LINK)

Thus gem_link_timer() will periodically complain that the link is down.

So gem_start_xmit() is issued.

Repeat until the TX ring is full: netif_stop_queue() is called.

gem_link_timer() printks.

net/core/netpoll.c::netpoll_send_skb() notices that the queue is stopped
and decides to try the usual NAPI poll(). A few function calls later, the
driver ends in drivers/net/sungem.c::gem_poll() where it takes so many
(irq-)locks that I do not even want to verify that it has a chance
to play nice with the pending gem_link_timer().

--
Ueimor
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux