On Sun, May 22, 2005 at 12:39:59AM +0200, Francois Romieu wrote:
> Jeff Garzik <[email protected]> :
> [tulip_media_select]
> > 1) called from timer context, from the media poll timer
> >
> > 2) called from spin_lock_irqsave() context, in the ->tx_timeout hook.
> >
> > The first case can be fixed by moved all the timer code to a workqueue.
> > Then when the existing timer fires, kick the workqueue.
> >
> > The second case can be fixed by kicking the workqueue upon tx_timeout
> > (which is the reason why I did not suggest queue_delayed_work() use).
>
> First try below. It only moves tulip_select_media() to process context.
Cool - thanks.
> The original patch (with s/udelay/msleep/ or such) is not included.
That's fine. I'll take care of that once Jeff is happy with this.
> Comments/suggestions ?
Basic workqueue create/destroy looks correct - but I've only played with
workqueues once before.
It wouldn't hurt if someone else double checked too.
Comments below are mostly about the other parts.
> +static inline int tulip_create_workqueue(void)
> +{
> + ktulipd_workqueue = create_workqueue("ktulipd");
> + return ktulipd_workqueue ? 0 : -ENOMEM;
> +}
This just obfuscates the code. It's only called in one place.
Please just directly call create_workqueue("ktulipd") from tulip_init()
and check the return value.
> +static inline void tulip_destroy_workqueue(void)
> +{
> + destroy_workqueue(ktulipd_workqueue);
> +}
Same thing.
> @@ -526,20 +549,9 @@ static void tulip_tx_timeout(struct net_
...
> + tp->timeout_recovery = 1;
> + queue_work(ktulipd_workqueue, &tp->media_work);
> + goto out_unlock;
This is the key bit.
> - /* Stop and restart the chip's Tx processes . */
> -
> - tulip_restart_rxtx(tp);
> - /* Trigger an immediate transmit demand. */
> - iowrite32(0, ioaddr + CSR1);
> -
> - tp->stats.tx_errors++;
> + tulip_tx_timeout_complete(tp, ioaddr);
This doesn't fix the existing issue with tulip_restart_rxtx().
Even without the patch to tulip_select_media(),
tulip_restart_rxtx() does not comply with jgarzik's linux driver
requirements becuase it can spin delay up to 1200us.
> static void __exit tulip_cleanup (void)
> {
> pci_unregister_driver (&tulip_driver);
> + tulip_destroy_workqueue();
> }
Only one workqueue for all instances of tulip cards, right?
...
> @@ -127,6 +128,14 @@ void tulip_timer(unsigned long data)
> }
> break;
> }
> +
> + spin_lock_irqsave (&tp->lock, flags);
> + if (tp->timeout_recovery) {
> + tp->timeout_recovery = 0;
> + tulip_tx_timeout_complete(tp, ioaddr);
> + }
> + spin_unlock_irqrestore (&tp->lock, flags);
This suffers the original issue: blocked IRQs while CPU might spin
for 1200us in tulip_tx_timeout_complete().
If tp->timeout_recovery acts as a sort of semaphore for us,
do we even need the spinlock?
I suspect "yes" because timeout_recovery is a bitfield and clearing
it is a read/modify/write operation. This is why I don't like bitfields.
ie. something like:
if (tp->timeout_recovery) {
tulip_tx_timeout_complete(tp, ioaddr);
spin_lock_irqsave (&tp->lock, flags);
tp->timeout_recovery = 0; /* Bitfields are NOT atomic. */
spin_unlock_irqrestore (&tp->lock, flags);
}
thanks,
grant
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]