Re: patch tulip-natsemi-dp83840a-phy-fix.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, May 22, 2005 at 12:39:59AM +0200, Francois Romieu wrote:
> Jeff Garzik <[email protected]> :
> [tulip_media_select]
> > 1) called from timer context, from the media poll timer
> > 
> > 2) called from spin_lock_irqsave() context, in the ->tx_timeout hook.
> > 
> > The first case can be fixed by moved all the timer code to a workqueue. 
> >  Then when the existing timer fires, kick the workqueue.
> > 
> > The second case can be fixed by kicking the workqueue upon tx_timeout 
> > (which is the reason why I did not suggest queue_delayed_work() use).
> 
> First try below. It only moves tulip_select_media() to process context.

Cool - thanks.

> The original patch (with s/udelay/msleep/ or such) is not included.

That's fine. I'll take care of that once Jeff is happy with this.


> Comments/suggestions ?

Basic workqueue create/destroy looks correct - but I've only played with
workqueues once before.
It wouldn't hurt if someone else double checked too.

Comments below are mostly about the other parts.


> +static inline int tulip_create_workqueue(void)
> +{
> +	ktulipd_workqueue = create_workqueue("ktulipd");
> +	return ktulipd_workqueue ? 0 : -ENOMEM;
> +}

This just obfuscates the code. It's only called in one place.
Please just directly call create_workqueue("ktulipd") from tulip_init()
and check the return value.

> +static inline void tulip_destroy_workqueue(void)
> +{
> +	destroy_workqueue(ktulipd_workqueue);
> +}

Same thing.

> @@ -526,20 +549,9 @@ static void tulip_tx_timeout(struct net_
...
> +		tp->timeout_recovery = 1;
> +		queue_work(ktulipd_workqueue, &tp->media_work);
> +		goto out_unlock;

This is the key bit.

> -	/* Stop and restart the chip's Tx processes . */
> -
> -	tulip_restart_rxtx(tp);
> -	/* Trigger an immediate transmit demand. */
> -	iowrite32(0, ioaddr + CSR1);
> -
> -	tp->stats.tx_errors++;
> +	tulip_tx_timeout_complete(tp, ioaddr);

This doesn't fix the existing issue with tulip_restart_rxtx().
Even without the patch to tulip_select_media(),
tulip_restart_rxtx() does not comply with jgarzik's linux driver
requirements becuase it can spin delay up to 1200us.


>  static void __exit tulip_cleanup (void)
>  {
>  	pci_unregister_driver (&tulip_driver);
> +	tulip_destroy_workqueue();
>  }

Only one workqueue for all instances of tulip cards, right?


...
> @@ -127,6 +128,14 @@ void tulip_timer(unsigned long data)
>  	}
>  	break;
>  	}
> +
> +	spin_lock_irqsave (&tp->lock, flags);
> +	if (tp->timeout_recovery) {
> +		tp->timeout_recovery = 0;
> +		tulip_tx_timeout_complete(tp, ioaddr);
> +	}
> +	spin_unlock_irqrestore (&tp->lock, flags);


This suffers the original issue: blocked IRQs while CPU might spin
for 1200us in tulip_tx_timeout_complete().

If tp->timeout_recovery acts as a sort of semaphore for us,
do we even need the spinlock?

I suspect "yes" because timeout_recovery is a bitfield and clearing
it is a read/modify/write operation. This is why I don't like bitfields.

ie. something like:
	if (tp->timeout_recovery) {
		tulip_tx_timeout_complete(tp, ioaddr);

		spin_lock_irqsave (&tp->lock, flags);
		tp->timeout_recovery = 0;	/* Bitfields are NOT atomic. */
		spin_unlock_irqrestore (&tp->lock, flags);
	}

thanks,
grant
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux