Frequent spurious tx_timeouts for libertas
Ben Hutchings
bhutchings at solarflare.com
Mon May 2 16:47:39 EDT 2011
On Mon, 2011-05-02 at 20:59 +0100, Daniel Drake wrote:
> On 2 May 2011 03:24, Ben Hutchings <bhutchings at solarflare.com> wrote:
> >> Also, while looking at this code, I spotted a bug in dev_watchdog():
> >> /*
> >> * old device drivers set dev->trans_start
> >> */
> >> trans_start = txq->trans_start ? : dev->trans_start;
> >>
> >> i.e. it is trying to figure out whether to read trans_start from txq
> >> or dev. In both cases, trans_start is updated based on the value of
> >> jiffies, which will occasionally be 0 (as it wraps around). Therefore
> >> this line of code will occasionally make the wrong decision.
> >
> > No, I don't think so.
> >
> > If only dev->trans_start is being updated then the watchdog reads that.
> > If both txq->trans_start and dev->trans_start are being updated then it
> > doesn't matter much which the watchdog reads.
> > If only txq->trans_start is being updated then dev->trans_start is
> > always set to 0, so when txq->trans_start is 0 the watchdog still gets
> > 0.
>
> dev->trans_start is unconditionally initialized by dev_activate() in
> sch_generic.c:
>
> if (need_watchdog) {
> dev->trans_start = jiffies;
> dev_watchdog_up(dev);
> }
>
> so it is (usually) not 0.
[...]
You're right. Seems like we have an incomplete compatibility hack that
can hurt drivers that are doing the right thing.
For those few single-queue drivers that need to update the transmit
time, perhaps we could add a dev_trans_update() as a wrapper for
txq_trans_update(). Then delete net_device::trans_start and change
dev_trans_start() to avoid using it.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
More information about the libertas-dev
mailing list