Re: bad networking related lag in v2.6.22-rc2

Ingo Molnar wrote:
> if you feel inclined to try the git-bisection then by all means please 
> do it (it will certainly be helpful and educative), but it's optional: i 
> dont think you should 'need' to go through extra debugging chores, my 
> analysis based on the excellent trace you provided still holds and 
> whoever modified htb_dequeue()'s logic recently ought to be able to 
> figure that out (or send you a debug patch to further narrow the problem 
> down).
>
> The trace shows a _clearly_ anomalous loop: for example there's 56396 
> (!) calls to rb_first() in htb_dequeue() [without the kernel ever 
> exiting that function]:
> 
>   earth4:~/s> grep rb_first trace-to-ingo.txt  | wc -l
>   56396

How is this trace to be understood? Is it simply a call trace in
execution-order? If thats the case than we are exiting htb_dequeue,
each call to qdisc_watchdog_schedule happens at the very end of
that function, which would imply a bug in __qdisc_run.

Looking at the recent changes to __qdisc_run, this indeed seems
to be the case, when the qdisc is throttled and has packets queued
we return a value != 0, causing __qdisc_run to loop until all
packets have been sent, which may be a long time.

Anant, can you please verify by testing the attached patch? Thanks.

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index f28bb2d..f536060 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -174,7 +174,7 @@ requeue:
 
 out:
 	BUG_ON((int) q->q.qlen < 0);
-	return q->q.qlen;
+	return skb ? q->q.qlen : 0;
 }
 
 void __qdisc_run(struct net_device *dev)

Follow-Ups:
- Re: bad networking related lag in v2.6.22-rc2
  - From: Herbert Xu <[email protected]>
- Re: bad networking related lag in v2.6.22-rc2
  - From: Ingo Molnar <[email protected]>

References:
- [patch] CFS scheduler, -v13
  - From: Ingo Molnar <[email protected]>
- Re: bad networking related lag in v2.6.22-rc2
  - From: Anant Nitya <[email protected]>
- Re: bad networking related lag in v2.6.22-rc2
  - From: Ingo Molnar <[email protected]>
- Re: bad networking related lag in v2.6.22-rc2
  - From: Anant Nitya <[email protected]>
- Re: bad networking related lag in v2.6.22-rc2
  - From: Ingo Molnar <[email protected]>

Prev by Date: Re: [RFC] LZO de/compression support - take 3
Next by Date: RE: [stable] [patch 00/69] -stable review
Previous by thread: Re: bad networking related lag in v2.6.22-rc2
Next by thread: Re: bad networking related lag in v2.6.22-rc2
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]