Re: bad networking related lag in v2.6.22-rc2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ingo Molnar wrote:
> if you feel inclined to try the git-bisection then by all means please 
> do it (it will certainly be helpful and educative), but it's optional: i 
> dont think you should 'need' to go through extra debugging chores, my 
> analysis based on the excellent trace you provided still holds and 
> whoever modified htb_dequeue()'s logic recently ought to be able to 
> figure that out (or send you a debug patch to further narrow the problem 
> down).
>
> The trace shows a _clearly_ anomalous loop: for example there's 56396 
> (!) calls to rb_first() in htb_dequeue() [without the kernel ever 
> exiting that function]:
> 
>   earth4:~/s> grep rb_first trace-to-ingo.txt  | wc -l
>   56396


How is this trace to be understood? Is it simply a call trace in
execution-order? If thats the case than we are exiting htb_dequeue,
each call to qdisc_watchdog_schedule happens at the very end of
that function, which would imply a bug in __qdisc_run.

Looking at the recent changes to __qdisc_run, this indeed seems
to be the case, when the qdisc is throttled and has packets queued
we return a value != 0, causing __qdisc_run to loop until all
packets have been sent, which may be a long time.

Anant, can you please verify by testing the attached patch? Thanks.
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index f28bb2d..f536060 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -174,7 +174,7 @@ requeue:
 
 out:
 	BUG_ON((int) q->q.qlen < 0);
-	return q->q.qlen;
+	return skb ? q->q.qlen : 0;
 }
 
 void __qdisc_run(struct net_device *dev)

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux