Ingo Molnar wrote:
> if you feel inclined to try the git-bisection then by all means please
> do it (it will certainly be helpful and educative), but it's optional: i
> dont think you should 'need' to go through extra debugging chores, my
> analysis based on the excellent trace you provided still holds and
> whoever modified htb_dequeue()'s logic recently ought to be able to
> figure that out (or send you a debug patch to further narrow the problem
> down).
>
> The trace shows a _clearly_ anomalous loop: for example there's 56396
> (!) calls to rb_first() in htb_dequeue() [without the kernel ever
> exiting that function]:
>
> earth4:~/s> grep rb_first trace-to-ingo.txt | wc -l
> 56396
How is this trace to be understood? Is it simply a call trace in
execution-order? If thats the case than we are exiting htb_dequeue,
each call to qdisc_watchdog_schedule happens at the very end of
that function, which would imply a bug in __qdisc_run.
Looking at the recent changes to __qdisc_run, this indeed seems
to be the case, when the qdisc is throttled and has packets queued
we return a value != 0, causing __qdisc_run to loop until all
packets have been sent, which may be a long time.
Anant, can you please verify by testing the attached patch? Thanks.
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index f28bb2d..f536060 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -174,7 +174,7 @@ requeue:
out:
BUG_ON((int) q->q.qlen < 0);
- return q->q.qlen;
+ return skb ? q->q.qlen : 0;
}
void __qdisc_run(struct net_device *dev)
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]