Hi On Wed, 22 Jun 2005 20:25, Ingo Molnar wrote: > William Weston reported unusually high scheduling latencies on his x86 > HT box, on the -RT kernel. I managed to reproduce it on my HT box and > the latency tracer shows the incident in action: Thanks for picking this up. I've had a long hard look at the code and your patch. > the reason for this anomaly is the following code in dependent_sleeper(): > > /* > * If a user task with lower static priority than the > * running task on the SMT sibling is trying to schedule, > * delay it till there is proportionately less timeslice > * left of the sibling task to prevent a lower priority > * task from using an unfair proportion of the > * physical cpu's resources. -ck > */ > [...] > if (((smt_curr->time_slice * (100 - > sd->per_cpu_gain) / 100) > task_timeslice(p))) > ret = 1; > > note that in contrast to the comment above, we dont actually do the > check based on static priority, we do the check based on timeslices. But > timeslices go up and down, and even highprio tasks can randomly have > very low timeslices (just before their next refill) and can thus be > judged as 'lowprio' by the above piece of code. I don't see it like that. task_timeslice(p) will always return the same value based purely on static priority and smt_curr->time_slice cannot ever be larger than task_timeslice(p) unless there is a significant enough 'nice' difference. It is not smt_curr that is rescheduled as a result of this test, it is p that is not scheduled and we look at p's task_timeslice which does not alter. The task that is delayed in either case is dependant on its static priority which will determine its task_timeslice() vs the current value of ->time_slice on the sibling which is emptied as that task runs, and it is expected to fluctuate. > This condition is > clearly buggy. The correct test is to check for static_prio _and_ to > check for the preemption priority. Even on different static priority > levels, a higher-prio interactive task should not be delayed due to a > higher-static-prio CPU hog. > - if (((smt_curr->time_slice * (100 - sd->per_cpu_gain) / > - 100) > task_timeslice(p))) > + if (smt_curr->static_prio < p->static_prio && > + !TASK_PREEMPTS_CURR(p, smt_rq) && > + smt_slice(smt_curr, sd) > task_timeslice(p)) Checking for smt_curr->static_prio < p->static_prio appears redundant to me because the condition can only be met if there is a significant difference in the different timeslice case as I mentioned above. > + if (TASK_PREEMPTS_CURR(p, smt_rq) && Is this check necessary? The proportion is supposed to be distributed according to static priority only. If this code is causing large latencies then I believe it can only occur with different nice levels running on siblings and high priority tasks starting new timeslices repeatedly and never getting to the last per_cpu_gain% of their timeslice. Ingo do you think this might be what is being seen? If this truly can happen then this code will have to move to a jiffy based proportion as the real time code is to prevent this problem. Cheers, Con
Attachment:
pgp46IvtWUj6B.pgp
Description: PGP signature
- Follow-Ups:
- Re: [patch] fix SMT scheduler latency bug
- From: Ingo Molnar <[email protected]>
- Re: [patch] fix SMT scheduler latency bug
- References:
- [patch] fix SMT scheduler latency bug
- From: Ingo Molnar <[email protected]>
- [patch] fix SMT scheduler latency bug
- Prev by Date: Re: IBM HDAPS Someone interested?
- Next by Date: Re: [PATCH 2/2] Fix signed char problem in scripts/kconfig
- Previous by thread: [patch] fix SMT scheduler latency bug
- Next by thread: Re: [patch] fix SMT scheduler latency bug
- Index(es):