On Fri, 10 Jun 2005 22:03, Con Kolivas wrote: > On Fri, 10 Jun 2005 17:02, Ingo Molnar wrote: > > * Martin J. Bligh <[email protected]> wrote: > > > > I'm assuming it was the CPU scheduler patches. There are 36 of them > > > > ;) > > So the > > candidates for the regression are: > > > > sched-implement-nice-support-across-physical-cpus-on-smp.patch > > sched-change_prio_bias_only_if_queued.patch > > sched-account_rt_tasks_in_prio_bias.patch > > consolidate-preempt-options-into-kernel-kconfigpreempt.patch > > enable-preempt_bkl-on-preemptsmp-too.patch > > sched-tweak-idle-thread-setup-semantics.patch > > sched-voluntary-kernel-preemption.patch > > sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch > > sched-task_noninteractive.patch > > sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch > These tend to run together so just try adding my four patches together. In > retrospect I guess they're likely candidates because they also change the > _ratio_ of balance which they should not so they are buggy as a group > currently. Easy enough to fix but it will make it easy to pinpoint the > problem if they're responsible. > > sched-implement-nice-support-across-physical-cpus-on-smp.patch > sched-change_prio_bias_only_if_queued.patch > sched-account_rt_tasks_in_prio_bias.patch > sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch By the way it has already been decided to remove these patches from -mm pending the completion of current scheduler work. If they turn out to be responsible for this regression I apologise profusely :-|. It is clearer to me now that I have made a mistake with the priority biasing, and the following patch corrects it to the planned behaviour. This is academic at this stage as we won't be looking at this particular feature again in earnest until the other 32 scheduler patches (and any followups) go upstream. It's already known that schedstats data will be off without further code to understand smp nice as well (thanks Nick for pointing out the data)... more academic stuff but obviously something to consider when/if we get there. Cheers, Con
The priority biasing was off by mutliplying the total load by the total priority bias and this ruins the ratio of loads between runqueues. This patch should correct the ratios of loads between runqueues to be proportional to overall load. Signed-off-by: Con Kolivas <[email protected]> Index: linux-2.6.12-rc6-mm1/kernel/sched.c =================================================================== --- linux-2.6.12-rc6-mm1.orig/kernel/sched.c 2005-06-10 23:56:56.000000000 +1000 +++ linux-2.6.12-rc6-mm1/kernel/sched.c 2005-06-10 23:59:57.000000000 +1000 @@ -978,7 +978,7 @@ static inline unsigned long __source_loa else source_load = min(cpu_load, load_now); - if (idle == NOT_IDLE || rq->nr_running > 1) + if (idle == NOT_IDLE || rq->nr_running > 1) { /* * If we are busy rebalancing the load is biased by * priority to create 'nice' support across cpus. When @@ -987,7 +987,10 @@ static inline unsigned long __source_loa * prevent idle rebalance from trying to pull tasks from a * queue with only one running task. */ - source_load *= rq->prio_bias; + unsigned long prio_bias = rq->prio_bias / rq->nr_running; + + source_load *= prio_bias; + } return source_load; } @@ -1011,8 +1014,11 @@ static inline unsigned long __target_loa else target_load = max(cpu_load, load_now); - if (idle == NOT_IDLE || rq->nr_running > 1) - target_load *= rq->prio_bias; + if (idle == NOT_IDLE || rq->nr_running > 1) { + unsigned long prio_bias = rq->prio_bias / rq->nr_running; + + target_load *= prio_bias; + } return target_load; }
Attachment:
pgpqQwLta0jOc.pgp
Description: PGP signature
- Follow-Ups:
- Re: 2.6.12-rc6-mm1
- From: "Martin J. Bligh" <[email protected]>
- Re: 2.6.12-rc6-mm1
- From: "J.A. Magallon" <[email protected]>
- Re: 2.6.12-rc6-mm1
- References:
- Re: 2.6.12-rc6-mm1
- From: Andrew Morton <[email protected]>
- Re: 2.6.12-rc6-mm1
- From: Ingo Molnar <[email protected]>
- Re: 2.6.12-rc6-mm1
- From: Con Kolivas <[email protected]>
- Re: 2.6.12-rc6-mm1
- Prev by Date: Re: PROBLEM: Adaptec RAID 2010S hang-up under heavy load
- Next by Date: Re: [Jfs-discussion] fsck.jfs segfaults on x86_64
- Previous by thread: Re: 2.6.12-rc6-mm1
- Next by thread: Re: 2.6.12-rc6-mm1
- Index(es):