Re: [PATCH] sched: move enough load to balance average load per task

Siddha, Suresh B wrote:

On Wed, Apr 12, 2006 at 09:46:32AM +1000, Peter Williams wrote:
Siddha, Suresh B wrote:
On Mon, Apr 10, 2006 at 04:45:32PM +1000, Peter Williams wrote:
Problem:
The current implementation of find_busiest_group() recognizes thatapproximately equal average loads per task for each group/queue aredesirable (e.g. this condition will increase the probability that thetop N highest priority tasks on an N CPU system will be on differentCPUs) by being slightly more aggressive when *imbalance is small but theaverage load per task in "busiest" group is more than that in "this"group. Unfortunately, the amount moved from "busiest" to "this" is toosmall to reduce the average load per task on "busiest" (at best therewill be no change and at worst it will get bigger).
Peter, We don't need to reduce the average load per task on "busiest"
always. By moving a "busiest_load_per_task", we will increase theaverage load per task of lesser busy cpu (there by trying to achieve
the equality with busiest...)
Well, first off, we don't always move busiest_load_per_task we move UPTO busiest_load_per_task so there is no way you can make definitivestatements about what will happen to the value "this_load_per_task" as aresult of setting *imbalance to busiest_load_per_task. Load balancingis a probabilistic endeavour and we need to take steps that increase theprobability that we get the desired result.
I agree with you. But the previous code was more conservative and may slowly
(just from theory pt of view... I don't have an example to show this..)
balance towards the desired state. With this code, I feel we are
aggressive. for example, on a DP system: if I run one high priority
and two low priority processes, they keep hopping from one processor
to another... you may argue it is because of the "top" or some other
process... I agree that it is the case.. But same thing doesn't happen
with the previous version.. I like the conservative approach...
Without this patch there is no chance that busiest_load_per_task willget smaller
Is there an example for this?

Yes, we just take a slight variation of your scenario that prompted thefirst patch (to which this patch is a minor modification) by adding onenormal priority task to each of the CPUs. This gives us a 2 CPU systemwith CPU-0 having 2 high priority tasks plus 1 normal priority task andCPU-1 having two normal priority tasks. Clearly, the desirable loadbalancing outcome would be for the two high priority tasks to be ondifferent CPUs otherwise we have a high priority task stuck on a runqueue while a normal priority is running on another (less heavilyloaded) CPU.

In order to analyze what happens during load balancing, let's use W asthe load weight for a normal task and suppose that the load weights ofthe two high priority tasks are (W + k) and that "this" == CPU-1 infind_busiest_queue(). This will result in "busiest" == CPU-0 and:


this_load = 2W
this_load_per_task = W
max_load = 3W + 2k
busiest_load_per_task = W + 2k / 3
avg_load = 5W / 2 + k
max_pull = W / 2 + k
*imbalance = W / 2 + k

Whenever k < (3W / 2) this will result in *imbalance <busiest_load_per_task and we end up in the small imbalance code.

(max_load - this_load) = W + 2k which is greater thanbusiest_load_per_task so we decide that we want to move some load from"busiest" to "this".

Without this patch we would set *imbalance to busiest_load_per_task andthe only task on "busiest" that has a weighted load less than or equalto this value is the normal task so this is the one that will be movedresulting:


this_load = 3W
this_load_per_task = W
max_load = 2W + 2k
busiest_load_per_task = W + k

Even if you reverse the roles of "busiest" and "this", this will beconsidered balanced and the system will stabilize in this undesirablestate. NB, as predicted, the average load per task on "this" hasn'tchanged and the average load per task on "busiest" has increased. Westill have the situation where a high priority task is stuck on a runqueue while a low priority task is running on another CPU -- we'vefailed :-(.

With this patch, *imbalance will be set to (W + 4k / 3) which is biggerthan the weighted load of the high priority tasks so one of them will bemoved resulting in:


this_load = 3W + k
this_load_per_task = W + k / 3
max_load = 2W + k
busiest_load_per_task = W + k / 2

and whether this_load_per_task will get bigger isindeterminate. With this patch there IS a chance thatbusiest_load_per_task will decrease and an INCREASED chance thatthis_load_per_task will get bigger. Ergo we have increased theprobability that the (absolute) difference between this_load_per_taskand busiest_load_per_task will decrease. This is a desirable outcome.
All I am saying is we are more aggressive.. I don't have any issue with
the desired outcome..

We need to be more aggressive but not too aggressive and I think thispatch achieves the required balance.

NB busiest_load_per_task < *imbalance < (max_load - this_load) is truefor this path through the code. To be precise, *imbalance will be halfway between busiest_load_per_task and (max_load - this_load).


Peter
--
Peter Williams                                   [email protected]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [PATCH] sched: move enough load to balance average load per task
  - From: "Siddha, Suresh B" <[email protected]>

References:
- [PATCH] sched: move enough load to balance average load per task
  - From: Peter Williams <[email protected]>
- Re: [PATCH] sched: move enough load to balance average load per task
  - From: "Siddha, Suresh B" <[email protected]>
- Re: [PATCH] sched: move enough load to balance average load per task
  - From: Peter Williams <[email protected]>
- Re: [PATCH] sched: move enough load to balance average load per task
  - From: "Siddha, Suresh B" <[email protected]>

Prev by Date: Re: [PATCH 1/3] swsusp add architecture special saveable pages support
Next by Date: Re: [Devel] Re: [RFC] Virtualization steps
Previous by thread: Re: [PATCH] sched: move enough load to balance average load per task
Next by thread: Re: [PATCH] sched: move enough load to balance average load per task
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]