Re: RT task scheduling — Linux Kernel

Darren Hart wrote:

My last mail specifically addresses preempt-rt, but I'd like to know people'sthoughts regarding this issue in the mainline kernel. Please see my previouspost "realtime-preempt scheduling - rt_overload behavior" for a testcase thatproduces unpredictable scheduling results.
Part of the issue here is to define what we consider "correct behavior" forSCHED_FIFO realtime tasks. Do we (A) need to strive for "strict realtimepriority scheduling" where the NR_CPUS highest priority runnable SCHED_FIFOtasks are _always_ running? Or do we (B) take the best effort approach withan upper limit RT priority imbalances, where an imbalance may occur (say atwakeup or exit) but will be remedied within 1 tick. The smpnice patchesimprove load balancing, but don't provide (A).
More details in the previous mail...

I'm currently researching some ideas to improve smpnice that may help inthis situation. The basic idea is that as well as trying to equallydistribute the weighted load among the groups/queues we should also tryto achieve equal "average load per task" for each group/queue. (As wellas helping with problems such as yours, this will help to restore the"equal distribution of nr_running" amongst groups/queues aim that isimplicit without smpnice due to the fact that load is just a smoothedversion of nr_running.)

In find_busiest_group(), I think that load balancing in the case where*imbalance is greater than busiest_load_per_task will tend towards thisresult and also when *imbalance is less than busiest_load_per_task ANDbusiest_load_per_task is less than this_load_per_task. However, in thecase where *imbalance is less than busiest_load_per_task ANDbusiest_load_per_task is greater than this_load_per_task this will notbe the case as the amount of load moved from "busiest" to "this" will beless than or equal to busiest_load_per_task and this will actuallyincrease the value of busiest_load_per_task. So, although it willachieve the aim of equally distributing the weighted load, it won't helpthe second aim of equal "average load per task" values for groups/queues.

The obvious way to fix this problem is to alter the code so that morethan busiest_load_per_task is moved from "busiest" to "this" in thesecases while at the same time ensuring that the imbalance between theirloads doesn't get any bigger. I'm working on a patch along these lines.

Changes to find_idlest_group() and try_to_wake_up() taking into accountthe "average load per task" on the candidate queues/groups as well astheir weighted loads may also help and I'll be looking at them as well.It's not immediately obvious to me how this can be done so any ideaswould be welcome. It will likely involve taking the load weight of thewaking task into account as well.


Peter
--
Peter Williams                                   [email protected]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: RT task scheduling
  - From: Darren Hart <[email protected]>

References:
- RT task scheduling
  - From: Darren Hart <[email protected]>

Prev by Date: Re: [patch 2.6.16-mm2 10/9] sched throttle tree extract - kill interactive task feedback loop
Next by Date: Re: Q on audit, audit-syscall
Previous by thread: RT task scheduling
Next by thread: Re: RT task scheduling
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]