Re: [ANNOUNCE][RFC] PlugSched-6.3.1 for 2.6.16-rc5

Al Boldi wrote:

Peter Williams wrote:

Al Boldi wrote:

This is especially visible in spa_no_frills, and spa_ws recovers from
this lockup somewhat and starts exhibiting this problem as a choking
behavior.

Running '# top d.1 (then shift T)' on another vt shows this choking
behavior as the proc gets boosted.

But anyway, based on the evidence, I think the problem is caused by the
fact that the eatm tasks are running to completion in less than one time
slice without sleeping and this means that they never have their

priorities reassessed.

Yes.

The reason that spa_ebs doesn't demonstrate the
problem is that it uses a smaller time slice for the first time slice
that a task gets. The reason that it does this is that it gives newly
forked processes a fairly high priority and if they're left to run for a
full 120 msecs at that high priority they can hose the system.  Having a
shorter first time slice gives the scheduler a chance to reassess the
task's priority before it does much damage.

But how does this explain spa_no_frills setting promotion to max not havingthis problem?

I'm still puzzled by this. The only thing I can think of is that thepromotion mechanism is to simple in that it just moves all promotabletasks up one slot without regard for how long they've been on the queue.Doing this was a deliberate decision based on the desire to minimizeoverhead and the belief that it wouldn't matter in the grand scheme ofthings. I may do some experimenting with slightly more sophisticatedversion.

Properly done, promotion should hardly ever occur but the cost would beslightly more complex enqueue/dequeue operations. The current versionwill do unnecessary promotions but it was felt this was more thancompensated for by the lower enqueue/dequeue costs. We'll see how amore sophisticated version goes in terms of trade offs.

The reason that the other schedulers don't have this strategy is that I
didn't think that it was necessary.  Obviously I was wrong and should
extend it to the other schedulers.  It's doubtful whether this will help
a great deal with spa_no_frills as it is pure round robin and doesn't
reassess priorities except when nice changes of the task changes
policies.

Would it hurt to add it to spa_no_frills and let the children inherit it?

That would be the plan :-)

This is one good reason not to use spa_no_frills on
production systems.
spa_ebs is great, but rather bursty. Even setting max_ia_bonus=0 doesn't fixthat. Is there a way to smooth it like spa_no_frills?

The principal determinant would be the smoothness of the yardstick.This is supposed to represent the task with the highest (recent) CPUusage rate per share and is used to determine how fairly CPU is beingdistributed among the currently active tasks. Tasks are given apriority based on how their CPU usage rate per share compares to thisyardstick. This means that as the system load and/or type of taskrunning changes the priorities of the tasks can change dramatically.

Is the burstiness that you're seeing just in the observed priorities oris it associated with behavioural burstiness as well?

Perhaps you should consider creating a child
scheduler on top of it that meets your needs?

Perhaps.

Good. I've been hoping that other interested parties might beencouraged by the small interface to SPA children to try different ideasfor scheduling.

Anyway, an alternative (and safer) way to reduce the effects of this
problem (while your waiting for me to do the above change) is to reduce
the size of the time slice.  The only bad effects of doing this is that
you'll do slightly worse (less than 1%) on kernbench.
Actually, setting timeslice to 5,50,100 gives me better performance onkernbench. After closer inspection, I found 120ms a rather awkwardtimeslice whereas 5,50, and 100 exhibited a smoother and faster response,which may be machine dependent, thus the need for an autotuner.

When I had the SPA schedulers fully instrumented I did some long termmeasurements of my work station and found that the average CPU burst forall tasks was only a few msecs. The exceptions were some of the tasksinvolved in building kernels. So the only bad effects of reducing thetime slice will be causing those tasks to have more context switchesthan otherwise and this will slightly reduce their throughput.

One thing that could be played with here is to vary the time slice basedon the priority. This would be in the opposite direction to the normalscheduler with higher priority tasks (i.e. those with lower prio values)getting smaller time slices. The rationale being:

1. stop tasks that have been given large bonuses from shutting out othertasks for too long, and

2. reduce the context switch rate for tasks that haven't received bonuses.

Because tasks that get large bonuses will have short CPU bursts theyshould not be adversely effected (if this is done properly) as they will(except in exceptional circumstances such as a change in behaviour)surrender the CPU voluntarily before their reduced time slice hasexpired. Imaginative use of the available statistics could make thislargely automatic but there would be a need to be aware that thestatistics can be distorted by the shorter time slices.

On the other hand, giving tasks without bonuses longer time slicesshouldn't adversely effect interactive performance as the interactivetasks will (courtesy of their bonuses) preempt them.

Peter
--
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [ANNOUNCE][RFC] PlugSched-6.3.1 for 2.6.16-rc5
  - From: Al Boldi <a1426z@gawab.com>

References:
- Re: [ANNOUNCE][RFC] PlugSched-6.3.1 for 2.6.16-rc5
  - From: Al Boldi <a1426z@gawab.com>
- Re: [ANNOUNCE][RFC] PlugSched-6.3.1 for 2.6.16-rc5
  - From: Al Boldi <a1426z@gawab.com>
- Re: [ANNOUNCE][RFC] PlugSched-6.3.1 for 2.6.16-rc5
  - From: Peter Williams <pwil3058@bigpond.net.au>
- Re: [ANNOUNCE][RFC] PlugSched-6.3.1 for 2.6.16-rc5
  - From: Al Boldi <a1426z@gawab.com>

Prev by Date: Re: [PATCH 3/19] kconfig: recenter menuconfig
Next by Date: Re: [PATCH 3/19] kconfig: recenter menuconfig
Previous by thread: Re: [ANNOUNCE][RFC] PlugSched-6.3.1 for 2.6.16-rc5
Next by thread: Re: [ANNOUNCE][RFC] PlugSched-6.3.1 for 2.6.16-rc5
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]