On Fri, Nov 16, 2007 at 11:48:50AM +0100, Dmitry Adamushko wrote:
> could you try to change either :
>
> cat /proc/sys/kernel/sched_stat_granularity
>
> put it to the value equal to a tick on your system
This didn't seem to have any effect.
> or just remove bit #3 (which is responsible for 8 == 1000) here:
>
> cat /proc/sys/kernel/sched_features
>
> (this one is enabled by default in 2.6.23.1)
Aha. Turning off bit 3 appears to instantly fix my problem while it's
occurring in an existing process, and I can't reproduce it on any new
processes afterward.
> anyway, when it comes to calculating rq->cpu_load[], a nice(0) cpu-hog
> task (on cpu_0) may generate a similar load (contribute to
> rq->cpu_load[]) as e.g. some negatively reniced task (on cpu_1) which
> runs only periodically (say, once in a tick for N ms., etc.) [*]
>
> The thing is that the higher a prio of the task, the bigger 'weight'
> it has (prio_to_wait[] table in sched.c) ... and roughly, the load it
> generates is not only 'proportional' to 'run-time per fixed interval
> of time' but also to its 'weight'. That's why the [*] above.
Right. I gathered from reading the scheduler source earlier that the
load average is intended to be proportional to the priority of the
task, but I was really confused by the fairly nondeterministic effect
on the cpu_load average that my test process is having.
> so you may have a situation :
>
> cpu_0 : e.g. a nice(-20) task running periodically every tick and
> generating, say ~10% cpu load ;
Part of the problem may be that my high-priority task can run much
more often than every tick. In my test case and in the VMware code
that I originally observed the problem in, the thread can wake up
based on /dev/rtc or on a device IRQ. Either of these can happen much
more frequently than the scheduler tick, if I understand correctly.
> cpu_1 : 2-3 nice(0) cpu-hog tasks ;
>
> both cpus may be seen with similar rq->load_cpu[]...
When I try this, cpu0 has a cpu_load[] of over 10000 and cpu1 has a
load of 2048 or so.
> yeah, one would
> argue that one of the cpu hogs could be migrated to cpu_0 and consume
> remaining 'time slots' and it would not "disturb" the nice(-20) task
> as :
> it's able to preempt the lower prio task whenever it want (provided,
> fine-grained kernel preemption) and we don't care that much of
> trashing of caches here.
Yes, that's the behaviour I expected to see (and what my application
would prefer).
> btw., without the precise load balancing, there can be situations when
> the nice(-20) (or say, a RT periodic task) can be even not seen (i.e.
> don't contribute to cpu_load[]) on cpu_0...
> we do sampling every tick (sched.c :: update_cpu_load()) and consider
> this_rq->ls.load.weight at this particular moment (that is the sum of
> 'weights' for all runnable tasks on this rq)... and it may well be
> that the aforementioned high-priority task is just never (or likely,
> rarely) runnable at this particular moment (it runs for short interval
> of time in between ticks).
Indeed. I think this is the major contributor to the nondeterminism
I'm seeing.
Thanks much,
--Micah
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]