Re: Volanomark slows by 80% under CFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tim Chen wrote:
Ingo,

Volanomark slows by 80% with CFS scheduler on 2.6.23-rc1. Benchmark was run on a 2 socket Core2 machine.

The change in scheduler treatment of sched_yield could play a part in changing Volanomark behavior.
In CFS, sched_yield is implemented
by dequeueing and requeueing a process . The time a process has spent running probably reduced the the cpu time due it by only a bit. The process could get re-queued pretty close
to head of the queue, and may get scheduled again pretty
quickly if there is still a lot of cpu time due.
It may make sense to queue the
yielding process a bit further behind in the queue. I made a slight change by zeroing out wait_runtime (i.e. have the process gives up cpu time due for it to run) for experimentation. Let's put aside gripes that Volanomark should have used a better mechanism to coordinate threads instead sched_yield for a second. Volanomark runs better and is only 40% (instead of 80%) down from old scheduler without CFS.
Of course we should not tune for Volanomark and this is
reference data. What are your view on how CFS's sched_yield should behave?

Regards,
Tim

The primary purpose of sched_yield is for SCHED_FIFO realtime processes. Where nothing else will run, ever, unless the running thread blocks or yields the CPU. Under CFS, the yielding process will still be leftmost in the rbtree, otherwise it would have already been scheduled out.

Zeroing out wait_runtime on sched_yield strikes me as completely appropriate. If the process wanted to sleep a finite duration, it should actually call a sleep function, but sched_yield is essentially saying "I don't have anything else to do right now", so it's hardly fair to claim you've been waiting for your chance when you just gave it up.

As for the remaining 40% degradation, if Volanomark is using it for synchronization, the scheduler is probably cycling through threads until it gets to the one that actually wants to do work. The O(1) scheduler will do this very quickly, whereas CFS has a bit more overhead. Interactivity boosting may have also helped the old scheduler find the right thread faster.

I think Volanomark is being pretty stupid, and deserves to run slowly, but there are legitimate reasons to want to call sched_yield in a non-SCHED_FIFO process. If I'm performing multiple different calculations on the same set of data in multiple threads, and accessing the shared data in a linear fashion, I'd like to be able to have one thread give the other some CPU time so they can stay at the same point in the stream and improve cache hit rates, but this is only an optimization if I can do it without wasting CPU or gradually nicing myself into oblivion. Having sched_yield zero out wait_runtime seems like an appropriate way to make this use case work to the extent possible. Any user attempting such an optimization should have the good sense to do real work between sched_yield calls, to avoid calling the scheduler in a tight loop.

	-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux