Re: pluggable scheduler thread (was Re: Volanomark slows by 80% under CFS)

Andrea Arcangeli wrote:

On Fri, Jul 27, 2007 at 11:43:23PM -0400, Chris Snook wrote:
I'm pretty sure the point of posting a patch that triples CFS performanceon a certain benchmark and arguably improves the semantics of sched_yieldwas to improve CFS. You have a point, but it is a point for a differentthread. I have taken the liberty of starting this thread for you.
I've no real interest in starting or participating in flamewars
(especially the ones not backed by hard numbers). So I adjusted the
subject a bit in the hope the discussion will not degenerate as you
predicted, hope you don't mind.

Not at all.  I clearly misread your tone.

I'm pretty sure the point of posting that email was to show the
remaining performance regression with the sched_yield fix applied
too. Given you considered my post both offtopic and inflammatory, I
guess you think it's possible and reasonably easy to fix that
remaining regression without a pluggable scheduler, right? So please
enlighten us on your intend to achieve it.

There are four possibilities that are immediately obvious to me:

a) The remaining difference is due mostly to the algorithmic complexityof the rbtree algorithm in CFS.

If this is the case, we should be able to vary the test parameters (CPUcount, thread count, etc.) graph the results, and see a roughlylogarithmic divergence between the schedulers as some parameter(s) vary.If this is the problem, we may be able to fix it with data structuretweaks or optimized base cases, like how quicksort can be optimized byusing insertion sort below a certain threshold.

b) The remaining difference is due mostly to how the scheduler handlesvolanomark.

vmstat can give us a comparison of context switches between O(1), CFS,and CFS+patch. If the decrease in throughput correlates with anincrease in context switches, we may be able to induce more O(1)-likebehavior by charging tasks for context switch overhead.

c) The remaining difference is due mostly to how the scheduler handlessomething other than volanomark.

If context switch count is not the problem, context switch pattern stillcould be. I doubt we'd see a 40% difference due to cache misses, butit's possible. Fortunately, oprofile can sample based on cache misses,so we can debug this too.

d) The remaining difference is due mostly to some implementation detailin CFS.

It's possible there's some constant-factor overhead in CFS that ismagnified heavily by the context switching volanomark deliberatelyinduces. If this is the case, oprofile sampling on clock cycles shouldcatch it.

Tim --

Since you're already set up to do this benchmarking, would you mindvarying the parameters a bit and collecting vmstat data? If you want torun oprofile too, that wouldn't hurt.

Also consider the other numbers likely used nptl so they shouldn't be
affected by sched_yield changes.
Sure there is. We can run a fully-functional POSIX OS without using anyblock devices at all. We cannot run a fully-functional POSIX OS without ascheduler. Any feature without which the OS cannot execute userspace codeis sufficiently primitive that somewhere there is a device on which it willbe impossible to debug if that feature fails to initialize. It is quitereasonable to insist on only having one implementation of such features inany given kernel build.
Sounds like a red-herring to me... There aren't just pluggable I/O
schedulers in the kernel, there are pluggable packet schedulers too
(see `tc qdisc`). And both are switchable at runtime (not just at boot
time).

Can you run your fully-functional POSIX OS without a packet scheduler
and without an I/O scheduler? I wonder where are you going to
read/write data without HD and network?

If I'm missing both, I'm pretty screwed, but if either one isfunctional, I can send something out.

Also those pluggable things don't increase the risk of crash much, if
compared to the complexity of the schedulers.
Whether or not these alternatives belong in the source tree as config-timeoptions is a political question, but preserving boot-time debuggingcapability is a perfectly reasonable technical motivation.
The scheduler is invoked very late in the boot process (printk and
serial console, kdb are working for ages when scheduler kicks in), so
it's fully debuggable (no debugger depends on the scheduler, they run
inside the nmi handler...), I don't really see your point.

I'm more concerned about embedded systems. These are the same peoplewho want userspace character drivers to control their custom hardware.Having the robot point to where it hurts is a lot more convenient thanhooking up a JTAG debugger.

And even if there would be a subtle bug in the scheduler you'll never
trigger it at boot with so few tasks and so few context switches.

Sure, but it's the non-subtle bugs that worry me. These are usuallyrelated to low-level hardware setup, so they could miss the mainstreamdevelopers and clobber unsuspecting embedded developers.

I acknowledge that debugging such problems shouldn't be terribly hard onmainstream systems, but some people are going to want to choose a singlescheduler at build time and avoid the hassle. If we can improve CFS tobe regression-free, and I think we can if we give ourselves a fewpercent tolerance and keep tracking down the corner cases, the pluggablescheduler infrastructure will just be another disused feature.

	-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: pluggable scheduler thread (was Re: Volanomark slows by 80% under CFS)
  - From: Tim Chen <tim.c.chen@linux.intel.com>

References:
- Volanomark slows by 80% under CFS
  - From: Tim Chen <tim.c.chen@linux.intel.com>
- Re: Volanomark slows by 80% under CFS
  - From: Chris Snook <csnook@redhat.com>
- Re: Volanomark slows by 80% under CFS
  - From: Andrea Arcangeli <andrea@suse.de>
- pluggable scheduler flamewar thread (was Re: Volanomark slows by 80% under CFS)
  - From: Chris Snook <csnook@redhat.com>
- Re: pluggable scheduler thread (was Re: Volanomark slows by 80% under CFS)
  - From: Andrea Arcangeli <andrea@suse.de>

Prev by Date: Re: request for patches: showing mount options
Next by Date: 2.6.23-rc1-mm1 -- INFO: possible recursive locking detected -- (&hashbin->hb_spinlock){....}, at: [<c03d95d3>] irias_seq_show+0xba/0x1a8
Previous by thread: Re: pluggable scheduler thread (was Re: Volanomark slows by 80% under CFS)
Next by thread: Re: pluggable scheduler thread (was Re: Volanomark slows by 80% under CFS)
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]