Re: Interesting interaction between lguest and CFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Matt Mackall <[email protected]> wrote:

> On Tue, Jun 05, 2007 at 09:50:15PM +0200, Ingo Molnar wrote:
> > Matt, could you run this for 1-2 minutes and send us the sched_debug.txt 
> > output?
> 
> http://selenic.com/sched_debug.txt.gz

thanks. It shows the anomaly in action:

  now at 365294215491977 nsecs

  .jiffies               : 91248553
  .next_balance          : 0
  .curr->pid             : 18589
  .clock                 : 125652924079659272
  .prev_clock_raw        : 365201238127457
  .clock_warps           : 9
  .clock_unstable_events : 61896358
  .clock_max_delta       : 3997813

next one is:

  now at 365295219388142 nsecs

  .jiffies               : 91248804
  .next_balance          : 0
  .curr->pid             : 18591
  .clock                 : 125653018059166371
  .prev_clock_raw        : 365295217642619
  .clock_warps           : 9
  .clock_unstable_events : 61896359
  .clock_max_delta       : 92976502936

251 jiffies passed, at 250 Hz that's 1 second - this proves that the 
sample is indeed an accurate once-per-second sample according to the 
timer interrupt. The 'now' timestamp (ktime_get() based) shows 
1003896165 nanosecs passed - this too is showing a precise 1 second 
sample, according to GTOD.

So all the time references we have show that (no surprise here) 1 second 
passed between the two samples. But sched_clock() shows a _large_ jump:

  .clock                 : 125652924079659272
  .clock                 : 125653018059166371

also reflected in .clock_max_delta:

  .clock_max_delta       : 92976502936

that's a 93 seconds jump (!) in a single 1-second sample. We also had a 
single sched-clock-unstable event:

  .clock_unstable_events : 61896358
  .clock_unstable_events : 61896359

could you please try a test-boot with 'notsc' - do the scheduling 
weirdnesses go away? Also, 

There are two reasons why the sched_clock() in -mm could behave like 
that - either the sched-clock-share patches in it are buggy and we do 
not smoothly switch over from sc->unstable == 1 to sc->unstable == 0, or 
the TSC itself is unstable. To test the latter theory, could you run a 1 
minute tsc-dump on your box:

	./tsc-dump > tsc-dump.txt

You can pick tsc-dump up from:

	http://redhat.com/~mingo/time-warp-test/

(please run this on a recent -mm kernel so that we have the same 
ACPI-idle characteristics as on the buggy kernel.)

to test the former theory, could you boot with 'notsc' - do the 
weirdnesses go away? (please create another sched-debug.txt as well)

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux