Re: preempt-rt, NUMA and strange latency traces

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2006-02-08 at 04:41 -0500, Steven Rostedt wrote:
> On Tue, 7 Feb 2006, Sébastien Dugué wrote:
> 
> >   Hi,
> >
> >   I've been experimenting with 2.6.15-rt16 on a dual 2.8GHz Xeon box
> > with quite good results and decided to make a run on a NUMA dual node
> > IBM x440 (8 1.4GHz Xeon, 28GB ram).
> >
> >   However, the kernel crashes early when creating the slabs. Does the
> > current preempt-rt patchset supports NUMA machines or has support
> > been disabled until things settle down?
> 
> Yeah, currently the -rt patch doesn't work well with NUMA.

  OK, I'll have to wait for that I guess.

> 
> >
> >   Going on, I compiled a non NUMA RT kernel which booted just fine,
> > but when examining the latency traces, I came upon strange jumps
> > in the latencies such as:
> >
> >
> >    <...>-6459  2D.h1   42us : rcu_pending (update_process_times)
> >    <...>-6459  2D.h1   42us : scheduler_tick (update_process_times)
> >    <...>-6459  2D.h1   43us : sched_clock (scheduler_tick)
> >    <...>-6459  2D.h1   44us!: _raw_spin_lock (scheduler_tick)
> >    <...>-6459  2D.h2 28806us : _raw_spin_unlock (scheduler_tick)
> >    <...>-6459  2D.h1 28806us : rebalance_tick (scheduler_tick)
> >    <...>-6459  2D.h1 28807us : irq_exit (smp_apic_timer_interrupt)
> >    <...>-6459  2D..1 28808us < (608)
> >    <...>-6459  2D..1 28809us : smp_apic_timer_interrupt (c03e2a02 0 0)
> >    <...>-6459  2D.h1 28810us : handle_nextevent_update (smp_apic_timer_interrupt)
> >    <...>-6459  2D.h1 28810us : hrtimer_interrupt (handle_nextevent_update)
> 
> Hmm, are the TSC of the CPUS in sync?  If not, you will get bad
> messurements of the latency tracer.  That is currently why we can't use
> tracing with x86_64 x2 CPUS.
> 
> >
> >   There does not seem to be a precise code path leading to those jumps,
> > it seems
> > they can appear anywhere. Furthermore the jump seems to always be of ~ 27 ms
> >
> >   I tried running on only 1 CPU, tried using the TSC instead of the cyclone
> > timer but to no avail, the phenomenon is still there.
> 
> OK, this scares me.  You get these with only one CPU?  Is it still
> compiled for SMP?  If not, see if the latencies go away if you turn off
> SMP.

  Yes the kernel is still compiled for SMP. An UP compiled kernel did
not boot. I will try to fix it and go for it again.

> 
>  >
> >   My test program only consists in a thread running at max RT priority doing
> > a nanosleep().
> >
> >   What could be going on here?
> 
> Good question.  Could you send your .config

  The more I think about it, the more I tend to believe it's hardware 
related. It seems as if the CPU just hangs for ~27 ms before
resuming processing.

  Anyway, here is my .config. Maybe I've got something misconfigured
somewhere after all.

  Thanks for replying.



Attachment: config-2.6.15-rt16-no-numa.gz
Description: GNU Zip compressed data


[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux