Re: 2.6.16-mm1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



john stultz <[email protected]> wrote:
>
> On Thu, 2006-03-23 at 16:04 -0800, Andrew Morton wrote:
> > "R. J. Wysocki" <[email protected]> wrote:
> > >
> > > On Thursday 23 March 2006 10:40, Andrew Morton wrote:
> > > > 
> > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.16/2.6.16-mm1/
> > > 
> > > On a uniprocessor AMD64 w/ CONFIG_SMP unset (2.6.16-rc6-mm2 works on this box
> > > just fine, .config attached):
> > 
> > hm, uniproc x86_64 seems to cause problems sometimes.  I should test it more.
> > 
> > > }-- snip --{
> > > PID hash table entries: 4096 (order: 12, 32768 bytes)
> > > time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer.
> > > time.c: Detected 1795.400 MHz processor.
> > > disabling early console
> > > Console: colour dummy device 80x25
> > > time.c: Lost 103 timer tick(s)! rip 10:start_kernel+0x121/0x220
> > > last cli 0x0
> > > last cli caller 0x0
> > > time.c: Lost 3 timer tick(s)! rip 10:__do_softirq+0x44/0xc0
> > > last cli 0x0
> > > last cli caller 0x0
> > > time.c: Lost 3 timer tick(s)! rip 10:__do_softirq+0x44/0xc0
> 
> It seems report_lost_ticks has been set to one w/ 2.6.16-mm1, thus these
> debug messages will be shown.
> 
> Rafael: To properly compare, could you boot 2.6.16-rc6-mm2 w/ the
> "report_lost_ticks" boot option and see if the same sort of messages do
> not appear?

<looks>

OK, we have an andi-easter-egg:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.16/2.6.16-mm1/broken-out/x86_64-mm-lost-cli-debug.patch.

So you're right, it's not the time patches.  I just blame you whenever anything
time-related goes wrong - who else is there? ;)

The above patch records the most-recent caller of local_irq_disable() in a
global variable, then prints that out in the lost-ticks handler.  But how
do we know that the global didn't get overwritten between the most-recent
local_irq_enable() and the call to handle_lost_ticks()?

I guess the code assumes that the local_irq_enable() will result in
insta-entry into the timer IRQ handler.  Which is probably good enough, as
interrupts from other sources won't be pending most times.

So why did we lose three ticks after __do_sortirq()'s local_irq_disable()? 
Dunno.

I think a better debugging patch would be:

u64 timestamp[NR_CPUS];

local_irq_disable()
{
	cli;
	timestamp[me] = rdtslcc();
}

local_irq_enable()
{
	if (rdtscll() - timestamp[me] > enough)
		whine();
	sti;
}

(Is there any point in do_softirq() doing local_irq_save() instead of
local_irq_disable()?  __do_softirq() will unconditionally enable anyway..)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux