* Fernando Lopez-Lezcano <[email protected]> wrote:
> Here's one with rc5-rt3:
>
> Oct 21 15:01:46 cmn3 kernel: BUG: ktimer expired short without user
> signal! (hald-addon-stor:4309)
and no "BUG: foo:1234 waking up bar:4321, expiring ktimer short" message
prior to that? Very weird, this line:
> Oct 21 15:01:46 cmn3 kernel: .. expires: 1012/751245500
> Oct 21 15:01:46 cmn3 kernel: .. expired: 1012/750908115
> Oct 21 15:01:46 cmn3 kernel: .. at line: 942
suggests that the ktimer was expired by ktimer_try_to_cancel() /
ktimer_cancel(), in ktimer_schedule(). I.e. something must have woken
the task early. Probably this theory of mine is incorrect then. I'll try
extend the debug info a bit: it would be interesting to see a 'timer
inserted at' timestamp as well (was it shortly before the problem
happened?), and a 'which PID cancelled the timer' info.
a heavy-hitting but complex-to-set-up solution would be to add a serial
console, and to enable WAKEUP_TIMING+LATENCY_TRACING in the .config, and
to edit kernel/latency.c to initialize the default value of the
following variables:
int wakeup_timing = 0;
int trace_all_cpus = 1;
int trace_freerunning = 1;
int trace_print_at_crash = 1;
int trace_user_triggered = 1;
these variables are in the top portion of latency.c. Important: if you
try this then you should probably also enable IGNORE_PRINTK_LOGLEVEL,
which will improve mass-output to the serial console. Another important
thing is to add a stop_trace() call to kernel/ktimers.c's
check_ktimer_signal() function:
unlock_ktimer_base(timer, &flags);
stop_trace();
printk("BUG: ktimer expired short without user signal! (%s:%d)\n",
current->comm, current->pid);
(otherwise all the trace output you'd be getting would be boring printk
related trace entries.)
this will cause the dump_stack() to also output thousands of trace
entries - all the kernel activity (from all CPUs) that preceded the
ktimer problem. Hopefully this pinpoints the bug.
> In both cases the machine goes catatonic, I don't know if right after
> this or not. It responds to the SysRQ key but that's pretty much it, I
> should probably try to get a serial console going somehow.
would it be easy for you to try the UP kernel? One possibility is that
this is some sort of SMP/APIC-timer related problem.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]