Re: [patch] Real-Time Preemption, -RT-2.6.12-rc6-V0.7.48-00

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* William Weston <[email protected]> wrote:

> OK.  Running on -50-25 now.  The burnP6 starvation doesn't seem to 
> affect the whole system, but comes close enough to require the reset 
> button every time.  I usually, but not always, lose network, X, the 
> keyboard, mouse, and serial console.  I'm still unable to get any sort 
> of a trace from these lockups, since it's looking more like a bunch of 
> processes starving than a kernel crash or a full lockup.
> 
> Once, with VLC (viewing a 5mbit/s mcast/udp stream) and two burnP6 
> instances running, I was able to fire up top on the serial console and 
> found out that the IRQ thread for the ns83820 nic was using 99% of one 
> CPU.

aha! that's an important clue. It seems you've got a screaming interrupt 
or some other loop in ns83820 irq handling. Could you do this:

	chrt -o 0 -p `pidof 'IRQ 18'`

(assuming your ns83820 device is still on IRQ18) To check the command 
was effective, enter the following command:

	chrt -o -p `pidof 'IRQ 18'`

and you should see output like:

	pid 748's current scheduling policy: SCHED_OTHER
	pid 748's current scheduling priority: 0

i.e. the thread is not SCHED_FIFO anymore.

this will not fix the ns83820 problem for you, but will make it more 
debuggable - you will still probably lose networking, but keyboard and 
the local console should work fine. You should see a 99% CPU-time 
looping ns83820 IRQ thread when the condition triggers. To debug the 
condition further, could you do something like:

	vmstat 1

what kind of interrupt rate ('in' field) does it report? If it's very 
high then it's a screaming interrupt, if it's low then the IRQ thread is 
looping for some other reason. (but both would be bugs of the -RT 
kernel.)

also, could you try to get a trace of what the IRQ thread is doing. I've 
attached trace-it.c, just run it as root (on a LATENCY_TRACING-enabled 
-RT kernel) to get a finegrained trace of what's going on in the system. 
Whenever the thread-spinning occurs, just run this utility:

	trace-it > trace.txt

and you should get a couple of milliseconds worth of system activity.  
The trace output should look like a really long latency-trace. (The 
latency traces usually compress really well with bzip2 -9, so you can 
attach it to public replies too, if compressed - that way others can 
have a look too.)

> Once, with a normal desktop load and a yum update, this came across on 
> the serial console:
> 
> cat/2100[CPU#1]: BUG in update_out_trace at kernel/latency.c:587

on SMP this could occur if the TSCs of different CPUs are too apart from 
each other. I'll probably put an automatic check for this into the 
/proc/latency_trace code.

	Ingo
/*
 * Copyright (C) 1999, Ingo Molnar <[email protected]>
 */

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <sys/wait.h>
#include <linux/unistd.h>

int main (int argc, char **argv)
{
	int ret;

	if (getuid() != 0) {
		fprintf(stderr, "needs to run as root.\n");
		exit(1);
	}
	ret = system("cat /proc/sys/kernel/mcount_enabled >/dev/null 2>/dev/null");
	if (ret) {
		fprintf(stderr, "CONFIG_LATENCY_TRACING not enabled?\n");
		exit(1);
	}
	system("echo 1 > /proc/sys/kernel/trace_all_cpus");
	system("echo 1 > /proc/sys/kernel/trace_enabled");
	system("echo 0 > /proc/sys/kernel/trace_freerunning");
	system("echo 0 > /proc/sys/kernel/trace_print_at_crash");
	system("echo 1 > /proc/sys/kernel/trace_user_triggered");
	system("echo 0 > /proc/sys/kernel/trace_verbose");
	system("echo 0 > /proc/sys/kernel/preempt_max_latency");
	system("echo 0 > /proc/sys/kernel/preempt_thresh");
	system("[ -e /proc/sys/kernel/wakeup_timing ] && echo 0 > /proc/sys/kernel/wakeup_timing");
	system("echo 1 > /proc/sys/kernel/mcount_enabled");

	gettimeofday(0, 1);
	usleep(100000);
	gettimeofday(0, 0);

	system("cat /proc/latency_trace");

	return 0;
}


Attachment: trace-it
Description: Binary data


[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux