* David Miller <[email protected]> wrote:

> I suspect that what is happening is that the NOHZ period is longer 
> than the softlockup timeout (10 seconds) and we get an interrupt 
> before the watchdog thread gets onto the cpu.

indeed! Does the patch below do the trick?


Subject: softlockup: do the wakeup from a hrtimer
From: Ingo Molnar <[email protected]>

David Miller reported soft lockup false-positives that trigger
on NOHZ due to CPUs idling for more than 10 seconds.

The solution is to drive the wakeup of the watchdog threads
not from the timer tick (which has no guaranteed frequency),
but from the watchdog tasks themselves.

Reported-by: David Miller <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
 kernel/softlockup.c |    6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

Index: linux/kernel/softlockup.c
--- linux.orig/kernel/softlockup.c
+++ linux/kernel/softlockup.c
@@ -100,10 +100,6 @@ void softlockup_tick(void)
 	now = get_timestamp(this_cpu);
-	/* Wake up the high-prio watchdog task every second: */
-	if (now > (touch_timestamp + 1))
-		wake_up_process(per_cpu(watchdog_task, this_cpu));
 	/* Warn about unreasonable 10+ seconds delays: */
 	if (now <= (touch_timestamp + softlockup_thresh))
@@ -141,7 +137,7 @@ static int watchdog(void *__bind_cpu)
 	while (!kthread_should_stop()) {
-		schedule();
+		msleep(1000);
 	return 0;
