[RFC] [PATCH] Fix misrouted interrupts deadlocks

While testing kernel on machine with "irqpoll" option
I've caught such a lockup:

	__do_IRQ()
	   spin_lock(&desc->lock);
           desc->chip->ack(); /* IRQ is ACKed */
	note_interrupt()
	misrouted_irq()
	handle_IRQ_event()
           if (...)
	      local_irq_enable_in_hardirq();
	/* interrupts are enabled from now */
	...
	__do_IRQ() /* same IRQ we've started from */
	   spin_lock(&desc->lock); /* LOCKUP */

Looking at misrouted_irq() code I've found that a potential
deadlock like this can also take place:

1CPU:
__do_IRQ()
   spin_lock(&desc->lock); /* irq = A */
misrouted_irq()
   for (i = 1; i < NR_IRQS; i++) {
      spin_lock(&desc->lock); /* irq = B */
      if (desc->status & IRQ_INPROGRESS) {

2CPU:
__do_IRQ()
   spin_lock(&desc->lock); /* irq = B */
misrouted_irq()
   for (i = 1; i < NR_IRQS; i++) {
      spin_lock(&desc->lock); /* irq = A */
      if (desc->status & IRQ_INPROGRESS) {

As the second lock on booth CPUs is taken before checking that
this irq is being handled in another processor this may cause
a deadlock. This issue is only theoretical.

I propose the attached patch to fix booth problems: when trying
to handle misrouted IRQ active desc->lock may be unlocked.

Please comment.

--- ./kernel/irq/spurious.c.irqlockup	2006-11-09 11:19:10.000000000 +0300
+++ ./kernel/irq/spurious.c	2006-11-10 16:53:38.000000000 +0300
@@ -147,7 +147,11 @@ void note_interrupt(unsigned int irq, st
 	if (unlikely(irqfixup)) {
 		/* Don't punish working computers */
 		if ((irqfixup == 2 && irq == 0) || action_ret == IRQ_NONE) {
-			int ok = misrouted_irq(irq);
+			int ok;
+
+			spin_unlock(&desc->lock);
+			ok = misrouted_irq(irq);
+			spin_lock(&desc->lock);
 			if (action_ret == IRQ_NONE)
 				desc->irqs_unhandled -= ok;
 		}

Follow-Ups:
- Re: [RFC] [PATCH] Fix misrouted interrupts deadlocks
  - From: Vivek Goyal <vgoyal@in.ibm.com>
- Re: [RFC] [PATCH] Fix misrouted interrupts deadlocks
  - From: Ingo Molnar <mingo@redhat.com>

Prev by Date: Re: [patch 0/5] Fix timeout bug in rtmutex in 2.6.18-rt
Next by Date: Re: Failure of sata_via with kernels since 2.6.15.6
Previous by thread: Failure of sata_via with kernels since 2.6.15.6
Next by thread: Re: [RFC] [PATCH] Fix misrouted interrupts deadlocks
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]