Re: [PATCH 2/2] x86_64 irq: Handle irqs pending in IRR during irq migration.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



As I reported when I tested this patch, it works, but I could see an abnormally high load averay while triggering the error message. anyway, it is better to have an high load averag three or four times higher than what you would expect then a crash/reboot. isn't it? :)

Luigi Genoni

p.s.
will test the other definitive patch on montday on both 8 and 16 CPU system.

On Sat, 3 Feb 2007, Eric W. Biederman wrote:

Date: Sat, 03 Feb 2007 00:55:11 -0700
From: Eric W. Biederman <[email protected]>
To: Arjan van de Ven <[email protected]>
Cc: Andrew Morton <[email protected]>, [email protected],
    "Lu, Yinghai" <[email protected]>,
    Luigi Genoni <[email protected]>, Ingo Molnar <[email protected]>,
    Natalie Protasevich <[email protected]>, Andi Kleen <[email protected]>
Subject: Re: [PATCH 2/2] x86_64 irq:  Handle irqs pending in IRR during irq
    migration.
Resent-Date: Sat, 03 Feb 2007 09:05:10 +0100
Resent-From: <[email protected]>

Arjan van de Ven <[email protected]> writes:

Once the migration operation is complete we know we will receive
no more interrupts on this vector so the irq pending state for
this irq will no longer be updated.  If the irq is not pending and
we are in the intermediate state we immediately free the vector,
otherwise in we free the vector in do_IRQ when the pending irq
arrives.

So is this a for-2.6.20 thing?  The bug was present in 2.6.19, so
I assume it doesn't affect many people?

I got a few reports of this; irqbalance may trigger this kernel bug it
seems... I would suggest to consider this for 2.6.20 since it's a
hard-hang case


Yes.  The bug I fixed will not happen if you don't migrate irqs.

At the very least we want the patch below (already in -mm)
that makes it not a hard hang case.

Subject: [PATCH] x86_64:  Survive having no irq mapping for a vector

Occasionally the kernel has bugs that result in no irq being
found for a given cpu vector.  If we acknowledge the irq
the system has a good chance of continuing even though we dropped
an missed an irq message.  If we continue to simply print a
message and drop and not acknowledge the irq the system is
likely to become non-responsive shortly there after.

Signed-off-by: Eric W. Biederman <[email protected]>
---
arch/x86_64/kernel/irq.c |   11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86_64/kernel/irq.c b/arch/x86_64/kernel/irq.c
index 0c06af6..648055a 100644
--- a/arch/x86_64/kernel/irq.c
+++ b/arch/x86_64/kernel/irq.c
@@ -120,9 +120,14 @@ asmlinkage unsigned int do_IRQ(struct pt_regs *regs)

	if (likely(irq < NR_IRQS))
		generic_handle_irq(irq);
-	else if (printk_ratelimit())
-		printk(KERN_EMERG "%s: %d.%d No irq handler for vector\n",
-			__func__, smp_processor_id(), vector);
+	else {
+		if (!disable_apic)
+			ack_APIC_irq();
+
+		if (printk_ratelimit())
+			printk(KERN_EMERG "%s: %d.%d No irq handler for vector\n",
+				__func__, smp_processor_id(), vector);
+	}

	irq_exit();

--
1.4.4.1.g278f

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux