> Natalie,
> Regarding the "IRQ compression" in mp_register_gsi()....
>
> Some time ago we invented ioapic_renumber_irq() to handle the
> case where the ES7000 BIOS is missing the INTERRUPT_OVERRIDES
> needed to tell that legacy IRQs used pins > 15, and PCI
> interrupts used pins < 16.
>
> In that case, the ES7000 uses es7000_rename_gsi() to simply
> re-number the IRQs for the lower pins to some unused numbers
> above the highest GSI in the system -- emulating what the
> BIOS should have done.
>
> Later, GSI compression was also added to mp_register_gsi():
>
> if (triggering == ACPI_LEVEL_SENSITIVE)
> {
> gsi = pci_irq++;
> ...
> }
>
> This GSI "compression" was intended to handle systems which
> have a large number of sparsely populated IOAPICs. On these
> systems, the maximum GSI is well above NR_IRQS. While there
> can be a large number of devices on these boxes, the total in
> use is actually still below NR_IRQs -- so squeezing the GSIs
> into the IRQ numbers works.
>
> The problem with doing this is that it also compresses the
> IRQ numbers for the other 99.99% of the systems on earth,
> causing complexity and bugs, such as several mentioned on this thread.
>
> I'm thinking that it would be simpler for the big iron that
> has GSI's > NR_IRQs, to simply use the existing
> ioapci_renumber_irq() hook to do whatever compression or
> re-naming they fancy.
> This will also allow for simpler systems to remain simple and
> use identity irq = gsi numbering.
> (don't get me started on CONFIG_PCI_MSI, that is an independent issue)
>
> Also, as we discussed, it would probably be cleaner even on
> those large systems if the compression algorithm did not
> re-name every GSI, but only did re-names when it really has
> to (GSI > NR_IRQS). eg.
>
> < gsi = pci_irq++
> ---
> > if (GSI < NR_IRQS)
> > gsi = irq;
> > else
> > BUG_ON(irq_used >= NR_IRQS)
> > gsi = platform specific algorithm to grab the unused IRQs that
> > can never have any real devices attached.
> > irqs_used++;
>
> If we had done this, then we wouldn't have needed kimbal's patch.
> If we do this, then we can delete that workaround on top of
> workaround.
Sure sounds like a great idea. I just signed up for a large partition (>
1700 GSIs) to try this over the weekend.
The problem with oapic_renumber_gsi() is that it didn't actually migrate
to x86_64 because it doesn't have sub-archs so far. But using actual
number of interrupts vs. range is definitely promising idea. I was also
thinking that all the complexity is actually located on the first
IO-APIC, this may have some potential for compression only coming into
play on non-zero IO-APICs. But using the total number of IRQs used is
definitely the most logical thing to try...I will test this shortly -
Thanks,
--Natalie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]