On Tue, Oct 09, 2007 at 11:06:23PM +0200, Muli Ben-Yehuda wrote:
> Hi Chandru,
>
> Thanks for the patch. Comments on the patch below, but first a general
> question for my education: the main problem here that aacraid
> continues DMA'ing when it shouldn't. Why can't we shut it down
> cleanly? Even without the presence of an IOMMU, it seems dangerous to
> let an adapter continue DMA'ing to and from memory when the kernel is
> an inconsistent state.
>
Hi Muli,
After the kernel crash, we can't rely on the crashed kernel's data
structures or methods any more. We can't call the device shutdown methods of
all the device drivers as we might be invoking a driver which actually might
have caused the crash. Hence we don't perform any device shutdown
in the case of kdump. Instead after the crash, we try to take the shortest
route to second kernel (Execute minimum code in crashed kernel).
Whatever special handling is required to bring up the second kernel on
a potentially unknown state hardware, is taken in second kernel.
We will not be too concerned about ongoing DMA's as long as there is no
corruption of tce tables. That would mean DMA is happening in first
kernel's memory buffer and second kernel is not impacted. But if TCE
tables themselves are corrupted, then it can potentially interfere with
second kernel's operation. Don't know how it can be addressed.
> The patch below looks reasonable *if* that is the least worst way of
> doing it - let's see if we can come up with something cleaner that
> doesn't rely in the new kernel on data (which may or may not be
> corrupted...) from the old kernel.
>
I think the issue here is that some DMA was going on when first kernel
crashed. After the crash, second kernel booted and created new TCE tables
and switched to it. This resulted in ongoing DMA failure and hardware raised
an alarm.
In this case, probably it would make sense to re-use the TCE tables of
previous kernel (until and unless we have a way to tell hardware not
to flag a DMA error if TCE mapping is changed while DMA is going on ?)
I think, we also need to reserve some TCE table entries (in first kernel),
which can be used by second kernel for saving kernel core file to disk. There
might be a case where first kernel has used up all TCE entries and second
kernel can't allocate more. I think ppc64 has taken the approach of freeing
some entries in second kernel but that will have the problem as you might
be clearing an entry which is being used by ongoing DMA.
Thanks
Vivek
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]