Re: [Fastboot] [RFC][PATCH] Add missing notifier before crashing

Vivek Goyal wrote:

On Wed, May 31, 2006 at 06:20:45PM +0900, Akiyama, Nobuyuki wrote:

Hello Vivek-san,

On Tue, 30 May 2006 10:56:58 -0400
Vivek Goyal <[email protected]> wrote:

On Tue, May 30, 2006 at 06:33:59PM +0900, Akiyama, Nobuyuki wrote:

Hello,

The panic notifier(i.e. panic_notifier_list) does not be called
if kdump is activated because crash_kexec() does not return.
And there is nothing to notify of crash before crashing by SysRq-c.

Although notify_die() exists, the function depends on architecture.
If notify_die() is added in panic and SysRq respectively like existing
implementation, the code will be ugly.
I think that adding a generic hook in crash_kexec() is better to simplify
the code. The panic_notifier_list-user will have nothing to do.
If you want to catch SysRq, use the crash_notifier_list.

What's the use of introducing crash_notifier_list? Who is going to use
it for what purpose?

Probably we don't want to create any such infrastructure because carries

the risk of hanging in between and reducing the reliability of dumpoperation.

I really understand what you concern about.
But a certain program indeed needs some processing before
crashing even if panic occurs.
Now standard kernel includes some panic_notifier_list-user,
these are the familiar examples.


I see the panic_notifier_list but I am afraid that we can not afford
to send the crash/panic notifications if system admin has chosen to load
the kdump kernel and has decided to take a dump in case of a crash event.
This might very seriously compromise the reliability of kdump. IIRC, we
recently saw an issue with powerpc where we did not even start booting into
the second kernel as system lost way somewhere while handling notifiers.

So far on a panic event kernel only used to display the panic string
and halt the system but now it tries to do much more. That is boot
into the second kernel and caputure the dump. Of course, one can argue
that I want to implement a different policy in case of system crash. In
that case probably we should not load the kdump kernel at all.

panic_notifier_list helps in this regard that multiple subsystem can
register their own policy and policy will be excuted in the registered

priority order. Given the nature of kdump, I feel it is kind ofmutually exclusive and can not co-exist with other policies. Otherwise, wewill end up calling all other policies first and trigger booting into

the second kernel last. This is equivalent to giving higher priority to
all other policies and least priority to kdump.

As other example, a cluster support software needs to clean up
current environment and to take over immediately.  > To do so, the cluster software immediately and surely want to
know the current node dies.
Some mission critical systems demand to start taking over
within a few milli-second after the system dies.


I am no failover expert, but a quick thought suggests me that doing
anything after the panic is not reliable. So should't all failover
mechanisms depend on auto detecting that failure has occurred instead
of failing system informing that I am about to fail so you better take
over. Something like syncying two systems with a hearbeat message kind
of thing and if main node fails, it will stop generating hearbeat
messages and spare node will take over.

In the type of systems we build, we very much would like bothkexec/kdump to be reliable and

a method for early panic notification to external systems.

We need kernel dumping i.e. for verification of a Telco server indeedcrashed only because some HW malfunctionwas detected. As these situations occur very seldomly as much reliableinformation is very much appriciated.

On the other hand, for Telco servers running under full load, we simplycan't afford waisting any time in waiting

for others servers to take over.

Informing other servers about them taking over has the single highestpriority for servers running in production!-Doing notification directly in the panic function would only cost a fewCPU cycles.-Setting up some external watchdog functionality would most likely cost1 millisecond. (factor 1.000.000)-Doing notification in dump kernel would take seconds. (factor1.000.000.000)


Too me it seems the issue can be boiled down to these two issues:

issue 1:
We need early panic notification in the kernel.

From the kernel dumping people could you please give some informationabout what could kind of

functionality could be acceptable in an early panic notification function.

The most simple solution is simply to hardcode outb() directly in thepanic function in panic.c.

A more flexible solution would be to allow something like this

if (function pointer)
 call function pointer

and the called function could do outb on what ever suits the system.

Using a function pointer allows each vendor to execute whatever isneeded in their environment and avoids every vendor to

alter the panic function to suit their purpose ?

issue 2:
Just something nice to have:

I do actually not know if this is an issue, as I am not working withreading the kernel dumps,

that is if the information is already available in the dump.

The panic function is called with a buffer as argument. Could thisbuffer somehow be transferred

to the dumping kernel i.e. for storage in special loggin systems..

./Preben












-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- [RFC][PATCH] Add missing notifier before crashing
  - From: "Akiyama, Nobuyuki" <[email protected]>
- Re: [Fastboot] [RFC][PATCH] Add missing notifier before crashing
  - From: Vivek Goyal <[email protected]>
- Re: [Fastboot] [RFC][PATCH] Add missing notifier before crashing
  - From: "Akiyama, Nobuyuki" <[email protected]>
- Re: [Fastboot] [RFC][PATCH] Add missing notifier before crashing
  - From: Vivek Goyal <[email protected]>

Prev by Date: Re: IO APIC IRQ assignment
Next by Date: Re: 2.6.17-rc5-mm2
Previous by thread: Re: [Fastboot] [RFC][PATCH] Add missing notifier before crashing
Next by thread: Re: [Fastboot] [RFC][PATCH] Add missing notifier before crashing
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]