On Thu, 12 Oct 2006 13:43:15 -0400 Edward Goggin wrote:
> I'm looking for information about potential causes for a
> spontaneous reboot of a dual core Opteron running RHEL 4
> linux (2.6.9 derivative). I'm thinking the cause is
> due to the box taking a triple fault and resetting the
> BIOS.
>
> I'm also thinking that the triple fault is caused by a
> kernel stack overflow into an unmapped virtual page
> frame. Is this a reasonable explanation? Are there
> others?
>
> What are reasonable debugging strategies for handling
> this? Following the kernel stack overflow hunch, I'm
> going to try increasing the Opteron stack size from
> 8K to 16K. Can this be done by simply changing
> THREAD_ORDER in include/asm-x86_64/page.h from 1 to 2?
>
> Also, is there any kernel stack overflow detection
> debugging code anywhere for x86_64 as there is for
> i386?
[Please start a new thread instead of replying to a diff.
one and changing the subject.]
arch/x86_64/Kconfig.debug has these Kernel hacking options:
[but I'm not talking about 2.6.9; maybe you should/could try
a newer kernel]
config DEBUG_STACKOVERFLOW
bool "Check for stack overflows"
depends on DEBUG_KERNEL
help
This option will cause messages to be printed if free stack space
drops below a certain limit.
config DEBUG_STACK_USAGE
bool "Stack utilization instrumentation"
depends on DEBUG_KERNEL
help
Enables the display of the minimum amount of free stack which each
task has ever had available in the sysrq-T and sysrq-P debug output.
This option will slow down process creation somewhat.
---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]