Re: linux-2.6 x86_64 kgdb issue

On Thu, 2006-05-25 at 12:07 +0530, Amit S. Kale wrote:
> On Wednesday 24 May 2006 23:41, Vladimir A. Barinov wrote:
> > Amit S. Kale wrote:
> > >Looking at this again:
> > >Call Trace:  {kgdb_notify}
> > >                    {notifier_call_chain}
> > >                    {do_stack_segment}
> > >                    {stack_segment}
> > >                      {io_outb}
> > >                      {kgdb_mem2hex}
> > >
> > >Why is io_outb being called from kgdb_mem2hex. kgdb_mem2hex refers to data
> > >directly and not through io_outb.
> > >
> > >Perhaps it's got something to do with iommu feature. Have you used "iommu"
> > > on kernel command line?
> >
> > I have iommu switched on, but disabling this lead to the same dump_stack
> > result.
> 
> This is confusing. Could you shed some light on this.
> 
> > But I've used earlier version of kgdb_8250.c and io_outb() was a
> > callback similar to kgdb_oiwrite() in kgdb-2.6.16.tar.bz2.
> > I merged the kgdb_8250.c from the kgdb-2.6.16.tar.bz2 and got the
> > dump_stack:
> >                     {notifier_call_chain}
> >                     {do_stack_segment}
> >                      {stack_segment}
> >                      {kgdb_mem2hex}
> >                     {kgdb_mem2hex}
> >
> >
> > Also I've used 2.6.10 with this stack exception behavior. In 2.6.16
> > patched with kgdb-2.6.16.tar.bz2 the behavior is so that the target
> > reboots after multiple steps and a "continue" command  in the end.
> 
> Now it looks like we have a stack overflow. It would result in a stack 
> exception. Stack overflow usually results in a complete breakdown of a kernel 
> since there is no stack to handle the stack exception itself. Processors upto 
> pentium used float their buses, which would be detected by the surrounding 
> hardware and cause a reset. I am not sure whether modern processors and/or 
> hardware does that.
> 
> >
> > Just want to note that in 2.6.10 kernel the stack exception doesn't
> > occur if CONFIG_64BIT in linux/kernel/kgdb.c is not defined.
> 
> CONFIG_64BIT probably requires more stack. That's why you see a stack 
> exception.

I added some debug info to the thread and got stack overflows; it was
trivial to double the size of the stack. I was saving a back trace of
the stack during at each preemption point (Ex: spinlock) to allow
me to see the context of the active holders of spinlocks. I configured
it with CONFIG_DEBUG_PREEMPT_AUDIT and enabled large stacks in:
------------------------------------------------------------------------
 			include/asm-i386/thread_info.h:
------------------------------------------------------------------------
#ifdef CONFIG_DEBUG_PREEMPT_AUDIT
#define THREAD_SIZE     (8192 * 2)
#else
#ifdef CONFIG_4KSTACKS
#define THREAD_SIZE         (4096)
#else
#define THREAD_SIZE         (8192)
#endif
#endif
-------------------------------------------------------------------------
All you really need to do is change THREAD_SIZE from 
(8192) to (8192 * 2). I didn't have any problems in i386.



> 
> Unfortunately x86_64 architecture doesn't provide any stack overflow debugging 
> mechanism. Perhaps you can implement a little code in kgdb_handle_exception, 
> which checks whether we are beyond 7168 bytes of stack usage on entry. If we 
> are, declare a panic indicating a possible stack overflow later.

I was getting stack overflows on the SPARC architecture when compiling
the kernel -O1 for kgdb/kdbx debugging. I allocated a hot physical page
for each CPU as it was brought on line and then mapped it on the fly
when we got a stack overflow. I then pushed out the register window that
caused the trap, and then continued with the normal panic path. 

 
Perhaps we should add a kgdb config menu option and #define
CONFIG_16KSTACKS to double the stack size so the kernel can be 
debugged with more context available. I'm currently using -O0 for 
the networking stack and -O1 for the rest of the kernel. Sounds like 
it would be helpful now for AMD64 targets.

-piet

> 
> -Amit
> 
> >
> > Vladimir
> >
> > >-Amit
> > >
> > >On Friday 19 May 2006 23:45, Vladimir A. Barinov wrote:
> > >>Hi All,
> > >>
> > >>I'm working with em64t dual xeon board and have a problems with kgdb
> > >>when SMP is on.
> > >>During step by step debugging I've got the error message and gdb server
> > >>lost connection
> > >>to the target (gdb log is attached)
> > >>
> > >>Putting simple printk() and dump_stack() into the kgdb_notify():
> > >>    .....
> > >>    if  (cmd ==  DIE_TRAP) {
> > >>        printk("DIE_TRAP, args->str=%s,
> > >>kgdb_may_fault=%d\n",args->str,kgdb_may_fault);
> > >>        dump_stack();
> > >>    }
> > >>    ......
> > >>
> > >>I've got trace:
> > >>DIE_TRAP, args->str=stack segment, kgdb_may_fault=1
> > >>Call Trace:  {kgdb_notify}
> > >>                    {notifier_call_chain}
> > >>                    {do_stack_segment}
> > >>                    {stack_segment}
> > >>                      {io_outb}
> > >>                      {kgdb_mem2hex}
> > >>
> > >>The stack exception occurs always at the same step during debugging in
> > >>kgdb_mem2hex().
> > >>I've attached patch that fixes this issue.  Could you please review, is
> > >>this patch appropriate
> > >>to the problem?
> > >>
> > >>Vladimir
> 
> 
> -------------------------------------------------------
> All the advantages of Linux Managed Hosting--Without the Cost and Risk!
> Fully trained technicians. The highest number of Red Hat certifications in
> the hosting industry. Fanatical Support. Click to learn more
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642
> _______________________________________________
> Kgdb-bugreport mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport
-- 
---
[email protected]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Follow-Ups:
- Re: linux-2.6 x86_64 kgdb issue
  - From: Andi Kleen <[email protected]>
Prev by Date: Re: OpenGL-based framebuffer concepts
Next by Date: Re: OpenGL-based framebuffer concepts
Previous by thread: Question about tcp hash function tcp_hashfn()
Next by thread: Re: linux-2.6 x86_64 kgdb issue
Index(es):
- Date
- Thread
[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]