Re: Serial related oops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 22, 2007 at 03:02:46PM +0000, Jose Goncalves wrote:
> It could be a silly question (tamper with me as I'm not familiar with
> such low level programming), but couldn't it be possible for a interrupt
> to hit in the middle of the serial_in() calls and mess with %ebx?

I'm no expert on x86, but if an interrupt was messing with %ebx, you'd
have random crashes verywhere - userspace, kernel space in unpredicatable
ways.

> What I find real hard to understand is why a hardware fault happens
> always in the same software instruction! I would expect a hardware fault
> to hit randomly...

Well, compared with your previous report, your latest report is different.
Your first report had  both EIP and %ebx being zero (because they got
corrupted when returning from serial_in).  This time only %ebx was
corrupted.

Consequently, this time we oopsed in the subsequent serial_in() rather
than trying to return to serial8250_startup() as last time.

> I left my application running this night, with a 2.6.16.41 kernel
> unpatched  on the serial driver (my last Oops report was with Frederik
> patch to remove the insertion made in 2.6.12) and it crashed again on
> exactly the same point!

>From that I take it that you removed the test in serial8250_startup which
sets UART_BUG_TXEN, and the problem persisted.  That tends to suggest
that it's not the culpret.

> > For all we know, it could be a one-off fault on the hardware you
> > happen to have - other identical units may not behave the same (can
> > you check?)
> 
> Yes I have other units that I can test it. I'll do that to see if it's
> really a one-off fault on the hardware.

Would be nice to know.

> If it continues to crash with other units I will then test with the
> msleep(10) before the "And clear the interrupt registers again for
> luck.", as you suggested earlier.
> 
> > If it is a one off case, you are welcome to patch that test out in
> > your kernel build to remove the problem, and if it's an isolated case
> > I encourage you to do this.  This is one of the great advantages of
> > open source - if you hit such a problem rather than throwing the
> > hardware away you can work around such issues.
> 
> I didn't understand what you mean by "you are welcome to patch that test
> out in your kernel build to remove the problem". Which test are you
> talking about?

The one which sets UART_BUG_TXEN.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux