Re: bcm43xx regression 2.6.19rc3 -> rc5, rtnl_lock trouble?

Larry Finger wrote:
> Johannes Berg wrote:
>> Hah, that's a lot more plausible than bcm43xx's drain patch actually
>> causing this. So maybe somehow interrupts for bcm43xx aren't routed
>> properly or something...
>>
>> Ray, please check /proc/interrupts when this happens.

When it happens, I can't. The keyboard is entirely dead (I'm in X, perhaps at
a console it would be okay). The only thing that works is magic SysRq. even
ctrl-alt-f1 to get to a console doesn't work.

That said, /proc/interrupts doesn't show MSI routed things on my AMD64 laptop.

>> I am convinced that the patch in question (drain tx status) is not
>> causing this -- the patch should be a no-op in most cases anyway, and in
>> those cases where it isn't a no-op it'll run only once at card init and
>> remove some things from a hardware-internal FIFO.
>

Okay, I can buy that.

> I agree that drain tx status should not cause the problem.
> 
> Ray, does -rc6 solve your problem as it did for Joseph?

I can't get it to repeat other than the first two times. However, I
accidentally stopped NetworkManager from handling my wireless a few days ago,
and haven't restarted it, so that may play into this.

Humor me one last time, I beg. Did you look at the messages file I posted? (Or
maybe I didn't include this second bit... Damn, I need to be more careful with
cutting and pasting...)

The second sysrq-t shows locking stuff going on, can you tell me if it looks
reasonable? It still seems to me that something acquiring and not releasing
rtnl_lock explains what I was seeing (rtnl lock is implicated in both sysrq-t
backtraces). I don't know if that thing is bcm43xx, though.

Is this part reasonable?:
 1 lock held by events/0/4:
  #0:  (&bcm->mutex){--..}, at: [mutex_lock+9/16] mutex_lock+0x9/0x10
 2 locks held by NetworkManager/4837:
  #0:  (rtnl_mutex){--..}, at: [mutex_lock+9/16] mutex_lock+0x9/0x10
  #1:  (&bcm->mutex){--..}, at: [mutex_lock+9/16] mutex_lock+0x9/0x10
 1 lock held by wpa_supplicant/5953:
  #0:  (rtnl_mutex){--..}, at: [mutex_lock+9/16] mutex_lock+0x9/0x10

(So locks A, A&B, B)

...of the below...

 Showing all locks held in the system:
 1 lock held by events/0/4:
  #0:  (&bcm->mutex){--..}, at: [mutex_lock+9/16] mutex_lock+0x9/0x10
 1 lock held by getty/4224:
  #0:  (&tty->atomic_read_lock){--..}, at: [mutex_lock_interruptible+9/16]
mutex_lock_interruptible+0x9/0x10
 1 lock held by getty/4225:
  #0:  (&tty->atomic_read_lock){--..}, at: [mutex_lock_interruptible+9/16]
mutex_lock_interruptible+0x9/0x10
 1 lock held by getty/4226:
  #0:  (&tty->atomic_read_lock){--..}, at: [mutex_lock_interruptible+9/16]
mutex_lock_interruptible+0x9/0x10
 1 lock held by getty/4227:
  #0:  (&tty->atomic_read_lock){--..}, at: [mutex_lock_interruptible+9/16]
mutex_lock_interruptible+0x9/0x10
 1 lock held by getty/4228:
  #0:  (&tty->atomic_read_lock){--..}, at: [mutex_lock_interruptible+9/16]
mutex_lock_interruptible+0x9/0x10
 1 lock held by getty/4229:
  #0:  (&tty->atomic_read_lock){--..}, at: [mutex_lock_interruptible+9/16]
mutex_lock_interruptible+0x9/0x10
 2 locks held by NetworkManager/4837:
  #0:  (rtnl_mutex){--..}, at: [mutex_lock+9/16] mutex_lock+0x9/0x10
  #1:  (&bcm->mutex){--..}, at: [mutex_lock+9/16] mutex_lock+0x9/0x10
 1 lock held by wpa_supplicant/5953:
  #0:  (rtnl_mutex){--..}, at: [mutex_lock+9/16] mutex_lock+0x9/0x10
 1 lock held by less/29492:
  #0:  (&tty->atomic_read_lock){--..}, at: [mutex_lock_interruptible+9/16]
mutex_lock_interruptible+0x9/0x10
 1 lock held by bash/9871:
  #0:  (&tty->atomic_read_lock){--..}, at: [mutex_lock_interruptible+9/16]
mutex_lock_interruptible+0x9/0x10

 =============================================

Regardless, I'm going to withdraw my regression report until I can reproduce
this. I can't justify holding anything up if we can't even finger a culprit to
look at. In the meantime I'll try running with rc6.

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Problem with DMA on x86_64 with 3 GB RAM
  - From: Larry Finger <[email protected]>
- Re: bcm43xx regression 2.6.19rc3 -> rc5, rtnl_lock trouble?
  - From: Larry Finger <[email protected]>
- Re: bcm43xx regression 2.6.19rc3 -> rc5, rtnl_lock trouble?
  - From: Adrian Bunk <[email protected]>

References:
- bcm43xx regression 2.6.19rc3 -> rc5, rtnl_lock trouble?
  - From: Ray Lee <[email protected]>
- Re: bcm43xx regression 2.6.19rc3 -> rc5, rtnl_lock trouble?
  - From: [email protected] (Joseph Fannin)
- Re: bcm43xx regression 2.6.19rc3 -> rc5, rtnl_lock trouble?
  - From: Johannes Berg <[email protected]>
- Re: bcm43xx regression 2.6.19rc3 -> rc5, rtnl_lock trouble?
  - From: Larry Finger <[email protected]>

Prev by Date: Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
Next by Date: Re: [PATCH 1/10] cxgb3 - main header files
Previous by thread: Re: bcm43xx regression 2.6.19rc3 -> rc5, rtnl_lock trouble?
Next by thread: Re: bcm43xx regression 2.6.19rc3 -> rc5, rtnl_lock trouble?
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]