[email protected] (Rasmus Bøg Hansen) writes:
> Oleg Verych <[email protected]> writes:
>
>> [ Adding e-mail of Andrew Morton, he may have clue about who to ping ;]
>> [ MAINTAINERS.smbfs seems to be emply ]
>>
>> On 2006-11-14, Rasmus BЬg Hansen wrote:
>> []
>>> [1.] One line summary of the problem:
>>>
>>> Kernel BUG's and freezes after a soft lockup.
>>>
>>> [2.] Full description of the problem/report:
>>>
>>> The night before sunday, my server froze. It was entirely dead and had
>>> to be power cycled. There was no seriel console connected but it
>>> managed to log a short BUG before, which seems related to smbfs.
>>>
>>> As it happened in the night, I am unsure what triggered the bug, but
>>> it was during the nightly backup routines, which includes running
>>> rsync over ssh (over ADSL so pretty slow) and writing some large
>>> .tar.bz2 to a smbfs drive. I assume (but do no know for sure) that it
>>> was the last one that triggered the bug.
>>
>> Nobody seems to picked this up. So.
>> Why don't you try debian's kernel 2.6.18 from unstable?
>
> I haven't tried that, partially to get rid of the initrd and have
> drivers for the controllers/disks in-kernel, partially because I like
> having built my own kernel. That might be silly, of course - I can try
> the Debian kernel, if you think it would make a difference.
>
>> I see, you've build it yourself, then try to enable some more locking
>> debuging in the "kernel hacking" section.
>
> I am now running with all lock debugging turned on. "Unfortunately"
> the machine has been running stable since the crash last weekend -
> perhaps the extra-large backup rourines at sunday may trigger it
> again...
>
>> (gitweb down, i can't check history of smbfs, and i have amd64 arch, anyway)
>>> Nov 12 03:54:57 gere kernel: BUG: soft lockup detected on CPU#0!
>>> Nov 12 03:54:57 gere kernel: [softlockup_tick+170/195] softlockup_tick+0xaa/0xc3
>>> Nov 12 03:54:57 gere kernel: [update_process_times+56/137] update_process_times+0x38/0x89
>>> Nov 12 03:54:57 gere kernel: [smp_apic_timer_interrupt+105/117] smp_apic_timer_interrupt+0x69/0x75
>>> Nov 12 03:54:57 gere kernel: [smbiod+238/348] smbiod+0xee/0x15c
>> this
>>
>>> Nov 12 03:54:57 gere kernel: [apic_timer_interrupt+31/36] apic_timer_interrupt+0x1f/0x24
>>> Nov 12 03:54:57 gere kernel: [journal_init_revoke+49/678] journal_init_revoke+0x31/0x2a6
>>> Nov 12 03:54:57 gere kernel: [smbiod+238/348] smbiod+0xee/0x15c
>> and this *may be* double (un)lock.
>
> Hopefully lock debugging will tell.
I got this - I think it was this morning (somehow kernel logging was
disabled so I can't tell the exact time):
----
=============================================
[ INFO: possible recursive locking detected ]
---------------------------------------------
nfsd/1788 is trying to acquire lock:
(&inode->i_mutex){--..}, at: [<c02cc35c>] mutex_lock+0x8/0xa
but task is already holding lock:
(&inode->i_mutex){--..}, at: [<c02cc35c>] mutex_lock+0x8/0xa
other info that might help us debug this:
2 locks held by nfsd/1788:
#0: (hash_sem){----}, at: [<e0930d99>] exp_readlock+0x12/0x16 [nfsd]
#1: (&inode->i_mutex){--..}, at: [<c02cc35c>] mutex_lock+0x8/0xa
stack backtrace:
[<c0103c10>] show_trace+0x27/0x2b
[<c0103d2d>] dump_stack+0x26/0x2a
[<c01353ba>] print_deadlock_bug+0xb5/0xba
[<c0135420>] check_deadlock+0x61/0x71
[<c0136da3>] __lock_acquire+0x334/0x9b6
[<c0137b3d>] lock_acquire+0x75/0x90
[<c02cb9a1>] __mutex_lock_slowpath+0x85/0x270
[<c02cc35c>] mutex_lock+0x8/0xa
[<e092cb75>] nfsd_setattr+0x3ca/0x574 [nfsd]
[<e092e3a8>] nfsd_create_v3+0x3a6/0x540 [nfsd]
[<e0934d4b>] nfsd3_proc_create+0x118/0x161 [nfsd]
[<e0929751>] nfsd_dispatch+0xd8/0x1ff [nfsd]
[<e08e8503>] svc_process+0x4e5/0x6da [sunrpc]
[<e09294e7>] nfsd+0x1cc/0x35e [nfsd]
[<c01010b9>] kernel_thread_helper+0x5/0xb
----
Also it doesn't seem to be related to smbfs AFAICS - I am not enough
into lock debugging to say if this is relevant, useful or just
noise...
Regards
/Rasmus
--
Rasmus Bøg Hansen
MSC Aps
Bøgesvinget 8
2740 Skovlunde
44 53 93 66
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]