On Sat, 2005-04-02 at 17:09 -0500, Steven Rostedt wrote: > On Sat, 2005-04-02 at 15:44 -0500, Steven Rostedt wrote: > > On Sat, 2005-04-02 at 22:35 +0200, Ingo Molnar wrote: > I tried the 2.6.12-rc1-RT-V0.7.43-06 kernel again, and I still have the > serial, to do sysrq. The console is using X which locks (even all the > ctrl-alt-X functions) and the ssh session that I run the test stops > after the processes try to grab the locks. It doesn't reply to ping. But > the sysrq from the serial shows constantly: > > s test3_rt: 2269 [cf304690, 89] (not blocked) > s test3_rt: 2270 [cf304050, 88] (not blocked) > s test3_rt: 2271 [cef86cb0, 87] (not blocked) > s test3_rt: 2272 [cef86670, 86] (not blocked) > R test3_rt: 2273 [cf28a050, 85] (not blocked) > > > So it seems that it's in a deadlock somewhere. Since the 43-06 gets much > further, this seems to be another problem. I'm not going to look at this > problem anymore, since it doesn't show up in the lastest. > > I'll run a few more tests to see if I can narrow things down on 43-06. Once again I spoke too soon. I got the same lockup with 43-06, it just took a little more to do so. All I did differently was to put the test into a while loop on the command line: while : ; do test3_rs ; done On the second iteration, it locked up in the same place as 36-02. But this is a different issue than what I showed with a NFS mounted system. That was a BUG where as this has something in a spinning loop somewhere. The attached is the gzipped minicom capture. After the system locked up, I was still able to get the serial input and do a few task dumps. I did several and it clearly shows that the system is locked. What you constantly see is: R test3_rt: 2263 [cbd6b9b0, 116] (not blocked) S test3_rt: 2264 [cc06d8d0, 89] (not blocked) S test3_rt: 2265 [cc06c090, 88] (not blocked) R test3_rt: 2266 [cc06ccb0, 87] (not blocked) D test3_rt: 2267 [cc1286c0, 86] blocked on: [ce127650] {(struct semaphore *)(&inode->i_sem)} .. held by: test3_rt: 2268 [ce822760, 85] ... acquired at: vfs_readdir+0x53/0xa0 D test3_rt: 2268 [ce822760, 85] (not blocked) Process 2263 started the run and has just released the semaphore, it was preempted before getting to the wait and although still running, it is not really part of the picture. Process 2267 is blocked on 2268 which is sleeping at fs/jbd/transaction.c:636 which is in do_get_write_access waiting (I believe) for a transaction to finish. Probably for kjournald to run, since it is also in the running state. But the problem seems to be with 2266 where as it's trace shows: test3_rt [cc06ccb0]R C0371AD6 [on rq] 6824 2266 2263 2267 2265 (NOTLB) cadebc48 00000046 c04eea30 c0371ad6 00000000 00000001 00000016 c0371aa1 c0102d00 00000000 00000000 11de7305 071e89c0 00000071 cc06ccb0 cc06cde8 ffffffff cc06ccb0 cfae68d4 cadebc5c c0371ad6 10000000 cabc71ac cfae68d4 Call Trace: [<c0371ad6>] preempt_schedule_irq+0x46/0x70 (20) [<c0102d00>] need_resched+0x20/0x24 (-7260) --------------------------- | preempt count: 00000002 ] | 2-level deep critical section nesting: ---------------------------------------- .. [<c03710df>] .... __schedule+0x4f/0x840 .....[<c0371ad6>] .. ( <= preempt_schedule_irq+0x46/0x70) .. [<c0371179>] .... __schedule+0xe9/0x840 .....[<c0371ad6>] .. ( <= preempt_schedule_irq+0x46/0x70) Which is not very helpful. So it is probably stuck in some spinning "yield" loop, which was the reason I was writing this test to begin with! It's most likely also waiting for kjournald to do some work, and is starving it in a schedule or yield loop never actually going to sleep letting kjournald do the real work. -- Steve
Attachment:
lockup.cap.gz
Description: GNU Zip compressed data
- Follow-Ups:
- Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- From: Ingo Molnar <[email protected]>
- Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- References:
- [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-10
- From: Ingo Molnar <[email protected]>
- Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- From: Gene Heskett <[email protected]>
- Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- From: "K.R. Foley" <[email protected]>
- Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- From: Gene Heskett <[email protected]>
- Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- From: Ingo Molnar <[email protected]>
- Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- From: Steven Rostedt <[email protected]>
- Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- From: Steven Rostedt <[email protected]>
- Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- From: Ingo Molnar <[email protected]>
- Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- From: Steven Rostedt <[email protected]>
- Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- From: Steven Rostedt <[email protected]>
- [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-10
- Prev by Date: Re: [PATCH] kernel/module.c - fix warning and reduce stack usage - reintroduction of mistakenly dropped patch
- Next by Date: sched /HT processor
- Previous by thread: Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- Next by thread: Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.43-00
- Index(es):