Re: [FIXED] Re: Total machine lockup w/ current kernels while installing from CD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bernhard Rosenkraenzer <[email protected]> wrote:
>
> On Thursday, 11. May 2006 03:22, Bernhard Rosenkraenzer wrote:
> > Hi,
> > I've built a CD that installs a customized system
> > [... crash at a random point ...]
> > BUG: soft lockup detected on CPU#0!
> >
> > Pid: 421, comm: kjournald
> > EIP: 0060:[<b01a2f52>] CPU: 0
> > EIP is at journal_commit_transaction+0x92e/0xfcc
> > EFLAGS: 00000297 Not tainted (2.6.16-rc6 #1)
> > EAX: 00000001 EBX: c2d34788 ECX: 00000001 EDX: c785e000
> > ESI: b3ff8d04 EDI: 000000f0 EBP: b683b840 DS: 007b ES: 007b
> > CR0: 8005003b CR2: 0841f7fc CR3: 17217000 CR4: 000006d0
> >  [<b02bd52e>] schedule+0x2ee/0x5b6
> >  [<b01a6a88>] kjournald+0x201/0x213
> >  [<b0111089>] smp_apic_timer_interrupt+0x32/0x49
> >  [<b01a6937>] kjournald+0xb0/0x213
> >  [<b01a5ffa>] commit_timeout+0x0/0x9
> >  [<b012a789>] autoremove_wake_function+0x0/0x4b
> >  [<b01a6887>] kjournald+0x0/0x213
> >  [<b0101005>] kernel_thread_helper+0x5/0xb
> 
> After backing out lots of changes, I've figured out the problem is caused by 
> this bit of 2.6.16-rc6:
> 
> diff -urN linux-2.6.16-rc5/kernel/sched.c linux-2.6.16-rc6/kernel/sched.c
> --- linux-2.6.16-rc5/kernel/sched.c	2006-05-11 20:04:18.000000000 +0200
> +++ linux-2.6.16-rc6/kernel/sched.c	2006-05-11 20:00:00.000000000 +0200
> @@ -4028,6 +4021,8 @@
>  	 */
>  	if (unlikely(preempt_count()))
>  		return;
> +	if (unlikely(system_state != SYSTEM_RUNNING))
> +		return;
>  	do {
>  		add_preempt_count(PREEMPT_ACTIVE);
>  		schedule();

(That's cond_resched())

> 
> The problem is that (to save a couple of bits of space), my simple installer 
> was running inside an initrd -- and system_state isn't set to SYSTEM_RUNNING 
> before linuxrc is executed --> scheduler breakage causes the oops.

ah-hah.

It's odd that we'll run initrds in a !SYSTEM_RUNNING state.

It's not an oops - it's sort-of a warning.  Did the system actually
continue to run and boot up OK?

If so, I'd assume that the ext3 filesystem was mounted on a very slow
device - perhaps an IDE disk in PIO mode?

Perhaps we should poke the softlockup detector if someone called
cond_resched() when in a reschedulable state.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux