Re: [Patch] jbd commit code deadloop when installing Linux

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 28 Jun 2006 13:46:22 +0800
Zou Nan hai <[email protected]> wrote:

> On Wed, 2006-06-28 at 14:55, Andrew Morton wrote:
> > On Wed, 28 Jun 2006 08:38:59 +0200
> > Ingo Molnar <[email protected]> wrote:
> > 
> > > 
> > > * Andrew Morton <[email protected]> wrote:
> > > 
> > > > > We see system hang in ext3 jbd code
> > > > > when Linux install program anaconda copying 
> > > > > packages. 
> > > > > 
> > > > > That is because anaconda is invoked from linuxrc 
> > > > > in initrd when system_state is still SYSTEM_BOOTING.
> > > 
> > > [ argh ...! ]
> > 
> > That's what I thought  ;)
> > 
> > > > > Thus the cond_resched checks in  journal_commit_transaction 
> > > > > will always return 1 without actually schedule, 
> > > > > then the system fall into deadloop.
> > > > 
> > > > That's a bug in cond_resched().
> > > > 
> > > > Something like this..
> > > 
> > > Acked-by: Ingo Molnar <[email protected]>
> > > 
> > 
> > Thanks.  Zou, it'd be great if you could test this in your setup, please. 
> > I've tagged it as 2.6.17.x material.
> 
> Andrew, 
>    I am building the env to test.
>    The patch was my original idea, but I was afraid of breaking any code
> that rely on the OLD wrong cond_sched semantic.

We prefer the "right" fix, however painful or risky that might be.

> However later I did a
> grep found that there is very few code that checks the return value of
> cond_resched. So the patch should be safe. 

Hope so.

> However I think cond_resched_lock and cond_resched_softirq also need fix
> to make the semantic consistent.
> 
> Please check the following patch.
> 

Ah.  I think the return value from these functions should mean "something
disruptive happened", if you like.

See, the callers of cond_resched_lock() aren't interested in whether
cond_resched_lock() actually called schedule().  They want to know whether
cond_resched_lock() dropped the lock.  Because if the lock was dropped, the
caller needs to take some special action, regardless of whether schedule()
was finally called.

So I think the patch I queued is OK, agree?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux