Re: CFQ + 2.6.13-rc4-RT-V0.7.52-02 = BUG: scheduling with irqs disabled

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 25 2005, Ingo Molnar wrote:
> 
> * Jens Axboe <[email protected]> wrote:
> 
> > > BUG: scheduling with irqs disabled: libc6.postinst/0x20000000/13229
> > > caller is ___down_mutex+0xe9/0x1a0
> > >  [<c029c1f9>] schedule+0x59/0xf0 (8)
> > >  [<c029ced9>] ___down_mutex+0xe9/0x1a0 (28)
> > >  [<c0221832>] cfq_exit_single_io_context+0x22/0xa0 (84)
> > >  [<c02218ea>] cfq_exit_io_context+0x3a/0x50 (16)
> > >  [<c021db84>] exit_io_context+0x64/0x70 (16)
> > >  [<c011efda>] do_exit+0x5a/0x3e0 (20)
> > >  [<c011f3ca>] do_group_exit+0x2a/0xb0 (24)
> > >  [<c0103039>] syscall_call+0x7/0xb (20)
> > 
> > Hmm, Ingo I seem to remember you saying that the following construct:
> > 
> >         local_irq_save(flags);
> >         spin_lock(lock);
> > 
> > which is equivelant to spin_lock_irqsave() in mainline being illegal 
> > in -RT, is that correct? This is what cfq uses right now for an 
> > exiting task, as the above trace indicates.
> 
> yes, that's correct. Mainline's exit_io_contexts() uses the above 
> construct because there's no task_lock_irqsave(current, flags) API.
> 
> note that recent -RT kernels are a lot less drastic about these cases 
> and print a once-per-bootup warning, not a scary message like above.  
> This message should not happen in recent -RT kernels.
> 
> The problem was this piece of code in exit_io_contexts():
> 
>         local_irq_save(flags);
>         task_lock(current);
>         ioc = current->io_context;
>         current->io_context = NULL;
>         ioc->task = NULL;
>         task_unlock(current);
>         local_irq_restore(flags);
> 
> i understand the detached use of flags, but i'm also wondering why irqs 
> have to be disabled here in the first place? At this point in do_exit() 
> we should normally not have any pending IO attached to our io_context.  
> What am i missing?

There can quite easily be lots of pending IO for the io_context (and, in
CFQ's case, below cfq_io_contexts), task exiting is completely decoupled
from any pending io.

Then there's the cfq_exit_io_context() locking. I have to ponder this a
bit, I cannot even convince myself that it is currently safe right now.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux