Re: [PATCH] Page writeback broken after resume: wb_timer lost

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

> > I have noticed for some time that nr_dirty never drops but
> > increases except when VM pressure forces it down. This only
> > occurs after a resume, never on a freshly booted system.
> > 
> > It seems the wb_timer is lost when the timer function is
> > trying to start a frozen pdflush thread, and this occurs
> > during suspend or resume.
> > 
> > I have included a patch which work for me. Don't know if the
> > test also should include a check for freezing to be safe, ie
> >   if ( !frozen(..) && !freezing(..) )

Yep, I have seen this too. Sync took *way* too long and I believe I
lost some data because of this problem.

> Maybe the code over in page-writeback.c should just rearm the timee within
> the timer handler rather than waiting for a pdflush thread to do it.  I'll
> think about that.
> 
> But the main questions is: what on earth is going on here?  We've taken a
> kernel thread and we've done a wake_up_process() on it, but because it was
> in a frozen state it just never gets to run, even after the resume. 
> Presumably it goes back into interruptible sleep after the resume.  We took
> it off the list (in the expectation that it'd run again) so we've lost
> control of it.

I guess you should not try to wake up process while it is frozen. Such
wakeups are likely to get lost. Should we add some BUG_ON() somewhere?

...we have to eat some wakeups, because we fake some.

Or perhaps we should do WARN_ON(frozen(current)) just after schedule()
below?

> Pavel, Rafael: this amounts to a lost wakeup.  What's the story?

								Pavel

Refrigerator looks like this:

/* Refrigerator is place where frozen processes are stored :-). */
void refrigerator(void)
{
        /* Hmm, should we be allowed to suspend when there are
realtime
           processes around? */
        long save;
        save = current->state;
        pr_debug("%s entered refrigerator\n", current->comm);
        printk("=");

        frozen_process(current);
        spin_lock_irq(&current->sighand->siglock);
        recalc_sigpending(); /* We sent fake signal, clean it up */
        spin_unlock_irq(&current->sighand->siglock);

        while (frozen(current)) {
                current->state = TASK_UNINTERRUPTIBLE;
                schedule();
        }
        pr_debug("%s left refrigerator\n", current->comm);
        current->state = save;
}

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux