On Sat, 2006-10-28 at 22:10 +1000, Nick Piggin wrote:
> Richard Purdie wrote:
> > I'm open to suggestions as to how to do it. The main problem was that in
> > in order to mark the page bad you needed to hold an extra reference on
> > the page so that the "mark bad" code would be the last code to touch the
> > page. The swapoff code doesn't need this and I don't like the idea of
> > passing some count value to a common function as that would be
> > confusing. I guess swapoff could start taking an extra reference but I
> > can see people objecting to that too as it doesn't need it.
>
> If you have the page locked (which you can't from where you're trying,
> but we'll tackle that next), and the page is swapcache, then doesn't
> that pin you a reference to the swap entry as well? So I still don't
> see why you need that extra reference.... I know there is a
> try_to_unuse_page/entry hiding in try_to_unuse somewhere ;)
The final call to swap_free() releases the swap mapping and beyond that
point you can't tell which swap page you were dealing with.
try_to_unuse_page() calls swap_free() each time it removes a reference.
To make sure we control the last swap_free(), I had it take a reference
with swap_duplicate() and then we guarantee we call the last free. The
last free can then mark the page bad and take care of the accounting.
This does mean try_to_unuse_page() succeeds when the refcount is 2 and
not 1 as in the case of the usual try_to_unuse().
> That isn't your only problem though, and we simply don't want to do
> this (potentially expensive) unusing from interrupt context. Noting
> the error and dealing with it in process context I think is the best
> way to do it.
The reasoning was that this circumstance should be extremely rare. If it
happens, we have a hardware problem. Recovering from that hardware
problem gracefully is more important than a slightly longer interrupt.
But yes, process context would be nicer, *if* we can find a way to do
it.
> BTW. what's the driver you're using? It might be useful to have an
> option to schedule a timer for the completions (at least the ones
> which generate errors).
Its a custom driver. I'm sure I can force it into using interrupt
context for the completions to try and test things although I'm also
fairly sure the existing patch will break when I do that for the reasons
you mention :-(.
> >>You say that SetPageError makes the processes die unexpectedly? How and
> >>where? We use SetPageError for IO errors, and it doesn't mean the page
> >>has errors AFAIK.
> >
> > Most of the testing I did was on ARM. If you mark a page with
> > SetPageError, the next time userspace tries to access it, you get a data
> > abort and the process dies with a SIGBUS or other faults (even if the
> > page is marked as up to date). The effect was very clear and I just
> > assumed it was behaving as intended. It was this problem I started to
> > address as I didn't like processes randomly dieing...
>
> I can't for the life of me see how or why this is happening. I don't
> doubt it is a problem, but it smells like another bug.
I can't work out the code path it happens in and until I do, I'm not
sure how I can track it down...
> > Ignoring the filesystem calls to PageError(), the remaining two are
> > wait_on_page_writeback_range() and write_one_page() which something
> > like:
> >
> > if (PageError(page))
> > return -EIO;
> >
> > I'd guess something somewhere acts on the EIO and ignores the fact the
> > page is uptodate. I'm struggling a bit to work out the exact codepaths
> > though.
>
> I don't think we have to worry about any of the filesystem code, even
> if we are talking about a swap file. It should all go straight through
> page_io.c. And wait_on_page_writeback doesn't have its return value
> checked in any of the swap code AFAIK...
>
> Still, something must be triggering it somewhere.
Something must be but I wish I knew what/where...
> > 3. Any other ideas?
> >
> > Can you give me a hint as to where in the swap out path you might want
> > to detect the error? The only way I can see is to have the end of
> > swap_writepage() wait_on_page_writeback() but that would appear to have
> > significant performance implications. I'm struggling to see a way you
> > can do this which is always outside interrupt context.
>
> swap_writepage sounds better than . You've even got the page
> already locked for you, so the job's half done ;)
>
> What performance implications did you imagine? The fastpath will
> just be a single PageError test, and error case slowpath doesn't
> matter too much beyond having something that actually works.
If swap_writepage() is to check for an error, it will have to wait until
the IO is complete with something like wait_on_page_writeback() before
it can check. The performance implication is the extra
wait_on_page_writeback() on every call to swap_writepage(). In the
meantime, it will have to give up the page lock and then reacquire it.
Unless you're thinking of something different?
Regards,
Richard
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]