Re: [RFC] page fault retry with NOPAGE_RETRY

> Livelocks.  I've described the deliberate one, but I fear there are
> accidental ones I haven't thought of.

But if you check for signal pending in handle_pte_fault() and possibly
do a cond_resched(), there should be no livelock situation...

> > My thinking was something around the lines of no_page() always does the
> > retry logic. Then, we do something like:
> > 
> > handle_pte_fault() gets modified. If do_no_page() returns
> > VM_FAULT_RETRY, it checks pte_present() again. If the PTE is present, it
> > returns VM_FAULT_MINOR. If PTE is absent, it checks for signals, and
> > returns VM_FAULT_MINOR if a signal is pending. If PTE is absent and no
> > signals are pending, it returns VM_FAULT_RETRY.
> 
> You forget that the point of this optimisation is to undo mmap_sem while
> waiting on the disk IO.  Once we've done that we cannot go looking at ptes
> or vmas: another thread could have munapped the whole lot or anything. 
> (And we always need to be afraid of use_mm()..)

Wait wait .. .we don't need to have the mmap sem to -look- at a PTE. Or
the hardware would be in pain when doing table walk. My point is, if the
PTE is -present- (which we can check without taking the mmap sem), we
just return to userland and re-do the instruction.

> Once mmap_sem has been dropped we need to go all the way back to the
> process's virtual address and redo the vma lookup. The easy and clean way
> of doing that is to rerun the fault, reuse all the existing code.

Sure, but what I'm saying is that we can still check if the PTE is
present or a signal pending and based on that, decide to return either
VM_FAULT_MINOR or VM_FAULT_RETRY. The former would cause do_page_fault()
to return all the way to userland while the later would just loop in
do_page_fault() as per Mike patch.

> > In addition, we still need to modify all archs do_page_fault() to handle
> > VM_FAULT_RETRY...
> 
> Yup, it would need some temporary ifdeffery while architetures convert.
> 
> But bear in mind my earlier comments regarding possible optimisations to
> this code.

Yup and I think my idea does, and I still don't see why we need this
additional burden of MAY_RETRY to carry around... (that is, I don't see
the livelock if we are careful enough to test for signals when we get a
VM_FAULT_RETRY result, and possibly cond_resched() or test
need_resched() and go back to userland, either way is fine by me).

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [RFC] page fault retry with NOPAGE_RETRY
  - From: Andrew Morton <[email protected]>

References:
- [RFC] page fault retry with NOPAGE_RETRY
  - From: Benjamin Herrenschmidt <[email protected]>
- Re: [RFC] page fault retry with NOPAGE_RETRY
  - From: Andrew Morton <[email protected]>
- Re: [RFC] page fault retry with NOPAGE_RETRY
  - From: Mike Waychison <[email protected]>
- Re: [RFC] page fault retry with NOPAGE_RETRY
  - From: Benjamin Herrenschmidt <[email protected]>
- Re: [RFC] page fault retry with NOPAGE_RETRY
  - From: Benjamin Herrenschmidt <[email protected]>
- Re: [RFC] page fault retry with NOPAGE_RETRY
  - From: Andrew Morton <[email protected]>

Prev by Date: Re: 2.6.18-rc7-mm1: networking breakage on HPC nx6325 + SUSE 10.1
Next by Date: Re: [RFC] page fault retry with NOPAGE_RETRY
Previous by thread: Re: [RFC] page fault retry with NOPAGE_RETRY
Next by thread: Re: [RFC] page fault retry with NOPAGE_RETRY
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]