On Fri, 5 Oct 2007, Keir Fraser wrote:
> On 5/10/07 10:05, "Jeremy Fitzhardinge" <[email protected]> wrote:
> > Andi says:
> >> Do I misread that patch or does it really walk the complete address
> >> space and try to take all possible locks? Isn't that very slow?
> >
> > That's pretty much what it has to do. Pinning/unpinning walks the whole
> > pagetable anyway, so it shouldn't be much more expensive. And they're
> > relatively rare operations (fork, exec, exit).
>
> It is a shame to do 3x walks per pin or unpin, rather than 1x, though.
>
> One way to improve this, possibly, is to pin the pte tables individually as
> you go, rather than doing one big pin/unpin just at the root pgd. Then you
> can lock/unlock the pte's as you go. I'd suggest that as a possible post
> 2.6.23 improvement, however. Jan's patch has actually had some testing.
A few points come to mind looking at Jan's patch:
The comment about nested pagetable locks is wrong: mm/mremap.c does
nest pagetable locks; but you wouldn't (as I understand it) be doing
this pinning/unpinning anywhere which could race with an mremap() on
the same mm, so that's a niggle about a comment, not a real issue.
Yes, it would be greatly preferable to take the locks one by one
as needed. As it stands, I think you're in danger of overflowing
the PREEMPT_BITS 8 area of preempt_count(), venturing into the
SOFTIRQ area: I don't know the real-life consequence of that.
I don't see any protection against hugetlb areas, where the pmd
entry may indicate a hugetlb page rather than a pagetable page.
I guess you'll be needing to test pte_huge(). I don't know if
you want to lock those or skip them: locking is usually just
with page_table_lock, but beware there's also sharing of huge
page pmds between mms - Ken Chen should be able to help on that.
If a 2.6.23 fix is needed, I suggest simply excluding split ptlocks
in the Xen case, as shown by the mm/Kconfig - line in Jan's patch.
Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]