Re: process creation time increases linearly with shmem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 26 Aug 2005, Linus Torvalds wrote:
> On Fri, 26 Aug 2005, Hugh Dickins wrote:
> > 
> > I see some flaws in the various patches posted, including Rik's.
> > Here's another version - doing it inside copy_page_range, so this
> > kind of vma special-casing is over in mm/ rather than kernel/.
> 
> I like this approach better, but I don't understand your particular 
> choice of bits.
> 
> > +	 * Assume the fork will probably exec: don't waste time copying
> > +	 * ptes where a page fault will fill them correctly afterwards.
> > +	 */
> > +	if ((vma->vm_flags & (VM_MAYSHARE|VM_HUGETLB|VM_NONLINEAR|VM_RESERVED))
> > +								== VM_MAYSHARE)
> > +		return 0;
> > +
> >  	if (is_vm_hugetlb_page(vma))
> >  		return copy_hugetlb_page_range(dst_mm, src_mm, vma);
> 
> First off, if you just did it below the hugetlb check, you'd not need to
> check hugetlb again.

Yes: I wanted to include VM_HUGETLB in the list as documentation really;
and it costs nothing to test it along with the other flags - or are there
architectures where the more bits you test, the costlier?

> And while I understand VM_NONLINEAR and VM_RESERVED,
> can you please comment on why VM_MAYSHARE is so important, and why no
> other information matters.

The VM_MAYSHARE one isn't terribly important, there's no correctness
reason to replace VM_SHARED there.   It's just that do_mmap_pgoff takes
VM_SHARED and VM_MAYWRITE off a MAP_SHARED mapping of a file which was
not opened for writing.  We can safely avoid copying the ptes of such a
vma, just as with the writable ones, but the VM_MAYSHARE test catches
them where the VM_SHARED test does not.

> Now, VM_MAYSHARE is a sign of the mapping being a shared mapping. Fair 
> enough. But afaik, a shared anonymous mapping absolutely needs its page 
> tables copied, because those page tables contains either the pointers to 
> the shared pages, or the swap entries.
> 
> So I really think you need to verify that it's a file mapping too.

Either I'm misunderstanding, or you're remembering back to how shared
anonymous was done in 2.2 (perhaps).  In 2.4 and 2.6, shared anonymous
is "backed" by a shared memory object, created by shmem_zero_setup:
which sets vm_file even though we came into do_mmap_pgoff with no file.

> Also, arguably, there are other cases that may or may not be worth 
> worrying about. What about non-shared non-writable file mappings? What 
> about private mappings that haven't been COW'ed? 

Non-shared non-currently-writable file mappings might have been writable
and modified in the past, so we cannot necessarily skip those.

We could, and I did, consider testing whether the vma has an anon_vma:
we always allocate a vma's anon_vma just before first allocating it a
private page, and it's a good test which swapoff uses to narrow its
search.

But partly I thought that a little too tricksy, and hard to explain;
and partly I thought it was liable to catch the executable text,
some of which is most likely to be needed in between fork and exec.

> So I think that in addition to your tests, you should test for
> "vma->vm_file", and you could toy with testing for "vma->anon_vma"  being
> NULL (the latter will cause a _lot_ of hits, because any read-only private
> mapping will trigger, but it's a good stress-test and conceptually
> interesting, even if I suspect it will kill any performance gain through
> extra minor faults in the child).

Ah yes, I wrote the paragraph above before reading this one, honest!

Well, I still don't think we need to test vm_file.  We can add an
anon_vma test if you like, if we really want to minimize the fork
overhead, in favour of later faults.  Do we?

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux