Re: process creation time increases linearly with shmem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 25 Aug 2005, Linus Torvalds wrote:
> On Fri, 26 Aug 2005, Nick Piggin wrote:
> > 
> > > Skipping MAP_SHARED in fork() sounds like a good idea to me...
> > 
> > Indeed. Linus, can you remember why we haven't done this before?
> 
> Hmm. Historical reasons. Also, if the child ends up needing it, it will 
> now have to fault them in.
> 
> That said, I think it's a valid optimization. Especially as the child 
> _probably_ doesn't need it (ie there's at least some likelihood of an 
> execve() or similar).

I agree, seems a great idea to me (sulking because I was too dumb
to get it, even when Nick and Andi first posted their patches).

It won't just save on the copying at fork time, it'll save on
undoing it all again when the child mm is torn down for exec.

The refaulting will hurt the performance of something: let's
just hope that something doesn't turn out to be a show-stopper.

I see some flaws in the various patches posted, including Rik's.
Here's another version - doing it inside copy_page_range, so this
kind of vma special-casing is over in mm/ rather than kernel/.

No point in testing vm_file, the vm_flags cover the cases.
Test VM_MAYSHARE rather than VM_SHARED to include the never-can-be-
written MAP_SHARED cases too.  Must exclude VM_NONLINEAR, their ptes
are essential for defining the file offsets.  Must exclude VM_RESERVED,
faults on remap_pfn_range areas would usually put in anon zeroed pages
instead of the driver pages - or perhaps would be better as a test
against VM_IO, or vma->vm_ops->nopage?

Having to exclude the VM_NONLINEAR seems rather a shame, since those
are always shared and likely enormous.  The InfiniBand people's idea 
of a way for the app to set VM_DONTCOPY (to avoid rdma get_user_pages
problems) becomes attractive as a way for apps to speed their forks.

Hugh

--- 2.6.13-rc7/mm/memory.c	2005-08-24 11:13:41.000000000 +0100
+++ linux/mm/memory.c	2005-08-26 10:09:50.000000000 +0100
@@ -498,6 +498,14 @@ int copy_page_range(struct mm_struct *ds
 	unsigned long addr = vma->vm_start;
 	unsigned long end = vma->vm_end;
 
+	/*
+	 * Assume the fork will probably exec: don't waste time copying
+	 * ptes where a page fault will fill them correctly afterwards.
+	 */
+	if ((vma->vm_flags & (VM_MAYSHARE|VM_HUGETLB|VM_NONLINEAR|VM_RESERVED))
+								== VM_MAYSHARE)
+		return 0;
+
 	if (is_vm_hugetlb_page(vma))
 		return copy_hugetlb_page_range(dst_mm, src_mm, vma);
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux