[PATCH] get_user_pages(..., write==1, ...) may return with readable pte.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Handle the case in get_user_pages() when a call to __handle_mm_fault()
inserts a writable pte, and a process doing dup_mmap converts it
to readable before get_user_pages() does the subsequent request to
follow_page().


Signed-off-by: Robin Holt <[email protected]>

---

Hugh, Nick, and Linus,

I think I have tripped over another flavor of a get_user_pages bug
we addressed back in 2005.  I do not have a test case to prove it is
the issue I am trying to address, but I have done as thorough a code
walk-through as I can.

Assume a pte is currently empty.  A first pthread is in the kernel on
a call path which is leading to get_user_pages.  A second pthread is
in the process of doing a fork.  The process doing get_user_pages()
gets into __handle_mm_fault() and grabs ptl just before the process
doing a fork attempts to grab the ptl to convert the pages to COW.
__handle_mm_fault() will insert the writable pte and unlock ptl then
return with VM_FAULT_WRITE set.  The process doing a fork then gets
the lock and starts converting the pte to RO/COW.  The get_user_pages()
process then clears FOLL_WRITE from foll_flags and calls follow_page()
without write, adds to the map count for the page, but does not have a
writable mapping.


Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c	2006-10-06 12:06:25.000000000 -0500
+++ linux-2.6/mm/memory.c	2006-10-13 15:06:38.286230638 -0500
@@ -1063,15 +1063,7 @@ int get_user_pages(struct task_struct *t
 				int ret;
 				ret = __handle_mm_fault(mm, vma, start,
 						foll_flags & FOLL_WRITE);
-				/*
-				 * The VM_FAULT_WRITE bit tells us that do_wp_page has
-				 * broken COW when necessary, even if maybe_mkwrite
-				 * decided not to set pte_write. We can thus safely do
-				 * subsequent page lookups as if they were reads.
-				 */
-				if (ret & VM_FAULT_WRITE)
-					foll_flags &= ~FOLL_WRITE;
-				
+
 				switch (ret & ~VM_FAULT_WRITE) {
 				case VM_FAULT_MINOR:
 					tsk->min_flt++;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux