Re: [PATCH] Add notification of page becoming writable to VMA ops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 24 Oct 2005, David Howells wrote:
> 
> The attached patch adds a new VMA operation to notify a filesystem or other
> driver about the MMU generating a fault because userspace attempted to write
> to a page mapped through a read-only PTE.
> 
> This facility permits the filesystem or driver to:
> 
>  (*) Implement storage allocation/reservation on attempted write, and so to
>      deal with problems such as ENOSPC more gracefully (perhaps by generating
>      SIGBUS).
> 
>  (*) Delay making the page writable until the contents have been written to a
>      backing cache. This is useful for NFS/AFS when using FS-Cache/CacheFS.
>      It permits the filesystem to have some guarantee about the state of the
>      cache.

I've only given it a quick look, it looks pretty good, but too hastily
thrown together, without understanding of the intervening changes:

> --- linux-2.6.14-rc4-mm1/mm/memory.c	2005-10-17 14:26:44.000000000 +0100
> +++ linux-2.6.14-rc4-mm1-cachefs/mm/memory.c	2005-10-20 18:53:04.000000000 +0100
> @@ -1261,19 +1261,53 @@ static int do_wp_page(struct mm_struct *
> +	if (unlikely(vma->vm_flags & VM_SHARED)) {
> +		if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
> +			/*
> +			 * Notify the page owner without the lock held,
> +			 * so they can sleep if they want to.
> +			 */
> +			pte_unmap(page_table);
> +			if (!PageReserved(old_page))
> +				page_cache_get(old_page);
> +			spin_unlock(&mm->page_table_lock);

No, you need to pay attention to Nick's PageReserved removal, and
my pte lock stuff, throughout do_wp_page - there shouldn't be any
references to PageReserved or page_table_lock there now (and you'll
need to recheck the mapping/locking/unlocking/unmapping).  Sorry,
I don't have the time to spare to do it myself right now.

> +			page_table = pte_offset_map(pmd, address);
> +			if (!pte_same(*page_table, orig_pte)) {
> +				ret |= VM_FAULT_WRITE;

No, don't add VM_FAULT_WRITE in this case: you should only do that
when you've gone through the maybe_mkwrite yourself; this case
should remain the default VM_FAULT_MINOR.

> @@ -1847,18 +1890,28 @@ retry:
> +		} else {
> +			/* if the page will be shareable, see if the backing
> +			 * address space wants to know that the page is about
> +			 * to become writable */
> +			if (vma->vm_ops->page_mkwrite &&
> +			    vma->vm_ops->page_mkwrite(vma, new_page) < 0)
> +				return VM_FAULT_SIGBUS;
> +		}
>  	}

This isn't necessarily wrong, and may be exactly how it was before,
I don't remember.  But it implies that when page_mkwrite fails,
it page_cache_releases the page.  Is that desirable?  Or should
that be left to the caller?

> @@ -1945,7 +1998,7 @@ static int do_file_page(struct mm_struct

Drop all those changes to do_file_page (which I added), they're no
longer necessary.  A case appeared which made it clear that we cannot
rely on resolving this issue for get_user_pages in a single call to
handle_mm_fault, and that's why the VM_FAULT_WRITE stuff got added. 

This complication of do_file_page was always ugly, and I'm delighted
to drop it.  Whereas the call to do_wp_page from do_swap_page is less
obtrusive and may still be a worthwhile optimization, though I added
it for the same disgraced reason a year or more back.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux