Re: [PATCH]: Handling spurious page fault for hugetlb region for 2.6.14-rc4-git5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Rohit Seth <[email protected]> wrote:
>
> The prefetching problem is handled OK for regular pages because we can
>  handle page faults corresponding to those pages.  That is currently not
>  true for hugepages.  Currently the kernel assumes that PAGE_FAULT
>  happening against a hugetlb page is caused by truncate and returns
>  SIGBUS.

Doh.  No fault handler.  The penny finally drops.

Adam, I think this patch is temporary?


From: "Seth, Rohit" <[email protected]>

We prefault hugepages at mmap() time, but hardware TLB prefetching may mean
that the TLB has NULL pagetable entries in the places where the pagetable
in fact has the desired virtual->physical translation.

For regular pages this problem is resolved via the resulting pagefault, in
the pagefault handler.  But hugepages don't support pagefaults - they're
supposed to be prefaulted.

So we need minimal pagefault handling for these stale hugepage TLB entries.

An alternative is to invalidate the relevant TLB entries at hugepage
mmap()-time, but this is apparently too expensive.

Note: Adam Litke <[email protected]>'s demand-paging-for-hugepages patches are
now in -mm.  If/when these are merged up, this fix should probably be
reverted.

Signed-off-by: Rohit Seth <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

 include/linux/hugetlb.h |   13 +++++++++++++
 mm/memory.c             |   14 ++++++++++++--
 2 files changed, 25 insertions(+), 2 deletions(-)

diff -puN include/linux/hugetlb.h~handling-spurious-page-fault-for-hugetlb-region include/linux/hugetlb.h
--- devel/include/linux/hugetlb.h~handling-spurious-page-fault-for-hugetlb-region	2005-10-18 21:04:34.000000000 -0700
+++ devel-akpm/include/linux/hugetlb.h	2005-10-18 21:04:34.000000000 -0700
@@ -155,11 +155,24 @@ static inline void set_file_hugepages(st
 {
 	file->f_op = &hugetlbfs_file_operations;
 }
+
+static inline int valid_hugetlb_file_off(struct vm_area_struct *vma,
+					  unsigned long address)
+{
+	struct inode *inode = vma->vm_file->f_dentry->d_inode;
+	loff_t file_off = address - vma->vm_start;
+
+	file_off += (vma->vm_pgoff << PAGE_SHIFT);
+
+	return (file_off < inode->i_size);
+}
+
 #else /* !CONFIG_HUGETLBFS */
 
 #define is_file_hugepages(file)		0
 #define set_file_hugepages(file)	BUG()
 #define hugetlb_zero_setup(size)	ERR_PTR(-ENOSYS)
+#define valid_hugetlb_file_off(vma, address) 	0
 
 #endif /* !CONFIG_HUGETLBFS */
 
diff -puN mm/memory.c~handling-spurious-page-fault-for-hugetlb-region mm/memory.c
--- devel/mm/memory.c~handling-spurious-page-fault-for-hugetlb-region	2005-10-18 21:04:34.000000000 -0700
+++ devel-akpm/mm/memory.c	2005-10-18 21:04:34.000000000 -0700
@@ -2045,8 +2045,18 @@ int __handle_mm_fault(struct mm_struct *
 
 	inc_page_state(pgfault);
 
-	if (is_vm_hugetlb_page(vma))
-		return VM_FAULT_SIGBUS;	/* mapping truncation does this. */
+	if (unlikely(is_vm_hugetlb_page(vma))) {
+		if (valid_hugetlb_file_off(vma, address))
+			/* We get here only if there was a stale(zero) TLB entry
+			 * (because of  HW prefetching).
+			 * Low-level arch code (if needed) should have already
+			 * purged the stale entry as part of this fault handling.
+			 * Here we just return.
+			 */
+			return VM_FAULT_MINOR;
+		else
+			return VM_FAULT_SIGBUS;	/* mapping truncation does this. */
+	}
 
 	/*
 	 * We need the page table lock to synchronize with kswapd
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux