"Seth, Rohit" <[email protected]> wrote:
>
> Linus,
>
> [PATCH]: Handle spurious page fault for hugetlb region
>
> The hugetlb pages are currently pre-faulted. At the time of mmap of
> hugepages, we populate the new PTEs. It is possible that HW has already cached
> some of the unused PTEs internally.
What's an "unused pte"? One which maps a regular-sized page at the same
virtual address? How can such a thing come about, and why isn't it already
a problem for regular-sized pages? From where does the hardware prefetch
the pte contents?
IOW: please tell us more about this hardware pte-fetcher.
> These stale entries never get a chance to
> be purged in existing control flow.
I'd have thought that invalidating those ptes at mmap()-time would be a
more consistent approach.
> This patch extends the check in page fault code for hugepages. Check if
> a faulted address falls with in size for the hugetlb file backing it. We
> return VM_FAULT_MINOR for these cases (assuming that the arch specific
> page-faulting code purges the stale entry for the archs that need it).
Do you have an example of the code which does this purging?
> --- linux-2.6.14-rc4-git5-x86/include/linux/hugetlb.h 2005-10-18 13:14:24.879947360 -0700
> +++ b/include/linux/hugetlb.h 2005-10-18 13:13:55.711381656 -0700
> @@ -155,11 +155,24 @@
> {
> file->f_op = &hugetlbfs_file_operations;
> }
> +
> +static inline int valid_hugetlb_file_off(struct vm_area_struct *vma,
> + unsigned long address)
> +{
> + struct inode *inode = vma->vm_file->f_dentry->d_inode;
> + loff_t file_off = address - vma->vm_start;
> +
> + file_off += (vma->vm_pgoff << PAGE_SHIFT);
> +
> + return (file_off < inode->i_size);
> +}
I suppose we should use i_size_read() here.
> + if (valid_hugetlb_file_off(vma, address))
> + /* We get here only if there was a stale(zero) TLB entry
> + * (because of HW prefetching).
> + * Low-level arch code (if needed) should have already
> + * purged the stale entry as part of this fault handling.
> + * Here we just return.
> + */
If the low-level code has purged the stale pte then it knows what's
happening. Perhaps it shouldn't call into handle_mm_fault() at all?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]