Re: hugepage: Strict page reservation for hugepage inodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 28, 2006 at 04:53:49PM +0800, Zhang, Yanmin wrote:
[snip]
> >>+	if (vma->vm_flags & VM_MAYSHARE) {
> >>+
> >>+		/* idx = radix tree index, i.e. offset into file in
> >>+		 * HPAGE_SIZE units */
> >>+		idx = ((addr - vma->vm_start) >> HPAGE_SHIFT)
> >>+			+ (vma->vm_pgoff >> (HPAGE_SHIFT - PAGE_SHIFT));
> >>+
> >>+		/* The hugetlbfs specific inode info stores the number
> >>+		 * of "guaranteed available" (huge) pages.  That is,
> >>+		 * the first 'prereserved_hpages' pages of the inode
> >>+		 * are either already instantiated, or have been
> >>+		 * pre-reserved (by hugetlb_reserve_for_inode()). Here
> >>+		 * we're in the process of instantiating the page, so
> >>+		 * we use this to determine whether to draw from the
> >>+		 * pre-reserved pool or the truly free pool. */
> >>+		if (idx < HUGETLBFS_I(inode)->prereserved_hpages)
> >>+			use_reserve = 1;
> >>+	}
> >>+
> >>+	if (!use_reserve) {
> >>+		if (free_huge_pages <= reserved_huge_pages)
> >>+			goto fail;
> >>+	} else {
> >>+		BUG_ON(reserved_huge_pages == 0);
> >>+		reserved_huge_pages--;
> [YM] Consider this scenario of multi-thread:
> One process has 2 threads. The process mmaps a hugetlb area with 1
> huge page and there is a free huge page. Later on, the 2 threads
> fault on the huge page at the same time.  The second thread would
> fail, and WARN_ON check is triggered, then the second thread is
> killed by function hugetlb_no_page.

That's why this patch *must* go after my other patch which serializes
the allocation->instantiation path.

> >> 	}
> >>+
> >>+	page = dequeue_huge_page(vma, addr);
> >>+	if (!page)
> >>+		goto fail;
> >>+
> >> 	spin_unlock(&hugetlb_lock);
> >> 	set_page_count(page, 1);
> >> 	return page;
> >>+
> >>+ fail:
> >>+	WARN_ON(use_reserve); /* reserved allocations shouldn't fail */
> >>+	spin_unlock(&hugetlb_lock);
> >>+	return NULL;
> >>+}
> >>+
> >>+/* hugetlb_extend_reservation()
> >>+ *
> >>+ * Ensure that at least 'atleast' hugepages are, and will remain,
> >>+ * available to instantiate the first 'atleast' pages of the given
> >>+ * inode.  If the inode doesn't already have this many pages reserved
> >>+ * or instantiated, set aside some hugepages in the reserved pool to
> >>+ * satisfy later faults (or fail now if there aren't enough, rather
> >>+ * than getting the SIGBUS later).
> >>+ */
> >>+int hugetlb_extend_reservation(struct hugetlbfs_inode_info *info,
> >>+			       unsigned long atleast)
> >>+{
> >>+	struct inode *inode = &info->vfs_inode;
> >>+	struct address_space *mapping = inode->i_mapping;
> >>+	unsigned long idx;
> >>+	unsigned long change_in_reserve = 0;
> >>+	struct page *page;
> >>+	int ret = 0;
> >>+
> >>+	spin_lock(&hugetlb_lock);
> >>+	read_lock_irq(&inode->i_mapping->tree_lock);
> >>+
> >>+	if (info->prereserved_hpages >= atleast)
> >>+		goto out;
> >>+
> >>+	/* prereserved_hpages stores the number of pages already
> >>+	 * guaranteed (reserved or instantiated) for this inode.
> >>+	 * Count how many extra pages we need to reserve. */
> >>+	for (idx = info->prereserved_hpages; idx < atleast; idx++) {
> >>+		page = radix_tree_lookup(&mapping->page_tree, idx);
> >>+		if (!page)
> >>+			/* Pages which are already instantiated don't
> >>+			 * need to be reserved */
> >>+			change_in_reserve++;
> >>+	}
> [YM] Why always to go through the page cache? prereserved_hpages and
> reserved_huge_pages are protected by hugetlb_lock.

Erm.. sorry, I don't see how that helps.  We need to go through the
page cache to see which pages have already been instantiated, and so
don't need to be reserved.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux