On Wed, 2007-10-03 at 08:47 -0700, Adam Litke wrote:
> When shrinking the size of the hugetlb pool via the nr_hugepages sysctl, we
> are careful to keep enough pages around to satisfy reservations. But the
> calculation is flawed for the following scenario:
>
> Action Pool Counters (Total, Free, Resv)
> ====== =============
> Set pool to 1 page 1 1 0
> Map 1 page MAP_PRIVATE 1 1 0
> Touch the page to fault it in 1 0 0
> Set pool to 3 pages 3 2 0
> Map 2 pages MAP_SHARED 3 2 2
> Set pool to 2 pages 2 1 2 <-- Mistake, should be 3 2 2
> Touch the 2 shared pages 2 0 1 <-- Program crashes here
>
> The last touch above will terminate the process due to lack of huge pages.
>
> This patch corrects the calculation so that it factors in pages being used
> for private mappings. Andrew, this is a standalone fix suitable for
> mainline. It is also now corrected in my latest dynamic pool resizing
> patchset which I will send out soon.
>
> Signed-off-by: Adam Litke <[email protected]>
> ---
>
> mm/hugetlb.c | 6 +++---
> 1 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 84c795e..7af3908 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -224,14 +224,14 @@ static void try_to_free_low(unsigned long count)
> for (i = 0; i < MAX_NUMNODES; ++i) {
> struct page *page, *next;
> list_for_each_entry_safe(page, next, &hugepage_freelists[i], lru) {
> + if (count >= nr_huge_pages)
> + return;
> if (PageHighMem(page))
> continue;
> list_del(&page->lru);
> update_and_free_page(page);
> free_huge_pages--;
> free_huge_pages_node[page_to_nid(page)]--;
> - if (count >= nr_huge_pages)
> - return;
> }
> }
> }
That's an excellent problem description. I'm just a bit hazy on how the
patch fixes it. :)
What is the actual error in this loop? The fact that we can go trying
to free pages when the count is actually OK?
BTW, try_to_free_low(count) kinda sucks for a function name. Is that
count the number of pages we're trying to end up with, or the total
number of low pages that we're trying to free?
Also, as I look at try_to_free_low(), why do we need to #ifdef it out in
the case of !HIGHMEM? If we have CONFIG_HIGHMEM=yes, we still might not
have any _actual_ high memory. So, they loop obviously doesn't *hurt*
when there is no high memory.
> @@ -251,7 +251,7 @@ static unsigned long set_max_huge_pages(unsigned long count)
> return nr_huge_pages;
>
> spin_lock(&hugetlb_lock);
> - count = max(count, resv_huge_pages);
> + count = max(count, resv_huge_pages + nr_huge_pages - free_huge_pages);
> try_to_free_low(count);
> while (count < nr_huge_pages) {
> struct page *page = dequeue_huge_page(NULL, 0);
The real problem with this line is that "count" is too ambiguous. :)
We could rewrite the original max() line this way:
if (resv_huge_pages > nr_of_pages_to_end_up_with)
nr_of_pages_to_end_up_with = resv_huge_pages;
try_to_make_the_total_nr_of_hpages(nr_of_pages_to_end_up_with);
Which makes it more clear that you're setting the number of total pages
to the number of reserved pages, which is obviously screwy.
OK, so this is actually saying: "count can never go below
resv_huge_pages+nr_huge_pages"?
Could we change try_to_free_low() to free a distinct number of pages?
if (count > free_huge_pages)
count = free_huge_pages;
try_to_free_nr_huge_pages(count);
I feel a bit sketchy about the "resv_huge_pages + nr_huge_pages -
free_huge_pages" logic. Could you elaborate a bit there on what the
rules are?
-- Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]