Re: Can't mlock hugetlb in 2.6.15

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 22 Jan 2006, Don Dupuis wrote:
> On 1/21/06, Nick Piggin <[email protected]> wrote:
> > Andrew Morton wrote:
> > > Don Dupuis <[email protected]> wrote:
> > >>I have an app that mlocks hugepages. The same app works just fine in 2.6.14.
> > > That being said, we shouldn't have broken your application.
> > Don, an strace log of the failing sequence of syscalls could be helpful.
> 
> sducstart:
> open("/pivot3/mem/sduc", O_RDWR|O_CREAT, 0666) = 3
> mmap2(NULL, 1761607680, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_LOCKED,
> 3, 0) = 0x4e000000
> 
> This is the strace output of sductest that is a test program to access
> the shared memory that was setup by sducstart:
> open("/pivot3/mem/sduc", O_RDWR)        = 3
> mmap2(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_LOCKED, 3,
> 0) = -1 ENOMEM (Cannot allocate memory)

Thanks a lot for the strace, that indeed helped to track it down.

This has nothing to do with mlock or MAP_LOCKED - which by the way do
make more sense in 2.6.15, since they provide a way of prefaulting the
hugepage area like in earlier releases (now hugepages are being faulted
in on demand, though never paged out, as Andrew said).

Please try the patch below, and let us know if it works for you - thanks.
Looks like we'll need this in 2.6.16-rc-git and 2.6.15-stable.


2.6.15's hugepage faulting introduced huge_pages_needed accounting into
hugetlbfs: to count how many pages are already in cache, for spot check
on how far a new mapping may be allowed to extend the file.  But it's
muddled: each hugepage found covers HPAGE_SIZE, not PAGE_SIZE.  Once
pages were already in cache, it would overshoot, wrap its hugepages
count backwards, and so fail a harmless repeat mapping with -ENOMEM.
Fixes the problem found by Don Dupuis.

Signed-off-by: Hugh Dickins <[email protected]>
---

 fs/hugetlbfs/inode.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

--- 2.6.15/fs/hugetlbfs/inode.c	2006-01-03 03:21:10.000000000 +0000
+++ linux/fs/hugetlbfs/inode.c	2006-01-23 18:39:47.000000000 +0000
@@ -71,8 +71,8 @@ huge_pages_needed(struct address_space *
 	unsigned long start = vma->vm_start;
 	unsigned long end = vma->vm_end;
 	unsigned long hugepages = (end - start) >> HPAGE_SHIFT;
-	pgoff_t next = vma->vm_pgoff;
-	pgoff_t endpg = next + ((end - start) >> PAGE_SHIFT);
+	pgoff_t next = vma->vm_pgoff >> (HPAGE_SHIFT - PAGE_SHIFT);
+	pgoff_t endpg = next + hugepages;
 
 	pagevec_init(&pvec, 0);
 	while (next < endpg) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux