>>-----Original Message-----
>>From: David Gibson [mailto:[email protected]]
>>Sent: 2006年3月1日 7:36
>>To: Zhang, Yanmin
>>Cc: Andrew Morton; William Lee Irwin; [email protected]
>>Subject: Re: hugepage: Strict page reservation for hugepage inodes
>>
>>hugepage: Strict page reservation for hugepage inodes
>>
>>These days, hugepages are demand-allocated at first fault time.
>>There's a somewhat dubious (and racy) heuristic when making a new
>>mmap() to check if there are enough available hugepages to fully
>>satisfy that mapping.
>>
>>A particularly obvious case where the heuristic breaks down is where a
>>process maps its hugepages not as a single chunk, but as a bunch of
>>individually mmap()ed (or shmat()ed) blocks without touching and
>>instantiating the pages in between allocations. In this case the size
>>of each block is compared against the total number of available
>>hugepages. It's thus easy for the process to become overcommitted,
>>because each block mapping will succeed, although the total number of
>>hugepages required by all blocks exceeds the number available. In
>>particular, this defeats such a program which will detect a mapping
>>failure and adjust its hugepage usage downward accordingly.
>>
>>The patch below addresses this problem, by strictly reserving a number
>>of physical hugepages for hugepage inodes which have been mapped, but
>>not instatiated.
The patch reserves a number of physical hugepages for hugepage inodes
which have been mapped into address spaces, but it doesn't just reserve the
pages what it needed. It reserves all huge pages from 0 to inode->i_size. For example,
fd = open("/mnt/hugepages/file1", O_CREAT|O_RDWR, 0755);
*addr = mmap(NULL, HUGEPAGE_SIZE*3, PROT_NONE, MAP_SHARED, fd, HUGEPAGE_SIZE*5);
The patch would reserve 8 huge pages instead of 3 pages. I know that shmget/shmat
have no such problem. But mmap has it and the patch looks not perfect.
I have an idea. How about to record all the start/end address of huge page mmaping of the inode?
Long long ago, there was a patch at http://marc.theaimsgroup.com/?l=lse-tech&m=108187931924134&w=2.
Of course, we need port it to the latest kernel if this idea is better.
Yanmin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]