[PATCH] Sanely size hash tables when using large base pages.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At the moment both the pidhash and inode/dentry cache hash tables (common
by way of alloc_large_system_hash()) are incorrectly sized by their
respective detection logic when we attempt to use large base pages on
systems with little memory.

This results in odd behaviour when using a 64kB PAGE_SIZE, such as:

PID hash table entries: 512 (order: 9, 2048 bytes)
...
Dentry cache hash table entries: 8192 (order: -1, 32768 bytes)
Inode-cache hash table entries: 4096 (order: -2, 16384 bytes)

The mount cache hash table is seemingly the only one that gets this right
by directly taking PAGE_SIZE in to account.

The following patch attempts to catch the bogus values and round it up to
at least 0-order (or down, in the PID hash case).

Signed-off-by: Paul Mundt <[email protected]>

--

 kernel/pid.c    |   17 +++++++++++++----
 mm/page_alloc.c |    4 ++++

 2 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/kernel/pid.c b/kernel/pid.c
index 2efe9d8..198c6a9 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -383,20 +383,29 @@ void free_pid_ns(struct kref *kref)
 /*
  * The pid hash table is scaled according to the amount of memory in the
  * machine.  From a minimum of 16 slots up to 4096 slots at one gigabyte or
- * more.
+ * more.  As a safety net for large base page on small memory systems
+ * the hash table is scaled to a 0-order allocation in the event that the
+ * initial detection logic sizes the table incorrectly (which can result
+ * in a very large number of slots).
  */
 void __init pidhash_init(void)
 {
-	int i, pidhash_size;
+	int i, pidhash_size, size;
 	unsigned long megabytes = nr_kernel_pages >> (20 - PAGE_SHIFT);
 
 	pidhash_shift = max(4, fls(megabytes * 4));
 	pidhash_shift = min(12, pidhash_shift);
 	pidhash_size = 1 << pidhash_shift;
 
+	size = pidhash_size * sizeof(struct hlist_head);
+	if (unlikely(size < PAGE_SIZE)) {
+		size = PAGE_SIZE;
+		pidhash_size = size / sizeof(struct hlist_head);
+		pidhash_shift = 0;
+	}
+
 	printk("PID hash table entries: %d (order: %d, %Zd bytes)\n",
-		pidhash_size, pidhash_shift,
-		pidhash_size * sizeof(struct hlist_head));
+		pidhash_size, pidhash_shift, size);
 
 	pid_hash = alloc_bootmem(pidhash_size *	sizeof(*(pid_hash)));
 	if (!pid_hash)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8c1a116..4a9a83f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3321,6 +3321,10 @@ void *__init alloc_large_system_hash(con
 			numentries >>= (scale - PAGE_SHIFT);
 		else
 			numentries <<= (PAGE_SHIFT - scale);
+
+		/* Make sure we've got at least a 0-order allocation.. */
+		if (unlikely((numentries * bucketsize) < PAGE_SIZE))
+			numentries = PAGE_SIZE / bucketsize;
 	}
 	numentries = roundup_pow_of_two(numentries);
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux