[PATCH] cpuset sched_load_balance sched domain confusion fix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Paul Jackson <[email protected]>

Fix a bug in the cpuset code that recalculates dynamic sched domains.

For sufficiently complex cpuset configurations, the recalc code
could get confused, due to overwriting some state then using the
overwritten values as if they still held the previous value.  This
could result in kernel oops and other random chaos, overwriting
memory.

The fix stashes the two values of interest, apn and bpn, in separate
local variables, to keep them separate from what will be overwritten.

Besides the fix, also:
 1) this confusion is easy to detect -- in the event that there are
    or ever come to be any more such bugs, notice when out of bounds
    and 'continue' past it, resulting in overly simplified sched
    domain setups, rather than oops or memory trashing, and
 2) in that case, print something out with a few clues, the first
    ten times this happens on a boot, so that someone might notice
    someday and chase the problem down.

Signed-off-by: Paul Jackson <[email protected]>

---

This is a needed fix for the *-mm patch:
    cpuset-sched_load_balance-flag.patch

 kernel/cpuset.c |   21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

--- 2.6.23-mm1.orig/kernel/cpuset.c	2007-10-17 18:56:09.814604327 -0700
+++ 2.6.23-mm1/kernel/cpuset.c	2007-10-18 01:56:23.863274785 -0700
@@ -631,16 +631,18 @@ restart:
 	/* Find the best partition (set of sched domains) */
 	for (i = 0; i < csn; i++) {
 		struct cpuset *a = csa[i];
+		int apn = a->pn;
 
 		for (j = 0; j < csn; j++) {
 			struct cpuset *b = csa[j];
+			int bpn = b->pn;
 
-			if (a->pn != b->pn && cpusets_overlap(a, b)) {
+			if (apn != bpn && cpusets_overlap(a, b)) {
 				for (k = 0; k < csn; k++) {
 					struct cpuset *c = csa[k];
 
-					if (c->pn == b->pn)
-						c->pn = a->pn;
+					if (c->pn == bpn)
+						c->pn = apn;
 				}
 				ndoms--;	/* one less element */
 				goto restart;
@@ -660,6 +662,19 @@ restart:
 		if (apn >= 0) {
 			cpumask_t *dp = doms + nslot;
 
+			if (nslot == ndoms) {
+				static int warnings = 10;
+				if (warnings) {
+					printk(KERN_WARNING
+					 "rebuild_sched_domains confused:"
+					  " nslot %d, ndoms %d, csn %d, i %d,"
+					  " apn %d\n",
+					  nslot, ndoms, csn, i, apn);
+					warnings--;
+				}
+				continue;
+			}
+
 			cpus_clear(*dp);
 			for (j = i; j < csn; j++) {
 				struct cpuset *b = csa[j];

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <[email protected]> 1.650.933.1373
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux