Re: [BUG] sched: big numa dynamic sched domain memory corruption

Andrew, Ingo,

I believe Suresh is correct, and that the mainline no longer
has this bug.  So the patch that was buried in my report should
not be applied.


Suresh wrote:
> Paul can you please test the mainline code and confirm?

I have tested the mainline, 2.6.18-rc2-mm1, and I am now nearly certain
you are correct.  I did not see the failure.  There is a slim chance
that I botched the forward port of the tripwire code needed to catch
this bug efficiently, and so could have missed seeing the bug on that
account.

I will now try backporting the three patches you recommended for
systems on a release such as SLES10:

  sched: fix group power for allnodes_domains
  sched_domai: Allocate sched_group structures dynamically
  sched: build_sched_domains() fix

I wish you well on any further code improvements you have planned for
this code.  It's tough to understand, with such issues as many #ifdef's,
an interesting memory layout of the key sched domain arrays that I
didn't see described much in the comments, and a variety of memory
allocation calls that are tough to unravel on error.  Portions of
the code could use some more comments, explaining what is going on.
For example, I still haven't figured exactly what 'cpu_power' means.

The allocations of sched_group_allnodes, sched_group_phys and
sched_group_core are -big- on our ia64 SN2 systems (1024 CPUs),
and could fail once a system has been up for a while and is
getting memory tight and fragmented.

It is not obvious to me from the code or comments just how sched
domains are arranged on various large systems with hyper-threading
(SMT) and/or multiple cores (MC) and/or multiple processor packages
per node, and how scheduling is affected by all this.

This was about the third bug that has come by in it -- which I
in particular notice when it is someone playing with cpu_exclusive
cpusets who hits the bug.  Any kernel backtrace with 'cpuset' buried in
it tends to migrate to my inbox.  This latest bug was particularly
nasty, as is usually the case with random memory corruption bugs,
costing us a bunch of hours.

Good luck.

If you are aware of any other fixes/patches besides the above that us
big honkin numa iron SLES10 users need for reliable operation, let me
know.

Thanks.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <[email protected]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [BUG] sched: big numa dynamic sched domain memory corruption
  - From: "Siddha, Suresh B" <[email protected]>

References:
- [BUG] sched: big numa dynamic sched domain memory corruption
  - From: Paul Jackson <[email protected]>
- Re: [BUG] sched: big numa dynamic sched domain memory corruption
  - From: Ingo Molnar <[email protected]>
- Re: [BUG] sched: big numa dynamic sched domain memory corruption
  - From: "Siddha, Suresh B" <[email protected]>

Prev by Date: do { } while (0) question
Next by Date: Re: let md auto-detect 128+ raid members, fix potential race condition
Previous by thread: Re: [BUG] sched: big numa dynamic sched domain memory corruption
Next by thread: Re: [BUG] sched: big numa dynamic sched domain memory corruption
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]