Re: NUMA mempolicy /proc code in mainline shouldn't have been merged

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christoph Lameter <[email protected]> wrote:
>
> On Sat, 10 Sep 2005, Andrew Morton wrote:
> 
> > > Well its ugly because you said that the fixes to make it less ugly were 
> > > "useless". I can still submit those fixes that make numa_maps a part of 
> > > smaps and that cleanup the way policies are displayed.
> > 
> > It would be useful to see these.
> 
> URLs (these are not up to date and in particular the conversion
> functions are much simpler in recent versions thanks to some help by
> Paul Jackson since then)
> 
> http://www.uwsg.iu.edu/hypermail/linux/kernel/0507.3/1662.html
> http://www.uwsg.iu.edu/hypermail/linux/kernel/0507.3/1663.html
> http://www.uwsg.iu.edu/hypermail/linux/kernel/0507.3/1665.html
> 

That doesn't looks like a great improvement.  The code you have there now
is quite straightforward.

> 
> > > > >  - it presents lots of kernel internal information and mempolicy
> > > > >  internals (like how many people have a page mapped) etc.
> > > > >  to userland that shouldn't be exposed to this.
> > > Very important information.
> > Important to whom?  Kernel developers or userspace developers?  If the
> > latter, what use do they actually make of it?  Shouldn't it be documented?
> 
> Both. System administrators would like to know on which node an 
> application has allocated memory. They would also like to change the way 
> applications do allocate memory while they are running but Andi has 
> philosophical concerns about that and will not even discuss methods to fix 
> the design issues in order to make that possible. Got a couple here ready 
> pull out if an opportunity arises.

Well I can understand concerns about externally fiddling with a process's
allocation policies, but that's a separate issue from the one at hand.

I'd expect that the ability to know "on which node an application has
allocated memory" would be more valuable to a developer than to a sysadmin.
Can you provide an example usage?

> I still have a hard time to see how people can accept the line of 
> reasoning that says:
> 
>  Users are not allowed to know on which nodes the operating system 
>  allocated resources for a process

This would only be useful if the process wasn't explicitly setting memory
policies, yes?  It's "hey, where is did all my memory end up".

> and are also not allowed to see the 
>  memory policies in effect for the memory areas

And this is the case where the process _has_ set some memory allocation
policies.

Certainly I can see value in that.  How can a developer test his
code without any form of runtime feedback?

> Then the application developers have to guess the effect that the memory 
> policies have on memory allocation. For memory alloc debugging the poor 
> app guys must today simply imagine what the operating system is doing. 
> They can see the amount of total memory allocated on a node via other proc 
> entries and then guess based on that which application has taken it. Then 
> they modify their apps and do another run. 

Agree.

What does Andi mean by "there was a theoretical usecase where it might be
needed, but there were better solutions proposed for this"?  The
application developer's problem here seems very real to me.  Maybe the
"theoretical usecase" was something different?

> My thinking today is that I'd rather leave /proc/<pid>/numa_stats 
> instead of using smaps because the smaps format is a bit verbose and 
> will make it difficult to see the allocation distribution.

We don't want /proc/pid/smaps to have different formats on NUMA versus !NUMA.

> I have a new series of patches here that does a gradual thing with
> the policy layer:

That's off-topic.



Andi, I don't understand the objection.  If I was developing the memory
policy tuning component of some honking number-crunching app I'd sure as
heck want to have some way of seeing the results of my efforts.  How do I
do that without numa_maps?

And the content doesn't look too bad:

2000000000000000 default MaxRef=43 Pages=11 Mapped=11 N0=4 N1=3 N2=2 N3=2
2000000000038000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2
2000000000040000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1
2000000000058000 default MaxRef=43 Pages=61 Mapped=61 N0=14 N1=15 N2=16 N3=16
2000000000268000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2

It's easy to parse and it is extensible.  It needs documenting though.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux