Re: [PATCH 0/13] maps: pagemap, kpagemap, and related cleanups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 12, 2007 at 05:42:01PM -0700, Andrew Morton wrote:
> On Fri, 13 Apr 2007 10:15:24 +1000 Nick Piggin <[email protected]> wrote:
> 
> > >>+                       ((char *)page)[1] = PAGE_SHIFT;
> > > 
> > > 
> > > OK.
> > 
> > Shouldn't we just expose page size and endianness by other means? (another file or
> > syscall).
> 
> I don't think so - this file exposes fairly deep kernel internals and
> that's unavoidable, really - it's *supposed* to do that.  It is explicitly
> designed for monitoring kernel behaviour.
> 
> So it needs special handling by userspace.  Keeping the number of files
> which need such special handling to a minimum will keep the number of
> applications which are exposed to kernel changes to a minimum.
> 
> > >>+               for (; i < 2 * chunk / KPMSIZE; i += 2, pfn++) {
> > >>+                       ppage = pfn_to_page(pfn);
> > >>+                       if (!ppage) {
> > >>+                               page[i] = 0;
> > >>+                               page[i + 1] = 0;
> > >>+                       } else {
> > >>+                               page[i] = ppage->flags;
> > >>+                               page[i + 1] = atomic_read(&ppage->_count);
> > >>+                       }
> > >>+               }
> > > 
> > > 
> > > Not a good idea to expose raw flags in this manner - it changes at the drop
> > > of a hat.  We'd need to also expose the kernel's PG_foo-to-bitnumber
> > > mapping to make this viable.
> > 
> > I don't think it is viable because that makes the flags part of the
> > userspace ABI.
> 
> It *will* be viable.  If the application wants to know if a page is dirty,
> it looks up "PG_dirty" in /proc/pg_foo-to-bitnumber and uses PG_dirty's
> numerical offset when inspecting fields in /proc/kpagemap.  If correctly
> designed, such a monitoring application will be able to report upon page
> flags which we haven't even thought up yet.

We can probably fit this in the existing (variable-sized) header.
 
> > I wonder what they are needed for.
> 
> Poking deeply into the kernel to provide information about kernel state. 
> 
> There are real-world needs for this, and the people who develop tools to
> process this information will have decent kernel understanding and will
> know that the file's contents may alter across kernel versions.  It sure
> beats poking around in /dev/kmem.
> 
> I doubt if there's a sensible way in which we can prettify this interface
> without losing information.  But we should aim to make it as robust as
> possible agaisnt future kenrel changes, of course.
> 
> And we should satisfy ourselves that all the required information has been
> made available.  The fact that it will satisfy the Oracle requirement is
> encouraging.
> 
> Matt, these changes make the new field in /proc/pid/smaps redundant, don't
> they?

Which new field? From /proc/kpagemap + /proc/*/pagemap, you can
basically synthesize any statistic you want, including all the
existing ones. For some data, /proc/pid/smaps (or /proc/meminfo) will
be considerably more efficient.

But in general, most of the statistics in smaps are basically useless
for shared mappings, just like RSS. Problem is, we really don't know
what statistics we want yet, or even if it can be distilled down to
simple numbers anyway.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux