Re: [PATCH 01/14] Per zone counter functionality

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 9 Jun 2006 08:54:39 -0700 (PDT)
Christoph Lameter <[email protected]> wrote:

> On Thu, 8 Jun 2006, Andrew Morton wrote:
> 
> > Is the use of 8-bit accumulators more efficient than using 32-bit ones? 
> > Obviously it's better from a cache POV, given that we have a pretty large
> > array of them.  But is there a downside on some architectures in not using
> > the natural wordsize?   I assume not, but I don't really know...
> 
> The advantage is that the whole thing fits into one cacheline right with 
> the pcp information. Some architectures need additional cycles but this 
> increases the cache hit rate. The speed of accessing memory is by far 
> worse than that.
> 
> > > +#ifdef CONFIG_SMP
> > > +typedef atomic_long_t vm_stat_t;
> > > +#define VM_STAT_GET(x) atomic_long_read(&(x))
> > > +#define VM_STAT_ADD(x,v) atomic_long_add(v, &(x))
> > > +#else
> > > +typedef unsigned long vm_stat_t;
> > > +#define VM_STAT_GET(x) (x)
> > > +#define VM_STAT_ADD(x,v) (x) += (v)
> > > +#endif
> > 
> > Is there a need to do this?  On !SMP the atomic ops for well-cared-for
> > architectures use nonatomic RMWs anyway.  For most architectures I'd expect
> > that we can simply use atomic_long_foo() in both cases with no loss of
> > efficiency.
> 
> Maybe I am not up to date too much on !SMP. I thought they still needed 
> atomic ops for MMU races.

There's no need for an atomic op - at the most the architecture would need
local_irq_disable() protection, and that's only if it doesn't have an
atomic-wrt-this-cpu add instruction.


> > > +void refresh_cpu_vm_stats(int cpu)
> > > +{
> > > +	struct zone *zone;
> > > +	int i;
> > > +	unsigned long flags;
> > > +
> > > +	for_each_zone(zone) {
> > > +		struct per_cpu_pageset *pcp;
> > > +
> > > +		pcp = zone_pcp(zone, cpu);
> > > +
> > > +		for (i = 0; i < NR_STAT_ITEMS; i++)
> > > +			if (pcp->vm_stat_diff[i]) {
> > > +				local_irq_save(flags);
> > > +				zone_page_state_add(pcp->vm_stat_diff[i],
> > > +					zone, i);
> > > +				pcp->vm_stat_diff[i] = 0;
> > > +				local_irq_restore(flags);
> > > +			}
> > > +	}
> > > +}
> > 
> > Note that when this function is called via on_each_cpu(), local interrupts
> > are already disabled.  So a small efficiency gain would come from changing
> > the API definition here to "caller must have disabled local interrupts".
> 
> Interrupts are enabled for on_each_cpu on IA64.

Not from my reading of arch/ia64/kernel/smp.c:handle_IPI().  And if I've
misread it, ia64 has broken invalidate_bh_lrus() and who knows what else.

> > Well I guess if this doesn't oops then we've finally answered that "Should
> > this ever happen" in __alloc_pages().
> 
> Why would this oops? I thought all the zones are always populated?

That's my point - probably the check in __alloc_pages() isn't needed.

> > Would it be possible/sensible to move all this stuff into a new .c file? 
> > page_alloc.c is getting awfully large and multipurpose, and this code is a
> > single logical chunk.
> 
> Right thought about that one as well. Can we stablize this first before I 
> do another big reorg?

That's unfortunate patch ordering.  Do it (much) later I guess.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux