Re: 2.6.17-mm6 — Linux Kernel

On Wed, 5 Jul 2006 17:05:49 -0700
"Keith Mannthey" <[email protected]> wrote:

> On 7/5/06, Andrew Morton <[email protected]> wrote:
> > On Wed, 5 Jul 2006 16:44:57 -0700
> > Andrew Morton <[email protected]> wrote:
> >
> > > I guess a medium-term fix would be to add a boot parameter to override
> > > PERCPU_ENOUGH_ROOM - it's hard to justify increasing it permanently just
> > > for the benefit of the tiny minority of kernels which are hand-built with
> > > lots of drivers in vmlinux.
> 
> I am not really loading alot of drivers.  I am building with a ton of driver.
> >
> > That's not right, is it.  PERCPU_ENOUGH_ROOM covers vmlinux and all loaded
> > modules, so if vmlinux blows it all then `modprobe the-same-stuff' will
> > blow it as well.
> >
> > > But first let's find out where it all went.
> >
> > I agree with that person.
> :)
> 
> This is what I get it is diffrent that yours for sure. I am a little
> confused by the large offset change near the start.....?

Yes, I had an unexplained 8k hole.  That was i386.  Your x86_64 output
looks OK though.

> elm3a153:/home/keith/linux-2.6.17-mm6-orig # nm -n vmlinux | grep per_cpu
>...
> ffffffff80658000 A __per_cpu_start			It starts here
> ffffffff80658000 D per_cpu__init_tss			8k
> ffffffff8065a080 d per_cpu__idle_state		
> ffffffff8065a084 d per_cpu__cpu_idle_state
> ffffffff8065a0a0 D per_cpu__vector_irq
> ffffffff8065a4a0 D per_cpu__device_mce
> ffffffff8065a518 d per_cpu__next_check
> ffffffff8065a520 d per_cpu__threshold_banks
> ffffffff8065a550 d per_cpu__bank_map
> ffffffff8065a580 d per_cpu__flush_state
> ffffffff8065a600 D per_cpu__cpu_state
> ffffffff8065a620 d per_cpu__perfctr_nmi_owner
> ffffffff8065a624 d per_cpu__evntsel_nmi_owner
> ffffffff8065a640 d per_cpu__nmi_watchdog_ctlblk
> ffffffff8065a660 d per_cpu__last_irq_sum
> ffffffff8065a668 d per_cpu__alert_counter
> ffffffff8065a670 d per_cpu__nmi_touch
> ffffffff8065a680 D per_cpu__current_kprobe
> ffffffff8065a6a0 D per_cpu__kprobe_ctlblk
> ffffffff8065a7e0 D per_cpu__mmu_gathers		4k
> ffffffff8065b7e0 d per_cpu__runqueues			5k
> ffffffff8065ca60 d per_cpu__cpu_domains
> ffffffff8065cae0 d per_cpu__core_domains
> ffffffff8065cb60 d per_cpu__phys_domains
> ffffffff8065cbe0 d per_cpu__node_domains
> ffffffff8065cc60 d per_cpu__allnodes_domains
> ffffffff8065cce0 D per_cpu__kstat			wham - 17.5k
> ffffffff80661120 D per_cpu__process_counts
> ffffffff80661130 d per_cpu__cpu_profile_hits
> ffffffff80661140 d per_cpu__cpu_profile_flip
> ffffffff80661148 d per_cpu__tasklet_vec
> ffffffff80661150 d per_cpu__tasklet_hi_vec
> ffffffff80661158 d per_cpu__ksoftirqd
> ffffffff80661160 d per_cpu__tvec_bases
> ffffffff80661180 D per_cpu__rcu_data
> ffffffff80661200 D per_cpu__rcu_bh_data
> ffffffff80661280 d per_cpu__rcu_tasklet
> ffffffff806612c0 d per_cpu__hrtimer_bases
> ffffffff80661340 d per_cpu__kprobe_instance
> ffffffff80661348 d per_cpu__taskstats_seqnum
> ffffffff80661360 d per_cpu__ratelimits.18857
> ffffffff80661380 d per_cpu__committed_space
> ffffffff806613a0 d per_cpu__lru_add_pvecs
> ffffffff80661420 d per_cpu__lru_add_active_pvecs
> ffffffff806614a0 d per_cpu__lru_add_tail_pvecs
> ffffffff80661520 D per_cpu__vm_event_states
> ffffffff80661640 d per_cpu__reap_work
> ffffffff806616a0 d per_cpu__reap_node
> ffffffff806616c0 d per_cpu__bh_lrus
> ffffffff80661700 d per_cpu__bh_accounting
> ffffffff80661720 d per_cpu__fdtable_defer_list
> ffffffff806617c0 d per_cpu__blk_cpu_done
> ffffffff806617e0 D per_cpu__radix_tree_preloads
> ffffffff80661860 d per_cpu__trickle_count
> ffffffff80661864 d per_cpu__proc_event_counts
> ffffffff80661880 d per_cpu__loopback_stats
> ffffffff80661980 d per_cpu__sockets_in_use
> ffffffff80661a00 D per_cpu__softnet_data
> ffffffff80662000 D per_cpu__netdev_rx_stat
> ffffffff80662010 d per_cpu__net_rand_state
> ffffffff80662080 d per_cpu__flow_tables
> ffffffff80662100 d per_cpu__flow_hash_info
> ffffffff80662180 d per_cpu__flow_flush_tasklets
> ffffffff806621c0 d per_cpu__rt_cache_stat
> ffffffff80662200 d per_cpu____icmp_socket
> ffffffff80662208 A __per_cpu_end

So you've been hit by the expansion of NR_IRQS which bloats kernel_stat
which gobbles per-cpu data.

In 2.6.17 NR_IRQS is 244.  In -mm (due to the x86_64 genirq conversion)
NR_IRQS became (256 + 32 * NR_CPUS).  Hence the kstat "array" became
two-dimensional.  It's now O(NR_CPUS^2).

I don't know what's a sane max for NR_CPUS on x86_64, but that'll sure be a
showstopper if the ia64 guys try the same trick.

I guess one fix would be to de-percpuify kernel_stat.irqs[].  Or
dynamically allocate it with alloc_percpu().

(cc's people, runs away)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: 2.6.17-mm6
  - From: Steven Rostedt <[email protected]>
- Re: 2.6.17-mm6
  - From: [email protected] (Eric W. Biederman)

References:
- 2.6.17-mm6
  - From: Andrew Morton <[email protected]>
- Re: 2.6.17-mm6
  - From: Andrew Morton <[email protected]>
- Re: 2.6.17-mm6
  - From: "Keith Mannthey" <[email protected]>
- Re: 2.6.17-mm6
  - From: Andrew Morton <[email protected]>
- Re: 2.6.17-mm6
  - From: Andrew Morton <[email protected]>
- Re: 2.6.17-mm6
  - From: "Keith Mannthey" <[email protected]>

Prev by Date: [PATCH 2.6.17 sparc64] 32-bit compat for Mach64 framebuffer
Next by Date: Re: 2.6.17-mm2 hrtimer code wedges at boot?
Previous by thread: Re: 2.6.17-mm6
Next by thread: Re: 2.6.17-mm6
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]