Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrew Morton wrote:
> On Thu, 29 Jun 2006 09:44:08 -0700
> Paul Jackson <[email protected]> wrote:
>
>
>>>You're probably correct on that model. However, it all depends on the actual
>>>workload. Are people who actually have large-CPU (>256) systems actually
>>>running fork()-heavy things like webservers on them, or are they running things
>>>like database servers and computations, which tend to have persistent
>>>processes?
>>
>>It may well be mostly as you say - the large-CPU systems not running
>>the fork() heavy jobs.
>>
>>Sooner or later, someone will want to run a fork()-heavy job on a
>>large-CPU system.  On a 1024 CPU system, it would apparently take
>>just 14 exits/sec/CPU to hit this bottleneck, if Jay's number of
>>14000 applied.
>>
>>Chris Sturdivant's reply is reasonable -- we'll hit it sooner or later,
>>and deal with it then.
>>
>
>
> I agree, and I'm viewing this as blocking the taskstats merge.  Because if
> this _is_ a problem then it's a big one because fixing it will be
> intrusive, and might well involve userspace-visible changes.
>
> The only ways I can see of fixing the problem generally are to either
>
> a) throw more CPU(s) at stats collection: allow userspace to register for
>    "stats generated by CPU N", then run a stats collection daemon on each
>    CPU or
>
> b) make the kernel recognise when it's getting overloaded and switch to
>    some degraded mode where it stops trying to send all the data to
>    userspace - just send a summary, or a "we goofed" message or something.



Andrew,

Based on previous discussions, the above solutions can be expanded/modified to:

a) allow userspace to listen to a group of cpus instead of all. Multiple
collection daemons can distribute the load as you pointed out. Doing collection
by cpu groups rather than individual cpus reduces the aggregation burden on
userspace (and scales better with NR_CPUS)

b) do flow control on the kernel send side. This can involve buffering and sending
later (to handle bursty case) or dropping (to handle sustained load) as pointed out
by you, Jamal in other threads.

c) increase receiver's socket buffer. This can and should always be done but no
involvement needed.


With regards to taskstats changes to handle the problem and its impact on userspace
visible changes,

a) will change userspace
b) will be transparent.
c) is immaterial going forward (except perhaps as a change in Documentation)


I'm sending a patch that demonstrates how a) can be done quite simply
and a patch for b) is in progress.

If the approach suggested in patch a) is acceptable (and I'll provide the testing, stability
results once comments on it are largely over), could taskstats acceptance in 2.6.18 go ahead
and patch b) be added later (solution outline has already been provided and a prelim patch should
be out by eod)

--Shailabh



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux