Parag Warudkar wrote:
>
> On Nov 15, 2005, at 5:29 PM, Shailabh Nagar wrote:
<snip>
>
>>> Does this mean, whether or not the per task delay accounting is used,
>>> we have a constant overhead of sizeof(spinlock_t) + 2*sizeof (uint32_t)
>>> + 2* sizeof(uint64_t) bytes going into the struct task_struct?. Is it
>>> possible/beneficial to use struct task_delay_info *delays instead and
>>> allocate it if task wants to use the information?
>>>
>>
>> Doing so would have value in the case where the feature is configured
>> but no one ever registers to listen for it.
>
>
> Precisely. Such a feature will be used only occasionally I suppose.
> I didn't read the code deeply but are any scheduling decisions altered
> based on this data? If not, then it makes sense to not account unless required.
>
> I think it should be possible to do it on demand, per process instead
> of forcing the accounting on _all_ processes which cumulatively becomes a sizeable
> o/h.
>
> Per Process activation of this feature will add significant value IMHO.
> (Of course, if that's possible in first place.)
Per-task activation is useful/possible only for long-running tasks.
If one is trying to gather stats for a user-defined grouping of tasks
then it would involve too much overhead & inaccuracy to require monitoring
to be turned on individually.
>
>> The cost of doing this would be
>> - adding more code to the fork path to allocate conditionally
>
>
> Just an unlikely branch for normal code path - not a big deal.
> Also I am thinking it could be handled outside of fork?
only if per-task activation is done - thats probably what you meant ?
>
>> - make the collecting of the delays conditional on a similar check
>
>
> Weighing this against the actual accounting - I think it's a win.
Hmmm..since there is locking involved in the stats collection, this is
starting to make a lot of sense.
>
>> - cache pollution from following an extra pointer in the pgflt/
>> io_schedule paths
>> I'm not sure is this really matters for these two code paths.
>
>
> Possibly.
>
>> Even if one does this, once the first listener registers, all future
>> tasks
>> (and even the current ones) will have to go ahead and allocate the
>> structure
>> and accounting of delays will have to switch to unconditional mode.
>> This is
>> because the delay data has cumulative value...future listeners will be
>> interested in data collected earlier (as long as task is still
>> running). And
>> once the first listener registers, you can no longer be sure no one's
>> interested
>> in the future.
>>
>
> Is it possible to do it per process? Forcing it on all processes is
> what I was trying to avoid given the feature's usage pattern.
>
>> Another alternative is to let userland control the overhead of
>> allocation and
>> collection completely through a /proc/sys/kernel/delayacct variable.
>> When its switched on, it triggers an allocation for all existing
>> tasks in the
>> system, turns on allocation in fork() for future tasks, and
>> collection of the stats.
>> When turned off, collection of stats stops as does allocation for
>> future tasks
>> (not worth going in and deallocating structs for existing tasks).
>
>
>> Does this seem worth it ?
>>
>
> Definitely not unless we can do it per process and on demand.
Per-task doesn't seem like a good idea. Neither does allowing dynamic
switching off/on of the allocation of the delay struct.
So how about this:
Have /proc/sys/kernel/delayacct and a corresponding kernel boot parameter (for setting
the switch early) which control just the collection of data. Allocation always happens.
>> -- Shailabh
>>
>
> Cheers
>
> Parag
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]