Shailabh Nagar wrote:
Andrew Morton wrote:
On Fri, 23 Jun 2006 22:59:04 -0400
Shailabh Nagar <[email protected]> wrote:
It was due to a loop in fill_tgid() when per-TG stats
data are assembled for netlink:
do {
rc = delayacct_add_tsk(stats, tsk);
if (rc)
break;
} while_each_thread(first, tsk);
and it is executed inside a lock.
Fortunately single threaded appls do not hit this code.
Am I reading this right? We do that loop when each thread within the
thread group exits?
Yes.
How come?
To get the sum of all per-tid data for threads that are currently alive.
This is returned to userspace with each thread exit.
I realise that. How about we stop doing it?
When a thread exits it only makes sense to send up the stats for that
thread.
Why does the kernel assume that userspace is also interested in
the accumulated stats of its siblings? And if userspace _is_
interested in
that info, it's still present in-kernel and can be queried for.
The reason for sending out sum of siblings's stats was as follows:
I didn't maintain a per-tgid data structure in-kernel where the exiting
threads taskstats could be accumalated
, erroneously thinking that this would require such a structure to be
*additionally* updated each time a statististic
was being collected and that would be way too much overhead. Also to
save on space. Thus if userspace wants to get the per-tgid stats for the
thread group when the last thread exits, then it cannot
do so by querying since such a query only returns the sum of currently
live threads (data from exited threads is lost).
So, the current design chooses to return the sum of all siblings + self
when each thread exits. Using this userspace
can maintain the per-tgid data for all currently living threads of the
group + previously exited threads.
But as pointed out in an earlier mail, it looks like this is
unnecessarily elaborate way of trying to avoid maintaining
a separate per-tgid data structure in the kernel (in addition to the
per-tid ones we already have).
What can be done is to create a taskstats structure for a thread group
the moment the *second* thread gets created.
Then each exiting thread can accumalate its stats to this struct. If
userspace queries for per-tgid data, the sum of all
live threads + value in this struct can be returned. And when the last
thread of the thread group exits, the struct's
value can be output.
While this will mean an extra taskstats structure hanging around for the
lifetime of a multithreaded app (not single threaded
ones), it should cut down on the overhead of running through all threads
that we see in the current design.
More importantly, it will reduce the frequency of per-tgid data send to
once for each thread group exit instead of once
per thread exit.
Will that work for everyone ?
As long as the per-pid delayacct struct has a pointer to the per-tgid
data struct and deoes not need to go through the loop on every exit.
Is there some better lock we can use in there? It only has to be
threadgroup-wide rather than kernel-wide.
The lock we're holding is the tasklist_lock. To go through all the
threads of a thread group
thats the only lock that can protect integrity of while_each_thread
afaics.
At present, yes. That's persumably not impossible to fix.
In the above design, if a userspace query for per-tgid data arrives,
then I'll still need to run through
all the threads of a thread group (to return their sum + that of already
exited threads accumalated in the
extra per-tgid taskstats struct).
But, this query-reply logic can be separated from that executed at
exit.
Thanks,
- jay
So that could still benefit from such a thread group specific lock.
Scope of change is a bit more of course
so will need to take a closer look.
--Shailabh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]