On Tue, 17 May 2005, Mitchell Blank Jr wrote:
> Maybe #ifdef CONFIG_SMP around this? On uniproc you're still saving the
> sti/cli around reading the tsk->signal stuff but that's probably not
> enough to warrant the code bloat.
>
> Or maybe this is a small enough amount of code not to matter... just a
> suggestion.
Here is the patch with more explanations and #ifdef CONFIG_SMP
Avoid taking the tasklist_lock in sys_times if the process is
single threaded. In a NUMA system taking the tasklist_lock may
cause a bouncing cacheline if multiple independent processes
continually call sys_times to measure their performance.
Signed-off-by: Christoph Lameter <[email protected]>
Signed-off-by: Shai Fultheim <[email protected]>
Index: linux-2.6.11/kernel/sys.c
===================================================================
--- linux-2.6.11.orig/kernel/sys.c 2005-05-17 15:21:50.000000000 -0700
+++ linux-2.6.11/kernel/sys.c 2005-05-17 17:00:42.000000000 -0700
@@ -894,35 +894,69 @@ asmlinkage long sys_times(struct tms __u
*/
if (tbuf) {
struct tms tmp;
- struct task_struct *tsk = current;
- struct task_struct *t;
cputime_t utime, stime, cutime, cstime;
- read_lock(&tasklist_lock);
- utime = tsk->signal->utime;
- stime = tsk->signal->stime;
- t = tsk;
- do {
- utime = cputime_add(utime, t->utime);
- stime = cputime_add(stime, t->stime);
- t = next_thread(t);
- } while (t != tsk);
-
- /*
- * While we have tasklist_lock read-locked, no dying thread
- * can be updating current->signal->[us]time. Instead,
- * we got their counts included in the live thread loop.
- * However, another thread can come in right now and
- * do a wait call that updates current->signal->c[us]time.
- * To make sure we always see that pair updated atomically,
- * we take the siglock around fetching them.
- */
- spin_lock_irq(&tsk->sighand->siglock);
- cutime = tsk->signal->cutime;
- cstime = tsk->signal->cstime;
- spin_unlock_irq(&tsk->sighand->siglock);
- read_unlock(&tasklist_lock);
+#ifdef CONFIG_SMP
+ if (current == next_thread(current)) {
+ /*
+ * Single thread case without the use of any locks.
+ *
+ * We may race with release_task if two threads are
+ * executing. However, release task first adds up the
+ * counters (__exit_signal) before removing the task
+ * from the process tasklist (__unhash_process).
+ * __exit_signal also acquires and releases the
+ * siglock which results in the proper memory ordering
+ * so that the list modifications are always visible
+ * after the counters have been updated.
+ *
+ * If the counters have been updated by the second thread
+ * but the thread has not yet been removed from the list
+ * then the other branch will be executing which will
+ * block on tasklist_lock until the exit handling of the
+ * other task is finished.
+ *
+ * This also implies that the sighand->siglock cannot
+ * be held by another processor. So we can also
+ * skip acquiring that lock.
+ */
+ utime = cputime_add(current->signal->utime, current->utime);
+ stime = cputime_add(current->signal->utime, current->stime);
+ cutime = current->signal->cutime;
+ cstime = current->signal->cstime;
+ } else
+#endif
+ {
+ /* Process with multiple threads */
+ struct task_struct *tsk = current;
+ struct task_struct *t;
+
+ read_lock(&tasklist_lock);
+ utime = tsk->signal->utime;
+ stime = tsk->signal->stime;
+ t = tsk;
+ do {
+ utime = cputime_add(utime, t->utime);
+ stime = cputime_add(stime, t->stime);
+ t = next_thread(t);
+ } while (t != tsk);
+
+ /*
+ * While we have tasklist_lock read-locked, no dying thread
+ * can be updating current->signal->[us]time. Instead,
+ * we got their counts included in the live thread loop.
+ * However, another thread can come in right now and
+ * do a wait call that updates current->signal->c[us]time.
+ * To make sure we always see that pair updated atomically,
+ * we take the siglock around fetching them.
+ */
+ spin_lock_irq(&tsk->sighand->siglock);
+ cutime = tsk->signal->cutime;
+ cstime = tsk->signal->cstime;
+ spin_unlock_irq(&tsk->sighand->siglock);
+ read_unlock(&tasklist_lock);
+ }
tmp.tms_utime = cputime_to_clock_t(utime);
tmp.tms_stime = cputime_to_clock_t(stime);
tmp.tms_cutime = cputime_to_clock_t(cutime);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]