Andi,
On Fri, Jun 30, 2006 at 02:59:05PM +0200, Andi Kleen wrote:
> Stephane Eranian <[email protected]> writes:
> >
> > __kprobes struct task_struct *
> > __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
> > {
> > struct thread_struct *prev = &prev_p->thread,
> > *next = &next_p->thread;
> > int cpu = smp_processor_id();
> > struct tss_struct *tss = &per_cpu(init_tss, cpu);
> >
> > if (unlikely(__get_cpu_var(pmu_ctx) || next_p->pfm_context))
> > __pfm_ctxswout(prev_p, next_p);
> >
> > /*
> > * Reload esp0, LDT and the page table pointer:
> > */
> > tss->rsp0 = next->rsp0;
> >
> > There is now a single hook and a conditional branch.
> > this is similar to what you have with the debug registers.
>
> It's still more than there was before. Also __get_cpu_var
> is quite a lot of instructions.
>
Ok, let me explain why we have the pmu_tx per-cpu variable.
Perfmon supports per thread and cpu-wide monitoring (across all
threads running on one CPU).
In per-thread mode, a perfmon context is attached to the monitored
thread in task->pfm_context. When the thread is context switched in
that context is also stored into the per-cpu variable to indicate
the currently active context. This may seem a bit redundant because
in this mode current->pfm_context == pmu_ctx in SMP mode. However
in UP mode, we support lazy save/restore and there pmu_ctx stores
the last owner.
In cpu-wide mode, the perfmon context is not attched to any thread.
It is logically attached to the CPU being monitored and is pointed
to by pmu_ctx. That variable is used to get to the active perfmon
context on PMU interrupt for instance (because current->pfm_context is
always NULL).
So why do we need care about context switch in cpu-wide mode?
It is because we support a mode where the idle thread is excluded
from cpu-wide monitoring. This is very useful to distinguish
'useful kernel work' from 'idle'. As you realize, that means
that we need to turn off when the idle thread is context switched
in and turn it back on when it is switched off.
Having said that, and based on your suggestion, I think we could
simply mark the PERFMON flag in the idle thread when needed in cpu-wide
mode. That would bring the test in __switch_to() to something along
those lines:
prev = &prev_p->thread;
next = &next_p->thread;
if (unlikely((prev->flags & PERFMON) || (next->flags & PERFMON)))
pfm_switch_to(prev_p, next_p);
I could split prev from next and merge next into the DEBUG and BITMAP stuff
as you suggest below.
What do you think?
> I would suggest you borrow some bits in one of the process
> or thread info flags and then do a single test
>
> if (unlikely(thr->flags & (DEBUG|PERFMON)) != 0) {
> if (flags & DEBUG)
> ... do debug ...
> if (flags & PERFMON)
> ... do perfmon ...
> }
>
> [which you're at it you can probably add ioports in there too -
> improving existing code is always a good thing]
>
> Ideally flags is in some cache line that is already
> touched during context switch. If not you might need
> to change the layout.
>
> It's ok to put the do_perfmon stuff into a separate noinline
> function because that will disturb the register allocation
> in the caller less.
>
> I would suggest doing this in separate preparing patches that
> first just do it for existing facilities.
>
> -Andi
>
> P.S.: My comments probably apply to the i386 versions too
> although I haven't read them.
Yes, this is the same for i386.
--
-Stephane
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]