Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update

Arnd Bergmann wrote:

On Tuesday 30 January 2007 23:54, Maynard Johnson wrote:
Why do you store them per spu in the first place? The physical spu
doesn't have any relevance to this at all, the only data that is
per spu is the sample data collected on a profiling interrupt,
which you can then copy in the per-context data on a context switch.
The sample data is written out to the event buffer on every profilinginterrupt. But we don't write out the SPU program counter samplesdirectly to the event buffer. First, we have to find the cached_infofor the appropriate SPU context to retrieve the cached vma-to-fileoffsetmap. Then we do the vma_map_lookup to find the fileoffset correspondingto the SPU PC sample, which we then write out to the event buffer. Thisis one of the most time-critical pieces of the SPU profiling code, so Iused an array to hold the cached_info for fast random access. But as Istated in a code comment above, the negative implication of this currentimplementation is that the array can only hold the cached_info forcurrently running SPU tasks. I need to give this some more thought.
I've given this some more thought, and I'm coming to the conclusion thata pure array-based implementation for holding cached_info (getting ridof the lists) would work well for the vast majority of cases in whichOProfile will be used. Yes, it is true that the mapping of an SPUcontext to a phsyical spu-numbered array location cannot be guaranteedto stay valid, and that's why I discard the cached_info at that arraylocation when the SPU task is switched out. Yes, it would be terriblyinefficient if the same SPU task gets switched back in later and wewould have to recreate the cached_info. However, I contend thatOProfile users are interested in profiling one application at a time.They are not going to want to muddy the waters with multiple SPU appsrunning at the same time. I can't think of any reason why someone wouldconscisouly choose to do that.
Any thoughts from the general community, especially OProfile users?
Please assume that in the near future we will be scheduling SPU contexts
in and out multiple times a second. Even in a single application, you
can easily have more contexts than you have physical SPUs.

Arnd, thanks for pointing this out. That's definitely a good reason whymy simplistic approach won't work. I'll look at other options.


The event buffer by definition needs to be per context. If you for some

Yes, and it is. Right now, with the current simplistic approach, thecontext and the physical SPU are kept in sync.

reason want to collect the samples per physical SPU during an event
interrupt, you should at least make sure that they are copied into the
per-context event buffer on a context switch.

At the context switch point, you probably also want to drain the
hw event counters, so that you account all events correctly.

Yeah, that's a good idea. The few extraneous invalid samples wouldprobably never rise above the noise level, but we should do this anywayfor completeness.


We also want to be able to profile the context switch code itself, which
means that we also need one event buffer associated with the kernel to
collect events that for a zero context_id.

The hardware design precludes tracing both SPU and PPU simultaneously.

-Maynard


Of course, the recording of raw samples in the per-context buffer does
not need to have the dcookies along with it, you can still resolve
the pointers when the SPU context gets destroyed (or an object gets
unmapped).

	Arnd <><



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update
  - From: Arnd Bergmann <[email protected]>

References:
- [RFC, PATCH 0/4] Add support to OProfile for profiling Cell BE SPUs -- update
  - From: Maynard Johnson <[email protected]>
- Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update
  - From: Maynard Johnson <[email protected]>
- Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update
  - From: Maynard Johnson <[email protected]>
- Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update
  - From: Arnd Bergmann <[email protected]>

Prev by Date: Re: MSI failure on nForce 430 (WAS: intel 82571EB gigabit fails to see link on 2.6.20-rc5 in-tree e1000 driver (regression))
Next by Date: Re: [PATCH] pata_atiixp: propogate cable detection hack from drivers/ide to the new driver
Previous by thread: Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update
Next by thread: Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]