Re: [PATCH] Linux Kernel Markers 0.5 for Linux 2.6.17 (with probe management)

* Frank Ch. Eigler ([email protected]) wrote:
> Thank you for continuing on.
> 
Thanks for the comments, see below :

> > +#define MARK_SYM(name) \
> > +		here: asm volatile \
> > +			(MARK_KPROBE_PREFIX#name " = %0" : : "m" (*&&here)); \
> 
> Regarding MARK_SYM, if I read Ingo's message correctly, this is the
> only type of marker he believes is necessary, since it would put a
> place for kprobes to put a breakpoint.  FWIW, this still appears to be
> applicable only if the int3 overheads are tolerable, and if parameter
> data extraction is unnecessary or sufficiently robust "by accident".
> 
I agree. And I don't see how the int3 approach fills the need of embedded
systems and high performance computing developers, so I think that a "call"
alternative, where the stack is set up by gcc itself, is worthy.
> 
> Regarding:
> 
> > +#define MARK_JUMP(name, format, args...) \
> > [...]
> 
> All this leap/jump/branch elaboration may be based upon scanty
> benchmarks.  The plain conditional-indirect function call is already
> "fast", especially if its hosting compilation unit is compiled with
> -freorder-blocks.
> 

Yes, I just wanted to show that there was some performance merits to this
technique, more benchmarks will be needed. The main problem I have seen with
your condition+call approach is that there is no safe way to remove the function
pointer without causing a huge mess (null pointer dereference). This is why I
thought of this alternative that has the abolity of keep a consistent execution
flow at _all_ times.

> Direct jumps to instrumentation are unlikely to be of a great deal of
> use, in that there would have to be some assembly-level code in
> between to save/restore affected registers.  Remember, ultimately the
> handler will be written in C.
> 
Yup, I agree, but it's nice to have the ability to jump over inline functions :)

> Regarding use of an empty_function() sentinel instead of NULLs, it is
> worth measuring carefully whether a unconditional indirect call to
> such a dummy function is faster than a conditional indirect call.  It
> may have to be a per-architecture internal implementation option.
> 
Yes, but you also have to take in account the stack setup cost. Idoubt that it
will be faster than a jump, but this is worthy to test.

> Regarding varargs, I still believe that varargs have poorer
> type-safety.  Remember, it's not just type-safety at the marker site
> (which gcc's printf-format-checker could validate), but the handler's
> too.  I believe it is incorrect that a non-varargs function can always
> safely take a call from a varargs call - varargs changes the calling
> conventions.  This would mean that the handler would itself have to be
> varargs, with all the attendant run-time overheads.  (Plus the idea of
> using a build-time script to generate the non-verargs handlers from
> analyzing particular MARK() calls is itself probably a bit complex for
> its own good.)
> 

Absolutely, you are right. Good catch! I am putting a asmlinkage prefix to the
probe function, which will make sure that every argument is passed on the stack.

From http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html

"regparm (number)
    On the Intel 386, the regparm attribute causes the compiler to
    pass arguments number one to number if they are of integral type in
    registers EAX, EDX, and ECX instead of on the stack. Functions that
    take a variable number of arguments will continue to be passed all
    of their arguments on the stack.

    Beware that on some ELF systems this attribute is unsuitable for
    global functions in shared libraries with lazy binding (which is the
    default). Lazy binding will send the first call via resolving code
    in the loader, which might assume EAX, EDX and ECX can be clobbered,
    as per the standard calling conventions. Solaris 8 is affected by
    this. GNU systems with GLIBC 2.1 or higher, and FreeBSD, are believed
    to be safe since the loaders there save all registers. (Lazy binding
    can be disabled with the linker or the loader if desired, to avoid
    the problem.)"

so a regparm(0) seems like a good solution there.

> Regarding measurements in general, one must avoid being misled by
> microbenchmarks such as those that represent an extreme of caching
> friendliness.  It may be necessary to run large system-level tests to
> meaningfully compare alternatives.
> 

Yes, the problem is that, after having ran such tests on a slower approach, the
macrobenchmarks failed to show any difference. Yes, those tests should be done,
but I don't expect any revelation from them.

> 
> > +int marker_set_probe(const char *name, void (*probe)(const char *fmt, ...),
> > +			enum marker_type type);
> > +void marker_disable_probe(const char *name, void (*probe)(const char *fmt, ...),
> > +			enum marker_type type);
> 
> I'm unclear about what a marker_type value represents.  Could you go
> over that again?
> 

marker_type MARKER_CALL
  Sets the jump to a function call

marker_type MARKER_INLINE
  Sets the jump to the provided inline function

> 
> > +static int marker_get_pointers(const char *name,
> > + [...]
> > +	ptrs->call = (void**)kallsyms_lookup_name(call_sym);
> > +	ptrs->jmpselect = (void**)kallsyms_lookup_name(jmpselect_sym);
> > +	ptrs->jmpcall = (void**)kallsyms_lookup_name(jmpcall_sym);
> > +	ptrs->jmpinline = (void**)kallsyms_lookup_name(jmpinline_sym);
> > +	ptrs->jmpover = (void**)kallsyms_lookup_name(jmpover_sym);
> > [...]
> 
> I wonder whether, as for kprobes, it will be necessary to lock into
> memory a module containing an active marker.
> 

Nope, as the module_exit _has_ to call marker_disable_probe and that
marker_disable_probe does a synchronize_sched(). Because the function call is
surrounded by preempt_disable(), there will be no execution stream within the
function when the module will be unloaded.

Regards,

Mathieu

OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

Attachment: signature.asc
Description: Digital signature

References:
- [PATCH] Linux Kernel Markers 0.5 for Linux 2.6.17 (with probe management)
  - From: Mathieu Desnoyers <[email protected]>
- Re: [PATCH] Linux Kernel Markers 0.5 for Linux 2.6.17 (with probe management)
  - From: "Frank Ch. Eigler" <[email protected]>

Prev by Date: Re: Flushing writes to PCI devices
Next by Date: Re: 2.6.18-rc7-mm1
Previous by thread: Re: [PATCH] Linux Kernel Markers 0.5 for Linux 2.6.17 (with probe management)
Next by thread: [patch -mm] namespaces: exit_task_namespaces() invalidates nsproxy
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]