Re: [patch] x86, ptrace: support for branch trace store(BTS)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday 13 November 2007 20:59:54 Siddha, Suresh B wrote:
> Support for Intel's last branch recording to ptrace. This gives debuggers
> access to this hardware feature and allows them to show an execution trace
> of the debugged application.

Cool. Finally someone gets around to supporting this properly.

Ok mostly properly, the patch needs some improvements.

But can you please add a mode for tracing kernel code too? I guess
this means a way to forbid it to user space or allocate the register.
That would be probably needed by kernel debuggers too anyways, so
might be better to add it from the begging.

I'm sure some people will suggest now that should be all only
supported with utrace. My suggestion is to ignore them.

Also there was some discussion recently that complex kernel
interfaces like this need manpages on proposal so that proper review
of the interfaces is possible. I would suggest to write manpages
(or manpages patches) and point to them on the next submission.

> Index: linux-2.6/arch/x86/kernel/process_32.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/process_32.c	2007-11-07 09:08:32.%N +0100
> +++ linux-2.6/arch/x86/kernel/process_32.c	2007-11-07 09:09:36.%N +0100
> @@ -735,6 +735,18 @@
>  		__switch_to_xtra(prev_p, next_p, tss);
>  
>  	/*
> +	 * Last branch recording recofiguration of trace hardware and
> +	 * disentangling of trace data per task.
> +	 */
> +	if (unlikely(task_thread_info(prev_p)->flags & _TIF_BTS_TRACE ||
> +		     task_thread_info(prev_p)->flags & _TIF_BTS_TRACE_TS))
> +		ptrace_bts_task_departs(prev_p);
> +
> +	if (unlikely(task_thread_info(next_p)->flags & _TIF_BTS_TRACE ||
> +		     task_thread_info(next_p)->flags & _TIF_BTS_TRACE_TS))
> +		ptrace_bts_task_arrives(next_p);
> +

Congratulations. You managed to pick exactly the wrong place.  
Why do you think __switch_to_xtra was invented?  The bits should be in 
CTXSW_PREV/NEXT and then the checks moved there. Then it would be zero
cost for the nobody using it case.


> +#ifdef __i386__
> +static int ptrace_bts_disable_netburst(void)
> +{
> +	unsigned long long debugctl_msr = 0;
> +	rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl_msr);
> +	debugctl_msr &=
> +		~((1 << 2) | /* TR */
> +		  (1 << 3));  /* BTS */
> +	wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctl_msr);
> +
> +	return 0;
> +}
> +
> +static int ptrace_bts_disable_pentium_m(void)
> +{
> +	unsigned long long debugctl_msr = 0;
> +	rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl_msr);

etc. Surely it would be faster to cache the value of the MSR in RAM?
Might matter for the context switch.

> +	if (size_in_records > 4000)
> +		return -EINVAL;

Nasty undocumented undefined magic numbers.


> +
> +	if (!ptrace_bts_ops.allocate_bts)
> +		return -EOPNOTSUPP;
> +	return (*ptrace_bts_ops.allocate_bts)(child, size_in_records);
> +}
> +
> +#ifdef __i386__

etc. The code duplication for i386 is indeed quite ugly. Would
be good to find some better way to do this. A lot of it seems
to be just different sizeof(), surely that could be passed around too
or gotten from the ops structure? 

Also it's unclear why you got different MSR access functions for 32bit 
and 64bit.

> +static int ptrace_bts_init_intel(void)

This should be probably in the standard CPU initialization code,
possibly using synthesized feature bits too.

Also do you have user space to use this?

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux