Commit graph

361441 commits

Author SHA1 Message Date
Oleg Nesterov
457d1772f1 uprobes/tracing: Generalize struct uprobe_trace_entry_head
struct uprobe_trace_entry_head has a single member for reporting,
"unsigned long ip". If we want to support uretprobes we need to
create another struct which has "func" and "ret_ip" and duplicate
a lot of functions, like trace_kprobe.c does.

To avoid this copy-and-paste horror we turn ->ip into ->vaddr[]
and add couple of trivial helpers to calculate sizeof/data. This
uglifies the code a bit, but this allows us to avoid a lot more
complications later, when we add the support for ret-probes.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:32:01 +02:00
Oleg Nesterov
0e3853d202 uprobes/tracing: Kill the pointless local_save_flags/preempt_count calls
uprobe_trace_func() is never called with irqs or preemption
disabled, no need to ask preempt_count() or local_save_flags().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:32:00 +02:00
Oleg Nesterov
456fdbcb86 uprobes/tracing: Kill the pointless seq_print_ip_sym() call
seq_print_ip_sym(ip) in print_uprobe_event() is pointless,
kallsyms_lookup(ip) can not resolve a user-space address.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:31:59 +02:00
Oleg Nesterov
07720b63a9 uprobes/tracing: Kill the pointless task_pt_regs() calls
uprobe_trace_func() and uprobe_perf_func() do not need task_pt_regs(),
we already have "struct pt_regs *regs".

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:31:59 +02:00
Anton Arapov
decc6bfb49 uretprobes: Documentation update
add the uretprobe syntax and update an example

Signed-off-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-13 15:31:59 +02:00
Anton Arapov
a0d60aef4b uretprobes: Remove -ENOSYS as return probes implemented
Enclose return probes implementation.

Signed-off-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-13 15:31:58 +02:00
Anton Arapov
ded49c5530 uretprobes: Limit the depth of return probe nestedness
Unlike the kretprobes we can't trust userspace, thus must have
protection from user space attacks. User-space have  "unlimited"
stack, and this patch limits the return probes nestedness as a
simple remedy for it.

Note that this implementation leaks return_instance on siglongjmp
until exit()/exec().

The intention is to have KISS and bare minimum solution for the
initial implementation in order to not complicate the uretprobes
code.

In the future we may come up with more sophisticated solution that
remove this depth limitation. It is not easy task and lays beyond
this patchset.

Signed-off-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-13 15:31:58 +02:00
Anton Arapov
fec8898d86 uretprobes: Return probe exit, invoke handlers
Uretprobe handlers are invoked when the trampoline is hit, on completion
the trampoline is replaced with the saved return address and the uretprobe
instance deleted.

TODO: handle_trampoline() assumes that ->return_instances is always valid.
We should teach it to handle longjmp() which can invalidate the pending
return_instance's. This is nontrivial, we will try to do this in a separate
series.

Signed-off-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-13 15:31:57 +02:00
Anton Arapov
0dfd0eb8e4 uretprobes: Return probe entry, prepare_uretprobe()
When a uprobe with return probe consumer is hit, prepare_uretprobe()
function is invoked. It creates return_instance, hijacks return address
and replaces it with the trampoline.

* Return instances are kept as stack per uprobed task.
* Return instance is chained, when the original return address is
  trampoline's page vaddr (e.g. recursive call of the probed function).

Signed-off-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-13 15:31:57 +02:00
Anton Arapov
f15706b79d uretprobes/powerpc: Hijack return address
Hijack the return address and replace it with a trampoline address.
PowerPC implementation.

Signed-off-by: Anton Arapov <anton@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-13 15:31:56 +02:00
Anton Arapov
791eca1010 uretprobes/x86: Hijack return address
Hijack the return address and replace it with a trampoline address.

Signed-off-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-13 15:31:55 +02:00
Anton Arapov
e78aebfd27 uretprobes: Reserve the first slot in xol_vma for trampoline
Allocate trampoline page, as the very first one in uprobed
task xol area, and fill it with breakpoint opcode.

Also introduce get_trampoline_vaddr() helper, to wrap the
trampoline address extraction from area->vaddr. That removes
confusion and eases the debug experience in case ->vaddr
notion will be changed.

Signed-off-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-13 15:31:54 +02:00
Anton Arapov
ea024870cf uretprobes: Introduce uprobe_consumer->ret_handler()
Enclose return probes implementation, introduce ->ret_handler() and update
existing code to rely on ->handler() *and* ->ret_handler() for uprobe and
uretprobe respectively.

Signed-off-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-13 15:31:53 +02:00
Oleg Nesterov
3f47107c5c uprobes: Change write_opcode() to use copy_*page()
Change write_opcode() to use copy_highpage() + copy_to_page()
and simplify the code.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-04-04 13:57:06 +02:00
Oleg Nesterov
5669ccee21 uprobes: Introduce copy_to_page()
Extract the kmap_atomic/memcpy/kunmap_atomic code from
xol_get_insn_slot() into the new simple helper, copy_to_page().
It will have more users soon.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-04-04 13:57:05 +02:00
Oleg Nesterov
98763a1bb1 uprobes: Kill the unnecesary filp != NULL check in __copy_insn()
__copy_insn(filp) can only be called after valid_vma() returns T,
vma->vm_file passed as "filp" can not be NULL.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-04-04 13:57:05 +02:00
Oleg Nesterov
2edb7b5574 uprobes: Change __copy_insn() to use copy_from_page()
Change __copy_insn() to use copy_from_page() and simplify the code.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-04-04 13:57:05 +02:00
Oleg Nesterov
ab0d805c7b uprobes: Turn copy_opcode() into copy_from_page()
No functional changes. Rename copy_opcode() into copy_from_page() and
add the new "int len" argument to make it more more generic for the
new users.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-04-04 13:57:04 +02:00
Ananth N Mavinakayanahalli
3c9eb54f71 uprobes/powerpc: Remove additional trap instruction check
prepare_uprobe() already checks if the underlying unstruction
(on file) is a trap variant. We don't need to check this again.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-04 13:57:04 +02:00
Ananth N Mavinakayanahalli
ab07e807be uprobes/powerpc: Teach uprobes to ignore gdb breakpoints
Powerpc has many trap variants that could be used by entities like gdb.
Currently, running gdb on a program being traced by uprobes causes an
endless loop since uprobes doesn't understand that the trap was inserted
by some other entity and a SIGTRAP needs to be delivered.

Teach uprobes to ignore breakpoints that do not belong to it.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-04 13:57:04 +02:00
Ananth N Mavinakayanahalli
0908ad6e56 uprobes: Add trap variant helper
Some architectures like powerpc have multiple variants of the trap
instruction. Introduce an additional helper is_trap_insn() for run-time
handling of non-uprobe traps on such architectures.

While there, change is_swbp_at_addr() to is_trap_at_addr() for reading
clarity.

With this change, the uprobe registration path will supercede any trap
instruction inserted at the requested location, while taking care of
delivering the SIGTRAP for cases where the trap notification came in
for an address without a uprobe. See [1] for a more detailed explanation.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-March/104771.html

This change was suggested by Oleg Nesterov.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-04 13:57:04 +02:00
Oleg Nesterov
f281769e81 uprobes: Use file_inode()
Cleanup. Now that we have f_inode/file_inode() we can use it instead
of vm_file->f_mapping->host.

This should not make any difference for uprobes, but in theory this
change is more correct. We use this inode as a key, to compare it
with uprobe->inode set by uprobe_register(inode), and the caller uses
d_inode.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-04-04 13:57:03 +02:00
Ingo Molnar
b847d0501a perf/core improvements and fixes:
. Revert "perf sched: Handle PERF_RECORD_EXIT events" to get 'perf sched lat'
   back working.
 
 . We don't use Newt anymore, just plain libslang.
 
 . Kill a bunch of die() calls, from Namhyung Kim.
 
 . Add --no-demangle to report/top, from Namhyung Kim.
 
 . Fix dependency of the python binding wrt libtraceevent, from Naohiro Aota.
 
 . Introduce per core aggregation in 'perf stat', from Stephane Eranian.
 
 . Add memory profiling via PEBS, from Stephane Eranian.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJRWbZJAAoJENZQFvNTUqpAf1EP+wVatb3apHN/LyVq2gKBRZLs
 U8Ut/l8zpWmGDCHYvj6PazUbH92BGxqc5DNO7jHBOTLds3WQLvDV/QhJi3JIF6I4
 o+PdH1u0KhiJityZqRWmhaYIS4DvuWFqZUceXgA3IUZJ2onKTzjD6ABq1nsgHMSZ
 LwrhN5TLy7MmoY48wr02Ofc7HtbDfqzPDT2tqhpNaKPj+XslTR+nmVCTOeTNLZhX
 aSCTv8qgxt8Y2O8pdmbRg+RIYIKpx25U2BFHvvqNXVJypzuJe7lnRLYnumNRxKms
 b8NTS6UVI3TYyDMXLL1EiLK1OBJaY4XLFL5aX6sJcB1GiVSQJvG3mOpTFceQ8tvn
 ZJR3SzCasioYrQNTIt1cd482zlTiNpy5RdrdGaTfkL1B7pHIYvPnrLZGh57WneyG
 aVu4GnM5pu78WlsOq/hvgde/J9QX07jTIohlyRxcvqVi6s0NZHIB2TCwjCj61njZ
 2lITA5oM0kEJuKaoL3sXU/PJd8piTZvkrSCP5+1j87LMPA3AtgcZgXCNcpWSAZ9+
 J5geK5FN0sBLzA4UXpfUJPvUiiPIn3U7TLkBm9bx7j1KQvd53GB39TgFLl8a4Hzo
 WQSW6ieBc8vFYFsYNdj33CPciXgOxBKshbqqSwLEmok7IVPNFJ+b6szP+hkxuVIx
 KKXXDRdIZ7hw6PBlgvqz
 =VNmo
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

  * Revert "perf sched: Handle PERF_RECORD_EXIT events" to get 'perf sched lat'
    back working.

  * We don't use Newt anymore, just plain libslang.

  * Kill a bunch of die() calls, from Namhyung Kim.

  * Add --no-demangle to report/top, from Namhyung Kim.

  * Fix dependency of the python binding wrt libtraceevent, from Naohiro Aota.

  * Introduce per core aggregation in 'perf stat', from Stephane Eranian.

  * Add memory profiling via PEBS, from Stephane Eranian.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-01 21:41:35 +02:00
Arnaldo Carvalho de Melo
d06f791179 perf map browser: Exit just on well known key presses
Initial motivation was to avoid the confusing exit when when '/' is
pressed in non verbose mode, as specified in the help line searches
are only available in verbose mode.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-20xezxim2y4agmkx7f3sucll@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:23:55 -03:00
Arnaldo Carvalho de Melo
6692c262df perf tools: Remove dependency on libnewt
Now that the map browser shares the input routine with the hists
browser, there is no need for using any libnewt routine, so remove all
traces except for honouring NO_NEWT=1 on the makefile command line as an
indication that TUI support is not needed, in fact it just sets
NO_SLANG=1.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-wae5o7xca9m52bj1re28jc5j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:23:21 -03:00
Arnaldo Carvalho de Melo
a403253634 perf map browser: Use ui_browser__input_window()
Instead of an ad-hoc, libnewt based equivalent.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-elrijp95pijt66y6mmij4xm1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:23:10 -03:00
Naohiro Aota
707ef2e69a perf python: Fix dependency for python/perf.so
The python/perf.so binding lacks dependency for libtraceevent.a so that
it cause the following error building python/perf.so. This patch
introduce the dependency for it.

   $ make python/perf.so
       CHK -fstack-protector-all
       CHK -Wstack-protector
       CHK -Wvolatile-register-var
       CHK -D_FORTIFY_SOURCE=2
       CHK bionic
       CHK libelf
       CHK libdw
       CHK libunwind
       CHK -DLIBELF_MMAP
       CHK libaudit
       CHK libnewt
       CHK gtk2
       CHK -DHAVE_GTK_INFO_BAR
       CHK perl
       CHK python
       CHK python version
       CHK libbfd
       CHK -DHAVE_STRLCPY
       CHK -DHAVE_ON_EXIT
       CHK -DBACKTRACE_SUPPORT
       CHK libnuma
       GEN python/perf.so
   x86_64-pc-linux-gnu-gcc: error: ../lib/traceevent/libtraceevent.a: No such file or directory
   error: command 'x86_64-pc-linux-gnu-gcc' failed with exit status 1
   cp: cannot stat 'python_ext_build/lib/perf.so': No such file or directory
   make: *** [python/perf.so] Error 1

Signed-off-by: Naohiro Aota <naota@elisp.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/87wqswzznx.fsf@locke.i-did-not-set--mail-host-address--so-tickle-me
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:23:01 -03:00
Arnaldo Carvalho de Melo
b5ded71397 perf tools: Convert needless static variable to local
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-k85ajz97xbrd8fkt2a8pp7q1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:22:48 -03:00
Arnaldo Carvalho de Melo
1c6763cb99 Revert "perf sched: Handle PERF_RECORD_EXIT events"
This reverts commit 0439539f72.

This caused this segfault:

[root@sandy linux]# perf sched rec
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.306 MB perf.data (~57062 samples) ]
perf
[root@sandy linux]# perf sched lat
perf: builtin-sched.c:781: thread_atoms_search: Assertion `!(thread != atoms->thread)' failed.
Aborted (core dumped)
[root@sandy linux]#

Further investigation is needed to check that even with machine__remove_thread()
not really deleting the thread referenced in the PERF_RECORD_EXIT (it goes to
machine->dead_threads, because references may still exist to them in things like
 hist, etc) some event later comes for this dead thread and then
machine__findnew_thread() will create a new thead instance that will not be the
same as the one referenced by work_atoms->thread in thread_atoms_search().

For now just revert this patch to get the 'perf sched lat' back working.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
echo Link: http://lkml.kernel.org/n/tip-`ranpwd -l 24`@git.kernel.org
Link: http://lkml.kernel.org/n/tip-hg4s6e5txiwqe00h8rdg1sin@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:22:34 -03:00
Namhyung Kim
62667746a6 perf tools: Fix output of symbol_daddr offset
The symbol addresses in a dso have relative offsets from the start of a
mapping.  So in order to ouput correct offset value from @ip, one of
them should be converted.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1359040242-8269-19-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:22:15 -03:00
Stephane Eranian
bad4091791 perf machine: Detect data vs. text mappings
Leverages the PERF_RECORD_MISC_MMAP_DATA bit in the RECORD_MMAP record
header. When the bit is set then the mapping type is set to
MAP__VARIABLE.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-17-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:22:00 -03:00
Stephane Eranian
028f12ee6b perf tools: Add new mem command for memory access profiling
This new command is a wrapper on top of perf record and perf report to
make it easier to configure for memory access profiling.

To record loads:
$ perf mem -t load rec .....

To record stores:
$ perf mem -t store rec .....

To get the report:
$ perf mem -t load rep

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-15-git-send-email-eranian@google.com
[ Fixed minor conflict with 66857b5 "Sort command-list.txt alphabetically" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:21:44 -03:00
Stephane Eranian
f4f7e28d0e perf report: Add support for mem access profiling
This patch adds the --mem-mode option to perf report.

This mode requires a perf.data file created with memory access samples.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-13-git-send-email-eranian@google.com
[ Removed duplicates in the --sort help, man page needs updating,
  Fixed minor conflict with 328ccda "perf report: Add --no-demangle option" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:21:28 -03:00
Stephane Eranian
ccf49bfc6b perf record: Add support for mem access profiling
We use the -W option to obtain the cost of the memory accesses.

Data address sampling is obtained via the -d option.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-14-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:20:28 -03:00
Stephane Eranian
98a3b32c99 perf tools: Add mem access sampling core support
This patch adds the sorting and histogram support
functions to enable profiling of memory accesses.

The following sorting orders are added:
 - symbol_daddr: data address symbol (or raw address)
 - dso_daddr: data address shared object
 - locked: access uses locked transaction
 - tlb : TLB access
 - mem : memory level of the access (L1, L2, L3, RAM, ...)
 - snoop: access snoop mode

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-12-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed, the move of methods to
  machine.[ch], and the rename of dsrc to data_src, to match the change
  made in the PERF_SAMPLE_DSRC in a previous patch. ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:20:13 -03:00
Andi Kleen
05484298cb perf tools: Add support for weight v7 (modified)
perf record has a new option -W that enables weightened sampling.

Add sorting support in top/report for the average weight per sample and the
total weight sum. This allows to both compare relative cost per event
and the total cost over the measurement period.

Add the necessary glue to perf report, record and the library.

v2: Merge with new hist refactoring.
v3: Fix manpage. Remove value check.
Rename global_weight to weight and weight to local_weight.
v4: Readd sort keys to manpage
v5: Move weight to end
v6: Move weight to template
v7: Rename weight key.

Original patch from Andi modified by Stephane Eranian <eranian@google.com>
to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed and the hists_link perf test entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:19:43 -03:00
Stephane Eranian
2fe85427e3 perf: Add PERF_RECORD_MISC_MMAP_DATA to RECORD_MMAP
Type of mapping was lost and made it hard for a tool
to distinguish code vs. data mmaps. Perf has the ability
to distinguish the two.

Use a bit in the header->misc bitmask to keep track of
the mmap type. If PERF_RECORD_MISC_MMAP_DATA is set then
the mapping is not executable (!VM_EXEC). If not set, then
the mapping is executable.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-16-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:19:02 -03:00
Stephane Eranian
9ad64c0f48 perf/x86: Add support for PEBS Precise Store
This patch adds support for PEBS Precise Store
which is available on Intel Sandy Bridge and
Ivy Bridge processors.

To use Precise store, the proper PEBS event
must be used: mem_trans_retired:precise_stores.
For the perf tool, the generic mem-stores event
exported via sysfs can be used directly.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-11-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:17:06 -03:00
Stephane Eranian
a63fcab452 perf/x86: Export PEBS load latency threshold register to sysfs
Make the PEBS Load Latency threshold register layout
and encoding visible to user level tools.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-10-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:16:49 -03:00
Stephane Eranian
f20093eef5 perf/x86: Add memory profiling via PEBS Load Latency
This patch adds support for memory profiling using the
PEBS Load Latency facility.

Load accesses are sampled by HW and the instruction
address, data address, load latency, data source, tlb,
locked information can be saved in the sampling buffer
if using the PERF_SAMPLE_COST (for latency),
PERF_SAMPLE_ADDR, PERF_SAMPLE_DATA_SRC types.

To enable PEBS Load Latency, users have to use the
model specific event:

 - on NHM/WSM: MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD
 - on SNB/IVB: MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD

To make things easier, this patch also exports a generic
alias via sysfs: mem-loads. It export the right event
encoding based on the host CPU and can be used directly
by the perf tool.

Loosely based on Intel's Lin Ming patch posted on LKML
in July 2011.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-9-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:16:31 -03:00
Stephane Eranian
d6be9ad6c9 perf: Add generic memory sampling interface
This patch adds PERF_SAMPLE_DATA_SRC.

PERF_SAMPLE_DATA_SRC collects the data source, i.e., where
did the data associated with the sampled instruction
come from. Information is stored in a perf_mem_data_src
structure. It contains opcode, mem level, tlb, snoop,
lock information, subject to availability in hardware.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-8-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:15:59 -03:00
Andi Kleen
c3feedf2aa perf/core: Add weighted samples
For some events it's useful to weight sample with a hardware
provided number. This expresses how expensive the action the
sample represent was.  This allows the profiler to scale
the samples to be more informative to the programmer.

There is already the period which is used similarly, but it
means something different, so I chose to not overload it.
Instead a new sample type for WEIGHT is added.

Can be used for multiple things. Initially it is used for TSX
abort costs and profiling by memory latencies (so to make
expensive load appear higher up in the histograms). The concept
is quite generic and can be extended to many other kinds of
events or architectures, as long as the hardware provides
suitable auxillary values. In principle it could be also used
for software tracepoints.

This adds the generic glue. A new optional sample format for a
64-bit weight value.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:15:44 -03:00
Stephane Eranian
9fac2cf316 perf/x86: Add flags to event constraints
This patch adds a flags field to each event constraint.
It can be used to store event specific features which can
then later be used by scheduling code or low-level x86 code.

The flags are propagated into event->hw.flags during the
get_event_constraint() call. They are cleared during the
put_event_constraint() call.

This mechanism is going to be used by the PEBS-LL patches.
It avoids defining yet another table to hold event specific
information.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:15:04 -03:00
Stephane Eranian
3a54aaa0a3 perf/x86: Improve sysfs event mapping with event string
This patch extends Jiri's changes to make generic
events mapping visible via sysfs. The patch extends
the mechanism to non-generic events by allowing
the mappings to be hardcoded in strings.

This mechanism will be used by the PEBS-LL patch
later on.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ fixed up conflict with 2663960 "perf: Make EVENT_ATTR global" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-03-26 17:36:45 -03:00
Andi Kleen
1a6461b128 perf/x86: Support CPU specific sysfs events
Add a way for the CPU initialization code to register additional
events, and merge them into the events attribute directory. Used
in the next patch.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-2-git-send-email-eranian@google.com
[ small cleanups ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ merge_attr returns a **, not just * ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-03-26 16:50:23 -03:00
Namhyung Kim
328ccdace8 perf report: Add --no-demangle option
It's sometimes useful to see undemangled raw symbol name for example
other tools using the perf output to do manipulation of binaries.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: William Cohen <wcohen@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: William Cohen <wcohen@redhat.com>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=55571
Link: http://lkml.kernel.org/r/1364203098-17741-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-03-26 16:38:21 -03:00
Stephane Eranian
12c08a9f59 perf stat: Add per-core aggregation
This patch adds the --per-core option to perf stat.

This option is used to aggregate system-wide counts
on a per physical core basis. On processors with
hyperthreading, this means counts of all HT threads
running on a physical core are aggregated.

This mode is useful to find imblance between physical
cores running an uniform workload. Cores are identified
by socket: S0-C1, means physical core 1 on socket 0. Note
that cores are identified using their physical core id,
thus their numbering may not be continuous.

Per core aggregation can be combined with interval printing:

 # perf stat -a --per-core -I 1000 -e cycles sleep 1000
 #           time core         cpus             counts events
      1.000090030 S0-C0           1          4,765,747 cycles
      1.000090030 S0-C1           1          5,580,647 cycles
      1.000090030 S0-C2           1            221,181 cycles
      1.000090030 S0-C3           1            266,092 cycles

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1360846649-6411-4-git-send-email-eranian@google.com
[ committer note: Remove parts already applied on 86ee6e1 to keep bisectability ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-03-25 16:13:26 -03:00
Stephane Eranian
d4304958a2 perf stat: Rename --aggr-socket to --per-socket
To make it more obvious what this option does as suggested by Andi on
LKML.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1360846649-6411-3-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-03-25 16:09:24 -03:00
Stephane Eranian
86ee6e18f6 perf stat: Refactor aggregation code
Refactor aggregation code by introducing a single aggr_mode variable and an
enum for aggregation.

Also refactor cpumap code having to do with cpu to socket mappings. All in
preparation for extended modes, such as cpu -> core.

Also fix socket aggregation and ensure that sockets are printed in increasing
order.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1360846649-6411-2-git-send-email-eranian@google.com
[ committer note: Fixup conflicts with a7e191c "--repeat forever" and
  acf2892 "Use perf_evlist__prepare/start_workload()" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-03-25 15:29:53 -03:00
Namhyung Kim
ebf3c675d7 perf tools: Cleanup calc_data_size logic
It's for calculating whole trace data size during reading.  However
relation functions are called only in this file, no need to
conditionalize it with tricky +1 offset and rename the variable to
more meaningful name like trace_data_size.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1363850332-25297-10-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-03-21 13:37:37 -03:00