Be more clear about DSO long names and tell from which file
kernel symbols were obtained, all in --verbose mode:
[root@mica ~]# perf report -v > /dev/null
Looking at the vmlinux_path (5 entries long)
Using /lib/modules/2.6.33-rc8-tip-00777-g0918527-dirty/build/vmlinux for symbols
[root@mica ~]# mv /lib/modules/2.6.33-rc8-tip-00777-g0918527-dirty/build/vmlinux /tmp/dd
[root@mica ~]# perf report -v > /dev/null
Looking at the vmlinux_path (5 entries long)
Using /proc/kallsyms for symbols
[root@mica ~]#
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1266866139-6361-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
First, for programs and prelinked libraries, annotate code was
fooled by objdump output IPs (src->eip in the code) being
wrongly converted to absolute IPs. In such case there were no
conversion needed, but in
src->eip = strtoull(src->line, NULL, 16);
src->eip = map->unmap_ip(map, src->eip); // = eip + map->start - map->pgoff
we were reading absolute address from objdump (e.g. 8048604) and
then almost doubling it, because eip & map->start are
approximately close for small programs.
Needless to say, that later, in record_precise_ip() there was no
matching with real runtime IPs.
And second, like with `perf annotate` the problem with
non-prelinked *.so was that we were doing rip -> objdump address
conversion wrong.
Also, because unlike `perf annotate`, `perf top` code does
annotation based on absolute IPs for performance reasons(*), new
helper for mapping objdump addresse to IP is introduced.
(*) we get samples info in absolute IPs, and since we do lots of
hit-testing on absolute IPs at runtime in record_precise_ip(), it's
better to convert objdump addresses to IPs once and do no conversion
at runtime.
I also had to fix how objdump output is parsed (with hardcoded
8/16 characters format, which was inappropriate for ET_DYN dsos
with small addresses like '4ac')
Also note, that not all objdump output lines has associtated
IPs, e.g. look at source lines here:
000004ac <my_strlen>:
extern "C"
int my_strlen(const char *s)
4ac: 55 push %ebp
4ad: 89 e5 mov %esp,%ebp
4af: 83 ec 10 sub $0x10,%esp
{
int len = 0;
4b2: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%ebp)
4b9: eb 08 jmp 4c3 <my_strlen+0x17>
while (*s) {
++len;
4bb: 83 45 fc 01 addl $0x1,-0x4(%ebp)
++s;
4bf: 83 45 08 01 addl $0x1,0x8(%ebp)
So we mark them with eip=0, and ignore such lines in annotate
lookup code.
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
[ Note: one hunk of this patch was applied by Mike in 57d8188 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1265550376-12665-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The problem was we were incorrectly calculating objdump
addresses for sym->start and sym->end, look:
For simple ET_DYN type DSO (*.so) with one function, objdump -dS
output is something like this:
000004ac <my_strlen>:
int my_strlen(const char *s)
4ac: 55 push %ebp
4ad: 89 e5 mov %esp,%ebp
4af: 83 ec 10 sub $0x10,%esp
{
i.e. we have relative-to-dso-mapping IPs (=RIP) there.
For ET_EXEC type and probably for prelinked libs as well (sorry
can't test - I don't use prelink) objdump outputs absolute IPs,
e.g.
08048604 <zz_strlen>:
extern "C"
int zz_strlen(const char *s)
8048604: 55 push %ebp
8048605: 89 e5 mov %esp,%ebp
8048607: 83 ec 10 sub $0x10,%esp
{
So, if sym->start is always relative to dso mapping(*), we'll
have to unmap it for ET_EXEC like cases, and leave as is for
ET_DYN cases.
(*) and it is - we've explicitely made it relative. Look for
adjust_symbols handling in dso__load_sym()
Previously we were always unmapping sym->start and for ET_DYN
dsos resulting addresses were wrong, and so objdump output was
empty.
The end result was that perf annotate output for symbols from
non-prelinked *.so had always 0.00% percents only, which is
wrong.
To fix it, let's introduce a helper for converting rip to
objdump address, and also let's document what map_ip() and
unmap_ip() do -- I had to study sources for several hours to
understand it.
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1265223128-11786-8-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
I noticed while writing the first test in 'perf regtest' that to
just test the symbol handling routines one needs to create a
perf session, that is a layer centered on a perf.data file,
events, etc, so I untied these layers.
This reduces the complexity for the users as the number of
parameters to most of the symbols and session APIs now was
reduced while not adding more state to all the map instances by
only having data that is needed to split the kernel (kallsyms
and ELF symtab sections) maps and do vmlinux relocation on the
main kernel map.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1265223128-11786-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
And this resulted in the need for adding some missing includes
in some places that were getting the definitions needed out of
sheer luck.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261957026-15580-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>