kernel-fxtec-pro1x

Author	SHA1	Message	Date
Chris Wilson	933da83aa1	perf: Propagate term signal to child If we launch the child on behalf of the user, ensure that it dies along with ourselves when we are interrupted. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> LKML-Reference: <1254616502-4728-1-git-send-email-chris@chris-wilson.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-04 19:37:39 +02:00
Mulyadi Santosa	1ad0560e8c	perf tools: Run generate-cmdlist.sh properly Right now generate-cmdlist.sh is not executable, so we should call it as an argument ".". This fixes cases where due to different umask defaults the generate-cmdlist.sh script is not executable in a kernel tree checkout. Signed-off-by: Mulyadi Santosa <mulyadi.santosa@gmail.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <f284c33d0909251201w422e9687x8cd3a784e85adf7d@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-01 10:12:03 +02:00
Arjan van de Ven	39a90a8ef1	perf timechart: Add a power-only mode For doing work on the Linux power management components, I need to make long (30+ seconds) traces. Currently, this then results in a HUGE svg file, with mostly process data that isn't interesting. This patch adds a --power-only mode to perf timechart that only outputs the CPU power section of the SVG; this significantly reduces the size of the SVG file, making even 30+ second traces viewable with inkscape. As a minor tweak for the same effect, the minimum text size is decreased; current inkscape cannot zoom in deep enough to show text this small, but it reduces inkscape compute time. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: peterz@infradead.org LKML-Reference: <20090924154013.0675ab71@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-01 09:26:40 +02:00
Arnaldo Carvalho de Melo	8357275bb9	perf top: Add poll_idle to the skip list Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20090925220239.GA5488@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-30 13:56:54 +02:00
Eric Dumazet	725b13685c	perf tools: Dont use openat() openat() is still a young glibc facility, better to not use it in a non performance critical program (perf list) Many machines have older glibc (RHEL 4 Update 5 -> glibc-2.3.4-2.36 on my dev machine for example). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ulrich Drepper <drepper@redhat.com> LKML-Reference: <4ABB767D.6080004@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-24 21:20:08 +02:00
Eric Dumazet	a255a9981a	perf tools: Fix buffer allocation "perf top" cores dump on my dev machine, if run from a directory where vmlinux is present: * glibc detected * malloc(): memory corruption: 0x085670d0 *** Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: <stable@kernel.org> LKML-Reference: <4ABB6EB7.7000002@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-24 15:11:23 +02:00
Kirill Smelkov	67030036eb	perf tools: .gitignore += perf*.html I've tried building the docs in tools/perf/Documentation/ , and after that `git status` showed dozen of untracked htmls. Let's ignore them. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> LKML-Reference: <1253790022-10300-1-git-send-email-kirr@mns.spb.ru> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-24 14:01:22 +02:00
Mike Galbraith	dd906a0fe8	perf tools: Handle relative paths while loading module symbols Inform util/module.c::mod_dso__load_module_paths() that relative paths do exist in some modules.dep, and make it fail noisily should it encounter a path that it doesn't understand, or a module it cannot open. Reported-by: Avi Kivity <avi@redhat.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: rostedt@goodmis.org Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <1253779628.10513.8.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-24 11:40:35 +02:00
Mike Galbraith	508c4d0874	perf tools: Fix module symbol loading bug Avi Kivity reported 'perf annotate' failures with modules, the requested function was not annotated. If there are no modules currently loaded, or the last module scanned is not loaded, dso__load_modules() steps on the value from dso__load_vmlinux(), so we happily load the kallsyms symbols on top of what we've already loaded. Fix that such that the total count of symbols loaded is returned. Should module symbol load fail after parsing of vmlinux, is's a hard failure, so do not silently fall-back to kallsyms. Reported-by: Avi Kivity <avi@redhat.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: rostedt@goodmis.org Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <1253697658.11461.36.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-23 13:45:48 +02:00
Ingo Molnar	c7f7fea30b	perf stat: Fix zero total printouts Before: 0 sched:sched_switch # nan M/sec After: 0 sched:sched_switch # 0.000 M/sec Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-22 15:01:47 +02:00
Ingo Molnar	cdd6c482c9	perf: Do the big rename: Performance Counters -> Performance Events Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f \| grep -vE 'oprofile\|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N \| sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-21 14:28:04 +02:00
Arjan van de Ven	611a546bec	perf util: SVG performance improvements Tweak the output SVG to increase performance in SVG viewers by limiting the different types of font sizes and by smarter transformations on the text. At least with Inkscape this gives a notable performance improvement during zoom and scrolling. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090920181438.3a49cb93@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-20 19:37:36 +02:00
Arjan van de Ven	5094b65545	perf util: Make the timechart SVG width dynamic This patch adds a command line option for timechart that allows the user to specify the width of the SVG file. This patch also makes sure that each second of recording has at least 200 units (pixels at 96 DPI) of width. This impacts recordings longer than 5 seconds; recordings shorter than 5 second will scale up to have a width of 1000 units for the whole recording (as before). Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090920181416.69570c5d@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-20 19:37:35 +02:00
Arjan van de Ven	a92fe7b306	perf timechart: Show the duration of scheduler delays in the SVG Given that scheduler latencies are the hot thing nowadays, show the duration of said latencies in the SVG in text form. In addition, if the latency is more than 10 msec, pick a brighter yellow color as a way to point these long delays out. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090920181353.796f4509@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-20 19:37:35 +02:00
Arjan van de Ven	4f1202c8e6	perf timechart: Show the name of the waker/wakee in timechart Timechart currently shows thin green lines for sending or receiving wakeups. This patch also prints (in a very small font) the name of the process that is being woken/wakes up this process. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090920181328.68baa978@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-20 19:37:34 +02:00
Arjan van de Ven	ec60a3fe47	perf utils: Use a define for the maximum length of a trace event As per Ingo's review: use a #define rather than an open coded constant for the maximum length of a trace event for storing in the perf.data file. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: fweisbec@gmail.com Cc: peterz@infradead.org Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090919133630.10533d3e@infradead.org> [ add a few comments to nearby functions ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-19 18:57:53 +02:00
Arjan van de Ven	151750cec5	perf: Add timechart help text and add timechart to "perf help" As suggested by Ingo, add a timechart man page help text, as well as add it to the "perf help" overview. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: fweisbec@gmail.com Cc: peterz@infradead.org Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090919133604.3767fa35@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-19 18:57:53 +02:00
Arjan van de Ven	964a0b3d2b	perf utils: Be consistent about minimum text size in the svghelper Be more consistent in the svghelper about the minimum text size by having a global #define for this. There needs to be a minimum text size in order to keep the size of the SVG file within the reach of what current SVG viewers can cope with. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: fweisbec@gmail.com Cc: peterz@infradead.org Cc: Paul Mackerras <paulus@samba.org> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <20090919133507.7374ef8b@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-19 18:57:52 +02:00
Arjan van de Ven	3c09eebd61	perf timechart: Add "perf timechart record" Add a command line option to record a trace, similar to "perf sched record". Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: fweisbec@gmail.com Cc: peterz@infradead.org Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090919133442.0dc2c7f5@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-19 18:57:51 +02:00
Arjan van de Ven	10274989fd	perf: Add the timechart tool timechart is a tool to visualize what is going on in the system. The user makes a trace of what is going on with > perf record --timechart /usr/bin/some_command and then can turn the output of this into an svg file > perf timechart which then can be viewed with any SVG view; inkscape works well enough for me. The idea behind timechart is to create a "infinitely zoomable" picture; something that has high level information on a 1:1 zoom level, but which exposes more details every time you zoom into a specific area. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090912130713.6a77bbc0@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-19 11:42:13 +02:00
Arjan van de Ven	f48d55ce78	perf: Add a SVG helper library file The timechart tool writes out SVG format output; this patch adds a set of helper functions to abstract dealing with SVG from the core timechart code. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090912130613.677f0516@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-19 11:42:13 +02:00
Arjan van de Ven	fd39e055c4	perf: Add a sample_event type to the event_union Add a sample_event type to the event_union so that raw samples can be processed easily. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090912130511.411434b5@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-19 11:42:12 +02:00
Arjan van de Ven	a79d7db9fd	perf: Allow perf utilities to have "callback" options without arguments timechart needs to add a "callback" type command line argument that does not take arguments. This patch adds the parse-options.h infrastructure to make this possible. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090912130440.548666c1@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-19 11:42:11 +02:00
Arjan van de Ven	8755a8f27a	perf: Store trace event name/id pairs in perf.data The trace event name<->id mapping is dynamic for each kernel compile. In order for perf.data to be useable outside the actual system, we thus need to store a table of this mapping for later use. This patch adds this table to perf.data, and provides helper functions for lookup up fields from this table. To avoid mistakes, lookup-from-table is kept completely seprate from lookup-from-local-debugfs. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090912130405.6960d099@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-19 11:42:11 +02:00
Arjan van de Ven	393b2ad8c7	perf: Add a timestamp to fork events perf timechart needs to know when a process forked, in order to be able to visualize properly when tasks start. This patch adds a time field to the event structure, and fills it in appropriately. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090912130341.51ad2de2@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-19 11:42:10 +02:00
Ingo Molnar	929bf0d015	Merge branch 'linus' into perfcounters/core Merge reason: Bring in tracing changes we depend on. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-19 11:28:41 +02:00
Mike Galbraith	4b77a72977	perf sched: Add --input=file option to builtin-sched.c perf sched record passes unparsed args on to perf record, so specifying an output file via perf sched record -o FILE (cmd) just works. Ergo, provide an option to specify input file as well. Also add the missing 'map' command to help. Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1253254944.20589.11.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-18 08:36:06 +02:00
Li Zefan	1281a49b7a	perf trace: Sample timestamp and cpu when using record flag Sample timestamp and cpu just like the -R option. Before: init-0 [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=18 handler=eth0 init-0 [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=18 handler=eth0 init-0 [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=1 handler=i8042 init-0 [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=18 handler=eth0 init-0 [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=1 handler=i8042 After: init-0 [001] 7364.568965353: irq_handler_entry: irq=18 handler=eth0 init-0 [001] 7365.530226877: irq_handler_entry: irq=1 handler=i8042 init-0 [001] 7365.542831563: irq_handler_entry: irq=18 handler=eth0 init-0 [001] 7365.644156299: irq_handler_entry: irq=18 handler=eth0 init-0 [001] 7365.694556201: irq_handler_entry: irq=18 handler=eth0 Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4AB1F827.8040905@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-18 00:30:25 +02:00
Li Zefan	270bbbe80d	perf tools: Increase MAX_EVENT_LENGTH The name length of some trace events is longer than 30, like sys_enter_sched_get_priority_max and ext4_mb_discard_preallocations. Passing those events to perf-record will fail, try: # ./perf record -f -e syscalls:sys_enter_sched_get_priority_max -F 1 -a Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4AB1F4AB.7050205@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-18 00:30:25 +02:00
Li Zefan	6706ccf8e7	perf tools: Fix memory leak in read_ftrace_printk() get_tracing_file() should be paired with put_tracing_file(). Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4AB1F48F.4070807@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-18 00:30:24 +02:00
Ingo Molnar	40749d0ff4	perf sched: Determine the number of CPUs automatically For 'perf sched map' output, determine max_cpu automatically, instead of the static default of 15. [ v2: use sysconf() pointed out by Arjan van de Ven <arjan@infradead.org> ] Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-17 22:09:19 +02:00
Peter Zijlstra	8b412664d0	perf record: Disable profiling before draining the buffer I noticed that perf-record continues profiling itself after the child terminated and we're draining the buffer. This can cause a _lot_ of overhead with --all recording - we keep and keep recording, which produces new and new events. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-17 22:08:27 +02:00
Ingo Molnar	0ec04e16d0	perf sched: Add 'perf sched map' scheduling event map printout This prints a textual context-switching outline of workload captured via perf sched record. For example, on a 16 CPU box it outputs: N1 O1 . . . S1 . . . B0 . I0 C1 . M1 . 23002.773423 secs N1 O1 . Q0 . S1 . . . B0 . I0 C1 . M1 . 23002.773423 secs N1 O1 . Q0 . S1 . . . B0 . R1 C1 . M1 . 23002.773485 secs N1 O1 . Q0 . S1 . S0 . B0 . R1 C1 . M1 . 23002.773478 secs L0 O1 . Q0 . S1 . S0 . B0 . R1 C1 . M1 . 23002.773523 secs L0 O1 . . . S1 . S0 . B0 . R1 C1 . M1 . 23002.773531 secs L0 O1 . . . S1 . S0 . B0 . R1 C1 T1 M1 . 23002.773547 secs T1 => irqbalance:2089 L0 O1 . . . S1 . S0 . P0 . R1 C1 T1 M1 . 23002.773549 secs N1 O1 . . . S1 . S0 . P0 . R1 C1 T1 M1 . 23002.773566 secs N1 O1 . . . J0 . S0 . P0 . R1 C1 T1 M1 . 23002.773571 secs N1 O1 . . . J0 . S0 B0 P0 . R1 C1 T1 M1 . 23002.773592 secs N1 O1 . . . J0 . U0 B0 P0 . R1 C1 T1 M1 . 23002.773582 secs N1 O1 . . . S1 . U0 B0 P0 . R1 C1 T1 M1 . 23002.773604 secs N1 O1 . . . S1 . U0 B0 . . R1 C1 T1 M1 . 23002.773615 secs N1 O1 . . . S1 . U0 B0 . . K0 C1 T1 M1 . 23002.773631 secs N1 O1 . M0 . S1 . U0 B0 . . K0 C1 T1 M1 . 23002.773624 secs N1 O1 . M0 . S1 . U0 . . . K0 C1 T1 M1 . 23002.773644 secs N1 O1 . M0 . S1 . U0 . . . R1 C1 T1 M1 . 23002.773662 secs N1 O1 . M0 . S1 . . . . . R1 C1 T1 M1 . 23002.773648 secs N1 O1 . . . S1 . . . . . R1 C1 T1 M1 . 23002.773680 secs N1 O1 . . . L0 . . . . . R1 C1 T1 M1 . 23002.773717 secs N0 O1 . . . L0 . . . . . R1 C1 T1 M1 . 23002.773709 secs N1 O1 . . . L0 . . . . . R1 C1 T1 M1 . 23002.773747 secs Columns stand for individual CPUs, from CPU0 to CPU15, and the two-letter shortcuts stand for tasks that are running on a CPU. '' denotes the CPU that had the event. A dot signals an idle CPU. New tasks are assigned new two-letter shortcuts - when they occur first they are printed. In the above example 'T1' stood for irqbalance: T1 => irqbalance:2089 Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-16 16:41:30 +02:00
Ingo Molnar	80ed0987f3	perf sched: Make idle thread and comm/pid names more consistent Peter noticed that we have 3 ways of referring to the idle thread: [idle]:0 swapper:0 swapper-0 Standardize on 'swapper:0'. Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-16 12:15:47 +02:00
Ingo Molnar	c8a3775104	perf sched: Sanity check context switch events Use 'perf sched latency' to track the current task based on context-switch events, and flag the cases where there's some impossible transition: such as a PID being switched out that was not switched in. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-16 12:08:28 +02:00
Ingo Molnar	dc02bf7178	perf sched: Account for lost events, increase default buffering Output such lost event and state machine weirdness stats: TOTAL: \| 14974.910 ms \| 46384 \| --------------------------------------------------- INFO: 8.865% lost events (19132 out of 215819, in 8 chunks) INFO: 0.198% state machine bugs (49 out of 24708) (due to lost events?) And increase buffering to -m 1024 (4 MB) by default. Since we use output multiplexing that kind of space is needed. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-16 11:48:05 +02:00
mingo	39aeb52f99	perf sched: Add support for sched:sched_stat_runtime events This allows more precise 'perf sched latency' output: --------------------------------------------------------------------------------------- Task \| Runtime ms \| Switches \| Average delay ms \| Maximum delay ms \| --------------------------------------------------------------------------------------- ksoftirqd/0-4 \| 0.010 ms \| 2 \| avg: 2.476 ms \| max: 2.977 ms \| perf-12328 \| 15.844 ms \| 66 \| avg: 1.118 ms \| max: 9.979 ms \| bdi-default-235 \| 0.009 ms \| 1 \| avg: 0.998 ms \| max: 0.998 ms \| events/1-8 \| 0.020 ms \| 2 \| avg: 0.998 ms \| max: 0.998 ms \| events/0-7 \| 0.018 ms \| 2 \| avg: 0.992 ms \| max: 0.996 ms \| sleep-12329 \| 0.742 ms \| 3 \| avg: 0.906 ms \| max: 2.289 ms \| sshd-12122 \| 0.163 ms \| 2 \| avg: 0.283 ms \| max: 0.562 ms \| loop-getpid-lon-12322 \| 1023.636 ms \| 69 \| avg: 0.208 ms \| max: 5.996 ms \| loop-getpid-lon-12321 \| 1038.638 ms \| 5 \| avg: 0.073 ms \| max: 0.171 ms \| migration/1-5 \| 0.000 ms \| 1 \| avg: 0.006 ms \| max: 0.006 ms \| --------------------------------------------------------------------------------------- TOTAL: \| 2079.078 ms \| 153 \| ------------------------------------------------- Also, streamline the code a bit more, add asserts for various state machine failures (they should be debugged if they occur) and fix a few odd ends. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-14 20:08:23 +02:00
mingo	08f69e6c2e	perf sched: Print PIDs too Often it's useful to know the PID of the task as well - print it out too. ( While at it, reformat the output to be a bit more paste-into-commit-logs friendly. ) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-14 18:34:57 +02:00
Ingo Molnar	d11533893b	perf sched: Fix 'perf sched latency' output on 32-bit systems Before: ----------------------------------------------------------------------------------- Task \| Runtime ms \| Switches \| Average delay ms \| Maximum delay ms \| ----------------------------------------------------------------------------------- perf \|4853313.251 ms \| 10 \| avg: 0.046 ms \| max: 0.337 ms \| flush-8:0 \|2426659.202 ms \| 5 \| avg: 0.015 ms \| max: 0.016 ms \| sleep \|485331.966 ms \| 1 \| avg: 0.012 ms \| max: 0.012 ms \| ksoftirqd/1 \|485331.320 ms \| 1 \| avg: 0.005 ms \| max: 0.005 ms \| ----------------------------------------------------------------------------------- TOTAL: \|8250635.739 ms \| 17 \| --------------------------------------------- After: ----------------------------------------------------------------------------------- Task \| Runtime ms \| Switches \| Average delay ms \| Maximum delay ms \| ----------------------------------------------------------------------------------- perf \| 0.206 ms \| 10 \| avg: 0.046 ms \| max: 0.337 ms \| flush-8:0 \| 2.680 ms \| 5 \| avg: 0.015 ms \| max: 0.016 ms \| sleep \| 0.662 ms \| 1 \| avg: 0.012 ms \| max: 0.012 ms \| ksoftirqd/1 \| 0.015 ms \| 1 \| avg: 0.005 ms \| max: 0.005 ms \| ----------------------------------------------------------------------------------- TOTAL: \| 3.563 ms \| 17 \| --------------------------------------------- Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-14 18:22:53 +02:00
Ingo Molnar	ea57c4f520	perf tools: Implement counter output multiplexing Finish the -M/--multiplex option implementation: - separate it out from group_fd - correctly set it via the ioctl and dont mmap counters that are multiplexed - modify the perf record event loop to deal with buffer-less counters. - remove the -g option from perf sched record - account for unordered events in perf sched latency - (add -f to perf sched record to ease measurements) - skip idle threads (pid==0) in latency output The result is better latency output by 'perf sched latency': ----------------------------------------------------------------------------------- Task \| Runtime ms \| Switches \| Average delay ms \| Maximum delay ms \| ----------------------------------------------------------------------------------- ksoftirqd/8 \| 0.071 ms \| 2 \| avg: 0.458 ms \| max: 0.913 ms \| at-spi-registry \| 0.609 ms \| 19 \| avg: 0.013 ms \| max: 0.023 ms \| perf \| 3.316 ms \| 16 \| avg: 0.013 ms \| max: 0.054 ms \| Xorg \| 0.392 ms \| 19 \| avg: 0.011 ms \| max: 0.018 ms \| sleep \| 0.537 ms \| 2 \| avg: 0.009 ms \| max: 0.009 ms \| ----------------------------------------------------------------------------------- TOTAL: \| 4.925 ms \| 58 \| --------------------------------------------- Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-14 15:45:11 +02:00
Frederic Weisbecker	aa1ab9d26a	perf tools: Fix processing of randomly serialized sched traces Currently it's possible to meet such too high latency results with 'perf sched latency'. ----------------------------------------------------------------------------------- Task \| Runtime ms \| Switches \| Average delay ms \| Maximum delay ms \| ----------------------------------------------------------------------------------- xfce4-panel \| 0.222 ms \| 2 \| avg: 4718.345 ms \| max: 9436.493 ms \| scsi_eh_3 \| 3.962 ms \| 36 \| avg: 55.957 ms \| max: 1977.829 ms \| The origin is on traces that are sometimes badly serialized across cpus. For example the raw traces that raised such results for xfce4-panel: (1) [init]-0 [000] 1494.663899990: sched_switch: task swapper:0 [140] (R) ==> xfce4-panel:4569 [120] (2) xfce4-panel-4569 [000] 1494.663928373: sched_switch: task xfce4-panel:4569 [120] (S) ==> swapper:0 [140] (3) Xorg-4276 [001] 1494.663860125: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000] (4) Xorg-4276 [001] 1504.098252756: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000] (5) perf-5219 [000] 1504.100353302: sched_switch: task perf:5219 [120] (S) ==> xfce4-panel:4569 [120] The traces are processed in the order they arrive. Then in (2), xfce4-panel sleeps, it is first waken up in (3) and eventually scheduled in (5). The latency reported is then 1504 - 1495 = 9 secs, as reported by perf sched. But this is wrong, we are confident in the fact the traces are nicely serialized while we should actually more trust the timestamps. If we reorder by timestamps we get: (1) Xorg-4276 [001] 1494.663860125: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000] (2) [init]-0 [000] 1494.663899990: sched_switch: task swapper:0 [140] (R) ==> xfce4-panel:4569 [120] (3) xfce4-panel-4569 [000] 1494.663928373: sched_switch: task xfce4-panel:4569 [120] (S) ==> swapper:0 [140] (4) Xorg-4276 [001] 1504.098252756: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000] (5) perf-5219 [000] 1504.100353302: sched_switch: task perf:5219 [120] (S) ==> xfce4-panel:4569 [120] Now the trace make more sense, xfce4-panel is sleeping. Then it is woken up in (1), scheduled in (2) It goes to sleep in (3), woken up in (4) and scheduled in (5). Now, latency captured between (1) and (2) is of 39 us. And between (4) and (5) it is 2.1 ms. Such pattern of bad serializing is the origin of the high latencies reported by perf sched. Basically, we need to check whether wake up time is higher than schedule out time. If it's not the case, we need to tag the current work atom as invalid. Beside that, we may need to work later on a better ordering of the traces given by the kernel. After this patch: xfce4-session \| 0.221 ms \| 1 \| avg: 0.538 ms \| max: 0.538 ms \| Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-14 15:45:10 +02:00
Frederic Weisbecker	d13025222c	perf tools: Add an option to multiplex counters in a single channel Add an option to multiplex counters output in the channel of the group leader, ie: the first counter opened: -M --multiplex The effect is better serialized samples. This is especially useful for tracepoint samples that need to be well serialized for their post-processing. Also make use of this option in 'perf sched'. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-14 09:52:23 +02:00
Ingo Molnar	c13f0d3c81	perf sched: Add 'perf sched trace', improve documentation Alias 'perf sched trace' to 'perf trace', for workflow completeness. Add a bit of documentation for perf sched. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-13 17:55:23 +02:00
Ingo Molnar	1fc35b29b4	perf sched: Implement the 'perf sched record' subcommand Implement the 'perf sched record' subcommand that adds a default list of events, turns on raw sampling and system-wide tracing and passes off the rest of the command to perf record. This is more convenient than having to specify the events all the time. Before: $ perf record -a -R -e sched:sched_switch:r -e sched:sched_stat_wait:r -e sched:sched_stat_sleep:r -e sched:sched_stat_iowait:r -e sched:sched_process_exit:r -e sched:sched_process_fork:r -e sched:sched_wakeup:r -e sched:sched_migrate_task:r -c 1 sleep 1 After: $ perf sched record -f sleep 1 Also fix an assumption in the event string parser that assumed that strings passed in can be modified. (In this case they wont be as they come from a readonly constant section.) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-13 10:22:51 +02:00
Ingo Molnar	b5fae128e4	perf sched: Clean up PID sorting logic Use a sort list for thread atoms insertion as well - instead of hardcoded for PID. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-13 10:22:50 +02:00
Ingo Molnar	b1ffe8f3e0	perf sched: Finish latency => atom rename and misc cleanups - Rename 'latency' field/variable names to the better 'atom' ones - Reduce the number of #include lines and consolidate them - Gather file scope variables at the top of the file - Remove unused bits No change in functionality. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-13 10:22:49 +02:00
Ingo Molnar	f2858d8ad9	perf sched: Add 'perf sched latency' and 'perf sched replay' Separate the option parsing cleanly and add two variants: - 'perf sched latency' (can be abbreviated via 'perf sched lat') - 'perf sched replay' (can be abbreviated via 'perf sched rep') Also add a repeat count option to replay and add a separation set of options for replay. Do the sorting setup only in the latency sub-command. Display separate help screens for 'perf sched' and 'perf sched replay -h' - i.e. further separation of the sub-commands. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-13 10:22:49 +02:00
Frederic Weisbecker	daa1d7a5ea	perf sched: Implement multidimensional sorting Implement multidimensional sorting on perf sched so that you can sort either by number of switches, latency average, latency maximum, runtime. perf sched -l -s avg,max (this is the default) ----------------------------------------------------------------------------------- Task \| Runtime ms \| Switches \| Average delay ms \| Maximum delay ms \| ----------------------------------------------------------------------------------- gnome-power-man \| 0.113 ms \| 1 \| avg: 4998.531 ms \| max: 4998.531 ms \| xfdesktop \| 1.190 ms \| 7 \| avg: 136.475 ms \| max: 940.933 ms \| xfce-mcs-manage \| 2.194 ms \| 22 \| avg: 38.534 ms \| max: 735.174 ms \| notification-da \| 2.749 ms \| 31 \| avg: 27.436 ms \| max: 731.791 ms \| xfce4-session \| 3.343 ms \| 28 \| avg: 26.796 ms \| max: 734.891 ms \| xfwm4 \| 3.159 ms \| 22 \| avg: 12.406 ms \| max: 241.333 ms \| xchat \| 42.789 ms \| 214 \| avg: 11.886 ms \| max: 100.349 ms \| xfce4-terminal \| 5.386 ms \| 22 \| avg: 11.414 ms \| max: 241.611 ms \| firefox \| 151.992 ms \| 123 \| avg: 9.543 ms \| max: 153.717 ms \| xfce4-panel \| 24.324 ms \| 47 \| avg: 8.189 ms \| max: 242.352 ms \| :5090 \| 6.932 ms \| 111 \| avg: 8.131 ms \| max: 102.665 ms \| events/0 \| 0.758 ms \| 12 \| avg: 1.964 ms \| max: 21.879 ms \| Xorg \| 280.558 ms \| 340 \| avg: 1.864 ms \| max: 99.526 ms \| geany \| 63.391 ms \| 295 \| avg: 1.099 ms \| max: 9.334 ms \| reiserfs/0 \| 0.039 ms \| 2 \| avg: 0.854 ms \| max: 1.487 ms \| kondemand/0 \| 8.251 ms \| 245 \| avg: 0.691 ms \| max: 34.372 ms \| Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-13 10:22:48 +02:00
Frederic Weisbecker	7362262687	perf sched: Fix nsec to msec conversion We are dividing a time in ns by 1e9. This is a nsec to sec conversion. What we want is msecs. Fix it by dividing by 1e6. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-13 10:22:48 +02:00
Frederic Weisbecker	66685678a0	perf sched: Export the total, max latency and total runtime to thread atoms list Add a field in the thread atom list that keeps track of the total and max latencies and also the total runtime. This makes a faster output and also prepares for sorting. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-13 10:22:47 +02:00

1 2 3 4 5 ...

310 commits