ftrace: update txt document
Impact: Documentation update only A lot of changes have gone into ftrace. This patch updates the ftrace.txt document. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
This commit is contained in:
parent
457d2ee225
commit
9b803c0fc3
1 changed files with 71 additions and 96 deletions
|
@ -50,26 +50,26 @@ of ftrace. Here is a list of some of the key files:
|
|||
|
||||
Note: all time values are in microseconds.
|
||||
|
||||
current_tracer : This is used to set or display the current tracer
|
||||
current_tracer: This is used to set or display the current tracer
|
||||
that is configured.
|
||||
|
||||
available_tracers : This holds the different types of tracers that
|
||||
available_tracers: This holds the different types of tracers that
|
||||
have been compiled into the kernel. The tracers
|
||||
listed here can be configured by echoing their name
|
||||
into current_tracer.
|
||||
|
||||
tracing_enabled : This sets or displays whether the current_tracer
|
||||
tracing_enabled: This sets or displays whether the current_tracer
|
||||
is activated and tracing or not. Echo 0 into this
|
||||
file to disable the tracer or 1 to enable it.
|
||||
|
||||
trace : This file holds the output of the trace in a human readable
|
||||
trace: This file holds the output of the trace in a human readable
|
||||
format (described below).
|
||||
|
||||
latency_trace : This file shows the same trace but the information
|
||||
latency_trace: This file shows the same trace but the information
|
||||
is organized more to display possible latencies
|
||||
in the system (described below).
|
||||
|
||||
trace_pipe : The output is the same as the "trace" file but this
|
||||
trace_pipe: The output is the same as the "trace" file but this
|
||||
file is meant to be streamed with live tracing.
|
||||
Reads from this file will block until new data
|
||||
is retrieved. Unlike the "trace" and "latency_trace"
|
||||
|
@ -82,11 +82,11 @@ of ftrace. Here is a list of some of the key files:
|
|||
tracer is not adding more data, they will display
|
||||
the same information every time they are read.
|
||||
|
||||
iter_ctrl : This file lets the user control the amount of data
|
||||
iter_ctrl: This file lets the user control the amount of data
|
||||
that is displayed in one of the above output
|
||||
files.
|
||||
|
||||
trace_max_latency : Some of the tracers record the max latency.
|
||||
trace_max_latency: Some of the tracers record the max latency.
|
||||
For example, the time interrupts are disabled.
|
||||
This time is saved in this file. The max trace
|
||||
will also be stored, and displayed by either
|
||||
|
@ -94,29 +94,26 @@ of ftrace. Here is a list of some of the key files:
|
|||
only be recorded if the latency is greater than
|
||||
the value in this file. (in microseconds)
|
||||
|
||||
trace_entries : This sets or displays the number of trace
|
||||
entries each CPU buffer can hold. The tracer buffers
|
||||
are the same size for each CPU. The displayed number
|
||||
is the size of the CPU buffer and not total size. The
|
||||
trace_entries: This sets or displays the number of bytes each CPU
|
||||
buffer can hold. The tracer buffers are the same size
|
||||
for each CPU. The displayed number is the size of the
|
||||
CPU buffer and not total size of all buffers. The
|
||||
trace buffers are allocated in pages (blocks of memory
|
||||
that the kernel uses for allocation, usually 4 KB in size).
|
||||
Since each entry is smaller than a page, if the last
|
||||
allocated page has room for more entries than were
|
||||
requested, the rest of the page is used to allocate
|
||||
entries.
|
||||
If the last page allocated has room for more bytes
|
||||
than requested, the rest of the page will be used,
|
||||
making the actual allocation bigger than requested.
|
||||
(Note, the size may not be a multiple of the page size due
|
||||
to buffer managment overhead.)
|
||||
|
||||
This can only be updated when the current_tracer
|
||||
is set to "none".
|
||||
is set to "nop".
|
||||
|
||||
NOTE: It is planned on changing the allocated buffers
|
||||
from being the number of possible CPUS to
|
||||
the number of online CPUS.
|
||||
|
||||
tracing_cpumask : This is a mask that lets the user only trace
|
||||
tracing_cpumask: This is a mask that lets the user only trace
|
||||
on specified CPUS. The format is a hex string
|
||||
representing the CPUS.
|
||||
|
||||
set_ftrace_filter : When dynamic ftrace is configured in (see the
|
||||
set_ftrace_filter: When dynamic ftrace is configured in (see the
|
||||
section below "dynamic ftrace"), the code is dynamically
|
||||
modified (code text rewrite) to disable calling of the
|
||||
function profiler (mcount). This lets tracing be configured
|
||||
|
@ -130,14 +127,11 @@ of ftrace. Here is a list of some of the key files:
|
|||
be traced. If a function exists in both set_ftrace_filter
|
||||
and set_ftrace_notrace, the function will _not_ be traced.
|
||||
|
||||
available_filter_functions : When a function is encountered the first
|
||||
time by the dynamic tracer, it is recorded and
|
||||
later the call is converted into a nop. This file
|
||||
lists the functions that have been recorded
|
||||
by the dynamic tracer and these functions can
|
||||
be used to set the ftrace filter by the above
|
||||
"set_ftrace_filter" file. (See the section "dynamic ftrace"
|
||||
below for more details).
|
||||
available_filter_functions: This lists the functions that ftrace
|
||||
has processed and can trace. These are the function
|
||||
names that you can pass to "set_ftrace_filter" or
|
||||
"set_ftrace_notrace". (See the section "dynamic ftrace"
|
||||
below for more details.)
|
||||
|
||||
|
||||
The Tracers
|
||||
|
@ -145,7 +139,7 @@ The Tracers
|
|||
|
||||
Here is the list of current tracers that may be configured.
|
||||
|
||||
ftrace - function tracer that uses mcount to trace all functions.
|
||||
function - function tracer that uses mcount to trace all functions.
|
||||
|
||||
sched_switch - traces the context switches between tasks.
|
||||
|
||||
|
@ -166,8 +160,8 @@ Here is the list of current tracers that may be configured.
|
|||
the highest priority task to get scheduled after
|
||||
it has been woken up.
|
||||
|
||||
none - This is not a tracer. To remove all tracers from tracing
|
||||
simply echo "none" into current_tracer.
|
||||
nop - This is not a tracer. To remove all tracers from tracing
|
||||
simply echo "nop" into current_tracer.
|
||||
|
||||
|
||||
Examples of using the tracer
|
||||
|
@ -182,7 +176,7 @@ Output format:
|
|||
Here is an example of the output format of the file "trace"
|
||||
|
||||
--------
|
||||
# tracer: ftrace
|
||||
# tracer: function
|
||||
#
|
||||
# TASK-PID CPU# TIMESTAMP FUNCTION
|
||||
# | | | | |
|
||||
|
@ -192,7 +186,7 @@ Here is an example of the output format of the file "trace"
|
|||
--------
|
||||
|
||||
A header is printed with the tracer name that is represented by the trace.
|
||||
In this case the tracer is "ftrace". Then a header showing the format. Task
|
||||
In this case the tracer is "function". Then a header showing the format. Task
|
||||
name "bash", the task PID "4251", the CPU that it was running on
|
||||
"01", the timestamp in <secs>.<usecs> format, the function name that was
|
||||
traced "path_put" and the parent function that called this function
|
||||
|
@ -1003,22 +997,20 @@ is the stack for the hard interrupt. This hides the fact that NEED_RESCHED
|
|||
has been set. We do not see the 'N' until we switch back to the task's
|
||||
assigned stack.
|
||||
|
||||
ftrace
|
||||
------
|
||||
function
|
||||
--------
|
||||
|
||||
ftrace is not only the name of the tracing infrastructure, but it
|
||||
is also a name of one of the tracers. The tracer is the function
|
||||
tracer. Enabling the function tracer can be done from the
|
||||
debug file system. Make sure the ftrace_enabled is set otherwise
|
||||
this tracer is a nop.
|
||||
This tracer is the function tracer. Enabling the function tracer
|
||||
can be done from the debug file system. Make sure the ftrace_enabled is
|
||||
set; otherwise this tracer is a nop.
|
||||
|
||||
# sysctl kernel.ftrace_enabled=1
|
||||
# echo ftrace > /debug/tracing/current_tracer
|
||||
# echo function > /debug/tracing/current_tracer
|
||||
# echo 1 > /debug/tracing/tracing_enabled
|
||||
# usleep 1
|
||||
# echo 0 > /debug/tracing/tracing_enabled
|
||||
# cat /debug/tracing/trace
|
||||
# tracer: ftrace
|
||||
# tracer: function
|
||||
#
|
||||
# TASK-PID CPU# TIMESTAMP FUNCTION
|
||||
# | | | | |
|
||||
|
@ -1040,10 +1032,10 @@ this tracer is a nop.
|
|||
[...]
|
||||
|
||||
|
||||
Note: ftrace uses ring buffers to store the above entries. The newest data
|
||||
may overwrite the oldest data. Sometimes using echo to stop the trace
|
||||
is not sufficient because the tracing could have overwritten the data
|
||||
that you wanted to record. For this reason, it is sometimes better to
|
||||
Note: function tracer uses ring buffers to store the above entries.
|
||||
The newest data may overwrite the oldest data. Sometimes using echo to
|
||||
stop the trace is not sufficient because the tracing could have overwritten
|
||||
the data that you wanted to record. For this reason, it is sometimes better to
|
||||
disable tracing directly from a program. This allows you to stop the
|
||||
tracing at the point that you hit the part that you are interested in.
|
||||
To disable the tracing directly from a C program, something like following
|
||||
|
@ -1077,18 +1069,31 @@ every kernel function, produced by the -pg switch in gcc), starts
|
|||
of pointing to a simple return. (Enabling FTRACE will include the
|
||||
-pg switch in the compiling of the kernel.)
|
||||
|
||||
When dynamic ftrace is initialized, it calls kstop_machine to make
|
||||
the machine act like a uniprocessor so that it can freely modify code
|
||||
without worrying about other processors executing that same code. At
|
||||
initialization, the mcount calls are changed to call a "record_ip"
|
||||
function. After this, the first time a kernel function is called,
|
||||
it has the calling address saved in a hash table.
|
||||
At compile time every C file object is run through the
|
||||
recordmcount.pl script (located in the scripts directory). This
|
||||
script will process the C object using objdump to find all the
|
||||
locations in the .text section that call mcount. (Note, only
|
||||
the .text section is processed, since processing other sections
|
||||
like .init.text may cause races due to those sections being freed).
|
||||
|
||||
Later on the ftraced kernel thread is awoken and will again call
|
||||
kstop_machine if new functions have been recorded. The ftraced thread
|
||||
will change all calls to mcount to "nop". Just calling mcount
|
||||
and having mcount return has shown a 10% overhead. By converting
|
||||
it to a nop, there is no measurable overhead to the system.
|
||||
A new section called "__mcount_loc" is created that holds references
|
||||
to all the mcount call sites in the .text section. This section is
|
||||
compiled back into the original object. The final linker will add
|
||||
all these references into a single table.
|
||||
|
||||
On boot up, before SMP is initialized, the dynamic ftrace code
|
||||
scans this table and updates all the locations into nops. It also
|
||||
records the locations, which are added to the available_filter_functions
|
||||
list. Modules are processed as they are loaded and before they are
|
||||
executed. When a module is unloaded, it also removes its functions from
|
||||
the ftrace function list. This is automatic in the module unload
|
||||
code, and the module author does not need to worry about it.
|
||||
|
||||
When tracing is enabled, kstop_machine is called to prevent races
|
||||
with the CPUS executing code being modified (which can cause the
|
||||
CPU to do undesireable things), and the nops are patched back
|
||||
to calls. But this time, they do not call mcount (which is just
|
||||
a function stub). They now call into the ftrace infrastructure.
|
||||
|
||||
One special side-effect to the recording of the functions being
|
||||
traced is that we can now selectively choose which functions we
|
||||
|
@ -1251,36 +1256,6 @@ Produces:
|
|||
|
||||
We can see that there's no more lock or preempt tracing.
|
||||
|
||||
ftraced
|
||||
-------
|
||||
|
||||
As mentioned above, when dynamic ftrace is configured in, a kernel
|
||||
thread wakes up once a second and checks to see if there are mcount
|
||||
calls that need to be converted into nops. If there are not any, then
|
||||
it simply goes back to sleep. But if there are some, it will call
|
||||
kstop_machine to convert the calls to nops.
|
||||
|
||||
There may be a case in which you do not want this added latency.
|
||||
Perhaps you are doing some audio recording and this activity might
|
||||
cause skips in the playback. There is an interface to disable
|
||||
and enable the "ftraced" kernel thread.
|
||||
|
||||
# echo 0 > /debug/tracing/ftraced_enabled
|
||||
|
||||
This will disable the calling of kstop_machine to update the
|
||||
mcount calls to nops. Remember that there is a large overhead
|
||||
to calling mcount. Without this kernel thread, that overhead will
|
||||
exist.
|
||||
|
||||
If there are recorded calls to mcount, any write to the ftraced_enabled
|
||||
file will cause the kstop_machine to run. This means that a
|
||||
user can manually perform the updates when they want to by simply
|
||||
echoing a '0' into the ftraced_enabled file.
|
||||
|
||||
The updates are also done at the beginning of enabling a tracer
|
||||
that uses ftrace function recording.
|
||||
|
||||
|
||||
trace_pipe
|
||||
----------
|
||||
|
||||
|
@ -1289,14 +1264,14 @@ on the tracing is different. Every read from trace_pipe is consumed.
|
|||
This means that subsequent reads will be different. The trace
|
||||
is live.
|
||||
|
||||
# echo ftrace > /debug/tracing/current_tracer
|
||||
# echo function > /debug/tracing/current_tracer
|
||||
# cat /debug/tracing/trace_pipe > /tmp/trace.out &
|
||||
[1] 4153
|
||||
# echo 1 > /debug/tracing/tracing_enabled
|
||||
# usleep 1
|
||||
# echo 0 > /debug/tracing/tracing_enabled
|
||||
# cat /debug/tracing/trace
|
||||
# tracer: ftrace
|
||||
# tracer: function
|
||||
#
|
||||
# TASK-PID CPU# TIMESTAMP FUNCTION
|
||||
# | | | | |
|
||||
|
@ -1317,7 +1292,7 @@ is live.
|
|||
|
||||
Note, reading the trace_pipe file will block until more input is added.
|
||||
By changing the tracer, trace_pipe will issue an EOF. We needed
|
||||
to set the ftrace tracer _before_ cating the trace_pipe file.
|
||||
to set the function tracer _before_ we "cat" the trace_pipe file.
|
||||
|
||||
|
||||
trace entries
|
||||
|
@ -1334,10 +1309,10 @@ number of entries.
|
|||
65620
|
||||
|
||||
Note, to modify this, you must have tracing completely disabled. To do that,
|
||||
echo "none" into the current_tracer. If the current_tracer is not set
|
||||
to "none", an EINVAL error will be returned.
|
||||
echo "nop" into the current_tracer. If the current_tracer is not set
|
||||
to "nop", an EINVAL error will be returned.
|
||||
|
||||
# echo none > /debug/tracing/current_tracer
|
||||
# echo nop > /debug/tracing/current_tracer
|
||||
# echo 100000 > /debug/tracing/trace_entries
|
||||
# cat /debug/tracing/trace_entries
|
||||
100045
|
||||
|
|
Loading…
Reference in a new issue