Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits)
  tracing, net: fix net tree and tracing tree merge interaction
  tracing, powerpc: fix powerpc tree and tracing tree interaction
  ring-buffer: do not remove reader page from list on ring buffer free
  function-graph: allow unregistering twice
  trace: make argument 'mem' of trace_seq_putmem() const
  tracing: add missing 'extern' keywords to trace_output.h
  tracing: provide trace_seq_reserve()
  blktrace: print out BLK_TN_MESSAGE properly
  blktrace: extract duplidate code
  blktrace: fix memory leak when freeing struct blk_io_trace
  blktrace: fix blk_probes_ref chaos
  blktrace: make classic output more classic
  blktrace: fix off-by-one bug
  blktrace: fix the original blktrace
  blktrace: fix a race when creating blk_tree_root in debugfs
  blktrace: fix timestamp in binary output
  tracing, Text Edit Lock: cleanup
  tracing: filter fix for TRACE_EVENT_FORMAT events
  ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
  x86: kretprobe-booster interrupt emulation code fix
  ...

Fix up trivial conflicts in
 arch/parisc/include/asm/ftrace.h
 include/linux/memory.h
 kernel/extable.c
 kernel/module.c
This commit is contained in:
Linus Torvalds 2009-04-05 11:04:19 -07:00
commit 714f83d5d9
145 changed files with 13836 additions and 3480 deletions

View file

@ -0,0 +1,71 @@
What: /sys/kernel/debug/kmemtrace/
Date: July 2008
Contact: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Description:
In kmemtrace-enabled kernels, the following files are created:
/sys/kernel/debug/kmemtrace/
cpu<n> (0400) Per-CPU tracing data, see below. (binary)
total_overruns (0400) Total number of bytes which were dropped from
cpu<n> files because of full buffer condition,
non-binary. (text)
abi_version (0400) Kernel's kmemtrace ABI version. (text)
Each per-CPU file should be read according to the relay interface. That is,
the reader should set affinity to that specific CPU and, as currently done by
the userspace application (though there are other methods), use poll() with
an infinite timeout before every read(). Otherwise, erroneous data may be
read. The binary data has the following _core_ format:
Event ID (1 byte) Unsigned integer, one of:
0 - represents an allocation (KMEMTRACE_EVENT_ALLOC)
1 - represents a freeing of previously allocated memory
(KMEMTRACE_EVENT_FREE)
Type ID (1 byte) Unsigned integer, one of:
0 - this is a kmalloc() / kfree()
1 - this is a kmem_cache_alloc() / kmem_cache_free()
2 - this is a __get_free_pages() et al.
Event size (2 bytes) Unsigned integer representing the
size of this event. Used to extend
kmemtrace. Discard the bytes you
don't know about.
Sequence number (4 bytes) Signed integer used to reorder data
logged on SMP machines. Wraparound
must be taken into account, although
it is unlikely.
Caller address (8 bytes) Return address to the caller.
Pointer to mem (8 bytes) Pointer to target memory area. Can be
NULL, but not all such calls might be
recorded.
In case of KMEMTRACE_EVENT_ALLOC events, the next fields follow:
Requested bytes (8 bytes) Total number of requested bytes,
unsigned, must not be zero.
Allocated bytes (8 bytes) Total number of actually allocated
bytes, unsigned, must not be lower
than requested bytes.
Requested flags (4 bytes) GFP flags supplied by the caller.
Target CPU (4 bytes) Signed integer, valid for event id 1.
If equal to -1, target CPU is the same
as origin CPU, but the reverse might
not be true.
The data is made available in the same endianness the machine has.
Other event ids and type ids may be defined and added. Other fields may be
added by increasing event size, but see below for details.
Every modification to the ABI, including new id definitions, are followed
by bumping the ABI version by one.
Adding new data to the packet (features) is done at the end of the mandatory
data:
Feature size (2 byte)
Feature ID (1 byte)
Feature data (Feature size - 3 bytes)
Users:
kmemtrace-user - git://repo.or.cz/kmemtrace-user.git

File diff suppressed because it is too large Load diff

View file

@ -50,6 +50,7 @@ parameter is applicable:
ISAPNP ISA PnP code is enabled. ISAPNP ISA PnP code is enabled.
ISDN Appropriate ISDN support is enabled. ISDN Appropriate ISDN support is enabled.
JOY Appropriate joystick support is enabled. JOY Appropriate joystick support is enabled.
KMEMTRACE kmemtrace is enabled.
LIBATA Libata driver is enabled LIBATA Libata driver is enabled
LP Printer support is enabled. LP Printer support is enabled.
LOOP Loopback device support is enabled. LOOP Loopback device support is enabled.
@ -1081,6 +1082,15 @@ and is between 256 and 4096 characters. It is defined in the file
use the HighMem zone if it exists, and the Normal use the HighMem zone if it exists, and the Normal
zone if it does not. zone if it does not.
kmemtrace.enable= [KNL,KMEMTRACE] Format: { yes | no }
Controls whether kmemtrace is enabled
at boot-time.
kmemtrace.subbufs=n [KNL,KMEMTRACE] Overrides the number of
subbufs kmemtrace's relay channel has. Set this
higher than default (KMEMTRACE_N_SUBBUFS in code) if
you experience buffer overruns.
movablecore=nn[KMG] [KNL,X86-32,IA-64,PPC,X86-64] This parameter movablecore=nn[KMG] [KNL,X86-32,IA-64,PPC,X86-64] This parameter
is similar to kernelcore except it specifies the is similar to kernelcore except it specifies the
amount of memory used for migratable allocations. amount of memory used for migratable allocations.
@ -2367,6 +2377,8 @@ and is between 256 and 4096 characters. It is defined in the file
tp720= [HW,PS2] tp720= [HW,PS2]
trace_buf_size=nn[KMG] [ftrace] will set tracing buffer size.
trix= [HW,OSS] MediaTrix AudioTrix Pro trix= [HW,OSS] MediaTrix AudioTrix Pro
Format: Format:
<io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq> <io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq>

View file

@ -115,6 +115,8 @@ On all - write a character to /proc/sysrq-trigger. e.g.:
'x' - Used by xmon interface on ppc/powerpc platforms. 'x' - Used by xmon interface on ppc/powerpc platforms.
'z' - Dump the ftrace buffer
'0'-'9' - Sets the console log level, controlling which kernel messages '0'-'9' - Sets the console log level, controlling which kernel messages
will be printed to your console. ('0', for example would make will be printed to your console. ('0', for example would make
it so that only emergency messages like PANICs or OOPSes would it so that only emergency messages like PANICs or OOPSes would

View file

@ -45,8 +45,8 @@ In include/trace/subsys.h :
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
DECLARE_TRACE(subsys_eventname, DECLARE_TRACE(subsys_eventname,
TPPROTO(int firstarg, struct task_struct *p), TP_PROTO(int firstarg, struct task_struct *p),
TPARGS(firstarg, p)); TP_ARGS(firstarg, p));
In subsys/file.c (where the tracing statement must be added) : In subsys/file.c (where the tracing statement must be added) :
@ -66,10 +66,10 @@ Where :
- subsys is the name of your subsystem. - subsys is the name of your subsystem.
- eventname is the name of the event to trace. - eventname is the name of the event to trace.
- TPPROTO(int firstarg, struct task_struct *p) is the prototype of the - TP_PROTO(int firstarg, struct task_struct *p) is the prototype of the
function called by this tracepoint. function called by this tracepoint.
- TPARGS(firstarg, p) are the parameters names, same as found in the - TP_ARGS(firstarg, p) are the parameters names, same as found in the
prototype. prototype.
Connecting a function (probe) to a tracepoint is done by providing a Connecting a function (probe) to a tracepoint is done by providing a
@ -103,13 +103,14 @@ used to export the defined tracepoints.
* Probe / tracepoint example * Probe / tracepoint example
See the example provided in samples/tracepoints/src See the example provided in samples/tracepoints
Compile them with your kernel. Compile them with your kernel. They are built during 'make' (not
'make modules') when CONFIG_SAMPLE_TRACEPOINTS=m.
Run, as root : Run, as root :
modprobe tracepoint-example (insmod order is not important) modprobe tracepoint-sample (insmod order is not important)
modprobe tracepoint-probe-example modprobe tracepoint-probe-sample
cat /proc/tracepoint-example (returns an expected error) cat /proc/tracepoint-sample (returns an expected error)
rmmod tracepoint-example tracepoint-probe-example rmmod tracepoint-sample tracepoint-probe-sample
dmesg dmesg

View file

@ -0,0 +1,126 @@
kmemtrace - Kernel Memory Tracer
by Eduard - Gabriel Munteanu
<eduard.munteanu@linux360.ro>
I. Introduction
===============
kmemtrace helps kernel developers figure out two things:
1) how different allocators (SLAB, SLUB etc.) perform
2) how kernel code allocates memory and how much
To do this, we trace every allocation and export information to the userspace
through the relay interface. We export things such as the number of requested
bytes, the number of bytes actually allocated (i.e. including internal
fragmentation), whether this is a slab allocation or a plain kmalloc() and so
on.
The actual analysis is performed by a userspace tool (see section III for
details on where to get it from). It logs the data exported by the kernel,
processes it and (as of writing this) can provide the following information:
- the total amount of memory allocated and fragmentation per call-site
- the amount of memory allocated and fragmentation per allocation
- total memory allocated and fragmentation in the collected dataset
- number of cross-CPU allocation and frees (makes sense in NUMA environments)
Moreover, it can potentially find inconsistent and erroneous behavior in
kernel code, such as using slab free functions on kmalloc'ed memory or
allocating less memory than requested (but not truly failed allocations).
kmemtrace also makes provisions for tracing on some arch and analysing the
data on another.
II. Design and goals
====================
kmemtrace was designed to handle rather large amounts of data. Thus, it uses
the relay interface to export whatever is logged to userspace, which then
stores it. Analysis and reporting is done asynchronously, that is, after the
data is collected and stored. By design, it allows one to log and analyse
on different machines and different arches.
As of writing this, the ABI is not considered stable, though it might not
change much. However, no guarantees are made about compatibility yet. When
deemed stable, the ABI should still allow easy extension while maintaining
backward compatibility. This is described further in Documentation/ABI.
Summary of design goals:
- allow logging and analysis to be done across different machines
- be fast and anticipate usage in high-load environments (*)
- be reasonably extensible
- make it possible for GNU/Linux distributions to have kmemtrace
included in their repositories
(*) - one of the reasons Pekka Enberg's original userspace data analysis
tool's code was rewritten from Perl to C (although this is more than a
simple conversion)
III. Quick usage guide
======================
1) Get a kernel that supports kmemtrace and build it accordingly (i.e. enable
CONFIG_KMEMTRACE).
2) Get the userspace tool and build it:
$ git-clone git://repo.or.cz/kmemtrace-user.git # current repository
$ cd kmemtrace-user/
$ ./autogen.sh
$ ./configure
$ make
3) Boot the kmemtrace-enabled kernel if you haven't, preferably in the
'single' runlevel (so that relay buffers don't fill up easily), and run
kmemtrace:
# '$' does not mean user, but root here.
$ mount -t debugfs none /sys/kernel/debug
$ mount -t proc none /proc
$ cd path/to/kmemtrace-user/
$ ./kmemtraced
Wait a bit, then stop it with CTRL+C.
$ cat /sys/kernel/debug/kmemtrace/total_overruns # Check if we didn't
# overrun, should
# be zero.
$ (Optionally) [Run kmemtrace_check separately on each cpu[0-9]*.out file to
check its correctness]
$ ./kmemtrace-report
Now you should have a nice and short summary of how the allocator performs.
IV. FAQ and known issues
========================
Q: 'cat /sys/kernel/debug/kmemtrace/total_overruns' is non-zero, how do I fix
this? Should I worry?
A: If it's non-zero, this affects kmemtrace's accuracy, depending on how
large the number is. You can fix it by supplying a higher
'kmemtrace.subbufs=N' kernel parameter.
---
Q: kmemtrace_check reports errors, how do I fix this? Should I worry?
A: This is a bug and should be reported. It can occur for a variety of
reasons:
- possible bugs in relay code
- possible misuse of relay by kmemtrace
- timestamps being collected unorderly
Or you may fix it yourself and send us a patch.
---
Q: kmemtrace_report shows many errors, how do I fix this? Should I worry?
A: This is a known issue and I'm working on it. These might be true errors
in kernel code, which may have inconsistent behavior (e.g. allocating memory
with kmem_cache_alloc() and freeing it with kfree()). Pekka Enberg pointed
out this behavior may work with SLAB, but may fail with other allocators.
It may also be due to lack of tracing in some unusual allocator functions.
We don't want bug reports regarding this issue yet.
---
V. See also
===========
Documentation/kernel-parameters.txt
Documentation/ABI/testing/debugfs-kmemtrace

View file

@ -2654,6 +2654,12 @@ M: jason.wessel@windriver.com
L: kgdb-bugreport@lists.sourceforge.net L: kgdb-bugreport@lists.sourceforge.net
S: Maintained S: Maintained
KMEMTRACE
P: Eduard - Gabriel Munteanu
M: eduard.munteanu@linux360.ro
L: linux-kernel@vger.kernel.org
S: Maintained
KPROBES KPROBES
P: Ananth N Mavinakayanahalli P: Ananth N Mavinakayanahalli
M: ananth@in.ibm.com M: ananth@in.ibm.com

View file

@ -6,6 +6,7 @@ config OPROFILE
tristate "OProfile system profiling (EXPERIMENTAL)" tristate "OProfile system profiling (EXPERIMENTAL)"
depends on PROFILING depends on PROFILING
depends on HAVE_OPROFILE depends on HAVE_OPROFILE
depends on TRACING_SUPPORT
select TRACING select TRACING
select RING_BUFFER select RING_BUFFER
help help

View file

@ -0,0 +1 @@
/* empty */

View file

@ -14,17 +14,4 @@ typedef struct {
void ack_bad_irq(unsigned int irq); void ack_bad_irq(unsigned int irq);
#define HARDIRQ_BITS 12
/*
* The hardirq mask has to be large enough to have
* space for potentially nestable IRQ sources in the system
* to nest on a single CPU. On Alpha, interrupts are masked at the CPU
* by IPL as well as at the system level. We only have 8 IPLs (UNIX PALcode)
* so we really only have 8 nestable IRQs, but allow some overhead
*/
#if (1 << HARDIRQ_BITS) < 16
#error HARDIRQ_BITS is too low!
#endif
#endif /* _ALPHA_HARDIRQ_H */ #endif /* _ALPHA_HARDIRQ_H */

View file

@ -0,0 +1 @@
/* empty */

View file

@ -20,15 +20,4 @@ void ack_bad_irq(unsigned int irq);
#endif /* __ASSEMBLY__ */ #endif /* __ASSEMBLY__ */
#define HARDIRQ_BITS 12
/*
* The hardirq mask has to be large enough to have
* space for potentially all IRQ sources in the system
* nesting on a single CPU:
*/
#if (1 << HARDIRQ_BITS) < NR_IRQS
# error HARDIRQ_BITS is too low!
#endif
#endif /* __ASM_AVR32_HARDIRQ_H */ #endif /* __ASM_AVR32_HARDIRQ_H */

View file

@ -0,0 +1 @@
/* empty */

View file

@ -0,0 +1 @@
/* empty */

View file

@ -0,0 +1 @@
/* empty */

View file

@ -22,6 +22,9 @@ config IA64
select HAVE_OPROFILE select HAVE_OPROFILE
select HAVE_KPROBES select HAVE_KPROBES
select HAVE_KRETPROBES select HAVE_KRETPROBES
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_DYNAMIC_FTRACE if (!ITANIUM)
select HAVE_FUNCTION_TRACER
select HAVE_DMA_ATTRS select HAVE_DMA_ATTRS
select HAVE_KVM select HAVE_KVM
select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRACEHOOK

View file

@ -0,0 +1,28 @@
#ifndef _ASM_IA64_FTRACE_H
#define _ASM_IA64_FTRACE_H
#ifdef CONFIG_FUNCTION_TRACER
#define MCOUNT_INSN_SIZE 32 /* sizeof mcount call */
#ifndef __ASSEMBLY__
extern void _mcount(unsigned long pfs, unsigned long r1, unsigned long b0, unsigned long r0);
#define mcount _mcount
#include <asm/kprobes.h>
/* In IA64, MCOUNT_ADDR is set in link time, so it's not a constant at compile time */
#define MCOUNT_ADDR (((struct fnptr *)mcount)->ip)
#define FTRACE_ADDR (((struct fnptr *)ftrace_caller)->ip)
static inline unsigned long ftrace_call_adjust(unsigned long addr)
{
/* second bundle, insn 2 */
return addr - 0x12;
}
struct dyn_arch_ftrace {
};
#endif
#endif /* CONFIG_FUNCTION_TRACER */
#endif /* _ASM_IA64_FTRACE_H */

View file

@ -20,16 +20,6 @@
#define local_softirq_pending() (local_cpu_data->softirq_pending) #define local_softirq_pending() (local_cpu_data->softirq_pending)
#define HARDIRQ_BITS 14
/*
* The hardirq mask has to be large enough to have space for potentially all IRQ sources
* in the system nesting on a single CPU:
*/
#if (1 << HARDIRQ_BITS) < NR_IRQS
# error HARDIRQ_BITS is too low!
#endif
extern void __iomem *ipi_base_addr; extern void __iomem *ipi_base_addr;
void ack_bad_irq(unsigned int irq); void ack_bad_irq(unsigned int irq);

View file

@ -2,6 +2,10 @@
# Makefile for the linux kernel. # Makefile for the linux kernel.
# #
ifdef CONFIG_DYNAMIC_FTRACE
CFLAGS_REMOVE_ftrace.o = -pg
endif
extra-y := head.o init_task.o vmlinux.lds extra-y := head.o init_task.o vmlinux.lds
obj-y := acpi.o entry.o efi.o efi_stub.o gate-data.o fsys.o ia64_ksyms.o irq.o irq_ia64.o \ obj-y := acpi.o entry.o efi.o efi_stub.o gate-data.o fsys.o ia64_ksyms.o irq.o irq_ia64.o \
@ -28,6 +32,7 @@ obj-$(CONFIG_IA64_CYCLONE) += cyclone.o
obj-$(CONFIG_CPU_FREQ) += cpufreq/ obj-$(CONFIG_CPU_FREQ) += cpufreq/
obj-$(CONFIG_IA64_MCA_RECOVERY) += mca_recovery.o obj-$(CONFIG_IA64_MCA_RECOVERY) += mca_recovery.o
obj-$(CONFIG_KPROBES) += kprobes.o jprobes.o obj-$(CONFIG_KPROBES) += kprobes.o jprobes.o
obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o
obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o crash.o obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o crash.o
obj-$(CONFIG_CRASH_DUMP) += crash_dump.o obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-$(CONFIG_IA64_UNCACHED_ALLOCATOR) += uncached.o obj-$(CONFIG_IA64_UNCACHED_ALLOCATOR) += uncached.o

View file

@ -47,6 +47,7 @@
#include <asm/processor.h> #include <asm/processor.h>
#include <asm/thread_info.h> #include <asm/thread_info.h>
#include <asm/unistd.h> #include <asm/unistd.h>
#include <asm/ftrace.h>
#include "minstate.h" #include "minstate.h"
@ -1404,6 +1405,105 @@ GLOBAL_ENTRY(unw_init_running)
br.ret.sptk.many rp br.ret.sptk.many rp
END(unw_init_running) END(unw_init_running)
#ifdef CONFIG_FUNCTION_TRACER
#ifdef CONFIG_DYNAMIC_FTRACE
GLOBAL_ENTRY(_mcount)
br ftrace_stub
END(_mcount)
.here:
br.ret.sptk.many b0
GLOBAL_ENTRY(ftrace_caller)
alloc out0 = ar.pfs, 8, 0, 4, 0
mov out3 = r0
;;
mov out2 = b0
add r3 = 0x20, r3
mov out1 = r1;
br.call.sptk.many b0 = ftrace_patch_gp
//this might be called from module, so we must patch gp
ftrace_patch_gp:
movl gp=__gp
mov b0 = r3
;;
.global ftrace_call;
ftrace_call:
{
.mlx
nop.m 0x0
movl r3 = .here;;
}
alloc loc0 = ar.pfs, 4, 4, 2, 0
;;
mov loc1 = b0
mov out0 = b0
mov loc2 = r8
mov loc3 = r15
;;
adds out0 = -MCOUNT_INSN_SIZE, out0
mov out1 = in2
mov b6 = r3
br.call.sptk.many b0 = b6
;;
mov ar.pfs = loc0
mov b0 = loc1
mov r8 = loc2
mov r15 = loc3
br ftrace_stub
;;
END(ftrace_caller)
#else
GLOBAL_ENTRY(_mcount)
movl r2 = ftrace_stub
movl r3 = ftrace_trace_function;;
ld8 r3 = [r3];;
ld8 r3 = [r3];;
cmp.eq p7,p0 = r2, r3
(p7) br.sptk.many ftrace_stub
;;
alloc loc0 = ar.pfs, 4, 4, 2, 0
;;
mov loc1 = b0
mov out0 = b0
mov loc2 = r8
mov loc3 = r15
;;
adds out0 = -MCOUNT_INSN_SIZE, out0
mov out1 = in2
mov b6 = r3
br.call.sptk.many b0 = b6
;;
mov ar.pfs = loc0
mov b0 = loc1
mov r8 = loc2
mov r15 = loc3
br ftrace_stub
;;
END(_mcount)
#endif
GLOBAL_ENTRY(ftrace_stub)
mov r3 = b0
movl r2 = _mcount_ret_helper
;;
mov b6 = r2
mov b7 = r3
br.ret.sptk.many b6
_mcount_ret_helper:
mov b0 = r42
mov r1 = r41
mov ar.pfs = r40
br b7
END(ftrace_stub)
#endif /* CONFIG_FUNCTION_TRACER */
.rodata .rodata
.align 8 .align 8
.globl sys_call_table .globl sys_call_table

206
arch/ia64/kernel/ftrace.c Normal file
View file

@ -0,0 +1,206 @@
/*
* Dynamic function tracing support.
*
* Copyright (C) 2008 Shaohua Li <shaohua.li@intel.com>
*
* For licencing details, see COPYING.
*
* Defines low-level handling of mcount calls when the kernel
* is compiled with the -pg flag. When using dynamic ftrace, the
* mcount call-sites get patched lazily with NOP till they are
* enabled. All code mutation routines here take effect atomically.
*/
#include <linux/uaccess.h>
#include <linux/ftrace.h>
#include <asm/cacheflush.h>
#include <asm/patch.h>
/* In IA64, each function will be added below two bundles with -pg option */
static unsigned char __attribute__((aligned(8)))
ftrace_orig_code[MCOUNT_INSN_SIZE] = {
0x02, 0x40, 0x31, 0x10, 0x80, 0x05, /* alloc r40=ar.pfs,12,8,0 */
0xb0, 0x02, 0x00, 0x00, 0x42, 0x40, /* mov r43=r0;; */
0x05, 0x00, 0xc4, 0x00, /* mov r42=b0 */
0x11, 0x48, 0x01, 0x02, 0x00, 0x21, /* mov r41=r1 */
0x00, 0x00, 0x00, 0x02, 0x00, 0x00, /* nop.i 0x0 */
0x08, 0x00, 0x00, 0x50 /* br.call.sptk.many b0 = _mcount;; */
};
struct ftrace_orig_insn {
u64 dummy1, dummy2, dummy3;
u64 dummy4:64-41+13;
u64 imm20:20;
u64 dummy5:3;
u64 sign:1;
u64 dummy6:4;
};
/* mcount stub will be converted below for nop */
static unsigned char ftrace_nop_code[MCOUNT_INSN_SIZE] = {
0x00, 0x00, 0x00, 0x00, 0x01, 0x00, /* [MII] nop.m 0x0 */
0x30, 0x00, 0x00, 0x60, 0x00, 0x00, /* mov r3=ip */
0x00, 0x00, 0x04, 0x00, /* nop.i 0x0 */
0x05, 0x00, 0x00, 0x00, 0x01, 0x00, /* [MLX] nop.m 0x0 */
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* nop.x 0x0;; */
0x00, 0x00, 0x04, 0x00
};
static unsigned char *ftrace_nop_replace(void)
{
return ftrace_nop_code;
}
/*
* mcount stub will be converted below for call
* Note: Just the last instruction is changed against nop
* */
static unsigned char __attribute__((aligned(8)))
ftrace_call_code[MCOUNT_INSN_SIZE] = {
0x00, 0x00, 0x00, 0x00, 0x01, 0x00, /* [MII] nop.m 0x0 */
0x30, 0x00, 0x00, 0x60, 0x00, 0x00, /* mov r3=ip */
0x00, 0x00, 0x04, 0x00, /* nop.i 0x0 */
0x05, 0x00, 0x00, 0x00, 0x01, 0x00, /* [MLX] nop.m 0x0 */
0xff, 0xff, 0xff, 0xff, 0x7f, 0x00, /* brl.many .;;*/
0xf8, 0xff, 0xff, 0xc8
};
struct ftrace_call_insn {
u64 dummy1, dummy2;
u64 dummy3:48;
u64 imm39_l:16;
u64 imm39_h:23;
u64 dummy4:13;
u64 imm20:20;
u64 dummy5:3;
u64 i:1;
u64 dummy6:4;
};
static unsigned char *ftrace_call_replace(unsigned long ip, unsigned long addr)
{
struct ftrace_call_insn *code = (void *)ftrace_call_code;
unsigned long offset = addr - (ip + 0x10);
code->imm39_l = offset >> 24;
code->imm39_h = offset >> 40;
code->imm20 = offset >> 4;
code->i = offset >> 63;
return ftrace_call_code;
}
static int
ftrace_modify_code(unsigned long ip, unsigned char *old_code,
unsigned char *new_code, int do_check)
{
unsigned char replaced[MCOUNT_INSN_SIZE];
/*
* Note: Due to modules and __init, code can
* disappear and change, we need to protect against faulting
* as well as code changing. We do this by using the
* probe_kernel_* functions.
*
* No real locking needed, this code is run through
* kstop_machine, or before SMP starts.
*/
if (!do_check)
goto skip_check;
/* read the text we want to modify */
if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE))
return -EFAULT;
/* Make sure it is what we expect it to be */
if (memcmp(replaced, old_code, MCOUNT_INSN_SIZE) != 0)
return -EINVAL;
skip_check:
/* replace the text with the new text */
if (probe_kernel_write(((void *)ip), new_code, MCOUNT_INSN_SIZE))
return -EPERM;
flush_icache_range(ip, ip + MCOUNT_INSN_SIZE);
return 0;
}
static int ftrace_make_nop_check(struct dyn_ftrace *rec, unsigned long addr)
{
unsigned char __attribute__((aligned(8))) replaced[MCOUNT_INSN_SIZE];
unsigned long ip = rec->ip;
if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE))
return -EFAULT;
if (rec->flags & FTRACE_FL_CONVERTED) {
struct ftrace_call_insn *call_insn, *tmp_call;
call_insn = (void *)ftrace_call_code;
tmp_call = (void *)replaced;
call_insn->imm39_l = tmp_call->imm39_l;
call_insn->imm39_h = tmp_call->imm39_h;
call_insn->imm20 = tmp_call->imm20;
call_insn->i = tmp_call->i;
if (memcmp(replaced, ftrace_call_code, MCOUNT_INSN_SIZE) != 0)
return -EINVAL;
return 0;
} else {
struct ftrace_orig_insn *call_insn, *tmp_call;
call_insn = (void *)ftrace_orig_code;
tmp_call = (void *)replaced;
call_insn->sign = tmp_call->sign;
call_insn->imm20 = tmp_call->imm20;
if (memcmp(replaced, ftrace_orig_code, MCOUNT_INSN_SIZE) != 0)
return -EINVAL;
return 0;
}
}
int ftrace_make_nop(struct module *mod,
struct dyn_ftrace *rec, unsigned long addr)
{
int ret;
char *new;
ret = ftrace_make_nop_check(rec, addr);
if (ret)
return ret;
new = ftrace_nop_replace();
return ftrace_modify_code(rec->ip, NULL, new, 0);
}
int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
{
unsigned long ip = rec->ip;
unsigned char *old, *new;
old= ftrace_nop_replace();
new = ftrace_call_replace(ip, addr);
return ftrace_modify_code(ip, old, new, 1);
}
/* in IA64, _mcount can't directly call ftrace_stub. Only jump is ok */
int ftrace_update_ftrace_func(ftrace_func_t func)
{
unsigned long ip;
unsigned long addr = ((struct fnptr *)ftrace_call)->ip;
if (func == ftrace_stub)
return 0;
ip = ((struct fnptr *)func)->ip;
ia64_patch_imm64(addr + 2, ip);
flush_icache_range(addr, addr + 16);
return 0;
}
/* run from kstop_machine */
int __init ftrace_dyn_arch_init(void *data)
{
*(unsigned long *)data = 0;
return 0;
}

View file

@ -112,3 +112,9 @@ EXPORT_SYMBOL_GPL(esi_call_phys);
#endif #endif
extern char ia64_ivt[]; extern char ia64_ivt[];
EXPORT_SYMBOL(ia64_ivt); EXPORT_SYMBOL(ia64_ivt);
#include <asm/ftrace.h>
#ifdef CONFIG_FUNCTION_TRACER
/* mcount is defined in assembly */
EXPORT_SYMBOL(_mcount);
#endif

View file

@ -0,0 +1 @@
/* empty */

View file

@ -0,0 +1 @@
/* empty */

View file

@ -557,7 +557,6 @@ extern void mod_return_to_handler(void);
void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr) void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
{ {
unsigned long old; unsigned long old;
unsigned long long calltime;
int faulted; int faulted;
struct ftrace_graph_ent trace; struct ftrace_graph_ent trace;
unsigned long return_hooker = (unsigned long)&return_to_handler; unsigned long return_hooker = (unsigned long)&return_to_handler;
@ -606,10 +605,7 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
return; return;
} }
calltime = cpu_clock(raw_smp_processor_id()); if (ftrace_push_return_trace(old, self_addr, &trace.depth) == -EBUSY) {
if (ftrace_push_return_trace(old, calltime,
self_addr, &trace.depth) == -EBUSY) {
*parent = old; *parent = old;
return; return;
} }

View file

@ -0,0 +1 @@
/* empty */

View file

@ -34,6 +34,8 @@ config X86
select HAVE_FUNCTION_TRACER select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACE_MCOUNT_TEST select HAVE_FUNCTION_TRACE_MCOUNT_TEST
select HAVE_FTRACE_NMI_ENTER if DYNAMIC_FTRACE
select HAVE_FTRACE_SYSCALLS
select HAVE_KVM select HAVE_KVM
select HAVE_ARCH_KGDB select HAVE_ARCH_KGDB
select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRACEHOOK

View file

@ -126,6 +126,11 @@ void clflush_cache_range(void *addr, unsigned int size);
#ifdef CONFIG_DEBUG_RODATA #ifdef CONFIG_DEBUG_RODATA
void mark_rodata_ro(void); void mark_rodata_ro(void);
extern const int rodata_test_data; extern const int rodata_test_data;
void set_kernel_text_rw(void);
void set_kernel_text_ro(void);
#else
static inline void set_kernel_text_rw(void) { }
static inline void set_kernel_text_ro(void) { }
#endif #endif
#ifdef CONFIG_DEBUG_RODATA_TEST #ifdef CONFIG_DEBUG_RODATA_TEST

View file

@ -111,6 +111,8 @@ enum fixed_addresses {
#ifdef CONFIG_PARAVIRT #ifdef CONFIG_PARAVIRT
FIX_PARAVIRT_BOOTMAP, FIX_PARAVIRT_BOOTMAP,
#endif #endif
FIX_TEXT_POKE0, /* reserve 2 pages for text_poke() */
FIX_TEXT_POKE1,
__end_of_permanent_fixed_addresses, __end_of_permanent_fixed_addresses,
#ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT #ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
FIX_OHCI1394_BASE, FIX_OHCI1394_BASE,

View file

@ -28,6 +28,13 @@
#endif #endif
/* FIXME: I don't want to stay hardcoded */
#ifdef CONFIG_X86_64
# define FTRACE_SYSCALL_MAX 296
#else
# define FTRACE_SYSCALL_MAX 333
#endif
#ifdef CONFIG_FUNCTION_TRACER #ifdef CONFIG_FUNCTION_TRACER
#define MCOUNT_ADDR ((long)(mcount)) #define MCOUNT_ADDR ((long)(mcount))
#define MCOUNT_INSN_SIZE 5 /* sizeof mcount call */ #define MCOUNT_INSN_SIZE 5 /* sizeof mcount call */

View file

@ -80,8 +80,6 @@
#define PTRACE_SINGLEBLOCK 33 /* resume execution until next branch */ #define PTRACE_SINGLEBLOCK 33 /* resume execution until next branch */
#ifdef CONFIG_X86_PTRACE_BTS
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
#include <linux/types.h> #include <linux/types.h>
@ -140,6 +138,5 @@ struct ptrace_bts_config {
BTS records are read from oldest to newest. BTS records are read from oldest to newest.
Returns number of BTS records drained. Returns number of BTS records drained.
*/ */
#endif /* CONFIG_X86_PTRACE_BTS */
#endif /* _ASM_X86_PTRACE_ABI_H */ #endif /* _ASM_X86_PTRACE_ABI_H */

View file

@ -94,6 +94,7 @@ struct thread_info {
#define TIF_FORCED_TF 24 /* true if TF in eflags artificially */ #define TIF_FORCED_TF 24 /* true if TF in eflags artificially */
#define TIF_DEBUGCTLMSR 25 /* uses thread_struct.debugctlmsr */ #define TIF_DEBUGCTLMSR 25 /* uses thread_struct.debugctlmsr */
#define TIF_DS_AREA_MSR 26 /* uses thread_struct.ds_area_msr */ #define TIF_DS_AREA_MSR 26 /* uses thread_struct.ds_area_msr */
#define TIF_SYSCALL_FTRACE 27 /* for ftrace syscall instrumentation */
#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE) #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
#define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
@ -115,15 +116,17 @@ struct thread_info {
#define _TIF_FORCED_TF (1 << TIF_FORCED_TF) #define _TIF_FORCED_TF (1 << TIF_FORCED_TF)
#define _TIF_DEBUGCTLMSR (1 << TIF_DEBUGCTLMSR) #define _TIF_DEBUGCTLMSR (1 << TIF_DEBUGCTLMSR)
#define _TIF_DS_AREA_MSR (1 << TIF_DS_AREA_MSR) #define _TIF_DS_AREA_MSR (1 << TIF_DS_AREA_MSR)
#define _TIF_SYSCALL_FTRACE (1 << TIF_SYSCALL_FTRACE)
/* work to do in syscall_trace_enter() */ /* work to do in syscall_trace_enter() */
#define _TIF_WORK_SYSCALL_ENTRY \ #define _TIF_WORK_SYSCALL_ENTRY \
(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU | \ (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU | _TIF_SYSCALL_FTRACE | \
_TIF_SYSCALL_AUDIT | _TIF_SECCOMP | _TIF_SINGLESTEP) _TIF_SYSCALL_AUDIT | _TIF_SECCOMP | _TIF_SINGLESTEP)
/* work to do in syscall_trace_leave() */ /* work to do in syscall_trace_leave() */
#define _TIF_WORK_SYSCALL_EXIT \ #define _TIF_WORK_SYSCALL_EXIT \
(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | _TIF_SINGLESTEP) (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | _TIF_SINGLESTEP | \
_TIF_SYSCALL_FTRACE)
/* work to do on interrupt/exception return */ /* work to do on interrupt/exception return */
#define _TIF_WORK_MASK \ #define _TIF_WORK_MASK \
@ -132,7 +135,7 @@ struct thread_info {
_TIF_SINGLESTEP|_TIF_SECCOMP|_TIF_SYSCALL_EMU)) _TIF_SINGLESTEP|_TIF_SECCOMP|_TIF_SYSCALL_EMU))
/* work to do on any return to user space */ /* work to do on any return to user space */
#define _TIF_ALLWORK_MASK (0x0000FFFF & ~_TIF_SECCOMP) #define _TIF_ALLWORK_MASK ((0x0000FFFF & ~_TIF_SECCOMP) | _TIF_SYSCALL_FTRACE)
/* Only used for 64 bit */ /* Only used for 64 bit */
#define _TIF_DO_NOTIFY_MASK \ #define _TIF_DO_NOTIFY_MASK \

View file

@ -66,7 +66,8 @@ obj-$(CONFIG_X86_MPPARSE) += mpparse.o
obj-y += apic/ obj-y += apic/
obj-$(CONFIG_X86_REBOOTFIXUPS) += reboot_fixups_32.o obj-$(CONFIG_X86_REBOOTFIXUPS) += reboot_fixups_32.o
obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o
obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o
obj-$(CONFIG_KEXEC) += machine_kexec_$(BITS).o obj-$(CONFIG_KEXEC) += machine_kexec_$(BITS).o
obj-$(CONFIG_KEXEC) += relocate_kernel_$(BITS).o crash.o obj-$(CONFIG_KEXEC) += relocate_kernel_$(BITS).o crash.o
obj-$(CONFIG_CRASH_DUMP) += crash_dump_$(BITS).o obj-$(CONFIG_CRASH_DUMP) += crash_dump_$(BITS).o

View file

@ -5,6 +5,7 @@
#include <linux/kprobes.h> #include <linux/kprobes.h>
#include <linux/mm.h> #include <linux/mm.h>
#include <linux/vmalloc.h> #include <linux/vmalloc.h>
#include <linux/memory.h>
#include <asm/alternative.h> #include <asm/alternative.h>
#include <asm/sections.h> #include <asm/sections.h>
#include <asm/pgtable.h> #include <asm/pgtable.h>
@ -12,7 +13,9 @@
#include <asm/nmi.h> #include <asm/nmi.h>
#include <asm/vsyscall.h> #include <asm/vsyscall.h>
#include <asm/cacheflush.h> #include <asm/cacheflush.h>
#include <asm/tlbflush.h>
#include <asm/io.h> #include <asm/io.h>
#include <asm/fixmap.h>
#define MAX_PATCH_LEN (255-1) #define MAX_PATCH_LEN (255-1)
@ -226,6 +229,7 @@ static void alternatives_smp_lock(u8 **start, u8 **end, u8 *text, u8 *text_end)
{ {
u8 **ptr; u8 **ptr;
mutex_lock(&text_mutex);
for (ptr = start; ptr < end; ptr++) { for (ptr = start; ptr < end; ptr++) {
if (*ptr < text) if (*ptr < text)
continue; continue;
@ -234,6 +238,7 @@ static void alternatives_smp_lock(u8 **start, u8 **end, u8 *text, u8 *text_end)
/* turn DS segment override prefix into lock prefix */ /* turn DS segment override prefix into lock prefix */
text_poke(*ptr, ((unsigned char []){0xf0}), 1); text_poke(*ptr, ((unsigned char []){0xf0}), 1);
}; };
mutex_unlock(&text_mutex);
} }
static void alternatives_smp_unlock(u8 **start, u8 **end, u8 *text, u8 *text_end) static void alternatives_smp_unlock(u8 **start, u8 **end, u8 *text, u8 *text_end)
@ -243,6 +248,7 @@ static void alternatives_smp_unlock(u8 **start, u8 **end, u8 *text, u8 *text_end
if (noreplace_smp) if (noreplace_smp)
return; return;
mutex_lock(&text_mutex);
for (ptr = start; ptr < end; ptr++) { for (ptr = start; ptr < end; ptr++) {
if (*ptr < text) if (*ptr < text)
continue; continue;
@ -251,6 +257,7 @@ static void alternatives_smp_unlock(u8 **start, u8 **end, u8 *text, u8 *text_end
/* turn lock prefix into DS segment override prefix */ /* turn lock prefix into DS segment override prefix */
text_poke(*ptr, ((unsigned char []){0x3E}), 1); text_poke(*ptr, ((unsigned char []){0x3E}), 1);
}; };
mutex_unlock(&text_mutex);
} }
struct smp_alt_module { struct smp_alt_module {
@ -500,15 +507,16 @@ void *text_poke_early(void *addr, const void *opcode, size_t len)
* It means the size must be writable atomically and the address must be aligned * It means the size must be writable atomically and the address must be aligned
* in a way that permits an atomic write. It also makes sure we fit on a single * in a way that permits an atomic write. It also makes sure we fit on a single
* page. * page.
*
* Note: Must be called under text_mutex.
*/ */
void *__kprobes text_poke(void *addr, const void *opcode, size_t len) void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
{ {
unsigned long flags;
char *vaddr; char *vaddr;
int nr_pages = 2;
struct page *pages[2]; struct page *pages[2];
int i; int i;
might_sleep();
if (!core_kernel_text((unsigned long)addr)) { if (!core_kernel_text((unsigned long)addr)) {
pages[0] = vmalloc_to_page(addr); pages[0] = vmalloc_to_page(addr);
pages[1] = vmalloc_to_page(addr + PAGE_SIZE); pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
@ -518,18 +526,21 @@ void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
pages[1] = virt_to_page(addr + PAGE_SIZE); pages[1] = virt_to_page(addr + PAGE_SIZE);
} }
BUG_ON(!pages[0]); BUG_ON(!pages[0]);
if (!pages[1]) local_irq_save(flags);
nr_pages = 1; set_fixmap(FIX_TEXT_POKE0, page_to_phys(pages[0]));
vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); if (pages[1])
BUG_ON(!vaddr); set_fixmap(FIX_TEXT_POKE1, page_to_phys(pages[1]));
local_irq_disable(); vaddr = (char *)fix_to_virt(FIX_TEXT_POKE0);
memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len); memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
local_irq_enable(); clear_fixmap(FIX_TEXT_POKE0);
vunmap(vaddr); if (pages[1])
clear_fixmap(FIX_TEXT_POKE1);
local_flush_tlb();
sync_core(); sync_core();
/* Could also do a CLFLUSH here to speed up CPU recovery; but /* Could also do a CLFLUSH here to speed up CPU recovery; but
that causes hangs on some VIA CPUs. */ that causes hangs on some VIA CPUs. */
for (i = 0; i < len; i++) for (i = 0; i < len; i++)
BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]); BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]);
local_irq_restore(flags);
return addr; return addr;
} }

View file

@ -33,7 +33,7 @@
#include <linux/cpufreq.h> #include <linux/cpufreq.h>
#include <linux/compiler.h> #include <linux/compiler.h>
#include <linux/dmi.h> #include <linux/dmi.h>
#include <linux/ftrace.h> #include <trace/power.h>
#include <linux/acpi.h> #include <linux/acpi.h>
#include <linux/io.h> #include <linux/io.h>
@ -72,6 +72,8 @@ struct acpi_cpufreq_data {
static DEFINE_PER_CPU(struct acpi_cpufreq_data *, drv_data); static DEFINE_PER_CPU(struct acpi_cpufreq_data *, drv_data);
DEFINE_TRACE(power_mark);
/* acpi_perf_data is a pointer to percpu data. */ /* acpi_perf_data is a pointer to percpu data. */
static struct acpi_processor_performance *acpi_perf_data; static struct acpi_processor_performance *acpi_perf_data;

View file

@ -15,6 +15,7 @@
#include <linux/bug.h> #include <linux/bug.h>
#include <linux/nmi.h> #include <linux/nmi.h>
#include <linux/sysfs.h> #include <linux/sysfs.h>
#include <linux/ftrace.h>
#include <asm/stacktrace.h> #include <asm/stacktrace.h>
@ -196,6 +197,11 @@ unsigned __kprobes long oops_begin(void)
int cpu; int cpu;
unsigned long flags; unsigned long flags;
/* notify the hw-branch tracer so it may disable tracing and
add the last trace to the trace buffer -
the earlier this happens, the more useful the trace. */
trace_hw_branch_oops();
oops_enter(); oops_enter();
/* racy, but better than risking deadlock. */ /* racy, but better than risking deadlock. */

View file

@ -18,6 +18,7 @@
#include <linux/init.h> #include <linux/init.h>
#include <linux/list.h> #include <linux/list.h>
#include <asm/cacheflush.h>
#include <asm/ftrace.h> #include <asm/ftrace.h>
#include <linux/ftrace.h> #include <linux/ftrace.h>
#include <asm/nops.h> #include <asm/nops.h>
@ -26,6 +27,18 @@
#ifdef CONFIG_DYNAMIC_FTRACE #ifdef CONFIG_DYNAMIC_FTRACE
int ftrace_arch_code_modify_prepare(void)
{
set_kernel_text_rw();
return 0;
}
int ftrace_arch_code_modify_post_process(void)
{
set_kernel_text_ro();
return 0;
}
union ftrace_code_union { union ftrace_code_union {
char code[MCOUNT_INSN_SIZE]; char code[MCOUNT_INSN_SIZE];
struct { struct {
@ -66,11 +79,11 @@ static unsigned char *ftrace_call_replace(unsigned long ip, unsigned long addr)
* *
* 1) Put the instruction pointer into the IP buffer * 1) Put the instruction pointer into the IP buffer
* and the new code into the "code" buffer. * and the new code into the "code" buffer.
* 2) Set a flag that says we are modifying code * 2) Wait for any running NMIs to finish and set a flag that says
* 3) Wait for any running NMIs to finish. * we are modifying code, it is done in an atomic operation.
* 4) Write the code * 3) Write the code
* 5) clear the flag. * 4) clear the flag.
* 6) Wait for any running NMIs to finish. * 5) Wait for any running NMIs to finish.
* *
* If an NMI is executed, the first thing it does is to call * If an NMI is executed, the first thing it does is to call
* "ftrace_nmi_enter". This will check if the flag is set to write * "ftrace_nmi_enter". This will check if the flag is set to write
@ -82,9 +95,9 @@ static unsigned char *ftrace_call_replace(unsigned long ip, unsigned long addr)
* are the same as what exists. * are the same as what exists.
*/ */
static atomic_t in_nmi = ATOMIC_INIT(0); #define MOD_CODE_WRITE_FLAG (1 << 31) /* set when NMI should do the write */
static atomic_t nmi_running = ATOMIC_INIT(0);
static int mod_code_status; /* holds return value of text write */ static int mod_code_status; /* holds return value of text write */
static int mod_code_write; /* set when NMI should do the write */
static void *mod_code_ip; /* holds the IP to write to */ static void *mod_code_ip; /* holds the IP to write to */
static void *mod_code_newcode; /* holds the text to write to the IP */ static void *mod_code_newcode; /* holds the text to write to the IP */
@ -101,6 +114,20 @@ int ftrace_arch_read_dyn_info(char *buf, int size)
return r; return r;
} }
static void clear_mod_flag(void)
{
int old = atomic_read(&nmi_running);
for (;;) {
int new = old & ~MOD_CODE_WRITE_FLAG;
if (old == new)
break;
old = atomic_cmpxchg(&nmi_running, old, new);
}
}
static void ftrace_mod_code(void) static void ftrace_mod_code(void)
{ {
/* /*
@ -111,37 +138,52 @@ static void ftrace_mod_code(void)
*/ */
mod_code_status = probe_kernel_write(mod_code_ip, mod_code_newcode, mod_code_status = probe_kernel_write(mod_code_ip, mod_code_newcode,
MCOUNT_INSN_SIZE); MCOUNT_INSN_SIZE);
/* if we fail, then kill any new writers */
if (mod_code_status)
clear_mod_flag();
} }
void ftrace_nmi_enter(void) void ftrace_nmi_enter(void)
{ {
atomic_inc(&in_nmi); if (atomic_inc_return(&nmi_running) & MOD_CODE_WRITE_FLAG) {
/* Must have in_nmi seen before reading write flag */ smp_rmb();
smp_mb();
if (mod_code_write) {
ftrace_mod_code(); ftrace_mod_code();
atomic_inc(&nmi_update_count); atomic_inc(&nmi_update_count);
} }
/* Must have previous changes seen before executions */
smp_mb();
} }
void ftrace_nmi_exit(void) void ftrace_nmi_exit(void)
{ {
/* Finish all executions before clearing in_nmi */ /* Finish all executions before clearing nmi_running */
smp_wmb(); smp_mb();
atomic_dec(&in_nmi); atomic_dec(&nmi_running);
}
static void wait_for_nmi_and_set_mod_flag(void)
{
if (!atomic_cmpxchg(&nmi_running, 0, MOD_CODE_WRITE_FLAG))
return;
do {
cpu_relax();
} while (atomic_cmpxchg(&nmi_running, 0, MOD_CODE_WRITE_FLAG));
nmi_wait_count++;
} }
static void wait_for_nmi(void) static void wait_for_nmi(void)
{ {
int waited = 0; if (!atomic_read(&nmi_running))
return;
while (atomic_read(&in_nmi)) { do {
waited = 1;
cpu_relax(); cpu_relax();
} } while (atomic_read(&nmi_running));
if (waited) nmi_wait_count++;
nmi_wait_count++;
} }
static int static int
@ -151,14 +193,9 @@ do_ftrace_mod_code(unsigned long ip, void *new_code)
mod_code_newcode = new_code; mod_code_newcode = new_code;
/* The buffers need to be visible before we let NMIs write them */ /* The buffers need to be visible before we let NMIs write them */
smp_wmb();
mod_code_write = 1;
/* Make sure write bit is visible before we wait on NMIs */
smp_mb(); smp_mb();
wait_for_nmi(); wait_for_nmi_and_set_mod_flag();
/* Make sure all running NMIs have finished before we write the code */ /* Make sure all running NMIs have finished before we write the code */
smp_mb(); smp_mb();
@ -166,13 +203,9 @@ do_ftrace_mod_code(unsigned long ip, void *new_code)
ftrace_mod_code(); ftrace_mod_code();
/* Make sure the write happens before clearing the bit */ /* Make sure the write happens before clearing the bit */
smp_wmb();
mod_code_write = 0;
/* make sure NMIs see the cleared bit */
smp_mb(); smp_mb();
clear_mod_flag();
wait_for_nmi(); wait_for_nmi();
return mod_code_status; return mod_code_status;
@ -368,25 +401,6 @@ int ftrace_disable_ftrace_graph_caller(void)
return ftrace_mod_jmp(ip, old_offset, new_offset); return ftrace_mod_jmp(ip, old_offset, new_offset);
} }
#else /* CONFIG_DYNAMIC_FTRACE */
/*
* These functions are picked from those used on
* this page for dynamic ftrace. They have been
* simplified to ignore all traces in NMI context.
*/
static atomic_t in_nmi;
void ftrace_nmi_enter(void)
{
atomic_inc(&in_nmi);
}
void ftrace_nmi_exit(void)
{
atomic_dec(&in_nmi);
}
#endif /* !CONFIG_DYNAMIC_FTRACE */ #endif /* !CONFIG_DYNAMIC_FTRACE */
/* /*
@ -396,14 +410,13 @@ void ftrace_nmi_exit(void)
void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr) void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
{ {
unsigned long old; unsigned long old;
unsigned long long calltime;
int faulted; int faulted;
struct ftrace_graph_ent trace; struct ftrace_graph_ent trace;
unsigned long return_hooker = (unsigned long) unsigned long return_hooker = (unsigned long)
&return_to_handler; &return_to_handler;
/* Nmi's are currently unsupported */ /* Nmi's are currently unsupported */
if (unlikely(atomic_read(&in_nmi))) if (unlikely(in_nmi()))
return; return;
if (unlikely(atomic_read(&current->tracing_graph_pause))) if (unlikely(atomic_read(&current->tracing_graph_pause)))
@ -439,17 +452,7 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
return; return;
} }
if (unlikely(!__kernel_text_address(old))) { if (ftrace_push_return_trace(old, self_addr, &trace.depth) == -EBUSY) {
ftrace_graph_stop();
*parent = old;
WARN_ON(1);
return;
}
calltime = cpu_clock(raw_smp_processor_id());
if (ftrace_push_return_trace(old, calltime,
self_addr, &trace.depth) == -EBUSY) {
*parent = old; *parent = old;
return; return;
} }
@ -463,3 +466,66 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
} }
} }
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */ #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
#ifdef CONFIG_FTRACE_SYSCALLS
extern unsigned long __start_syscalls_metadata[];
extern unsigned long __stop_syscalls_metadata[];
extern unsigned long *sys_call_table;
static struct syscall_metadata **syscalls_metadata;
static struct syscall_metadata *find_syscall_meta(unsigned long *syscall)
{
struct syscall_metadata *start;
struct syscall_metadata *stop;
char str[KSYM_SYMBOL_LEN];
start = (struct syscall_metadata *)__start_syscalls_metadata;
stop = (struct syscall_metadata *)__stop_syscalls_metadata;
kallsyms_lookup((unsigned long) syscall, NULL, NULL, NULL, str);
for ( ; start < stop; start++) {
if (start->name && !strcmp(start->name, str))
return start;
}
return NULL;
}
struct syscall_metadata *syscall_nr_to_meta(int nr)
{
if (!syscalls_metadata || nr >= FTRACE_SYSCALL_MAX || nr < 0)
return NULL;
return syscalls_metadata[nr];
}
void arch_init_ftrace_syscalls(void)
{
int i;
struct syscall_metadata *meta;
unsigned long **psys_syscall_table = &sys_call_table;
static atomic_t refs;
if (atomic_inc_return(&refs) != 1)
goto end;
syscalls_metadata = kzalloc(sizeof(*syscalls_metadata) *
FTRACE_SYSCALL_MAX, GFP_KERNEL);
if (!syscalls_metadata) {
WARN_ON(1);
return;
}
for (i = 0; i < FTRACE_SYSCALL_MAX; i++) {
meta = find_syscall_meta(psys_syscall_table[i]);
syscalls_metadata[i] = meta;
}
return;
/* Paranoid: avoid overflow */
end:
atomic_dec(&refs);
}
#endif

View file

@ -638,13 +638,13 @@ static void __used __kprobes kretprobe_trampoline_holder(void)
#else #else
" pushf\n" " pushf\n"
/* /*
* Skip cs, ip, orig_ax. * Skip cs, ip, orig_ax and gs.
* trampoline_handler() will plug in these values * trampoline_handler() will plug in these values
*/ */
" subl $12, %esp\n" " subl $16, %esp\n"
" pushl %fs\n" " pushl %fs\n"
" pushl %ds\n"
" pushl %es\n" " pushl %es\n"
" pushl %ds\n"
" pushl %eax\n" " pushl %eax\n"
" pushl %ebp\n" " pushl %ebp\n"
" pushl %edi\n" " pushl %edi\n"
@ -655,10 +655,10 @@ static void __used __kprobes kretprobe_trampoline_holder(void)
" movl %esp, %eax\n" " movl %esp, %eax\n"
" call trampoline_handler\n" " call trampoline_handler\n"
/* Move flags to cs */ /* Move flags to cs */
" movl 52(%esp), %edx\n" " movl 56(%esp), %edx\n"
" movl %edx, 48(%esp)\n" " movl %edx, 52(%esp)\n"
/* Replace saved flags with true return address. */ /* Replace saved flags with true return address. */
" movl %eax, 52(%esp)\n" " movl %eax, 56(%esp)\n"
" popl %ebx\n" " popl %ebx\n"
" popl %ecx\n" " popl %ecx\n"
" popl %edx\n" " popl %edx\n"
@ -666,8 +666,8 @@ static void __used __kprobes kretprobe_trampoline_holder(void)
" popl %edi\n" " popl %edi\n"
" popl %ebp\n" " popl %ebp\n"
" popl %eax\n" " popl %eax\n"
/* Skip ip, orig_ax, es, ds, fs */ /* Skip ds, es, fs, gs, orig_ax and ip */
" addl $20, %esp\n" " addl $24, %esp\n"
" popf\n" " popf\n"
#endif #endif
" ret\n"); " ret\n");
@ -691,6 +691,7 @@ static __used __kprobes void *trampoline_handler(struct pt_regs *regs)
regs->cs = __KERNEL_CS; regs->cs = __KERNEL_CS;
#else #else
regs->cs = __KERNEL_CS | get_kernel_rpl(); regs->cs = __KERNEL_CS | get_kernel_rpl();
regs->gs = 0;
#endif #endif
regs->ip = trampoline_address; regs->ip = trampoline_address;
regs->orig_ax = ~0UL; regs->orig_ax = ~0UL;

View file

@ -8,7 +8,7 @@
#include <linux/module.h> #include <linux/module.h>
#include <linux/pm.h> #include <linux/pm.h>
#include <linux/clockchips.h> #include <linux/clockchips.h>
#include <linux/ftrace.h> #include <trace/power.h>
#include <asm/system.h> #include <asm/system.h>
#include <asm/apic.h> #include <asm/apic.h>
#include <asm/idle.h> #include <asm/idle.h>
@ -22,6 +22,9 @@ EXPORT_SYMBOL(idle_nomwait);
struct kmem_cache *task_xstate_cachep; struct kmem_cache *task_xstate_cachep;
DEFINE_TRACE(power_start);
DEFINE_TRACE(power_end);
int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
{ {
*dst = *src; *dst = *src;

View file

@ -21,6 +21,7 @@
#include <linux/audit.h> #include <linux/audit.h>
#include <linux/seccomp.h> #include <linux/seccomp.h>
#include <linux/signal.h> #include <linux/signal.h>
#include <linux/ftrace.h>
#include <asm/uaccess.h> #include <asm/uaccess.h>
#include <asm/pgtable.h> #include <asm/pgtable.h>
@ -1415,6 +1416,9 @@ asmregparm long syscall_trace_enter(struct pt_regs *regs)
tracehook_report_syscall_entry(regs)) tracehook_report_syscall_entry(regs))
ret = -1L; ret = -1L;
if (unlikely(test_thread_flag(TIF_SYSCALL_FTRACE)))
ftrace_syscall_enter(regs);
if (unlikely(current->audit_context)) { if (unlikely(current->audit_context)) {
if (IS_IA32) if (IS_IA32)
audit_syscall_entry(AUDIT_ARCH_I386, audit_syscall_entry(AUDIT_ARCH_I386,
@ -1438,6 +1442,9 @@ asmregparm void syscall_trace_leave(struct pt_regs *regs)
if (unlikely(current->audit_context)) if (unlikely(current->audit_context))
audit_syscall_exit(AUDITSC_RESULT(regs->ax), regs->ax); audit_syscall_exit(AUDITSC_RESULT(regs->ax), regs->ax);
if (unlikely(test_thread_flag(TIF_SYSCALL_FTRACE)))
ftrace_syscall_exit(regs);
if (test_thread_flag(TIF_SYSCALL_TRACE)) if (test_thread_flag(TIF_SYSCALL_TRACE))
tracehook_report_syscall_exit(regs, 0); tracehook_report_syscall_exit(regs, 0);

View file

@ -59,7 +59,8 @@ config KVM_AMD
config KVM_TRACE config KVM_TRACE
bool "KVM trace support" bool "KVM trace support"
depends on KVM && MARKERS && SYSFS depends on KVM && SYSFS
select MARKERS
select RELAY select RELAY
select DEBUG_FS select DEBUG_FS
default n default n

View file

@ -1054,17 +1054,47 @@ static noinline int do_test_wp_bit(void)
const int rodata_test_data = 0xC3; const int rodata_test_data = 0xC3;
EXPORT_SYMBOL_GPL(rodata_test_data); EXPORT_SYMBOL_GPL(rodata_test_data);
static int kernel_set_to_readonly;
void set_kernel_text_rw(void)
{
unsigned long start = PFN_ALIGN(_text);
unsigned long size = PFN_ALIGN(_etext) - start;
if (!kernel_set_to_readonly)
return;
pr_debug("Set kernel text: %lx - %lx for read write\n",
start, start+size);
set_pages_rw(virt_to_page(start), size >> PAGE_SHIFT);
}
void set_kernel_text_ro(void)
{
unsigned long start = PFN_ALIGN(_text);
unsigned long size = PFN_ALIGN(_etext) - start;
if (!kernel_set_to_readonly)
return;
pr_debug("Set kernel text: %lx - %lx for read only\n",
start, start+size);
set_pages_ro(virt_to_page(start), size >> PAGE_SHIFT);
}
void mark_rodata_ro(void) void mark_rodata_ro(void)
{ {
unsigned long start = PFN_ALIGN(_text); unsigned long start = PFN_ALIGN(_text);
unsigned long size = PFN_ALIGN(_etext) - start; unsigned long size = PFN_ALIGN(_etext) - start;
#ifndef CONFIG_DYNAMIC_FTRACE
/* Dynamic tracing modifies the kernel text section */
set_pages_ro(virt_to_page(start), size >> PAGE_SHIFT); set_pages_ro(virt_to_page(start), size >> PAGE_SHIFT);
printk(KERN_INFO "Write protecting the kernel text: %luk\n", printk(KERN_INFO "Write protecting the kernel text: %luk\n",
size >> 10); size >> 10);
kernel_set_to_readonly = 1;
#ifdef CONFIG_CPA_DEBUG #ifdef CONFIG_CPA_DEBUG
printk(KERN_INFO "Testing CPA: Reverting %lx-%lx\n", printk(KERN_INFO "Testing CPA: Reverting %lx-%lx\n",
start, start+size); start, start+size);
@ -1073,7 +1103,6 @@ void mark_rodata_ro(void)
printk(KERN_INFO "Testing CPA: write protecting again\n"); printk(KERN_INFO "Testing CPA: write protecting again\n");
set_pages_ro(virt_to_page(start), size>>PAGE_SHIFT); set_pages_ro(virt_to_page(start), size>>PAGE_SHIFT);
#endif #endif
#endif /* CONFIG_DYNAMIC_FTRACE */
start += size; start += size;
size = (unsigned long)__end_rodata - start; size = (unsigned long)__end_rodata - start;

View file

@ -734,21 +734,48 @@ void __init mem_init(void)
const int rodata_test_data = 0xC3; const int rodata_test_data = 0xC3;
EXPORT_SYMBOL_GPL(rodata_test_data); EXPORT_SYMBOL_GPL(rodata_test_data);
static int kernel_set_to_readonly;
void set_kernel_text_rw(void)
{
unsigned long start = PFN_ALIGN(_stext);
unsigned long end = PFN_ALIGN(__start_rodata);
if (!kernel_set_to_readonly)
return;
pr_debug("Set kernel text: %lx - %lx for read write\n",
start, end);
set_memory_rw(start, (end - start) >> PAGE_SHIFT);
}
void set_kernel_text_ro(void)
{
unsigned long start = PFN_ALIGN(_stext);
unsigned long end = PFN_ALIGN(__start_rodata);
if (!kernel_set_to_readonly)
return;
pr_debug("Set kernel text: %lx - %lx for read only\n",
start, end);
set_memory_ro(start, (end - start) >> PAGE_SHIFT);
}
void mark_rodata_ro(void) void mark_rodata_ro(void)
{ {
unsigned long start = PFN_ALIGN(_stext), end = PFN_ALIGN(__end_rodata); unsigned long start = PFN_ALIGN(_stext), end = PFN_ALIGN(__end_rodata);
unsigned long rodata_start = unsigned long rodata_start =
((unsigned long)__start_rodata + PAGE_SIZE - 1) & PAGE_MASK; ((unsigned long)__start_rodata + PAGE_SIZE - 1) & PAGE_MASK;
#ifdef CONFIG_DYNAMIC_FTRACE
/* Dynamic tracing modifies the kernel text section */
start = rodata_start;
#endif
printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n", printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n",
(end - start) >> 10); (end - start) >> 10);
set_memory_ro(start, (end - start) >> PAGE_SHIFT); set_memory_ro(start, (end - start) >> PAGE_SHIFT);
kernel_set_to_readonly = 1;
/* /*
* The rodata section (but not the kernel text!) should also be * The rodata section (but not the kernel text!) should also be
* not-executable. * not-executable.

View file

@ -0,0 +1 @@
/* empty */

View file

@ -44,22 +44,6 @@ config LBD
If unsure, say N. If unsure, say N.
config BLK_DEV_IO_TRACE
bool "Support for tracing block io actions"
depends on SYSFS
select RELAY
select DEBUG_FS
select TRACEPOINTS
help
Say Y here if you want to be able to trace the block layer actions
on a given queue. Tracing allows you to see any traffic happening
on a block device queue. For more information (and the userspace
support tools needed), fetch the blktrace tools from:
git://git.kernel.dk/blktrace.git
If unsure, say N.
config BLK_DEV_BSG config BLK_DEV_BSG
bool "Block layer SG support v4 (EXPERIMENTAL)" bool "Block layer SG support v4 (EXPERIMENTAL)"
depends on EXPERIMENTAL depends on EXPERIMENTAL

View file

@ -13,6 +13,5 @@ obj-$(CONFIG_IOSCHED_AS) += as-iosched.o
obj-$(CONFIG_IOSCHED_DEADLINE) += deadline-iosched.o obj-$(CONFIG_IOSCHED_DEADLINE) += deadline-iosched.o
obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o
obj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o
obj-$(CONFIG_BLOCK_COMPAT) += compat_ioctl.o obj-$(CONFIG_BLOCK_COMPAT) += compat_ioctl.o
obj-$(CONFIG_BLK_DEV_INTEGRITY) += blk-integrity.o obj-$(CONFIG_BLK_DEV_INTEGRITY) += blk-integrity.o

View file

@ -283,7 +283,7 @@ static void sysrq_ftrace_dump(int key, struct tty_struct *tty)
} }
static struct sysrq_key_op sysrq_ftrace_dump_op = { static struct sysrq_key_op sysrq_ftrace_dump_op = {
.handler = sysrq_ftrace_dump, .handler = sysrq_ftrace_dump,
.help_msg = "dumpZ-ftrace-buffer", .help_msg = "dump-ftrace-buffer(Z)",
.action_msg = "Dump ftrace buffer", .action_msg = "Dump ftrace buffer",
.enable_mask = SYSRQ_ENABLE_DUMP, .enable_mask = SYSRQ_ENABLE_DUMP,
}; };

View file

@ -161,7 +161,7 @@ struct op_sample
{ {
entry->event = ring_buffer_lock_reserve entry->event = ring_buffer_lock_reserve
(op_ring_buffer_write, sizeof(struct op_sample) + (op_ring_buffer_write, sizeof(struct op_sample) +
size * sizeof(entry->sample->data[0]), &entry->irq_flags); size * sizeof(entry->sample->data[0]));
if (entry->event) if (entry->event)
entry->sample = ring_buffer_event_data(entry->event); entry->sample = ring_buffer_event_data(entry->event);
else else
@ -178,8 +178,7 @@ struct op_sample
int op_cpu_buffer_write_commit(struct op_entry *entry) int op_cpu_buffer_write_commit(struct op_entry *entry)
{ {
return ring_buffer_unlock_commit(op_ring_buffer_write, entry->event, return ring_buffer_unlock_commit(op_ring_buffer_write, entry->event);
entry->irq_flags);
} }
struct op_sample *op_cpu_buffer_read_entry(struct op_entry *entry, int cpu) struct op_sample *op_cpu_buffer_read_entry(struct op_entry *entry, int cpu)

View file

@ -30,6 +30,7 @@
static struct vfsmount *debugfs_mount; static struct vfsmount *debugfs_mount;
static int debugfs_mount_count; static int debugfs_mount_count;
static bool debugfs_registered;
static struct inode *debugfs_get_inode(struct super_block *sb, int mode, dev_t dev) static struct inode *debugfs_get_inode(struct super_block *sb, int mode, dev_t dev)
{ {
@ -496,6 +497,16 @@ struct dentry *debugfs_rename(struct dentry *old_dir, struct dentry *old_dentry,
} }
EXPORT_SYMBOL_GPL(debugfs_rename); EXPORT_SYMBOL_GPL(debugfs_rename);
/**
* debugfs_initialized - Tells whether debugfs has been registered
*/
bool debugfs_initialized(void)
{
return debugfs_registered;
}
EXPORT_SYMBOL_GPL(debugfs_initialized);
static struct kobject *debug_kobj; static struct kobject *debug_kobj;
static int __init debugfs_init(void) static int __init debugfs_init(void)
@ -509,11 +520,16 @@ static int __init debugfs_init(void)
retval = register_filesystem(&debug_fs_type); retval = register_filesystem(&debug_fs_type);
if (retval) if (retval)
kobject_put(debug_kobj); kobject_put(debug_kobj);
else
debugfs_registered = true;
return retval; return retval;
} }
static void __exit debugfs_exit(void) static void __exit debugfs_exit(void)
{ {
debugfs_registered = false;
simple_release_fs(&debugfs_mount, &debugfs_mount_count); simple_release_fs(&debugfs_mount, &debugfs_mount_count);
unregister_filesystem(&debug_fs_type); unregister_filesystem(&debug_fs_type);
kobject_put(debug_kobj); kobject_put(debug_kobj);

View file

@ -19,6 +19,7 @@
#include <linux/kmod.h> #include <linux/kmod.h>
#include <linux/ctype.h> #include <linux/ctype.h>
#include <linux/genhd.h> #include <linux/genhd.h>
#include <linux/blktrace_api.h>
#include "check.h" #include "check.h"
@ -294,6 +295,9 @@ static struct attribute_group part_attr_group = {
static struct attribute_group *part_attr_groups[] = { static struct attribute_group *part_attr_groups[] = {
&part_attr_group, &part_attr_group,
#ifdef CONFIG_BLK_DEV_IO_TRACE
&blk_trace_attr_group,
#endif
NULL NULL
}; };

1
include/asm-frv/ftrace.h Normal file
View file

@ -0,0 +1 @@
/* empty */

View file

@ -61,6 +61,30 @@
#define BRANCH_PROFILE() #define BRANCH_PROFILE()
#endif #endif
#ifdef CONFIG_EVENT_TRACER
#define FTRACE_EVENTS() VMLINUX_SYMBOL(__start_ftrace_events) = .; \
*(_ftrace_events) \
VMLINUX_SYMBOL(__stop_ftrace_events) = .;
#else
#define FTRACE_EVENTS()
#endif
#ifdef CONFIG_TRACING
#define TRACE_PRINTKS() VMLINUX_SYMBOL(__start___trace_bprintk_fmt) = .; \
*(__trace_printk_fmt) /* Trace_printk fmt' pointer */ \
VMLINUX_SYMBOL(__stop___trace_bprintk_fmt) = .;
#else
#define TRACE_PRINTKS()
#endif
#ifdef CONFIG_FTRACE_SYSCALLS
#define TRACE_SYSCALLS() VMLINUX_SYMBOL(__start_syscalls_metadata) = .; \
*(__syscalls_metadata) \
VMLINUX_SYMBOL(__stop_syscalls_metadata) = .;
#else
#define TRACE_SYSCALLS()
#endif
/* .data section */ /* .data section */
#define DATA_DATA \ #define DATA_DATA \
*(.data) \ *(.data) \
@ -86,7 +110,10 @@
*(__verbose) \ *(__verbose) \
VMLINUX_SYMBOL(__stop___verbose) = .; \ VMLINUX_SYMBOL(__stop___verbose) = .; \
LIKELY_PROFILE() \ LIKELY_PROFILE() \
BRANCH_PROFILE() BRANCH_PROFILE() \
TRACE_PRINTKS() \
FTRACE_EVENTS() \
TRACE_SYSCALLS()
#define RO_DATA(align) \ #define RO_DATA(align) \
. = ALIGN((align)); \ . = ALIGN((align)); \

View file

@ -0,0 +1 @@
/* empty */

View file

@ -0,0 +1 @@
/* empty */

View file

@ -144,6 +144,9 @@ struct blk_user_trace_setup {
#ifdef __KERNEL__ #ifdef __KERNEL__
#if defined(CONFIG_BLK_DEV_IO_TRACE) #if defined(CONFIG_BLK_DEV_IO_TRACE)
#include <linux/sysfs.h>
struct blk_trace { struct blk_trace {
int trace_state; int trace_state;
struct rchan *rchan; struct rchan *rchan;
@ -194,6 +197,8 @@ extern int blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
extern int blk_trace_startstop(struct request_queue *q, int start); extern int blk_trace_startstop(struct request_queue *q, int start);
extern int blk_trace_remove(struct request_queue *q); extern int blk_trace_remove(struct request_queue *q);
extern struct attribute_group blk_trace_attr_group;
#else /* !CONFIG_BLK_DEV_IO_TRACE */ #else /* !CONFIG_BLK_DEV_IO_TRACE */
#define blk_trace_ioctl(bdev, cmd, arg) (-ENOTTY) #define blk_trace_ioctl(bdev, cmd, arg) (-ENOTTY)
#define blk_trace_shutdown(q) do { } while (0) #define blk_trace_shutdown(q) do { } while (0)

View file

@ -68,6 +68,7 @@ struct ftrace_branch_data {
unsigned long miss; unsigned long miss;
unsigned long hit; unsigned long hit;
}; };
unsigned long miss_hit[2];
}; };
}; };
@ -125,10 +126,7 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
.line = __LINE__, \ .line = __LINE__, \
}; \ }; \
______r = !!(cond); \ ______r = !!(cond); \
if (______r) \ ______f.miss_hit[______r]++; \
______f.hit++; \
else \
______f.miss++; \
______r; \ ______r; \
})) }))
#endif /* CONFIG_PROFILE_ALL_BRANCHES */ #endif /* CONFIG_PROFILE_ALL_BRANCHES */

View file

@ -71,6 +71,9 @@ struct dentry *debugfs_create_bool(const char *name, mode_t mode,
struct dentry *debugfs_create_blob(const char *name, mode_t mode, struct dentry *debugfs_create_blob(const char *name, mode_t mode,
struct dentry *parent, struct dentry *parent,
struct debugfs_blob_wrapper *blob); struct debugfs_blob_wrapper *blob);
bool debugfs_initialized(void);
#else #else
#include <linux/err.h> #include <linux/err.h>
@ -183,6 +186,11 @@ static inline struct dentry *debugfs_create_blob(const char *name, mode_t mode,
return ERR_PTR(-ENODEV); return ERR_PTR(-ENODEV);
} }
static inline bool debugfs_initialized(void)
{
return false;
}
#endif #endif
#endif #endif

View file

@ -1,15 +1,18 @@
#ifndef _LINUX_FTRACE_H #ifndef _LINUX_FTRACE_H
#define _LINUX_FTRACE_H #define _LINUX_FTRACE_H
#include <linux/linkage.h> #include <linux/trace_clock.h>
#include <linux/fs.h>
#include <linux/ktime.h>
#include <linux/init.h>
#include <linux/types.h>
#include <linux/module.h>
#include <linux/kallsyms.h> #include <linux/kallsyms.h>
#include <linux/linkage.h>
#include <linux/bitops.h> #include <linux/bitops.h>
#include <linux/module.h>
#include <linux/ktime.h>
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/types.h>
#include <linux/init.h>
#include <linux/fs.h>
#include <asm/ftrace.h>
#ifdef CONFIG_FUNCTION_TRACER #ifdef CONFIG_FUNCTION_TRACER
@ -95,9 +98,41 @@ stack_trace_sysctl(struct ctl_table *table, int write,
loff_t *ppos); loff_t *ppos);
#endif #endif
struct ftrace_func_command {
struct list_head list;
char *name;
int (*func)(char *func, char *cmd,
char *params, int enable);
};
#ifdef CONFIG_DYNAMIC_FTRACE #ifdef CONFIG_DYNAMIC_FTRACE
/* asm/ftrace.h must be defined for archs supporting dynamic ftrace */
#include <asm/ftrace.h> int ftrace_arch_code_modify_prepare(void);
int ftrace_arch_code_modify_post_process(void);
struct seq_file;
struct ftrace_probe_ops {
void (*func)(unsigned long ip,
unsigned long parent_ip,
void **data);
int (*callback)(unsigned long ip, void **data);
void (*free)(void **data);
int (*print)(struct seq_file *m,
unsigned long ip,
struct ftrace_probe_ops *ops,
void *data);
};
extern int
register_ftrace_function_probe(char *glob, struct ftrace_probe_ops *ops,
void *data);
extern void
unregister_ftrace_function_probe(char *glob, struct ftrace_probe_ops *ops,
void *data);
extern void
unregister_ftrace_function_probe_func(char *glob, struct ftrace_probe_ops *ops);
extern void unregister_ftrace_function_probe_all(char *glob);
enum { enum {
FTRACE_FL_FREE = (1 << 0), FTRACE_FL_FREE = (1 << 0),
@ -110,15 +145,23 @@ enum {
}; };
struct dyn_ftrace { struct dyn_ftrace {
struct list_head list; union {
unsigned long ip; /* address of mcount call-site */ unsigned long ip; /* address of mcount call-site */
unsigned long flags; struct dyn_ftrace *freelist;
struct dyn_arch_ftrace arch; };
union {
unsigned long flags;
struct dyn_ftrace *newlist;
};
struct dyn_arch_ftrace arch;
}; };
int ftrace_force_update(void); int ftrace_force_update(void);
void ftrace_set_filter(unsigned char *buf, int len, int reset); void ftrace_set_filter(unsigned char *buf, int len, int reset);
int register_ftrace_command(struct ftrace_func_command *cmd);
int unregister_ftrace_command(struct ftrace_func_command *cmd);
/* defined in arch */ /* defined in arch */
extern int ftrace_ip_converted(unsigned long ip); extern int ftrace_ip_converted(unsigned long ip);
extern int ftrace_dyn_arch_init(void *data); extern int ftrace_dyn_arch_init(void *data);
@ -126,6 +169,10 @@ extern int ftrace_update_ftrace_func(ftrace_func_t func);
extern void ftrace_caller(void); extern void ftrace_caller(void);
extern void ftrace_call(void); extern void ftrace_call(void);
extern void mcount_call(void); extern void mcount_call(void);
#ifndef FTRACE_ADDR
#define FTRACE_ADDR ((unsigned long)ftrace_caller)
#endif
#ifdef CONFIG_FUNCTION_GRAPH_TRACER #ifdef CONFIG_FUNCTION_GRAPH_TRACER
extern void ftrace_graph_caller(void); extern void ftrace_graph_caller(void);
extern int ftrace_enable_ftrace_graph_caller(void); extern int ftrace_enable_ftrace_graph_caller(void);
@ -136,7 +183,7 @@ static inline int ftrace_disable_ftrace_graph_caller(void) { return 0; }
#endif #endif
/** /**
* ftrace_make_nop - convert code into top * ftrace_make_nop - convert code into nop
* @mod: module structure if called by module load initialization * @mod: module structure if called by module load initialization
* @rec: the mcount call site record * @rec: the mcount call site record
* @addr: the address that the call site should be calling * @addr: the address that the call site should be calling
@ -181,7 +228,6 @@ extern int ftrace_make_nop(struct module *mod,
*/ */
extern int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr); extern int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr);
/* May be defined in arch */ /* May be defined in arch */
extern int ftrace_arch_read_dyn_info(char *buf, int size); extern int ftrace_arch_read_dyn_info(char *buf, int size);
@ -198,6 +244,14 @@ extern void ftrace_enable_daemon(void);
# define ftrace_disable_daemon() do { } while (0) # define ftrace_disable_daemon() do { } while (0)
# define ftrace_enable_daemon() do { } while (0) # define ftrace_enable_daemon() do { } while (0)
static inline void ftrace_release(void *start, unsigned long size) { } static inline void ftrace_release(void *start, unsigned long size) { }
static inline int register_ftrace_command(struct ftrace_func_command *cmd)
{
return -EINVAL;
}
static inline int unregister_ftrace_command(char *cmd_name)
{
return -EINVAL;
}
#endif /* CONFIG_DYNAMIC_FTRACE */ #endif /* CONFIG_DYNAMIC_FTRACE */
/* totally disable ftrace - can not re-enable after this */ /* totally disable ftrace - can not re-enable after this */
@ -233,24 +287,25 @@ static inline void __ftrace_enabled_restore(int enabled)
#endif #endif
} }
#ifdef CONFIG_FRAME_POINTER #ifndef HAVE_ARCH_CALLER_ADDR
/* TODO: need to fix this for ARM */ # ifdef CONFIG_FRAME_POINTER
# define CALLER_ADDR0 ((unsigned long)__builtin_return_address(0)) # define CALLER_ADDR0 ((unsigned long)__builtin_return_address(0))
# define CALLER_ADDR1 ((unsigned long)__builtin_return_address(1)) # define CALLER_ADDR1 ((unsigned long)__builtin_return_address(1))
# define CALLER_ADDR2 ((unsigned long)__builtin_return_address(2)) # define CALLER_ADDR2 ((unsigned long)__builtin_return_address(2))
# define CALLER_ADDR3 ((unsigned long)__builtin_return_address(3)) # define CALLER_ADDR3 ((unsigned long)__builtin_return_address(3))
# define CALLER_ADDR4 ((unsigned long)__builtin_return_address(4)) # define CALLER_ADDR4 ((unsigned long)__builtin_return_address(4))
# define CALLER_ADDR5 ((unsigned long)__builtin_return_address(5)) # define CALLER_ADDR5 ((unsigned long)__builtin_return_address(5))
# define CALLER_ADDR6 ((unsigned long)__builtin_return_address(6)) # define CALLER_ADDR6 ((unsigned long)__builtin_return_address(6))
#else # else
# define CALLER_ADDR0 ((unsigned long)__builtin_return_address(0)) # define CALLER_ADDR0 ((unsigned long)__builtin_return_address(0))
# define CALLER_ADDR1 0UL # define CALLER_ADDR1 0UL
# define CALLER_ADDR2 0UL # define CALLER_ADDR2 0UL
# define CALLER_ADDR3 0UL # define CALLER_ADDR3 0UL
# define CALLER_ADDR4 0UL # define CALLER_ADDR4 0UL
# define CALLER_ADDR5 0UL # define CALLER_ADDR5 0UL
# define CALLER_ADDR6 0UL # define CALLER_ADDR6 0UL
#endif # endif
#endif /* ifndef HAVE_ARCH_CALLER_ADDR */
#ifdef CONFIG_IRQSOFF_TRACER #ifdef CONFIG_IRQSOFF_TRACER
extern void time_hardirqs_on(unsigned long a0, unsigned long a1); extern void time_hardirqs_on(unsigned long a0, unsigned long a1);
@ -268,54 +323,6 @@ static inline void __ftrace_enabled_restore(int enabled)
# define trace_preempt_off(a0, a1) do { } while (0) # define trace_preempt_off(a0, a1) do { } while (0)
#endif #endif
#ifdef CONFIG_TRACING
extern int ftrace_dump_on_oops;
extern void tracing_start(void);
extern void tracing_stop(void);
extern void ftrace_off_permanent(void);
extern void
ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3);
/**
* ftrace_printk - printf formatting in the ftrace buffer
* @fmt: the printf format for printing
*
* Note: __ftrace_printk is an internal function for ftrace_printk and
* the @ip is passed in via the ftrace_printk macro.
*
* This function allows a kernel developer to debug fast path sections
* that printk is not appropriate for. By scattering in various
* printk like tracing in the code, a developer can quickly see
* where problems are occurring.
*
* This is intended as a debugging tool for the developer only.
* Please refrain from leaving ftrace_printks scattered around in
* your code.
*/
# define ftrace_printk(fmt...) __ftrace_printk(_THIS_IP_, fmt)
extern int
__ftrace_printk(unsigned long ip, const char *fmt, ...)
__attribute__ ((format (printf, 2, 3)));
extern void ftrace_dump(void);
#else
static inline void
ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) { }
static inline int
ftrace_printk(const char *fmt, ...) __attribute__ ((format (printf, 1, 2)));
static inline void tracing_start(void) { }
static inline void tracing_stop(void) { }
static inline void ftrace_off_permanent(void) { }
static inline int
ftrace_printk(const char *fmt, ...)
{
return 0;
}
static inline void ftrace_dump(void) { }
#endif
#ifdef CONFIG_FTRACE_MCOUNT_RECORD #ifdef CONFIG_FTRACE_MCOUNT_RECORD
extern void ftrace_init(void); extern void ftrace_init(void);
extern void ftrace_init_module(struct module *mod, extern void ftrace_init_module(struct module *mod,
@ -327,36 +334,6 @@ ftrace_init_module(struct module *mod,
unsigned long *start, unsigned long *end) { } unsigned long *start, unsigned long *end) { }
#endif #endif
enum {
POWER_NONE = 0,
POWER_CSTATE = 1,
POWER_PSTATE = 2,
};
struct power_trace {
#ifdef CONFIG_POWER_TRACER
ktime_t stamp;
ktime_t end;
int type;
int state;
#endif
};
#ifdef CONFIG_POWER_TRACER
extern void trace_power_start(struct power_trace *it, unsigned int type,
unsigned int state);
extern void trace_power_mark(struct power_trace *it, unsigned int type,
unsigned int state);
extern void trace_power_end(struct power_trace *it);
#else
static inline void trace_power_start(struct power_trace *it, unsigned int type,
unsigned int state) { }
static inline void trace_power_mark(struct power_trace *it, unsigned int type,
unsigned int state) { }
static inline void trace_power_end(struct power_trace *it) { }
#endif
/* /*
* Structure that defines an entry function trace. * Structure that defines an entry function trace.
*/ */
@ -398,8 +375,7 @@ struct ftrace_ret_stack {
extern void return_to_handler(void); extern void return_to_handler(void);
extern int extern int
ftrace_push_return_trace(unsigned long ret, unsigned long long time, ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth);
unsigned long func, int *depth);
extern void extern void
ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret); ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret);
@ -514,6 +490,50 @@ static inline int test_tsk_trace_graph(struct task_struct *tsk)
return tsk->trace & TSK_TRACE_FL_GRAPH; return tsk->trace & TSK_TRACE_FL_GRAPH;
} }
extern int ftrace_dump_on_oops;
#endif /* CONFIG_TRACING */ #endif /* CONFIG_TRACING */
#ifdef CONFIG_HW_BRANCH_TRACER
void trace_hw_branch(u64 from, u64 to);
void trace_hw_branch_oops(void);
#else /* CONFIG_HW_BRANCH_TRACER */
static inline void trace_hw_branch(u64 from, u64 to) {}
static inline void trace_hw_branch_oops(void) {}
#endif /* CONFIG_HW_BRANCH_TRACER */
/*
* A syscall entry in the ftrace syscalls array.
*
* @name: name of the syscall
* @nb_args: number of parameters it takes
* @types: list of types as strings
* @args: list of args as strings (args[i] matches types[i])
*/
struct syscall_metadata {
const char *name;
int nb_args;
const char **types;
const char **args;
};
#ifdef CONFIG_FTRACE_SYSCALLS
extern void arch_init_ftrace_syscalls(void);
extern struct syscall_metadata *syscall_nr_to_meta(int nr);
extern void start_ftrace_syscalls(void);
extern void stop_ftrace_syscalls(void);
extern void ftrace_syscall_enter(struct pt_regs *regs);
extern void ftrace_syscall_exit(struct pt_regs *regs);
#else
static inline void start_ftrace_syscalls(void) { }
static inline void stop_ftrace_syscalls(void) { }
static inline void ftrace_syscall_enter(struct pt_regs *regs) { }
static inline void ftrace_syscall_exit(struct pt_regs *regs) { }
#endif
#endif /* _LINUX_FTRACE_H */ #endif /* _LINUX_FTRACE_H */

View file

@ -2,7 +2,7 @@
#define _LINUX_FTRACE_IRQ_H #define _LINUX_FTRACE_IRQ_H
#if defined(CONFIG_DYNAMIC_FTRACE) || defined(CONFIG_FUNCTION_GRAPH_TRACER) #ifdef CONFIG_FTRACE_NMI_ENTER
extern void ftrace_nmi_enter(void); extern void ftrace_nmi_enter(void);
extern void ftrace_nmi_exit(void); extern void ftrace_nmi_exit(void);
#else #else

View file

@ -15,55 +15,61 @@
* - bits 0-7 are the preemption count (max preemption depth: 256) * - bits 0-7 are the preemption count (max preemption depth: 256)
* - bits 8-15 are the softirq count (max # of softirqs: 256) * - bits 8-15 are the softirq count (max # of softirqs: 256)
* *
* The hardirq count can be overridden per architecture, the default is: * The hardirq count can in theory reach the same as NR_IRQS.
* In reality, the number of nested IRQS is limited to the stack
* size as well. For archs with over 1000 IRQS it is not practical
* to expect that they will all nest. We give a max of 10 bits for
* hardirq nesting. An arch may choose to give less than 10 bits.
* m68k expects it to be 8.
* *
* - bits 16-27 are the hardirq count (max # of hardirqs: 4096) * - bits 16-25 are the hardirq count (max # of nested hardirqs: 1024)
* - ( bit 28 is the PREEMPT_ACTIVE flag. ) * - bit 26 is the NMI_MASK
* - bit 28 is the PREEMPT_ACTIVE flag
* *
* PREEMPT_MASK: 0x000000ff * PREEMPT_MASK: 0x000000ff
* SOFTIRQ_MASK: 0x0000ff00 * SOFTIRQ_MASK: 0x0000ff00
* HARDIRQ_MASK: 0x0fff0000 * HARDIRQ_MASK: 0x03ff0000
* NMI_MASK: 0x04000000
*/ */
#define PREEMPT_BITS 8 #define PREEMPT_BITS 8
#define SOFTIRQ_BITS 8 #define SOFTIRQ_BITS 8
#define NMI_BITS 1
#define MAX_HARDIRQ_BITS 10
#ifndef HARDIRQ_BITS #ifndef HARDIRQ_BITS
#define HARDIRQ_BITS 12 # define HARDIRQ_BITS MAX_HARDIRQ_BITS
#ifndef MAX_HARDIRQS_PER_CPU
#define MAX_HARDIRQS_PER_CPU NR_IRQS
#endif #endif
/* #if HARDIRQ_BITS > MAX_HARDIRQ_BITS
* The hardirq mask has to be large enough to have space for potentially #error HARDIRQ_BITS too high!
* all IRQ sources in the system nesting on a single CPU.
*/
#if (1 << HARDIRQ_BITS) < MAX_HARDIRQS_PER_CPU
# error HARDIRQ_BITS is too low!
#endif
#endif #endif
#define PREEMPT_SHIFT 0 #define PREEMPT_SHIFT 0
#define SOFTIRQ_SHIFT (PREEMPT_SHIFT + PREEMPT_BITS) #define SOFTIRQ_SHIFT (PREEMPT_SHIFT + PREEMPT_BITS)
#define HARDIRQ_SHIFT (SOFTIRQ_SHIFT + SOFTIRQ_BITS) #define HARDIRQ_SHIFT (SOFTIRQ_SHIFT + SOFTIRQ_BITS)
#define NMI_SHIFT (HARDIRQ_SHIFT + HARDIRQ_BITS)
#define __IRQ_MASK(x) ((1UL << (x))-1) #define __IRQ_MASK(x) ((1UL << (x))-1)
#define PREEMPT_MASK (__IRQ_MASK(PREEMPT_BITS) << PREEMPT_SHIFT) #define PREEMPT_MASK (__IRQ_MASK(PREEMPT_BITS) << PREEMPT_SHIFT)
#define SOFTIRQ_MASK (__IRQ_MASK(SOFTIRQ_BITS) << SOFTIRQ_SHIFT) #define SOFTIRQ_MASK (__IRQ_MASK(SOFTIRQ_BITS) << SOFTIRQ_SHIFT)
#define HARDIRQ_MASK (__IRQ_MASK(HARDIRQ_BITS) << HARDIRQ_SHIFT) #define HARDIRQ_MASK (__IRQ_MASK(HARDIRQ_BITS) << HARDIRQ_SHIFT)
#define NMI_MASK (__IRQ_MASK(NMI_BITS) << NMI_SHIFT)
#define PREEMPT_OFFSET (1UL << PREEMPT_SHIFT) #define PREEMPT_OFFSET (1UL << PREEMPT_SHIFT)
#define SOFTIRQ_OFFSET (1UL << SOFTIRQ_SHIFT) #define SOFTIRQ_OFFSET (1UL << SOFTIRQ_SHIFT)
#define HARDIRQ_OFFSET (1UL << HARDIRQ_SHIFT) #define HARDIRQ_OFFSET (1UL << HARDIRQ_SHIFT)
#define NMI_OFFSET (1UL << NMI_SHIFT)
#if PREEMPT_ACTIVE < (1 << (HARDIRQ_SHIFT + HARDIRQ_BITS)) #if PREEMPT_ACTIVE < (1 << (NMI_SHIFT + NMI_BITS))
#error PREEMPT_ACTIVE is too low! #error PREEMPT_ACTIVE is too low!
#endif #endif
#define hardirq_count() (preempt_count() & HARDIRQ_MASK) #define hardirq_count() (preempt_count() & HARDIRQ_MASK)
#define softirq_count() (preempt_count() & SOFTIRQ_MASK) #define softirq_count() (preempt_count() & SOFTIRQ_MASK)
#define irq_count() (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK)) #define irq_count() (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK \
| NMI_MASK))
/* /*
* Are we doing bottom half or hardware interrupt processing? * Are we doing bottom half or hardware interrupt processing?
@ -73,6 +79,11 @@
#define in_softirq() (softirq_count()) #define in_softirq() (softirq_count())
#define in_interrupt() (irq_count()) #define in_interrupt() (irq_count())
/*
* Are we in NMI context?
*/
#define in_nmi() (preempt_count() & NMI_MASK)
#if defined(CONFIG_PREEMPT) #if defined(CONFIG_PREEMPT)
# define PREEMPT_INATOMIC_BASE kernel_locked() # define PREEMPT_INATOMIC_BASE kernel_locked()
# define PREEMPT_CHECK_OFFSET 1 # define PREEMPT_CHECK_OFFSET 1
@ -164,20 +175,24 @@ extern void irq_enter(void);
*/ */
extern void irq_exit(void); extern void irq_exit(void);
#define nmi_enter() \ #define nmi_enter() \
do { \ do { \
ftrace_nmi_enter(); \ ftrace_nmi_enter(); \
lockdep_off(); \ BUG_ON(in_nmi()); \
rcu_nmi_enter(); \ add_preempt_count(NMI_OFFSET + HARDIRQ_OFFSET); \
__irq_enter(); \ lockdep_off(); \
rcu_nmi_enter(); \
trace_hardirq_enter(); \
} while (0) } while (0)
#define nmi_exit() \ #define nmi_exit() \
do { \ do { \
__irq_exit(); \ trace_hardirq_exit(); \
rcu_nmi_exit(); \ rcu_nmi_exit(); \
lockdep_on(); \ lockdep_on(); \
ftrace_nmi_exit(); \ BUG_ON(!in_nmi()); \
sub_preempt_count(NMI_OFFSET + HARDIRQ_OFFSET); \
ftrace_nmi_exit(); \
} while (0) } while (0)
#endif /* LINUX_HARDIRQ_H */ #endif /* LINUX_HARDIRQ_H */

View file

@ -278,6 +278,11 @@ enum
NR_SOFTIRQS NR_SOFTIRQS
}; };
/* map softirq index to softirq name. update 'softirq_to_name' in
* kernel/softirq.c when adding a new softirq.
*/
extern char *softirq_to_name[NR_SOFTIRQS];
/* softirq mask and active fields moved to irq_cpustat_t in /* softirq mask and active fields moved to irq_cpustat_t in
* asm/hardirq.h to get better cache usage. KAO * asm/hardirq.h to get better cache usage. KAO
*/ */

View file

@ -391,6 +391,139 @@ static inline char *pack_hex_byte(char *buf, u8 byte)
({ if (0) printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); 0; }) ({ if (0) printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); 0; })
#endif #endif
/*
* General tracing related utility functions - trace_printk(),
* tracing_on/tracing_off and tracing_start()/tracing_stop
*
* Use tracing_on/tracing_off when you want to quickly turn on or off
* tracing. It simply enables or disables the recording of the trace events.
* This also corresponds to the user space debugfs/tracing/tracing_on
* file, which gives a means for the kernel and userspace to interact.
* Place a tracing_off() in the kernel where you want tracing to end.
* From user space, examine the trace, and then echo 1 > tracing_on
* to continue tracing.
*
* tracing_stop/tracing_start has slightly more overhead. It is used
* by things like suspend to ram where disabling the recording of the
* trace is not enough, but tracing must actually stop because things
* like calling smp_processor_id() may crash the system.
*
* Most likely, you want to use tracing_on/tracing_off.
*/
#ifdef CONFIG_RING_BUFFER
void tracing_on(void);
void tracing_off(void);
/* trace_off_permanent stops recording with no way to bring it back */
void tracing_off_permanent(void);
int tracing_is_on(void);
#else
static inline void tracing_on(void) { }
static inline void tracing_off(void) { }
static inline void tracing_off_permanent(void) { }
static inline int tracing_is_on(void) { return 0; }
#endif
#ifdef CONFIG_TRACING
extern void tracing_start(void);
extern void tracing_stop(void);
extern void ftrace_off_permanent(void);
extern void
ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3);
static inline void __attribute__ ((format (printf, 1, 2)))
____trace_printk_check_format(const char *fmt, ...)
{
}
#define __trace_printk_check_format(fmt, args...) \
do { \
if (0) \
____trace_printk_check_format(fmt, ##args); \
} while (0)
/**
* trace_printk - printf formatting in the ftrace buffer
* @fmt: the printf format for printing
*
* Note: __trace_printk is an internal function for trace_printk and
* the @ip is passed in via the trace_printk macro.
*
* This function allows a kernel developer to debug fast path sections
* that printk is not appropriate for. By scattering in various
* printk like tracing in the code, a developer can quickly see
* where problems are occurring.
*
* This is intended as a debugging tool for the developer only.
* Please refrain from leaving trace_printks scattered around in
* your code.
*/
#define trace_printk(fmt, args...) \
do { \
__trace_printk_check_format(fmt, ##args); \
if (__builtin_constant_p(fmt)) { \
static const char *trace_printk_fmt \
__attribute__((section("__trace_printk_fmt"))) = \
__builtin_constant_p(fmt) ? fmt : NULL; \
\
__trace_bprintk(_THIS_IP_, trace_printk_fmt, ##args); \
} else \
__trace_printk(_THIS_IP_, fmt, ##args); \
} while (0)
extern int
__trace_bprintk(unsigned long ip, const char *fmt, ...)
__attribute__ ((format (printf, 2, 3)));
extern int
__trace_printk(unsigned long ip, const char *fmt, ...)
__attribute__ ((format (printf, 2, 3)));
/*
* The double __builtin_constant_p is because gcc will give us an error
* if we try to allocate the static variable to fmt if it is not a
* constant. Even with the outer if statement.
*/
#define ftrace_vprintk(fmt, vargs) \
do { \
if (__builtin_constant_p(fmt)) { \
static const char *trace_printk_fmt \
__attribute__((section("__trace_printk_fmt"))) = \
__builtin_constant_p(fmt) ? fmt : NULL; \
\
__ftrace_vbprintk(_THIS_IP_, trace_printk_fmt, vargs); \
} else \
__ftrace_vprintk(_THIS_IP_, fmt, vargs); \
} while (0)
extern int
__ftrace_vbprintk(unsigned long ip, const char *fmt, va_list ap);
extern int
__ftrace_vprintk(unsigned long ip, const char *fmt, va_list ap);
extern void ftrace_dump(void);
#else
static inline void
ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) { }
static inline int
trace_printk(const char *fmt, ...) __attribute__ ((format (printf, 1, 2)));
static inline void tracing_start(void) { }
static inline void tracing_stop(void) { }
static inline void ftrace_off_permanent(void) { }
static inline int
trace_printk(const char *fmt, ...)
{
return 0;
}
static inline int
ftrace_vprintk(const char *fmt, va_list ap)
{
return 0;
}
static inline void ftrace_dump(void) { }
#endif /* CONFIG_TRACING */
/* /*
* Display an IP address in readable format. * Display an IP address in readable format.
*/ */

View file

@ -110,4 +110,10 @@ struct memory_accessor {
off_t offset, size_t count); off_t offset, size_t count);
}; };
/*
* Kernel text modification mutex, used for code patching. Users of this lock
* can sleep.
*/
extern struct mutex text_mutex;
#endif /* _LINUX_MEMORY_H_ */ #endif /* _LINUX_MEMORY_H_ */

View file

@ -333,6 +333,11 @@ struct module
unsigned int num_tracepoints; unsigned int num_tracepoints;
#endif #endif
#ifdef CONFIG_TRACING
const char **trace_bprintk_fmt_start;
unsigned int num_trace_bprintk_fmt;
#endif
#ifdef CONFIG_MODULE_UNLOAD #ifdef CONFIG_MODULE_UNLOAD
/* What modules depend on me? */ /* What modules depend on me? */
struct list_head modules_which_use_me; struct list_head modules_which_use_me;

View file

@ -8,7 +8,7 @@ struct ring_buffer;
struct ring_buffer_iter; struct ring_buffer_iter;
/* /*
* Don't reference this struct directly, use functions below. * Don't refer to this struct directly, use functions below.
*/ */
struct ring_buffer_event { struct ring_buffer_event {
u32 type:2, len:3, time_delta:27; u32 type:2, len:3, time_delta:27;
@ -18,10 +18,13 @@ struct ring_buffer_event {
/** /**
* enum ring_buffer_type - internal ring buffer types * enum ring_buffer_type - internal ring buffer types
* *
* @RINGBUF_TYPE_PADDING: Left over page padding * @RINGBUF_TYPE_PADDING: Left over page padding or discarded event
* array is ignored * If time_delta is 0:
* size is variable depending on how much * array is ignored
* size is variable depending on how much
* padding is needed * padding is needed
* If time_delta is non zero:
* everything else same as RINGBUF_TYPE_DATA
* *
* @RINGBUF_TYPE_TIME_EXTEND: Extend the time delta * @RINGBUF_TYPE_TIME_EXTEND: Extend the time delta
* array[0] = time delta (28 .. 59) * array[0] = time delta (28 .. 59)
@ -65,6 +68,8 @@ ring_buffer_event_time_delta(struct ring_buffer_event *event)
return event->time_delta; return event->time_delta;
} }
void ring_buffer_event_discard(struct ring_buffer_event *event);
/* /*
* size is in bytes for each per CPU buffer. * size is in bytes for each per CPU buffer.
*/ */
@ -74,13 +79,10 @@ void ring_buffer_free(struct ring_buffer *buffer);
int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size); int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size);
struct ring_buffer_event * struct ring_buffer_event *ring_buffer_lock_reserve(struct ring_buffer *buffer,
ring_buffer_lock_reserve(struct ring_buffer *buffer, unsigned long length);
unsigned long length,
unsigned long *flags);
int ring_buffer_unlock_commit(struct ring_buffer *buffer, int ring_buffer_unlock_commit(struct ring_buffer *buffer,
struct ring_buffer_event *event, struct ring_buffer_event *event);
unsigned long flags);
int ring_buffer_write(struct ring_buffer *buffer, int ring_buffer_write(struct ring_buffer *buffer,
unsigned long length, void *data); unsigned long length, void *data);
@ -121,17 +123,19 @@ unsigned long ring_buffer_overruns(struct ring_buffer *buffer);
unsigned long ring_buffer_entries_cpu(struct ring_buffer *buffer, int cpu); unsigned long ring_buffer_entries_cpu(struct ring_buffer *buffer, int cpu);
unsigned long ring_buffer_overrun_cpu(struct ring_buffer *buffer, int cpu); unsigned long ring_buffer_overrun_cpu(struct ring_buffer *buffer, int cpu);
u64 ring_buffer_time_stamp(int cpu); u64 ring_buffer_time_stamp(struct ring_buffer *buffer, int cpu);
void ring_buffer_normalize_time_stamp(int cpu, u64 *ts); void ring_buffer_normalize_time_stamp(struct ring_buffer *buffer,
int cpu, u64 *ts);
void ring_buffer_set_clock(struct ring_buffer *buffer,
u64 (*clock)(void));
size_t ring_buffer_page_len(void *page);
void tracing_on(void);
void tracing_off(void);
void tracing_off_permanent(void);
void *ring_buffer_alloc_read_page(struct ring_buffer *buffer); void *ring_buffer_alloc_read_page(struct ring_buffer *buffer);
void ring_buffer_free_read_page(struct ring_buffer *buffer, void *data); void ring_buffer_free_read_page(struct ring_buffer *buffer, void *data);
int ring_buffer_read_page(struct ring_buffer *buffer, int ring_buffer_read_page(struct ring_buffer *buffer, void **data_page,
void **data_page, int cpu, int full); size_t len, int cpu, int full);
enum ring_buffer_flags { enum ring_buffer_flags {
RB_FL_OVERWRITE = 1 << 0, RB_FL_OVERWRITE = 1 << 0,

View file

@ -138,6 +138,8 @@ extern unsigned long nr_uninterruptible(void);
extern unsigned long nr_active(void); extern unsigned long nr_active(void);
extern unsigned long nr_iowait(void); extern unsigned long nr_iowait(void);
extern unsigned long get_parent_ip(unsigned long addr);
struct seq_file; struct seq_file;
struct cfs_rq; struct cfs_rq;
struct task_group; struct task_group;
@ -1405,6 +1407,8 @@ struct task_struct {
int curr_ret_stack; int curr_ret_stack;
/* Stack of return addresses for return function tracing */ /* Stack of return addresses for return function tracing */
struct ftrace_ret_stack *ret_stack; struct ftrace_ret_stack *ret_stack;
/* time stamp for last schedule */
unsigned long long ftrace_timestamp;
/* /*
* Number of functions that haven't been traced * Number of functions that haven't been traced
* because of depth overrun. * because of depth overrun.

View file

@ -14,6 +14,7 @@
#include <asm/page.h> /* kmalloc_sizes.h needs PAGE_SIZE */ #include <asm/page.h> /* kmalloc_sizes.h needs PAGE_SIZE */
#include <asm/cache.h> /* kmalloc_sizes.h needs L1_CACHE_BYTES */ #include <asm/cache.h> /* kmalloc_sizes.h needs L1_CACHE_BYTES */
#include <linux/compiler.h> #include <linux/compiler.h>
#include <trace/kmemtrace.h>
/* Size description struct for general caches. */ /* Size description struct for general caches. */
struct cache_sizes { struct cache_sizes {
@ -28,8 +29,26 @@ extern struct cache_sizes malloc_sizes[];
void *kmem_cache_alloc(struct kmem_cache *, gfp_t); void *kmem_cache_alloc(struct kmem_cache *, gfp_t);
void *__kmalloc(size_t size, gfp_t flags); void *__kmalloc(size_t size, gfp_t flags);
static inline void *kmalloc(size_t size, gfp_t flags) #ifdef CONFIG_KMEMTRACE
extern void *kmem_cache_alloc_notrace(struct kmem_cache *cachep, gfp_t flags);
extern size_t slab_buffer_size(struct kmem_cache *cachep);
#else
static __always_inline void *
kmem_cache_alloc_notrace(struct kmem_cache *cachep, gfp_t flags)
{ {
return kmem_cache_alloc(cachep, flags);
}
static inline size_t slab_buffer_size(struct kmem_cache *cachep)
{
return 0;
}
#endif
static __always_inline void *kmalloc(size_t size, gfp_t flags)
{
struct kmem_cache *cachep;
void *ret;
if (__builtin_constant_p(size)) { if (__builtin_constant_p(size)) {
int i = 0; int i = 0;
@ -47,10 +66,17 @@ static inline void *kmalloc(size_t size, gfp_t flags)
found: found:
#ifdef CONFIG_ZONE_DMA #ifdef CONFIG_ZONE_DMA
if (flags & GFP_DMA) if (flags & GFP_DMA)
return kmem_cache_alloc(malloc_sizes[i].cs_dmacachep, cachep = malloc_sizes[i].cs_dmacachep;
flags); else
#endif #endif
return kmem_cache_alloc(malloc_sizes[i].cs_cachep, flags); cachep = malloc_sizes[i].cs_cachep;
ret = kmem_cache_alloc_notrace(cachep, flags);
kmemtrace_mark_alloc(KMEMTRACE_TYPE_KMALLOC, _THIS_IP_, ret,
size, slab_buffer_size(cachep), flags);
return ret;
} }
return __kmalloc(size, flags); return __kmalloc(size, flags);
} }
@ -59,8 +85,25 @@ static inline void *kmalloc(size_t size, gfp_t flags)
extern void *__kmalloc_node(size_t size, gfp_t flags, int node); extern void *__kmalloc_node(size_t size, gfp_t flags, int node);
extern void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node); extern void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node);
static inline void *kmalloc_node(size_t size, gfp_t flags, int node) #ifdef CONFIG_KMEMTRACE
extern void *kmem_cache_alloc_node_notrace(struct kmem_cache *cachep,
gfp_t flags,
int nodeid);
#else
static __always_inline void *
kmem_cache_alloc_node_notrace(struct kmem_cache *cachep,
gfp_t flags,
int nodeid)
{ {
return kmem_cache_alloc_node(cachep, flags, nodeid);
}
#endif
static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
{
struct kmem_cache *cachep;
void *ret;
if (__builtin_constant_p(size)) { if (__builtin_constant_p(size)) {
int i = 0; int i = 0;
@ -78,11 +121,18 @@ static inline void *kmalloc_node(size_t size, gfp_t flags, int node)
found: found:
#ifdef CONFIG_ZONE_DMA #ifdef CONFIG_ZONE_DMA
if (flags & GFP_DMA) if (flags & GFP_DMA)
return kmem_cache_alloc_node(malloc_sizes[i].cs_dmacachep, cachep = malloc_sizes[i].cs_dmacachep;
flags, node); else
#endif #endif
return kmem_cache_alloc_node(malloc_sizes[i].cs_cachep, cachep = malloc_sizes[i].cs_cachep;
flags, node);
ret = kmem_cache_alloc_node_notrace(cachep, flags, node);
kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC, _THIS_IP_,
ret, size, slab_buffer_size(cachep),
flags, node);
return ret;
} }
return __kmalloc_node(size, flags, node); return __kmalloc_node(size, flags, node);
} }

View file

@ -3,14 +3,15 @@
void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node); void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node);
static inline void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags) static __always_inline void *kmem_cache_alloc(struct kmem_cache *cachep,
gfp_t flags)
{ {
return kmem_cache_alloc_node(cachep, flags, -1); return kmem_cache_alloc_node(cachep, flags, -1);
} }
void *__kmalloc_node(size_t size, gfp_t flags, int node); void *__kmalloc_node(size_t size, gfp_t flags, int node);
static inline void *kmalloc_node(size_t size, gfp_t flags, int node) static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
{ {
return __kmalloc_node(size, flags, node); return __kmalloc_node(size, flags, node);
} }
@ -23,12 +24,12 @@ static inline void *kmalloc_node(size_t size, gfp_t flags, int node)
* kmalloc is the normal method of allocating memory * kmalloc is the normal method of allocating memory
* in the kernel. * in the kernel.
*/ */
static inline void *kmalloc(size_t size, gfp_t flags) static __always_inline void *kmalloc(size_t size, gfp_t flags)
{ {
return __kmalloc_node(size, flags, -1); return __kmalloc_node(size, flags, -1);
} }
static inline void *__kmalloc(size_t size, gfp_t flags) static __always_inline void *__kmalloc(size_t size, gfp_t flags)
{ {
return kmalloc(size, flags); return kmalloc(size, flags);
} }

View file

@ -10,6 +10,7 @@
#include <linux/gfp.h> #include <linux/gfp.h>
#include <linux/workqueue.h> #include <linux/workqueue.h>
#include <linux/kobject.h> #include <linux/kobject.h>
#include <trace/kmemtrace.h>
enum stat_item { enum stat_item {
ALLOC_FASTPATH, /* Allocation from cpu slab */ ALLOC_FASTPATH, /* Allocation from cpu slab */
@ -217,13 +218,31 @@ static __always_inline struct kmem_cache *kmalloc_slab(size_t size)
void *kmem_cache_alloc(struct kmem_cache *, gfp_t); void *kmem_cache_alloc(struct kmem_cache *, gfp_t);
void *__kmalloc(size_t size, gfp_t flags); void *__kmalloc(size_t size, gfp_t flags);
#ifdef CONFIG_KMEMTRACE
extern void *kmem_cache_alloc_notrace(struct kmem_cache *s, gfp_t gfpflags);
#else
static __always_inline void *
kmem_cache_alloc_notrace(struct kmem_cache *s, gfp_t gfpflags)
{
return kmem_cache_alloc(s, gfpflags);
}
#endif
static __always_inline void *kmalloc_large(size_t size, gfp_t flags) static __always_inline void *kmalloc_large(size_t size, gfp_t flags)
{ {
return (void *)__get_free_pages(flags | __GFP_COMP, get_order(size)); unsigned int order = get_order(size);
void *ret = (void *) __get_free_pages(flags | __GFP_COMP, order);
kmemtrace_mark_alloc(KMEMTRACE_TYPE_KMALLOC, _THIS_IP_, ret,
size, PAGE_SIZE << order, flags);
return ret;
} }
static __always_inline void *kmalloc(size_t size, gfp_t flags) static __always_inline void *kmalloc(size_t size, gfp_t flags)
{ {
void *ret;
if (__builtin_constant_p(size)) { if (__builtin_constant_p(size)) {
if (size > SLUB_MAX_SIZE) if (size > SLUB_MAX_SIZE)
return kmalloc_large(size, flags); return kmalloc_large(size, flags);
@ -234,7 +253,13 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
if (!s) if (!s)
return ZERO_SIZE_PTR; return ZERO_SIZE_PTR;
return kmem_cache_alloc(s, flags); ret = kmem_cache_alloc_notrace(s, flags);
kmemtrace_mark_alloc(KMEMTRACE_TYPE_KMALLOC,
_THIS_IP_, ret,
size, s->size, flags);
return ret;
} }
} }
return __kmalloc(size, flags); return __kmalloc(size, flags);
@ -244,8 +269,24 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
void *__kmalloc_node(size_t size, gfp_t flags, int node); void *__kmalloc_node(size_t size, gfp_t flags, int node);
void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node); void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node);
#ifdef CONFIG_KMEMTRACE
extern void *kmem_cache_alloc_node_notrace(struct kmem_cache *s,
gfp_t gfpflags,
int node);
#else
static __always_inline void *
kmem_cache_alloc_node_notrace(struct kmem_cache *s,
gfp_t gfpflags,
int node)
{
return kmem_cache_alloc_node(s, gfpflags, node);
}
#endif
static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node) static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
{ {
void *ret;
if (__builtin_constant_p(size) && if (__builtin_constant_p(size) &&
size <= SLUB_MAX_SIZE && !(flags & SLUB_DMA)) { size <= SLUB_MAX_SIZE && !(flags & SLUB_DMA)) {
struct kmem_cache *s = kmalloc_slab(size); struct kmem_cache *s = kmalloc_slab(size);
@ -253,7 +294,13 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
if (!s) if (!s)
return ZERO_SIZE_PTR; return ZERO_SIZE_PTR;
return kmem_cache_alloc_node(s, flags, node); ret = kmem_cache_alloc_node_notrace(s, flags, node);
kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC,
_THIS_IP_, ret,
size, s->size, flags, node);
return ret;
} }
return __kmalloc_node(size, flags, node); return __kmalloc_node(size, flags, node);
} }

View file

@ -65,6 +65,7 @@ struct old_linux_dirent;
#include <asm/signal.h> #include <asm/signal.h>
#include <linux/quota.h> #include <linux/quota.h>
#include <linux/key.h> #include <linux/key.h>
#include <linux/ftrace.h>
#define __SC_DECL1(t1, a1) t1 a1 #define __SC_DECL1(t1, a1) t1 a1
#define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__) #define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
@ -95,7 +96,46 @@ struct old_linux_dirent;
#define __SC_TEST5(t5, a5, ...) __SC_TEST(t5); __SC_TEST4(__VA_ARGS__) #define __SC_TEST5(t5, a5, ...) __SC_TEST(t5); __SC_TEST4(__VA_ARGS__)
#define __SC_TEST6(t6, a6, ...) __SC_TEST(t6); __SC_TEST5(__VA_ARGS__) #define __SC_TEST6(t6, a6, ...) __SC_TEST(t6); __SC_TEST5(__VA_ARGS__)
#ifdef CONFIG_FTRACE_SYSCALLS
#define __SC_STR_ADECL1(t, a) #a
#define __SC_STR_ADECL2(t, a, ...) #a, __SC_STR_ADECL1(__VA_ARGS__)
#define __SC_STR_ADECL3(t, a, ...) #a, __SC_STR_ADECL2(__VA_ARGS__)
#define __SC_STR_ADECL4(t, a, ...) #a, __SC_STR_ADECL3(__VA_ARGS__)
#define __SC_STR_ADECL5(t, a, ...) #a, __SC_STR_ADECL4(__VA_ARGS__)
#define __SC_STR_ADECL6(t, a, ...) #a, __SC_STR_ADECL5(__VA_ARGS__)
#define __SC_STR_TDECL1(t, a) #t
#define __SC_STR_TDECL2(t, a, ...) #t, __SC_STR_TDECL1(__VA_ARGS__)
#define __SC_STR_TDECL3(t, a, ...) #t, __SC_STR_TDECL2(__VA_ARGS__)
#define __SC_STR_TDECL4(t, a, ...) #t, __SC_STR_TDECL3(__VA_ARGS__)
#define __SC_STR_TDECL5(t, a, ...) #t, __SC_STR_TDECL4(__VA_ARGS__)
#define __SC_STR_TDECL6(t, a, ...) #t, __SC_STR_TDECL5(__VA_ARGS__)
#define SYSCALL_METADATA(sname, nb) \
static const struct syscall_metadata __used \
__attribute__((__aligned__(4))) \
__attribute__((section("__syscalls_metadata"))) \
__syscall_meta_##sname = { \
.name = "sys"#sname, \
.nb_args = nb, \
.types = types_##sname, \
.args = args_##sname, \
}
#define SYSCALL_DEFINE0(sname) \
static const struct syscall_metadata __used \
__attribute__((__aligned__(4))) \
__attribute__((section("__syscalls_metadata"))) \
__syscall_meta_##sname = { \
.name = "sys_"#sname, \
.nb_args = 0, \
}; \
asmlinkage long sys_##sname(void)
#else
#define SYSCALL_DEFINE0(name) asmlinkage long sys_##name(void) #define SYSCALL_DEFINE0(name) asmlinkage long sys_##name(void)
#endif
#define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__) #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
#define SYSCALL_DEFINE2(name, ...) SYSCALL_DEFINEx(2, _##name, __VA_ARGS__) #define SYSCALL_DEFINE2(name, ...) SYSCALL_DEFINEx(2, _##name, __VA_ARGS__)
#define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__) #define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)
@ -117,10 +157,26 @@ struct old_linux_dirent;
#endif #endif
#endif #endif
#ifdef CONFIG_FTRACE_SYSCALLS
#define SYSCALL_DEFINEx(x, sname, ...) \
static const char *types_##sname[] = { \
__SC_STR_TDECL##x(__VA_ARGS__) \
}; \
static const char *args_##sname[] = { \
__SC_STR_ADECL##x(__VA_ARGS__) \
}; \
SYSCALL_METADATA(sname, x); \
__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
#else
#define SYSCALL_DEFINEx(x, sname, ...) \
__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
#endif
#ifdef CONFIG_HAVE_SYSCALL_WRAPPERS #ifdef CONFIG_HAVE_SYSCALL_WRAPPERS
#define SYSCALL_DEFINE(name) static inline long SYSC_##name #define SYSCALL_DEFINE(name) static inline long SYSC_##name
#define SYSCALL_DEFINEx(x, name, ...) \
#define __SYSCALL_DEFINEx(x, name, ...) \
asmlinkage long sys##name(__SC_DECL##x(__VA_ARGS__)); \ asmlinkage long sys##name(__SC_DECL##x(__VA_ARGS__)); \
static inline long SYSC##name(__SC_DECL##x(__VA_ARGS__)); \ static inline long SYSC##name(__SC_DECL##x(__VA_ARGS__)); \
asmlinkage long SyS##name(__SC_LONG##x(__VA_ARGS__)) \ asmlinkage long SyS##name(__SC_LONG##x(__VA_ARGS__)) \
@ -134,7 +190,7 @@ struct old_linux_dirent;
#else /* CONFIG_HAVE_SYSCALL_WRAPPERS */ #else /* CONFIG_HAVE_SYSCALL_WRAPPERS */
#define SYSCALL_DEFINE(name) asmlinkage long sys_##name #define SYSCALL_DEFINE(name) asmlinkage long sys_##name
#define SYSCALL_DEFINEx(x, name, ...) \ #define __SYSCALL_DEFINEx(x, name, ...) \
asmlinkage long sys##name(__SC_DECL##x(__VA_ARGS__)) asmlinkage long sys##name(__SC_DECL##x(__VA_ARGS__))
#endif /* CONFIG_HAVE_SYSCALL_WRAPPERS */ #endif /* CONFIG_HAVE_SYSCALL_WRAPPERS */

View file

@ -0,0 +1,19 @@
#ifndef _LINUX_TRACE_CLOCK_H
#define _LINUX_TRACE_CLOCK_H
/*
* 3 trace clock variants, with differing scalability/precision
* tradeoffs:
*
* - local: CPU-local trace clock
* - medium: scalable global clock with some jitter
* - global: globally monotonic, serialized clock
*/
#include <linux/compiler.h>
#include <linux/types.h>
extern u64 notrace trace_clock_local(void);
extern u64 notrace trace_clock(void);
extern u64 notrace trace_clock_global(void);
#endif /* _LINUX_TRACE_CLOCK_H */

View file

@ -31,8 +31,8 @@ struct tracepoint {
* Keep in sync with vmlinux.lds.h. * Keep in sync with vmlinux.lds.h.
*/ */
#define TPPROTO(args...) args #define TP_PROTO(args...) args
#define TPARGS(args...) args #define TP_ARGS(args...) args
#ifdef CONFIG_TRACEPOINTS #ifdef CONFIG_TRACEPOINTS
@ -65,7 +65,7 @@ struct tracepoint {
{ \ { \
if (unlikely(__tracepoint_##name.state)) \ if (unlikely(__tracepoint_##name.state)) \
__DO_TRACE(&__tracepoint_##name, \ __DO_TRACE(&__tracepoint_##name, \
TPPROTO(proto), TPARGS(args)); \ TP_PROTO(proto), TP_ARGS(args)); \
} \ } \
static inline int register_trace_##name(void (*probe)(proto)) \ static inline int register_trace_##name(void (*probe)(proto)) \
{ \ { \
@ -153,4 +153,114 @@ static inline void tracepoint_synchronize_unregister(void)
synchronize_sched(); synchronize_sched();
} }
#define PARAMS(args...) args
#define TRACE_FORMAT(name, proto, args, fmt) \
DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
/*
* For use with the TRACE_EVENT macro:
*
* We define a tracepoint, its arguments, its printk format
* and its 'fast binay record' layout.
*
* Firstly, name your tracepoint via TRACE_EVENT(name : the
* 'subsystem_event' notation is fine.
*
* Think about this whole construct as the
* 'trace_sched_switch() function' from now on.
*
*
* TRACE_EVENT(sched_switch,
*
* *
* * A function has a regular function arguments
* * prototype, declare it via TP_PROTO():
* *
*
* TP_PROTO(struct rq *rq, struct task_struct *prev,
* struct task_struct *next),
*
* *
* * Define the call signature of the 'function'.
* * (Design sidenote: we use this instead of a
* * TP_PROTO1/TP_PROTO2/TP_PROTO3 ugliness.)
* *
*
* TP_ARGS(rq, prev, next),
*
* *
* * Fast binary tracing: define the trace record via
* * TP_STRUCT__entry(). You can think about it like a
* * regular C structure local variable definition.
* *
* * This is how the trace record is structured and will
* * be saved into the ring buffer. These are the fields
* * that will be exposed to user-space in
* * /debug/tracing/events/<*>/format.
* *
* * The declared 'local variable' is called '__entry'
* *
* * __field(pid_t, prev_prid) is equivalent to a standard declariton:
* *
* * pid_t prev_pid;
* *
* * __array(char, prev_comm, TASK_COMM_LEN) is equivalent to:
* *
* * char prev_comm[TASK_COMM_LEN];
* *
*
* TP_STRUCT__entry(
* __array( char, prev_comm, TASK_COMM_LEN )
* __field( pid_t, prev_pid )
* __field( int, prev_prio )
* __array( char, next_comm, TASK_COMM_LEN )
* __field( pid_t, next_pid )
* __field( int, next_prio )
* ),
*
* *
* * Assign the entry into the trace record, by embedding
* * a full C statement block into TP_fast_assign(). You
* * can refer to the trace record as '__entry' -
* * otherwise you can put arbitrary C code in here.
* *
* * Note: this C code will execute every time a trace event
* * happens, on an active tracepoint.
* *
*
* TP_fast_assign(
* memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
* __entry->prev_pid = prev->pid;
* __entry->prev_prio = prev->prio;
* memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
* __entry->next_pid = next->pid;
* __entry->next_prio = next->prio;
* )
*
* *
* * Formatted output of a trace record via TP_printk().
* * This is how the tracepoint will appear under ftrace
* * plugins that make use of this tracepoint.
* *
* * (raw-binary tracing wont actually perform this step.)
* *
*
* TP_printk("task %s:%d [%d] ==> %s:%d [%d]",
* __entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
* __entry->next_comm, __entry->next_pid, __entry->next_prio),
*
* );
*
* This macro construct is thus used for the regular printk format
* tracing setup, it is used to construct a function pointer based
* tracepoint callback (this is used by programmatic plugins and
* can also by used by generic instrumentation like SystemTap), and
* it is also used to expose a structured trace record in
* /debug/tracing/events/.
*/
#define TRACE_EVENT(name, proto, args, struct, assign, print) \
DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
#endif #endif

View file

@ -5,72 +5,72 @@
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
DECLARE_TRACE(block_rq_abort, DECLARE_TRACE(block_rq_abort,
TPPROTO(struct request_queue *q, struct request *rq), TP_PROTO(struct request_queue *q, struct request *rq),
TPARGS(q, rq)); TP_ARGS(q, rq));
DECLARE_TRACE(block_rq_insert, DECLARE_TRACE(block_rq_insert,
TPPROTO(struct request_queue *q, struct request *rq), TP_PROTO(struct request_queue *q, struct request *rq),
TPARGS(q, rq)); TP_ARGS(q, rq));
DECLARE_TRACE(block_rq_issue, DECLARE_TRACE(block_rq_issue,
TPPROTO(struct request_queue *q, struct request *rq), TP_PROTO(struct request_queue *q, struct request *rq),
TPARGS(q, rq)); TP_ARGS(q, rq));
DECLARE_TRACE(block_rq_requeue, DECLARE_TRACE(block_rq_requeue,
TPPROTO(struct request_queue *q, struct request *rq), TP_PROTO(struct request_queue *q, struct request *rq),
TPARGS(q, rq)); TP_ARGS(q, rq));
DECLARE_TRACE(block_rq_complete, DECLARE_TRACE(block_rq_complete,
TPPROTO(struct request_queue *q, struct request *rq), TP_PROTO(struct request_queue *q, struct request *rq),
TPARGS(q, rq)); TP_ARGS(q, rq));
DECLARE_TRACE(block_bio_bounce, DECLARE_TRACE(block_bio_bounce,
TPPROTO(struct request_queue *q, struct bio *bio), TP_PROTO(struct request_queue *q, struct bio *bio),
TPARGS(q, bio)); TP_ARGS(q, bio));
DECLARE_TRACE(block_bio_complete, DECLARE_TRACE(block_bio_complete,
TPPROTO(struct request_queue *q, struct bio *bio), TP_PROTO(struct request_queue *q, struct bio *bio),
TPARGS(q, bio)); TP_ARGS(q, bio));
DECLARE_TRACE(block_bio_backmerge, DECLARE_TRACE(block_bio_backmerge,
TPPROTO(struct request_queue *q, struct bio *bio), TP_PROTO(struct request_queue *q, struct bio *bio),
TPARGS(q, bio)); TP_ARGS(q, bio));
DECLARE_TRACE(block_bio_frontmerge, DECLARE_TRACE(block_bio_frontmerge,
TPPROTO(struct request_queue *q, struct bio *bio), TP_PROTO(struct request_queue *q, struct bio *bio),
TPARGS(q, bio)); TP_ARGS(q, bio));
DECLARE_TRACE(block_bio_queue, DECLARE_TRACE(block_bio_queue,
TPPROTO(struct request_queue *q, struct bio *bio), TP_PROTO(struct request_queue *q, struct bio *bio),
TPARGS(q, bio)); TP_ARGS(q, bio));
DECLARE_TRACE(block_getrq, DECLARE_TRACE(block_getrq,
TPPROTO(struct request_queue *q, struct bio *bio, int rw), TP_PROTO(struct request_queue *q, struct bio *bio, int rw),
TPARGS(q, bio, rw)); TP_ARGS(q, bio, rw));
DECLARE_TRACE(block_sleeprq, DECLARE_TRACE(block_sleeprq,
TPPROTO(struct request_queue *q, struct bio *bio, int rw), TP_PROTO(struct request_queue *q, struct bio *bio, int rw),
TPARGS(q, bio, rw)); TP_ARGS(q, bio, rw));
DECLARE_TRACE(block_plug, DECLARE_TRACE(block_plug,
TPPROTO(struct request_queue *q), TP_PROTO(struct request_queue *q),
TPARGS(q)); TP_ARGS(q));
DECLARE_TRACE(block_unplug_timer, DECLARE_TRACE(block_unplug_timer,
TPPROTO(struct request_queue *q), TP_PROTO(struct request_queue *q),
TPARGS(q)); TP_ARGS(q));
DECLARE_TRACE(block_unplug_io, DECLARE_TRACE(block_unplug_io,
TPPROTO(struct request_queue *q), TP_PROTO(struct request_queue *q),
TPARGS(q)); TP_ARGS(q));
DECLARE_TRACE(block_split, DECLARE_TRACE(block_split,
TPPROTO(struct request_queue *q, struct bio *bio, unsigned int pdu), TP_PROTO(struct request_queue *q, struct bio *bio, unsigned int pdu),
TPARGS(q, bio, pdu)); TP_ARGS(q, bio, pdu));
DECLARE_TRACE(block_remap, DECLARE_TRACE(block_remap,
TPPROTO(struct request_queue *q, struct bio *bio, dev_t dev, TP_PROTO(struct request_queue *q, struct bio *bio, dev_t dev,
sector_t from, sector_t to), sector_t from, sector_t to),
TPARGS(q, bio, dev, from, to)); TP_ARGS(q, bio, dev, from, to));
#endif #endif

9
include/trace/irq.h Normal file
View file

@ -0,0 +1,9 @@
#ifndef _TRACE_IRQ_H
#define _TRACE_IRQ_H
#include <linux/interrupt.h>
#include <linux/tracepoint.h>
#include <trace/irq_event_types.h>
#endif

View file

@ -0,0 +1,55 @@
/* use <trace/irq.h> instead */
#ifndef TRACE_FORMAT
# error Do not include this file directly.
# error Unless you know what you are doing.
#endif
#undef TRACE_SYSTEM
#define TRACE_SYSTEM irq
/*
* Tracepoint for entry of interrupt handler:
*/
TRACE_FORMAT(irq_handler_entry,
TP_PROTO(int irq, struct irqaction *action),
TP_ARGS(irq, action),
TP_FMT("irq=%d handler=%s", irq, action->name)
);
/*
* Tracepoint for return of an interrupt handler:
*/
TRACE_EVENT(irq_handler_exit,
TP_PROTO(int irq, struct irqaction *action, int ret),
TP_ARGS(irq, action, ret),
TP_STRUCT__entry(
__field( int, irq )
__field( int, ret )
),
TP_fast_assign(
__entry->irq = irq;
__entry->ret = ret;
),
TP_printk("irq=%d return=%s",
__entry->irq, __entry->ret ? "handled" : "unhandled")
);
TRACE_FORMAT(softirq_entry,
TP_PROTO(struct softirq_action *h, struct softirq_action *vec),
TP_ARGS(h, vec),
TP_FMT("softirq=%d action=%s", (int)(h - vec), softirq_to_name[h-vec])
);
TRACE_FORMAT(softirq_exit,
TP_PROTO(struct softirq_action *h, struct softirq_action *vec),
TP_ARGS(h, vec),
TP_FMT("softirq=%d action=%s", (int)(h - vec), softirq_to_name[h-vec])
);
#undef TRACE_SYSTEM

75
include/trace/kmemtrace.h Normal file
View file

@ -0,0 +1,75 @@
/*
* Copyright (C) 2008 Eduard - Gabriel Munteanu
*
* This file is released under GPL version 2.
*/
#ifndef _LINUX_KMEMTRACE_H
#define _LINUX_KMEMTRACE_H
#ifdef __KERNEL__
#include <linux/types.h>
#include <linux/marker.h>
enum kmemtrace_type_id {
KMEMTRACE_TYPE_KMALLOC = 0, /* kmalloc() or kfree(). */
KMEMTRACE_TYPE_CACHE, /* kmem_cache_*(). */
KMEMTRACE_TYPE_PAGES, /* __get_free_pages() and friends. */
};
#ifdef CONFIG_KMEMTRACE
extern void kmemtrace_init(void);
extern void kmemtrace_mark_alloc_node(enum kmemtrace_type_id type_id,
unsigned long call_site,
const void *ptr,
size_t bytes_req,
size_t bytes_alloc,
gfp_t gfp_flags,
int node);
extern void kmemtrace_mark_free(enum kmemtrace_type_id type_id,
unsigned long call_site,
const void *ptr);
#else /* CONFIG_KMEMTRACE */
static inline void kmemtrace_init(void)
{
}
static inline void kmemtrace_mark_alloc_node(enum kmemtrace_type_id type_id,
unsigned long call_site,
const void *ptr,
size_t bytes_req,
size_t bytes_alloc,
gfp_t gfp_flags,
int node)
{
}
static inline void kmemtrace_mark_free(enum kmemtrace_type_id type_id,
unsigned long call_site,
const void *ptr)
{
}
#endif /* CONFIG_KMEMTRACE */
static inline void kmemtrace_mark_alloc(enum kmemtrace_type_id type_id,
unsigned long call_site,
const void *ptr,
size_t bytes_req,
size_t bytes_alloc,
gfp_t gfp_flags)
{
kmemtrace_mark_alloc_node(type_id, call_site, ptr,
bytes_req, bytes_alloc, gfp_flags, -1);
}
#endif /* __KERNEL__ */
#endif /* _LINUX_KMEMTRACE_H */

9
include/trace/lockdep.h Normal file
View file

@ -0,0 +1,9 @@
#ifndef _TRACE_LOCKDEP_H
#define _TRACE_LOCKDEP_H
#include <linux/lockdep.h>
#include <linux/tracepoint.h>
#include <trace/lockdep_event_types.h>
#endif

View file

@ -0,0 +1,44 @@
#ifndef TRACE_FORMAT
# error Do not include this file directly.
# error Unless you know what you are doing.
#endif
#undef TRACE_SYSTEM
#define TRACE_SYSTEM lock
#ifdef CONFIG_LOCKDEP
TRACE_FORMAT(lock_acquire,
TP_PROTO(struct lockdep_map *lock, unsigned int subclass,
int trylock, int read, int check,
struct lockdep_map *next_lock, unsigned long ip),
TP_ARGS(lock, subclass, trylock, read, check, next_lock, ip),
TP_FMT("%s%s%s", trylock ? "try " : "",
read ? "read " : "", lock->name)
);
TRACE_FORMAT(lock_release,
TP_PROTO(struct lockdep_map *lock, int nested, unsigned long ip),
TP_ARGS(lock, nested, ip),
TP_FMT("%s", lock->name)
);
#ifdef CONFIG_LOCK_STAT
TRACE_FORMAT(lock_contended,
TP_PROTO(struct lockdep_map *lock, unsigned long ip),
TP_ARGS(lock, ip),
TP_FMT("%s", lock->name)
);
TRACE_FORMAT(lock_acquired,
TP_PROTO(struct lockdep_map *lock, unsigned long ip),
TP_ARGS(lock, ip),
TP_FMT("%s", lock->name)
);
#endif
#endif
#undef TRACE_SYSTEM

32
include/trace/power.h Normal file
View file

@ -0,0 +1,32 @@
#ifndef _TRACE_POWER_H
#define _TRACE_POWER_H
#include <linux/ktime.h>
#include <linux/tracepoint.h>
enum {
POWER_NONE = 0,
POWER_CSTATE = 1,
POWER_PSTATE = 2,
};
struct power_trace {
ktime_t stamp;
ktime_t end;
int type;
int state;
};
DECLARE_TRACE(power_start,
TP_PROTO(struct power_trace *it, unsigned int type, unsigned int state),
TP_ARGS(it, type, state));
DECLARE_TRACE(power_mark,
TP_PROTO(struct power_trace *it, unsigned int type, unsigned int state),
TP_ARGS(it, type, state));
DECLARE_TRACE(power_end,
TP_PROTO(struct power_trace *it),
TP_ARGS(it));
#endif /* _TRACE_POWER_H */

View file

@ -4,53 +4,6 @@
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
DECLARE_TRACE(sched_kthread_stop, #include <trace/sched_event_types.h>
TPPROTO(struct task_struct *t),
TPARGS(t));
DECLARE_TRACE(sched_kthread_stop_ret,
TPPROTO(int ret),
TPARGS(ret));
DECLARE_TRACE(sched_wait_task,
TPPROTO(struct rq *rq, struct task_struct *p),
TPARGS(rq, p));
DECLARE_TRACE(sched_wakeup,
TPPROTO(struct rq *rq, struct task_struct *p, int success),
TPARGS(rq, p, success));
DECLARE_TRACE(sched_wakeup_new,
TPPROTO(struct rq *rq, struct task_struct *p, int success),
TPARGS(rq, p, success));
DECLARE_TRACE(sched_switch,
TPPROTO(struct rq *rq, struct task_struct *prev,
struct task_struct *next),
TPARGS(rq, prev, next));
DECLARE_TRACE(sched_migrate_task,
TPPROTO(struct task_struct *p, int orig_cpu, int dest_cpu),
TPARGS(p, orig_cpu, dest_cpu));
DECLARE_TRACE(sched_process_free,
TPPROTO(struct task_struct *p),
TPARGS(p));
DECLARE_TRACE(sched_process_exit,
TPPROTO(struct task_struct *p),
TPARGS(p));
DECLARE_TRACE(sched_process_wait,
TPPROTO(struct pid *pid),
TPARGS(pid));
DECLARE_TRACE(sched_process_fork,
TPPROTO(struct task_struct *parent, struct task_struct *child),
TPARGS(parent, child));
DECLARE_TRACE(sched_signal_send,
TPPROTO(int sig, struct task_struct *p),
TPARGS(sig, p));
#endif #endif

View file

@ -0,0 +1,337 @@
/* use <trace/sched.h> instead */
#ifndef TRACE_EVENT
# error Do not include this file directly.
# error Unless you know what you are doing.
#endif
#undef TRACE_SYSTEM
#define TRACE_SYSTEM sched
/*
* Tracepoint for calling kthread_stop, performed to end a kthread:
*/
TRACE_EVENT(sched_kthread_stop,
TP_PROTO(struct task_struct *t),
TP_ARGS(t),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
__field( pid_t, pid )
),
TP_fast_assign(
memcpy(__entry->comm, t->comm, TASK_COMM_LEN);
__entry->pid = t->pid;
),
TP_printk("task %s:%d", __entry->comm, __entry->pid)
);
/*
* Tracepoint for the return value of the kthread stopping:
*/
TRACE_EVENT(sched_kthread_stop_ret,
TP_PROTO(int ret),
TP_ARGS(ret),
TP_STRUCT__entry(
__field( int, ret )
),
TP_fast_assign(
__entry->ret = ret;
),
TP_printk("ret %d", __entry->ret)
);
/*
* Tracepoint for waiting on task to unschedule:
*
* (NOTE: the 'rq' argument is not used by generic trace events,
* but used by the latency tracer plugin. )
*/
TRACE_EVENT(sched_wait_task,
TP_PROTO(struct rq *rq, struct task_struct *p),
TP_ARGS(rq, p),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
__field( pid_t, pid )
__field( int, prio )
),
TP_fast_assign(
memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
__entry->pid = p->pid;
__entry->prio = p->prio;
),
TP_printk("task %s:%d [%d]",
__entry->comm, __entry->pid, __entry->prio)
);
/*
* Tracepoint for waking up a task:
*
* (NOTE: the 'rq' argument is not used by generic trace events,
* but used by the latency tracer plugin. )
*/
TRACE_EVENT(sched_wakeup,
TP_PROTO(struct rq *rq, struct task_struct *p, int success),
TP_ARGS(rq, p, success),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
__field( pid_t, pid )
__field( int, prio )
__field( int, success )
),
TP_fast_assign(
memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
__entry->pid = p->pid;
__entry->prio = p->prio;
__entry->success = success;
),
TP_printk("task %s:%d [%d] success=%d",
__entry->comm, __entry->pid, __entry->prio,
__entry->success)
);
/*
* Tracepoint for waking up a new task:
*
* (NOTE: the 'rq' argument is not used by generic trace events,
* but used by the latency tracer plugin. )
*/
TRACE_EVENT(sched_wakeup_new,
TP_PROTO(struct rq *rq, struct task_struct *p, int success),
TP_ARGS(rq, p, success),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
__field( pid_t, pid )
__field( int, prio )
__field( int, success )
),
TP_fast_assign(
memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
__entry->pid = p->pid;
__entry->prio = p->prio;
__entry->success = success;
),
TP_printk("task %s:%d [%d] success=%d",
__entry->comm, __entry->pid, __entry->prio,
__entry->success)
);
/*
* Tracepoint for task switches, performed by the scheduler:
*
* (NOTE: the 'rq' argument is not used by generic trace events,
* but used by the latency tracer plugin. )
*/
TRACE_EVENT(sched_switch,
TP_PROTO(struct rq *rq, struct task_struct *prev,
struct task_struct *next),
TP_ARGS(rq, prev, next),
TP_STRUCT__entry(
__array( char, prev_comm, TASK_COMM_LEN )
__field( pid_t, prev_pid )
__field( int, prev_prio )
__array( char, next_comm, TASK_COMM_LEN )
__field( pid_t, next_pid )
__field( int, next_prio )
),
TP_fast_assign(
memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
__entry->prev_pid = prev->pid;
__entry->prev_prio = prev->prio;
memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
__entry->next_pid = next->pid;
__entry->next_prio = next->prio;
),
TP_printk("task %s:%d [%d] ==> %s:%d [%d]",
__entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
__entry->next_comm, __entry->next_pid, __entry->next_prio)
);
/*
* Tracepoint for a task being migrated:
*/
TRACE_EVENT(sched_migrate_task,
TP_PROTO(struct task_struct *p, int orig_cpu, int dest_cpu),
TP_ARGS(p, orig_cpu, dest_cpu),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
__field( pid_t, pid )
__field( int, prio )
__field( int, orig_cpu )
__field( int, dest_cpu )
),
TP_fast_assign(
memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
__entry->pid = p->pid;
__entry->prio = p->prio;
__entry->orig_cpu = orig_cpu;
__entry->dest_cpu = dest_cpu;
),
TP_printk("task %s:%d [%d] from: %d to: %d",
__entry->comm, __entry->pid, __entry->prio,
__entry->orig_cpu, __entry->dest_cpu)
);
/*
* Tracepoint for freeing a task:
*/
TRACE_EVENT(sched_process_free,
TP_PROTO(struct task_struct *p),
TP_ARGS(p),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
__field( pid_t, pid )
__field( int, prio )
),
TP_fast_assign(
memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
__entry->pid = p->pid;
__entry->prio = p->prio;
),
TP_printk("task %s:%d [%d]",
__entry->comm, __entry->pid, __entry->prio)
);
/*
* Tracepoint for a task exiting:
*/
TRACE_EVENT(sched_process_exit,
TP_PROTO(struct task_struct *p),
TP_ARGS(p),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
__field( pid_t, pid )
__field( int, prio )
),
TP_fast_assign(
memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
__entry->pid = p->pid;
__entry->prio = p->prio;
),
TP_printk("task %s:%d [%d]",
__entry->comm, __entry->pid, __entry->prio)
);
/*
* Tracepoint for a waiting task:
*/
TRACE_EVENT(sched_process_wait,
TP_PROTO(struct pid *pid),
TP_ARGS(pid),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
__field( pid_t, pid )
__field( int, prio )
),
TP_fast_assign(
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
__entry->pid = pid_nr(pid);
__entry->prio = current->prio;
),
TP_printk("task %s:%d [%d]",
__entry->comm, __entry->pid, __entry->prio)
);
/*
* Tracepoint for do_fork:
*/
TRACE_EVENT(sched_process_fork,
TP_PROTO(struct task_struct *parent, struct task_struct *child),
TP_ARGS(parent, child),
TP_STRUCT__entry(
__array( char, parent_comm, TASK_COMM_LEN )
__field( pid_t, parent_pid )
__array( char, child_comm, TASK_COMM_LEN )
__field( pid_t, child_pid )
),
TP_fast_assign(
memcpy(__entry->parent_comm, parent->comm, TASK_COMM_LEN);
__entry->parent_pid = parent->pid;
memcpy(__entry->child_comm, child->comm, TASK_COMM_LEN);
__entry->child_pid = child->pid;
),
TP_printk("parent %s:%d child %s:%d",
__entry->parent_comm, __entry->parent_pid,
__entry->child_comm, __entry->child_pid)
);
/*
* Tracepoint for sending a signal:
*/
TRACE_EVENT(sched_signal_send,
TP_PROTO(int sig, struct task_struct *p),
TP_ARGS(sig, p),
TP_STRUCT__entry(
__field( int, sig )
__array( char, comm, TASK_COMM_LEN )
__field( pid_t, pid )
),
TP_fast_assign(
memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
__entry->pid = p->pid;
__entry->sig = sig;
),
TP_printk("sig: %d task %s:%d",
__entry->sig, __entry->comm, __entry->pid)
);
#undef TRACE_SYSTEM

View file

@ -5,7 +5,7 @@
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
DECLARE_TRACE(kfree_skb, DECLARE_TRACE(kfree_skb,
TPPROTO(struct sk_buff *skb, void *location), TP_PROTO(struct sk_buff *skb, void *location),
TPARGS(skb, location)); TP_ARGS(skb, location));
#endif #endif

View file

@ -0,0 +1,5 @@
/* trace/<type>_event_types.h here */
#include <trace/sched_event_types.h>
#include <trace/irq_event_types.h>
#include <trace/lockdep_event_types.h>

View file

@ -0,0 +1,5 @@
/* trace/<type>.h here */
#include <trace/sched.h>
#include <trace/irq.h>
#include <trace/lockdep.h>

25
include/trace/workqueue.h Normal file
View file

@ -0,0 +1,25 @@
#ifndef __TRACE_WORKQUEUE_H
#define __TRACE_WORKQUEUE_H
#include <linux/tracepoint.h>
#include <linux/workqueue.h>
#include <linux/sched.h>
DECLARE_TRACE(workqueue_insertion,
TP_PROTO(struct task_struct *wq_thread, struct work_struct *work),
TP_ARGS(wq_thread, work));
DECLARE_TRACE(workqueue_execution,
TP_PROTO(struct task_struct *wq_thread, struct work_struct *work),
TP_ARGS(wq_thread, work));
/* Trace the creation of one workqueue thread on a cpu */
DECLARE_TRACE(workqueue_creation,
TP_PROTO(struct task_struct *wq_thread, int cpu),
TP_ARGS(wq_thread, cpu));
DECLARE_TRACE(workqueue_destruction,
TP_PROTO(struct task_struct *wq_thread),
TP_ARGS(wq_thread));
#endif /* __TRACE_WORKQUEUE_H */

View file

@ -1007,7 +1007,7 @@ config TRACEPOINTS
config MARKERS config MARKERS
bool "Activate markers" bool "Activate markers"
depends on TRACEPOINTS select TRACEPOINTS
help help
Place an empty function call at each marker site. Can be Place an empty function call at each marker site. Can be
dynamically changed for a probe function. dynamically changed for a probe function.

View file

@ -71,6 +71,7 @@
#include <asm/setup.h> #include <asm/setup.h>
#include <asm/sections.h> #include <asm/sections.h>
#include <asm/cacheflush.h> #include <asm/cacheflush.h>
#include <trace/kmemtrace.h>
#ifdef CONFIG_X86_LOCAL_APIC #ifdef CONFIG_X86_LOCAL_APIC
#include <asm/smp.h> #include <asm/smp.h>
@ -648,6 +649,7 @@ asmlinkage void __init start_kernel(void)
enable_debug_pagealloc(); enable_debug_pagealloc();
cpu_hotplug_init(); cpu_hotplug_init();
kmem_cache_init(); kmem_cache_init();
kmemtrace_init();
debug_objects_mem_init(); debug_objects_mem_init();
idr_init_cache(); idr_init_cache();
setup_per_cpu_pageset(); setup_per_cpu_pageset();
@ -769,6 +771,7 @@ static void __init do_basic_setup(void)
{ {
rcu_init_sched(); /* needed by module_init stage. */ rcu_init_sched(); /* needed by module_init stage. */
init_workqueues(); init_workqueues();
cpuset_init_smp();
usermodehelper_init(); usermodehelper_init();
driver_init(); driver_init();
init_irq_proc(); init_irq_proc();
@ -863,8 +866,6 @@ static int __init kernel_init(void * unused)
smp_init(); smp_init();
sched_init_smp(); sched_init_smp();
cpuset_init_smp();
do_basic_setup(); do_basic_setup();
/* /*

View file

@ -15,11 +15,22 @@
along with this program; if not, write to the Free Software along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/ */
#include <linux/module.h>
#include <linux/init.h>
#include <linux/ftrace.h> #include <linux/ftrace.h>
#include <asm/uaccess.h> #include <linux/memory.h>
#include <linux/module.h>
#include <linux/mutex.h>
#include <linux/init.h>
#include <asm/sections.h> #include <asm/sections.h>
#include <asm/uaccess.h>
/*
* mutex protecting text section modification (dynamic code patching).
* some users need to sleep (allocating memory...) while they hold this lock.
*
* NOT exported to modules - patching kernel text is a really delicate matter.
*/
DEFINE_MUTEX(text_mutex);
extern struct exception_table_entry __start___ex_table[]; extern struct exception_table_entry __start___ex_table[];
extern struct exception_table_entry __stop___ex_table[]; extern struct exception_table_entry __stop___ex_table[];
@ -49,7 +60,7 @@ static inline int init_kernel_text(unsigned long addr)
return 0; return 0;
} }
__notrace_funcgraph int core_kernel_text(unsigned long addr) int core_kernel_text(unsigned long addr)
{ {
if (addr >= (unsigned long)_stext && if (addr >= (unsigned long)_stext &&
addr <= (unsigned long)_etext) addr <= (unsigned long)_etext)
@ -61,7 +72,7 @@ __notrace_funcgraph int core_kernel_text(unsigned long addr)
return 0; return 0;
} }
__notrace_funcgraph int __kernel_text_address(unsigned long addr) int __kernel_text_address(unsigned long addr)
{ {
if (core_kernel_text(addr)) if (core_kernel_text(addr))
return 1; return 1;

View file

@ -17,6 +17,7 @@
#include <linux/kernel_stat.h> #include <linux/kernel_stat.h>
#include <linux/rculist.h> #include <linux/rculist.h>
#include <linux/hash.h> #include <linux/hash.h>
#include <trace/irq.h>
#include <linux/bootmem.h> #include <linux/bootmem.h>
#include "internals.h" #include "internals.h"
@ -338,6 +339,9 @@ irqreturn_t no_action(int cpl, void *dev_id)
return IRQ_NONE; return IRQ_NONE;
} }
DEFINE_TRACE(irq_handler_entry);
DEFINE_TRACE(irq_handler_exit);
/** /**
* handle_IRQ_event - irq action chain handler * handle_IRQ_event - irq action chain handler
* @irq: the interrupt number * @irq: the interrupt number
@ -356,7 +360,9 @@ irqreturn_t handle_IRQ_event(unsigned int irq, struct irqaction *action)
local_irq_enable_in_hardirq(); local_irq_enable_in_hardirq();
do { do {
trace_irq_handler_entry(irq, action);
ret = action->handler(irq, action->dev_id); ret = action->handler(irq, action->dev_id);
trace_irq_handler_exit(irq, action, ret);
if (ret == IRQ_HANDLED) if (ret == IRQ_HANDLED)
status |= action->flags; status |= action->flags;
retval |= ret; retval |= ret;

View file

@ -43,6 +43,7 @@
#include <linux/seq_file.h> #include <linux/seq_file.h>
#include <linux/debugfs.h> #include <linux/debugfs.h>
#include <linux/kdebug.h> #include <linux/kdebug.h>
#include <linux/memory.h>
#include <asm-generic/sections.h> #include <asm-generic/sections.h>
#include <asm/cacheflush.h> #include <asm/cacheflush.h>
@ -699,9 +700,10 @@ int __kprobes register_kprobe(struct kprobe *p)
goto out; goto out;
} }
mutex_lock(&text_mutex);
ret = arch_prepare_kprobe(p); ret = arch_prepare_kprobe(p);
if (ret) if (ret)
goto out; goto out_unlock_text;
INIT_HLIST_NODE(&p->hlist); INIT_HLIST_NODE(&p->hlist);
hlist_add_head_rcu(&p->hlist, hlist_add_head_rcu(&p->hlist,
@ -710,6 +712,8 @@ int __kprobes register_kprobe(struct kprobe *p)
if (kprobe_enabled) if (kprobe_enabled)
arch_arm_kprobe(p); arch_arm_kprobe(p);
out_unlock_text:
mutex_unlock(&text_mutex);
out: out:
mutex_unlock(&kprobe_mutex); mutex_unlock(&kprobe_mutex);
@ -746,8 +750,11 @@ static int __kprobes __unregister_kprobe_top(struct kprobe *p)
* enabled and not gone - otherwise, the breakpoint would * enabled and not gone - otherwise, the breakpoint would
* already have been removed. We save on flushing icache. * already have been removed. We save on flushing icache.
*/ */
if (kprobe_enabled && !kprobe_gone(old_p)) if (kprobe_enabled && !kprobe_gone(old_p)) {
mutex_lock(&text_mutex);
arch_disarm_kprobe(p); arch_disarm_kprobe(p);
mutex_unlock(&text_mutex);
}
hlist_del_rcu(&old_p->hlist); hlist_del_rcu(&old_p->hlist);
} else { } else {
if (p->break_handler && !kprobe_gone(p)) if (p->break_handler && !kprobe_gone(p))
@ -912,10 +919,8 @@ static int __kprobes pre_handler_kretprobe(struct kprobe *p,
ri->rp = rp; ri->rp = rp;
ri->task = current; ri->task = current;
if (rp->entry_handler && rp->entry_handler(ri, regs)) { if (rp->entry_handler && rp->entry_handler(ri, regs))
spin_unlock_irqrestore(&rp->lock, flags);
return 0; return 0;
}
arch_prepare_kretprobe(ri, regs); arch_prepare_kretprobe(ri, regs);
@ -1280,12 +1285,14 @@ static void __kprobes enable_all_kprobes(void)
if (kprobe_enabled) if (kprobe_enabled)
goto already_enabled; goto already_enabled;
mutex_lock(&text_mutex);
for (i = 0; i < KPROBE_TABLE_SIZE; i++) { for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
head = &kprobe_table[i]; head = &kprobe_table[i];
hlist_for_each_entry_rcu(p, node, head, hlist) hlist_for_each_entry_rcu(p, node, head, hlist)
if (!kprobe_gone(p)) if (!kprobe_gone(p))
arch_arm_kprobe(p); arch_arm_kprobe(p);
} }
mutex_unlock(&text_mutex);
kprobe_enabled = true; kprobe_enabled = true;
printk(KERN_INFO "Kprobes globally enabled\n"); printk(KERN_INFO "Kprobes globally enabled\n");
@ -1310,6 +1317,7 @@ static void __kprobes disable_all_kprobes(void)
kprobe_enabled = false; kprobe_enabled = false;
printk(KERN_INFO "Kprobes globally disabled\n"); printk(KERN_INFO "Kprobes globally disabled\n");
mutex_lock(&text_mutex);
for (i = 0; i < KPROBE_TABLE_SIZE; i++) { for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
head = &kprobe_table[i]; head = &kprobe_table[i];
hlist_for_each_entry_rcu(p, node, head, hlist) { hlist_for_each_entry_rcu(p, node, head, hlist) {
@ -1318,6 +1326,7 @@ static void __kprobes disable_all_kprobes(void)
} }
} }
mutex_unlock(&text_mutex);
mutex_unlock(&kprobe_mutex); mutex_unlock(&kprobe_mutex);
/* Allow all currently running kprobes to complete */ /* Allow all currently running kprobes to complete */
synchronize_sched(); synchronize_sched();

View file

@ -42,6 +42,7 @@
#include <linux/hash.h> #include <linux/hash.h>
#include <linux/ftrace.h> #include <linux/ftrace.h>
#include <linux/stringify.h> #include <linux/stringify.h>
#include <trace/lockdep.h>
#include <asm/sections.h> #include <asm/sections.h>
@ -2923,6 +2924,8 @@ void lock_set_class(struct lockdep_map *lock, const char *name,
} }
EXPORT_SYMBOL_GPL(lock_set_class); EXPORT_SYMBOL_GPL(lock_set_class);
DEFINE_TRACE(lock_acquire);
/* /*
* We are not always called with irqs disabled - do that here, * We are not always called with irqs disabled - do that here,
* and also avoid lockdep recursion: * and also avoid lockdep recursion:
@ -2933,6 +2936,8 @@ void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
{ {
unsigned long flags; unsigned long flags;
trace_lock_acquire(lock, subclass, trylock, read, check, nest_lock, ip);
if (unlikely(current->lockdep_recursion)) if (unlikely(current->lockdep_recursion))
return; return;
@ -2947,11 +2952,15 @@ void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
} }
EXPORT_SYMBOL_GPL(lock_acquire); EXPORT_SYMBOL_GPL(lock_acquire);
DEFINE_TRACE(lock_release);
void lock_release(struct lockdep_map *lock, int nested, void lock_release(struct lockdep_map *lock, int nested,
unsigned long ip) unsigned long ip)
{ {
unsigned long flags; unsigned long flags;
trace_lock_release(lock, nested, ip);
if (unlikely(current->lockdep_recursion)) if (unlikely(current->lockdep_recursion))
return; return;
@ -3100,10 +3109,14 @@ __lock_acquired(struct lockdep_map *lock, unsigned long ip)
lock->ip = ip; lock->ip = ip;
} }
DEFINE_TRACE(lock_contended);
void lock_contended(struct lockdep_map *lock, unsigned long ip) void lock_contended(struct lockdep_map *lock, unsigned long ip)
{ {
unsigned long flags; unsigned long flags;
trace_lock_contended(lock, ip);
if (unlikely(!lock_stat)) if (unlikely(!lock_stat))
return; return;
@ -3119,10 +3132,14 @@ void lock_contended(struct lockdep_map *lock, unsigned long ip)
} }
EXPORT_SYMBOL_GPL(lock_contended); EXPORT_SYMBOL_GPL(lock_contended);
DEFINE_TRACE(lock_acquired);
void lock_acquired(struct lockdep_map *lock, unsigned long ip) void lock_acquired(struct lockdep_map *lock, unsigned long ip)
{ {
unsigned long flags; unsigned long flags;
trace_lock_acquired(lock, ip);
if (unlikely(!lock_stat)) if (unlikely(!lock_stat))
return; return;

View file

@ -2777,7 +2777,7 @@ bool is_module_address(unsigned long addr)
* Must be called with preempt disabled or module mutex held so that * Must be called with preempt disabled or module mutex held so that
* module doesn't get freed during this. * module doesn't get freed during this.
*/ */
__notrace_funcgraph struct module *__module_address(unsigned long addr) struct module *__module_address(unsigned long addr)
{ {
struct module *mod; struct module *mod;

View file

@ -677,9 +677,7 @@ int relay_late_setup_files(struct rchan *chan,
*/ */
for_each_online_cpu(i) { for_each_online_cpu(i) {
if (unlikely(!chan->buf[i])) { if (unlikely(!chan->buf[i])) {
printk(KERN_ERR "relay_late_setup_files: CPU %u " WARN_ONCE(1, KERN_ERR "CPU has no buffer!\n");
"has no buffer, it must have!\n", i);
BUG();
err = -EINVAL; err = -EINVAL;
break; break;
} }

View file

@ -4773,10 +4773,7 @@ void scheduler_tick(void)
#endif #endif
} }
#if defined(CONFIG_PREEMPT) && (defined(CONFIG_DEBUG_PREEMPT) || \ unsigned long get_parent_ip(unsigned long addr)
defined(CONFIG_PREEMPT_TRACER))
static inline unsigned long get_parent_ip(unsigned long addr)
{ {
if (in_lock_functions(addr)) { if (in_lock_functions(addr)) {
addr = CALLER_ADDR2; addr = CALLER_ADDR2;
@ -4786,6 +4783,9 @@ static inline unsigned long get_parent_ip(unsigned long addr)
return addr; return addr;
} }
#if defined(CONFIG_PREEMPT) && (defined(CONFIG_DEBUG_PREEMPT) || \
defined(CONFIG_PREEMPT_TRACER))
void __kprobes add_preempt_count(int val) void __kprobes add_preempt_count(int val)
{ {
#ifdef CONFIG_DEBUG_PREEMPT #ifdef CONFIG_DEBUG_PREEMPT

View file

@ -25,6 +25,7 @@
* consistent between cpus (never more than 2 jiffies difference). * consistent between cpus (never more than 2 jiffies difference).
*/ */
#include <linux/spinlock.h> #include <linux/spinlock.h>
#include <linux/hardirq.h>
#include <linux/module.h> #include <linux/module.h>
#include <linux/percpu.h> #include <linux/percpu.h>
#include <linux/ktime.h> #include <linux/ktime.h>
@ -154,6 +155,17 @@ u64 sched_clock_cpu(int cpu)
return sched_clock(); return sched_clock();
scd = cpu_sdc(cpu); scd = cpu_sdc(cpu);
/*
* Normally this is not called in NMI context - but if it is,
* trying to do any locking here is totally lethal.
*/
if (unlikely(in_nmi()))
return scd->clock;
if (unlikely(!sched_clock_running))
return 0ull;
WARN_ON_ONCE(!irqs_disabled()); WARN_ON_ONCE(!irqs_disabled());
now = sched_clock(); now = sched_clock();

View file

@ -21,8 +21,10 @@
#include <linux/freezer.h> #include <linux/freezer.h>
#include <linux/kthread.h> #include <linux/kthread.h>
#include <linux/rcupdate.h> #include <linux/rcupdate.h>
#include <linux/ftrace.h>
#include <linux/smp.h> #include <linux/smp.h>
#include <linux/tick.h> #include <linux/tick.h>
#include <trace/irq.h>
#include <asm/irq.h> #include <asm/irq.h>
/* /*
@ -52,6 +54,11 @@ static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp
static DEFINE_PER_CPU(struct task_struct *, ksoftirqd); static DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
char *softirq_to_name[NR_SOFTIRQS] = {
"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK",
"TASKLET", "SCHED", "HRTIMER", "RCU"
};
/* /*
* we cannot loop indefinitely here to avoid userspace starvation, * we cannot loop indefinitely here to avoid userspace starvation,
* but we also don't want to introduce a worst case 1/HZ latency * but we also don't want to introduce a worst case 1/HZ latency
@ -79,13 +86,23 @@ static void __local_bh_disable(unsigned long ip)
WARN_ON_ONCE(in_irq()); WARN_ON_ONCE(in_irq());
raw_local_irq_save(flags); raw_local_irq_save(flags);
add_preempt_count(SOFTIRQ_OFFSET); /*
* The preempt tracer hooks into add_preempt_count and will break
* lockdep because it calls back into lockdep after SOFTIRQ_OFFSET
* is set and before current->softirq_enabled is cleared.
* We must manually increment preempt_count here and manually
* call the trace_preempt_off later.
*/
preempt_count() += SOFTIRQ_OFFSET;
/* /*
* Were softirqs turned off above: * Were softirqs turned off above:
*/ */
if (softirq_count() == SOFTIRQ_OFFSET) if (softirq_count() == SOFTIRQ_OFFSET)
trace_softirqs_off(ip); trace_softirqs_off(ip);
raw_local_irq_restore(flags); raw_local_irq_restore(flags);
if (preempt_count() == SOFTIRQ_OFFSET)
trace_preempt_off(CALLER_ADDR0, get_parent_ip(CALLER_ADDR1));
} }
#else /* !CONFIG_TRACE_IRQFLAGS */ #else /* !CONFIG_TRACE_IRQFLAGS */
static inline void __local_bh_disable(unsigned long ip) static inline void __local_bh_disable(unsigned long ip)
@ -169,6 +186,9 @@ EXPORT_SYMBOL(local_bh_enable_ip);
*/ */
#define MAX_SOFTIRQ_RESTART 10 #define MAX_SOFTIRQ_RESTART 10
DEFINE_TRACE(softirq_entry);
DEFINE_TRACE(softirq_exit);
asmlinkage void __do_softirq(void) asmlinkage void __do_softirq(void)
{ {
struct softirq_action *h; struct softirq_action *h;
@ -195,12 +215,14 @@ asmlinkage void __do_softirq(void)
if (pending & 1) { if (pending & 1) {
int prev_count = preempt_count(); int prev_count = preempt_count();
trace_softirq_entry(h, softirq_vec);
h->action(h); h->action(h);
trace_softirq_exit(h, softirq_vec);
if (unlikely(prev_count != preempt_count())) { if (unlikely(prev_count != preempt_count())) {
printk(KERN_ERR "huh, entered softirq %td %p" printk(KERN_ERR "huh, entered softirq %td %s %p"
"with preempt_count %08x," "with preempt_count %08x,"
" exited with %08x?\n", h - softirq_vec, " exited with %08x?\n", h - softirq_vec,
softirq_to_name[h - softirq_vec],
h->action, prev_count, preempt_count()); h->action, prev_count, preempt_count());
preempt_count() = prev_count; preempt_count() = prev_count;
} }

View file

@ -9,6 +9,9 @@ config USER_STACKTRACE_SUPPORT
config NOP_TRACER config NOP_TRACER
bool bool
config HAVE_FTRACE_NMI_ENTER
bool
config HAVE_FUNCTION_TRACER config HAVE_FUNCTION_TRACER
bool bool
@ -31,12 +34,20 @@ config HAVE_FTRACE_MCOUNT_RECORD
config HAVE_HW_BRANCH_TRACER config HAVE_HW_BRANCH_TRACER
bool bool
config HAVE_FTRACE_SYSCALLS
bool
config TRACER_MAX_TRACE config TRACER_MAX_TRACE
bool bool
config RING_BUFFER config RING_BUFFER
bool bool
config FTRACE_NMI_ENTER
bool
depends on HAVE_FTRACE_NMI_ENTER
default y
config TRACING config TRACING
bool bool
select DEBUG_FS select DEBUG_FS
@ -44,13 +55,29 @@ config TRACING
select STACKTRACE if STACKTRACE_SUPPORT select STACKTRACE if STACKTRACE_SUPPORT
select TRACEPOINTS select TRACEPOINTS
select NOP_TRACER select NOP_TRACER
select BINARY_PRINTF
#
# Minimum requirements an architecture has to meet for us to
# be able to offer generic tracing facilities:
#
config TRACING_SUPPORT
bool
# PPC32 has no irqflags tracing support, but it can use most of the
# tracers anyway, they were tested to build and work. Note that new
# exceptions to this list aren't welcomed, better implement the
# irqflags tracing for your architecture.
depends on TRACE_IRQFLAGS_SUPPORT || PPC32
depends on STACKTRACE_SUPPORT
default y
if TRACING_SUPPORT
menu "Tracers" menu "Tracers"
config FUNCTION_TRACER config FUNCTION_TRACER
bool "Kernel Function Tracer" bool "Kernel Function Tracer"
depends on HAVE_FUNCTION_TRACER depends on HAVE_FUNCTION_TRACER
depends on DEBUG_KERNEL
select FRAME_POINTER select FRAME_POINTER
select KALLSYMS select KALLSYMS
select TRACING select TRACING
@ -82,7 +109,6 @@ config IRQSOFF_TRACER
default n default n
depends on TRACE_IRQFLAGS_SUPPORT depends on TRACE_IRQFLAGS_SUPPORT
depends on GENERIC_TIME depends on GENERIC_TIME
depends on DEBUG_KERNEL
select TRACE_IRQFLAGS select TRACE_IRQFLAGS
select TRACING select TRACING
select TRACER_MAX_TRACE select TRACER_MAX_TRACE
@ -105,7 +131,6 @@ config PREEMPT_TRACER
default n default n
depends on GENERIC_TIME depends on GENERIC_TIME
depends on PREEMPT depends on PREEMPT
depends on DEBUG_KERNEL
select TRACING select TRACING
select TRACER_MAX_TRACE select TRACER_MAX_TRACE
help help
@ -126,13 +151,13 @@ config SYSPROF_TRACER
bool "Sysprof Tracer" bool "Sysprof Tracer"
depends on X86 depends on X86
select TRACING select TRACING
select CONTEXT_SWITCH_TRACER
help help
This tracer provides the trace needed by the 'Sysprof' userspace This tracer provides the trace needed by the 'Sysprof' userspace
tool. tool.
config SCHED_TRACER config SCHED_TRACER
bool "Scheduling Latency Tracer" bool "Scheduling Latency Tracer"
depends on DEBUG_KERNEL
select TRACING select TRACING
select CONTEXT_SWITCH_TRACER select CONTEXT_SWITCH_TRACER
select TRACER_MAX_TRACE select TRACER_MAX_TRACE
@ -142,16 +167,30 @@ config SCHED_TRACER
config CONTEXT_SWITCH_TRACER config CONTEXT_SWITCH_TRACER
bool "Trace process context switches" bool "Trace process context switches"
depends on DEBUG_KERNEL
select TRACING select TRACING
select MARKERS select MARKERS
help help
This tracer gets called from the context switch and records This tracer gets called from the context switch and records
all switching of tasks. all switching of tasks.
config EVENT_TRACER
bool "Trace various events in the kernel"
select TRACING
help
This tracer hooks to various trace points in the kernel
allowing the user to pick and choose which trace point they
want to trace.
config FTRACE_SYSCALLS
bool "Trace syscalls"
depends on HAVE_FTRACE_SYSCALLS
select TRACING
select KALLSYMS
help
Basic tracer to catch the syscall entry and exit events.
config BOOT_TRACER config BOOT_TRACER
bool "Trace boot initcalls" bool "Trace boot initcalls"
depends on DEBUG_KERNEL
select TRACING select TRACING
select CONTEXT_SWITCH_TRACER select CONTEXT_SWITCH_TRACER
help help
@ -164,13 +203,11 @@ config BOOT_TRACER
representation of the delays during initcalls - but the raw representation of the delays during initcalls - but the raw
/debug/tracing/trace text output is readable too. /debug/tracing/trace text output is readable too.
( Note that tracing self tests can't be enabled if this tracer is You must pass in ftrace=initcall to the kernel command line
selected, because the self-tests are an initcall as well and that to enable this on bootup.
would invalidate the boot trace. )
config TRACE_BRANCH_PROFILING config TRACE_BRANCH_PROFILING
bool "Trace likely/unlikely profiler" bool "Trace likely/unlikely profiler"
depends on DEBUG_KERNEL
select TRACING select TRACING
help help
This tracer profiles all the the likely and unlikely macros This tracer profiles all the the likely and unlikely macros
@ -223,7 +260,6 @@ config BRANCH_TRACER
config POWER_TRACER config POWER_TRACER
bool "Trace power consumption behavior" bool "Trace power consumption behavior"
depends on DEBUG_KERNEL
depends on X86 depends on X86
select TRACING select TRACING
help help
@ -235,7 +271,6 @@ config POWER_TRACER
config STACK_TRACER config STACK_TRACER
bool "Trace max stack" bool "Trace max stack"
depends on HAVE_FUNCTION_TRACER depends on HAVE_FUNCTION_TRACER
depends on DEBUG_KERNEL
select FUNCTION_TRACER select FUNCTION_TRACER
select STACKTRACE select STACKTRACE
select KALLSYMS select KALLSYMS
@ -265,11 +300,66 @@ config HW_BRANCH_TRACER
This tracer records all branches on the system in a circular This tracer records all branches on the system in a circular
buffer giving access to the last N branches for each cpu. buffer giving access to the last N branches for each cpu.
config KMEMTRACE
bool "Trace SLAB allocations"
select TRACING
help
kmemtrace provides tracing for slab allocator functions, such as
kmalloc, kfree, kmem_cache_alloc, kmem_cache_free etc.. Collected
data is then fed to the userspace application in order to analyse
allocation hotspots, internal fragmentation and so on, making it
possible to see how well an allocator performs, as well as debug
and profile kernel code.
This requires an userspace application to use. See
Documentation/vm/kmemtrace.txt for more information.
Saying Y will make the kernel somewhat larger and slower. However,
if you disable kmemtrace at run-time or boot-time, the performance
impact is minimal (depending on the arch the kernel is built for).
If unsure, say N.
config WORKQUEUE_TRACER
bool "Trace workqueues"
select TRACING
help
The workqueue tracer provides some statistical informations
about each cpu workqueue thread such as the number of the
works inserted and executed since their creation. It can help
to evaluate the amount of work each of them have to perform.
For example it can help a developer to decide whether he should
choose a per cpu workqueue instead of a singlethreaded one.
config BLK_DEV_IO_TRACE
bool "Support for tracing block io actions"
depends on SYSFS
depends on BLOCK
select RELAY
select DEBUG_FS
select TRACEPOINTS
select TRACING
select STACKTRACE
help
Say Y here if you want to be able to trace the block layer actions
on a given queue. Tracing allows you to see any traffic happening
on a block device queue. For more information (and the userspace
support tools needed), fetch the blktrace tools from:
git://git.kernel.dk/blktrace.git
Tracing also is possible using the ftrace interface, e.g.:
echo 1 > /sys/block/sda/sda1/trace/enable
echo blk > /sys/kernel/debug/tracing/current_tracer
cat /sys/kernel/debug/tracing/trace_pipe
If unsure, say N.
config DYNAMIC_FTRACE config DYNAMIC_FTRACE
bool "enable/disable ftrace tracepoints dynamically" bool "enable/disable ftrace tracepoints dynamically"
depends on FUNCTION_TRACER depends on FUNCTION_TRACER
depends on HAVE_DYNAMIC_FTRACE depends on HAVE_DYNAMIC_FTRACE
depends on DEBUG_KERNEL
default y default y
help help
This option will modify all the calls to ftrace dynamically This option will modify all the calls to ftrace dynamically
@ -295,7 +385,7 @@ config FTRACE_SELFTEST
config FTRACE_STARTUP_TEST config FTRACE_STARTUP_TEST
bool "Perform a startup test on ftrace" bool "Perform a startup test on ftrace"
depends on TRACING && DEBUG_KERNEL && !BOOT_TRACER depends on TRACING
select FTRACE_SELFTEST select FTRACE_SELFTEST
help help
This option performs a series of startup tests on ftrace. On bootup This option performs a series of startup tests on ftrace. On bootup
@ -305,7 +395,7 @@ config FTRACE_STARTUP_TEST
config MMIOTRACE config MMIOTRACE
bool "Memory mapped IO tracing" bool "Memory mapped IO tracing"
depends on HAVE_MMIOTRACE_SUPPORT && DEBUG_KERNEL && PCI depends on HAVE_MMIOTRACE_SUPPORT && PCI
select TRACING select TRACING
help help
Mmiotrace traces Memory Mapped I/O access and is meant for Mmiotrace traces Memory Mapped I/O access and is meant for
@ -327,3 +417,6 @@ config MMIOTRACE_TEST
Say N, unless you absolutely know what you are doing. Say N, unless you absolutely know what you are doing.
endmenu endmenu
endif # TRACING_SUPPORT

View file

@ -19,6 +19,10 @@ obj-$(CONFIG_FUNCTION_TRACER) += libftrace.o
obj-$(CONFIG_RING_BUFFER) += ring_buffer.o obj-$(CONFIG_RING_BUFFER) += ring_buffer.o
obj-$(CONFIG_TRACING) += trace.o obj-$(CONFIG_TRACING) += trace.o
obj-$(CONFIG_TRACING) += trace_clock.o
obj-$(CONFIG_TRACING) += trace_output.o
obj-$(CONFIG_TRACING) += trace_stat.o
obj-$(CONFIG_TRACING) += trace_printk.o
obj-$(CONFIG_CONTEXT_SWITCH_TRACER) += trace_sched_switch.o obj-$(CONFIG_CONTEXT_SWITCH_TRACER) += trace_sched_switch.o
obj-$(CONFIG_SYSPROF_TRACER) += trace_sysprof.o obj-$(CONFIG_SYSPROF_TRACER) += trace_sysprof.o
obj-$(CONFIG_FUNCTION_TRACER) += trace_functions.o obj-$(CONFIG_FUNCTION_TRACER) += trace_functions.o
@ -33,5 +37,14 @@ obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += trace_functions_graph.o
obj-$(CONFIG_TRACE_BRANCH_PROFILING) += trace_branch.o obj-$(CONFIG_TRACE_BRANCH_PROFILING) += trace_branch.o
obj-$(CONFIG_HW_BRANCH_TRACER) += trace_hw_branches.o obj-$(CONFIG_HW_BRANCH_TRACER) += trace_hw_branches.o
obj-$(CONFIG_POWER_TRACER) += trace_power.o obj-$(CONFIG_POWER_TRACER) += trace_power.o
obj-$(CONFIG_KMEMTRACE) += kmemtrace.o
obj-$(CONFIG_WORKQUEUE_TRACER) += trace_workqueue.o
obj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o
obj-$(CONFIG_EVENT_TRACER) += trace_events.o
obj-$(CONFIG_EVENT_TRACER) += events.o
obj-$(CONFIG_EVENT_TRACER) += trace_export.o
obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o
obj-$(CONFIG_EVENT_PROFILE) += trace_event_profile.o
obj-$(CONFIG_EVENT_TRACER) += trace_events_filter.o
libftrace-y := ftrace.o libftrace-y := ftrace.o

File diff suppressed because it is too large Load diff

14
kernel/trace/events.c Normal file
View file

@ -0,0 +1,14 @@
/*
* This is the place to register all trace points as events.
*/
#include <linux/stringify.h>
#include <trace/trace_events.h>
#include "trace_output.h"
#include "trace_events_stage_1.h"
#include "trace_events_stage_2.h"
#include "trace_events_stage_3.h"

Some files were not shown because too many files have changed in this diff Show more