Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299119690-13991-5-git-send-email-ming.m.lin@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On Intel Nehalem and Westmere CPUs the generic perf LLC-* events count the
L2 caches, not the real L3 LLC - this was inconsistent with behavior on
other CPUs.
Fixing this requires the use of the special OFFCORE_RESPONSE
events which need a separate mask register.
This has been implemented by the previous patch, now use this infrastructure
to set correct events for the LLC-* on Nehalem and Westmere.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299119690-13991-3-git-send-email-ming.m.lin@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Change logs against Andi's original version:
- Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra)
- Fixed a major event scheduling issue. There cannot be a ref++ on an
event that has already done ref++ once and without calling
put_constraint() in between. (Stephane Eranian)
- Use thread_cpumask for percore allocation. (Lin Ming)
- Use MSR names in the extra reg lists. (Lin Ming)
- Remove redundant "c = NULL" in intel_percore_constraints
- Fix comment of perf_event_attr::config1
Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event
that can be used to monitor any offcore accesses from a core.
This is a very useful event for various tunings, and it's
also needed to implement the generic LLC-* events correctly.
Unfortunately this event requires programming a mask in a separate
register. And worse this separate register is per core, not per
CPU thread.
This patch:
- Teaches perf_events that OFFCORE_RESPONSE needs extra parameters.
The extra parameters are passed by user space in the
perf_event_attr::config1 field.
- Adds support to the Intel perf_event core to schedule per
core resources. This adds fairly generic infrastructure that
can be also used for other per core resources.
The basic code has is patterned after the similar AMD northbridge
constraints code.
Thanks to Stephane Eranian who pointed out some problems
in the original version and suggested improvements.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299119690-13991-2-git-send-email-ming.m.lin@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch updates PEBS event constraints for Intel Atom, Nehalem, Westmere.
This patch also reorganizes the PEBS format/constraint detection code. It is
now based on processor model and not PEBS format. Two processors may use the
same PEBS format without have the same list of PEBS events.
In this second version, we simplified the initialization of the PEBS
constraints by leveraging the existing switch() statement in perf_event_intel.c.
We also renamed the constraint tables to be more consistent with regular
constraints.
In this 3rd version, we drop BR_INST_RETIRED.MISPRED from Intel Atom as it does
not seem to work. Use MISPREDICTED_BRANCH_RETIRED instead. Also add FP_ASSIST.*
o both Intel Nehalem and Westmere. I misssed those in the earlier patches.
Events were tested using libpfm4 perf_examples.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4d6e6b02.815bdf0a.637b.07a7@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds basic SandyBridge support, including hardware
cache events and PEBS events support.
It has been tested on SandyBridge CPUs with perf stat and also
with PEBS based profiling - both work fine.
The patch does not affect other models.
v2 -> v3:
- fix PEBS event 0xd0 with right umask combinations
- move snb pebs constraint assignment to intel_pmu_init
v1 -> v2:
- add more raw and PEBS events constraints
- use offcore events for LLC-* cache events
- remove the call to Nehalem workaround enable_all function
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <1299072424.2175.24.camel@localhost>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 6745/1: kprobes insn decoding fix
ARM: tlb: move noMMU tlb_flush() to asm/tlb.h
ARM: tlb: delay page freeing for SMP and ARMv7 CPUs
ARM: Keep exit text/data around for SMP_ON_UP
ARM: Ensure predictable endian state on signal handler entry
ARM: 6740/1: Place correctly notes section in the linker script
ARM: 6700/1: SPEAr: Correct SOC config base address for spear320
ARM: 6722/1: SPEAr: sp810: switch to slow mode before reset
ARM: 6712/1: SPEAr: replace readl(), writel() with relaxed versions in uncompress.h
ARM: 6720/1: SPEAr: Append UL to VMALLOC_END
ARM: 6676/1: Correct the cpu_architecture() function for ARMv7
ARM: 6739/1: update .gitignore for boot/compressed
ARM: 6743/1: errata: interrupted ICALLUIS may prevent completion of broadcasted operation
ARM: 6742/1: pmu: avoid setting IRQ affinity on UP systems
ARM: 6741/1: errata: pl310 cache sync operation may be faulty
Marcin Slusarz says:
> In arch/arm/kernel/kprobes-decode.c there's a function
> arm_kprobe_decode_insn which does:
>
> } else if ((insn & 0x0e000000) == 0x0c400000) {
> ...
>
> This is always false, so code below is dead.
> I found this bug by coccinelle (http://coccinelle.lip6.fr/).
Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There's no need to noMMU to put tlb_flush() in asm/tlbflush.h - it's
part of the tlb shootdown interface. Move it to asm/tlb.h instead, as
per x86.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We need to delay freeing any mapped page on SMP and ARMv7 systems to
ensure that the data is not accessed by other CPUs, or is used for
speculative prefetch with ARMv7. This includes not only mapped pages
but also pages used for the page tables themselves.
This avoids races with the MMU/other CPUs accessing pages after they've
been freed but before we've invalidated the TLB.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When SMP_ON_UP is used and the spinlocks are inlined, we end up with
inline spinlocks in the exit code, with references from the SMP
alternatives section to the exit sections. This causes link time
errors. Avoid this by placing the exit sections in the init-discarded
region.
Cc: <stable@kernel.org>
Tested-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Ensure a predictable endian state when entering signal handlers. This
avoids programs which use SETEND to momentarily switch their endian
state from having their signal handlers entered with an unpredictable
endian state.
Cc: <stable@kernel.org>
Acked-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 18991197b4 added --build-id
linker option when toolchain supports it. ARM one does, but for some
reason places the section at 0 when linker script doesn't mention it
explicitly.
The 1e621a8e37 worked around the problem
removing this section from binary image with explicit objcopy options,
but it still exists in vmlinux, confusing tools like debuggers and perf.
This problem was discussed here:
http://lists.infradead.org/pipermail/linux-arm-kernel/2010-May/015994.htmlhttp://lists.infradead.org/pipermail/linux-arm-kernel/2010-May/015994.html
but the proposed changes to the linker script were substantial.
This patch simply places NOTES (36 bytes long, at least when compiled
with CodeSourcery toolchain) between data and bss, which seem to be
the right place (and suggested by the sample linker script in
include/asm-generic/vmlinux.lds.h).
It is enough to place it correctly in vmlinux (so debuggers are happy):
Section Headers:
[11] .data PROGBITS c07ce000 7ce000 020fc0 00 WA 0 0 32
[12] .notes NOTE c07eefc0 7eefc0 000024 00 AX 0 0 4
[13] .bss NOBITS c07ef000 7eefe4 01e628 00 WA 0 0 32
Program Headers:
LOAD 0x008000 0xc0008000 0xc0008000 0x7e6fe4 0x805628 RWE 0x8000
NOTE 0x7eefc0 0xc07eefc0 0xc07eefc0 0x00024 0x00024 R E 0x4
Section to Segment mapping:
Segment Sections...
00 <...> .data .notes .bss
01 .notes
and to get it exposed as /sys/kernel/notes used by perf tools.
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
SPEAR320_SOC_CONFIG_BASE was wrong, causing the wrong registers to be
accessed.
Reviewed-by: Stanley Miao <stanley.miao@windriver.com>
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In sysctl_soft_reset(), switch to slow mode before resetting the system
via the system controller. This is required.
Reviewed-by: Stanley Miao <stanley.miao@windriver.com>
Signed-off-by: Shiraz Hashim <shiraz.hashim@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
readl() and writel() calls the outer cache maintainance operations
which are not available during Linux uncompression. This patch replaces
readl() and writel() with readl_relaxed() and writel_relaxed() to avoid
the link time errors.
Reviewed-by: Stanley Miao <stanley.miao@windriver.com>
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch fixes following warning:
arch/arm/mm/init.c:606: warning: format '%08lx' expects type 'long unsigned int', but argument 12 has type 'unsigned int'
by appending UL to VMALLOC_END's Number.
Reviewed-by: Stanley Miao <stanley.miao@windriver.com>
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
If ID_MMFR0[3:0] >= 3, the architecture version is ARMv7. The code was
currently only testing for ID_MMFR0[3:0] == 3.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On versions of the Cortex-A9 prior to r3p0, an interrupted ICIALLUIS
operation may prevent the completion of a following broadcasted
operation if the second operation is received by a CPU before the
ICIALLUIS has completed, potentially leading to corrupted entries in
the cache or TLB.
This workaround sets a bit in the diagnostic register of the Cortex-A9,
causing CP15 maintenance operations to be uninterruptible.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Now that we can execute a CONFIG_SMP kernel on a uniprocessor system,
extra care has to be taken in the PMU IRQ affinity setting code to
ensure that we don't always fail to initialise.
This patch changes the CPU PMU initialisation code so that when we
only have a single IRQ, whose affinity can not be changed at the
controller, we report success (0) rather than -EINVAL.
Reported-by: Avik Sil <avik.sil@linaro.org>
Acked-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The effect of cache sync operation is to drain the store buffer and
wait for all internal buffers to be empty. In normal conditions, store
buffer is able to merge the normal memory writes within its 32-byte
data buffers. Due to this erratum present in r3p0, the effect of cache
sync operation on the store buffer still remains when the operation
completes. This means that the store buffer is always asked to drain
and this prevents it from merging any further writes.
This can severely affect performance on the write traffic esp. on
Normal memory NC one.
The proposed workaround is to replace the normal offset of cache sync
operation(0x730) by another offset targeting an unmapped PL310
register 0x740.
Signed-off-by: srinidhi kasagar <srinidhi.kasagar@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since commit 1130e5b3ff regulators are exported to debugfs. The names
of the regulators that contains slash ('/') causes an ops during kernel
boot. This patch fixes this issue.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Max8998 PMIC driver's platform data has been changed once again in
commit 735a3d9efd. This patch fixes build break caused by that commit.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
s3c24xx_ts_set_platdata is annotated __init and not used by any module,
thus don't export it.
This patch fixes below warning:
WARNING: arch/arm/plat-samsung/built-in.o(__ksymtab+0x90): Section mismatch
in reference from the variable __ksymtab_s3c24xx_ts_set_platdata to the
function .init.text:s3c24xx_ts_set_platdata()
The symbol s3c24xx_ts_set_platdata is exported and annotated __init
Fix this by removing the __init annotation of s3c24xx_ts_set_platdata
or drop the export.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
With no caller left, the function and the DIE_NMIWATCHDOG
enumerator can both go away.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Don Zickus <dzickus@redhat.com>
LKML-Reference: <4D5D521C0200007800032702@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
NET_SKB_PAD has been increased from 32 to 64 and later to
max(32, L1_CACHE_BYTES). This led to a 25% throughput decrease for
streaming workloads accompanied by a 37% CPU cost increase on s390.
Define a architecture specific NET_SKB_PAD with the old value of 32.
Signed-off-by: Horst Hartmann <horsth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use inline assemblies for atomic_read/set(). This way there shouldn't
be any questions or subtle volatile semantics left.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The 'output' variable is passed from decompress_kernel to
check_ipl_parmblock before it is initialized. That disables the
safe guard against the overwrite of the ipl parameter block.
Fix this by passing the correct value to check_ipl_parmblock.
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Let's make atomic_read() and atomic_set() behave like on all/most other
architectures. Generated code is identical with gcc 4.5.2.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
For S5P platforms, the end address in memory resource information for UART
devices is one byte more than the intended value. Fix this by reducing the
end address by one byte.
Signed-off-by: Thomas Abraham <thomas.abraham@linaro.org>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
This patch adds support for AMD family 15h core counters. There are
major changes compared to family 10h. First, there is a new perfctr
msr range for up to 6 counters. Northbridge counters are separate
now. This patch only adds support for core counters. Second, certain
events may only be scheduled on certain counters. For this we need to
extend the event scheduling and constraints.
We use cpu feature flags to calculate family 15h msr address offsets.
This way we later can implement a faster ALTERNATIVE() version for
this.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110215135210.GB5874@erda.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of storing the base addresses we can store the counter's msr
addresses directly in config_base/event_base of struct hw_perf_event.
This avoids recalculating the address with each msr access. The
addresses are configured one time. We also need this change to later
modify the address calculation.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1296664860-10886-5-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch allows the reservation of perfctrs with new msr addresses
introduced for AMD cpu family 15h (0xc0010200/0xc0010201, etc).
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1296664860-10886-4-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds helper functions to calculate perfctr msr addresses.
We need this to later add support for AMD family 15h cpus. For this we
have to change the algorithms to generate the perfctr's msr addresses.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1296664860-10886-3-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use helper function in x86_pmu_enable_all() to minimize access to
x86_pmu.eventsel in the fast path. The counter's msr address is now
calculated using struct hw_perf_event. Later we add code that
calculates the msr addresses with a table lookup which shouldn't be
done in the fast path.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1296664860-10886-2-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Several people have reported spurious unknown NMI
messages on some P4 CPUs.
This patch fixes it by checking for an overflow (negative
counter values) directly, instead of relying on the
P4_CCCR_OVF bit.
Reported-by: George Spelvin <linux@horizon.com>
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Don Zickus <dzickus@redhat.com>
Reported-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <AANLkTinfuTfCck_FfaOHrDqQZZehtRzkBum4SpFoO=KJ@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68knommu: set flow handler for secondary interrupt controller of 5249
m68knommu: remove use of IRQ_FLG_LOCK from 68360 platform support
m68knommu: fix dereference of port.tty
m68knommu: add missing linker __modver section
m68knommu: fix mis-named variable int set_irq_chip loop
m68knommu: add optimize memmove() function
m68k: remove arch specific non-optimized memcmp()
m68knommu: fix use of un-defined _TIF_WORK_MASK
m68knommu: Rename m548x_wdt.c to m54xx_wdt.c
m68knommu: fix m548x_wdt.c compilation after headers renaming
m68knommu: Remove dependencies on nonexistent M68KNOMMU
The secondary interrupt controller of the ColdFire 5249 code is not
setting the edge triggered flow handler. Set it.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>