The following structure elements duplicate the information in
'struct device.of_node' and so are being eliminated. This patch
makes all readers of these elements use device.of_node instead.
(struct of_device *)->node
(struct dev_archdata *)->prom_node (sparc)
(struct dev_archdata *)->of_node (powerpc & microblaze)
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (311 commits)
perf tools: Add mode to build without newt support
perf symbols: symbol inconsistency message should be done only at verbose=1
perf tui: Add explicit -lslang option
perf options: Type check all the remaining OPT_ variants
perf options: Type check OPT_BOOLEAN and fix the offenders
perf options: Check v type in OPT_U?INTEGER
perf options: Introduce OPT_UINTEGER
perf tui: Add workaround for slang < 2.1.4
perf record: Fix bug mismatch with -c option definition
perf options: Introduce OPT_U64
perf tui: Add help window to show key associations
perf tui: Make <- exit menus too
perf newt: Add single key shortcuts for zoom into DSO and threads
perf newt: Exit browser unconditionally when CTRL+C, q or Q is pressed
perf newt: Fix the 'A'/'a' shortcut for annotate
perf newt: Make <- exit the ui_browser
x86, perf: P4 PMU - fix counters management logic
perf newt: Make <- zoom out filters
perf report: Report number of events, not samples
perf hist: Clarify events_stats fields usage
...
Fix up trivial conflicts in kernel/fork.c and tools/perf/builtin-record.c
When we build with ftrace enabled its possible that loadcam_entry would
have used the stack pointer (even though the code doesn't need it). We
call loadcam_entry in __secondary_start before the stack is setup. To
ensure that loadcam_entry doesn't use the stack pointer the easiest
solution is to just have it in asm code.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
In CONFIG_PTE_64BIT the PTE format has unique permission bits for user
and supervisor execute. However on !CONFIG_PTE_64BIT we overload the
supervisor bit to imply user execute with _PAGE_USER set. This allows
us to use the same permission check mask for user or supervisor code on
!CONFIG_PTE_64BIT.
However, on CONFIG_PTE_64BIT we map _PAGE_EXEC to _PAGE_BAP_UX so we
need a different permission mask based on the fault coming from a kernel
address or user space.
Without unique permission masks we see issues like the following with
modules:
Unable to handle kernel paging request for instruction fetch
Faulting instruction address: 0xf938d040
Oops: Kernel access of bad area, sig: 11 [#1]
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Jin Qing <b24347@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
For KVM we need to find the location of the HTAB. We can either rely
on internal data structures of the kernel or ask the hardware.
Ben issued complaints about the internal data structure method, so
let's switch it to our own inquiry of the HTAB. Now we're fully
independend :-).
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
When an interrupt occurs we don't know yet if we're in guest context or
in host context. When in guest context, KVM needs to handle it.
So let's pull the same trick we did on Book3S_64: Just add a macro to
determine if we're in guest context or not and if so jump on to KVM code.
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
We need the SWITCH_FRAME_SIZE define on Book3S_32 now too.
So let's export it unconditionally.
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
Our shadow MMU code needs to know where the HTAB is located and how
big it is. So we need some variables from the kernel exported to
module space if KVM is built as a module.
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
We need to keep the pointer to the shadow vcpu somewhere accessible from
within really early interrupt code. The best fit I found was the thread
struct, as that resides in an SPRG.
So let's put a pointer to the shadow vcpu in the thread struct and add
an asm-offset so we can find it.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
The shadow vcpu now contains some fields we don't use from the vcpu anymore.
Access to them happens using inline functions that happily use the shadow
vcpu fields.
So let's now ifdef them out to booke only and add asm-offsets.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Upstream recently added a new name for PPC64: Book3S_64.
So instead of using CONFIG_PPC64 we should use CONFIG_PPC_BOOK3S consotently.
That makes understanding the code easier (I hope).
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
We have quite some code that can be used by Book3S_32 and Book3S_64 alike,
so let's call it "Book3S" instead of "Book3S_64", so we can later on
use it from the 32 bit port too.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Book3S knows how to convert floats to doubles and vice versa. BookE doesn't.
So let's make sure we don't export them on BookE.
This fixes a link error on BookE with CONFIG_KVM=y.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Anton Blanchard found that large POWER systems would occasionally
crash in the exception exit path when profiling with perf_events.
The symptom was that an interrupt would occur late in the exit path
when the MSR[RI] (recoverable interrupt) bit was clear. Interrupts
should be hard-disabled at this point but they were enabled. Because
the interrupt was not recoverable the system panicked.
The reason is that the exception exit path was calling
perf_event_do_pending after hard-disabling interrupts, and
perf_event_do_pending will re-enable interrupts.
The simplest and cleanest fix for this is to use the same mechanism
that 32-bit powerpc does, namely to cause a self-IPI by setting the
decrementer to 1. This means we can remove the tests in the exception
exit path and raw_local_irq_restore.
This also makes sure that the call to perf_event_do_pending from
timer_interrupt() happens within irq_enter/irq_exit. (Note that
calling perf_event_do_pending from timer_interrupt does not mean that
there is a possible 1/HZ latency; setting the decrementer to 1 ensures
that the timer interrupt will happen immediately, i.e. within one
timebase tick, which is a few nanoseconds or 10s of nanoseconds.)
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[paulus@samba.org: Set cpuhw->event[i]->hw.config in
power_pmu_commit_txn.]
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100508102841.GA10650@brick.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since the *_map cpumask variants are deprecated, change the comments to
instead refer to *_mask.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Dynamically allocate cpu_sibling_map and cpu_core_map cpumasks.
We don't need to set_cpu_online() the boot cpu in smp_prepare_boot_cpu,
init/main.c does it for us.
We also postpone setting of the boot cpu in cpu_sibling_map and cpu_core_map
until when the memory allocator is available (smp_prepare_cpus), similar
to x86.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use new cpumask API in /proc/cpuinfo code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This separates the per cpu output from the summary output at the end of the
file, making it easier to convert to the new cpumask API in a subsequent
patch.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use new cpumask_* functions, and dynamically allocate cpumask in fixup_irqs.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use the new cpumask_* functions and dynamically allocate the cpumask in
smp_cpus_done.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use cpumask_first, cpumask_next in rtasd code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is a trivial 4xx plaform that uses the new simple bsp from
Josh and is handy to use in simulators such as ISS or even Mambo
who don't properly implement most of the actual devices in the
SoC but really only the core.
Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
476 requires an isync after loading MMU and debug related SPR's. Some of
these are in performance-critical paths and may need to be optimized, but
initially, we're playing it safe.
Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
The 47x core's MCSR varies from 44x, so it needs it's own machine check
handler.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This patch adds the base support for the 476 processor. The code was
primarily written by Ben Herrenschmidt and Torez Smith, but I've been
maintaining it for a while.
The goal is to have a single binary that will run on 44x and 47x, but
we still have some details to work out. The biggest is that the L1 cache
line size differs on the two platforms, but it's currently a compile-time
option.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
The 47x platform supports multiple cores and shares code with 44x.
Break out code that is common for initializing the primary and secondary
cpus into a function which can be called for both.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This patch adds a marker to the exception stack frame to aid in debugging.
It's already inserted on other platforms and xmon recognizes it and
identifies exception frames when showing stack traces.
Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
When we compare the devices DMA mask to the amount of memory we need to
make sure we treat the DMA mask as an address boundary. For example if
the DMA_MASK(32) and we have 4G of memory we'd incorrectly set the dma
ops to swiotlb. We need to add one to the dma mask when we convert it.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Currently, platforms using CONFIG_OF add a 'struct device_node *of_node'
to dev->archdata. However, with CONFIG_OF becoming generic for all
architectures, it makes sense for commonality to move it out of archdata
and into struct device proper.
This patch adds a struct device_node *of_node member to struct device
and updates all locations which currently write the device_node pointer
into archdata to also update dev->of_node. Subsequent patches will
modify callers to use the archdata location and ultimately remove
the archdata member entirely.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
CC: Michal Simek <monstr@monstr.eu>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: "David S. Miller" <davem@davemloft.net>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: Jeremy Kerr <jeremy.kerr@canonical.com>
CC: microblaze-uclinux@itee.uq.edu.au
CC: linux-kernel@vger.kernel.org
CC: linuxppc-dev@ozlabs.org
CC: sparclinux@vger.kernel.org
Firmware changed the way it represents memory and cpu affinity on POWER7.
Unfortunately the old method now caps the topology to work around issues
with legacy operating systems. For Linux to get the correct topology we
need to use the new form 1 affinity information.
We set the form 1 field in the client architecture, and if we see "1" in the
ibm,associativity-form property firmware supports form 1 affinity and
we should look at the first field in the ibm,associativity-reference-points
array. If not we use the second field as we always have.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To emulate paired single instructions, we need to be able to call FPU
operations from within the kernel. Since we don't want gcc to spill
arbitrary FPU code everywhere, we tell it to use a soft fpu.
Since we know we can really call the FPU in safe areas, let's also add
some calls that we can later use to actually execute real world FPU
operations on the host's FPU.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch ports the kprobe-based event tracer to powerpc. This patch
is based on x86 port. This brings powerpc on par with x86.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Adds support for suspend/resume for VIO devices. This is needed for
support for HMC initiated hibernation.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have had issues in the past with ibm,os-term initiating shutdown of a
partition. This is confusing to the user, especially if panic_timeout is
non zero.
The temporary fix was to avoid calling ibm,os-term if a panic_timeout was set
and since we set it on every boot we basically never call ibm,os-term.
An extended version of ibm,os-term has since been implemented which gives us
the behaviour we want:
"When the platform supports extended ibm,os-term behavior, the return to the
RTAS will always occur unless there is a kernel assisted dump active as
initiated by an ibm,configure-kernel-dump call."
This patch checks for the ibm,extended-os-term property and calls ibm,os-term
if it exists.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use set_cpus_allowed_ptr rather than set_cpus_allowed.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression E1,E2;
@@
- set_cpus_allowed(E1, cpumask_of_cpu(E2))
+ set_cpus_allowed_ptr(E1, cpumask_of(E2))
@@
expression E;
identifier I;
@@
- set_cpus_allowed(E, I)
+ set_cpus_allowed_ptr(E, &I)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add an unlock before exiting the function.
A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@r exists@
expression E1;
identifier f;
@@
f (...) { <+...
* spin_lock_irq (E1,...);
... when != E1
* return ...;
...+> }
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This avoids storing these registers in memory.
CPU6 errata will still use the old way.
Remove some G2 leftover accesses from 2.4
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Only the swap function cares about the ACCESSED bit in
the pte. Do not waste cycles updateting ACCESSED when swap
is not compiled into the kernel.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Only modules will cause ITLB Misses as we always pin
the first 8MB of kernel memory.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This removes a couple of insn's from the TLB Miss
handlers whithout changing functionality.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add support for H_EM_GET_PARMS hcall that will return data
related to power modes from the platform. Export the data
directly to user space for administrative tools to interpret
and use.
cat /proc/powerpc/lparcfg will export power mode data
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Data address breakpoint exceptions are currently handled along with page-faults
which require interrupts to remain in enabled state. Since exception handling
for data breakpoints aren't pre-empt safe, we handle them separately.
Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
BenH: Added to vio_cmo_dev_attrs as well
Provide a modalias entry for VIO devices in sysfs. I believe
this was another initrd generation bugfix for anaconda.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Now that software events use perf_arch_fetch_caller_regs() too, we
need the powerpc version to be always built.
Fixes the following build error:
(.text+0x3210): undefined reference to `perf_arch_fetch_caller_regs'
(.text+0x3324): undefined reference to `perf_arch_fetch_caller_regs'
(.text+0x33bc): undefined reference to `perf_arch_fetch_caller_regs'
(.text+0x33ec): undefined reference to `perf_arch_fetch_caller_regs'
(.text+0xd4a0): undefined reference to `perf_arch_fetch_caller_regs'
arch/powerpc/kernel/built-in.o:(.text+0xd528): more undefined references to `perf_arch_fetch_caller_regs' follow
make[1]: *** [.tmp_vmlinux1] Error 1
make: *** [sub-make] Error 2
Reported-by: Michael Ellerman <michael@ellerman.id.au>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
powerpc/perf_events: Fix call-graph recording, add perf_arch_fetch_caller_regs
perf top: Add missing initialization to zero
perf probe: Use original address instead of CU-based address
perf probe: Fix offset to allow signed value
perf top: Improve the autosizing of column lenghts
perf probe: Fix need_dwarf flag if lazy matching is used
perf probe: Fix probe_point buffer overrun
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Remove IOMMU_VMERGE config option
powerpc: Fix swiotlb to respect the boot option
powerpc: Do not call prink when CONFIG_PRINTK is not defined
powerpc: Use correct ccr bit for syscall error status
powerpc/fsl-booke: Get coherent bit from PTE
powerpc/85xx: Make sure lwarx hint isn't set on ppc32
The description says:
Cause IO segments sent to a device for DMA to be merged virtually
by the IOMMU when they happen to have been allocated contiguously.
This doesn't add pressure to the IOMMU allocator. However, some
drivers don't support getting large merged segments coming back
from *_map_sg().
Most drivers don't have this problem; it is safe to say Y here.
It's out of date. Long ago, drivers didn't have a way to tell IOMMUs
about their segment length limit (that is, the maximum segment length
that they can handle). So IOMMUs merged as many segments as possible
and gave too large segments to drivers.
dma_get_max_seg_size() was introduced to solve the above
problem. Device drives can use the API to tell IOMMU about the maximum
segment length that they can handle. In addition, the default limit
(64K) should be safe for everyone.
So this config option seems to be unnecessary.
Note that this config option just enables users to disable the virtual
merging by default. Users can still disable the virtual merging by the
boot parameter.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc initializes swiotlb before parsing the kernel boot options so
swiotlb options (e.g. specifying the swiotlb buffer size) are ignored.
Any time before freeing bootmem works for swiotlb so this patch moves
powerpc's swiotlb initialization after parsing the kernel boot
options, mem_init (as x86 does).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Becky Bruce <beckyb@kernel.crashing.org>
Tested-by: Albert Herranz <albert_herranz@yahoo.es>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When printk() is disabled (CONFIG_PRINTK) at menu item
General setup
-> Configure standard kernel features (for small systems)
-> Enable support for printk
then there should be no printk() calls at all.
Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (35 commits)
perf: Fix unexported generic perf_arch_fetch_caller_regs
perf record: Don't try to find buildids in a zero sized file
perf: export perf_trace_regs and perf_arch_fetch_caller_regs
perf, x86: Fix hw_perf_enable() event assignment
perf, ppc: Fix compile error due to new cpu notifiers
perf: Make the install relative to DESTDIR if specified
kprobes: Calculate the index correctly when freeing the out-of-line execution slot
perf tools: Fix sparse CPU numbering related bugs
perf_event: Fix oops triggered by cpu offline/online
perf: Drop the obsolete profile naming for trace events
perf: Take a hot regs snapshot for trace events
perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot
perf/x86-64: Use frame pointer to walk on irq and process stacks
lockdep: Move lock events under lockdep recursion protection
perf report: Print the map table just after samples for which no map was found
perf report: Add multiple event support
perf session: Change perf_session post processing functions to take histogram tree
perf session: Add storage for seperating event types in report
perf session: Change add_hist_entry to take the tree root instead of session
perf record: Add ID and to recorded event data when recording multiple events
...
This implements a powerpc version of perf_arch_fetch_caller_regs
to get correct call-graphs.
It's implemented in assembly because that way we can be sure there isn't
a stack frame for perf_arch_fetch_caller_regs. If it was in C, gcc might
or might not create a stack frame for it, which would affect the number
of levels we have to skip.
With this, we see results from perf record -e lock:lock_acquire like
this:
# Samples: 24878
#
# Overhead Command Shared Object Symbol
# ........ .............. ................. ......
#
14.99% perf [kernel.kallsyms] [k] ._raw_spin_lock
|
--- ._raw_spin_lock
|
|--25.00%-- .alloc_fd
| (nil)
| |
| |--50.00%-- .anon_inode_getfd
| | .sys_perf_event_open
| | syscall_exit
| | syscall
| | create_counter
| | __cmd_record
| | run_builtin
| | main
| | 0xfd2e704
| | 0xfd2e8c0
| | (nil)
... etc.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: anton@samba.org
Cc: linuxppc-dev@ozlabs.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100318050513.GA6575@drongo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We shouldn't be always setting 'M' in the TLB entry since its reasonable
for somethings to be mapped non-coherent. The PTE should have 'M' set
properly.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf: Provide generic perf_sample_data initialization
MAINTAINERS: Add Arnaldo as tools/perf/ co-maintainer
perf trace: Don't use pager if scripting
perf trace/scripting: Remove extraneous header read
perf, ARM: Modify kuser rmb() call to compile for Thumb-2
x86/stacktrace: Don't dereference bad frame pointers
perf archive: Don't try to collect files without a build-id
perf_events, x86: Fixup fixed counter constraints
perf, x86: Restrict the ANY flag
perf, x86: rename macro in ARCH_PERFMON_EVENTSEL_ENABLE
perf, x86: add some IBS macros to perf_event.h
perf, x86: make IBS macros available in perf_event.h
hw-breakpoints: Remove stub unthrottle callback
x86/hw-breakpoints: Remove the name field
perf: Remove pointless breakpoint union
perf lock: Drop the buffers multiplexing dependency
perf lock: Fix and add misc documentally things
percpu: Add __percpu sparse annotations to hw_breakpoint
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/booke: Fix breakpoint/watchpoint one-shot behavior
powerpc: Reduce printk from pseries_mach_cpu_die()
powerpc: Move checks in pseries_mach_cpu_die()
powerpc: Reset kernel stack on cpu online from cede state
powerpc: Fix G5 thermal shutdown
powerpc/pseries: Pass CPPR value to H_XIRR hcall
powerpc/booke: Fix a couple typos in the advanced ptrace code
powerpc: Fix SMP build with disabled CPU hotplugging.
powerpc: Dynamically allocate pacas
powerpc/perf: e500 support
powerpc/perf: Build callchain code regardless of hardware event support.
powerpc/cpm2: Checkpatch cleanup
powerpc/86xx: Renaming following split of GE Fanuc joint venture
powerpc/86xx: Convert gef_pic_lock to raw_spinlock
powerpc/qe: Convert qe_ic_lock to raw_spinlock
powerpc/82xx: Convert pci_pic_lock to raw_spinlock
powerpc/85xx: Convert socrates_fpga_pic_lock to raw_spinlock
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (56 commits)
doc: fix typo in comment explaining rb_tree usage
Remove fs/ntfs/ChangeLog
doc: fix console doc typo
doc: cpuset: Update the cpuset flag file
Fix of spelling in arch/sparc/kernel/leon_kernel.c no longer needed
Remove drivers/parport/ChangeLog
Remove drivers/char/ChangeLog
doc: typo - Table 1-2 should refer to "status", not "statm"
tree-wide: fix typos "ass?o[sc]iac?te" -> "associate" in comments
No need to patch AMD-provided drivers/gpu/drm/radeon/atombios.h
devres/irq: Fix devm_irq_match comment
Remove reference to kthread_create_on_cpu
tree-wide: Assorted spelling fixes
tree-wide: fix 'lenght' typo in comments and code
drm/kms: fix spelling in error message
doc: capitalization and other minor fixes in pnp doc
devres: typo fix s/dev/devm/
Remove redundant trailing semicolons from macros
fix typo "definetly" -> "definitely" in comment
tree-wide: s/widht/width/g typo in comments
...
Fix trivial conflict in Documentation/laptops/00-INDEX
This converts powerpc to use the generic pci_set_dma_mask and
pci_set_consistent_dma_mask (drivers/pci/pci.c).
The generic pci_set_dma_mask does what powerpc's pci_set_dma_mask does.
Unlike powerpc's pci_set_consistent_dma_mask, the gneric
pci_set_consistent_dma_mask sets only coherent_dma_mask. It doesn't work
for powerpc? pci_set_consistent_dma_mask API should set only
coherent_dma_mask?
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Greg KH <greg@kroah.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add generic implementations of the old and really old uname system calls.
Note that sh only implements sys_olduname but not sys_oldolduname, but I'm
not going to bother with another ifdef for that special case.
m32r implemented an old uname but never wired it up, so kill it, too.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On an architecture that supports 32-bit compat we need to override the
reported machine in uname with the 32-bit value. Instead of doing this
separately in every architecture introduce a COMPAT_UTS_MACHINE define in
<asm/compat.h> and apply it directly in sys_newuname().
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a generic implementation of the ipc demultiplexer syscall. Except for
s390 and sparc64 all implementations of the sys_ipc are nearly identical.
There are slight differences in the types of the parameters, where mips
and powerpc as the only 64-bit architectures with sys_ipc use unsigned
long for the "third" argument as it gets casted to a pointer later, while
it traditionally is an "int" like most other paramters. frv goes even
further and uses unsigned long for all parameters execept for "ptr" which
is a pointer type everywhere. The change from int to unsigned long for
"third" and back to "int" for the others on frv should be fine due to the
in-register calling conventions for syscalls (we already had a similar
issue with the generic sys_ptrace), but I'd prefer to have the arch
maintainers looks over this in details.
Except for that h8300, m68k and m68knommu lack an impplementation of the
semtimedop sub call which this patch adds, and various architectures have
gets used - at least on i386 it seems superflous as the compat code on
x86-64 and ia64 doesn't even bother to implement it.
[akpm@linux-foundation.org: add sys_ipc to sys_ni.c]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix:
arch/powerpc/kernel/perf_event.c:1334: error: 'power_pmu_notifier' undeclared (first use in this function)
arch/powerpc/kernel/perf_event.c:1334: error: (Each undeclared identifier is reported only once
arch/powerpc/kernel/perf_event.c:1334: error: for each function it appears in.)
arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'power_pmu_notifier'
arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'register_cpu_notifier'
Due to commit 3f6da390 (perf: Rework and fix the arch CPU-hotplug hooks).
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Remove the hw_perf_event_*() hotplug hooks in favour of per PMU hotplug
notifiers. This has the advantage of reducing the static weak interface
as well as exposing all hotplug actions to the PMU.
Use this to fix x86 hotplug usage where we did things in ONLINE which
should have been done in UP_PREPARE or STARTING.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <20100305154128.736225361@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This makes it easier to extend perf_sample_data and fixes a bug on arm
and sparc, which failed to set ->raw to NULL, which can cause crashes
when combined with PERF_SAMPLE_RAW.
It also optimizes PowerPC and tracepoint, because the struct
initialization is forced to zero out the whole structure.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Jean Pihet <jpihet@mvista.com>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Jamie Iles <jamie.iles@picochip.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: stable@kernel.org
LKML-Reference: <20100304140100.315416040@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Another fix for the extended ptrace patches in the -next tree.
The handling of breakpoints and watchpoints is inconsistent. When a
breakpoint or watchpoint is hit, the interrupt handler is clearing the
proper bits in the dbcr* registers, but leaving the dac* and iac* registers
alone. The ptrace code to delete the break/watchpoints checks the dac* and
iac* registers for zero to determine if they are enabled. Instead, they
should check the dbcr* bits.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cpu hotplug (offline) without dlpar operation will place cpu
in cede state and the extended_cede_processor() function will
return when resumed.
Kernel stack pointer needs to be reset before
start_secondary() is called to continue the online operation.
Added new function start_secondary_resume() to do the above
steps.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/booke: Fix a couple typos in the advanced ptrace code
Found and fixed a couple typos in the advanced ptrace patches.
(These patches are currently in benh's next tree.)
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev list <Linuxppc-dev@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On 64-bit kernels we currently have a 512 byte struct paca_struct for
each cpu (usually just called "the paca"). Currently they are statically
allocated, which means a kernel built for a large number of cpus will
waste a lot of space if it's booted on a machine with few cpus.
We can avoid that by only allocating the number of pacas we need at
boot. However this is complicated by the fact that we need to access
the paca before we know how many cpus there are in the system.
The solution is to dynamically allocate enough space for NR_CPUS pacas,
but then later in boot when we know how many cpus we have, we free any
unused pacas.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Constify struct sysfs_ops.
This is part of the ops structure constification
effort started by Arjan van de Ven et al.
Benefits of this constification:
* prevents modification of data that is shared
(referenced) by many other structure instances
at runtime
* detects/prevents accidental (but not intentional)
modification attempts on archs that enforce
read-only kernel data at runtime
* potentially better optimized code as the compiler
can assume that the const data cannot be changed
* the compiler/linker move const data into .rodata
and therefore exclude them from false sharing
Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: David Teigland <teigland@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This implements perf_event support for the Freescale embedded performance
monitor, based on the existing perf_event.c that supports server/classic
chips.
Some limitations:
- Performance monitor interrupts are regular EE interrupts, and thus you
can't profile places with interrupts disabled. We may want to implement
soft IRQ-disabling, with perfmon interrupts exempted and treated as NMIs.
- When trying to schedule multiple event groups at once, and using
restricted events, situations could arise where scheduling fails even
though it would be possible. Consider three groups, each with two events.
One group has restricted events, the others don't. The two non-restricted
groups are scheduled, then one is removed, which happens to occupy the two
counters that can't do restricted events. The remaining non-restricted
group will not be moved to the non-restricted-capable counters to make
room if the restricted group tries to be scheduled.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
It's also useful for software events, as well as future support for
other types of hardware counters.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
SRR1 stores more information that just the MSR value. It also stores
valuable information about the type of interrupt we received, for
example whether the storage interrupt we just got was because of a
missing htab entry or not.
We use that information to speed up the exit path.
Now if we get preempted before we can interpret the shadow_msr values,
we get into vcpu_put which then calls the MSR handler, which then sets
all the SRR1 information bits in shadow_msr to 0. Great.
So let's preserve the SRR1 specific bits in shadow_msr whenever we set
the MSR. They don't hurt.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
We need to explicitly only giveup VSX in KVM, so let's export that
specific function to module space.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Currently we're racy when doing the transition from IR=1 to IR=0, from
the module memory entry code to the real mode SLB switching code.
To work around that I took a look at the RTAS entry code which is faced
with a similar problem and did the same thing:
A small helper in linear mapped memory that does mtmsr with IR=0 and
then RFIs info the actual handler.
Thanks to that trick we can safely take page faults in the entry code
and only need to be really wary of what to do as of the SLB switching
part.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
We're being horribly racy right now. All the entry and exit code hijacks
random fields from the PACA that could easily be used by different code in
case we get interrupted, for example by a #MC or even page fault.
After discussing this with Ben, we figured it's best to reserve some more
space in the PACA and just shove off some vcpu state to there.
That way we can drastically improve the readability of the code, make it
less racy and less complex.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (88 commits)
powerpc: Fix lwsync feature fixup vs. modules on 64-bit
powerpc: Convert pmc_owner_lock to raw_spinlock
powerpc: Convert die.lock to raw_spinlock
powerpc: Convert tlbivax_lock to raw_spinlock
powerpc: Convert mpic locks to raw_spinlock
powerpc: Convert pmac_pic_lock to raw_spinlock
powerpc: Convert big_irq_lock to raw_spinlock
powerpc: Convert feature_lock to raw_spinlock
powerpc: Convert i8259_lock to raw_spinlock
powerpc: Convert beat_htab_lock to raw_spinlock
powerpc: Convert confirm_error_lock to raw_spinlock
powerpc: Convert ipic_lock to raw_spinlock
powerpc: Convert native_tlbie_lock to raw_spinlock
powerpc: Convert beatic_irq_mask_lock to raw_spinlock
powerpc: Convert nv_lock to raw_spinlock
powerpc: Convert context_lock to raw_spinlock
powerpc/85xx: Add NOR, LEDs and PIB support for MPC8568E-MDS boards
powerpc/86xx: Enable VME driver on the GE SBC610
powerpc/86xx: Enable VME driver on the GE PPC9A
powerpc/86xx: Add MSI section to GE PPC9A DTS
...
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (48 commits)
x86/PCI: Prevent mmconfig memory corruption
ACPI: Use GPE reference counting to support shared GPEs
x86/PCI: use host bridge _CRS info by default on 2008 and newer machines
PCI: augment bus resource table with a list
PCI: add pci_bus_for_each_resource(), remove direct bus->resource[] refs
PCI: read bridge windows before filling in subtractive decode resources
PCI: split up pci_read_bridge_bases()
PCIe PME: use pci_pcie_cap()
PCI PM: Run-time callbacks for PCI bus type
PCIe PME: use pci_is_pcie()
PCI / ACPI / PM: Platform support for PCI PME wake-up
ACPI / ACPICA: Multiple system notify handlers per device
ACPI / PM: Add more run-time wake-up fields
ACPI: Use GPE reference counting to support shared GPEs
PCI PM: Make it possible to force using INTx for PCIe PME signaling
PCI PM: PCIe PME root port service driver
PCI PM: Add function for checking PME status of devices
PCI: mark is_pcie obsolete
PCI: set PCI_PREF_RANGE_TYPE_64 in pci_bridge_check_ranges
PCI: pciehp: second try to get big range for pcie devices
...
Since the cpu argument to hw_perf_group_sched_in() is always
smp_processor_id(), simplify the code a little by removing this argument
and using the current cpu where needed.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1265890918.5396.3.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: (41 commits)
of: remove undefined request_OF_resource & release_OF_resource
of/sparc: Remove sparc-local declaration of allnodes and devtree_lock
of: move definition of of_chosen into common code.
of: remove unused extern reference to devtree_lock
of: put default string compare and #a/s-cell values into common header
of/flattree: Don't assume HAVE_LMB
of: protect linux/of.h with CONFIG_OF
proc_devtree: fix THIS_MODULE without module.h
of: Remove old and misplaced function declarations
of/flattree: Make the kernel accept ePAPR style phandle information
of/flattree: endian-convert members of boot_param_header
of: assume big-endian properties, adding conversions where necessary
of: use __be32 for cell value accessors
of/flattree: use OF_ROOT_NODE_{SIZE,ADDR}_CELLS DEFAULT for fdt parsing
of/flattree: use callback to setup initrd from /chosen
proc_devtree: include linux/of.h
of: make set_node_proc_entry private to proc_devtree.c
of: include linux/proc_fs.h
of/flattree: merge early_init_dt_scan_memory() common code
of: add 'of_' prefix to machine_is_compatible()
...
No functional change; this converts loops that iterate from 0 to
PCI_BUS_NUM_RESOURCES through pci_bus resource[] table to use the
pci_bus_for_each_resource() iterator instead.
This doesn't change the way resources are stored; it merely removes
dependencies on the fact that they're in a table.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Now that we return the new resource start position, there is no
need to update "struct resource" inside the align function.
Therefore, mark the struct resource as const.
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
As suggested by Linus, align functions should return the start
of a resource, not void. An update of "res->start" is no longer
necessary.
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pmc_owner_lock needs to be a real spinlock in RT. Convert it to
raw_spinlock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
die.lock needs to be a real spinlock in RT. Convert it to
raw_spinlock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
big_irq_lock needs to be a real spinlock in RT. Convert it to
raw_spinlock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
24 is offset between the opcode past bl and past rfi. This makes it more
obvious.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
powerpc/booke: Add support for advanced debug registers
From: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Based on patches originally written by Torez Smith.
This patch defines context switch and trap related functionality
for BookE specific Debug Registers. It adds support to ptrace()
for setting and getting BookE related Debug Registers
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <dwg@au1.ibm.com>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Sergio Durigan Junior <sergiodj@br.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@br.ibm.com>
Cc: linuxppc-dev list <Linuxppc-dev@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc: Extended ptrace interface
From: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Based on patches originally written by Torez Smith.
Add a new extended ptrace interface so that user-space has a single
interface for powerpc, without having to know the specific layout
of the debug registers.
Implement:
PPC_PTRACE_GETHWDEBUGINFO
PPC_PTRACE_SETHWDEBUG
PPC_PTRACE_DELHWDEBUG
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Acked-by: David Gibson <dwg@au1.ibm.com>
Cc: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Sergio Durigan Junior <sergiodj@br.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@br.ibm.com>
Cc: linuxppc-dev list <Linuxppc-dev@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/booke: Introduce new CONFIG options for advanced debug registers
From: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Introduce new config options to simplify the ifdefs pertaining to the
advanced debug registers for booke and 40x processors:
CONFIG_PPC_ADV_DEBUG_REGS - boolean: true for dac-based processors
CONFIG_PPC_ADV_DEBUG_IACS - number of IAC registers
CONFIG_PPC_ADV_DEBUG_DACS - number of DAC registers
CONFIG_PPC_ADV_DEBUG_DVCS - number of DVC registers
CONFIG_PPC_ADV_DEBUG_DAC_RANGE - DAC ranges supported
Beginning conservatively, since I only have the facilities to test 440
hardware. I believe all 40x and booke platforms support at least 2 IAC
and 2 DAC registers. For 440, 4 IAC and 2 DVC registers are enabled, as
well as the DAC ranges.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Acked-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I often get asked if BAD interrupts are really bad. On some boxes (eg
IBM machines running a hypervisor) there are valid cases where are
presented with an interrupt that is not for us. These cases are common
enough to show up as thousands of BAD interrupts a day.
Tone them down by calling them spurious. Since they can be a significant cause
of OS jitter, we may as well log them per cpu so we know where they are
occurring.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With NO_HZ it is useful to know how often the decrementer is going off. The
patch below adds an entry for it and also adds it into the /proc/stat
summaries.
While here, I added performance monitoring and machine check exceptions.
I found it useful to keep an eye on the PMU exception rate
when using the perf tool. Since it's possible to take a completely
handled machine check on a System p box it also sounds like a good idea to
keep a machine check summary.
The event naming matches x86 to keep gratuitous differences to a minimum.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On a large machine I noticed the columns of /proc/interrupts failed to line up
with the header after CPU9. At sufficiently large numbers of CPUs it becomes
impossible to line up the CPU number with the counts.
While fixing this I noticed x86 has a number of updates that we may as well
pull in. On PowerPC we currently omit an interrupt completely if there is no
active handler, whereas on x86 it is printed if there is a non zero count.
The x86 code also spaces the first column correctly based on nr_irqs.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
PowerPC is currently using asm-generic/hardirq.h which statically allocates an
NR_CPUS irq_stat array. Switch to an arch specific implementation which uses
per cpu data:
On a kernel with NR_CPUS=1024, this saves quite a lot of memory:
text data bss dec hex filename
8767938 2944132 1636796 13348866 cbb002 vmlinux.baseline
8767779 2944260 1505724 13217763 c9afe3 vmlinux.irq_cpustat
A saving of around 128kB.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rather than defining of_chosen in each arch, it can be defined for all
in driver/of/base.c
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Neither the powerpc nor the microblaze code use devtree_lock anymore.
Remove the extern reference.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
We don't always have lmb available, so make arches provide an
early_init_dt_alloc_memory_arch() to handle the allocation of
memory in the fdt code.
When we don't have lmb.h included, we need asm/page.h for __va.
Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
The boot_param_header has big-endian fields, so change the types to
__be32, and perform endian conversion when we access them.
Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
At present, the fdt code sets the kernel-wide initrd_start and
initrd_end variables when parsing /chosen. On ARM, we only set these
once the bootmem has been reserved.
This change adds an arch hook to setup the initrd from the device
tree:
void early_init_dt_setup_initrd_arch(unsigned long start,
unsigned long end);
The arch-specific code can then setup the initrd however it likes.
Compiled on powerpc, with CONFIG_BLK_DEV_INITRD=y and =n.
Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Merge common code between PowerPC and Microblaze architectures.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Michal Simek <monstr@monstr.eu>
machine is compatible is an OF-specific call. It should have
the of_ prefix to protect the global namespace.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Michal Simek <monstr@monstr.eu>
Merge common function between powerpc, sparc and microblaze. Code is
identical for powerpc and microblaze, but adds a lock (and release) of
the devtree_lock on sparc.
Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Merge common code between PowerPC and Microblaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The clockevent multiplier and shift is useful information, but we
only need to print it once.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
RTAS should never cause an exception but if it does (for example accessing
outside our RMO) then we might go a long way through the kernel before
oopsing. If we unset MSR_RI we should at least stop things on exception
exit.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Frans Pop <elendil@planet.nl>
Cc: linuxppc-dev@ozlabs.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We use firmware_has_feature quite a lot these days, so it's worth putting
powerpc_firmware_features into __read_mostly.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add printout of last accessed sysfs file, added to x86 in
ae87221d3c (sysfs: crash debugging)
Also add the notify_die hook that allows us to print out the ftrace
buffer on oops. This is useful in conjunction with ftrace function_graph:
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=128 NUMA pSeries
last sysfs file: /sys/class/net/tunl0/type
Dumping ftrace buffer:
...
0) | .sysrq_handle_crash() {
0) 0.476 us | .hash_page();
0) 0.488 us | .xmon_fault_handler();
0) | .bad_page_fault() {
0) | .search_exception_tables() {
0) 0.590 us | .search_module_extables();
0) 2.546 us | }
0) | .printk() {
0) | .vprintk() {
0) 0.488 us | ._raw_spin_lock();
0) 0.572 us | .emit_log_char();
Showing the function graph of a sysrq-c crash.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
String constants that are continued on subsequent lines with \
are not good.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Replace platfrom -> platform.
This is a frequent spelling bug.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Updated variant of a patch by Joel Schopp.
The field containing the number of supported cores which we pass to
firmware via the ibm,client-architecture call was set by a previous
patch statically as high as is possible (NR_CPUS).
However, that value isn't quite right for a system that supports
multiple threads per core, thus permitting the firmware to assign
more cores to a Linux partition than it can really cope with.
This patch improves it by using the device-tree to determine the
number of threads supported by the processors in order to adjust
the value passed to firmware.
Signed-off-by: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch adds 2 fields to the ibm_architecture_vec array.
The first of these fields indicates the number of cores which Linux can
boot. It does not account for SMT, so it may result in cpus assigned to
Linux which cannot be booted. A second patch follows that dynamically
updates this for SMT.
The second field just indicates that our OS is Linux, and not another
OS. The system may or may not use this hint to performance tune
settings for Linux.
Signed-off-by: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Using perf to trace L1 dcache misses and dumping data addresses I found a few
variables taking a lot of misses. Since they are almost never written, they
should go into the __read_mostly section.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The cputime code has a few places that do per_cpu(, smp_processor_id()).
Replace them with __get_cpu_var().
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Here are the powerpc bits to remove TIF_ABI_PENDING now that
set_personality() is called at the appropriate place in exec.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add missing call to pci_fixup_device(pci_fixup_early, ...) when
building the pci_dev from scratch off the Open Firmware device-tree
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add missing hookup to existing pci_slot when building the pci_dev from
scratch off the Open Firmware device-tree
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We are missing these when building the pci_dev from scratch off
the Open Firmware device-tree
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In struct device_node, the phandle is named 'linux_phandle' for PowerPC
and MicroBlaze, and 'node' for SPARC. There is no good reason for the
difference, it is just an artifact of the code diverging over a couple
of years. This patch renames both to simply .phandle.
Note: the .node also existed in PowerPC/MicroBlaze, but the only user
seems to be arch/powerpc/platforms/powermac/pfunc_core.c. It doesn't
look like the assignment between .linux_phandle and .node is
significantly different enough to warrant the separate code paths
unless ibm,phandle properties actually appear in Apple device trees.
I think it is safe to eliminate the old .node property and use
phandle everywhere.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Merge common code between PowerPC and MicroBlaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Merge common code between PowerPC and Microblaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When running perf across all cpus with backtracing (-a -g), sometimes we
get samples without associated backtraces:
23.44% init [kernel] [k] restore
11.46% init eeba0c [k] 0x00000000eeba0c
6.77% swapper [kernel] [k] .perf_ctx_adjust_freq
5.73% init [kernel] [k] .__trace_hcall_entry
4.69% perf libc-2.9.so [.] 0x0000000006bb8c
|
|--11.11%-- 0xfffa941bbbc
It turns out the backtrace code has a check for the idle task and the IP
sampling does not. This creates problems when profiling an interrupt
heavy workload (in my case 10Gbit ethernet) since we get no backtraces
for interrupts received while idle (ie most of the workload).
Right now x86 and sh check that current is not NULL, which should never
happen so remove that too.
Idle task's exclusion must be performed from the core code, on top
of perf_event_attr:exclude_idle.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
LKML-Reference: <20100118054707.GT12666@kryten>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Move the defintion and lock helper routines for the cpu hotplug driver
lock from pseries to powerpc code to avoid build breaks for platforms
other than pseries that use cpu hotplug.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It looks like the previous patch sent out to move RTAS and
other items from /proc/ppc64 to /proc/powerpc missed a few
files needed for RAS and DLPAR functionality.
Original Patch here:
http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-September/076096.html
This patch updates the remaining files to be created under /proc/powerpc.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The newly added fixup for buggy dcbX insn's has
a bug that always trigger a kernel TLB walk so a user space
dcbX insn will cause a Kernel Machine Check if it hits DTLB error.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We noticed that recent kernels didn't boot on our 1GHz Canyonlands 460EX
boards anymore. As it seems, patch 8d165db1 [powerpc: Improve
decrementer accuracy] introduced this problem. The routine div_sc()
overflows with shift = 32 resulting in this incorrect setup:
time_init: decrementer frequency = 1000.000012 MHz
time_init: processor frequency = 1000.000012 MHz
clocksource: timebase mult[400000] shift[22] registered
clockevent: decrementer mult[33] shift[32] cpu[0]
This patch now introduces a local div_dc64() version of this function
so that this overflow doesn't happen anymore.
Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Detlev Zundel <dzu@denx.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It seems there is a thinko in the TLB invalidation code that makes the
tlbie in the loop executed just once. The intended check was probably
'gt', not 'lt'.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Various kernel asm modifies SRR0/SRR1 just before executing
a rfi. If such code crosses a page boundary you risk a TLB miss
which will clobber SRR0/SRR1. Avoid this by always pinning
kernel instruction TLB space.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI/cardbus: Add a fixup hook and fix powerpc
PCI: change PCI nomenclature in drivers/pci/ (non-comment changes)
PCI: change PCI nomenclature in drivers/pci/ (comment changes)
PCI: fix section mismatch on update_res()
PCI: add Intel 82599 Virtual Function specific reset method
PCI: add Intel USB specific reset method
PCI: support device-specific reset methods
PCI: Handle case when no pci device can provide cache line size hint
PCI/PM: Propagate wake-up enable for PCIe devices too
vgaarbiter: fix a typo in the vgaarbiter Documentation
This patch fixes the handling of VSX alignment faults in little-endian
mode (the current code assumes the processor is in big-endian mode).
The patch also makes the handlers clear the top 8 bytes of the register
when handling an 8 byte VSX load.
This is based on 2.6.32.
Signed-off-by: Neil Campbell <neilc@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The cardbus code creates PCI devices without ever going through the
necessary fixup bits and pieces that normal PCI devices go through.
There's in fact a commented out call to pcibios_fixup_bus() in there,
it's commented because ... it doesn't work.
I could make pcibios_fixup_bus() do the right thing on powerpc easily
but I felt it cleaner instead to provide a specific hook pci_fixup_cardbus
for which a weak empty implementation is provided by the PCI core.
This fixes cardbus on powerbooks and probably all other PowerPC
platforms which was broken completely for ever on some platforms and
since 2.6.31 on others such as PowerBooks when we made the DMA ops
mandatory (since those are setup by the fixups).
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'next' of git://git.secretlab.ca/git/linux-2.6: (23 commits)
powerpc: fix up for mmu_mapin_ram api change
powerpc: wii: allow ioremap within the memory hole
powerpc: allow ioremap within reserved memory regions
wii: use both mem1 and mem2 as ram
wii: bootwrapper: add fixup to calc useable mem2
powerpc: gamecube/wii: early debugging using usbgecko
powerpc: reserve fixmap entries for early debug
powerpc: wii: default config
powerpc: wii: platform support
powerpc: wii: hollywood interrupt controller support
powerpc: broadway processor support
powerpc: wii: bootwrapper bits
powerpc: wii: device tree
powerpc: gamecube: default config
powerpc: gamecube: platform support
powerpc: gamecube/wii: flipper interrupt controller support
powerpc: gamecube/wii: udbg support for usbgecko
powerpc: gamecube/wii: do not include PCI support
powerpc: gamecube/wii: declare as non-coherent platforms
powerpc: gamecube/wii: introduce GAMECUBE_COMMON
...
Fix up conflicts in arch/powerpc/mm/fsl_booke_mmu.c.
Hopefully even close to correctly.
* 'module' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
modpost: fix segfault with short symbol names
module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y
Kbuild: clear marker out of modpost
module: make MODULE_SYMBOL_PREFIX into a CONFIG option
ARM: unexport symbols used to implement floating point emulation
ARM: use unified discard definition in linker script
x86: don't export inline function
sparc64: don't export static inline pci_ functions
Use bitmap library and kill some unused iommu helper functions.
1. s/iommu_area_free/bitmap_clear/
2. s/iommu_area_reserve/bitmap_set/
3. Use bitmap_find_next_zero_area instead of find_next_zero_area
This cannot be simple substitution because find_next_zero_area
doesn't check the last bit of the limit in bitmap
4. Remove iommu_area_free, iommu_area_reserve, and find_next_zero_area
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
powerpc applies relocations to the kcrctab. They're absolute symbols,
but it's not completely unreasonable: other archs may too, but the
relocation is often 0.
http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-November/077972.html
Inspired-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Name space cleanup. No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
Further name space cleanup. No functional change
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
The raw_spin* namespace was taken by lockdep for the architecture
specific implementations. raw_spin_* would be the ideal name space for
the spinlocks which are not converted to sleeping locks in preempt-rt.
Linus suggested to convert the raw_ to arch_ locks and cleanup the
name space instead of using an artifical name like core_spin,
atomic_spin or whatever
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
m68k: rename global variable vmalloc_end to m68k_vmalloc_end
percpu: add missing per_cpu_ptr_to_phys() definition for UP
percpu: Fix kdump failure if booted with percpu_alloc=page
percpu: make misc percpu symbols unique
percpu: make percpu symbols in ia64 unique
percpu: make percpu symbols in powerpc unique
percpu: make percpu symbols in x86 unique
percpu: make percpu symbols in xen unique
percpu: make percpu symbols in cpufreq unique
percpu: make percpu symbols in oprofile unique
percpu: make percpu symbols in tracer unique
percpu: make percpu symbols under kernel/ and mm/ unique
percpu: remove some sparse warnings
percpu: make alloc_percpu() handle array types
vmalloc: fix use of non-existent percpu variable in put_cpu_var()
this_cpu: Use this_cpu_xx in trace_functions_graph.c
this_cpu: Use this_cpu_xx for ftrace
this_cpu: Use this_cpu_xx in nmi handling
this_cpu: Use this_cpu operations in RCU
this_cpu: Use this_cpu ops for VM statistics
...
Fix up trivial (famous last words) global per-cpu naming conflicts in
arch/x86/kvm/svm.c
mm/slab.c
Add support for using the USB Gecko adapter as an early debugging
console on the Nintendo GameCube and Wii video game consoles.
The USB Gecko is a 3rd party memory card interface adapter that provides
a EXI (External Interface) to USB serial converter.
Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch extends the cputable entry of the 750CL to also match
the 750CL-based "Broadway" cpu found on the Nintendo Wii.
As of this patch, the following "Broadway" design revision levels have
been seen in the wild:
- DD1.2 (87102)
- DD2.0 (87200)
Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (151 commits)
powerpc: Fix usage of 64-bit instruction in 32-bit altivec code
MAINTAINERS: Add PowerPC patterns
powerpc/pseries: Track previous CPPR values to correctly EOI interrupts
powerpc/pseries: Correct pseries/dlpar.c build break without CONFIG_SMP
powerpc: Make "intspec" pointers in irq_host->xlate() const
powerpc/8xx: DTLB Miss cleanup
powerpc/8xx: Remove DIRTY pte handling in DTLB Error.
powerpc/8xx: Start using dcbX instructions in various copy routines
powerpc/8xx: Restore _PAGE_WRITETHRU
powerpc/8xx: Add missing Guarded setting in DTLB Error.
powerpc/8xx: Fixup DAR from buggy dcbX instructions.
powerpc/8xx: Tag DAR with 0x00f0 to catch buggy instructions.
powerpc/8xx: Update TLB asm so it behaves as linux mm expects.
powerpc/8xx: Invalidate non present TLBs
powerpc/pseries: Serialize cpu hotplug operations during deactivate Vs deallocate
pseries/pseries: Add code to online/offline CPUs of a DLPAR node
powerpc: stop_this_cpu: remove the cpu from the online map.
powerpc/pseries: Add kernel based CPU DLPAR handling
sysfs/cpu: Add probe/release files
powerpc/pseries: Kernel DLPAR Infrastructure
...
New helper - sys_mmap_pgoff(); switch syscalls to using it.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Merge common code between PowerPC and Microblaze. This patch
splits the arch-specific stuff out into a new function,
early_init_dt_scan_chosen_arch().
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
A cell is firmly established as a u32. No need to do an ugly typedef
to redefine it to cell_t. Eliminate the unnecessary typedef so that
it doesn't have to be added to the of_fdt header file
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Merge common code between PowerPC and Microblaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Merge common code between PowerPC and Microblaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Merge common code between PowerPC and Microblaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
e821ea70f3 introduced a bug by copying
some 64-bit originated code as-is to be used by both 32 and 64-bit
but this code contains a 64-bit ony "cmpdi" instruction.
This changes it to cmpwi, which is fine since VRSAVE can only contains
a 32-bit value anyway.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@kernel.org>
Writing a driver using SCLPC on the MPC5200B I detected, that the
intspec arrays to map irqs to Linux virq cannot be const, because the
mapping and xlate functions only take non const pointers. All those
functions do not modify the intspec, so a const pointer could be used.
Signed-off-by: Roman Fietze <roman.fietze@telemotive.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use symbolic constant for PRESENT and avoid branching.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There is no need to do set the DIRTY bit directly in DTLB Error.
Trap to do_page_fault() and let the generic MM code do the work.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Now that 8xx can fixup dcbX instructions, start using them
where possible like every other PowerPc arch do.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
8xx has not had WRITETHRU due to lack of bits in the pte.
After the recent rewrite of the 8xx TLB code, there are
two bits left. Use one of them to WRITETHRU.
Perhaps use the last SW bit to PAGE_SPECIAL or PAGE_FILE?
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
only DTLB Miss did set this bit, DTLB Error needs too otherwise
the setting is lost when the page becomes dirty.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is an assembler version to fixup DAR not being set
by dcbX, icbi instructions. There are two versions, one
uses selfmodifing code, the other uses a
jump table but is much bigger(default).
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
dcbz, dcbf, dcbi, dcbst and icbi do not set DAR when they
cause a DTLB Error. Dectect this by tagging DAR with 0x00f0
at every exception exit that modifies DAR.
Test for DAR=0x00f0 in DataTLBError and bail
to handle_page_fault().
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Update the TLB asm to make proper use of _PAGE_DIRY and _PAGE_ACCESSED.
Get rid of _PAGE_HWWRITE too.
Pros:
- I/D TLB Miss never needs to write to the linux pte.
- _PAGE_ACCESSED is only set on TLB Error fixing accounting
- _PAGE_DIRTY is mapped to 0x100, the changed bit, and is set directly
when a page has been made dirty.
- Proper RO/RW mapping of user space.
- Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?)
- kernel RO/user NA support.
Cons:
- A few more instructions in the TLB Miss routines.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Remove the CPU from the online map to prevent smp_call_function
from sending messages to a stopped CPU.
Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Version 3 of this patch is updated with documentation added to
Documentation/ABI. There are no changes to any of the C code from v2
of the patch.
In order to support kernel DLPAR of CPU resources we need to provide an
interface to add (probe) and remove (release) the resource from the system.
This patch Creates new generic probe and release sysfs files to facilitate
cpu probe/release. The probe/release interface provides for allowing each
arch to supply their own routines for implementing the backend of adding
and removing cpus to/from the system.
This also creates the powerpc specific stubs to handle the arch callouts
from writes to the sysfs files.
The creation and use of these files is regulated by the
CONFIG_ARCH_CPU_PROBE_RELEASE option so that only architectures that need the
capability will have the files created.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
timers, init: Limit the number of per cpu calibration bootup messages
posix-cpu-timers: optimize and document timer_create callback
clockevents: Add missing include to pacify sparse
x86: vmiclock: Fix printk format
x86: Fix printk format due to variable type change
sparc: fix printk for change of variable type
clocksource/events: Fix fallout of generic code changes
nohz: Allow 32-bit machines to sleep for more than 2.15 seconds
nohz: Track last do_timer() cpu
nohz: Prevent clocksource wrapping during idle
nohz: Type cast printk argument
mips: Use generic mult/shift factor calculation for clocks
clocksource: Provide a generic mult/shift factor calculation
clockevents: Use u32 for mult and shift factors
nohz: Introduce arch_needs_cpu
nohz: Reuse ktime in sub-functions of tick_check_idle.
time: Remove xtime_cache
time: Implement logarithmic time accumulation
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits)
security/tomoyo: Remove now unnecessary handling of security_sysctl.
security/tomoyo: Add a special case to handle accesses through the internal proc mount.
sysctl: Drop & in front of every proc_handler.
sysctl: Remove CTL_NONE and CTL_UNNUMBERED
sysctl: kill dead ctl_handler definitions.
sysctl: Remove the last of the generic binary sysctl support
sysctl net: Remove unused binary sysctl code
sysctl security/tomoyo: Don't look at ctl_name
sysctl arm: Remove binary sysctl support
sysctl x86: Remove dead binary sysctl support
sysctl sh: Remove dead binary sysctl support
sysctl powerpc: Remove dead binary sysctl support
sysctl ia64: Remove dead binary sysctl support
sysctl s390: Remove dead sysctl binary support
sysctl frv: Remove dead binary sysctl support
sysctl mips/lasat: Remove dead binary sysctl support
sysctl drivers: Remove dead binary sysctl support
sysctl crypto: Remove dead binary sysctl support
sysctl security/keys: Remove dead binary sysctl support
sysctl kernel: Remove binary sysctl logic
...
We can kill unused swiotlb variable.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Since they are static inline functions.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The code under "if (is_global_init())" is bogus, and is_global_init()
itself is not right in mt case.
Contrary to what the comment says, nowadays force_sig_info() does kill
init even if the handler is SIG_DFL. Note that force_sig_info() clears
SIGNAL_UNKILLABLE exactly for this case.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The typename member of struct irq_chip was kept for migration purposes
and is obsolete since more than 2 years. Fix up the leftovers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The ioctl is only used for powermac systems and reads a partition
number from an array which is initialized at boot time way before the
nvram code is initialized. So it's safe to switch to unlocked_ioctl.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Merge common code between PowerPC and MicroBlaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
Merge common code between PowerPC and MicroBlaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
Merge common code between PowerPC and MicroBlaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
Merge common code between PowerPC and Microblaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
Merge common code between PowerPC and Microblaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
Merge common code between PowerPC and MicroBlaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
Merge common code between PowerPC and Microblaze
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
Merge common code between Microblaze and PowerPC.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
Re-write the code so its more standalone and fixed some issues:
* Bump'd # of CAM entries to 64 to support e500mc
* Make the code handle MAS7 properly
* Use pr_cont instead of creating a string as we go
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>