The current implementation of CONFIG_RELOCATABLE in BookE is based
on mapping the page aligned kernel load address to KERNELBASE. This
approach however is not enough for platforms, where the TLB page size
is large (e.g, 256M on 44x). So we are renaming the RELOCATABLE used
currently in BookE to DYNAMIC_MEMSTART to reflect the actual method.
The CONFIG_RELOCATABLE for PPC32(BookE) based on processing of the
dynamic relocations will be introduced in the later in the patch series.
This change would allow the use of the old method of RELOCATABLE for
platforms which can afford to enforce the page alignment (platforms with
smaller TLB size).
Changes since v3:
* Introduced a new config, NONSTATIC_KERNEL, to denote a kernel which is
either a RELOCATABLE or DYNAMIC_MEMSTART(Suggested by: Josh Boyer)
Suggested-by: Scott Wood <scottwood@freescale.com>
Tested-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux ppc dev <linuxppc-dev@lists.ozlabs.org>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
Now all ARCH_POPULATES_NODE_MAP archs select HAVE_MEBLOCK_NODE_MAP -
there's no user of early_node_map[] left. Kill early_node_map[] and
replace ARCH_POPULATES_NODE_MAP with HAVE_MEMBLOCK_NODE_MAP. Also,
relocate for_each_mem_pfn_range() and helper from mm.h to memblock.h
as page_alloc.c would no longer host an alternative implementation.
This change is ultimately one to one mapping and shouldn't cause any
observable difference; however, after the recent changes, there are
some functions which now would fit memblock.c better than page_alloc.c
and dependency on HAVE_MEMBLOCK_NODE_MAP instead of HAVE_MEMBLOCK
doesn't make much sense on some of them. Further cleanups for
functions inside HAVE_MEMBLOCK_NODE_MAP in mm.h would be nice.
-v2: Fix compile bug introduced by mis-spelling
CONFIG_HAVE_MEMBLOCK_NODE_MAP to CONFIG_MEMBLOCK_HAVE_NODE_MAP in
mmzone.h. Reported by Stephen Rothwell.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
powerpc doesn't access early_node_map[] directly and enabling
HAVE_MEMBLOCK_NODE_MAP is trivial - replacing add_active_range() calls
with memblock_set_node() and selecting HAVE_MEMBLOCK_NODE_MAP is
enough.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
This patch provides cpu_idle_wait() routine for the powerpc
platform which is required by the cpuidle subsystem. This
routine is required to change the idle handler on SMP systems.
The equivalent routine for x86 is in arch/x86/kernel/process.c
but the powerpc implementation is different.
cpuidle_disable variable is to enable/disable cpuidle
framework if power_save option is set during the boot
time.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc copied pci_iomap from generic code, probably to avoid
pulling the rest of iomap.c in. Since that's in
a separate file now, we can reuse the common implementation.
The only difference is handling of nocache flag,
that turns out to be done correctly by the
generic code since arch/powerpc/include/asm/io.h
defines ioremap_nocache same as ioremap.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
All interrupts which must be non threaded are marked
IRQF_NO_THREAD. So it's safe to allow force threaded handlers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kexec is not supported on 47x. 47x is a variant of 44x with slightly
different MMU and SMP support. There was a typo in the config dependency
for kexec. This patch fixes the same.
Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linux ppc dev <linuxppc-dev@lists.ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
scsi: drop unused Kconfig symbol
pci: drop unused Kconfig symbol
stmmac: drop unused Kconfig symbol
x86: drop unused Kconfig symbol
powerpc: drop unused Kconfig symbols
powerpc: 40x: drop unused Kconfig symbol
mips: drop unused Kconfig symbols
openrisc: drop unused Kconfig symbols
arm: at91: drop unused Kconfig symbol
samples: drop unused Kconfig symbol
m32r: drop unused Kconfig symbol
score: drop unused Kconfig symbols
sh: drop unused Kconfig symbol
um: drop unused Kconfig symbol
sparc: drop unused Kconfig symbol
alpha: drop unused Kconfig symbol
Fix up trivial conflict in drivers/net/ethernet/stmicro/stmmac/Kconfig
as per Michal: the STMMAC_DUAL_MAC config variable is still unused and
should be deleted.
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (106 commits)
powerpc/p3060qds: Add support for P3060QDS board
powerpc/83xx: Add shutdown request support to MCU handling on MPC8349 MITX
powerpc/85xx: Make kexec to interate over online cpus
powerpc/fsl_booke: Fix comment in head_fsl_booke.S
powerpc/85xx: issue 15 EOI after core reset for FSL CoreNet devices
powerpc/8xxx: Fix interrupt handling in MPC8xxx GPIO driver
powerpc/85xx: Add 'fsl,pq3-gpio' compatiable for GPIO driver
powerpc/86xx: Correct Gianfar support for GE boards
powerpc/cpm: Clear muram before it is in use.
drivers/virt: add ioctl for 32-bit compat on 64-bit to fsl-hv-manager
powerpc/fsl_msi: add support for "msi-address-64" property
powerpc/85xx: Setup secondary cores PIR with hard SMP id
powerpc/fsl-booke: Fix settlbcam for 64-bit
powerpc/85xx: Adding DCSR node to dtsi device trees
powerpc/85xx: clean up FPGA device tree nodes for Freecsale QorIQ boards
powerpc/85xx: fix PHYS_64BIT selection for P1022DS
powerpc/fsl-booke: Fix setup_initial_memory_limit to not blindly map
powerpc: respect mem= setting for early memory limit setup
powerpc: Update corenet64_smp_defconfig
powerpc: Update mpc85xx/corenet 32-bit defconfigs
...
Fix up trivial conflicts in:
- arch/powerpc/configs/40x/hcu4_defconfig
removed stale file, edited elsewhere
- arch/powerpc/include/asm/udbg.h, arch/powerpc/kernel/udbg.c:
added opal and gelic drivers vs added ePAPR driver
- drivers/tty/serial/8250.c
moved UPIO_TSI to powerpc vs removed UPIO_DWAPB support
Enable hugepages on Freescale BookE processors. This allows the kernel to
use huge TLB entries to map pages, which can greatly reduce the number of
TLB misses and the amount of TLB thrashing experienced by applications with
large memory footprints. Care should be taken when using this on FSL
processors, as the number of large TLB entries supported by the core is low
(16-64) on current processors.
The supported set of hugepage sizes include 4m, 16m, 64m, 256m, and 1g.
Page sizes larger than the max zone size are called "gigantic" pages and
must be allocated on the command line (and cannot be deallocated).
This is currently only fully implemented for Freescale 32-bit BookE
processors, but there is some infrastructure in the code for
64-bit BooKE.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Allow the p1010 processor to select the flexcan network driver.
Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>,
Acked-by: Wolfgang Grandegger <wg@grandegger.com>,
Cc: U Bhaskar-B22300 <B22300@freescale.com>
Cc: socketcan-core@lists.berlios.de,
Cc: netdev@vger.kernel.org,
Cc: PPC list <linuxppc-dev@lists.ozlabs.org>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds kexec support for PPC440 based chipsets. This work is based
on the KEXEC patches for FSL BookE.
The FSL BookE patch and the code flow could be found at the link below:
http://patchwork.ozlabs.org/patch/49359/
Steps:
1) Invalidate all the TLB entries except the one this code is run from
2) Create a tmp mapping for our code in the other address space and jump to it
3) Invalidate the entry we used
4) Create a 1:1 mapping for 0-2GiB in blocks of 256M
5) Jump to the new 1:1 mapping and invalidate the tmp mapping
I have tested this patches on Ebony, Sequoia boards and Virtex on QEMU.
You need kexec-tools commit e8b7939b1e or newer for ppc440x support,
available at:
git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
Some trivial conflicts due to other various merges
adding to the end of common lists sooner than this one.
arch/ia64/Kconfig
arch/powerpc/Kconfig
arch/x86/Kconfig
lib/Kconfig
lib/Makefile
Signed-off-by: Len Brown <len.brown@intel.com>
cmpxchg() is widely used by lockless code, including NMI-safe lockless
code. But on some architectures, the cmpxchg() implementation is not
NMI-safe, on these architectures the lockless code may need a
spin_trylock_irqsave() based implementation.
This patch adds a Kconfig option: ARCH_HAVE_NMI_SAFE_CMPXCHG, so that
NMI-safe lockless code can depend on it or provide different
implementation according to it.
On many architectures, cmpxchg is only NMI-safe for several specific
operand sizes. So, ARCH_HAVE_NMI_SAFE_CMPXCHG define in this patch
only guarantees cmpxchg is NMI-safe for sizeof(unsigned long).
Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Richard Henderson <rth@twiddle.net>
CC: Mikael Starvik <starvik@axis.com>
Acked-by: David Howells <dhowells@redhat.com>
CC: Yoshinori Sato <ysato@users.sourceforge.jp>
CC: Tony Luck <tony.luck@intel.com>
CC: Hirokazu Takata <takata@linux-m32r.org>
CC: Geert Uytterhoeven <geert@linux-m68k.org>
CC: Michal Simek <monstr@monstr.eu>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
CC: Kyle McMartin <kyle@mcmartin.ca>
CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: Chen Liqin <liqin.chen@sunplusct.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Ingo Molnar <mingo@redhat.com>
CC: Chris Zankel <chris@zankel.net>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (99 commits)
drivers/virt: add missing linux/interrupt.h to fsl_hypervisor.c
powerpc/85xx: fix mpic configuration in CAMP mode
powerpc: Copy back TIF flags on return from softirq stack
powerpc/64: Make server perfmon only built on ppc64 server devices
powerpc/pseries: Fix hvc_vio.c build due to recent changes
powerpc: Exporting boot_cpuid_phys
powerpc: Add CFAR to oops output
hvc_console: Add kdb support
powerpc/pseries: Fix hvterm_raw_get_chars to accept < 16 chars, fixing xmon
powerpc/irq: Quieten irq mapping printks
powerpc: Enable lockup and hung task detectors in pseries and ppc64 defeconfigs
powerpc: Add mpt2sas driver to pseries and ppc64 defconfig
powerpc: Disable IRQs off tracer in ppc64 defconfig
powerpc: Sync pseries and ppc64 defconfigs
powerpc/pseries/hvconsole: Fix dropped console output
hvc_console: Improve tty/console put_chars handling
powerpc/kdump: Fix timeout in crash_kexec_wait_realmode
powerpc/mm: Fix output of total_ram.
powerpc/cpufreq: Add cpufreq driver for Momentum Maple boards
powerpc: Correct annotations of pmu registration functions
...
Fix up trivial Kconfig/Makefile conflicts in arch/powerpc, drivers, and
drivers/cpufreq
An implementation of a code generator for BPF programs to speed up packet
filtering on PPC64, inspired by Eric Dumazet's x86-64 version.
Filter code is generated as an ABI-compliant function in module_alloc()'d mem
with stackframe & prologue/epilogue generated if required (simple filters don't
need anything more than an li/blr). The filter's local variables, M[], live in
registers. Supports all BPF opcodes, although "complicated" loads from negative
packet offsets (e.g. SKF_LL_OFF) are not yet supported.
There are a couple of further optimisations left for future work; many-pass
assembly with branch-reach reduction and a register allocator to push M[]
variables into volatile registers would improve the code quality further.
This currently supports big-endian 64-bit PowerPC only (but is fairly simple
to port to PPC32 or LE!).
Enabled in the same way as x86-64:
echo 1 > /proc/sys/net/core/bpf_jit_enable
Or, enabled with extra debug output:
echo 2 > /proc/sys/net/core/bpf_jit_enable
Signed-off-by: Matt Evans <matt@ozlabs.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 44x code (which is shared by 47x) assumes the available physical memory
begins at 0x00000000. This is not necessarily the case in an AMP
environment.
Support CONFIG_RELOCATABLE for 476 in order to allow the kernel to be
loaded into a higher memory range.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This patch adds support for the new "jump label" feature.
Unlike x86 and sparc we just merrily patch the code with no locks etc,
as far as I know this is safe, but I'm not really sure what the x86/sparc
code is protecting against so maybe it's not.
I also don't see any reason for us to implement the poke_early() routine,
even though sparc does.
[BenH: Updated the patch to upstream generic changes]
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
By the previous style change, CONFIG_GENERIC_FIND_NEXT_BIT,
CONFIG_GENERIC_FIND_BIT_LE, and CONFIG_GENERIC_FIND_LAST_BIT are not used
to test for existence of find bitops anymore.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch implements the raw syscall tracepoints on PowerPC and exports
them for ftrace syscalls to use.
To minimise reworking existing code, I slightly re-ordered the thread
info flags such that the new TIF_SYSCALL_TRACEPOINT bit would still fit
within the 16 bits of the andi. instruction's UI field. The instructions
in question are in /arch/powerpc/kernel/entry_{32,64}.S to and the
_TIF_SYSCALL_T_OR_A with the thread flags to see if system call tracing
is enabled.
In the case of 64bit PowerPC, arch_syscall_addr and
arch_syscall_match_sym_name are overridden to allow ftrace syscalls to
work given the unusual system call table structure and symbol names that
start with a period.
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In case other architectures require RCU freed page-tables to implement
gup_fast() and software filled hashes and similar things, provide the
means to do so by moving the logic into generic code.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Requested-by: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a first cut at making bootwrapper code which will
produce a zImage compliant with the requirements set down
by ePAPR.
This is a very simple bootwrapper, taking the device tree
blob supplied by the ePAPR boot program and passing it on
to the kernel. It builds on the earlier patch to build a
relocatable ET_DYN zImage to meet the other ePAPR image
requirements.
For good measure we have some paranoid checks which will
generate warnings if some of the ePAPR entry condition
guarantees are not met.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We currently dont have CPU Hotplug support working on 85xx so we need to
disable Suspsend support as it will force enabling of CPU Hotplug.
arch/powerpc/kernel/built-in.o: In function `cpu_die': arch/powerpc/kernel/smp.c:702: undefined reference to `start_secondary_resume'
make: *** [.tmp_vmlinux1] Error 1
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
1. Add an option to include RapidIO support if the PCI is available.
2. Add FSL_RIO configuration option to enable controller selection.
3. Add RapidIO support option into x86 and MIPS architectures.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This introduces CONFIG_GENERIC_FIND_BIT_LE to tell whether to use generic
implementation of find_*_bit_le() in lib/find_next_bit.c or not.
For now we select CONFIG_GENERIC_FIND_BIT_LE for all architectures which
enable CONFIG_GENERIC_FIND_NEXT_BIT.
But m68knommu wants to define own faster find_next_zero_bit_le() and
continues using generic find_next_{,zero_}bit().
(CONFIG_GENERIC_FIND_NEXT_BIT and !CONFIG_GENERIC_FIND_BIT_LE)
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All architectures are finally converted. Remove the cruft.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Michal Simek <monstr@monstr.eu>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jeff Dike <jdike@addtoit.com>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (72 commits)
powerpc/pseries: Fix build of topology stuff without CONFIG_NUMA
powerpc/pseries: Fix VPHN build errors on non-SMP systems
powerpc/83xx: add mpc8308_p1m DMA controller device-tree node
powerpc/83xx: add DMA controller to mpc8308 device-tree node
powerpc/512x: try to free dma descriptors in case of allocation failure
powerpc/512x: add MPC8308 dma support
powerpc/512x: fix the hanged dma transfer issue
powerpc/512x: scatter/gather dma fix
powerpc/powermac: Make auto-loading of therm_pm72 possible
of/address: Use propper endianess in get_flags
powerpc/pci: Use printf extension %pR for struct resource
powerpc: Remove unnecessary casts of void ptr
powerpc: Disable VPHN polling during a suspend operation
powerpc/pseries: Poll VPA for topology changes and update NUMA maps
powerpc: iommu: Add device name to iommu error printks
powerpc: Record vma->phys_addr in ioremap()
powerpc: Update compat_arch_ptrace
powerpc: Fix PPC_PTRACE_SETHWDEBUG on PPC_BOOK3S
powerpc/time: printk time stamp init not correct
powerpc: Minor cleanups for machdep.h
...
The device tree code is now in two pieces: some which can be used generically
on any platform which selects CONFIG_OF_FLATTREE, and some early which is used
at boot time on only a few architectures. This patch segregates the early
code so that only those architectures which care about it need compile it.
This also means that some of the requirements in the early code (such as
a cmd_line variable) that most architectures (e.g. X86) don't provide
can be ignored.
Signed-off-by: Stephen Neuendorffer <stephen.neuendorffer@xilinx.com>
[grant.likely@secretlab.ca: remove extra blank line addition]
[grant.likely@secretlab.ca: fixed incorrect #ifdef CONFIG_EARLY_FLATTREE check]
[grant.likely@secretlab.ca: Made OF_EARLY_FLATTREE select instead of depend
on OF_FLATTREE]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Add suspend/resume support for 4xx compatible CPUs.
See /sys/power/state for available power states configured in.
Add two different idle states (idle-wait and idle-doze) controlled via sysfs.
Default is idle-wait.
cat /sys/devices/system/cpu/cpu0/idle
[wait] doze
To save additional power, use idle-doze.
echo doze > /sys/devices/system/cpu/cpu0/idle
cat /sys/devices/system/cpu/cpu0/idle
wait [doze]
Signed-off-by: Victor Gallardo <vgallardo@apm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
commit ffe8018c34 of the -mm tree
fixes the initramfs size calculation for e.g. s390 but breaks it
for 32bit architectures which do not define CONFIG_32BIT.
This patch fix the problem for PPC32 which will elsewise end up
with a __initramfs_size of 0.
Signed-off-by: Kerstin Jonsson <kerstin.jonsson@ericsson.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* git://git.infradead.org/mtd-2.6: (82 commits)
mtd: fix build error in m25p80.c
mtd: Remove redundant mutex from mtd_blkdevs.c
MTD: Fix wrong check register_blkdev return value
Revert "mtd: cleanup Kconfig dependencies"
mtd: cfi_cmdset_0002: make sector erase command variable
mtd: cfi_cmdset_0002: add CFI detection for SST 38VF640x chips
mtd: cfi_util: add support for switching SST 39VF640xB chips into QRY mode
mtd: cfi_cmdset_0001: use defined value of P_ID_INTEL_PERFORMANCE instead of hardcoded one
block2mtd: dubious assignment
P4080/mtd: Fix the freescale lbc issue with 36bit mode
P4080/eLBC: Make Freescale elbc interrupt common to elbc devices
mtd: phram: use KBUILD_MODNAME
mtd: OneNAND: S5PC110: Fix double call suspend & resume function
mtd: nand: fix MTD_MODE_RAW writes
jffs2: use kmemdup
mtd: sm_ftl: cosmetic, use bool when possible
mtd: r852: remove useless pci powerup/down from suspend/resume routines
mtd: blktrans: fix a race vs kthread_stop
mtd: blktrans: kill BKL
mtd: allow to unload the mtdtrans module if its block devices aren't open
...
Fix up trivial whitespace-introduced conflict in drivers/mtd/mtdchar.c
Conflicts:
drivers/mtd/mtd_blkdevs.c
Merge Grant's device-tree bits so that we can apply the subsequent fixes.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6: (38 commits)
kbuild: convert `arch/tile' to the kconfig mainmenu upgrade
README: cite nconfig
Revert "kconfig: Temporarily disable dependency warnings"
kconfig: Use PATH_MAX instead of 128 for path buffer sizes.
kconfig: Fix realloc usage()
kconfig: Propagate const
kconfig: Don't go out from read config loop when you read new symbol
kconfig: fix menuconfig on debian lenny
kbuild: migrate all arch to the kconfig mainmenu upgrade
kconfig: expand file names
kconfig: use the file's name of sourced file
kconfig: constify file name
kconfig: don't emit warning upon rootmenu's prompt redefinition
kconfig: replace KERNELVERSION usage by the mainmenu's prompt
kconfig: delay gconf window initialization
kconfig: expand by default the rootmenu's prompt
kconfig: add a symbol string expansion helper
kconfig: regen parser
kconfig: implement the `mainmenu' directive
kconfig: allow PACKAGE to be defined on the compiler's command-line
...
Fix up trivial conflict in arch/mn10300/Kconfig
Move Freescale elbc interrupt from nand driver to elbc driver.
Then all elbc devices can use the interrupt instead of ONLY nand.
For former nand driver, it had the two functions:
1. detecting nand flash partitions;
2. registering elbc interrupt.
Now, second function is removed to fsl_lbc.c.
Signed-off-by: Lan Chunhe-B25806 <b25806@freescale.com>
Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>
Cc: Wood Scott-B07421 <B07421@freescale.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.
Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.
The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.
Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[ various fixes ]
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'timers-timekeeping-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
um: Fix read_persistent_clock fallout
kgdb: Do not access xtime directly
powerpc: Clean up obsolete code relating to decrementer and timebase
powerpc: Rework VDSO gettimeofday to prevent time going backwards
clocksource: Add __clocksource_updatefreq_hz/khz methods
x86: Convert common clocksources to use clocksource_register_hz/khz
timekeeping: Make xtime and wall_to_monotonic static
hrtimer: Cleanup direct access to wall_to_monotonic
um: Convert to use read_persistent_clock
timkeeping: Fix update_vsyscall to provide wall_to_monotonic offset
powerpc: Cleanup xtime usage
powerpc: Simplify update_vsyscall
time: Kill off CONFIG_GENERIC_TIME
time: Implement timespec_add
x86: Fix vtime/file timestamp inconsistencies
Trivial conflicts in Documentation/feature-removal-schedule.txt
Much less trivial conflicts in arch/powerpc/kernel/time.c resolved as
per Thomas' earlier merge commit 47916be4e2 ("Merge branch
'powerpc.cherry-picks' into timers/clocksource")
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: (63 commits)
of/platform: Register of_platform_drivers with an "of:" prefix
of/address: Clean up function declarations
of/spi: call of_register_spi_devices() from spi core code
of: Provide default of_node_to_nid() implementation.
of/device: Make of_device_make_bus_id() usable by other code.
of/irq: Fix endian issues in parsing interrupt specifiers
of: Fix phandle endian issues
of/flattree: fix of_flat_dt_is_compatible() to match the full compatible string
of: remove of_default_bus_ids
of: make of_find_device_by_node generic
microblaze: remove references to of_device and to_of_device
sparc: remove references to of_device and to_of_device
powerpc: remove references to of_device and to_of_device
of/device: Replace of_device with platform_device in includes and core code
of/device: Protect against binding of_platform_drivers to non-OF devices
of: remove asm/of_device.h
of: remove asm/of_platform.h
of/platform: remove all of_bus_type and of_platform_bus_type references
of: Merge of_platform_bus_type with platform_bus_type
drivercore/of: Add OF style matching to platform bus
...
Fix up trivial conflicts in arch/microblaze/kernel/Makefile due to just
some obj-y removals by the devicetree branch, while the microblaze
updates added a new file.
Adds support for kexec on 85xx machines for the BookE platform.
Including support for SMP machines
Based off work from Maxim Uvarov <muvarov@mvista.com>
Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
On PowerPC we should always use generic ISA DMA API implementation
as there is simply no other implementation exist.
Without this patch, the following build error pops up:
sound/built-in.o: In function 'snd_dma_pointer':
(.text+0x74ae): undefined reference to 'dma_spin_lock'
...
make: *** [.tmp_vmlinux1] Error 1
This is PPC_85xx, SMP and some sound drivers set to =y.
Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Acked-by: Dave Liu <daveliu@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Now that all arches have been converted over to use generic time via
clocksources or arch_gettimeoffset(), we can remove the GENERIC_TIME
config option and simplify the generic code.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
LKML-Reference: <1279068988-21864-4-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
via following scripts
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/lmb/memblock/g' \
-e 's/LMB/MEMBLOCK/g' \
$FILES
for N in $(find . -name lmb.[ch]); do
M=$(echo $N | sed 's/lmb/memblock/g')
mv $N $M
done
and remove some wrong change like lmbench and dlmb etc.
also move memblock.c from lib/ to mm/
Suggested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Enables support for HMC initiated partition hibernation. This is
a firmware assisted hibernation, since the firmware handles writing
the memory out to disk, along with other partition information,
so we just mimic suspend to ram.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The SPARSE_IRQ considerably adds overhead to critical path of IRQ
handling. However it doesn't benefit much in space for most systems with
limited IRQ_NR. Should be disabled unless really necessary.
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Microblaze and PPC both use PROC_DEVICETREE, and OLPC will as well.. put
the Kconfig option into fs/ rather than in arch/*/Kconfig.
Signed-off-by: Andres Salomon <dilinger@queued.net>
[grant.likely@secretlab.ca: changed depends to PROC_FS && !SPARC]
[grant.likely@secretlab.ca: moved to drivers/of/Kconfig]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
All of the options in drivers/of/Kconfig depend on CONFIG_OF. Putting
all of them inside a menu block simplifies the dependency statements.
It also creates a logical group for adding user selectable OF options.
This patch also changes (PPC_OF || MICROBLAZE) statements to (!SPARC)
so that those options are available to other architectures (and in
fact the !SPARC conditions should probably be re-evalutated since the
code is more generic now)
This patch also moves the definition of CONFIG_DTC from arch/* to
drivers/of/Kconfig
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
now that CONFIG_OF is defined globally
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
so that we can make CONFIG_OF global and remove it from
the architecture Kconfig files later.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Implement perf-events based hw-breakpoint interfaces for PowerPC
64-bit server (Book III S) processors. This allows access to a
given location to be used as an event that can be counted or
profiled by the perf_events subsystem.
This is done using the DABR (data breakpoint register), which can
also be used for process debugging via ptrace. When perf_event
hw_breakpoint support is configured in, the perf_event subsystem
manages the DABR and arbitrates access to it, and ptrace then
creates a perf_event when it is requested to set a data breakpoint.
[Adopted suggestions from Paul Mackerras <paulus@samba.org> to
- emulate_step() all system-wide breakpoints and single-step only the
per-task breakpoints
- perform arch-specific cleanup before unregistration through
arch_unregister_hw_breakpoint()
]
Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds support kexec on FSL-BookE where the MMU can not be simply
switched off. The code borrows the initial MMU-setup code to create the
identical mapping mapping. The only difference to the original boot code
is the size of the mapping(s) and the executeable address.
The kexec code maps the first 2 GiB of memory in 256 MiB steps. This
should work also on e500v1 boxes.
SMP support is still not available.
(Kumar: Added minor change to build to ifdef CONFIG_PPC_STD_MMU_64 some
code that was PPC64 specific)
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This patch ports the kprobe-based event tracer to powerpc. This patch
is based on x86 port. This brings powerpc on par with x86.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The description says:
Cause IO segments sent to a device for DMA to be merged virtually
by the IOMMU when they happen to have been allocated contiguously.
This doesn't add pressure to the IOMMU allocator. However, some
drivers don't support getting large merged segments coming back
from *_map_sg().
Most drivers don't have this problem; it is safe to say Y here.
It's out of date. Long ago, drivers didn't have a way to tell IOMMUs
about their segment length limit (that is, the maximum segment length
that they can handle). So IOMMUs merged as many segments as possible
and gave too large segments to drivers.
dma_get_max_seg_size() was introduced to solve the above
problem. Device drives can use the API to tell IOMMU about the maximum
segment length that they can handle. In addition, the default limit
(64K) should be safe for everyone.
So this config option seems to be unnecessary.
Note that this config option just enables users to disable the virtual
merging by default. Users can still disable the virtual merging by the
boot parameter.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/booke: Introduce new CONFIG options for advanced debug registers
From: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Introduce new config options to simplify the ifdefs pertaining to the
advanced debug registers for booke and 40x processors:
CONFIG_PPC_ADV_DEBUG_REGS - boolean: true for dac-based processors
CONFIG_PPC_ADV_DEBUG_IACS - number of IAC registers
CONFIG_PPC_ADV_DEBUG_DACS - number of DAC registers
CONFIG_PPC_ADV_DEBUG_DVCS - number of DVC registers
CONFIG_PPC_ADV_DEBUG_DAC_RANGE - DAC ranges supported
Beginning conservatively, since I only have the facilities to test 440
hardware. I believe all 40x and booke platforms support at least 2 IAC
and 2 DAC registers. For 440, 4 IAC and 2 DVC registers are enabled, as
well as the DAC ranges.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Acked-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With dynamic irq descriptors the overhead of a large NR_IRQS is much lower
than it used to be. With more MSI-X capable adapters and drivers exploiting
multiple vectors we may as well allow the user to increase it beyond the
current maximum of 512.
32768 seems large enough that we'd never have to bump it again (although I bet
my prediction is horribly wrong). It boot tests OK and the vmlinux footprint
increase is only around 500kB due to:
struct irq_map_entry irq_map[NR_IRQS];
We format /proc/interrupts correctly with the previous changes:
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
286: 0 0 0 0 0 0
516: 0 0 0 0 0 0
16689: 1833 0 0 0 0 0
17157: 0 0 0 0 0 0
17158: 319 0 0 0 0 0
25092: 0 0 0 0 0 0
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'next' of git://git.secretlab.ca/git/linux-2.6: (23 commits)
powerpc: fix up for mmu_mapin_ram api change
powerpc: wii: allow ioremap within the memory hole
powerpc: allow ioremap within reserved memory regions
wii: use both mem1 and mem2 as ram
wii: bootwrapper: add fixup to calc useable mem2
powerpc: gamecube/wii: early debugging using usbgecko
powerpc: reserve fixmap entries for early debug
powerpc: wii: default config
powerpc: wii: platform support
powerpc: wii: hollywood interrupt controller support
powerpc: broadway processor support
powerpc: wii: bootwrapper bits
powerpc: wii: device tree
powerpc: gamecube: default config
powerpc: gamecube: platform support
powerpc: gamecube/wii: flipper interrupt controller support
powerpc: gamecube/wii: udbg support for usbgecko
powerpc: gamecube/wii: do not include PCI support
powerpc: gamecube/wii: declare as non-coherent platforms
powerpc: gamecube/wii: introduce GAMECUBE_COMMON
...
Fix up conflicts in arch/powerpc/mm/fsl_booke_mmu.c.
Hopefully even close to correctly.
The Nintendo GameCube and Wii video game consoles do not have PCI hardware.
Avoid wasting their scarce memory by not including PCI support into the
kernel.
Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Version 3 of this patch is updated with documentation added to
Documentation/ABI. There are no changes to any of the C code from v2
of the patch.
In order to support kernel DLPAR of CPU resources we need to provide an
interface to add (probe) and remove (release) the resource from the system.
This patch Creates new generic probe and release sysfs files to facilitate
cpu probe/release. The probe/release interface provides for allowing each
arch to supply their own routines for implementing the backend of adding
and removing cpus to/from the system.
This also creates the powerpc specific stubs to handle the arch callouts
from writes to the sysfs files.
The creation and use of these files is regulated by the
CONFIG_ARCH_CPU_PROBE_RELEASE option so that only architectures that need the
capability will have the files created.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Merge common code between Microblaze and PowerPC.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
This patch adds suspend/resume support for MPC8540 and MPC8641D-
compatible CPUs. To reach sleep state, we just write the SLP bit
into the PM control and status register.
So far we don't support Deep Sleep mode as found in newer MPC85xx
CPUs (i.e. MPC8536). It can be relatively easy implemented though,
and for it we reserve 'mem' suspend type.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Defining CONFIG_SPARSE_IRQ enables generic code that gets rid of the
static irq_desc array, and replaces it with an array of pointers to
irq_descs.
It also allows node local allocation of irq_descs, however we
currently don't have the information available to do that, so we just
allocate them on all on node 0.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The irq_desc array consumes quite a lot of space, and for systems
that don't need or can't have 512 irqs it's just wasted space.
The first 16 are reserved for ISA, so the minimum of 32 is really
16 - and no one has asked for more than 512 so leave that as the
maximum.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Just as with kexec, hibernation may fail even on well-tested platforms:
some PCI device, a driver of which doesn't play well with hibernation,
is enough to break resuming.
Hibernation code is not much platform dependent, and hiding features only
because these were not verified on a particular hardware is
counterproductive: we just prevent the features from being widely tested.
For example, with this patch I just tested hibernation on a MPC83xx
board, and it works quite well, modulo a few drivers that need some
fixing.
So, let's make it possible to select hibernation support for all
PowerPCs, then let's wait for any possible bug reports, and actually fix
(or just collect ;-) the bugs instead of hiding them. If some platforms
really can't stand hibernation, we can make a blacklist, with proper
comments why exactly hibernation doesn't work, whether it is possible to
fix, and what needs to be done to fix it.
CONFIG_HIBERNATION is still =n by default, so the commit doesn't change
anything apart from ability to set it to =y.
I'm not sure if EXPERIMENTAL dependency is needed, I'd rather not add it
for a few reasons:
1) It doesn't matter much, for distro kernels user has no clue that some
feature is experimental. Majority of defconfigs enable EXPERIMENTAL
anyway (90 vs. 4, which, btw, means that EXPERIMENTAL is overused
in Kconfigs);
2) EXPERIMENTAL is a good thing for features that change default
behaviour of a kernel, while for hibernation user has to explicitly
issue 'echo disk > /sys/power/state' to trigger any hibernation bugs;
3) Per init/Kconfig, EXPERIMENTAL is a good thing to scare and discourage
users from 'widespread use of a feature', while we want to encourage
that use.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Some System p configurations can already have more than 16 nodes so we
need to increase NODES_SHIFT. I chose 256 to give us some room to grow in the
future, although we can look at something smaller if the memory bloat is
considered too much.
Unless we clamp MAX_ACTIVE_REGIONS we end up with 300kB of extra bloat in
early_node_map in mm/page_alloc.c:
< 6144 early_node_map
> 307200 early_node_map
due to:
#if MAX_NUMNODES >= 32
/* If there can be many nodes, allow up to 50 holes per node */
#define MAX_ACTIVE_REGIONS (MAX_NUMNODES*50)
#else
/* By default, allow up to 256 distinct regions */
#define MAX_ACTIVE_REGIONS 256
Since our memory is mostly contiguous it seems reasonable to keep this
at 256 for now. I also set 32bit to 32 to save space (is there any chance
a 32bit system will have more than 32 discontiguous memory ranges?).
Even with that fixed we have a few data structures that grow:
< 896 bootmem_node_data
> 14336 bootmem_node_data
< 1280 node_devices
> 20480 node_devices
< 25088 kmalloc_caches
> 59648 kmalloc_caches
< 1632 hstates
> 21792 hstates
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Bye-bye Performance Counters, welcome Performance Events!
In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.
Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.
All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)
The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.
Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.
User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)
This patch has been generated via the following script:
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILES
for N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
done
FILES=$(find . -name perf_event.*)
sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\<event\>/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES
... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.
Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.
( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )
Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (134 commits)
powerpc/nvram: Enable use Generic NVRAM driver for different size chips
powerpc/iseries: Fix oops reading from /proc/iSeries/mf/*/cmdline
powerpc/ps3: Workaround for flash memory I/O error
powerpc/booke: Don't set DABR on 64-bit BookE, use DAC1 instead
powerpc/perf_counters: Reduce stack usage of power_check_constraints
powerpc: Fix bug where perf_counters breaks oprofile
powerpc/85xx: Fix SMP compile error and allow NULL for smp_ops
powerpc/irq: Improve nanodoc
powerpc: Fix some late PowerMac G5 with PCIe ATI graphics
powerpc/fsl-booke: Use HW PTE format if CONFIG_PTE_64BIT
powerpc/book3e: Add missing page sizes
powerpc/pseries: Fix to handle slb resize across migration
powerpc/powermac: Thermal control turns system off too eagerly
powerpc/pci: Merge ppc32 and ppc64 versions of phb_scan()
powerpc/405ex: support cuImage via included dtb
powerpc/405ex: provide necessary fixup function to support cuImage
powerpc/40x: Add support for the ESTeem 195E (PPC405EP) SBC
powerpc/44x: Add Eiger AMCC (AppliedMicro) PPC460SX evaluation board support.
powerpc/44x: Update Arches defconfig
powerpc/44x: Update Arches dts
...
Fix up conflicts in drivers/char/agp/uninorth-agp.c
This contains all the bits that didn't fit in previous patches :-) This
includes the actual exception handlers assembly, the changes to the
kernel entry, other misc bits and wiring it all up in Kconfig.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The current definitions set ranges and defaults for 32 and 64-bit
only using "PPC_STD_MMU" which means hash based MMU. This uselessly
restrict the usefulness for the upcoming 64-bit BookE port, but more
than that, it's broken on 32-bit since the only 32-bit platform
supporting multiple page sizes currently is 44x which does -not-
have PPC_STD_MMU_32 set.
This fixes it by using PPC64 and PPC32 instead.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Now that percpu allows arbitrary embedding of the first chunk,
powerpc64 can easily be converted to dynamic percpu allocator.
Convert it. powerpc supports several large page sizes. Cap atom_size
at 1M. There isn't much to gain by going above that anyway.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix
changes. As alpha in percpu tree uses 'weak' attribute instead of
inline assembly, there's no need for __used attribute.
Conflicts:
arch/alpha/include/asm/percpu.h
arch/mn10300/kernel/vmlinux.lds.S
include/linux/percpu-defs.h
Based on initial work from: Dale Farnsworth <dale@farnsworth.org>
Add the low level irq tracing hooks for 32-bit powerpc needed
to enable full lockdep functionality.
The approach taken to deal with the code in entry_32.S is that
we don't trace all the transitions of MSR:EE when we just turn
it off to peek at TI_FLAGS without races. Only when we are
calling into C code or returning from exceptions with a state
that have changed from what lockdep thinks.
There's a little bugger though: If we take an exception that
keeps interrupts enabled (such as an alignment exception) while
interrupts are enabled, we will call trace_hardirqs_on() on the
way back spurriously. Not a big deal, but to get rid of it would
require remembering in pt_regs that the exception was one of the
type that kept interrupts enabled which we don't know at this
stage. (Well, we could test all cases for regs->trap but that
sucks too much).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Kumar Gala <galak@kernel.crashing.org>
This patch makes most !CONFIG_HAVE_SETUP_PER_CPU_AREA archs use
dynamic percpu allocator. The first chunk is allocated using
embedding helper and 8k is reserved for modules. This ensures that
the new allocator behaves almost identically to the original allocator
as long as static percpu variables are concerned, so it shouldn't
introduce much breakage.
s390 and alpha use custom SHIFT_PERCPU_PTR() to work around addressing
range limit the addressing model imposes. Unfortunately, this breaks
if the address is specified using a variable, so for now, the two
archs aren't converted.
The following architectures are affected by this change.
* sh
* arm
* cris
* mips
* sparc(32)
* blackfin
* avr32
* parisc (broken, under investigation)
* m32r
* powerpc(32)
As this change makes the dynamic allocator the default one,
CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is replaced with its invert -
CONFIG_HAVE_LEGACY_PER_CPU_AREA, which is added to yet-to-be converted
archs. These archs implement their own setup_per_cpu_areas() and the
conversion is not trivial.
* powerpc(64)
* sparc(64)
* ia64
* alpha
* s390
Boot and batch alloc/free tests on x86_32 with debug code (x86_32
doesn't use default first chunk initialization). Compile tested on
sparc(32), powerpc(32), arm and alpha.
Kyle McMartin reported that this change breaks parisc. The problem is
still under investigation and he is okay with pushing this patch
forward and fixing parisc later.
[ Impact: use dynamic allocator for most archs w/o custom percpu setup ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Bryan Wu <cooloney@kernel.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
This enables the perf_counter subsystem on 32-bit powerpc. Since we
don't have any support for hardware counters on 32-bit powerpc yet,
only software counters can be used.
Besides selecting HAVE_PERF_COUNTERS for 32-bit powerpc as well as
64-bit, the main thing this does is add an implementation of
set_perf_counter_pending(). This needs to arrange for
perf_counter_do_pending() to be called when interrupts are enabled.
Rather than add code to local_irq_restore as 64-bit does, the 32-bit
set_perf_counter_pending() generates an interrupt by setting the
decrementer to 1 so that a decrementer interrupt will become pending
in 1 or 2 timebase ticks (if a decrementer interrupt isn't already
pending). When interrupts are enabled, timer_interrupt() will be
called, and some new code in there calls perf_counter_do_pending().
We use a per-cpu array of flags to indicate whether we need to call
perf_counter_do_pending() or not.
This introduces a couple of new Kconfig symbols: PPC_HAVE_PMU_SUPPORT,
which is selected by processor families for which we have hardware PMU
support (currently only PPC64), and PPC_PERF_CTRS, which enables the
powerpc-specific perf_counter back-end.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linuxppc-dev@ozlabs.org
Cc: benh@kernel.crashing.org
LKML-Reference: <19000.55404.103840.393470@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This makes 32-bit powerpc use the generic atomic64_t implementation.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently we are wasting time calling the generic calibrate_delay()
function. We don't need it since our implementation of __delay() is
based on the CPU timebase. So instead, we use our own small
implementation that initializes loops_per_jiffy to something sensible
to make the few users like spinlock debug be happy
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch includes the basic infrastructure to use swiotlb
bounce buffering on 32-bit powerpc. It is not yet enabled on
any platforms. Probably the most interesting bit is the
addition of addr_needs_map to dma_ops - we need this as
a dma_op because the decision of whether or not an addr
can be mapped by a device is device-specific.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The implementation we just revived has issues, such as using a
Kconfig-defined virtual address area in kernel space that nothing
actually carves out (and thus will overlap whatever is there),
or having some dependencies on being self contained in a single
PTE page which adds unnecessary constraints on the kernel virtual
address space.
This fixes it by using more classic PTE accessors and automatically
locating the area for consistent memory, carving an appropriate hole
in the kernel virtual address space, leaving only the size of that
area as a Kconfig option. It also brings some dma-mask related fixes
from the ARM implementation which was almost identical initially but
grew its own fixes.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This reverts commit 33f00dcedb.
While it was a good idea to try to use the mm/vmalloc.c allocator instead
of our own (in fact, ours is itself a dup on an old variant of the vmalloc
one), unfortunately, the approach is terminally busted since
dma_alloc_coherent() can be called at interrupt time or in atomic contexts
and there's little chances we'll make the code in mm/vmalloc.c cope with\ that :-(
Until we can get the generic code to forbid that idiocy and fix all
drivers abusing it, we pretty much have no choice but revert to
our custom virtual space allocator.
There's also a problem with SMP safety since freeing such mapping
would require an IPI which cannot be done at interrupt time.
However, right now, I don't think we support any platform that is
both SMP and has non-coherent DMA (don't laugh, I know such things
do exist !) so we can sort that out later.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
So select GENERIC_HARDIRQS_NO__DO_IRQ to disable it.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>