Use the generic code, just add the MPIC priority setting,
I don't see any use in mucking around with the decrementer,
as 32-bit will have EE off all along, and 64-bit will be able
to deal with it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use generic cpu_state, call idle_task_exit() properly, and
remove smp_core99_cpu_die() which isn't useful, the generic
function does the job just fine.
Remove the last remnants of cpu_enable(), everybody uses the normal
__cpu_up() path now
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is used by some "soft" hotplug implementations. I needs to
call idle_task_exit() when the CPU is going away, and we remove
the now no-longer needed set_cpu_online() and local_irq_enable()
which are handled by the return to start_secondary
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Various thing are torn down when a CPU is hot-unplugged. That CPU
is expected to go back to start_secondary when re-plugged to re
initialize everything, such as clock sources, maps, ...
Some implementations just return from cpu_die() callback
in the idle loop when the CPU is "re-plugged". This is not enough.
We fix it using a little asm trampoline which resets the stack
and calls back into start_secondary as if we were all fresh from
boot. The trampoline already existed on ppc64, but we add it for
ppc32
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With some implementations, it is possible that a timer interrupt
occurs every few seconds on an offline CPU. In this case, just
re-arm the decrementer and return immediately
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
avr32: Fix missing irq namespace conversion
powerpc: qe_ic: Rename get_irq_desc_data and get_irq_desc_chip
genirq: Remove the now obsolete config options and select statements
arm: versatile : Fix typo introduced in irq namespace cleanup
sound: Fixup the last user of the old irq functions
genirq: Remove obsolete comment
genirq: Remove now obsolete set_irq_wake()
sh: Fix irq cleanup fallout
x86: apb_timer: Fixup genirq fallout
genirq: Fix misnamed label in handle_edge_eoi_irq
Fix up crazy conflict in arch/powerpc/include/asm/qe_ic.h:
- commit eead4d5c63 ("powerpc: qe_ic: Rename get_irq_desc_data and
get_irq_desc_chip") made the helper functions use
irq_desc_get_handler_data() instead of the legacy (and no longer
existing) get_irq_desc_data.
- commit d4db35e8dc ("powerpc/qe_ic: Fix another breakage from the
irq_data conversion") used irq_desc_get_chip_data() instead.
According to Thomas, the former is the correct direct conversion, but it
does look like both should work (arch/powerpc/sysdev/qe_lib/qe_ic.c
seems to initialize both to the same thing), and the chip data in some
ways is the more logical. Somebody should really decide on one of the
other.
This merge picks irq_desc_get_handler_data() as the straightforward pure
conversion to new names, as per Thomas.
These two functions disappeared in commit
0c6f8a8b91
"genirq: Remove compat code"
but they still exist in qe_ic.h.
This patch renames the function to their new names.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Lennert Buytenhek <buytenh@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
LKML-Reference: <20110330132504.GA31832@riccoc20.at.omicron.at>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Ensure the Chelsio T3/T4 network drivers and iWARP drivers are
enabled in the pseries config.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Recent upstream builds with allmodconfig fail due to lack of space
between 0x3000 and 0x6000. We have a hard block at 0x7000 but we can
spare a page by moving the STAB0 from 0x6000 to 0x8000.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
commit cf9efce0ce (powerpc: Account time using timebase rather
than PURR) used in_irq() to detect if the time was spent in
interrupt processing. This only catches hardirq context so if we
are in softirq context and in the idle loop we end up accounting it
as idle time. If we instead use in_interrupt() we catch both softirq
and hardirq time.
The issue was found when running a network intensive workload. top
showed the following:
0.0%us, 1.1%sy, 0.0%ni, 85.7%id, 0.0%wa, 9.9%hi, 3.3%si, 0.0%st
85.7% idle. But this was wildly different to the perf events data.
To confirm the suspicion I ran something to keep the core busy:
# yes > /dev/null &
8.2%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 10.3%hi, 81.4%si, 0.0%st
We only got 8.2% of the CPU for the userspace task and softirq has
shot up to 81.4%.
With the patch below top shows the correct stats:
0.0%us, 0.0%sy, 0.0%ni, 5.3%id, 0.0%wa, 13.3%hi, 81.3%si, 0.0%st
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If a given firmware doesn't have a token to support query-cpu-stopped-state,
its not likely to change during the lifetime of the kernel.
Only print this information once, not once per secondary thread.
While here, make the line wrap grep friendly.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To try to avoid future confusion, rename irq to hwirq when it refers
to a xics domain number instead of a linux irq number.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
commit 79f26c268e (powerpc:
platforms/pseries irq_data conversion) pushed irq_desc down into many
functions, dererencing the descriptor irq field as late as possible.
But it incorrectly passed a linix virtural irq number to RTAS,
resulting in the interrupt not being disabled and possibly
other bad things, such as another interrupt being disabled and/or
a checkstop.
In addition this missed the point of xics_mask_unknown_vec and
the seperation of xics_mask_real_irq from xics_mask_irq. When
xics_mask_unknown_vec is called it's because the hardware delivered an
irq source for which we have no linux irq allocated, and thefore we can
not have an irq_desc allocated.
Revert xics_mask_real_irq to its prior version, naming the argument
hwirq to highlight the difference.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These syscalls have been added recently:
name_to_handle_at
open_by_handle_at
clock_adjtime
syncfs
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
339 is the SPR number for MAS5 documented by Power ISA 2.06, and
implemented by e500mc. It is not yet used anywhere in the kernel,
so nothing should be relying on the wrong number.
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
pfns are unsigned long, but MEMORY_START is phys_addr_t. This leads
to page_to_pfn() returning phys_addr_t, and thus type mismatches in a few
print statements.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is used by Alsa to mmap buffers allocated with dma_alloc_coherent()
into userspace. We need a special variant to handle machines with
non-coherent DMAs as those buffers have "special" virt addresses and
require non-cachable mappings
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For normal halt, reboot, and poweroff events, refrain from overwriting
the lnx,oops-log partition. Also, don't save the dmesg buffer on an
emergency-restart event if we've already saved it earlier in panic().
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Uwe Kleine-König reported:
while working on an defconfig (arm/mx27) I noticed that just updating
it[1] results in removing CONFIG_EEPROM_AT24=y. The reason is that
since commit
v2.6.36-5965-g5f2365d (misc devices: do not enable by default)
MISC_DEVICES isn't enabled anymore by default. So all defconfigs that
have CONFIG_SOME_SYMBOL=y (or =m) (with SOME_SYMBOL depending on
MISC_DEVICES) but not CONFIG_MISC_DEVICES=y suffer from the same
problem.
This restores those misc devices to the powerpc defconfigs.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Acked-by: Uwe Kleine-König
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The core irq_set_type() function updates the flow type when the chip
callback returns 0. So setting the type is bogus. The core also
updates the LEVEL flag.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The core irq_set_type() function updates the flow type when the chip
callback returns 0. So setting the type is bogus. The core also
updates IRQ_LEVEL.
Use irq_data to get the level type information in the chip functions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The core irq_set_type() function updates the flow type when the chip
callback returns 0. So setting the type is bogus.
The new core code allows to update the type in irq_data and return
IRQ_SET_MASK_OK_NOCOPY, so the core code will not touch it, except for
setting the IRQ_LEVEL flag.
Retrieve the IRQ_LEVEL information from irq_data which avoids a
redundant sparse irq lookup as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The core irq_set_type() function updates the flow type when the chip
callback returns 0. So setting the type is bogus. The level flag is
updated in the core as well.
Use the proper accessors for setting the irq handlers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The core irq_set_type() function updates the flow type when the chip
callback returns 0. So setting the type is bogus.
The new core code allows to update the type in irq_data and return
IRQ_SET_MASK_OK_NOCOPY, so the core code will not touch it, except for
setting the IRQ_LEVEL flag.
Use the proper accessors for setting the irq handlers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The core code provides the same functionality when the
IRQCHIP_EOI_IF_HANDLED flag is set for the irq chip.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The core irq_set_type() function updates the flow type when the chip
callback returns 0. So setting the type is bogus.
The new core code allows to update the type in irq_data and return
IRQ_SET_MASK_OK_NOCOPY, so the core code will not touch it, except for
setting the IRQ_LEVEL flag.
Use the proper accessors for setting the irq handlers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The core irq_set_type() function updates the flow type when the chip
callback returns 0. It also updates irq_data, so this can be used in
irq_ack() to check for the level bit. That avoids a redundant sparse
irq lookup.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The irq chip has no irq_set_type() callback. So calling the call is
pointless. Set IRQ_LEVEL via the proper accessor.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- all the integration parameters have been captured by the binding.
- the block name really uniquely identifies this hardware.
Some advocate putting SoC names everywhere in case software needs
to work around some chip-specific bug, but more precise SoC
information already exists in SVR, and board information already
exists in the top-level device tree node.
Note that sometimes the SoC name is a worse identifier than the
block version, as the block version can change between revisions
of the same SoC.
As a matter of historical reference, neither SEC versions 2.x
nor 3.x (driven by talitos) ever needed CHIP references.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Cc: Kumar Gala <kumar.gala@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Acked-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Help clarify that the number trailing in compatible nomenclature
is the version number of the device, i.e., change:
"fsl,p4080-sec4.0", "fsl,sec4.0";
to:
"fsl,p4080-sec-v4.0", "fsl,sec-v4.0";
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Cc: Kumar Gala <kumar.gala@freescale.com>
Cc: Steve Cornelius <sec@pobox.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The SEC4 supercedes the SEC2.x/3.x as Freescale's
Integrated Security Engine. Its programming model is
incompatible with all prior versions of the SEC (talitos).
The SEC4 is also known as the Cryptographic Accelerator
and Assurance Module (CAAM); this driver is named caam.
This initial submission does not include support for Data Path
mode operation - AEAD descriptors are submitted via the job
ring interface, while the Queue Interface (QI) is enabled
for use by others. Only AEAD algorithms are implemented
at this time, for use with IPsec.
Many thanks to the Freescale STC team for their contributions
to this driver.
Signed-off-by: Steve Cornelius <sec@pobox.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commit ddd588b5dd ("oom: suppress nodes that are not allowed from
meminfo on oom kill") moved lib/show_mem.o out of lib/lib.a, which
resulted in build warnings on all architectures that implement their own
versions of show_mem():
lib/lib.a(show_mem.o): In function `show_mem':
show_mem.c:(.text+0x1f4): multiple definition of `show_mem'
arch/sparc/mm/built-in.o:(.text+0xd70): first defined here
The fix is to remove __show_mem() and add its argument to show_mem() in
all implementations to prevent this breakage.
Architectures that implement their own show_mem() actually don't do
anything with the argument yet, but they could be made to filter nodes
that aren't allowed in the current context in the future just like the
generic implementation.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: James Bottomley <James.Bottomley@hansenpartnership.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
deal with races in /proc/*/{syscall,stack,personality}
proc: enable writing to /proc/pid/mem
proc: make check_mem_permission() return an mm_struct on success
proc: hold cred_guard_mutex in check_mem_permission()
proc: disable mem_write after exec
mm: implement access_remote_vm
mm: factor out main logic of access_process_vm
mm: use mm_struct to resolve gate vma's in __get_user_pages
mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm
mm: arch: make in_gate_area take an mm_struct instead of a task_struct
mm: arch: make get_gate_vma take an mm_struct instead of a task_struct
x86: mark associated mm when running a task in 32 bit compatibility mode
x86: add context tag to mark mm when running a task in 32-bit compatibility mode
auxv: require the target to be tracable (or yourself)
close race in /proc/*/environ
report errors in /proc/*/*map* sanely
pagemap: close races with suid execve
make sessionid permissions in /proc/*/task/* match those in /proc/*
fix leaks in path_lookupat()
Fix up trivial conflicts in fs/proc/base.c
The Xen PV drivers in a crashed HVM guest can not connect to the dom0
backend drivers because both frontend and backend drivers are still in
connected state. To run the connection reset function only in case of a
crashdump, the is_kdump_kernel() function needs to be available for the PV
driver modules.
Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
usable for modules.
Leave 'elfcorehdr' as early_param(). This changes powerpc from __setup()
to early_param(). It adds an address range check from x86 also on ia64
and powerpc.
[akpm@linux-foundation.org: additional #includes]
[akpm@linux-foundation.org: remove elfcorehdr_addr export]
[akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is no user now.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: David Miller <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Removes resource reservation from the common sybsystem initialization code
and make it part of mport driver initialization. This resolves conflict
with resource reservation by device specific mport drivers.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Changes mport ID and host destination ID assignment to implement unified
method common to all mport drivers. Makes "riohdid=" kernel command line
parameter common for all architectures with support for more that one host
destination ID assignment.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Subsystem initialization sequence modified to support presence of multiple
RapidIO controllers in the system. The new sequence is compatible with
initialization of PCI devices.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1. Add an option to include RapidIO support if the PCI is available.
2. Add FSL_RIO configuration option to enable controller selection.
3. Add RapidIO support option into x86 and MIPS architectures.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This set of patches eliminates RapidIO dependency on PowerPC architecture
and makes it available to other architectures (x86 and MIPS). It also
enables support of new platform independent RapidIO controllers such as
PCI-to-SRIO and PCI Express-to-SRIO.
This patch:
Extend number of mport callback functions to eliminate direct linking of
architecture specific mport operations.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
minix bit operations are only used by minix filesystem and useless by
other modules. Because byte order of inode and block bitmaps is different
on each architecture like below:
m68k:
big-endian 16bit indexed bitmaps
h8300, microblaze, s390, sparc, m68knommu:
big-endian 32 or 64bit indexed bitmaps
m32r, mips, sh, xtensa:
big-endian 32 or 64bit indexed bitmaps for big-endian mode
little-endian bitmaps for little-endian mode
Others:
little-endian bitmaps
In order to move minix bit operations from asm/bitops.h to architecture
independent code in minix filesystem, this provides two config options.
CONFIG_MINIX_FS_BIG_ENDIAN_16BIT_INDEXED is only selected by m68k.
CONFIG_MINIX_FS_NATIVE_ENDIAN is selected by the architectures which use
native byte order bitmaps (h8300, microblaze, s390, sparc, m68knommu,
m32r, mips, sh, xtensa). The architectures which always use little-endian
bitmaps do not select these options.
Finally, we can remove minix bit operations from asm/bitops.h for all
architectures.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As the result of conversions, there are no users of ext2 non-atomic bit
operations except for ext2 filesystem itself. Now we can put them into
architecture independent code in ext2 filesystem, and remove from
asm/bitops.h for all architectures.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This introduces CONFIG_GENERIC_FIND_BIT_LE to tell whether to use generic
implementation of find_*_bit_le() in lib/find_next_bit.c or not.
For now we select CONFIG_GENERIC_FIND_BIT_LE for all architectures which
enable CONFIG_GENERIC_FIND_NEXT_BIT.
But m68knommu wants to define own faster find_next_zero_bit_le() and
continues using generic find_next_{,zero_}bit().
(CONFIG_GENERIC_FIND_NEXT_BIT and !CONFIG_GENERIC_FIND_BIT_LE)
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce little-endian bit operations by renaming existing powerpc native
little-endian bit operations and changing them to take any pointer types.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This makes the little-endian bitops take any pointer types by changing the
prototypes and adding casts in the preprocessor macros.
That would seem to at least make all the filesystem code happier, and they
can continue to do just something like
#define ext2_set_bit __test_and_set_bit_le
(or whatever the exact sequence ends up being).
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that gate vma's are referenced with respect to a particular mm and not a
particular task it only makes sense to propagate the change to this predicate as
well.
Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Morally, the question of whether an address lies in a gate vma should be asked
with respect to an mm, not a particular task. Moreover, dropping the dependency
on task_struct will help make existing and future operations on mm's more
flexible and convenient.
Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Morally, the presence of a gate vma is more an attribute of a particular mm than
a particular task. Moreover, dropping the dependency on task_struct will help
make both existing and future operations on mm's more flexible and convenient.
Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
All architectures can use the common dma_addr_t typedef now. We can
remove the arch specific dma_addr_t.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a node parameter to alloc_thread_info(), and change its name to
alloc_thread_info_node()
This change is needed to allow NUMA aware kthread_create_on_cpu()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In some cases during a threaded core dump not all the threads will have
a full register set. This happens when the signal causing the core dump
races with a thread exiting. The race happens when the exiting thread
has entered the kernel for the last time before the signal arrives, but
doesn't get far enough through the exit code to avoid being included
in the core dump.
So we get a thread included in the core dump which is never going to go
out to userspace again and only has a partial register set recorded
Normally we would catch each thread as it is about to go into userspace
and capture the full register set then.
However, this exiting thread is never going to go out to userspace
again, so we have no way to capture its full register set. It doesn't
really matter, though, as this is a thread which is effectively
already dead.
So instead of hitting a BUG() in this case (a really bad choice of
action in the first place), we use a poison value for the register
values.
[BenH]: Some cosmetic/stylistic changes and fix build on ppc32
Signed-off-by: Mike Wolf <mjw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The following code snippet:
unsigned int cpu = 0;
if (mpic->flags & MPIC_PRIMARY)
cpu = hard_smp_processor_id();
is seen in several places in the 'mpic.c' code. This changeset factors
that pattern out into a helper function called 'mpic_processor_id'.
Signed-off-by: Meador Inge <meador_inge@mentor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This property, defined in the Open PIC binding, tells the kernel not to use
the reset bit in the global configuration register. Additionally, its
presence mandates that only sources which are actually used (i.e. appear in
the device tree) should have their VECPRI bits initialized.
Although, "pic-no-reset" can be used for the same use cases that
"protected-sources" is covering, the "protected-sources" implementation was
left completely intact. This is a more pragmatic approach as there are
already several existing systems which use protected sources. If
"pic-no-reset" *and* "protected-sources" are both used, however, then
"pic-no-reset" takes precedence in terms of the init behavior and the
sanity checks done by protected sources will still take place.
Signed-off-by: Meador Inge <meador_inge@mentor.com>
Cc: Hollis Blanchard <hollis_blanchard@mentor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit b5d937de03 has a bug which causes
basically a NULL dereference in the PCI code during boot on ppc64
machines.
fetch_dev_dn() is called when dev->dev.of_node is NULL, so using that
as the starting point for the search makes no sense. It should instead
start from the device node of the PHB.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits)
trace, filters: Initialize the match variable in process_ops() properly
trace, documentation: Fix branch profiling location in debugfs
oprofile, s390: Cleanups
oprofile, s390: Remove hwsampler_files.c and merge it into init.c
perf: Fix tear-down of inherited group events
perf: Reorder & optimize perf_event_context to remove alignment padding on 64 bit builds
perf: Handle stopped state with tracepoints
perf: Fix the software events state check
perf, powerpc: Handle events that raise an exception without overflowing
perf, x86: Use INTEL_*_CONSTRAINT() for all PEBS event constraints
perf, x86: Clean up SandyBridge PEBS events
perf lock: Fix sorting by wait_min
perf tools: Version incorrect with some versions of grep
perf evlist: New command to list the names of events present in a perf.data file
perf script: Add support for H/W and S/W events
perf script: Add support for dumping symbols
perf script: Support custom field selection for output
perf script: Move printing of 'common' data from print_event and rename
perf tracing: Remove print_graph_cpu and print_graph_proc from trace-event-parse
perf script: Change process_event prototype
...
* 'kvm-updates/2.6.39' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (55 commits)
KVM: unbreak userspace that does not sets tss address
KVM: MMU: cleanup pte write path
KVM: MMU: introduce a common function to get no-dirty-logged slot
KVM: fix rcu usage in init_rmode_* functions
KVM: fix kvmclock regression due to missing clock update
KVM: emulator: Fix permission checking in io permission bitmap
KVM: emulator: Fix io permission checking for 64bit guest
KVM: SVM: Load %gs earlier if CONFIG_X86_32_LAZY_GS=n
KVM: x86: Remove useless regs_page pointer from kvm_lapic
KVM: improve comment on rcu use in irqfd_deassign
KVM: MMU: remove unused macros
KVM: MMU: cleanup page alloc and free
KVM: MMU: do not record gfn in kvm_mmu_pte_write
KVM: MMU: move mmu pages calculated out of mmu lock
KVM: MMU: set spte accessed bit properly
KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bits
KVM: Start lock documentation
KVM: better readability of efer_reserved_bits
KVM: Clear async page fault hash after switching to real mode
KVM: VMX: Initialize vm86 TSS only once.
...
Previously SPRGs 4-7 were improperly read and written in
kvm_arch_vcpu_ioctl_get_regs() and kvm_arch_vcpu_ioctl_set_regs();
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Tyser <ptyser@xes-inc.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (76 commits)
pch_uart: reference clock on CM-iTC
pch_phub: add new device ML7213
n_gsm: fix UIH control byte : P bit should be 0
n_gsm: add a documentation
serial: msm_serial_hs: Add MSM high speed UART driver
tty_audit: fix tty_audit_add_data live lock on audit disabled
tty: move cd1865.h to drivers/staging/tty/
Staging: tty: fix build with epca.c driver
pcmcia: synclink_cs: fix prototype for mgslpc_ioctl()
Staging: generic_serial: fix double locking bug
nozomi: don't use flush_scheduled_work()
tty/serial: Relax the device_type restriction from of_serial
MAINTAINERS: Update HVC file patterns
tty: phase out of ioctl file pointer for tty3270 as well
tty: forgot to remove ipwireless from drivers/char/pcmcia/Makefile
pch_uart: Fix DMA channel miss-setting issue.
pch_uart: fix exclusive access issue
pch_uart: fix auto flow control miss-setting issue
pch_uart: fix uart clock setting issue
pch_uart : Use dev_xxx not pr_xxx
...
Fix up trivial conflicts in drivers/misc/pch_phub.c (same patch applied
twice, then changes to the same area in one branch)
None of the existing cpufreq drivers uses the second argument of
its .suspend() callback (which isn't useful anyway), so remove it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support
percpu: Generic support for this_cpu_cmpxchg_double()
alpha: use L1_CACHE_BYTES for cacheline size in the linker script
percpu: align percpu readmostly subsection to cacheline
Fix up trivial conflict in arch/x86/kernel/vmlinux.lds.S due to the
percpu alignment having changed ("x86: Reduce back the alignment of the
per-CPU data section")
Events on POWER7 can roll back if a speculative event doesn't
eventually complete. Unfortunately in some rare cases they will
raise a performance monitor exception. We need to catch this to
ensure we reset the PMC. In all cases the PMC will be 256 or less
cycles from overflow.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org> # as far back as it applies cleanly
LKML-Reference: <20110309143842.6c22845e@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
x86: Clean up apic.c and apic.h
x86: Remove superflous goal definition of tsc_sync
x86: dt: Correct local apic documentation in device tree bindings
x86: dt: Cleanup local apic setup
x86: dt: Fix OLPC=y/INTEL_CE=n build
rtc: cmos: Add OF bindings
x86: ce4100: Use OF to setup devices
x86: ioapic: Add OF bindings for IO_APIC
x86: dtb: Add generic bus probe
x86: dtb: Add support for PCI devices backed by dtb nodes
x86: dtb: Add device tree support for HPET
x86: dtb: Add early parsing of IO_APIC
x86: dtb: Add irq domain abstraction
x86: dtb: Add a device tree for CE4100
x86: Add device tree support
x86: e820: Remove conditional early mapping in parse_e820_ext
x86: OLPC: Make OLPC=n build again
x86: OLPC: Remove extra OLPC_OPENFIRMWARE_DT indirection
x86: OLPC: Cleanup config maze completely
x86: OLPC: Hide OLPC_OPENFIRMWARE config switch
...
Fix up conflicts in arch/x86/platform/ce4100/ce4100.c
* 'core-futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
arm: Remove bogus comment in futex_atomic_cmpxchg_inatomic()
futex: Deobfuscate handle_futex_death()
plist: Add priority list test
plist: Shrink struct plist_head
futex,plist: Remove debug lock assignment from plist_node
futex,plist: Pass the real head of the priority list to plist_del()
futex: Sanitize futex ops argument types
futex: Sanitize cmpxchg_futex_value_locked API
futex: Remove redundant pagefault_disable in futex_atomic_cmpxchg_inatomic()
futex: Avoid redudant evaluation of task_pid_vnr()
futex: Update futex_wait_setup comments about locking
sram_params.sram_size and sram_params.sram_offset were unsigned.
If get_cache_sram_size() or get_cache_sram_offset() returns error code
then it is not seen to the caller. Made sram_size and sram_offset signed.
Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Now handles multiple ranges, doesn't make assumptions about interrupt
specifier format, and doesn't claim interrupts that don't correspond to an
available range.
Also has some better error checking.
The device tree binding is updated to clarify some existing assumptions.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Conversion from float to integer should based on both the instruction
encoding and the sign of the operand.
A simple testcase to show the issue:
static float fm;
static signed int si_min = (-2147483647 - 1);
static unsigned int ui;
int main()
{
fm = (float) si_min; ;
ui = (unsigned int)fm;
printf("ui=%d, should be %d\n", ui, si_min);
return 0;
}
Result: ui=-1, should be -2147483648
Signed-off-by: Shan Hai <shan.hai@windriver.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Update p1022 sata compatible to "fsl,p1022-sata", "fsl,pq-sata-v2".
p1022ds sata controller is v2 version comparing previous FSL sata
controller, for example, mpc8536.
Signed-off-by: Lei Xu <B33228@freescale.com>
Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Commit da3ed89e7c added
'fsl,qoriq-gpio' compatiable searching in the old way
using for_each_compatible_node(). But the driver have
previously been changed to use a struct of_device_id
compatible list passed to for_each_matching_node().
Add 'fsl,qoriq-gpio' compatiable to the existing
compatible list instead of adding another
for_each_compatible_node() loop.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The MPC852 based mgsuvd board from Keymile was initially ported,
but later on not developed further. This patch removes the respective
files to decrease merging conflicts and unneeded maintenance.
Signed-off-by: Holger Brunck <holger.brunck@keymile.com>
Acked-by: Heiko Schocher<hs@denx.de>
Cc: Vitaly Bordug <vitb@kernel.crashing.org>
Cc: Marcelo Tosatti <marcelo@kvack.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The mgcoge board from keymile is now base for some other
similar boards. Therefore the board specific name mgcoge
was renamed to a generic name km82xx. Additionally some
enhancements were made:
- rework partition table in dts file
- add cpm2_pio_c gpio controller in dts file
- update defconfig
- add pin description for SCC1
- add pin description and configuration for USB
Signed-off-by: Holger Brunck <holger.brunck@keymile.com>
Acked-by: Heiko Schocher <hs@denx.de>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Heiko Schocher <hs@denx.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Beside the MPC 8360 based board kmeter1 other km83xx boards
from keymile will follow. Therefore the board specific naming
kmeter1 for functions and files were replaced with km83xx.
Additionally some updates were made:
- update defconfig for 2.6.38
- rework flash partitioning in dts file
- add gpio controller for qe_pio_c in dts
Signed-off-by: Holger Brunck <holger.brunck@keymile.com>
Acked-by: Heiko Schocher <hs@denx.de>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Heiko Schocher <hs@denx.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This erratum can occur if a single-precision floating-point,
double-precision floating-point or vector floating-point instruction on a
mispredicted branch path signals one of the floating-point data interrupts
which are enabled by the SPEFSCR (FINVE, FDBZE, FUNFE or FOVFE bits). This
interrupt must be recorded in a one-cycle window when the misprediction is
resolved. If this extremely rare event should occur, the result could be:
The SPE Data Exception from the mispredicted path may be reported
erroneously if a single-precision floating-point, double-precision
floating-point or vector floating-point instruction is the second
instruction on the correct branch path.
According to errata description, some efp instructions which are not
supposed to trigger SPE exceptions can trigger the exceptions in this case.
However, as we haven't emulated these instructions here, a signal will
send to userspace, and userspace application would exit.
This patch re-issue the efp instruction that we haven't emulated,
so that hardware can properly execute it again if this case happen.
Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
FSL PCIe controller v2.1:
- New MSI inbound window
- Same Inbound windows address as PCIe controller v1.x
Added new pit_t member(pmit) to struct ccsr_pci for MSI inbound window
FSL PCIe controller v2.2 and v2.3:
- Different addresses for PCIe inbound window 3,2,1
- Exposed PCIe inbound window 0
- New PCIe interrupt status register
Added new config and interrupt Status register to struct ccsr_pci & updated
pit_t array size to reflect the 4 inbound windows.
Device tree is used to maintain backward compatibility i.e. update inbound
window 1 index depending upon "compatible" field witin PCIE node.
Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
Acked-by: Roy Zang <tie-fei.zang@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
If the spin table is located in the linear mapping (which can happen if
we have 4G or more of memory) we need to access the spin table via a
cacheable coherent mapping like we do on ppc32 (and do explicit cache
flush).
See the following commit for the ppc32 version of this issue:
commit d1d47ec6e6
Author: Peter Tyser <ptyser@xes-inc.com>
Date: Fri Dec 18 16:50:37 2009 -0600
powerpc/85xx: Fix SMP when "cpu-release-addr" is in lowmem
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
all remaining callers pass LOOKUP_PARENT to it, so
flags argument can die; renamed to kern_path_parent()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change futex_atomic_op_inuser and futex_atomic_cmpxchg_inatomic
prototypes to use u32 types for the futex as this is the data type the
futex core code uses all over the place.
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110311025058.GD26122@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The cmpxchg_futex_value_locked API was funny in that it returned either
the original, user-exposed futex value OR an error code such as -EFAULT.
This was confusing at best, and could be a source of livelocks in places
that retry the cmpxchg_futex_value_locked after trying to fix the issue
by running fault_in_user_writeable().
This change makes the cmpxchg_futex_value_locked API more similar to the
get_futex_value_locked one, returning an error code and updating the
original value through a reference argument.
Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com> [tile]
Acked-by: Tony Luck <tony.luck@intel.com> [ia64]
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu> [microblaze]
Acked-by: David Howells <dhowells@redhat.com> [frv]
Cc: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110311024851.GC26122@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
On upcoming hardware, we have a PCI adapter with two functions, one of
which uses MSI and the other uses MSI-X. This adapter, when MSI is
disabled using the "old" firmware interface (RTAS_CHANGE_FN), still
signals an MSI-X interrupt and triggers an EEH. We are working with the
vendor to ensure that the hardware is not at fault, but if we use the
"new" interface (RTAS_CHANGE_MSI_FN) to disable MSI, we also
automatically disable MSI-X and the adapter does not appear to signal
any stray MSI-X interrupt.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This feature triggers nasty races in the scheduler between the
rebuilding of the topology and the load balancing code, causing
the machine to hang.
Disable it for now until the races are fixed.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The combination of commit
8154c5d22d and
93c22703ef
Broke boot on iSeries.
The problem is that iSeries very early boot code, which generates
the device-tree and runs before our normal early initializations
does need access the lppaca's very early, before the PACA array is
initialized, and in fact even before the boot PACA has been
initialized (it contains all 0's at this stage).
However, the first patch above makes that code use the new
llpaca_of(cpu) accessor, which itself is changed by the second patch to
use the PACA array.
We fix that by reverting iSeries to directly dereferencing the array. In
addition, we fix all iterators in the iSeries code to always skip CPU
whose number is above 63 which is the maximum size of that array and
the maximum number of supported CPUs on these machines.
Additionally, we make sure the boot_paca is properly initialized
in our early startup code.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If firmware allows us to map all of a partition's memory for DMA on a
particular bridge, create a 1:1 mapping of that memory. Add hooks for
dealing with hotplug events. Dynamic DMA windows can use larger than the
default page size, and we use the largest one possible.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Move SPRN_PID declearations in various locations into one place.
Signed-off-by: Tseng-Hui (Frank) Lin <thlin@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Create the lnx,oops-log NVRAM partition, and capture the end of the printk
buffer in it when there's an oops or panic. If we can't create the
lnx,oops-log partition, capture the oops/panic report in ibm,rtas-log.
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Adapt the functions used to create and write to the RTAS-log partition
to work with any OS-type partition.
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
memblock_enforce_memory_limit() takes the desired maximum quantity of memory
to end up with, not an address above which memory will not be used.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The rtas_event_scan() function uses smp_processor_id() to select a
starting point in cpu_online_mask, and does so under the protection
of get_online_cpus(). This might not select the current processor
in any case, so switch to raw_smp_processor_id().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch below removes an extra "l" in the word.
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
A number of drivers are using pgprot_writecombine() to enable write
combining on userspace mappings. Implement it on powerpc.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use the new functions and free the descriptor when the virq is
destroyed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Define the ARCH_IRQ_INIT_FLAGS instead of fixing it up in a loop.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Minor cleanup of notifier_from_errno() in powerpc.
notifier_from_errno() now contains the if(ret)/else conditional.
There is no need to do it in the powerpc code.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Untested, but looks like an obvious typo to me.
[BenH: No feedback, but it's obviously wrong]
Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix the error in spelling the config option for hw-breakpoints and fix
the build issue that follows.
Signed-off by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kyle Moffett points out that mpc85xx has started using the
ppc_md.machine_kexec hook. As such, revert patch c94868788c
(powerpc/kexec: Remove ppc_md.machine_kexec).
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
hpte_need_flush() might be called outside of a preempt section
when manipulating the kernel page tables, so we need to use the
appopriate variants of per-cpu variable accesses. There should
be no risk of being in the middle of a batch and a context
switch will flush any pending batch.
[Patch extracted from a larger patch in Peter's preemptible
mmu_gather series]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Get rid of old users of of_platform_driver in arch/powerpc. Most
of_platform_driver users can be converted to use the platform_bus
directly.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
arch/powerpc/kernel/ibmebus.c is the only remaining user of the
of_bus_type support code for initializing the bus and registering
drivers. All others have either been switched to the vanilla platform
bus or already have their own infrastructure.
This patch moves the functionality that ibmebus is using out of
drivers/of/{platform,device}.c and into ibmebus.c where it is actually
used. Also renames the moved symbols from of_platform_* to
ibmebus_bus_* to reflect the actual usage.
This patch is part of moving all of the of_platform_bus_type users
over to the platform_bus_type.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reason: Import mainline device tree changes on which further patches
depend on or conflict.
Trivial conflict in: drivers/spi/pxa2xx_spi_pci.c
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This is useful for system management software so that it can kick
off things like gettys and everything that's started from a tty,
before we reuse it from/for something else or shut it down.
Without this ioctl it would have to temporarily become the owner of
the tty, then call vhangup() and then give it up again.
Cc: Lennart Poettering <lennart@poettering.net>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Spinlocks on shared processor partitions use H_YIELD to notify the
hypervisor we are waiting on another virtual CPU. Unfortunately this means
the hcall tracepoints can recurse.
The patch below adds a percpu depth and checks it on both the entry and
exit hcall tracepoints.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@kernel.org
When converting to the new cpumask code I screwed up:
- if (cpu_isset(cpu, numa_cpumask_lookup_table[node])) {
- cpu_clear(cpu, numa_cpumask_lookup_table[node]);
+ if (cpumask_test_cpu(cpu, node_to_cpumask_map[node])) {
+ cpumask_set_cpu(cpu, node_to_cpumask_map[node]);
This was introduced in commit 25863de07a (powerpc/cpumask: Convert NUMA code
to new cpumask API)
Fix it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There is no need to start up the timer and monitor topology changes on a
dedicated processor partition, so disable it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The rest of the NUMA code expects an OF associativity property with
the first cell containing the length. Without this fix all topology changes
cause us to misparse the property and put the cpu into node 0.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The hypervisor uses unsigned 1 byte counters to signal topology changes to
the OS. Since they can wrap we need to check for any difference, not just if
the hypervisor count is greater than the previous count.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
VPHN supports up to 8 distance fields but the number of entries in
ibm,associativity-reference-points signifies how many are in use.
Don't look at all the VPHN counts, only distance_ref_points_depth
worth.
Since we already cap our distance metrics at MAX_DISTANCE_REF_POINTS,
use that to size the VPHN arrays and add a BUILD_BUG_ON to avoid it growing
larger than the VPHN maximum of 8.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Correct a spelling error in VPHN comments in numa.c.
Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Some of those functions try to adjust the CPU features, for example
to remove NAP support on some revisions. However, they seem to use
r5 as an index into the CPU table entry, which might have been right
a long time ago but no longer is. r4 is the right register to use.
This probably caused some off behaviours on some PowerMac variants
using 750cx or 7455 processor revisions.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@kernel.org
When calling setup_cpu() on 64-bit, we pass a pointer to the
cputable entry we have found. This used to be fine when cur_cpu_spec
was a pointer to that entry, but nowadays, we copy the entry into
a separate variable, and we do so before we call the setup_cpu()
callback. That means that any attempt by that callback at patching
the CPU table entry (to adjust CPU features for example) will patch
the wrong table.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
max_mapnr is a pfn, not an index innto mem_map[]. So don't add
ARCH_PFN_OFFSET a second time.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently, ppc32 uses sysdata for the pci_controller pointer, and
ppc64 uses it to hold the device_node pointer. This patch moves the
of_node pointer into (struct pci_bus*)->dev.of_node and
(struct pci_dev*)->dev.of_node so that sysdata can be converted to always
use the pci_controller pointer instead. It also fixes up the
allocating of pci devices so that the of_node pointer gets assigned
consistently and increments the ref count.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
There is a tiny difference between PPC32 and PPC64. Microblaze uses the
PPC32 variant.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
[grant.likely@secretlab.ca: Added comment to #endif, moved documentation
block to function implementation, fixed for non ppc and microblaze
compiles]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Define a version of memory_block_size_bytes() for powerpc/pseries such that
a memory block spans an entire lmb.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Reviewed-by: Robin Holt <holt@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The 476FP core may hang if an instruction fetch happens during an msync
following a tlbsync. This workaround makes sure that enough instruction
cache lines are pre-fetched before executing the msync. (sync and msync
are the same to the compiler.)
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
The DD2 core still has some unstability. Define CPU_FTR_476_DD2 to
enable workarounds in later patches.
This is based on an earlier, unreleased patch for DD1 by Ben Herrenschmidt.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This fix is a reset for USB PHY that requires some amount of time for power
to be stable on Canyonlands.
Signed-off-by: Rupjyoti Sarmah <rsarmah@apm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
All architecture specific rwsem headers carry the same function
prototypes. Just x86 adds asmregparm, which is an empty define on all
other architectures. S390 has a stale rwsem_downgrade_write()
prototype.
Remove the duplicates and add the prototypes to linux/rwsem.h
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.970840140@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Instead of having the same implementation in each architecture, move
it to linux/rwsem.h and remove the duplicates. It's unlikely that an
arch will ever implement something different, but we can deal with
that when it happens.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.876773757@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The rwsem initializers and related macros and functions are mostly the
same. Some of them lack the lockdep initializer, but having it in
place does not matter for architectures which do not support lockdep.
powerpc, sparc, x86: No functional change
sh, s390: Removes the duplicate init_rwsem (inline and #define)
alpha, ia64, xtensa: Use the lockdep capable init function in
lib/rwsem.c which is just uninlining the init
function for the LOCKDEP=n case
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.771812729@linutronix.de>
The difference between these declarations is the data type of the
count member and the lack of lockdep in some architectures/
long is equivivalent to signed long and the #ifdef guarded dep_map
member does not hurt anyone.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.679641914@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
All rwsem implementations include the same headers. Include them from
include/linux/rwsem.h
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.483520950@linutronix.de>
Currently percpu readmostly subsection may share cachelines with other
percpu subsections which may result in unnecessary cacheline bounce
and performance degradation.
This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR()
linker macros, makes each arch linker scripts specify its cacheline
size and use it to align percpu subsections.
This is based on Shaohua's x86 only patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shaohua.li@intel.com>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Fix time function double declaration with glibc
perf tools: Fix build by checking if extra warnings are supported
perf tools: Fix build when using gcc 3.4.6
perf tools: Add missing header, fixes build
perf tools: Fix 64 bit integer format strings
perf test: Fix build on older glibcs
perf: perf_event_exit_task_context: s/rcu_dereference/rcu_dereference_raw/
perf test: Use cpu_map->[cpu] when setting affinity
perf symbols: Fix annotation of thumb code
perf: Annotate cpuctx->ctx.mutex to avoid a lockdep splat
powerpc, perf: Fix frequency calculation for overflowing counters (FSL version)
perf: Fix perf_event_init_task()/perf_event_free_task() interaction
perf: Fix find_get_context() vs perf_event_exit_task() race
All architectures are finally converted. Remove the cruft.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Michal Simek <monstr@monstr.eu>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jeff Dike <jdike@addtoit.com>
Don't say that enable timed out when it was disable, and
show which IRQ had the problem.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Upcoming servers will include a Broadcom NIC, add to the defconfig to
increase testing coverage and make sure mainline builds come up with
networking.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
- Enable 64kB pages so it gets some regular testing.
- The largest POWER7 has 1024 threads so bump NR_CPUS it to match.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
IRQSOFF_TRACER and STACK_TRACER force the kernel to be built with -pg
which is a substantial overhead.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The dts-installed variable is initialised using a wildcard path that
will be expanded relative to the build directory. Use the existing
variable dtstree to generate an absolute wildcard path that will work
when building in a separate directory.
Reported-by: Gerhard Pircher <gerhard_pircher@gmx.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Gerhard Pircher <gerhard_pircher@gmx.net> [against 2.6.32]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Decoding machine checks is CPU specific and so machine_check_generic doesn't
do the right thing on 64bit chips. Luckily we never call into this code
because we call ppc_md.machine_check_exception instead if available.
Since we check cur_cpu_spec->machine_check before calling it, we may as
well remove machine_check_generic from 64bit archs.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The spec suggests we should first check the extended log flag before checking
the length field.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The FWNMI code uses a global buffer without any locks to read the RTAS error
information. If two CPUs take a machine check at once then we will corrupt
this buffer.
Since most FWNMI rtas messages are not of the extended type, we can create a
64bit percpu buffer and use it where possible. If we do receive an extended
RTAS log then we fall back to the old behaviour of using the global buffer.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rework pseries machine check handler:
- If MSR_RI isn't set, we cannot recover even if the machine check was fully
recovered
- Rename nonfatal to recovered
- Handle RTAS_DISP_LIMITED_RECOVERY
- Use BUS_MCEERR_AR instead of BUS_ADRERR
- Don't check all the RTAS error log fields when receiving a synchronous
machine check. Recent versions of the pseries firmware do not fill them
in during a machine check and instead send a follow up error log with
the detailed information. If we see a synchronous machine check, and we
came from userspace then kill the task.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If a machine check comes from userspace we send a SIGBUS to the task and
fail to printk anything.
If we are taking machine checks due to bad hardware we want to know about
it right away. Furthermore if we don't complain loudly then it will look
a lot like a bug in the userspace application, potentially causing a lot
of confusion.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We are calling debugger_fault_handler twice in machine_check_exception.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Newer versions of the System p firwmare send a partial RTAS error log in the
machine check handler with a more detailed response appearing sometime later
via check event.
This means at machine check time we do not have enough information to
ascertain exactly what went on. Furthermore, I have found the RTAS error
logs in the machine check handler contain no useful information, so halting on
them makes little sense. If we want to halt it would make more sense to do
it following the error log received sometime later via check event.
In light of this, never halt the error log in the pseries machine
check handler.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We should never force MSR_RI on. If we take a machine check with MSR_RI off
then we have no chance of recovering safely.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We were printing 64 bits of DSISR in show_regs even though it is 32 bit.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We should disable ftrace during kexec, some of the tracers are very invasive
and we do not want them going off while doing the low level work of swapping
one kernel out for another. This mirrors what we do on x86.
Even though we cannot return from a kexec on powerpc (since we do not implement
CONFIG_KEXEC_JUMP), add the restore code in case we do one day.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use the crash handler hooks to run the SPU stop code, just like we do for
ehea and cell RAS code.
While I'm here I noticed "CPUSs reliabally"
so fix the spelling MISTAKESs reliabally.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We check for a valid handler before calling ppc_md.machine_kexec_prepare
so we can just remove these empty handlers.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There's no need to initialise ppc_md.machine_kexec and
ppc_md.machine_kexec_prepare to the default handlers.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
No one uses ppc_md.machine_crash_shutdown, so remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
No one uses ppc_md.machine_kexec, so remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
No one uses ppc_md.machine_kexec_cleanup, so remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Move all the kexec handlers together.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With cmwq, there's no reason to use a separate workqueue in
cpufreq_spudemand. Use system_wq instead. The work items are already
sync canceled on stop, so it's already guaranteed that no work is
running when spu_gov_exit() is entered.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Dave Jones <davej@redhat.com>
Cc: cpufreq@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Simplify read file operation for /proc/powerpc/rtas/* interface
by using simple_read_from_buffer.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Simplify several write fileoperations for spufs by using
simple_write_to_buffer().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
32-bit variant of the previous patch for 64-bit:
<<
When an interrupt occurs in userspace, we can call trace_hardirqs_on/off()
With one level stack. But if we have irqsoff tracing enabled,
it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call
goes two stack frames up. If this is from user space, then there may
not exist a second stack....
>>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When an interrupt occurs in userspace, we can call trace_hardirqs_on/off()
With one level stack. But if we have irqsoff tracing enabled,
it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call
goes two stack frames up. If this is from user space, then there may
not exist a second stack.
Add a second stack when calling trace_hardirqs_on/off() otherwise
the following oops might occur:
Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT SMP NR_CPUS=2 PA Semi PWRficient
last sysfs file: /sys/block/sda/size
Modules linked in: ohci_hcd ehci_hcd usbcore
NIP: c0000000000e1c00 LR: c0000000000034d4 CTR: 000000011012c440
REGS: c00000003e2f3af0 TRAP: 0300 Not tainted (2.6.37-rc6+)
MSR: 9000000000001032 <ME,IR,DR> CR: 48044444 XER: 20000000
DAR: 00000001ffb9db50, DSISR: 0000000040000000
TASK = c00000003e1a00a0[2088] 'emacs' THREAD: c00000003e2f0000 CPU: 1
GPR00: 0000000000000001 c00000003e2f3d70 c00000000084e0d0 c0000000008816e8
GPR04: 000000001034c678 000000001032e8f9 0000000010336540 0000000040020000
GPR08: 0000000040020000 00000001ffb9db40 c00000003e2f3e30 0000000060000000
GPR12: 100000000000f032 c00000000fff0280 000000001032e8c9 0000000000000008
GPR16: 00000000105be9c0 00000000105be950 00000000105be9b0 00000000105be950
GPR20: 00000000ffb9dc50 00000000ffb9dbf0 00000000102f0000 00000000102f0000
GPR24: 00000000102e0000 00000000102f0000 0000000010336540 c0000000009ded38
GPR28: 00000000102e0000 c0000000000034d4 c0000000007ccb10 c00000003e2f3d70
NIP [c0000000000e1c00] .trace_hardirqs_off+0xb0/0x1d0
LR [c0000000000034d4] decrementer_common+0xd4/0x100
Call Trace:
[c00000003e2f3d70] [c00000003e2f3e30] 0xc00000003e2f3e30 (unreliable)
[c00000003e2f3e30] [c0000000000034d4] decrementer_common+0xd4/0x100
Instruction dump:
81690000 7f8b0000 419e0018 f84a0028 60000000 60000000 60000000 e95f0000
80030000 e92a0000 eb6301f8 2f800000 <eb890010> 41fe00dc a06d000a eb1e8050
---[ end trace 4ec7fd2be9240928 ]---
Reported-by: Joerg Sommer <joerg@alea.gnuu.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When we create an alternative feature section, the else case must be the
same size or smaller than the body. This is because when we patch the
else case in we just overwrite the body, so there must be room.
Up to now we just did this by inspection, but it's quite easy to enforce
it in the assembler, so we should.
The only change is to add the ifgt block, but that effects the alignment
of the tabs and so the whole macro is modified.
Also add a test, but #if 0 it because we don't want to break the build.
Anyone who's modifying the feature macros should enable the test.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option
is used to configure any non-standard kernel with a much larger scope than
only small devices.
This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes
references to the option throughout the kernel. A new CONFIG_EMBEDDED
option is added that automatically selects CONFIG_EXPERT when enabled and
can be used in the future to isolate options that should only be
considered for embedded systems (RISC architectures, SLOB, etc).
Calling the option "EXPERT" more accurately represents its intention: only
expert users who understand the impact of the configuration changes they
are making should enable it.
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: David Woodhouse <david.woodhouse@intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Greg KH <gregkh@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Robin Holt <holt@sgi.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When fixing the frequency calculations for perf on powerpc I
forgot to fix the FSL version.
If we dont set event->hw.last_period the frequency to period
calculations in perf go haywire and we continually
throttle/unthrottle the PMU.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110118214404.2f42e634@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit a4f740cf, "of/flattree: Add of_flat_dt_match() helper function"
introduced build failures in arch/powerpc/platform/83xx by mistyping
'static' as 'struct' in the compatible string list, and omitting a few
semicolons. This patch fixes it.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Fix tracepoint id to string perf.data header table
perf tools: Fix handling of wildcards in tracepoint event selectors
powerpc: perf: Fix frequency calculation for overflowing counters
When profiling a benchmark that is almost 100% userspace, I noticed some wildly
inaccurate profiles that showed almost all time spent in the kernel.
Closer examination shows we were programming a tiny number of cycles into the
PMU after each overflow (about ~200 away from the next overflow). This gets us
stuck in a loop which we eventually break out of by throttling the PMU (there
are regular throttle/unthrottle events in the log).
It looks like we aren't setting event->hw.last_period to something same and the
frequency to period calculations in perf are going haywire.
With the following patch we find the correct period after a few interrupts and
stay there. I also see no more throttle events.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
LKML-Reference: <20110117161742.5feb3761@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The physical address is never used by the device tree code when
allocating memory for unflattening. Change the architecture's alloc
hook to return the virutal address instead.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Alter compound get_page/put_page to keep references on subpages too, in
order to allow __split_huge_page_refcount to split an hugepage even while
subpages have been pinned by one of the get_user_pages() variants.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (142 commits)
KVM: Initialize fpu state in preemptible context
KVM: VMX: when entering real mode align segment base to 16 bytes
KVM: MMU: handle 'map_writable' in set_spte() function
KVM: MMU: audit: allow audit more guests at the same time
KVM: Fetch guest cr3 from hardware on demand
KVM: Replace reads of vcpu->arch.cr3 by an accessor
KVM: MMU: only write protect mappings at pagetable level
KVM: VMX: Correct asm constraint in vmcs_load()/vmcs_clear()
KVM: MMU: Initialize base_role for tdp mmus
KVM: VMX: Optimize atomic EFER load
KVM: VMX: Add definitions for more vm entry/exit control bits
KVM: SVM: copy instruction bytes from VMCB
KVM: SVM: implement enhanced INVLPG intercept
KVM: SVM: enhance mov DR intercept handler
KVM: SVM: enhance MOV CR intercept handler
KVM: SVM: add new SVM feature bit names
KVM: cleanup emulate_instruction
KVM: move complete_insn_gp() into x86.c
KVM: x86: fix CR8 handling
KVM guest: Fix kvm clock initialization when it's configured out
...
In fsl_rio_dbell_handler() the code currently simply acknowledges the QFI
queue full interrupt, but does nothing to resolve the queue full
condition. Instead, it jumps to the end of the isr. When a queue full
condition occurs, the isr is then re-entered immediately and continually,
forever.
The fix is to just fall through and read out current doorbell entries.
Signed-off-by: Thomas Taranowski <tom@baringforge.com>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the warnings genereted by arch/powerpc/include/asm/immap_qe.h when
CONFIG_PHYS_ADDR_T_64BIT is defined:
immap_qe.h: In function 'immrbar_virt_to_phys':
immap_qe.h:472:8: warning: cast from pointer to integer of different size
immap_qe.h:472:24: warning: cast from pointer to integer of different size
immap_qe.h:473:5: warning: cast from pointer to integer of different size
immap_qe.h:473:21: warning: cast from pointer to integer of different size
immap_qe.h:474:36: warning: cast from pointer to integer of different size
Note that the QE does not support 36-bit physical addresses, so even when
CONFIG_PHYS_ADDR_T_64BIT is defined, the QE MURAM must be located below the
4GB boundary.
Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
In order to prevent the fsl_dma driver from claiming the DMA channels that the
P1022DS audio driver needs, the compatible properties for those nodes must say
"fsl,ssi-dma-channel" instead of "fsl,eloplus-dma-channel".
Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
MPC8308 has ULPI pin muxing settings in SICRH register, bits 17-18
which is different from both MPC8313 and MPC8315.
Also MPC8308 doesn't have REFSEL, UTMI_PHY_EN and OTG_PORT fields
in the USB DR controller CONTROL register.
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Tested-by: Wolfgang Denk <wd@denx.de>
Acked-by: Wolfgang Denk <wd@denx.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Moved setting of RFXE bit so we get machine checks on RIO errors into
cpu_setup so that the RIO code isn't core specific.
Signed-off-by: Shaohui Xie <b21989@freescale.com>
Cc: Li Yang <leoli@freescale.com>
Cc: Roy Zang <tie-fei.zang@freescale.com>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Also make 74xx HID1 definition conditional.
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Shaohui Xie <b21989@freescale.com>
Cc: Roy Zang <tie-fei.zang@freescale.com>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
IA64 support forces us to abstract the allocation of the kvm structure.
But instead of mixing this up with arch-specific initialization and
doing the same on destruction, split both steps. This allows to move
generic destruction calls into generic code.
It also fixes error clean-up on failures of kvm_create_vm for IA64.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (72 commits)
powerpc/pseries: Fix build of topology stuff without CONFIG_NUMA
powerpc/pseries: Fix VPHN build errors on non-SMP systems
powerpc/83xx: add mpc8308_p1m DMA controller device-tree node
powerpc/83xx: add DMA controller to mpc8308 device-tree node
powerpc/512x: try to free dma descriptors in case of allocation failure
powerpc/512x: add MPC8308 dma support
powerpc/512x: fix the hanged dma transfer issue
powerpc/512x: scatter/gather dma fix
powerpc/powermac: Make auto-loading of therm_pm72 possible
of/address: Use propper endianess in get_flags
powerpc/pci: Use printf extension %pR for struct resource
powerpc: Remove unnecessary casts of void ptr
powerpc: Disable VPHN polling during a suspend operation
powerpc/pseries: Poll VPA for topology changes and update NUMA maps
powerpc: iommu: Add device name to iommu error printks
powerpc: Record vma->phys_addr in ioremap()
powerpc: Update compat_arch_ptrace
powerpc: Fix PPC_PTRACE_SETHWDEBUG on PPC_BOOK3S
powerpc/time: printk time stamp init not correct
powerpc: Minor cleanups for machdep.h
...
The header asm/hvcall.h was previously included indirectly via
smp.h. On non-SMP systems, however, these declarations are excluded
and the build breaks. This is easily fixed by including asm/hvcall.h
directly.
The VPHN feature is only meaningful on NUMA systems that implement
the SPLPAR option, so exclude the VPHN code on systems without
SPLPAR enabled.
Also, expose unmap_cpu_from_node() on systems with SPLPAR enabled,
even if CONFIG_HOTPLUG_CPU is disabled.
Lastly, map_cpu_to_node() is now needed by VPHN to manipulate the
node masks after boot time, so remove the __cpuinit annotation to
fix a section mismatch.
Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: (29 commits)
of/flattree: forward declare struct device_node in of_fdt.h
ipmi: explicitly include of_address.h and of_irq.h
sparc: explicitly cast negative phandle checks to s32
powerpc/405: Fix missing #{address,size}-cells in i2c node
powerpc/5200: dts: refactor dts files
powerpc/5200: dts: Change combatible strings on localbus
powerpc/5200: dts: remove unused properties
powerpc/5200: dts: rename nodes to prepare for refactoring dts files
of/flattree: Update dtc to current mainline.
of/device: Don't register disabled devices
powerpc/dts: fix syntax bugs in bluestone.dts
of: Fixes for OF probing on little endian systems
of: make drivers depend on CONFIG_OF instead of CONFIG_PPC_OF
of/flattree: Add of_flat_dt_match() helper function
of_serial: explicitly include of_irq.h
of/flattree: Refactor unflatten_device_tree and add fdt_unflatten_tree
of/flattree: Reorder unflatten_dt_node
of/flattree: Refactor unflatten_dt_node
of/flattree: Add non-boottime device tree functions
of/flattree: Add Kconfig for EARLY_FLATTREE
...
Fix up trivial conflict in arch/sparc/prom/tree_32.c as per Grant.
Remove kobject.h from files which don't need it, notably,
sched.h and fs.h.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (36 commits)
serial: apbuart: Fixup apbuart_console_init()
TTY: Add tty ioctl to figure device node of the system console.
tty: add 'active' sysfs attribute to tty0 and console device
drivers: serial: apbuart: Handle OF failures gracefully
Serial: Avoid unbalanced IRQ wake disable during resume
tty: fix typos/errors in tty_driver.h comments
pch_uart : fix warnings for 64bit compile
8250: fix uninitialized FIFOs
ip2: fix compiler warning on ip2main_pci_tbl
specialix: fix compiler warning on specialix_pci_tbl
rocket: fix compiler warning on rocket_pci_ids
8250: add a UPIO_DWAPB32 for 32 bit accesses
8250: use container_of() instead of casting
serial: omap-serial: Add support for kernel debugger
serial: fix pch_uart kconfig & build
drivers: char: hvc: add arm JTAG DCC console support
RS485 documentation: add 16C950 UART description
serial: ifx6x60: fix memory leak
serial: ifx6x60: free IRQ on error
Serial: EG20T: add PCH_UART driver
...
Fixed up conflicts in drivers/serial/apbuart.c with evil merge that
makes the code look fairly sane (unlike either side).
RCU free the struct inode. This will allow:
- Subsequent store-free path walking patch. The inode must be consulted for
permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
to take i_lock no longer need to take sb_inode_list_lock to walk the list in
the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
page lock to follow page->mapping.
The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.
In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.
The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
dget_locked was a shortcut to avoid the lazy lru manipulation when we already
held dcache_lock (lru manipulation was relatively cheap at that point).
However, how that the lru lock is an innermost one, we never hold it at any
caller, so the lock cost can now be avoided. We already have well working lazy
dcache LRU, so it should be fine to defer LRU manipulations to scan time.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Protect d_unhashed(dentry) condition with d_lock. This means keeping
DCACHE_UNHASHED bit in synch with hash manipulations.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Make d_count non-atomic and protect it with d_lock. This allows us to ensure a
0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when
we start protecting many other dentry members with d_lock.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
This patch creates mpc5200b.dtsi containing the information for the MPC5200b
SoC then modifies all of the dts files for MPC5200b based systems to use
mpc5200b.dtsi.
Signed-off-by: John Bonesio <bones@secretlab.ca>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch changes some incorrect compatible strings on the local plus bus node
in dts files for MPC5200b based systems.
Signed-off-by: John Bonesio <bones@secretlab.ca>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch remove unused properties in dts files in preparation of refactoring
the dts files for MPC5200b based boards.
Signed-off-by: John Bonesio <bones@secretlab.ca>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch renames nodes in dts fils for MPC5200b files to prepare for
refactoring of these files later. When refactoring it will be easier to verify
the results if the node names aren't changing at the same time.
Signed-off-by: John Bonesio <bones@secretlab.ca>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>