On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit
boundary. For example:
u64 val = PAGE_ALIGN(size);
always returns a value < 4GB even if size is greater than 4GB.
The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for
example):
#define PAGE_SHIFT 12
#define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
#define PAGE_MASK (~(PAGE_SIZE-1))
...
#define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK)
The "~" is performed on a 32-bit value, so everything in "and" with
PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary.
Using the ALIGN() macro seems to be the right way, because it uses
typeof(addr) for the mask.
Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in
include/linux/mm.h.
See also lkml discussion: http://lkml.org/lkml/2008/6/11/237
[akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c]
[akpm@linux-foundation.org: fix v850]
[akpm@linux-foundation.org: fix powerpc]
[akpm@linux-foundation.org: fix arm]
[akpm@linux-foundation.org: fix mips]
[akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c]
[akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c]
[akpm@linux-foundation.org: fix powerpc]
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Based upon a report by Daniel Smolik.
We do it too early, which triggers a BUG in
cpufreq_register_notifier().
Signed-off-by: David S. Miller <davem@davemloft.net>
We're calling request_irq() with a IRQs disabled.
No straightforward fix exists because we want to
enable these IRQs and setup state atomically before
getting into the IRQ handler the first time.
What happens now is that we mark the VIRQ to not be
automatically enabled by request_irq(). Then we
make explicit enable_irq() calls when we grab the
LDC channel.
This way we don't need to call request_irq() illegally
under the LDC channel lock any more.
Bump LDC version and release date.
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
remove CONFIG_KMOD from core kernel code
remove CONFIG_KMOD from lib
remove CONFIG_KMOD from sparc64
rework try_then_request_module to do less in non-modular kernels
remove mention of CONFIG_KMOD from documentation
make CONFIG_KMOD invisible
modules: Take a shortcut for checking if an address is in a module
module: turn longs into ints for module sizes
Shrink struct module: CONFIG_UNUSED_SYMBOLS ifdefs
module: reorder struct module to save space on 64 bit builds
module: generic each_symbol iterator function
module: don't use stop_machine for waiting rmmod
One place is just a comment, the other a conditional, unused
inclusion of linux/kmod.h.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This converts all instances of bus_id in the sparc core kernel to use
either dev_set_name(), or dev_name() depending on the need.
This is done in anticipation of removing the bus_id field from struct
driver.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This allow to dynamically generate attributes and share show/store
functions between attributes. Right now most attributes are generated
by special macros and lots of duplicated code. With the attribute
passed it's instead possible to attach some data to the attribute
and then use that in shared low level functions to do different things.
I need this for the dynamically generated bank attributes in the x86
machine check code, but it'll allow some further cleanups.
I converted all users in tree to the new show/store prototype. It's a single
huge patch to avoid unbisectable sections.
Runtime tested: x86-32, x86-64
Compiled only: ia64, powerpc
Not compile tested/only grep converted: sh, arm, avr32
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kobjects do not have a limit in name size since a while, so stop
pretending that they do.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:
scheduler switch to idle task
enable interrupts
Window starts here
----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick
----> interrupt happens (does set NEED_RESCHED)
return from schedule()
cpu_idle(): preempt_disable();
Window ends here
The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.
The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.
Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.
cpu_idle()
{
preempt_disable();
while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop
while (!need_resched())
halt();
tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}
In hindsight we should have done this forever, but ...
/me grabs a large brown paperbag.
Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (72 commits)
Revert "x86/PCI: ACPI based PCI gap calculation"
PCI: remove unnecessary volatile in PCIe hotplug struct controller
x86/PCI: ACPI based PCI gap calculation
PCI: include linux/pm_wakeup.h for device_set_wakeup_capable
PCI PM: Fix pci_prepare_to_sleep
x86/PCI: Fix PCI config space for domains > 0
Fix acpi_pm_device_sleep_wake() by providing a stub for CONFIG_PM_SLEEP=n
PCI: Simplify PCI device PM code
PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep
PCI ACPI: Rework PCI handling of wake-up
ACPI: Introduce new device wakeup flag 'prepared'
ACPI: Introduce acpi_device_sleep_wake function
PCI: rework pci_set_power_state function to call platform first
PCI: Introduce platform_pci_power_manageable function
ACPI: Introduce acpi_bus_power_manageable function
PCI: make pci_name use dev_name
PCI: handle pci_name() being const
PCI: add stub for pci_set_consistent_dma_mask()
PCI: remove unused arch pcibios_update_resource() functions
PCI: fix pci_setup_device()'s sprinting into a const buffer
...
Fixed up conflicts in various files (arch/x86/kernel/setup_64.c,
arch/x86/pci/irq.c, arch/x86/pci/pci.h, drivers/acpi/sleep/main.c,
drivers/pci/pci.c, drivers/pci/pci.h, include/acpi/acpi_bus.h) from x86
and ACPI updates manually.
* 'tracing/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (228 commits)
ftrace: build fix for ftraced_suspend
ftrace: separate out the function enabled variable
ftrace: add ftrace_kill_atomic
ftrace: use current CPU for function startup
ftrace: start wakeup tracing after setting function tracer
ftrace: check proper config for preempt type
ftrace: trace schedule
ftrace: define function trace nop
ftrace: move sched_switch enable after markers
ftrace: prevent ftrace modifications while being kprobe'd, v2
fix "ftrace: store mcount address in rec->ip"
mmiotrace broken in linux-next (8-bit writes only)
ftrace: avoid modifying kprobe'd records
ftrace: freeze kprobe'd records
kprobes: enable clean usage of get_kprobe
ftrace: store mcount address in rec->ip
ftrace: build fix with gcc 4.3
namespacecheck: fixes
ftrace: fix "notrace" filtering priority
ftrace: fix printout
...
Today's linux-next build (spac64 allmodconfig) failed like this:
arch/sparc64/kernel/stacktrace.c:50: warning: type defaults to `int' in declaration of `EXPORT_SYMBOL_GPL'
arch/sparc64/kernel/stacktrace.c:50: warning: parameter names (without types) in function declaration
arch/sparc64/kernel/stacktrace.c:50: warning: data definition has no type or storage class
Signed-off-by: Stephen Rothwell <sf@canb.auug.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Also fixes up the sparc code that was assuming this is not a constant.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Andrew Morton reported this against linux-next:
ERROR: ".save_stack_trace" [tests/backtracetest.ko] undefined!
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Alexander Beregalov reported this build failure:
$ make CROSS_COMPILE=sparc64-unknown-linux-gnu- image modules && sudo
make modules_install
CHK include/linux/version.h
CHK include/linux/utsrelease.h
CALL scripts/checksyscalls.sh
CHK include/linux/compile.h
dnsdomainname: Unknown host
CC arch/sparc64/kernel/sparc64_ksyms.o
arch/sparc64/kernel/sparc64_ksyms.c:116: error: '_mcount' undeclared
here (not in a function)
cc1: warnings being treated as errors
arch/sparc64/kernel/sparc64_ksyms.c:116: error: type defaults to 'int'
in declaration of '_mcount'
And bisected it back to:
| commit 395a59d0f8
| Author: Abhishek Sagar <sagar.abhishek@gmail.com>
| Date: Sat Jun 21 23:47:27 2008 +0530
|
| ftrace: store mcount address in rec->ip
the mcount prototype is only available under CONFIG_FTRACE,
extend it to CONFIG_MCOUNT as well.
Reported-and-bisected-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It's never used and the comments refer to nonatomic and retry
interchangably. So get rid of it.
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This adds kernel/smp.c which contains helpers for IPI function calls. In
addition to supporting the existing smp_call_function() in a more efficient
manner, it also adds a more scalable variant called smp_call_function_single()
for calling a given function on a single CPU only.
The core of this is based on the x86-64 patch from Nick Piggin, lots of
changes since then. "Alan D. Brunelle" <Alan.Brunelle@hp.com> has
contributed lots of fixes and suggestions as well. Also thanks to
Paul E. McKenney <paulmck@linux.vnet.ibm.com> for reviewing RCU usage
and getting rid of the data allocation fallback deadlock.
Acked-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Record the address of the mcount call-site. Currently all archs except sparc64
record the address of the instruction following the mcount call-site. Some
general cleanups are entailed. Storing mcount addresses in rec->ip enables
looking them up in the kprobe hash table later on to check if they're kprobe'd.
Signed-off-by: Abhishek Sagar <sagar.abhishek@gmail.com>
Cc: davem@davemloft.net
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When we fully commit to returning back to kernel mode from
a trap, zero out the regs->magic value to prevent false
positives during stack backtraces.
Signed-off-by: David S. Miller <davem@davemloft.net>
The offset to the pt_regs area was wrong, so we weren't
looking at the right location for the magic cookie.
A trap frame is composed of a "struct sparc_stackf" then
a "struct pt_regs", the code was using "struct reg_window"
instead of "struct sparc_stackf".
Signed-off-by: David S. Miller <davem@davemloft.net>
Because of the silly way I set up the initial stack for
new kernel threads, there is a loop at the top of the
stack.
To fix this, properly add another stack frame that is copied
from the parent and terminate it in the child by setting
the frame pointer in that frame to zero.
Signed-off-by: David S. Miller <davem@davemloft.net>
When a cpu really is stuck in the kernel, it can be often
impossible to figure out which cpu is stuck where. The
worst case is when the stuck cpu has interrupts disabled.
Therefore, implement a global cpu state capture that uses
SMP message interrupts which are not disabled by the
normal IRQ enable/disable APIs of the kernel.
As long as we can get a sysrq 'y' to the kernel, we can
get a dump. Even if the console interrupt cpu is wedged,
we can trigger it from userspace using /proc/sysrq-trigger
The output is made compact so that this facility is more
useful on high cpu count systems, which is where this
facility will likely find itself the most useful :)
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the CVS keywords that weren't updated for a long time
from comments.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just like mmap, we need to validate address ranges regardless
of MAP_FIXED.
sparc{,64}_mmap_check()'s flag argument is unused, remove.
Based upon a report and preliminary patch by
Jan Lieskovsky <jlieskov@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So, forever, we've had this ptrace_signal_deliver implementation
which tries to handle all of the nasties that can occur when the
debugger looks at a process about to take a signal. It's meant
to address all of these issues inside of the kernel so that the
debugger need not be mindful of such things.
Problem is, this doesn't work.
The idea was that we should do the syscall restart business first, so
that the debugger captures that state. Otherwise, if the debugger for
example saves the child's state, makes the child execute something
else, then restores the saved state, we won't handle the syscall
restart properly because we lose the "we're in a syscall" state.
The code here worked for most cases, but if the debugger actually
passes the signal through to the child unaltered, it's possible that
we would do a syscall restart when we shouldn't have.
In particular this breaks the case of debugging a process under a gdb
which is being debugged by yet another gdb. gdb uses sigsuspend
to wait for SIGCHLD of the inferior, but if gdb itself is being
debugged by a top-level gdb we get a ptrace_stop(). The top-level gdb
does a PTRACE_CONT with SIGCHLD to let the inferior gdb see the
signal. But ptrace_signal_deliver() assumed the debugger would cancel
out the signal and therefore did a syscall restart, because the return
error was ERESTARTNOHAND.
Fix this by simply making ptrace_signal_deliver() a nop, and providing
a way for the debugger to control system call restarting properly:
1) Report a "in syscall" software bit in regs->{tstate,psr}.
It is set early on in trap entry to a system call and is fully
visible to the debugger via ptrace() and regsets.
2) Test this bit right before doing a syscall restart. We have
to do a final recheck right after get_signal_to_deliver() in
case the debugger cleared the bit during ptrace_stop().
3) Clear the bit in trap return so we don't accidently try to set
that bit in the real register.
As a result we also get a ptrace_{is,clear}_syscall() for sparc32 just
like sparc64 has.
M68K has this same exact bug, and is now the only other user of the
ptrace_signal_deliver hook. It needs to be fixed in the same exact
way as sparc.
Signed-off-by: David S. Miller <davem@davemloft.net>
Forever we had a PTRACE_SUNOS_DETACH which was unconditionally
recognized, regardless of the personality of the process.
Unfortunately, this value is what ended up in the GLIBC sys/ptrace.h
header file on sparc as PTRACE_DETACH and PT_DETACH.
So continue to recognize this old value. Luckily, it doesn't conflict
with anything we actually care about.
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to be more liberal about the alignment of the buffer given to
us by sigaltstack(). The user should not need to be mindful of all of
the alignment constraints we have for the stack frame.
This mirrors how we handle this situation in clone() as well.
Also, we align the stack even in non-SA_ONSTACK cases so that signals
due to bad stack alignment can be delivered properly. This makes such
errors easier to debug and recover from.
Finally, add the sanity check x86 has to make sure we won't overflow
the signal stack.
This fixes glibc testcases nptl/tst-cancel20.c and
nptl/tst-cancelx20.c
Signed-off-by: David S. Miller <davem@davemloft.net>
We clobber %i1 as well as %i0 for these system calls,
because they give two return values.
Therefore, on error, we have to restore %i1 properly
or else the restart explodes since it uses the wrong
arguments.
This fixes glibc's nptl/tst-eintr1.c testcase.
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 2664ef44cf.
Ingo moved around where the softlockup dependency sits
so this change is no longer necessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
The change I put into copy_thread() just papered over the real
problem.
When we are looking to see if we should do a syscall restart, when
deliverying a signal, we should only interpret the syscall return
value as an error if the carry condition code(s) are set.
Otherwise it's a success return.
Also, sigreturn paths should do a pt_regs_clear_trap_type().
It turns out that doing a syscall restart when returning from a fork()
does and should happen, from time to time. Even if copy_thread()
returns success, copy_process() can still unwind and signal
-ERESTARTNOINTR in the parent.
Signed-off-by: David S. Miller <davem@davemloft.net>
It just creates confusion, errors, and bugs.
For one thing, this can cause dup sysfs or procfs nodes to get
created:
[ 1.198015] proc_dir_entry '00.0' already registered
[ 1.198036] Call Trace:
[ 1.198052] [00000000004f2534] create_proc_entry+0x7c/0x98
[ 1.198092] [00000000005719e4] pci_proc_attach_device+0xa4/0xd4
[ 1.198126] [00000000007d991c] pci_proc_init+0x64/0x88
[ 1.198158] [00000000007c62a4] kernel_init+0x190/0x330
[ 1.198183] [0000000000426cf8] kernel_thread+0x38/0x48
[ 1.198210] [00000000006a0d90] rest_init+0x18/0x5c
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove dulicated include file <asm/timer.h> in arch/sparc64/kernel/smp.c.
Signed-off-by: Huang Weiyi <hwy@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current limitations:
1) On SMP single stepping has some fundamental issues,
shared with other sw single-step architectures such
as mips and arm.
2) On 32-bit sparc we don't support SMP kgdb yet. That
requires some reworking of the IPI mechanisms and
infrastructure on that platform.
Signed-off-by: David S. Miller <davem@davemloft.net>
entry.S was a hodge-podge of several totally unrelated
sets of assembler routines, ranging from FPU trap handlers
to hypervisor call functions.
Split it up into topic-sized pieces.
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a regression added by
238468b2ac ("[SPARC64]: Use trap type
stored in pt_regs to handle syscall restart.")
Because we now encode the "returning from syscall" status in the
pt_regs area, we have to be mindful to zap it out in the child
of a fork.
During a parallel kernel build I saw an accidental -EINTR return
from vfork() in 'make' because of this bug.
Signed-off-by: David S. Miller <davem@davemloft.net>
If we use this from more than one place, it's better to
have helpers instead of twiddling magic constants all
over.
Add pt_regs_trap_type(), pt_regs_clear_trap_type(), and
pt_regs_is_syscall().
Use them in do_signal().
Signed-off-by: David S. Miller <davem@davemloft.net>
Back around the same time we were bootstrapping the first 32-bit sparc
Linux kernel with a SunOS userland, we made the signal frame match
that of SunOS.
By the time we even started putting together a native Linux userland
for 32-bit Sparc we realized this layout wasn't sufficient for Linux's
needs.
Therefore we changed the layout, yet kept support for the old style
signal frame layout in there. The detection mechanism is that we had
sys_sigaction() start passing in a negative signal number to indicate
"new style signal frames please".
Anyways, no binaries exist in the world that use the old stuff. In
fact, I bet Jakub Jelinek and myself are the only two people who ever
had such binaries to be honest.
So let's get rid of this stuff.
I added an assertion using WARN_ON_ONCE() that makes sure 32-bit
applications are passing in that negative signal number still.
Signed-off-by: David S. Miller <davem@davemloft.net>
I must have disabled this due to other bugs which were fixed over
time. And this is needed in order for child devices of "pmu"
to get proper resource values.
Signed-off-by: David S. Miller <davem@davemloft.net>
It's completely superfluous, CONFIG_COMPAT is sufficient.
What this used to be is an umbrella for enabling code shared
by all 32-bit compat binary support types. But with the
removal of SunOS and Solaris support, the only one left is
Linux 32-bit ELF.
Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
Kernel bugzilla 10273
As reported by Jos van der Ende, ever since commit
5a606b72a4 ("[SPARC64]: Do not ACK an
INO if it is disabled or inprogress.") sun4u interrupts
can get stuck.
What this changset did was add the following conditional to
the various IRQ chip ->enable() handlers on sparc64:
if (unlikely(desc->status & (IRQ_DISABLED|IRQ_INPROGRESS)))
return;
which is correct, however it means that special care is needed
in the ->enable() method.
Specifically we must put the interrupt into IDLE state during
an enable, or else it might never be sent out again.
Setting the INO interrupt state to IDLE resets the state machine,
the interrupt input to the INO is retested by the hardware, and
if an interrupt is being signalled by the device, the INO
moves back into TRANSMIT state, and an interrupt vector is sent
to the cpu.
The two sun4v IRQ chip handlers were already doing this properly,
only sun4u got it wrong.
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise all sorts of bad things can happen, including
spurious softlockup reports.
Other platforms have this same bug, in one form or
another, just don't see the issue because they
don't sleep as long as sparc64 can in NOHZ.
Thanks to some brilliant debugging by Peter Zijlstra.
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we have a magic cookie in the pt_regs, we can
properly detect trap frames in stack bactraces.
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we indicate the "restart system call" in the
trap type field of pt_regs->magic, we don't need to
set the %l6 boolean in all of the trap return paths.
And we therefore don't need to pass it to do_notify_resume().
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we can check the trap type directly, we don't need the
funny restart_syscall indication from the trap return paths.
Signed-off-by: David S. Miller <davem@davemloft.net>
This sets us up for several simplifications and facilities:
1) The magic cookie lets us identify trap frames more
accurately in stack backtraces.
2) The trap type lets us simplify all of the "are we in
a syscall" state management and checks.
3) We can now see if a task off the cpu is sleeping in
a system call or not. In fact, we can see what
trap it is sleeping in whatever the type. The utrace
guys will use this.
Based upon some discussions with Roland McGrath.
Signed-off-by: David S. Miller <davem@davemloft.net>
The following cleanups are now possible:
- arch/sparc64/kernel/entry.S:ret_sys_call no longer has to be global
- arch/sparc64/kernel/sparc64_ksyms.c:
remove no longer used prototypes
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there is only code to parse NUMA attributes on
sun4v/niagara systems, but later on we will add such parsing
for older systems.
Signed-off-by: David S. Miller <davem@davemloft.net>
We have to do it like this before we can move the PROM and MDESC device
tree code over to using lmb_alloc().
Signed-off-by: David S. Miller <davem@davemloft.net>
None of these files use any of the functionality promised by
asm/semaphore.h. It's possible that they rely on it dragging in some
unrelated header file, but I can't build all these files, so we'll have
fix any build failures as they come up.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Semaphores are no longer performance-critical, so a generic C
implementation is better for maintainability, debuggability and
extensibility. Thanks to Peter Zijlstra for fixing the lockdep
warning. Thanks to Harvey Harrison for pointing out that the
unlikely() was unnecessary.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
1) ptrace should pass 'current' to task_user_regset_view()
2) When fetching general registers using a 64-bit view, and
the target is 32-bit, we have to convert.
3) Skip the whole register window get/set code block if
the user isn't asking to access anything in there.
Otherwise we have problems if the user doesn't have
an address space setup. Fetching ptrace register is
still valid at such a time, and ptrace does not try
to access the register window area of the regset.
Signed-off-by: David S. Miller <davem@davemloft.net>
The calculation of the FPU reg save area pointer
was wrong.
Based upon an OOPS report from Tom Callaway.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some IOMMUs allocate memory areas spanning LLD's segment boundary limit. It
forces low level drivers to have a workaround to adjust scatter lists that the
IOMMU builds. We are in the process of making all the IOMMUs respect the
segment boundary limits to remove such work around in LLDs.
SPARC64 IOMMUs were rewritten to use the IOMMU helper functions and the commit
89c94f2f70 made the IOMMUs not allocate memory
areas spanning the segment boundary limit.
However, SPARC64 IOMMUs allocate memory areas first then try to merge them
(while some IOMMUs walk through all the sg entries to see how they can be
merged first and allocate memory areas). So SPARC64 IOMMUs also need the
boundary limit checking when they try to merge sg entries.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sparse still doesn't like the funny cast we make from a scalar to a
"union semun" (which is correct by the C language and in particular
works with the sparc64 calling conventions, but sparse doesn't grok
that yet).
Signed-off-by: David S. Miller <davem@davemloft.net>
Add 'UL' markers to DCU_* macros.
Declare C functions called from assembler in entry.h
Declare C functions called from within the sparc64 arch
code in include/asm-sparc64/*.h headers as appropriate.
Remove unused routines in traps.c
Signed-off-by: David S. Miller <davem@davemloft.net>
We create a local header file entry.h, under arch/sparc64/kernel/,
that we can use to declare routines either defined in assembler
or only invoked from assembler. As well as other data objects
which are private to the inner sparc64 kernel arch code.
Signed-off-by: David S. Miller <davem@davemloft.net>
Doing a 'flushw' every stack trace capture creates so much overhead
that it makes lockdep next to unusable.
We only care about the frame pointer chain and the function caller
program counters, so flush those by hand to the stack frame.
This is significantly more efficient than a 'flushw' because:
1) We only save 16 bytes per active register window to the stack.
2) This doesn't push the entire register window context of the current
call chain out of the cpu, forcing register window fill traps as we
return back down.
Note that we can't use 'restore' and 'save' instructions to move
around the register windows because that wouldn't work on Niagara
processors. They optimize 'save' into a new register window by
simply clearing out the registers instead of pulling them in from
the on-chip register window backing store.
Based upon a report by Tom Callaway.
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: exec PT_DTRACE
[SPARC64]: Use shorter list_splice_init() for brevity.
[SPARC64]: Remove most limitations to kernel image size.
The PT_DTRACE flag is meaningless and obsolete.
Don't touch it.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently kernel images are limited to 8MB in size, and this causes
problems especially when enabling features that take up a lot of
kernel image space such as lockdep.
The code now will align the kernel image size up to 4MB and map that
many locked TLB entries. So, the only practical limitation is the
number of available locked TLB entries which is 16 on Cheetah and 64
on pre-Cheetah sparc64 cpus. Niagara cpus don't actually have hw
locked TLB entry support. Rather, the hypervisor transparently
provides support for "locked" TLB entries since it runs with physical
addressing and does the initial TLB miss processing.
Fully utilizing this change requires some help from SILO, a patch for
which will be submitted to the maintainer. Essentially, SILO will
only currently map up to 8MB for the kernel image and that needs to be
increased.
Note that neither this patch nor the SILO bits will help with network
booting. The openfirmware code will only map up to a certain amount
of kernel image during a network boot and there isn't much we can to
about that other than to implemented a layered network booting
facility. Solaris has this, and calls it "wanboot" and we may
implement something similar at some point.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix following warnings:
WARNING: vmlinux.o(.text+0x4b258): Section mismatch in reference from the function dr_cpu_data() to the function .devinit.text:mdesc_fill_in_cpu_data()
WARNING: vmlinux.o(.text+0x4b290): Section mismatch in reference from the function dr_cpu_data() to the function .cpuinit.text:cpu_up()
mdesc_fill_in_cpu_data() is only used during early init and for
cpu hotplug so the __cpuinit annotation is the correct choice.
We have the call chain:
dr_cpu_data() => dr_cpu_configure() => mdesc_fill_in_cpu_data()
dr_cpu_data() is used only during early init and for cpu
hotplug. So annotating them all __cpuinit solves the
section mismatch and should be correct.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
arch/sparc64/kernel/process.c:504:17: warning: symbol 'sparc_do_fork' was not declared. Should it be static?
arch/sparc64/kernel/process.c:655:5: warning: symbol 'dump_fpu' was not declared. Should it be static?
arch/sparc64/kernel/process.c:708:16: warning: symbol 'sparc_execve' was not declared. Should it be static?
Signed-off-by: David S. Miller <davem@davemloft.net>
arch/sparc64/kernel/process.c:219:6: warning: symbol '__show_regs' was not declared. Should it be static?
Signed-off-by: David S. Miller <davem@davemloft.net>
Noticed via sparse:
arch/sparc64/kernel/process.c:215:6: warning: symbol 'show_stackframe' was not declared. Should it be static?
arch/sparc64/kernel/process.c:243:6: warning: symbol 'show_stackframe32' was not declared. Should it be static?
It is totally unused.
Signed-off-by: David S. Miller <davem@davemloft.net>
arch/sparc64/kernel/process.c:123:6: warning: symbol 'machine_alt_power_off' was not declared. Should it be static?
Signed-off-by: David S. Miller <davem@davemloft.net>
And also it's helper function pci_is_controller(). Both
are unused.
I can't remove the equivalent from sparc32 yet as some
ancient bus probing code still uses that platform's version.
Signed-off-by: David S. Miller <davem@davemloft.net>
The idea of this thing is we could save/restore the firmware's
palette when breaking in and out of the firmware prompt.
Only one driver implemented this (atyfb) and it's value is
questionable. If you're just debugging you don't really
care that the characters end up being purple or whatever.
And we can provide better debugging and firmware command
facilities with minimal in-kernel console I/O drivers.
Signed-off-by: David S. Miller <davem@davemloft.net>
The functions time_before, time_before_eq, time_after, and
time_after_eq are more robust for comparing jiffies against other
values.
So following patch implements usage of the time_after() macro, defined
at linux/jiffies.h, which deals with wrapping correctly
Signed-off-by: S.Çağlar Onur <caglar@pardus.org.tr>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are no callers of this on the Sparc platforms.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mimicks almost perfectly the powerpc IOMMU code, except that it
doesn't have the IOMMU_PAGE_SIZE != PAGE_SIZE handling, and it also
lacks the device dma mask support bits.
I'll add that later as time permits, but this gets us at least back to
where we were beforehand.
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: Make use of the new fs/compat_binfmt_elf.c
[SPARC64]: Make use of compat_sys_ptrace()
Manually fixed trivial delete/modift conflict in arch/sparc64/kernel/binfmt_elf32.c
Suppress A.OUT library support if CONFIG_ARCH_SUPPORTS_AOUT is not set.
Not all architectures support the A.OUT binfmt, so the ELF binfmt should not
be permitted to go looking for A.OUT libraries to load in such a case. Not
only that, but under such conditions A.OUT core dumps are not produced either.
To make this work, this patch also does the following:
(1) Makes the existence of the contents of linux/a.out.h contingent on
CONFIG_ARCH_SUPPORTS_AOUT.
(2) Renames dump_thread() to aout_dump_thread() as it's only called by A.OUT
core dumping code.
(3) Moves aout_dump_thread() into asm/a.out-core.h and makes it inline. This
is then included only where needed. This means that this bit of arch
code will be stored in the appropriate A.OUT binfmt module rather than
the core kernel.
(4) Drops A.OUT support for Blackfin (according to Mike Frysinger it's not
needed) and FRV.
This patch depends on the previous patch to move STACK_TOP[_MAX] out of
asm/a.out.h and into asm/processor.h as they're required whether or not A.OUT
format is available.
[jdike@addtoit.com: uml: re-remove accidentally restored code]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Supporting SunOS ptrace() is pretty pointless and these
kinds of quirks keep us from being able to share more
code with other platforms.
Signed-off-by: David S. Miller <davem@davemloft.net>
The early per-cpu handling needs a slight tweak to work when booting
on a non-zero cpu.
We got away with this for a long time, but can't any longer as now
even printk() calls functions (cpu_clock() for example) that thus make
early references to per-cpu variables.
Signed-off-by: David S. Miller <davem@davemloft.net>
calibrate_delay() must be __cpuinit, not __{dev,}init.
I've verified that this is correct for all users.
While doing the latter, I also did the following cleanups:
- remove pointless additional prototypes in C files
- ensure all users #include <linux/delay.h>
This fixes the following section mismatches with CONFIG_HOTPLUG=n,
CONFIG_HOTPLUG_CPU=y:
WARNING: vmlinux.o(.text+0x1128d): Section mismatch: reference to .init.text.1:calibrate_delay (between 'check_cx686_slop' and 'set_cx86_reorder')
WARNING: vmlinux.o(.text+0x25102): Section mismatch: reference to .init.text.1:calibrate_delay (between 'smp_callin' and 'cpu_coregroup_map')
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christian Zankel <chris@zankel.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NR_OPEN (historically set to 1024*1024) actually forbids processes to open
more than 1024*1024 handles.
Unfortunatly some production servers hit the not so 'ridiculously high
value' of 1024*1024 file descriptors per process.
Changing NR_OPEN is not considered safe because of vmalloc space potential
exhaust.
This patch introduces a new sysctl (/proc/sys/fs/nr_open) wich defaults to
1024*1024, so that admins can decide to change this limit if their workload
needs it.
[akpm@linux-foundation.org: export it for sparc64]
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- All implementations can be __devinit
- The function prototypes were in asm/timex.h but they all must be the same,
so create a single declaration in linux/timex.h.
- uninline the sparc64 version to match the other architectures
- Don't bother #defining ARCH_HAS_READ_CURRENT_TIMER to a particular value.
[ezk@cs.sunysb.edu: fix build]
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Changeset fde6a3c82d ("iommu sg merging:
sparc64: make iommu respect the segment size limits") broke sparc64
because whilst it added the segment limiting code to the first pass of
SG mapping (in prepare_sg()) it did not add matching code to the
second pass handling (in fill_sg())
As a result the two passes disagree where the segment boundaries
should be, resulting in OOPSes, DMA corruption, and corrupted
superblocks.
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the new timerfd API as it is implemented by the following patch:
int timerfd_create(int clockid, int flags);
int timerfd_settime(int ufd, int flags,
const struct itimerspec *utmr,
struct itimerspec *otmr);
int timerfd_gettime(int ufd, struct itimerspec *otmr);
The timerfd_create() API creates an un-programmed timerfd fd. The "clockid"
parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME.
The timerfd_settime() API give new settings by the timerfd fd, by optionally
retrieving the previous expiration time (in case the "otmr" parameter is not
NULL).
The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit
is set in the "flags" parameter. Otherwise it's a relative time.
The timerfd_gettime() API returns the next expiration time of the timer, or
{0, 0} if the timerfd has not been set yet.
Like the previous timerfd API implementation, read(2) and poll(2) are
supported (with the same interface). Here's a simple test program I used to
exercise the new timerfd APIs:
http://www.xmailserver.org/timerfd-test2.c
[akpm@linux-foundation.org: coding-style cleanups]
[akpm@linux-foundation.org: fix ia64 build]
[akpm@linux-foundation.org: fix m68k build]
[akpm@linux-foundation.org: fix mips build]
[akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds]
[heiko.carstens@de.ibm.com: fix s390]
[akpm@linux-foundation.org: fix powerpc build]
[akpm@linux-foundation.org: fix sparc64 more]
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
WARNING: vmlinux.o(.text+0x39be4): Section mismatch in reference from the function probe_existing_entries() to the function .init.text:page_in_phys_avail()
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the deprecated __attribute_used__.
[Introduce __section in a few places to silence checkpatch /sam]
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
This patch consolidate all definitions of .init.text, .init.data
and .exit.text, .exit.data section definitions in
the generic vmlinux.lds.h.
This is a preparational patch - alone it does not buy
us much good.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Fix following Section mismatch warning in sparc64:
WARNING: arch/sparc64/kernel/built-in.o(.text+0x13dec): Section mismatch: reference to .devinit.text:pci_scan_one_pbm (between 'psycho_scan_bus' and 'psycho_pbm_init')
WARNING: arch/sparc64/kernel/built-in.o(.text+0x14b58): Section mismatch: reference to .devinit.text:pci_scan_one_pbm (between 'sabre_scan_bus' and 'sabre_init')
WARNING: arch/sparc64/kernel/built-in.o(.text+0x15ea4): Section mismatch: reference to .devinit.text:pci_scan_one_pbm (between 'schizo_scan_bus' and 'schizo_pbm_init')
WARNING: arch/sparc64/kernel/built-in.o(.text+0x17780): Section mismatch: reference to .devinit.text:pci_scan_one_pbm (between 'pci_sun4v_scan_bus' and 'pci_sun4v_get_head')
WARNING: arch/sparc64/kernel/built-in.o(.text+0x17d5c): Section mismatch: reference to .devinit.text:pci_scan_one_pbm (between 'pci_fire_scan_bus' and 'pci_fire_get_head')
WARNING: arch/sparc64/kernel/built-in.o(.text+0x23860): Section mismatch: reference to .devinit.text:vio_dev_release (between 'vio_create_one' and 'vio_add')
WARNING: arch/sparc64/kernel/built-in.o(.text+0x23868): Section mismatch: reference to .devinit.text:vio_dev_release (between 'vio_create_one' and 'vio_add')
The pci_* were all missing __init annotations.
For the vio.c case it was a function with a wrong annotation which was removed.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) Trap level wasn't being passed down properly, we need to
move it from %l4 into the correct outgoing arg register.
2) Although the TPC often provides the most direct clue, we
have the caller PC so we should provide that as well.
Signed-off-by: David S. Miller <davem@davemloft.net>
This was caught and identified by Greg Onufer.
Since we setup the 256M/4M bitmap table after taking over the trap
table, it's possible for some 4M mapping to get loaded in the TLB
beforhand which later will be 256M mappings.
This can cause illegal TLB multiple-match conditions. Fix this by
setting up the bitmap before we take over the trap table.
Next, __flush_tlb_all() was not doing anything on hypervisor
platforms. Fix by adding sun4v_mmu_demap_all() and calling it.
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to mask out the proper bits when testing the dispatch status
register else we can see unrelated NACK bits from previous cross call
sends.
Signed-off-by: David S. Miller <davem@davemloft.net>
get_cpu() always returns zero on non-SMP builds, but we
really want the physical cpu number in this code in order
to do the right thing.
Based upon a non-SMP kernel boot failure report from Bernd Zeimetz.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds checking for possible NULL pointer dereference
if of_find_property() failed.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There should be a pci_dev_put when breaking out of a loop that iterates
over calls to pci_get_device and similar functions.
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can't export verify_compat_iovec when CONFIG_NET is
disabled, and consequently the Solaris compat module
should also depend upon CONFIG_NET.
Signed-off-by: David S. Miller <davem@davemloft.net>
Looks like the MAP_FIXED case is using the wrong address hint. I'd
expect the comment "don't mess with it" means pass the request
straight on through, not change the address requested to -ENOMEM.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
zero out dma_length in the entry immediately following the last mapped
entry for unmap_sg.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
A few trivial Makefile cleanups
- dependencipes in head.o was all wrong - deleted
- CMODEL_CFLAG was not used anywhere
- NEW_GCC was then not used outside sparc64/Makefe - do not export it
- FIXME seems not appropriate - all other put oprofile in drivers-y too
- No reason to do -I. (and it still builds)
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Invoke the desc->handle_irq directly in the top-level dispatch,
just like other sophisticated ports.
This will allow us to decrease the cost of the MSI queue dispatch.
Signed-off-by: David S. Miller <davem@davemloft.net>
One of the easiest things to isolate is the pid printed in kernel log.
There was a patch, that made this for arch-independent code, this one makes
so for arch/xxx files.
It took some time to cross-compile it, but hopefully these are all the
printks in arch code.
Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the largest patch in the set. Make all (I hope) the places where
the pid is shown to or get from user operate on the virtual pids.
The idea is:
- all in-kernel data structures must store either struct pid itself
or the pid's global nr, obtained with pid_nr() call;
- when seeking the task from kernel code with the stored id one
should use find_task_by_pid() call that works with global pids;
- when showing pid's numerical value to the user the virtual one
should be used, but however when one shows task's pid outside this
task's namespace the global one is to be used;
- when getting the pid from userspace one need to consider this as
the virtual one and use appropriate task/pid-searching functions.
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: nuther build fix]
[akpm@linux-foundation.org: yet nuther build fix]
[akpm@linux-foundation.org: remove unneeded casts]
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Paul Menage <menage@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Also of_unregister_driver. These will be shortly also used by the
PowerPC code.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not use *alloc_bootmem_low*(), because ARCH_LOW_ADDRESS_LIMIT
is 4GB and this results in boot failures if all of the physical
memory in the machine is above 4GB.
Signed-off-by: David S. Miller <davem@davemloft.net>
All asm/ipc.h files do only #include <asm-generic/ipc.h>.
This patch therefore removes all include/asm-*/ipc.h files and moves the
contents of include/asm-generic/ipc.h to include/linux/ipc.h.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For some time /proc/sys/kernel/core_pattern has been able to set its output
destination as a pipe, allowing a user space helper to receive and
intellegently process a core. This infrastructure however has some
shortcommings which can be enhanced. Specifically:
1) The coredump code in the kernel should ignore RLIMIT_CORE limitation
when core_pattern is a pipe, since file system resources are not being
consumed in this case, unless the user application wishes to save the core,
at which point the app is restricted by usual file system limits and
restrictions.
2) The core_pattern code should be able to parse and pass options to the
user space helper as an argv array. The real core limit of the uid of the
crashing proces should also be passable to the user space helper (since it
is overridden to zero when called).
3) Some miscellaneous bugs need to be cleaned up (specifically the
recognition of a recursive core dump, should the user mode helper itself
crash. Also, the core dump code in the kernel should not wait for the user
mode helper to exit, since the same context is responsible for writing to
the pipe, and a read of the pipe by the user mode helper will result in a
deadlock.
This patch:
Remove the check of RLIMIT_CORE if core_pattern is a pipe. In the event that
core_pattern is a pipe, the entire core will be fed to the user mode helper.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Cc: <martin.pitt@ubuntu.com>
Cc: <wwoods@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch single-linked binfmt formats list to usual list_head's. This leads
to one-liners in register_binfmt() and unregister_binfmt(). The downside
is one pointer more in struct linux_binfmt. This is not a problem, since
the set of registered binfmts on typical box is very small -- (ELF +
something distro enabled for you).
Test-booted, played with executable .txt files, modprobe/rmmod binfmt_misc.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 2c941a2040 looks incomplete. The
helper functions like prepare_sg() need to support sg chaining too.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Introduce architecture dependent kretprobe blacklists to prohibit users
from inserting return probes on the function in which kprobes can be
inserted but kretprobes can not.
This patch also removes "__kprobes" mark from "__switch_to" on x86_64 and
registers "__switch_to" to the blacklist on x86-64, because that mark is to
prohibit user from inserting only kretprobe.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Convert cpu_sibling_map from a static array sized by NR_CPUS to a per_cpu
variable. This saves sizeof(cpumask_t) * NR unused cpus. Access is mostly
from startup and CPU HOTPLUG functions.
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This updates the sparc64 iommu/pci dma mappers to sg chaining.
Acked-by: David S. Miller <davem@davemloft.net>
Later updated to newer kernel with unified sparc64 iommu sg handling.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
It no longer translates to "real irqs" (aka. INO buckets)
so reflect that by using a simpler name for it.
Signed-off-by: David S. Miller <davem@davemloft.net>
All the users go through virt_irq_to_bucket() and essentially
want to go from a virt_irq to an INO, but we have a way
to do that already via virt_to_real_irq_table[].dev_ino.
This also allows us to kill both virt_to_real_irq() and
virt_irq_to_bucket().
Signed-off-by: David S. Miller <davem@davemloft.net>
We have a place to stick INO information in the
virt_to_real_irq_table[], which is currently only used for VIRQs.
And that is readily accessible from the one __irq_ino() call site.
Signed-off-by: David S. Miller <davem@davemloft.net>
We were simply concatenating the devhandle and devino and using that
as the cookie, which defeats the entire purpose of the VIRQ hypervisor
interfaces.
Now that we use physical addresses for the INO buckets, we can
allocate them dynamically for VIRQs and encode the cookies as
~__pa(bucket). This allows us to test for and decode the cookie with
a simple:
brlz $reg1, 1f
xnor $reg1, %g0, $reg2
sequence.
This works because bit 64 is never set in traditional
INO vectors, and it is also never set in a physical
address. So xnor'ing the physical address of the bucket
always gives us a negative number, and thus a unique
condition we can test cheaply.
Inspired by ideas from Greg Onufer.
Signed-off-by: David S. Miller <davem@davemloft.net>