I found an issue in cfi_cmdset0001.c. It is related to cache region
invalidation in the buffered write procedure.
The code performs cache invalidation from "cmd_addr" to "cmd_adr + len" in
do_write_buffer() while we modify region from "adr" to "adr+len".
This issue affects writes + reads of data by small chunks.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I'm seeing a kernel panic on an ES7000-600 when booting in virtual wire
mode. The panic happens because smp_read_mpc() is passed a physical
address, and it should be virtual. I tested the attached patch on the
ES7000-600 and on a 2 cpu Dell box, and saw no problems on either.
Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some allocations are restricted to a limited set of nodes (due to memory
policies or cpuset constraints). If the page allocator is not able to find
enough memory then that does not mean that overall system memory is low.
In particular going postal and more or less randomly shooting at processes
is not likely going to help the situation but may just lead to suicide (the
whole system coming down).
It is better to signal to the process that no memory exists given the
constraints that the process (or the configuration of the process) has
placed on the allocation behavior. The process may be killed but then the
sysadmin or developer can investigate the situation. The solution is
similar to what we do when running out of hugepages.
This patch adds a check before we kill processes. At that point
performance considerations do not matter much so we just scan the zonelist
and reconstruct a list of nodes. If the list of nodes does not contain all
online nodes then this is a constrained allocation and we should kill the
current process.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In the badness() calculation, there's currently this piece of code:
/*
* Processes which fork a lot of child processes are likely
* a good choice. We add the vmsize of the children if they
* have an own mm. This prevents forking servers to flood the
* machine with an endless amount of children
*/
list_for_each(tsk, &p->children) {
struct task_struct *chld;
chld = list_entry(tsk, struct task_struct, sibling);
if (chld->mm = p->mm && chld->mm)
points += chld->mm->total_vm;
}
The intention is clear: If some server (apache) keeps spawning new children
and we run OOM, we want to kill the father rather than picking a child.
This -- to some degree -- also helps a bit with getting fork bombs under
control, though I'd consider this a desirable side-effect rather than a
feature.
There's one problem with this: No matter how many or few children there are,
if just one of them misbehaves, and all others (including the father) do
everything right, we still always kill the whole family. This hits in real
life; whether it's javascript in konqueror resulting in kdeinit (and thus the
whole KDE session) being hit or just a classical server that spawns children.
Sidenote: The killer does kill all direct children as well, not only the
selected father, see oom_kill_process().
The idea in attached patch is that we do want to account the memory
consumption of the (direct) children to the father -- however not fully.
This maintains the property that fathers with too many children will still
very likely be picked, whereas a single misbehaving child has the chance to
be picked by the OOM killer.
In the patch I account only half (rounded up) of the children's vm_size to
the parent. This means that if one child eats more mem than the rest of
the family, it will be picked, otherwise it's still the father and thus the
whole family that gets selected.
This is heuristics -- we could debate whether accounting for a fourth would
be better than for half of it. Or -- if people would consider it worth the
trouble -- make it a sysctl. For now I sticked to accounting for half,
which should IMHO be a significant improvement.
The patch does one more thing: As users tend to be irritated by the choice
of killed processes (mainly because the children are killed first, despite
some of them having a very low OOM score), I added some more output: The
selected (father) process will be reported first and it's oom_score printed
to syslog.
Description:
Only account for half of children's vm size in oom score calculation
This should still give the parent enough point in case of fork bombs. If
any child however has more than 50% of the vm size of all children
together, it'll get a higher score and be elected.
This patch also makes the kernel display the oom_score.
Signed-off-by: Kurt Garloff <garloff@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch makes ata_sg_setup_one() trim sg entry (thus making
qc->n_elem zero) if padding results in zero length sg entry.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
This patch makes ata_for_each_sg() start with pad_sgent when
qc->n_elem is zero. Previously, ata_for_each_sg() unconditionally
started with qc->__sg, handling the first sg to fill_sg() routines
even when the entry was invalid. And while at it, unwind ?: in
ata_qc_next_sg() into if statement.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
For ATAPI commands, padding can reduce qc->n_elem by one and thus to
zero making assert(qc->n_elem > 0)'s in ata_fill_sg() and qs_fill_sg()
fail for legal commands. This patch fixes the assert()'s to take
qc->pad_len into account.
Although the condition check seems a bit excessive, as this part of
code isn't still stable yet, I think it's worth to keep those.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Some of netfilter-related members are initalized / copied twice in
skb_clone(). Remove one.
Pointed out by Olivier MATZ <olivier.matz@6wind.com>.
And this patch also fixes order of copying / clearing members.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When redirecting an outgoing packet to loopback, it keeps the original
conntrack reference and information from the outgoing path, which
falsely triggers the check for DNAT on input and the dst_entry is
released to trigger rerouting. ip_route_input refuses to route the
packet because it has a local source address and it is dropped.
Look at the packet itself to dermine if it was NATed. Also fix a
missing inversion that causes unneccesary xfrm lookups.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes 2 bugs in the USB-IrDA code.
The first one is a buffer overrun in the RX path. We are now using
IRDA_SKB_MAX_MTU when initializing the Rx URB.
The second one is a potential stack recursion when unplugging the USB
dongle. It seems that first we get the Rx URB with a generic error
code, and after a while the Rx URB comes again with a "disconnect"
error code. Since we are resubmitting the Rx URB immediately after
receiving the first error one, we might enter an endless loop.
When getting an error Rx URB, the patch defers the Rx URB resubmitting
so that it gives us a chance to catch the disconnect one, in case the
dongle has juts been unplugged.
Tested against 2.6.16-rc2.
Patch from Jean Tourrilhes
Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: Samuel Ortiz <samuel.ortiz@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ICMP errors are only SNATed when their source matches the source of the
connection they are related to, otherwise the source address is not
changed. This creates problems with ICMP frag. required messages
originating from a router behind the NAT, if private IPs are used the
packet has a good change of getting dropped on the path to its destination.
Always NAT ICMP errors similar to the original connection.
Based on report by Al Viro.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the logical and physical cpu ids of a secondary thread don't match, we will
fail to spin the thread up on pSeries machines due to a bug in pseries/smp.c
We call the RTAS "start-cpu" method with the physical cpu id, the address of
pSeries_secondary_smp_init and the value to pass that function in r3. Currently
we pass "lcpu", the logical cpu id, but pSeries_secondary_smp_init expects
the physical cpu id in r3.
We should be passing pcpu instead.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
For UP to SMP kexec to work we need to jump into pSeries_secondary_smp_init
event on a UP + KEXEC kernel. The secondary cpus will not find their hw_cpu_id
in the paca and so they'll jump into kexec_wait, ready for a kexec.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Because smp_release_cpus() is built for SMP || KEXEC, it's not safe to
unconditionally call it from setup_system(). On a UP && KEXEC kernel we'll
start up the secondary CPUs which will then go beserk and we die.
Simple fix is to conditionally call smp_release_cpus() in setup_system(). With
that in place we don't need the dummy definition of smp_release_cpus() because
all call sites are #ifdef'ed either SMP or KEXEC.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fallback gracefully when reading /proc/ppc64/lparcfg when the /rtas
device node can't be found.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A few symbols are exported twice, remove them from ppc_ksyms.c
Remove users of sys_ctrler in arch/ppc/
WARNING: vmlinux: duplicate symbol '__delay' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol '__up' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol '__down' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol '__down_interruptible' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'sys_ctrler' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strncat' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strncmp' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strchr' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strrchr' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strnlen' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strpbrk' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'memscan' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strstr' previous definition was in vmlinux
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes a regression which was introduced by moving ppc32 to use
the same sort of lockless gettimeofday as ppc64 has been using for
some time. This involves getting the timebase and performing some
simple arithmetic to convert it to seconds and microseconds. However,
the factor and offset used there weren't being updated when NTP
varied the tick length using adjtimex. 64-bit didn't notice the
problem because it had a hook in the 32-bit adjtimex compat routine
that attempted to work out what the generic timekeeping code would
do and alter the factor and offset to match. However, that code
was very complex and it wasn't clear that it still matched what the
generic code would do.
Now we use the generic current_tick_length() routine that was recently
added to check that the current tick will be as long as we expect; if
not we recompute the factor and offset. This keeps gettimeofday and
xtime in sync. In addition we check that gettimeofday hasn't got ahead
of xtime on each timer interrupt; if it has, we resync.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Patch claiming to remove enable_irq_nosync() had left it alive but killed
disable_irq_nosync() instead...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1) it should use nr_processes(), not nr_threads; otherwise we are getting
very confused find(1) and friends, among other things.
2) better do that at stat() time than at every damn lookup in procfs root.
Patch had been sitting in FC4 kernels for many months now...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
sbp2.c mangles INQUIRY response in a way that only applies to standard
inquiry data (i.e. when both cmddt and evpd bits are 0). Leave other cases
alone; e.g. when asking for VPD the length of reply is in byte 3, not 4
and byte 4 is the first byte of device serial number.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
audit_log_exit() is called from atomic contexts and gets explicit
gfp_mask argument; it should use it for all allocations rather
than doing some with gfp_mask and some with GFP_KERNEL.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The Xgl on r300 doesn't work unless you add a verify bitblt function to the
DRM, and we need to pass TX_CNTL to flush texture caches.
Signed-off-by: Dave Airlie <airlied@linux.ie>
acpi_rs_get_list_length() needs to account for all the vendor-defined data
bytes. Failing to include these causes buffers to be sized too small,
which causes slab corruption when we later convert AML to resources and run
off the end of the buffer.
This causes slab corruption on machines that use ACPI vendor-defined
resources. All HP ia64 machines do, and I'm told that some NEC machines
may as well.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make sure maxnodes is safe size before calculating nlongs in
get_nodes().
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I got all of these backwards. We want to return
min(input timeout, new timeout)
to userspace to prevent increasing the time-remaining value.
Thanks to Ernst Herzberg <earny@net4u.de> for reporting and diagnosing.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
One of the parameters to the __pud_free_tlb() macro for powerpc is
incorrect (see patch) . We get away with it by accident, because the one
place the macro is called, the second parameter is a variable named "pud".
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Don't print KERN_INFO in the middle of a printk line.
printk(KERN_INFO "OEM ID: %s ",str);
is just above this. This is already fixed up in i386 copy.
Signed-off-by: Martin J. Bligh <mbligh@google.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The windfarm_pm112 module relies on smu_sat_get_sdb_partition which is in
windfarm_smu_sat.c but is not exported to modules, so despite Kconfig
having the option to build the pm112 as modules, this can never be loaded.
This patch fixes that by exporting smu_sat_get_sdb_partition with
EXPORT_SYMBOL_GPL
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There's a rather theoretical case of the BUG triggering in
fuse_reset_request():
- iget() fails because of OOM after a successful CREATE_OPEN request
- during IO on the resulting RELEASE request the connection is aborted
Fix and add warning to fuse_reset_request().
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Restore the compatibility with the older code and make it possible to
suspend if the kernel command line doesn't contain the "resume=" argument
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Just rename the compat system call to keep the name consistent with all the
other *64 compat system calls.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The last changes that introduced the additional_cpus command line parameter
also introduced a regression regarding smp initialization speed. In
smp_setup_cpu_possible_map() cpu_present_map is set to the same value as
cpu_possible_map. Especially that means that bits in the present map will be
set for cpus that are not present. This will cause a slow down in the initial
cpu_up() loop in smp_init() since trying to take cpus online that aren't
present takes a while.
Fix this by setting only bits for present cpus in cpu_present_map and set
cpu_present_map to cpu_possible_map in smp_cpus_done().
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce possible_cpus command line option. Hard sets the number of bits set
in cpu_possible_map. Unlike the additional_cpus parameter this one guarantees
that num_possible_cpus() will stay constant even if the system gets rebooted
and a different number of cpus are present at startup.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce additional_cpus command line option. By default no additional cpu
can be attached to the system anymore. Only the cpus present at IPL time can
be switched on/off. If it is desired that additional cpus can be attached to
the system the maximum number of additional cpus needs to be specified with
this option.
This change is necessary in order to limit the waste of per_cpu data
structures.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Set preempt_count of idle_thread to zero before switching off cpu. Otherwise
the preempt_count will be wrong if the cpu is switched on again since the
thread will be reused.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If __ccw_device_disband_start() fails to initiate disbanding, it should finish
with ccw_device_disband_done() (which leaves the device in offline state)
instead of ccw_device_verify_done() (which leaves the device in online state).
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
The boot sequence on s390 sometimes takes ages and we spend a very long
time (up to one or two minutes) in calibrate_migration_costs. The time
spent there differs from boot to boot. Also the calculated costs differ
a lot. I've seen differences by up to a factor of 15 (yes, factor not
percent). Also I doubt that making these measurements make much sense on
a completely virtualized architecture where you cannot tell how much cpu
time you will get anyway.
So introduce the CONFIG_DEFAULT_MIGRATION_COST method for an architecture
to set the scheduler migration costs. This turns off automatic detection
of migration costs. Makes sense on virtual platforms, where migration
costs are hard to measure accurately.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jean-Luc Leger <reiga@dspnet.fr.eu.org> found this obvious typo.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix IO-port leakage from request_region in case of error during TPM
initialization, adds more pnp-verification and fixes a WTX-bug.
Signed-off-by: Marcel Selhorst <selhorst@crypto.rub.de>
Acked-by: Kylene Jo Hall <kjhall@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix a deadlock possible in the ext2 file system implementation. This
deadlock occurs when a file is removed from an ext2 file system which was
mounted with the "sync" mount option.
The problem is that ext2_xattr_delete_inode() was invoking the routine,
sync_dirty_buffer(), using a buffer head which was previously locked via
lock_buffer(). The first thing that sync_dirty_buffer() does is to lock
the buffer head that it was passed. It does this via lock_buffer(). Oops.
The solution is to unlock the buffer head in ext2_xattr_delete_inode()
before invoking sync_dirty_buffer(). This makes the code in
ext2_xattr_delete_inode() obey the same locking rules as all other callers
of sync_dirty_buffer() in the ext2 file system implementation.
Signed-off-by: Peter Staubach <staubach@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>