* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1623 commits)
netxen: update copyright
netxen: fix tx timeout recovery
netxen: fix file firmware leak
netxen: improve pci memory access
netxen: change firmware write size
tg3: Fix return ring size breakage
netxen: build fix for INET=n
cdc-phonet: autoconfigure Phonet address
Phonet: back-end for autoconfigured addresses
Phonet: fix netlink address dump error handling
ipv6: Add IFA_F_DADFAILED flag
net: Add DEVTYPE support for Ethernet based devices
mv643xx_eth.c: remove unused txq_set_wrr()
ucc_geth: Fix hangs after switching from full to half duplex
ucc_geth: Rearrange some code to avoid forward declarations
phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs
drivers/net/phy: introduce missing kfree
drivers/net/wan: introduce missing kfree
net: force bridge module(s) to be GPL
Subject: [PATCH] appletalk: Fix skb leak when ipddp interface is not loaded
...
Fixed up trivial conflicts:
- arch/x86/include/asm/socket.h
converted to <asm-generic/socket.h> in the x86 tree. The generic
header has the same new #define's, so that works out fine.
- drivers/net/tun.c
fix conflict between 89f56d1e9 ("tun: reuse struct sock fields") that
switched over to using 'tun->socket.sk' instead of the redundantly
available (and thus removed) 'tun->sk', and 2b980dbd ("lsm: Add hooks
to the TUN driver") which added a new 'tun->sk' use.
Noted in 'next' by Stephen Rothwell.
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (28 commits)
rcu: Move end of special early-boot RCU operation earlier
rcu: Changes from reviews: avoid casts, fix/add warnings, improve comments
rcu: Create rcutree plugins to handle hotplug CPU for multi-level trees
rcu: Remove lockdep annotations from RCU's _notrace() API members
rcu: Add #ifdef to suppress __rcu_offline_cpu() warning in !HOTPLUG_CPU builds
rcu: Add CPU-offline processing for single-node configurations
rcu: Add "notrace" to RCU function headers used by ftrace
rcu: Remove CONFIG_PREEMPT_RCU
rcu: Merge preemptable-RCU functionality into hierarchical RCU
rcu: Simplify rcu_pending()/rcu_check_callbacks() API
rcu: Use debugfs_remove_recursive() simplify code.
rcu: Merge per-RCU-flavor initialization into pre-existing macro
rcu: Fix online/offline indication for rcudata.csv trace file
rcu: Consolidate sparse and lockdep declarations in include/linux/rcupdate.h
rcu: Renamings to increase RCU clarity
rcu: Move private definitions from include/linux/rcutree.h to kernel/rcutree.h
rcu: Expunge lingering references to CONFIG_CLASSIC_RCU, optimize on !SMP
rcu: Delay rcu_barrier() wait until beginning of next CPU-hotunplug operation.
rcu: Fix typo in rcu_irq_exit() comment header
rcu: Make rcupreempt_trace.c look at offline CPUs
...
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (59 commits)
x86/gart: Do not select AGP for GART_IOMMU
x86/amd-iommu: Initialize passthrough mode when requested
x86/amd-iommu: Don't detach device from pt domain on driver unbind
x86/amd-iommu: Make sure a device is assigned in passthrough mode
x86/amd-iommu: Align locking between attach_device and detach_device
x86/amd-iommu: Fix device table write order
x86/amd-iommu: Add passthrough mode initialization functions
x86/amd-iommu: Add core functions for pd allocation/freeing
x86/dma: Mark iommu_pass_through as __read_mostly
x86/amd-iommu: Change iommu_map_page to support multiple page sizes
x86/amd-iommu: Support higher level PTEs in iommu_page_unmap
x86/amd-iommu: Remove old page table handling macros
x86/amd-iommu: Use 2-level page tables for dma_ops domains
x86/amd-iommu: Remove bus_addr check in iommu_map_page
x86/amd-iommu: Remove last usages of IOMMU_PTE_L0_INDEX
x86/amd-iommu: Change alloc_pte to support 64 bit address space
x86/amd-iommu: Introduce increase_address_space function
x86/amd-iommu: Flush domains if address space size was increased
x86/amd-iommu: Introduce set_dte_entry function
x86/amd-iommu: Add a gneric version of amd_iommu_flush_all_devices
...
Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking
for credential management. The additional code keeps track of the number of
pointers from task_structs to any given cred struct, and checks to see that
this number never exceeds the usage count of the cred struct (which includes
all references, not just those from task_structs).
Furthermore, if SELinux is enabled, the code also checks that the security
pointer in the cred struct is never seen to be invalid.
This attempts to catch the bug whereby inode_has_perm() faults in an nfsd
kernel thread on seeing cred->security be a NULL pointer (it appears that the
credential struct has been previously released):
http://www.kerneloops.org/oops.php?number=252883
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
We call lmb_end_of_DRAM() to test whether a DMA mask is ok on a machine
without IOMMU, but this function is marked as __init.
I don't think there's a clean way to get the top of RAM max_pfn doesn't
appear to include highmem or I missed (or we have a bug :-) so for now,
let's just avoid having a broken 2.6.31 by making this function
non-__init and we can revisit later.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's problematic to allow signed element_nr's or total's to be passed as
part of the flex array API.
flex_array_alloc() allows total_nr_elements to be set to a negative
quantity, which is obviously erroneous.
flex_array_get() and flex_array_put() allows negative array indices in
dereferencing an array part, which could address memory mapped before
struct flex_array.
The fix is to convert all existing element_nr formals to be qualified as
unsigned. Existing checks to compare it to total_nr_elements or the max
array size based on element_size need not be changed.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
flex_array_free_parts() does not take `src' or `element_nr' formals, so
remove their respective comments.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If all array elements fit into the base structure and data is copied using
flex_array_put() starting at a non-zero index, flex_array_get() will fail
to return the data.
This fixes the bug by only checking for NULL parts when all elements do
not fit in the base structure when flex_array_get() is used. Otherwise,
fa_element_to_part_nr() will always be 0 since there are no parts
structures needed and such element may never have been put. Thus, it will
remain NULL due to the kzalloc() of the base.
Additionally, flex_array_put() now only checks for a NULL part when all
elements do not fit in the base structure. This is otherwise unnecessary
since the base structure is guaranteed to exist (or we would have already
hit a NULL pointer).
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Create a kernel/rcutree_plugin.h file that contains definitions
for preemptable RCU (or, under the #else branch of the #ifdef,
empty definitions for the classic non-preemptable semantics).
These definitions fit into plugins defined in kernel/rcutree.c
for this purpose.
This variant of preemptable RCU uses a new algorithm whose
read-side expense is roughly that of classic hierarchical RCU
under CONFIG_PREEMPT. This new algorithm's update-side expense
is similar to that of classic hierarchical RCU, and, in absence
of read-side preemption or blocking, is exactly that of classic
hierarchical RCU. Perhaps more important, this new algorithm
has a much simpler implementation, saving well over 1,000 lines
of code compared to mainline's implementation of preemptable
RCU, which will hopefully be retired in favor of this new
algorithm.
The simplifications are obtained by maintaining per-task
nesting state for running tasks, and using a simple
lock-protected algorithm to handle accounting when tasks block
within RCU read-side critical sections, making use of lessons
learned while creating numerous user-level RCU implementations
over the past 18 months.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josht@linux.vnet.ibm.com
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
LKML-Reference: <12509746134003-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When 'and'ing two bitmasks (where 'andnot' is a variation on it), some
cases want to know whether the result is the empty set or not. In
particular, the TLB IPI sending code wants to do cpumask operations and
determine if there are any CPU's left in the final set.
So this just makes the bitmask (and cpumask) functions return a boolean
for whether the result has any bits set.
Cc: stable@kernel.org (2.6.30, needed by TLB shootdown fix)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
swiotlb_full() in lib/swiotlb.c throws one of two panic messages
based on whether the direction of transfer is from the device
or to the device. The logic around this is somewhat weird in
the case of bidirectional transfers. It appears to want to
throw both in succession, but since its a panic only the first
makes it.
This patch adds a third, separate error for DMA_BIDIRECTIONAL
to make things a bit clearer.
Signed-off-by: Casey Dahlin <cdahlin@redhat.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Becky Bruce <beckyb@kernel.crashing.org>
[ further fixed the error message ]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <200908202327.n7KNRuqK001504@imap1.linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While it's debatable whether or not a NULL device argument to
the DMA API functions is valid... since it certainly isn't
valid on devices with an IOMMU... dma-debug really shouldn't be
dereferencing null pointers either.
Guard against that in err_printk and the driver_filter
functions. A Fedora rawhide user was seeing this in one of the
dvb drivers resulting in an oops on boot.
[ A patch has been sent for testing to the driver, but I feel
the dma debugging support should be fixed as well. (There's
still a pile of legacy garbage in the kernel passing null
pointers to dma_{alloc,free}_*. :( ]
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Cc: mchehab@infradead.org
Cc: Joerg Roedel <joerg.roedel@amd.com>
LKML-Reference: <20090820011708.GP25206@bombadil.infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Very lightly tested, doesn't crash the kernel.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Conflicts:
arch/sparc/kernel/smp_64.c
arch/x86/kernel/cpu/perf_counter.c
arch/x86/kernel/setup_percpu.c
drivers/cpufreq/cpufreq_ondemand.c
mm/percpu.c
Conflicts in core and arch percpu codes are mostly from commit
ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many
num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all
the first chunk allocators into mm/percpu.c, the changes are moved
from arch code to mm/percpu.c.
Signed-off-by: Tejun Heo <tj@kernel.org>
These includes were added by 079effb693
("kmemtrace, kbuild: fix slab.h dependency problem in
lib/decompress_inflate.c") to fix the build when using kmemtrace. However
this is not necessary when used to create a compressed kernel, and
actually creates issues (brings a lot of things unavailable in the
decompression environment), so don't include it if STATIC is defined.
Signed-off-by: Albin Tonnerre <albin.tonnerre@free-electrons.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Phillip Lougher <phillip@lougher.demon.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
decompress_bunzip2 and decompress_unlzma have a nasty hack that subtracts
4 from the input length if being called in the pre-boot environment.
This is a nasty hack because it relies on the fact that flush = NULL only
when called from the pre-boot environment (i.e.
arch/x86/boot/compressed/misc.c). initramfs.c/do_mounts_rd.c pass in a
flush buffer (flush != NULL).
This hack prevents the decompressors from being used with flush = NULL by
other callers unless knowledge of the hack is propagated to them.
This patch removes the hack by making decompress (called only from the
pre-boot environment) a wrapper function that subtracts 4 from the input
length before calling the decompressor.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix and improve comments in decompress/generic.h that describe the
decompressor API. Also remove an unused definition, and rename INBUF_LEN
in lib/decompress_inflate.c to conform to bzip2/lzma naming.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
flex_array_get() calculates an index value, then drops it on the floor;
simply remove it.
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sg_miter_start() is currently unaware of the direction of the copy
process (to or from the scatter list). It is important to know the
direction because the page has to be flushed in case the data written
is seen on a different mapping in user land on cache incoherent
architectures.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Pierre Ossman <pierre@ossman.eu>
Once a structure goes over PAGE_SIZE*2, we see occasional allocation
failures. Some people have chosen to switch over to things like vmalloc()
that will let them keep array-like access to such a large structures.
But, vmalloc() has plenty of downsides.
Here's an alternative. I think it's what Andrew was suggesting here:
http://lkml.org/lkml/2009/7/2/518
I call it a flexible array. It does all of its work in PAGE_SIZE bits, so
never does an order>0 allocation. The base level has
PAGE_SIZE-2*sizeof(int) bytes of storage for pointers to the second level.
So, with a 32-bit arch, you get about 4MB (4183112 bytes) of total
storage when the objects pack nicely into a page. It is half that on
64-bit because the pointers are twice the size. There's a table detailing
this in the code.
There are kerneldocs for the functions, but here's an
overview:
flex_array_alloc() - dynamically allocate a base structure
flex_array_free() - free the array and all of the
second-level pages
flex_array_free_parts() - free the second-level pages, but
not the base (for static bases)
flex_array_put() - copy into the array at the given index
flex_array_get() - copy out of the array at the given index
flex_array_prealloc() - preallocate the second-level pages
between the given indexes to
guarantee no allocs will occur at
put() time.
We could also potentially just pass the "element_size" into each of the
API functions instead of storing it internally. That would get us one
more base pointer on 32-bit.
I've been testing this by running it in userspace. The header and patch
that I've been using are here, as well as the little script I'm using to
generate the size table which goes in the kerneldocs.
http://sr71.net/~dave/linux/flexarray/
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The generic atomic64_t implementation in lib/ did not export the functions
it defined, which means that modules that use atomic64_t would not link on
platforms (such as 32-bit powerpc). For example, trying to build a kernel
with CONFIG_NET_RDS on such a platform would fail with:
ERROR: "atomic64_read" [net/rds/rds.ko] undefined!
ERROR: "atomic64_set" [net/rds/rds.ko] undefined!
Fix this by exporting the atomic64_t functions to modules. (I export the
entire API even if it's not all currently used by in-tree modules to avoid
having to continue fixing this in dribs and drabs)
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The member was intended, not the local variable.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Greg Banks <gnb@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This converts swiotlb to use phys_to_dma and dma_to_phys instead of
swiotlb_phys_to_bus() and swiotlb_bus_to_phys().
swiotlb_phys_to_bus() and swiotlb_bus_to_phys() are not necessary so
this patch also removes them.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Becky Bruce <beckyb@kernel.crashing.org>
This converts swiotlb to use dma_capable() instead of
swiotlb_arch_address_needs_mapping() and is_buffer_dma_capable().
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Becky Bruce <beckyb@kernel.crashing.org>
swiotlb_bus_to_virt is unncessary; we can use swiotlb_bus_to_phys and
phys_to_virt instead.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Becky Bruce <beckyb@kernel.crashing.org>
is_current_single_threaded() can safely miss a freshly forked CLONE_VM
task, but in this case it must not miss its parent. That is why we take
mm->mmap_sem for writing to make sure a thread/task with the same ->mm
can't pass exit_mm() and disappear.
However we can avoid ->mmap_sem and rely on rcu/barriers:
- if we do not see the exiting parent on thread/process list
we see the result of list_del_rcu(), in this case we must
also see the result of list_add_rcu() which does wmb().
- if we do see the parent but its ->mm == NULL, we need rmb()
to make sure we can't miss the child.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
- is_single_threaded(task) is not safe unless task == current,
we can't use task->signal or task->mm.
- it doesn't make sense unless task == current, the task can
fork right after the check.
Rename it to current_is_single_threaded() and kill the argument.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
- Fix the comment, is_single_threaded(p) actually means that nobody shares
->mm with p.
I think this helper should be renamed, and it should not have arguments.
With or without this patch it must not be used unless p == current,
otherwise we can't safely use p->signal or p->mm.
- "if (atomic_read(&p->signal->count) != 1)" is not right when we have a
zombie group leader, use signal->live instead.
- Add PF_KTHREAD check to skip kernel threads which may borrow p->mm,
otherwise we can return the wrong "false".
- Use for_each_process() instead of do_each_thread(), all threads must use
the same ->mm.
- Use down_write(mm->mmap_sem) + rcu_read_lock() instead of tasklist_lock
to iterate over the process list. If there is another CLONE_VM process
it can't pass exit_mm() which takes the same mm->mmap_sem. We can miss
a freshly forked CLONE_VM task, but this doesn't matter because we must
see its parent and return false.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>
* 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
dma-debug: Fix the overlap() function to be correct and readable
oprofile: reset bt_lost_no_mapping with other stats
x86/oprofile: rename kernel parameter for architectural perfmon to arch_perfmon
signals: declare sys_rt_tgsigqueueinfo in syscalls.h
rcu: Mark Hierarchical RCU no longer experimental
dma-debug: Put all hash-chain locks into the same lock class
dma-debug: fix off-by-one error in overlap function
Linus noticed how unclean and buggy the overlap() function is:
- It uses convoluted (and bug-causing) positive checks for
range overlap - instead of using a more natural negative
check.
- Even the positive checks are buggy: a positive intersection
check has four natural cases while we checked only for three,
missing the (addr < start && addr2 == end) case for example.
- The variables are mis-named, making it non-obvious how the
check was done.
- It needlessly uses u64 instead of unsigned long. Since these
are kernel memory pointers and we explicitly exclude highmem
ranges anyway we cannot ever overflow 32 bits, even if we
could. (and on 64-bit it doesnt matter anyway)
All in one, this function needs a total revamp. I used Linus's
suggestions minus the paranoid checks (we cannot overflow really
because if we get totally bad DMA ranges passed far more things
break in the systems than just DMA debugging). I also fixed a
few other small details i noticed.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix
changes. As alpha in percpu tree uses 'weak' attribute instead of
inline assembly, there's no need for __used attribute.
Conflicts:
arch/alpha/include/asm/percpu.h
arch/mn10300/kernel/vmlinux.lds.S
include/linux/percpu-defs.h
to make it selectable if it is available.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
(feature suggested by Sergey Senozhatsky)
Kmemleak needs to track all the memory allocations but some of these
happen before kmemleak is initialised. These are stored in an internal
buffer which may be exceeded in some kernel configurations. This patch
adds a configuration option with a default value of 400 and also removes
the stack dump when the early log buffer is exceeded.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by>
Some archs (alpha and s390) need to use weak definitions for percpu
variables in modules so that the compiler generates external
references for them.
This patch implements weak percpu definitions which arch can enable by
defining ARCH_NEEDS_WEAK_PER_CPU in arch percpu header file. This
weak definition adds the following two restrictions on percpu variable
definitions.
1. percpu symbols must be unique whether static or not
2. percpu variables can't be defined inside a function
To ensure that these restrictions are observed in generic code, config
option DEBUG_FORCE_WEAK_PER_CPU enables weak percpu definitions for
all cases.
This patch is inspired by Ivan Kokshaysky's alpha percpu patch.
[ Impact: stricter rules for percpu variables, one more debug config option ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Howells <dhowells@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: add dummy pgprot_noncached()
lib/checksum.c: fix endianess bug
asm-generic: hook up new system calls
asm-generic: list Arnd as asm-generic maintainer
asm-generic: drop HARDIRQ_BITS definition from hardirq.h
asm-generic: uaccess: fix up local access_ok() usage
asm-generic: uaccess: add missing access_ok() check to strnlen_user()
Selecting DEBUG_SLAB or SLUB_DEBUG by the KMEMLEAK menu entry may cause
issues with other dependencies (KMEMCHECK). These configuration options
aren't strictly needed by kmemleak but they may increase the chances of
finding leaks. This patch also updates the KMEMLEAK config entry help
text.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: Select frame pointers on x86
dma-debug: be more careful when building reference entries
dma-debug: check for sg_call_ents in best-fit algorithm too
x86 stack traces are a piece of crap without frame pointers, and its not
like the 'performance gain' of not having stack pointers matters when you
selected lockdep.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <new-submission>
Cc: <stable@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The new generic checksum code has a small dependency on endianess and
worked only on big-endian systems. I could not find a nice efficient
way to express this, so I added an #ifdef. Using
'result += le16_to_cpu(*buff);' would have worked as well, but
would be slightly less efficient on big-endian systems and IMHO
would not be clearer.
Also fix a bug that prevents this from working on 64-bit machines.
If you have a 64-bit CPU and want to use the generic checksum
code, you should probably do some more optimizations anyway, but
at least the code should not break.
Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This patch adds lib/gcd.c which contains a greatest common divider
implementation taken from sound/core/pcm_timer.c
Several usages of this new library function will be sent to subsystem
maintainers.
[akpm@linux-foundation.org: use swap() (pointed out by Joe)]
[akpm@linux-foundation.org: just add gcd.o to obj-y, remove Kconfig changes]
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julius Volz <juliusv@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox reported that lockdep runs out of its stack-trace entries
with certain configs:
BUG: MAX_STACK_TRACE_ENTRIES too low
This happens because there are 1024 hash buckets, each with a
separate lock. Lockdep puts each lock into a separate lock class and
tracks them independently.
But in reality we never take more than one of the buckets, so they
really belong into a single lock-class. Annotate the has bucket lock
init accordingly.
[ Impact: reduce the lockdep footprint of dma-debug ]
Reported-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
* akpm: (182 commits)
fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset
fbdev: *bfin*: fix __dev{init,exit} markings
fbdev: *bfin*: drop unnecessary calls to memset
fbdev: bfin-t350mcqb-fb: drop unused local variables
fbdev: blackfin has __raw I/O accessors, so use them in fb.h
fbdev: s1d13xxxfb: add accelerated bitblt functions
tcx: use standard fields for framebuffer physical address and length
fbdev: add support for handoff from firmware to hw framebuffers
intelfb: fix a bug when changing video timing
fbdev: use framebuffer_release() for freeing fb_info structures
radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb?
s3c-fb: CPUFREQ frequency scaling support
s3c-fb: fix resource releasing on error during probing
carminefb: fix possible access beyond end of carmine_modedb[]
acornfb: remove fb_mmap function
mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF
mb862xxfb: restrict compliation of platform driver to PPC
Samsung SoC Framebuffer driver: add Alpha Channel support
atmel-lcdc: fix pixclock upper bound detection
offb: use framebuffer_alloc() to allocate fb_info struct
...
Manually fix up conflicts due to kmemcheck in mm/slab.c
Furthermore, notice that the initial checks:
if (!node->rb_left)
child = node->rb_right;
else if (!node->rb_right)
child = node->rb_left;
else
{
...
}
guarantee that old->rb_right is set in the final else branch, therefore
we can omit checking that again.
Signed-off-by: Wolfram Strepp <wstrepp@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are two cases when a node, having 2 childs, is erased:
'normal case': the successor is not the right-hand-child of the node to be erased
'special case': the successor is the right-hand child of the node to be erased
Here some ascii-art, with following symbols (referring to the code):
O: node to be deleted
N: the successor of O
P: parent of N
C: child of N
L: some other node
normal case:
O N
/ \ / \
/ \ / \
L \ L \
/ \ P ----> / \ P
/ \ / \
/ /
N C
\ / \
\
C
/ \
special case:
O|P N
/ \ / \
/ \ / \
L \ L \
/ \ N ----> / C
\ / \
\
C
/ \
Notice that for the special case we don't have to reconnect C to N.
Signed-off-by: Wolfram Strepp <wstrepp@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
First, move some code around in order to make the next change more obvious.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Wolfram Strepp <wstrepp@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is a call to write_lock() in gen_pool_destroy which is not balanced
by any corresponding write_unlock(). This causes problems with preemption
because the preemption-disable counter is incremented in the write_lock()
call, but never decremented by any call to write_unlock(). This bug is
gen_pool_destroy, and one of them is non-x86 arch-specific code.
Signed-off-by: Zygo Blaxell <zygo.blaxell@xandros.com>
Cc: Jiri Kosina <trivial@kernel.org>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For example:
hex_dump_to_buffer("AB", 2, 16, 1, buf, 100, 0);
pr_info("[%s]\n", buf);
I'd expect the output to be "[41 42]", but actually it's "[41 42 ]"
This patch also makes the required buf to be minimum. To print the hex
format of "AB", a buf with size 6 should be sufficient, but
hex_dump_to_buffer() required at least 8.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
radix_tree_lookup() and radix_tree_lookup_slot() have much the
same code except for the return value.
Introduce radix_tree_lookup_element() to do the real work.
/*
* is_slot == 1 : search for the slot.
* is_slot == 0 : search for the node.
*/
static void * radix_tree_lookup_element(struct radix_tree_root *root,
unsigned long index, int is_slot);
Signed-off-by: Huang Shijie <shijie8@gmail.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
_atomic_dec_and_lock() should not unconditionally take the lock before
calling atomic_dec_and_test() in the UP case. For consistency reasons it
should behave exactly like in the SMP case.
Besides that this works around the problem that with CONFIG_DEBUG_SPINLOCK
this spins in __spin_lock_debug() if the lock is already taken even if the
counter doesn't drop to 0.
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Cc: Valerie Aurora <vaurora@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The counterpart of radix_tree_next_hole(). To be used by context readahead.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Vladislav Bolkhovitin <vst@vlnb.net>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (64 commits)
debugfs: use specified mode to possibly mark files read/write only
debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem.
xen: remove driver_data direct access of struct device from more drivers
usb: gadget: at91_udc: remove driver_data direct access of struct device
uml: remove driver_data direct access of struct device
block/ps3: remove driver_data direct access of struct device
s390: remove driver_data direct access of struct device
parport: remove driver_data direct access of struct device
parisc: remove driver_data direct access of struct device
of_serial: remove driver_data direct access of struct device
mips: remove driver_data direct access of struct device
ipmi: remove driver_data direct access of struct device
infiniband: ehca: remove driver_data direct access of struct device
ibmvscsi: gadget: at91_udc: remove driver_data direct access of struct device
hvcs: remove driver_data direct access of struct device
xen block: remove driver_data direct access of struct device
thermal: remove driver_data direct access of struct device
scsi: remove driver_data direct access of struct device
pcmcia: remove driver_data direct access of struct device
PCIE: remove driver_data direct access of struct device
...
Manually fix up trivial conflicts due to different direct driver_data
direct access fixups in drivers/block/{ps3disk.c,ps3vram.c}
This patch fixes a bug in the overlap function which returned true if
one region ends exactly before the second region begins. This is no
overlap but the function returned true in that case.
Cc: stable@kernel.org
Reported-by: Andrew Randrianasulu <randrik@mail.ru>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
kset_create should check the kobject_set_name return value.
Add the return value checking code.
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The Kconfig options of kmemcheck are hidden under arch/x86 which makes porting
to other architectures harder. To fix that, move the Kconfig bits to
lib/Kconfig.kmemcheck and introduce a CONFIG_HAVE_ARCH_KMEMCHECK config option
that architectures can define.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
let it rip!
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
The current code is not very careful when it builds reference
dma_debug_entries which get passed to hash_bucket_find(). But since this
function changed to a best-fit algorithm these entries have to be more
acurate. This patch adds this higher level of accuracy.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
If we don't check for sg_call_ents the hash_bucket_find function might
still return the wrong dma_debug_entry for sg mappings.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Many processor architectures have no 64-bit atomic instructions, but
we need atomic64_t in order to support the perf_counter subsystem.
This adds an implementation of 64-bit atomic operations using hashed
spinlocks to provide atomicity. For each atomic operation, the address
of the atomic64_t variable is hashed to an index into an array of 16
spinlocks. That spinlock is taken (with interrupts disabled) around the
operation, which can then be coded non-atomically within the lock.
On UP, all the spinlock manipulation goes away and we simply disable
interrupts around each operation. In fact gcc eliminates the whole
atomic64_lock variable as well.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It's theoretically possible that there are exception table entries
which point into the (freed) init text of modules. These could cause
future problems if other modules get loaded into that memory and cause
an exception as we'd see the wrong fixup. The only case I know of is
kvm-intel.ko (when CONFIG_CC_OPTIMIZE_FOR_SIZE=n).
Amerigo fixed this long-standing FIXME in the x86 version, but this
patch is more general.
This implements trim_init_extable(); most archs are simple since they
use the standard lib/extable.c sort code. Alpha and IA64 use relative
addresses in their fixups, so thier trimming is a slight variation.
Sparc32 is unique; it doesn't seem to define ARCH_HAS_SORT_EXTABLE,
yet it defines its own sort_extable() which overrides the one in lib.
It doesn't sort, so we have to mark deleted entries instead of
actually trimming them.
Inspired-by: Amerigo Wang <amwang@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-alpha@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
* 'for-linus' of git://linux-arm.org/linux-2.6:
kmemleak: Add the corresponding MAINTAINERS entry
kmemleak: Simple testing module for kmemleak
kmemleak: Enable the building of the memory leak detector
kmemleak: Remove some of the kmemleak false positives
kmemleak: Add modules support
kmemleak: Add kmemleak_alloc callback from alloc_large_system_hash
kmemleak: Add the vmalloc memory allocation/freeing hooks
kmemleak: Add the slub memory allocation/freeing hooks
kmemleak: Add the slob memory allocation/freeing hooks
kmemleak: Add the slab memory allocation/freeing hooks
kmemleak: Add documentation on the memory leak detector
kmemleak: Add the base support
Manual conflict resolution (with the slab/earlyboot changes) in:
drivers/char/vt.c
init/main.c
mm/slab.c
* 'topic/slab/earlyboot' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
vgacon: use slab allocator instead of the bootmem allocator
irq: use kcalloc() instead of the bootmem allocator
sched: use slab in cpupri_init()
sched: use alloc_cpumask_var() instead of alloc_bootmem_cpumask_var()
memcg: don't use bootmem allocator in setup code
irq/cpumask: make memoryless node zero happy
x86: remove some alloc_bootmem_cpumask_var calling
vt: use kzalloc() instead of the bootmem allocator
sched: use kzalloc() instead of the bootmem allocator
init: introduce mm_init()
vmalloc: use kzalloc() instead of alloc_bootmem()
slab: setup allocators earlier in the boot sequence
bootmem: fix slab fallback on numa
bootmem: use slab if bootmem is no longer available
Add a generic (unoptimized) implementation of checksum.c in pure C
for use by all architectures that cannot be bother with implementing
their own version.
Based on microblaze code by Michal Simek <monstr@monstr.eu>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Now that we set up the slab allocator earlier, we can get rid of some
alloc_bootmem_cpumask_var() calls in boot code.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
This patch adds a loadable module that deliberately leaks memory. It
is used for testing various memory leaking scenarios.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch adds the Kconfig.debug and Makefile entries needed for
building kmemleak into the kernel.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* serial-from-alan: (79 commits)
moxa: prevent opening unavailable ports
imx: serial: use tty_encode_baud_rate to set true rate
imx: serial: add IrDA support to serial driver
imx: serial: use rational library function
lib: isolate rational fractions helper function
imx: serial: handle initialisation failure correctly
imx: serial: be sure to stop xmit upon shutdown
imx: serial: notify higher layers in case xmit IRQ was not called
imx: serial: fix one bit field type
imx: serial: fix whitespaces (no changes in functionality)
tty: use prepare/finish_wait
tty: remove sleep_on
sierra: driver interface blacklisting
sierra: driver urb handling improvements
tty: resolve some sierra breakage
timbuart: Fix the termios logic
serial: Added Timberdale UART driver
tty: Add URL for ttydev queue
devpts: unregister the file system on error
tty: Untangle termios and mm mutex dependencies
...
Provide a helper function to determine optimum numerator
denominator value pairs taking into account restricted
register size. Useful especially with PLL and other clock
configurations.
Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
vsprintf: introduce %pf format specifier
printk: add support of hh length modifier for printk
* 'iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (61 commits)
amd-iommu: remove unnecessary "AMD IOMMU: " prefix
amd-iommu: detach device explicitly before attaching it to a new domain
amd-iommu: remove BUS_NOTIFY_BOUND_DRIVER handling
dma-debug: simplify logic in driver_filter()
dma-debug: disable/enable irqs only once in device_dma_allocations
dma-debug: use pr_* instead of printk(KERN_* ...)
dma-debug: code style fixes
dma-debug: comment style fixes
dma-debug: change hash_bucket_find from first-fit to best-fit
x86: enable GART-IOMMU only after setting up protection methods
amd_iommu: fix lock imbalance
dma-debug: add documentation for the driver filter
dma-debug: add dma_debug_driver kernel command line
dma-debug: add debugfs file for driver filter
dma-debug: add variables and checks for driver filter
dma-debug: fix debug_dma_sync_sg_for_cpu and debug_dma_sync_sg_for_device
dma-debug: use sg_dma_len accessor
dma-debug: use sg_dma_address accessor instead of using dma_address directly
amd-iommu: don't free dma adresses below 512MB with CONFIG_IOMMU_STRESS
amd-iommu: don't preallocate page tables with CONFIG_IOMMU_STRESS
...
This patch makes the driver_filter function more readable by
reorganizing the code. The removal of a code code block to an upper
indentation level makes hard-to-read line-wraps unnecessary.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
There is no need to disable/enable irqs on each loop iteration. Just
disable irqs for the whole time the loop runs.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The pr_* macros are shorter than the old printk(KERN_ ...) variant.
Change the dma-debug code to use the new macros and save a few
unnecessary line breaks. If lines don't break the source code can also
be grepped more easily.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This patch changes the recent updates to dma-debug to conform with
coding style guidelines of Linux and the -tip tree.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Last patch series introduced some new comment which does not fit the
Kernel comment style guidelines. Fix it with this patch.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Merge reason: This branch was on an -rc5 base so pull almost-2.6.30
to resync with the latest upstream fixes and make sure
the combination works fine.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some device drivers map the same physical address multiple times to a
dma address. Without an IOMMU this results in the same dma address being
put into the dma-debug hash multiple times. With a first-fit match in
hash_bucket_find() this function may return the wrong dma_debug_entry.
This can result in false positive warnings. This patch fixes it by
changing the first-fit behavior of hash_bucket_find() into a best-fit
algorithm.
Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: lethal@linux-sh.org
Cc: just.for.lkml@googlemail.com
Cc: hancockrwd@gmail.com
Cc: jens.axboe@oracle.com
Cc: bharrosh@panasas.com
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
LKML-Reference: <20090605104132.GE24836@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the dma-api/driver_filter file to debugfs. The root user
can write a driver name into this file to see only dma-api errors for
that particular driver in the kernel log. Writing an empty string to
that file disables the driver filter.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This patch adds the state variables for the driver filter and a function
to check if the filter is enabled and matches to the current device. The
check is built into the err_printk function.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>