Add ARM support for the DMA debug infrastructure, which allows the
DMA API usage to be debugged.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Replace the page_to_dma() and dma_to_page() macros with their PFN
equivalents. This allows us to map parts of memory which do not have
a struct page allocated to them to bus addresses. This will be used
internally by dma_alloc_coherent()/dma_alloc_writecombine().
Build tested on Versatile, OMAP1, IOP13xx and KS8695.
Tested-by: Janusz Krzysztofik <jkrzyszt@tis.icnet.pl>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Conflicts:
MAINTAINERS
arch/arm/mach-omap2/pm24xx.c
drivers/scsi/bfa/bfa_fcpim.c
Needed to update to apply fixes for which the old branch was too
outdated.
The hardware page tables use an XN bit 'execute never'. Historically,
we've had a Linux 'execute allow' bit, in the positive sense. Get rid
of this artifact as future hardware will continue to have the XN sense.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
FIRST_USER_PGD_NR is now unnecessary, as this has been replaced by
FIRST_USER_ADDRESS except in the architecture code. Fix up the last
usage of FIRST_USER_PGD_NR, and remove the definition.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove some knowledge of our 2-level page table layout from the
identity mapping code - we assume that a step size of PGDIR_SIZE will
allow us to step over all entries. While this is true today, it won't
be true in the near future.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We have two places where we create identity mappings - one when we bring
secondary CPUs online, and one where we setup some mappings for soft-
reboot. Combine these two into a single implementation. Also collect
the identity mapping deletion function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This switches the ordering of the Linux vs hardware page tables in
each page, thereby eliminating some of the arithmetic in the page
table walks. As we now place the Linux page table at the beginning
of the page, we can deal with the offset in the pgt by simply masking
it away, along with the other control bits.
This also makes the arithmetic all be positive, rather than a mixture.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since commit 3e4d3af501 "mm: stack based kmap_atomic()", it is actively
wrong to rely on fixed kmap type indices (namely KM_L2_CACHE) as
kmap_atomic() totally ignores them and a concurrent instance of it may
happily reuse any slot for any purpose. Because kmap_atomic() is now
able to deal with reentrancy, we can get rid of the ad hoc mapping here.
While the code is made much simpler, there is a needless cache flush
introduced by the usage of __kunmap_atomic(). It is not clear if the
performance difference to remove that is worth the cost in code
maintenance (I don't think there are that many highmem users on that
platform anyway) but that should be reconsidered when/if someone cares
enough to do some measurements.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Since commit 3e4d3af501 "mm: stack based kmap_atomic()", it is actively
wrong to rely on fixed kmap type indices (namely KM_L2_CACHE) as
kmap_atomic() totally ignores them and a concurrent instance of it may
happily reuse any slot for any purpose. Because kmap_atomic() is now
able to deal with reentrancy, we can get rid of the ad hoc mapping here,
and we even don't have to disable IRQs anymore (highmem case).
While the code is made much simpler, there is a needless cache flush
introduced by the usage of __kunmap_atomic(). It is not clear if the
performance difference to remove that is worth the cost in code
maintenance (I don't think there are that many highmem users on that
platform if at all anyway).
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Since commit 3e4d3af501 "mm: stack based kmap_atomic()", it is no longer
necessary to carry an ad hoc version of kmap_atomic() added in commit
7e5a69e83b "ARM: 6007/1: fix highmem with VIPT cache and DMA" to cope
with reentrancy.
In fact, it is now actively wrong to rely on fixed kmap type indices
(namely KM_L1_CACHE) as kmap_atomic() totally ignores them now and a
concurrent instance of it may reuse any slot for any purpose.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Since CPU_PJ4 is shared between PXA95x and MMP2, select CPU_PJ4 in MMP2
configuration.
Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
The core of PXA955 is PJ4. Add new PJ4 support. And add new macro
CONFIG_PXA95x.
Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Cache ownership must be acquired by reading/writing data from the
cache line to make cache operation have the desired effect on the
SMP MPCore CPU. However, the ownership is never acquired in the
v6_dma_inv_range function when cleaning the first line and
flushing the last one, in case the address is not aligned
to D_CACHE_LINE_SIZE boundary.
Fix this by reading/writing data if needed, before performing
cache operations.
While at it, fix v6_dma_flush_range to prevent RWFO outside
the buffer.
Cc: stable@kernel.org
Signed-off-by: Valentine Barshak <vbarshak@mvista.com>
Signed-off-by: George G. Davis <gdavis@mvista.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The current implementation of the v7_coherent_*_range function assumes
that the D and I cache lines have the same size, which is incorrect
architecturally. This patch adds the icache_line_size macro which reads
the CTR register. The main loop in v7_coherent_*_range is split in two
independent loops or the D and I caches. This also has the performance
advantage that the DSB is moved outside the main loop.
Reported-by: Kevin Sapp <ksapp@quicinc.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The current implementation of the dcache_line_size macro reads the L1
cache size from the CCSIDR register. This, however, is not guaranteed to
be the smallest cache line in the cache hierarchy. The patch changes to
the macro to use the more architecturally correct CTR register.
Reported-by: Kevin Sapp <ksapp@quicinc.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Directives such as .long and .word do not magically cause the
assembler location counter to become aligned in gas. As a result,
using these directives in code sections can result in misaligned
data words when building a Thumb-2 kernel (CONFIG_THUMB2_KERNEL).
This is a Bad Thing, since the ABI permits the compiler to assume
that fundamental types of word size or above are word- aligned when
accessing them from C. If the data is not really word-aligned,
this can cause impaired performance and stray alignment faults in
some circumstances.
In general, the following rules should be applied when using data
word declaration directives inside code sections:
* .quad and .double:
.align 3
* .long, .word, .single, .float:
.align (or .align 2)
* .short:
No explicit alignment required, since Thumb-2
instructions are always 2 or 4 bytes in size.
immediately after an instruction.
In this specific case, we can achieve the desired alignment by
forcing a 32-bit branch instruction using the W() macro, since the
assembler location counter is already 32-bit aligned in this case.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove knowledge of the 2-level wrapping in pgd_free(), and use the
pXd_none_or_clear_bad() macros when checking the entries.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Adding KERN_WARNING in the middle of strings now produces those tokens
in the output, rather than accepting the level as was once the case.
Fix this in the one reported case. There might be more...
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds initial support for Renesas SH-Mobile AG5.
At this point the AG5 CPU support is limited to the ARM
core, SCIF serial and a CMT timer together with L2 cache
and the GIC. The AG5EVM board also supports Ethernet.
Future patches will add support for GPIO, INTCS, CPGA
and platform data / driver updates for devices such as
IIC, LCDC, FSI, KEYSC, CEU and SDHI among others.
The code in entry-macro.S will be cleaned up when the
ARM IRQ demux code improvements have been merged.
Depends on the AG5EVM mach-type recently registered but
not yet present in arch/arm/tools/mach-types.
As the AG5EVM board comes with 512MiB memory it is
recommended to turn on HIGHMEM.
Many thanks to Yoshii-san for initial bring up.
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
It's enough to include the asm/smp_plat.h once in arch/arm/mm/flush.c
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
An out by one bug meant that the DMA coherent allocator was aligning
to one more bit than it should, causing it to run out of available
memory quicker. Fix this.
Reported-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The SWP instruction was deprecated in the ARMv6 architecture,
superseded by the LDREX/STREX family of instructions for
load-linked/store-conditional operations. The ARMv7 multiprocessing
extensions mandate that SWP/SWPB instructions are treated as undefined
from reset, with the ability to enable them through the System Control
Register SW bit.
This patch adds the alternative solution to emulate the SWP and SWPB
instructions using LDREX/STREX sequences, and log statistics to
/proc/cpu/swp_emulation. To correctly deal with copy-on-write, it also
modifies cpu_v7_set_pte_ext to change the mappings to priviliged RO when
user RO.
Signed-off-by: Leif Lindholm <leif.lindholm@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch removes the domain switching functionality via the set_fs and
__switch_to functions on cores that have a TLS register.
Currently, the ioremap and vmalloc areas share the same level 1 page
tables and therefore have the same domain (DOMAIN_KERNEL). When the
kernel domain is modified from Client to Manager (via the __set_fs or in
the __switch_to function), the XN (eXecute Never) bit is overridden and
newer CPUs can speculatively prefetch the ioremap'ed memory.
Linux performs the kernel domain switching to allow user-specific
functions (copy_to/from_user, get/put_user etc.) to access kernel
memory. In order for these functions to work with the kernel domain set
to Client, the patch modifies the LDRT/STRT and related instructions to
the LDR/STR ones.
The user pages access rights are also modified for kernel read-only
access rather than read/write so that the copy-on-write mechanism still
works. CPU_USE_DOMAINS gets disabled only if the hardware has a TLS register
(CPU_32v6K is defined) since writing the TLS value to the high vectors page
isn't possible.
The user addresses passed to the kernel are checked by the access_ok()
function so that they do not point to the kernel space.
Tested-by: Anton Vorontsov <cbouatmailru@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (215 commits)
ARM: memblock: setup lowmem mappings using memblock
ARM: memblock: move meminfo into find_limits directly
ARM: memblock: convert free_highpages() to use memblock
ARM: move freeing of highmem pages out of mem_init()
ARM: memblock: convert memory detail printing to use memblock
ARM: memblock: use memblock to free memory into arm_bootmem_init()
ARM: memblock: use memblock when initializing memory allocators
ARM: ensure membank array is always sorted
ARM: 6466/1: implement flush_icache_all for the rest of the CPUs
ARM: 6464/2: fix spinlock recursion in adjust_pte()
ARM: fix memblock breakage
ARM: 6465/1: Fix data abort accessing proc_info from __lookup_processor_type
ARM: 6460/1: ixp2000: fix type of ixp2000_timer_interrupt
ARM: 6449/1: Fix for compiler warning of uninitialized variable.
ARM: 6445/1: fixup TCM memory types
ARM: imx: Add wake functionality to GPIO
ARM: mx5: Add gpio-keys to mx51 babbage board
ARM: imx: Add gpio-keys to plat-mxc
mx31_3ds: Fix spi registration
mx31_3ds: Fix the logic for detecting the debug board
...
Use memblock information to setup lowmem mappings rather than the
membank array.
This allows platforms to manipulate the memblock information during
initialization to reserve (and remove) memory from the kernel's view
of memory - and thus allowing platforms to setup their own private
mappings for this memory without causing problems with multiple
aliasing mappings:
size = min(size, SZ_2M);
base = memblock_alloc(size, min(align, SZ_2M));
memblock_free(base, size);
memblock_remove(base, size);
This is needed because multiple mappings of regions with differing
attributes (sharability, type, cache) are not permitted with ARMv6
and above.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
bootmem_init() no longer makes several uses of the membank
information, so move this into the one remaining called function
which does use it.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Free the high pages using the memblock memory lists - and more
importantly, exclude any memblock allocations in highmem from the
free'd memory.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This was missing from the noMMU code, so there was the possibility
of things not working as expected if out of order memory information
was passed.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 81d11955bf ("ARM: 6405/1: Handle __flush_icache_all for
CONFIG_SMP_ON_UP") added a new function to struct cpu_cache_fns:
flush_icache_all(). It also implemented this for v6 and v7 but not
for v5 and backwards. Without the function pointer in place, we
will be calling wrong cache functions.
For example with ep93xx we get following:
Unable to handle kernel paging request at virtual address ee070f38
pgd = c0004000
[ee070f38] *pgd=00000000
Internal error: Oops: 80000005 [#1] PREEMPT
last sysfs file:
Modules linked in:
CPU: 0 Not tainted (2.6.36+ #1)
PC is at 0xee070f38
LR is at __dma_alloc+0x11c/0x2d0
pc : [<ee070f38>] lr : [<c0032c8c>] psr: 60000013
sp : c581bde0 ip : 00000000 fp : c0472000
r10: c0472000 r9 : 000000d0 r8 : 00020000
r7 : 0001ffff r6 : 00000000 r5 : c0472400 r4 : c5980000
r3 : c03ab7e0 r2 : 00000000 r1 : c59a0000 r0 : c5980000
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: c000717f Table: c0004000 DAC: 00000017
Process swapper (pid: 1, stack limit = 0xc581a270)
[<c0032c8c>] (__dma_alloc+0x11c/0x2d0)
[<c0032e5c>] (dma_alloc_writecombine+0x1c/0x24)
[<c0204148>] (ep93xx_pcm_preallocate_dma_buffer+0x44/0x60)
[<c02041c0>] (ep93xx_pcm_new+0x5c/0x88)
[<c01ff188>] (snd_soc_instantiate_cards+0x8a8/0xbc0)
[<c01ff59c>] (soc_probe+0xfc/0x134)
[<c01adafc>] (platform_drv_probe+0x18/0x1c)
[<c01acca4>] (driver_probe_device+0xb0/0x16c)
[<c01ac284>] (bus_for_each_drv+0x48/0x84)
[<c01ace90>] (device_attach+0x50/0x68)
[<c01ac0f8>] (bus_probe_device+0x24/0x44)
[<c01aad7c>] (device_add+0x2fc/0x44c)
[<c01adfa8>] (platform_device_add+0x104/0x15c)
[<c0015eb8>] (simone_init+0x60/0x94)
[<c0021410>] (do_one_initcall+0xd0/0x1a4)
__dma_alloc() calls (inlined) __dma_alloc_buffer() which ends up
calling dmac_flush_range(). Now since the entries in the
arm920_cache_fns are shifted by one, we jump into address 0xee070f38
which is actually next instruction after the arm920_cache_fns
structure.
So implement flush_icache_all() for the rest of the supported CPUs
using a generic 'invalidate I cache' instruction.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When running following code in a machine which has VIVT caches and
USE_SPLIT_PTLOCKS is not defined:
fd = open("/etc/passwd", O_RDONLY);
addr = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);
addr2 = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);
v = *((int *)addr);
we will hang in spinlock recursion in the page fault handler:
BUG: spinlock recursion on CPU#0, mmap_test/717
lock: c5e295d8, .magic: dead4ead, .owner: mmap_test/717,
.owner_cpu: 0
[<c0026604>] (unwind_backtrace+0x0/0xec)
[<c014ee48>] (do_raw_spin_lock+0x40/0x140)
[<c0027f68>] (update_mmu_cache+0x208/0x250)
[<c0079db4>] (__do_fault+0x320/0x3ec)
[<c007af7c>] (handle_mm_fault+0x2f0/0x6d8)
[<c0027834>] (do_page_fault+0xdc/0x1cc)
[<c00202d0>] (do_DataAbort+0x34/0x94)
This comes from the fact that when USE_SPLIT_PTLOCKS is not defined,
the only lock protecting the page tables is mm->page_table_lock
which is already locked before update_mmu_cache() is called.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>