In commit e616c59140, highmem support was
deactivated for SMP platforms without hardware TLB ops broadcast because
usage of kmap_high_get() requires that IRQs be disabled when kmap_lock
is locked which is incompatible with the IPI mechanism used by the
software TLB ops broadcast invoked through flush_all_zero_pkmaps().
The reason for kmap_high_get() is to ensure that the currently kmap'd
page usage count does not decrease to zero while we're using its
existing virtual mapping in an atomic context. With a VIVT cache this
is essential to do due to cache coherency issues, but with a VIPT cache
this is only an optimization so not to pay the price of establishing a
second mapping if an existing one can be used. However, on VIPT
platforms without hardware TLB maintenance we can give up on that
optimization in order to be able to use highmem.
From ARMv7 onwards the TLB ops are broadcasted in hardware, so let's
disable ARCH_NEEDS_KMAP_HIGH_GET only when CONFIG_SMP and
CONFIG_CPU_TLB_V6 are defined.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Saeed Bishara <saeed.bishara@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add pud_offset() et.al. between the pgd and pmd code in preparation of
using pgtable-nopud.h rather than 4level-fixup.h.
This incorporates a fix from Jamie Iles <jamie@jamieiles.com> for
uaccess_with_memcpy.c.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The unsigned long datatype is not sufficient for mapping physical addresses
>= 4GB.
This patch ensures that the phys_addr_t datatype is used to represent physical
addresses when converting from a PFN.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
For the Kernel to support 2 level and 3 level page tables, physical
addresses (and also page table entries) need to be 32 or 64-bits depending
upon the configuration.
This patch uses the %08llx conversion specifier for physical addresses
and page table entries, ensuring that they are cast to (long long) so
that common code can be used regardless of the datatype widths.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
With LPAE we no longer have software bits in a separate Linux PTE and
the early_pte_alloc() function should pass PTE_HWTABLE_OFF +
PTE_HWTABLE_SIZE to early_alloc() to avoid allocating extra memory.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The hardware page tables use an XN bit 'execute never'. Historically,
we've had a Linux 'execute allow' bit, in the positive sense. Get rid
of this artifact as future hardware will continue to have the XN sense.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We have two places where we create identity mappings - one when we bring
secondary CPUs online, and one where we setup some mappings for soft-
reboot. Combine these two into a single implementation. Also collect
the identity mapping deletion function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch removes the domain switching functionality via the set_fs and
__switch_to functions on cores that have a TLS register.
Currently, the ioremap and vmalloc areas share the same level 1 page
tables and therefore have the same domain (DOMAIN_KERNEL). When the
kernel domain is modified from Client to Manager (via the __set_fs or in
the __switch_to function), the XN (eXecute Never) bit is overridden and
newer CPUs can speculatively prefetch the ioremap'ed memory.
Linux performs the kernel domain switching to allow user-specific
functions (copy_to/from_user, get/put_user etc.) to access kernel
memory. In order for these functions to work with the kernel domain set
to Client, the patch modifies the LDRT/STRT and related instructions to
the LDR/STR ones.
The user pages access rights are also modified for kernel read-only
access rather than read/write so that the copy-on-write mechanism still
works. CPU_USE_DOMAINS gets disabled only if the hardware has a TLS register
(CPU_32v6K is defined) since writing the TLS value to the high vectors page
isn't possible.
The user addresses passed to the kernel are checked by the access_ok()
function so that they do not point to the kernel space.
Tested-by: Anton Vorontsov <cbouatmailru@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Use memblock information to setup lowmem mappings rather than the
membank array.
This allows platforms to manipulate the memblock information during
initialization to reserve (and remove) memory from the kernel's view
of memory - and thus allowing platforms to setup their own private
mappings for this memory without causing problems with multiple
aliasing mappings:
size = min(size, SZ_2M);
base = memblock_alloc(size, min(align, SZ_2M));
memblock_free(base, size);
memblock_remove(base, size);
This is needed because multiple mappings of regions with differing
attributes (sharability, type, cache) are not permitted with ARMv6
and above.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This was missing from the noMMU code, so there was the possibility
of things not working as expected if out of order memory information
was passed.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Will says:
| Commit e63075a3 removed the explicit MEMBLOCK_REAL_LIMIT #define
| and introduced the requirement that arch code calls
| memblock_set_current_limit to ensure that the __va macro can
| be used on physical addresses returned from memblock_alloc.
Unfortunately, ARM was missed out of this change. Fix this.
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
After Santosh's fixup of the generic MT_MEMORY and
MT_MEMORY_NONCACHED I add this fix to the TCM memory types.
The main change is that the ITCM memory is L_PTE_WRITE and
DOMAIN_KERNEL which works just fine. The changed to the DTCM
is just cosmetic to fit with surrounding code.
Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Rickard Andersson <rickard.andersson@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
UP systems do not implement all the instructions that SMP systems have,
so in order to boot a SMP kernel on a UP system, we need to rewrite
parts of the kernel.
Do this using an 'alternatives' scheme, where the kernel code and data
is modified prior to initialization to replace the SMP instructions,
thereby rendering the problematical code ineffectual. We use the linker
to generate a list of 32-bit word locations and their replacement values,
and run through these replacements when we detect a UP system.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The commit f1a2481c0 sets up the default flags for MT_MEMORY and
MT_MEMORY_NONCACHED memory types. L_PTE_USER flag is wrongly
set as default for these entries so remove it. Also adding
the 'L_PTE_WRITE' flag so that these pages become read-write
instead of just being read-only
[this stops them being exposed to userspace, which is the main
concern here --rmk]
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch populates the L1 entries for MT_MEMORY and MT_MEMORY_NONCACHED
types so that at boot-up, we can map memories outside system memory
at page level granularity
Previously the mapping was limiting to section level, which creates
unnecessary additional mapping for which physical memory may not
present. On the newer ARM with speculation, this is dangerous and can
result in untraceable aborts.
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ARMv7 onwards requires that there are no aliases to the same physical
location using different memory types (i.e. Normal vs Strongly Ordered).
Access to SO mappings when the unaligned accesses are handled in
hardware is also Unpredictable (pgprot_noncached() mappings in user
space).
The /dev/mem driver requires uncached mappings with O_SYNC. The patch
implements the phys_mem_access_prot() function which generates Strongly
Ordered memory attributes if !pfn_valid() (independent of O_SYNC) and
Normal Noncacheable (writecombine) if O_SYNC.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The earlier TCM memory regions were mapped as MT_MEMORY_UNCACHED
which doesn't really work on platforms supporting the new v6
features like the NX bit. Add unique MT_MEMORY_[I|D]TCM types
instead.
Cc: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add a common early allocator function, in preparation for switching
over to LMB. When we do, this function will need to do a little more
than just allocating memory; we need it zero initialized too.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move the platform specific bootmem memory reservations out of
arch/arm/mm/mmu.c into their respective platform files.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Everything should now be using sparsemem rather than discontigmem, so
remove the code supporting discontigmem from ARM.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rather than storing the minimum size of the vmalloc area, store the
maximum permitted address of the vmalloc area instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When crash happens in interrupt context there is no userspace context.
We always use current->active_mm in those cases.
Signed-off-by: Mika Westerberg <ext-mika.1.westerberg@nokia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Write combining/cached device mappings are not setting the shared bit,
which could potentially cause problems on SMP systems since the cache
lines won't participate in the cache coherency protocol.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
The ARM setup code includes its own parser for early params, there's
also one in the generic init code.
This patch removes __early_init (and related code) from
arch/arm/kernel/setup.c, and changes users to the generic early_init
macro instead.
The generic macro takes a char * argument, rather than char **, so we
need to update the parser functions a little.
Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We need to do that if we tinker with the MMU entries.
This fixes the occasional bug with kexec where the new
fails to uncompress with "crc error". Most likely at
least kexec on v6 and v7 need this fix.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
PAGE_KERNEL should not be executable; any area marked executable can
be prefetched into the instruction cache. We don't want vmalloc areas
to be read in this way.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The zero page is read-only, and has its cache state cleared during
boot. No further maintanence for this page is required.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Mapping the same memory using two different attributes (memory
type, shareability, cacheability) is unpredictable. During boot,
we encounter a situation when we're updating the kernel's page
tables which can lead to dirty cache lines existing in the cache
which are subsequently missed. This causes stack corruption,
and therefore a crash.
Therefore, ensure that the shared and cacheability settings
matches the configuration that will be used later; this together
with the restriction in early_cachepolicy() ensures that we won't
create a mismatch during boot.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We suffer an unfortunate combination of "features" which makes highmem
support on platforms without hardware TLB maintainence broadcast difficult:
- we need kmap_high_get() support for DMA cache coherence
- this requires kmap_high() to take a spinlock with IRQs disabled
- kmap_high() occasionally calls flush_all_zero_pkmaps() to clear
out old mappings
- flush_all_zero_pkmaps() calls flush_tlb_kernel_range(), which
on s/w IPI'd systems eventually calls smp_call_function_many()
- smp_call_function_many() must not be called with IRQs disabled:
WARNING: at kernel/smp.c:380 smp_call_function_many+0xc4/0x240()
Modules linked in:
Backtrace:
[<c00306f0>] (dump_backtrace+0x0/0x108) from [<c0286e6c>] (dump_stack+0x18/0x1c)
r6:c007cd18 r5:c02ff228 r4:0000017c
[<c0286e54>] (dump_stack+0x0/0x1c) from [<c0053e08>] (warn_slowpath_common+0x50/0x80)
[<c0053db8>] (warn_slowpath_common+0x0/0x80) from [<c0053e50>] (warn_slowpath_null+0x18/0x1c)
r7:00000003 r6:00000001 r5:c1ff4000 r4:c035fa34
[<c0053e38>] (warn_slowpath_null+0x0/0x1c) from [<c007cd18>] (smp_call_function_many+0xc4/0x240)
[<c007cc54>] (smp_call_function_many+0x0/0x240) from [<c007cec0>] (smp_call_function+0x2c/0x38)
[<c007ce94>] (smp_call_function+0x0/0x38) from [<c005980c>] (on_each_cpu+0x1c/0x38)
[<c00597f0>] (on_each_cpu+0x0/0x38) from [<c0031788>] (flush_tlb_kernel_range+0x50/0x58)
r6:00000001 r5:00000800 r4:c05f3590
[<c0031738>] (flush_tlb_kernel_range+0x0/0x58) from [<c009c600>] (flush_all_zero_pkmaps+0xc0/0xe8)
[<c009c540>] (flush_all_zero_pkmaps+0x0/0xe8) from [<c009c6b4>] (kmap_high+0x8c/0x1e0)
[<c009c628>] (kmap_high+0x0/0x1e0) from [<c00364a8>] (kmap+0x44/0x5c)
[<c0036464>] (kmap+0x0/0x5c) from [<c0109dfc>] (cramfs_readpage+0x3c/0x194)
[<c0109dc0>] (cramfs_readpage+0x0/0x194) from [<c0090c14>] (__do_page_cache_readahead+0x1f0/0x290)
[<c0090a24>] (__do_page_cache_readahead+0x0/0x290) from [<c0090ce4>] (ra_submit+0x30/0x38)
[<c0090cb4>] (ra_submit+0x0/0x38) from [<c0089384>] (filemap_fault+0x3dc/0x438)
r4:c1819988
[<c0088fa8>] (filemap_fault+0x0/0x438) from [<c009d21c>] (__do_fault+0x58/0x43c)
[<c009d1c4>] (__do_fault+0x0/0x43c) from [<c009e8cc>] (handle_mm_fault+0x104/0x318)
[<c009e7c8>] (handle_mm_fault+0x0/0x318) from [<c0033c98>] (do_page_fault+0x188/0x1e4)
[<c0033b10>] (do_page_fault+0x0/0x1e4) from [<c0033ddc>] (do_translation_fault+0x7c/0x84)
[<c0033d60>] (do_translation_fault+0x0/0x84) from [<c002b474>] (do_DataAbort+0x40/0xa4)
r8:c1ff5e20 r7:c0340120 r6:00000805 r5:c1ff5e54 r4:c03400d0
[<c002b434>] (do_DataAbort+0x0/0xa4) from [<c002bcac>] (__dabt_svc+0x4c/0x60)
...
So we disable highmem support on these systems.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Currently, highmem is selectable, and you can request an increased
vmalloc area. However, none of this has any effect on the memory
layout since a patch in the highmem series was accidentally dropped.
Moreover, even if you did want highmem, all memory would still be
registered as lowmem, possibly resulting in overflow of the available
virtual mapping space.
The highmem boundary is determined by the highest allowed beginning
of the vmalloc area, which depends on its configurable minimum size
(see commit 60296c71f6 for details on
this).
We should create mappings and initialize bootmem only for low memory,
while the zone allocator must still be told about highmem.
Currently, memory nodes which are completely located in high memory
are not supported. This is not a huge limitation since systems
relying on highmem support are unlikely to have discontiguous memory
with large holes.
[ A similar patch was meant to be merged before commit 5f0fbf9eca
and be available in Linux v2.6.30, however some git rebase screw-up
of mine dropped the first commit of the series, and that goofage
escaped testing somehow as well. -- Nico ]
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: Nicolas Pitre <nico@marvell.com>