Commit graph

1741 commits

Author SHA1 Message Date
Punit Agrawal
c3e4ed5c3d arm64: hugetlb: Override huge_pte_clear() to support contiguous hugepages
The default huge_pte_clear() implementation does not clear contiguous
page table entries when it encounters contiguous hugepages that are
supported on arm64.

Fix this by overriding the default implementation to clear all the
entries associated with contiguous hugepages.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: David Woods <dwoods@mellanox.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-22 17:47:10 +01:00
Hoeun Ryu
a88ce63b64 arm64: kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores
Commit 0ee5941 : (x86/panic: replace smp_send_stop() with kdump friendly
version in panic path) introduced crash_smp_send_stop() which is a weak
function and can be overridden by architecture codes to fix the side effect
caused by commit f06e515 : (kernel/panic.c: add "crash_kexec_post_
notifiers" option).

 ARM64 architecture uses the weak version function and the problem is that
the weak function simply calls smp_send_stop() which makes other CPUs
offline and takes away the chance to save crash information for nonpanic
CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel
option is enabled.

 Calling smp_send_crash_stop() in machine_crash_shutdown() is useless
because all nonpanic CPUs are already offline by smp_send_stop() in this
case and smp_send_crash_stop() only works against online CPUs.

 The result is that secondary CPUs registers are not saved by
crash_save_cpu() and the vmcore file misreports these CPUs as being
offline.

 crash_smp_send_stop() is implemented to fix this problem by replacing the
existing smp_send_crash_stop() and adding a check for multiple calling to
the function. The function (strong symbol version) saves crash information
for nonpanic CPUs and machine_crash_shutdown() tries to save crash
information for nonpanic CPUs only when crash_kexec_post_notifiers kernel
option is disabled.

* crash_kexec_post_notifiers : false

  panic()
    __crash_kexec()
      machine_crash_shutdown()
        crash_smp_send_stop()    <= save crash dump for nonpanic cores

* crash_kexec_post_notifiers : true

  panic()
    crash_smp_send_stop()        <= save crash dump for nonpanic cores
    __crash_kexec()
      machine_crash_shutdown()
        crash_smp_send_stop()    <= just return.

Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-21 18:01:04 +01:00
Catalin Marinas
af29678fe7 arm64: Remove the !CONFIG_ARM64_HW_AFDBM alternative code paths
Since the pte handling for hardware AF/DBM works even when the hardware
feature is not present, make the pte accessors implementation permanent
and remove the corresponding #ifdefs. The Kconfig option is kept as it
can still be used to disable the feature at the hardware level.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-21 11:13:11 +01:00
Catalin Marinas
64c26841b3 arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect()
ptep_set_wrprotect() is only called on CoW mappings which are private
(!VM_SHARED) with the pte either read-only (!PTE_WRITE && PTE_RDONLY) or
writable and software-dirty (PTE_WRITE && !PTE_RDONLY && PTE_DIRTY).
There is no race with the hardware update of the dirty state: clearing
of PTE_RDONLY when PTE_WRITE (a.k.a. PTE_DBM) is set. This patch removes
the code setting the software PTE_DIRTY bit in ptep_set_wrprotect() as
superfluous. A VM_WARN_ONCE is introduced in case the above logic is
wrong or the core mm code changes its use of ptep_set_wrprotect().

Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-21 11:13:00 +01:00
Catalin Marinas
73e86cb03c arm64: Move PTE_RDONLY bit handling out of set_pte_at()
Currently PTE_RDONLY is treated as a hardware only bit and not handled
by the pte_mkwrite(), pte_wrprotect() or the user PAGE_* definitions.
The set_pte_at() function is responsible for setting this bit based on
the write permission or dirty state. This patch moves the PTE_RDONLY
handling out of set_pte_at into the pte_mkwrite()/pte_wrprotect()
functions. The PAGE_* definitions to need to be updated to explicitly
include PTE_RDONLY when !PTE_WRITE.

The patch also removes the redundant PAGE_COPY(_EXEC) definitions as
they are identical to the corresponding PAGE_READONLY(_EXEC).

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-21 11:12:50 +01:00
Catalin Marinas
0966253d7c kvm: arm64: Convert kvm_set_s2pte_readonly() from inline asm to cmpxchg()
To take advantage of the LSE atomic instructions and also make the code
cleaner, convert the kvm_set_s2pte_readonly() function to use the more
generic cmpxchg().

Cc: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-21 11:12:39 +01:00
Catalin Marinas
3bbf7157ac arm64: Convert pte handling from inline asm to using (cmp)xchg
With the support for hardware updates of the access and dirty states,
the following pte handling functions had to be implemented using
exclusives: __ptep_test_and_clear_young(), ptep_get_and_clear(),
ptep_set_wrprotect() and ptep_set_access_flags(). To take advantage of
the LSE atomic instructions and also make the code cleaner, convert
these pte functions to use the more generic cmpxchg()/xchg().

Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-21 11:12:29 +01:00
Ingo Molnar
94edf6f3c2 Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

 - Removal of spin_unlock_wait()
 - SRCU updates
 - Torture-test updates
 - Documentation updates
 - Miscellaneous fixes
 - CPU-hotplug fixes
 - Miscellaneous non-RCU fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-21 09:45:19 +02:00
Ard Biesheuvel
760b61d76d efi/libstub/arm64: Use hidden attribute for struct screen_info reference
To prevent the compiler from emitting absolute references to screen_info
when building position independent code, redeclare the symbol with hidden
visibility.

Tested-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20170818194947.19347-3-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-21 09:43:49 +02:00
Linus Torvalds
2615a38f14 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
 "A few small fixes for timer drivers:

   - Prevent infinite recursion in the arm architected timer driver with
     ftrace

   - Propagate error codes to the caller in case of failure in EM STI
     driver

   - Adjust a bogus loop iteration in the arm architected timer driver

   - Add a missing Kconfig dependency to the pistachio clocksource to
     prevent build failures

   - Correctly check for IS_ERR() instead of NULL in the shared timer-of
     code"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/arm_arch_timer: Avoid infinite recursion when ftrace is enabled
  clocksource/drivers/Kconfig: Fix CLKSRC_PISTACHIO dependencies
  clocksource/drivers/timer-of: Checking for IS_ERR() instead of NULL
  clocksource/drivers/em_sti: Fix error return codes in em_sti_probe()
  clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization
2017-08-20 09:34:24 -07:00
Kees Cook
c715b72c1b mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes
Moving the x86_64 and arm64 PIE base from 0x555555554000 to 0x000100000000
broke AddressSanitizer.  This is a partial revert of:

  eab09532d4 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE")
  02445990a9 ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB")

The AddressSanitizer tool has hard-coded expectations about where
executable mappings are loaded.

The motivation for changing the PIE base in the above commits was to
avoid the Stack-Clash CVEs that allowed executable mappings to get too
close to heap and stack.  This was mainly a problem on 32-bit, but the
64-bit bases were moved too, in an effort to proactively protect those
systems (proofs of concept do exist that show 64-bit collisions, but
other recent changes to fix stack accounting and setuid behaviors will
minimize the impact).

The new 32-bit PIE base is fine for ASan (since it matches the ET_EXEC
base), so only the 64-bit PIE base needs to be reverted to let x86 and
arm64 ASan binaries run again.  Future changes to the 64-bit PIE base on
these architectures can be made optional once a more dynamic method for
dealing with AddressSanitizer is found.  (e.g.  always loading PIE into
the mmap region for marked binaries.)

Link: http://lkml.kernel.org/r/20170807201542.GA21271@beast
Fixes: eab09532d4 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE")
Fixes: 02445990a9 ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Kostya Serebryany <kcc@google.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:02 -07:00
Catalin Marinas
a7ba38d680 Merge branch 'for-next/kernel-mode-neon' into for-next/core
* for-next/kernel-mode-neon:
  arm64: neon/efi: Make EFI fpsimd save/restore variables static
  arm64: neon: Forbid when irqs are disabled
  arm64: neon: Export kernel_neon_busy to loadable modules
  arm64: neon: Temporarily add a kernel_mode_begin_partial() definition
  arm64: neon: Remove support for nested or hardirq kernel-mode NEON
  arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
  arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
  arm64: neon: Add missing header guard in <asm/neon.h>
  arm64: neon: replace generic definition of may_use_simd()
2017-08-18 18:32:50 +01:00
Paul E. McKenney
952111d7db arch: Remove spin_unlock_wait() arch-specific definitions
There is no agreed-upon definition of spin_unlock_wait()'s semantics,
and it appears that all callers could do just as well with a lock/unlock
pair.  This commit therefore removes the underlying arch-specific
arch_spin_unlock_wait() for all architectures providing them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: <linux-arch@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
2017-08-17 08:08:59 -07:00
Catalin Marinas
df5b95bee1 Merge branch 'arm64/vmap-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux into for-next/core
* 'arm64/vmap-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux:
  arm64: add VMAP_STACK overflow detection
  arm64: add on_accessible_stack()
  arm64: add basic VMAP_STACK support
  arm64: use an irq stack pointer
  arm64: assembler: allow adr_this_cpu to use the stack pointer
  arm64: factor out entry stack manipulation
  efi/arm64: add EFI_KIMG_ALIGN
  arm64: move SEGMENT_ALIGN to <asm/memory.h>
  arm64: clean up irq stack definitions
  arm64: clean up THREAD_* definitions
  arm64: factor out PAGE_* and CONT_* definitions
  arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP
  fork: allow arch-override of VMAP stack alignment
  arm64: remove __die()'s stack dump
2017-08-15 18:40:58 +01:00
Mark Rutland
872d8327ce arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.

Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).

Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.

The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.

This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:

[  305.388749] lkdtm: Performing direct entry OVERFLOW
[  305.395444] Insufficient stack space to handle exception!
[  305.395482] ESR: 0x96000047 -- DABT (current EL)
[  305.399890] FAR: 0xffff00000a5e7f30
[  305.401315] Task stack:     [0xffff00000a5e8000..0xffff00000a5ec000]
[  305.403815] IRQ stack:      [0xffff000008000000..0xffff000008004000]
[  305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[  305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[  305.412785] Hardware name: linux,dummy-virt (DT)
[  305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[  305.419221] PC is at recursive_loop+0x10/0x48
[  305.421637] LR is at recursive_loop+0x38/0x48
[  305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[  305.428020] sp : ffff00000a5e7f50
[  305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[  305.433191] x27: ffff000008981000 x26: ffff000008f80400
[  305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[  305.440369] x23: ffff000008f80138 x22: 0000000000000009
[  305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[  305.444552] x19: 0000000000000013 x18: 0000000000000006
[  305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[  305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[  305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[  305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[  305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[  305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[  305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[  305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[  305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[  305.467724] Kernel panic - not syncing: kernel stack overflow
[  305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[  305.473325] Hardware name: linux,dummy-virt (DT)
[  305.475070] Call trace:
[  305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[  305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[  305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[  305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[  305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[  305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[  305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[  305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[  305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[  305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[  305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[  305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[  305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[  305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[  305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[  305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[  305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[  305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[  305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[  305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[  305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[  305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[  305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[  305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[  305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[  305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[  305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[  305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[  305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[  305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[  305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[  305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[  305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[  305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[  305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[  305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[  305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[  305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[  305.504720] Kernel Offset: disabled
[  305.505189] CPU features: 0x002082
[  305.505473] Memory Limit: none
[  305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow

This patch was co-authored by Ard Biesheuvel and Mark Rutland.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-08-15 18:36:18 +01:00
Mark Rutland
12964443e8 arm64: add on_accessible_stack()
Both unwind_frame() and dump_backtrace() try to check whether a stack
address is sane to access, with very similar logic. Both will need
updating in order to handle overflow stacks.

Factor out this logic into a helper, so that we can avoid further
duplication when we add overflow stacks.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-08-15 18:36:12 +01:00
Mark Rutland
e3067861ba arm64: add basic VMAP_STACK support
This patch enables arm64 to be built with vmap'd task and IRQ stacks.

As vmap'd stacks are mapped at page granularity, stacks must be a multiple of
PAGE_SIZE. This means that a 64K page kernel must use stacks of at least 64K in
size.

To minimize the increase in Image size, IRQ stacks are dynamically allocated at
boot time, rather than embedding the boot CPU's IRQ stack in the kernel image.

This patch was co-authored by Ard Biesheuvel and Mark Rutland.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-08-15 18:36:04 +01:00
Mark Rutland
f60fe78f13 arm64: use an irq stack pointer
We allocate our IRQ stacks using a percpu array. This allows us to generate our
IRQ stack pointers with adr_this_cpu, but bloats the kernel Image with the boot
CPU's IRQ stack. Additionally, these are packed with other percpu variables,
and aren't guaranteed to have guard pages.

When we enable VMAP_STACK we'll want to vmap our IRQ stacks also, in order to
provide guard pages and to permit more stringent alignment requirements. Doing
so will require that we use a percpu pointer to each IRQ stack, rather than
allocating a percpu IRQ stack in the kernel image.

This patch updates our IRQ stack code to use a percpu pointer to the base of
each IRQ stack. This will allow us to change the way the stack is allocated
with minimal changes elsewhere. In some cases we may try to backtrace before
the IRQ stack pointers are initialised, so on_irq_stack() is updated to account
for this.

In testing with cyclictest, there was no measureable difference between using
adr_this_cpu (for irq_stack) and ldr_this_cpu (for irq_stack_ptr) in the IRQ
entry path.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-08-15 18:35:54 +01:00
Ard Biesheuvel
8ea41b11ef arm64: assembler: allow adr_this_cpu to use the stack pointer
Given that adr_this_cpu already requires a temp register in addition
to the destination register, tweak the instruction sequence so that sp
may be used as well.

This will simplify switching to per-cpu stacks in subsequent patches. While
this limits the range of adr_this_cpu, to +/-4GiB, we don't currently use
adr_this_cpu in modules, and this is not problematic for the main kernel image.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: add more commit text]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-08-15 18:35:47 +01:00
Mark Rutland
170976bcab efi/arm64: add EFI_KIMG_ALIGN
The EFI stub is intimately coupled with the kernel, and takes advantage
of this by relocating the kernel at a weaker alignment than the
documented boot protocol mandates.

However, it does so by assuming it can align the kernel to the segment
alignment, and assumes that this is 64K. In subsequent patches, we'll
have to consider other details to determine this de-facto alignment
constraint.

This patch adds a new EFI_KIMG_ALIGN definition that will track the
kernel's de-facto alignment requirements. Subsequent patches will modify
this as required.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
2017-08-15 18:35:32 +01:00
Mark Rutland
8018ba4edf arm64: move SEGMENT_ALIGN to <asm/memory.h>
Currently we define SEGMENT_ALIGN directly in our vmlinux.lds.S.

This is unfortunate, as the EFI stub currently open-codes the same
number, and in future we'll want to fiddle with this.

This patch moves the definition to our <asm/memory.h>, where it can be
used by both vmlinux.lds.S and the EFI stub code.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-08-15 18:35:22 +01:00
Mark Rutland
f60ad4edcf arm64: clean up irq stack definitions
Before we add yet another stack to the kernel, it would be nice to
ensure that we consistently organise stack definitions and related
helper functions.

This patch moves the basic IRQ stack defintions to <asm/memory.h> to
live with their task stack counterparts. Helpers used for unwinding are
moved into <asm/stacktrace.h>, where subsequent patches will add helpers
for other stacks. Includes are fixed up accordingly.

This patch is a pure refactoring -- there should be no functional
changes as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-08-15 18:35:14 +01:00
Mark Rutland
dbc9344a68 arm64: clean up THREAD_* definitions
Currently we define THREAD_SIZE and THREAD_SIZE_ORDER separately, with
the latter dependent on particular CONFIG_ARM64_*K_PAGES definitions.
This is somewhat opaque, and will get in the way of future modifications
to THREAD_SIZE.

This patch cleans this up, defining both in terms of a common
THREAD_SHIFT, and using PAGE_SHIFT to calculate THREAD_SIZE_ORDER,
rather than using a number of definitions dependent on config symbols.
Subsequent patches will make use of this to alter the stack size used in
some configurations.

At the same time, these are moved into <asm/memory.h>, which will avoid
circular include issues in subsequent patches. To ensure that existing
code isn't adversely affected, <asm/thread_info.h> is updated to
transitively include these definitions.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-08-15 18:35:07 +01:00
Mark Rutland
b6531456ba arm64: factor out PAGE_* and CONT_* definitions
Some headers rely on PAGE_* definitions from <asm/page.h>, but cannot
include this due to potential circular includes. For example, a number
of definitions in <asm/memory.h> rely on PAGE_SHIFT, and <asm/page.h>
includes <asm/memory.h>.

This requires users of these definitions to include both headers, which
is fragile and error-prone.

This patch ameliorates matters by moving the basic definitions out to a
new header, <asm/page-def.h>. Both <asm/page.h> and <asm/memory.h> are
updated to include this, avoiding this fragility, and avoiding the
possibility of circular include dependencies.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-08-15 18:35:00 +01:00
Ard Biesheuvel
34be98f494 arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP
For historical reasons, we leave the top 16 bytes of our task and IRQ
stacks unused, a practice used to ensure that the SP can always be
masked to find the base of the current stack (historically, where
thread_info could be found).

However, this is not necessary, as:

* When an exception is taken from a task stack, we decrement the SP by
  S_FRAME_SIZE and stash the exception registers before we compare the
  SP against the task stack. In such cases, the SP must be at least
  S_FRAME_SIZE below the limit, and can be safely masked to determine
  whether the task stack is in use.

* When transitioning to an IRQ stack, we'll place a dummy frame onto the
  IRQ stack before enabling asynchronous exceptions, or executing code
  we expect to trigger faults. Thus, if an exception is taken from the
  IRQ stack, the SP must be at least 16 bytes below the limit.

* We no longer mask the SP to find the thread_info, which is now found
  via sp_el0. Note that historically, the offset was critical to ensure
  that cpu_switch_to() found the correct stack for new threads that
  hadn't yet executed ret_from_fork().

Given that, this initial offset serves no purpose, and can be removed.
This brings us in-line with other architectures (e.g. x86) which do not
rely on this masking.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: rebase, kill THREAD_START_SP, commit msg additions]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-08-15 18:34:53 +01:00
Dou Liyang
969ff73e72 arm64: numa: Remove the unused parent_node() macro
Commit a7be6e5a7f ("mm: drop useless local parameters of
__register_one_node()") removes the last user of parent_node().

The parent_node() macro in ARM64 platform is unnecessary.

Remove it for cleanup.

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-15 14:37:45 +01:00
Ingo Molnar
b60bf53abc Merge branch 'clockevents/4.13-fixes' of http://git.linaro.org/people/daniel.lezcano/linux into timers/urgent
Pull clockevents fixes from Daniel Lezcano:

" - Fix error check against IS_ERR() instead of NULL for the timer-of code (Dan Carpenter)
  - Fix infinite recusion with ftrace for the ARM architected timer (Ding Tianhong)
  - Fix the error code return in the em_sti's probe function (Gustavo A. R.  Silva)
  - Fix Kconfig dependency for the pistachio driver (Matt Redfearn)
  - Fix mem frame loop initialization for the ARM architected timer (Matthias Kaehlcke)"

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-13 11:39:45 +02:00
Ding Tianhong
adb4f11e0a clocksource/drivers/arm_arch_timer: Avoid infinite recursion when ftrace is enabled
On platforms with an arch timer erratum workaround, it's possible for
arch_timer_reg_read_stable() to recurse into itself when certain
tracing options are enabled, leading to stack overflows and related
problems.

For example, when PREEMPT_TRACER and FUNCTION_GRAPH_TRACER are
selected, it's possible to trigger this with:

$ mount -t debugfs nodev /sys/kernel/debug/
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer

The problem is that in such cases, preempt_disable() instrumentation
attempts to acquire a timestamp via trace_clock(), resulting in a call
back to arch_timer_reg_read_stable(), and hence recursion.

This patch changes arch_timer_reg_read_stable() to use
preempt_{disable,enable}_notrace(), which avoids this.

This problem is similar to the fixed by upstream commit 96b3d28bf4
("sched/clock: Prevent tracing recursion in sched_clock_cpu()").

Fixes: 6acc71ccac ("arm64: arch_timer: Allows a CPU-specific erratum to only affect a subset of CPUs")
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-11 16:01:43 +02:00
Kevin Brodsky
82d24d114f arm64: compat: Remove leftover variable declaration
Commit a1d5ebaf8c ("arm64: big-endian: don't treat code as data when
copying sigret code") moved the 32-bit sigreturn trampoline code from
the aarch32_sigret_code array to kuser32.S. The commit removed the
array definition from signal32.c, but not its declaration in
signal32.h. Remove the leftover declaration.

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-10 18:17:32 +01:00
Masami Hiramatsu
229a718605 irq: Make the irqentry text section unconditional
Generate irqentry and softirqentry text sections without
any Kconfig dependencies. This will add extra sections, but
there should be no performace impact.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David S . Miller <davem@davemloft.net>
Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: linux-arch@vger.kernel.org
Cc: linux-cris-kernel@axis.com
Cc: mathieu.desnoyers@efficios.com
Link: http://lkml.kernel.org/r/150172789110.27216.3955739126693102122.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 16:28:53 +02:00
Peter Zijlstra
a9668cd6ee locking: Remove smp_mb__before_spinlock()
Now that there are no users of smp_mb__before_spinlock() left, remove
it entirely.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:29:03 +02:00
Peter Zijlstra
d89e588ca4 locking: Introduce smp_mb__after_spinlock()
Since its inception, our understanding of ACQUIRE, esp. as applied to
spinlocks, has changed somewhat. Also, I wonder if, with a simple
change, we cannot make it provide more.

The problem with the comment is that the STORE done by spin_lock isn't
itself ordered by the ACQUIRE, and therefore a later LOAD can pass over
it and cross with any prior STORE, rendering the default WMB
insufficient (pointed out by Alan).

Now, this is only really a problem on PowerPC and ARM64, both of
which already defined smp_mb__before_spinlock() as a smp_mb().

At the same time, we can get a much stronger construct if we place
that same barrier _inside_ the spin_lock(). In that case we upgrade
the RCpc spinlock to an RCsc.  That would make all schedule() calls
fully transitive against one another.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:29:02 +02:00
Catalin Marinas
0553896787 Merge branch 'arm64/exception-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux into for-next/core
* 'arm64/exception-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux:
  arm64: unwind: remove sp from struct stackframe
  arm64: unwind: reference pt_regs via embedded stack frame
  arm64: unwind: disregard frame.sp when validating frame pointer
  arm64: unwind: avoid percpu indirection for irq stack
  arm64: move non-entry code out of .entry.text
  arm64: consistently use bl for C exception entry
  arm64: Add ASM_BUG()
2017-08-09 15:37:49 +01:00
Dave Martin
66c3ec5a71 arm64: neon: Forbid when irqs are disabled
Currently, may_use_simd() can return true if IRQs are disabled.  If
the caller goes ahead and calls kernel_neon_begin(), this can
result in use of local_bh_enable() in an unsafe context.

In particular, __efi_fpsimd_begin() may do this when calling EFI as
part of system shutdown.

This patch ensures that callers don't think they can use
kernel_neon_begin() in such a context.

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09 15:05:59 +01:00
Ard Biesheuvel
31e43ad3b7 arm64: unwind: remove sp from struct stackframe
The unwind code sets the sp member of struct stackframe to
'frame pointer + 0x10' unconditionally, without regard for whether
doing so produces a legal value. So let's simply remove it now that
we have stopped using it anyway.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-08-09 14:10:29 +01:00
Ard Biesheuvel
7326749801 arm64: unwind: reference pt_regs via embedded stack frame
As it turns out, the unwind code is slightly broken, and probably has
been for a while. The problem is in the dumping of the exception stack,
which is intended to dump the contents of the pt_regs struct at each
level in the call stack where an exception was taken and routed to a
routine marked as __exception (which means its stack frame is right
below the pt_regs struct on the stack).

'Right below the pt_regs struct' is ill defined, though: the unwind
code assigns 'frame pointer + 0x10' to the .sp member of the stackframe
struct at each level, and dump_backtrace() happily dereferences that as
the pt_regs pointer when encountering an __exception routine. However,
the actual size of the stack frame created by this routine (which could
be one of many __exception routines we have in the kernel) is not known,
and so frame.sp is pretty useless to figure out where struct pt_regs
really is.

So it seems the only way to ensure that we can find our struct pt_regs
when walking the stack frames is to put it at a known fixed offset of
the stack frame pointer that is passed to such __exception routines.
The simplest way to do that is to put it inside pt_regs itself, which is
the main change implemented by this patch. As a bonus, doing this allows
us to get rid of a fair amount of cruft related to walking from one stack
to the other, which is especially nice since we intend to introduce yet
another stack for overflow handling once we add support for vmapped
stacks. It also fixes an inconsistency where we only add a stack frame
pointing to ELR_EL1 if we are executing from the IRQ stack but not when
we are executing from the task stack.

To consistly identify exceptions regs even in the presence of exceptions
taken from entry code, we must check whether the next frame was created
by entry text, rather than whether the current frame was crated by
exception text.

To avoid backtracing using PCs that fall in the idmap, or are controlled
by userspace, we must explcitly zero the FP and LR in startup paths, and
must ensure that the frame embedded in pt_regs is zeroed upon entry from
EL0. To avoid these NULL entries showin in the backtrace, unwind_frame()
is updated to avoid them.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: compare current frame against .entry.text, avoid bogus PCs]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-08-09 14:07:13 +01:00
Robin Murphy
5d7bdeb1ee arm64: uaccess: Implement *_flushcache variants
Implement the set of copy functions with guarantees of a clean cache
upon completion necessary to support the pmem driver.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09 12:16:26 +01:00
Robin Murphy
d50e071fda arm64: Implement pmem API support
Add a clean-to-point-of-persistence cache maintenance helper, and wire
up the basic architectural support for the pmem driver based on it.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[catalin.marinas@arm.com: move arch_*_pmem() functions to arch/arm64/mm/flush.c]
[catalin.marinas@arm.com: change dmb(sy) to dmb(osh)]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09 12:15:45 +01:00
Robin Murphy
e1bc5d1b8e arm64: Handle trapped DC CVAP
Cache clean to PoP is subject to the same access controls as to PoC, so
if we are trapping userspace cache maintenance with SCTLR_EL1.UCI, we
need to be prepared to handle it. To avoid getting into complicated
fights with binutils about ARMv8.2 options, we'll just cheat and use the
raw SYS instruction rather than the 'proper' DC alias.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09 11:00:43 +01:00
Robin Murphy
7aac405ebb arm64: Expose DC CVAP to userspace
The ARMv8.2-DCPoP feature introduces persistent memory support to the
architecture, by defining a point of persistence in the memory
hierarchy, and a corresponding cache maintenance operation, DC CVAP.
Expose the support via HWCAP and MRS emulation.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09 11:00:35 +01:00
Robin Murphy
d46befef4c arm64: Convert __inval_cache_range() to area-based
__inval_cache_range() is already the odd one out among our data cache
maintenance routines as the only remaining range-based one; as we're
going to want an invalidation routine to call from C code for the pmem
API, let's tweak the prototype and name to bring it in line with the
clean operations, and to make its relationship with __dma_inv_area()
neatly mirror that of __clean_dcache_area_poc() and __dma_clean_area().
The loop clearing the early page tables gets mildly massaged in the
process for the sake of consistency.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09 11:00:23 +01:00
Robin Murphy
09c2a7dc4c arm64: mm: Fix set_memory_valid() declaration
Clearly, set_memory_valid() has never been seen in the same room as its
declaration... Whilst the type mismatch is such that kexec probably
wasn't broken in practice, fix it to match the definition as it should.

Fixes: 9b0aa14e31 ("arm64: mm: add set_memory_valid()")
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09 10:59:52 +01:00
Ard Biesheuvel
c736533075 arm64: unwind: disregard frame.sp when validating frame pointer
Currently, when unwinding the call stack, we validate the frame pointer
of each frame against frame.sp, whose value is not clearly defined, and
which makes it more difficult to link stack frames together across
different stacks. It is far better to simply check whether the frame
pointer itself points into a valid stack.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-08-08 16:28:26 +01:00
Mark Rutland
096683724c arm64: unwind: avoid percpu indirection for irq stack
Our IRQ_STACK_PTR() and on_irq_stack() helpers both take a cpu argument,
used to generate a percpu address. In all cases, they are passed
{raw_,}smp_processor_id(), so this parameter is redundant.

Since {raw_,}smp_processor_id() use a percpu variable internally, this
approach means we generate a percpu offset to find the current cpu, then
use this to index an array of percpu offsets, which we then use to find
the current CPU's IRQ stack pointer. Thus, most of the work is
redundant.

Instead, we can consistently use raw_cpu_ptr() to generate the CPU's
irq_stack pointer by simply adding the percpu offset to the irq_stack
address, which is simpler in both respects.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-08-08 16:28:25 +01:00
Mark Rutland
ed84b4e958 arm64: move non-entry code out of .entry.text
Currently, cpu_switch_to and ret_from_fork both live in .entry.text,
though neither form the critical path for an exception entry.

In subsequent patches, we will require that code in .entry.text is part
of the critical path for exception entry, for which we can assume
certain properties (e.g. the presence of exception regs on the stack).

Neither cpu_switch_to nor ret_from_fork will meet these requirements, so
we must move them out of .entry.text. To ensure that neither are kprobed
after being moved out of .entry.text, we must explicitly blacklist them,
requiring a new NOKPROBE() asm helper.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-08-08 16:28:25 +01:00
Mark Rutland
db44e9c5ec arm64: Add ASM_BUG()
Currently. we can only use BUG() from C code, though there are
situations where we would like an equivalent mechanism in assembly code.

This patch refactors our BUG() definition such that it can be used in
either C or assembly, in the form of a new ASM_BUG().

The refactoring requires the removal of escape sequences, such as '\n'
and '\t', but these aren't strictly necessary as we can use ';' to
terminate assembler statements.

The low-level assembly is factored out into <asm/asm-bug.h>, with
<asm/bug.h> retained as the C wrapper.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-08-08 16:28:13 +01:00
Julien Thierry
1f9b8936f3 arm64: Decode information from ESR upon mem faults
When receiving unhandled faults from the CPU, description is very sparse.
Adding information about faults decoded from ESR.

Added defines to esr.h corresponding ESR fields. Values are based on ARM
Archtecture Reference Manual (DDI 0487B.a), section D7.2.28 ESR_ELx, Exception
Syndrome Register (ELx) (pages D7-2275 to D7-2280).

New output is of the form:
[   77.818059] Mem abort info:
[   77.820826]   Exception class = DABT (current EL), IL = 32 bits
[   77.826706]   SET = 0, FnV = 0
[   77.829742]   EA = 0, S1PTW = 0
[   77.832849] Data abort info:
[   77.835713]   ISV = 0, ISS = 0x00000070
[   77.839522]   CM = 0, WnR = 1

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: fix "%lu" in a pr_alert() call]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-07 09:58:33 +01:00
Dave Martin
17c2895860 arm64: Abstract syscallno manipulation
The -1 "no syscall" value is written in various ways, shared with
the user ABI in some places, and generally obscure.

This patch attempts to make things a little more consistent and
readable by replacing all these uses with a single #define.  A
couple of symbolic helpers are provided to clarify the intent
further.

Because the in-syscall check in do_signal() is changed from >= 0 to
!= NO_SYSCALL by this patch, different behaviour may be observable
if syscallno is set to values less than -1 by a tracer.  However,
this is not different from the behaviour that is already observable
if a tracer sets syscallno to a value >= __NR_(compat_)syscalls.

It appears that this can cause spurious syscall restarting, but
that is not a new behaviour either, and does not appear harmful.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-07 09:58:33 +01:00
Dave Martin
35d0e6fb4d arm64: syscallno is secretly an int, make it official
The upper 32 bits of the syscallno field in thread_struct are
handled inconsistently, being sometimes zero extended and sometimes
sign-extended.  In fact, only the lower 32 bits seem to have any
real significance for the behaviour of the code: it's been OK to
handle the upper bits inconsistently because they don't matter.

Currently, the only place I can find where those bits are
significant is in calling trace_sys_enter(), which may be
unintentional: for example, if a compat tracer attempts to cancel a
syscall by passing -1 to (COMPAT_)PTRACE_SET_SYSCALL at the
syscall-enter-stop, it will be traced as syscall 4294967295
rather than -1 as might be expected (and as occurs for a native
tracer doing the same thing).  Elsewhere, reads of syscallno cast
it to an int or truncate it.

There's also a conspicuous amount of code and casting to bodge
around the fact that although semantically an int, syscallno is
stored as a u64.

Let's not pretend any more.

In order to preserve the stp x instruction that stores the syscall
number in entry.S, this patch special-cases the layout of struct
pt_regs for big endian so that the newly 32-bit syscallno field
maps onto the low bits of the stored value.  This is not beautiful,
but benchmarking of the getpid syscall on Juno suggests indicates a
minor slowdown if the stp is split into an stp x and stp w.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-07 09:58:33 +01:00
Catalin Marinas
174dfb1286 arm64: neon: Temporarily add a kernel_mode_begin_partial() definition
The crypto code currently relies on kernel_mode_begin_partial() being
available. Until the corresponding crypto patches are merged, define
this macro temporarily, though with different semantics as it cannot be
called in interrupt context.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-04 15:10:12 +01:00