commit 8d92e992a785f35d23f845206cf8c6cafbc264e0 upstream.
The default defintions use fill pattern 0x90 for padding which for ARC
generates unintended "ldh_s r12,[r0,0x20]" corresponding to opcode 0x9090
So use ".align 4" which insert a "nop_s" instruction instead.
Cc: stable@vger.kernel.org
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7980dff398f86a618f502378fa27cf7e77449afa upstream.
Add a missing property to GMAC node so that multicast filtering works
correctly.
Fixes: 556cc1c5f5 ("ARC: [axs101] Add support for AXS101 SDP (software development platform)")
Acked-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3624379d90ad2b65f9dbb30d7f7ce5498d2fe322 ]
If IOC was already enabled (due to bootloader) it technically needs to
be reconfigured with aperture base,size corresponding to Linux memory map
which will certainly be different than uboot's. But disabling and
reenabling IOC when DMA might be potentially active is tricky business.
To avoid random memory issues later, just panic here and ask user to
upgrade bootloader to one which doesn't enable IOC
This was actually seen as issue on some of the HSDK board with a version
of uboot which enabled IOC. There were random issues later with starting
of X or peripherals etc.
Also while I'm at it, replace hardcoded bits in ARC_REG_IO_COH_PARTIAL
and ARC_REG_IO_COH_ENABLE registers by definitions.
Inspired by: https://lkml.org/lkml/2018/1/19/557
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 5effc09c4907901f0e71e68e5f2e14211d9a203f upstream.
8-letter strings representing ARC perf events are stores in two
32-bit registers as ASCII characters like that: "IJMP", "IALL", "IJMPTAK" etc.
And the same order of bytes in the word is used regardless CPU endianness.
Which means in case of big-endian CPU core we need to swap bytes to get
the same order as if it was on little-endian CPU.
Otherwise we're seeing the following error message on boot:
------------------------->8----------------------
ARC perf : 8 counters (32 bits), 40 conditions, [overflow IRQ support]
sysfs: cannot create duplicate filename '/devices/arc_pct/events/pmji'
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3
Stack Trace:
arc_unwind_core+0xd4/0xfc
dump_stack+0x64/0x80
sysfs_warn_dup+0x46/0x58
sysfs_add_file_mode_ns+0xb2/0x168
create_files+0x70/0x2a0
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at kernel/events/core.c:12144 perf_event_sysfs_init+0x70/0xa0
Failed to register pmu: arc_pct, reason -17
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3
Stack Trace:
arc_unwind_core+0xd4/0xfc
dump_stack+0x64/0x80
__warn+0x9c/0xd4
warn_slowpath_fmt+0x22/0x2c
perf_event_sysfs_init+0x70/0xa0
---[ end trace a75fb9a9837bd1ec ]---
------------------------->8----------------------
What happens here we're trying to register more than one raw perf event
with the same name "PMJI". Why? Because ARC perf events are 4 to 8 letters
and encoded into two 32-bit words. In this particular case we deal with 2
events:
* "IJMP____" which counts all jump & branch instructions
* "IJMPC___" which counts only conditional jumps & branches
Those strings are split in two 32-bit words this way "IJMP" + "____" &
"IJMP" + "C___" correspondingly. Now if we read them swapped due to CPU core
being big-endian then we read "PMJI" + "____" & "PMJI" + "___C".
And since we interpret read array of ASCII letters as a null-terminated string
on big-endian CPU we end up with 2 events of the same name "PMJI".
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a8c715b4dd73c26a81a9cc8dc792aa715d8b4bb2 ]
As of today if userspace process tries to access a kernel virtual addres
(0x7000_0000 to 0x7ffff_ffff) such that a legit kernel mapping already
exists, that process hangs instead of being killed with SIGSEGV
Fix that by ensuring that do_page_fault() handles kenrel vaddr only if
in kernel mode.
And given this, we can also simplify the code a bit. Now a vmalloc fault
implies kernel mode so its failure (for some reason) can reuse the
@no_context label and we can remove @bad_area_nosemaphore.
Reproduce user test for original problem:
------------------------>8-----------------
#include <stdlib.h>
#include <stdint.h>
int main(int argc, char *argv[])
{
volatile uint32_t temp;
temp = *(uint32_t *)(0x70000000);
}
------------------------>8-----------------
Cc: <stable@vger.kernel.org>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4d447455e73b47c43dd35fcc38ed823d3182a474 ]
do_page_fault() forgot to relinquish mmap_sem if a signal came while
handling handle_mm_fault() - due to say a ctl+c or oom etc.
This would later cause a deadlock by acquiring it twice.
This came to light when running libc testsuite tst-tls3-malloc test but
is likely also the cause for prior seen LTP failures. Using lockdep
clearly showed what the issue was.
| # while true; do ./tst-tls3-malloc ; done
| Didn't expect signal from child: got `Segmentation fault'
| ^C
| ============================================
| WARNING: possible recursive locking detected
| 4.17.0+ #25 Not tainted
| --------------------------------------------
| tst-tls3-malloc/510 is trying to acquire lock:
| 606c7728 (&mm->mmap_sem){++++}, at: __might_fault+0x28/0x5c
|
|but task is already holding lock:
|606c7728 (&mm->mmap_sem){++++}, at: do_page_fault+0x9c/0x2a0
|
| other info that might help us debug this:
| Possible unsafe locking scenario:
|
| CPU0
| ----
| lock(&mm->mmap_sem);
| lock(&mm->mmap_sem);
|
| *** DEADLOCK ***
|
------------------------------------------------------------
What the change does is not obvious (note to myself)
prior code was
| do_page_fault
|
| down_read() <-- lock taken
| handle_mm_fault <-- signal pending as this runs
| if fatal_signal_pending
| if VM_FAULT_ERROR
| up_read
| if user_mode
| return <-- lock still held, this was the BUG
New code
| do_page_fault
|
| down_read() <-- lock taken
| handle_mm_fault <-- signal pending as this runs
| if fatal_signal_pending
| if VM_FAULT_RETRY
| return <-- not same case as above, but still OK since
| core mm already relinq lock for FAULT_RETRY
| ...
|
| < Now falls through for bug case above >
|
| up_read() <-- lock relinquished
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f731a8e89f8c78985707c626680f3e24c7a60772 ]
signal handling core calls show_regs() with preemption disabled which
on ARC takes mmap_sem for mm/vma access, causing lockdep splat.
| [ARCLinux]# ./segv-null-ptr
| potentially unexpected fatal signal 11.
| BUG: sleeping function called from invalid context at kernel/fork.c:1011
| in_atomic(): 1, irqs_disabled(): 0, pid: 70, name: segv-null-ptr
| no locks held by segv-null-ptr/70.
| CPU: 0 PID: 70 Comm: segv-null-ptr Not tainted 4.18.0+ #69
|
| Stack Trace:
| arc_unwind_core+0xcc/0x100
| ___might_sleep+0x17a/0x190
| mmput+0x16/0xb8
| show_regs+0x52/0x310
| get_signal+0x5ee/0x610
| do_signal+0x2c/0x218
| resume_user_mode_begin+0x90/0xd8
Workaround by re-enabling preemption temporarily.
Note that the preemption disabling in core code around show_regs()
was introduced by commit 3a9f84d354 ("signals, debug: fix BUG: using
smp_processor_id() in preemptible code in print_fatal_signal()")
to silence a differnt lockdep seen on x86 bakc in 2009.
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 493a2f812446e92bcb1e69a77381b4d39808d730 upstream.
After reworking U-boot args handling code and adding paranoid
arguments check we can eliminate CONFIG_ARC_UBOOT_SUPPORT and
enable uboot support unconditionally.
For JTAG case we can assume that core registers will come up
reset value of 0 or in worst case we rely on user passing
'-on=clear_regs' to Metaware debugger.
Cc: stable@vger.kernel.org
Tested-by: Corentin LABBE <clabbe@baylibre.com>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd5de2721ea7d16e2b16c4049ac49f229551b290 upstream.
As kernelci.org reports, this function is not used in
vdk_hs38_defconfig:
arch/arc/kernel/unwind.c:188:14: warning: 'unw_hdr_alloc' defined but not used [-Wunused-function]
Fixes: bc79c9a721 ("ARC: dw2 unwind: Reinstante unwinding out of modules")
Link: https://kernelci.org/build/id/5d1cae3f59b514300340c132/logs/
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ecc906a11c2a0940e1a380debd8bd5bc09faf454 ]
GMAC controller on HSDK boards supports 256 Hash Table size so we need to
add the multicast filter bins property. This allows for the Hash filter
to work properly using stmmac driver.
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Acked-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0728aeb7ead99a9b0dac2f3c92b3752b4e02ff97 ]
We have now a HSDK device in our kernelci lab, but kernel builded via
the hsdk_defconfig lacks ramfs supports, so it cannot boot kernelci jobs
yet.
So this patch enable CONFIG_BLK_DEV_RAM in hsdk_defconfig.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Acked-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit edb64bca50cd736c6894cc6081d5263c007ce005 ]
In case of devboards we really often disable bootloader and load
Linux image in memory via JTAG. Even if kernel tries to verify
uboot_tag and uboot_arg there is sill a chance that we treat some
garbage in registers as valid u-boot arguments in JTAG case.
E.g. it is enough to have '1' in r0 to treat any value in r2 as
a boot command line.
So check that magic number passed from u-boot is correct and drop
u-boot arguments otherwise. That helps to reduce the possibility
of using garbage as u-boot arguments in JTAG case.
We can safely check U-boot magic value (0x0) in linux passed via
r1 register as U-boot pass it from the beginning. So there is no
backward-compatibility issues.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7b2e932f633bcb7b190fc7031ce6dac75f8c3472 ]
The first release of core4 (0x54) was dual issue only (HS4x).
Newer releases allow hardware to be configured as single issue (HS3x)
or dual issue.
Prevent accessing a HS4x only aux register in HS3x, which otherwise
leads to illegal instruction exceptions
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e494239a007e601448110ac304fe055951f9de3b ]
There's a hardware bug which affects the HSDK platform, triggered by
micro-ops for auto-saving regfile on taken interrupt. The workaround is
to inhibit autosave.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d5e3c55e01d8b1774b37b4647c30fb22f1d39077 ]
Newer ARC gcc handles lp_start, lp_end in a different way and doesn't
like them in the clobber list.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f8a15f97664178f27dfbf86a38f780a532cb6df0 ]
ARCv2 optimized memcpy uses PREFETCHW instruction for prefetching the
next cache line but doesn't ensure that the line is not past the end of
the buffer. PRETECHW changes the line ownership and marks it dirty,
which can cause data corruption if this area is used for DMA IO.
Fix the issue by avoiding the PREFETCHW. This leads to performance
degradation but it is OK as we'll introduce new memcpy implementation
optimized for unaligned memory access using.
We also cut off all PREFETCH instructions at they are quite useless
here:
* we call PREFETCH right before LOAD instruction call.
* we copy 16 or 32 bytes of data (depending on CONFIG_ARC_HAS_LL64)
in a main logical loop. so we call PREFETCH 4 times (or 2 times)
for each L1 cache line (in case of 64B L1 cache Line which is
default case). Obviously this is not optimal.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ab6c03676cb190156603cf4c5ecf97aa406c9c53 ]
and use smaller/on-stack buffer instead
The motivation for this change was lockdep splat like below.
| potentially unexpected fatal signal 11.
| BUG: sleeping function called from invalid context at ../mm/page_alloc.c:4317
| in_atomic(): 1, irqs_disabled(): 0, pid: 57, name: segv
| no locks held by segv/57.
| Preemption disabled at:
| [<8182f17e>] get_signal+0x4a6/0x7c4
| CPU: 0 PID: 57 Comm: segv Not tainted 4.17.0+ #23
|
| Stack Trace:
| arc_unwind_core.constprop.1+0xd0/0xf4
| __might_sleep+0x1f6/0x234
| __get_free_pages+0x174/0xca0
| show_regs+0x22/0x330
| get_signal+0x4ac/0x7c4 # print_fatal_signals() -> preempt_disable()
| do_signal+0x30/0x224
| resume_user_mode_begin+0x90/0xd8
So signal handling core calls show_regs() with preemption disabled but
an ensuing GFP_KERNEL page allocator call is flagged by lockdep.
We could have switched to GFP_NOWAIT, but turns out that is not enough
anways and eliding page allocator call leads to less code and
instruction traces to sift thru when debugging pesky crashes.
FWIW, this patch doesn't cure the lockdep splat (which next patch does).
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4e868f8419cb4cb558c5d428e7ab5629cef864c7 ]
| CC mm/nobootmem.o
|In file included from ./include/asm-generic/bug.h:18:0,
| from ./arch/arc/include/asm/bug.h:32,
| from ./include/linux/bug.h:5,
| from ./include/linux/mmdebug.h:5,
| from ./include/linux/gfp.h:5,
| from ./include/linux/slab.h:15,
| from mm/nobootmem.c:14:
|mm/nobootmem.c: In function '__free_pages_memory':
|./include/linux/kernel.h:845:29: warning: comparison of distinct pointer types lacks a cast
| (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
| ^
|./include/linux/kernel.h:859:4: note: in expansion of macro '__typecheck'
| (__typecheck(x, y) && __no_side_effects(x, y))
| ^~~~~~~~~~~
|./include/linux/kernel.h:869:24: note: in expansion of macro '__safe_cmp'
| __builtin_choose_expr(__safe_cmp(x, y), \
| ^~~~~~~~~~
|./include/linux/kernel.h:878:19: note: in expansion of macro '__careful_cmp'
| #define min(x, y) __careful_cmp(x, y, <)
| ^~~~~~~~~~~~~
|mm/nobootmem.c:104:11: note: in expansion of macro 'min'
| order = min(MAX_ORDER - 1UL, __ffs(start));
Change __ffs return value from 'int' to 'unsigned long' as it
is done in other implementations (like asm-generic, x86, etc...)
to avoid build-time warnings in places where type is strictly
checked.
As __ffs may return values in [0-31] interval changing return
type to unsigned is valid.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit b6835ea77729e7faf4656ca637ba53f42b8ee3fd upstream.
The default value of ARCH_SLAB_MINALIGN in "include/linux/slab.h" is
"__alignof__(unsigned long long)" which for ARC unexpectedly turns out
to be 4. This is not a compiler bug, but as defined by ARC ABI [1]
Thus slab allocator would allocate a struct which is 32-bit aligned,
which is generally OK even if struct has long long members.
There was however potetial problem when it had any atomic64_t which
use LLOCKD/SCONDD instructions which are required by ISA to take
64-bit addresses. This is the problem we ran into
[ 4.015732] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
[ 4.167881] Misaligned Access
[ 4.172356] Path: /bin/busybox.nosuid
[ 4.176004] CPU: 2 PID: 171 Comm: rm Not tainted 4.19.14-yocto-standard #1
[ 4.182851]
[ 4.182851] [ECR ]: 0x000d0000 => Check Programmer's Manual
[ 4.190061] [EFA ]: 0xbeaec3fc
[ 4.190061] [BLINK ]: ext4_delete_entry+0x210/0x234
[ 4.190061] [ERET ]: ext4_delete_entry+0x13e/0x234
[ 4.202985] [STAT32]: 0x80080002 : IE K
[ 4.207236] BTA: 0x9009329c SP: 0xbe5b1ec4 FP: 0x00000000
[ 4.212790] LPS: 0x9074b118 LPE: 0x9074b120 LPC: 0x00000000
[ 4.218348] r00: 0x00000040 r01: 0x00000021 r02: 0x00000001
...
...
[ 4.270510] Stack Trace:
[ 4.274510] ext4_delete_entry+0x13e/0x234
[ 4.278695] ext4_rmdir+0xe0/0x238
[ 4.282187] vfs_rmdir+0x50/0xf0
[ 4.285492] do_rmdir+0x9e/0x154
[ 4.288802] EV_Trap+0x110/0x114
The fix is to make sure slab allocations are 64-bit aligned.
Do note that atomic64_t is __attribute__((aligned(8)) which means gcc
does generate 64-bit aligned references, relative to beginning of
container struct. However the issue is if the container itself is not
64-bit aligned, atomic64_t ends up unaligned which is what this patch
ensures.
[1] https://github.com/foss-for-synopsys-dwc-arc-processors/toolchain/wiki/files/ARCv2_ABI.pdf
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: <stable@vger.kernel.org> # 4.8+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: reworked changelog, added dependency on LL64+LLSC]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a66f2e57bd566240d8b3884eedf503928fbbe557 upstream.
Handle U-boot arguments paranoidly:
* don't allow to pass unknown tag.
* try to use external device tree blob only if corresponding tag
(TAG_DTB) is set.
* don't check uboot_tag if kernel build with no ARC_UBOOT_SUPPORT.
NOTE:
If U-boot args are invalid we skip them and try to use embedded device
tree blob. We can't panic on invalid U-boot args as we really pass
invalid args due to bug in U-boot code.
This happens if we don't provide external DTB to U-boot and
don't set 'bootargs' U-boot environment variable (which is default
case at least for HSDK board) In that case we will pass
{r0 = 1 (bootargs in r2); r1 = 0; r2 = 0;} to linux which is invalid.
While I'm at it refactor U-boot arguments handling code.
Cc: stable@vger.kernel.org
Tested-by: Corentin LABBE <clabbe@baylibre.com>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 252f6e8eae909bc075a1b1e3b9efb095ae4c0b56 upstream.
It is currently done in arc_init_IRQ() which might be too late
considering gcc 7.3.1 onwards (GNU 2018.03) generates unaligned
memory accesses by default
Cc: stable@vger.kernel.org #4.4+
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: rewrote changelog]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3affbf0e154ee351add6fcc254c59c3f3947fa8f upstream.
So far we've mapped branches to "ijmp" which also counts conditional
branches NOT taken. This makes us different from other architectures
such as ARM which seem to be counting only taken branches.
So use "ijmptak" hardware condition which only counts (all jump
instructions that are taken)
'ijmptak' event is available on both ARCompact and ARCv2 ISA based
cores.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: reworked changelog]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3010a0465383300f909f62b8a83f83ffa7b2517 upstream.
In setup_arch_memory we reserve the memory area wherein the kernel
is located. Current implementation may reserve more memory than
it actually required in case of CONFIG_LINUX_LINK_BASE is not
equal to CONFIG_LINUX_RAM_BASE. This happens because we calculate
start of the reserved region relatively to the CONFIG_LINUX_RAM_BASE
and end of the region relatively to the CONFIG_LINUX_RAM_BASE.
For example in case of HSDK board we wasted 256MiB of physical memory:
------------------->8------------------------------
Memory: 770416K/1048576K available (5496K kernel code,
240K rwdata, 1064K rodata, 2200K init, 275K bss,
278160K reserved, 0K cma-reserved)
------------------->8------------------------------
Fix that.
Fixes: 9ed68785f7 ("ARC: mm: Decouple RAM base address from kernel link addr")
Cc: stable@vger.kernel.org #4.14+
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6a72b7daeeb521753803550f0ed711152bb2555 upstream.
ARCv2 optimized memset uses PREFETCHW instruction for prefetching the
next cache line but doesn't ensure that the line is not past the end of
the buffer. PRETECHW changes the line ownership and marks it dirty,
which can cause issues in SMP config when next line was already owned by
other core. Fix the issue by avoiding the PREFETCHW
Some more details:
The current code has 3 logical loops (ignroing the unaligned part)
(a) Big loop for doing aligned 64 bytes per iteration with PREALLOC
(b) Loop for 32 x 2 bytes with PREFETCHW
(c) any left over bytes
loop (a) was already eliding the last 64 bytes, so PREALLOC was
safe. The fix was removing PREFETCW from (b).
Another potential issue (applicable to configs with 32 or 128 byte L1
cache line) is that PREALLOC assumes 64 byte cache line and may not do
the right thing specially for 32b. While it would be easy to adapt,
there are no known configs with those lie sizes, so for now, just
compile out PREALLOC in such cases.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Cc: stable@vger.kernel.org #4.4+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: rewrote changelog, used asm .macro vs. "C" macro]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf287607c80f24387fedb431a346dc67f25be12c upstream.
It turned out we used to use default implementation of sched_clock()
from kernel/sched/clock.c which was as precise as 1/HZ, i.e.
by default we had 10 msec granularity of time measurement.
Now given ARC built-in timers are clocked with the same frequency as
CPU cores we may get much higher precision of time tracking.
Thus we switch to generic sched_clock which really reads ARC hardware
counters.
This is especially helpful for measuring short events.
That's what we used to have:
------------------------------>8------------------------
$ perf stat /bin/sh -c /root/lmbench-master/bin/arc/hello > /dev/null
Performance counter stats for '/bin/sh -c /root/lmbench-master/bin/arc/hello':
10.000000 task-clock (msec) # 2.832 CPUs utilized
1 context-switches # 0.100 K/sec
1 cpu-migrations # 0.100 K/sec
63 page-faults # 0.006 M/sec
3049480 cycles # 0.305 GHz
1091259 instructions # 0.36 insn per cycle
256828 branches # 25.683 M/sec
27026 branch-misses # 10.52% of all branches
0.003530687 seconds time elapsed
0.000000000 seconds user
0.010000000 seconds sys
------------------------------>8------------------------
And now we'll see:
------------------------------>8------------------------
$ perf stat /bin/sh -c /root/lmbench-master/bin/arc/hello > /dev/null
Performance counter stats for '/bin/sh -c /root/lmbench-master/bin/arc/hello':
3.004322 task-clock (msec) # 0.865 CPUs utilized
1 context-switches # 0.333 K/sec
1 cpu-migrations # 0.333 K/sec
63 page-faults # 0.021 M/sec
2986734 cycles # 0.994 GHz
1087466 instructions # 0.36 insn per cycle
255209 branches # 84.947 M/sec
26002 branch-misses # 10.19% of all branches
0.003474829 seconds time elapsed
0.003519000 seconds user
0.000000000 seconds sys
------------------------------>8------------------------
Note how much more meaningful is the second output - time spent for
execution pretty much matches number of cycles spent (we're runnign
@ 1GHz here).
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 10d443431dc2bb733cf7add99b453e3fb9047a2e ]
Some ARC CPU's do not support unaligned loads/stores. Currently, generic
implementation of reads{b/w/l}()/writes{b/w/l}() is being used with ARC.
This can lead to misfunction of some drivers as generic functions do a
plain dereference of a pointer that can be unaligned.
Let's use {get/put}_unaligned() helpers instead of plain dereference of
pointer in order to fix. The helpers allow to get and store data from an
unaligned address whilst preserving the CPU internal alignment.
According to [1], the use of these helpers are costly in terms of
performance so we added an initial check for a buffer already aligned so
that the usage of the helpers can be avoided, when possible.
[1] Documentation/unaligned-memory-access.txt
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Tested-by: Vitor Soares <soares@synopsys.com>
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 6b04114f6fae5e84d33404c2970b1949c032546e upstream.
By default NFSv3 doesn't support ACL (Access Control Lists)
which might be quite convenient to have so that
mounted NFS behaves exactly as any other local file-system.
In particular missing support of ACL makes umask useless.
This among other thigs fixes Glibc's "nptl/tst-umask1".
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Cupertino Miranda <cmiranda@synopsys.com>
Cc: stable@vger.kernel.org #4.14+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b7cc40c32a8bfa6f2581a71747f6a7d491fe43ba upstream.
Change the default defconfig (used with 'make defconfig') to the ARCv2
nsim_hs_defconfig, and also switch the default Kconfig ISA selection to
ARCv2.
This allows several default defconfigs (e.g. make defconfig, make
allnoconfig, make tinyconfig) to all work with ARCv2 by default.
Note since we change default architecture from ARCompact to ARCv2
it's required to explicitly mention architecture type in ARCompact
defconfigs otherwise ARCv2 will be implied and binaries will be
generated for ARCv2.
Cc: <stable@vger.kernel.org> # 4.4.x
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Per ARC TLS ABI, r25 is designated TP (thread pointer register).
However so far kernel didn't do any special treatment, like setting up
usermode r25, even for CLONE_SETTLS. We instead relied on libc runtime
to do this, in say clone libc wrapper [1]. This was deliberate to keep
kernel ABI agnostic (userspace could potentially change TP, specially
for different ARC ISA say ARCompact vs. ARCv2 with different spare
registers etc)
However userspace setting up r25, after clone syscall opens a race, if
child is not scheduled and gets a signal instead. It starts off in
userspace not in clone but in a signal handler and anything TP sepcific
there such as pthread_self() fails which showed up with uClibc
testsuite nptl/tst-kill6 [2]
Fix this by having kernel populate r25 to TP value. So this locks in
ABI, but it was not going to change anyways, and fwiw is same for both
ARCompact (arc700 core) and ARCvs (HS3x cores)
[1] https://cgit.uclibc-ng.org/cgi/cgit/uclibc-ng.git/tree/libc/sysdeps/linux/arc/clone.S
[2] https://github.com/wbx-github/uclibc-ng-test/blob/master/test/nptl/tst-kill6.c
Fixes: ARC STAR 9001378481
Cc: stable@vger.kernel.org
Reported-by: Nikita Sobolev <sobolev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
There's not much sense in doing that because if user or
his build-system didn't set CROSS_COMPILE we still may
very well make incorrect guess.
But as it turned out setting CROSS_COMPILE is not as harmless
as one may think: with recent changes that implemented automatic
discovery of __host__ gcc features unconditional setup of
CROSS_COMPILE leads to failures on execution of "make xxx_defconfig"
with absent cross-compiler, for more info see [1].
Set CROSS_COMPILE as well gets in the way if we want only to build
.dtb's (again with absent cross-compiler which is not really needed
for building .dtb's), see [2].
Note, we had to change LIBGCC assignment type from ":=" to "="
so that is is resolved on its usage, otherwise if it is resolved
at declaration time with missing CROSS_COMPILE we're getting this
error message from host GCC:
| gcc: error: unrecognized command line option -mmedium-calls
| gcc: error: unrecognized command line option -mno-sdata
[1] http://lists.infradead.org/pipermail/linux-snps-arc/2018-September/004308.html
[2] http://lists.infradead.org/pipermail/linux-snps-arc/2018-September/004320.html
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This check is very naive: we simply test if GCC invoked without
"-mcpu=XXX" has ARC700 define set. In that case we think that GCC
was built with "--with-cpu=arc700" and has libgcc built for ARC700.
Otherwise if ARC700 is not defined we think that everythng was built
for ARCv2.
But in reality our life is much more interesting.
1. Regardless of GCC configuration (i.e. what we pass in "--with-cpu"
it may generate code for any ARC core).
2. libgcc might be built with explicitly specified "--mcpu=YYY"
That's exactly what happens in case of multilibbed toolchains:
- GCC is configured with default settings
- All the libs built for many different CPU flavors
I.e. that check gets in the way of usage of multilibbed
toolchains. And even non-multilibbed toolchains are affected.
OpenEmbedded also builds GCC without "--with-cpu" because
each and every target component later is compiled with explicitly
set "-mcpu=ZZZ".
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
__GFP_HIGHMEM flag is cleared by upper layer functions
(in include/linux/dma-mapping.h) so we'll never get a
__GFP_HIGHMEM flag in arch_dma_alloc gfp argument.
That's why alloc_pages will never return highmem page
here.
Get rid of highmem pages handling and cleanup arch_dma_alloc
and arch_dma_free functions.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
So far the IOC treatment was global on ARC, being turned on (or off)
for all devices in the system. With this patch, this can now be done
per device using the "dma-coherent" DT property; IOW with this patch
we can use both HW-coherent and regular DMA peripherals simultaneously.
The changes involved are too many so enlisting the summary below:
1. common code calls ARC arch_setup_dma_ops() per device.
2. For coherent dma (IOC) it plugs in generic @dma_direct_ops which
doesn't need any arch specific backend: No need for any explicit
cache flushes or MMU mappings to provide for uncached access
- dma_(map|sync)_single* return early as corresponding dma ops callbacks
are NULL in generic code.
So arch_sync_dma_*() -> dma_cache_*() need not handle the coherent
dma case, hence drop ARC __dma_cache_*_ioc() which were no-op anyways
3. For noncoherent dma (non IOC) generic @dma_noncoherent_ops is used
which in turns calls ARC specific routines
- arch_dma_alloc() no longer checks for @ioc_enable since this is
called only for !IOC case.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: rewrote changelog]
Mark DMA devices on AXS103 and HSDK boards connected through IOC
port as dma-coherent.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
In 4.19-rc1, Eugeniy reported weird boot and IO errors on ARC HSDK
| INFO: task syslogd:77 blocked for more than 10 seconds.
| Not tainted 4.19.0-rc1-00007-gf213acea4e88 #40
| "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
| message.
| syslogd D 0 77 76 0x00000000
|
| Stack Trace:
| __switch_to+0x0/0xac
| __schedule+0x1b2/0x730
| io_schedule+0x5c/0xc0
| __lock_page+0x98/0xdc
| find_lock_entry+0x38/0x100
| shmem_getpage_gfp.isra.3+0x82/0xbfc
| shmem_fault+0x46/0x138
| handle_mm_fault+0x5bc/0x924
| do_page_fault+0x100/0x2b8
| ret_from_exception+0x0/0x8
He bisected to 84c6591103 ("locking/atomics,
asm-generic/bitops/lock.h: Rewrite using atomic_fetch_*()")
This commit however only unmasked the real issue introduced by commit
4aef66c8ae ("locking/atomic, arch/arc: Fix build") which missed the
retry-if-scond-failed branch in atomic_fetch_##op() macros.
The bisected commit started using atomic_fetch_##op() macros for building
the rest of atomics.
Fixes: 4aef66c8ae ("locking/atomic, arch/arc: Fix build")
Reported-by: Eugeniy Paltsev <paltsev@synopsys.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: wrote changelog]
Commit cafa0010cd ("Raise the minimum required gcc version to 4.6")
bumped the minimum GCC version to 4.6 for all architectures.
With GCC >= 4.6 assumed, 'upto_gcc44' is empty, 'atleast_gcc44' is y.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
SWAP support on ARC was fixed earlier by
commit 6e3761145a ("ARC: Fix CONFIG_SWAP")
so now we may safely enable it on platforms that
have external media like USB and SD-card.
Note: it was already allowed for HSDK
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: stable@vger.kernel.org # 6e3761145a: ARC: Fix CONFIG_SWAP
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Otherwise kernel uses random MAC which is not very conveniet.
With that change in place use might set desired MAC in U-Boot
with "setenv ethaddr 11:22:33:44:55:66", save environment and
then from boot to boot the same MAC will be used by the kernel.
One other note for this to happen it's required to pass
board's .dtb in U-Boot's "bootm" command like that:
------------------->8-----------------
bootm 0x82000000 - 0x84000000
------------------->8-----------------
Here 0x82000000 is location of uImage while
0x80000000 is location of either axs10x.dtb or hsdk.dtb
previously loaded from SD-card, USB storage or TFTP server.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: stable@vger.kernel.org # 4.14
Cc: devicetree@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
- Remove CONFIG_DEFAULT_HOSTNAME from defconfigs
There's no reason to set the same hostname to all ARC boards
by default. It usually gets overwritten by init scripts anyways.
- Remove disabled CONFIG_DEVKMEM from defconfigs
It is disabled by default
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Commit a0f97e06a4 ("kbuild: enable 'make CFLAGS=...' to add
additional options to CC") renamed CFLAGS to KBUILD_CFLAGS.
Commit 222d394d30 ("kbuild: enable 'make AFLAGS=...' to add
additional options to AS") renamed AFLAGS to KBUILD_AFLAGS.
Commit 06c5040cdb ("kbuild: enable 'make CPPFLAGS=...' to add
additional options to CPP") renamed CPPFLAGS to KBUILD_CPPFLAGS.
For some reason, LDFLAGS was not renamed.
Using a well-known variable like LDFLAGS may result in accidental
override of the variable.
Kbuild generally uses KBUILD_ prefixed variables for the internally
appended options, so here is one more conversion to sanitize the
naming convention.
I did not touch Makefiles under tools/ since the tools build system
is a different world.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>