13222 commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Greg Kroah-Hartman
|
d8a623cfbb |
This is the 4.19.80 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl2o0vkACgkQONu9yGCS aT50rQ//So/ShGy5+D6cmsPVbd8p/9X0fM5D6VjgJ2pj+HbalxW7JhJ0tyVdfkO1 AS6p/24bmUMWI3pxJBbJjfggbyleV2UFqlHIei+SjfHd0sghZ+E/XY7mLdcKvx0a dGdXuxF+enb18pNiUKUrJUgqksgfO40yMXyZhjGf5A+/JmYsNaJhkyCWgPfGIB5i YuowOWC4Rkds/CQxyRSRJl+YSQWh8j7Ke0tdjhMQxnmyzFD0wcE5vUtq2vIP7yxO ypPrz9zWTase3y1yCcPsXd0m4Ji1VpF99hlgAr0HLFNcZn2kNYFpyblfOrI/zfoK RqUJgVVlZaWvhMOrueQO+XbebXdCWTJHiNTUDxIFakEAPNz+XkTQQf6LvzOjcFxB oLHnp1qBCn72y6o6Q3XmauFcmYBokyfBvXDAJsXYr+X5B4akn/fpyzzBCInX0h+r og5zpLbXDLszG3j76TPSpcjd+t4xqMy+MG+jkH/mLtMAFyaVi2sfMzsZJAQCACr2 wNHRFQQGjDmQjovkwSRYENQBUuITSj+VrBUD0M7lnfA/Fni3BAJaxY5I6WeEvbeW ejzPVFZB8dfEy7hAKKiVEuI8/akcp8MUXfLJOfTxytLIpYllOBoVC4LtjgO5FYu3 grSbRuowjrlUCtnCU9H18avLE1ScFFEGTmPOqI6xDqKiZf59QsQ= =sKVA -----END PGP SIGNATURE----- Merge 4.19.80 into android-4.19 Changes in 4.19.80 panic: ensure preemption is disabled during panic() f2fs: use EINVAL for superblock with invalid magic USB: rio500: Remove Rio 500 kernel driver USB: yurex: Don't retry on unexpected errors USB: yurex: fix NULL-derefs on disconnect USB: usb-skeleton: fix runtime PM after driver unbind USB: usb-skeleton: fix NULL-deref on disconnect xhci: Fix false warning message about wrong bounce buffer write length xhci: Prevent device initiated U1/U2 link pm if exit latency is too long xhci: Check all endpoints for LPM timeout xhci: Fix USB 3.1 capability detection on early xHCI 1.1 spec based hosts usb: xhci: wait for CNR controller not ready bit in xhci resume xhci: Prevent deadlock when xhci adapter breaks during init xhci: Increase STS_SAVE timeout in xhci_suspend() USB: adutux: fix use-after-free on disconnect USB: adutux: fix NULL-derefs on disconnect USB: adutux: fix use-after-free on release USB: iowarrior: fix use-after-free on disconnect USB: iowarrior: fix use-after-free on release USB: iowarrior: fix use-after-free after driver unbind USB: usblp: fix runtime PM after driver unbind USB: chaoskey: fix use-after-free on release USB: ldusb: fix NULL-derefs on driver unbind serial: uartlite: fix exit path null pointer USB: serial: keyspan: fix NULL-derefs on open() and write() USB: serial: ftdi_sio: add device IDs for Sienna and Echelon PL-20 USB: serial: option: add Telit FN980 compositions USB: serial: option: add support for Cinterion CLS8 devices USB: serial: fix runtime PM after driver unbind USB: usblcd: fix I/O after disconnect USB: microtek: fix info-leak at probe USB: dummy-hcd: fix power budget for SuperSpeed mode usb: renesas_usbhs: gadget: Do not discard queues in usb_ep_set_{halt,wedge}() usb: renesas_usbhs: gadget: Fix usb_ep_set_{halt,wedge}() behavior USB: legousbtower: fix slab info leak at probe USB: legousbtower: fix deadlock on disconnect USB: legousbtower: fix potential NULL-deref on disconnect USB: legousbtower: fix open after failed reset request USB: legousbtower: fix use-after-free on release mei: me: add comet point (lake) LP device ids mei: avoid FW version request on Ibex Peak and earlier gpio: eic: sprd: Fix the incorrect EIC offset when toggling Staging: fbtft: fix memory leak in fbtft_framebuffer_alloc staging: vt6655: Fix memory leak in vt6655_probe iio: adc: hx711: fix bug in sampling of data iio: adc: ad799x: fix probe error handling iio: adc: axp288: Override TS pin bias current for some models iio: light: opt3001: fix mutex unlock race efivar/ssdt: Don't iterate over EFI vars if no SSDT override was specified perf llvm: Don't access out-of-scope array perf inject jit: Fix JIT_CODE_MOVE filename blk-wbt: fix performance regression in wbt scale_up/scale_down CIFS: Gracefully handle QueryInfo errors during open CIFS: Force revalidate inode when dentry is stale CIFS: Force reval dentry if LOOKUP_REVAL flag is set kernel/sysctl.c: do not override max_threads provided by userspace mm/vmpressure.c: fix a signedness bug in vmpressure_register_event() firmware: google: increment VPD key_len properly gpiolib: don't clear FLAG_IS_OUT when emulating open-drain/open-source iio: adc: stm32-adc: move registers definitions iio: adc: stm32-adc: fix a race when using several adcs with dma and irq cifs: use cifsInodeInfo->open_file_lock while iterating to avoid a panic btrfs: fix incorrect updating of log root tree btrfs: fix uninitialized ret in ref-verify NFS: Fix O_DIRECT accounting of number of bytes read/written MIPS: Disable Loongson MMI instructions for kernel build MIPS: elf_hwcap: Export userspace ASEs ACPICA: ACPI 6.3: PPTT add additional fields in Processor Structure Flags ACPI/PPTT: Add support for ACPI 6.3 thread flag arm64: topology: Use PPTT to determine if PE is a thread Fix the locking in dcache_readdir() and friends media: stkwebcam: fix runtime PM after driver unbind arm64/sve: Fix wrong free for task->thread.sve_state tracing/hwlat: Report total time spent in all NMIs during the sample tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency ftrace: Get a reference counter for the trace_array on filter files tracing: Get trace_array reference for available_tracers files hwmon: Fix HWMON_P_MIN_ALARM mask x86/asm: Fix MWAITX C-state hint value PCI: vmd: Fix config addressing when using bus offsets perf/hw_breakpoint: Fix arch_hw_breakpoint use-before-initialization Linux 4.19.80 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I8ef1835585f79b752712a835852a4bc960ae3b97 |
||
Dan Carpenter
|
491a39dcee |
mm/vmpressure.c: fix a signedness bug in vmpressure_register_event()
commit 518a86713078168acd67cf50bc0b45d54b4cce6c upstream.
The "mode" and "level" variables are enums and in this context GCC will
treat them as unsigned ints so the error handling is never triggered.
I also removed the bogus initializer because it isn't required any more
and it's sort of confusing.
[akpm@linux-foundation.org: reduce implicit and explicit typecasting]
[akpm@linux-foundation.org: fix return value, add comment, per Matthew]
Link: http://lkml.kernel.org/r/20190925110449.GO3264@mwanda
Fixes:
|
||
Roman Gushchin
|
91711ef418 |
UPSTREAM: mm: vmalloc: show number of vmalloc pages in /proc/meminfo
Vmalloc() is getting more and more used these days (kernel stacks, bpf and
percpu allocator are new top users), and the total % of memory consumed by
vmalloc() can be pretty significant and changes dynamically.
/proc/meminfo is the best place to display this information: its top goal
is to show top consumers of the memory.
Since the VmallocUsed field in /proc/meminfo is not in use for quite a
long time (it has been defined to 0 by
|
||
Greg Kroah-Hartman
|
ef55d5261c |
This is the 4.19.79 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl2grBgACgkQONu9yGCS aT6xRBAA0pTW2W/VvzBHBLeVlmNtwQZb8x7civVb72iZkltKR9tTPim90PULpz/P iO7kh8KqkgVUqdgBE0VzkHGWUSThggfSTQiqzCqOgTwV8WQWqSF8ET0HU8zbglYB 5pXSojoRYmurGVznd4Ll6aWa5brXIKwf1mDSrFHagOyOLxQmyggHaTRSLx36BSfj gunE2ideB1oTaPmd/2aTI03CU3jRwXmowe8rZIDa8pJEpplZPFdk0YOPXg2t6uRI bjJGO8bhfR/14r/3h76IwsEiVVXIcCeEVm0fos/H6NUypedfi7jlT0Ldzg1/zZti mUMkbPGHcJbOWfBYPQq8xQzviCa+MFraA4Tek5h/Lf7kf3NpjE20AnH3pb9TaqQf mJYUGziCoOOOz8k+0eNtIjIZiCysOnf9sI5rGhMYb9qfZoZGG6RiitqyVYNa+rzJ wvIUQZ4vSnYmQMAXqxyayfSZvFbMxv6pAdeH0NrXVRgFF6dnKG9TSsCnIuQaJxAE OQRaYEJktMUBs81hS0IjnJNDFLW3r++s87xEYvCt4L7XGSrxMJ3jW6xLZlmET68G 4UIddJ81zIuqpGY1qoWdWZAp3nfRfSX4ehOnoNmIDyC9pRhiCKc+N6j5rX8gBNO/ SO8YOaNf9RTphhEG6Op7u4ZbU+UR4pYP+rjKveyT2HKPH6D/Tv0= =wt6H -----END PGP SIGNATURE----- Merge 4.19.79 into android-4.19 Changes in 4.19.79 s390/process: avoid potential reading of freed stack KVM: s390: Test for bad access register and size at the start of S390_MEM_OP s390/topology: avoid firing events before kobjs are created s390/cio: exclude subchannels with no parent from pseudo check KVM: PPC: Book3S HV: Fix race in re-enabling XIVE escalation interrupts KVM: PPC: Book3S HV: Check for MMU ready on piggybacked virtual cores KVM: PPC: Book3S HV: Don't lose pending doorbell request on migration on P9 KVM: X86: Fix userspace set invalid CR4 KVM: nVMX: handle page fault in vmread fix nbd: fix max number of supported devs PM / devfreq: tegra: Fix kHz to Hz conversion ASoC: Define a set of DAPM pre/post-up events ASoC: sgtl5000: Improve VAG power and mute control powerpc/mce: Fix MCE handling for huge pages powerpc/mce: Schedule work from irq_work powerpc/powernv: Restrict OPAL symbol map to only be readable by root powerpc/powernv/ioda: Fix race in TCE level allocation powerpc/book3s64/mm: Don't do tlbie fixup for some hardware revisions can: mcp251x: mcp251x_hw_reset(): allow more time after a reset tools lib traceevent: Fix "robust" test of do_generate_dynamic_list_file crypto: qat - Silence smp_processor_id() warning crypto: skcipher - Unmap pages after an external error crypto: cavium/zip - Add missing single_release() crypto: caam - fix concurrency issue in givencrypt descriptor crypto: ccree - account for TEE not ready to report crypto: ccree - use the full crypt length value MIPS: Treat Loongson Extensions as ASEs power: supply: sbs-battery: use correct flags field power: supply: sbs-battery: only return health when battery present tracing: Make sure variable reference alias has correct var_ref_idx usercopy: Avoid HIGHMEM pfn warning timer: Read jiffies once when forwarding base clk PCI: vmd: Fix shadow offsets to reflect spec changes PCI: Restore Resizable BAR size bits correctly for 1MB BARs watchdog: imx2_wdt: fix min() calculation in imx2_wdt_set_timeout perf stat: Fix a segmentation fault when using repeat forever drm/omap: fix max fclk divider for omap36xx drm/msm/dsi: Fix return value check for clk_get_parent drm/nouveau/kms/nv50-: Don't create MSTMs for eDP connectors drm/i915/gvt: update vgpu workload head pointer correctly mmc: sdhci: improve ADMA error reporting mmc: sdhci-of-esdhc: set DMA snooping based on DMA coherence Revert "locking/pvqspinlock: Don't wait if vCPU is preempted" xen/xenbus: fix self-deadlock after killing user process ieee802154: atusb: fix use-after-free at disconnect s390/cio: avoid calling strlen on null pointer cfg80211: initialize on-stack chandefs arm64: cpufeature: Detect SSBS and advertise to userspace ima: always return negative code for error ima: fix freeing ongoing ahash_request fs: nfs: Fix possible null-pointer dereferences in encode_attrs() 9p: Transport error uninitialized 9p: avoid attaching writeback_fid on mmap with type PRIVATE xen/pci: reserve MCFG areas earlier ceph: fix directories inode i_blkbits initialization ceph: reconnect connection if session hang in opening state watchdog: aspeed: Add support for AST2600 netfilter: nf_tables: allow lookups in dynamic sets drm/amdgpu: Fix KFD-related kernel oops on Hawaii drm/amdgpu: Check for valid number of registers to read pNFS: Ensure we do clear the return-on-close layout stateid on fatal errors pwm: stm32-lp: Add check in case requested period cannot be achieved x86/purgatory: Disable the stackleak GCC plugin for the purgatory ntb: point to right memory window index thermal: Fix use-after-free when unregistering thermal zone device thermal_hwmon: Sanitize thermal_zone type libnvdimm/region: Initialize bad block for volatile namespaces fuse: fix memleak in cuse_channel_open libnvdimm/nfit_test: Fix acpi_handle redefinition sched/membarrier: Call sync_core only before usermode for same mm sched/membarrier: Fix private expedited registration check sched/core: Fix migration to invalid CPU in __set_cpus_allowed_ptr() perf build: Add detection of java-11-openjdk-devel package kernel/elfcore.c: include proper prototypes perf unwind: Fix libunwind build failure on i386 systems nfp: flower: fix memory leak in nfp_flower_spawn_vnic_reprs drm/radeon: Bail earlier when radeon.cik_/si_support=0 is passed KVM: PPC: Book3S HV: XIVE: Free escalation interrupts before disabling the VP KVM: nVMX: Fix consistency check on injected exception error code nbd: fix crash when the blksize is zero powerpc/pseries: Fix cpu_hotplug_lock acquisition in resize_hpt() powerpc/book3s64/radix: Rename CPU_FTR_P9_TLBIE_BUG feature flag tools lib traceevent: Do not free tep->cmdlines in add_new_comm() on failure tick: broadcast-hrtimer: Fix a race in bc_set_next perf tools: Fix segfault in cpu_cache_level__read() perf stat: Reset previous counts on repeat with interval riscv: Avoid interrupts being erroneously enabled in handle_exception() arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3 KVM: arm64: Set SCTLR_EL2.DSSBS if SSBD is forcefully disabled and !vhe arm64: docs: Document SSBS HWCAP arm64: fix SSBS sanitization arm64: Add sysfs vulnerability show for spectre-v1 arm64: add sysfs vulnerability show for meltdown arm64: enable generic CPU vulnerabilites support arm64: Always enable ssb vulnerability detection arm64: Provide a command line to disable spectre_v2 mitigation arm64: Advertise mitigation of Spectre-v2, or lack thereof arm64: Always enable spectre-v2 vulnerability detection arm64: add sysfs vulnerability show for spectre-v2 arm64: add sysfs vulnerability show for speculative store bypass arm64: ssbs: Don't treat CPUs with SSBS as unaffected by SSB arm64: Force SSBS on context switch arm64: Use firmware to detect CPUs that are not affected by Spectre-v2 arm64/speculation: Support 'mitigations=' cmdline option vfs: Fix EOVERFLOW testing in put_compat_statfs64 coresight: etm4x: Use explicit barriers on enable/disable staging: erofs: fix an error handling in erofs_readdir() staging: erofs: some compressed cluster should be submitted for corrupted images staging: erofs: add two missing erofs_workgroup_put for corrupted images staging: erofs: detect potential multiref due to corrupted images cfg80211: add and use strongly typed element iteration macros cfg80211: Use const more consistently in for_each_element macros nl80211: validate beacon head Linux 4.19.79 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ie4f85994b5f3e53658c42833d0dc712575d0902e |
||
Kees Cook
|
12c6c4a50f |
usercopy: Avoid HIGHMEM pfn warning
commit 314eed30ede02fa925990f535652254b5bad6b65 upstream.
When running on a system with >512MB RAM with a 32-bit kernel built with:
CONFIG_DEBUG_VIRTUAL=y
CONFIG_HIGHMEM=y
CONFIG_HARDENED_USERCOPY=y
all execve()s will fail due to argv copying into kmap()ed pages, and on
usercopy checking the calls ultimately of virt_to_page() will be looking
for "bad" kmap (highmem) pointers due to CONFIG_DEBUG_VIRTUAL=y:
------------[ cut here ]------------
kernel BUG at ../arch/x86/mm/physaddr.c:83!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc8 #6
Hardware name: Dell Inc. Inspiron 1318/0C236D, BIOS A04 01/15/2009
EIP: __phys_addr+0xaf/0x100
...
Call Trace:
__check_object_size+0xaf/0x3c0
? __might_sleep+0x80/0xa0
copy_strings+0x1c2/0x370
copy_strings_kernel+0x2b/0x40
__do_execve_file+0x4ca/0x810
? kmem_cache_alloc+0x1c7/0x370
do_execve+0x1b/0x20
...
The check is from arch/x86/mm/physaddr.c:
VIRTUAL_BUG_ON((phys_addr >> PAGE_SHIFT) > max_low_pfn);
Due to the kmap() in fs/exec.c:
kaddr = kmap(kmapped_page);
...
if (copy_from_user(kaddr+offset, str, bytes_to_copy)) ...
Now we can fetch the correct page to avoid the pfn check. In both cases,
hardened usercopy will need to walk the page-span checker (if enabled)
to do sanity checking.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Fixes:
|
||
Catalin Marinas
|
13daec4d70 |
UPSTREAM: mm: untag user pointers in mmap/munmap/mremap/brk
(Upstream commit ce18d171cb7368557e6498a3ce111d7d3dc03e4d). There isn't a good reason to differentiate between the user address space layout modification syscalls and the other memory permission/attributes ones (e.g. mprotect, madvise) w.r.t. the tagged address ABI. Untag the user addresses on entry to these functions. Link: http://lkml.kernel.org/r/20190821164730.47450-2-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Will Deacon <will@kernel.org> Acked-by: Andrey Konovalov <andreyknvl@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Szabolcs Nagy <szabolcs.nagy@arm.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Dave P Martin <Dave.Martin@arm.com> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 135692346 Change-Id: If6a7be6d5f39f411bd65c49100088d0ceae530ee |
||
Andrey Konovalov
|
6e651342c6 |
UPSTREAM: mm: untag user pointers in get_vaddr_frames
(Upstream commit 5d65e7a7d8cd5c77baa1acf129a11b8b45ffee75). This patch is a part of a series that extends kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments. get_vaddr_frames uses provided user pointers for vma lookups, which can only by done with untagged pointers. Instead of locating and changing all callers of this function, perform untagging in it. Link: http://lkml.kernel.org/r/28f05e49c92b2a69c4703323d6c12208f3d881fe.1563904656.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eric Auger <eric.auger@redhat.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Jens Wiklander <jens.wiklander@linaro.org> Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 135692346 Change-Id: I1a8fd78531a7cf299cb41192519857b40e0d2305 |
||
Andrey Konovalov
|
d59b4a6646 |
UPSTREAM: mm: untag user pointers in mm/gup.c
(Upstream commit f9652594195fca8c3d8b8ee392ad0ff9f701bb20). This patch is a part of a series that extends kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments. mm/gup.c provides a kernel interface that accepts user addresses and manipulates user pages directly (for example get_user_pages, that is used by the futex syscall). Since a user can provided tagged addresses, we need to handle this case. Add untagging to gup.c functions that use user addresses for vma lookups. Link: http://lkml.kernel.org/r/4731bddba3c938658c10ff4ed55cc01c60f4c8f8.1563904656.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eric Auger <eric.auger@redhat.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Jens Wiklander <jens.wiklander@linaro.org> Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 135692346 Change-Id: Icdfc4cf0851cd84bac89908d53ba17c65df40077 |
||
Andrey Konovalov
|
690c4ca8a5 |
UPSTREAM: mm: untag user pointers passed to memory syscalls
(Upstream commit 057d3389108eda8a20c7f496f011846932680d88). This patch is a part of a series that extends kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments. This patch allows tagged pointers to be passed to the following memory syscalls: get_mempolicy, madvise, mbind, mincore, mlock, mlock2, mprotect, mremap, msync, munlock, move_pages. The mmap and mremap syscalls do not currently accept tagged addresses. Architectures may interpret the tag as a background colour for the corresponding vma. Link: http://lkml.kernel.org/r/aaf0c0969d46b2feb9017f3e1b3ef3970b633d91.1563904656.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eric Auger <eric.auger@redhat.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Jens Wiklander <jens.wiklander@linaro.org> Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 135692346 Change-Id: I1a2d89eedb45e618e85ca515f4c9121460711efb |
||
Greg Kroah-Hartman
|
7f1f24fed2 |
This is the 4.19.77 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl2YehoACgkQONu9yGCS aT536hAA0w6XF1ogPi23xS5BQ386c3BjWaOJddu0PFkR9utguuZaDG5AKjnkHtm2 foxKeCGyChCc1h6qFD2wLavsQMephzLWXkMw1/uXfwAPXGWY1hquD46zrFO0IYOc dsmKh5V8ZT70lmPTBHCDVowp1xUczNFhuNRHz1KzVQ3lsKMAhsRiy6vpsXWzIeOM VCrayI4jxTVZvoA9Odm9fTXpau0Qb9k4pqGGU84dz/xWfGzrz/AOmp+kWuZRGepS fQi+46HK3idmGGNBVGkJYkDdpkefdrYdjlVHkkoAZHlaB+QaOrre0trNArK+knRu jJIl8R/M50yYbkn97vVIpwoh19rV0BI7KUFKO4C4NCByOOV+9vjJ5g1FuZ7c3pmj XKZiBmFBJVqbI1uwgqEn+/Tv6DyEz4gUzo5GHOe2i/Bh2nSguHZzO+Yhat1U+J+Y 3QVfrIS10jICWBAqm147FepGtNxbCPP3plyqbJdrMwgM6GOMd83v+rY/5okB2YK4 SHhzuevQoxujjeoOgHKjIqTiCIaJMt5ZvlCFqODg8NgpjTNDAbCA/UUQWs7MlrqX 6uwH2QLwk3h+bEXUfGzsbsk5isTDpHr1ch6SI9T675JHRZCE461RoR82XpjLOyHD 90tBJk9pfKlLqSEyiaWPPpXErhoS5WCVKM5yuT/rSS/i2Ve4VH4= =Q+c9 -----END PGP SIGNATURE----- Merge 4.19.77 into android-4.19 Changes in 4.19.77 arcnet: provide a buffer big enough to actually receive packets cdc_ncm: fix divide-by-zero caused by invalid wMaxPacketSize macsec: drop skb sk before calling gro_cells_receive net/phy: fix DP83865 10 Mbps HDX loopback disable function net: qrtr: Stop rx_worker before freeing node net/sched: act_sample: don't push mac header on ip6gre ingress net_sched: add max len check for TCA_KIND nfp: flower: fix memory leak in nfp_flower_spawn_vnic_reprs openvswitch: change type of UPCALL_PID attribute to NLA_UNSPEC ppp: Fix memory leak in ppp_write sch_netem: fix a divide by zero in tabledist() skge: fix checksum byte order usbnet: ignore endpoints with invalid wMaxPacketSize usbnet: sanity checking of packet sizes and device mtu net: sched: fix possible crash in tcf_action_destroy() tcp: better handle TCP_USER_TIMEOUT in SYN_SENT state net/mlx5: Add device ID of upcoming BlueField-2 mISDN: enforce CAP_NET_RAW for raw sockets appletalk: enforce CAP_NET_RAW for raw sockets ax25: enforce CAP_NET_RAW for raw sockets ieee802154: enforce CAP_NET_RAW for raw sockets nfc: enforce CAP_NET_RAW for raw sockets nfp: flower: prevent memory leak in nfp_flower_spawn_phy_reprs ALSA: hda: Flush interrupts on disabling regulator: lm363x: Fix off-by-one n_voltages for lm3632 ldo_vpos/ldo_vneg ASoC: tlv320aic31xx: suppress error message for EPROBE_DEFER ASoC: sgtl5000: Fix of unmute outputs on probe ASoC: sgtl5000: Fix charge pump source assignment firmware: qcom_scm: Use proper types for dma mappings dmaengine: bcm2835: Print error in case setting DMA mask fails leds: leds-lp5562 allow firmware files up to the maximum length media: dib0700: fix link error for dibx000_i2c_set_speed media: mtk-cir: lower de-glitch counter for rc-mm protocol media: exynos4-is: fix leaked of_node references media: hdpvr: Add device num check and handling media: i2c: ov5640: Check for devm_gpiod_get_optional() error time/tick-broadcast: Fix tick_broadcast_offline() lockdep complaint sched/fair: Fix imbalance due to CPU affinity sched/core: Fix CPU controller for !RT_GROUP_SCHED x86/apic: Make apic_pending_intr_clear() more robust sched/deadline: Fix bandwidth accounting at all levels after offline migration x86/reboot: Always use NMI fallback when shutdown via reboot vector IPI fails x86/apic: Soft disable APIC before initializing it ALSA: hda - Show the fatal CORB/RIRB error more clearly ALSA: i2c: ak4xxx-adda: Fix a possible null pointer dereference in build_adc_controls() EDAC/mc: Fix grain_bits calculation media: iguanair: add sanity checks base: soc: Export soc_device_register/unregister APIs ALSA: usb-audio: Skip bSynchAddress endpoint check if it is invalid ia64:unwind: fix double free for mod->arch.init_unw_table EDAC/altera: Use the proper type for the IRQ status bits ASoC: rsnd: don't call clk_get_rate() under atomic context arm64/prefetch: fix a -Wtype-limits warning md/raid1: end bio when the device faulty md: don't call spare_active in md_reap_sync_thread if all member devices can't work md: don't set In_sync if array is frozen media: media/platform: fsl-viu.c: fix build for MICROBLAZE ACPI / processor: don't print errors for processorIDs == 0xff loop: Add LOOP_SET_DIRECT_IO to compat ioctl EDAC, pnd2: Fix ioremap() size in dnv_rd_reg() efi: cper: print AER info of PCIe fatal error firmware: arm_scmi: Check if platform has released shmem before using sched/fair: Use rq_lock/unlock in online_fair_sched_group idle: Prevent late-arriving interrupts from disrupting offline media: gspca: zero usb_buf on error perf config: Honour $PERF_CONFIG env var to specify alternate .perfconfig perf test vfs_getname: Disable ~/.perfconfig to get default output media: mtk-mdp: fix reference count on old device tree media: fdp1: Reduce FCP not found message level to debug media: em28xx: modules workqueue not inited for 2nd device media: rc: imon: Allow iMON RC protocol for ffdc 7e device dmaengine: iop-adma: use correct printk format strings perf record: Support aarch64 random socket_id assignment media: vsp1: fix memory leak of dl on error return path media: i2c: ov5645: Fix power sequence media: omap3isp: Don't set streaming state on random subdevs media: imx: mipi csi-2: Don't fail if initial state times-out net: lpc-enet: fix printk format strings m68k: Prevent some compiler warnings in Coldfire builds ARM: dts: imx7d: cl-som-imx7: make ethernet work again ARM: dts: imx7-colibri: disable HS400 media: radio/si470x: kill urb on error media: hdpvr: add terminating 0 at end of string ASoC: uniphier: Fix double reset assersion when transitioning to suspend state tools headers: Fixup bitsperlong per arch includes ASoC: sun4i-i2s: Don't use the oversample to calculate BCLK led: triggers: Fix a memory leak bug nbd: add missing config put media: mceusb: fix (eliminate) TX IR signal length limit media: dvb-frontends: use ida for pll number posix-cpu-timers: Sanitize bogus WARNONS media: dvb-core: fix a memory leak bug libperf: Fix alignment trap with xyarray contents in 'perf stat' EDAC/amd64: Recognize DRAM device type ECC capability EDAC/amd64: Decode syndrome before translating address PM / devfreq: passive: Use non-devm notifiers PM / devfreq: exynos-bus: Correct clock enable sequence media: cec-notifier: clear cec_adap in cec_notifier_unregister media: saa7146: add cleanup in hexium_attach() media: cpia2_usb: fix memory leaks media: saa7134: fix terminology around saa7134_i2c_eeprom_md7134_gate() perf trace beauty ioctl: Fix off-by-one error in cmd->string table media: ov9650: add a sanity check ASoC: es8316: fix headphone mixer volume table ACPI / CPPC: do not require the _PSD method sched/cpufreq: Align trace event behavior of fast switching x86/apic/vector: Warn when vector space exhaustion breaks affinity arm64: kpti: ensure patched kernel text is fetched from PoU x86/mm/pti: Do not invoke PTI functions when PTI is disabled ASoC: fsl_ssi: Fix clock control issue in master mode x86/mm/pti: Handle unaligned address gracefully in pti_clone_pagetable() nvmet: fix data units read and written counters in SMART log nvme-multipath: fix ana log nsid lookup when nsid is not found ALSA: firewire-motu: add support for MOTU 4pre iommu/amd: Silence warnings under memory pressure libata/ahci: Drop PCS quirk for Denverton and beyond iommu/iova: Avoid false sharing on fq_timer_on libtraceevent: Change users plugin directory ARM: dts: exynos: Mark LDO10 as always-on on Peach Pit/Pi Chromebooks ACPI: custom_method: fix memory leaks ACPI / PCI: fix acpi_pci_irq_enable() memory leak closures: fix a race on wakeup from closure_sync hwmon: (acpi_power_meter) Change log level for 'unsafe software power cap' md/raid1: fail run raid1 array when active disk less than one dmaengine: ti: edma: Do not reset reserved paRAM slots kprobes: Prohibit probing on BUG() and WARN() address s390/crypto: xts-aes-s390 fix extra run-time crypto self tests finding x86/cpu: Add Tiger Lake to Intel family platform/x86: intel_pmc_core: Do not ioremap RAM ASoC: dmaengine: Make the pcm->name equal to pcm->id if the name is not set raid5: don't set STRIPE_HANDLE to stripe which is in batch list mmc: core: Clarify sdio_irq_pending flag for MMC_CAP2_SDIO_IRQ_NOTHREAD mmc: sdhci: Fix incorrect switch to HS mode mmc: core: Add helper function to indicate if SDIO IRQs is enabled mmc: dw_mmc: Re-store SDIO IRQs mask at system resume raid5: don't increment read_errors on EILSEQ return libertas: Add missing sentinel at end of if_usb.c fw_table e1000e: add workaround for possible stalled packet ALSA: hda - Drop unsol event handler for Intel HDMI codecs drm/amd/powerplay/smu7: enforce minimal VBITimeout (v2) media: ttusb-dec: Fix info-leak in ttusb_dec_send_command() ALSA: hda/realtek - Blacklist PC beep for Lenovo ThinkCentre M73/93 iommu/amd: Override wrong IVRS IOAPIC on Raven Ridge systems btrfs: extent-tree: Make sure we only allocate extents from block groups with the same type media: omap3isp: Set device on omap3isp subdevs PM / devfreq: passive: fix compiler warning iwlwifi: fw: don't send GEO_TX_POWER_LIMIT command to FW version 36 ALSA: firewire-tascam: handle error code when getting current source of clock ALSA: firewire-tascam: check intermediate state of clock status and retry scsi: scsi_dh_rdac: zero cdb in send_mode_select() scsi: qla2xxx: Fix Relogin to prevent modifying scan_state flag printk: Do not lose last line in kmsg buffer dump IB/mlx5: Free mpi in mp_slave mode IB/hfi1: Define variables as unsigned long to fix KASAN warning randstruct: Check member structs in is_pure_ops_struct() Revert "ceph: use ceph_evict_inode to cleanup inode's resource" ceph: use ceph_evict_inode to cleanup inode's resource ALSA: hda/realtek - PCI quirk for Medion E4254 blk-mq: add callback of .cleanup_rq scsi: implement .cleanup_rq callback powerpc/imc: Dont create debugfs files for cpu-less nodes fuse: fix missing unlock_page in fuse_writepage() parisc: Disable HP HSC-PCI Cards to prevent kernel crash KVM: x86: always stop emulation on page fault KVM: x86: set ctxt->have_exception in x86_decode_insn() KVM: x86: Manually calculate reserved bits when loading PDPTRS media: sn9c20x: Add MSI MS-1039 laptop to flip_dmi_table media: don't drop front-end reference count for ->detach binfmt_elf: Do not move brk for INTERP-less ET_EXEC ASoC: Intel: NHLT: Fix debug print format ASoC: Intel: Skylake: Use correct function to access iomem space ASoC: Intel: Fix use of potentially uninitialized variable ARM: samsung: Fix system restart on S3C6410 ARM: zynq: Use memcpy_toio instead of memcpy on smp bring-up Revert "arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}" arm64: tlb: Ensure we execute an ISB following walk cache invalidation arm64: dts: rockchip: limit clock rate of MMC controllers for RK3328 alarmtimer: Use EOPNOTSUPP instead of ENOTSUPP regulator: Defer init completion for a while after late_initcall efifb: BGRT: Improve efifb_bgrt_sanity_check gfs2: clear buf_in_tr when ending a transaction in sweep_bh_for_rgrps memcg, oom: don't require __GFP_FS when invoking memcg OOM killer memcg, kmem: do not fail __GFP_NOFAIL charges i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask block: fix null pointer dereference in blk_mq_rq_timed_out() smb3: allow disabling requesting leases ovl: Fix dereferencing possible ERR_PTR() ovl: filter of trusted xattr results in audit btrfs: fix allocation of free space cache v1 bitmap pages Btrfs: fix use-after-free when using the tree modification log btrfs: Relinquish CPUs in btrfs_compare_trees btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls Btrfs: fix race setting up and completing qgroup rescan workers md/raid6: Set R5_ReadError when there is read failure on parity disk md: don't report active array_state until after revalidate_disk() completes. md: only call set_in_sync() when it is expected to succeed. cfg80211: Purge frame registrations on iftype change /dev/mem: Bail out upon SIGKILL. ext4: fix warning inside ext4_convert_unwritten_extents_endio ext4: fix punch hole for inline_data file systems quota: fix wrong condition in is_quota_modification() hwrng: core - don't wait on add_early_randomness() i2c: riic: Clear NACK in tend isr CIFS: fix max ea value size CIFS: Fix oplock handling for SMB 2.1+ protocols md/raid0: avoid RAID0 data corruption due to layout confusion. fuse: fix deadlock with aio poll and fuse_iqueue::waitq.lock mm/compaction.c: clear total_{migrate,free}_scanned before scanning a new zone drm/amd/display: Restore backlight brightness after system resume Linux 4.19.77 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I2c74f09f497a4b45b244a7dd263c5116533dfccb |
||
Yafang Shao
|
4d8bdf7f3a |
mm/compaction.c: clear total_{migrate,free}_scanned before scanning a new zone
[ Upstream commit a94b525241c0fff3598809131d7cfcfe1d572d8c ]
total_{migrate,free}_scanned will be added to COMPACTMIGRATE_SCANNED and
COMPACTFREE_SCANNED in compact_zone(). We should clear them before
scanning a new zone. In the proc triggered compaction, we forgot clearing
them.
[laoar.shao@gmail.com: introduce a helper compact_zone_counters_init()]
Link: http://lkml.kernel.org/r/1563869295-25748-1-git-send-email-laoar.shao@gmail.com
[akpm@linux-foundation.org: expand compact_zone_counters_init() into its single callsite, per mhocko]
[vbabka@suse.cz: squash compact_zone() list_head init as well]
Link: http://lkml.kernel.org/r/1fb6f7da-f776-9e42-22f8-bbb79b030b98@suse.cz
[akpm@linux-foundation.org: kcompactd_do_work(): avoid unnecessary initialization of cc.zone]
Link: http://lkml.kernel.org/r/1563789275-9639-1-git-send-email-laoar.shao@gmail.com
Fixes:
|
||
Michal Hocko
|
b4a734a529 |
memcg, kmem: do not fail __GFP_NOFAIL charges
commit e55d9d9bfb69405bd7615c0f8d229d8fafb3e9b8 upstream. Thomas has noticed the following NULL ptr dereference when using cgroup v1 kmem limit: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 3 PID: 16923 Comm: gtk-update-icon Not tainted 4.19.51 #42 Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1/Z97X-Gaming G1, BIOS F9 07/31/2015 RIP: 0010:create_empty_buffers+0x24/0x100 Code: cd 0f 1f 44 00 00 0f 1f 44 00 00 41 54 49 89 d4 ba 01 00 00 00 55 53 48 89 fb e8 97 fe ff ff 48 89 c5 48 89 c2 eb 03 48 89 ca <48> 8b 4a 08 4c 09 22 48 85 c9 75 f1 48 89 6a 08 48 8b 43 18 48 8d RSP: 0018:ffff927ac1b37bf8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: fffff2d4429fd740 RCX: 0000000100097149 RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff9075a99fbe00 RBP: 0000000000000000 R08: fffff2d440949cc8 R09: 00000000000960c0 R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000 R13: ffff907601f18360 R14: 0000000000002000 R15: 0000000000001000 FS: 00007fb55b288bc0(0000) GS:ffff90761f8c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000007aebc002 CR4: 00000000001606e0 Call Trace: create_page_buffers+0x4d/0x60 __block_write_begin_int+0x8e/0x5a0 ? ext4_inode_attach_jinode.part.82+0xb0/0xb0 ? jbd2__journal_start+0xd7/0x1f0 ext4_da_write_begin+0x112/0x3d0 generic_perform_write+0xf1/0x1b0 ? file_update_time+0x70/0x140 __generic_file_write_iter+0x141/0x1a0 ext4_file_write_iter+0xef/0x3b0 __vfs_write+0x17e/0x1e0 vfs_write+0xa5/0x1a0 ksys_write+0x57/0xd0 do_syscall_64+0x55/0x160 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Tetsuo then noticed that this is because the __memcg_kmem_charge_memcg fails __GFP_NOFAIL charge when the kmem limit is reached. This is a wrong behavior because nofail allocations are not allowed to fail. Normal charge path simply forces the charge even if that means to cross the limit. Kmem accounting should be doing the same. Link: http://lkml.kernel.org/r/20190906125608.32129-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Thomas Lindroth <thomas.lindroth@gmail.com> Debugged-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Thomas Lindroth <thomas.lindroth@gmail.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Tetsuo Handa
|
d40b3eafb5 |
memcg, oom: don't require __GFP_FS when invoking memcg OOM killer
commit f9c645621a28e37813a1de96d9cbd89cde94a1e4 upstream. Masoud Sharbiani noticed that commit |
||
Andrey Ryabinin
|
a0aef70a47 |
UPSTREAM: mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y
(Upstream commit 00fb24a42a68b1ee0f6495993fe1be7124433dfb). The code like this: ptr = kmalloc(size, GFP_KERNEL); page = virt_to_page(ptr); offset = offset_in_page(ptr); kfree(page_address(page) + offset); may produce false-positive invalid-free reports on the kernel with CONFIG_KASAN_SW_TAGS=y. In the example above we lose the original tag assigned to 'ptr', so kfree() gets the pointer with 0xFF tag. In kfree() we check that 0xFF tag is different from the tag in shadow hence print false report. Instead of just comparing tags, do the following: 1) Check that shadow doesn't contain KASAN_TAG_INVALID. Otherwise it's double-free and it doesn't matter what tag the pointer have. 2) If pointer tag is different from 0xFF, make sure that tag in the shadow is the same as in the pointer. Link: http://lkml.kernel.org/r/20190819172540.19581-1-aryabinin@virtuozzo.com Fixes: 7f94ffbc4c6a ("kasan: add hooks implementation for tag-based mode") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reported-by: Walter Wu <walter-zh.wu@mediatek.com> Reported-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I3250be3a6e243086c561899a3458387370db693a |
||
Nathan Chancellor
|
c89c308809 |
UPSTREAM: kasan: initialize tag to 0xff in __kasan_kmalloc
(Upstream commit 0600597c854e53d2f9b7a6a718c1da2b8b4cb4db). When building with -Wuninitialized and CONFIG_KASAN_SW_TAGS unset, Clang warns: mm/kasan/common.c:484:40: warning: variable 'tag' is uninitialized when used here [-Wuninitialized] kasan_unpoison_shadow(set_tag(object, tag), size); ^~~ set_tag ignores tag in this configuration but clang doesn't realize it at this point in its pipeline, as it points to arch_kasan_set_tag as being the point where it is used, which will later be expanded to (void *)(object) without a use of tag. Initialize tag to 0xff, as it removes this warning and doesn't change the meaning of the code. Link: https://github.com/ClangBuiltLinux/linux/issues/465 Link: http://lkml.kernel.org/r/20190502163057.6603-1-natechancellor@gmail.com Fixes: 7f94ffbc4c6a ("kasan: add hooks implementation for tag-based mode") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I15c57bb15b36ed9f2dbfe31641a5a2905d3ab668 |
||
Peter Zijlstra
|
40b539bf2e |
UPSTREAM: x86/uaccess, kasan: Fix KASAN vs SMAP
(Upstream commit 57b78a62e7f23c4686fe54091cdc3d12e60d6513). KASAN inserts extra code for every LOAD/STORE emitted by te compiler. Much of this code is simple and safe to run with AC=1, however the kasan_report() function, called on error, is most certainly not safe to call with AC=1. Therefore wrap kasan_report() in user_access_{save,restore}; which for x86 SMAP, saves/restores EFLAGS and clears AC before calling the real function. Also ensure all the functions are without __fentry__ hook. The function tracer is also not safe. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I636890ae2e0dcb5a54f27942d1b00098bb1a43fe |
||
Qian Cai
|
0ee9b4391a |
UPSTREAM: kasan: fix variable 'tag' set but not used warning
(Upstream commit c412a769d2452161e97f163c4c4f31efc6626f06). set_tag() compiles away when CONFIG_KASAN_SW_TAGS=n, so make arch_kasan_set_tag() a static inline function to fix warnings below. mm/kasan/common.c: In function '__kasan_kmalloc': mm/kasan/common.c:475:5: warning: variable 'tag' set but not used [-Wunused-but-set-variable] u8 tag; ^~~ Link: http://lkml.kernel.org/r/20190307185244.54648-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I46b33353a25e5e3e0ef7b61dba56ae52922a5452 |
||
Andrey Konovalov
|
6395402ef0 |
UPSTREAM: kasan: fix coccinelle warnings in kasan_p*_table
(Upstream commit 5c0198b6fb73867dea296a69a944a4fdbceff3d8).
kasan_p4d_table(), kasan_pmd_table() and kasan_pud_table() are declared
as returning bool, but return 0 instead of false, which produces a
coccinelle warning. Fix it.
Link: http://lkml.kernel.org/r/1fa6fadf644859e8a6a8ecce258444b49be8c7ee.1551716733.git.andreyknvl@google.com
Fixes:
|
||
Arnd Bergmann
|
cdb775fce4 |
UPSTREAM: kasan: fix kasan_check_read/write definitions
(Upstream commit bcf6f55a0d05eedd8ebb6ecc60ae3f93205ad833). Building little-endian allmodconfig kernels on arm64 started failing with the generated atomic.h implementation, since we now try to call kasan helpers from the EFI stub: aarch64-linux-gnu-ld: drivers/firmware/efi/libstub/arm-stub.stub.o: in function `atomic_set': include/generated/atomic-instrumented.h:44: undefined reference to `__efistub_kasan_check_write' I suspect that we get similar problems in other files that explicitly disable KASAN for some reason but call atomic_t based helper functions. We can fix this by checking the predefined __SANITIZE_ADDRESS__ macro that the compiler sets instead of checking CONFIG_KASAN, but this in turn requires a small hack in mm/kasan/common.c so we do see the extern declaration there instead of the inline function. Link: http://lkml.kernel.org/r/20181211133453.2835077-1-arnd@arndb.de Fixes: b1864b828644 ("locking/atomics: build atomic headers as required") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Anders Roxell <anders.roxell@linaro.org> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au>, Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: Ib6836f63fc9dfe594cd8ce4e99f2ad4489c98a53 |
||
Andrey Ryabinin
|
f46851e627 |
BACKPORT: kasan: remove use after scope bugs detection.
(Upstream commit 7771bdbbfd3d6f204631b6fd9e1bbc30cd15918e). Use after scope bugs detector seems to be almost entirely useless for the linux kernel. It exists over two years, but I've seen only one valid bug so far [1]. And the bug was fixed before it has been reported. There were some other use-after-scope reports, but they were false-positives due to different reasons like incompatibility with structleak plugin. This feature significantly increases stack usage, especially with GCC < 9 version, and causes a 32K stack overflow. It probably adds performance penalty too. Given all that, let's remove use-after-scope detector entirely. While preparing this patch I've noticed that we mistakenly enable use-after-scope detection for clang compiler regardless of CONFIG_KASAN_EXTRA setting. This is also fixed now. [1] http://lkml.kernel.org/r/<20171129052106.rhgbjhhis53hkgfn@wfg-t540p.sh.intel.com> Link: http://lkml.kernel.org/r/20190111185842.13978-1-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Will Deacon <will.deacon@arm.com> [arm64] Cc: Qian Cai <cai@lca.pw> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: I00d869a5e287e3e198950b78ad17bd6ff6a51595 Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 |
||
Qian Cai
|
e50674c599 |
UPSTREAM: slub: fix a crash with SLUB_DEBUG + KASAN_SW_TAGS
(Upstream commit 6373dca16c911b2828ef8d836d7f6f1800e1bbbc). In process_slab(), "p = get_freepointer()" could return a tagged pointer, but "addr = page_address()" always return a native pointer. As the result, slab_index() is messed up here, return (p - addr) / s->size; All other callers of slab_index() have the same situation where "addr" is from page_address(), so just need to untag "p". # cat /sys/kernel/slab/hugetlbfs_inode_cache/alloc_calls Unable to handle kernel paging request at virtual address 2bff808aa4856d48 Mem abort info: ESR = 0x96000007 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000007 CM = 0, WnR = 0 swapper pgtable: 64k pages, 48-bit VAs, pgdp = 0000000002498338 [2bff808aa4856d48] pgd=00000097fcfd0003, pud=00000097fcfd0003, pmd=00000097fca30003, pte=00e8008b24850712 Internal error: Oops: 96000007 [#1] SMP CPU: 3 PID: 79210 Comm: read_all Tainted: G L 5.0.0-rc7+ #84 Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.0.6 07/10/2018 pstate: 00400089 (nzcv daIf +PAN -UAO) pc : get_map+0x78/0xec lr : get_map+0xa0/0xec sp : aeff808989e3f8e0 x29: aeff808989e3f940 x28: ffff800826200000 x27: ffff100012d47000 x26: 9700000000002500 x25: 0000000000000001 x24: 52ff8008200131f8 x23: 52ff8008200130a0 x22: 52ff800820013098 x21: ffff800826200000 x20: ffff100013172ba0 x19: 2bff808a8971bc00 x18: ffff1000148f5538 x17: 000000000000001b x16: 00000000000000ff x15: ffff1000148f5000 x14: 00000000000000d2 x13: 0000000000000001 x12: 0000000000000000 x11: 0000000020000002 x10: 2bff808aa4856d48 x9 : 0000020000000000 x8 : 68ff80082620ebb0 x7 : 0000000000000000 x6 : ffff1000105da1dc x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000010 x2 : 2bff808a8971bc00 x1 : ffff7fe002098800 x0 : ffff80082620ceb0 Process read_all (pid: 79210, stack limit = 0x00000000f65b9361) Call trace: get_map+0x78/0xec process_slab+0x7c/0x47c list_locations+0xb0/0x3c8 alloc_calls_show+0x34/0x40 slab_attr_show+0x34/0x48 sysfs_kf_seq_show+0x2e4/0x570 kernfs_seq_show+0x12c/0x1a0 seq_read+0x48c/0xf84 kernfs_fop_read+0xd4/0x448 __vfs_read+0x94/0x5d4 vfs_read+0xcc/0x194 ksys_read+0x6c/0xe8 __arm64_sys_read+0x68/0xb0 el0_svc_handler+0x230/0x3bc el0_svc+0x8/0xc Code: d3467d2a 9ac92329 8b0a0e6a f9800151 (c85f7d4b) ---[ end trace a383a9a44ff13176 ]--- Kernel panic - not syncing: Fatal exception SMP: stopping secondary CPUs SMP: failed to stop secondary CPUs 1-7,32,40,127 Kernel Offset: disabled CPU features: 0x002,20000c18 Memory Limit: none ---[ end Kernel panic - not syncing: Fatal exception ]--- Link: http://lkml.kernel.org/r/20190220020251.82039-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I14d53cd0c3ce9672edc3b9666c68241e42d2fced |
||
Andrey Konovalov
|
b26d7811dd |
UPSTREAM: kasan, slab: remove redundant kasan_slab_alloc hooks
(Upstream commit 557ea25383a231fe3ffc72881ada35c24b960dbc). kasan_slab_alloc() calls in kmem_cache_alloc() and kmem_cache_alloc_node() are redundant as they are already called via slab_alloc/slab_alloc_node()-> slab_post_alloc_hook()->kasan_slab_alloc(). Remove them. Link: http://lkml.kernel.org/r/4ca1655cdcfc4379c49c50f7bf80f81c4ad01485.1550602886.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Qian Cai <cai@lca.pw> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgeniy Stepanov <eugenis@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I61be422e28a4e31f36f9ec5bea654077691b89b9 |
||
Andrey Konovalov
|
8607c71441 |
UPSTREAM: kasan, slab: make freelist stored without tags
(Upstream commit 51dedad06b5f6c3eea7ec1069631b1ef7796912a). Similarly to "kasan, slub: move kasan_poison_slab hook before page_address", move kasan_poison_slab() before alloc_slabmgmt(), which calls page_address(), to make page_address() return value to be non-tagged. This, combined with calling kasan_reset_tag() for off-slab slab management object, leads to freelist being stored non-tagged. Link: http://lkml.kernel.org/r/dfb53b44a4d00de3879a05a9f04c1f55e584f7a1.1550602886.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Qian Cai <cai@lca.pw> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgeniy Stepanov <eugenis@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I65954f8eed880ff2a0fe76a81bcf2703f707a585 |
||
Andrey Konovalov
|
137a7d4213 |
UPSTREAM: kasan, slab: fix conflicts with CONFIG_HARDENED_USERCOPY
(Upstream commit 219667c23c68eb3dbc0d5662b9246f28477fe529). Similarly to commit 96fedce27e13 ("kasan: make tag based mode work with CONFIG_HARDENED_USERCOPY"), we need to reset pointer tags in __check_heap_object() in mm/slab.c before doing any pointer math. Link: http://lkml.kernel.org/r/9a5c0f958db10e69df5ff9f2b997866b56b7effc.1550602886.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Qian Cai <cai@lca.pw> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgeniy Stepanov <eugenis@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I411184ecec0e3f77014bb330de3db5728d2065ed |
||
Andrey Konovalov
|
d735952416 |
UPSTREAM: kasan: prevent tracing of tags.c
(Upstream commit dc15a8a2543cb9ebe67998eefe2880ce1a20d42e). Similarly to commit 0d0c8de8788b ("kasan: mark file common so ftrace doesn't trace it") add the -pg flag to mm/kasan/tags.c to prevent conflicts with tracing. Link: http://lkml.kernel.org/r/9c4c3ce5ccfb894c7fe66d91de7c1da2787b4da4.1550602886.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: Qian Cai <cai@lca.pw> Tested-by: Qian Cai <cai@lca.pw> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Evgeniy Stepanov <eugenis@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I41a129790bd12e699f816e27fb086a6570d4223e |
||
Andrey Konovalov
|
aa846c8644 |
UPSTREAM: kasan: fix random seed generation for tag-based mode
(Upstream commit 3f41b609382388f95c0a05b69b8db0d706adafb4). There are two issues with assigning random percpu seeds right now: 1. We use for_each_possible_cpu() to iterate over cpus, but cpumask is not set up yet at the moment of kasan_init(), and thus we only set the seed for cpu #0. 2. A call to get_random_u32() always returns the same number and produces a message in dmesg, since the random subsystem is not yet initialized. Fix 1 by calling kasan_init_tags() after cpumask is set up. Fix 2 by using get_cycles() instead of get_random_u32(). This gives us lower quality random numbers, but it's good enough, as KASAN is meant to be used as a debugging tool and not a mitigation. Link: http://lkml.kernel.org/r/1f815cc914b61f3516ed4cc9bfd9eeca9bd5d9de.1550677973.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: Ic4b29dfd24515f02302d5cbd0d79eab5c6f0642c |
||
Qian Cai
|
2a9788f5f8 |
UPSTREAM: slub: fix SLAB_CONSISTENCY_CHECKS + KASAN_SW_TAGS
(Upstream commit 338cfaad4993d3bc35a740e28981747770a65f90). Enabling SLUB_DEBUG's SLAB_CONSISTENCY_CHECKS with KASAN_SW_TAGS triggers endless false positives during boot below due to check_valid_pointer() checks tagged pointers which have no addresses that is valid within slab pages: BUG radix_tree_node (Tainted: G B ): Freelist Pointer check fails ----------------------------------------------------------------------------- INFO: Slab objects=69 used=69 fp=0x (null) flags=0x7ffffffc000200 INFO: Object @offset=15060037153926966016 fp=0x Redzone: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 18 6b 06 00 08 80 ff d0 .........k...... Object : 18 6b 06 00 08 80 ff d0 00 00 00 00 00 00 00 00 .k.............. Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Redzone: bb bb bb bb bb bb bb bb ........ Padding: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B 5.0.0-rc5+ #18 Call trace: dump_backtrace+0x0/0x450 show_stack+0x20/0x2c __dump_stack+0x20/0x28 dump_stack+0xa0/0xfc print_trailer+0x1bc/0x1d0 object_err+0x40/0x50 alloc_debug_processing+0xf0/0x19c ___slab_alloc+0x554/0x704 kmem_cache_alloc+0x2f8/0x440 radix_tree_node_alloc+0x90/0x2fc idr_get_free+0x1e8/0x6d0 idr_alloc_u32+0x11c/0x2a4 idr_alloc+0x74/0xe0 worker_pool_assign_id+0x5c/0xbc workqueue_init_early+0x49c/0xd50 start_kernel+0x52c/0xac4 FIX radix_tree_node: Marking all objects used Link: http://lkml.kernel.org/r/20190209044128.3290-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Change-Id: Id7d2fe26fd720bd4bd9fa9ae7f5a04be881066e7 Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 |
||
Andrey Konovalov
|
712b50a15c |
UPSTREAM: kasan, slub: fix more conflicts with CONFIG_SLAB_FREELIST_HARDENED
(Upstream commit d36a63a943e37081e92e4abdf4a207fd2e83a006). When CONFIG_KASAN_SW_TAGS is enabled, ptr_addr might be tagged. Normally, this doesn't cause any issues, as both set_freepointer() and get_freepointer() are called with a pointer with the same tag. However, there are some issues with CONFIG_SLUB_DEBUG code. For example, when __free_slub() iterates over objects in a cache, it passes untagged pointers to check_object(). check_object() in turns calls get_freepointer() with an untagged pointer, which causes the freepointer to be restored incorrectly. Add kasan_reset_tag to freelist_ptr(). Also add a detailed comment. Link: http://lkml.kernel.org/r/bf858f26ef32eb7bd24c665755b3aee4bc58d0e4.1550103861.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: Qian Cai <cai@lca.pw> Tested-by: Qian Cai <cai@lca.pw> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: Ie57f08f676ea7a244a20f1dee98fc725594cafa6 |
||
Andrey Konovalov
|
5dbe32c94e |
UPSTREAM: kasan, slub: fix conflicts with CONFIG_SLAB_FREELIST_HARDENED
(Upstream commit 18e506610238eda2b0c5a19a123d3d6ec0ab2de6). CONFIG_SLAB_FREELIST_HARDENED hashes freelist pointer with the address of the object where the pointer gets stored. With tag based KASAN we don't account for that when building freelist, as we call set_freepointer() with the first argument untagged. This patch changes the code to properly propagate tags throughout the loop. Link: http://lkml.kernel.org/r/3df171559c52201376f246bf7ce3184fe21c1dc7.1549921721.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: Qian Cai <cai@lca.pw> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Evgeniy Stepanov <eugenis@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I218d23e0fe3164f6f03d4f08bb93630f3e496f82 |
||
Andrey Konovalov
|
d79e51f2a9 |
UPSTREAM: kasan, slub: move kasan_poison_slab hook before page_address
(Upstream commit a71012242837fe5e67d8c999cfc357174ed5dba0). With tag based KASAN page_address() looks at the page flags to see whether the resulting pointer needs to have a tag set. Since we don't want to set a tag when page_address() is called on SLAB pages, we call page_kasan_tag_reset() in kasan_poison_slab(). However in allocate_slab() page_address() is called before kasan_poison_slab(). Fix it by changing the order. [andreyknvl@google.com: fix compilation error when CONFIG_SLUB_DEBUG=n] Link: http://lkml.kernel.org/r/ac27cc0bbaeb414ed77bcd6671a877cf3546d56e.1550066133.git.andreyknvl@google.com Link: http://lkml.kernel.org/r/cd895d627465a3f1c712647072d17f10883be2a1.1549921721.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgeniy Stepanov <eugenis@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Qian Cai <cai@lca.pw> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I53ade71e532b3e4163b84a3a23c7c54ab2d059bd |
||
Andrey Konovalov
|
11543367c3 |
UPSTREAM: kasan, kmemleak: pass tagged pointers to kmemleak
(Upstream commit 53128245b43daad600d9fe72940206570e064112). Right now we call kmemleak hooks before assigning tags to pointers in KASAN hooks. As a result, when an objects gets allocated, kmemleak sees a differently tagged pointer, compared to the one it sees when the object gets freed. Fix it by calling KASAN hooks before kmemleak's ones. Link: http://lkml.kernel.org/r/cd825aa4897b0fc37d3316838993881daccbe9f5.1549921721.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: Qian Cai <cai@lca.pw> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgeniy Stepanov <eugenis@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I7766731cf2fe22effd91bfb89da915355d0e7981 |
||
Andrey Konovalov
|
ceface4ee3 |
UPSTREAM: kasan: fix assigning tags twice
(Upstream commit e1db95befb3e9e3476629afec6e0f5d0707b9825). When an object is kmalloc()'ed, two hooks are called: kasan_slab_alloc() and kasan_kmalloc(). Right now we assign a tag twice, once in each of the hooks. Fix it by assigning a tag only in the former hook. Link: http://lkml.kernel.org/r/ce8c6431da735aa7ec051fd6497153df690eb021.1549921721.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgeniy Stepanov <eugenis@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Qian Cai <cai@lca.pw> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I70f85dda8fcbcc796ebde8294a6f110d390bff3f |
||
Anders Roxell
|
b4d4745fe4 |
UPSTREAM: kasan: mark file common so ftrace doesn't trace it
(Upstream commit 0d0c8de8788b6c441ea01365612de7efc20cc682). When option CONFIG_KASAN is enabled toghether with ftrace, function ftrace_graph_caller() gets in to a recursion, via functions kasan_check_read() and kasan_check_write(). Breakpoint 2, ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179 179 mcount_get_pc x0 // function's pc (gdb) bt #0 ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179 #1 0xffffff90101406c8 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:151 #2 0xffffff90106fd084 in kasan_check_write (p=0xffffffc06c170878, size=4) at ../mm/kasan/common.c:105 #3 0xffffff90104a2464 in atomic_add_return (v=<optimized out>, i=<optimized out>) at ./include/generated/atomic-instrumented.h:71 #4 atomic_inc_return (v=<optimized out>) at ./include/generated/atomic-fallback.h:284 #5 trace_graph_entry (trace=0xffffffc03f5ff380) at ../kernel/trace/trace_functions_graph.c:441 #6 0xffffff9010481774 in trace_graph_entry_watchdog (trace=<optimized out>) at ../kernel/trace/trace_selftest.c:741 #7 0xffffff90104a185c in function_graph_enter (ret=<optimized out>, func=<optimized out>, frame_pointer=18446743799894897728, retp=<optimized out>) at ../kernel/trace/trace_functions_graph.c:196 #8 0xffffff9010140628 in prepare_ftrace_return (self_addr=18446743592948977792, parent=0xffffffc03f5ff418, frame_pointer=18446743799894897728) at ../arch/arm64/kernel/ftrace.c:231 #9 0xffffff90101406f4 in ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:182 Backtrace stopped: previous frame identical to this frame (corrupt stack?) (gdb) Rework so that the kasan implementation isn't traced. Link: http://lkml.kernel.org/r/20181212183447.15890-1-anders.roxell@linaro.org Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Acked-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: Id4dc25b795f81b4193c5f861657e9091acd99cef |
||
Andrey Konovalov
|
0f007a2bf6 |
UPSTREAM: kasan: fix krealloc handling for tag-based mode
(Upstream commit a3fe7cdf02e318870fb71218726cc2321ff41f30). Right now tag-based KASAN can retag the memory that is reallocated via krealloc and return a differently tagged pointer even if the same slab object gets used and no reallocated technically happens. There are a few issues with this approach. One is that krealloc callers can't rely on comparing the return value with the passed argument to check whether reallocation happened. Another is that if a caller knows that no reallocation happened, that it can access object memory through the old pointer, which leads to false positives. Look at nf_ct_ext_add() to see an example. Fix this by keeping the same tag if the memory don't actually gets reallocated during krealloc. Link: http://lkml.kernel.org/r/bb2a71d17ed072bcc528cbee46fcbd71a6da3be4.1546540962.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: Ie9bd73c43fa76ac2ba946bfbd2cb88e5c0000dfb |
||
Andrey Konovalov
|
d8c656e717 |
UPSTREAM: kasan: make tag based mode work with CONFIG_HARDENED_USERCOPY
(Upstream commit 96fedce27e1356a2fff1c270710d9405848db562). With CONFIG_HARDENED_USERCOPY enabled __check_heap_object() compares and then subtracts a potentially tagged pointer with a non-tagged address of the page that this pointer belongs to, which leads to unexpected behavior. Untag the pointer in __check_heap_object() before doing any of these operations. Link: http://lkml.kernel.org/r/7e756a298d514c4482f52aea6151db34818d395d.1546540962.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: Id1a000155fe12fc39b3caee7a097f9bea0ba3fce |
||
Andrey Konovalov
|
c84291fd95 |
UPSTREAM: kasan, arm64: use ARCH_SLAB_MINALIGN instead of manual aligning
(Upstream commit eb214f2dda31ffa989033b1e0f848ba0d3cb6188). Instead of changing cache->align to be aligned to KASAN_SHADOW_SCALE_SIZE in kasan_cache_create() we can reuse the ARCH_SLAB_MINALIGN macro. Link: http://lkml.kernel.org/r/52ddd881916bcc153a9924c154daacde78522227.1546540962.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Suggested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: Ia1c8de45c621b68555f77a569431e232ff4dc781 |
||
Qian Cai
|
88299f649b |
BACKPORT: mm/memblock.c: skip kmemleak for kasan_init()
(Upstream commit fed84c78527009d4f799a3ed9a566502fa026d82). Kmemleak does not play well with KASAN (tested on both HPE Apollo 70 and Huawei TaiShan 2280 aarch64 servers). After calling start_kernel()->setup_arch()->kasan_init(), kmemleak early log buffer went from something like 280 to 260000 which caused kmemleak disabled and crash dump memory reservation failed. The multitude of kmemleak_alloc() calls is from nested loops while KASAN is setting up full memory mappings, so let early kmemleak allocations skip those memblock_alloc_internal() calls came from kasan_init() given that those early KASAN memory mappings should not reference to other memory. Hence, no kmemleak false positives. kasan_init kasan_map_populate [1] kasan_pgd_populate [2] kasan_pud_populate [3] kasan_pmd_populate [4] kasan_pte_populate [5] kasan_alloc_zeroed_page memblock_alloc_try_nid memblock_alloc_internal kmemleak_alloc [1] for_each_memblock(memory, reg) [2] while (pgdp++, addr = next, addr != end) [3] while (pudp++, addr = next, addr != end && pud_none(READ_ONCE(*pudp))) [4] while (pmdp++, addr = next, addr != end && pmd_none(READ_ONCE(*pmdp))) [5] while (ptep++, addr = next, addr != end && pte_none(READ_ONCE(*ptep))) Link: http://lkml.kernel.org/r/1543442925-17794-1-git-send-email-cai@gmx.us Signed-off-by: Qian Cai <cai@gmx.us> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: I2423d511c3938c882c738341673b13b3beff5475 Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 |
||
Andrey Konovalov
|
f7de9038cd |
UPSTREAM: kasan: add SPDX-License-Identifier mark to source files
(Upstream commit e886bf9d9abedf8236464bfd21bc5707748b4a02). This patch adds a "SPDX-License-Identifier: GPL-2.0" mark to all source files under mm/kasan. Link: http://lkml.kernel.org/r/bce2d1e618afa5142e81961ab8fa4b4165337380.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: Ia5cdf04c5183e6c2cb75d98d60e6f4f5d961cd6e |
||
Andrey Konovalov
|
c80e94ddfc |
UPSTREAM: kasan: add __must_check annotations to kasan hooks
(Upstream commit 66afc7f1e07a1db74453be9167ac0d1205653854). This patch adds __must_check annotations to kasan hooks that return a pointer to make sure that a tagged pointer always gets propagated. Link: http://lkml.kernel.org/r/03b269c5e453945f724bfca3159d4e1333a8fb1c.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Suggested-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I758f33506b67ed5d4b7539745db311ab31c37f46 |
||
Andrey Konovalov
|
be21fa3044 |
UPSTREAM: kasan, mm, arm64: tag non slab memory allocated via pagealloc
(Upstream commit 2813b9c0296259fb11e75c839bab2d958ba4f96c). Tag-based KASAN doesn't check memory accesses through pointers tagged with 0xff. When page_address is used to get pointer to memory that corresponds to some page, the tag of the resulting pointer gets set to 0xff, even though the allocated memory might have been tagged differently. For slab pages it's impossible to recover the correct tag to return from page_address, since the page might contain multiple slab objects tagged with different values, and we can't know in advance which one of them is going to get accessed. For non slab pages however, we can recover the tag in page_address, since the whole page was marked with the same tag. This patch adds tagging to non slab memory allocated with pagealloc. To set the tag of the pointer returned from page_address, the tag gets stored to page->flags when the memory gets allocated. Link: http://lkml.kernel.org/r/d758ddcef46a5abc9970182b9137e2fbee202a2c.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I500bdf42462fee0ee14495a6be51815e7e44460f |
||
Andrey Konovalov
|
854b1b3576 |
UPSTREAM: kasan: add hooks implementation for tag-based mode
(Upstream commit 7f94ffbc4c6a1bdb51d39965e4f2acaa19bd798f). This commit adds tag-based KASAN specific hooks implementation and adjusts common generic and tag-based KASAN ones. 1. When a new slab cache is created, tag-based KASAN rounds up the size of the objects in this cache to KASAN_SHADOW_SCALE_SIZE (== 16). 2. On each kmalloc tag-based KASAN generates a random tag, sets the shadow memory, that corresponds to this object to this tag, and embeds this tag value into the top byte of the returned pointer. 3. On each kfree tag-based KASAN poisons the shadow memory with a random tag to allow detection of use-after-free bugs. The rest of the logic of the hook implementation is very much similar to the one provided by generic KASAN. Tag-based KASAN saves allocation and free stack metadata to the slab object the same way generic KASAN does. Link: http://lkml.kernel.org/r/bda78069e3b8422039794050ddcb2d53d053ed41.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I23c40472207cba40dc962bb182225e378322249e |
||
Andrey Konovalov
|
cf9a2eedfd |
UPSTREAM: mm: move obj_to_index to include/linux/slab_def.h
(Upstream commit 5b7c4148222d7acaa1612e5eec84fc66c88d54f3). While with SLUB we can actually preassign tags for caches with contructors and store them in pointers in the freelist, SLAB doesn't allow that since the freelist is stored as an array of indexes, so there are no pointers to store the tags. Instead we compute the tag twice, once when a slab is created before calling the constructor and then again each time when an object is allocated with kmalloc. Tag is computed simply by taking the lowest byte of the index that corresponds to the object. However in kasan_kmalloc we only have access to the objects pointer, so we need a way to find out which index this object corresponds to. This patch moves obj_to_index from slab.c to include/linux/slab_def.h to be reused by KASAN. Link: http://lkml.kernel.org/r/c02cd9e574cfd93858e43ac94b05e38f891fef64.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I6cb3332ea05e6152ab8de0490171ed5d4947def3 |
||
Andrey Konovalov
|
f8adce49ea |
UPSTREAM: kasan: add bug reporting routines for tag-based mode
(Upstream commit 121e8f81d38cc43834195722d0768340dc130a33). This commit adds rountines, that print tag-based KASAN error reports. Those are quite similar to generic KASAN, the difference is: 1. The way tag-based KASAN finds the first bad shadow cell (with a mismatching tag). Tag-based KASAN compares memory tags from the shadow memory to the pointer tag. 2. Tag-based KASAN reports all bugs with the "KASAN: invalid-access" header. Also simplify generic KASAN find_first_bad_addr. Link: http://lkml.kernel.org/r/aee6897b1bd077732a315fd84c6b4f234dbfdfcb.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I20ef95c8c568df73adf69b140dbbf4cd9da52f34 |
||
Andrey Konovalov
|
1ff3965086 |
UPSTREAM: kasan: split out generic_report.c from report.c
(Upstream commit 11cd3cd69a256a353dd1a249b48ccd727d945952). Move generic KASAN specific error reporting routines to generic_report.c without any functional changes, leaving common error reporting code in report.c to be later reused by tag-based KASAN. Link: http://lkml.kernel.org/r/ba48c32f8e5aefedee78998ccff0413bee9e0f5b.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: Ic65fb3da92607d345be5bbe2c486269d1db53745 |
||
Andrey Konovalov
|
70b223ca1d |
UPSTREAM: kasan, mm: perform untagged pointers comparison in krealloc
(Upstream commit 772a2fa50ffb2f4282be8436da6e70530a2ac63c). The krealloc function checks where the same buffer was reused or a new one allocated by comparing kernel pointers. Tag-based KASAN changes memory tag on the krealloc'ed chunk of memory and therefore also changes the pointer tag of the returned pointer. Therefore we need to perform comparison on untagged (with tags reset) pointers to check whether it's the same memory region or not. Link: http://lkml.kernel.org/r/14f6190d7846186a3506cd66d82446646fe65090.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I1e64158a5a0d683fc19c76296bc5fa345639bf30 |
||
Andrey Konovalov
|
9900949b99 |
UPSTREAM: kasan: preassign tags to objects with ctors or SLAB_TYPESAFE_BY_RCU
(Upstream commit 4d176711ea7a8d4873e7157ac6ab242ade3ba351). An object constructor can initialize pointers within this objects based on the address of the object. Since the object address might be tagged, we need to assign a tag before calling constructor. The implemented approach is to assign tags to objects with constructors when a slab is allocated and call constructors once as usual. The downside is that such object would always have the same tag when it is reallocated, so we won't catch use-after-frees on it. Also pressign tags for objects from SLAB_TYPESAFE_BY_RCU caches, since they can be validy accessed after having been freed. Link: http://lkml.kernel.org/r/f158a8a74a031d66f0a9398a5b0ed453c37ba09a.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I48b1de0516b9998f3b3e3917a15dd0bde27897cf |
||
Andrey Konovalov
|
f8ce5f5926 |
UPSTREAM: kasan: add tag related helper functions
(Upstream commit 3c9e3aa11094e821aff4a8f6812a6e032293dbc0). This commit adds a few helper functions, that are meant to be used to work with tags embedded in the top byte of kernel pointers: to set, to get or to reset the top byte. Link: http://lkml.kernel.org/r/f6c6437bb8e143bc44f42c3c259c62e734be7935.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I6a46eb172205fe869d126ae296f64d44e007ac88 |
||
Andrey Konovalov
|
fe1cf9c054 |
BACKPORT: kasan: initialize shadow to 0xff for tag-based mode
Use kasan_alloc_zeroed_page instead of defining kasan_alloc_raw_page. (Upstream commit 080eb83f54cf5b96ae5b6ce3c1896e35c341aff9). A tag-based KASAN shadow memory cell contains a memory tag, that corresponds to the tag in the top byte of the pointer, that points to that memory. The native top byte value of kernel pointers is 0xff, so with tag-based KASAN we need to initialize shadow memory to 0xff. [cai@lca.pw: arm64: skip kmemleak for KASAN again\ Link: http://lkml.kernel.org/r/20181226020550.63712-1-cai@lca.pw Link: http://lkml.kernel.org/r/5cc1b789aad7c99cf4f3ec5b328b147ad53edb40.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I370390e529de29fee99820f3ae2fc3d0408379ef |
||
Andrey Konovalov
|
02e8a3fcb0 |
BACKPORT: kasan: rename kasan_zero_page to kasan_early_shadow_page
(Upstream commit 9577dd7486487722ed8f0773243223f108e8089f). With tag based KASAN mode the early shadow value is 0xff and not 0x00, so this patch renames kasan_zero_(page|pte|pmd|pud|p4d) to kasan_early_shadow_(page|pte|pmd|pud|p4d) to avoid confusion. Link: http://lkml.kernel.org/r/3fed313280ebf4f88645f5b89ccbc066d320e177.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 128674696 Change-Id: I22d043b1f2ce489b12832f6f1ba1593c4ccdaedf Signed-off-by: Andrey Konovalov <andreyknvl@google.com> |
||
Andrey Konovalov
|
8712cc728d |
BACKPORT: kasan: add CONFIG_KASAN_GENERIC and CONFIG_KASAN_SW_TAGS
The conflict during backport is caused by the include/linux/compiler_attributes.h file not being present. (Upstream commit 2bd926b439b4cb6b9ed240a9781cd01958b53d85). This commit splits the current CONFIG_KASAN config option into two: 1. CONFIG_KASAN_GENERIC, that enables the generic KASAN mode (the one that exists now); 2. CONFIG_KASAN_SW_TAGS, that enables the software tag-based KASAN mode. The name CONFIG_KASAN_SW_TAGS is chosen as in the future we will have another hardware tag-based KASAN mode, that will rely on hardware memory tagging support in arm64. With CONFIG_KASAN_SW_TAGS enabled, compiler options are changed to instrument kernel files with -fsantize=kernel-hwaddress (except the ones for which KASAN_SANITIZE := n is set). Both CONFIG_KASAN_GENERIC and CONFIG_KASAN_SW_TAGS support both CONFIG_KASAN_INLINE and CONFIG_KASAN_OUTLINE instrumentation modes. This commit also adds empty placeholder (for now) implementation of tag-based KASAN specific hooks inserted by the compiler and adjusts common hooks implementation. While this commit adds the CONFIG_KASAN_SW_TAGS config option, this option is not selectable, as it depends on HAVE_ARCH_KASAN_SW_TAGS, which we will enable once all the infrastracture code has been added. Link: http://lkml.kernel.org/r/b2550106eb8a68b10fefbabce820910b115aa853.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: Id95c0c0b6857c6b30f2bea4597aea6c90273ef89 Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 |
||
Andrey Konovalov
|
3915eb8bdf |
UPSTREAM: kasan: rename source files to reflect the new naming scheme
(Upstream commit b938fcf42739de8270e6ea41593722929c8a7dd0). We now have two KASAN modes: generic KASAN and tag-based KASAN. Rename kasan.c to generic.c to reflect that. Also rename kasan_init.c to init.c as it contains initialization code for both KASAN modes. Link: http://lkml.kernel.org/r/88c6fd2a883e459e6242030497230e5fb0d44d44.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I375981e1e9013517f073b29e27e750c48ba24879 |
||
Andrey Konovalov
|
17045587b4 |
UPSTREAM: kasan: move common generic and tag-based code to common.c
(Upstream commit bffa986c6f80e39d9903015fc7d0d99a66bbf559). Tag-based KASAN reuses a significant part of the generic KASAN code, so move the common parts to common.c without any functional changes. Link: http://lkml.kernel.org/r/114064d002356e03bb8cc91f7835e20dc61b51d9.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: Iffe8a07fa53751e4232c8458aab0a8adda951f55 |
||
Andrey Konovalov
|
9938c593b5 |
UPSTREAM: kasan, slub: handle pointer tags in early_kmem_cache_node_alloc
(Upstream commit 12b22386998ccf97497a49c88f9579cf9c0dee55). The previous patch updated KASAN hooks signatures and their usage in SLAB and SLUB code, except for the early_kmem_cache_node_alloc function. This patch handles that function separately, as it requires to reorder some of the initialization code to correctly propagate a tagged pointer in case a tag is assigned by kasan_kmalloc. Link: http://lkml.kernel.org/r/fc8d0fdcf733a7a52e8d0daaa650f4736a57de8c.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: Ifcfcd2a679ebf9444a1c6525ad55b3a65a4a5941 |
||
Andrey Konovalov
|
0697e14f31 |
UPSTREAM: kasan, mm: change hooks signatures
(Upstream commit 0116523cfffa62aeb5aa3b85ce7419f3dae0c1b8). Patch series "kasan: add software tag-based mode for arm64", v13. This patchset adds a new software tag-based mode to KASAN [1]. (Initially this mode was called KHWASAN, but it got renamed, see the naming rationale at the end of this section). The plan is to implement HWASan [2] for the kernel with the incentive, that it's going to have comparable to KASAN performance, but in the same time consume much less memory, trading that off for somewhat imprecise bug detection and being supported only for arm64. The underlying ideas of the approach used by software tag-based KASAN are: 1. By using the Top Byte Ignore (TBI) arm64 CPU feature, we can store pointer tags in the top byte of each kernel pointer. 2. Using shadow memory, we can store memory tags for each chunk of kernel memory. 3. On each memory allocation, we can generate a random tag, embed it into the returned pointer and set the memory tags that correspond to this chunk of memory to the same value. 4. By using compiler instrumentation, before each memory access we can add a check that the pointer tag matches the tag of the memory that is being accessed. 5. On a tag mismatch we report an error. With this patchset the existing KASAN mode gets renamed to generic KASAN, with the word "generic" meaning that the implementation can be supported by any architecture as it is purely software. The new mode this patchset adds is called software tag-based KASAN. The word "tag-based" refers to the fact that this mode uses tags embedded into the top byte of kernel pointers and the TBI arm64 CPU feature that allows to dereference such pointers. The word "software" here means that shadow memory manipulation and tag checking on pointer dereference is done in software. As it is the only tag-based implementation right now, "software tag-based" KASAN is sometimes referred to as simply "tag-based" in this patchset. A potential expansion of this mode is a hardware tag-based mode, which would use hardware memory tagging support (announced by Arm [3]) instead of compiler instrumentation and manual shadow memory manipulation. Same as generic KASAN, software tag-based KASAN is strictly a debugging feature. [1] https://www.kernel.org/doc/html/latest/dev-tools/kasan.html [2] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html [3] https://community.arm.com/processors/b/blog/posts/arm-a-profile-architecture-2018-developments-armv85a ====== Rationale On mobile devices generic KASAN's memory usage is significant problem. One of the main reasons to have tag-based KASAN is to be able to perform a similar set of checks as the generic one does, but with lower memory requirements. Comment from Vishwath Mohan <vishwath@google.com>: I don't have data on-hand, but anecdotally both ASAN and KASAN have proven problematic to enable for environments that don't tolerate the increased memory pressure well. This includes (a) Low-memory form factors - Wear, TV, Things, lower-tier phones like Go, (c) Connected components like Pixel's visual core [1]. These are both places I'd love to have a low(er) memory footprint option at my disposal. Comment from Evgenii Stepanov <eugenis@google.com>: Looking at a live Android device under load, slab (according to /proc/meminfo) + kernel stack take 8-10% available RAM (~350MB). KASAN's overhead of 2x - 3x on top of it is not insignificant. Not having this overhead enables near-production use - ex. running KASAN/KHWASAN kernel on a personal, daily-use device to catch bugs that do not reproduce in test configuration. These are the ones that often cost the most engineering time to track down. CPU overhead is bad, but generally tolerable. RAM is critical, in our experience. Once it gets low enough, OOM-killer makes your life miserable. [1] https://www.blog.google/products/pixel/pixel-visual-core-image-processing-and-machine-learning-pixel-2/ ====== Technical details Software tag-based KASAN mode is implemented in a very similar way to the generic one. This patchset essentially does the following: 1. TCR_TBI1 is set to enable Top Byte Ignore. 2. Shadow memory is used (with a different scale, 1:16, so each shadow byte corresponds to 16 bytes of kernel memory) to store memory tags. 3. All slab objects are aligned to shadow scale, which is 16 bytes. 4. All pointers returned from the slab allocator are tagged with a random tag and the corresponding shadow memory is poisoned with the same value. 5. Compiler instrumentation is used to insert tag checks. Either by calling callbacks or by inlining them (CONFIG_KASAN_OUTLINE and CONFIG_KASAN_INLINE flags are reused). 6. When a tag mismatch is detected in callback instrumentation mode KASAN simply prints a bug report. In case of inline instrumentation, clang inserts a brk instruction, and KASAN has it's own brk handler, which reports the bug. 7. The memory in between slab objects is marked with a reserved tag, and acts as a redzone. 8. When a slab object is freed it's marked with a reserved tag. Bug detection is imprecise for two reasons: 1. We won't catch some small out-of-bounds accesses, that fall into the same shadow cell, as the last byte of a slab object. 2. We only have 1 byte to store tags, which means we have a 1/256 probability of a tag match for an incorrect access (actually even slightly less due to reserved tag values). Despite that there's a particular type of bugs that tag-based KASAN can detect compared to generic KASAN: use-after-free after the object has been allocated by someone else. ====== Testing Some kernel developers voiced a concern that changing the top byte of kernel pointers may lead to subtle bugs that are difficult to discover. To address this concern deliberate testing has been performed. It doesn't seem feasible to do some kind of static checking to find potential issues with pointer tagging, so a dynamic approach was taken. All pointer comparisons/subtractions have been instrumented in an LLVM compiler pass and a kernel module that would print a bug report whenever two pointers with different tags are being compared/subtracted (ignoring comparisons with NULL pointers and with pointers obtained by casting an error code to a pointer type) has been used. Then the kernel has been booted in QEMU and on an Odroid C2 board and syzkaller has been run. This yielded the following results. The two places that look interesting are: is_vmalloc_addr in include/linux/mm.h is_kernel_rodata in mm/util.c Here we compare a pointer with some fixed untagged values to make sure that the pointer lies in a particular part of the kernel address space. Since tag-based KASAN doesn't add tags to pointers that belong to rodata or vmalloc regions, this should work as is. To make sure debug checks to those two functions that check that the result doesn't change whether we operate on pointers with or without untagging has been added. A few other cases that don't look that interesting: Comparing pointers to achieve unique sorting order of pointee objects (e.g. sorting locks addresses before performing a double lock): tty_ldisc_lock_pair_timeout in drivers/tty/tty_ldisc.c pipe_double_lock in fs/pipe.c unix_state_double_lock in net/unix/af_unix.c lock_two_nondirectories in fs/inode.c mutex_lock_double in kernel/events/core.c ep_cmp_ffd in fs/eventpoll.c fsnotify_compare_groups fs/notify/mark.c Nothing needs to be done here, since the tags embedded into pointers don't change, so the sorting order would still be unique. Checks that a pointer belongs to some particular allocation: is_sibling_entry in lib/radix-tree.c object_is_on_stack in include/linux/sched/task_stack.h Nothing needs to be done here either, since two pointers can only belong to the same allocation if they have the same tag. Overall, since the kernel boots and works, there are no critical bugs. As for the rest, the traditional kernel testing way (use until fails) is the only one that looks feasible. Another point here is that tag-based KASAN is available under a separate config option that needs to be deliberately enabled. Even though it might be used in a "near-production" environment to find bugs that are not found during fuzzing or running tests, it is still a debug tool. ====== Benchmarks The following numbers were collected on Odroid C2 board. Both generic and tag-based KASAN were used in inline instrumentation mode. Boot time [1]: * ~1.7 sec for clean kernel * ~5.0 sec for generic KASAN * ~5.0 sec for tag-based KASAN Network performance [2]: * 8.33 Gbits/sec for clean kernel * 3.17 Gbits/sec for generic KASAN * 2.85 Gbits/sec for tag-based KASAN Slab memory usage after boot [3]: * ~40 kb for clean kernel * ~105 kb (~260% overhead) for generic KASAN * ~47 kb (~20% overhead) for tag-based KASAN KASAN memory overhead consists of three main parts: 1. Increased slab memory usage due to redzones. 2. Shadow memory (the whole reserved once during boot). 3. Quaratine (grows gradually until some preset limit; the more the limit, the more the chance to detect a use-after-free). Comparing tag-based vs generic KASAN for each of these points: 1. 20% vs 260% overhead. 2. 1/16th vs 1/8th of physical memory. 3. Tag-based KASAN doesn't require quarantine. [1] Time before the ext4 driver is initialized. [2] Measured as `iperf -s & iperf -c 127.0.0.1 -t 30`. [3] Measured as `cat /proc/meminfo | grep Slab`. ====== Some notes A few notes: 1. The patchset can be found here: https://github.com/xairy/kasan-prototype/tree/khwasan 2. Building requires a recent Clang version (7.0.0 or later). 3. Stack instrumentation is not supported yet and will be added later. This patch (of 25): Tag-based KASAN changes the value of the top byte of pointers returned from the kernel allocation functions (such as kmalloc). This patch updates KASAN hooks signatures and their usage in SLAB and SLUB code to reflect that. Link: http://lkml.kernel.org/r/aec2b5e3973781ff8a6bb6760f8543643202c451.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I62e554e732ec79ffd195e2269c8a50aed14381c0 |
||
Clark Williams
|
52023a31dc |
UPSTREAM: mm/kasan/quarantine.c: make quarantine_lock a raw_spinlock_t
(Upstream commit 026d1eaf5ef1a5d6258b46e4e411cd9f5ab8c41d). The static lock quarantine_lock is used in quarantine.c to protect the quarantine queue datastructures. It is taken inside quarantine queue manipulation routines (quarantine_put(), quarantine_reduce() and quarantine_remove_cache()), with IRQs disabled. This is not a problem on a stock kernel but is problematic on an RT kernel where spin locks are sleeping spinlocks, which can sleep and can not be acquired with disabled interrupts. Convert the quarantine_lock to a raw spinlock_t. The usage of quarantine_lock is confined to quarantine.c and the work performed while the lock is held is used for debug purpose. [bigeasy@linutronix.de: slightly altered the commit message] Link: http://lkml.kernel.org/r/20181010214945.5owshc3mlrh74z4b@linutronix.de Signed-off-by: Clark Williams <williams@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 128674696 Change-Id: I12f35246b81b23cad5ce8b90407c00b86bf90cc0 |
||
Greg Kroah-Hartman
|
8ca5759502 |
This is the 4.19.73 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1/KiEACgkQONu9yGCS aT49JBAAy7b3wv1WXAtg9wsyS1JL4HbMXt3YjtokIX+UpkznoqII4B85QftPBbiD 9zDuTWPjhrqKv1GsMkFRCqBVp5wGVik1MIbjVuKdstFN5W8KQybpbYnSW4T52+wS cs6oOPkLydAfWzKeq+ekEeU8yr5dua+Ui3huundZ49wseJWQP3fh9T+ToUx8V/cr tsLiRRgI0djj7KQWVuM1j8YGKT/6qk/UL0HMVZyoIdLmsxpLap+LWe0+CRXn8rvs eJJlVQTVtYf/ySoHkpnwR12VsjRYjx6pNkm/GrebMCkM7wF/4RMqxk7j9EU0PENH VUdRrUd+j/YPp6QzjSFMK0+0eb7Gm3X0FEN0IGZshu1r/CDnoj/7hqnBmOlYIbhv pdteYaLqWq7JjAHu7vF+S4aNQRGpAZb05LsbTJ39Eu3FbdVTLXsAuUveZ7Y4/y0X ri2M3d/sF/cjc3C+V7Y7h422SM36jSAK6496VAoRyqqjX/3JyROhgfU9NAMzVr83 4uI904z9lH4TZGOd5YQgX2VuOtBcGwa7+g6fy97u1tp8UxSWFZRGDDLRysF/dIJO Wi51UK0Q7EWnqBTe0TFF6TjE5tC7R3ZgzqEQ1MU4eLI5mqokg82DAK4Ub2Wk5Qch CGs5/d16OOrLtG2RoaOGz9UdQR7IHUXLSqkKbaEdstc16MXNXns= =cmGh -----END PGP SIGNATURE----- Merge 4.19.73 into android-4.19 Changes in 4.19.73 ALSA: hda - Fix potential endless loop at applying quirks ALSA: hda/realtek - Fix overridden device-specific initialization ALSA: hda/realtek - Add quirk for HP Pavilion 15 ALSA: hda/realtek - Enable internal speaker & headset mic of ASUS UX431FL ALSA: hda/realtek - Fix the problem of two front mics on a ThinkCentre sched/fair: Don't assign runtime for throttled cfs_rq drm/vmwgfx: Fix double free in vmw_recv_msg() vhost/test: fix build for vhost test vhost/test: fix build for vhost test - again powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction batman-adv: fix uninit-value in batadv_netlink_get_ifindex() batman-adv: Only read OGM tvlv_len after buffer len check hv_sock: Fix hang when a connection is closed Blk-iolatency: warn on negative inflight IO counter blk-iolatency: fix STS_AGAIN handling {nl,mac}80211: fix interface combinations on crypto controlled devices timekeeping: Use proper ktime_add when adding nsecs in coarse offset selftests: fib_rule_tests: use pre-defined DEV_ADDR x86/ftrace: Fix warning and considate ftrace_jmp_replace() and ftrace_call_replace() powerpc/64: mark start_here_multiplatform as __ref media: stm32-dcmi: fix irq = 0 case arm64: dts: rockchip: enable usb-host regulators at boot on rk3328-rock64 scripts/decode_stacktrace: match basepath using shell prefix operator, not regex riscv: remove unused variable in ftrace nvme-fc: use separate work queue to avoid warning clk: s2mps11: Add used attribute to s2mps11_dt_match remoteproc: qcom: q6v5: shore up resource probe handling modules: always page-align module section allocations kernel/module: Fix mem leak in module_add_modinfo_attrs drm/i915: Re-apply "Perform link quality check, unconditionally during long pulse" media: cec/v4l2: move V4L2 specific CEC functions to V4L2 media: cec: remove cec-edid.c scsi: qla2xxx: Move log messages before issuing command to firmware keys: Fix the use of the C++ keyword "private" in uapi/linux/keyctl.h Drivers: hv: kvp: Fix two "this statement may fall through" warnings x86, hibernate: Fix nosave_regions setup for hibernation remoteproc: qcom: q6v5-mss: add SCM probe dependency drm/amdgpu/gfx9: Update gfx9 golden settings. drm/amdgpu: Update gc_9_0 golden settings. KVM: x86: hyperv: enforce vp_index < KVM_MAX_VCPUS KVM: x86: hyperv: consistently use 'hv_vcpu' for 'struct kvm_vcpu_hv' variables KVM: x86: hyperv: keep track of mismatched VP indexes KVM: hyperv: define VP assist page helpers x86/kvm/lapic: preserve gfn_to_hva_cache len on cache reinit drm/i915: Fix intel_dp_mst_best_encoder() drm/i915: Rename PLANE_CTL_DECOMPRESSION_ENABLE drm/i915/gen9+: Fix initial readout for Y tiled framebuffers drm/atomic_helper: Disallow new modesets on unregistered connectors Drivers: hv: kvp: Fix the indentation of some "break" statements Drivers: hv: kvp: Fix the recent regression caused by incorrect clean-up powerplay: Respect units on max dcfclk watermark drm/amd/pp: Fix truncated clock value when set watermark drm/amd/dm: Understand why attaching path/tile properties are needed ARM: davinci: da8xx: define gpio interrupts as separate resources ARM: davinci: dm365: define gpio interrupts as separate resources ARM: davinci: dm646x: define gpio interrupts as separate resources ARM: davinci: dm355: define gpio interrupts as separate resources ARM: davinci: dm644x: define gpio interrupts as separate resources s390/zcrypt: reinit ap queue state machine during device probe media: vim2m: use workqueue media: vim2m: use cancel_delayed_work_sync instead of flush_schedule_work drm/i915: Restore sane defaults for KMS on GEM error load drm/i915: Cleanup gt powerstate from gem KVM: PPC: Book3S HV: Fix race between kvm_unmap_hva_range and MMU mode switch Btrfs: clean up scrub is_dev_replace parameter Btrfs: fix deadlock with memory reclaim during scrub btrfs: Remove extent_io_ops::fill_delalloc btrfs: Fix error handling in btrfs_cleanup_ordered_extents scsi: megaraid_sas: Fix combined reply queue mode detection scsi: megaraid_sas: Add check for reset adapter bit scsi: megaraid_sas: Use 63-bit DMA addressing powerpc/pkeys: Fix handling of pkey state across fork() btrfs: volumes: Make sure no dev extent is beyond device boundary btrfs: Use real device structure to verify dev extent media: vim2m: only cancel work if it is for right context ARC: show_regs: lockdep: re-enable preemption ARC: mm: do_page_fault fixes #1: relinquish mmap_sem if signal arrives while handle_mm_fault IB/uverbs: Fix OOPs upon device disassociation crypto: ccree - fix resume race condition on init crypto: ccree - add missing inline qualifier drm/vblank: Allow dynamic per-crtc max_vblank_count drm/i915/ilk: Fix warning when reading emon_status with no output mfd: Kconfig: Fix I2C_DESIGNWARE_PLATFORM dependencies tpm: Fix some name collisions with drivers/char/tpm.h bcache: replace hard coded number with BUCKET_GC_GEN_MAX bcache: treat stale && dirty keys as bad keys KVM: VMX: Compare only a single byte for VMCS' "launched" in vCPU-run iio: adc: exynos-adc: Add S5PV210 variant dt-bindings: iio: adc: exynos-adc: Add S5PV210 variant iio: adc: exynos-adc: Use proper number of channels for Exynos4x12 mt76: fix corrupted software generated tx CCMP PN drm/nouveau: Don't WARN_ON VCPI allocation failures iwlwifi: fix devices with PCI Device ID 0x34F0 and 11ac RF modules iwlwifi: add new card for 9260 series x86/kvmclock: set offset for kvm unstable clock spi: spi-gpio: fix SPI_CS_HIGH capability powerpc/kvm: Save and restore host AMR/IAMR/UAMOR mmc: renesas_sdhi: Fix card initialization failure in high speed mode btrfs: scrub: pass fs_info to scrub_setup_ctx btrfs: scrub: move scrub_setup_ctx allocation out of device_list_mutex btrfs: scrub: fix circular locking dependency warning btrfs: init csum_list before possible free PCI: qcom: Fix error handling in runtime PM support PCI: qcom: Don't deassert reset GPIO during probe drm: add __user attribute to ptr_to_compat() CIFS: Fix error paths in writeback code CIFS: Fix leaking locked VFS cache pages in writeback retry drm/i915: Handle vm_mmap error during I915_GEM_MMAP ioctl with WC set drm/i915: Sanity check mmap length against object size usb: typec: tcpm: Try PD-2.0 if sink does not respond to 3.0 source-caps arm64: dts: stratix10: add the sysmgr-syscon property from the gmac's IB/mlx5: Reset access mask when looping inside page fault handler kvm: mmu: Fix overflow on kvm mmu page limit calculation x86/kvm: move kvm_load/put_guest_xcr0 into atomic context KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernels cifs: Fix lease buffer length error media: i2c: tda1997x: select V4L2_FWNODE ext4: protect journal inode's blocks using block_validity ARM: dts: qcom: ipq4019: fix PCI range ARM: dts: qcom: ipq4019: Fix MSI IRQ type ARM: dts: qcom: ipq4019: enlarge PCIe BAR range dt-bindings: mmc: Add supports-cqe property dt-bindings: mmc: Add disable-cqe-dcmd property. PCI: Add macro for Switchtec quirk declarations PCI: Reset Lenovo ThinkPad P50 nvgpu at boot if necessary dm mpath: fix missing call of path selector type->end_io blk-mq: free hw queue's resource in hctx's release handler mmc: sdhci-pci: Add support for Intel CML PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code cifs: smbd: take an array of reqeusts when sending upper layer data dm crypt: move detailed message into debug level signal/arc: Use force_sig_fault where appropriate ARC: mm: fix uninitialised signal code in do_page_fault ARC: mm: SIGSEGV userspace trying to access kernel virtual memory drm/amdkfd: Add missing Polaris10 ID kvm: Check irqchip mode before assign irqfd drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2) drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc Btrfs: fix race between block group removal and block group allocation cifs: add spinlock for the openFileList to cifsInodeInfo clk: tegra: Fix maximum audio sync clock for Tegra124/210 clk: tegra210: Fix default rates for HDA clocks IB/hfi1: Avoid hardlockup with flushlist_lock apparmor: reset pos on failure to unpack for various functions scsi: target/core: Use the SECTOR_SHIFT constant scsi: target/iblock: Fix overrun in WRITE SAME emulation staging: wilc1000: fix error path cleanup in wilc_wlan_initialize() scsi: zfcp: fix request object use-after-free in send path causing wrong traces cifs: Properly handle auto disabling of serverino option ALSA: hda - Don't resume forcibly i915 HDMI/DP codec ceph: use ceph_evict_inode to cleanup inode's resource KVM: x86: optimize check for valid PAT value KVM: VMX: Always signal #GP on WRMSR to MSR_IA32_CR_PAT with bad value KVM: VMX: Fix handling of #MC that occurs during VM-Entry KVM: VMX: check CPUID before allowing read/write of IA32_XSS KVM: PPC: Use ccr field in pt_regs struct embedded in vcpu struct KVM: PPC: Book3S HV: Fix CR0 setting in TM emulation ARM: dts: gemini: Set DIR-685 SPI CS as active low RDMA/srp: Document srp_parse_in() arguments RDMA/srp: Accept again source addresses that do not have a port number btrfs: correctly validate compression type resource: Include resource end in walk_*() interfaces resource: Fix find_next_iomem_res() iteration issue resource: fix locking in find_next_iomem_res() pstore: Fix double-free in pstore_mkfile() failure path dm thin metadata: check if in fail_io mode when setting needs_check drm/panel: Add support for Armadeus ST0700 Adapt ALSA: hda - Fix intermittent CORB/RIRB stall on Intel chips powerpc/mm: Limit rma_size to 1TB when running without HV mode iommu/iova: Remove stale cached32_node gpio: don't WARN() on NULL descs if gpiolib is disabled i2c: at91: disable TXRDY interrupt after sending data i2c: at91: fix clk_offset for sama5d2 mm/migrate.c: initialize pud_entry in migrate_vma() iio: adc: gyroadc: fix uninitialized return code NFSv4: Fix delegation state recovery bcache: only clear BTREE_NODE_dirty bit when it is set bcache: add comments for mutex_lock(&b->write_lock) bcache: fix race in btree_flush_write() drm/i915: Make sure cdclk is high enough for DP audio on VLV/CHV virtio/s390: fix race on airq_areas[] drm/atomic_helper: Allow DPMS On<->Off changes for unregistered connectors ext4: don't perform block validity checks on the journal inode ext4: fix block validity checks for journal inodes using indirect blocks ext4: unsigned int compared against zero PCI: Reset both NVIDIA GPU and HDA in ThinkPad P50 workaround powerpc/tm: Remove msr_tm_active() powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts vhost: make sure log_num < in_num Linux 4.19.73 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I7bc57825aeb36759bb8e8726888da9af06392c09 |
||
Ralph Campbell
|
2e7e7c8f94 |
mm/migrate.c: initialize pud_entry in migrate_vma()
[ Upstream commit 7b358c6f12dc82364f6d317f8c8f1d794adbc3f5 ]
When CONFIG_MIGRATE_VMA_HELPER is enabled, migrate_vma() calls
migrate_vma_collect() which initializes a struct mm_walk but didn't
initialize mm_walk.pud_entry. (Found by code inspection) Use a C
structure initialization to make sure it is set to NULL.
Link: http://lkml.kernel.org/r/20190719233225.12243-1-rcampbell@nvidia.com
Fixes:
|
||
Greg Kroah-Hartman
|
6b1f307bd0 |
This is the 4.19.70 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1yF0AACgkQONu9yGCS aT64ew/6AzJDRMcmnx1COeRP8tfQ5A8ghjnp6REEca1MYJWjDlqd0X+EMd/7zorZ YkBzb1ND1c/9KeGQzdx8lJ5lcNRVJcD7tT6irT5zMnBcyR9uPamAaVgmVHXxuorK el4nvX9g/MxBLsuqLHxGr5pNXi7mHu6zfXyQ86TJzlez7yHlPuiJb8bDpHnCRJ1P n/MemPq/nQgC5jPBQRhT+IpqC1MTIHNhRaHHg/5Gdrrz+eVumnk+1zbWqtBRuJKS qS7RL1pI0Su00i5bY1r76iSkoRkGw9SoeIgz3sycbtAvGo9TI16/hZPvWAsYOAjL 2DQS0rMPSnM4QV0odbUImFt86f0YJiAL7xYS8EYCc4GX/eLbNRtP8yLq+8rlo4Oa 36HbiNhGSEjRxenfVRUD/STgBYzfVeQOMEyFJNRtfNDP/l66sLk/pEOEd83j06H4 G87BJgKFC35dv5QrbCmJO8P1IXLs5QaChD3dL6R9/hbvCU2A/MOqhPL16JCXWA20 +hOWn8ryrtBa5Dt0avAkrrnUNC8cVWyD44uAm+Hu/49CZpkTasZMB7Z81VSIsuvO xoT1Jx0J1W/LtHwCghSkui/fjQjVpkP3xnB7zVGem73Mpcm68g6mLajKaUrLnJHp /sz2mppF7gZob43anTtUhSV9OvrzqftkR+iFg3rCKSJyQE1o48o= =hMvg -----END PGP SIGNATURE----- Merge 4.19.70 into android-4.19 Changes in 4.19.70 dmaengine: ste_dma40: fix unneeded variable warning nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns afs: Fix the CB.ProbeUuid service handler to reply correctly afs: Fix loop index mixup in afs_deliver_vl_get_entry_by_name_u() fs: afs: Fix a possible null-pointer dereference in afs_put_read() afs: Only update d_fsdata if different in afs_d_revalidate() nvmet-loop: Flush nvme_delete_wq when removing the port nvme: fix a possible deadlock when passthru commands sent to a multipath device nvme-pci: Fix async probe remove race soundwire: cadence_master: fix register definition for SLAVE_STATE soundwire: cadence_master: fix definitions for INTSTAT0/1 auxdisplay: panel: need to delete scan_timer when misc_register fails in panel_attach dmaengine: stm32-mdma: Fix a possible null-pointer dereference in stm32_mdma_irq_handler() omap-dma/omap_vout_vrfb: fix off-by-one fi value iommu/dma: Handle SG length overflow better usb: gadget: composite: Clear "suspended" on reset/disconnect usb: gadget: mass_storage: Fix races between fsg_disable and fsg_set_alt xen/blkback: fix memory leaks arm64: cpufeature: Don't treat granule sizes as strict i2c: rcar: avoid race when unregistering slave client i2c: emev2: avoid race when unregistering slave client drm/ast: Fixed reboot test may cause system hanged usb: host: fotg2: restart hcd after port reset tools: hv: fixed Python pep8/flake8 warnings for lsvmbus tools: hv: fix KVP and VSS daemons exit code drm/i915: fix broadwell EU computation watchdog: bcm2835_wdt: Fix module autoload drm/bridge: tfp410: fix memleak in get_modes() scsi: ufs: Fix RX_TERMINATION_FORCE_ENABLE define value drm/tilcdc: Register cpufreq notifier after we have initialized crtc net/tls: Fixed return value when tls_complete_pending_work() fails net/tls: swap sk_write_space on close net: tls, fix sk_write_space NULL write when tx disabled ipv6/addrconf: allow adding multicast addr if IFA_F_MCAUTOJOIN is set ipv6: Default fib6_type to RTN_UNICAST when not set net/smc: make sure EPOLLOUT is raised tcp: make sure EPOLLOUT wont be missed ipv4/icmp: fix rt dst dev null pointer dereference mm/zsmalloc.c: fix build when CONFIG_COMPACTION=n ALSA: usb-audio: Check mixer unit bitmap yet more strictly ALSA: line6: Fix memory leak at line6_init_pcm() error path ALSA: hda - Fixes inverted Conexant GPIO mic mute led ALSA: seq: Fix potential concurrent access to the deleted pool ALSA: usb-audio: Fix invalid NULL check in snd_emuusb_set_samplerate() ALSA: usb-audio: Add implicit fb quirk for Behringer UFX1604 kvm: x86: skip populating logical dest map if apic is not sw enabled KVM: x86: Don't update RIP or do single-step on faulting emulation uprobes/x86: Fix detection of 32-bit user mode x86/apic: Do not initialize LDR and DFR for bigsmp x86/apic: Include the LDR when clearing out APIC registers ftrace: Fix NULL pointer dereference in t_probe_next() ftrace: Check for successful allocation of hash ftrace: Check for empty hash and comment the race with registering probes usb-storage: Add new JMS567 revision to unusual_devs USB: cdc-wdm: fix race between write and disconnect due to flag abuse usb: hcd: use managed device resources usb: chipidea: udc: don't do hardware access if gadget has stopped usb: host: ohci: fix a race condition between shutdown and irq usb: host: xhci: rcar: Fix typo in compatible string matching USB: storage: ums-realtek: Update module parameter description for auto_delink_en USB: storage: ums-realtek: Whitelist auto-delink support mei: me: add Tiger Lake point LP device ID mmc: sdhci-of-at91: add quirk for broken HS200 mmc: core: Fix init of SD cards reporting an invalid VDD range stm class: Fix a double free of stm_source_device intel_th: pci: Add support for another Lewisburg PCH intel_th: pci: Add Tiger Lake support typec: tcpm: fix a typo in the comparison of pdo_max_voltage fsi: scom: Don't abort operations for minor errors lib: logic_pio: Fix RCU usage lib: logic_pio: Avoid possible overlap for unregistering regions lib: logic_pio: Add logic_pio_unregister_range() drm/amdgpu: Add APTX quirk for Dell Latitude 5495 drm/i915: Don't deballoon unused ggtt drm_mm_node in linux guest drm/i915: Call dma_set_max_seg_size() in i915_driver_hw_probe() bus: hisi_lpc: Unregister logical PIO range to avoid potential use-after-free bus: hisi_lpc: Add .remove method to avoid driver unbind crash VMCI: Release resource if the work is already queued crypto: ccp - Ignore unconfigured CCP device on suspend/resume Revert "cfg80211: fix processing world regdomain when non modular" mac80211: fix possible sta leak mac80211: Don't memset RXCB prior to PAE intercept mac80211: Correctly set noencrypt for PAE frames KVM: PPC: Book3S: Fix incorrect guest-to-user-translation error handling KVM: arm/arm64: vgic: Fix potential deadlock when ap_list is long KVM: arm/arm64: vgic-v2: Handle SGI bits in GICD_I{S,C}PENDR0 as WI NFS: Clean up list moves of struct nfs_page NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend() NFS: Pass error information to the pgio error cleanup routine NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0 i2c: piix4: Fix port selection for AMD Family 16h Model 30h x86/ptrace: fix up botched merge of spectrev1 fix mt76: mt76x0u: do not reset radio on resume Revert "ASoC: Fail card instantiation if DAI format setup fails" Linux 4.19.70 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I35ff8a403a05a8c66d87cb4b542997e63c422288 |
||
Andrew Morton
|
5dd2db1ab0 |
mm/zsmalloc.c: fix build when CONFIG_COMPACTION=n
commit 441e254cd40dc03beec3c650ce6ce6074bc6517f upstream. Fixes: 701d678599d0c1 ("mm/zsmalloc.c: fix race condition in zs_destroy_pool") Link: http://lkml.kernel.org/r/201908251039.5oSbEEUT%25lkp@intel.com Reported-by: kbuild test robot <lkp@intel.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Henry Burns <henrywolfeburns@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Jonathan Adams <jwadams@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Greg Kroah-Hartman
|
12dc90c620 |
This is the 4.19.69 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1ncKwACgkQONu9yGCS aT6FPg//RiiJo8O+CUzkP4MFohy8JUuGC1MnnSfSFJn9bzAljWhYtSoJlZ9PbfHq qx9oWuQNNVZRn9nWuRbTRfRlz6ztc7whsjhAth4eNCtXvu+xAvFLvFhlbVt6xiZ0 Wg3jXDtIY3Y8km01uJdzVk/juUqvTU8nioM4s1OWTFRfOfakLMK9CkxOKfZMFnxP mVILTcOxZAf0Js2tRMRPvm8c6OhegkXZjWUhGMlvmFKk/pqUouVXH8pKbBoTj8zR VoHB6pWs3YG3S15WgkNfKiR9WpeXywC7XN9ilziczaQ8HbsH6Y+5wM9Ncx+3FVxd mzygLCtlckYWjiabS/w3tQHrH+LV8MaPYuW/2tlL9sBljlDW5RBW+g7vaBQjIpCK gco/z0qmeEIYt8ktLL08i9FQBPp00Fra9x3jZKLz8Tp+W//EBm4ENMm2cxHtRKtd 3fG70ngJmycksCK/e8N466/1f/aAOxfBZkog3R/4yqNvOX8rkQGRJlhv0AzIdsy8 RlTDotDwwQFbMWROVs/Jea+9Wwp71jlPOXyMqX2EqUR/WfDWuIZEyzSdbqKPNgHL Q9OoUB7kIKGusqQC6ABYKtDn7T046o6ePEB8r2aIvPmw5AiWMefSubHNV6mP3KJb 0NbQlUelMTvxuwm5NLMY2EWbWi+jvg4OgJvajwcDretTPeBGMZI= =945Y -----END PGP SIGNATURE----- Merge 4.19.69 into android-4.19 Changes in 4.19.69 HID: Add 044f:b320 ThrustMaster, Inc. 2 in 1 DT MIPS: kernel: only use i8253 clocksource with periodic clockevent mips: fix cacheinfo netfilter: ebtables: fix a memory leak bug in compat ASoC: dapm: Fix handling of custom_stop_condition on DAPM graph walks selftests/bpf: fix sendmsg6_prog on s390 bonding: Force slave speed check after link state recovery for 802.3ad net: mvpp2: Don't check for 3 consecutive Idle frames for 10G links selftests: forwarding: gre_multipath: Enable IPv4 forwarding selftests: forwarding: gre_multipath: Fix flower filters can: dev: call netif_carrier_off() in register_candev() can: mcp251x: add error check when wq alloc failed can: gw: Fix error path of cgw_module_init ASoC: Fail card instantiation if DAI format setup fails st21nfca_connectivity_event_received: null check the allocation st_nci_hci_connectivity_event_received: null check the allocation ASoC: rockchip: Fix mono capture ASoC: ti: davinci-mcasp: Correct slot_width posed constraint net: usb: qmi_wwan: Add the BroadMobi BM818 card qed: RDMA - Fix the hw_ver returned in device attributes isdn: mISDN: hfcsusb: Fix possible null-pointer dereferences in start_isoc_chain() mac80211_hwsim: Fix possible null-pointer dereferences in hwsim_dump_radio_nl() netfilter: ipset: Actually allow destination MAC address for hash:ip,mac sets too netfilter: ipset: Copy the right MAC address in bitmap:ip,mac and hash:ip,mac sets netfilter: ipset: Fix rename concurrency with listing rxrpc: Fix potential deadlock rxrpc: Fix the lack of notification when sendmsg() fails on a DATA packet isdn: hfcsusb: Fix mISDN driver crash caused by transfer buffer on the stack net: phy: phy_led_triggers: Fix a possible null-pointer dereference in phy_led_trigger_change_speed() perf bench numa: Fix cpu0 binding can: sja1000: force the string buffer NULL-terminated can: peak_usb: force the string buffer NULL-terminated net/ethernet/qlogic/qed: force the string buffer NULL-terminated NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim() NFS: Fix regression whereby fscache errors are appearing on 'nofsc' mounts HID: quirks: Set the INCREMENT_USAGE_ON_DUPLICATE quirk on Saitek X52 HID: input: fix a4tech horizontal wheel custom usage drm/rockchip: Suspend DP late SMB3: Fix potential memory leak when processing compound chain SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL s390: put _stext and _etext into .text section net: cxgb3_main: Fix a resource leak in a error path in 'init_one()' net: stmmac: Fix issues when number of Queues >= 4 net: stmmac: tc: Do not return a fragment entry net: hisilicon: make hip04_tx_reclaim non-reentrant net: hisilicon: fix hip04-xmit never return TX_BUSY net: hisilicon: Fix dma_map_single failed on arm64 libata: have ata_scsi_rw_xlat() fail invalid passthrough requests libata: add SG safety checks in SFF pio transfers x86/lib/cpu: Address missing prototypes warning drm/vmwgfx: fix memory leak when too many retries have occurred block, bfq: handle NULL return value by bfq_init_rq() perf ftrace: Fix failure to set cpumask when only one cpu is present perf cpumap: Fix writing to illegal memory in handling cpumap mask perf pmu-events: Fix missing "cpu_clk_unhalted.core" event KVM: arm64: Don't write junk to sysregs on reset KVM: arm: Don't write junk to CP15 registers on reset selftests: kvm: Adding config fragments HID: wacom: correct misreported EKR ring values HID: wacom: Correct distance scale for 2nd-gen Intuos devices Revert "dm bufio: fix deadlock with loop device" clk: socfpga: stratix10: fix rate caclulationg for cnt_clks ceph: clear page dirty before invalidate page ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply libceph: fix PG split vs OSD (re)connect race drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX gpiolib: never report open-drain/source lines as 'input' to user-space Drivers: hv: vmbus: Fix virt_to_hvpfn() for X86_PAE userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386 x86/apic: Handle missing global clockevent gracefully x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h x86/boot: Save fields explicitly, zero out everything else x86/boot: Fix boot regression caused by bootparam sanitizing dm kcopyd: always complete failed jobs dm btree: fix order of block initialization in btree_split_beneath dm integrity: fix a crash due to BUG_ON in __journal_read_write() dm raid: add missing cleanup in raid_ctr() dm space map metadata: fix missing store of apply_bops() return value dm table: fix invalid memory accesses with too high sector number dm zoned: improve error handling in reclaim dm zoned: improve error handling in i/o map code dm zoned: properly handle backing device failure genirq: Properly pair kobject_del() with kobject_add() mm, page_owner: handle THP splits correctly mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely mm/zsmalloc.c: fix race condition in zs_destroy_pool xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT xfs: don't trip over uninitialized buffer on extent read of corrupted inode xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h xfs: Add helper function xfs_attr_try_sf_addname xfs: Add attibute set and helper functions xfs: Add attibute remove and helper functions xfs: always rejoin held resources during defer roll dm zoned: fix potential NULL dereference in dmz_do_reclaim() powerpc: Allow flush_(inval_)dcache_range to work across ranges >4GB rxrpc: Fix local endpoint refcounting rxrpc: Fix read-after-free in rxrpc_queue_local() rxrpc: Fix local endpoint replacement rxrpc: Fix local refcounting Linux 4.19.69 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I9824a29e0434a6a80e2f32fdb88c0ac1fe8e5af5 |
||
Laura Abbott
|
c41e798f4f |
UPSTREAM: mm: slub: Fix slab walking for init_on_free
Upstream commit 1b7e816fc80e ("mm: slub: Fix slab walking for init_on_free"). To properly clear the slab on free with slab_want_init_on_free, we walk the list of free objects using get_freepointer/set_freepointer. The value we get from get_freepointer may not be valid. This isn't an issue since an actual value will get written later but this means there's a chance of triggering a bug if we use this value with set_freepointer: kernel BUG at mm/slub.c:306! invalid opcode: 0000 [#1] PREEMPT PTI CPU: 0 PID: 0 Comm: swapper Not tainted 5.2.0-05754-g6471384a #4 RIP: 0010:kfree+0x58a/0x5c0 Code: 48 83 05 78 37 51 02 01 0f 0b 48 83 05 7e 37 51 02 01 48 83 05 7e 37 51 02 01 48 83 05 7e 37 51 02 01 48 83 05 d6 37 51 02 01 <0f> 0b 48 83 05 d4 37 51 02 01 48 83 05 d4 37 51 02 01 48 83 05 d4 RSP: 0000:ffffffff82603d90 EFLAGS: 00010002 RAX: ffff8c3976c04320 RBX: ffff8c3976c04300 RCX: 0000000000000000 RDX: ffff8c3976c04300 RSI: 0000000000000000 RDI: ffff8c3976c04320 RBP: ffffffff82603db8 R08: 0000000000000000 R09: 0000000000000000 R10: ffff8c3976c04320 R11: ffffffff8289e1e0 R12: ffffd52cc8db0100 R13: ffff8c3976c01a00 R14: ffffffff810f10d4 R15: ffff8c3976c04300 FS: 0000000000000000(0000) GS:ffffffff8266b000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8c397ffff000 CR3: 0000000125020000 CR4: 00000000000406b0 Call Trace: apply_wqattrs_prepare+0x154/0x280 apply_workqueue_attrs_locked+0x4e/0xe0 apply_workqueue_attrs+0x36/0x60 alloc_workqueue+0x25a/0x6d0 workqueue_init_early+0x246/0x348 start_kernel+0x3c7/0x7ec x86_64_start_reservations+0x40/0x49 x86_64_start_kernel+0xda/0xe4 secondary_startup_64+0xb6/0xc0 Modules linked in: ---[ end trace f67eb9af4d8d492b ]--- Fix this by ensuring the value we set with set_freepointer is either NULL or another value in the chain. Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Laura Abbott <labbott@redhat.com> Fixes: 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options") Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: I89414f6bf1559771d5d7c14853ddfe9f769989e5 Bug: 138435492 Test: Boot cuttlefish with and without Test: CONFIG_INIT_ON_ALLOC_DEFAULT_ON/CONFIG_INIT_ON_FREE_DEFAULT_ON Signed-off-by: Alexander Potapenko <glider@google.com> |
||
Alexander Potapenko
|
a5587d8fce |
UPSTREAM: mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options
Upstream commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options"). Patch series "add init_on_alloc/init_on_free boot options", v10. Provide init_on_alloc and init_on_free boot options. These are aimed at preventing possible information leaks and making the control-flow bugs that depend on uninitialized values more deterministic. Enabling either of the options guarantees that the memory returned by the page allocator and SL[AU]B is initialized with zeroes. SLOB allocator isn't supported at the moment, as its emulation of kmem caches complicates handling of SLAB_TYPESAFE_BY_RCU caches correctly. Enabling init_on_free also guarantees that pages and heap objects are initialized right after they're freed, so it won't be possible to access stale data by using a dangling pointer. As suggested by Michal Hocko, right now we don't let the heap users to disable initialization for certain allocations. There's not enough evidence that doing so can speed up real-life cases, and introducing ways to opt-out may result in things going out of control. This patch (of 2): The new options are needed to prevent possible information leaks and make control-flow bugs that depend on uninitialized values more deterministic. This is expected to be on-by-default on Android and Chrome OS. And it gives the opportunity for anyone else to use it under distros too via the boot args. (The init_on_free feature is regularly requested by folks where memory forensics is included in their threat models.) init_on_alloc=1 makes the kernel initialize newly allocated pages and heap objects with zeroes. Initialization is done at allocation time at the places where checks for __GFP_ZERO are performed. init_on_free=1 makes the kernel initialize freed pages and heap objects with zeroes upon their deletion. This helps to ensure sensitive data doesn't leak via use-after-free accesses. Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator returns zeroed memory. The two exceptions are slab caches with constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never zero-initialized to preserve their semantics. Both init_on_alloc and init_on_free default to zero, but those defaults can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and CONFIG_INIT_ON_FREE_DEFAULT_ON. If either SLUB poisoning or page poisoning is enabled, those options take precedence over init_on_alloc and init_on_free: initialization is only applied to unpoisoned allocations. Slowdown for the new features compared to init_on_free=0, init_on_alloc=0: hackbench, init_on_free=1: +7.62% sys time (st.err 0.74%) hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%) Linux build with -j12, init_on_free=1: +8.38% wall time (st.err 0.39%) Linux build with -j12, init_on_free=1: +24.42% sys time (st.err 0.52%) Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%) Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%) The slowdown for init_on_free=0, init_on_alloc=0 compared to the baseline is within the standard error. The new features are also going to pave the way for hardware memory tagging (e.g. arm64's MTE), which will require both on_alloc and on_free hooks to set the tags for heap objects. With MTE, tagging will have the same cost as memory initialization. Although init_on_free is rather costly, there are paranoid use-cases where in-memory data lifetime is desired to be minimized. There are various arguments for/against the realism of the associated threat models, but given that we'll need the infrastructure for MTE anyway, and there are people who want wipe-on-free behavior no matter what the performance cost, it seems reasonable to include it in this series. [glider@google.com: v8] Link: http://lkml.kernel.org/r/20190626121943.131390-2-glider@google.com [glider@google.com: v9] Link: http://lkml.kernel.org/r/20190627130316.254309-2-glider@google.com [glider@google.com: v10] Link: http://lkml.kernel.org/r/20190628093131.199499-2-glider@google.com Link: http://lkml.kernel.org/r/20190617151050.92663-2-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.cz> [page and dmapool parts Acked-by: James Morris <jamorris@linux.microsoft.com>] Cc: Christoph Lameter <cl@linux.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Sandeep Patil <sspatil@android.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Jann Horn <jannh@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: If0620a6a8aed34c21e98458c965e94f5a9dfd297 Bug: 138435492 Test: Boot cuttlefish with and without Test: CONFIG_INIT_ON_ALLOC_DEFAULT_ON/CONFIG_INIT_ON_FREE_DEFAULT_ON Signed-off-by: Alexander Potapenko <glider@google.com> |
||
Henry Burns
|
ed11e60033 |
mm/zsmalloc.c: fix race condition in zs_destroy_pool
commit 701d678599d0c1623aaf4139c03eea260a75b027 upstream.
In zs_destroy_pool() we call flush_work(&pool->free_work). However, we
have no guarantee that migration isn't happening in the background at
that time.
Since migration can't directly free pages, it relies on free_work being
scheduled to free the pages. But there's nothing preventing an
in-progress migrate from queuing the work *after*
zs_unregister_migration() has called flush_work(). Which would mean
pages still pointing at the inode when we free it.
Since we know at destroy time all objects should be free, no new
migrations can come in (since zs_page_isolate() fails for fully-free
zspages). This means it is sufficient to track a "# isolated zspages"
count by class, and have the destroy logic ensure all such pages have
drained before proceeding. Keeping that state under the class spinlock
keeps the logic straightforward.
In this case a memory leak could lead to an eventual crash if compaction
hits the leaked page. This crash would only occur if people are
changing their zswap backend at runtime (which eventually starts
destruction).
Link: http://lkml.kernel.org/r/20190809181751.219326-2-henryburns@google.com
Fixes:
|
||
Henry Burns
|
b30a2f608e |
mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely
commit 1a87aa03597efa9641e92875b883c94c7f872ccb upstream.
In zs_page_migrate() we call putback_zspage() after we have finished
migrating all pages in this zspage. However, the return value is
ignored. If a zs_free() races in between zs_page_isolate() and
zs_page_migrate(), freeing the last object in the zspage,
putback_zspage() will leave the page in ZS_EMPTY for potentially an
unbounded amount of time.
To fix this, we need to do the same thing as zs_page_putback() does:
schedule free_work to occur.
To avoid duplicated code, move the sequence to a new
putback_zspage_deferred() function which both zs_page_migrate() and
zs_page_putback() call.
Link: http://lkml.kernel.org/r/20190809181751.219326-1-henryburns@google.com
Fixes:
|
||
Vlastimil Babka
|
db67ac0316 |
mm, page_owner: handle THP splits correctly
commit f7da677bc6e72033f0981b9d58b5c5d409fa641e upstream.
THP splitting path is missing the split_page_owner() call that
split_page() has.
As a result, split THP pages are wrongly reported in the page_owner file
as order-9 pages. Furthermore when the former head page is freed, the
remaining former tail pages are not listed in the page_owner file at
all. This patch fixes that by adding the split_page_owner() call into
__split_huge_page().
Link: http://lkml.kernel.org/r/20190820131828.22684-2-vbabka@suse.cz
Fixes:
|
||
Greg Kroah-Hartman
|
bf707e4df1 |
This is the 4.19.68 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1iS0cACgkQONu9yGCS aT7zbhAAyPU6KNVLa1Pj/xQf8puJ3v+FTws0X9Ii9zyWCNNuTXSi8mf4arP8oMjH NsYvhCGBHIQO0l3kFJRnOMLp0pPZPPUgHLvFWQljKWRABUOzMLjCWolnjegIPo8E cUwYkwx5T5oZtYH7ScxfQLiQUJB28L65+gi+/LBqANcEYL6WaSa2+UIBeVUzbMTt tQw6TAmE6f/kGAbmiQFeIdHj6Q9MrQyGBwTdhVSe+OENlZnZq8pxbwy3GXwEBuOP rtmUfZrSxdUmWBa7+/oW14TBe/c6j2LT0tVoyZdUFGZNbOUNv4vEImXghd28YSuv fppeSkom5di+RDH+B+LCNm+rV5vbfQqGTMuwBT6do2EDuQQ5KqS5kR8tULI4GKVl pNejWeK2qcNNloC7imH+4rHIy9AFewKz1ixbGolSaPXyCYfYEBx1xfpw4npng2+X aWJuk7/DnEEdzeKu9msdycQpf0aT5vkJqquBZQTzd5HAMlgqAF/sjvKjIsaUPNu1 wDc+tyyAF0lObR7aswuhvttZ/8yczMW6pOJiM/XAfFn/a+Y7V2FRedGcIsxyeKOw YMNoW6VskIunShpbKhxVjPViI27NM0o8vmwCKmRQ6FRpxOX6vViNj+uQOjGQxfez N5g9jqxLsq9YGVQfoseS2Taao8JkBrTOW0wiJtE3/nwt/FaeHpU= =M37u -----END PGP SIGNATURE----- Merge 4.19.68 into android-4.19 Changes in 4.19.68 sh: kernel: hw_breakpoint: Fix missing break in switch statement seq_file: fix problem when seeking mid-record mm/hmm: fix bad subpage pointer in try_to_unmap_one mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind mm/memcontrol.c: fix use after free in mem_cgroup_iter() mm/usercopy: use memory range to be accessed for wraparound check Revert "pwm: Set class for exported channels in sysfs" cpufreq: schedutil: Don't skip freq update when limits change xtensa: add missing isync to the cpu_reset TLB code ALSA: hda/realtek - Add quirk for HP Envy x360 ALSA: usb-audio: Fix a stack buffer overflow bug in check_input_term ALSA: usb-audio: Fix an OOB bug in parse_audio_mixer_unit ALSA: hda - Apply workaround for another AMD chip 1022:1487 ALSA: hda - Fix a memory leak bug ALSA: hda - Add a generic reboot_notify ALSA: hda - Let all conexant codec enter D3 when rebooting HID: holtek: test for sanity of intfdata HID: hiddev: avoid opening a disconnected device HID: hiddev: do cleanup in failure of opening a device Input: kbtab - sanity check for endpoint type Input: iforce - add sanity checks net: usb: pegasus: fix improper read if get_registers() fail netfilter: ebtables: also count base chain policies riscv: Make __fstate_clean() work correctly. clk: at91: generated: Truncate divisor to GENERATED_MAX_DIV + 1 clk: sprd: Select REGMAP_MMIO to avoid compile errors clk: renesas: cpg-mssr: Fix reset control race condition xen/pciback: remove set but not used variable 'old_state' irqchip/gic-v3-its: Free unused vpt_page when alloc vpe table fail irqchip/irq-imx-gpcv2: Forward irq type to parent perf header: Fix divide by zero error if f_header.attr_size==0 perf header: Fix use of unitialized value warning libata: zpodd: Fix small read overflow in zpodd_get_mech_type() drm/bridge: lvds-encoder: Fix build error while CONFIG_DRM_KMS_HELPER=m Btrfs: fix deadlock between fiemap and transaction commits scsi: hpsa: correct scsi command status issue after reset scsi: qla2xxx: Fix possible fcport null-pointer dereferences drm/amdgpu: fix a potential information leaking bug ata: libahci: do not complain in case of deferred probe kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules kbuild: Check for unknown options with cc-option usage in Kconfig and clang arm64/efi: fix variable 'si' set but not used arm64: unwind: Prohibit probing on return_address() arm64/mm: fix variable 'pud' set but not used IB/core: Add mitigation for Spectre V1 IB/mlx5: Fix MR registration flow to use UMR properly IB/mad: Fix use-after-free in ib mad completion handling drm: msm: Fix add_gpu_components drm/exynos: fix missing decrement of retry counter Revert "kmemleak: allow to coexist with fault injection" ocfs2: remove set but not used variable 'last_hash' asm-generic: fix -Wtype-limits compiler warnings arm64: KVM: regmap: Fix unexpected switch fall-through KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block staging: comedi: dt3000: Fix signed integer overflow 'divider * base' staging: comedi: dt3000: Fix rounding up of timer divisor iio: adc: max9611: Fix temperature reading in probe USB: core: Fix races in character device registration and deregistraion usb: gadget: udc: renesas_usb3: Fix sysfs interface of "role" usb: cdc-acm: make sure a refcount is taken early enough USB: CDC: fix sanity checks in CDC union parser USB: serial: option: add D-Link DWM-222 device ID USB: serial: option: Add support for ZTE MF871A USB: serial: option: add the BroadMobi BM818 card USB: serial: option: Add Motorola modem UARTs drm/i915/cfl: Add a new CFL PCI ID. dm: disable DISCARD if the underlying storage no longer supports it arm64: ftrace: Ensure module ftrace trampoline is coherent with I-side netfilter: conntrack: Use consistent ct id hash calculation Input: psmouse - fix build error of multiple definition iommu/amd: Move iommu_init_pci() to .init section bnx2x: Fix VF's VLAN reconfiguration in reload. bonding: Add vlan tx offload to hw_enc_features net: dsa: Check existence of .port_mdb_add callback before calling it net/mlx4_en: fix a memory leak bug net/packet: fix race in tpacket_snd() sctp: fix memleak in sctp_send_reset_streams sctp: fix the transport error_count check team: Add vlan tx offload to hw_enc_features tipc: initialise addr_trail_end when setting node addresses xen/netback: Reset nr_frags before freeing skb net/mlx5e: Only support tx/rx pause setting for port owner net/mlx5e: Use flow keys dissector to parse packets for ARFS mmc: sdhci-of-arasan: Do now show error message in case of deffered probe Linux 4.19.68 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I8d9bb84a852deab3581f6cb37b4cb16e9bfe927c |
||
Yang Shi
|
01d8d08f4c |
Revert "kmemleak: allow to coexist with fault injection"
[ Upstream commit df9576def004d2cd5beedc00cb6e8901427634b9 ] When running ltp's oom test with kmemleak enabled, the below warning was triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is passed in: WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50 Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring virtio libata CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50 ... kmemleak_alloc+0x4e/0xb0 kmem_cache_alloc+0x2a7/0x3e0 mempool_alloc_slab+0x2d/0x40 mempool_alloc+0x118/0x2b0 bio_alloc_bioset+0x19d/0x350 get_swap_bio+0x80/0x230 __swap_writepage+0x5ff/0xb20 The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak has __GFP_NOFAIL set all the time due to |
||
Isaac J. Manjarres
|
056368fc3e |
mm/usercopy: use memory range to be accessed for wraparound check
commit 951531691c4bcaa59f56a316e018bc2ff1ddf855 upstream.
Currently, when checking to see if accessing n bytes starting at address
"ptr" will cause a wraparound in the memory addresses, the check in
check_bogus_address() adds an extra byte, which is incorrect, as the
range of addresses that will be accessed is [ptr, ptr + (n - 1)].
This can lead to incorrectly detecting a wraparound in the memory
address, when trying to read 4 KB from memory that is mapped to the the
last possible page in the virtual address space, when in fact, accessing
that range of memory would not cause a wraparound to occur.
Use the memory range that will actually be accessed when considering if
accessing a certain amount of bytes will cause the memory address to
wrap around.
Link: http://lkml.kernel.org/r/1564509253-23287-1-git-send-email-isaacm@codeaurora.org
Fixes:
|
||
Miles Chen
|
c8282f1b56 |
mm/memcontrol.c: fix use after free in mem_cgroup_iter()
commit 54a83d6bcbf8f4700013766b974bf9190d40b689 upstream.
This patch is sent to report an use after free in mem_cgroup_iter()
after merging commit be2657752e9e ("mm: memcg: fix use after free in
mem_cgroup_iter()").
I work with android kernel tree (4.9 & 4.14), and commit be2657752e9e
("mm: memcg: fix use after free in mem_cgroup_iter()") has been merged
to the trees. However, I can still observe use after free issues
addressed in the commit be2657752e9e. (on low-end devices, a few times
this month)
backtrace:
css_tryget <- crash here
mem_cgroup_iter
shrink_node
shrink_zones
do_try_to_free_pages
try_to_free_pages
__perform_reclaim
__alloc_pages_direct_reclaim
__alloc_pages_slowpath
__alloc_pages_nodemask
To debug, I poisoned mem_cgroup before freeing it:
static void __mem_cgroup_free(struct mem_cgroup *memcg)
for_each_node(node)
free_mem_cgroup_per_node_info(memcg, node);
free_percpu(memcg->stat);
+ /* poison memcg before freeing it */
+ memset(memcg, 0x78, sizeof(struct mem_cgroup));
kfree(memcg);
}
The coredump shows the position=0xdbbc2a00 is freed.
(gdb) p/x ((struct mem_cgroup_per_node *)0xe5009e00)->iter[8]
$13 = {position = 0xdbbc2a00, generation = 0x2efd}
0xdbbc2a00: 0xdbbc2e00 0x00000000 0xdbbc2800 0x00000100
0xdbbc2a10: 0x00000200 0x78787878 0x00026218 0x00000000
0xdbbc2a20: 0xdcad6000 0x00000001 0x78787800 0x00000000
0xdbbc2a30: 0x78780000 0x00000000 0x0068fb84 0x78787878
0xdbbc2a40: 0x78787878 0x78787878 0x78787878 0xe3fa5cc0
0xdbbc2a50: 0x78787878 0x78787878 0x00000000 0x00000000
0xdbbc2a60: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2a70: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2a80: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2a90: 0x00000001 0x00000000 0x00000000 0x00100000
0xdbbc2aa0: 0x00000001 0xdbbc2ac8 0x00000000 0x00000000
0xdbbc2ab0: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2ac0: 0x00000000 0x00000000 0xe5b02618 0x00001000
0xdbbc2ad0: 0x00000000 0x78787878 0x78787878 0x78787878
0xdbbc2ae0: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2af0: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b00: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b10: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b20: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b30: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b40: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b50: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b60: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b70: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b80: 0x78787878 0x78787878 0x00000000 0x78787878
0xdbbc2b90: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2ba0: 0x78787878 0x78787878 0x78787878 0x78787878
In the reclaim path, try_to_free_pages() does not setup
sc.target_mem_cgroup and sc is passed to do_try_to_free_pages(), ...,
shrink_node().
In mem_cgroup_iter(), root is set to root_mem_cgroup because
sc->target_mem_cgroup is NULL. It is possible to assign a memcg to
root_mem_cgroup.nodeinfo.iter in mem_cgroup_iter().
try_to_free_pages
struct scan_control sc = {...}, target_mem_cgroup is 0x0;
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup *root = sc->target_mem_cgroup;
memcg = mem_cgroup_iter(root, NULL, &reclaim);
mem_cgroup_iter()
if (!root)
root = root_mem_cgroup;
...
css = css_next_descendant_pre(css, &root->css);
memcg = mem_cgroup_from_css(css);
cmpxchg(&iter->position, pos, memcg);
My device uses memcg non-hierarchical mode. When we release a memcg:
invalidate_reclaim_iterators() reaches only dead_memcg and its parents.
If non-hierarchical mode is used, invalidate_reclaim_iterators() never
reaches root_mem_cgroup.
static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
{
struct mem_cgroup *memcg = dead_memcg;
for (; memcg; memcg = parent_mem_cgroup(memcg)
...
}
So the use after free scenario looks like:
CPU1 CPU2
try_to_free_pages
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup_iter()
if (!root)
root = root_mem_cgroup;
...
css = css_next_descendant_pre(css, &root->css);
memcg = mem_cgroup_from_css(css);
cmpxchg(&iter->position, pos, memcg);
invalidate_reclaim_iterators(memcg);
...
__mem_cgroup_free()
kfree(memcg);
try_to_free_pages
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup_iter()
if (!root)
root = root_mem_cgroup;
...
mz = mem_cgroup_nodeinfo(root, reclaim->pgdat->node_id);
iter = &mz->iter[reclaim->priority];
pos = READ_ONCE(iter->position);
css_tryget(&pos->css) <- use after free
To avoid this, we should also invalidate root_mem_cgroup.nodeinfo.iter
in invalidate_reclaim_iterators().
[cai@lca.pw: fix -Wparentheses compilation warning]
Link: http://lkml.kernel.org/r/1564580753-17531-1-git-send-email-cai@lca.pw
Link: http://lkml.kernel.org/r/20190730015729.4406-1-miles.chen@mediatek.com
Fixes:
|
||
Yang Shi
|
3c0cb90e92 |
mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind
commit a53190a4aaa36494f4d7209fd1fcc6f2ee08e0e0 upstream.
When running syzkaller internally, we ran into the below bug on 4.9.x
kernel:
kernel BUG at mm/huge_memory.c:2124!
invalid opcode: 0000 [#1] SMP KASAN
CPU: 0 PID: 1518 Comm: syz-executor107 Not tainted 4.9.168+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
task: ffff880067b34900 task.stack: ffff880068998000
RIP: split_huge_page_to_list+0x8fb/0x1030 mm/huge_memory.c:2124
Call Trace:
split_huge_page include/linux/huge_mm.h:100 [inline]
queue_pages_pte_range+0x7e1/0x1480 mm/mempolicy.c:538
walk_pmd_range mm/pagewalk.c:50 [inline]
walk_pud_range mm/pagewalk.c:90 [inline]
walk_pgd_range mm/pagewalk.c:116 [inline]
__walk_page_range+0x44a/0xdb0 mm/pagewalk.c:208
walk_page_range+0x154/0x370 mm/pagewalk.c:285
queue_pages_range+0x115/0x150 mm/mempolicy.c:694
do_mbind mm/mempolicy.c:1241 [inline]
SYSC_mbind+0x3c3/0x1030 mm/mempolicy.c:1370
SyS_mbind+0x46/0x60 mm/mempolicy.c:1352
do_syscall_64+0x1d2/0x600 arch/x86/entry/common.c:282
entry_SYSCALL_64_after_swapgs+0x5d/0xdb
Code: c7 80 1c 02 00 e8 26 0a 76 01 <0f> 0b 48 c7 c7 40 46 45 84 e8 4c
RIP [<ffffffff81895d6b>] split_huge_page_to_list+0x8fb/0x1030 mm/huge_memory.c:2124
RSP <ffff88006899f980>
with the below test:
uint64_t r[1] = {0xffffffffffffffff};
int main(void)
{
syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0);
intptr_t res = 0;
res = syscall(__NR_socket, 0x11, 3, 0x300);
if (res != -1)
r[0] = res;
*(uint32_t*)0x20000040 = 0x10000;
*(uint32_t*)0x20000044 = 1;
*(uint32_t*)0x20000048 = 0xc520;
*(uint32_t*)0x2000004c = 1;
syscall(__NR_setsockopt, r[0], 0x107, 0xd, 0x20000040, 0x10);
syscall(__NR_mmap, 0x20fed000, 0x10000, 0, 0x8811, r[0], 0);
*(uint64_t*)0x20000340 = 2;
syscall(__NR_mbind, 0x20ff9000, 0x4000, 0x4002, 0x20000340, 0x45d4, 3);
return 0;
}
Actually the test does:
mmap(0x20000000, 16777216, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000
socket(AF_PACKET, SOCK_RAW, 768) = 3
setsockopt(3, SOL_PACKET, PACKET_TX_RING, {block_size=65536, block_nr=1, frame_size=50464, frame_nr=1}, 16) = 0
mmap(0x20fed000, 65536, PROT_NONE, MAP_SHARED|MAP_FIXED|MAP_POPULATE|MAP_DENYWRITE, 3, 0) = 0x20fed000
mbind(..., MPOL_MF_STRICT|MPOL_MF_MOVE) = 0
The setsockopt() would allocate compound pages (16 pages in this test)
for packet tx ring, then the mmap() would call packet_mmap() to map the
pages into the user address space specified by the mmap() call.
When calling mbind(), it would scan the vma to queue the pages for
migration to the new node. It would split any huge page since 4.9
doesn't support THP migration, however, the packet tx ring compound
pages are not THP and even not movable. So, the above bug is triggered.
However, the later kernel is not hit by this issue due to commit
|
||
Yang Shi
|
cd825d8714 |
mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified
commit d883544515aae54842c21730b880172e7894fde9 upstream. When both MPOL_MF_MOVE* and MPOL_MF_STRICT was specified, mbind() should try best to migrate misplaced pages, if some of the pages could not be migrated, then return -EIO. There are three different sub-cases: 1. vma is not migratable 2. vma is migratable, but there are unmovable pages 3. vma is migratable, pages are movable, but migrate_pages() fails If #1 happens, kernel would just abort immediately, then return -EIO, after a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified"). If #3 happens, kernel would set policy and migrate pages with best-effort, but won't rollback the migrated pages and reset the policy back. Before that commit, they behaves in the same way. It'd better to keep their behavior consistent. But, rolling back the migrated pages and resetting the policy back sounds not feasible, so just make #1 behave as same as #3. Userspace will know that not everything was successfully migrated (via -EIO), and can take whatever steps it deems necessary - attempt rollback, determine which exact page(s) are violating the policy, etc. Make queue_pages_range() return 1 to indicate there are unmovable pages or vma is not migratable. The #2 is not handled correctly in the current kernel, the following patch will fix it. [yang.shi@linux.alibaba.com: fix review comments from Vlastimil] Link: http://lkml.kernel.org/r/1563556862-54056-2-git-send-email-yang.shi@linux.alibaba.com Link: http://lkml.kernel.org/r/1561162809-59140-2-git-send-email-yang.shi@linux.alibaba.com Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Ralph Campbell
|
f0fed8283d |
mm/hmm: fix bad subpage pointer in try_to_unmap_one
commit 1de13ee59225dfc98d483f8cce7d83f97c0b31de upstream.
When migrating an anonymous private page to a ZONE_DEVICE private page,
the source page->mapping and page->index fields are copied to the
destination ZONE_DEVICE struct page and the page_mapcount() is
increased. This is so rmap_walk() can be used to unmap and migrate the
page back to system memory.
However, try_to_unmap_one() computes the subpage pointer from a swap pte
which computes an invalid page pointer and a kernel panic results such
as:
BUG: unable to handle page fault for address: ffffea1fffffffc8
Currently, only single pages can be migrated to device private memory so
no subpage computation is needed and it can be set to "page".
[rcampbell@nvidia.com: add comment]
Link: http://lkml.kernel.org/r/20190724232700.23327-4-rcampbell@nvidia.com
Link: http://lkml.kernel.org/r/20190719192955.30462-4-rcampbell@nvidia.com
Fixes:
|
||
Greg Kroah-Hartman
|
b1e96f1650 |
This is the 4.19.67 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1WZYYACgkQONu9yGCS aT5VjRAApdD6wuKcKhZ8j010Ni18w6W+3qs6IuIXv94eav0zFSRaO9Zp93lZq2p0 h+k+ssZ+P8a4EuDquzDydlagno9hojHFAYr+9loPZlZUw578Jzg9JbJK9Z1MyQCo BCRElzZG67E+WjLP0wGHnS0oVhIoHlJaVWP3pEYkTJILY65ErLT/fYGs64YUAEKr Ct1pKoIHPEC0606IKx12kmV645ME4z6pI7g4kLDhk2BozglbxGlwdHgVuIe/NzDP PraR1gqMoOD2skjK673ozsZ65yuiVeqSTsbs49Xao1lAS6etUMbC/ACU/yrhL48H mMM/EFTSKb5TjJSxQAXU1ANQrm4X6n1yPkNW/MdthnPAotDY3Nda4NNVE9X2toM7 XW0HfFdcVD7aJtpC/h6ckndGTaOGkHSPjhJtSlBEjF76BA+KhZ9hhcjNWng92bWL d5Nws4b82wvgM6T99mkZfbMc2MOopPMf+I94W0JcMa49+rXhyhJdrC72GpxKLdSq +XtZJupFWg0RrPlZfmc4Az8f/uY0UfR9gNSaHJokaZAkMzH2x4MzMnPxwRiXAw4W qz1s+sgZlqUQcWvODzaNvG1l7QtjD5rbdJ+FAjN2+16F8rep52Yl/IQffYr04DDD wikYmcUoMh8hCoj6Atj2LAAU9ulhl6ja8s0YpmHz/HQETufHAZc= =gOG+ -----END PGP SIGNATURE----- Merge 4.19.67 into android-4.19 Changes in 4.19.67 iio: cros_ec_accel_legacy: Fix incorrect channel setting iio: adc: max9611: Fix misuse of GENMASK macro staging: gasket: apex: fix copy-paste typo staging: android: ion: Bail out upon SIGKILL when allocating memory. crypto: ccp - Fix oops by properly managing allocated structures crypto: ccp - Add support for valid authsize values less than 16 crypto: ccp - Ignore tag length when decrypting GCM ciphertext usb: usbfs: fix double-free of usb memory upon submiturb error usb: iowarrior: fix deadlock on disconnect sound: fix a memory leak bug mmc: cavium: Set the correct dma max segment size for mmc_host mmc: cavium: Add the missing dma unmap when the dma has finished. loop: set PF_MEMALLOC_NOIO for the worker thread Input: usbtouchscreen - initialize PM mutex before using it Input: elantech - enable SMBus on new (2018+) systems Input: synaptics - enable RMI mode for HP Spectre X360 x86/mm: Check for pfn instead of page in vmalloc_sync_one() x86/mm: Sync also unmappings in vmalloc_sync_all() mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy() perf annotate: Fix s390 gap between kernel end and module start perf db-export: Fix thread__exec_comm() perf record: Fix module size on s390 x86/purgatory: Use CFLAGS_REMOVE rather than reset KBUILD_CFLAGS gfs2: gfs2_walk_metadata fix usb: host: xhci-rcar: Fix timeout in xhci_suspend() usb: yurex: Fix use-after-free in yurex_delete usb: typec: tcpm: free log buf memory when remove debug file usb: typec: tcpm: remove tcpm dir if no children usb: typec: tcpm: Add NULL check before dereferencing config usb: typec: tcpm: Ignore unsupported/unknown alternate mode requests can: rcar_canfd: fix possible IRQ storm on high load can: peak_usb: fix potential double kfree_skb() netfilter: nfnetlink: avoid deadlock due to synchronous request_module vfio-ccw: Set pa_nr to 0 if memory allocation fails for pa_iova_pfn netfilter: Fix rpfilter dropping vrf packets by mistake netfilter: conntrack: always store window size un-scaled netfilter: nft_hash: fix symhash with modulus one scripts/sphinx-pre-install: fix script for RHEL/CentOS drm/amd/display: Wait for backlight programming completion in set backlight level drm/amd/display: use encoder's engine id to find matched free audio device drm/amd/display: Fix dc_create failure handling and 666 color depths drm/amd/display: Only enable audio if speaker allocation exists drm/amd/display: Increase size of audios array iscsi_ibft: make ISCSI_IBFT dependson ACPI instead of ISCSI_IBFT_FIND nl80211: fix NL80211_HE_MAX_CAPABILITY_LEN mac80211: don't warn about CW params when not using them allocate_flower_entry: should check for null deref hwmon: (nct6775) Fix register address and added missed tolerance for nct6106 drm: silence variable 'conn' set but not used cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init() s390/qdio: add sanity checks to the fast-requeue path ALSA: compress: Fix regression on compressed capture streams ALSA: compress: Prevent bypasses of set_params ALSA: compress: Don't allow paritial drain operations on capture streams ALSA: compress: Be more restrictive about when a drain is allowed perf tools: Fix proper buffer size for feature processing perf probe: Avoid calling freeing routine multiple times for same pointer drbd: dynamically allocate shash descriptor ACPI/IORT: Fix off-by-one check in iort_dev_find_its_id() nvme: fix multipath crash when ANA is deactivated ARM: davinci: fix sleep.S build error on ARMv4 ARM: dts: bcm: bcm47094: add missing #cells for mdio-bus-mux scsi: megaraid_sas: fix panic on loading firmware crashdump scsi: ibmvfc: fix WARN_ON during event pool release scsi: scsi_dh_alua: always use a 2 second delay before retrying RTPG test_firmware: fix a memory leak bug tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop perf/core: Fix creating kernel counters for PMUs that override event->cpu s390/dma: provide proper ARCH_ZONE_DMA_BITS value HID: sony: Fix race condition between rumble and device remove. x86/purgatory: Do not use __builtin_memcpy and __builtin_memset ALSA: usb-audio: fix a memory leak bug can: peak_usb: pcan_usb_pro: Fix info-leaks to USB devices can: peak_usb: pcan_usb_fd: Fix info-leaks to USB devices hwmon: (nct7802) Fix wrong detection of in4 presence drm/i915: Fix wrong escape clock divisor init for GLK ALSA: firewire: fix a memory leak bug ALSA: hiface: fix multiple memory leak bugs ALSA: hda - Don't override global PCM hw info flag ALSA: hda - Workaround for crackled sound on AMD controller (1022:1457) mac80211: don't WARN on short WMM parameters from AP dax: dax_layout_busy_page() should not unmap cow pages SMB3: Fix deadlock in validate negotiate hits reconnect smb3: send CAP_DFS capability during session setup NFSv4: Fix an Oops in nfs4_do_setattr KVM: Fix leak vCPU's VMCS value into other pCPU mwifiex: fix 802.11n/WPA detection iwlwifi: don't unmap as page memory that was mapped as single iwlwifi: mvm: fix an out-of-bound access iwlwifi: mvm: don't send GEO_TX_POWER_LIMIT on version < 41 iwlwifi: mvm: fix version check for GEO_TX_POWER_LIMIT support Linux 4.19.67 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I5ea813ed5ba6d1eeda51eb4031395ee3e8ba54c3 |
||
Joerg Roedel
|
46b306f3cd |
mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
commit 3f8fd02b1bf1d7ba964485a56f2f4b53ae88c167 upstream.
On x86-32 with PTI enabled, parts of the kernel page-tables are not shared
between processes. This can cause mappings in the vmalloc/ioremap area to
persist in some page-tables after the region is unmapped and released.
When the region is re-used the processes with the old mappings do not fault
in the new mappings but still access the old ones.
This causes undefined behavior, in reality often data corruption, kernel
oopses and panics and even spontaneous reboots.
Fix this problem by activly syncing unmaps in the vmalloc/ioremap area to
all page-tables in the system before the regions can be re-used.
References: https://bugzilla.suse.com/show_bug.cgi?id=1118689
Fixes:
|
||
Greg Kroah-Hartman
|
de4c70d6a9 |
This is the 4.19.65 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1Js7MACgkQONu9yGCS aT4PQxAAo7xa4kYvDxc1RjUY/yIlp6lQ3rpYAAfZB0t8vN+dqivnJZ7m6JHeWX1Y CMcxg85zxLVFeuiXdP821Zj68AB5zqlWMhX0bXm2lhw/Eo9+XHzXtnrLZHhz0/Xd M5cmfIPmoyPCUQQfzSfUMvch+ZpwzEt5op5pUfSjckSpjHQZ0HFj1WJ4D8Hn9jAJ y4+DAKDZgtqhb3GvpS6MoVnBJgcPk9+mBiDkSb12L392+FvHqfeBi3tDRhvyiZAO iJrk747SPds7NlNmuRnj7YyUSDhBzaceRCz0Jsv9FT5EKXoPErXdsL3Bkfa9TREM pH0OaMgNr6WSXLO9qIMcfxMeaKVIvIbotqBTkBTzhEAGPkHA75dhi0lpixXXFExg MaqhLfmHO0dOEr9FrvYGe7f2wUA1Rdw/qRTM3KPEKmHxMqBS7eufIWMHwie1n9Oe cYoP6UkxUIvhUyFV2BlMRFdMfaDbtR0iqy8Dqh36NISD6PAYaUGSoVeSO1fEg4Jy 5GgrKPg6rcz2XNY2cVbsm2zLpqY4dY58SFK9ORfuULdKUQvScvFGrdSSW0CgX+uc F/5NmPutUoboHVxFraDPx7yo46pHf1RW0Me4xZ0aJ3e9ituLAN4fmJ9u46nofb5M thPelQlMVt30O41uViJ0ADkOjCsiBr3AxOFvc76Ct9Q/BJVxhLk= =JVBv -----END PGP SIGNATURE----- Merge 4.19.65 into android-4.19 Changes in 4.19.65 ARM: riscpc: fix DMA ARM: dts: rockchip: Make rk3288-veyron-minnie run at hs200 ARM: dts: rockchip: Make rk3288-veyron-mickey's emmc work again ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend ftrace: Enable trampoline when rec count returns back to one dmaengine: tegra-apb: Error out if DMA_PREP_INTERRUPT flag is unset arm64: dts: rockchip: fix isp iommu clocks and power domain kernel/module.c: Only return -EEXIST for modules that have finished loading firmware/psci: psci_checker: Park kthreads before stopping them MIPS: lantiq: Fix bitfield masking dmaengine: rcar-dmac: Reject zero-length slave DMA requests clk: tegra210: fix PLLU and PLLU_OUT1 fs/adfs: super: fix use-after-free bug clk: sprd: Add check for return value of sprd_clk_regmap_init() btrfs: fix minimum number of chunk errors for DUP btrfs: qgroup: Don't hold qgroup_ioctl_lock in btrfs_qgroup_inherit() cifs: Fix a race condition with cifs_echo_request ceph: fix improper use of smp_mb__before_atomic() ceph: return -ERANGE if virtual xattr value didn't fit in buffer ACPI: blacklist: fix clang warning for unused DMI table scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized perf version: Fix segfault due to missing OPT_END() x86: kvm: avoid constant-conversion warning ACPI: fix false-positive -Wuninitialized warning be2net: Signal that the device cannot transmit during reconfiguration x86/apic: Silence -Wtype-limits compiler warnings x86: math-emu: Hide clang warnings for 16-bit overflow mm/cma.c: fail if fixed declaration can't be honored lib/test_overflow.c: avoid tainting the kernel and fix wrap size lib/test_string.c: avoid masking memset16/32/64 failures coda: add error handling for fget coda: fix build using bare-metal toolchain uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings ipc/mqueue.c: only perform resource calculation if user valid mlxsw: spectrum_dcb: Configure DSCP map as the last rule is removed xen/pv: Fix a boot up hang revealed by int3 self test x86/kvm: Don't call kvm_spurious_fault() from .fixup x86/paravirt: Fix callee-saved function ELF sizes x86, boot: Remove multiple copy of static function sanitize_boot_params() drm/nouveau: fix memory leak in nouveau_conn_reset() kconfig: Clear "written" flag to avoid data loss kbuild: initialize CLANG_FLAGS correctly in the top Makefile Btrfs: fix incremental send failure after deduplication Btrfs: fix race leading to fs corruption after transaction abort mmc: dw_mmc: Fix occasional hang after tuning on eMMC mmc: meson-mx-sdio: Fix misuse of GENMASK macro gpiolib: fix incorrect IRQ requesting of an active-low lineevent IB/hfi1: Fix Spectre v1 vulnerability mtd: rawnand: micron: handle on-die "ECC-off" devices correctly selinux: fix memory leak in policydb_init() ALSA: hda: Fix 1-minute detection delay when i915 module is not available mm: vmscan: check if mem cgroup is disabled or not before calling memcg slab shrinker s390/dasd: fix endless loop after read unit address configuration cgroup: kselftest: relax fs_spec checks parisc: Fix build of compressed kernel even with debug enabled drivers/perf: arm_pmu: Fix failure path in PM notifier arm64: compat: Allow single-byte watchpoints on all addresses arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG} nbd: replace kill_bdev() with __invalidate_device() again xen/swiotlb: fix condition for calling xen_destroy_contiguous_region() IB/mlx5: Fix unreg_umr to ignore the mkey state IB/mlx5: Use direct mkey destroy command upon UMR unreg failure IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache IB/mlx5: Fix clean_mr() to work in the expected order IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification IB/hfi1: Check for error on call to alloc_rsm_map_table drm/i915/gvt: fix incorrect cache entry for guest page mapping eeprom: at24: make spd world-readable again ARC: enable uboot support unconditionally objtool: Support GCC 9 cold subfunction naming scheme gcc-9: properly declare the {pv,hv}clock_page storage x86/vdso: Prevent segfaults due to hoisted vclock reads scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA x86/cpufeatures: Carve out CQM features retrieval x86/cpufeatures: Combine word 11 and 12 into a new scattered features word x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations x86/speculation: Enable Spectre v1 swapgs mitigations x86/entry/64: Use JMP instead of JMPQ x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS Documentation: Add swapgs description to the Spectre v1 documentation Linux 4.19.65 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Iceeabdb164657e0a616db618e6aa8445d56b0dc1 |
||
Yang Shi
|
beb0cc781b |
mm: vmscan: check if mem cgroup is disabled or not before calling memcg slab shrinker
commit fa1e512fac717f34e7c12d7a384c46e90a647392 upstream. Shakeel Butt reported premature oom on kernel with "cgroup_disable=memory" since mem_cgroup_is_root() returns false even though memcg is actually NULL. The drop_caches is also broken. It is because commit |
||
Doug Berger
|
439c79ed77 |
mm/cma.c: fail if fixed declaration can't be honored
[ Upstream commit c633324e311243586675e732249339685e5d6faa ]
The description of cma_declare_contiguous() indicates that if the
'fixed' argument is true the reserved contiguous area must be exactly at
the address of the 'base' argument.
However, the function currently allows the 'base', 'size', and 'limit'
arguments to be silently adjusted to meet alignment constraints. This
commit enforces the documented behavior through explicit checks that
return an error if the region does not fit within a specified region.
Link: http://lkml.kernel.org/r/1561422051-16142-1-git-send-email-opendmb@gmail.com
Fixes:
|
||
Greg Kroah-Hartman
|
75ff56e1a2 |
This is the 4.19.63 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1BJrAACgkQONu9yGCS aT6MpRAAt2nuozm5Z/MshFAuGFzddAwPoYtkIPSy8BiPHYjf7x0+D5Ew4dz5OihS ElfbA94hMOpvhhXlzBU3ZFJsWZIK78gzV6+LiHyb5R97Jdzj/zT4h40y0kxKw+pS gghnZ6zx+pGSIXm/EsODW2gg98yTrmhBFpUhXpAGoC/71c1vxVlj3jHcuhd778YK NRlj2tFWJGIBpmXrApo1Eg7qQRj4tzbOjthfgANWPq+EP68PgiOaxMDMMp7IUwju KrXEFcXlk5aHvfaJ06FSBRBnn45XMyGXPYV/76HsaqkBNmg1r7o02dZKy/0S84Fn YXoEFxHt6NFOJ52LiO95z7cQ0xeTSYygNCtYXJ70uyDMnrCPXCrKp7DRyka+vDPs RCrcpB1QjcCb3xTL//SPkNWM3oZEW9CawpRFh9bmqbw/h7ZjaUEuwNIJnunISulu 2fvOjUmFWfUVIARiwKFVuIkXzgf3cSLYZTtiDFC5/yBpkGVNnXqyO7YtWZwnrMHq L3DC3pOKuYXMa03KWGEzZoCZEXjtoRhRwCSgV7wbK5o90sZeRj/HC1zXEyDPPD/R 7A1rePTuwlAH3gHCJGhYkmYqULx62ZdvV6IC2N7xNxeTL1Y7OVNBT7ZUqexxY6WC OG1vVxUKNJIvBLYmc6cmQATgR6XH5/B9H2p1YRBLuAQuqHk+jqo= =ZQf3 -----END PGP SIGNATURE----- Merge 4.19.63 into android-4.19 Changes in 4.19.63 hvsock: fix epollout hang from race condition drm/panel: simple: Fix panel_simple_dsi_probe iio: adc: stm32-dfsdm: manage the get_irq error case iio: adc: stm32-dfsdm: missing error case during probe staging: vt6656: use meaningful error code during buffer allocation usb: core: hub: Disable hub-initiated U1/U2 tty: max310x: Fix invalid baudrate divisors calculator pinctrl: rockchip: fix leaked of_node references tty: serial: cpm_uart - fix init when SMC is relocated drm/amd/display: Fill prescale_params->scale for RGB565 drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE drm/amd/display: Disable ABM before destroy ABM struct drm/amdkfd: Fix a potential memory leak drm/amdkfd: Fix sdma queue map issue drm/edid: Fix a missing-check bug in drm_load_edid_firmware() PCI: Return error if cannot probe VF drm/bridge: tc358767: read display_props in get_modes() drm/bridge: sii902x: pixel clock unit is 10kHz instead of 1kHz gpu: host1x: Increase maximum DMA segment size drm/crc-debugfs: User irqsafe spinlock in drm_crtc_add_crc_entry drm/crc-debugfs: Also sprinkle irqrestore over early exits memstick: Fix error cleanup path of memstick_init tty/serial: digicolor: Fix digicolor-usart already registered warning tty: serial: msm_serial: avoid system lockup condition serial: 8250: Fix TX interrupt handling condition drm/amd/display: Always allocate initial connector state state drm/virtio: Add memory barriers for capset cache. phy: renesas: rcar-gen2: Fix memory leak at error paths drm/amd/display: fix compilation error powerpc/pseries/mobility: prevent cpu hotplug during DT update drm/rockchip: Properly adjust to a true clock in adjusted_mode serial: imx: fix locking in set_termios() tty: serial_core: Set port active bit in uart_port_activate usb: gadget: Zero ffs_io_data mmc: sdhci: sdhci-pci-o2micro: Check if controller supports 8-bit width powerpc/pci/of: Fix OF flags parsing for 64bit BARs drm/msm: Depopulate platform on probe failure serial: mctrl_gpio: Check if GPIO property exisits before requesting it PCI: sysfs: Ignore lockdep for remove attribute i2c: stm32f7: fix the get_irq error cases kbuild: Add -Werror=unknown-warning-option to CLANG_FLAGS genksyms: Teach parser about 128-bit built-in types PCI: xilinx-nwl: Fix Multi MSI data programming iio: iio-utils: Fix possible incorrect mask calculation powerpc/cacheflush: fix variable set but not used powerpc/xmon: Fix disabling tracing while in xmon recordmcount: Fix spurious mcount entries on powerpc mfd: madera: Add missing of table registration mfd: core: Set fwnode for created devices mfd: arizona: Fix undefined behavior mfd: hi655x-pmic: Fix missing return value check for devm_regmap_init_mmio_clk mm/swap: fix release_pages() when releasing devmap pages um: Silence lockdep complaint about mmap_sem powerpc/4xx/uic: clear pending interrupt after irq type/pol change RDMA/i40iw: Set queue pair state when being queried serial: sh-sci: Terminate TX DMA during buffer flushing serial: sh-sci: Fix TX DMA buffer flushing and workqueue races IB/mlx5: Fixed reporting counters on 2nd port for Dual port RoCE powerpc/mm: Handle page table allocation failures IB/ipoib: Add child to parent list only if device initialized arm64: assembler: Switch ESB-instruction with a vanilla nop if !ARM64_HAS_RAS PCI: mobiveil: Fix PCI base address in MEM/IO outbound windows PCI: mobiveil: Fix the Class Code field kallsyms: exclude kasan local symbols on s390 PCI: mobiveil: Initialize Primary/Secondary/Subordinate bus numbers PCI: mobiveil: Use the 1st inbound window for MEM inbound transactions perf test mmap-thread-lookup: Initialize variable to suppress memory sanitizer warning perf stat: Fix use-after-freed pointer detected by the smatch tool perf top: Fix potential NULL pointer dereference detected by the smatch tool perf session: Fix potential NULL pointer dereference found by the smatch tool perf annotate: Fix dereferencing freed memory found by the smatch tool perf hists browser: Fix potential NULL pointer dereference found by the smatch tool RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM PCI: dwc: pci-dra7xx: Fix compilation when !CONFIG_GPIOLIB powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h block: init flush rq ref count to 1 f2fs: avoid out-of-range memory access mailbox: handle failed named mailbox channel request dlm: check if workqueues are NULL before flushing/destroying powerpc/eeh: Handle hugepages in ioremap space block/bio-integrity: fix a memory leak bug sh: prevent warnings when using iounmap mm/kmemleak.c: fix check for softirq context 9p: pass the correct prototype to read_cache_page mm/gup.c: mark undo_dev_pagemap as __maybe_unused mm/gup.c: remove some BUG_ONs from get_gate_page() memcg, fsnotify: no oom-kill for remote memcg charging mm/mmu_notifier: use hlist_add_head_rcu() proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup proc: use down_read_killable mmap_sem for /proc/pid/pagemap proc: use down_read_killable mmap_sem for /proc/pid/clear_refs proc: use down_read_killable mmap_sem for /proc/pid/map_files cxgb4: reduce kernel stack usage in cudbg_collect_mem_region() proc: use down_read_killable mmap_sem for /proc/pid/maps locking/lockdep: Fix lock used or unused stats error mm: use down_read_killable for locking mmap_sem in access_remote_vm locking/lockdep: Hide unused 'class' variable usb: wusbcore: fix unbalanced get/put cluster_id usb: pci-quirks: Correct AMD PLL quirk detection btrfs: inode: Don't compress if NODATASUM or NODATACOW set x86/sysfb_efi: Add quirks for some devices with swapped width and height x86/speculation/mds: Apply more accurate check on hypervisor platform binder: prevent transactions to context manager from its own process. fpga-manager: altera-ps-spi: Fix build error mei: me: add mule creek canyon (EHL) device ids hpet: Fix division by zero in hpet_time_div() ALSA: ac97: Fix double free of ac97_codec_device ALSA: line6: Fix wrong altsetting for LINE6_PODHD500_1 ALSA: hda - Add a conexant codec entry to let mute led work powerpc/xive: Fix loop exit-condition in xive_find_target_in_mask() powerpc/tm: Fix oops on sigreturn on systems without TM libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl() access: avoid the RCU grace period for the temporary subjective credentials Linux 4.19.63 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ic31529aa6fd283d16d6bfb182187a9402a4db44f |
||
Konstantin Khlebnikov
|
b07687243d |
mm: use down_read_killable for locking mmap_sem in access_remote_vm
[ Upstream commit 1e426fe28261b03f297992e89da3320b42816f4e ] This function is used by ptrace and proc files like /proc/pid/cmdline and /proc/pid/environ. Access_remote_vm never returns error codes, all errors are ignored and only size of successfully read data is returned. So, if current task was killed we'll simply return 0 (bytes read). Mmap_sem could be locked for a long time or forever if something goes wrong. Using a killable lock permits cleanup of stuck tasks and simplifies investigation. Link: http://lkml.kernel.org/r/156007494202.3335.16782303099589302087.stgit@buzz Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: Michal Koutný <mkoutny@suse.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Jean-Philippe Brucker
|
a8c568fc48 |
mm/mmu_notifier: use hlist_add_head_rcu()
[ Upstream commit 543bdb2d825fe2400d6e951f1786d92139a16931 ]
Make mmu_notifier_register() safer by issuing a memory barrier before
registering a new notifier. This fixes a theoretical bug on weakly
ordered CPUs. For example, take this simplified use of notifiers by a
driver:
my_struct->mn.ops = &my_ops; /* (1) */
mmu_notifier_register(&my_struct->mn, mm)
...
hlist_add_head(&mn->hlist, &mm->mmu_notifiers); /* (2) */
...
Once mmu_notifier_register() releases the mm locks, another thread can
invalidate a range:
mmu_notifier_invalidate_range()
...
hlist_for_each_entry_rcu(mn, &mm->mmu_notifiers, hlist) {
if (mn->ops->invalidate_range)
The read side relies on the data dependency between mn and ops to ensure
that the pointer is properly initialized. But the write side doesn't have
any dependency between (1) and (2), so they could be reordered and the
readers could dereference an invalid mn->ops. mmu_notifier_register()
does take all the mm locks before adding to the hlist, but those have
acquire semantics which isn't sufficient.
By calling hlist_add_head_rcu() instead of hlist_add_head() we update the
hlist using a store-release, ensuring that readers see prior
initialization of my_struct. This situation is better illustated by
litmus test MP+onceassign+derefonce.
Link: http://lkml.kernel.org/r/20190502133532.24981-1-jean-philippe.brucker@arm.com
Fixes:
|
||
Andy Lutomirski
|
041b127df7 |
mm/gup.c: remove some BUG_ONs from get_gate_page()
[ Upstream commit b5d1c39f34d1c9bca0c4b9ae2e339fbbe264a9c7 ] If we end up without a PGD or PUD entry backing the gate area, don't BUG -- just fail gracefully. It's not entirely implausible that this could happen some day on x86. It doesn't right now even with an execute-only emulated vsyscall page because the fixmap shares the PUD, but the core mm code shouldn't rely on that particular detail to avoid OOPSing. Link: http://lkml.kernel.org/r/a1d9f4efb75b9d464e59fd6af00104b21c58f6f7.1561610798.git.luto@kernel.org Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Guenter Roeck
|
fa099d6ddf |
mm/gup.c: mark undo_dev_pagemap as __maybe_unused
[ Upstream commit 790c73690c2bbecb3f6f8becbdb11ddc9bcff8cc ] Several mips builds generate the following build warning. mm/gup.c:1788:13: warning: 'undo_dev_pagemap' defined but not used The function is declared unconditionally but only called from behind various ifdefs. Mark it __maybe_unused. Link: http://lkml.kernel.org/r/1562072523-22311-1-git-send-email-linux@roeck-us.net Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Dmitry Vyukov
|
071f2135cf |
mm/kmemleak.c: fix check for softirq context
[ Upstream commit 6ef9056952532c3b746de46aa10d45b4d7797bd8 ] in_softirq() is a wrong predicate to check if we are in a softirq context. It also returns true if we have BH disabled, so objects are falsely stamped with "softirq" comm. The correct predicate is in_serving_softirq(). If user does cat from /sys/kernel/debug/kmemleak previously they would see this, which is clearly wrong, this is system call context (see the comm): unreferenced object 0xffff88805bd661c0 (size 64): comm "softirq", pid 0, jiffies 4294942959 (age 12.400s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 ff ff ff ff 00 00 00 00 ................ 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................ backtrace: [<0000000007dcb30c>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<0000000007dcb30c>] slab_post_alloc_hook mm/slab.h:439 [inline] [<0000000007dcb30c>] slab_alloc mm/slab.c:3326 [inline] [<0000000007dcb30c>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553 [<00000000969722b7>] kmalloc include/linux/slab.h:547 [inline] [<00000000969722b7>] kzalloc include/linux/slab.h:742 [inline] [<00000000969722b7>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline] [<00000000969722b7>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085 [<00000000a4134b5f>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475 [<00000000d20248ad>] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957 [<000000003d367be7>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246 [<000000003c7c76af>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616 [<000000000c1aeb23>] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130 [<000000000157b92b>] __sys_setsockopt+0x9e/0x120 net/socket.c:2078 [<00000000a9f3d058>] __do_sys_setsockopt net/socket.c:2089 [inline] [<00000000a9f3d058>] __se_sys_setsockopt net/socket.c:2086 [inline] [<00000000a9f3d058>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086 [<000000001b8da885>] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301 [<00000000ba770c62>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 now they will see this: unreferenced object 0xffff88805413c800 (size 64): comm "syz-executor.4", pid 8960, jiffies 4294994003 (age 14.350s) hex dump (first 32 bytes): 00 7a 8a 57 80 88 ff ff e0 00 00 01 00 00 00 00 .z.W............ 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................ backtrace: [<00000000c5d3be64>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<00000000c5d3be64>] slab_post_alloc_hook mm/slab.h:439 [inline] [<00000000c5d3be64>] slab_alloc mm/slab.c:3326 [inline] [<00000000c5d3be64>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553 [<0000000023865be2>] kmalloc include/linux/slab.h:547 [inline] [<0000000023865be2>] kzalloc include/linux/slab.h:742 [inline] [<0000000023865be2>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline] [<0000000023865be2>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085 [<000000003029a9d4>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475 [<00000000ccd0a87c>] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957 [<00000000a85a3785>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246 [<00000000ec13c18d>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616 [<0000000052d748e3>] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130 [<00000000512f1014>] __sys_setsockopt+0x9e/0x120 net/socket.c:2078 [<00000000181758bc>] __do_sys_setsockopt net/socket.c:2089 [inline] [<00000000181758bc>] __se_sys_setsockopt net/socket.c:2086 [inline] [<00000000181758bc>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086 [<00000000d4b73623>] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301 [<00000000c1098bec>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Link: http://lkml.kernel.org/r/20190517171507.96046-1-dvyukov@gmail.com Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Ira Weiny
|
30edc7c1fe |
mm/swap: fix release_pages() when releasing devmap pages
[ Upstream commit c5d6c45e90c49150670346967971e14576afd7f1 ] release_pages() is an optimized version of a loop around put_page(). Unfortunately for devmap pages the logic is not entirely correct in release_pages(). This is because device pages can be more than type MEMORY_DEVICE_PUBLIC. There are in fact 4 types, private, public, FS DAX, and PCI P2PDMA. Some of these have specific needs to "put" the page while others do not. This logic to handle any special needs is contained in put_devmap_managed_page(). Therefore all devmap pages should be processed by this function where we can contain the correct logic for a page put. Handle all device type pages within release_pages() by calling put_devmap_managed_page() on all devmap pages. If put_devmap_managed_page() returns true the page has been put and we continue with the next page. A false return of put_devmap_managed_page() means the page did not require special processing and should fall to "normal" processing. This was found via code inspection while determining if release_pages() and the new put_user_pages() could be interchangeable.[1] [1] https://lkml.kernel.org/r/20190523172852.GA27175@iweiny-DESK2.sc.intel.com Link: https://lkml.kernel.org/r/20190605214922.17684-1-ira.weiny@intel.com Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Greg Kroah-Hartman
|
f232ce65ba |
This is the 4.19.62 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl09QMoACgkQONu9yGCS aT7bBQ//dApu9DLQ3tyfc+tucIc3ViOcif3zlfLWRqClcLXw1HF63/Q/u40kYuf1 TOnMkVXFWmpeGPgz2/tWw9TN+ibHaPBfyMoPGw/Ehs5dB8qn/F4AnLqkePFimIYQ +YOZTbvPqW/xSh8I79i6y2NxrZLGvO9FDY9YUblBKz4lKWc+HULXn9RmjtxlBOLE s6epoM0tWKPPeM5MyJIwT+9MZ0o3Cj89YNv+PqYLQSzcvbUbHDUiMt+DO4rqSyyf 6fQe4mSX4tk+tbviyEruvKH3kJg0rWS1Maw1/h+AOJ8Gmr2WZ+vYBorPEdwdYu5K eootc3Jkj4wtcPsvKtNzWVPqJxdd5j651vS8/bxA0DmOVQM7A8dUBWKtMJGGG7nM nIRvxoMo+kv9QE/DJpeQ/apE9WnBflSqw6/DYdqs11gTX4E+beR75aRoVD1ue0lS if5CfnM0BTiOO1rMdMo42rzcBfh2DLt5a18WXsMmYOaSMGEJuF0KjgaGc5E4N00A QlqIJ7PKk+Kb53jz5oLV2vB/SXNvAQRLAvMTMH2Mst4veDonYyiVTfp6C+r8DJYc hvyaF1FoCMTWV5XsINmM2mlJ2+/G0nV5kLDmVPUHiFxE0JXnPQ8iCx9a96Ub+Zim F//muwohNIUa6EEN6AwrEUsoyVZ7cP/aR5wHQ9PAY3RV5nis474= =zWcX -----END PGP SIGNATURE----- Merge 4.19.62 into android-4.19 Changes in 4.19.62 bnx2x: Prevent load reordering in tx completion processing caif-hsi: fix possible deadlock in cfhsi_exit_module() hv_netvsc: Fix extra rcu_read_unlock in netvsc_recv_callback() igmp: fix memory leak in igmpv3_del_delrec() ipv4: don't set IPv6 only flags to IPv4 addresses ipv6: rt6_check should return NULL if 'from' is NULL ipv6: Unlink sibling route in case of failure net: bcmgenet: use promisc for unsupported filters net: dsa: mv88e6xxx: wait after reset deactivation net: make skb_dst_force return true when dst is refcounted net: neigh: fix multiple neigh timer scheduling net: openvswitch: fix csum updates for MPLS actions net: phy: sfp: hwmon: Fix scaling of RX power net: stmmac: Re-work the queue selection for TSO packets nfc: fix potential illegal memory access r8169: fix issue with confused RX unit after PHY power-down on RTL8411b rxrpc: Fix send on a connected, but unbound socket sctp: fix error handling on stream scheduler initialization sky2: Disable MSI on ASUS P6T tcp: be more careful in tcp_fragment() tcp: fix tcp_set_congestion_control() use from bpf hook tcp: Reset bytes_acked and bytes_received when disconnecting vrf: make sure skb->data contains ip header to make routing net/mlx5e: IPoIB, Add error path in mlx5_rdma_setup_rn macsec: fix use-after-free of skb during RX macsec: fix checksumming after decryption netrom: fix a memory leak in nr_rx_frame() netrom: hold sock when setting skb->destructor net_sched: unset TCQ_F_CAN_BYPASS when adding filters net/tls: make sure offload also gets the keys wiped sctp: not bind the socket in sctp_connect net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query net: bridge: don't cache ether dest pointer on input net: bridge: stp: don't cache eth dest pointer before skb pull dma-buf: balance refcount inbalance dma-buf: Discard old fence_excl on retrying get_fences_rcu for realloc gpio: davinci: silence error prints in case of EPROBE_DEFER MIPS: lb60: Fix pin mappings perf/core: Fix exclusive events' grouping perf/core: Fix race between close() and fork() ext4: don't allow any modifications to an immutable file ext4: enforce the immutable flag on open files mm: add filemap_fdatawait_range_keep_errors() jbd2: introduce jbd2_inode dirty range scoping ext4: use jbd2_inode dirty range scoping ext4: allow directory holes KVM: nVMX: do not use dangling shadow VMCS after guest reset KVM: nVMX: Clear pending KVM_REQ_GET_VMCS12_PAGES when leaving nested mm: vmscan: scan anonymous pages on file refaults net: sched: verify that q!=NULL before setting q->flags Linux 4.19.62 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I2eb23bda9d5294a5c874fe4f403934fd99e84661 |
||
Kuo-Hsin Yang
|
c1d98b766e |
mm: vmscan: scan anonymous pages on file refaults
commit 2c012a4ad1a2cd3fb5a0f9307b9d219f84eda1fa upstream. When file refaults are detected and there are many inactive file pages, the system never reclaim anonymous pages, the file pages are dropped aggressively when there are still a lot of cold anonymous pages and system thrashes. This issue impacts the performance of applications with large executable, e.g. chrome. With this patch, when file refault is detected, inactive_list_is_low() always returns true for file pages in get_scan_count() to enable scanning anonymous pages. The problem can be reproduced by the following test program. ---8<--- void fallocate_file(const char *filename, off_t size) { struct stat st; int fd; if (!stat(filename, &st) && st.st_size >= size) return; fd = open(filename, O_WRONLY | O_CREAT, 0600); if (fd < 0) { perror("create file"); exit(1); } if (posix_fallocate(fd, 0, size)) { perror("fallocate"); exit(1); } close(fd); } long *alloc_anon(long size) { long *start = malloc(size); memset(start, 1, size); return start; } long access_file(const char *filename, long size, long rounds) { int fd, i; volatile char *start1, *end1, *start2; const int page_size = getpagesize(); long sum = 0; fd = open(filename, O_RDONLY); if (fd == -1) { perror("open"); exit(1); } /* * Some applications, e.g. chrome, use a lot of executable file * pages, map some of the pages with PROT_EXEC flag to simulate * the behavior. */ start1 = mmap(NULL, size / 2, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0); if (start1 == MAP_FAILED) { perror("mmap"); exit(1); } end1 = start1 + size / 2; start2 = mmap(NULL, size / 2, PROT_READ, MAP_SHARED, fd, size / 2); if (start2 == MAP_FAILED) { perror("mmap"); exit(1); } for (i = 0; i < rounds; ++i) { struct timeval before, after; volatile char *ptr1 = start1, *ptr2 = start2; gettimeofday(&before, NULL); for (; ptr1 < end1; ptr1 += page_size, ptr2 += page_size) sum += *ptr1 + *ptr2; gettimeofday(&after, NULL); printf("File access time, round %d: %f (sec) ", i, (after.tv_sec - before.tv_sec) + (after.tv_usec - before.tv_usec) / 1000000.0); } return sum; } int main(int argc, char *argv[]) { const long MB = 1024 * 1024; long anon_mb, file_mb, file_rounds; const char filename[] = "large"; long *ret1; long ret2; if (argc != 4) { printf("usage: thrash ANON_MB FILE_MB FILE_ROUNDS "); exit(0); } anon_mb = atoi(argv[1]); file_mb = atoi(argv[2]); file_rounds = atoi(argv[3]); fallocate_file(filename, file_mb * MB); printf("Allocate %ld MB anonymous pages ", anon_mb); ret1 = alloc_anon(anon_mb * MB); printf("Access %ld MB file pages ", file_mb); ret2 = access_file(filename, file_mb * MB, file_rounds); printf("Print result to prevent optimization: %ld ", *ret1 + ret2); return 0; } ---8<--- Running the test program on 2GB RAM VM with kernel 5.2.0-rc5, the program fills ram with 2048 MB memory, access a 200 MB file for 10 times. Without this patch, the file cache is dropped aggresively and every access to the file is from disk. $ ./thrash 2048 200 10 Allocate 2048 MB anonymous pages Access 200 MB file pages File access time, round 0: 2.489316 (sec) File access time, round 1: 2.581277 (sec) File access time, round 2: 2.487624 (sec) File access time, round 3: 2.449100 (sec) File access time, round 4: 2.420423 (sec) File access time, round 5: 2.343411 (sec) File access time, round 6: 2.454833 (sec) File access time, round 7: 2.483398 (sec) File access time, round 8: 2.572701 (sec) File access time, round 9: 2.493014 (sec) With this patch, these file pages can be cached. $ ./thrash 2048 200 10 Allocate 2048 MB anonymous pages Access 200 MB file pages File access time, round 0: 2.475189 (sec) File access time, round 1: 2.440777 (sec) File access time, round 2: 2.411671 (sec) File access time, round 3: 1.955267 (sec) File access time, round 4: 0.029924 (sec) File access time, round 5: 0.000808 (sec) File access time, round 6: 0.000771 (sec) File access time, round 7: 0.000746 (sec) File access time, round 8: 0.000738 (sec) File access time, round 9: 0.000747 (sec) Checked the swap out stats during the test [1], 19006 pages swapped out with this patch, 3418 pages swapped out without this patch. There are more swap out, but I think it's within reasonable range when file backed data set doesn't fit into the memory. $ ./thrash 2000 100 2100 5 1 # ANON_MB FILE_EXEC FILE_NOEXEC ROUNDS PROCESSES Allocate 2000 MB anonymous pages active_anon: 1613644, inactive_anon: 348656, active_file: 892, inactive_file: 1384 (kB) pswpout: 7972443, pgpgin: 478615246 Access 100 MB executable file pages Access 2100 MB regular file pages File access time, round 0: 12.165, (sec) active_anon: 1433788, inactive_anon: 478116, active_file: 17896, inactive_file: 24328 (kB) File access time, round 1: 11.493, (sec) active_anon: 1430576, inactive_anon: 477144, active_file: 25440, inactive_file: 26172 (kB) File access time, round 2: 11.455, (sec) active_anon: 1427436, inactive_anon: 476060, active_file: 21112, inactive_file: 28808 (kB) File access time, round 3: 11.454, (sec) active_anon: 1420444, inactive_anon: 473632, active_file: 23216, inactive_file: 35036 (kB) File access time, round 4: 11.479, (sec) active_anon: 1413964, inactive_anon: 471460, active_file: 31728, inactive_file: 32224 (kB) pswpout: 7991449 (+ 19006), pgpgin: 489924366 (+ 11309120) With 4 processes accessing non-overlapping parts of a large file, 30316 pages swapped out with this patch, 5152 pages swapped out without this patch. The swapout number is small comparing to pgpgin. [1]: https://github.com/vovo/testing/blob/master/mem_thrash.c Link: http://lkml.kernel.org/r/20190701081038.GA83398@google.com Fixes: |
||
Ross Zwisler
|
4becd6c11e |
mm: add filemap_fdatawait_range_keep_errors()
commit aa0bfcd939c30617385ffa28682c062d78050eba upstream. In the spirit of filemap_fdatawait_range() and filemap_fdatawait_keep_errors(), introduce filemap_fdatawait_range_keep_errors() which both takes a range upon which to wait and does not clear errors from the address space. Signed-off-by: Ross Zwisler <zwisler@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Greg Kroah-Hartman
|
5ad6eeba58 |
This is the 4.19.58 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0lmYwACgkQONu9yGCS aT4h5w//ZG0BYEwxoa4Qc8rwvncnk78miK/VRH5JVTiToDqTuttHZQoMp+NLD2fQ V679f/2+VqEPn8o6yJsrbM8uea0iIratI8U6L2OEt6TKPbar3CPcRUPJeqlPWkej tf3qjAtvNNjLcl7xCYt9JNvpF4RwA8rLWWP5hZyYMi7xcMiB0FOriTlVJYHJ0PLK Iqg+edkBxKwx7mvFlZnJkT0ln5hCqT4QBq2XrOYGUfy2Ans5Ytg5dhhp41QDD6iu oE4mS+fybCzNOR3BWl7pfpeJRg8TKq4XNzYsQr9ftt2e3OZxOi3Jg+RLsgzjJB9P 1aTsuSzSeMXVGrAwRpBAot7TC+8F88sci0gibh4pg5N0ujGdvRW4gyzYHtdKhsTc wmjYMKbAxJWwz0vkRp1aSnUMSRur4Wo3qCWaOWpjkP4xhSBTTER5e5cqeuVSWde5 FaD8s0yjnQsUaH3oxZ7zDL//MR0N+C4Izs9c2A8HkdksWTdTvI7YX8c766iIZgrm JFV0FIZYIHAyuXT04W9n3VSvV4tLS+ouwYZpgG09oK0lBA8NT6RyZWzijY3VE0ed Kl+t6iu02qZgZrvnq4pHUVnLQtw7KfyL3mzeljVxEeaTbGODPOJfypY1OMfhWYw+ dIlmsmfa2aANf5wttl8CjLkAIIG3JmuWO2exMQidvXlGCE+rKVM= =u7q2 -----END PGP SIGNATURE----- Merge 4.19.58 into android-4.19 Changes in 4.19.58 Bluetooth: Fix faulty expression for minimum encryption key size check block: Fix a NULL pointer dereference in generic_make_request() md/raid0: Do not bypass blocking queue entered for raid0 bios netfilter: nf_flow_table: ignore DF bit setting netfilter: nft_flow_offload: set liberal tracking mode for tcp netfilter: nft_flow_offload: don't offload when sequence numbers need adjustment netfilter: nft_flow_offload: IPCB is only valid for ipv4 family ASoC : cs4265 : readable register too low ASoC: ak4458: add return value for ak4458_probe ASoC: soc-pcm: BE dai needs prepare when pause release after resume ASoC: ak4458: rstn_control - return a non-zero on error only spi: bitbang: Fix NULL pointer dereference in spi_unregister_master drm/mediatek: fix unbind functions drm/mediatek: unbind components in mtk_drm_unbind() drm/mediatek: call drm_atomic_helper_shutdown() when unbinding driver drm/mediatek: clear num_pipes when unbind driver drm/mediatek: call mtk_dsi_stop() after mtk_drm_crtc_atomic_disable() ASoC: max98090: remove 24-bit format support if RJ is 0 ASoC: sun4i-i2s: Fix sun8i tx channel offset mask ASoC: sun4i-i2s: Add offset to RX channel select x86/CPU: Add more Icelake model numbers usb: gadget: fusb300_udc: Fix memory leak of fusb300->ep[i] usb: gadget: udc: lpc32xx: allocate descriptor with GFP_ATOMIC ALSA: hdac: fix memory release for SST and SOF drivers SoC: rt274: Fix internal jack assignment in set_jack callback scsi: hpsa: correct ioaccel2 chaining drm: panel-orientation-quirks: Add quirk for GPD pocket2 drm: panel-orientation-quirks: Add quirk for GPD MicroPC platform/x86: asus-wmi: Only Tell EC the OS will handle display hotkeys from asus_nb_wmi platform/x86: intel-vbtn: Report switch events when event wakes device platform/x86: mlx-platform: Fix parent device in i2c-mux-reg device registration platform/mellanox: mlxreg-hotplug: Add devm_free_irq call to remove flow i2c: pca-platform: Fix GPIO lookup code cpuset: restore sanity to cpuset_cpus_allowed_fallback() scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILE mm/mlock.c: change count_mm_mlocked_page_nr return type tracing: avoid build warning with HAVE_NOP_MCOUNT module: Fix livepatch/ftrace module text permissions race ftrace: Fix NULL pointer dereference in free_ftrace_func_mapper() drm/i915/dmc: protect against reading random memory ptrace: Fix ->ptracer_cred handling for PTRACE_TRACEME crypto: user - prevent operating on larval algorithms crypto: cryptd - Fix skcipher instance memory leak ALSA: seq: fix incorrect order of dest_client/dest_ports arguments ALSA: firewire-lib/fireworks: fix miss detection of received MIDI messages ALSA: line6: Fix write on zero-sized buffer ALSA: usb-audio: fix sign unintended sign extension on left shifts ALSA: hda/realtek: Add quirks for several Clevo notebook barebones ALSA: hda/realtek - Change front mic location for Lenovo M710q lib/mpi: Fix karactx leak in mpi_powm fs/userfaultfd.c: disable irqs for fault_pending and event locks tracing/snapshot: Resize spare buffer if size changed ARM: dts: armada-xp-98dx3236: Switch to armada-38x-uart serial node arm64: kaslr: keep modules inside module region when KASAN is enabled drm/amd/powerplay: use hardware fan control if no powerplay fan table drm/amdgpu/gfx9: use reset default for PA_SC_FIFO_SIZE drm/etnaviv: add missing failure path to destroy suballoc drm/imx: notify drm core before sending event during crtc disable drm/imx: only send event on crtc disable if kept disabled ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code() mm/vmscan.c: prevent useless kswapd loops btrfs: Ensure replaced device doesn't have pending chunk allocation tty: rocket: fix incorrect forward declaration of 'rp_init()' mlxsw: spectrum: Handle VLAN device unlinking net/smc: move unhash before release of clcsock media: s5p-mfc: fix incorrect bus assignment in virtual child device drm/fb-helper: generic: Don't take module ref for fbcon f2fs: don't access node/meta inode mapping after iput mac80211: mesh: fix missing unlock on error in table_path_del() scsi: tcmu: fix use after free selftests: fib_rule_tests: Fix icmp proto with ipv6 x86/boot/compressed/64: Do not corrupt EDX on EFER.LME=1 setting net: hns: Fixes the missing put_device in positive leg for roce reset ALSA: hda: Initialize power_state field properly rds: Fix warning. ip6: fix skb leak in ip6frag_expire_frag_queue() netfilter: ipv6: nf_defrag: fix leakage of unqueued fragments sc16is7xx: move label 'err_spi' to correct section net: hns: fix unsigned comparison to less than zero bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K netfilter: ipv6: nf_defrag: accept duplicate fragments again KVM: x86: degrade WARN to pr_warn_ratelimited KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC nfsd: Fix overflow causing non-working mounts on 1 TB machines svcrdma: Ignore source port when computing DRC hash MIPS: Fix bounds check virt_addr_valid MIPS: Add missing EHB in mtc0 -> mfc0 sequence. MIPS: have "plain" make calls build dtbs for selected platforms dmaengine: qcom: bam_dma: Fix completed descriptors count dmaengine: imx-sdma: remove BD_INTR for channel0 Linux 4.19.58 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Shakeel Butt
|
27ce6c2675 |
mm/vmscan.c: prevent useless kswapd loops
commit dffcac2cb88e4ec5906235d64a83d802580b119e upstream.
In production we have noticed hard lockups on large machines running
large jobs due to kswaps hoarding lru lock within isolate_lru_pages when
sc->reclaim_idx is 0 which is a small zone. The lru was couple hundred
GiBs and the condition (page_zonenum(page) > sc->reclaim_idx) in
isolate_lru_pages() was basically skipping GiBs of pages while holding
the LRU spinlock with interrupt disabled.
On further inspection, it seems like there are two issues:
(1) If kswapd on the return from balance_pgdat() could not sleep (i.e.
node is still unbalanced), the classzone_idx is unintentionally set
to 0 and the whole reclaim cycle of kswapd will try to reclaim only
the lowest and smallest zone while traversing the whole memory.
(2) Fundamentally isolate_lru_pages() is really bad when the
allocation has woken kswapd for a smaller zone on a very large machine
running very large jobs. It can hoard the LRU spinlock while skipping
over 100s of GiBs of pages.
This patch only fixes (1). (2) needs a more fundamental solution. To
fix (1), in the kswapd context, if pgdat->kswapd_classzone_idx is
invalid use the classzone_idx of the previous kswapd loop otherwise use
the one the waker has requested.
Link: http://lkml.kernel.org/r/20190701201847.251028-1-shakeelb@google.com
Fixes:
|
||
swkhack
|
79fccb9815 |
mm/mlock.c: change count_mm_mlocked_page_nr return type
[ Upstream commit 0874bb49bb21bf24deda853e8bf61b8325e24bcb ]
On a 64-bit machine the value of "vma->vm_end - vma->vm_start" may be
negative when using 32 bit ints and the "count >> PAGE_SHIFT"'s result
will be wrong. So change the local variable and return value to
unsigned long to fix the problem.
Link: http://lkml.kernel.org/r/20190513023701.83056-1-swkhack@gmail.com
Fixes:
|
||
Greg Kroah-Hartman
|
5b2dde5e0b |
This is the 4.19.57 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0cjioACgkQONu9yGCS aT4TNg//Sr2cN3HmcbJrjfNAifpjT1XRix0Qy0EOYMhieCh26SbHyB0yo/N0UMCK iGv4ThqoBE+goK9bfb1F4CL0iMo88RM11lTy7UbemSQg2+MNJb8mvaq8YkpexTdw SRgXT1kyOPoHVGCypTgQcKHLdLAuOkQQGCxccU0n+Vc006nLPI0b9yRvgUnUwzvY EO9zLSfMLQhCcsLVoXLqaJ0AeU+VG5mkILjHZjcNElT+0T/LwoPO+VBLkuQt3KLp BWe+N11xsc2ZR53jptpl9UU2aaUGIKeYttKgwj7rcqUuigk4hQ0AIZmZuQWzhgBu 6ERnKRgKARKQt4igxL5IsbIJiSK4/VJvuaR+26Sobc6zfDPQ0qfOuJaZeLYQjRQe SXjLNXzozA1SV593o1atLhFeY+tGMRQ4dlFCE9x/gJ68v5dya+f0e7X+zP8+HV+v u7pfgHT3Jb43D/G6H4sHE0VZZF4vh3Ba675Xp4NzOQOaFHJtQQUPCROiyYjJF6+H 2fgkwsokE8oFPgqWrYuOIzV9t5THjSNqhT7lyZ/LNDJiMTnJytqfQ01zbHoaHCAb i5QB09x+72L7L/U9B9BGH+zEPTC2myw3dKmMv7kUxNx/3QKVDb/6cVnLnTWs4zrJ lw52HzgB2aV8pRtvgg0OeHedJ8UGVYfVq2/YHUHbiukgZ61n3J8= =OkFp -----END PGP SIGNATURE----- Merge 4.19.57 into android-4.19 Changes in 4.19.57 perf ui helpline: Use strlcpy() as a shorter form of strncpy() + explicit set nul perf help: Remove needless use of strncpy() perf header: Fix unchecked usage of strncpy() arm64: Don't unconditionally add -Wno-psabi to KBUILD_CFLAGS Revert "x86/uaccess, ftrace: Fix ftrace_likely_update() vs. SMAP" IB/hfi1: Close PSM sdma_progress sleep window 9p/xen: fix check for xenbus_read error in front_probe 9p: Use a slab for allocating requests 9p: embed fcall in req to round down buffer allocs 9p: add a per-client fcall kmem_cache 9p: rename p9_free_req() function 9p: Add refcount to p9_req_t 9p/rdma: do not disconnect on down_interruptible EAGAIN 9p: Rename req to rreq in trans_fd 9p: acl: fix uninitialized iattr access 9p/rdma: remove useless check in cm_event_handler 9p: p9dirent_read: check network-provided name length 9p: potential NULL dereference 9p/trans_fd: abort p9_read_work if req status changed 9p/trans_fd: put worker reqs on destroy net/9p: include trans_common.h to fix missing prototype warning. qmi_wwan: Fix out-of-bounds read Revert "usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup" usb: dwc3: gadget: combine unaligned and zero flags usb: dwc3: gadget: track number of TRBs per request usb: dwc3: gadget: use num_trbs when skipping TRBs on ->dequeue() usb: dwc3: gadget: extract dwc3_gadget_ep_skip_trbs() usb: dwc3: gadget: introduce cancelled_list usb: dwc3: gadget: move requests to cancelled_list usb: dwc3: gadget: remove wait_end_transfer usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup fs/proc/array.c: allow reporting eip/esp for all coredumping threads mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask fs/binfmt_flat.c: make load_flat_shared_library() work clk: socfpga: stratix10: fix divider entry for the emac clocks mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge mm/page_idle.c: fix oops because end_pfn is larger than max_pfn dm log writes: make sure super sector log updates are written in order scsi: vmw_pscsi: Fix use-after-free in pvscsi_queue_lck() x86/speculation: Allow guests to use SSBD even if host does not x86/microcode: Fix the microcode load on CPU hotplug for real x86/resctrl: Prevent possible overrun during bitmap operations KVM: x86/mmu: Allocate PAE root array when using SVM's 32-bit NPT NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O cpu/speculation: Warn on unsupported mitigations= parameter SUNRPC: Clean up initialisation of the struct rpc_rqst irqchip/mips-gic: Use the correct local interrupt map registers eeprom: at24: fix unexpected timeout under high load af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET bonding: Always enable vlan tx offload ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop net/packet: fix memory leak in packet_set_ring() net: remove duplicate fetch in sock_getsockopt net: stmmac: fixed new system time seconds value calculation net: stmmac: set IC bit when transmitting frames with HW timestamp sctp: change to hold sk after auth shkey is created successfully team: Always enable vlan tx offload tipc: change to use register_pernet_device tipc: check msg->req data len in tipc_nl_compat_bearer_disable tun: wake up waitqueues after IFF_UP is set bpf: simplify definition of BPF_FIB_LOOKUP related flags bpf: lpm_trie: check left child of last leftmost node for NULL bpf: fix nested bpf tracepoints with per-cpu data bpf: fix unconnected udp hooks bpf: udp: Avoid calling reuseport's bpf_prog from udp_gro bpf: udp: ipv6: Avoid running reuseport's bpf_prog from __udp6_lib_err arm64: futex: Avoid copying out uninitialised stack in failed cmpxchg() bpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd futex: Update comments and docs about return values of arch futex code RDMA: Directly cast the sockaddr union to sockaddr tipc: pass tunnel dev as NULL to udp_tunnel(6)_xmit_skb usb: dwc3: Reset num_trbs after skipping arm64: insn: Fix ldadd instruction encoding Linux 4.19.57 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Colin Ian King
|
87cf811ab6 |
mm/page_idle.c: fix oops because end_pfn is larger than max_pfn
commit 7298e3b0a149c91323b3205d325e942c3b3b9ef6 upstream.
Currently the calcuation of end_pfn can round up the pfn number to more
than the actual maximum number of pfns, causing an Oops. Fix this by
ensuring end_pfn is never more than max_pfn.
This can be easily triggered when on systems where the end_pfn gets
rounded up to more than max_pfn using the idle-page stress-ng stress test:
sudo stress-ng --idle-page 0
BUG: unable to handle kernel paging request at 00000000000020d8
#PF error: [normal kernel read fault]
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 1 PID: 11039 Comm: stress-ng-idle- Not tainted 5.0.0-5-generic #6-Ubuntu
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
RIP: 0010:page_idle_get_page+0xc8/0x1a0
Code: 0f b1 0a 75 7d 48 8b 03 48 89 c2 48 c1 e8 33 83 e0 07 48 c1 ea 36 48 8d 0c 40 4c 8d 24 88 49 c1 e4 07 4c 03 24 d5 00 89 c3 be <49> 8b 44 24 58 48 8d b8 80 a1 02 00 e8 07 d5 77 00 48 8b 53 08 48
RSP: 0018:ffffafd7c672fde8 EFLAGS: 00010202
RAX: 0000000000000005 RBX: ffffe36341fff700 RCX: 000000000000000f
RDX: 0000000000000284 RSI: 0000000000000275 RDI: 0000000001fff700
RBP: ffffafd7c672fe00 R08: ffffa0bc34056410 R09: 0000000000000276
R10: ffffa0bc754e9b40 R11: ffffa0bc330f6400 R12: 0000000000002080
R13: ffffe36341fff700 R14: 0000000000080000 R15: ffffa0bc330f6400
FS: 00007f0ec1ea5740(0000) GS:ffffa0bc7db00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000020d8 CR3: 0000000077d68000 CR4: 00000000000006e0
Call Trace:
page_idle_bitmap_write+0x8c/0x140
sysfs_kf_bin_write+0x5c/0x70
kernfs_fop_write+0x12e/0x1b0
__vfs_write+0x1b/0x40
vfs_write+0xab/0x1b0
ksys_write+0x55/0xc0
__x64_sys_write+0x1a/0x20
do_syscall_64+0x5a/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Link: http://lkml.kernel.org/r/20190618124352.28307-1-colin.king@canonical.com
Fixes:
|
||
Naoya Horiguchi
|
1192fb703d |
mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge
commit faf53def3b143df11062d87c12afe6afeb6f8cc7 upstream.
madvise(MADV_SOFT_OFFLINE) often returns -EBUSY when calling soft offline
for hugepages with overcommitting enabled. That was caused by the
suboptimal code in current soft-offline code. See the following part:
ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
MIGRATE_SYNC, MR_MEMORY_FAILURE);
if (ret) {
...
} else {
/*
* We set PG_hwpoison only when the migration source hugepage
* was successfully dissolved, because otherwise hwpoisoned
* hugepage remains on free hugepage list, then userspace will
* find it as SIGBUS by allocation failure. That's not expected
* in soft-offlining.
*/
ret = dissolve_free_huge_page(page);
if (!ret) {
if (set_hwpoison_free_buddy_page(page))
num_poisoned_pages_inc();
}
}
return ret;
Here dissolve_free_huge_page() returns -EBUSY if the migration source page
was freed into buddy in migrate_pages(), but even in that case we actually
has a chance that set_hwpoison_free_buddy_page() succeeds. So that means
current code gives up offlining too early now.
dissolve_free_huge_page() checks that a given hugepage is suitable for
dissolving, where we should return success for !PageHuge() case because
the given hugepage is considered as already dissolved.
This change also affects other callers of dissolve_free_huge_page(), which
are cleaned up together.
[n-horiguchi@ah.jp.nec.com: v3]
Link: http://lkml.kernel.org/r/1560761476-4651-3-git-send-email-n-horiguchi@ah.jp.nec.comLink: http://lkml.kernel.org/r/1560154686-18497-3-git-send-email-n-horiguchi@ah.jp.nec.com
Fixes:
|
||
Naoya Horiguchi
|
aab6291888 |
mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails
commit b38e5962f8ed0d2a2b28a887fc2221f7f41db119 upstream.
The pass/fail of soft offline should be judged by checking whether the
raw error page was finally contained or not (i.e. the result of
set_hwpoison_free_buddy_page()), but current code do not work like
that. It might lead us to misjudge the test result when
set_hwpoison_free_buddy_page() fails.
Without this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may
not offline the original page and will not return an error.
Link: http://lkml.kernel.org/r/1560154686-18497-2-git-send-email-n-horiguchi@ah.jp.nec.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Fixes:
|
||
zhong jiang
|
49e9b499a3 |
mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask
commit 29b190fa774dd1b72a1a6f19687d55dc72ea83be upstream. mpol_rebind_nodemask() is called for MPOL_BIND and MPOL_INTERLEAVE mempoclicies when the tasks's cpuset's mems_allowed changes. For policies created without MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES, it works by remapping the policy's allowed nodes (stored in v.nodes) using the previous value of mems_allowed (stored in w.cpuset_mems_allowed) as the domain of map and the new mems_allowed (passed as nodes) as the range of the map (see the comment of bitmap_remap() for details). The result of remapping is stored back as policy's nodemask in v.nodes, and the new value of mems_allowed should be stored in w.cpuset_mems_allowed to facilitate the next rebind, if it happens. However, |
||
Greg Kroah-Hartman
|
237d383597 |
This is the 4.19.54 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0Nx3oACgkQONu9yGCS aT7hJRAAzdu1EKr/VUiFJshvPL/+veH1MLdMgafW6gXcoQCQSdMuv1j3mv3axElC dz2e/nHXvIhMKMrhzgEvVwV8m8eKPwM4IZjTJ8ji8st9uy0R2UIdbPUHrLtRAQzI bK2BIk6697B9/9z2s8nDXhKex9EtZ2vj8DeQLT1AgQKMM0+abPA2YR9+cp9HEjQR Wa5NjajwWgpty1VVk+Q6GMD+GBnEILyqxtVGv30z/prWoVadFAXJEZrWsaXotUvh ySc2CS9Iu/muiIegTSStnNXaWaawZblA4JdDcdRJ235rPK8sbACYtDgZispYopi4 y/9kUtuZzWEv61aFZkXd13jKZ8pJsrLZLoc+yDqew0IA6M2Hz7d7Pq/UWQ0m4qnt rCcU1vTFooSp/IHOToXN0oQQrTqD2oK5zO4V9S5McwQniuO/RqUPJfX/sKjtrvNT SI/yLVKMXImAwVXxNEfO4bE+T9F/QYptOHdpfqJkPvpKLzjxnPBPswG8poIKwQzJ w3p1AoQ1ISuW6b94a/nI5nSC28gVslVg+oTjRjF7kfIcOtvglX79XPvfdM5bB2DC qD51A8veEGCtAu57/tSyisGOomqTqqqbaiYXwuhgTI1NSszbqKDLNvjsw1GKj0rK mSmDzVMgQhxaHOagOeDqLYlj1zgTamx5R6+dretf8AOXyEE2RSg= =+MJy -----END PGP SIGNATURE----- Merge 4.19.54 into android-4.19 Changes in 4.19.54 ax25: fix inconsistent lock state in ax25_destroy_timer be2net: Fix number of Rx queues used for flow hashing hv_netvsc: Set probe mode to sync ipv6: flowlabel: fl6_sock_lookup() must use atomic_inc_not_zero lapb: fixed leak of control-blocks. neigh: fix use-after-free read in pneigh_get_next net: dsa: rtl8366: Fix up VLAN filtering net: openvswitch: do not free vport if register_netdevice() is failed. nfc: Ensure presence of required attributes in the deactivate_target handler sctp: Free cookie before we memdup a new one sunhv: Fix device naming inconsistency between sunhv_console and sunhv_reg tipc: purge deferredq list for each grp member in tipc_group_delete vsock/virtio: set SOCK_DONE on peer shutdown net/mlx5: Avoid reloading already removed devices net: mvpp2: prs: Fix parser range for VID filtering net: mvpp2: prs: Use the correct helpers when removing all VID filters Staging: vc04_services: Fix a couple error codes perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints netfilter: nf_queue: fix reinject verdict handling ipvs: Fix use-after-free in ip_vs_in selftests: netfilter: missing error check when setting up veth interface clk: ti: clkctrl: Fix clkdm_clk handling powerpc/powernv: Return for invalid IMC domain usb: xhci: Fix a potential null pointer dereference in xhci_debugfs_create_endpoint() mISDN: make sure device name is NUL terminated x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor perf/ring_buffer: Fix exposing a temporarily decreased data_head perf/ring_buffer: Add ordering to rb->nest increment perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb->user_page data gpio: fix gpio-adp5588 build errors net: stmmac: update rx tail pointer register to fix rx dma hang issue. net: tulip: de4x5: Drop redundant MODULE_DEVICE_TABLE() ACPI/PCI: PM: Add missing wakeup.flags.valid checks drm/etnaviv: lock MMU while dumping core net: aquantia: tx clean budget logic error net: aquantia: fix LRO with FCS error i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr ALSA: hda - Force polling mode on CNL for fixing codec communication configfs: Fix use-after-free when accessing sd->s_dentry perf data: Fix 'strncat may truncate' build failure with recent gcc perf namespace: Protect reading thread's namespace perf record: Fix s390 missing module symbol and warning for non-root users ia64: fix build errors by exporting paddr_to_nid() xen/pvcalls: Remove set but not used variable xenbus: Avoid deadlock during suspend due to open transactions KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu arm64: fix syscall_fn_t type arm64: use the correct function type in SYSCALL_DEFINE0 arm64: use the correct function type for __arm64_sys_ni_syscall net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs net: phylink: ensure consistent phy interface mode net: phy: dp83867: Set up RGMII TX delay scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route() scsi: smartpqi: properly set both the DMA mask and the coherent DMA mask scsi: scsi_dh_alua: Fix possible null-ptr-deref scsi: libsas: delete sas port if expander discover failed mlxsw: spectrum: Prevent force of 56G ocfs2: fix error path kobject memory leak coredump: fix race condition between collapse_huge_page() and core dumping Abort file_remove_privs() for non-reg. files Linux 4.19.54 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Andrea Arcangeli
|
465ce9a50f |
coredump: fix race condition between collapse_huge_page() and core dumping
commit 59ea6d06cfa9247b586a695c21f94afa7183af74 upstream.
When fixing the race conditions between the coredump and the mmap_sem
holders outside the context of the process, we focused on
mmget_not_zero()/get_task_mm() callers in 04f5866e41fb70 ("coredump: fix
race condition between mmget_not_zero()/get_task_mm() and core
dumping"), but those aren't the only cases where the mmap_sem can be
taken outside of the context of the process as Michal Hocko noticed
while backporting that commit to older -stable kernels.
If mmgrab() is called in the context of the process, but then the
mm_count reference is transferred outside the context of the process,
that can also be a problem if the mmap_sem has to be taken for writing
through that mm_count reference.
khugepaged registration calls mmgrab() in the context of the process,
but the mmap_sem for writing is taken later in the context of the
khugepaged kernel thread.
collapse_huge_page() after taking the mmap_sem for writing doesn't
modify any vma, so it's not obvious that it could cause a problem to the
coredump, but it happens to modify the pmd in a way that breaks an
invariant that pmd_trans_huge_lock() relies upon. collapse_huge_page()
needs the mmap_sem for writing just to block concurrent page faults that
call pmd_trans_huge_lock().
Specifically the invariant that "!pmd_trans_huge()" cannot become a
"pmd_trans_huge()" doesn't hold while collapse_huge_page() runs.
The coredump will call __get_user_pages() without mmap_sem for reading,
which eventually can invoke a lockless page fault which will need a
functional pmd_trans_huge_lock().
So collapse_huge_page() needs to use mmget_still_valid() to check it's
not running concurrently with the coredump... as long as the coredump
can invoke page faults without holding the mmap_sem for reading.
This has "Fixes: khugepaged" to facilitate backporting, but in my view
it's more a bug in the coredump code that will eventually have to be
rewritten to stop invoking page faults without the mmap_sem for reading.
So the long term plan is still to drop all mmget_still_valid().
Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com
Fixes:
|
||
Greg Kroah-Hartman
|
f613e8938d |
This is the 4.19.53 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0J4rgACgkQONu9yGCS aT6JVhAAouXsxBjKDxMlJkMrcCFPVHTgWyrHMJ0aeV3jWACGoKbd+RBZtEoNrD4n X5/AJdDMvSPuPMWqE8BO/+Q1e4YQYRoHQyQ24aRkq66a8+UpybRMpshFb+tfGQO+ ysVypoBg97Y+8ao0MFKxu79+rrJor566f8jXznrxvFG3MnI0Kq3BBGq59UcQTu4z e1jd2OJjQxttZovOlA6v9rRTdHdzoI/84DZfYgqV1pUoC8pwjg7UO0+1cwnKlqem No4iWGP4lmfXN/VuzRpBs9a6kqE14GL5Ah13U3uhYXk9Dd2iLkS2fPKPiaDpiHql /SIL/3q4BCJVDZePIZuLoRe9qKLDh7rdNziIDhSUskW1H1tHAhsJUktWo0HarZW2 6Z5K10S/MsXAxWoFnY1TMBhZg+h+f/pHxy6FqZdvV4DGsx7uDWoVhn+o4Qw5WTSG 7KGMvvyDE7EyciwXVkQoN3hwX9pcxBwDtMswzSbW1VM9h3nYOzfY6OwSc/GDbM+f 9zQb+sgLIu1K5FUqyYNMfV/YXGGHpml181ZRWm3bZ4N63odDKMwzWfkc5RlIlDoM kqmMZKvpTSA4MeRzGzcUMsIpXv8SeNvR3cotCBqau+IRAIkLX8tZClJL+4FilLIN PkOjXtuEt648ZDybVGZykmAPIVuWreosWdtIH79ZOmnpveRQN7I= =DiXP -----END PGP SIGNATURE----- Merge 4.19.53 into android-4.19 Changes in 4.19.53 drm/nouveau: add kconfig option to turn off nouveau legacy contexts. (v3) nouveau: Fix build with CONFIG_NOUVEAU_LEGACY_CTX_SUPPORT disabled HID: multitouch: handle faulty Elo touch device HID: wacom: Don't set tool type until we're in range HID: wacom: Don't report anything prior to the tool entering range HID: wacom: Send BTN_TOUCH in response to INTUOSP2_BT eraser contact HID: wacom: Correct button numbering 2nd-gen Intuos Pro over Bluetooth HID: wacom: Sync INTUOSP2_BT touch state after each frame if necessary Revert "ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops" ALSA: oxfw: allow PCM capture for Stanton SCS.1m ALSA: hda/realtek - Update headset mode for ALC256 ALSA: firewire-motu: fix destruction of data for isochronous resources libata: Extend quirks for the ST1000LM024 drives with NOLPM quirk mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node fs/ocfs2: fix race in ocfs2_dentry_attach_lock() mm/vmscan.c: fix trying to reclaim unevictable LRU page signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO ptrace: restore smp_rmb() in __ptrace_may_access() iommu/arm-smmu: Avoid constant zero in TLBI writes i2c: acorn: fix i2c warning bcache: fix stack corruption by PRECEDING_KEY() bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css() ASoC: cs42xx8: Add regcache mask dirty ASoC: fsl_asrc: Fix the issue about unsupported rate drm/i915/sdvo: Implement proper HDMI audio support for SDVO x86/uaccess, kcov: Disable stack protector ALSA: seq: Protect in-kernel ioctl calls with mutex ALSA: seq: Fix race of get-subscription call vs port-delete ioctls Revert "ALSA: seq: Protect in-kernel ioctl calls with mutex" s390/kasan: fix strncpy_from_user kasan checks Drivers: misc: fix out-of-bounds access in function param_set_kgdbts_var f2fs: fix to avoid accessing xattr across the boundary scsi: qedi: remove memset/memcpy to nfunc and use func instead scsi: qedi: remove set but not used variables 'cdev' and 'udev' scsi: lpfc: correct rcu unlock issue in lpfc_nvme_info_show scsi: lpfc: add check for loss of ndlp when sending RRQ arm64/mm: Inhibit huge-vmap with ptdump nvme: fix srcu locking on error return in nvme_get_ns_from_disk nvme: remove the ifdef around nvme_nvm_ioctl nvme: merge nvme_ns_ioctl into nvme_ioctl nvme: release namespace SRCU protection before performing controller ioctls nvme: fix memory leak for power latency tolerance platform/x86: pmc_atom: Add Lex 3I380D industrial PC to critclk_systems DMI table platform/x86: pmc_atom: Add several Beckhoff Automation boards to critclk_systems DMI table scsi: bnx2fc: fix incorrect cast to u64 on shift operation libnvdimm: Fix compilation warnings with W=1 selftests: fib_rule_tests: fix local IPv4 address typo selftests/timers: Add missing fflush(stdout) calls tracing: Prevent hist_field_var_ref() from accessing NULL tracing_map_elts usbnet: ipheth: fix racing condition KVM: arm/arm64: Move cc/it checks under hyp's Makefile to avoid instrumentation KVM: x86/pmu: mask the result of rdpmc according to the width of the counters KVM: x86/pmu: do not mask the value that is written to fixed PMUs KVM: s390: fix memory slot handling for KVM_SET_USER_MEMORY_REGION tools/kvm_stat: fix fields filter for child events drm/vmwgfx: integer underflow in vmw_cmd_dx_set_shader() leading to an invalid read drm/vmwgfx: NULL pointer dereference from vmw_cmd_dx_view_define() usb: dwc2: Fix DMA cache alignment issues usb: dwc2: host: Fix wMaxPacketSize handling (fix webcam regression) USB: Fix chipmunk-like voice when using Logitech C270 for recording audio. USB: usb-storage: Add new ID to ums-realtek USB: serial: pl2303: add Allied Telesis VT-Kit3 USB: serial: option: add support for Simcom SIM7500/SIM7600 RNDIS mode USB: serial: option: add Telit 0x1260 and 0x1261 compositions timekeeping: Repair ktime_get_coarse*() granularity RAS/CEC: Convert the timer callback to a workqueue RAS/CEC: Fix binary search function x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback x86/kasan: Fix boot with 5-level paging and KASAN x86/mm/KASLR: Compute the size of the vmemmap section properly x86/resctrl: Prevent NULL pointer dereference when local MBM is disabled drm/edid: abstract override/firmware EDID retrieval drm: add fallback override/firmware EDID modes workaround rtc: pcf8523: don't return invalid date when battery is low Linux 4.19.53 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Minchan Kim
|
54a20289cb |
mm/vmscan.c: fix trying to reclaim unevictable LRU page
commit a58f2cef26e1ca44182c8b22f4f4395e702a5795 upstream. There was the below bug report from Wu Fangsuo. On the CMA allocation path, isolate_migratepages_range() could isolate unevictable LRU pages and reclaim_clean_page_from_list() can try to reclaim them if they are clean file-backed pages. page:ffffffbf02f33b40 count:86 mapcount:84 mapping:ffffffc08fa7a810 index:0x24 flags: 0x19040c(referenced|uptodate|arch_1|mappedtodisk|unevictable|mlocked) raw: 000000000019040c ffffffc08fa7a810 0000000000000024 0000005600000053 raw: ffffffc009b05b20 ffffffc009b05b20 0000000000000000 ffffffc09bf3ee80 page dumped because: VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page)) page->mem_cgroup:ffffffc09bf3ee80 ------------[ cut here ]------------ kernel BUG at /home/build/farmland/adroid9.0/kernel/linux/mm/vmscan.c:1350! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 7125 Comm: syz-executor Tainted: G S 4.14.81 #3 Hardware name: ASR AQUILAC EVB (DT) task: ffffffc00a54cd00 task.stack: ffffffc009b00000 PC is at shrink_page_list+0x1998/0x3240 LR is at shrink_page_list+0x1998/0x3240 pc : [<ffffff90083a2158>] lr : [<ffffff90083a2158>] pstate: 60400045 sp : ffffffc009b05940 .. shrink_page_list+0x1998/0x3240 reclaim_clean_pages_from_list+0x3c0/0x4f0 alloc_contig_range+0x3bc/0x650 cma_alloc+0x214/0x668 ion_cma_allocate+0x98/0x1d8 ion_alloc+0x200/0x7e0 ion_ioctl+0x18c/0x378 do_vfs_ioctl+0x17c/0x1780 SyS_ioctl+0xac/0xc0 Wu found it's due to commit |
||
Shakeel Butt
|
553a1f0d3c |
mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node
commit 3510955b327176fd4cbab5baa75b449f077722a2 upstream.
Syzbot reported following memory leak:
ffffffffda RBX: 0000000000000003 RCX: 0000000000441f79
BUG: memory leak
unreferenced object 0xffff888114f26040 (size 32):
comm "syz-executor626", pid 7056, jiffies 4294948701 (age 39.410s)
hex dump (first 32 bytes):
40 60 f2 14 81 88 ff ff 40 60 f2 14 81 88 ff ff @`......@`......
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
slab_post_alloc_hook mm/slab.h:439 [inline]
slab_alloc mm/slab.c:3326 [inline]
kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
kmalloc include/linux/slab.h:547 [inline]
__memcg_init_list_lru_node+0x58/0xf0 mm/list_lru.c:352
memcg_init_list_lru_node mm/list_lru.c:375 [inline]
memcg_init_list_lru mm/list_lru.c:459 [inline]
__list_lru_init+0x193/0x2a0 mm/list_lru.c:626
alloc_super+0x2e0/0x310 fs/super.c:269
sget_userns+0x94/0x2a0 fs/super.c:609
sget+0x8d/0xb0 fs/super.c:660
mount_nodev+0x31/0xb0 fs/super.c:1387
fuse_mount+0x2d/0x40 fs/fuse/inode.c:1236
legacy_get_tree+0x27/0x80 fs/fs_context.c:661
vfs_get_tree+0x2e/0x120 fs/super.c:1476
do_new_mount fs/namespace.c:2790 [inline]
do_mount+0x932/0xc50 fs/namespace.c:3110
ksys_mount+0xab/0x120 fs/namespace.c:3319
__do_sys_mount fs/namespace.c:3333 [inline]
__se_sys_mount fs/namespace.c:3330 [inline]
__x64_sys_mount+0x26/0x30 fs/namespace.c:3330
do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This is a simple off by one bug on the error path.
Link: http://lkml.kernel.org/r/20190528043202.99980-1-shakeelb@google.com
Fixes:
|
||
Greg Kroah-Hartman
|
d1f7f3be99 |
This is the 4.19.51 stable release
-----BEGIN PGP SIGNATURE----- iQIyBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0EwEMACgkQONu9yGCS aT4UpA/3UqmMAaHH4g20cic8YDoBkYXoQUyj/kbf1w6bXqCuLTHOFS/qXa6QtPK3 WfwNWChIob+EbfAMjreQZjT6pwBbxyCNUvsvpB0k8YhJ3v/sIZL/Wc+1ZDu+jBC9 xfqwT9+JGzgu8P4PUTr1BsGMhgiH9qoJAeq7RCuDcicMIJ4/aJVr4Cvrs+18PVUe 95T5vSn6G62QyOrUExmuuztvNM2P/kos6yJTkN80l3uPLMUjsnWCsMKu+9utd5ea ew332Z/BQs+ff4oljH1uRwsM/Z7+AKXlXatXD0sHQ8CTEqh44SgUSU96vB/h9W8I 6a0t4M2atsdaGjMHmiiPA+gIgd1rW0lsHk6ob6qgfzuRBFGN9BTUfZgQwhOW7uXt e4o5RrWELkbk/TlJzrG1dFjhfyeb7q3LHOOg8kOVU0KdPD44ekJW9qoI8tlMPI+5 mafCCS/oS6TaW20ZKmjjkIbfTndzdO3dy5EWLy3elCEyLF2gDZ6WCAL+SMngdApC /dbuBigF/+RBaEU1e56DkcYUYXjt6UO84O3dYAx69s6EhHS5CP4yCySuYpdxyg/G MWdFDhtbnFyMXqoK1ROS0hNnxpydkh+R1Ns0TeSYibJI2J2enMGIa0thu4aVLD3v +GLqHV2PsPTRQmF5ChnvNV7O53i4j4WBQQjL+80PQulJwRFb/A== =bIqz -----END PGP SIGNATURE----- Merge 4.19.51 into android-4.19 Changes in 4.19.51 rapidio: fix a NULL pointer dereference when create_workqueue() fails fs/fat/file.c: issue flush after the writeback of FAT sysctl: return -EINVAL if val violates minmax ipc: prevent lockup on alloc_msg and free_msg drm/pl111: Initialize clock spinlock early ARM: prevent tracing IPI_CPU_BACKTRACE mm/hmm: select mmu notifier when selecting HMM hugetlbfs: on restore reserve error path retain subpool reservation mem-hotplug: fix node spanned pages when we have a node with only ZONE_MOVABLE mm/cma.c: fix crash on CMA allocation if bitmap allocation fails initramfs: free initrd memory if opening /initrd.image fails mm/cma.c: fix the bitmap status to show failed allocation reason mm: page_mkclean vs MADV_DONTNEED race mm/cma_debug.c: fix the break condition in cma_maxchunk_get() mm/slab.c: fix an infinite loop in leaks_show() kernel/sys.c: prctl: fix false positive in validate_prctl_map() thermal: rcar_gen3_thermal: disable interrupt in .remove drivers: thermal: tsens: Don't print error message on -EPROBE_DEFER mfd: tps65912-spi: Add missing of table registration mfd: intel-lpss: Set the device in reset state when init drm/nouveau/disp/dp: respect sink limits when selecting failsafe link configuration mfd: twl6040: Fix device init errors for ACCCTL register perf/x86/intel: Allow PEBS multi-entry in watermark mode drm/nouveau/kms/gf119-gp10x: push HeadSetControlOutputResource() mthd when encoders change drm/bridge: adv7511: Fix low refresh rate selection objtool: Don't use ignore flag for fake jumps drm/nouveau/kms/gv100-: fix spurious window immediate interlocks bpf: fix undefined behavior in narrow load handling EDAC/mpc85xx: Prevent building as a module pwm: meson: Use the spin-lock only to protect register modifications mailbox: stm32-ipcc: check invalid irq ntp: Allow TAI-UTC offset to be set to zero f2fs: fix to avoid panic in do_recover_data() f2fs: fix to avoid panic in f2fs_inplace_write_data() f2fs: fix to avoid panic in f2fs_remove_inode_page() f2fs: fix to do sanity check on free nid f2fs: fix to clear dirty inode in error path of f2fs_iget() f2fs: fix to avoid panic in dec_valid_block_count() f2fs: fix to use inline space only if inline_xattr is enable f2fs: fix to do sanity check on valid block count of segment f2fs: fix to do checksum even if inode page is uptodate percpu: remove spurious lock dependency between percpu and sched configfs: fix possible use-after-free in configfs_register_group uml: fix a boot splat wrt use of cpu_all_mask PCI: dwc: Free MSI in dw_pcie_host_init() error path PCI: dwc: Free MSI IRQ page in dw_pcie_free_msi() ovl: do not generate duplicate fsnotify events for "fake" path mmc: mmci: Prevent polling for busy detection in IRQ context netfilter: nf_flow_table: fix missing error check for rhashtable_insert_fast netfilter: nf_conntrack_h323: restore boundary check correctness mips: Make sure dt memory regions are valid netfilter: nf_tables: fix base chain stat rcu_dereference usage watchdog: imx2_wdt: Fix set_timeout for big timeout values watchdog: fix compile time error of pretimeout governors blk-mq: move cancel of requeue_work into blk_mq_release iommu/vt-d: Set intel_iommu_gfx_mapped correctly misc: pci_endpoint_test: Fix test_reg_bar to be updated in pci_endpoint_test PCI: designware-ep: Use aligned ATU window for raising MSI interrupts nvme-pci: unquiesce admin queue on shutdown nvme-pci: shutdown on timeout during deletion netfilter: nf_flow_table: check ttl value in flow offload data path netfilter: nf_flow_table: fix netdev refcnt leak ALSA: hda - Register irq handler after the chip initialization nvmem: core: fix read buffer in place nvmem: sunxi_sid: Support SID on A83T and H5 fuse: retrieve: cap requested size to negotiated max_write nfsd: allow fh_want_write to be called twice nfsd: avoid uninitialized variable warning vfio: Fix WARNING "do not call blocking ops when !TASK_RUNNING" iommu/arm-smmu-v3: Don't disable SMMU in kdump kernel switchtec: Fix unintended mask of MRPC event net: thunderbolt: Unregister ThunderboltIP protocol handler when suspending x86/PCI: Fix PCI IRQ routing table memory leak i40e: Queues are reserved despite "Invalid argument" error platform/chrome: cros_ec_proto: check for NULL transfer function PCI: keystone: Prevent ARM32 specific code to be compiled for ARM64 soc: mediatek: pwrap: Zero initialize rdata in pwrap_init_cipher clk: rockchip: Turn on "aclk_dmac1" for suspend on rk3288 soc: rockchip: Set the proper PWM for rk3288 ARM: dts: imx51: Specify IMX5_CLK_IPG as "ahb" clock to SDMA ARM: dts: imx50: Specify IMX5_CLK_IPG as "ahb" clock to SDMA ARM: dts: imx53: Specify IMX5_CLK_IPG as "ahb" clock to SDMA ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ahb" clock to SDMA ARM: dts: imx6sll: Specify IMX6SLL_CLK_IPG as "ipg" clock to SDMA ARM: dts: imx7d: Specify IMX7D_CLK_IPG as "ipg" clock to SDMA ARM: dts: imx6ul: Specify IMX6UL_CLK_IPG as "ipg" clock to SDMA ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ipg" clock to SDMA ARM: dts: imx6qdl: Specify IMX6QDL_CLK_IPG as "ipg" clock to SDMA PCI: rpadlpar: Fix leaked device_node references in add/remove paths drm/amd/display: Use plane->color_space for dpp if specified ARM: OMAP2+: pm33xx-core: Do not Turn OFF CEFUSE as PPA may be using it platform/x86: intel_pmc_ipc: adding error handling power: supply: max14656: fix potential use-before-alloc net: hns3: return 0 and print warning when hit duplicate MAC PCI: rcar: Fix a potential NULL pointer dereference PCI: rcar: Fix 64bit MSI message address handling scsi: qla2xxx: Reset the FCF_ASYNC_{SENT|ACTIVE} flags video: hgafb: fix potential NULL pointer dereference video: imsttfb: fix potential NULL pointer dereferences block, bfq: increase idling for weight-raised queues PCI: xilinx: Check for __get_free_pages() failure gpio: gpio-omap: add check for off wake capable gpios ice: Add missing case in print_link_msg for printing flow control dmaengine: idma64: Use actual device for DMA transfers pwm: tiehrpwm: Update shadow register for disabling PWMs ARM: dts: exynos: Always enable necessary APIO_1V8 and ABB_1V8 regulators on Arndale Octa pwm: Fix deadlock warning when removing PWM device ARM: exynos: Fix undefined instruction during Exynos5422 resume usb: typec: fusb302: Check vconn is off when we start toggling soc: renesas: Identify R-Car M3-W ES1.3 gpio: vf610: Do not share irq_chip percpu: do not search past bitmap when allocating an area Revert "Bluetooth: Align minimum encryption key size for LE and BR/EDR connections" Revert "drm/nouveau: add kconfig option to turn off nouveau legacy contexts. (v3)" ovl: check the capability before cred overridden ovl: support stacked SEEK_HOLE/SEEK_DATA drm/vc4: fix fb references in async update ALSA: seq: Cover unsubscribe_port() in list_mutex Linux 4.19.51 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Dennis Zhou
|
526972e95e |
percpu: do not search past bitmap when allocating an area
[ Upstream commit 8c43004af01635cc9fbb11031d070e5e0d327ef2 ] pcpu_find_block_fit() guarantees that a fit is found within PCPU_BITMAP_BLOCK_BITS. Iteration is used to determine the first fit as it compares against the block's contig_hint. This can lead to incorrectly scanning past the end of the bitmap. The behavior was okay given the check after for bit_off >= end and the correctness of the hints from pcpu_find_block_fit(). This patch fixes this by bounding the end offset by the number of bits in a chunk. Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
John Sperbeck
|
5329dcafea |
percpu: remove spurious lock dependency between percpu and sched
[ Upstream commit 198790d9a3aeaef5792d33a560020861126edc22 ] In free_percpu() we sometimes call pcpu_schedule_balance_work() to queue a work item (which does a wakeup) while holding pcpu_lock. This creates an unnecessary lock dependency between pcpu_lock and the scheduler's pi_lock. There are other places where we call pcpu_schedule_balance_work() without hold pcpu_lock, and this case doesn't need to be different. Moving the call outside the lock prevents the following lockdep splat when running tools/testing/selftests/bpf/{test_maps,test_progs} in sequence with lockdep enabled: ====================================================== WARNING: possible circular locking dependency detected 5.1.0-dbg-DEV #1 Not tainted ------------------------------------------------------ kworker/23:255/18872 is trying to acquire lock: 000000000bc79290 (&(&pool->lock)->rlock){-.-.}, at: __queue_work+0xb2/0x520 but task is already holding lock: 00000000e3e7a6aa (pcpu_lock){..-.}, at: free_percpu+0x36/0x260 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (pcpu_lock){..-.}: lock_acquire+0x9e/0x180 _raw_spin_lock_irqsave+0x3a/0x50 pcpu_alloc+0xfa/0x780 __alloc_percpu_gfp+0x12/0x20 alloc_htab_elem+0x184/0x2b0 __htab_percpu_map_update_elem+0x252/0x290 bpf_percpu_hash_update+0x7c/0x130 __do_sys_bpf+0x1912/0x1be0 __x64_sys_bpf+0x1a/0x20 do_syscall_64+0x59/0x400 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #3 (&htab->buckets[i].lock){....}: lock_acquire+0x9e/0x180 _raw_spin_lock_irqsave+0x3a/0x50 htab_map_update_elem+0x1af/0x3a0 -> #2 (&rq->lock){-.-.}: lock_acquire+0x9e/0x180 _raw_spin_lock+0x2f/0x40 task_fork_fair+0x37/0x160 sched_fork+0x211/0x310 copy_process.part.43+0x7b1/0x2160 _do_fork+0xda/0x6b0 kernel_thread+0x29/0x30 rest_init+0x22/0x260 arch_call_rest_init+0xe/0x10 start_kernel+0x4fd/0x520 x86_64_start_reservations+0x24/0x26 x86_64_start_kernel+0x6f/0x72 secondary_startup_64+0xa4/0xb0 -> #1 (&p->pi_lock){-.-.}: lock_acquire+0x9e/0x180 _raw_spin_lock_irqsave+0x3a/0x50 try_to_wake_up+0x41/0x600 wake_up_process+0x15/0x20 create_worker+0x16b/0x1e0 workqueue_init+0x279/0x2ee kernel_init_freeable+0xf7/0x288 kernel_init+0xf/0x180 ret_from_fork+0x24/0x30 -> #0 (&(&pool->lock)->rlock){-.-.}: __lock_acquire+0x101f/0x12a0 lock_acquire+0x9e/0x180 _raw_spin_lock+0x2f/0x40 __queue_work+0xb2/0x520 queue_work_on+0x38/0x80 free_percpu+0x221/0x260 pcpu_freelist_destroy+0x11/0x20 stack_map_free+0x2a/0x40 bpf_map_free_deferred+0x3c/0x50 process_one_work+0x1f7/0x580 worker_thread+0x54/0x410 kthread+0x10f/0x150 ret_from_fork+0x24/0x30 other info that might help us debug this: Chain exists of: &(&pool->lock)->rlock --> &htab->buckets[i].lock --> pcpu_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(pcpu_lock); lock(&htab->buckets[i].lock); lock(pcpu_lock); lock(&(&pool->lock)->rlock); *** DEADLOCK *** 3 locks held by kworker/23:255/18872: #0: 00000000b36a6e16 ((wq_completion)events){+.+.}, at: process_one_work+0x17a/0x580 #1: 00000000dfd966f0 ((work_completion)(&map->work)){+.+.}, at: process_one_work+0x17a/0x580 #2: 00000000e3e7a6aa (pcpu_lock){..-.}, at: free_percpu+0x36/0x260 stack backtrace: CPU: 23 PID: 18872 Comm: kworker/23:255 Not tainted 5.1.0-dbg-DEV #1 Hardware name: ... Workqueue: events bpf_map_free_deferred Call Trace: dump_stack+0x67/0x95 print_circular_bug.isra.38+0x1c6/0x220 check_prev_add.constprop.50+0x9f6/0xd20 __lock_acquire+0x101f/0x12a0 lock_acquire+0x9e/0x180 _raw_spin_lock+0x2f/0x40 __queue_work+0xb2/0x520 queue_work_on+0x38/0x80 free_percpu+0x221/0x260 pcpu_freelist_destroy+0x11/0x20 stack_map_free+0x2a/0x40 bpf_map_free_deferred+0x3c/0x50 process_one_work+0x1f7/0x580 worker_thread+0x54/0x410 kthread+0x10f/0x150 ret_from_fork+0x24/0x30 Signed-off-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Qian Cai
|
515d18ced8 |
mm/slab.c: fix an infinite loop in leaks_show()
[ Upstream commit 745e10146c31b1c6ed3326286704ae251b17f663 ]
"cat /proc/slab_allocators" could hang forever on SMP machines with
kmemleak or object debugging enabled due to other CPUs running do_drain()
will keep making kmemleak_object or debug_objects_cache dirty and unable
to escape the first loop in leaks_show(),
do {
set_store_user_clean(cachep);
drain_cpu_caches(cachep);
...
} while (!is_store_user_clean(cachep));
For example,
do_drain
slabs_destroy
slab_destroy
kmem_cache_free
__cache_free
___cache_free
kmemleak_free_recursive
delete_object_full
__delete_object
put_object
free_object_rcu
kmem_cache_free
cache_free_debugcheck --> dirty kmemleak_object
One approach is to check cachep->name and skip both kmemleak_object and
debug_objects_cache in leaks_show(). The other is to set store_user_clean
after drain_cpu_caches() which leaves a small window between
drain_cpu_caches() and set_store_user_clean() where per-CPU caches could
be dirty again lead to slightly wrong information has been stored but
could also speed up things significantly which sounds like a good
compromise. For example,
# cat /proc/slab_allocators
0m42.778s # 1st approach
0m0.737s # 2nd approach
[akpm@linux-foundation.org: tweak comment]
Link: http://lkml.kernel.org/r/20190411032635.10325-1-cai@lca.pw
Fixes:
|
||
Yue Hu
|
13e1ea0881 |
mm/cma_debug.c: fix the break condition in cma_maxchunk_get()
[ Upstream commit f0fd50504a54f5548eb666dc16ddf8394e44e4b7 ] If not find zero bit in find_next_zero_bit(), it will return the size parameter passed in, so the start bit should be compared with bitmap_maxno rather than cma->count. Although getting maxchunk is working fine due to zero value of order_per_bit currently, the operation will be stuck if order_per_bit is set as non-zero. Link: http://lkml.kernel.org/r/20190319092734.276-1-zbestahu@gmail.com Signed-off-by: Yue Hu <huyue2@yulong.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Joe Perches <joe@perches.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Safonov <d.safonov@partner.samsung.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Aneesh Kumar K.V
|
38c5fce7fc |
mm: page_mkclean vs MADV_DONTNEED race
[ Upstream commit 024eee0e83f0df52317be607ca521e0fc572aa07 ] MADV_DONTNEED is handled with mmap_sem taken in read mode. We call page_mkclean without holding mmap_sem. MADV_DONTNEED implies that pages in the region are unmapped and subsequent access to the pages in that range is handled as a new page fault. This implies that if we don't have parallel access to the region when MADV_DONTNEED is run we expect those range to be unallocated. w.r.t page_mkclean() we need to make sure that we don't break the MADV_DONTNEED semantics. MADV_DONTNEED check for pmd_none without holding pmd_lock. This implies we skip the pmd if we temporarily mark pmd none. Avoid doing that while marking the page clean. Keep the sequence same for dax too even though we don't support MADV_DONTNEED for dax mapping The bug was noticed by code review and I didn't observe any failures w.r.t test run. This is similar to commit |
||
Yue Hu
|
77a01e3357 |
mm/cma.c: fix the bitmap status to show failed allocation reason
[ Upstream commit 2b59e01a3aa665f751d1410b99fae9336bd424e1 ] Currently one bit in cma bitmap represents number of pages rather than one page, cma->count means cma size in pages. So to find available pages via find_next_zero_bit()/find_next_bit() we should use cma size not in pages but in bits although current free pages number is correct due to zero value of order_per_bit. Once order_per_bit is changed the bitmap status will be incorrect. The size input in cma_debug_show_areas() is not correct. It will affect the available pages at some position to debug the failure issue. This is an example with order_per_bit = 1 Before this change: [ 4.120060] cma: number of available pages: 1@93+4@108+7@121+7@137+7@153+7@169+7@185+7@201+3@213+3@221+3@229+3@237+3@245+3@253+3@261+3@269+3@277+3@285+3@293+3@301+3@309+3@317+3@325+19@333+15@369+512@512=> 638 free of 1024 total pages After this change: [ 4.143234] cma: number of available pages: 2@93+8@108+14@121+14@137+14@153+14@169+14@185+14@201+6@213+6@221+6@229+6@237+6@245+6@253+6@261+6@269+6@277+6@285+6@293+6@301+6@309+6@317+6@325+38@333+30@369=> 252 free of 1024 total pages Obviously the bitmap status before is incorrect. Link: http://lkml.kernel.org/r/20190320060829.9144-1-zbestahu@gmail.com Signed-off-by: Yue Hu <huyue2@yulong.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Laura Abbott <labbott@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Yue Hu
|
e5f8857ea9 |
mm/cma.c: fix crash on CMA allocation if bitmap allocation fails
[ Upstream commit 1df3a339074e31db95c4790ea9236874b13ccd87 ]
|
||
Linxu Fang
|
5094a85d6d |
mem-hotplug: fix node spanned pages when we have a node with only ZONE_MOVABLE
[ Upstream commit 299c83dce9ea3a79bb4b5511d2cb996b6b8e5111 ] |
||
Mike Kravetz
|
ffaafd27b0 |
hugetlbfs: on restore reserve error path retain subpool reservation
[ Upstream commit 0919e1b69ab459e06df45d3ba6658d281962db80 ] When a huge page is allocated, PagePrivate() is set if the allocation consumed a reservation. When freeing a huge page, PagePrivate is checked. If set, it indicates the reservation should be restored. PagePrivate being set at free huge page time mostly happens on error paths. When huge page reservations are created, a check is made to determine if the mapping is associated with an explicitly mounted filesystem. If so, pages are also reserved within the filesystem. The default action when freeing a huge page is to decrement the usage count in any associated explicitly mounted filesystem. However, if the reservation is to be restored the reservation/use count within the filesystem should not be decrementd. Otherwise, a subsequent page allocation and free for the same mapping location will cause the file filesystem usage to go 'negative'. Filesystem Size Used Avail Use% Mounted on nodev 4.0G -4.0M 4.1G - /opt/hugepool To fix, when freeing a huge page do not adjust filesystem usage if PagePrivate() is set to indicate the reservation should be restored. I did not cc stable as the problem has been around since reserves were added to hugetlbfs and nobody has noticed. Link: http://lkml.kernel.org/r/20190328234704.27083-2-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Jérôme Glisse
|
85e1a6c4b3 |
mm/hmm: select mmu notifier when selecting HMM
[ Upstream commit 734fb89968900b5c5f8edd5038bd4cdeab8c61d2 ] To avoid random config build issue, select mmu notifier when HMM is selected. In any cases when HMM get selected it will be by users that will also wants the mmu notifier. Link: http://lkml.kernel.org/r/20190403193318.16478-2-jglisse@redhat.com Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Greg Kroah-Hartman
|
3f534fa2fc |
This is the 4.19.49 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlz8soUACgkQONu9yGCS aT5o7w//cc+XhbVg3ro68Rq2byYVuntdiKtYDYlEQtv5DORELdX29g6gBlLlO+ma J29W/8quC9fi8EERG14TRvhh8MJNlQCv5g+Afvqr42zCOBowUFL4q3o06dON2e3v qhKsEhwcvHH/dR+ixtw1zpSNl6bIdG87C5gxYpWvpjCHpVpVDwLHfRK7GbSS/ioe Ys2uJwtbB4ZpileCoA4O61alYNLq/zP/gJiKraMbrwfOPfY3dmDIqo6KMniUyFs3 0E0w6AKp2PalsoVxyl6mD0KGFFChWz0AfSE6b3IdlzXC2qOZkWlAKPWxpQMHWdme HVGXHxrW6aDVk+RGOmIk1dKWDdVqTZJ1Cx1RscrtIQsQwuXqR/STgoqu1hbQ/B4W /jMak4BICuyfnOzsmuQbIhOWaf8KTIokYfQ92SAEDQl+xRhxqe2VTSCkl0gVE+Yr KxMoyzyZTYidpWRedUGJiACNjlgogYp3L9xiJjHz2dlQD9r4GTsHXo2CbzV1jYrL HKqmkz6angoHx3Vw6tffCe9117KFKkOxX8MG3xRMZMfRglRbNd3m8Uo+YGOWcscU e1mUE+71woj89TquN7cTg5hY5/kRiGwVJBqihSuWzeJsf0LPab9Txv7ipHKuzh84 i8e3Isi3eoZR55ZsU7Jl8QeDIZ2OSNprUy9ldvaI2K4fg/VxpA4= =hIeN -----END PGP SIGNATURE----- Merge 4.19.49 into android-4.19 Changes in 4.19.49 sparc64: Fix regression in non-hypervisor TLB flush xcall include/linux/bitops.h: sanitize rotate primitives xhci: update bounce buffer with correct sg num xhci: Use %zu for printing size_t type xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic() usb: xhci: avoid null pointer deref when bos field is NULL usbip: usbip_host: fix BUG: sleeping function called from invalid context usbip: usbip_host: fix stub_dev lock context imbalance regression USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor USB: sisusbvga: fix oops in error path of sisusb_probe USB: Add LPM quirk for Surface Dock GigE adapter USB: rio500: refuse more than one device at a time USB: rio500: fix memory leak in close after disconnect media: usb: siano: Fix general protection fault in smsusb media: usb: siano: Fix false-positive "uninitialized variable" warning media: smsusb: better handle optional alignment brcmfmac: fix NULL pointer derefence during USB disconnect scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs) tracing: Avoid memory leak in predicate_parse() Btrfs: fix wrong ctime and mtime of a directory after log replay Btrfs: fix race updating log root item during fsync Btrfs: fix fsync not persisting changed attributes of a directory Btrfs: incremental send, fix file corruption when no-holes feature is enabled iio: dac: ds4422/ds4424 fix chip verification iio: adc: ti-ads8688: fix timestamp is not updated in buffer s390/crypto: fix gcm-aes-s390 selftest failures s390/crypto: fix possible sleep during spinlock aquired KVM: PPC: Book3S HV: XIVE: Do not clear IRQ data of passthrough interrupts powerpc/perf: Fix MMCRA corruption by bhrb_filter ALSA: line6: Assure canceling delayed work at disconnection ALSA: hda/realtek - Set default power save node to 0 ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops KVM: s390: Do not report unusabled IDs via KVM_CAP_MAX_VCPU_ID drm/nouveau/i2c: Disable i2c bus access after ->fini() i2c: mlxcpld: Fix wrong initialization order in probe i2c: synquacer: fix synquacer_i2c_doxfer() return value tty: serial: msm_serial: Fix XON/XOFF tty: max310x: Fix external crystal register setup memcg: make it work on sparse non-0-node systems kernel/signal.c: trace_signal_deliver when signal_group_exit arm64: Fix the arm64_personality() syscall wrapper redirection docs: Fix conf.py for Sphinx 2.0 doc: Cope with the deprecation of AutoReporter doc: Cope with Sphinx logging deprecations ima: show rules with IMA_INMASK correctly evm: check hash algorithm passed to init_desc() vt/fbcon: deinitialize resources in visual_init() after failed memory allocation serial: sh-sci: disable DMA for uart_console staging: vc04_services: prevent integer overflow in create_pagelist() staging: wlan-ng: fix adapter initialization failure cifs: fix memory leak of pneg_inbuf on -EOPNOTSUPP ioctl case CIFS: cifs_read_allocate_pages: don't iterate through whole page array on ENOMEM Revert "lockd: Show pid of lockd for remote locks" gcc-plugins: Fix build failures under Darwin host drm/tegra: gem: Fix CPU-cache maintenance for BO's allocated using get_pages() drm/vmwgfx: Don't send drm sysfs hotplug events on initial master set drm/sun4i: Fix sun8i HDMI PHY clock initialization drm/sun4i: Fix sun8i HDMI PHY configuration for > 148.5 MHz drm/rockchip: shutdown drm subsystem on shutdown drm/lease: Make sure implicit planes are leased Compiler Attributes: add support for __copy (gcc >= 9) include/linux/module.h: copy __init/__exit attrs to init/cleanup_module Revert "x86/build: Move _etext to actual end of .text" Revert "binder: fix handling of misaligned binder object" binder: fix race between munmap() and direct reclaim x86/ftrace: Do not call function graph from dynamic trampolines x86/ftrace: Set trampoline pages as executable x86/kprobes: Set instruction page as executable scsi: lpfc: Fix backport of faf5a744f4f8 ("scsi: lpfc: avoid uninitialized variable warning") of: overlay: validate overlay properties #address-cells and #size-cells of: overlay: set node fields from properties when add new overlay node media: uvcvideo: Fix uvc_alloc_entity() allocation alignment Linux 4.19.49 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Jiri Slaby
|
8b057ad846 |
memcg: make it work on sparse non-0-node systems
commit 3e8589963773a5c23e2f1fe4bcad0e9a90b7f471 upstream.
We have a single node system with node 0 disabled:
Scanning NUMA topology in Northbridge 24
Number of physical nodes 2
Skipping disabled node 0
Node 1 MemBase 0000000000000000 Limit 00000000fbff0000
NODE_DATA(1) allocated [mem 0xfbfda000-0xfbfeffff]
This causes crashes in memcg when system boots:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
#PF error: [normal kernel read fault]
...
RIP: 0010:list_lru_add+0x94/0x170
...
Call Trace:
d_lru_add+0x44/0x50
dput.part.34+0xfc/0x110
__fput+0x108/0x230
task_work_run+0x9f/0xc0
exit_to_usermode_loop+0xf5/0x100
It is reproducible as far as 4.12. I did not try older kernels. You have
to have a new enough systemd, e.g. 241 (the reason is unknown -- was not
investigated). Cannot be reproduced with systemd 234.
The system crashes because the size of lru array is never updated in
memcg_update_all_list_lrus and the reads are past the zero-sized array,
causing dereferences of random memory.
The root cause are list_lru_memcg_aware checks in the list_lru code. The
test in list_lru_memcg_aware is broken: it assumes node 0 is always
present, but it is not true on some systems as can be seen above.
So fix this by avoiding checks on node 0. Remember the memcg-awareness by
a bool flag in struct list_lru.
Link: http://lkml.kernel.org/r/20190522091940.3615-1-jslaby@suse.cz
Fixes:
|
||
Greg Kroah-Hartman
|
50f91435a2 |
This is the 4.19.45 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlzk4CsACgkQONu9yGCS aT5Xaw//UWopx4Yqbiv+4HBgW+2ijP4utxI4lBNYITD44jvkyVJnztUtVkWepu5r Tkl/7zytXOpxbpuhS0xqpWwG7lL5eT4NCG08KSX4lYQVjIWX4YzVkw9gLe9V2AaK IqTzaWtbuagARbnR3UC65TI4kjRGsr9ldY0AbbGGVTM6IwPquHN9Qd9TAzRwRohn CxY94Bwp1RcN2sSPkD3nUCUGOSNh97BXyypeM7FyceOzOpyAdQCXoUPc84cPqdNC 4GBkd5Z1IL/7zX3HDjQeGS0KK6e1enslSmsbSSUVuHI90LCr3CZPJkFF8RFnPnff 2RA7bdhp8C1JPeLDimr+SNSLEl9yywoH6d4UQAnBwoLDjiFCEITVgjDtYzzd81+1 ES6lbUAs8v/LXkaCaExq6pNNd1prg6Mj9Fe6cz+G9V/YV1tLUsoAJHdFucu8Sp7w rwz/PZ6waCf8VRO4aYFF9b+u7PQ/RFZWQYsz22P7PhAYg0CTajV1FWGk1AYi0+wQ 5YCmthbWhDo9U5lAFyQ0pVTXv/UNgEu6MfV1/jKtCk5AzsbE77orj1xusKckHq2e QojgmELmHMlFFajI0h/ddDo7iwz/5OrPVs9D03RysiOciMzdTKPucPyC0Ah4yEBA sJ0cQkaVtqO2Nu3E42lfQTpVIqBgi8NGav+kRwryB1YyKeaXLsM= =HJ7O -----END PGP SIGNATURE----- Merge 4.19.45 into android-4.19 Changes in 4.19.45 locking/rwsem: Prevent decrement of reader count before increment x86/speculation/mds: Revert CPU buffer clear on double fault exit x86/speculation/mds: Improve CPU buffer clear documentation objtool: Fix function fallthrough detection arm64: dts: rockchip: Disable DCMDs on RK3399's eMMC controller. ARM: dts: exynos: Fix interrupt for shared EINTs on Exynos5260 ARM: dts: exynos: Fix audio (microphone) routing on Odroid XU3 mmc: sdhci-of-arasan: Add DTS property to disable DCMDs. ARM: exynos: Fix a leaked reference by adding missing of_node_put power: supply: axp288_charger: Fix unchecked return value power: supply: axp288_fuel_gauge: Add ACEPC T8 and T11 mini PCs to the blacklist arm64: mmap: Ensure file offset is treated as unsigned arm64: arch_timer: Ensure counter register reads occur with seqlock held arm64: compat: Reduce address limit arm64: Clear OSDLR_EL1 on CPU boot arm64: Save and restore OSDLR_EL1 across suspend/resume sched/x86: Save [ER]FLAGS on context switch crypto: crypto4xx - fix ctr-aes missing output IV crypto: crypto4xx - fix cfb and ofb "overran dst buffer" issues crypto: salsa20 - don't access already-freed walk.iv crypto: chacha20poly1305 - set cra_name correctly crypto: ccp - Do not free psp_master when PLATFORM_INIT fails crypto: vmx - fix copy-paste error in CTR mode crypto: skcipher - don't WARN on unprocessed data after slow walk step crypto: crct10dif-generic - fix use via crypto_shash_digest() crypto: x86/crct10dif-pcl - fix use via crypto_shash_digest() crypto: arm64/gcm-aes-ce - fix no-NEON fallback code crypto: gcm - fix incompatibility between "gcm" and "gcm_base" crypto: rockchip - update IV buffer to contain the next IV crypto: arm/aes-neonbs - don't access already-freed walk.iv crypto: arm64/aes-neonbs - don't access already-freed walk.iv mmc: core: Fix tag set memory leak ALSA: line6: toneport: Fix broken usage of timer for delayed execution ALSA: usb-audio: Fix a memory leak bug ALSA: hda/hdmi - Read the pin sense from register when repolling ALSA: hda/hdmi - Consider eld_valid when reporting jack event ALSA: hda/realtek - EAPD turn on later ALSA: hdea/realtek - Headset fixup for System76 Gazelle (gaze14) ASoC: max98090: Fix restore of DAPM Muxes ASoC: RT5677-SPI: Disable 16Bit SPI Transfers ASoC: fsl_esai: Fix missing break in switch statement ASoC: codec: hdac_hdmi add device_link to card device bpf, arm64: remove prefetch insn in xadd mapping crypto: ccree - remove special handling of chained sg crypto: ccree - fix mem leak on error path crypto: ccree - don't map MAC key on stack crypto: ccree - use correct internal state sizes for export crypto: ccree - don't map AEAD key and IV on stack crypto: ccree - pm resume first enable the source clk crypto: ccree - HOST_POWER_DOWN_EN should be the last CC access during suspend crypto: ccree - add function to handle cryptocell tee fips error crypto: ccree - handle tee fips error during power management resume mm/mincore.c: make mincore() more conservative mm/huge_memory: fix vmf_insert_pfn_{pmd, pud}() crash, handle unaligned addresses mm/hugetlb.c: don't put_page in lock of hugetlb_lock hugetlb: use same fault hash key for shared and private mappings ocfs2: fix ocfs2 read inode data panic in ocfs2_iget userfaultfd: use RCU to free the task struct when fork fails ACPI: PM: Set enable_for_wake for wakeup GPEs during suspend-to-idle mfd: da9063: Fix OTP control register names to match datasheets for DA9063/63L mfd: max77620: Fix swapped FPS_PERIOD_MAX_US values mtd: spi-nor: intel-spi: Avoid crossing 4K address boundary on read/write tty: vt.c: Fix TIOCL_BLANKSCREEN console blanking if blankinterval == 0 tty/vt: fix write/write race in ioctl(KDSKBSENT) handler jbd2: check superblock mapped prior to committing ext4: make sanity check in mballoc more strict ext4: ignore e_value_offs for xattrs with value-in-ea-inode ext4: avoid drop reference to iloc.bh twice ext4: fix use-after-free race with debug_want_extra_isize ext4: actually request zeroing of inode table after grow ext4: fix ext4_show_options for file systems w/o journal btrfs: Check the first key and level for cached extent buffer btrfs: Correctly free extent buffer in case btree_read_extent_buffer_pages fails btrfs: Honour FITRIM range constraints during free space trim Btrfs: send, flush dellaloc in order to avoid data loss Btrfs: do not start a transaction during fiemap Btrfs: do not start a transaction at iterate_extent_inodes() bcache: fix a race between cache register and cacheset unregister bcache: never set KEY_PTRS of journal key to 0 in journal_reclaim() ipmi:ssif: compare block number correctly for multi-part return messages crypto: ccm - fix incompatibility between "ccm" and "ccm_base" fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount tty: Don't force RISCV SBI console as preferred console ext4: zero out the unused memory region in the extent tree block ext4: fix data corruption caused by overlapping unaligned and aligned IO ext4: fix use-after-free in dx_release() ext4: avoid panic during forced reboot due to aborted journal ALSA: hda/realtek - Corrected fixup for System76 Gazelle (gaze14) ALSA: hda/realtek - Fixup headphone noise via runtime suspend ALSA: hda/realtek - Fix for Lenovo B50-70 inverted internal microphone bug jbd2: fix potential double free KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes KVM: lapic: Busy wait for timer to expire when using hv_timer kbuild: turn auto.conf.cmd into a mandatory include file xen/pvh: set xen_domain_type to HVM in xen_pvh_init libnvdimm/namespace: Fix label tracking error iov_iter: optimize page_copy_sane() pstore: Centralize init/exit routines pstore: Allocate compression during late_initcall() pstore: Refactor compression initialization ext4: fix compile error when using BUFFER_TRACE ext4: don't update s_rev_level if not required Linux 4.19.45 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Mike Kravetz
|
a3ccc156f3 |
hugetlb: use same fault hash key for shared and private mappings
commit 1b426bac66e6cc83c9f2d92b96e4e72acf43419a upstream. hugetlb uses a fault mutex hash table to prevent page faults of the same pages concurrently. The key for shared and private mappings is different. Shared keys off address_space and file index. Private keys off mm and virtual address. Consider a private mappings of a populated hugetlbfs file. A fault will map the page from the file and if needed do a COW to map a writable page. Hugetlbfs hole punch uses the fault mutex to prevent mappings of file pages. It uses the address_space file index key. However, private mappings will use a different key and could race with this code to map the file page. This causes problems (BUG) for the page cache remove code as it expects the page to be unmapped. A sample stack is: page dumped because: VM_BUG_ON_PAGE(page_mapped(page)) kernel BUG at mm/filemap.c:169! ... RIP: 0010:unaccount_page_cache_page+0x1b8/0x200 ... Call Trace: __delete_from_page_cache+0x39/0x220 delete_from_page_cache+0x45/0x70 remove_inode_hugepages+0x13c/0x380 ? __add_to_page_cache_locked+0x162/0x380 hugetlbfs_fallocate+0x403/0x540 ? _cond_resched+0x15/0x30 ? __inode_security_revalidate+0x5d/0x70 ? selinux_file_permission+0x100/0x130 vfs_fallocate+0x13f/0x270 ksys_fallocate+0x3c/0x80 __x64_sys_fallocate+0x1a/0x20 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 There seems to be another potential COW issue/race with this approach of different private and shared keys as noted in commit |
||
Kai Shen
|
0b16b09a72 |
mm/hugetlb.c: don't put_page in lock of hugetlb_lock
commit 2bf753e64b4a702e27ce26ff520c59563c62f96b upstream.
spinlock recursion happened when do LTP test:
#!/bin/bash
./runltp -p -f hugetlb &
./runltp -p -f hugetlb &
./runltp -p -f hugetlb &
./runltp -p -f hugetlb &
./runltp -p -f hugetlb &
The dtor returned by get_compound_page_dtor in __put_compound_page may be
the function of free_huge_page which will lock the hugetlb_lock, so don't
put_page in lock of hugetlb_lock.
BUG: spinlock recursion on CPU#0, hugemmap05/1079
lock: hugetlb_lock+0x0/0x18, .magic: dead4ead, .owner: hugemmap05/1079, .owner_cpu: 0
Call trace:
dump_backtrace+0x0/0x198
show_stack+0x24/0x30
dump_stack+0xa4/0xcc
spin_dump+0x84/0xa8
do_raw_spin_lock+0xd0/0x108
_raw_spin_lock+0x20/0x30
free_huge_page+0x9c/0x260
__put_compound_page+0x44/0x50
__put_page+0x2c/0x60
alloc_surplus_huge_page.constprop.19+0xf0/0x140
hugetlb_acct_memory+0x104/0x378
hugetlb_reserve_pages+0xe0/0x250
hugetlbfs_file_mmap+0xc0/0x140
mmap_region+0x3e8/0x5b0
do_mmap+0x280/0x460
vm_mmap_pgoff+0xf4/0x128
ksys_mmap_pgoff+0xb4/0x258
__arm64_sys_mmap+0x34/0x48
el0_svc_common+0x78/0x130
el0_svc_handler+0x38/0x78
el0_svc+0x8/0xc
Link: http://lkml.kernel.org/r/b8ade452-2d6b-0372-32c2-703644032b47@huawei.com
Fixes:
|
||
Dan Williams
|
58db381368 |
mm/huge_memory: fix vmf_insert_pfn_{pmd, pud}() crash, handle unaligned addresses
commit fce86ff5802bac3a7b19db171aa1949ef9caac31 upstream. Starting with c6f3c5ee40c1 ("mm/huge_memory.c: fix modifying of page protection by insert_pfn_pmd()") vmf_insert_pfn_pmd() internally calls pmdp_set_access_flags(). That helper enforces a pmd aligned @address argument via VM_BUG_ON() assertion. Update the implementation to take a 'struct vm_fault' argument directly and apply the address alignment fixup internally to fix crash signatures like: kernel BUG at arch/x86/mm/pgtable.c:515! invalid opcode: 0000 [#1] SMP NOPTI CPU: 51 PID: 43713 Comm: java Tainted: G OE 4.19.35 #1 [..] RIP: 0010:pmdp_set_access_flags+0x48/0x50 [..] Call Trace: vmf_insert_pfn_pmd+0x198/0x350 dax_iomap_fault+0xe82/0x1190 ext4_dax_huge_fault+0x103/0x1f0 ? __switch_to_asm+0x40/0x70 __handle_mm_fault+0x3f6/0x1370 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 handle_mm_fault+0xda/0x200 __do_page_fault+0x249/0x4f0 do_page_fault+0x32/0x110 ? page_fault+0x8/0x30 page_fault+0x1e/0x30 Link: http://lkml.kernel.org/r/155741946350.372037.11148198430068238140.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: c6f3c5ee40c1 ("mm/huge_memory.c: fix modifying of page protection by insert_pfn_pmd()") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Piotr Balcer <piotr.balcer@intel.com> Tested-by: Yan Ma <yan.ma@intel.com> Tested-by: Pankaj Gupta <pagupta@redhat.com> Reviewed-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Chandan Rajendra <chandan@linux.ibm.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Jiri Kosina
|
f580a54bbd |
mm/mincore.c: make mincore() more conservative
commit 134fca9063ad4851de767d1768180e5dede9a881 upstream. The semantics of what mincore() considers to be resident is not completely clear, but Linux has always (since 2.3.52, which is when mincore() was initially done) treated it as "page is available in page cache". That's potentially a problem, as that [in]directly exposes meta-information about pagecache / memory mapping state even about memory not strictly belonging to the process executing the syscall, opening possibilities for sidechannel attacks. Change the semantics of mincore() so that it only reveals pagecache information for non-anonymous mappings that belog to files that the calling process could (if it tried to) successfully open for writing; otherwise we'd be including shared non-exclusive mappings, which - is the sidechannel - is not the usecase for mincore(), as that's primarily used for data, not (shared) text [jkosina@suse.cz: v2] Link: http://lkml.kernel.org/r/20190312141708.6652-2-vbabka@suse.cz [mhocko@suse.com: restructure can_do_mincore() conditions] Link: http://lkml.kernel.org/r/nycvar.YFH.7.76.1903062342020.19912@cbobk.fhfr.pm Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Josh Snyder <joshs@netflix.com> Acked-by: Michal Hocko <mhocko@suse.com> Originally-by: Linus Torvalds <torvalds@linux-foundation.org> Originally-by: Dominique Martinet <asmadeus@codewreck.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Chinner <david@fromorbit.com> Cc: Kevin Easton <kevin@guarana.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Cyril Hrubis <chrubis@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Daniel Gruss <daniel@gruss.cc> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Christoph Hellwig
|
2bf50b5a8e |
FROMLIST: mm: don't cast ->readpage to filler_t for do_read_cache_page
We can just pass a NULL filler and do the right thing inside of do_read_cache_page based on the NULL parameter. Signed-off-by: Christoph Hellwig <hch@lst.de> Bug: 133186739 Change-Id: I720afb2e0c28d5649b62ed3852b48ac63dc0d7a8 Link: https://lkml.org/lkml/2019/5/1/294 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> |
||
Greg Kroah-Hartman
|
0b63cd6d63 |
This is the 4.19.44 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlzdoMwACgkQONu9yGCS aT5H9RAAkvsM0ryV/0BLcS7bAc2qXHz3shwcqMRHeqLSHCI728Ew3vbrenquQ9ou I1I2gXVjVr3fT5IrNeEKC/RJg5EvUBD25dm3Iei9cTf8kqnnFaEqQFe+jIvVOMfR zr//hEG3lOO5tLtyK1pnZxrW1uKl7B+1O64bVMpHSola+5teV51r/PxqYM5lNKYm A0YhhcrTZ4f1ZY1nctgb3g5XousPHJAMhe8ACv6zO7I+eWwn410nUeAJL7WK8SvL UqW0xgLAkKsYrriaDCPC9VeOVMUetCHeT9KbYQ8W9p+kb5bMQua4d/YJtCxulnqH TAQosuCVSVoq+jQIZS+El4+teCQW/aAkHFGo+08e9ti+trEliz2u2PxiloZoGlxl M4iIPwjkp7YJ9WHoVdNjObPiv5vt+2pWDEPhB7p++nsk5/2ArbShqWVDREj09SRp RpOJ7UGl+Nckk8IppfQgRUOlVyMW6a+Fw7GX7dxnyh9U1g48w5ebNTVeUvKxQVCJ a99U9qqGRgmEsgicnOXs4tRMYuU0gn3j0Qjhd9G2xYijh8ElMLgTJyv+2JiUIavC nXmSVW3G54dR7N94aDfWm/z85w/uFmTmT4BiqNuc4/gHVyyk12Q9RbsV4T617xls wiCkGoLKlpi9cp6abxB9H+oav9PdZd9ztwZMSB35k1AIgh6oeo0= =v14F -----END PGP SIGNATURE----- Merge 4.19.44 into android-4.19 Changes in 4.19.44 bfq: update internal depth state when queue depth changes platform/x86: sony-laptop: Fix unintentional fall-through platform/x86: thinkpad_acpi: Disable Bluetooth for some machines platform/x86: dell-laptop: fix rfkill functionality hwmon: (pwm-fan) Disable PWM if fetching cooling data fails kernfs: fix barrier usage in __kernfs_new_node() virt: vbox: Sanity-check parameter types for hgcm-calls coming from userspace USB: serial: fix unthrottle races iio: adc: xilinx: fix potential use-after-free on remove iio: adc: xilinx: fix potential use-after-free on probe iio: adc: xilinx: prevent touching unclocked h/w on remove acpi/nfit: Always dump _DSM output payload libnvdimm/namespace: Fix a potential NULL pointer dereference HID: input: add mapping for Expose/Overview key HID: input: add mapping for keyboard Brightness Up/Down/Toggle keys HID: input: add mapping for "Toggle Display" key libnvdimm/btt: Fix a kmemdup failure check s390/dasd: Fix capacity calculation for large volumes mac80211: fix unaligned access in mesh table hash function mac80211: Increase MAX_MSG_LEN cfg80211: Handle WMM rules in regulatory domain intersection mac80211: fix memory accounting with A-MSDU aggregation nl80211: Add NL80211_FLAG_CLEAR_SKB flag for other NL commands libnvdimm/pmem: fix a possible OOB access when read and write pmem s390/3270: fix lockdep false positive on view->lock drm/amd/display: extending AUX SW Timeout clocksource/drivers/npcm: select TIMER_OF clocksource/drivers/oxnas: Fix OX820 compatible selftests: fib_tests: Fix 'Command line is not complete' errors mISDN: Check address length before reading address family vxge: fix return of a free'd memblock on a failed dma mapping qede: fix write to free'd pointer error and double free of ptp afs: Unlock pages for __pagevec_release() drm/amd/display: If one stream full updates, full update all planes s390/pkey: add one more argument space for debug feature entry x86/build/lto: Fix truncated .bss with -fdata-sections x86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51T KVM: fix spectrev1 gadgets KVM: x86: avoid misreporting level-triggered irqs as edge-triggered in tracing tools lib traceevent: Fix missing equality check for strcmp ipmi: ipmi_si_hardcode.c: init si_type array to fix a crash ocelot: Don't sleep in atomic context (irqs_disabled()) scsi: aic7xxx: fix EISA support mm: fix inactive list balancing between NUMA nodes and cgroups init: initialize jump labels before command line option parsing selftests: netfilter: check icmp pkttoobig errors are set as related ipvs: do not schedule icmp errors from tunnels netfilter: ctnetlink: don't use conntrack/expect object addresses as id netfilter: nf_tables: prevent shift wrap in nft_chain_parse_hook() MIPS: perf: ath79: Fix perfcount IRQ assignment s390: ctcm: fix ctcm_new_device error return code drm/sun4i: Set device driver data at bind time for use in unbind drm/sun4i: Fix component unbinding and component master deletion selftests/net: correct the return value for run_netsocktests netfilter: fix nf_l4proto_log_invalid to log invalid packets gpu: ipu-v3: dp: fix CSC handling drm/imx: don't skip DP channel disable for background plane ARM: 8856/1: NOMMU: Fix CCR register faulty initialization when MPU is disabled spi: Micrel eth switch: declare missing of table spi: ST ST95HF NFC: declare missing of table drm/sun4i: Unbind components before releasing DRM and memory Input: synaptics-rmi4 - fix possible double free RDMA/hns: Bugfix for mapping user db mm/memory_hotplug.c: drop memory device reference after find_memory_block() powerpc/smp: Fix NMI IPI timeout powerpc/smp: Fix NMI IPI xmon timeout net: dsa: mv88e6xxx: fix few issues in mv88e6390x_port_set_cmode mm/memory.c: fix modifying of page protection by insert_pfn() usb: typec: Fix unchecked return value netfilter: nf_tables: use-after-free in dynamic operations netfilter: nf_tables: add missing ->release_ops() in error path of newrule() net: fec: manage ahb clock in runtime pm mlxsw: spectrum_switchdev: Add MDB entries in prepare phase mlxsw: core: Do not use WQ_MEM_RECLAIM for EMAD workqueue mlxsw: core: Do not use WQ_MEM_RECLAIM for mlxsw ordered workqueue mlxsw: core: Do not use WQ_MEM_RECLAIM for mlxsw workqueue net/tls: fix the IV leaks net: strparser: partially revert "strparser: Call skb_unclone conditionally" NFC: nci: Add some bounds checking in nci_hci_cmd_received() nfc: nci: Potential off by one in ->pipes[] array x86/kprobes: Avoid kretprobe recursion bug cw1200: fix missing unlock on error in cw1200_hw_scan() mwl8k: Fix rate_idx underflow rtlwifi: rtl8723ae: Fix missing break in switch statement Don't jump to compute_result state from check_result state um: Don't hardcode path as it is architecture dependent powerpc/64s: Include cpu header bonding: fix arp_validate toggling in active-backup mode bridge: Fix error path for kobject_init_and_add() dpaa_eth: fix SG frame cleanup fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied ipv4: Fix raw socket lookup for local traffic net: dsa: Fix error cleanup path in dsa_init_module net: ethernet: stmmac: dwmac-sun8i: enable support of unicast filtering net: macb: Change interrupt and napi enable order in open net: seeq: fix crash caused by not set dev.parent net: ucc_geth - fix Oops when changing number of buffers in the ring packet: Fix error path in packet_init selinux: do not report error on connect(AF_UNSPEC) vlan: disable SIOCSHWTSTAMP in container vrf: sit mtu should not be updated when vrf netdev is the link tuntap: fix dividing by zero in ebpf queue selection tuntap: synchronize through tfiles array instead of tun->numqueues isdn: bas_gigaset: use usb_fill_int_urb() properly tipc: fix hanging clients using poll with EPOLLOUT flag drivers/virt/fsl_hypervisor.c: dereferencing error pointers in ioctl drivers/virt/fsl_hypervisor.c: prevent integer overflow in ioctl powerpc/book3s/64: check for NULL pointer in pgd_alloc() powerpc/powernv/idle: Restore IAMR after idle powerpc/booke64: set RI in default MSR PCI: hv: Fix a memory leak in hv_eject_device_work() PCI: hv: Add hv_pci_remove_slots() when we unload the driver PCI: hv: Add pci_destroy_slot() in pci_devices_present_work(), if necessary Linux 4.19.44 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Jan Kara
|
6832199422 |
mm/memory.c: fix modifying of page protection by insert_pfn()
[ Upstream commit cae85cb8add35f678cf487139d05e083ce2f570a ]
Aneesh has reported that PPC triggers the following warning when
excercising DAX code:
IP set_pte_at+0x3c/0x190
LR insert_pfn+0x208/0x280
Call Trace:
insert_pfn+0x68/0x280
dax_iomap_pte_fault.isra.7+0x734/0xa40
__xfs_filemap_fault+0x280/0x2d0
do_wp_page+0x48c/0xa40
__handle_mm_fault+0x8d0/0x1fd0
handle_mm_fault+0x140/0x250
__do_page_fault+0x300/0xd60
handle_page_fault+0x18
Now that is WARN_ON in set_pte_at which is
VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep));
The problem is that on some architectures set_pte_at() cannot cope with
a situation where there is already some (different) valid entry present.
Use ptep_set_access_flags() instead to modify the pfn which is built to
deal with modifying existing PTE.
Link: http://lkml.kernel.org/r/20190311084537.16029-1-jack@suse.cz
Fixes:
|
||
David Hildenbrand
|
6a60fb62c8 |
mm/memory_hotplug.c: drop memory device reference after find_memory_block()
[ Upstream commit 89c02e69fc5245f8a2f34b58b42d43a737af1a5e ]
Right now we are using find_memory_block() to get the node id for the
pfn range to online. We are missing to drop a reference to the memory
block device. While the device still gets unregistered via
device_unregister(), resulting in no user visible problem, the device is
never released via device_release(), resulting in a memory leak. Fix
that by properly using a put_device().
Link: http://lkml.kernel.org/r/20190411110955.1430-1-david@redhat.com
Fixes:
|
||
Johannes Weiner
|
6536de8232 |
mm: fix inactive list balancing between NUMA nodes and cgroups
[ Upstream commit 3b991208b897f52507168374033771a984b947b1 ]
During !CONFIG_CGROUP reclaim, we expand the inactive list size if it's
thrashing on the node that is about to be reclaimed. But when cgroups
are enabled, we suddenly ignore the node scope and use the cgroup scope
only. The result is that pressure bleeds between NUMA nodes depending
on whether cgroups are merely compiled into Linux. This behavioral
difference is unexpected and undesirable.
When the refault adaptivity of the inactive list was first introduced,
there were no statistics at the lruvec level - the intersection of node
and memcg - so it was better than nothing.
But now that we have that infrastructure, use lruvec_page_state() to
make the list balancing decision always NUMA aware.
[hannes@cmpxchg.org: fix bisection hole]
Link: http://lkml.kernel.org/r/20190417155241.GB23013@cmpxchg.org
Link: http://lkml.kernel.org/r/20190412144438.2645-1-hannes@cmpxchg.org
Fixes:
|
||
Greg Kroah-Hartman
|
66aebe2d89 |
This is the 4.19.42 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlzVnqQACgkQONu9yGCS aT4iJQ//SEDeqCHsZIblTqIywc87Xn/M4WGvQSr5l7NlFFWXRmIZ2VqGSa8IqY6m mbq0WJHIb0N3fJ09rh6/OuoFz8j7L4aq3OngCsAfmj3Esb2mAt8QkpvHgKdO8wEp iWstqAUUgCJLaaCO2KN5Z4uvpib+HhpaoOkHNto94w6M5Hc9Ax9iqMWqOk+ugcy9 q3nPWomhesrmzNjw4gOtG2gyGlkqUZOkbvQDYttXO5jAV2sgfy1puaG3maApRRvT uZFOYBJEa13MFx+QgaLvZ7asyhtlBVzQyRx/CcBjecCMRv3oC/ZZtemDmwqEZ4ll kj4IaRMzthRhmkpLEEW4SkN/TZDL3C0n114Wp+hRMmLviY7yOOovrbiHs8k/RYzU eOKZ2v/zI2FKGYbH0Nsa1HPJdOO1ESAVIxs/XHyk8odOPCpx+KN8NOPdnWILrM1l uhVKK62iQlZsLlny+EE7Oy3nqZAg1hHwNkV7WuyLfFrSVDQ/IOLK953QJdNlmgCd NbVRCcZ8gtL47jLD4F3uf8bcNQhheg/1m/A95xl860WmTDXrQSvN3FmvVWZ9LUCf ftsA76tMcDFlpw/Xu12z+P93VliaATt/I6aRbJyKvL3BBL5dB7Cx0asMDi5ecT6P sgaxYPp56xX0m6V10vEtaFCRcyuP2JmR60uU5Zvq9t5CtG+7BL0= =SzxN -----END PGP SIGNATURE----- Merge 4.19.42 into android-4.19 Changes in 4.19.42 net: stmmac: Use bfsize1 in ndesc_init_rx_desc scsi: libsas: fix a race condition when smp task timeout Drivers: hv: vmbus: Remove the undesired put_cpu_ptr() in hv_synic_cleanup() ubsan: Fix nasty -Wbuiltin-declaration-mismatch GCC-9 warnings staging: greybus: power_supply: fix prop-descriptor request size staging: most: cdev: fix chrdev_region leak in mod_exit ASoC: tlv320aic3x: fix reset gpio reference counting ASoC: hdmi-codec: fix S/PDIF DAI ASoC: stm32: sai: fix iec958 controls indexation ASoC: stm32: sai: fix exposed capabilities in spdif mode ASoC:soc-pcm:fix a codec fixup issue in TDM case ASoC:intel:skl:fix a simultaneous playback & capture issue on hda platform ASoC: nau8824: fix the issue of the widget with prefix name ASoC: nau8810: fix the issue of widget with prefixed name ASoC: samsung: odroid: Fix clock configuration for 44100 sample rate ASoC: rt5682: recording has no sound after booting ASoC: wm_adsp: Add locking to wm_adsp2_bus_error clk: meson-gxbb: round the vdec dividers to closest ASoC: stm32: dfsdm: manage multiple prepare ASoC: stm32: dfsdm: fix debugfs warnings on entry creation ASoC: cs4270: Set auto-increment bit for register writes ASoC: dapm: Fix NULL pointer dereference in snd_soc_dapm_free_kcontrol drm/omap: hdmi4_cec: Fix CEC clock handling for PM IB/hfi1: Eliminate opcode tests on mr deref IB/hfi1: Fix the allocation of RSM table MIPS: KGDB: fix kgdb support for SMP platforms. ASoC: tlv320aic32x4: Fix Common Pins drm/mediatek: Fix an error code in mtk_hdmi_dt_parse_pdata() perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS perf/x86/intel: Initialize TFA MSR linux/kernel.h: Use parentheses around argument in u64_to_user_ptr() ASoC: rockchip: pdm: fix regmap_ops hang issue drm/amd/display: fix cursor black issue ASoC: cs35l35: Disable regulators on driver removal objtool: Add rewind_stack_do_exit() to the noreturn list slab: fix a crash by reading /proc/slab_allocators drm/sun4i: tcon top: Fix NULL/invalid pointer dereference in sun8i_tcon_top_un/bind virtio_pci: fix a NULL pointer reference in vp_del_vqs RDMA/vmw_pvrdma: Fix memory leak on pvrdma_pci_remove RDMA/hns: Fix bug that caused srq creation to fail scsi: csiostor: fix missing data copy in csio_scsi_err_handler() drm/mediatek: fix possible object reference leak ASoC: Intel: kbl: fix wrong number of channels virtio-blk: limit number of hw queues by nr_cpu_ids nvme-fc: correct csn initialization and increments on error platform/x86: pmc_atom: Drop __initconst on dmi table perf/core: Fix perf_event_disable_inatomic() race iommu/amd: Set exclusion range correctly genirq: Prevent use-after-free and work list corruption usb: dwc3: Fix default lpm_nyet_threshold value USB: serial: f81232: fix interrupt worker not stop USB: cdc-acm: fix unthrottle races usb-storage: Set virt_boundary_mask to avoid SG overflows intel_th: pci: Add Comet Lake support cpufreq: armada-37xx: fix frequency calculation for opp soc: sunxi: Fix missing dependency on REGMAP_MMIO scsi: lpfc: change snprintf to scnprintf for possible overflow scsi: qla2xxx: Fix incorrect region-size setting in optrom SYSFS routines scsi: qla2xxx: Fix device staying in blocked state Bluetooth: hidp: fix buffer overflow Bluetooth: Align minimum encryption key size for LE and BR/EDR connections UAS: fix alignment of scatter/gather segments ASoC: Intel: avoid Oops if DMA setup fails locking/futex: Allow low-level atomic operations to return -EAGAIN arm64: futex: Bound number of LDXR/STXR loops in FUTEX_WAKE_OP Linux 4.19.42 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Qian Cai
|
78bc98235e |
slab: fix a crash by reading /proc/slab_allocators
[ Upstream commit fcf88917dd435c6a4cb2830cb086ee58605a1d85 ] The commit |
||
Greg Kroah-Hartman
|
44c5f03127 |
This is the 4.19.41 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlzSZ3MACgkQONu9yGCS aT7f2hAAww50MxygeVQXgB0zJ7oSlEBwJAUH1GRCj1ual70pDoK9W8iuacPI4gNu +A0V4cD+lndMZ8E+F9/Vmzhmj/xHdxRfCkRmtnpa5IKUJqA/bLr1kdspYO6G354G ZBGvizvmFaAikbqNSnoadQZ5AYoT/C4FxxD6t/meQtnwmrK+OhRlh1xCkTddAbQo UEPDns9pxqrwHK7tS0GbPXLWgsr4y7BPq92ILdt/GFZGmEBJa0+tf66/DF+RMKjD OCmAkiFr+hN8KScUHeZpJKdmI3s7dHEUnQnwkHyC1TwjEXwxhtA0QZgzl9j1F8JI f27kbJrpmYuw2QADJyZmCnTiqBo7XglItxEZ2csJKyV7+jFFlzKT4S26iyrHwgXp BTdSuTaVe2Ubp/EiULwoSsmxaIyW6fROotQK/x4oB0eDm+OSXq217nnbnF2MRpMb eOqzRPCd5CvL8+7T9rc0kOJcDuK/9h95OjzZWBo9rDGz0glxlokw7cJ4fUyrJMPo kcw7ZIRRHLtKGTD/ZKX3XrErAqz9I4RAEzBRAXT6W0is7PUi00vFNcOU4ThNSsSK ihfuHsp9Jq/SKEUl6/CumU8e91LWv77/gqYxLLuygpuKLFLpMMc/PQOruxXT6ojx QC/KIUtcMfTP+xy23Wg2UQ1BXhl+HcfShXzvcEXxHzjgHpglKvc= =S4RT -----END PGP SIGNATURE----- Merge 4.19.41 into android-4.19 Changes in 4.19.41 iwlwifi: fix driver operation for 5350 mwifiex: Make resume actually do something useful again on SDIO cards mac80211: don't attempt to rename ERR_PTR() debugfs dirs i2c: synquacer: fix enumeration of slave devices i2c: imx: correct the method of getting private data in notifier_call i2c: Remove unnecessary call to irq_find_mapping i2c: Clear client->irq in i2c_device_remove i2c: Allow recovery of the initial IRQ by an I2C client device. i2c: Prevent runtime suspend of adapter when Host Notify is required ALSA: hda/realtek - Add new Dell platform for headset mode ALSA: hda/realtek - Fixed Dell AIO speaker noise ALSA: hda/realtek - Apply the fixup for ASUS Q325UAR USB: yurex: Fix protection fault after device removal USB: w1 ds2490: Fix bug caused by improper use of altsetting array USB: dummy-hcd: Fix failure to give back unlinked URBs usb: usbip: fix isoc packet num validation in get_pipe USB: core: Fix unterminated string returned by usb_string() USB: core: Fix bug caused by duplicate interface PM usage counter nvme-loop: init nvmet_ctrl fatal_err_work when allocate efi: Fix debugobjects warning on 'efi_rts_work' arm64: dts: rockchip: fix rk3328-roc-cc gmac2io tx/rx_delay HID: logitech: check the return value of create_singlethread_workqueue HID: debug: fix race condition with between rdesc_show() and device removal rtc: cros-ec: Fail suspend/resume if wake IRQ can't be configured rtc: sh: Fix invalid alarm warning for non-enabled alarm batman-adv: Reduce claim hash refcnt only for removed entry batman-adv: Reduce tt_local hash refcnt only for removed entry batman-adv: Reduce tt_global hash refcnt only for removed entry batman-adv: fix warning in function batadv_v_elp_get_throughput ARM: dts: rockchip: Fix gpu opp node names for rk3288 reset: meson-audio-arb: Fix missing .owner setting of reset_controller_dev igb: Fix WARN_ONCE on runtime suspend riscv: fix accessing 8-byte variable from RV32 HID: quirks: Fix keyboard + touchpad on Lenovo Miix 630 net: hns3: fix compile error net/mlx5: E-Switch, Fix esw manager vport indication for more vport commands bonding: show full hw address in sysfs for slave entries net: stmmac: use correct DMA buffer size in the RX descriptor net: stmmac: ratelimit RX error logs net: stmmac: don't stop NAPI processing when dropping a packet net: stmmac: don't overwrite discard_frame status net: stmmac: fix dropping of multi-descriptor RX frames net: stmmac: don't log oversized frames jffs2: fix use-after-free on symlink traversal debugfs: fix use-after-free on symlink traversal mfd: twl-core: Disable IRQ while suspended block: use blk_free_flush_queue() to free hctx->fq in blk_mq_init_hctx rtc: da9063: set uie_unsupported when relevant HID: input: add mapping for Assistant key vfio/pci: use correct format characters scsi: core: add new RDAC LENOVO/DE_Series device scsi: storvsc: Fix calculation of sub-channel count arm/mach-at91/pm : fix possible object reference leak arm64: fix wrong check of on_sdei_stack in nmi context net: hns: fix KASAN: use-after-free in hns_nic_net_xmit_hw() net: hns: Use NAPI_POLL_WEIGHT for hns driver net: hns: Fix probabilistic memory overwrite when HNS driver initialized net: hns: fix ICMP6 neighbor solicitation messages discard problem net: hns: Fix WARNING when remove HNS driver with SMMU enabled libcxgb: fix incorrect ppmax calculation KVM: SVM: prevent DBG_DECRYPT and DBG_ENCRYPT overflow kmemleak: powerpc: skip scanning holes in the .bss section hugetlbfs: fix memory leak for resv_map sh: fix multiple function definition build errors xsysace: Fix error handling in ace_setup fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock ARM: orion: don't use using 64-bit DMA masks ARM: iop: don't use using 64-bit DMA masks block: pass no-op callback to INIT_WORK(). perf/x86/amd: Update generic hardware cache events for Family 17h Bluetooth: btusb: request wake pin with NOAUTOEN Bluetooth: mediatek: fix up an error path to restore bdev->tx_state clk: qcom: Add missing freq for usb30_master_clk on 8998 staging: iio: adt7316: allow adt751x to use internal vref for all dacs staging: iio: adt7316: fix the dac read calculation staging: iio: adt7316: fix the dac write calculation scsi: RDMA/srpt: Fix a credit leak for aborted commands ASoC: Intel: bytcr_rt5651: Revert "Fix DMIC map headsetmic mapping" ASoC: wm_adsp: Correct handling of compressed streams that restart ASoC: stm32: fix sai driver name initialisation platform/x86: intel_pmc_core: Fix PCH IP name platform/x86: intel_pmc_core: Handle CFL regmap properly IB/core: Unregister notifier before freeing MAD security IB/core: Fix potential memory leak while creating MAD agents IB/core: Destroy QP if XRC QP fails Input: snvs_pwrkey - initialize necessary driver data before enabling IRQ Input: stmfts - acknowledge that setting brightness is a blocking call gpio: mxc: add check to return defer probe if clock tree NOT ready selinux: avoid silent denials in permissive mode under RCU walk selinux: never allow relabeling on context mounts mac80211: Honor SW_CRYPTO_CONTROL for unicast keys in AP VLAN mode powerpc/mm/hash: Handle mmap_min_addr correctly in get_unmapped_area topdown search x86/mce: Improve error message when kernel cannot recover, p2 clk: x86: Add system specific quirk to mark clocks as critical x86/mm/KASLR: Fix the size of the direct mapping section x86/mm: Fix a crash with kmemleak_scan() x86/mm/tlb: Revert "x86/mm: Align TLB invalidation info" i2c: i2c-stm32f7: Fix SDADEL minimum formula media: v4l2: i2c: ov7670: Fix PLL bypass register values ASoC: wm_adsp: Check for buffer in trigger stop mm/kmemleak.c: fix unused-function warning Linux 4.19.41 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Arnd Bergmann
|
e7c2d06656 |
mm/kmemleak.c: fix unused-function warning
commit dce5b0bdeec61bdbee56121ceb1d014151d5cab1 upstream. The only references outside of the #ifdef have been removed, so now we get a warning in non-SMP configurations: mm/kmemleak.c:1404:13: error: unused function 'scan_large_block' [-Werror,-Wunused-function] Add a new #ifdef around it. Link: http://lkml.kernel.org/r/20190416123148.3502045-1-arnd@arndb.de Fixes: 298a32b13208 ("kmemleak: powerpc: skip scanning holes in the .bss section") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Catalin Marinas
|
6a62bbe823 |
kmemleak: powerpc: skip scanning holes in the .bss section
[ Upstream commit 298a32b132087550d3fa80641ca58323c5dfd4d9 ]
Commit
|
||
Greg Kroah-Hartman
|
dda1dd3ba3 |
This is the 4.19.39 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlzNPTYACgkQONu9yGCS aT5+LA//TEZMvKwfq8ASsp17ncB9jNrV9G5gwwTv4Aa+Zv16LN5pLhlTwqJE1JCc wJzisFxCGNil0LTZYlpTwGnPvjNB4MPqyMEU5B7GkLsbRCjCJCZUpCEhrsEhR6D+ finLDnSAHh8+c/kvETIazvzTw9BP8YUDe7j933L54pXONfAHUvYDkz57ri5oEyIE 9g+Haot6gytgnPH1iU3A+uSjMll2naBHT1ga7F6fqU6EuhIogtEepG1Y6TWT6zfQ IDO4HZwMhdi75ITkv06lh/1zdQaMaGWE+R7svKfQ5UoH89eIUvRjKl8Pkdd5JPha 1LOz5HBBTxCZdzl70UF8n0ZzlNYP71rQ1CT4bo+U4kPNWNvKZb2MxJEOxB7/JqJo qNrv1lBE1ERBM5s8NlTSx1P+RSs3G6kLhDIgz+uB8ALl87zDkwUA7ynP13NFsP2c QY7E0eianWSYJxrqhkgWXQ/ToYRLsCBARvAHMZgDveVUNe83SyK1NFhZD77weDSC WtirUFzKpw0ZtVjs8Ox6zh4HRtAFXsyPvnlPbtrMdaU0GtEox2Skg7SKboEZfyUa 8nu98ZdEdzqAKKUOs2o/3vVFnq4NKmw1khpo1OEW5Ob76KdDHYGMiiGOYH1vEGY1 OxqNyNgw8/8OT6drCrAgx7+4X+BTXdiSXORUyp9AP7WpELwMy4Q= =35Eg -----END PGP SIGNATURE----- Merge 4.19.39 into android-4.19 Changes in 4.19.39 selinux: use kernel linux/socket.h for genheaders and mdp Revert "ACPICA: Clear status of GPEs before enabling them" mm: make page ref count overflow check tighter and more explicit mm: add 'try_get_page()' helper function mm: prevent get_user_pages() from overflowing page refcount fs: prevent page refcount overflow in pipe_buf_get ARM: dts: bcm283x: Fix hdmi hpd gpio pull s390: limit brk randomization to 32MB net: ieee802154: fix a potential NULL pointer dereference ieee802154: hwsim: propagate genlmsg_reply return code net: stmmac: don't set own bit too early for jumbo frames qlcnic: Avoid potential NULL pointer dereference xsk: fix umem memory leak on cleanup staging: axis-fifo: add CONFIG_OF dependency staging, mt7621-pci: fix build without pci support netfilter: nft_set_rbtree: check for inactive element after flag mismatch netfilter: bridge: set skb transport_header before entering NF_INET_PRE_ROUTING netfilter: fix NETFILTER_XT_TARGET_TEE dependencies netfilter: ip6t_srh: fix NULL pointer dereferences s390/qeth: fix race when initializing the IP address table ARM: imx51: fix a leaked reference by adding missing of_node_put sc16is7xx: missing unregister/delete driver on error in sc16is7xx_init() serial: ar933x_uart: Fix build failure with disabled console KVM: arm64: Reset the PMU in preemptible context KVM: arm/arm64: vgic-its: Take the srcu lock when writing to guest memory KVM: arm/arm64: vgic-its: Take the srcu lock when parsing the memslots usb: dwc3: pci: add support for Comet Lake PCH ID usb: gadget: net2280: Fix overrun of OUT messages usb: gadget: net2280: Fix net2280_dequeue() usb: gadget: net2272: Fix net2272_dequeue() ARM: dts: pfla02: increase phy reset duration i2c: i801: Add support for Intel Comet Lake net: ks8851: Dequeue RX packets explicitly net: ks8851: Reassert reset pin if chip ID check fails net: ks8851: Delay requesting IRQ until opened net: ks8851: Set initial carrier state to down staging: rtl8188eu: Fix potential NULL pointer dereference of kcalloc staging: rtlwifi: rtl8822b: fix to avoid potential NULL pointer dereference staging: rtl8712: uninitialized memory in read_bbreg_hdl() staging: rtlwifi: Fix potential NULL pointer dereference of kzalloc net: macb: Add null check for PCLK and HCLK net/sched: don't dereference a->goto_chain to read the chain index ARM: dts: imx6qdl: Fix typo in imx6qdl-icore-rqs.dtsi drm/tegra: hub: Fix dereference before check NFS: Fix a typo in nfs_init_timeout_values() net: xilinx: fix possible object reference leak net: ibm: fix possible object reference leak net: ethernet: ti: fix possible object reference leak drm: Fix drm_release() and device unplug gpio: aspeed: fix a potential NULL pointer dereference drm/meson: Fix invalid pointer in meson_drv_unbind() drm/meson: Uninstall IRQ handler ARM: davinci: fix build failure with allnoconfig scsi: mpt3sas: Fix kernel panic during expander reset scsi: aacraid: Insure we don't access PCIe space during AER/EEH scsi: qla4xxx: fix a potential NULL pointer dereference usb: usb251xb: fix to avoid potential NULL pointer dereference leds: trigger: netdev: fix refcnt leak on interface rename x86/realmode: Don't leak the trampoline kernel address usb: u132-hcd: fix resource leak ceph: fix use-after-free on symlink traversal scsi: zfcp: reduce flood of fcrscn1 trace records on multi-element RSCN x86/mm: Don't exceed the valid physical address space libata: fix using DMA buffers on stack gpio: of: Fix of_gpiochip_add() error path nvme-multipath: relax ANA state check perf machine: Update kernel map address and re-order properly kconfig/[mn]conf: handle backspace (^H) key iommu/amd: Reserve exclusion range in iova-domain ptrace: take into account saved_sigmask in PTRACE{GET,SET}SIGMASK leds: pca9532: fix a potential NULL pointer dereference leds: trigger: netdev: use memcpy in device_name_store Linux 4.19.39 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Linus Torvalds
|
d972ebbf42 |
mm: prevent get_user_pages() from overflowing page refcount
commit 8fde12ca79aff9b5ba951fce1a2641901b8d8e64 upstream. If the page refcount wraps around past zero, it will be freed while there are still four billion references to it. One of the possible avenues for an attacker to try to make this happen is by doing direct IO on a page multiple times. This patch makes get_user_pages() refuse to take a new page reference if there are already more than two billion references to the page. Reported-by: Jann Horn <jannh@google.com> Acked-by: Matthew Wilcox <willy@infradead.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Greg Kroah-Hartman
|
5e7b4fbe36 |
This is the 4.19.38 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlzKo0YACgkQONu9yGCS aT4dbQ//U1bo/8bdBJec+a0aNMy3cxzPF1Ozbrb/vEaHofj1BR87hgo4BODBO7pu 6ppwloPle9VFrsfT1FYOjsicUBhT4NmieHlsC3msAR4xlBEbHEOBTEbUdu3HinGV Jn/uL/NDTrq+wA5rROGOh9sTlQ5w6dqItjHAWvnGkXlerbUJwIgnzbgH5qGBFZhQ 6SbPmqJv5V+C+qYy3yXNs2CnbtS7+cfulLy26MNnkFMEZGbHTWeNbeu9H41AK6T4 xtO8INse28RD6lbAPvW/xb//iAXsOHv+7KF1TgtZq89Z1RmlaqLSdPdgTYvCxm+Y RhWa8KyIdhADJ8z8sRcPviFI5bR65cfCMUAEgBcFNYYByDv36KCBLsXajn4JbBsF OOOtqnGaZyAJBZgMXySfVJIXLAx7cUlt07YD9cIdsOzjl1DCMP76XvypeGXLw5Mk ZBXBJ+By+8jwnE7PAtecij/VH6qCDsfn4HqoRELsRLVahFsnFFid5lutVIjsO21j QHrwi4hChuYGa89MhD48KyC2ZuaQmbs3rm6F3O0iQ0aipknvlsDoB4jYYp9qRI04 0FYMlZLlVyg+sNYOM2XvTtpOBFa1PFwFwscqXoyt0CGtig0D+pD3gDYExRONj6Fp 8h+OUBWbVHWscceMc6G1p/Qu+YcgmQTu8CFAUO8l/X8xq655c1A= =isRm -----END PGP SIGNATURE----- Merge 4.19.38 into android-4.19 Changes in 4.19.38 netfilter: nft_compat: use refcnt_t type for nft_xt reference count netfilter: nft_compat: make lists per netns netfilter: nf_tables: split set destruction in deactivate and destroy phase netfilter: nft_compat: destroy function must not have side effects netfilter: nf_tables: warn when expr implements only one of activate/deactivate netfilter: nf_tables: unbind set in rule from commit path netfilter: nft_compat: don't use refcount_inc on newly allocated entry netfilter: nft_compat: use .release_ops and remove list of extension netfilter: nf_tables: fix set double-free in abort path netfilter: nf_tables: bogus EBUSY when deleting set after flush netfilter: nf_tables: bogus EBUSY in helper removal from transaction net/ibmvnic: Fix RTNL deadlock during device reset net: mvpp2: fix validate for PPv2.1 ext4: fix some error pointer dereferences tipc: handle the err returned from cmd header function loop: do not print warn message if partition scan is successful drm/rockchip: fix for mailbox read validation. vsock/virtio: fix kernel panic from virtio_transport_reset_no_sock ipvs: fix warning on unused variable powerpc/vdso32: fix CLOCK_MONOTONIC on PPC64 ALSA: hda/ca0132 - Fix build error without CONFIG_PCI net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework cifs: fix memory leak in SMB2_read cifs: do not attempt cifs operation on smb2+ rename error tracing: Fix a memory leak by early error exit in trace_pid_write() tracing: Fix buffer_ref pipe ops gpio: eic: sprd: Fix incorrect irq type setting for the sync EIC zram: pass down the bvec we need to read into in the work struct lib/Kconfig.debug: fix build error without CONFIG_BLOCK MIPS: scall64-o32: Fix indirect syscall number load trace: Fix preempt_enable_no_resched() abuse IB/rdmavt: Fix frwr memory registration RDMA/mlx5: Do not allow the user to write to the clock page sched/numa: Fix a possible divide-by-zero ceph: only use d_name directly when parent is locked ceph: ensure d_name stability in ceph_dentry_hash() ceph: fix ci->i_head_snapc leak nfsd: Don't release the callback slot unless it was actually held sunrpc: don't mark uninitialised items as VALID. perf/x86/intel: Update KBL Package C-state events to also include PC8/PC9/PC10 counters Input: synaptics-rmi4 - write config register values to the right offset vfio/type1: Limit DMA mappings per container dmaengine: sh: rcar-dmac: With cyclic DMA residue 0 is valid dmaengine: sh: rcar-dmac: Fix glitch in dmaengine_tx_status ARM: 8857/1: efi: enable CP15 DMB instructions before cleaning the cache powerpc/mm/radix: Make Radix require HUGETLB_PAGE drm/vc4: Fix memory leak during gpu reset. Revert "drm/i915/fbdev: Actually configure untiled displays" drm/vc4: Fix compilation error reported by kbuild test bot USB: Add new USB LPM helpers USB: Consolidate LPM checks to avoid enabling LPM twice slip: make slhc_free() silently accept an error pointer intel_th: gth: Fix an off-by-one in output unassigning fs/proc/proc_sysctl.c: Fix a NULL pointer dereference workqueue: Try to catch flush_work() without INIT_WORK(). binder: fix handling of misaligned binder object sched/deadline: Correctly handle active 0-lag timers NFS: Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family. netfilter: ebtables: CONFIG_COMPAT: drop a bogus WARN_ON fm10k: Fix a potential NULL pointer dereference tipc: check bearer name with right length in tipc_nl_compat_bearer_enable tipc: check link name with right length in tipc_nl_compat_link_set net: netrom: Fix error cleanup path of nr_proto_init net/rds: Check address length before reading address family rxrpc: fix race condition in rxrpc_input_packet() aio: clear IOCB_HIPRI aio: use assigned completion handler aio: separate out ring reservation from req allocation aio: don't zero entire aio_kiocb aio_get_req() aio: use iocb_put() instead of open coding it aio: split out iocb copy from io_submit_one() aio: abstract out io_event filler helper aio: initialize kiocb private in case any filesystems expect it. aio: simplify - and fix - fget/fput for io_submit() pin iocb through aio. aio: fold lookup_kiocb() into its sole caller aio: keep io_event in aio_kiocb aio: store event at final iocb_put() Fix aio_poll() races x86, retpolines: Raise limit for generating indirect calls from switch-case x86/retpolines: Disable switch jump tables when retpolines are enabled mm: Fix warning in insert_pfn() x86/fpu: Don't export __kernel_fpu_{begin,end}() ipv4: add sanity checks in ipv4_link_failure() ipv4: set the tcp_min_rtt_wlen range from 0 to one day mlxsw: spectrum: Fix autoneg status in ethtool net/mlx5e: ethtool, Remove unsupported SFP EEPROM high pages query net: rds: exchange of 8K and 1M pool net/rose: fix unbound loop in rose_loopback_timer() net: stmmac: move stmmac_check_ether_addr() to driver probe net/tls: fix refcount adjustment in fallback stmmac: pci: Adjust IOT2000 matching team: fix possible recursive locking when add slaves net: hns: Fix WARNING when hns modules installed mlxsw: pci: Reincrease PCI reset timeout mlxsw: spectrum: Put MC TCs into DWRR mode net/mlx5e: Fix the max MTU check in case of XDP net/mlx5e: Fix use-after-free after xdp_return_frame net/tls: avoid potential deadlock in tls_set_device_offload_rx() net/tls: don't leak IV and record seq when offload fails powerpc/fsl: Add FSL_PPC_BOOK3E as supported arch for nospectre_v2 boot arg Linux 4.19.38 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Jan Kara
|
423497a96d |
mm: Fix warning in insert_pfn()
commit f2c57d91b0d96aa13ccff4e3b178038f17b00658 upstream. In DAX mode a write pagefault can race with write(2) in the following way: CPU0 CPU1 write fault for mapped zero page (hole) dax_iomap_rw() iomap_apply() xfs_file_iomap_begin() - allocates blocks dax_iomap_actor() invalidate_inode_pages2_range() - invalidates radix tree entries in given range dax_iomap_pte_fault() grab_mapping_entry() - no entry found, creates empty ... xfs_file_iomap_begin() - finds already allocated block ... vmf_insert_mixed_mkwrite() - WARNs and does nothing because there is still zero page mapped in PTE unmap_mapping_pages() This race results in WARN_ON from insert_pfn() and is occasionally triggered by fstest generic/344. Note that the race is otherwise harmless as before write(2) on CPU0 is finished, we will invalidate page tables properly and thus user of mmap will see modified data from write(2) from that point on. So just restrict the warning only to the case when the PFN in PTE is not zero page. Link: http://lkml.kernel.org/r/20180824154542.26872-1-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Greg Kroah-Hartman
|
9bf5904866 |
This is the 4.19.37 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlzEBokACgkQONu9yGCS aT7G7w/8C93URGM67H7ynkCHTo8y3hkRE2rUJPckJNdS+IJKuecmOphak4tF0h07 qPWDPya70Q1S0cNu661TuVAGrhmE5jBx8/xfZaAOeaaU0xtZive+TfSHdAQQaHct tDk32O85N1aZ49rDEz9ibr7CGLVFDZtyhxV5gFMYQpjbqA7MzJC61zQg1jHyPSCz sKjQzW+uXMuSLru8jXHMvp41K5sFFp5gYdQbAVKlWtt79qPxWdxZPJbLbM0LBbtz XHt9E45Ink3ALF9P6tZ4e6gi4zzlNbh9yR92+X5NK5/8AP57yWba4W9JHWIfMBpC yyDYTOEAzdxqa2Jrgwr4WTdKH6U7FbQZFmWfTBB4VotbHLBWkVXj0OnF10qxP9eQ p5wGDTJAlWezhX1BTCfYroglDsvqhj+gHfwHzDRF1Del1dRgydRMQc0qLD1d9tul ovzwOkx1xyJrM2wq05I5gc0FoVyOL6/KCwqMrpVfKa3WKY7Uttjgf56bMqdIIkns i/6opzF+wtvwlLlCoXgYPXdm6kbWdgvS+skVHfWcHmZFMuGrFGGzJNwzXb7qnVjK T0hD1OestsfTyD/amnDNYkNeCkoOZqtHAi+xYOQR4kGY5cxP1lQJf85MgAy6RZSY h+rjys76Qf6+hTCtrowLr8SgksX4ACWxm+UarfAiiNnnDXwGfu8= =SrFV -----END PGP SIGNATURE----- Merge 4.19.37 into android-4.19 Changes in 4.19.37 bonding: fix event handling for stacked bonds failover: allow name change on IFF_UP slave interfaces net: atm: Fix potential Spectre v1 vulnerabilities net: bridge: fix per-port af_packet sockets net: bridge: multicast: use rcu to access port list from br_multicast_start_querier net: Fix missing meta data in skb with vlan packet net: fou: do not use guehdr after iptunnel_pull_offloads in gue_udp_recv tcp: tcp_grow_window() needs to respect tcp_space() team: set slave to promisc if team is already in promisc mode tipc: missing entries in name table of publications vhost: reject zero size iova range ipv4: recompile ip options in ipv4_link_failure ipv4: ensure rcu_read_lock() in ipv4_link_failure() net: thunderx: raise XDP MTU to 1508 net: thunderx: don't allow jumbo frames with XDP net/mlx5: FPGA, tls, hold rcu read lock a bit longer net/tls: prevent bad memory access in tls_is_sk_tx_device_offloaded() net/mlx5: FPGA, tls, idr remove on flow delete route: Avoid crash from dereferencing NULL rt->from sch_cake: Use tc_skb_protocol() helper for getting packet protocol sch_cake: Make sure we can write the IP header before changing DSCP bits nfp: flower: replace CFI with vlan present nfp: flower: remove vlan CFI bit from push vlan action sch_cake: Simplify logic in cake_select_tin() net: IP defrag: encapsulate rbtree defrag code into callable functions net: IP6 defrag: use rbtrees for IPv6 defrag net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c CIFS: keep FileInfo handle live during oplock break cifs: Fix use-after-free in SMB2_write cifs: Fix use-after-free in SMB2_read cifs: fix handle leak in smb2_query_symlink() KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU KVM: x86: svm: make sure NMI is injected after nmi_singlestep Staging: iio: meter: fixed typo staging: iio: ad7192: Fix ad7193 channel address iio: gyro: mpu3050: fix chip ID reading iio/gyro/bmg160: Use millidegrees for temperature scale iio:chemical:bme680: Fix, report temperature in millidegrees iio:chemical:bme680: Fix SPI read interface iio: cros_ec: Fix the maths for gyro scale calculation iio: ad_sigma_delta: select channel when reading register iio: dac: mcp4725: add missing powerdown bits in store eeprom iio: Fix scan mask selection iio: adc: at91: disable adc channel interrupt in timeout case iio: core: fix a possible circular locking dependency io: accel: kxcjk1013: restore the range after resume. staging: most: core: use device description as name staging: comedi: vmk80xx: Fix use of uninitialized semaphore staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf staging: comedi: ni_usb6501: Fix use of uninitialized mutex staging: comedi: ni_usb6501: Fix possible double-free of ->usb_rx_buf ALSA: hda/realtek - add two more pin configuration sets to quirk table ALSA: core: Fix card races between register and disconnect Input: elan_i2c - add hardware ID for multiple Lenovo laptops serial: sh-sci: Fix HSCIF RX sampling point adjustment serial: sh-sci: Fix HSCIF RX sampling point calculation vt: fix cursor when clearing the screen scsi: core: set result when the command cannot be dispatched Revert "scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO" Revert "svm: Fix AVIC incomplete IPI emulation" coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier crypto: x86/poly1305 - fix overflow during partial reduction drm/ttm: fix out-of-bounds read in ttm_put_pages() v2 arm64: futex: Restore oldval initialization to work around buggy compilers x86/kprobes: Verify stack frame on kretprobe kprobes: Mark ftrace mcount handler functions nokprobe kprobes: Fix error check when reusing optimized probes rt2x00: do not increment sequence number while re-transmitting mac80211: do not call driver wake_tx_queue op during reconfig drm/amdgpu/gmc9: fix VM_L2_CNTL3 programming perf/x86/amd: Add event map for AMD Family 17h x86/cpu/bugs: Use __initconst for 'const' init data perf/x86: Fix incorrect PEBS_REGS x86/speculation: Prevent deadlock on ssb_state::lock timers/sched_clock: Prevent generic sched_clock wrap caused by tick_freeze() nfit/ars: Remove ars_start_flags nfit/ars: Introduce scrub_flags nfit/ars: Allow root to busy-poll the ARS state machine nfit/ars: Avoid stale ARS results mmc: sdhci: Fix data command CRC error handling mmc: sdhci: Rename SDHCI_ACMD12_ERR and SDHCI_INT_ACMD12ERR mmc: sdhci: Handle auto-command errors modpost: file2alias: go back to simple devtable lookup modpost: file2alias: check prototype of handler tpm/tpm_i2c_atmel: Return -E2BIG when the transfer is incomplete tpm: Fix the type of the return value in calc_tpm2_event_size() Revert "kbuild: use -Oz instead of -Os when using clang" sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup device_cgroup: fix RCU imbalance in error case mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n ALSA: info: Fix racy addition/deletion of nodes percpu: stop printing kernel addresses tools include: Adopt linux/bits.h ASoC: rockchip: add missing INTERLEAVED PCM attribute i2c-hid: properly terminate i2c_hid_dmi_desc_override_table[] array Revert "locking/lockdep: Add debug_locks check in __lock_downgrade()" kernel/sysctl.c: fix out-of-bounds access when setting file-max Linux 4.19.37 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Matteo Croce
|
6580376fe8 |
percpu: stop printing kernel addresses
commit 00206a69ee32f03e6f40837684dcbe475ea02266 upstream. Since commit |
||
Konstantin Khlebnikov
|
1343fd8f96 |
mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
commit e8277b3b52240ec1caad8e6df278863e4bf42eac upstream. Commit |
||
Andrea Arcangeli
|
6ff17bc593 |
coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
commit 04f5866e41fb70690e28397487d8bd8eea7d712a upstream.
The core dumping code has always run without holding the mmap_sem for
writing, despite that is the only way to ensure that the entire vma
layout will not change from under it. Only using some signal
serialization on the processes belonging to the mm is not nearly enough.
This was pointed out earlier. For example in Hugh's post from Jul 2017:
https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils
"Not strictly relevant here, but a related note: I was very surprised
to discover, only quite recently, how handle_mm_fault() may be called
without down_read(mmap_sem) - when core dumping. That seems a
misguided optimization to me, which would also be nice to correct"
In particular because the growsdown and growsup can move the
vm_start/vm_end the various loops the core dump does around the vma will
not be consistent if page faults can happen concurrently.
Pretty much all users calling mmget_not_zero()/get_task_mm() and then
taking the mmap_sem had the potential to introduce unexpected side
effects in the core dumping code.
Adding mmap_sem for writing around the ->core_dump invocation is a
viable long term fix, but it requires removing all copy user and page
faults and to replace them with get_dump_page() for all binary formats
which is not suitable as a short term fix.
For the time being this solution manually covers the places that can
confuse the core dump either by altering the vma layout or the vma flags
while it runs. Once ->core_dump runs under mmap_sem for writing the
function mmget_still_valid() can be dropped.
Allowing mmap_sem protected sections to run in parallel with the
coredump provides some minor parallelism advantage to the swapoff code
(which seems to be safe enough by never mangling any vma field and can
keep doing swapins in parallel to the core dumping) and to some other
corner case.
In order to facilitate the backporting I added "Fixes: 86039bd3b4e6"
however the side effect of this same race condition in /proc/pid/mem
should be reproducible since before 2.6.12-rc2 so I couldn't add any
other "Fixes:" because there's no hash beyond the git genesis commit.
Because find_extend_vma() is the only location outside of the process
context that could modify the "mm" structures under mmap_sem for
reading, by adding the mmget_still_valid() check to it, all other cases
that take the mmap_sem for reading don't need the new check after
mmget_not_zero()/get_task_mm(). The expand_stack() in page fault
context also doesn't need the new check, because all tasks under core
dumping are frozen.
Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com
Fixes:
|
||
Greg Kroah-Hartman
|
10f41ccfc7 |
This is the 4.19.36 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAly6xzUACgkQONu9yGCS aT5sIA//b7nAk2zuhmbkonsBfzFq5uBJmqXcCrOgy3XHMs4fE+Q11kLd1wMAV7dx U7FNHe4PIJ8Rczxgqr2VP3VmFbV6UuTK+UTclJKfbV3ouIAQiQBuutABBmbDUj2p FInc/yAYyhVc9n7gX78czTiUxKnKi4+sisUYDCZPr3hr6jDPcLvm/WVWdyrcXJje rYFNmE/2MBH1NofG+MOpq+ILhKHXlf2APN2/spl+I42a8bwodiSl9g+dhuWr7wgT Ln2Ocf7BZ6BPCQKoveZdD1Gd56NNR/lJh4ulqpuhaZw4Yp+B/C7GmrBtdPzVSGka IwPWoSc9/9VSUl+ooSZHms78VLbqq0rNNclskL2bN6m962u04Eu7sB2Tg/bwUs52 Wkcw0DY4J/oMJtj/CMHcQOUPsk6vwHxqnjsj+LYJ1ZjHO68tUshnENxXrbAoDc45 2fuY3TCA+XqFvqNt5HbkLPtFR78u8QmZ1lP/Pkri6xoG/GA6O0EAxhS0Z9hncGK7 8wJNuxLMd2UX94wlajQ+DF7yyCU4HOFEdeSEOwlHHBid/fckXsGzL2tKJUAbbUPP ux3An8kJHni8nQrmUkyy1Nx29ROyAFxBLOQshWGpXgJrV3qRMYLyB2Icv0WYCGFk zZCTupPgvb46u81VzqxrLH4RZdy4Ar4uB3BQGPKs596rlYmvnSo= =CArs -----END PGP SIGNATURE----- Merge 4.19.36 into android-4.19 Changes in 4.19.36 ARC: u-boot args: check that magic number is correct arc: hsdk_defconfig: Enable CONFIG_BLK_DEV_RAM inotify: Fix fsnotify_mark refcount leak in inotify_update_existing_watch() perf/core: Restore mmap record type correctly ext4: avoid panic during forced reboot ext4: add missing brelse() in add_new_gdb_meta_bg() ext4: report real fs size after failed resize ALSA: echoaudio: add a check for ioremap_nocache ALSA: sb8: add a check for request_region auxdisplay: hd44780: Fix memory leak on ->remove() drm/udl: use drm_gem_object_put_unlocked. IB/mlx4: Fix race condition between catas error reset and aliasguid flows i40iw: Avoid panic when handling the inetdev event mmc: davinci: remove extraneous __init annotation ALSA: opl3: fix mismatch between snd_opl3_drum_switch definition and declaration thermal/intel_powerclamp: fix __percpu declaration of worker_data thermal: samsung: Fix incorrect check after code merge thermal: bcm2835: Fix crash in bcm2835_thermal_debugfs thermal/int340x_thermal: Add additional UUIDs thermal/int340x_thermal: fix mode setting thermal/intel_powerclamp: fix truncated kthread name scsi: iscsi: flush running unbind operations when removing a session sched/cpufreq: Fix 32-bit math overflow sched/core: Fix buffer overflow in cgroup2 property cpu.max x86/mm: Don't leak kernel addresses tools/power turbostat: return the exit status of a command perf list: Don't forget to drop the reference to the allocated thread_map perf config: Fix an error in the config template documentation perf config: Fix a memory leak in collect_config() perf build-id: Fix memory leak in print_sdt_events() perf top: Fix error handling in cmd_top() perf hist: Add missing map__put() in error case perf evsel: Free evsel->counts in perf_evsel__exit() perf tests: Fix a memory leak of cpu_map object in the openat_syscall_event_on_all_cpus test perf tests: Fix memory leak by expr__find_other() in test__expr() perf tests: Fix a memory leak in test__perf_evsel__tp_sched_test() ACPI / utils: Drop reference in test for device presence PM / Domains: Avoid a potential deadlock blk-iolatency: #include "blk.h" drm/exynos/mixer: fix MIXER shadow registry synchronisation code irqchip/stm32: Don't clear rising/falling config registers at init irqchip/mbigen: Don't clear eventid when freeing an MSI x86/hpet: Prevent potential NULL pointer dereference x86/hyperv: Prevent potential NULL pointer dereference x86/cpu/cyrix: Use correct macros for Cyrix calls on Geode processors drm/nouveau/debugfs: Fix check of pm_runtime_get_sync failure iommu/vt-d: Check capability before disabling protected memory x86/hw_breakpoints: Make default case in hw_breakpoint_arch_parse() return an error fix incorrect error code mapping for OBJECTID_NOT_FOUND x86/gart: Exclude GART aperture from kcore ext4: prohibit fstrim in norecovery mode drm/cirrus: Use drm_framebuffer_put to avoid kernel oops in clean-up gpio: pxa: handle corner case of unprobed device rsi: improve kernel thread handling to fix kernel panic f2fs: fix to avoid NULL pointer dereference on se->discard_map 9p: do not trust pdu content for stat item size 9p locks: add mount option for lock retry interval ASoC: Fix UBSAN warning at snd_soc_get/put_volsw_sx() f2fs: fix to do sanity check with current segment number netfilter: xt_cgroup: shrink size of v2 path serial: uartps: console_setup() can't be placed to init section powerpc/pseries: Remove prrn_work workqueue media: au0828: cannot kfree dev before usb disconnect Bluetooth: Fix debugfs NULL pointer dereference HID: i2c-hid: override HID descriptors for certain devices pinctrl: core: make sure strcmp() doesn't get a null parameter ARM: samsung: Limit SAMSUNG_PM_CHECK config option to non-Exynos platforms usbip: fix vhci_hcd controller counting ACPI / SBS: Fix GPE storm on recent MacBookPro's HID: usbhid: Add quirk for Redragon/Dragonrise Seymur 2 KVM: nVMX: restore host state in nested_vmx_vmexit for VMFail compiler.h: update definition of unreachable() netfilter: nf_flow_table: remove flowtable hook flush routine in netns exit routine f2fs: cleanup dirty pages if recover failed net: stmmac: Set OWN bit for jumbo frames cifs: fallback to older infolevels on findfirst queryinfo retry kernel: hung_task.c: disable on suspend platform/x86: Add Intel AtomISP2 dummy / power-management driver drm/ttm: Fix bo_global and mem_global kfree error ALSA: hda: fix front speakers on Huawei MBXP ACPI: EC / PM: Disable non-wakeup GPEs for suspend-to-idle net/rds: fix warn in rds_message_alloc_sgs xfrm: destroy xfrm_state synchronously on net exit path crypto: sha256/arm - fix crash bug in Thumb2 build crypto: sha512/arm - fix crash bug in Thumb2 build net: ip6_gre: fix possible NULL pointer dereference in ip6erspan_set_version iommu/dmar: Fix buffer overflow during PCI bus notification scsi: core: Avoid that system resume triggers a kernel warning soc/tegra: pmc: Drop locking from tegra_powergate_is_powered() lkdtm: Print real addresses lkdtm: Add tests for NULL pointer dereference drm/panel: panel-innolux: set display off in innolux_panel_unprepare crypto: axis - fix for recursive locking from bottom half Revert "ACPI / EC: Remove old CLEAR_ON_RESUME quirk" coresight: cpu-debug: Support for CA73 CPUs PCI: Blacklist power management of Gigabyte X299 DESIGNARE EX PCIe ports drm/nouveau/volt/gf117: fix speedo readout register ARM: 8839/1: kprobe: make patch_lock a raw_spinlock_t drm/amdkfd: use init_mqd function to allocate object for hid_mqd (CI) appletalk: Fix use-after-free in atalk_proc_exit lib/div64.c: off by one in shift rxrpc: Fix client call connect/disconnect race f2fs: fix to dirty inode for i_mode recovery include/linux/swap.h: use offsetof() instead of custom __swapoffset macro bpf: fix use after free in bpf_evict_inode IB/hfi1: Failed to drain send queue when QP is put into error state mm: hide incomplete nr_indirectly_reclaimable in /proc/zoneinfo mm: hide incomplete nr_indirectly_reclaimable in sysfs appletalk: Fix compile regression Linux 4.19.36 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Roman Gushchin
|
d49dea545a |
mm: hide incomplete nr_indirectly_reclaimable in /proc/zoneinfo
[fixed differently upstream, this is a work-around to resolve it for 4.19.y] Yongqin reported that /proc/zoneinfo format is broken in 4.14 due to commit |
||
Greg Kroah-Hartman
|
3e166630b0 |
This is the 4.19.35 stable release
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAly2yf8ACgkQONu9yGCS
aT5OkA/+Jf3UWnM8MdPoxVUROYg6WRSMThvuKkcspkvXxXqu7Z9a/+meqdIZ0NDr
j+RdhCfZf7RRNpTpYB9jMqXXuRFhVWniAVM3/me2LxFvwyKkNqURyIz9azh7T0VW
nDa2NDQ9C0IyLFLW4cyzZncrmWntgW5XKZfVot1TMNpryHB4FPsKP5rInDQpSNIu
bbJBWuZziUqdEnk3TWuqdQr4ZTXgnjYSBaTbOQBU3F6E88TRmFIatyYovLcJCQh5
jBmXV3U6mGPZe64I+JXj8dTp32WD74afMX1YSZmjdLL6KgbD9a4k47UzpXzEbKT8
uMG057zPiYcwOyMCFZOWsCYIH7Yr4oMIAzOEyYcDI1uCuKd4aESdpgiWfUWambR8
BJO5oGELzLArMVWusWK7xBfm+Et2ePOFWC7V9MRVVQJLdAWkEafYuLoPqrhhqxQX
Iu3ThoY3UmEoNqi1345gmxQBJfbOqdK7CwQGzKOtiPNs0nyMfg62B9Rcr9O+FE5A
+womRQF2Tik1cZhXxkJVSD4f+orL+xuOu59ufvMTBe4co9tPVFm9Qp6SeWoOVuwZ
l8SS7DD5wb7vzDUZOO4KmA7D6p3YQWdBvGeN6jKqjASIFS9t5Sb+fNiKznTo/WCT
1I68sLm7/rs95+WazKu66ewW8bh91VAo+Gmpfo5zEJjcfajnDxY=
=F1Te
-----END PGP SIGNATURE-----
Merge 4.19.35 into android-4.19
Changes in 4.19.35
kvm: nVMX: NMI-window and interrupt-window exiting should wake L2 from HLT
drm/i915/gvt: do not let pin count of shadow mm go negative
powerpc/tm: Limit TM code inside PPC_TRANSACTIONAL_MEM
hv_netvsc: Fix unwanted wakeup after tx_disable
ibmvnic: Fix completion structure initialization
ip6_tunnel: Match to ARPHRD_TUNNEL6 for dev type
ipv6: Fix dangling pointer when ipv6 fragment
ipv6: sit: reset ip header pointer in ipip6_rcv
kcm: switch order of device registration to fix a crash
net: ethtool: not call vzalloc for zero sized memory request
net-gro: Fix GRO flush when receiving a GSO packet.
net/mlx5: Decrease default mr cache size
netns: provide pure entropy for net_hash_mix()
net: rds: force to destroy connection if t_sock is NULL in rds_tcp_kill_sock().
net/sched: act_sample: fix divide by zero in the traffic path
net/sched: fix ->get helper of the matchall cls
openvswitch: fix flow actions reallocation
qmi_wwan: add Olicard 600
r8169: disable ASPM again
sctp: initialize _pad of sockaddr_in before copying to user memory
tcp: Ensure DCTCP reacts to losses
tcp: fix a potential NULL pointer dereference in tcp_sk_exit
vrf: check accept_source_route on the original netdevice
net/mlx5e: Fix error handling when refreshing TIRs
net/mlx5e: Add a lock on tir list
nfp: validate the return code from dev_queue_xmit()
nfp: disable netpoll on representors
bnxt_en: Improve RX consumer index validity check.
bnxt_en: Reset device on RX buffer errors.
net: ip_gre: fix possible use-after-free in erspan_rcv
net: ip6_gre: fix possible use-after-free in ip6erspan_rcv
net: core: netif_receive_skb_list: unlist skb before passing to pt->func
r8169: disable default rx interrupt coalescing on RTL8168
net: mlx5: Add a missing check on idr_find, free buf
net/mlx5e: Update xoff formula
net/mlx5e: Update xon formula
kbuild: deb-pkg: fix bindeb-pkg breakage when O= is used
kbuild: clang: choose GCC_TOOLCHAIN_DIR not on LD
x86/vdso: Drop implicit common-page-size linker flag
lib/string.c: implement a basic bcmp
Revert "clk: meson: clean-up clock registration"
netfilter: nfnetlink_cttimeout: pass default timeout policy to obj_to_nlattr
netfilter: nfnetlink_cttimeout: fetch timeouts for udplite and gre, too
arm64: kaslr: Reserve size of ARM64_MEMSTART_ALIGN in linear region
tty: mark Siemens R3964 line discipline as BROKEN
tty: ldisc: add sysctl to prevent autoloading of ldiscs
hwmon: (w83773g) Select REGMAP_I2C to fix build error
ACPICA: Clear status of GPEs before enabling them
ACPICA: Namespace: remove address node from global list after method termination
ALSA: seq: Fix OOB-reads from strlcpy
ALSA: hda/realtek: Enable headset MIC of Acer TravelMate B114-21 with ALC233
ALSA: hda/realtek - Add quirk for Tuxedo XC 1509
ALSA: hda - Add two more machines to the power_save_blacklist
mm/huge_memory.c: fix modifying of page protection by insert_pfn_pmd()
arm64: dts: rockchip: fix rk3328 sdmmc0 write errors
parisc: Detect QEMU earlier in boot process
parisc: regs_return_value() should return gpr28
parisc: also set iaoq_b in instruction_pointer_set()
alarmtimer: Return correct remaining time
drm/i915/gvt: do not deliver a workload if its creation fails
drm/udl: add a release method and delay modeset teardown
kvm: svm: fix potential get_num_contig_pages overflow
include/linux/bitrev.h: fix constant bitrev
mm: writeback: use exact memcg dirty counts
ASoC: intel: Fix crash at suspend/resume after failed codec registration
ASoC: fsl_esai: fix channel swap issue when stream starts
Btrfs: do not allow trimming when a fs is mounted with the nologreplay option
btrfs: prop: fix zstd compression parameter validation
btrfs: prop: fix vanished compression property after failed set
riscv: Fix syscall_get_arguments() and syscall_set_arguments()
block: do not leak memory in bio_copy_user_iov()
block: fix the return errno for direct IO
genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent()
genirq: Initialize request_mutex if CONFIG_SPARSE_IRQ=n
virtio: Honour 'may_reduce_num' in vring_create_virtqueue
ARM: dts: rockchip: fix rk3288 cpu opp node reference
ARM: dts: am335x-evmsk: Correct the regulators for the audio codec
ARM: dts: am335x-evm: Correct the regulators for the audio codec
ARM: dts: at91: Fix typo in ISC_D0 on PC9
arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value
arm64: dts: rockchip: fix rk3328 rgmii high tx error rate
arm64: backtrace: Don't bother trying to unwind the userspace stack
xen: Prevent buffer overflow in privcmd ioctl
sched/fair: Do not re-read ->h_load_next during hierarchical load calculation
xtensa: fix return_address
x86/asm: Remove dead __GNUC__ conditionals
x86/asm: Use stricter assembly constraints in bitops
x86/perf/amd: Resolve race condition when disabling PMC
x86/perf/amd: Resolve NMI latency issues for active PMCs
x86/perf/amd: Remove need to check "running" bit in NMI handler
PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller
PCI: pciehp: Ignore Link State Changes after powering off a slot
dm integrity: change memcmp to strncmp in dm_integrity_ctr
dm: revert
|
||
Greg Thelen
|
43f47331a4 |
mm: writeback: use exact memcg dirty counts
commit 0b3d6e6f2dd0a7b697b1aa8c167265908940624b upstream.
Since commit
|
||
Aneesh Kumar K.V
|
9a62d69114 |
mm/huge_memory.c: fix modifying of page protection by insert_pfn_pmd()
commit c6f3c5ee40c10bb65725047a220570f718507001 upstream. With some architectures like ppc64, set_pmd_at() cannot cope with a situation where there is already some (different) valid entry present. Use pmdp_set_access_flags() instead to modify the pfn which is built to deal with modifying existing PMD entries. This is similar to commit cae85cb8add3 ("mm/memory.c: fix modifying of page protection by insert_pfn()") We also do similar update w.r.t insert_pfn_pud eventhough ppc64 don't support pud pfn entries now. Without this patch we also see the below message in kernel log "BUG: non-zero pgtables_bytes on freeing mm:" Link: http://lkml.kernel.org/r/20190402115125.18803-1-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by: Chandan Rajendra <chandan@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Greg Kroah-Hartman
|
d885da678e |
This is the 4.19.34 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlynu40ACgkQONu9yGCS aT5X6g//Wkfm/+qSZ0GhLDQkPniiH1QkvzhOmVrrxu+KB0qsiwsEl8Srw33ZVkJK LT8+IPGiG9jEGu9dj+BYXTIfy9ZvfSsEL2N6GhYwDSXP0fok2rUaHbZvv1IB2g4W afhGdNwNAUCJ/j1UrUsi+SAFJ+xWbVxFpGstd0cqM9IbKdEV7RIukvuKckHiKOKR qI8FxC+G2PAr+BtnETfk5/suPDJ7B3ZicDoMhiWJGxJ6dfFTVmkSmasSoPDaMiHm 4S3hN2lu+WTeRpRPPB17Dlk4MmIp0k+bGYBKAlaxAMCc/RZxvbT2pRYaMQbId2/L mNUfSnOQFGEAhlAPfb7wdbObphnyT34GhlkWfZBTrnhPO0/FomLOvU6xVdcNuakX Tv2JKfDzb+2ttcMZ+0T84Ru9RztoswFATSw8uFMVxW8oTS6MVWnHu96Kxfl7QO3J PdlIGcyqxSuWNE8OX1QVtdSruGZfwUDNs94S4nQJtkB8BViRwhGJlqaXuy4d9Wp6 fGlI2W6qhjyosi2wBSMTjh/ytk/jq0vfs+z2XjR2gAYssvB/SOLR/AlSVguWsDnf WaoFBkXvCbuPvPlo0TrLpl5RW5WlOtLXHE3Vr3dKp458wLwpf/OZBGoZiknp7DrF PzBZs2ie5tmyqTxbAygl7WkbQPJ682pd5R4nf5CY+zvUaOMZv1g= =Iuup -----END PGP SIGNATURE----- Merge 4.19.34 into android-4.19 Changes in 4.19.34 arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals ext4: cleanup bh release code in ext4_ind_remove_space() tty/serial: atmel: Add is_half_duplex helper tty/serial: atmel: RS485 HD w/DMA: enable RX after TX is stopped CIFS: fix POSIX lock leak and invalid ptr deref h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux- f2fs: fix to adapt small inline xattr space in __find_inline_xattr() f2fs: fix to avoid deadlock in f2fs_read_inline_dir() tracing: kdb: Fix ftdump to not sleep net/mlx5: Avoid panic when setting vport rate net/mlx5: Avoid panic when setting vport mac, getting vport config gpio: gpio-omap: fix level interrupt idling include/linux/relay.h: fix percpu annotation in struct rchan sysctl: handle overflow for file-max net: stmmac: Avoid sometimes uninitialized Clang warnings enic: fix build warning without CONFIG_CPUMASK_OFFSTACK libbpf: force fixdep compilation at the start of the build scsi: hisi_sas: Set PHY linkrate when disconnected scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO iio: adc: fix warning in Qualcomm PM8xxx HK/XOADC driver x86/hyperv: Fix kernel panic when kexec on HyperV perf c2c: Fix c2c report for empty numa node mm/sparse: fix a bad comparison mm/cma.c: cma_declare_contiguous: correct err handling mm/page_ext.c: fix an imbalance with kmemleak mm, swap: bounds check swap_info array accesses to avoid NULL derefs mm,oom: don't kill global init via memory.oom.group memcg: killed threads should not invoke memcg OOM killer mm, mempolicy: fix uninit memory access mm/vmalloc.c: fix kernel BUG at mm/vmalloc.c:512! mm/slab.c: kmemleak no scan alien caches ocfs2: fix a panic problem caused by o2cb_ctl f2fs: do not use mutex lock in atomic context fs/file.c: initialize init_files.resize_wait page_poison: play nicely with KASAN cifs: use correct format characters dm thin: add sanity checks to thin-pool and external snapshot creation f2fs: fix to check inline_xattr_size boundary correctly cifs: Accept validate negotiate if server return NT_STATUS_NOT_SUPPORTED cifs: Fix NULL pointer dereference of devname netfilter: nf_tables: check the result of dereferencing base_chain->stats netfilter: conntrack: tcp: only close if RST matches exact sequence jbd2: fix invalid descriptor block checksum fs: fix guard_bio_eod to check for real EOD errors tools lib traceevent: Fix buffer overflow in arg_eval PCI/PME: Fix hotplug/sysfs remove deadlock in pcie_pme_remove() wil6210: check null pointer in _wil_cfg80211_merge_extra_ies mt76: fix a leaked reference by adding a missing of_node_put crypto: crypto4xx - add missing of_node_put after of_device_is_available crypto: cavium/zip - fix collision with generic cra_driver_name usb: chipidea: Grab the (legacy) USB PHY by phandle first powerpc/powernv/ioda: Fix locked_vm counting for memory used by IOMMU tables scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c kbuild: invoke syncconfig if include/config/auto.conf.cmd is missing powerpc/xmon: Fix opcode being uninitialized in print_insn_powerpc coresight: etm4x: Add support to enable ETMv4.2 serial: 8250_pxa: honor the port number from devicetree ARM: 8840/1: use a raw_spinlock_t in unwind iommu/io-pgtable-arm-v7s: Only kmemleak_ignore L2 tables powerpc/hugetlb: Handle mmap_min_addr correctly in get_unmapped_area callback btrfs: qgroup: Make qgroup async transaction commit more aggressive mmc: omap: fix the maximum timeout setting net: dsa: mv88e6xxx: Add lockdep classes to fix false positive splat e1000e: Fix -Wformat-truncation warnings mlxsw: spectrum: Avoid -Wformat-truncation warnings platform/x86: ideapad-laptop: Fix no_hw_rfkill_list for Lenovo RESCUER R720-15IKBN platform/mellanox: mlxreg-hotplug: Fix KASAN warning loop: set GENHD_FL_NO_PART_SCAN after blkdev_reread_part() IB/mlx4: Increase the timeout for CM cache clk: fractional-divider: check parent rate only if flag is set perf annotate: Fix getting source line failure ASoC: qcom: Fix of-node refcount unbalance in qcom_snd_parse_of() cpufreq: acpi-cpufreq: Report if CPU doesn't support boost technologies efi: cper: Fix possible out-of-bounds access s390/ism: ignore some errors during deregistration scsi: megaraid_sas: return error when create DMA pool failed scsi: fcoe: make use of fip_mode enum complete drm/amd/display: Clear stream->mode_changed after commit perf test: Fix failure of 'evsel-tp-sched' test on s390 mwifiex: don't advertise IBSS features without FW support perf report: Don't shadow inlined symbol with different addr range SoC: imx-sgtl5000: add missing put_device() media: ov7740: fix runtime pm initialization media: sh_veu: Correct return type for mem2mem buffer helpers media: s5p-jpeg: Correct return type for mem2mem buffer helpers media: rockchip/rga: Correct return type for mem2mem buffer helpers media: s5p-g2d: Correct return type for mem2mem buffer helpers media: mx2_emmaprp: Correct return type for mem2mem buffer helpers media: mtk-jpeg: Correct return type for mem2mem buffer helpers mt76: usb: do not run mt76u_queues_deinit twice xen/gntdev: Do not destroy context while dma-bufs are in use vfs: fix preadv64v2 and pwritev64v2 compat syscalls with offset == -1 HID: intel-ish-hid: avoid binding wrong ishtp_cl_device cgroup, rstat: Don't flush subtree root unless necessary jbd2: fix race when writing superblock leds: lp55xx: fix null deref on firmware load failure perf report: Add s390 diagnosic sampling descriptor size iwlwifi: pcie: fix emergency path ACPI / video: Refactor and fix dmi_is_desktop() selftests: skip seccomp get_metadata test if not real root kprobes: Prohibit probing on bsearch() kprobes: Prohibit probing on RCU debug routine netfilter: conntrack: fix cloned unconfirmed skb->_nfct race in __nf_conntrack_confirm ARM: 8833/1: Ensure that NEON code always compiles with Clang ARM: dts: meson8b: fix the Ethernet data line signals in eth_rgmii_pins ALSA: PCM: check if ops are defined before suspending PCM ath10k: fix shadow register implementation for WCN3990 usb: f_fs: Avoid crash due to out-of-scope stack ptr access sched/topology: Fix percpu data types in struct sd_data & struct s_data bcache: fix input overflow to cache set sysfs file io_error_halflife bcache: fix input overflow to sequential_cutoff bcache: fix potential div-zero error of writeback_rate_i_term_inverse bcache: improve sysfs_strtoul_clamp() genirq: Avoid summation loops for /proc/stat net: marvell: mvpp2: fix stuck in-band SGMII negotiation iw_cxgb4: fix srqidx leak during connection abort net: phy: consider latched link-down status in polling mode fbdev: fbmem: fix memory access if logo is bigger than the screen cdrom: Fix race condition in cdrom_sysctl_register drm: rcar-du: add missing of_node_put drm/amd/display: Don't re-program planes for DPMS changes drm/amd/display: Disconnect mpcc when changing tg perf/aux: Make perf_event accessible to setup_aux() e1000e: fix cyclic resets at link up with active tx e1000e: Exclude device from suspend direct complete optimization platform/x86: intel_pmc_core: Fix PCH IP sts reading i2c: of: Try to find an I2C adapter matching the parent staging: spi: mt7621: Add return code check on device_reset() iwlwifi: mvm: fix RFH config command with >=10 CPUs ASoC: fsl-asoc-card: fix object reference leaks in fsl_asoc_card_probe sched/debug: Initialize sd_sysctl_cpus if !CONFIG_CPUMASK_OFFSTACK efi/memattr: Don't bail on zero VA if it equals the region's PA sched/core: Use READ_ONCE()/WRITE_ONCE() in move_queued_task()/task_rq_lock() drm/vkms: Bugfix extra vblank frame ARM: dts: lpc32xx: Remove leading 0x and 0s from bindings notation efi/arm/arm64: Allow SetVirtualAddressMap() to be omitted soc: qcom: gsbi: Fix error handling in gsbi_probe() mt7601u: bump supported EEPROM version ARM: 8830/1: NOMMU: Toggle only bits in EXC_RETURN we are really care of ARM: avoid Cortex-A9 livelock on tight dmb loops block, bfq: fix in-service-queue check for queue merging bpf: fix missing prototype warnings selftests/bpf: skip verifier tests for unsupported program types powerpc/64s: Clear on-stack exception marker upon exception return cgroup/pids: turn cgroup_subsys->free() into cgroup_subsys->release() to fix the accounting backlight: pwm_bl: Use gpiod_get_value_cansleep() to get initial state tty: increase the default flip buffer limit to 2*640K powerpc/pseries: Perform full re-add of CPU for topology update post-migration drm/amd/display: Enable vblank interrupt during CRC capture ALSA: dice: add support for Solid State Logic Duende Classic/Mini usb: dwc3: gadget: Fix OTG events when gadget driver isn't loaded platform/x86: intel-hid: Missing power button release on some Dell models perf script python: Use PyBytes for attr in trace-event-python perf script python: Add trace_context extension module to sys.modules media: mt9m111: set initial frame size other than 0x0 hwrng: virtio - Avoid repeated init of completion soc/tegra: fuse: Fix illegal free of IO base address HID: intel-ish: ipc: handle PIMR before ish_wakeup also clear PISR busy_clear bit f2fs: UBSAN: set boolean value iostat_enable correctly hpet: Fix missing '=' character in the __setup() code of hpet_mmap_enable cpu/hotplug: Mute hotplug lockdep during init dmaengine: imx-dma: fix warning comparison of distinct pointer types dmaengine: qcom_hidma: assign channel cookie correctly dmaengine: qcom_hidma: initialize tx flags in hidma_prep_dma_* netfilter: physdev: relax br_netfilter dependency media: rcar-vin: Allow independent VIN link enablement media: s5p-jpeg: Check for fmt_ver_flag when doing fmt enumeration regulator: act8865: Fix act8600_sudcdc_voltage_ranges setting pinctrl: meson: meson8b: add the eth_rxd2 and eth_rxd3 pins drm: Auto-set allow_fb_modifiers when given modifiers at plane init drm/nouveau: Stop using drm_crtc_force_disable x86/build: Specify elf_i386 linker emulation explicitly for i386 objects selinux: do not override context on context mounts brcmfmac: Use firmware_request_nowarn for the clm_blob wlcore: Fix memory leak in case wl12xx_fetch_firmware failure x86/build: Mark per-CPU symbols as absolute explicitly for LLD drm/fb-helper: fix leaks in error path of drm_fb_helper_fbdev_setup clk: meson: clean-up clock registration clk: rockchip: fix frac settings of GPLL clock for rk3328 dmaengine: tegra: avoid overflow of byte tracking Input: soc_button_array - fix mapping of the 5th GPIO in a PNP0C40 device drm/dp/mst: Configure no_stop_bit correctly for remote i2c xfers net: stmmac: Avoid one more sometimes uninitialized Clang warning ACPI / video: Extend chassis-type detection with a "Lunch Box" check bcache: fix potential div-zero error of writeback_rate_p_term_inverse kprobes/x86: Blacklist non-attachable interrupt functions Linux 4.19.34 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Qian Cai
|
a6c56bf63e |
page_poison: play nicely with KASAN
[ Upstream commit 4117992df66a26fa33908b4969e04801534baab1 ] KASAN does not play well with the page poisoning (CONFIG_PAGE_POISONING). It triggers false positives in the allocation path: BUG: KASAN: use-after-free in memchr_inv+0x2ea/0x330 Read of size 8 at addr ffff88881f800000 by task swapper/0 CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc1+ #54 Call Trace: dump_stack+0xe0/0x19a print_address_description.cold.2+0x9/0x28b kasan_report.cold.3+0x7a/0xb5 __asan_report_load8_noabort+0x19/0x20 memchr_inv+0x2ea/0x330 kernel_poison_pages+0x103/0x3d5 get_page_from_freelist+0x15e7/0x4d90 because KASAN has not yet unpoisoned the shadow page for allocation before it checks memchr_inv() but only found a stale poison pattern. Also, false positives in free path, BUG: KASAN: slab-out-of-bounds in kernel_poison_pages+0x29e/0x3d5 Write of size 4096 at addr ffff8888112cc000 by task swapper/0/1 CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc1+ #55 Call Trace: dump_stack+0xe0/0x19a print_address_description.cold.2+0x9/0x28b kasan_report.cold.3+0x7a/0xb5 check_memory_region+0x22d/0x250 memset+0x28/0x40 kernel_poison_pages+0x29e/0x3d5 __free_pages_ok+0x75f/0x13e0 due to KASAN adds poisoned redzones around slab objects, but the page poisoning needs to poison the whole page. Link: http://lkml.kernel.org/r/20190114233405.67843-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Qian Cai
|
f09c424cea |
mm/slab.c: kmemleak no scan alien caches
[ Upstream commit 92d1d07daad65c300c7d0b68bbef8867e9895d54 ]
Kmemleak throws endless warnings during boot due to in
__alloc_alien_cache(),
alc = kmalloc_node(memsize, gfp, node);
init_arraycache(&alc->ac, entries, batch);
kmemleak_no_scan(ac);
Kmemleak does not track the array cache (alc->ac) but the alien cache
(alc) instead, so let it track the latter by lifting kmemleak_no_scan()
out of init_arraycache().
There is another place that calls init_arraycache(), but
alloc_kmem_cache_cpus() uses the percpu allocation where will never be
considered as a leak.
kmemleak: Found object by alias at 0xffff8007b9aa7e38
CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ #2
Call trace:
dump_backtrace+0x0/0x168
show_stack+0x24/0x30
dump_stack+0x88/0xb0
lookup_object+0x84/0xac
find_and_get_object+0x84/0xe4
kmemleak_no_scan+0x74/0xf4
setup_kmem_cache_node+0x2b4/0x35c
__do_tune_cpucache+0x250/0x2d4
do_tune_cpucache+0x4c/0xe4
enable_cpucache+0xc8/0x110
setup_cpu_cache+0x40/0x1b8
__kmem_cache_create+0x240/0x358
create_cache+0xc0/0x198
kmem_cache_create_usercopy+0x158/0x20c
kmem_cache_create+0x50/0x64
fsnotify_init+0x58/0x6c
do_one_initcall+0x194/0x388
kernel_init_freeable+0x668/0x688
kernel_init+0x18/0x124
ret_from_fork+0x10/0x18
kmemleak: Object 0xffff8007b9aa7e00 (size 256):
kmemleak: comm "swapper/0", pid 1, jiffies 4294697137
kmemleak: min_count = 1
kmemleak: count = 0
kmemleak: flags = 0x1
kmemleak: checksum = 0
kmemleak: backtrace:
kmemleak_alloc+0x84/0xb8
kmem_cache_alloc_node_trace+0x31c/0x3a0
__kmalloc_node+0x58/0x78
setup_kmem_cache_node+0x26c/0x35c
__do_tune_cpucache+0x250/0x2d4
do_tune_cpucache+0x4c/0xe4
enable_cpucache+0xc8/0x110
setup_cpu_cache+0x40/0x1b8
__kmem_cache_create+0x240/0x358
create_cache+0xc0/0x198
kmem_cache_create_usercopy+0x158/0x20c
kmem_cache_create+0x50/0x64
fsnotify_init+0x58/0x6c
do_one_initcall+0x194/0x388
kernel_init_freeable+0x668/0x688
kernel_init+0x18/0x124
kmemleak: Not scanning unknown object at 0xffff8007b9aa7e38
CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ #2
Call trace:
dump_backtrace+0x0/0x168
show_stack+0x24/0x30
dump_stack+0x88/0xb0
kmemleak_no_scan+0x90/0xf4
setup_kmem_cache_node+0x2b4/0x35c
__do_tune_cpucache+0x250/0x2d4
do_tune_cpucache+0x4c/0xe4
enable_cpucache+0xc8/0x110
setup_cpu_cache+0x40/0x1b8
__kmem_cache_create+0x240/0x358
create_cache+0xc0/0x198
kmem_cache_create_usercopy+0x158/0x20c
kmem_cache_create+0x50/0x64
fsnotify_init+0x58/0x6c
do_one_initcall+0x194/0x388
kernel_init_freeable+0x668/0x688
kernel_init+0x18/0x124
ret_from_fork+0x10/0x18
Link: http://lkml.kernel.org/r/20190129184518.39808-1-cai@lca.pw
Fixes:
|
||
Uladzislau Rezki (Sony)
|
8a0fc62e33 |
mm/vmalloc.c: fix kernel BUG at mm/vmalloc.c:512!
[ Upstream commit afd07389d3f4933c7f7817a92fb5e053d59a3182 ] One of the vmalloc stress test case triggers the kernel BUG(): <snip> [60.562151] ------------[ cut here ]------------ [60.562154] kernel BUG at mm/vmalloc.c:512! [60.562206] invalid opcode: 0000 [#1] PREEMPT SMP PTI [60.562247] CPU: 0 PID: 430 Comm: vmalloc_test/0 Not tainted 4.20.0+ #161 [60.562293] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [60.562351] RIP: 0010:alloc_vmap_area+0x36f/0x390 <snip> it can happen due to big align request resulting in overflowing of calculated address, i.e. it becomes 0 after ALIGN()'s fixup. Fix it by checking if calculated address is within vstart/vend range. Link: http://lkml.kernel.org/r/20190124115648.9433-2-urezki@gmail.com Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joel Fernandes <joelaf@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Vlastimil Babka
|
67abbb9c54 |
mm, mempolicy: fix uninit memory access
[ Upstream commit 2e25644e8da4ed3a27e7b8315aaae74660be72dc ] Syzbot with KMSAN reports (excerpt): ================================================================== BUG: KMSAN: uninit-value in mpol_rebind_policy mm/mempolicy.c:353 [inline] BUG: KMSAN: uninit-value in mpol_rebind_mm+0x249/0x370 mm/mempolicy.c:384 CPU: 1 PID: 17420 Comm: syz-executor4 Not tainted 4.20.0-rc7+ #15 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:295 mpol_rebind_policy mm/mempolicy.c:353 [inline] mpol_rebind_mm+0x249/0x370 mm/mempolicy.c:384 update_tasks_nodemask+0x608/0xca0 kernel/cgroup/cpuset.c:1120 update_nodemasks_hier kernel/cgroup/cpuset.c:1185 [inline] update_nodemask kernel/cgroup/cpuset.c:1253 [inline] cpuset_write_resmask+0x2a98/0x34b0 kernel/cgroup/cpuset.c:1728 ... Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline] kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158 kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176 kmem_cache_alloc+0x572/0xb90 mm/slub.c:2777 mpol_new mm/mempolicy.c:276 [inline] do_mbind mm/mempolicy.c:1180 [inline] kernel_mbind+0x8a7/0x31a0 mm/mempolicy.c:1347 __do_sys_mbind mm/mempolicy.c:1354 [inline] As it's difficult to report where exactly the uninit value resides in the mempolicy object, we have to guess a bit. mm/mempolicy.c:353 contains this part of mpol_rebind_policy(): if (!mpol_store_user_nodemask(pol) && nodes_equal(pol->w.cpuset_mems_allowed, *newmask)) "mpol_store_user_nodemask(pol)" is testing pol->flags, which I couldn't ever see being uninitialized after leaving mpol_new(). So I'll guess it's actually about accessing pol->w.cpuset_mems_allowed on line 354, but still part of statement starting on line 353. For w.cpuset_mems_allowed to be not initialized, and the nodes_equal() reachable for a mempolicy where mpol_set_nodemask() is called in do_mbind(), it seems the only possibility is a MPOL_PREFERRED policy with empty set of nodes, i.e. MPOL_LOCAL equivalent, with MPOL_F_LOCAL flag. Let's exclude such policies from the nodes_equal() check. Note the uninit access should be benign anyway, as rebinding this kind of policy is always a no-op. Therefore no actual need for stable inclusion. Link: http://lkml.kernel.org/r/a71997c3-e8ae-a787-d5ce-3db05768b27c@suse.cz Link: http://lkml.kernel.org/r/73da3e9c-cc84-509e-17d9-0c434bb9967d@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: syzbot+b19c2dc2c990ea657a71@syzkaller.appspotmail.com Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Yisheng Xie <xieyisheng1@huawei.com> Cc: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Tetsuo Handa
|
9d785b92cf |
memcg: killed threads should not invoke memcg OOM killer
[ Upstream commit 7775face207922ea62a4e96b9cd45abfdc7b9840 ] If a memory cgroup contains a single process with many threads (including different process group sharing the mm) then it is possible to trigger a race when the oom killer complains that there are no oom elible tasks and complain into the log which is both annoying and confusing because there is no actual problem. The race looks as follows: P1 oom_reaper P2 try_charge try_charge mem_cgroup_out_of_memory mutex_lock(oom_lock) out_of_memory oom_kill_process(P1,P2) wake_oom_reaper mutex_unlock(oom_lock) oom_reap_task mutex_lock(oom_lock) select_bad_process # no victim The problem is more visible with many threads. Fix this by checking for fatal_signal_pending from mem_cgroup_out_of_memory when the oom_lock is already held. The oom bypass is safe because we do the same early in the try_charge path already. The situation migh have changed in the mean time. It should be safe to check for fatal_signal_pending and tsk_is_oom_victim but for a better code readability abstract the current charge bypass condition into should_force_charge and reuse it from that path. " Link: http://lkml.kernel.org/r/01370f70-e1f6-ebe4-b95e-0df21a0bc15e@i-love.sakura.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Rientjes <rientjes@google.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Tetsuo Handa
|
eed3ca0a66 |
mm,oom: don't kill global init via memory.oom.group
[ Upstream commit d342a0b38674867ea67fde47b0e1e60ffe9f17a2 ]
Since setting global init process to some memory cgroup is technically
possible, oom_kill_memcg_member() must check it.
Tasks in /test1 are going to be killed due to memory.oom.group set
Memory cgroup out of memory: Killed process 1 (systemd) total-vm:43400kB, anon-rss:1228kB, file-rss:3992kB, shmem-rss:0kB
oom_reaper: reaped process 1 (systemd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000008b
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
static char buffer[10485760];
static int pipe_fd[2] = { EOF, EOF };
unsigned int i;
int fd;
char buf[64] = { };
if (pipe(pipe_fd))
return 1;
if (chdir("/sys/fs/cgroup/"))
return 1;
fd = open("cgroup.subtree_control", O_WRONLY);
write(fd, "+memory", 7);
close(fd);
mkdir("test1", 0755);
fd = open("test1/memory.oom.group", O_WRONLY);
write(fd, "1", 1);
close(fd);
fd = open("test1/cgroup.procs", O_WRONLY);
write(fd, "1", 1);
snprintf(buf, sizeof(buf) - 1, "%d", getpid());
write(fd, buf, strlen(buf));
close(fd);
snprintf(buf, sizeof(buf) - 1, "%lu", sizeof(buffer) * 5);
fd = open("test1/memory.max", O_WRONLY);
write(fd, buf, strlen(buf));
close(fd);
for (i = 0; i < 10; i++)
if (fork() == 0) {
char c;
close(pipe_fd[1]);
read(pipe_fd[0], &c, 1);
memset(buffer, 0, sizeof(buffer));
sleep(3);
_exit(0);
}
close(pipe_fd[0]);
close(pipe_fd[1]);
sleep(3);
return 0;
}
[ 37.052923][ T9185] a.out invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
[ 37.056169][ T9185] CPU: 4 PID: 9185 Comm: a.out Kdump: loaded Not tainted 5.0.0-rc4-next-20190131 #280
[ 37.059205][ T9185] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[ 37.062954][ T9185] Call Trace:
[ 37.063976][ T9185] dump_stack+0x67/0x95
[ 37.065263][ T9185] dump_header+0x51/0x570
[ 37.066619][ T9185] ? trace_hardirqs_on+0x3f/0x110
[ 37.068171][ T9185] ? _raw_spin_unlock_irqrestore+0x3d/0x70
[ 37.069967][ T9185] oom_kill_process+0x18d/0x210
[ 37.071515][ T9185] out_of_memory+0x11b/0x380
[ 37.072936][ T9185] mem_cgroup_out_of_memory+0xb6/0xd0
[ 37.074601][ T9185] try_charge+0x790/0x820
[ 37.076021][ T9185] mem_cgroup_try_charge+0x42/0x1d0
[ 37.077629][ T9185] mem_cgroup_try_charge_delay+0x11/0x30
[ 37.079370][ T9185] do_anonymous_page+0x105/0x5e0
[ 37.080939][ T9185] __handle_mm_fault+0x9cb/0x1070
[ 37.082485][ T9185] handle_mm_fault+0x1b2/0x3a0
[ 37.083819][ T9185] ? handle_mm_fault+0x47/0x3a0
[ 37.085181][ T9185] __do_page_fault+0x255/0x4c0
[ 37.086529][ T9185] do_page_fault+0x28/0x260
[ 37.087788][ T9185] ? page_fault+0x8/0x30
[ 37.088978][ T9185] page_fault+0x1e/0x30
[ 37.090142][ T9185] RIP: 0033:0x7f8b183aefe0
[ 37.091433][ T9185] Code: 20 f3 44 0f 7f 44 17 d0 f3 44 0f 7f 47 30 f3 44 0f 7f 44 17 c0 48 01 fa 48 83 e2 c0 48 39 d1 74 a3 66 0f 1f 84 00 00 00 00 00 <66> 44 0f 7f 01 66 44 0f 7f 41 10 66 44 0f 7f 41 20 66 44 0f 7f 41
[ 37.096917][ T9185] RSP: 002b:00007fffc5d329e8 EFLAGS: 00010206
[ 37.098615][ T9185] RAX: 00000000006010e0 RBX: 0000000000000008 RCX: 0000000000c30000
[ 37.100905][ T9185] RDX: 00000000010010c0 RSI: 0000000000000000 RDI: 00000000006010e0
[ 37.103349][ T9185] RBP: 0000000000000000 R08: 00007f8b188f4740 R09: 0000000000000000
[ 37.105797][ T9185] R10: 00007fffc5d32420 R11: 00007f8b183aef40 R12: 0000000000000005
[ 37.108228][ T9185] R13: 0000000000000000 R14: ffffffffffffffff R15: 0000000000000000
[ 37.110840][ T9185] memory: usage 51200kB, limit 51200kB, failcnt 125
[ 37.113045][ T9185] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
[ 37.115808][ T9185] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
[ 37.117660][ T9185] Memory cgroup stats for /test1: cache:0KB rss:49484KB rss_huge:30720KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:49700KB inactive_file:0KB active_file:0KB unevictable:0KB
[ 37.123371][ T9185] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/test1,task_memcg=/test1,task=a.out,pid=9188,uid=0
[ 37.128158][ T9185] Memory cgroup out of memory: Killed process 9188 (a.out) total-vm:14456kB, anon-rss:10324kB, file-rss:504kB, shmem-rss:0kB
[ 37.132710][ T9185] Tasks in /test1 are going to be killed due to memory.oom.group set
[ 37.132833][ T54] oom_reaper: reaped process 9188 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 37.135498][ T9185] Memory cgroup out of memory: Killed process 1 (systemd) total-vm:43400kB, anon-rss:1228kB, file-rss:3992kB, shmem-rss:0kB
[ 37.143434][ T9185] Memory cgroup out of memory: Killed process 9182 (a.out) total-vm:14456kB, anon-rss:76kB, file-rss:588kB, shmem-rss:0kB
[ 37.144328][ T54] oom_reaper: reaped process 1 (systemd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 37.147585][ T9185] Memory cgroup out of memory: Killed process 9183 (a.out) total-vm:14456kB, anon-rss:6228kB, file-rss:512kB, shmem-rss:0kB
[ 37.157222][ T9185] Memory cgroup out of memory: Killed process 9184 (a.out) total-vm:14456kB, anon-rss:6228kB, file-rss:508kB, shmem-rss:0kB
[ 37.157259][ T9185] Memory cgroup out of memory: Killed process 9185 (a.out) total-vm:14456kB, anon-rss:6228kB, file-rss:512kB, shmem-rss:0kB
[ 37.157291][ T9185] Memory cgroup out of memory: Killed process 9186 (a.out) total-vm:14456kB, anon-rss:4180kB, file-rss:508kB, shmem-rss:0kB
[ 37.157306][ T54] oom_reaper: reaped process 9183 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 37.157328][ T9185] Memory cgroup out of memory: Killed process 9187 (a.out) total-vm:14456kB, anon-rss:4180kB, file-rss:512kB, shmem-rss:0kB
[ 37.157452][ T9185] Memory cgroup out of memory: Killed process 9189 (a.out) total-vm:14456kB, anon-rss:6228kB, file-rss:512kB, shmem-rss:0kB
[ 37.158733][ T9185] Memory cgroup out of memory: Killed process 9190 (a.out) total-vm:14456kB, anon-rss:552kB, file-rss:512kB, shmem-rss:0kB
[ 37.160083][ T54] oom_reaper: reaped process 9186 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 37.160187][ T54] oom_reaper: reaped process 9189 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 37.206941][ T54] oom_reaper: reaped process 9185 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 37.212300][ T9185] Memory cgroup out of memory: Killed process 9191 (a.out) total-vm:14456kB, anon-rss:4180kB, file-rss:512kB, shmem-rss:0kB
[ 37.212317][ T54] oom_reaper: reaped process 9190 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 37.218860][ T9185] Memory cgroup out of memory: Killed process 9192 (a.out) total-vm:14456kB, anon-rss:1080kB, file-rss:512kB, shmem-rss:0kB
[ 37.227667][ T54] oom_reaper: reaped process 9192 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 37.292323][ T9193] abrt-hook-ccpp (9193) used greatest stack depth: 10480 bytes left
[ 37.351843][ T1] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000008b
[ 37.354833][ T1] CPU: 7 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.0.0-rc4-next-20190131 #280
[ 37.357876][ T1] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[ 37.361685][ T1] Call Trace:
[ 37.363239][ T1] dump_stack+0x67/0x95
[ 37.365010][ T1] panic+0xfc/0x2b0
[ 37.366853][ T1] do_exit+0xd55/0xd60
[ 37.368595][ T1] do_group_exit+0x47/0xc0
[ 37.370415][ T1] get_signal+0x32a/0x920
[ 37.372449][ T1] ? _raw_spin_unlock_irqrestore+0x3d/0x70
[ 37.374596][ T1] do_signal+0x32/0x6e0
[ 37.376430][ T1] ? exit_to_usermode_loop+0x26/0x9b
[ 37.378418][ T1] ? prepare_exit_to_usermode+0xa8/0xd0
[ 37.380571][ T1] exit_to_usermode_loop+0x3e/0x9b
[ 37.382588][ T1] prepare_exit_to_usermode+0xa8/0xd0
[ 37.384594][ T1] ? page_fault+0x8/0x30
[ 37.386453][ T1] retint_user+0x8/0x18
[ 37.388160][ T1] RIP: 0033:0x7f42c06974a8
[ 37.389922][ T1] Code: Bad RIP value.
[ 37.391788][ T1] RSP: 002b:00007ffc3effd388 EFLAGS: 00010213
[ 37.394075][ T1] RAX: 000000000000000e RBX: 00007ffc3effd390 RCX: 0000000000000000
[ 37.396963][ T1] RDX: 000000000000002a RSI: 00007ffc3effd390 RDI: 0000000000000004
[ 37.399550][ T1] RBP: 00007ffc3effd680 R08: 0000000000000000 R09: 0000000000000000
[ 37.402334][ T1] R10: 00000000ffffffff R11: 0000000000000246 R12: 0000000000000001
[ 37.404890][ T1] R13: ffffffffffffffff R14: 0000000000000884 R15: 000056460b1ac3b0
Link: http://lkml.kernel.org/r/201902010336.x113a4EO027170@www262.sakura.ne.jp
Fixes:
|
||
Daniel Jordan
|
ed3345a660 |
mm, swap: bounds check swap_info array accesses to avoid NULL derefs
[ Upstream commit c10d38cc8d3e43f946b6c2bf4602c86791587f30 ]
Dan Carpenter reports a potential NULL dereference in
get_swap_page_of_type:
Smatch complains that the NULL checks on "si" aren't consistent. This
seems like a real bug because we have not ensured that the type is
valid and so "si" can be NULL.
Add the missing check for NULL, taking care to use a read barrier to
ensure CPU1 observes CPU0's updates in the correct order:
CPU0 CPU1
alloc_swap_info() if (type >= nr_swapfiles)
swap_info[type] = p /* handle invalid entry */
smp_wmb() smp_rmb()
++nr_swapfiles p = swap_info[type]
Without smp_rmb, CPU1 might observe CPU0's write to nr_swapfiles before
CPU0's write to swap_info[type] and read NULL from swap_info[type].
Ying Huang noticed other places in swapfile.c don't order these reads
properly. Introduce swap_type_to_swap_info to encourage correct usage.
Use READ_ONCE and WRITE_ONCE to follow the Linux Kernel Memory Model
(see tools/memory-model/Documentation/explanation.txt).
This ordering need not be enforced in places where swap_lock is held
(e.g. si_swapinfo) because swap_lock serializes updates to nr_swapfiles
and the swap_info array.
Link: http://lkml.kernel.org/r/20190131024410.29859-1-daniel.m.jordan@oracle.com
Fixes:
|
||
Qian Cai
|
4c6d7dc741 |
mm/page_ext.c: fix an imbalance with kmemleak
[ Upstream commit 0c81585499601acd1d0e1cbf424cabfaee60628c ] After offlining a memory block, kmemleak scan will trigger a crash, as it encounters a page ext address that has already been freed during memory offlining. At the beginning in alloc_page_ext(), it calls kmemleak_alloc(), but it does not call kmemleak_free() in free_page_ext(). BUG: unable to handle kernel paging request at ffff888453d00000 PGD 128a01067 P4D 128a01067 PUD 128a04067 PMD 47e09e067 PTE 800ffffbac2ff060 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI CPU: 1 PID: 1594 Comm: bash Not tainted 5.0.0-rc8+ #15 Hardware name: HP ProLiant DL180 Gen9/ProLiant DL180 Gen9, BIOS U20 10/25/2017 RIP: 0010:scan_block+0xb5/0x290 Code: 85 6e 01 00 00 48 b8 00 00 30 f5 81 88 ff ff 48 39 c3 0f 84 5b 01 00 00 48 89 d8 48 c1 e8 03 42 80 3c 20 00 0f 85 87 01 00 00 <4c> 8b 3b e8 f3 0c fa ff 4c 39 3d 0c 6b 4c 01 0f 87 08 01 00 00 4c RSP: 0018:ffff8881ec57f8e0 EFLAGS: 00010082 RAX: 0000000000000000 RBX: ffff888453d00000 RCX: ffffffffa61e5a54 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff888453d00000 RBP: ffff8881ec57f920 R08: fffffbfff4ed588d R09: fffffbfff4ed588c R10: fffffbfff4ed588c R11: ffffffffa76ac463 R12: dffffc0000000000 R13: ffff888453d00ff9 R14: ffff8881f80cef48 R15: ffff8881f80cef48 FS: 00007f6c0e3f8740(0000) GS:ffff8881f7680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff888453d00000 CR3: 00000001c4244003 CR4: 00000000001606a0 Call Trace: scan_gray_list+0x269/0x430 kmemleak_scan+0x5a8/0x10f0 kmemleak_write+0x541/0x6ca full_proxy_write+0xf8/0x190 __vfs_write+0xeb/0x980 vfs_write+0x15a/0x4f0 ksys_write+0xd2/0x1b0 __x64_sys_write+0x73/0xb0 do_syscall_64+0xeb/0xaaa entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f6c0dad73b8 Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 63 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55 RSP: 002b:00007ffd5b863cb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f6c0dad73b8 RDX: 0000000000000005 RSI: 000055a9216e1710 RDI: 0000000000000001 RBP: 000055a9216e1710 R08: 000000000000000a R09: 00007ffd5b863840 R10: 000000000000000a R11: 0000000000000246 R12: 00007f6c0dda9780 R13: 0000000000000005 R14: 00007f6c0dda4740 R15: 0000000000000005 Modules linked in: nls_iso8859_1 nls_cp437 vfat fat kvm_intel kvm irqbypass efivars ip_tables x_tables xfs sd_mod ahci libahci igb i2c_algo_bit libata i2c_core dm_mirror dm_region_hash dm_log dm_mod efivarfs CR2: ffff888453d00000 ---[ end trace ccf646c7456717c5 ]--- Kernel panic - not syncing: Fatal exception Shutting down cpus with NMI Kernel Offset: 0x24c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception ]--- Link: http://lkml.kernel.org/r/20190227173147.75650-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Peng Fan
|
f555b008c5 |
mm/cma.c: cma_declare_contiguous: correct err handling
[ Upstream commit 0d3bd18a5efd66097ef58622b898d3139790aa9d ] In case cma_init_reserved_mem failed, need to free the memblock allocated by memblock_reserve or memblock_alloc_range. Quote Catalin's comments: https://lkml.org/lkml/2019/2/26/482 Kmemleak is supposed to work with the memblock_{alloc,free} pair and it ignores the memblock_reserve() as a memblock_alloc() implementation detail. It is, however, tolerant to memblock_free() being called on a sub-range or just a different range from a previous memblock_alloc(). So the original patch looks fine to me. FWIW: Link: http://lkml.kernel.org/r/20190227144631.16708-1-peng.fan@nxp.com Signed-off-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Qian Cai
|
7b287c47e4 |
mm/sparse: fix a bad comparison
[ Upstream commit d778015ac95bc036af73342c878ab19250e01fe1 ]
next_present_section_nr() could only return an unsigned number -1, so
just check it specifically where compilers will convert -1 to unsigned
if needed.
mm/sparse.c: In function 'sparse_init_nid':
mm/sparse.c:200:20: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
((section_nr >= 0) && \
^~
mm/sparse.c:478:2: note: in expansion of macro
'for_each_present_section_nr'
for_each_present_section_nr(pnum_begin, pnum) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/sparse.c:200:20: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
((section_nr >= 0) && \
^~
mm/sparse.c:497:2: note: in expansion of macro
'for_each_present_section_nr'
for_each_present_section_nr(pnum_begin, pnum) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/sparse.c: In function 'sparse_init':
mm/sparse.c:200:20: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
((section_nr >= 0) && \
^~
mm/sparse.c:520:2: note: in expansion of macro
'for_each_present_section_nr'
for_each_present_section_nr(pnum_begin + 1, pnum_end) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~
Link: http://lkml.kernel.org/r/20190228181839.86504-1-cai@lca.pw
Fixes:
|
||
Greg Kroah-Hartman
|
0b065cd568 |
This is the 4.19.33 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlykNfcACgkQONu9yGCS aT40dRAAiCeYjEC1zH8dkAnbFlKo6IZuhKgISfVTgrWlRe9nUTYaenBXqAfGjufH EzXHrD1IANRAnFfWg8xt01TNBfTaiEYnYzFmJkWAHFWGKxa5fRU5Kan0MB97r8s9 NjoSRsnFl8l2oJI88zwFa7k89Itop9ST/zvZIgnrysAr+j8yEZb7BZWaU2UrKK/q qfnJxjfCb/jeqAxwVh3OkasXj0gG2JkGR/uEGTw2EARuI6pvKo5OCzYz0tXTN6ZJ CSzM4X7dhkGSgLIUw3JOCB28riK9TYbOdPr4MFYMrnoU5VL8+n62tXoKXewobJ1C 2+Pmg5E54r13Rr35eoGCiHsW2LGQrOyvm8S9TFB/0SmTPtzUjFlHBC62Vs/AW5ut HSmwVy+ILM/xTIdts5QT58Gw+5NCmHxw2oEdrgcct+6FtnR9XPOqZYyzH1tw1ZB+ DL2PqYyT9czuo2bKanWA37M8339q2INDFskXqKRokQ9GNiqUx1E6fNBqtK5SqyDI CdohtAs7xSQoPPbKDiITOmt82MM8xvefKmqTIvHN5B7Ns4lT1QC74DCXEFEKtw6M l+p64h6Qw4DiKmqna7fsbKPjmg8pg1lVrCOwD5iUF3JXy+Wxi/OKnr2gqAVMChAq GSfFFf+MMZhYzJPRQKxOn4GwDBDz5niaWvQmlPcPWLJ5jdbzl+Y= =Yyc/ -----END PGP SIGNATURE----- Merge 4.19.33 into android-4.19 Changes in 4.19.33 Bluetooth: Check L2CAP option sizes returned from l2cap_get_conf_opt Bluetooth: Verify that l2cap_get_conf_opt provides large enough buffer ipmi_si: Fix crash when using hard-coded device dccp: do not use ipv6 header for ipv4 flow genetlink: Fix a memory leak on error path gtp: change NET_UDP_TUNNEL dependency to select ipv6: make ip6_create_rt_rcu return ip6_null_entry instead of NULL mac8390: Fix mmio access size probe mISDN: hfcpci: Test both vendor & device ID for Digium HFC4S net: aquantia: fix rx checksum offload for UDP/TCP over IPv6 net: datagram: fix unbounded loop in __skb_try_recv_datagram() net/packet: Set __GFP_NOWARN upon allocation in alloc_pg_vec net: phy: meson-gxl: fix interrupt support net: rose: fix a possible stack overflow net: stmmac: fix memory corruption with large MTUs net-sysfs: call dev_hold if kobject_init_and_add success packets: Always register packet sk in the same order rhashtable: Still do rehash when we get EEXIST sctp: get sctphdr by offset in sctp_compute_cksum sctp: use memdup_user instead of vmemdup_user tcp: do not use ipv6 header for ipv4 flow tipc: allow service ranges to be connect()'ed on RDM/DGRAM tipc: change to check tipc_own_id to return in tipc_net_stop tipc: fix cancellation of topology subscriptions tun: properly test for IFF_UP vrf: prevent adding upper devices vxlan: Don't call gro_cells_destroy() before device is unregistered ila: Fix rhashtable walker list corruption net: sched: fix cleanup NULL pointer exception in act_mirr thunderx: enable page recycling for non-XDP case thunderx: eliminate extra calls to put_page() for pages held for recycling tun: add a missing rcu_read_unlock() in error path powerpc/fsl: Add infrastructure to fixup branch predictor flush powerpc/fsl: Add macro to flush the branch predictor powerpc/fsl: Emulate SPRN_BUCSR register powerpc/fsl: Add nospectre_v2 command line argument powerpc/fsl: Flush the branch predictor at each kernel entry (64bit) powerpc/fsl: Flush the branch predictor at each kernel entry (32 bit) powerpc/fsl: Flush branch predictor when entering KVM powerpc/fsl: Enable runtime patching if nospectre_v2 boot arg is used powerpc/fsl: Update Spectre v2 reporting powerpc/fsl: Fixed warning: orphan section `__btb_flush_fixup' powerpc/fsl: Fix the flush of branch predictor. powerpc/security: Fix spectre_v2 reporting Btrfs: fix incorrect file size after shrinking truncate and fsync btrfs: remove WARN_ON in log_dir_items btrfs: don't report readahead errors and don't update statistics btrfs: raid56: properly unmap parity page in finish_parity_scrub() btrfs: Avoid possible qgroup_rsv_size overflow in btrfs_calculate_inode_block_rsv_size Btrfs: fix assertion failure on fsync with NO_HOLES enabled ARM: imx6q: cpuidle: fix bug that CPU might not wake up at expected time powerpc: bpf: Fix generation of load/store DW instructions vfio: ccw: only free cp on final interrupt NFS: fix mount/umount race in nlmclnt. NFSv4.1 don't free interrupted slot on open net: dsa: qca8k: remove leftover phy accessors ALSA: rawmidi: Fix potential Spectre v1 vulnerability ALSA: seq: oss: Fix Spectre v1 vulnerability ALSA: pcm: Fix possible OOB access in PCM oss plugins ALSA: pcm: Don't suspend stream in unrecoverable PCM state ALSA: hda/realtek - Add support headset mode for DELL WYSE AIO ALSA: hda/realtek - Add support headset mode for New DELL WYSE NB ALSA: hda/realtek: Enable headset MIC of Acer AIO with ALC286 ALSA: hda/realtek: Enable headset MIC of Acer Aspire Z24-890 with ALC286 ALSA: hda/realtek - Add support for Acer Aspire E5-523G/ES1-432 headset mic ALSA: hda/realtek: Enable ASUS X441MB and X705FD headset MIC with ALC256 ALSA: hda/realtek: Enable headset mic of ASUS P5440FF with ALC256 ALSA: hda/realtek: Enable headset MIC of ASUS X430UN and X512DK with ALC256 ALSA: hda/realtek - Fix speakers on Acer Predator Helios 500 Ryzen laptops kbuild: modversions: Fix relative CRC byte order interpretation fs/open.c: allow opening only regular files during execve() ocfs2: fix inode bh swapping mixup in ocfs2_reflink_inodes_lock scsi: sd: Fix a race between closing an sd device and sd I/O scsi: sd: Quiesce warning if device does not report optimal I/O size scsi: zfcp: fix rport unblock if deleted SCSI devices on Scsi_Host scsi: zfcp: fix scsi_eh host reset with port_forced ERP for non-NPIV FCP devices drm/rockchip: vop: reset scale mode when win is disabled tty: mxs-auart: fix a potential NULL pointer dereference tty: atmel_serial: fix a potential NULL pointer dereference tty: serial: qcom_geni_serial: Initialize baud in qcom_geni_console_setup staging: comedi: ni_mio_common: Fix divide-by-zero for DIO cmdtest staging: speakup_soft: Fix alternate speech with other synths staging: vt6655: Remove vif check from vnt_interrupt staging: vt6655: Fix interrupt race condition on device start up. staging: erofs: fix to handle error path of erofs_vmap() serial: max310x: Fix to avoid potential NULL pointer dereference serial: mvebu-uart: Fix to avoid a potential NULL pointer dereference serial: sh-sci: Fix setting SCSCR_TIE while transferring data USB: serial: cp210x: add new device id USB: serial: ftdi_sio: add additional NovaTech products USB: serial: mos7720: fix mos_parport refcount imbalance on error path USB: serial: option: set driver_info for SIM5218 and compatibles USB: serial: option: add support for Quectel EM12 USB: serial: option: add Olicard 600 Disable kgdboc failed by echo space to /sys/module/kgdboc/parameters/kgdboc fs/proc/proc_sysctl.c: fix NULL pointer dereference in put_links drm/vgem: fix use-after-free when drm_gem_handle_create() fails drm/vkms: fix use-after-free when drm_gem_handle_create() fails drm/i915/gvt: Fix MI_FLUSH_DW parsing with correct index check gpio: exar: add a check for the return value of ida_simple_get fails gpio: adnp: Fix testing wrong value in adnp_gpio_direction_input phy: sun4i-usb: Support set_mode to USB_HOST for non-OTG PHYs usb: mtu3: fix EXTCON dependency USB: gadget: f_hid: fix deadlock in f_hidg_write() usb: common: Consider only available nodes for dr_mode usb: host: xhci-rcar: Add XHCI_TRUST_TX_LENGTH quirk xhci: Fix port resume done detection for SS ports with LPM enabled usb: xhci: dbc: Don't free all memory with spinlock held xhci: Don't let USB3 ports stuck in polling state prevent suspend usb: cdc-acm: fix race during wakeup blocking TX traffic mm: add support for kmem caches in DMA32 zone iommu/io-pgtable-arm-v7s: request DMA32 memory, and improve debugging mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified mm/migrate.c: add missing flush_dcache_page for non-mapped page migrate perf pmu: Fix parser error for uncore event alias perf intel-pt: Fix TSC slip objtool: Query pkg-config for libelf location powerpc/pseries/energy: Use OF accessor functions to read ibm,drc-indexes powerpc/64: Fix memcmp reading past the end of src/dest watchdog: Respect watchdog cpumask on CPU hotplug cpu/hotplug: Prevent crash when CPU bringup fails on CONFIG_HOTPLUG_CPU=n x86/smp: Enforce CONFIG_HOTPLUG_CPU when SMP=y KVM: Reject device ioctls from processes other than the VM's creator KVM: x86: update %rip after emulating IO KVM: x86: Emulate MSR_IA32_ARCH_CAPABILITIES on AMD hosts staging: erofs: fix error handling when failed to read compresssed data staging: erofs: keep corrupted fs from crashing kernel in erofs_readdir() bpf: do not restore dst_reg when cur_state is freed drivers: base: Helpers for adding device connection descriptions platform: x86: intel_cht_int33fe: Register all connections at once platform: x86: intel_cht_int33fe: Add connection for the DP alt mode platform: x86: intel_cht_int33fe: Add connections for the USB Type-C port usb: typec: class: Don't use port parent for getting mux handles platform: x86: intel_cht_int33fe: Remove the old connections for the muxes Linux 4.19.33 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Lars Persson
|
f70ddae24b |
mm/migrate.c: add missing flush_dcache_page for non-mapped page migrate
commit d2b2c6dd227ba5b8a802858748ec9a780cb75b47 upstream. Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL and SIGSEGV that could not be traced back to a userspace code bug. They had all the magic signs of an I/D cache coherency issue. Now recently we noticed that the /proc/sys/vm/compact_memory interface was quite efficient at provoking this class of userspace crashes. Studying the code in mm/migrate.c there is a distinction made between migrating a page that is mapped at the instant of migration and one that is not mapped. Our problem turned out to be the non-mapped pages. For the non-mapped page the code performs a copy of the page content and all relevant meta-data of the page without doing the required D-cache maintenance. This leaves dirty data in the D-cache of the CPU and on the 1004K cores this data is not visible to the I-cache. A subsequent page-fault that triggers a mapping of the page will happily serve the process with potentially stale code. What about ARM then, this bug should have seen greater exposure? Well ARM became immune to this flaw back in 2010, see commit |
||
Yang Shi
|
5966777dd8 |
mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified
commit a7f40cfe3b7ada57af9b62fd28430eeb4a7cfcb7 upstream. When MPOL_MF_STRICT was specified and an existing page was already on a node that does not follow the policy, mbind() should return -EIO. But commit |
||
Nicolas Boichat
|
62d342d670 |
mm: add support for kmem caches in DMA32 zone
commit 6d6ea1e967a246f12cfe2f5fb743b70b2e608d4a upstream.
Patch series "iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables",
v6.
This is a followup to the discussion in [1], [2].
IOMMUs using ARMv7 short-descriptor format require page tables (level 1
and 2) to be allocated within the first 4GB of RAM, even on 64-bit
systems.
For L1 tables that are bigger than a page, we can just use
__get_free_pages with GFP_DMA32 (on arm64 systems only, arm would still
use GFP_DMA).
For L2 tables that only take 1KB, it would be a waste to allocate a full
page, so we considered 3 approaches:
1. This series, adding support for GFP_DMA32 slab caches.
2. genalloc, which requires pre-allocating the maximum number of L2 page
tables (4096, so 4MB of memory).
3. page_frag, which is not very memory-efficient as it is unable to reuse
freed fragments until the whole page is freed. [3]
This series is the most memory-efficient approach.
stable@ note:
We confirmed that this is a regression, and IOMMU errors happen on 4.19
and linux-next/master on MT8173 (elm, Acer Chromebook R13). The issue
most likely starts from commit
|
||
Linus Torvalds
|
1d1d2f0762 |
UPSTREAM: filemap: add a comment about FAULT_FLAG_RETRY_NOWAIT behavior
I thought Josef Bacik's patch to drop the mmap_sem was buggy, because when looking at the error cases, there was one case where we returned VM_FAULT_RETRY without actually dropping the mmap_sem. Josef had to explain to me (using small words) that yes, that's actually what we're supposed to do, and his patch was correct. Which not only convinced me he knew what he was doing and I should stop arguing with him, but also that I should add a comment to the case I was confused about. Bug: 124328118 Change-Id: I28390f4924110ee259c20304753bd64a8f6121f8 Patiently-pointed-out-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 8b0f9fa2e02dc95216577c3387b0707c5f60fbaf) Signed-off-by: Minchan Kim <minchan@google.com> |
||
Josef Bacik
|
9f5fa1fd7a |
BACKPORT: filemap: drop the mmap_sem for all blocking operations
Currently we only drop the mmap_sem if there is contention on the page lock. The idea is that we issue readahead and then go to lock the page while it is under IO and we want to not hold the mmap_sem during the IO. The problem with this is the assumption that the readahead does anything. In the case that the box is under extreme memory or IO pressure we may end up not reading anything at all for readahead, which means we will end up reading in the page under the mmap_sem. Even if the readahead does something, it could get throttled because of io pressure on the system and the process is in a lower priority cgroup. Holding the mmap_sem while doing IO is problematic because it can cause system-wide priority inversions. Consider some large company that does a lot of web traffic. This large company has load balancing logic in it's core web server, cause some engineer thought this was a brilliant plan. This load balancing logic gets statistics from /proc about the system, which trip over processes mmap_sem for various reasons. Now the web server application is in a protected cgroup, but these other processes may not be, and if they are being throttled while their mmap_sem is held we'll stall, and cause this nice death spiral. Instead rework filemap fault path to drop the mmap sem at any point that we may do IO or block for an extended period of time. This includes while issuing readahead, locking the page, or needing to call ->readpage because readahead did not occur. Then once we have a fully uptodate page we can return with VM_FAULT_RETRY and come back again to find our nicely in-cache page that was gotten outside of the mmap_sem. This patch also adds a new helper for locking the page with the mmap_sem dropped. This doesn't make sense currently as generally speaking if the page is already locked it'll have been read in (unless there was an error) before it was unlocked. However a forthcoming patchset will change this with the ability to abort read-ahead bio's if necessary, making it more likely that we could contend for a page lock and still have a not uptodate page. This allows us to deal with this case by grabbing the lock and issuing the IO without the mmap_sem held, and then returning VM_FAULT_RETRY to come back around. Test: fio mmap ioengine with 8-thread on 4cpu under memory pressure for stability check Test: locking holding latency measure with lockstat for performance check Bug: 124328118 Change-Id: I0af79cbe6f2ef9fe3927961e042347490b892684 [josef@toxicpanda.com: v6] Link: http://lkml.kernel.org/r/20181212152757.10017-1-josef@toxicpanda.com [kirill@shutemov.name: fix race in filemap_fault()] Link: http://lkml.kernel.org/r/20181228235106.okk3oastsnpxusxs@kshutemo-mobl1 [akpm@linux-foundation.org: coding style fixes] Link: http://lkml.kernel.org/r/20181211173801.29535-4-josef@toxicpanda.com Signed-off-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Tested-by: syzbot+b437b5a429d680cf2217@syzkaller.appspotmail.com Cc: Dave Chinner <david@fromorbit.com> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 6b4c9f4469819a0c1a38a0a4541337e0f9bf6c11) Signed-off-by: Minchan Kim <minchan@google.com> |
||
Josef Bacik
|
76b6ee8f38 |
BACKPORT: filemap: kill page_cache_read usage in filemap_fault
Patch series "drop the mmap_sem when doing IO in the fault path", v6. Now that we have proper isolation in place with cgroups2 we have started going through and fixing the various priority inversions. Most are all gone now, but this one is sort of weird since it's not necessarily a priority inversion that happens within the kernel, but rather because of something userspace does. We have giant applications that we want to protect, and parts of these giant applications do things like watch the system state to determine how healthy the box is for load balancing and such. This involves running 'ps' or other such utilities. These utilities will often walk /proc/<pid>/whatever, and these files can sometimes need to down_read(&task->mmap_sem). Not usually a big deal, but we noticed when we are stress testing that sometimes our protected application has latency spikes trying to get the mmap_sem for tasks that are in lower priority cgroups. This is because any down_write() on a semaphore essentially turns it into a mutex, so even if we currently have it held for reading, any new readers will not be allowed on to keep from starving the writer. This is fine, except a lower priority task could be stuck doing IO because it has been throttled to the point that its IO is taking much longer than normal. But because a higher priority group depends on this completing it is now stuck behind lower priority work. In order to avoid this particular priority inversion we want to use the existing retry mechanism to stop from holding the mmap_sem at all if we are going to do IO. This already exists in the read case sort of, but needed to be extended for more than just grabbing the page lock. With io.latency we throttle at submit_bio() time, so the readahead stuff can block and even page_cache_read can block, so all these paths need to have the mmap_sem dropped. The other big thing is ->page_mkwrite. btrfs is particularly shitty here because we have to reserve space for the dirty page, which can be a very expensive operation. We use the same retry method as the read path, and simply cache the page and verify the page is still setup properly the next pass through ->page_mkwrite(). I've tested these patches with xfstests and there are no regressions. This patch (of 3): If we do not have a page at filemap_fault time we'll do this weird forced page_cache_read thing to populate the page, and then drop it again and loop around and find it. This makes for 2 ways we can read a page in filemap_fault, and it's not really needed. Instead add a FGP_FOR_MMAP flag so that pagecache_get_page() will return a unlocked page that's in pagecache. Then use the normal page locking and readpage logic already in filemap_fault. This simplifies the no page in page cache case significantly. Bug: 124328118 Change-Id: I6e04df24ba52bb96c05c9386248983f477ab78e5 [akpm@linux-foundation.org: fix comment text] [josef@toxicpanda.com: don't unlock null page in FGP_FOR_MMAP case] Link: http://lkml.kernel.org/r/20190312201742.22935-1-josef@toxicpanda.com Link: http://lkml.kernel.org/r/20181211173801.29535-2-josef@toxicpanda.com Signed-off-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Rik van Riel <riel@redhat.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit a75d4c33377277b6034dd1e2663bce444f952c14) Signed-off-by: Minchan Kim <minchan@google.com> |
||
Josef Bacik
|
0cb18b7b49 |
UPSTREAM: filemap: pass vm_fault to the mmap ra helpers
All of the arguments to these functions come from the vmf. Cut down on the amount of arguments passed by simply passing in the vmf to these two helpers. Bug: 124328118 Change-Id: I874f8320d8393f2e39aab030fcf43e51c882e90a Link: http://lkml.kernel.org/r/20181211173801.29535-3-josef@toxicpanda.com Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 2a1180f1bd389e9d47693e5eb384b95f482d8d19) Signed-off-by: Minchan Kim <minchan@google.com> |
||
Greg Kroah-Hartman
|
bb418a146a |
This is the 4.19.31 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlyWhJcACgkQONu9yGCS aT6XzxAAzP2QGzC4SVPgcFH1woF/d8Cz0zQ81mLXzjXtEPm39fZCM2hbBnxkXLu1 peFyrKNk6/c9541D9gsQCQT6Fu+H6u1bJKcIezlKJ2xyB/MsU1hXkjZrTJYW3RRs gimy1EGdood2el1ubEBZiaspazoeRzBqtg1Nsmr4V0l+RT8HwtKKw+0+Nxixfp59 NoVkqTpPI5mL0FiH2R9ogcfg3SvgMZOsOhOBjdPvSjiJJsbvIWcW48MCs95XSUpF R+l/fWn+oiFCcIqBaFheujuqZMvVrUHZHaWAPMuoR/c3Cdf0lTBokdv6UM9c0nv3 61jX5r5ImRI/dfQANN5mbB1YKcs5xOI+I7QZHQ2q4clsWrWyLapXW4clrAZJ6z5t UVeVbuLV2y5PL9GJyBcXpyY0BOf4e2gZURaPY3C5McNwgybNoiR0ZePqKb8ZhZyh jYOYRoBjJJpZoVTSt6MNX95NTvGaSAtqKMu1s3IeMfpwCfQKBPMOuBHr/dUqSC6I U0xxjk/71C15dSPVcTVJT/lmcKc6TXgoagnfbn8GBtDOAjBNsYyUJLQI+db1ERCe 9MEB9k1Z87ROQ5jQCQmWsewOVAtFZBEvSszFmpKv3zTe8M2oFpXG56zckdiumwHU nSfeZTTeWzsFJd30MioEnGYm3ZwKwZx7wi0x4B4WWvBfSpp20Us= =xtLx -----END PGP SIGNATURE----- Merge 4.19.31 into android-4.19 Changes in 4.19.31 media: videobuf2-v4l2: drop WARN_ON in vb2_warn_zero_bytesused() 9p: use inode->i_lock to protect i_size_write() under 32-bit 9p/net: fix memory leak in p9_client_create ASoC: fsl_esai: fix register setting issue in RIGHT_J mode ASoC: codecs: pcm186x: fix wrong usage of DECLARE_TLV_DB_SCALE() ASoC: codecs: pcm186x: Fix energysense SLEEP bit iio: adc: exynos-adc: Fix NULL pointer exception on unbind mei: hbm: clean the feature flags on link reset mei: bus: move hw module get/put to probe/release stm class: Fix an endless loop in channel allocation crypto: caam - fix hash context DMA unmap size crypto: ccree - fix missing break in switch statement crypto: caam - fixed handling of sg list crypto: caam - fix DMA mapping of stack memory crypto: ccree - fix free of unallocated mlli buffer crypto: ccree - unmap buffer before copying IV crypto: ccree - don't copy zero size ciphertext crypto: cfb - add missing 'chunksize' property crypto: cfb - remove bogus memcpy() with src == dest crypto: ahash - fix another early termination in hash walk crypto: rockchip - fix scatterlist nents error crypto: rockchip - update new iv to device in multiple operations drm/imx: ignore plane updates on disabled crtcs gpu: ipu-v3: Fix i.MX51 CSI control registers offset drm/imx: imx-ldb: add missing of_node_puts gpu: ipu-v3: Fix CSI offsets for imx53 ASoC: rt5682: Correct the setting while select ASRC clk for AD/DA filter clocksource: timer-ti-dm: Fix pwm dmtimer usage of fck reparenting KVM: arm/arm64: vgic: Make vgic_dist->lpi_list_lock a raw_spinlock arm64: dts: rockchip: fix graph_port warning on rk3399 bob kevin and excavator s390/dasd: fix using offset into zero size array error Input: pwm-vibra - prevent unbalanced regulator Input: pwm-vibra - stop regulator after disabling pwm, not before ARM: dts: Configure clock parent for pwm vibra ARM: OMAP2+: Variable "reg" in function omap4_dsi_mux_pads() could be uninitialized ASoC: dapm: fix out-of-bounds accesses to DAPM lookup tables ASoC: rsnd: fixup rsnd_ssi_master_clk_start() user count check KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded arm/arm64: KVM: Allow a VCPU to fully reset itself arm/arm64: KVM: Don't panic on failure to properly reset system registers KVM: arm/arm64: vgic: Always initialize the group of private IRQs KVM: arm64: Forbid kprobing of the VHE world-switch code ASoC: samsung: Prevent clk_get_rate() calls in atomic context ARM: OMAP2+: fix lack of timer interrupts on CPU1 after hotplug Input: cap11xx - switch to using set_brightness_blocking() Input: ps2-gpio - flush TX work when closing port Input: matrix_keypad - use flush_delayed_work() mac80211: call drv_ibss_join() on restart mac80211: Fix Tx aggregation session tear down with ITXQs netfilter: compat: initialize all fields in xt_init blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue ipvs: fix dependency on nf_defrag_ipv6 floppy: check_events callback should not return a negative number xprtrdma: Make sure Send CQ is allocated on an existing compvec NFS: Don't use page_file_mapping after removing the page mm/gup: fix gup_pmd_range() for dax Revert "mm: use early_pfn_to_nid in page_ext_init" scsi: qla2xxx: Fix panic from use after free in qla2x00_async_tm_cmd net: dsa: bcm_sf2: potential array overflow in bcm_sf2_sw_suspend() x86/CPU: Add Icelake model number mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs net: hns: Fix object reference leaks in hns_dsaf_roce_reset() i2c: cadence: Fix the hold bit setting i2c: bcm2835: Clear current buffer pointers and counts after a transfer auxdisplay: ht16k33: fix potential user-after-free on module unload Input: st-keyscan - fix potential zalloc NULL dereference clk: sunxi-ng: v3s: Fix TCON reset de-assert bit kallsyms: Handle too long symbols in kallsyms.c clk: sunxi: A31: Fix wrong AHB gate number esp: Skip TX bytes accounting when sending from a request socket ARM: 8824/1: fix a migrating irq bug when hotplug cpu bpf: only adjust gso_size on bytestream protocols bpf: fix lockdep false positive in stackmap af_key: unconditionally clone on broadcast ARM: 8835/1: dma-mapping: Clear DMA ops on teardown assoc_array: Fix shortcut creation keys: Fix dependency loop between construction record and auth key scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_task net: systemport: Fix reception of BPDUs net: dsa: bcm_sf2: Do not assume DSA master supports WoL pinctrl: meson: meson8b: fix the sdxc_a data 1..3 pins qmi_wwan: apply SET_DTR quirk to Sierra WP7607 net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe() xfrm: Fix inbound traffic via XFRM interfaces across network namespaces mailbox: bcm-flexrm-mailbox: Fix FlexRM ring flush timeout issue ASoC: topology: free created components in tplg load error qed: Fix iWARP buffer size provided for syn packet processing. qed: Fix iWARP syn packet mac address validation. ARM: dts: armada-xp: fix Armada XP boards NAND description arm64: Relax GIC version check during early boot ARM: tegra: Restore DT ABI on Tegra124 Chromebooks net: marvell: mvneta: fix DMA debug warning mm: handle lru_add_drain_all for UP properly tmpfs: fix link accounting when a tmpfile is linked in ixgbe: fix older devices that do not support IXGBE_MRQC_L3L4TXSWEN ARCv2: lib: memcpy: fix doing prefetchw outside of buffer ARC: uacces: remove lp_start, lp_end from clobber list ARCv2: support manual regfile save on interrupts ARCv2: don't assume core 0x54 has dual issue phonet: fix building with clang mac80211_hwsim: propagate genlmsg_reply return code bpf, lpm: fix lookup bug in map_delete_elem net: thunderx: make CFG_DONE message to run through generic send-ack sequence net: thunderx: add nicvf_send_msg_to_pf result check for set_rx_mode_task nfp: bpf: fix code-gen bug on BPF_ALU | BPF_XOR | BPF_K nfp: bpf: fix ALU32 high bits clearance bug bnxt_en: Fix typo in firmware message timeout logic. bnxt_en: Wait longer for the firmware message response to complete. net: set static variable an initial value in atl2_probe() selftests: fib_tests: sleep after changing carrier. again. tmpfs: fix uninitialized return value in shmem_link stm class: Prevent division by zero nfit: acpi_nfit_ctl(): Check out_obj->type in the right place acpi/nfit: Fix bus command validation nfit/ars: Attempt a short-ARS whenever the ARS state is idle at boot nfit/ars: Attempt short-ARS even in the no_init_ars case libnvdimm/label: Clear 'updating' flag after label-set update libnvdimm, pfn: Fix over-trim in trim_pfn_device() libnvdimm/pmem: Honor force_raw for legacy pmem regions libnvdimm: Fix altmap reservation size calculation fix cgroup_do_mount() handling of failure exits crypto: aead - set CRYPTO_TFM_NEED_KEY if ->setkey() fails crypto: aegis - fix handling chunked inputs crypto: arm/crct10dif - revert to C code for short inputs crypto: arm64/aes-neonbs - fix returning final keystream block crypto: arm64/crct10dif - revert to C code for short inputs crypto: hash - set CRYPTO_TFM_NEED_KEY if ->setkey() fails crypto: morus - fix handling chunked inputs crypto: pcbc - remove bogus memcpy()s with src == dest crypto: skcipher - set CRYPTO_TFM_NEED_KEY if ->setkey() fails crypto: testmgr - skip crc32c context test for ahash algorithms crypto: x86/aegis - fix handling chunked inputs and MAY_SLEEP crypto: x86/aesni-gcm - fix crash on empty plaintext crypto: x86/morus - fix handling chunked inputs and MAY_SLEEP crypto: arm64/aes-ccm - fix logical bug in AAD MAC handling crypto: arm64/aes-ccm - fix bugs in non-NEON fallback routine CIFS: Do not reset lease state to NONE on lease break CIFS: Do not skip SMB2 message IDs on send failures CIFS: Fix read after write for files with read caching tracing: Use strncpy instead of memcpy for string keys in hist triggers tracing: Do not free iter->trace in fail path of tracing_open_pipe() tracing/perf: Use strndup_user() instead of buggy open-coded version xen: fix dom0 boot on huge systems ACPI / device_sysfs: Avoid OF modalias creation for removed device mmc: sdhci-esdhc-imx: fix HS400 timing issue mmc:fix a bug when max_discard is 0 netfilter: ipt_CLUSTERIP: fix warning unused variable cn spi: ti-qspi: Fix mmap read when more than one CS in use spi: pxa2xx: Setup maximum supported DMA transfer length regulator: s2mps11: Fix steps for buck7, buck8 and LDO35 regulator: max77620: Initialize values for DT properties regulator: s2mpa01: Fix step values for some LDOs clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown clocksource/drivers/arch_timer: Workaround for Allwinner A64 timer instability s390/setup: fix early warning messages s390/virtio: handle find on invalid queue gracefully scsi: virtio_scsi: don't send sc payload with tmfs scsi: aacraid: Fix performance issue on logical drives scsi: sd: Optimal I/O size should be a multiple of physical block size scsi: target/iscsi: Avoid iscsit_release_commands_from_conn() deadlock scsi: qla2xxx: Fix LUN discovery if loop id is not assigned yet by firmware fs/devpts: always delete dcache dentry-s in dput() splice: don't merge into linked buffers ovl: During copy up, first copy up data and then xattrs ovl: Do not lose security.capability xattr over metadata file copy-up m68k: Add -ffreestanding to CFLAGS Btrfs: setup a nofs context for memory allocation at btrfs_create_tree() Btrfs: setup a nofs context for memory allocation at __btrfs_set_acl btrfs: ensure that a DUP or RAID1 block group has exactly two stripes Btrfs: fix corruption reading shared and compressed extents after hole punching soc: qcom: rpmh: Avoid accessing freed memory from batch API libertas_tf: don't set URB_ZERO_PACKET on IN USB transfer irqchip/gic-v3-its: Avoid parsing _indirect_ twice for Device table irqchip/brcmstb-l2: Use _irqsave locking variants in non-interrupt code x86/kprobes: Prohibit probing on optprobe template code cpufreq: kryo: Release OPP tables on module removal cpufreq: tegra124: add missing of_node_put() cpufreq: pxa2xx: remove incorrect __init annotation ext4: fix check of inode in swap_inode_boot_loader ext4: cleanup pagecache before swap i_data ext4: update quota information while swapping boot loader inode ext4: add mask of ext4 flags to swap ext4: fix crash during online resizing PCI/ASPM: Use LTR if already enabled by platform PCI/DPC: Fix print AER status in DPC event handling PCI: dwc: skip MSI init if MSIs have been explicitly disabled IB/hfi1: Close race condition on user context disable and close cxl: Wrap iterations over afu slices inside 'afu_list_lock' ext2: Fix underflow in ext2_max_size() clk: uniphier: Fix update register for CPU-gear clk: clk-twl6040: Fix imprecise external abort for pdmclk clk: samsung: exynos5: Fix possible NULL pointer exception on platform_device_alloc() failure clk: samsung: exynos5: Fix kfree() of const memory on setting driver_override clk: ingenic: Fix round_rate misbehaving with non-integer dividers clk: ingenic: Fix doc of ingenic_cgu_div_info usb: chipidea: tegra: Fix missed ci_hdrc_remove_device() usb: typec: tps6598x: handle block writes separately with plain-I2C adapters dmaengine: usb-dmac: Make DMAC system sleep callbacks explicit mm: hwpoison: fix thp split handing in soft_offline_in_use_page() mm/vmalloc: fix size check for remap_vmalloc_range_partial() mm/memory.c: do_fault: avoid usage of stale vm_area_struct kernel/sysctl.c: add missing range check in do_proc_dointvec_minmax_conv device property: Fix the length used in PROPERTY_ENTRY_STRING() intel_th: Don't reference unassigned outputs parport_pc: fix find_superio io compare code, should use equal test. i2c: tegra: fix maximum transfer size media: i2c: ov5640: Fix post-reset delay gpio: pca953x: Fix dereference of irq data in shutdown can: flexcan: FLEXCAN_IFLAG_MB: add () around macro argument drm/i915: Relax mmap VMA check bpf: only test gso type on gso packets serial: uartps: Fix stuck ISR if RX disabled with non-empty FIFO serial: 8250_of: assume reg-shift of 2 for mrvl,mmp-uart serial: 8250_pci: Fix number of ports for ACCES serial cards serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954 chip use the pci_pericom_setup() jbd2: clear dirty flag when revoking a buffer from an older transaction jbd2: fix compile warning when using JBUFFER_TRACE selinux: add the missing walk_size + len check in selinux_sctp_bind_connect security/selinux: fix SECURITY_LSM_NATIVE_LABELS on reused superblock powerpc/32: Clear on-stack exception marker upon exception return powerpc/wii: properly disable use of BATs when requested. powerpc/powernv: Make opal log only readable by root powerpc/83xx: Also save/restore SPRG4-7 during suspend powerpc/powernv: Don't reprogram SLW image on every KVM guest entry/exit powerpc: Fix 32-bit KVM-PR lockup and host crash with MacOS guest powerpc/ptrace: Simplify vr_get/set() to avoid GCC warning powerpc/hugetlb: Don't do runtime allocation of 16G pages in LPAR configuration powerpc/traps: fix recoverability of machine check handling on book3s/32 powerpc/traps: Fix the message printed when stack overflows ARM: s3c24xx: Fix boolean expressions in osiris_dvs_notify arm64: Fix HCR.TGE status for NMI contexts arm64: debug: Ensure debug handlers check triggering exception level arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2 ipmi_si: fix use-after-free of resource->name dm: fix to_sector() for 32bit dm integrity: limit the rate of error messages mfd: sm501: Fix potential NULL pointer dereference cpcap-charger: generate events for userspace NFS: Fix I/O request leakages NFS: Fix an I/O request leakage in nfs_do_recoalesce NFS: Don't recoalesce on error in nfs_pageio_complete_mirror() nfsd: fix performance-limiting session calculation nfsd: fix memory corruption caused by readdir nfsd: fix wrong check in write_v4_end_grace() NFSv4.1: Reinitialise sequence results before retransmitting a request svcrpc: fix UDP on servers with lots of threads PM / wakeup: Rework wakeup source timer cancellation bcache: never writeback a discard operation stable-kernel-rules.rst: add link to networking patch queue vt: perform safe console erase in the right order x86/unwind/orc: Fix ORC unwind table alignment perf intel-pt: Fix CYC timestamp calculation after OVF perf tools: Fix split_kallsyms_for_kcore() for trampoline symbols perf auxtrace: Define auxtrace record alignment perf intel-pt: Fix overlap calculation for padding perf/x86/intel/uncore: Fix client IMC events return huge result perf intel-pt: Fix divide by zero when TSC is not available md: Fix failed allocation of md_register_thread tpm/tpm_crb: Avoid unaligned reads in crb_recv() tpm: Unify the send callback behaviour rcu: Do RCU GP kthread self-wakeup from softirq and interrupt media: imx: prpencvf: Stop upstream before disabling IDMA channel media: lgdt330x: fix lock status reporting media: uvcvideo: Avoid NULL pointer dereference at the end of streaming media: vimc: Add vimc-streamer for stream control media: imx: csi: Disable CSI immediately after last EOF media: imx: csi: Stop upstream before disabling IDMA channel drm/fb-helper: generic: Fix drm_fbdev_client_restore() drm/radeon/evergreen_cs: fix missing break in switch statement drm/amd/powerplay: correct power reading on fiji drm/amd/display: don't call dm_pp_ function from an fpu block KVM: Call kvm_arch_memslots_updated() before updating memslots KVM: x86/mmu: Detect MMIO generation wrap in any address space KVM: x86/mmu: Do not cache MMIO accesses while memslots are in flux KVM: nVMX: Sign extend displacements of VMX instr's mem operands KVM: nVMX: Apply addr size mask to effective address for VMX instructions KVM: nVMX: Ignore limit checks on VMX instructions using flat segments bcache: use (REQ_META|REQ_PRIO) to indicate bio for metadata s390/setup: fix boot crash for machine without EDAT-1 Linux 4.19.31 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Jan Stancek
|
09417dd35e |
mm/memory.c: do_fault: avoid usage of stale vm_area_struct
commit fc8efd2ddfed3f343c11b693e87140ff358d7ff5 upstream. LTP testcase mtest06 [1] can trigger a crash on s390x running 5.0.0-rc8. This is a stress test, where one thread mmaps/writes/munmaps memory area and other thread is trying to read from it: CPU: 0 PID: 2611 Comm: mmap1 Not tainted 5.0.0-rc8+ #51 Hardware name: IBM 2964 N63 400 (z/VM 6.4.0) Krnl PSW : 0404e00180000000 00000000001ac8d8 (__lock_acquire+0x7/0x7a8) Call Trace: ([<0000000000000000>] (null)) [<00000000001adae4>] lock_acquire+0xec/0x258 [<000000000080d1ac>] _raw_spin_lock_bh+0x5c/0x98 [<000000000012a780>] page_table_free+0x48/0x1a8 [<00000000002f6e54>] do_fault+0xdc/0x670 [<00000000002fadae>] __handle_mm_fault+0x416/0x5f0 [<00000000002fb138>] handle_mm_fault+0x1b0/0x320 [<00000000001248cc>] do_dat_exception+0x19c/0x2c8 [<000000000080e5ee>] pgm_check_handler+0x19e/0x200 page_table_free() is called with NULL mm parameter, but because "0" is a valid address on s390 (see S390_lowcore), it keeps going until it eventually crashes in lockdep's lock_acquire. This crash is reproducible at least since 4.14. Problem is that "vmf->vma" used in do_fault() can become stale. Because mmap_sem may be released, other threads can come in, call munmap() and cause "vma" be returned to kmem cache, and get zeroed/re-initialized and re-used: handle_mm_fault | __handle_mm_fault | do_fault | vma = vmf->vma | do_read_fault | __do_fault | vma->vm_ops->fault(vmf); | mmap_sem is released | | | do_munmap() | remove_vma_list() | remove_vma() | vm_area_free() | # vma is released | ... | # same vma is allocated | # from kmem cache | do_mmap() | vm_area_alloc() | memset(vma, 0, ...) | pte_free(vma->vm_mm, ...); | page_table_free | spin_lock_bh(&mm->context.lock);| <crash> | Cache mm_struct to avoid using potentially stale "vma". [1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/mtest06/mmap1.c Link: http://lkml.kernel.org/r/5b3fdf19e2a5be460a384b936f5b56e13733f1b8.1551595137.git.jstancek@redhat.com Signed-off-by: Jan Stancek <jstancek@redhat.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Matthew Wilcox <willy@infradead.org> Acked-by: Rafael Aquini <aquini@redhat.com> Reviewed-by: Minchan Kim <minchan@kernel.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Rik van Riel <riel@surriel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Roman Penyaev
|
c1ddc7b785 |
mm/vmalloc: fix size check for remap_vmalloc_range_partial()
commit 401592d2e095947344e10ec0623adbcd58934dd4 upstream. When VM_NO_GUARD is not set area->size includes adjacent guard page, thus for correct size checking get_vm_area_size() should be used, but not area->size. This fixes possible kernel oops when userspace tries to mmap an area on 1 page bigger than was allocated by vmalloc_user() call: the size check inside remap_vmalloc_range_partial() accounts non-existing guard page also, so check successfully passes but vmalloc_to_page() returns NULL (guard page does not physically exist). The following code pattern example should trigger an oops: static int oops_mmap(struct file *file, struct vm_area_struct *vma) { void *mem; mem = vmalloc_user(4096); BUG_ON(!mem); /* Do not care about mem leak */ return remap_vmalloc_range(vma, mem, 0); } And userspace simply mmaps size + PAGE_SIZE: mmap(NULL, 8192, PROT_WRITE|PROT_READ, MAP_PRIVATE, fd, 0); Possible candidates for oops which do not have any explicit size checks: *** drivers/media/usb/stkwebcam/stk-webcam.c: v4l_stk_mmap[789] ret = remap_vmalloc_range(vma, sbuf->buffer, 0); Or the following one: *** drivers/video/fbdev/core/fbmem.c static int fb_mmap(struct file *file, struct vm_area_struct * vma) ... res = fb->fb_mmap(info, vma); Where fb_mmap callback calls remap_vmalloc_range() directly without any explicit checks: *** drivers/video/fbdev/vfb.c static int vfb_mmap(struct fb_info *info, struct vm_area_struct *vma) { return remap_vmalloc_range(vma, (void *)info->fix.smem_start, vma->vm_pgoff); } Link: http://lkml.kernel.org/r/20190103145954.16942-2-rpenyaev@suse.de Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Joe Perches <joe@perches.com> Cc: "Luis R. Rodriguez" <mcgrof@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
zhongjiang
|
234c0cc982 |
mm: hwpoison: fix thp split handing in soft_offline_in_use_page()
commit 46612b751c4941c5c0472ddf04027e877ae5990f upstream. When soft_offline_in_use_page() runs on a thp tail page after pmd is split, we trigger the following VM_BUG_ON_PAGE(): Memory failure: 0x3755ff: non anonymous thp __get_any_page: 0x3755ff: unknown zero refcount page type 2fffff80000000 Soft offlining pfn 0x34d805 at process virtual address 0x20fff000 page:ffffea000d360140 count:0 mapcount:0 mapping:0000000000000000 index:0x1 flags: 0x2fffff80000000() raw: 002fffff80000000 ffffea000d360108 ffffea000d360188 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) ------------[ cut here ]------------ kernel BUG at ./include/linux/mm.h:519! soft_offline_in_use_page() passed refcount and page lock from tail page to head page, which is not needed because we can pass any subpage to split_huge_page(). Naoya had fixed a similar issue in |
||
Darrick J. Wong
|
b3139fbb3b |
tmpfs: fix uninitialized return value in shmem_link
[ Upstream commit 29b00e609960ae0fcff382f4c7079dd0874a5311 ] When we made the shmem_reserve_inode call in shmem_link conditional, we forgot to update the declaration for ret so that it always has a known value. Dan Carpenter pointed out this deficiency in the original patch. Fixes: 1062af920c07 ("tmpfs: fix link accounting when a tmpfile is linked in") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Matej Kupljen <matej.kupljen@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Darrick J. Wong
|
064a61d3e7 |
tmpfs: fix link accounting when a tmpfile is linked in
[ Upstream commit 1062af920c07f5b54cf5060fde3339da6df0cf6b ]
tmpfs has a peculiarity of accounting hard links as if they were
separate inodes: so that when the number of inodes is limited, as it is
by default, a user cannot soak up an unlimited amount of unreclaimable
dcache memory just by repeatedly linking a file.
But when v3.11 added O_TMPFILE, and the ability to use linkat() on the
fd, we missed accommodating this new case in tmpfs: "df -i" shows that
an extra "inode" remains accounted after the file is unlinked and the fd
closed and the actual inode evicted. If a user repeatedly links
tmpfiles into a tmpfs, the limit will be hit (ENOSPC) even after they
are deleted.
Just skip the extra reservation from shmem_link() in this case: there's
a sense in which this first link of a tmpfile is then cheaper than a
hard link of another file, but the accounting works out, and there's
still good limiting, so no need to do anything more complicated.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1902182134370.7035@eggly.anvils
Fixes:
|
||
Michal Hocko
|
e6e9d6e290 |
mm: handle lru_add_drain_all for UP properly
[ Upstream commit 6ea183d60c469560e7b08a83c9804299e84ec9eb ]
Since for_each_cpu(cpu, mask) added by commit
|
||
Jann Horn
|
33e83ea302 |
mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs
[ Upstream commit 2c2ade81741c66082f8211f0b96cf509cc4c0218 ] The basic idea behind ->pagecnt_bias is: If we pre-allocate the maximum number of references that we might need to create in the fastpath later, the bump-allocation fastpath only has to modify the non-atomic bias value that tracks the number of extra references we hold instead of the atomic refcount. The maximum number of allocations we can serve (under the assumption that no allocation is made with size 0) is nc->size, so that's the bias used. However, even when all memory in the allocation has been given away, a reference to the page is still held; and in the `offset < 0` slowpath, the page may be reused if everyone else has dropped their references. This means that the necessary number of references is actually `nc->size+1`. Luckily, from a quick grep, it looks like the only path that can call page_frag_alloc(fragsz=1) is TAP with the IFF_NAPI_FRAGS flag, which requires CAP_NET_ADMIN in the init namespace and is only intended to be used for kernel testing and fuzzing. To test for this issue, put a `WARN_ON(page_ref_count(page) == 0)` in the `offset < 0` path, below the virt_to_page() call, and then repeatedly call writev() on a TAP device with IFF_TAP|IFF_NO_PI|IFF_NAPI_FRAGS|IFF_NAPI, with a vector consisting of 15 elements containing 1 byte each. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Qian Cai
|
53dcaeeff1 |
Revert "mm: use early_pfn_to_nid in page_ext_init"
[ Upstream commit 2f1ee0913ce58efe7f18fbd518bd54c598559b89 ] This reverts commit |
||
Yu Zhao
|
8b1a7762e0 |
mm/gup: fix gup_pmd_range() for dax
[ Upstream commit 414fd080d125408cb15d04ff4907e1dd8145c8c7 ] For dax pmd, pmd_trans_huge() returns false but pmd_huge() returns true on x86. So the function works as long as hugetlb is configured. However, dax doesn't depend on hugetlb. Link: http://lkml.kernel.org/r/20190111034033.601-1-yuzhao@google.com Signed-off-by: Yu Zhao <yuzhao@google.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Keith Busch <keith.busch@intel.com> Cc: "Michael S . Tsirkin" <mst@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Johannes Weiner
|
e550f94252 |
UPSTREAM: psi: pressure stall information for CPU, memory, and IO
When systems are overcommitted and resources become contended, it's hard to tell exactly the impact this has on workload productivity, or how close the system is to lockups and OOM kills. In particular, when machines work multiple jobs concurrently, the impact of overcommit in terms of latency and throughput on the individual job can be enormous. In order to maximize hardware utilization without sacrificing individual job health or risk complete machine lockups, this patch implements a way to quantify resource pressure in the system. A kernel built with CONFIG_PSI=y creates files in /proc/pressure/ that expose the percentage of time the system is stalled on CPU, memory, or IO, respectively. Stall states are aggregate versions of the per-task delay accounting delays: cpu: some tasks are runnable but not executing on a CPU memory: tasks are reclaiming, or waiting for swapin or thrashing cache io: tasks are waiting for io completions These percentages of walltime can be thought of as pressure percentages, and they give a general sense of system health and productivity loss incurred by resource overcommit. They can also indicate when the system is approaching lockup scenarios and OOMs. To do this, psi keeps track of the task states associated with each CPU and samples the time they spend in stall states. Every 2 seconds, the samples are averaged across CPUs - weighted by the CPUs' non-idle time to eliminate artifacts from unused CPUs - and translated into percentages of walltime. A running average of those percentages is maintained over 10s, 1m, and 5m periods (similar to the loadaverage). [hannes@cmpxchg.org: doc fixlet, per Randy] Link: http://lkml.kernel.org/r/20180828205625.GA14030@cmpxchg.org [hannes@cmpxchg.org: code optimization] Link: http://lkml.kernel.org/r/20180907175015.GA8479@cmpxchg.org [hannes@cmpxchg.org: rename psi_clock() to psi_update_work(), per Peter] Link: http://lkml.kernel.org/r/20180907145404.GB11088@cmpxchg.org [hannes@cmpxchg.org: fix build] Link: http://lkml.kernel.org/r/20180913014222.GA2370@cmpxchg.org Link: http://lkml.kernel.org/r/20180828172258.3185-9-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Drake <drake@endlessm.com> Tested-by: Suren Baghdasaryan <surenb@google.com> Cc: Christopher Lameter <cl@linux.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <jweiner@fb.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Enderborg <peter.enderborg@sony.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit eb414681d5a07d28d2ff90dc05f69ec6b232ebd2) Bug: 127712811 Test: lmkd in PSI mode Change-Id: Id00d23c977169b0c4636d92016fc1fee0274be05 Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
Johannes Weiner
|
580f26b93e |
UPSTREAM: delayacct: track delays from thrashing cache pages
Delay accounting already measures the time a task spends in direct reclaim and waiting for swapin, but in low memory situations tasks spend can spend a significant amount of their time waiting on thrashing page cache. This isn't tracked right now. To know the full impact of memory contention on an individual task, measure the delay when waiting for a recently evicted active cache page to read back into memory. Also update tools/accounting/getdelays.c: [hannes@computer accounting]$ sudo ./getdelays -d -p 1 print delayacct stats ON PID 1 CPU count real total virtual total delay total delay average 50318 745000000 847346785 400533713 0.008ms IO count delay total delay average 435 122601218 0ms SWAP count delay total delay average 0 0 0ms RECLAIM count delay total delay average 0 0 0ms THRASHING count delay total delay average 19 12621439 0ms Link: http://lkml.kernel.org/r/20180828172258.3185-4-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Drake <drake@endlessm.com> Tested-by: Suren Baghdasaryan <surenb@google.com> Cc: Christopher Lameter <cl@linux.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <jweiner@fb.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Enderborg <peter.enderborg@sony.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit b1d29ba82cf2bc784f4c963ddd6a2cf29e229b33) Bug: 127712811 Test: lmkd in PSI mode Change-Id: I259f693987cf04e6a52ee7e8accf55a17e0de005 Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
Johannes Weiner
|
b4a56abdf7 |
UPSTREAM: mm: workingset: tell cache transitions from workingset thrashing
Refaults happen during transitions between workingsets as well as in-place thrashing. Knowing the difference between the two has a range of applications, including measuring the impact of memory shortage on the system performance, as well as the ability to smarter balance pressure between the filesystem cache and the swap-backed workingset. During workingset transitions, inactive cache refaults and pushes out established active cache. When that active cache isn't stale, however, and also ends up refaulting, that's bonafide thrashing. Introduce a new page flag that tells on eviction whether the page has been active or not in its lifetime. This bit is then stored in the shadow entry, to classify refaults as transitioning or thrashing. How many page->flags does this leave us with on 32-bit? 20 bits are always page flags 21 if you have an MMU 23 with the zone bits for DMA, Normal, HighMem, Movable 29 with the sparsemem section bits 30 if PAE is enabled 31 with this patch. So on 32-bit PAE, that leaves 1 bit for distinguishing two NUMA nodes. If that's not enough, the system can switch to discontigmem and re-gain the 6 or 7 sparsemem section bits. Link: http://lkml.kernel.org/r/20180828172258.3185-3-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Drake <drake@endlessm.com> Tested-by: Suren Baghdasaryan <surenb@google.com> Cc: Christopher Lameter <cl@linux.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <jweiner@fb.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Enderborg <peter.enderborg@sony.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 8508cf3ffad4defa202b303e5b6379efc4cd9054) Bug: 127712811 Test: lmkd in PSI mode Change-Id: I71df060dce5590a3c654f9a0e8e54deeb74b64c2 Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
Greg Kroah-Hartman
|
2e568c979c |
This is the 4.19.29 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlyJb/EACgkQONu9yGCS aT4y0g//b9t9/onhTaXcY/ByPmBAwqNgugi7eYcZqGDBp7aCDOBLF6eOwbhdvvuS ZTaZ5eWG3Twz3mZu9vveuskgMci2npDyLPgqBWGzW+Ef5r/xPd40diaI75ZUc68T gimWbQ0VANuXKklK6LysBUaVQWE3ilIy6qnnpj0DI3ipNDoE62Ry1LNthuKy+73J w6r7uwkb6X/CkXpNB/L4cDdpSy/CvhGQhd6p91lBuE4DfyPqEzslYCokD9aPXp9b Fedt/Re+8eULBNcgqPYxkS5pBrbHtqrGf00AMlzC8DkC+GZyDqSP2xjv6AiTfGJd uf0/Jvsv2OBnP4aYsbk+uB2z3plzPBgmXxa/1bm+yrGCMvbpi9mMx75HM2joAeVp tVN4ZN65kNgJkXCchJTHdQ3s6teOD8Par1czy570HyKBU6l1j3AGArGm+b4WGPWx dL+82coojMKxKNdTHfxUXES6QGKp716r3un6mCrKR0xET/SDayzDQMaSM8UOtArK ELzNeKzKTc5oBx6i+JfGmY8ZsedpNGCIPpsiuoSYAaon5ZzNbruzOAlDOThs157d YezDHZ9XMrx3kN/xYnqZD63x/5egq9REbZGWljeykbNkWcEY74jIkKwNLxqv3P64 JsLp60owvjzwtzKycjZogNU//GGNTBdb+6pESq4MxJpPTteFWnc= =n9iV -----END PGP SIGNATURE----- Merge 4.19.29 into android-4.19 Changes in 4.19.29 media: uvcvideo: Fix 'type' check leading to overflow vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel perf script: Fix crash with printing mixed trace point and other events perf core: Fix perf_proc_update_handler() bug perf tools: Handle TOPOLOGY headers with no CPU perf script: Fix crash when processing recorded stat data IB/{hfi1, qib}: Fix WC.byte_len calculation for UD_SEND_WITH_IMM iommu/amd: Call free_iova_fast with pfn in map_sg iommu/amd: Unmap all mapped pages in error path of map_sg riscv: fixup max_low_pfn with PFN_DOWN. ipvs: Fix signed integer overflow when setsockopt timeout iommu/amd: Fix IOMMU page flush when detach device from a domain clk: ti: Fix error handling in ti_clk_parse_divider_data() clk: qcom: gcc: Use active only source for CPUSS clocks xtensa: SMP: fix ccount_timer_shutdown riscv: Adjust mmap base address at a third of task size IB/ipoib: Fix for use-after-free in ipoib_cm_tx_start selftests: cpu-hotplug: fix case where CPUs offline > CPUs present xtensa: SMP: fix secondary CPU initialization xtensa: smp_lx200_defconfig: fix vectors clash xtensa: SMP: mark each possible CPU as present iomap: get/put the page in iomap_page_create/release() iomap: fix a use after free in iomap_dio_rw xtensa: SMP: limit number of possible CPUs by NR_CPUS net: altera_tse: fix msgdma_tx_completion on non-zero fill_level case net: hns: Fix for missing of_node_put() after of_parse_phandle() net: hns: Restart autoneg need return failed when autoneg off net: hns: Fix wrong read accesses via Clause 45 MDIO protocol net: stmmac: dwmac-rk: fix error handling in rk_gmac_powerup() netfilter: ebtables: compat: un-break 32bit setsockopt when no rules are present gpio: vf610: Mask all GPIO interrupts selftests: net: use LDLIBS instead of LDFLAGS selftests: timers: use LDLIBS instead of LDFLAGS nfs: Fix NULL pointer dereference of dev_name qed: Fix bug in tx promiscuous mode settings qed: Fix LACP pdu drops for VFs qed: Fix VF probe failure while FLR qed: Fix system crash in ll2 xmit qed: Fix stack out of bounds bug scsi: libfc: free skb when receiving invalid flogi resp scsi: scsi_debug: fix write_same with virtual_gb problem scsi: bnx2fc: Fix error handling in probe() scsi: 53c700: pass correct "dev" to dma_alloc_attrs() platform/x86: Fix unmet dependency warning for ACPI_CMPC platform/x86: Fix unmet dependency warning for SAMSUNG_Q10 net: macb: Apply RXUBR workaround only to versions with errata x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning to long mode cifs: fix computation for MAX_SMB2_HDR_SIZE x86/microcode/amd: Don't falsely trick the late loading mechanism arm64: kprobe: Always blacklist the KVM world-switch code apparmor: Fix aa_label_build() error handling for failed merges x86/kexec: Don't setup EFI info if EFI runtime is not enabled proc: fix /proc/net/* after setns(2) x86_64: increase stack size for KASAN_EXTRA mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone lib/test_kmod.c: potential double free in error handling fs/drop_caches.c: avoid softlockups in drop_pagecache_sb() autofs: drop dentry reference only when it is never used autofs: fix error return in autofs_fill_super() mm, memory_hotplug: fix off-by-one in is_pageblock_removable ARM: OMAP: dts: N950/N9: fix onenand timings ARM: dts: omap4-droid4: Fix typo in cpcap IRQ flags ARM: dts: sun8i: h3: Add ethernet0 alias to Beelink X2 arm: dts: meson: Fix IRQ trigger type for macirq ARM: dts: meson8b: odroidc1: mark the SD card detection GPIO active-low ARM: dts: meson8m2: mxiii-plus: mark the SD card detection GPIO active-low ARM: dts: imx6sx: correct backward compatible of gpt arm64: dts: renesas: r8a7796: Enable DMA for SCIF2 arm64: dts: renesas: r8a77965: Enable DMA for SCIF2 soc: fsl: qbman: avoid race in clearing QMan interrupt pinctrl: mcp23s08: spi: Fix regmap allocation for mcp23s18 wlcore: sdio: Fixup power on/off sequence bpftool: Fix prog dump by tag bpftool: fix percpu maps updating bpf: sock recvbuff must be limited by rmem_max in bpf_setsockopt() ARM: pxa: ssp: unneeded to free devm_ allocated data arm64: dts: add msm8996 compatible to gicv3 batman-adv: release station info tidstats DTS: CI20: Fix bugs in ci20's device tree. usb: phy: fix link errors irqchip/gic-v4: Fix occasional VLPI drop irqchip/gic-v3-its: Gracefully fail on LPI exhaustion irqchip/mmp: Only touch the PJ4 IRQ & FIQ bits on enable/disable drm/amdgpu: Add missing power attribute to APU check drm/radeon: check if device is root before getting pci speed caps drm/amdgpu: Transfer fences to dmabuf importer net: stmmac: Fallback to Platform Data clock in Watchdog conversion net: stmmac: Send TSO packets always from Queue 0 net: stmmac: Disable EEE mode earlier in XMIT callback irqchip/gic-v3-its: Fix ITT_entry_size accessor relay: check return of create_buf_file() properly bpf, selftests: fix handling of sparse CPU allocations bpf: fix lockdep false positive in percpu_freelist bpf: fix potential deadlock in bpf_prog_register bpf: Fix syscall's stackmap lookup potential deadlock drm/sun4i: tcon: Prepare and enable TCON channel 0 clock at init dmaengine: at_xdmac: Fix wrongfull report of a channel as in use vsock/virtio: fix kernel panic after device hot-unplug vsock/virtio: reset connected sockets on device removal dmaengine: dmatest: Abort test in case of mapping error selftests: netfilter: fix config fragment CONFIG_NF_TABLES_INET selftests: netfilter: add simple masq/redirect test cases netfilter: nf_nat: skip nat clash resolution for same-origin entries s390/qeth: release cmd buffer in error paths s390/qeth: fix use-after-free in error path s390/qeth: cancel close_dev work before removing a card perf symbols: Filter out hidden symbols from labels perf trace: Support multiple "vfs_getname" probes MIPS: Remove function size check in get_frame_info() Revert "scsi: libfc: Add WARN_ON() when deleting rports" i2c: omap: Use noirq system sleep pm ops to idle device for suspend drm/amdgpu: use spin_lock_irqsave to protect vm_manager.pasid_idr nvme: lock NS list changes while handling command effects nvme-pci: fix rapid add remove sequence fs: ratelimit __find_get_block_slow() failure message. qed: Fix EQ full firmware assert. qed: Consider TX tcs while deriving the max num_queues for PF. qede: Fix system crash on configuring channels. blk-iolatency: fix IO hang due to negative inflight counter nvme-pci: add missing unlock for reset error Input: wacom_serial4 - add support for Wacom ArtPad II tablet Input: elan_i2c - add id for touchpad found in Lenovo s21e-20 iscsi_ibft: Fix missing break in switch statement scsi: aacraid: Fix missing break in switch statement x86/PCI: Fixup RTIT_BAR of Intel Denverton Trace Hub arm64: dts: zcu100-revC: Give wifi some time after power-on arm64: dts: hikey: Give wifi some time after power-on arm64: dts: hikey: Revert "Enable HS200 mode on eMMC" ARM: dts: exynos: Fix pinctrl definition for eMMC RTSN line on Odroid X2/U3 ARM: dts: exynos: Add minimal clkout parameters to Exynos3250 PMU ARM: dts: exynos: Fix max voltage for buck8 regulator on Odroid XU3/XU4 drm: disable uncached DMA optimization for ARM and arm64 netfilter: xt_TEE: fix wrong interface selection netfilter: xt_TEE: add missing code to get interface index in checkentry. gfs2: Fix missed wakeups in find_insert_glock staging: erofs: add error handling for xattr submodule staging: erofs: fix fast symlink w/o xattr when fs xattr is on staging: erofs: fix memleak of inode's shared xattr array staging: erofs: fix race of initializing xattrs of a inode at the same time staging: erofs: keep corrupted fs from crashing kernel in erofs_namei() cifs: allow calling SMB2_xxx_free(NULL) ath9k: Avoid OF no-EEPROM quirks without qca,no-eeprom driver core: Postpone DMA tear-down until after devres release perf/x86/intel: Make cpuc allocations consistent perf/x86/intel: Generalize dynamic constraint creation x86: Add TSX Force Abort CPUID/MSR perf/x86/intel: Implement support for TSX Force Abort Linux 4.19.29 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Michal Hocko
|
2cc84e2ea6 |
mm, memory_hotplug: fix off-by-one in is_pageblock_removable
[ Upstream commit 891cb2a72d821f930a39d5900cb7a3aa752c1d5b ] Rong Chen has reported the following boot crash: PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 239 Comm: udevd Not tainted 5.0.0-rc4-00149-gefad4e4 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 RIP: 0010:page_mapping+0x12/0x80 Code: 5d c3 48 89 df e8 0e ad 02 00 85 c0 75 da 89 e8 5b 5d c3 0f 1f 44 00 00 53 48 89 fb 48 8b 43 08 48 8d 50 ff a8 01 48 0f 45 da <48> 8b 53 08 48 8d 42 ff 83 e2 01 48 0f 44 c3 48 83 38 ff 74 2f 48 RSP: 0018:ffff88801fa87cd8 EFLAGS: 00010202 RAX: ffffffffffffffff RBX: fffffffffffffffe RCX: 000000000000000a RDX: fffffffffffffffe RSI: ffffffff820b9a20 RDI: ffff88801e5c0000 RBP: 6db6db6db6db6db7 R08: ffff88801e8bb000 R09: 0000000001b64d13 R10: ffff88801fa87cf8 R11: 0000000000000001 R12: ffff88801e640000 R13: ffffffff820b9a20 R14: ffff88801f145258 R15: 0000000000000001 FS: 00007fb2079817c0(0000) GS:ffff88801dd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000006 CR3: 000000001fa82000 CR4: 00000000000006a0 Call Trace: __dump_page+0x14/0x2c0 is_mem_section_removable+0x24c/0x2c0 removable_show+0x87/0xa0 dev_attr_show+0x25/0x60 sysfs_kf_seq_show+0xba/0x110 seq_read+0x196/0x3f0 __vfs_read+0x34/0x180 vfs_read+0xa0/0x150 ksys_read+0x44/0xb0 do_syscall_64+0x5e/0x4a0 entry_SYSCALL_64_after_hwframe+0x49/0xbe and bisected it down to commit efad4e475c31 ("mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone"). The reason for the crash is that the mapping is garbage for poisoned (uninitialized) page. This shouldn't happen as all pages in the zone's boundary should be initialized. Later debugging revealed that the actual problem is an off-by-one when evaluating the end_page. 'start_pfn + nr_pages' resp 'zone_end_pfn' refers to a pfn after the range and as such it might belong to a differen memory section. This along with CONFIG_SPARSEMEM then makes the loop condition completely bogus because a pointer arithmetic doesn't work for pages from two different sections in that memory model. Fix the issue by reworking is_pageblock_removable to be pfn based and only use struct page where necessary. This makes the code slightly easier to follow and we will remove the problematic pointer arithmetic completely. Link: http://lkml.kernel.org/r/20190218181544.14616-1-mhocko@kernel.org Fixes: efad4e475c31 ("mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: <rong.a.chen@intel.com> Tested-by: <rong.a.chen@intel.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Mikhail Zaslonko
|
71df1c8bc7 |
mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone
[ Upstream commit 24feb47c5fa5b825efb0151f28906dfdad027e61 ] If memory end is not aligned with the sparse memory section boundary, the mapping of such a section is only partly initialized. This may lead to VM_BUG_ON due to uninitialized struct pages access from test_pages_in_a_zone() function triggered by memory_hotplug sysfs handlers. Here are the the panic examples: CONFIG_DEBUG_VM_PGFLAGS=y kernel parameter mem=2050M -------------------------- page:000003d082008000 is uninitialized and poisoned page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) Call Trace: test_pages_in_a_zone+0xde/0x160 show_valid_zones+0x5c/0x190 dev_attr_show+0x34/0x70 sysfs_kf_seq_show+0xc8/0x148 seq_read+0x204/0x480 __vfs_read+0x32/0x178 vfs_read+0x82/0x138 ksys_read+0x5a/0xb0 system_call+0xdc/0x2d8 Last Breaking-Event-Address: test_pages_in_a_zone+0xde/0x160 Kernel panic - not syncing: Fatal exception: panic_on_oops Fix this by checking whether the pfn to check is within the zone. [mhocko@suse.com: separated this change from http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com] Link: http://lkml.kernel.org/r/20190128144506.15603-3-mhocko@kernel.org [mhocko@suse.com: separated this change from http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com] Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Michal Hocko
|
6027792d6a |
mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone
[ Upstream commit efad4e475c312456edb3c789d0996d12ed744c13 ] Patch series "mm, memory_hotplug: fix uninitialized pages fallouts", v2. Mikhail Zaslonko has posted fixes for the two bugs quite some time ago [1]. I have pushed back on those fixes because I believed that it is much better to plug the problem at the initialization time rather than play whack-a-mole all over the hotplug code and find all the places which expect the full memory section to be initialized. We have ended up with commit 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full memory section") merged and cause a regression [2][3]. The reason is that there might be memory layouts when two NUMA nodes share the same memory section so the merged fix is simply incorrect. In order to plug this hole we really have to be zone range aware in those handlers. I have split up the original patch into two. One is unchanged (patch 2) and I took a different approach for `removable' crash. [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948 [3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz This patch (of 2): Mikhail has reported the following VM_BUG_ON triggered when reading sysfs removable state of a memory block: page:000003d08300c000 is uninitialized and poisoned page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) Call Trace: is_mem_section_removable+0xb4/0x190 show_mem_removable+0x9a/0xd8 dev_attr_show+0x34/0x70 sysfs_kf_seq_show+0xc8/0x148 seq_read+0x204/0x480 __vfs_read+0x32/0x178 vfs_read+0x82/0x138 ksys_read+0x5a/0xb0 system_call+0xdc/0x2d8 Last Breaking-Event-Address: is_mem_section_removable+0xb4/0x190 Kernel panic - not syncing: Fatal exception: panic_on_oops The reason is that the memory block spans the zone boundary and we are stumbling over an unitialized struct page. Fix this by enforcing zone range in is_mem_section_removable so that we never run away from a zone. Link: http://lkml.kernel.org/r/20190128144506.15603-2-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Debugged-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Greg Kroah-Hartman
|
36d178b3bc |
This is the 4.19.27 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlx+qs4ACgkQONu9yGCS aT4OQQ/8DMkQa61GLVtL1Zm0i2GW+fpgAJj59QP7jRdilolP7BBnrX2Ne26Dx+X0 bYLJDJw6WexsvLKOkSMol4Y3Q3NsBguC+MasVrrWBHpfntizlaPzsYuTKkyNRG0c vjOYyTeAxElWqEh1WaVQhk/juhBbrTtkZIK63h9M60KYb0qHA7TY9oTGokEA8WA0 11Yuge2aV6cR8zup1coWR9NRvW/uealiENAhr0Jw1ZRtrBqwzQoFyn5psXdbzkjb HrwvDa/nkwfJuc/1+grOTF5HfI7D5+giq0iRtvBuChjDwqfJ2IrVRPF/DKSmR0Oz Sq1ILUdj61gd2hp4N9zpUE87r2tV2ABg/yZrtDcKFhCiUWnfaaGArn7mm/R7OhCD PjRDz7acVvKq3LLenM6xcFsAWUdjOoUWcGk70eO/ZZctW8o+HozxFcVrNmT/75zJ LcagplvYLiLB4Gard6niNH8qunmq/jD+GkgdKtF9GFHOjG1QMdQ9d/VxeJWGAjtA nE8ZX9EFYgmq14OR+YK0DNOOFBZe10lcR9rMIFp6T9AanY7rVT7LqqG6BUXb226P mJmNS9sSJhictucyyf/ej9j0jULBBsAYJA5+fkECHIioewACjFGn1dZfm3BBrDAb OoNIwaLmrvjh2iKwsj17OK9Fv8PG6VKwYRFQ/I3OGVouZUX5SEk= =edyH -----END PGP SIGNATURE----- Merge 4.19.27 into android-4.19 Changes in 4.19.27 irq/matrix: Split out the CPU selection code into a helper irq/matrix: Spread managed interrupts on allocation genirq/matrix: Improve target CPU selection for managed interrupts. mac80211: Change default tx_sk_pacing_shift to 7 scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attached drm/msm: Unblock writer if reader closes file ASoC: Intel: Haswell/Broadwell: fix setting for .dynamic field ALSA: compress: prevent potential divide by zero bugs ASoC: Variable "val" in function rt274_i2c_probe() could be uninitialized clk: tegra: dfll: Fix a potential Oop in remove() clk: sysfs: fix invalid JSON in clk_dump clk: vc5: Abort clock configuration without upstream clock thermal: int340x_thermal: Fix a NULL vs IS_ERR() check usb: dwc3: gadget: synchronize_irq dwc irq in suspend usb: dwc3: gadget: Fix the uninitialized link_state when udc starts usb: gadget: Potential NULL dereference on allocation error selftests: rtc: rtctest: fix alarm tests selftests: rtc: rtctest: add alarm test on minute boundary genirq: Make sure the initial affinity is not empty x86/mm/mem_encrypt: Fix erroneous sizeof() ASoC: rt5682: Fix PLL source register definitions ASoC: dapm: change snprintf to scnprintf for possible overflow ASoC: imx-audmux: change snprintf to scnprintf for possible overflow selftests/vm/gup_benchmark.c: match gup struct to kernel phy: ath79-usb: Fix the power on error path phy: ath79-usb: Fix the main reset name to match the DT binding selftests: seccomp: use LDLIBS instead of LDFLAGS selftests: gpio-mockup-chardev: Check asprintf() for error irqchip/gic-v3-mbi: Fix uninitialized mbi_lock ARC: fix __ffs return value to avoid build warnings ARC: show_regs: lockdep: avoid page allocator... drivers: thermal: int340x_thermal: Fix sysfs race condition staging: rtl8723bs: Fix build error with Clang when inlining is disabled mac80211: fix miscounting of ttl-dropped frames sched/wait: Fix rcuwait_wake_up() ordering sched/wake_q: Fix wakeup ordering for wake_q futex: Fix (possible) missed wakeup locking/rwsem: Fix (possible) missed wakeup drm/amd/powerplay: OD setting fix on Vega10 tty: serial: qcom_geni_serial: Allow mctrl when flow control is disabled serial: fsl_lpuart: fix maximum acceptable baud rate with over-sampling drm/sun4i: hdmi: Fix usage of TMDS clock staging: android: ion: Support cpu access during dma_buf_detach direct-io: allow direct writes to empty inodes writeback: synchronize sync(2) against cgroup writeback membership switches scsi: lpfc: nvme: avoid hang / use-after-free when destroying localport scsi: lpfc: nvmet: avoid hang / use-after-free when destroying targetport scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state() net: altera_tse: fix connect_local_phy error path hv_netvsc: Fix ethtool change hash key error hv_netvsc: Refactor assignments of struct netvsc_device_info hv_netvsc: Fix hash key value reset after other ops nvme-rdma: fix timeout handler nvme-multipath: drop optimization for static ANA group IDs drm/msm: Fix A6XX support for opp-level net: usb: asix: ax88772_bind return error when hw_reset fail net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP ibmveth: Do not process frames after calling napi_reschedule mac80211: don't initiate TDLS connection if station is not associated to AP mac80211: Add attribute aligned(2) to struct 'action' cfg80211: extend range deviation for DMG svm: Fix AVIC incomplete IPI emulation KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1 kvm: selftests: Fix region overlap check in kvm_util mmc: spi: Fix card detection during probe mmc: tmio_mmc_core: don't claim spurious interrupts mmc: tmio: fix access width of Block Count Register mmc: core: Fix NULL ptr crash from mmc_should_fail_request mmc: cqhci: fix space allocated for transfer descriptor mmc: cqhci: Fix a tiny potential memory leak on error condition mmc: sdhci-esdhc-imx: correct the fix of ERR004536 mm: enforce min addr even if capable() in expand_downwards() drm: Block fb changes for async plane updates hugetlbfs: fix races and page leaks during migration MIPS: fix truncation in __cmpxchg_small for short values MIPS: BCM63XX: provide DMA masks for ethernet devices MIPS: eBPF: Fix icache flush end address x86/uaccess: Don't leak the AC flag into __put_user() value evaluation Linux 4.19.27 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Mike Kravetz
|
527cabfffb |
hugetlbfs: fix races and page leaks during migration
commit cb6acd01e2e43fd8bad11155752b7699c3d0fb76 upstream.
hugetlb pages should only be migrated if they are 'active'. The
routines set/clear_page_huge_active() modify the active state of hugetlb
pages.
When a new hugetlb page is allocated at fault time, set_page_huge_active
is called before the page is locked. Therefore, another thread could
race and migrate the page while it is being added to page table by the
fault code. This race is somewhat hard to trigger, but can be seen by
strategically adding udelay to simulate worst case scheduling behavior.
Depending on 'how' the code races, various BUG()s could be triggered.
To address this issue, simply delay the set_page_huge_active call until
after the page is successfully added to the page table.
Hugetlb pages can also be leaked at migration time if the pages are
associated with a file in an explicitly mounted hugetlbfs filesystem.
For example, consider a two node system with 4GB worth of huge pages
available. A program mmaps a 2G file in a hugetlbfs filesystem. It
then migrates the pages associated with the file from one node to
another. When the program exits, huge page counts are as follows:
node0
1024 free_hugepages
1024 nr_hugepages
node1
0 free_hugepages
1024 nr_hugepages
Filesystem Size Used Avail Use% Mounted on
nodev 4.0G 2.0G 2.0G 50% /var/opt/hugepool
That is as expected. 2G of huge pages are taken from the free_hugepages
counts, and 2G is the size of the file in the explicitly mounted
filesystem. If the file is then removed, the counts become:
node0
1024 free_hugepages
1024 nr_hugepages
node1
1024 free_hugepages
1024 nr_hugepages
Filesystem Size Used Avail Use% Mounted on
nodev 4.0G 2.0G 2.0G 50% /var/opt/hugepool
Note that the filesystem still shows 2G of pages used, while there
actually are no huge pages in use. The only way to 'fix' the filesystem
accounting is to unmount the filesystem
If a hugetlb page is associated with an explicitly mounted filesystem,
this information in contained in the page_private field. At migration
time, this information is not preserved. To fix, simply transfer
page_private from old to new page at migration time if necessary.
There is a related race with removing a huge page from a file and
migration. When a huge page is removed from the pagecache, the
page_mapping() field is cleared, yet page_private remains set until the
page is actually freed by free_huge_page(). A page could be migrated
while in this state. However, since page_mapping() is not set the
hugetlbfs specific routine to transfer page_private is not called and we
leak the page count in the filesystem.
To fix that, check for this condition before migrating a huge page. If
the condition is detected, return EBUSY for the page.
Link: http://lkml.kernel.org/r/74510272-7319-7372-9ea6-ec914734c179@oracle.com
Link: http://lkml.kernel.org/r/20190212221400.3512-1-mike.kravetz@oracle.com
Fixes:
|
||
Jann Horn
|
de04d2973a |
mm: enforce min addr even if capable() in expand_downwards()
commit 0a1d52994d440e21def1c2174932410b4f2a98a1 upstream.
security_mmap_addr() does a capability check with current_cred(), but
we can reach this code from contexts like a VFS write handler where
current_cred() must not be used.
This can be abused on systems without SMAP to make NULL pointer
dereferences exploitable again.
Fixes:
|
||
Tejun Heo
|
edca54b897 |
writeback: synchronize sync(2) against cgroup writeback membership switches
[ Upstream commit 7fc5854f8c6efae9e7624970ab49a1eac2faefb1 ] sync_inodes_sb() can race against cgwb (cgroup writeback) membership switches and fail to writeback some inodes. For example, if an inode switches to another wb while sync_inodes_sb() is in progress, the new wb might not be visible to bdi_split_work_to_wbs() at all or the inode might jump from a wb which hasn't issued writebacks yet to one which already has. This patch adds backing_dev_info->wb_switch_rwsem to synchronize cgwb switch path against sync_inodes_sb() so that sync_inodes_sb() is guaranteed to see all the target wbs and inodes can't jump wbs to escape syncing. v2: Fixed misplaced rwsem init. Spotted by Jiufei. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Jiufei Xue <xuejiufei@gmail.com> Link: http://lkml.kernel.org/r/dc694ae2-f07f-61e1-7097-7c8411cee12d@gmail.com Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Greg Kroah-Hartman
|
c97d2b535c |
This is the 4.19.26 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlx2U68ACgkQONu9yGCS aT6rHA/+PGrMAAzSp193bme2rap7P/EzDD0JF0aQOLwizdit5A4zIYSZApVVSaw5 U8KIQg+fcwIjTvzBmkHddDVVJBEv/VdE8Xkf0EOIOidIQNqMoRQN7ctU1bnKfUyC 20/xZBlc78BEU4TB4JpNiKkzD3bsl/Gsb2A3VenhHe0H8dbj4oJYyNWWY99/433y u1SNmTogLJ0p64+1XWMCTqPJA2xkUIOciFSLiO7d51MfpbguMwcW+T9XSwYat3tp uTdfvZYKc4iGPa4pZCZmZY7+Lrxne2OjMhp3bUlp0coKrjlNO848X+kG7gsVZ7EN MA7YOXUj/Ngjl+E9EilNY5gybY2hK0IzaaV8W9TRVYvPBjgSwe5Mj9/EXYO0L8q0 Arla5HsJdRhLX7HDJ41w8BIngko87hetAaTZVCct+F+nnMYGrrMsle+BgYYbnd7r NYWB8HwZZ3krkWmAZb+Phpleo3Oj7beum5MLoydSH4pXhTVft5D/V8K2fwLmZtgP FsIC1J09jDvXJ5qHTFBY4ugTq8l21YZ2oMPdwcB6X8hBIKji4XydQ0HPrdMyAtXB tqUkzyLZlP1uzm9T3E+hyATWeM4ue7qLFacp7cX5LokDWIxlWi7tcIqBHqRZez+y S7un8/Lmz8ZvzC674JTZl1iXSHuuK6TFLrwI3mw/CVhg2qRvZtk= =EDHP -----END PGP SIGNATURE----- Merge 4.19.26 into android-4.19 Changes in 4.19.26 ARM: 8834/1: Fix: kprobes: optimized kprobes illegal instruction tracing: Fix number of entries in trace header MIPS: eBPF: Always return sign extended 32b values gpio: MT7621: use a per instance irq_chip structure gpio: pxa: avoid attempting to set pin direction via pinctrl on MMP2 mac80211: Restore vif beacon interval if start ap fails mac80211: Use linked list instead of rhashtable walk for mesh tables mac80211: Free mpath object when rhashtable insertion fails libceph: handle an empty authorize reply ceph: avoid repeatedly adding inode to mdsc->snap_flush_list numa: change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES proc, oom: do not report alien mms when setting oom_score_adj ALSA: hda/realtek - Headset microphone and internal speaker support for System76 oryp5 ALSA: hda/realtek: Disable PC beep in passthrough on alc285 KEYS: allow reaching the keys quotas exactly backlight: pwm_bl: Fix devicetree parsing with auto-generated brightness tables mfd: ti_am335x_tscadc: Use PLATFORM_DEVID_AUTO while registering mfd cells pvcalls-front: read all data before closing the connection pvcalls-front: don't try to free unallocated rings pvcalls-front: properly allocate sk pvcalls-back: set -ENOTCONN in pvcalls_conn_back_read mfd: twl-core: Fix section annotations on {,un}protect_pm_master mfd: db8500-prcmu: Fix some section annotations mfd: mt6397: Do not call irq_domain_remove if PMIC unsupported mfd: ab8500-core: Return zero in get_register_interruptible() mfd: bd9571mwv: Add volatile register to make DVFS work mfd: qcom_rpm: write fw_version to CTRL_REG mfd: wm5110: Add missing ASRC rate register mfd: axp20x: Add AC power supply cell for AXP813 mfd: axp20x: Re-align MFD cell entries mfd: axp20x: Add supported cells for AXP803 mfd: cros_ec_dev: Add missing mfd_remove_devices() call in remove mfd: tps65218: Use devm_regmap_add_irq_chip and clean up error path in probe() mfd: mc13xxx: Fix a missing check of a register-read failure xen/pvcalls: remove set but not used variable 'intf' qed: Fix qed_chain_set_prod() for PBL chains with non power of 2 page count qed: Fix qed_ll2_post_rx_buffer_notify_fw() by adding a write memory barrier net: hns: Fix use after free identified by SLUB debug bpf: Fix [::] -> [::1] rewrite in sys_sendmsg selftests/bpf: Test [::] -> [::1] rewrite in sys_sendmsg in test_sock_addr watchdog: mt7621_wdt/rt2880_wdt: Fix compilation problem net/mlx4: Get rid of page operation after dma_alloc_coherent MIPS: ath79: Enable OF serial ports in the default config xprtrdma: Double free in rpcrdma_sendctxs_create() mlxsw: spectrum_acl: Add cleanup after C-TCAM update error condition selftests: forwarding: Add a test for VLAN deletion netfilter: nf_tables: fix leaking object reference count scsi: qla4xxx: check return code of qla4xxx_copy_from_fwddb_param scsi: isci: initialize shost fully before calling scsi_add_host() include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR MIPS: jazz: fix 64bit build netfilter: nft_flow_offload: Fix reverse route lookup bpf: correctly set initial window on active Fast Open sender pvcalls-front: Avoid get_free_pages(GFP_KERNEL) under spinlock bpf: fix panic in stack_map_get_build_id() on i386 and arm32 netfilter: nft_flow_offload: fix interaction with vrf slave device RDMA/mthca: Clear QP objects during their allocation powerpc/8xx: fix setting of pagetable for Abatron BDI debug tool. acpi/nfit: Fix race accessing memdev in nfit_get_smbios_id() net: stmmac: Fix PCI module removal leak net: stmmac: dwxgmac2: Only clear interrupts that are active net: stmmac: Check if CBS is supported before configuring net: stmmac: Fix the logic of checking if RX Watchdog must be enabled net: stmmac: Prevent RX starvation in stmmac_napi_poll() isdn: i4l: isdn_tty: Fix some concurrency double-free bugs scsi: tcmu: avoid cmd/qfull timers updated whenever a new cmd comes scsi: ufs: Fix system suspend status scsi: qedi: Add ep_state for login completion on un-reachable targets scsi: ufs: Fix geometry descriptor size scsi: cxgb4i: add wait_for_completion() netfilter: nft_flow_offload: fix checking method of conntrack helper always clear the X2APIC_ENABLE bit for PV guest drm/meson: add missing of_node_put drm/amdkfd: Don't assign dGPUs to APU topology devices drm/amd/display: fix PME notification not working in RV desktop vhost: return EINVAL if iovecs size does not match the message size drm/sun4i: backend: add missing of_node_puts pvcalls-front: fix potential null dereference selftests: tc-testing: drop test on missing tunnel key id selftests: tc-testing: fix tunnel_key failure if dst_port is unspecified selftests: tc-testing: fix parsing of ife type afs: Don't set vnode->cb_s_break in afs_validate() afs: Fix key refcounting in file locking code bpf: don't assume build-id length is always 20 bytes bpf: zero out build_id for BPF_STACK_BUILD_ID_IP selftests/bpf: retry tests that expect build-id atm: he: fix sign-extension overflow on large shift hwmon: (tmp421) Correct the misspelling of the tmp442 compatible attribute in OF device ID table leds: lp5523: fix a missing check of return value of lp55xx_read bpf: bpf_setsockopt: reset sock dst on SO_MARK changes dpaa_eth: NETIF_F_LLTX requires to do our own update of trans_start mlxsw: pci: Return error on PCI reset timeout net: bridge: Mark FDB entries that were added by user as such mlxsw: spectrum_switchdev: Do not treat static FDB entries as sticky selftests: forwarding: Add a test case for externally learned FDB entries net/mlx5e: Fix wrong (zero) TX drop counter indication for representor isdn: avm: Fix string plus integer warning from Clang batman-adv: fix uninit-value in batadv_interface_tx() inet_diag: fix reporting cgroup classid and fallback to priority ipv6: propagate genlmsg_reply return code net: ena: fix race between link up and device initalization net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames net/mlx5e: Don't overwrite pedit action when multiple pedit used net/packet: fix 4gb buffer limit due to overflow check net: sfp: do not probe SFP module before we're attached sctp: call gso_reset_checksum when computing checksum in sctp_gso_segment sctp: set stream ext to NULL after freeing it in sctp_stream_outq_migrate team: avoid complex list operations in team_nl_cmd_options_set() Revert "socket: fix struct ifreq size in compat ioctl" Revert "kill dev_ifsioc()" net: socket: fix SIOCGIFNAME in compat net: socket: make bond ioctls go through compat_ifreq_ioctl() geneve: should not call rt6_lookup() when ipv6 was disabled sit: check if IPv6 enabled before calling ip6_err_gen_icmpv6_unreach() net_sched: fix a race condition in tcindex_destroy() net_sched: fix a memory leak in cls_tcindex net_sched: fix two more memory leaks in cls_tcindex net/mlx5e: XDP, fix redirect resources availability check RDMA/srp: Rework SCSI device reset handling KEYS: user: Align the payload buffer KEYS: always initialize keyring_index_key::desc_len parisc: Fix ptrace syscall number modification ARCv2: Enable unaligned access in early ASM code ARC: U-boot: check arguments paranoidly ARC: define ARCH_SLAB_MINALIGN = 8 drm/amdgpu: Set DPM_FLAG_NEVER_SKIP when enabling PM-runtime gpu: drm: radeon: Set DPM_FLAG_NEVER_SKIP when enabling PM-runtime drm/i915/fbdev: Actually configure untiled displays drm/amd/display: Fix MST reboot/poweroff sequence mac80211: allocate tailroom for forwarded mesh packets kvm: x86: Return LA57 feature based on hardware capability net: validate untrusted gso packets without csum offload net: avoid false positives in untrusted gso validation staging: erofs: fix a bug when appling cache strategy staging: erofs: complete error handing of z_erofs_do_read_page staging: erofs: replace BUG_ON with DBG_BUGON in data.c staging: erofs: drop multiref support temporarily staging: erofs: remove the redundant d_rehash() for the root dentry staging: erofs: atomic_cond_read_relaxed on ref-locked workgroup staging: erofs: fix `erofs_workgroup_{try_to_freeze, unfreeze}' staging: erofs: add a full barrier in erofs_workgroup_unfreeze staging: erofs: {dir,inode,super}.c: rectify BUG_ONs staging: erofs: unzip_{pagevec.h,vle.c}: rectify BUG_ONs staging: erofs: unzip_vle_lz4.c,utils.c: rectify BUG_ONs Revert "bridge: do not add port to router list when receives query with source 0.0.0.0" netfilter: nf_tables: fix flush after rule deletion in the same batch netfilter: nft_compat: use-after-free when deleting targets netfilter: ipv6: Don't preserve original oif for loopback address netfilter: nfnetlink_osf: add missing fmatch check netfilter: ipt_CLUSTERIP: fix sleep-in-atomic bug in clusterip_config_entry_put() udlfb: handle unplug properly pinctrl: max77620: Use define directive for max77620_pinconf_param values net: phylink: avoid resolving link state too early Linux 4.19.26 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Ralph Campbell
|
c7ddb2689d |
numa: change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES
commit 050c17f239fd53adb55aa768d4f41bc76c0fe045 upstream. The system call, get_mempolicy() [1], passes an unsigned long *nodemask pointer and an unsigned long maxnode argument which specifies the length of the user's nodemask array in bits (which is rounded up). The manual page says that if the maxnode value is too small, get_mempolicy will return EINVAL but there is no system call to return this minimum value. To determine this value, some programs search /proc/<pid>/status for a line starting with "Mems_allowed:" and use the number of digits in the mask to determine the minimum value. A recent change to the way this line is formatted [2] causes these programs to compute a value less than MAX_NUMNODES so get_mempolicy() returns EINVAL. Change get_mempolicy(), the older compat version of get_mempolicy(), and the copy_nodes_to_user() function to use nr_node_ids instead of MAX_NUMNODES, thus preserving the defacto method of computing the minimum size for the nodemask array and the maxnode argument. [1] http://man7.org/linux/man-pages/man2/get_mempolicy.2.html [2] https://lore.kernel.org/lkml/1545405631-6808-1-git-send-email-longman@redhat.com Link: http://lkml.kernel.org/r/20190211180245.22295-1-rcampbell@nvidia.com Fixes: 4fb8e5b89bcbbbb ("include/linux/nodemask.h: use nr_node_ids (not MAX_NUMNODES) in __nodemask_pr_numnodes()") Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Cc: Waiman Long <longman@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Greg Kroah-Hartman
|
cca7d2df6d |
This is the 4.19.24 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlxtHSUACgkQONu9yGCS aT6WIw//Xdd2P7aIWSDnmS6L+RlEtdz2hnO07O2lMVIhsH1MgetAk0nJbP8b9sYp zJY4kW6gEue1/J7vHIiTnDpnhaylxESOxOb49O/E95bnPIdhE9EqR1aiMmo8R5rU 8F2lH6kZncAawZ/7WGnQdHqukO9GtwFO0hwtsDT5RcwiIPe/fHkIpp1QfN21ntib nV7xQOfDFEMVcbyzAFa+59A4UUQSWKTkld9tcJFG62pjcN9DlpPej86M2qy9mG8U 0d7hEkYGqHthh092TAxgobsJDq5j0jUUVjK80/Fq0TKe/OSvsgQ8MRBryVsgBf8J hJBCLHay4Ps3ONtNI3IXDVZu9dgdL8UynBEfbWdgJHlGgKJEvMGxFCckzI8OP3yr HJtFm6gG24kySFt+QnvwHg1/zwAho5FDBMfx02RkukJwDiN16DZSbKRimrTVAioV bBTmL+JyoCBLYGfUzq4gM1+3L5TRyMBqjgcjLvXWuXF18fdpVdb0exKlNtLHMUf6 aqzhroF90sn5vUybpw02ouduxmRPWDaDel7ArA3zb16Dp+ODCxYtpkV7dmRBjPVK YxwweO604Rd2musacQkVAB5JkshjPdV7WA/FmhJf7Fn1/CaaXliAVrfMNbEGW15a NFQvTMrJva2pihXYJY4p/UfhnsBmxXmY30wYlLX2VxRFb+qoy3Y= =MO0g -----END PGP SIGNATURE----- Merge 4.19.24 into android-4.19 Changes in 4.19.24 dt-bindings: eeprom: at24: add "atmel,24c2048" compatible string eeprom: at24: add support for 24c2048 blk-mq: fix a hung issue when fsync ARM: 8789/1: signal: copy registers using __copy_to_user() ARM: 8790/1: signal: always use __copy_to_user to save iwmmxt context ARM: 8791/1: vfp: use __copy_to_user() when saving VFP state ARM: 8792/1: oabi-compat: copy oabi events using __copy_to_user() ARM: 8793/1: signal: replace __put_user_error with __put_user ARM: 8794/1: uaccess: Prevent speculative use of the current addr_limit ARM: 8795/1: spectre-v1.1: use put_user() for __put_user() ARM: 8796/1: spectre-v1,v1.1: provide helpers for address sanitization ARM: 8797/1: spectre-v1.1: harden __copy_to_user ARM: 8810/1: vfp: Fix wrong assignement to ufp_exc ARM: make lookup_processor_type() non-__init ARM: split out processor lookup ARM: clean up per-processor check_bugs method call ARM: add PROC_VTABLE and PROC_TABLE macros ARM: spectre-v2: per-CPU vtables to work around big.Little systems ARM: ensure that processor vtables is not lost after boot ARM: fix the cockup in the previous patch drm/amdgpu/sriov:Correct pfvf exchange logic ACPI: NUMA: Use correct type for printing addresses on i386-PAE perf report: Fix wrong iteration count in --branch-history perf test shell: Use a fallback to get the pathname in vfs_getname tools uapi: fix RISC-V 64-bit support riscv: fix trace_sys_exit hook cpufreq: check if policy is inactive early in __cpufreq_get() drm/bridge: tc358767: add bus flags drm/bridge: tc358767: add defines for DP1_SRCCTRL & PHY_2LANE drm/bridge: tc358767: fix single lane configuration drm/bridge: tc358767: fix initial DP0/1_SRCCTRL value drm/bridge: tc358767: reject modes which require too much BW drm/bridge: tc358767: fix output H/V syncs nvme-pci: use the same attributes when freeing host_mem_desc_bufs. nvme-pci: fix out of bounds access in nvme_cqe_pending nvme-multipath: zero out ANA log buffer nvme: pad fake subsys NQN vid and ssvid with zeros drm/amdgpu: set WRITE_BURST_LENGTH to 64B to workaround SDMA1 hang ARM: dts: da850-evm: Correct the audio codec regulators ARM: dts: da850-evm: Correct the sound card name ARM: dts: da850-lcdk: Correct the audio codec regulators ARM: dts: da850-lcdk: Correct the sound card name ARM: dts: kirkwood: Fix polarity of GPIO fan lines gpio: pl061: handle failed allocations drm/nouveau: Don't disable polling in fallback mode drm/nouveau/falcon: avoid touching registers if engine is off cifs: Limit memory used by lock request calls to a page kvm: sev: Fail KVM_SEV_INIT if already initialized CIFS: Do not assume one credit for async responses gpio: mxc: move gpio noirq suspend/resume to syscore phase Revert "Input: elan_i2c - add ACPI ID for touchpad in ASUS Aspire F5-573G" Input: elan_i2c - add ACPI ID for touchpad in Lenovo V330-15ISK ARM: OMAP5+: Fix inverted nirq pin interrupts with irq_set_type perf/core: Fix impossible ring-buffer sizes warning perf/x86: Add check_period PMU callback ALSA: hda - Add quirk for HP EliteBook 840 G5 ALSA: usb-audio: Fix implicit fb endpoint setup by quirk ASoC: hdmi-codec: fix oops on re-probe tools uapi: fix Alpha support riscv: Add pte bit to distinguish swap from invalid x86/kvm/nVMX: read from MSR_IA32_VMX_PROCBASED_CTLS2 only when it is available kvm: vmx: Fix entry number check for add_atomic_switch_msr() mmc: sunxi: Filter out unsupported modes declared in the device tree mmc: block: handle complete_work on separate workqueue Input: bma150 - register input device after setting private data Input: elantech - enable 3rd button support on Fujitsu CELSIUS H780 Revert "nfsd4: return default lease period" Revert "mm: don't reclaim inodes with many attached pages" Revert "mm: slowly shrink slabs with a relatively small number of objects" alpha: fix page fault handling for r16-r18 targets alpha: Fix Eiger NR_IRQS to 128 s390/zcrypt: fix specification exception on z196 during ap probe tracing/uprobes: Fix output for multiple string arguments x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls scsi: sd: fix entropy gathering for most rotational disks signal: Restore the stop PTRACE_EVENT_EXIT md/raid1: don't clear bitmap bits on interrupted recovery. x86/a.out: Clear the dump structure initially dm crypt: don't overallocate the integrity tag space dm thin: fix bug where bio that overwrites thin block ignores FUA drm: Use array_size() when creating lease drm/vkms: Fix license inconsistent drm/i915: Block fbdev HPD processing during suspend drm/i915: Prevent a race during I915_GEM_MMAP ioctl with WC set mm: proc: smaps_rollup: fix pss_locked calculation Linux 4.19.24 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Dave Chinner
|
657fbf79a8 |
Revert "mm: slowly shrink slabs with a relatively small number of objects"
commit a9a238e83fbb0df31c3b9b67003f8f9d1d1b6c96 upstream. This reverts commit |
||
Greg Kroah-Hartman
|
6e0411bdc2 |
This is the 4.19.21 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlxjFL8ACgkQONu9yGCS aT643xAAk+mnsrOfU6/LBFjcJUUUYohK01UAU+PRvTjy4uH8rrA4G01xvp11ftIu jjEikA2582ScR5a2Ww+E4SfqOdKT4z9hOGLTnyI0P4xN9jeVidvu9+C90AYyBYhi orHm1osVQIj6n9+OQ5db+DzZYbZLbyfCoqNXbq9EoLvNRS3FUUH0y2VXqcz9Ghcj obpdHTVMKRaFkRWdCglo+3hSpoKrncSVpKrwUXR18GCt8jjZjj39kI9t6UoGNfc7 nN3GOd26U1tpGo6ShZJYu6aPjV+zoYNlsg1o2zn9qJANIdulYe30vqNhWCeJ+/T7 WcT3EHv4pPEO3Lvgfp+l10Nc6IbYdJEFUpAP3CvfP+MvRfKvz8Vo3Nm/BQlr20+q +MUYJb+wxhlHPRLV192XbnYFkEzZg7vzymoMPL034XheAkOkPbOK0IIVo41p5Rai LxmOdvhzfAktbtD/VWnLTUbexjs2EJ05bvZRjdPKKIMBNKnAWz4ux3KcHxpdsUF8 KMCKwpJE8KDM4uiaKVdyfMhFeIg37pmy+7Uv9cUFjWwtsL3K+CiAXg8uaSvnajKr bOhwbFIgxoOI9VRBK8M0wvzouphA0miVbxOY81sdfbMeWNpbCLdb968AFz25llta rVlWp7bSAUQTsvTkVBv6mrTKPwzO4jnfHYN8yj5gO7pr5fmC2ig= =L+Mp -----END PGP SIGNATURE----- Merge 4.19.21 into android-4.19 Changes in 4.19.21 devres: Align data[] to ARCH_KMALLOC_MINALIGN drm/bufs: Fix Spectre v1 vulnerability staging: iio: adc: ad7280a: handle error from __ad7280_read32() drm/vgem: Fix vgem_init to get drm device available. pinctrl: bcm2835: Use raw spinlock for RT compatibility ASoC: Intel: mrfld: fix uninitialized variable access gpiolib: Fix possible use after free on label drm/sun4i: Initialize registers in tcon-top driver genirq/affinity: Spread IRQs to all available NUMA nodes gpu: ipu-v3: image-convert: Prevent race between run and unprepare nds32: Fix gcc 8.0 compiler option incompatible. wil6210: fix reset flow for Talyn-mb wil6210: fix memory leak in wil_find_tx_bcast_2 ath10k: assign 'n_cipher_suites' for WCN3990 ath9k: dynack: use authentication messages for 'late' ack scsi: lpfc: Correct LCB RJT handling scsi: mpt3sas: Call sas_remove_host before removing the target devices scsi: lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event ARM: 8808/1: kexec:offline panic_smp_self_stop CPU clk: boston: fix possible memory leak in clk_boston_setup() dlm: Don't swamp the CPU with callbacks queued during recovery x86/PCI: Fix Broadcom CNB20LE unintended sign extension (redux) powerpc/pseries: add of_node_put() in dlpar_detach_node() crypto: aes_ti - disable interrupts while accessing S-box drm/vc4: ->x_scaling[1] should never be set to VC4_SCALING_NONE serial: fsl_lpuart: clear parity enable bit when disable parity ptp: check gettime64 return code in PTP_SYS_OFFSET ioctl MIPS: Boston: Disable EG20T prefetch dpaa2-ptp: defer probe when portal allocation failed iwlwifi: fw: do not set sgi bits for HE connection staging:iio:ad2s90: Make probe handle spi_setup failure fpga: altera-cvp: Fix registration for CvP incapable devices Tools: hv: kvp: Fix a warning of buffer overflow with gcc 8.0.1 fpga: altera-cvp: fix 'bad IO access' on x86_64 vbox: fix link error with 'gcc -Og' platform/chrome: don't report EC_MKBP_EVENT_SENSOR_FIFO as wakeup i40e: prevent overlapping tx_timeout recover scsi: hisi_sas: change the time of SAS SSP connection staging: iio: ad7780: update voltage on read usbnet: smsc95xx: fix rx packet alignment drm/rockchip: fix for mailbox read size ARM: OMAP2+: hwmod: Fix some section annotations drm/amd/display: fix gamma not being applied correctly drm/amd/display: calculate stream->phy_pix_clk before clock mapping bpf: libbpf: retry map creation without the name net/mlx5: EQ, Use the right place to store/read IRQ affinity hint modpost: validate symbol names also in find_elf_symbol perf tools: Add Hygon Dhyana support soc/tegra: Don't leak device tree node reference media: rc: ensure close() is called on rc_unregister_device media: video-i2c: avoid accessing released memory area when removing driver media: mtk-vcodec: Release device nodes in mtk_vcodec_init_enc_pm() staging: erofs: fix the definition of DBG_BUGON clk: meson: meson8b: do not use cpu_div3 for cpu_scale_out_sel clk: meson: meson8b: fix the width of the cpu_scale_div clock clk: meson: meson8b: mark the CPU clock as CLK_IS_CRITICAL ptp: Fix pass zero to ERR_PTR() in ptp_clock_register dmaengine: xilinx_dma: Remove __aligned attribute on zynqmp_dma_desc_ll powerpc/32: Add .data..Lubsan_data*/.data..Lubsan_type* sections explicitly iio: adc: meson-saradc: check for devm_kasprintf failure iio: adc: meson-saradc: fix internal clock names iio: accel: kxcjk1013: Add KIOX010A ACPI Hardware-ID media: adv*/tc358743/ths8200: fill in min width/height/pixelclock ACPI: SPCR: Consider baud rate 0 as preconfigured state staging: pi433: fix potential null dereference f2fs: move dir data flush to write checkpoint process f2fs: fix race between write_checkpoint and write_begin f2fs: fix wrong return value of f2fs_acl_create i2c: sh_mobile: add support for r8a77990 (R-Car E3) arm64: io: Ensure calls to delay routines are ordered against prior readX() net: aquantia: return 'err' if set MPI_DEINIT state fails sunvdc: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN soc: bcm: brcmstb: Don't leak device tree node reference nfsd4: fix crash on writing v4_end_grace before nfsd startup drm: Clear state->acquire_ctx before leaving drm_atomic_helper_commit_duplicated_state() perf: arm_spe: handle devm_kasprintf() failure arm64: io: Ensure value passed to __iormb() is held in a 64-bit register Thermal: do not clear passive state during system sleep thermal: Fix locking in cooling device sysfs update cur_state firmware/efi: Add NULL pointer checks in efivars API functions s390/zcrypt: improve special ap message cmd handling mt76x0: dfs: fix IBI_R11 configuration on non-radar channels arm64: ftrace: don't adjust the LR value drm/v3d: Fix prime imports of buffers from other drivers. ARM: dts: mmp2: fix TWSI2 ARM: dts: aspeed: add missing memory unit-address x86/fpu: Add might_fault() to user_insn() media: i2c: TDA1997x: select CONFIG_HDMI media: DaVinci-VPBE: fix error handling in vpbe_initialize() smack: fix access permissions for keyring xtensa: xtfpga.dtsi: fix dtc warnings about SPI usb: dwc3: Correct the logic for checking TRB full in __dwc3_prepare_one_trb() usb: dwc2: Disable power down feature on Samsung SoCs usb: hub: delay hub autosuspend if USB3 port is still link training timekeeping: Use proper seqcount initializer usb: mtu3: fix the issue about SetFeature(U1/U2_Enable) clk: sunxi-ng: a33: Set CLK_SET_RATE_PARENT for all audio module clocks media: imx274: select REGMAP_I2C drm/amdgpu/powerplay: fix clock stretcher limits on polaris (v2) tipc: fix node keep alive interval calculation driver core: Move async_synchronize_full call kobject: return error code if writing /sys/.../uevent fails IB/hfi1: Unreserve a reserved request when it is completed usb: dwc3: trace: add missing break statement to make compiler happy gpio: mt7621: report failure of devm_kasprintf() gpio: mt7621: pass mediatek_gpio_bank_probe() failure up the stack pinctrl: sx150x: handle failure case of devm_kstrdup iommu/amd: Fix amd_iommu=force_isolation ARM: dts: Fix OMAP4430 SDP Ethernet startup mips: bpf: fix encoding bug for mm_srlv32_op media: coda: fix H.264 deblocking filter controls ARM: dts: Fix up the D-Link DIR-685 MTD partition info watchdog: renesas_wdt: don't set divider while watchdog is running ARM: dts: imx51-zii-rdu1: Do not specify "power-gpio" for hpa1 usb: dwc3: gadget: Disable CSP for stream OUT ep iommu/arm-smmu-v3: Avoid memory corruption from Hisilicon MSI payloads iommu/arm-smmu: Add support for qcom,smmu-v2 variant iommu/arm-smmu-v3: Use explicit mb() when moving cons pointer sata_rcar: fix deferred probing clk: imx6sl: ensure MMDC CH0 handshake is bypassed platform/x86: mlx-platform: Fix tachometer registers cpuidle: big.LITTLE: fix refcount leak OPP: Use opp_table->regulators to verify no regulator case tee: optee: avoid possible double list_del() drm/msm/dsi: fix dsi clock names in DSI 10nm PLL driver drm/msm: dpu: Only check flush register against pending flushes lightnvm: pblk: fix resubmission of overwritten write err lbas lightnvm: pblk: add lock protection to list operations i2c-axxia: check for error conditions first phy: sun4i-usb: add support for missing USB PHY index mlxsw: spectrum_acl: Limit priority value udf: Fix BUG on corrupted inode switchtec: Fix SWITCHTEC_IOCTL_EVENT_IDX_ALL flags overwrite selftests/bpf: use __bpf_constant_htons in test_prog.c ARM: pxa: avoid section mismatch warning ASoC: fsl: Fix SND_SOC_EUKREA_TLV320 build error on i.MX8M KVM: PPC: Book3S: Only report KVM_CAP_SPAPR_TCE_VFIO on powernv machines mmc: bcm2835: Recover from MMC_SEND_EXT_CSD mmc: bcm2835: reset host on timeout mmc: meson-mx-sdio: check devm_kasprintf for failure memstick: Prevent memstick host from getting runtime suspended during card detection mmc: sdhci-of-esdhc: Fix timeout checks mmc: sdhci-omap: Fix timeout checks mmc: sdhci-xenon: Fix timeout checks mmc: jz4740: Get CD/WP GPIOs from descriptors usb: renesas_usbhs: add support for RZ/G2E btrfs: harden agaist duplicate fsid on scanned devices serial: sh-sci: Fix locking in sci_submit_rx() serial: sh-sci: Resume PIO in sci_rx_interrupt() on DMA failure tty: serial: samsung: Properly set flags in autoCTS mode perf test: Fix perf_event_attr test failure perf dso: Fix unchecked usage of strncpy() perf header: Fix unchecked usage of strncpy() btrfs: use tagged writepage to mitigate livelock of snapshot perf probe: Fix unchecked usage of strncpy() i2c: sh_mobile: Add support for r8a774c0 (RZ/G2E) bnxt_en: Disable MSIX before re-reserving NQs/CMPL rings. tools/power/x86/intel_pstate_tracer: Fix non root execution for post processing a trace file livepatch: check kzalloc return values arm64: KVM: Skip MMIO insn after emulation usb: musb: dsps: fix otg state machine usb: musb: dsps: fix runtime pm for peripheral mode perf header: Fix up argument to ctime() perf tools: Cast off_t to s64 to avoid warning on bionic libc percpu: convert spin_lock_irq to spin_lock_irqsave. net: hns3: fix incomplete uninitialization of IRQ in the hns3_nic_uninit_vector_data() drm/amd/display: Add retry to read ddc_clock pin Bluetooth: hci_bcm: Handle deferred probing for the clock supply drm/amd/display: fix YCbCr420 blank color powerpc/uaccess: fix warning/error with access_ok() mac80211: fix radiotap vendor presence bitmap handling xfrm6_tunnel: Fix spi check in __xfrm6_tunnel_alloc_spi mlxsw: spectrum: Properly cleanup LAG uppers when removing port from LAG scsi: smartpqi: correct host serial num for ssa scsi: smartpqi: correct volume status scsi: smartpqi: increase fw status register read timeout cw1200: Fix concurrency use-after-free bugs in cw1200_hw_scan() net: hns3: add max vector number check for pf powerpc/perf: Fix thresholding counter data for unknown type iwlwifi: mvm: fix setting HE ppe FW config powerpc/powernv/ioda: Allocate indirect TCE levels of cached userspace addresses on demand mlx5: update timecounter at least twice per counter overflow drbd: narrow rcu_read_lock in drbd_sync_handshake drbd: disconnect, if the wrong UUIDs are attached on a connected peer drbd: skip spurious timeout (ping-timeo) when failing promote drbd: Avoid Clang warning about pointless switch statment drm/amd/display: validate extended dongle caps video: clps711x-fb: release disp device node in probe() md: fix raid10 hang issue caused by barrier fbdev: fbmem: behave better with small rotated displays and many CPUs i40e: define proper net_device::neigh_priv_len ice: Do not enable NAPI on q_vectors that have no rings igb: Fix an issue that PME is not enabled during runtime suspend ACPI/APEI: Clear GHES block_status before panic() fbdev: fbcon: Fix unregister crash when more than one framebuffer powerpc/mm: Fix reporting of kernel execute faults on the 8xx pinctrl: meson: meson8: fix the GPIO function for the GPIOAO pins pinctrl: meson: meson8b: fix the GPIO function for the GPIOAO pins KVM: x86: svm: report MSR_IA32_MCG_EXT_CTL as unsupported powerpc/fadump: Do not allow hot-remove memory from fadump reserved area. kvm: Change offset in kvm_write_guest_offset_cached to unsigned NFS: nfs_compare_mount_options always compare auth flavors. perf build: Don't unconditionally link the libbfd feature test to -liberty and -lz hwmon: (lm80) fix a missing check of the status of SMBus read hwmon: (lm80) fix a missing check of bus read in lm80 probe seq_buf: Make seq_buf_puts() null-terminate the buffer crypto: ux500 - Use proper enum in cryp_set_dma_transfer crypto: ux500 - Use proper enum in hash_set_dma_transfer MIPS: ralink: Select CONFIG_CPU_MIPSR2_IRQ_VI on MT7620/8 cifs: check ntwrk_buf_start for NULL before dereferencing it f2fs: fix use-after-free issue when accessing sbi->stat_info um: Avoid marking pages with "changed protection" niu: fix missing checks of niu_pci_eeprom_read f2fs: fix sbi->extent_list corruption issue cgroup: fix parsing empty mount option string perf python: Do not force closing original perf descriptor in evlist.get_pollfd() scripts/decode_stacktrace: only strip base path when a prefix of the path arch/sh/boards/mach-kfr2r09/setup.c: fix struct mtd_oob_ops build warning ocfs2: don't clear bh uptodate for block read ocfs2: improve ocfs2 Makefile mm/page_alloc.c: don't call kasan_free_pages() at deferred mem init zram: fix lockdep warning of free block handling isdn: hisax: hfc_pci: Fix a possible concurrency use-after-free bug in HFCPCI_l1hw() gdrom: fix a memory leak bug fsl/fman: Use GFP_ATOMIC in {memac,tgec}_add_hash_mac_address() block/swim3: Fix -EBUSY error when re-opening device after unmount thermal: bcm2835: enable hwmon explicitly kdb: Don't back trace on a cpu that didn't round up PCI: imx: Enable MSI from downstream components thermal: generic-adc: Fix adc to temp interpolation HID: lenovo: Add checks to fix of_led_classdev_register arm64/sve: ptrace: Fix SVE_PT_REGS_OFFSET definition kernel/hung_task.c: break RCU locks based on jiffies proc/sysctl: fix return error for proc_doulongvec_minmax() kernel/hung_task.c: force console verbose before panic fs/epoll: drop ovflist branch prediction exec: load_script: don't blindly truncate shebang string kernel/kcov.c: mark write_comp_data() as notrace scripts/gdb: fix lx-version string output xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat xfs: cancel COW blocks before swapext xfs: Fix error code in 'xfs_ioc_getbmap()' xfs: fix overflow in xfs_attr3_leaf_verify xfs: fix shared extent data corruption due to missing cow reservation xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers xfs: delalloc -> unwritten COW fork allocation can go wrong fs/xfs: fix f_ffree value for statfs when project quota is set xfs: fix PAGE_MASK usage in xfs_free_file_space xfs: fix inverted return from xfs_btree_sblock_verify_crc thermal: hwmon: inline helpers when CONFIG_THERMAL_HWMON is not set dccp: fool proof ccid_hc_[rt]x_parse_options() enic: fix checksum validation for IPv6 lib/test_rhashtable: Make test_insert_dup() allocate its hash table dynamically net: dp83640: expire old TX-skb net: dsa: Fix lockdep false positive splat net: dsa: Fix NULL checking in dsa_slave_set_eee() net: dsa: mv88e6xxx: Fix counting of ATU violations net: dsa: slave: Don't propagate flag changes on down slave interfaces net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames net: systemport: Fix WoL with password after deep sleep rds: fix refcount bug in rds_sock_addref Revert "net: phy: marvell: avoid pause mode on SGMII-to-Copper for 88e151x" rxrpc: bad unlock balance in rxrpc_recvmsg sctp: check and update stream->out_curr when allocating stream_out sctp: walk the list of asoc safely skge: potential memory corruption in skge_get_regs() virtio_net: Account for tx bytes and packets on sending xdp_frames net/mlx5e: FPGA, fix Innova IPsec TX offload data path performance xfs: eof trim writeback mapping as soon as it is cached ALSA: compress: Fix stop handling on compressed capture streams ALSA: usb-audio: Add support for new T+A USB DAC ALSA: hda - Serialize codec registrations ALSA: hda/realtek - Fix lose hp_pins for disable auto mute ALSA: hda/realtek - Use a common helper for hp pin reference ALSA: hda/realtek - Headset microphone support for System76 darp5 fuse: call pipe_buf_release() under pipe lock fuse: decrement NR_WRITEBACK_TEMP on the right page fuse: handle zero sized retrieve correctly HID: debug: fix the ring buffer implementation dmaengine: bcm2835: Fix interrupt race on RT dmaengine: bcm2835: Fix abort of transactions dmaengine: imx-dma: fix wrong callback invoke futex: Handle early deadlock return correctly irqchip/gic-v3-its: Plug allocation race for devices sharing a DevID usb: phy: am335x: fix race condition in _probe usb: dwc3: gadget: Handle 0 xfer length for OUT EP usb: gadget: udc: net2272: Fix bitwise and boolean operations usb: gadget: musb: fix short isoc packets with inventra dma staging: speakup: fix tty-operation NULL derefs scsi: cxlflash: Prevent deadlock when adapter probe fails scsi: aic94xx: fix module loading KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222) kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974) KVM: nVMX: unconditionally cancel preemption timer in free_nested (CVE-2019-7221) cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM perf/x86/intel/uncore: Add Node ID mask x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out() perf/core: Don't WARN() for impossible ring-buffer sizes perf tests evsel-tp-sched: Fix bitwise operator serial: fix race between flush_to_ldisc and tty_open serial: 8250_pci: Make PCI class test non fatal serial: sh-sci: Do not free irqs that have already been freed cacheinfo: Keep the old value if of_property_read_u32 fails IB/hfi1: Add limit test for RC/UC send via loopback perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu() ath9k: dynack: make ewma estimation faster ath9k: dynack: check da->enabled first in sampling routines Linux 4.19.21 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Waiman Long
|
f73c77535f |
mm/page_alloc.c: don't call kasan_free_pages() at deferred mem init
[ Upstream commit 3c0c12cc8f00ca5f81acb010023b8eb13e9a7004 ] When CONFIG_KASAN is enabled on large memory SMP systems, the deferrred pages initialization can take a long time. Below were the reported init times on a 8-socket 96-core 4TB IvyBridge system. 1) Non-debug kernel without CONFIG_KASAN [ 8.764222] node 1 initialised, 132086516 pages in 7027ms 2) Debug kernel with CONFIG_KASAN [ 146.288115] node 1 initialised, 132075466 pages in 143052ms So the page init time in a debug kernel was 20X of the non-debug kernel. The long init time can be problematic as the page initialization is done with interrupt disabled. In this particular case, it caused the appearance of following warning messages as well as NMI backtraces of all the cores that were doing the initialization. [ 68.240049] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 68.241000] rcu: 25-...0: (100 ticks this GP) idle=b72/1/0x4000000000000000 softirq=915/915 fqs=16252 [ 68.241000] rcu: 44-...0: (95 ticks this GP) idle=49a/1/0x4000000000000000 softirq=788/788 fqs=16253 [ 68.241000] rcu: 54-...0: (104 ticks this GP) idle=03a/1/0x4000000000000000 softirq=721/825 fqs=16253 [ 68.241000] rcu: 60-...0: (103 ticks this GP) idle=cbe/1/0x4000000000000000 softirq=637/740 fqs=16253 [ 68.241000] rcu: 72-...0: (105 ticks this GP) idle=786/1/0x4000000000000000 softirq=536/641 fqs=16253 [ 68.241000] rcu: 84-...0: (99 ticks this GP) idle=292/1/0x4000000000000000 softirq=537/537 fqs=16253 [ 68.241000] rcu: 111-...0: (104 ticks this GP) idle=bde/1/0x4000000000000000 softirq=474/476 fqs=16253 [ 68.241000] rcu: (detected by 13, t=65018 jiffies, g=249, q=2) The long init time was mainly caused by the call to kasan_free_pages() to poison the newly initialized pages. On a 4TB system, we are talking about almost 500GB of memory probably on the same node. In reality, we may not need to poison the newly initialized pages before they are ever allocated. So KASAN poisoning of freed pages before the completion of deferred memory initialization is now disabled. Those pages will be properly poisoned when they are allocated or freed after deferred pages initialization is done. With this change, the new page initialization time became: [ 21.948010] node 1 initialised, 132075466 pages in 18702ms This was still about double the non-debug kernel time, but was much better than before. Link: http://lkml.kernel.org/r/1544459388-8736-1-git-send-email-longman@redhat.com Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com> Cc: Oscar Salvador <osalvador@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Dennis Zhou
|
7c54ba4a5f |
percpu: convert spin_lock_irq to spin_lock_irqsave.
[ Upstream commit 6ab7d47bcbf0144a8cb81536c2cead4cde18acfe ] From Michael Cree: "Bisection lead to commit |
||
Greg Kroah-Hartman
|
c28f73fe42 |
This is the 4.19.20 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlxbC5gACgkQONu9yGCS aT4DYQ//Uqm/Q63KQuExgd7W+61FoP4NFHlYXZ31B5Rkydryyk2K5P2ONSdVd5n9 k3wjzRrxvlPvjOwbh9PHv+pLkGxBDqpT1X8IAXPe36bYUkvXoH71BE4YSRPRUJAf sdzw/vs7WE/Kx41iT3SXiQih8ok0y3LoACBKmUsEXoLI1cJZCUnnSFpP++QNe1Iz B/y04BigL8R7OWR/jow6OPWe9uXOI8iEe9QKVX26g4oaakzly4vkp6OwROSwM31q 0wut8jF/AtDcZpZXjJLjDCj10k5DRN8jwGcLD7iZeIKqexOabjUrsvfIHfbpUtXr e7pJw2aUM8BFb8Ba2lsB7gkqvdHQohqVKQE4Qy59aPyesm2G5miH4gAbncoixjCa u3eQV5ACpFLksUFR4RAMKq+10k7swsutyyJr5vG4qdbRpcTCNJirEwAGGqgI6IEP SDqtw6u8gMP8+SicwA9p71Wwntcq9RR6fx0gX/3wi2DQp6F8Txem00SqaciE7uQ1 uIOUrhcpWzIq4m58SGhgTSQcBkm5qBD5S154/xRKIo0mvME+NwBub/x3fIsixN/u AzWQmQPXBajHbYXbKGC7t2jNHkU5d9FedZ4iDmJk/+ZZsWyFByY1bH1cg4Qnq89e tDxL114YmSujbZD/mFlbGWcqdmGNT355BmyetKDx6w0rNiU/RBU= =oprJ -----END PGP SIGNATURE----- Merge 4.19.20 into android-4.19 Changes in 4.19.20 Fix "net: ipv4: do not handle duplicate fragments as overlapping" drm/msm/gpu: fix building without debugfs ipv6: Consider sk_bound_dev_if when binding a socket to an address ipv6: sr: clear IP6CB(skb) on SRH ip4ip6 encapsulation ipvlan, l3mdev: fix broken l3s mode wrt local routes l2tp: copy 4 more bytes to linear part if necessary l2tp: fix reading optional fields of L2TPv3 net: ip_gre: always reports o_key to userspace net: ip_gre: use erspan key field for tunnel lookup net/mlx4_core: Add masking for a few queries on HCA caps netrom: switch to sock timer API net/rose: fix NULL ax25_cb kernel panic net: set default network namespace in init_dummy_netdev() ravb: expand rx descriptor data to accommodate hw checksum sctp: improve the events for sctp stream reset tun: move the call to tun_set_real_num_queues ucc_geth: Reset BQL queue when stopping device vhost: fix OOB in get_rx_bufs() net: ip6_gre: always reports o_key to userspace sctp: improve the events for sctp stream adding net/mlx5e: Allow MAC invalidation while spoofchk is ON ip6mr: Fix notifiers call on mroute_clean_tables() Revert "net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager" sctp: set chunk transport correctly when it's a new asoc sctp: set flow sport from saddr only when it's 0 virtio_net: Don't enable NAPI when interface is down virtio_net: Don't call free_old_xmit_skbs for xdp_frames virtio_net: Fix not restoring real_num_rx_queues virtio_net: Fix out of bounds access of sq virtio_net: Don't process redirected XDP frames when XDP is disabled virtio_net: Use xdp_return_frame to free xdp_frames on destroying vqs virtio_net: Differentiate sk_buff and xdp_frame on freeing CIFS: Do not count -ENODATA as failure for query directory CIFS: Fix trace command logging for SMB2 reads and writes CIFS: Do not consider -ENODATA as stat failure for reads fs/dcache: Fix incorrect nr_dentry_unused accounting in shrink_dcache_sb() iommu/vt-d: Fix memory leak in intel_iommu_put_resv_regions() selftests/seccomp: Enhance per-arch ptrace syscall skip tests NFS: Fix up return value on fatal errors in nfs_page_async_flush() ARM: cns3xxx: Fix writing to wrong PCI config registers after alignment arm64: kaslr: ensure randomized quantities are clean also when kaslr is off arm64: Do not issue IPIs for user executable ptes arm64: hyp-stub: Forbid kprobing of the hyp-stub arm64: hibernate: Clean the __hyp_text to PoC after resume gpio: altera-a10sr: Set proper output level for direction_output gpiolib: fix line event timestamps for nested irqs gpio: pcf857x: Fix interrupts on multiple instances gpio: sprd: Fix the incorrect data register gpio: sprd: Fix incorrect irq type setting for the async EIC gfs2: Revert "Fix loop in gfs2_rbm_find" mmc: bcm2835: Fix DMA channel leak on probe error mmc: mediatek: fix incorrect register setting of hs400_cmd_int_delay ALSA: usb-audio: Add Opus #3 to quirks for native DSD support ALSA: hda/realtek - Fixed hp_pin no value IB/hfi1: Remove overly conservative VM_EXEC flag check platform/x86: asus-nb-wmi: Map 0x35 to KEY_SCREENLOCK platform/x86: asus-nb-wmi: Drop mapping of 0x33 and 0x34 scan codes mmc: sdhci-iproc: handle mmc_of_parse() errors during probe Btrfs: fix deadlock when allocating tree block during leaf/node split btrfs: On error always free subvol_name in btrfs_mount kernel/exit.c: release ptraced tasks before zap_pid_ns_processes mm/hugetlb.c: teach follow_hugetlb_page() to handle FOLL_NOWAIT oom, oom_reaper: do not enqueue same task twice mm,memory_hotplug: fix scan_movable_pages() for gigantic hugepages mm, oom: fix use-after-free in oom_kill_process mm: hwpoison: use do_send_sig_info() instead of force_sig() mm: migrate: don't rely on __PageMovable() of newpage after unlocking it of: Convert to using %pOFn instead of device_node.name of: overlay: add tests to validate kfrees from overlay removal of: overlay: add missing of_node_get() in __of_attach_node_sysfs of: overlay: use prop add changeset entry for property in new nodes of: overlay: do not duplicate properties from overlay for new nodes md/raid5: fix 'out of memory' during raid cache recovery cifs: Always resolve hostname before reconnecting Linux 4.19.20 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
David Hildenbrand
|
214dea147f |
mm: migrate: don't rely on __PageMovable() of newpage after unlocking it
commit e0a352fabce61f730341d119fbedf71ffdb8663f upstream. We had a race in the old balloon compaction code before |
||
Naoya Horiguchi
|
ced41d9d6a |
mm: hwpoison: use do_send_sig_info() instead of force_sig()
commit 6376360ecbe525a9c17b3d081dfd88ba3e4ed65b upstream.
Currently memory_failure() is racy against process's exiting, which
results in kernel crash by null pointer dereference.
The root cause is that memory_failure() uses force_sig() to forcibly
kill asynchronous (meaning not in the current context) processes. As
discussed in thread https://lkml.org/lkml/2010/6/8/236 years ago for OOM
fixes, this is not a right thing to do. OOM solves this issue by using
do_send_sig_info() as done in commit
|
||
Shakeel Butt
|
b6f534ab69 |
mm, oom: fix use-after-free in oom_kill_process
commit cefc7ef3c87d02fc9307835868ff721ea12cc597 upstream.
Syzbot instance running on upstream kernel found a use-after-free bug in
oom_kill_process. On further inspection it seems like the process
selected to be oom-killed has exited even before reaching
read_lock(&tasklist_lock) in oom_kill_process(). More specifically the
tsk->usage is 1 which is due to get_task_struct() in oom_evaluate_task()
and the put_task_struct within for_each_thread() frees the tsk and
for_each_thread() tries to access the tsk. The easiest fix is to do
get/put across the for_each_thread() on the selected task.
Now the next question is should we continue with the oom-kill as the
previously selected task has exited? However before adding more
complexity and heuristics, let's answer why we even look at the children
of oom-kill selected task? The select_bad_process() has already selected
the worst process in the system/memcg. Due to race, the selected
process might not be the worst at the kill time but does that matter?
The userspace can use the oom_score_adj interface to prefer children to
be killed before the parent. I looked at the history but it seems like
this is there before git history.
Link: http://lkml.kernel.org/r/20190121215850.221745-1-shakeelb@google.com
Reported-by: syzbot+7fbbfa368521945f0e3d@syzkaller.appspotmail.com
Fixes:
|
||
Oscar Salvador
|
d9f4d88d56 |
mm,memory_hotplug: fix scan_movable_pages() for gigantic hugepages
commit eeb0efd071d821a88da3fbd35f2d478f40d3b2ea upstream. This is the same sort of error we saw in commit 17e2e7d7e1b8 ("mm, page_alloc: fix has_unmovable_pages for HugePages"). Gigantic hugepages cross several memblocks, so it can be that the page we get in scan_movable_pages() is a page-tail belonging to a 1G-hugepage. If that happens, page_hstate()->size_to_hstate() will return NULL, and we will blow up in hugepage_migration_supported(). The splat is as follows: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 1350 Comm: bash Tainted: G E 5.0.0-rc1-mm1-1-default+ #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:__offline_pages+0x6ae/0x900 Call Trace: memory_subsys_offline+0x42/0x60 device_offline+0x80/0xa0 state_store+0xab/0xc0 kernfs_fop_write+0x102/0x180 __vfs_write+0x26/0x190 vfs_write+0xad/0x1b0 ksys_write+0x42/0x90 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Modules linked in: af_packet(E) xt_tcpudp(E) ipt_REJECT(E) xt_conntrack(E) nf_conntrack(E) nf_defrag_ipv4(E) ip_set(E) nfnetlink(E) ebtable_nat(E) ebtable_broute(E) bridge(E) stp(E) llc(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ebtable_filter(E) ebtables(E) iptable_filter(E) ip_tables(E) x_tables(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) bochs_drm(E) ttm(E) aesni_intel(E) drm_kms_helper(E) aes_x86_64(E) crypto_simd(E) cryptd(E) glue_helper(E) drm(E) virtio_net(E) syscopyarea(E) sysfillrect(E) net_failover(E) sysimgblt(E) pcspkr(E) failover(E) i2c_piix4(E) fb_sys_fops(E) parport_pc(E) parport(E) button(E) btrfs(E) libcrc32c(E) xor(E) zstd_decompress(E) zstd_compress(E) xxhash(E) raid6_pq(E) sd_mod(E) ata_generic(E) ata_piix(E) ahci(E) libahci(E) libata(E) crc32c_intel(E) serio_raw(E) virtio_pci(E) virtio_ring(E) virtio(E) sg(E) scsi_mod(E) autofs4(E) [akpm@linux-foundation.org: fix brace layout, per David. Reduce indentation] Link: http://lkml.kernel.org/r/20190122154407.18417-1-osalvador@suse.de Signed-off-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Tetsuo Handa
|
7e70ddc332 |
oom, oom_reaper: do not enqueue same task twice
commit 9bcdeb51bd7d2ae9fe65ea4d60643d2aeef5bfe3 upstream. Arkadiusz reported that enabling memcg's group oom killing causes strange memcg statistics where there is no task in a memcg despite the number of tasks in that memcg is not 0. It turned out that there is a bug in wake_oom_reaper() which allows enqueuing same task twice which makes impossible to decrease the number of tasks in that memcg due to a refcount leak. This bug existed since the OOM reaper became invokable from task_will_free_mem(current) path in out_of_memory() in Linux 4.7, T1@P1 |T2@P1 |T3@P1 |OOM reaper ----------+----------+----------+------------ # Processing an OOM victim in a different memcg domain. try_charge() mem_cgroup_out_of_memory() mutex_lock(&oom_lock) try_charge() mem_cgroup_out_of_memory() mutex_lock(&oom_lock) try_charge() mem_cgroup_out_of_memory() mutex_lock(&oom_lock) out_of_memory() oom_kill_process(P1) do_send_sig_info(SIGKILL, @P1) mark_oom_victim(T1@P1) wake_oom_reaper(T1@P1) # T1@P1 is enqueued. mutex_unlock(&oom_lock) out_of_memory() mark_oom_victim(T2@P1) wake_oom_reaper(T2@P1) # T2@P1 is enqueued. mutex_unlock(&oom_lock) out_of_memory() mark_oom_victim(T1@P1) wake_oom_reaper(T1@P1) # T1@P1 is enqueued again due to oom_reaper_list == T2@P1 && T1@P1->oom_reaper_list == NULL. mutex_unlock(&oom_lock) # Completed processing an OOM victim in a different memcg domain. spin_lock(&oom_reaper_lock) # T1P1 is dequeued. spin_unlock(&oom_reaper_lock) but memcg's group oom killing made it easier to trigger this bug by calling wake_oom_reaper() on the same task from one out_of_memory() request. Fix this bug using an approach used by commit |
||
Andrea Arcangeli
|
15033ca6bd |
mm/hugetlb.c: teach follow_hugetlb_page() to handle FOLL_NOWAIT
commit 1ac25013fb9e4ed595cd608a406191e93520881e upstream. hugetlb needs the same fix as faultin_nopage (which was applied in commit |
||
Greg Kroah-Hartman
|
18ba00a34e |
This is the 4.19.19 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlxSoGgACgkQONu9yGCS aT5IShAAmr4efAHaepYtsEM5q7hpYwdIXauhfmVzYS1KITQ0L+FSO6b6r18/X/Xu ZnGr87x9s7rzTwbgW9EK+cPBE7tVI3Tlem1prTSAQLneiv2zN/iTvJigWGUmifB/ +xldQxiM1S8j6dlVTu0lFl9P8voJ/zFA1II1DV4KJYiPX2lfVis77Tolcd/3TUJ4 V67abXHeAsLc+bU8kcMxamVievyQndwVlMT4XjStJyl6xy1zDozgiNwphyLqT7yc GY0H4jCbFNLfhZlMQgpHanvXzHshbJ4VtMNnjmUApplftzrVf864rgi+sRcoHWK/ Q6ER8LtgFYoqwG1ZjLUIEMChjhs/Xv+FLHWsCvCIkINyzo0PSgubnbYQVccXkBmB XSZ62YTh9sdQtdihWUly0Gr53yMvSn9+ndwJjvBDEErjC9b/D9jOLMcY1L/cCDaQ J34dp6ES+6YdQjAu0TEwuuJHMdGR+BvtKgGIL7V6ujxZhuU39eaJ0QLZktsmIO12 qWQyUKaLA450Qqiqza7foTWAVM6nFu9fd9xsZUKZV6lNnFjKYy/vWjhHsMCR8eKi aojGiRVRNnrlIwgk9h7H4EiuRkK3CJFc9jyZhA6u05NLMdIjD9YuLaMLHuSN0R+m lSgdmM3dVC4gyosKZHGdwzX/ytHTQiA8QRb4SWEnck7piKyWplg= =/rR0 -----END PGP SIGNATURE----- Merge 4.19.19 into android-4.19 Changes in 4.19.19 amd-xgbe: Fix mdio access for non-zero ports and clause 45 PHYs net: bridge: Fix ethernet header pointer before check skb forwardable net: Fix usage of pskb_trim_rcsum net: phy: marvell: Errata for mv88e6390 internal PHYs net: phy: mdio_bus: add missing device_del() in mdiobus_register() error handling net/sched: act_tunnel_key: fix memory leak in case of action replace net_sched: refetch skb protocol for each filter openvswitch: Avoid OOB read when parsing flow nlattrs vhost: log dirty page correctly mlxsw: pci: Increase PCI SW reset timeout net: ipv4: Fix memory leak in network namespace dismantle mlxsw: spectrum_fid: Update dummy FID index mlxsw: pci: Ring CQ's doorbell before RDQ's net/sched: cls_flower: allocate mask dynamically in fl_change() udp: with udp_segment release on error path ip6_gre: fix tunnel list corruption for x-netns erspan: build the header with the right proto according to erspan_ver net: phy: marvell: Fix deadlock from wrong locking ip6_gre: update version related info when changing link tcp: allow MSG_ZEROCOPY transmission also in CLOSE_WAIT state mei: me: mark LBG devices as having dma support mei: me: add denverton innovation engine device IDs USB: leds: fix regression in usbport led trigger USB: serial: simple: add Motorola Tetra TPG2200 device id USB: serial: pl2303: add new PID to support PL2303TB ceph: clear inode pointer when snap realm gets dropped by its inode ASoC: atom: fix a missing check of snd_pcm_lib_malloc_pages ASoC: rt5514-spi: Fix potential NULL pointer dereference ASoC: tlv320aic32x4: Kernel OOPS while entering DAPM standby mode clk: socfpga: stratix10: fix rate calculation for pll clocks clk: socfpga: stratix10: fix naming convention for the fixed-clocks inotify: Fix fd refcount leak in inotify_add_watch(). ALSA: hda/realtek - Fix typo for ALC225 model ALSA: hda - Add mute LED support for HP ProBook 470 G5 ARCv2: lib: memeset: fix doing prefetchw outside of buffer ARC: adjust memblock_reserve of kernel memory ARC: perf: map generic branches to correct hardware condition s390/mm: always force a load of the primary ASCE on context switch s390/early: improve machine detection s390/smp: fix CPU hotplug deadlock with CPU rescan misc: ibmvsm: Fix potential NULL pointer dereference char/mwave: fix potential Spectre v1 vulnerability mmc: dw_mmc-bluefield: : Fix the license information mmc: meson-gx: Free irq in release() callback staging: rtl8188eu: Add device code for D-Link DWA-121 rev B1 tty: Handle problem if line discipline does not have receive_buf uart: Fix crash in uart_write and uart_put_char tty/n_hdlc: fix __might_sleep warning hv_balloon: avoid touching uninitialized struct page during tail onlining Drivers: hv: vmbus: Check for ring when getting debug info vgacon: unconfuse vc_origin when using soft scrollback CIFS: Fix possible hang during async MTU reads and writes CIFS: Fix credits calculations for reads with errors CIFS: Fix credit calculation for encrypted reads with errors CIFS: Do not reconnect TCP session in add_credits() smb3: add credits we receive from oplock/break PDUs Input: xpad - add support for SteelSeries Stratus Duo Input: input_event - provide override for sparc64 Input: uinput - fix undefined behavior in uinput_validate_absinfo() acpi/nfit: Block function zero DSMs acpi/nfit: Fix command-supported detection scsi: ufs: Use explicit access size in ufshcd_dump_regs dm thin: fix passdown_double_checking_shared_status() dm crypt: fix parsing of extended IV arguments drm/amdgpu: Add APTX quirk for Lenovo laptop KVM: x86: Fix single-step debugging KVM: x86: Fix PV IPIs for 32-bit KVM host KVM: x86: WARN_ONCE if sending a PV IPI returns a fatal error kvm: x86/vmx: Use kzalloc for cached_vmcs12 KVM/nVMX: Do not validate that posted_intr_desc_addr is page aligned x86/pkeys: Properly copy pkey state at fork() x86/selftests/pkeys: Fork() to check for state being preserved x86/kaslr: Fix incorrect i8254 outb() parameters x86/entry/64/compat: Fix stack switching for XEN PV posix-cpu-timers: Unbreak timer rearming net: sun: cassini: Cleanup license conflict irqchip/gic-v3-its: Align PCI Multi-MSI allocation on their size can: dev: __can_get_echo_skb(): fix bogous check for non-existing skb by removing it can: bcm: check timer values before ktime conversion can: flexcan: fix NULL pointer exception during bringup vt: make vt_console_print() compatible with the unicode screen buffer vt: always call notifier with the console lock held vt: invoke notifier on screen size change drm/meson: Fix atomic mode switching regression bpf: improve verifier branch analysis bpf: add per-insn complexity limit bpf: move {prev_,}insn_idx into verifier env bpf: move tmp variable into ax register in interpreter bpf: enable access to ax register also from verifier rewrite bpf: restrict map value pointer arithmetic for unprivileged bpf: restrict stack pointer arithmetic for unprivileged bpf: restrict unknown scalars of mixed signed bounds for unprivileged bpf: fix check_map_access smin_value test when pointer contains offset bpf: prevent out of bounds speculation on pointer arithmetic bpf: fix sanitation of alu op with pointer / scalar type from different paths bpf: fix inner map masking to prevent oob under speculation s390/smp: Fix calling smp_call_ipl_cpu() from ipl CPU nvmet-rdma: Add unlikely for response allocated check nvmet-rdma: fix null dereference under heavy load Revert "mm, memory_hotplug: initialize struct pages for the full memory section" usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup ide: fix a typo in the settings proc file name Input: input_event - fix the CONFIG_SPARC64 mixup Linux 4.19.19 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Michal Hocko
|
6bab957396 |
Revert "mm, memory_hotplug: initialize struct pages for the full memory section"
commit 4aa9fc2a435abe95a1e8d7f8c7b3d6356514b37a upstream. This reverts commit 2830bf6f05fb3e05bc4743274b806c821807a684. The underlying assumption that one sparse section belongs into a single numa node doesn't hold really. Robert Shteynfeld has reported a boot failure. The boot log was not captured but his memory layout is as follows: Early memory node ranges node 1: [mem 0x0000000000001000-0x0000000000090fff] node 1: [mem 0x0000000000100000-0x00000000dbdf8fff] node 1: [mem 0x0000000100000000-0x0000001423ffffff] node 0: [mem 0x0000001424000000-0x0000002023ffffff] This means that node0 starts in the middle of a memory section which is also in node1. memmap_init_zone tries to initialize padding of a section even when it is outside of the given pfn range because there are code paths (e.g. memory hotplug) which assume that the full worth of memory section is always initialized. In this particular case, though, such a range is already intialized and most likely already managed by the page allocator. Scribbling over those pages corrupts the internal state and likely blows up when any of those pages gets used. Reported-by: Robert Shteynfeld <robert.shteynfeld@gmail.com> Fixes: 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full memory section") Cc: stable@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Greg Kroah-Hartman
|
26bf816608 |
This is the 4.19.18 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlxMGy0ACgkQONu9yGCS aT5ppQ/8COjyZg1aTrCrd0ttMHYotw3Lb4B6E/SCf2ub4X38SxGz9irhQ7r2FKdK w0ZXlLOF2ddqWe6BUnIfWago4Pk1GBpg3bgnp5XyYTjlJbfI2yZ9ggiO0iNYBPaL fN2JwM9eze/7cDlpYbhwGpF4+Wz8wTrzh+NIputcvC6n3SQH/cTGmOUa9rlamQju uukkvLanAYY3sqDCl4B415Ds44ROU4filqHYIkvZC81jc3Q0YZ8M7cTmpLcDQKGz 8Z+Veil07jEM9bF2W8iX79nwxMT+edFC62HMuRCoxJKq+1kccw1TVMWpQ8TWbv13 zeLOqXxNP6VcNaC251q3QzlInRDp1dtr8KtzA/OG0WFnZBTEDng/iChhiL8qZt0R 9+Sz7n9uZ5pMRK3tr03Ccjg3AneKWRqad2iaTB/kOwAdu7Uqxz8U9qUuRDFPV7OY KTMCCfdS8XpMHl/S+Cvg2dqSNiBEkNmowYO6NvQClG0aoN4/6wH+m2TZ0hCl6PVq pNFOTJmp7FOaztEZC4rqW8DoOGeGaNo5DP9A2XKKDR20F7EiAE437ApEQ4p5QGVk ek4uslZkwJWU/UOzXRl/Hoz0OlI0ixsdZy1vw88HCl7SD1E7xHJpnRUkOjigTT1Q nbCt0Nm/A2+c1tKbzU+PVW8FtIbutZhW1BtrqaIbbHr9NBTICR0= =Yg+/ -----END PGP SIGNATURE----- Merge 4.19.18 into android-4.19 Changes in 4.19.18 ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address mlxsw: spectrum: Disable lag port TX before removing it mlxsw: spectrum_switchdev: Set PVID correctly during VLAN deletion net: dsa: mv88x6xxx: mv88e6390 errata net, skbuff: do not prefer skb allocation fails early qmi_wwan: add MTU default to qmap network interface r8169: Add support for new Realtek Ethernet ipv6: Take rcu_read_lock in __inet6_bind for mapped addresses net: clear skb->tstamp in bridge forwarding path netfilter: ipset: Allow matching on destination MAC address for mac and ipmac sets gpio: pl061: Move irq_chip definition inside struct pl061 drm/amd/display: Guard against null stream_state in set_crc_source drm/amdkfd: fix interrupt spin lock ixgbe: allow IPsec Tx offload in VEPA mode platform/x86: asus-wmi: Tell the EC the OS will handle the display off hotkey e1000e: allow non-monotonic SYSTIM readings usb: typec: tcpm: Do not disconnect link for self powered devices selftests/bpf: enable (uncomment) all tests in test_libbpf.sh of: overlay: add missing of_node_put() after add new node to changeset writeback: don't decrement wb->refcnt if !wb->bdi serial: set suppress_bind_attrs flag only if builtin bpf: Allow narrow loads with offset > 0 ALSA: oxfw: add support for APOGEE duet FireWire x86/mce: Fix -Wmissing-prototypes warnings MIPS: SiByte: Enable swiotlb for SWARM, LittleSur and BigSur crypto: ecc - regularize scalar for scalar multiplication arm64: perf: set suppress_bind_attrs flag to true drm/atomic-helper: Complete fake_commit->flip_done potentially earlier clk: meson: meson8b: fix incorrect divider mapping in cpu_scale_table samples: bpf: fix: error handling regarding kprobe_events usb: gadget: udc: renesas_usb3: add a safety connection way for forced_b_device fpga: altera-cvp: fix probing for multiple FPGAs on the bus selinux: always allow mounting submounts ASoC: pcm3168a: Don't disable pcm3168a when CONFIG_PM defined scsi: qedi: Check for session online before getting iSCSI TLV data. drm/amdgpu: Reorder uvd ring init before uvd resume rxe: IB_WR_REG_MR does not capture MR's iova field efi/libstub: Disable some warnings for x86{,_64} jffs2: Fix use of uninitialized delayed_work, lockdep breakage clk: imx: make mux parent strings const pstore/ram: Do not treat empty buffers as valid media: uvcvideo: Refactor teardown of uvc on USB disconnect powerpc/xmon: Fix invocation inside lock region powerpc/pseries/cpuidle: Fix preempt warning media: firewire: Fix app_info parameter type in avc_ca{,_app}_info ASoC: use dma_ops of parent device for acp_audio_dma media: venus: core: Set dma maximum segment size staging: erofs: fix use-after-free of on-stack `z_erofs_vle_unzip_io' net: call sk_dst_reset when set SO_DONTROUTE scsi: target: use consistent left-aligned ASCII INQUIRY data scsi: target/core: Make sure that target_wait_for_sess_cmds() waits long enough selftests: do not macro-expand failed assertion expressions arm64: kasan: Increase stack size for KASAN_EXTRA clk: imx6q: reset exclusive gates on init arm64: Fix minor issues with the dcache_by_line_op macro bpf: relax verifier restriction on BPF_MOV | BPF_ALU kconfig: fix file name and line number of warn_ignored_character() kconfig: fix memory leak when EOF is encountered in quotation mmc: atmel-mci: do not assume idle after atmci_request_end btrfs: volumes: Make sure there is no overlap of dev extents at mount time btrfs: alloc_chunk: fix more DUP stripe size handling btrfs: fix use-after-free due to race between replace start and cancel btrfs: improve error handling of btrfs_add_link tty/serial: do not free trasnmit buffer page under port lock perf intel-pt: Fix error with config term "pt=0" perf tests ARM: Disable breakpoint tests 32-bit perf svghelper: Fix unchecked usage of strncpy() perf parse-events: Fix unchecked usage of strncpy() perf vendor events intel: Fix Load_Miss_Real_Latency on SKL/SKX netfilter: ipt_CLUSTERIP: check MAC address when duplicate config is set netfilter: ipt_CLUSTERIP: remove wrong WARN_ON_ONCE in netns exit routine netfilter: ipt_CLUSTERIP: fix deadlock in netns exit routine x86/topology: Use total_cpus for max logical packages calculation dm crypt: use u64 instead of sector_t to store iv_offset dm kcopyd: Fix bug causing workqueue stalls perf stat: Avoid segfaults caused by negated options tools lib subcmd: Don't add the kernel sources to the include path dm snapshot: Fix excessive memory usage and workqueue stalls perf cs-etm: Correct packets swapping in cs_etm__flush() perf tools: Add missing sigqueue() prototype for systems lacking it perf tools: Add missing open_memstream() prototype for systems lacking it quota: Lock s_umount in exclusive mode for Q_XQUOTA{ON,OFF} quotactls. clocksource/drivers/integrator-ap: Add missing of_node_put() dm: Check for device sector overflow if CONFIG_LBDAF is not set Bluetooth: btusb: Add support for Intel bluetooth device 8087:0029 ALSA: bebob: fix model-id of unit for Apogee Ensemble sysfs: Disable lockdep for driver bind/unbind files IB/usnic: Fix potential deadlock scsi: mpt3sas: fix memory ordering on 64bit writes scsi: smartpqi: correct lun reset issues ath10k: fix peer stats null pointer dereference scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown() scsi: megaraid: fix out-of-bound array accesses iomap: don't search past page end in iomap_is_partially_uptodate ocfs2: fix panic due to unrecovered local alloc mm/page-writeback.c: don't break integrity writeback on ->writepage() error mm/swap: use nr_node_ids for avail_lists in swap_info_struct userfaultfd: clear flag if remap event not enabled mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps iwlwifi: mvm: Send LQ command as async when necessary Bluetooth: Fix unnecessary error message for HCI request completion ipmi: fix use-after-free of user->release_barrier.rda ipmi: msghandler: Fix potential Spectre v1 vulnerabilities ipmi: Prevent use-after-free in deliver_response ipmi:ssif: Fix handling of multi-part return messages ipmi: Don't initialize anything in the core until something uses it Linux 4.19.18 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Aaron Lu
|
b0cd52e644 |
mm/swap: use nr_node_ids for avail_lists in swap_info_struct
[ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ]
Since
|
||
Brian Foster
|
dc15e3fd3f |
mm/page-writeback.c: don't break integrity writeback on ->writepage() error
[ Upstream commit 3fa750dcf29e8606e3969d13d8e188cc1c0f511d ] write_cache_pages() is used in both background and integrity writeback scenarios by various filesystems. Background writeback is mostly concerned with cleaning a certain number of dirty pages based on various mm heuristics. It may not write the full set of dirty pages or wait for I/O to complete. Integrity writeback is responsible for persisting a set of dirty pages before the writeback job completes. For example, an fsync() call must perform integrity writeback to ensure data is on disk before the call returns. write_cache_pages() unconditionally breaks out of its processing loop in the event of a ->writepage() error. This is fine for background writeback, which had no strict requirements and will eventually come around again. This can cause problems for integrity writeback on filesystems that might need to clean up state associated with failed page writeouts. For example, XFS performs internal delayed allocation accounting before returning a ->writepage() error, where applicable. If the current writeback happens to be associated with an unmount and write_cache_pages() completes the writeback prematurely due to error, the filesystem is unmounted in an inconsistent state if dirty+delalloc pages still exist. To handle this problem, update write_cache_pages() to always process the full set of pages for integrity writeback regardless of ->writepage() errors. Save the first encountered error and return it to the caller once complete. This facilitates XFS (or any other fs that expects integrity writeback to process the entire set of dirty pages) to clean up its internal state completely in the event of persistent mapping errors. Background writeback continues to exit on the first error encountered. [akpm@linux-foundation.org: fix typo in comment] Link: http://lkml.kernel.org/r/20181116134304.32440-1-bfoster@redhat.com Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Greg Kroah-Hartman
|
976f78d572 |
This is the 4.19.16 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlw/nGYACgkQONu9yGCS aT5/LA//bP/+XrOaB6YIkiM7EfhWTuATY6DOkhwT7kpIgRXMR4FvyTA3o7iEz0DE 5HfSL2wUpF+UZ0sC8c+zFCaNqhMNTwl95J6w4YI3N4V6IsTWxDYQOosQLU1y11Zw w5SV1FXqvKnbPchHehg/toDORs+sryw9QbydTXOPukqEQ1J9Kx8xtcyNivpvccVs /Jn+MNwnDZXWgw1gyx4/BcbtSVnu9RgLdtXSyBBUfZmZxy4Tx+e+ckfp+sd0TpE7 H7QPrMZHZys7EVKfvP1SWOJgStJNGav869Klj8HAZm3rI0R3EhMZBEIxG96HsxFd XOqRfn3Yarl0OQHKggRJQi0EbcOAEUAzWgJKxKFaoqBJyYVoQivp3XJvF+2B56Yb sg4EISWR2OXdO4ER1eYbPyDL+ZO+P0C5eQ16NRly1PifiUk1iHs1dyGg266GU4Tj cHWmdt743nMNCndQ+cUnHAqbJS+UQ6Y/96bOxZlKei93fQfMqynUZBV9FN6DejJt mMNqwV0aEEPlTx37rvExrxS30ydYg1lnF9BY7QP8r71RjpdXgB8fjLN3W2S21SWv 04zMSg9kAKgC3vRDc2vr7nZ9zkeujD/VBVp3HdTLU9gDb1xUL4MqdXNnTiUOzS29 wBWBi7+uiPhSC282kNM08PE1SDq6WtKU9WixJxLP9jYZccMjJDk= =Fi9w -----END PGP SIGNATURE----- Merge 4.19.16 into android-4.19 Changes in 4.19.16 Btrfs: fix deadlock when using free space tree due to block group creation staging: rtl8188eu: Fix module loading from tasklet for CCMP encryption staging: rtl8188eu: Fix module loading from tasklet for WEP encryption cpufreq: scmi: Fix frequency invariance in slow path x86, modpost: Replace last remnants of RETPOLINE with CONFIG_RETPOLINE ALSA: hda/realtek - Support Dell headset mode for New AIO platform ALSA: hda/realtek - Add unplug function into unplug state of Headset Mode for ALC225 ALSA: hda/realtek - Disable headset Mic VREF for headset mode of ALC225 CIFS: Fix adjustment of credits for MTU requests CIFS: Do not set credits to 1 if the server didn't grant anything CIFS: Do not hide EINTR after sending network packets CIFS: Fix credit computation for compounded requests cifs: Fix potential OOB access of lock element array usb: cdc-acm: send ZLP for Telit 3G Intel based modems USB: storage: don't insert sane sense for SPC3+ when bad sense specified USB: storage: add quirk for SMI SM3350 USB: Add USB_QUIRK_DELAY_CTRL_MSG quirk for Corsair K70 RGB slab: alien caches must not be initialized if the allocation of the alien cache failed mm/usercopy.c: no check page span for stack objects mm, memcg: fix reclaim deadlock with writeback ACPI: power: Skip duplicate power resource references in _PRx ACPI / PMIC: xpower: Fix TS-pin current-source handling ACPI/IORT: Fix rc_dma_get_range() i2c: dev: prevent adapter retries and timeout being set as minus value mtd: rawnand: qcom: fix memory corruption that causes panic vfio/type1: Fix unmap overflow off-by-one drm/amdgpu: Add new VegaM pci id PCI: dwc: Use interrupt masking instead of disabling PCI: dwc: Take lock when ACKing an interrupt PCI: dwc: Move interrupt acking into the proper callback drm/amd/display: Fix MST dp_blank REG_WAIT timeout drm/fb_helper: Allow leaking fbdev smem_start drm/fb-helper: Partially bring back workaround for bugs of SDL 1.2 drm/i915: Unwind failure on pinning the gen7 ppgtt drm/amdgpu: Don't ignore rc from drm_dp_mst_topology_mgr_resume() drm/amdgpu: Don't fail resume process if resuming atomic state fails rbd: don't return 0 on unmap if RBD_DEV_FLAG_REMOVING is set ext4: make sure enough credits are reserved for dioread_nolock writes ext4: fix a potential fiemap/page fault deadlock w/ inline_data ext4: avoid kernel warning when writing the superblock to a dead device ext4: use ext4_write_inode() when fsyncing w/o a journal ext4: track writeback errors using the generic tracking infrastructure ext4: fix special inode number checks in __ext4_iget() mm: page_mapped: don't assume compound page is huge or THP sunrpc: use-after-free in svc_process_common() KVM: arm/arm64: Fix VMID alloc race by reverting to lock-less arm64: compat: Don't pull syscall number from regs in arm_compat_syscall Btrfs: fix access to available allocation bits when starting balance Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation Btrfs: use nofs context when initializing security xattrs to avoid deadlock Linux 4.19.16 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Jan Stancek
|
160f79c0a0 |
mm: page_mapped: don't assume compound page is huge or THP
commit 8ab88c7169b7fba98812ead6524b9d05bc76cf00 upstream.
LTP proc01 testcase has been observed to rarely trigger crashes
on arm64:
page_mapped+0x78/0xb4
stable_page_flags+0x27c/0x338
kpageflags_read+0xfc/0x164
proc_reg_read+0x7c/0xb8
__vfs_read+0x58/0x178
vfs_read+0x90/0x14c
SyS_read+0x60/0xc0
The issue is that page_mapped() assumes that if compound page is not
huge, then it must be THP. But if this is 'normal' compound page
(COMPOUND_PAGE_DTOR), then following loop can keep running (for
HPAGE_PMD_NR iterations) until it tries to read from memory that isn't
mapped and triggers a panic:
for (i = 0; i < hpage_nr_pages(page); i++) {
if (atomic_read(&page[i]._mapcount) >= 0)
return true;
}
I could replicate this on x86 (v4.20-rc4-98-g60b548237fed) only
with a custom kernel module [1] which:
- allocates compound page (PAGEC) of order 1
- allocates 2 normal pages (COPY), which are initialized to 0xff (to
satisfy _mapcount >= 0)
- 2 PAGEC page structs are copied to address of first COPY page
- second page of COPY is marked as not present
- call to page_mapped(COPY) now triggers fault on access to 2nd COPY
page at offset 0x30 (_mapcount)
[1] https://github.com/jstancek/reproducers/blob/master/kernel/page_mapped_crash/repro.c
Fix the loop to iterate for "1 << compound_order" pages.
Kirrill said "IIRC, sound subsystem can producuce custom mapped compound
pages".
Link: http://lkml.kernel.org/r/c440d69879e34209feba21e12d236d06bc0a25db.1543577156.git.jstancek@redhat.com
Fixes:
|
||
Michal Hocko
|
97b02b6324 |
mm, memcg: fix reclaim deadlock with writeback
commit 63f3655f950186752236bb88a22f8252c11ce394 upstream.
Liu Bo has experienced a deadlock between memcg (legacy) reclaim and the
ext4 writeback
task1:
wait_on_page_bit+0x82/0xa0
shrink_page_list+0x907/0x960
shrink_inactive_list+0x2c7/0x680
shrink_node_memcg+0x404/0x830
shrink_node+0xd8/0x300
do_try_to_free_pages+0x10d/0x330
try_to_free_mem_cgroup_pages+0xd5/0x1b0
try_charge+0x14d/0x720
memcg_kmem_charge_memcg+0x3c/0xa0
memcg_kmem_charge+0x7e/0xd0
__alloc_pages_nodemask+0x178/0x260
alloc_pages_current+0x95/0x140
pte_alloc_one+0x17/0x40
__pte_alloc+0x1e/0x110
alloc_set_pte+0x5fe/0xc20
do_fault+0x103/0x970
handle_mm_fault+0x61e/0xd10
__do_page_fault+0x252/0x4d0
do_page_fault+0x30/0x80
page_fault+0x28/0x30
task2:
__lock_page+0x86/0xa0
mpage_prepare_extent_to_map+0x2e7/0x310 [ext4]
ext4_writepages+0x479/0xd60
do_writepages+0x1e/0x30
__writeback_single_inode+0x45/0x320
writeback_sb_inodes+0x272/0x600
__writeback_inodes_wb+0x92/0xc0
wb_writeback+0x268/0x300
wb_workfn+0xb4/0x390
process_one_work+0x189/0x420
worker_thread+0x4e/0x4b0
kthread+0xe6/0x100
ret_from_fork+0x41/0x50
He adds
"task1 is waiting for the PageWriteback bit of the page that task2 has
collected in mpd->io_submit->io_bio, and tasks2 is waiting for the
LOCKED bit the page which tasks1 has locked"
More precisely task1 is handling a page fault and it has a page locked
while it charges a new page table to a memcg. That in turn hits a
memory limit reclaim and the memcg reclaim for legacy controller is
waiting on the writeback but that is never going to finish because the
writeback itself is waiting for the page locked in the #PF path. So
this is essentially ABBA deadlock:
lock_page(A)
SetPageWriteback(A)
unlock_page(A)
lock_page(B)
lock_page(B)
pte_alloc_pne
shrink_page_list
wait_on_page_writeback(A)
SetPageWriteback(B)
unlock_page(B)
# flush A, B to clear the writeback
This accumulating of more pages to flush is used by several filesystems
to generate a more optimal IO patterns.
Waiting for the writeback in legacy memcg controller is a workaround for
pre-mature OOM killer invocations because there is no dirty IO
throttling available for the controller. There is no easy way around
that unfortunately. Therefore fix this specific issue by pre-allocating
the page table outside of the page lock. We have that handy
infrastructure for that already so simply reuse the fault-around pattern
which already does this.
There are probably other hidden __GFP_ACCOUNT | GFP_KERNEL allocations
from under a fs page locked but they should be really rare. I am not
aware of a better solution unfortunately.
[akpm@linux-foundation.org: fix mm/memory.c:__do_fault()]
[akpm@linux-foundation.org: coding-style fixes]
[mhocko@kernel.org: enhance comment, per Johannes]
Link: http://lkml.kernel.org/r/20181214084948.GA5624@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20181213092221.27270-1-mhocko@kernel.org
Fixes:
|
||
Qian Cai
|
8a4b6e8cb7 |
mm/usercopy.c: no check page span for stack objects
commit 7bff3c06997374fb9b9991536a547b840549a813 upstream. It is easy to trigger this with CONFIG_HARDENED_USERCOPY_PAGESPAN=y, usercopy: Kernel memory overwrite attempt detected to spans multiple pages (offset 0, size 23)! kernel BUG at mm/usercopy.c:102! For example, print_worker_info char name[WQ_NAME_LEN] = { }; char desc[WORKER_DESC_LEN] = { }; probe_kernel_read(name, wq->name, sizeof(name) - 1); probe_kernel_read(desc, worker->desc, sizeof(desc) - 1); __copy_from_user_inatomic check_object_size check_heap_object check_page_span This is because on-stack variables could cross PAGE_SIZE boundary, and failed this check, if (likely(((unsigned long)ptr & (unsigned long)PAGE_MASK) == ((unsigned long)end & (unsigned long)PAGE_MASK))) ptr = FFFF889007D7EFF8 end = FFFF889007D7F00E Hence, fix it by checking if it is a stack object first. [keescook@chromium.org: improve comments after reorder] Link: http://lkml.kernel.org/r/20190103165151.GA32845@beast Link: http://lkml.kernel.org/r/20181231030254.99441-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Christoph Lameter
|
f928ca3917 |
slab: alien caches must not be initialized if the allocation of the alien cache failed
commit 09c2e76ed734a1d36470d257a778aaba28e86531 upstream. Callers of __alloc_alien() check for NULL. We must do the same check in __alloc_alien_cache to avoid NULL pointer dereferences on allocation failures. Link: http://lkml.kernel.org/r/010001680f42f192-82b4e12e-1565-4ee0-ae1f-1e98974906aa-000000@email.amazonses.com Fixes: |
||
Joel Fernandes (Google)
|
dec031f640 |
UPSTREAM: mm/memfd: Add an F_SEAL_FUTURE_WRITE seal to memfd
Android uses ashmem for sharing memory regions. We are looking forward to migrating all usecases of ashmem to memfd so that we can possibly remove the ashmem driver in the future from staging while also benefiting from using memfd and contributing to it. Note staging drivers are also not ABI and generally can be removed at anytime. One of the main usecases Android has is the ability to create a region and mmap it as writeable, then add protection against making any "future" writes while keeping the existing already mmap'ed writeable-region active. This allows us to implement a usecase where receivers of the shared memory buffer can get a read-only view, while the sender continues to write to the buffer. See CursorWindow documentation in Android for more details: https://developer.android.com/reference/android/database/CursorWindow This usecase cannot be implemented with the existing F_SEAL_WRITE seal. To support the usecase, this patch adds a new F_SEAL_FUTURE_WRITE seal which prevents any future mmap and write syscalls from succeeding while keeping the existing mmap active. A better way to do F_SEAL_FUTURE_WRITE seal was discussed [1] last week where we don't need to modify core VFS structures to get the same behavior of the seal. This solves several side-effects pointed by Andy. self-tests are provided in later patch to verify the expected semantics. [1] https://lore.kernel.org/lkml/20181111173650.GA256781@google.com/ [Thanks a lot to Andy for suggestions to improve code] Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Acked-by: John Stultz <john.stultz@linaro.org> Change-Id: I6710c045954378f87bfbff6311d372a3b8549064 |
||
Joel Fernandes
|
198aac25b7 |
Revert "UPSTREAM: mm: Add an F_SEAL_FUTURE_WRITE seal to memfd"
This reverts commit
|
||
Joel Fernandes
|
3a49374afc |
Revert "UPSTREAM: mm/memfd: make F_SEAL_FUTURE_WRITE seal more robust"
This reverts commit
|
||
Greg Kroah-Hartman
|
caf54339d3 |
This is the 4.19.15 stable release
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlw6+/8ACgkQONu9yGCS
aT6VKw/9FUsbfy4MzFMH4XmTn/k9AHhcYdQ+gSEIcJbt/JLT13fU64e/O8QlQ3PF
5GWNY5ObA+HKlReCufSuW+AuAw5s/FLVaGLn8HZQ/FU27ZgTrGpFjb3vcnYSjsU0
vurXjstzndiRmpSahNufU6t2X7fkgyd41M94572pyidcT5NcP+ngVICwXtQOsXjH
QkIaMZHTmr4le0Z1oNvDraNkESJnxo7+D2eJebx5yDReD/Mdm3gAl2q0UkDXpZzk
qb3tH1oronm7ZfiEBCZYrewxMfz78ugJW3hpOu//JCbrVI2Ja0sBSh3VB6EFceoY
WI9z8JkZ3xQeLQnCdiabdQ66mGQa9XiLUwj7+sR//P7OduwJEv8HTYpDi8iqA6Vj
SigQmjEunjSHccqBWaPy1ZMAIXoNWQBC4EJ2erv3pAPyJr2FBw9o2Bmu6JAV18ow
iX94YnQtllZp8cJsEKEUWEmXZPLcTy6mXLMLoQ922P4p4KRJVQUhde4EeZZLFn27
6sPwASnrfEW9RS/i1XuxdDPbnMYg6uE0UoRfxp1tAUBKaVArjMglyIAj7t9GA07W
4480c3AegmDFZ+GxX+w5+duKRZnxBi+sHw8aBbZRi5m9mlxeFCSWSe0hPPRR2LIQ
fZrFySHmgbl1NtTP4cvZOb7bTxoyfjcIQfiqu7cwNsYGXtbfOuk=
=A6Ro
-----END PGP SIGNATURE-----
Merge 4.19.15 into android-4.19
Changes in 4.19.15
ARM: dts: sun8i: a83t: bananapi-m3: increase vcc-pd voltage to 3.3V
pinctrl: meson: fix pull enable register calculation
arm64: dts: mt7622: fix no more console output on rfb1
powerpc: Fix COFF zImage booting on old powermacs
powerpc/mm: Fix linux page tables build with some configs
HID: ite: Add USB id match for another ITE based keyboard rfkill key quirk
ARM: dts: imx7d-pico: Describe the Wifi clock
ARM: imx: update the cpu power up timing setting on i.mx6sx
ARM: dts: imx7d-nitrogen7: Fix the description of the Wifi clock
IB/mlx5: Block DEVX umem from the non applicable cases
Input: restore EV_ABS ABS_RESERVED
powerpc/mm: Fallback to RAM if the altmap is unusable
drm/amdgpu: Fix DEBUG_LOCKS_WARN_ON(depth <= 0) in amdgpu_ctx.lock
IB/core: Fix oops in netdev_next_upper_dev_rcu()
checkstack.pl: fix for aarch64
xfrm: Fix error return code in xfrm_output_one()
xfrm: Fix bucket count reported to userspace
xfrm: Fix NULL pointer dereference in xfrm_input when skb_dst_force clears the dst_entry.
ieee802154: hwsim: fix off-by-one in parse nested
netfilter: nf_tables: fix suspicious RCU usage in nft_chain_stats_replace()
netfilter: seqadj: re-load tcp header pointer after possible head reallocation
Revert "scsi: qla2xxx: Fix NVMe Target discovery"
scsi: bnx2fc: Fix NULL dereference in error handling
Input: omap-keypad - fix idle configuration to not block SoC idle states
Input: synaptics - enable RMI on ThinkPad T560
ibmvnic: Convert reset work item mutex to spin lock
ibmvnic: Fix non-atomic memory allocation in IRQ context
ieee802154: ca8210: fix possible u8 overflow in ca8210_rx_done
x86/mm: Fix guard hole handling
x86/dump_pagetables: Fix LDT remap address marker
i40e: fix mac filter delete when setting mac address
ixgbe: Fix race when the VF driver does a reset
netfilter: ipset: do not call ipset_nest_end after nla_nest_cancel
netfilter: nat: can't use dst_hold on noref dst
netfilter: nf_conncount: use rb_link_node_rcu() instead of rb_link_node()
bnx2x: Clear fip MAC when fcoe offload support is disabled
bnx2x: Remove configured vlans as part of unload sequence.
bnx2x: Send update-svid ramrod with retry/poll flags enabled
scsi: target: iscsi: cxgbit: fix csk leak
scsi: target: iscsi: cxgbit: add missing spin_lock_init()
mt76: fix potential NULL pointer dereference in mt76_stop_tx_queues
x86, hyperv: remove PCI dependency
drivers: net: xgene: Remove unnecessary forward declarations
net/tls: Init routines in create_ctx
w90p910_ether: remove incorrect __init annotation
net: hns: Incorrect offset address used for some registers.
net: hns: All ports can not work when insmod hns ko after rmmod.
net: hns: Some registers use wrong address according to the datasheet.
net: hns: Fixed bug that netdev was opened twice
net: hns: Clean rx fbd when ae stopped.
net: hns: Free irq when exit from abnormal branch
net: hns: Avoid net reset caused by pause frames storm
net: hns: Fix ntuple-filters status error.
net: hns: Add mac pcs config when enable|disable mac
net: hns: Fix ping failed when use net bridge and send multicast
mac80211: fix a kernel panic when TXing after TXQ teardown
SUNRPC: Fix a race with XPRT_CONNECTING
qed: Fix an error code qed_ll2_start_xmit()
net: macb: fix random memory corruption on RX with 64-bit DMA
net: macb: fix dropped RX frames due to a race
net: macb: add missing barriers when reading descriptors
lan743x: Expand phy search for LAN7431
lan78xx: Resolve issue with changing MAC address
vxge: ensure data0 is initialized in when fetching firmware version information
nl80211: fix memory leak if validate_pae_over_nl80211() fails
mac80211: free skb fraglist before freeing the skb
kbuild: fix false positive warning/error about missing libelf
m68k: Fix memblock-related crashes
virtio: fix test build after uio.h change
lan743x: Remove MAC Reset from initialization
gpio: mvebu: only fail on missing clk if pwm is actually to be used
Input: synaptics - enable SMBus for HP EliteBook 840 G4
net: netxen: fix a missing check and an uninitialized use
qmi_wwan: Fix qmap header retrieval in qmimux_rx_fixup
serial/sunsu: fix refcount leak
auxdisplay: charlcd: fix x/y command parsing
scsi: zfcp: fix posting too many status read buffers leading to adapter shutdown
scsi: lpfc: do not set queue->page_count to 0 if pc_sli4_params.wqpcnt is invalid
fork: record start_time late
zram: fix double free backing device
hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined
mm, devm_memremap_pages: mark devm_memremap_pages() EXPORT_SYMBOL_GPL
mm, devm_memremap_pages: kill mapping "System RAM" support
mm, devm_memremap_pages: fix shutdown handling
mm, devm_memremap_pages: add MEMORY_DEVICE_PRIVATE support
mm, hmm: use devm semantics for hmm_devmem_{add, remove}
mm, hmm: mark hmm_devmem_{add, add_resource} EXPORT_SYMBOL_GPL
mm, swap: fix swapoff with KSM pages
memcg, oom: notify on oom killer invocation from the charge path
sunrpc: fix cache_head leak due to queued request
sunrpc: use SVC_NET() in svcauth_gss_* functions
powerpc: remove old GCC version checks
powerpc: consolidate -mno-sched-epilog into FTRACE flags
powerpc: avoid -mno-sched-epilog on GCC 4.9 and newer
powerpc: Disable -Wbuiltin-requires-header when setjmp is used
kbuild: add -no-integrated-as Clang option unconditionally
kbuild: consolidate Clang compiler flags
Makefile: Export clang toolchain variables
powerpc/boot: Set target when cross-compiling for clang
raid6/ppc: Fix build for clang
dma-direct: do not include SME mask in the DMA supported check
mt76x0: init hw capabilities
media: cx23885: only reset DMA on problematic CPUs
ALSA: cs46xx: Potential NULL dereference in probe
ALSA: usb-audio: Avoid access before bLength check in build_audio_procunit()
ALSA: usb-audio: Check mixer unit descriptors more strictly
ALSA: usb-audio: Fix an out-of-bound read in create_composite_quirks
ALSA: usb-audio: Always check descriptor sizes in parser code
srcu: Lock srcu_data structure in srcu_gp_start()
driver core: Add missing dev->bus->need_parent_lock checks
Fix failure path in alloc_pid()
block: deactivate blk_stat timer in wbt_disable_default()
block: mq-deadline: Fix write completion handling
dlm: fixed memory leaks after failed ls_remove_names allocation
dlm: possible memory leak on error path in create_lkb()
dlm: lost put_lkb on error path in receive_convert() and receive_unlock()
dlm: memory leaks on error path in dlm_user_request()
gfs2: Get rid of potential double-freeing in gfs2_create_inode
gfs2: Fix loop in gfs2_rbm_find
b43: Fix error in cordic routine
selinux: policydb - fix byte order and alignment issues
PCI / PM: Allow runtime PM without callback functions
lockd: Show pid of lockd for remote locks
nfsd4: zero-length WRITE should succeed
arm64: drop linker script hack to hide __efistub_ symbols
arm64: relocatable: fix inconsistencies in linker script and options
leds: pwm: silently error out on EPROBE_DEFER
Revert "powerpc/tm: Unset MSR[TS] if not recheckpointing"
powerpc/tm: Set MSR[TS] just prior to recheckpoint
iio: dac: ad5686: fix bit shift read register
9p/net: put a lower bound on msize
rxe: fix error completion wr_id and qp_num
RDMA/srpt: Fix a use-after-free in the channel release code
iommu/vt-d: Handle domain agaw being less than iommu agaw
sched/fair: Fix infinite loop in update_blocked_averages() by reverting
|
||
Michal Hocko
|
1c5e0be35d |
memcg, oom: notify on oom killer invocation from the charge path
commit 7056d3a37d2c6aaaab10c13e8e69adc67ec1fc65 upstream. Burt Holzman has noticed that memcg v1 doesn't notify about OOM events via eventfd anymore. The reason is that |
||
Huang Ying
|
8da70752f5 |
mm, swap: fix swapoff with KSM pages
commit 7af7a8e19f0c5425ff639b0f0d2d244c2a647724 upstream. KSM pages may be mapped to the multiple VMAs that cannot be reached from one anon_vma. So during swapin, a new copy of the page need to be generated if a different anon_vma is needed, please refer to comments of ksm_might_need_to_copy() for details. During swapoff, unuse_vma() uses anon_vma (if available) to locate VMA and virtual address mapped to the page, so not all mappings to a swapped out KSM page could be found. So in try_to_unuse(), even if the swap count of a swap entry isn't zero, the page needs to be deleted from swap cache, so that, in the next round a new page could be allocated and swapin for the other mappings of the swapped out KSM page. But this contradicts with the THP swap support. Where the THP could be deleted from swap cache only after the swap count of every swap entry in the huge swap cluster backing the THP has reach 0. So try_to_unuse() is changed in commit |
||
Dan Williams
|
0f1a62e073 |
mm, hmm: mark hmm_devmem_{add, add_resource} EXPORT_SYMBOL_GPL
commit 02917e9f8676207a4c577d4d94eae12bf348e9d7 upstream. At Maintainer Summit, Greg brought up a topic I proposed around EXPORT_SYMBOL_GPL usage. The motivation was considerations for when EXPORT_SYMBOL_GPL is warranted and the criteria for taking the exceptional step of reclassifying an existing export. Specifically, I wanted to make the case that although the line is fuzzy and hard to specify in abstract terms, it is nonetheless clear that devm_memremap_pages() and HMM (Heterogeneous Memory Management) have crossed it. The devm_memremap_pages() facility should have been EXPORT_SYMBOL_GPL from the beginning, and HMM as a derivative of that functionality should have naturally picked up that designation as well. Contrary to typical rules, the HMM infrastructure was merged upstream with zero in-tree consumers. There was a promise at the time that those users would be merged "soon", but it has been over a year with no drivers arriving. While the Nouveau driver is about to belatedly make good on that promise it is clear that HMM was targeted first and foremost at an out-of-tree consumer. HMM is derived from devm_memremap_pages(), a facility Christoph and I spearheaded to support persistent memory. It combines a device lifetime model with a dynamically created 'struct page' / memmap array for any physical address range. It enables coordination and control of the many code paths in the kernel built to interact with memory via 'struct page' objects. With HMM the integration goes even deeper by allowing device drivers to hook and manipulate page fault and page free events. One interpretation of when EXPORT_SYMBOL is suitable is when it is exporting stable and generic leaf functionality. The devm_memremap_pages() facility continues to see expanding use cases, peer-to-peer DMA being the most recent, with no clear end date when it will stop attracting reworks and semantic changes. It is not suitable to export devm_memremap_pages() as a stable 3rd party driver API due to the fact that it is still changing and manipulates core behavior. Moreover, it is not in the best interest of the long term development of the core memory management subsystem to permit any external driver to effectively define its own system-wide memory management policies with no encouragement to engage with upstream. I am also concerned that HMM was designed in a way to minimize further engagement with the core-MM. That, with these hooks in place, device-drivers are free to implement their own policies without much consideration for whether and how the core-MM could grow to meet that need. Going forward not only should HMM be EXPORT_SYMBOL_GPL, but the core-MM should be allowed the opportunity and stimulus to change and address these new use cases as first class functionality. Original changelog: hmm_devmem_add(), and hmm_devmem_add_resource() duplicated devm_memremap_pages() and are now simple now wrappers around the core facility to inject a dev_pagemap instance into the global pgmap_radix and hook page-idle events. The devm_memremap_pages() interface is base infrastructure for HMM. HMM has more and deeper ties into the kernel memory management implementation than base ZONE_DEVICE which is itself a EXPORT_SYMBOL_GPL facility. Originally, the HMM page structure creation routines copied the devm_memremap_pages() code and reused ZONE_DEVICE. A cleanup to unify the implementations was discussed during the initial review: http://lkml.iu.edu/hypermail/linux/kernel/1701.2/00812.html Recent work to extend devm_memremap_pages() for the peer-to-peer-DMA facility enabled this cleanup to move forward. In addition to the integration with devm_memremap_pages() HMM depends on other GPL-only symbols: mmu_notifier_unregister_no_release percpu_ref region_intersects __class_create It goes further to consume / indirectly expose functionality that is not exported to any other driver: alloc_pages_vma walk_page_range HMM is derived from devm_memremap_pages(), and extends deep core-kernel fundamentals. Similar to devm_memremap_pages(), mark its entry points EXPORT_SYMBOL_GPL(). [logang@deltatee.com: PCI/P2PDMA: match interface changes to devm_memremap_pages()] Link: http://lkml.kernel.org/r/20181130225911.2900-1-logang@deltatee.com Link: http://lkml.kernel.org/r/154275560565.76910.15919297436557795278.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Balbir Singh <bsingharora@gmail.com>, Cc: Michal Hocko <mhocko@suse.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Dan Williams
|
e890a86706 |
mm, hmm: use devm semantics for hmm_devmem_{add, remove}
commit 58ef15b765af0d2cbe6799ec564f1dc485010ab8 upstream. devm semantics arrange for resources to be torn down when device-driver-probe fails or when device-driver-release completes. Similar to devm_memremap_pages() there is no need to support an explicit remove operation when the users properly adhere to devm semantics. Note that devm_kzalloc() automatically handles allocating node-local memory. Link: http://lkml.kernel.org/r/154275559545.76910.9186690723515469051.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Michal Hocko
|
2c87072a3b |
hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined
commit b15c87263a69272423771118c653e9a1d0672caa upstream. We have received a bug report that an injected MCE about faulty memory prevents memory offline to succeed on 4.4 base kernel. The underlying reason was that the HWPoison page has an elevated reference count and the migration keeps failing. There are two problems with that. First of all it is dubious to migrate the poisoned page because we know that accessing that memory is possible to fail. Secondly it doesn't make any sense to migrate a potentially broken content and preserve the memory corruption over to a new location. Oscar has found out that 4.4 and the current upstream kernels behave slightly differently with his simply testcase === int main(void) { int ret; int i; int fd; char *array = malloc(4096); char *array_locked = malloc(4096); fd = open("/tmp/data", O_RDONLY); read(fd, array, 4095); for (i = 0; i < 4096; i++) array_locked[i] = 'd'; ret = mlock((void *)PAGE_ALIGN((unsigned long)array_locked), sizeof(array_locked)); if (ret) perror("mlock"); sleep (20); ret = madvise((void *)PAGE_ALIGN((unsigned long)array_locked), 4096, MADV_HWPOISON); if (ret) perror("madvise"); for (i = 0; i < 4096; i++) array_locked[i] = 'd'; return 0; } === + offline this memory. In 4.4 kernels he saw the hwpoisoned page to be returned back to the LRU list kernel: [<ffffffff81019ac9>] dump_trace+0x59/0x340 kernel: [<ffffffff81019e9a>] show_stack_log_lvl+0xea/0x170 kernel: [<ffffffff8101ac71>] show_stack+0x21/0x40 kernel: [<ffffffff8132bb90>] dump_stack+0x5c/0x7c kernel: [<ffffffff810815a1>] warn_slowpath_common+0x81/0xb0 kernel: [<ffffffff811a275c>] __pagevec_lru_add_fn+0x14c/0x160 kernel: [<ffffffff811a2eed>] pagevec_lru_move_fn+0xad/0x100 kernel: [<ffffffff811a334c>] __lru_cache_add+0x6c/0xb0 kernel: [<ffffffff81195236>] add_to_page_cache_lru+0x46/0x70 kernel: [<ffffffffa02b4373>] extent_readpages+0xc3/0x1a0 [btrfs] kernel: [<ffffffff811a16d7>] __do_page_cache_readahead+0x177/0x200 kernel: [<ffffffff811a18c8>] ondemand_readahead+0x168/0x2a0 kernel: [<ffffffff8119673f>] generic_file_read_iter+0x41f/0x660 kernel: [<ffffffff8120e50d>] __vfs_read+0xcd/0x140 kernel: [<ffffffff8120e9ea>] vfs_read+0x7a/0x120 kernel: [<ffffffff8121404b>] kernel_read+0x3b/0x50 kernel: [<ffffffff81215c80>] do_execveat_common.isra.29+0x490/0x6f0 kernel: [<ffffffff81215f08>] do_execve+0x28/0x30 kernel: [<ffffffff81095ddb>] call_usermodehelper_exec_async+0xfb/0x130 kernel: [<ffffffff8161c045>] ret_from_fork+0x55/0x80 And that latter confuses the hotremove path because an LRU page is attempted to be migrated and that fails due to an elevated reference count. It is quite possible that the reuse of the HWPoisoned page is some kind of fixed race condition but I am not really sure about that. With the upstream kernel the failure is slightly different. The page doesn't seem to have LRU bit set but isolate_movable_page simply fails and do_migrate_range simply puts all the isolated pages back to LRU and therefore no progress is made and scan_movable_pages finds same set of pages over and over again. Fix both cases by explicitly checking HWPoisoned pages before we even try to get reference on the page, try to unmap it if it is still mapped. As explained by Naoya: : Hwpoison code never unmapped those for no big reason because : Ksm pages never dominate memory, so we simply didn't have strong : motivation to save the pages. Also put WARN_ON(PageLRU) in case there is a race and we can hit LRU HWPoison pages which shouldn't happen but I couldn't convince myself about that. Naoya has noted the following: : Theoretically no such gurantee, because try_to_unmap() doesn't have a : guarantee of success and then memory_failure() returns immediately : when hwpoison_user_mappings fails. : Or the following code (comes after hwpoison_user_mappings block) also impli= : es : that the target page can still have PageLRU flag. : : /* : * Torn down by someone else? : */ : if (PageLRU(p) && !PageSwapCache(p) && p->mapping =3D=3D NULL) { : action_result(pfn, MF_MSG_TRUNCATED_LRU, MF_IGNORED); : res =3D -EBUSY; : goto out; : } : : So I think it's OK to keep "if (WARN_ON(PageLRU(page)))" block in : current version of your patch. Link: http://lkml.kernel.org/r/20181206120135.14079-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Oscar Salvador <osalvador@suse.com> Debugged-by: Oscar Salvador <osalvador@suse.com> Tested-by: Oscar Salvador <osalvador@suse.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Greg Kroah-Hartman
|
a872d2d074 |
This is the 4.19.13 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlwnaqgACgkQONu9yGCS aT5BPA//TXG7P4K4Nor0eu6CLJ8KaO3wZSneHUp+cV3/zZsPe/6K4pgwz5Kmho7R ii82FuXKTwqr+CLegkGlwF01q/HFT7u487Yz1eqdrf3oqvjQjC9+ut/qO//9JePd OLZCCrPtFqWT8ClpHhxWA3skYx9UsnBxseUFE+cMCuTVin1/YGQc/xV6CQBgZfs3 V3dfmv9D1lCZ1nlvgEHh+VMvqlvnBEgUufLYZZEb6yK9GVQuRk+piXMf2rxm1RuN aBZHVI4tdHhYkEbhQ46ADaPLBghNeSoa2bIBnHu0G1YO+oRewQlVM/rEvMv+XOdX GoRSX1fNYZUjI0u6EsDw0WPBILoJaLmXF8bIH3hTmTkTev4Vslyiuz0SJNwLwrkx 0Zzg2D+AF9MdvO4EBwoAnqwzO2lM6WkIsHp85NMymggp5+VL1yuuo0kr7OMw51Rl U5ReIwcq+7TZp3WtqUQHEGO5TOfPoAdW8sINcQeWTjod6c3EHPxmvrS8EE6KgPI1 o+jE2j+uxUbgzzeq4ovJvsJj28WKqZ0jCLyMozCN6hpzki+S5qzNHYMYz3quZGQH GN82w5cZGrtPFHAm1Ft5hVB+uS9vj6+84jIprFVYwPnBN6f5tK8Rjsz5cJ5Oh7UW q5EAuLxcLt+5v2TMYlZRNLg/fzZBS3FnZy0KLx8XSJ+jm6E4LM0= =GR64 -----END PGP SIGNATURE----- Merge 4.19.13 into android-4.19 Changes in 4.19.13 iomap: Revert "fs/iomap.c: get/put the page in iomap_page_create/release()" Revert "vfs: Allow userns root to call mknod on owned filesystems." USB: hso: Fix OOB memory access in hso_probe/hso_get_config_data xhci: Don't prevent USB2 bus suspend in state check intended for USB3 only USB: xhci: fix 'broken_suspend' placement in struct xchi_hcd USB: serial: option: add GosunCn ZTE WeLink ME3630 USB: serial: option: add HP lt4132 USB: serial: option: add Simcom SIM7500/SIM7600 (MBIM mode) USB: serial: option: add Fibocom NL668 series USB: serial: option: add Telit LN940 series ubifs: Handle re-linking of inodes correctly while recovery scsi: t10-pi: Return correct ref tag when queue has no integrity profile scsi: sd: use mempool for discard special page mmc: core: Reset HPI enabled state during re-init and in case of errors mmc: core: Allow BKOPS and CACHE ctrl even if no HPI support mmc: core: Use a minimum 1600ms timeout when enabling CACHE ctrl mmc: omap_hsmmc: fix DMA API warning gpio: max7301: fix driver for use with CONFIG_VMAP_STACK gpiolib-acpi: Only defer request_irq for GpioInt ACPI event handlers posix-timers: Fix division by zero bug KVM: X86: Fix NULL deref in vcpu_scan_ioapic kvm: x86: Add AMD's EX_CFG to the list of ignored MSRs KVM: Fix UAF in nested posted interrupt processing Drivers: hv: vmbus: Return -EINVAL for the sys files for unopened channels futex: Cure exit race x86/mtrr: Don't copy uninitialized gentry fields back to userspace x86/mm: Fix decoy address handling vs 32-bit builds x86/vdso: Pass --eh-frame-hdr to the linker x86/intel_rdt: Ensure a CPU remains online for the region's pseudo-locking sequence panic: avoid deadlocks in re-entrant console drivers mm: add mm_pxd_folded checks to pgtable_bytes accounting functions mm: make the __PAGETABLE_PxD_FOLDED defines non-empty mm: introduce mm_[p4d|pud|pmd]_folded xfrm_user: fix freeing of xfrm states on acquire rtlwifi: Fix leak of skb when processing C2H_BT_INFO iwlwifi: mvm: don't send GEO_TX_POWER_LIMIT to old firmwares Revert "mwifiex: restructure rx_reorder_tbl_lock usage" iwlwifi: add new cards for 9560, 9462, 9461 and killer series media: ov5640: Fix set format regression mm, memory_hotplug: initialize struct pages for the full memory section mm: thp: fix flags for pmd migration when split mm, page_alloc: fix has_unmovable_pages for HugePages mm: don't miss the last page because of round-off error Input: elantech - disable elan-i2c for P52 and P72 proc/sysctl: don't return ENOMEM on lookup when a table is unregistering drm/ioctl: Fix Spectre v1 vulnerabilities Linux 4.19.13 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Roman Gushchin
|
a5e8809697 |
mm: don't miss the last page because of round-off error
commit 68600f623d69da428c6163275f97ca126e1a8ec5 upstream. I've noticed, that dying memory cgroups are often pinned in memory by a single pagecache page. Even under moderate memory pressure they sometimes stayed in such state for a long time. That looked strange. My investigation showed that the problem is caused by applying the LRU pressure balancing math: scan = div64_u64(scan * fraction[lru], denominator), where denominator = fraction[anon] + fraction[file] + 1. Because fraction[lru] is always less than denominator, if the initial scan size is 1, the result is always 0. This means the last page is not scanned and has no chances to be reclaimed. Fix this by rounding up the result of the division. In practice this change significantly improves the speed of dying cgroups reclaim. [guro@fb.com: prevent double calculation of DIV64_U64_ROUND_UP() arguments] Link: http://lkml.kernel.org/r/20180829213311.GA13501@castle Link: http://lkml.kernel.org/r/20180827162621.30187-3-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Oscar Salvador
|
e27666dd8f |
mm, page_alloc: fix has_unmovable_pages for HugePages
commit 17e2e7d7e1b83fa324b3f099bfe426659aa3c2a4 upstream. While playing with gigantic hugepages and memory_hotplug, I triggered the following #PF when "cat memoryX/removable": BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 1481 Comm: cat Tainted: G E 4.20.0-rc6-mm1-1-default+ #18 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:has_unmovable_pages+0x154/0x210 Call Trace: is_mem_section_removable+0x7d/0x100 removable_show+0x90/0xb0 dev_attr_show+0x1c/0x50 sysfs_kf_seq_show+0xca/0x1b0 seq_read+0x133/0x380 __vfs_read+0x26/0x180 vfs_read+0x89/0x140 ksys_read+0x42/0x90 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The reason is we do not pass the Head to page_hstate(), and so, the call to compound_order() in page_hstate() returns 0, so we end up checking all hstates's size to match PAGE_SIZE. Obviously, we do not find any hstate matching that size, and we return NULL. Then, we dereference that NULL pointer in hugepage_migration_supported() and we got the #PF from above. Fix that by getting the head page before calling page_hstate(). Also, since gigantic pages span several pageblocks, re-adjust the logic for skipping pages. While are it, we can also get rid of the round_up(). [osalvador@suse.de: remove round_up(), adjust skip pages logic per Michal] Link: http://lkml.kernel.org/r/20181221062809.31771-1-osalvador@suse.de Link: http://lkml.kernel.org/r/20181217225113.17864-1-osalvador@suse.de Signed-off-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Peter Xu
|
161a5654cf |
mm: thp: fix flags for pmd migration when split
commit 2e83ee1d8694a61d0d95a5b694f2e61e8dde8627 upstream. When splitting a huge migrating PMD, we'll transfer all the existing PMD bits and apply them again onto the small PTEs. However we are fetching the bits unconditionally via pmd_soft_dirty(), pmd_write() or pmd_yound() while actually they don't make sense at all when it's a migration entry. Fix them up. Since at it, drop the ifdef together as not needed. Note that if my understanding is correct about the problem then if without the patch there is chance to lose some of the dirty bits in the migrating pmd pages (on x86_64 we're fetching bit 11 which is part of swap offset instead of bit 2) and it could potentially corrupt the memory of an userspace program which depends on the dirty bit. Link: http://lkml.kernel.org/r/20181213051510.20306-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: <stable@vger.kernel.org> [4.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Mikhail Zaslonko
|
7592dbfaf3 |
mm, memory_hotplug: initialize struct pages for the full memory section
commit 2830bf6f05fb3e05bc4743274b806c821807a684 upstream. If memory end is not aligned with the sparse memory section boundary, the mapping of such a section is only partly initialized. This may lead to VM_BUG_ON due to uninitialized struct page access from is_mem_section_removable() or test_pages_in_a_zone() function triggered by memory_hotplug sysfs handlers: Here are the the panic examples: CONFIG_DEBUG_VM=y CONFIG_DEBUG_VM_PGFLAGS=y kernel parameter mem=2050M -------------------------- page:000003d082008000 is uninitialized and poisoned page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) Call Trace: ( test_pages_in_a_zone+0xde/0x160) show_valid_zones+0x5c/0x190 dev_attr_show+0x34/0x70 sysfs_kf_seq_show+0xc8/0x148 seq_read+0x204/0x480 __vfs_read+0x32/0x178 vfs_read+0x82/0x138 ksys_read+0x5a/0xb0 system_call+0xdc/0x2d8 Last Breaking-Event-Address: test_pages_in_a_zone+0xde/0x160 Kernel panic - not syncing: Fatal exception: panic_on_oops kernel parameter mem=3075M -------------------------- page:000003d08300c000 is uninitialized and poisoned page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) Call Trace: ( is_mem_section_removable+0xb4/0x190) show_mem_removable+0x9a/0xd8 dev_attr_show+0x34/0x70 sysfs_kf_seq_show+0xc8/0x148 seq_read+0x204/0x480 __vfs_read+0x32/0x178 vfs_read+0x82/0x138 ksys_read+0x5a/0xb0 system_call+0xdc/0x2d8 Last Breaking-Event-Address: is_mem_section_removable+0xb4/0x190 Kernel panic - not syncing: Fatal exception: panic_on_oops Fix the problem by initializing the last memory section of each zone in memmap_init_zone() till the very end, even if it goes beyond the zone end. Michal said: : This has alwways been problem AFAIU. It just went unnoticed because we : have zeroed memmaps during allocation before |
||
Joel Fernandes (Google)
|
2e0d7ea44a |
UPSTREAM: mm/memfd: make F_SEAL_FUTURE_WRITE seal more robust
A better way to do F_SEAL_FUTURE_WRITE seal was discussed [1] last week where we don't need to modify core VFS structures to get the same behavior of the seal. This solves several side-effects pointed out by Andy [2]. [1] https://lore.kernel.org/lkml/20181111173650.GA256781@google.com/ [2] https://lore.kernel.org/lkml/69CE06CC-E47C-4992-848A-66EB23EE6C74@amacapital.net/ Suggested-by: Andy Lutomirski <luto@kernel.org> Fixes: 5e653c2923fd ("mm: Add an F_SEAL_FUTURE_WRITE seal to memfd") Change-id: I5d2414cfcf8ac42d3632d0b0dc960c742d490e2f Verified with test program at: https://lore.kernel.org/patchwork/patch/1008117/ Backport link: https://lore.kernel.org/patchwork/patch/1014892/ Bug: 113362644 Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> |
||
Joel Fernandes (Google)
|
1dc8ca4429 |
UPSTREAM: mm: Add an F_SEAL_FUTURE_WRITE seal to memfd
Android uses ashmem for sharing memory regions. We are looking forward to migrating all usecases of ashmem to memfd so that we can possibly remove the ashmem driver in the future from staging while also benefiting from using memfd and contributing to it. Note staging drivers are also not ABI and generally can be removed at anytime. One of the main usecases Android has is the ability to create a region and mmap it as writeable, then add protection against making any "future" writes while keeping the existing already mmap'ed writeable-region active. This allows us to implement a usecase where receivers of the shared memory buffer can get a read-only view, while the sender continues to write to the buffer. See CursorWindow documentation in Android for more details: https://developer.android.com/reference/android/database/CursorWindow This usecase cannot be implemented with the existing F_SEAL_WRITE seal. To support the usecase, this patch adds a new F_SEAL_FUTURE_WRITE seal which prevents any future mmap and write syscalls from succeeding while keeping the existing mmap active. The following program shows the seal working in action: #include <stdio.h> #include <errno.h> #include <sys/mman.h> #include <linux/memfd.h> #include <linux/fcntl.h> #include <asm/unistd.h> #include <unistd.h> #define F_SEAL_FUTURE_WRITE 0x0010 #define REGION_SIZE (5 * 1024 * 1024) int memfd_create_region(const char *name, size_t size) { int ret; int fd = syscall(__NR_memfd_create, name, MFD_ALLOW_SEALING); if (fd < 0) return fd; ret = ftruncate(fd, size); if (ret < 0) { close(fd); return ret; } return fd; } int main() { int ret, fd; void *addr, *addr2, *addr3, *addr1; ret = memfd_create_region("test_region", REGION_SIZE); printf("ret=%d\n", ret); fd = ret; // Create map addr = mmap(0, REGION_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); if (addr == MAP_FAILED) printf("map 0 failed\n"); else printf("map 0 passed\n"); if ((ret = write(fd, "test", 4)) != 4) printf("write failed even though no future-write seal " "(ret=%d errno =%d)\n", ret, errno); else printf("write passed\n"); addr1 = mmap(0, REGION_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); if (addr1 == MAP_FAILED) perror("map 1 prot-write failed even though no seal\n"); else printf("map 1 prot-write passed as expected\n"); ret = fcntl(fd, F_ADD_SEALS, F_SEAL_FUTURE_WRITE | F_SEAL_GROW | F_SEAL_SHRINK); if (ret == -1) printf("fcntl failed, errno: %d\n", errno); else printf("future-write seal now active\n"); if ((ret = write(fd, "test", 4)) != 4) printf("write failed as expected due to future-write seal\n"); else printf("write passed (unexpected)\n"); addr2 = mmap(0, REGION_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); if (addr2 == MAP_FAILED) perror("map 2 prot-write failed as expected due to seal\n"); else printf("map 2 passed\n"); addr3 = mmap(0, REGION_SIZE, PROT_READ, MAP_SHARED, fd, 0); if (addr3 == MAP_FAILED) perror("map 3 failed\n"); else printf("map 3 prot-read passed as expected\n"); } The output of running this program is as follows: ret=3 map 0 passed write passed map 1 prot-write passed as expected future-write seal now active write failed as expected due to future-write seal map 2 prot-write failed as expected due to seal : Permission denied map 3 prot-read passed as expected Cc: jreck@google.com Cc: john.stultz@linaro.org Cc: tkjos@google.com Cc: gregkh@linuxfoundation.org Cc: hch@infradead.org Reviewed-by: John Stultz <john.stultz@linaro.org> Reported-by: Jann Horn <jannh@google.com> Change-id: Ie702c08f1f41ce78e5e8db3f415446ceedb01795 Bug: 113362644 Verified with test program at: https://lore.kernel.org/patchwork/patch/1008117/ Backport link: https://lore.kernel.org/patchwork/patch/1014892/ Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> |
||
Greg Kroah-Hartman
|
67319b77a0 |
This is the 4.19.10 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlwXXUsACgkQONu9yGCS aT5lHBAAm4DiCe303AjPWGQauwDWZPhXcF2ieF/gx77TSotIonxRa4w4nQdQAxVh aIiMzyxihwtgd6bCMNMkCjImWqUw+f189D6RGzKJZYLCB39HCPskJ6oMPvuzNXAL yF1+84288ZY+Z9DXxK3T9x8KJUj5qXexjgoMfdS9+lJWku/BsCTPFk8tIjxY5bI9 hMSIePIfvZqmXWuz7Btw9uykOYwAzk3tqcVv1P1vSeWaUE7dWQts17NUZhnDt5zp alSnmUUt7I7w+9CWpORFOHC+ekfltf/7VjIVgzBf9cKTgxGeZ8+htceYGTRIwegg kzU4cq8IZGWp+Umfhm9r7vWxf+tjdil42dYkiDWs/XnbKVw5f2UFi8c2rAItmfVw vpSZK1hgUFm8dojOFIjbJF2AfhLpDDSqKuZNhw1SIzDmsA6rV8cLNdQx+suL9Xc5 JoL+b1wH1uvrPnSOloScakF32gjsrU5mReP+yPgl3LNc1Hn/Nu85262i4OEzs+Od Kmy/TfaRWYlWWtejH3fydmVGGadJ4owNYqhuB9eYQgBKWbcSShDXZmvJ+VKVdmcs k9Nz/Lyt4GxrFYiaWGuQeE0VTG9z87FwQvuikYKJF7FptN4kixBITfzRlKh3JbM4 sR/nASeAvGiv5WrwszcM6AJ0Ps0yzZJr5JZ1w7wbWX84QH457mo= =bF+8 -----END PGP SIGNATURE----- Merge 4.19.10 into android-4.19 Changes in 4.19.10 ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes ipv6: Check available headroom in ip6_xmit() even without options neighbour: Avoid writing before skb->head in neigh_hh_output() ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output net: 8139cp: fix a BUG triggered by changing mtu with network traffic net/mlx4_core: Correctly set PFC param if global pause is turned off. net/mlx4_en: Change min MTU size to ETH_MIN_MTU net: phy: don't allow __set_phy_supported to add unsupported modes net: Prevent invalid access to skb->prev in __qdisc_drop_all net: use skb_list_del_init() to remove from RX sublists Revert "net/ibm/emac: wrong bit is used for STA control" rtnetlink: ndo_dflt_fdb_dump() only work for ARPHRD_ETHER devices sctp: kfree_rcu asoc tcp: Do not underestimate rwnd_limited tcp: fix NULL ref in tail loss probe tun: forbid iface creation with rtnl ops virtio-net: keep vnet header zeroed after processing XDP net: phy: sfp: correct store of detected link modes sctp: update frag_point when stream_interleave is set net: restore call to netdev_queue_numa_node_write when resetting XPS net: fix XPS static_key accounting ARM: OMAP2+: prm44xx: Fix section annotation on omap44xx_prm_enable_io_wakeup ASoC: rsnd: fixup clock start checker ASoC: qdsp6: q6afe: Fix wrong MI2S SD line mask ASoC: qdsp6: q6afe-dai: Fix the dai widgets staging: rtl8723bs: Fix the return value in case of error in 'rtw_wx_read32()' ARM: dts: am3517: Fix pinmuxing for CD on MMC1 ARM: dts: LogicPD Torpedo: Fix mmc3_dat1 interrupt ARM: dts: logicpd-somlv: Fix interrupt on mmc3_dat1 ARM: dts: am3517-som: Fix WL127x Wifi interrupt ARM: OMAP1: ams-delta: Fix possible use of uninitialized field tools: bpftool: prevent infinite loop in get_fdinfo() ASoC: sun8i-codec: fix crash on module removal arm64: dts: sdm845-mtp: Reserve reserved gpios sysv: return 'err' instead of 0 in __sysv_write_inode netfilter: nf_conncount: use spin_lock_bh instead of spin_lock netfilter: nf_conncount: fix list_del corruption in conn_free netfilter: nf_conncount: fix unexpected permanent node of list. netfilter: nf_tables: don't skip inactive chains during update selftests: add script to stress-test nft packet path vs. control plane perf tools: Fix crash on synthesizing the unit netfilter: xt_RATEEST: remove netns exit routine netfilter: nf_tables: fix use-after-free when deleting compat expressions s390/cio: Fix cleanup of pfn_array alloc failure s390/cio: Fix cleanup when unsupported IDA format is used hwmon (ina2xx) Fix NULL id pointer in probe() hwmon: (raspberrypi) Fix initial notify ASoC: rockchip: add missing slave_config setting for I2S ASoC: wm_adsp: Fix dma-unsafe read of scratch registers ASoC: Intel: Power down links before turning off display audio power ASoC: qcom: Set dai_link id to each dai_link s390/cpum_cf: Reject request for sampling in event initialization hwmon: (ina2xx) Fix current value calculation ASoC: omap-abe-twl6040: Fix missing audio card caused by deferred probing ASoC: dapm: Recalculate audio map forcely when card instantiated spi: omap2-mcspi: Add missing suspend and resume calls hwmon: (mlxreg-fan) Fix macros for tacho fault reading bpf: allocate local storage buffers using GFP_ATOMIC aio: fix failure to put the file pointer netfilter: xt_hashlimit: fix a possible memory leak in htable_create() hwmon: (w83795) temp4_type has writable permission perf tools: Restore proper cwd on return from mnt namespace PCI: imx6: Fix link training status detection in link up check ASoC: acpi: fix: continue searching when machine is ignored objtool: Fix double-free in .cold detection error path objtool: Fix segfault in .cold detection with -ffunction-sections phy: qcom-qusb2: Use HSTX_TRIM fused value as is phy: qcom-qusb2: Fix HSTX_TRIM tuning with fused value for SDM845 ARM: dts: at91: sama5d2: use the divided clock for SMC Btrfs: send, fix infinite loop due to directory rename dependencies RDMA/mlx5: Fix fence type for IB_WR_LOCAL_INV WR RDMA/core: Add GIDs while changing MAC addr only for registered ndev RDMA/bnxt_re: Fix system hang when registration with L2 driver fails RDMA/bnxt_re: Avoid accessing the device structure after it is freed RDMA/rdmavt: Fix rvt_create_ah function signature tools: bpftool: fix potential NULL pointer dereference in do_load ASoC: omap-mcbsp: Fix latency value calculation for pm_qos ASoC: omap-mcpdm: Add pm_qos handling to avoid under/overruns with CPU_IDLE ASoC: omap-dmic: Add pm_qos handling to avoid overruns with CPU_IDLE exportfs: do not read dentry after free RDMA/hns: Bugfix pbl configuration for rereg mr bpf: fix check of allowed specifiers in bpf_trace_printk fsi: master-ast-cf: select GENERIC_ALLOCATOR ipvs: call ip_vs_dst_notifier earlier than ipv6_dev_notf USB: omap_udc: use devm_request_irq() USB: omap_udc: fix crashes on probe error and module removal USB: omap_udc: fix omap_udc_start() on 15xx machines USB: omap_udc: fix USB gadget functionality on Palm Tungsten E USB: omap_udc: fix rejection of out transfers when DMA is used thunderbolt: Prevent root port runtime suspend during NVM upgrade drm/meson: add support for 1080p25 mode netfilter: ipv6: Preserve link scope traffic original oif IB/mlx5: Fix page fault handling for MW netfilter: add missing error handling code for register functions netfilter: nat: fix double register in masquerade modules netfilter: nf_conncount: remove wrong condition check routine KVM: VMX: Update shared MSRs to be saved/restored on MSR_EFER.LMA changes KVM: x86: fix empty-body warnings x86/kvm/vmx: fix old-style function declaration net: thunderx: fix NULL pointer dereference in nic_remove usb: gadget: u_ether: fix unsafe list iteration netfilter: nf_tables: deactivate expressions in rule replecement routine ALSA: usb-audio: Add vendor and product name for Dell WD19 Dock cachefiles: Fix an assertion failure when trying to update a failed object fscache: Fix race in fscache_op_complete() due to split atomic_sub & read cachefiles: Fix page leak in cachefiles_read_backing_file while vmscan is active igb: fix uninitialized variables ixgbe: recognize 1000BaseLX SFP modules as 1Gbps net: hisilicon: remove unexpected free_netdev drm/amdgpu: Add delay after enable RLC ucode drm/ast: fixed reading monitor EDID not stable issue xen: xlate_mmu: add missing header to fix 'W=1' warning Revert "xen/balloon: Mark unallocated host memory as UNUSABLE" pvcalls-front: fixes incorrect error handling pstore/ram: Correctly calculate usable PRZ bytes afs: Fix validation/callback interaction fscache: fix race between enablement and dropping of object cachefiles: Explicitly cast enumerated type in put_object fscache, cachefiles: remove redundant variable 'cache' nvme: warn when finding multi-port subsystems without multipathing enabled nvme: flush namespace scanning work just before removing namespaces nvme-rdma: fix double freeing of async event data ACPI/IORT: Fix iort_get_platform_device_domain() uninitialized pointer value ocfs2: fix deadlock caused by ocfs2_defrag_extent() mm/page_alloc.c: fix calculation of pgdat->nr_zones hfs: do not free node before using hfsplus: do not free node before using debugobjects: avoid recursive calls with kmemleak proc: fixup map_files test on arm kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace initramfs: clean old path before creating a hardlink ocfs2: fix potential use after free flexfiles: enforce per-mirror stateid only for v4 DSes dax: Check page->mapping isn't NULL ALSA: fireface: fix reference to wrong register for clock configuration ALSA: hda/realtek - Fixed headphone issue for ALC700 ALSA: hda/realtek: ALC294 mic and headset-mode fixups for ASUS X542UN ALSA: hda/realtek: Enable audio jacks of ASUS UX533FD with ALC294 ALSA: hda/realtek: Enable audio jacks of ASUS UX433FN/UX333FA with ALC294 ALSA: hda/realtek - Fix the mute LED regresion on Lenovo X1 Carbon IB/hfi1: Fix an out-of-bounds access in get_hw_stats bpf: fix off-by-one error in adjust_subprog_starts tcp: lack of available data can also cause TSO defer Linux 4.19.10 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Wei Yang
|
505bc9f389 |
mm/page_alloc.c: fix calculation of pgdat->nr_zones
[ Upstream commit 8f416836c0d50b198cad1225132e5abebf8980dc ]
init_currently_empty_zone() will adjust pgdat->nr_zones and set it to
'zone_idx(zone) + 1' unconditionally. This is correct in the normal
case, while not exact in hot-plug situation.
This function is used in two places:
* free_area_init_core()
* move_pfn_range_to_zone()
In the first case, we are sure zone index increase monotonically. While
in the second one, this is under users control.
One way to reproduce this is:
----------------------------
1. create a virtual machine with empty node1
-m 4G,slots=32,maxmem=32G \
-smp 4,maxcpus=8 \
-numa node,nodeid=0,mem=4G,cpus=0-3 \
-numa node,nodeid=1,mem=0G,cpus=4-7
2. hot-add cpu 3-7
cpu-add [3-7]
2. hot-add memory to nod1
object_add memory-backend-ram,id=ram0,size=1G
device_add pc-dimm,id=dimm0,memdev=ram0,node=1
3. online memory with following order
echo online_movable > memory47/state
echo online > memory40/state
After this, node1 will have its nr_zones equals to (ZONE_NORMAL + 1)
instead of (ZONE_MOVABLE + 1).
Michal said:
"Having an incorrect nr_zones might result in all sorts of problems
which would be quite hard to debug (e.g. reclaim not considering the
movable zone). I do not expect many users would suffer from this it
but still this is trivial and obviously right thing to do so
backporting to the stable tree shouldn't be harmful (last famous
words)"
Link: http://lkml.kernel.org/r/20181117022022.9956-1-richard.weiyang@gmail.com
Fixes:
|
||
Greg Kroah-Hartman
|
49fe708f16 |
This is the 4.19.8 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlwLshEACgkQONu9yGCS aT4wJA//V/G9RbjbXaY9kjfMQW/mgySwfPmhvyzS1O9J3ic3b5WVO1J547UkWyd9 DwjIOUNx8IGDTLiAs15Z92CqKYOxpGp9zy0hbNMLXE3WTLXyyg94K/jlk6jk3vXw jCvYGQaQuMyNhPr8chS3Nmkdqx3ZLC1NmmGIBSRJevseWXe2yVowTo4EuKDxnmEL dwYsEQAgsbPiZamt1J6gqKvgbcKnBk119cHXSJBFEpdtmSxjxEFz5sJIptO0QCI8 Ck08bMUA7YaQ5CGsvbOTGJtq8EW5Vakk9DTJWDDwkdk1kZ+Xv6u2992Ey3nesvin oKWayd9a+1qYBlkXVyZGiKBSSE9KPN8beZsiYSUidH1qZdT8XoWKLX7cOeaL1kWl SHsrXy3je3UWVaz7YEiAdmdEuocjbH9Nfb4q0bfPfCYmdFB5tjrFz4gpUjbdTEpC oh31h9gOvuOXWedFfOckh/Ung5CDinxmXLS8zFBNe7WrHA1ZLTypMaHwASuRlsTD UMJ9meuMtghHg6tt+jkz5GFEP1SqnP9rCQfBuFslWlR1Y/Y3SJRSeyL7OmXUBa5N w/L2iwOO+SK91WRivZXqinOaMMlolYk4OF1dCehlgTFCF5Dfn8olz6mm7G7zd37S swAcz1ogWZb+AmQ/EWlxeIzTOjss1I+howbdMjQctpLjkYAKr7g= =+hPU -----END PGP SIGNATURE----- Merge 4.19.8 into android-4.19 Changes in 4.19.8 blk-mq: fix corruption with direct issue test_hexdump: use memcpy instead of strncpy unifdef: use memcpy instead of strncpy iser: set sector for ambiguous mr status errors uprobes: Fix handle_swbp() vs. unregister() + register() race once more mtd: nand: Fix memory allocation in nanddev_bbt_init() arm64: ftrace: Fix to enable syscall events on arm64 sched, trace: Fix prev_state output in sched_switch tracepoint tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique MIPS: ralink: Fix mt7620 nd_sd pinmux mips: fix mips_get_syscall_arg o32 check IB/mlx5: Avoid load failure due to unknown link width tracing/fgraph: Fix set_graph_function from showing interrupts drm/ast: Fix incorrect free on ioregs drm/amd/dm: Don't forget to attach MST encoders drm: set is_master to 0 upon drm_new_set_master() failure drm/meson: Fixes for drm_crtc_vblank_on/off support drm/meson: Enable fast_io in meson_dw_hdmi_regmap_config drm/meson: Fix OOB memory accesses in meson_viu_set_osd_lut() userfaultfd: use ENOENT instead of EFAULT if the atomic copy user fails userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem userfaultfd: shmem: add i_size checks userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set kgdboc: Fix restrict error kgdboc: Fix warning with module build svm: Add mutex_lock to protect apic_access_page_done on AMD systems selinux: add support for RTM_NEWCHAIN, RTM_DELCHAIN, and RTM_GETCHAIN i40e: Fix deletion of MAC filters scsi: lpfc: fix block guard enablement on SLI3 adapters Input: xpad - quirk all PDP Xbox One gamepads Input: synaptics - add PNP ID for ThinkPad P50 to SMBus Input: matrix_keypad - check for errors from of_get_named_gpio() Input: cros_ec_keyb - fix button/switch capability reports Input: elan_i2c - add ELAN0620 to the ACPI table Input: elan_i2c - add ACPI ID for Lenovo IdeaPad 330-15ARR Input: elan_i2c - add support for ELAN0621 touchpad btrfs: tree-checker: Don't check max block group size as current max chunk size limit is unreliable ARC: change defconfig defaults to ARCv2 arc: [devboards] Add support of NFSv3 ACL tipc: use destination length for copy string blk-mq: punt failed direct issue to dispatch list Linux 4.19.8 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Andrea Arcangeli
|
8f193a716e |
userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set
commit dcf7fe9d89763a28e0f43975b422ff141fe79e43 upstream.
Set the page dirty if VM_WRITE is not set because in such case the pte
won't be marked dirty and the page would be reclaimed without writepage
(i.e. swapout in the shmem case).
This was found by source review. Most apps (certainly including QEMU)
only use UFFDIO_COPY on PROT_READ|PROT_WRITE mappings or the app can't
modify the memory in the first place. This is for correctness and it
could help the non cooperative use case to avoid unexpected data loss.
Link: http://lkml.kernel.org/r/20181126173452.26955-6-aarcange@redhat.com
Reviewed-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Fixes:
|
||
Andrea Arcangeli
|
4ce337622f |
userfaultfd: shmem: add i_size checks
commit e2a50c1f64145a04959df2442305d57307e5395a upstream.
With MAP_SHARED: recheck the i_size after taking the PT lock, to
serialize against truncate with the PT lock. Delete the page from the
pagecache if the i_size_read check fails.
With MAP_PRIVATE: check the i_size after the PT lock before mapping
anonymous memory or zeropages into the MAP_PRIVATE shmem mapping.
A mostly irrelevant cleanup: like we do the delete_from_page_cache()
pagecache removal after dropping the PT lock, the PT lock is a spinlock
so drop it before the sleepable page lock.
Link: http://lkml.kernel.org/r/20181126173452.26955-5-aarcange@redhat.com
Fixes:
|
||
Andrea Arcangeli
|
6e44dd02c9 |
userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem
commit 5b51072e97d587186c2f5390c8c9c1fb7e179505 upstream.
Userfaultfd did not create private memory when UFFDIO_COPY was invoked
on a MAP_PRIVATE shmem mapping. Instead it wrote to the shmem file,
even when that had not been opened for writing. Though, fortunately,
that could only happen where there was a hole in the file.
Fix the shmem-backed implementation of UFFDIO_COPY to create private
memory for MAP_PRIVATE mappings. The hugetlbfs-backed implementation
was already correct.
This change is visible to userland, if userfaultfd has been used in
unintended ways: so it introduces a small risk of incompatibility, but
is necessary in order to respect file permissions.
An app that uses UFFDIO_COPY for anything like postcopy live migration
won't notice the difference, and in fact it'll run faster because there
will be no copy-on-write and memory waste in the tmpfs pagecache
anymore.
Userfaults on MAP_PRIVATE shmem keep triggering only on file holes like
before.
The real zeropage can also be built on a MAP_PRIVATE shmem mapping
through UFFDIO_ZEROPAGE and that's safe because the zeropage pte is
never dirty, in turn even an mprotect upgrading the vma permission from
PROT_READ to PROT_READ|PROT_WRITE won't make the zeropage pte writable.
Link: http://lkml.kernel.org/r/20181126173452.26955-3-aarcange@redhat.com
Fixes:
|
||
Andrea Arcangeli
|
10f98c134b |
userfaultfd: use ENOENT instead of EFAULT if the atomic copy user fails
commit 9e368259ad988356c4c95150fafd1a06af095d98 upstream.
Patch series "userfaultfd shmem updates".
Jann found two bugs in the userfaultfd shmem MAP_SHARED backend: the
lack of the VM_MAYWRITE check and the lack of i_size checks.
Then looking into the above we also fixed the MAP_PRIVATE case.
Hugh by source review also found a data loss source if UFFDIO_COPY is
used on shmem MAP_SHARED PROT_READ mappings (the production usages
incidentally run with PROT_READ|PROT_WRITE, so the data loss couldn't
happen in those production usages like with QEMU).
The whole patchset is marked for stable.
We verified QEMU postcopy live migration with guest running on shmem
MAP_PRIVATE run as well as before after the fix of shmem MAP_PRIVATE.
Regardless if it's shmem or hugetlbfs or MAP_PRIVATE or MAP_SHARED, QEMU
unconditionally invokes a punch hole if the guest mapping is filebacked
and a MADV_DONTNEED too (needed to get rid of the MAP_PRIVATE COWs and
for the anon backend).
This patch (of 5):
We internally used EFAULT to communicate with the caller, switch to
ENOENT, so EFAULT can be used as a non internal retval.
Link: http://lkml.kernel.org/r/20181126173452.26955-2-aarcange@redhat.com
Fixes:
|
||
Greg Kroah-Hartman
|
c454ec1e21 |
This is the 4.19.7 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlwIG48ACgkQONu9yGCS aT7g6Q//RkJ8ZWaRkykcCGaWIvwI6QF1tmKalIEWmToPdndDuQdUDGzWVwfE9G7P yLcnp3GMlXo4F82BBwG8lFSAm9zaeqaLabnJnXbCc5mZ3xi/2aNqIGHzBY1isNZl 0fTzzcelnAKzjp0Aa/egRLOeraSLgVt/Cp7Ha3FXMP6RNxUMzs1pbQ2IFZ3m+P4G CAD3Iye6geOaZTu/kXiiooUEUGFQFbV4c3AZ4VW7dZDdrG+ekwtF4YHtkEPseWJQ Ugtrbr6S0IxYQ91o1Pk77kg4uwUFYo12jrk8Ni4gaPZE6mQCa08tr2Alg2oZkJGw PdXnt2ASYGRWFYK2JAuTvKzhHrTEJYhiC323dKYCAx7BgfFaqdo5F20oNzYxXFBB gGA3AzDDtLUD3OOO+lxrDxXMhpwXUx92WXsoJVsaSafdqIDAueq14sH19wqm0gUJ D1fC2dWTsFrPZKjkU8Z6rJAyO1XZED55h7v1YlqAt2ibjCeDKpjnW3yvUt8Ivpqc nlnmp8v/Yl2cdY55XtlgUadpknSc2jApFMwhSWetxAaqDCvha2dLQ28YMyPRJzat ZHOkizM/VUntXvlUzFvVTsqLQiX0sfLG6MKcUkzWehPomNKT+B8XL1wtzytv9QXb jOY8nRD5PiQo2p35cqdDCskBwqzEwY+WxDe7ji0yHZysBZLxoxQ= =OiCf -----END PGP SIGNATURE----- Merge 4.19.7 into android-4.19 Changes in 4.19.7 mm/huge_memory: rename freeze_page() to unmap_page() mm/huge_memory: splitting set mapping+index before unfreeze mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() mm/khugepaged: collapse_shmem() stop if punched or truncated mm/khugepaged: fix crashes due to misaccounted holes mm/khugepaged: collapse_shmem() remember to clear holes mm/khugepaged: minor reorderings in collapse_shmem() mm/khugepaged: collapse_shmem() without freezing new_page mm/khugepaged: collapse_shmem() do not crash on Compound lan743x: Enable driver to work with LAN7431 lan743x: fix return value for lan743x_tx_napi_poll net: don't keep lonely packets forever in the gro hash net: gemini: Fix copy/paste error net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue packet: copy user buffers before orphan or clone rapidio/rionet: do not free skb before reading its length s390/qeth: fix length check in SNMP processing usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2 net: thunderx: set xdp_prog to NULL if bpf_prog_add fails net: skb_scrub_packet(): Scrub offload_fwd_mark virtio-net: disable guest csum during XDP set virtio-net: fail XDP set if guest csum is negotiated net/dim: Update DIM start sample after each DIM iteration tcp: defer SACK compression after DupThresh net: phy: add workaround for issue where PHY driver doesn't bind to the device tipc: fix lockdep warning during node delete x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation x86/speculation: Apply IBPB more strictly to avoid cross-process data leak x86/speculation: Propagate information about RSB filling mitigation to sysfs x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support x86/retpoline: Remove minimal retpoline support x86/speculation: Update the TIF_SSBD comment x86/speculation: Clean up spectre_v2_parse_cmdline() x86/speculation: Remove unnecessary ret variable in cpu_show_common() x86/speculation: Move STIPB/IBPB string conditionals out of cpu_show_common() x86/speculation: Disable STIBP when enhanced IBRS is in use x86/speculation: Rename SSBD update functions x86/speculation: Reorganize speculation control MSRs update sched/smt: Make sched_smt_present track topology x86/Kconfig: Select SCHED_SMT if SMP enabled sched/smt: Expose sched_smt_present static key x86/speculation: Rework SMT state change x86/l1tf: Show actual SMT state x86/speculation: Reorder the spec_v2 code x86/speculation: Mark string arrays const correctly x86/speculataion: Mark command line parser data __initdata x86/speculation: Unify conditional spectre v2 print functions x86/speculation: Add command line control for indirect branch speculation x86/speculation: Prepare for per task indirect branch speculation control x86/process: Consolidate and simplify switch_to_xtra() code x86/speculation: Avoid __switch_to_xtra() calls x86/speculation: Prepare for conditional IBPB in switch_mm() ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS x86/speculation: Split out TIF update x86/speculation: Prevent stale SPEC_CTRL msr content x86/speculation: Prepare arch_smt_update() for PRCTL mode x86/speculation: Add prctl() control for indirect branch speculation x86/speculation: Enable prctl mode for spectre_v2_user x86/speculation: Add seccomp Spectre v2 user space protection mode x86/speculation: Provide IBPB always command line options userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas kvm: mmu: Fix race in emulated page table writes kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb KVM: nVMX/nSVM: Fix bug which sets vcpu->arch.tsc_offset to L1 tsc_offset KVM: x86: Fix kernel info-leak in KVM_HC_CLOCK_PAIRING hypercall KVM: LAPIC: Fix pv ipis use-before-initialization KVM: X86: Fix scan ioapic use-before-initialization KVM: VMX: re-add ple_gap module parameter xtensa: enable coprocessors that are being flushed xtensa: fix coprocessor context offset definitions xtensa: fix coprocessor part of ptrace_{get,set}xregs udf: Allow mounting volumes with incorrect identification strings btrfs: Always try all copies when reading extent buffers Btrfs: ensure path name is null terminated at btrfs_control_ioctl Btrfs: fix rare chances for data loss when doing a fast fsync Btrfs: fix race between enabling quotas and subvolume creation btrfs: relocation: set trans to be NULL after ending transaction PCI: layerscape: Fix wrong invocation of outbound window disable accessor PCI: dwc: Fix MSI-X EP framework address calculation bug PCI: Fix incorrect value returned from pcie_get_speed_cap() arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou. x86/MCE/AMD: Fix the thresholding machinery initialization order x86/fpu: Disable bottom halves while loading FPU registers perf/x86/intel: Move branch tracing setup to the Intel-specific source file perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts() perf/x86/intel: Disallow precise_ip on BTS events fs: fix lost error code in dio_complete ALSA: wss: Fix invalid snd_free_pages() at error path ALSA: ac97: Fix incorrect bit shift at AC97-SPSA control write ALSA: control: Fix race between adding and removing a user element ALSA: sparc: Fix invalid snd_free_pages() at error path ALSA: hda: Add ASRock N68C-S UCC the power_save blacklist ALSA: hda/realtek - Support ALC300 ALSA: hda/realtek - fix headset mic detection for MSI MS-B171 ALSA: hda/realtek - fix the pop noise on headphone for lenovo laptops ALSA: hda/realtek - Add auto-mute quirk for HP Spectre x360 laptop function_graph: Create function_graph_enter() to consolidate architecture code ARM: function_graph: Simplify with function_graph_enter() microblaze: function_graph: Simplify with function_graph_enter() x86/function_graph: Simplify with function_graph_enter() nds32: function_graph: Simplify with function_graph_enter() powerpc/function_graph: Simplify with function_graph_enter() sh/function_graph: Simplify with function_graph_enter() sparc/function_graph: Simplify with function_graph_enter() parisc: function_graph: Simplify with function_graph_enter() riscv/function_graph: Simplify with function_graph_enter() s390/function_graph: Simplify with function_graph_enter() arm64: function_graph: Simplify with function_graph_enter() MIPS: function_graph: Simplify with function_graph_enter() function_graph: Make ftrace_push_return_trace() static function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack function_graph: Have profiler use curr_ret_stack and not depth function_graph: Move return callback before update of curr_ret_stack function_graph: Reverse the order of pushing the ret_stack and the callback binder: fix race that allows malicious free of live buffer ext2: initialize opts.s_mount_opt as zero before using it ext2: fix potential use after free ASoC: intel: cht_bsw_max98090_ti: Add quirk for boards using pmc_plt_clk_0 ASoC: pcm186x: Fix device reset-registers trigger value ARM: dts: rockchip: Remove @0 from the veyron memory node dmaengine: at_hdmac: fix memory leak in at_dma_xlate() dmaengine: at_hdmac: fix module unloading staging: most: use format specifier "%s" in snprintf staging: vchiq_arm: fix compat VCHIQ_IOC_AWAIT_COMPLETION staging: mt7621-dma: fix potentially dereferencing uninitialized 'tx_desc' staging: mt7621-pinctrl: fix uninitialized variable ngroups staging: rtl8723bs: Fix incorrect sense of ether_addr_equal staging: rtl8723bs: Add missing return for cfg80211_rtw_get_station USB: usb-storage: Add new IDs to ums-realtek usb: core: quirks: add RESET_RESUME quirk for Cherry G230 Stream series Revert "usb: dwc3: gadget: skip Set/Clear Halt when invalid" iio/hid-sensors: Fix IIO_CHAN_INFO_RAW returning wrong values for signed numbers iio:st_magn: Fix enable device after trigger lib/test_kmod.c: fix rmmod double free mm: cleancache: fix corruption on missed inode invalidation mm: use swp_offset as key in shmem_replace_page() Drivers: hv: vmbus: check the creation_status in vmbus_establish_gpadl() misc: mic/scif: fix copy-paste error in scif_create_remote_lookup Linux 4.19.7 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Greg Kroah-Hartman
|
635c56d224 |
This is the 4.19.6 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlwCSE8ACgkQONu9yGCS aT58lg//YXiTDY8JuG+LX8PJyL28s5gIQZyq7a8aEuxGFXbTfmym0TecN74F2gFM 7YBJ9j4u/W5xp/u/29VUOUE9OUiRdMa+GJz73ncgslHApp7r3Z5r9PJFJHtW07Xu IElCg2GvQLR0pzyNlsa+Nv738pldDr0d9xZDmsOp1Cs0aCfJQAbU1y9P5WNN8j3y rHQP19/2+HF0j6LqYxIRmgioSrmeHrEN/nWIDlFpW74+QPyI7d/6aJpr1Tfdy64u 6BE/48OunHjOPbO6fWcNjFm0FUlTYDKd8jtzkaIHmFKgXpDFb+3yN4AiMd4/ucPS SNqVqvzTfU8aKWEtIabTTG1m3AwuqJUrExYUQZwNe32zOhEMIE+rMpmgafSN3SjE k0cER70OS1rJ5rs/cqBY8UpqhPxqfTFSwEwHGqn66PeuYgCpjoXHIBVyn/s+I3CZ Be8udYwi3KXBYrMGppzFp5PklwkqrUIFFouF2ijtPBjKfZpte9/ZOGWxvZMux6Ev rqFaq/zf9DjvQ3BSwHh2QuQKK5WnGQVuwjDWHR/vso4bApErHFhDWvGAIFyFxRsK W70DUeUxSScNjNKDgyxzRUV18VF0IN8zMXfh4hCMtoq6+XzDG/DUBt6fBFXaZCun kWyCTZk+9sMkGVlL8kAB2UPbAjfuDRAijouwC+u0j0VRMXlsAWM= =ju/p -----END PGP SIGNATURE----- Merge 4.19.6 into android-4.19 Changes in 4.19.6 HID: steam: remove input device when a hid client is running. efi/libstub: arm: support building with clang usb: core: Fix hub port connection events lost usb: dwc3: gadget: fix ISOC TRB type on unaligned transfers usb: dwc3: gadget: Properly check last unaligned/zero chain TRB usb: dwc3: core: Clean up ULPI device usb: dwc3: Fix NULL pointer exception in dwc3_pci_remove() xhci: Fix leaking USB3 shared_hcd at xhci removal xhci: handle port status events for removed USB3 hcd xhci: Add check for invalid byte size error when UAS devices are connected. usb: xhci: fix uninitialized completion when USB3 port got wrong status usb: xhci: fix timeout for transition from RExit to U0 xhci: Add quirk to workaround the errata seen on Cavium Thunder-X2 Soc usb: xhci: Prevent bus suspend if a port connect change or polling state is detected ALSA: oss: Use kvzalloc() for local buffer allocations MAINTAINERS: Add Sasha as a stable branch maintainer Documentation/security-bugs: Clarify treatment of embargoed information Documentation/security-bugs: Postpone fix publication in exceptional cases mmc: sdhci-pci: Try "cd" for card-detect lookup before using NULL mmc: sdhci-pci: Workaround GLK firmware failing to restore the tuning value gpio: don't free unallocated ida on gpiochip_add_data_with_key() error path iwlwifi: fix wrong WGDS_WIFI_DATA_SIZE iwlwifi: mvm: support sta_statistics() even on older firmware iwlwifi: mvm: fix regulatory domain update when the firmware starts iwlwifi: mvm: don't use SAR Geo if basic SAR is not used brcmfmac: fix reporting support for 160 MHz channels opp: ti-opp-supply: Dynamically update u_volt_min opp: ti-opp-supply: Correct the supply in _get_optimal_vdd_voltage call tools/power/cpupower: fix compilation with STATIC=true v9fs_dir_readdir: fix double-free on p9stat_read error selinux: Add __GFP_NOWARN to allocation at str_read() Input: synaptics - avoid using uninitialized variable when probing bfs: add sanity check at bfs_fill_super() sctp: clear the transport of some out_chunk_list chunks in sctp_assoc_rm_peer gfs2: Don't leave s_fs_info pointing to freed memory in init_sbd llc: do not use sk_eat_skb() mm: don't warn about large allocations for slab mm/memory.c: recheck page table entry with page table lock held tcp: do not release socket ownership in tcp_close() drm/fb-helper: Blacklist writeback when adding connectors to fbdev drm/amdgpu: Add missing firmware entry for HAINAN drm/vc4: Set ->legacy_cursor_update to false when doing non-async updates drm/amdgpu: Fix oops when pp_funcs->switch_power_profile is unset drm/i915: Disable LP3 watermarks on all SNB machines drm/ast: change resolution may cause screen blurred drm/ast: fixed cursor may disappear sometimes drm/ast: Remove existing framebuffers before loading driver can: flexcan: Unlock the MB unconditionally can: dev: can_get_echo_skb(): factor out non sending code to __can_get_echo_skb() can: dev: __can_get_echo_skb(): replace struct can_frame by canfd_frame to access frame length can: dev: __can_get_echo_skb(): Don't crash the kernel if can_priv::echo_skb is accessed out of bounds can: dev: __can_get_echo_skb(): print error message, if trying to echo non existing skb can: rx-offload: introduce can_rx_offload_get_echo_skb() and can_rx_offload_queue_sorted() functions can: rx-offload: rename can_rx_offload_irq_queue_err_skb() to can_rx_offload_queue_tail() can: flexcan: use can_rx_offload_queue_sorted() for flexcan_irq_bus_*() can: flexcan: handle tx-complete CAN frames via rx-offload infrastructure can: raw: check for CAN FD capable netdev in raw_sendmsg() can: hi311x: Use level-triggered interrupt can: flexcan: Always use last mailbox for TX can: flexcan: remove not needed struct flexcan_priv::tx_mb and struct flexcan_priv::tx_mb_idx ACPICA: AML interpreter: add region addresses in global list during initialization IB/hfi1: Eliminate races in the SDMA send error path fsnotify: generalize handling of extra event flags fanotify: fix handling of events on child sub-directory pinctrl: meson: fix pinconf bias disable pinctrl: meson: fix gxbb ao pull register bits pinctrl: meson: fix gxl ao pull register bits pinctrl: meson: fix meson8 ao pull register bits pinctrl: meson: fix meson8b ao pull register bits tools/testing/nvdimm: Fix the array size for dimm devices. scsi: lpfc: fix remoteport access scsi: hisi_sas: Remove set but not used variable 'dq_list' KVM: PPC: Move and undef TRACE_INCLUDE_PATH/FILE cpufreq: imx6q: add return value check for voltage scale rtc: cmos: Do not export alarm rtc_ops when we do not support alarms rtc: pcf2127: fix a kmemleak caused in pcf2127_i2c_gather_write crypto: simd - correctly take reqsize of wrapped skcipher into account floppy: fix race condition in __floppy_read_block_0() powerpc/io: Fix the IO workarounds code to work with Radix sched/fair: Fix cpu_util_wake() for 'execl' type workloads perf/x86/intel/uncore: Add more IMC PCI IDs for KabyLake and CoffeeLake CPUs block: copy ioprio in __bio_clone_fast() and bounce SUNRPC: Fix a bogus get/put in generic_key_to_expire() riscv: add missing vdso_install target RISC-V: Silence some module warnings on 32-bit drm/amdgpu: fix bug with IH ring setup kdb: Use strscpy with destination buffer size NFSv4: Fix an Oops during delegation callbacks powerpc/numa: Suppress "VPHN is not supported" messages efi/arm: Revert deferred unmap of early memmap mapping z3fold: fix possible reclaim races mm, memory_hotplug: check zone_movable in has_unmovable_pages tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset mm, page_alloc: check for max order in hot path dax: Avoid losing wakeup in dax_lock_mapping_entry include/linux/pfn_t.h: force '~' to be parsed as an unary operator tty: wipe buffer. tty: wipe buffer if not echoing data gfs2: Fix iomap buffer head reference counting bug rcu: Make need_resched() respond to urgent RCU-QS needs media: ov5640: Re-work MIPI startup sequence media: ov5640: Fix timings setup code media: ov5640: fix exposure regression media: ov5640: fix auto gain & exposure when changing mode media: ov5640: fix wrong binning value in exposure calculation media: ov5640: fix auto controls values when switching to manual mode Linux 4.19.6 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Yu Zhao
|
b66375b599 |
mm: use swp_offset as key in shmem_replace_page()
commit c1cb20d43728aa9b5393bd8d489bc85c142949b2 upstream.
We changed the key of swap cache tree from swp_entry_t.val to
swp_offset. We need to do so in shmem_replace_page() as well.
Hugh said:
"shmem_replace_page() has been wrong since the day I wrote it: good
enough to work on swap "type" 0, which is all most people ever use
(especially those few who need shmem_replace_page() at all), but
broken once there are any non-0 swp_type bits set in the higher order
bits"
Link: http://lkml.kernel.org/r/20181121215442.138545-1-yuzhao@google.com
Fixes:
|
||
Pavel Tikhomirov
|
16a2d60224 |
mm: cleancache: fix corruption on missed inode invalidation
commit 6ff38bd40230af35e446239396e5fc8ebd6a5248 upstream.
If all pages are deleted from the mapping by memory reclaim and also
moved to the cleancache:
__delete_from_page_cache
(no shadow case)
unaccount_page_cache_page
cleancache_put_page
page_cache_delete
mapping->nrpages -= nr
(nrpages becomes 0)
We don't clean the cleancache for an inode after final file truncation
(removal).
truncate_inode_pages_final
check (nrpages || nrexceptional) is false
no truncate_inode_pages
no cleancache_invalidate_inode(mapping)
These way when reading the new file created with same inode we may get
these trash leftover pages from cleancache and see wrong data instead of
the contents of the new file.
Fix it by always doing truncate_inode_pages which is already ready for
nrpages == 0 && nrexceptional == 0 case and just invalidates inode.
[akpm@linux-foundation.org: add comment, per Jan]
Link: http://lkml.kernel.org/r/20181112095734.17979-1-ptikhomirov@virtuozzo.com
Fixes: commit
|
||
Andrea Arcangeli
|
34b7a7cc53 |
userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas
commit 29ec90660d68bbdd69507c1c8b4e33aa299278b1 upstream. After the VMA to register the uffd onto is found, check that it has VM_MAYWRITE set before allowing registration. This way we inherit all common code checks before allowing to fill file holes in shmem and hugetlbfs with UFFDIO_COPY. The userfaultfd memory model is not applicable for readonly files unless it's a MAP_PRIVATE. Link: http://lkml.kernel.org/r/20181126173452.26955-4-aarcange@redhat.com Fixes: |
||
Hugh Dickins
|
8b37c40503 |
mm/khugepaged: collapse_shmem() do not crash on Compound
commit 06a5e1268a5fb9c2b346a3da6b97e85f2eba0f07 upstream.
collapse_shmem()'s VM_BUG_ON_PAGE(PageTransCompound) was unsafe: before
it holds page lock of the first page, racing truncation then extension
might conceivably have inserted a hugepage there already. Fail with the
SCAN_PAGE_COMPOUND result, instead of crashing (CONFIG_DEBUG_VM=y) or
otherwise mishandling the unexpected hugepage - though later we might
code up a more constructive way of handling it, with SCAN_SUCCESS.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261529310.2275@eggly.anvils
Fixes:
|
||
Hugh Dickins
|
af24c01831 |
mm/khugepaged: collapse_shmem() without freezing new_page
commit 87c460a0bded56195b5eb497d44709777ef7b415 upstream.
khugepaged's collapse_shmem() does almost all of its work, to assemble
the huge new_page from 512 scattered old pages, with the new_page's
refcount frozen to 0 (and refcounts of all old pages so far also frozen
to 0). Including shmem_getpage() to read in any which were out on swap,
memory reclaim if necessary to allocate their intermediate pages, and
copying over all the data from old to new.
Imagine the frozen refcount as a spinlock held, but without any lock
debugging to highlight the abuse: it's not good, and under serious load
heads into lockups - speculative getters of the page are not expecting
to spin while khugepaged is rescheduled.
One can get a little further under load by hacking around elsewhere; but
fortunately, freezing the new_page turns out to have been entirely
unnecessary, with no hacks needed elsewhere.
The huge new_page lock is already held throughout, and guards all its
subpages as they are brought one by one into the page cache tree; and
anything reading the data in that page, without the lock, before it has
been marked PageUptodate, would already be in the wrong. So simply
eliminate the freezing of the new_page.
Each of the old pages remains frozen with refcount 0 after it has been
replaced by a new_page subpage in the page cache tree, until they are
all unfrozen on success or failure: just as before. They could be
unfrozen sooner, but cause no problem once no longer visible to
find_get_entry(), filemap_map_pages() and other speculative lookups.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261527570.2275@eggly.anvils
Fixes:
|
||
Hugh Dickins
|
3e9646c76c |
mm/khugepaged: minor reorderings in collapse_shmem()
commit 042a30824871fa3149b0127009074b75cc25863c upstream.
Several cleanups in collapse_shmem(): most of which probably do not
really matter, beyond doing things in a more familiar and reassuring
order. Simplify the failure gotos in the main loop, and on success
update stats while interrupts still disabled from the last iteration.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261526400.2275@eggly.anvils
Fixes:
|
||
Hugh Dickins
|
ee13d69bc1 |
mm/khugepaged: collapse_shmem() remember to clear holes
commit 2af8ff291848cc4b1cce24b6c943394eb2c761e8 upstream.
Huge tmpfs testing reminds us that there is no __GFP_ZERO in the gfp
flags khugepaged uses to allocate a huge page - in all common cases it
would just be a waste of effort - so collapse_shmem() must remember to
clear out any holes that it instantiates.
The obvious place to do so, where they are put into the page cache tree,
is not a good choice: because interrupts are disabled there. Leave it
until further down, once success is assured, where the other pages are
copied (before setting PageUptodate).
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261525080.2275@eggly.anvils
Fixes:
|
||
Hugh Dickins
|
78141aabfb |
mm/khugepaged: fix crashes due to misaccounted holes
commit aaa52e340073b7f4593b3c4ddafcafa70cf838b5 upstream. Huge tmpfs testing on a shortish file mapped into a pmd-rounded extent hit shmem_evict_inode()'s WARN_ON(inode->i_blocks) followed by clear_inode()'s BUG_ON(inode->i_data.nrpages) when the file was later closed and unlinked. khugepaged's collapse_shmem() was forgetting to update mapping->nrpages on the rollback path, after it had added but then needs to undo some holes. There is indeed an irritating asymmetry between shmem_charge(), whose callers want it to increment nrpages after successfully accounting blocks, and shmem_uncharge(), when __delete_from_page_cache() already decremented nrpages itself: oh well, just add a comment on that to them both. And shmem_recalc_inode() is supposed to be called when the accounting is expected to be in balance (so it can deduce from imbalance that reclaim discarded some pages): so change shmem_charge() to update nrpages earlier (though it's rare for the difference to matter at all). Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261523450.2275@eggly.anvils Fixes: |
||
Hugh Dickins
|
8797f2f4fe |
mm/khugepaged: collapse_shmem() stop if punched or truncated
commit 701270fa193aadf00bdcf607738f64997275d4c7 upstream.
Huge tmpfs testing showed that although collapse_shmem() recognizes a
concurrently truncated or hole-punched page correctly, its handling of
holes was liable to refill an emptied extent. Add check to stop that.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261522040.2275@eggly.anvils
Fixes:
|
||
Hugh Dickins
|
d31ff4722f |
mm/huge_memory: fix lockdep complaint on 32-bit i_size_read()
commit 006d3ff27e884f80bd7d306b041afc415f63598f upstream.
Huge tmpfs testing, on 32-bit kernel with lockdep enabled, showed that
__split_huge_page() was using i_size_read() while holding the irq-safe
lru_lock and page tree lock, but the 32-bit i_size_read() uses an
irq-unsafe seqlock which should not be nested inside them.
Instead, read the i_size earlier in split_huge_page_to_list(), and pass
the end offset down to __split_huge_page(): all while holding head page
lock, which is enough to prevent truncation of that extent before the
page tree lock has been taken.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261520070.2275@eggly.anvils
Fixes:
|
||
Hugh Dickins
|
7e18656c9a |
mm/huge_memory: splitting set mapping+index before unfreeze
commit 173d9d9fd3ddae84c110fea8aedf1f26af6be9ec upstream. Huge tmpfs stress testing has occasionally hit shmem_undo_range()'s VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page). Move the setting of mapping and index up before the page_ref_unfreeze() in __split_huge_page_tail() to fix this: so that a page cache lookup cannot get a reference while the tail's mapping and index are unstable. In fact, might as well move them up before the smp_wmb(): I don't see an actual need for that, but if I'm missing something, this way round is safer than the other, and no less efficient. You might argue that VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page) is misplaced, and should be left until after the trylock_page(); but left as is has not crashed since, and gives more stringent assurance. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261516380.2275@eggly.anvils Fixes: |
||
Hugh Dickins
|
69697e6a61 |
mm/huge_memory: rename freeze_page() to unmap_page()
commit 906f9cdfc2a0800f13683f9e4ebdfd08c12ee81b upstream.
The term "freeze" is used in several ways in the kernel, and in mm it
has the particular meaning of forcing page refcount temporarily to 0.
freeze_page() is just too confusing a name for a function that unmaps a
page: rename it unmap_page(), and rename unfreeze_page() remap_page().
Went to change the mention of freeze_page() added later in mm/rmap.c,
but found it to be incorrect: ordinary page reclaim reaches there too;
but the substance of the comment still seems correct, so edit it down.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261514080.2275@eggly.anvils
Fixes:
|
||
Rik van Riel
|
fe0ea3084c |
ANDROID: add extra free kbytes tunable
Add a userspace visible knob to tell the VM to keep an extra amount of memory free, by increasing the gap between each zone's min and low watermarks. This is useful for realtime applications that call system calls and have a bound on the number of allocations that happen in any short time period. In this application, extra_free_kbytes would be left at an amount equal to or larger than than the maximum number of allocations that happen in any burst. It may also be useful to reduce the memory use of virtual machines (temporarily?), in a way that does not cause memory fragmentation like ballooning does. [ccross] Revived for use on old kernels where no other solution exists. The tunable will be removed on kernels that do better at avoiding direct reclaim. [surenb] Will be reverted as soon as Android framework is reworked to use upstream-supported watermark_scale_factor instead of extra_free_kbytes. Bug: 86445363 Bug: 109664768 Bug: 120445732 Change-Id: I765a42be8e964bfd3e2886d1ca85a29d60c3bb3e Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
Colin Cross
|
533e4ed309 |
ANDROID: mm: add a field to store names for private anonymous memory
Userspace processes often have multiple allocators that each do
anonymous mmaps to get memory. When examining memory usage of
individual processes or systems as a whole, it is useful to be
able to break down the various heaps that were allocated by
each layer and examine their size, RSS, and physical memory
usage.
This patch adds a user pointer to the shared union in
vm_area_struct that points to a null terminated string inside
the user process containing a name for the vma. vmas that
point to the same address will be merged, but vmas that
point to equivalent strings at different addresses will
not be merged.
Userspace can set the name for a region of memory by calling
prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, start, len, (unsigned long)name);
Setting the name to NULL clears it.
The names of named anonymous vmas are shown in /proc/pid/maps
as [anon:<name>] and in /proc/pid/smaps in a new "Name" field
that is only present for named vmas. If the userspace pointer
is no longer valid all or part of the name will be replaced
with "<fault>".
The idea to store a userspace pointer to reduce the complexity
within mm (at the expense of the complexity of reading
/proc/pid/mem) came from Dave Hansen. This results in no
runtime overhead in the mm subsystem other than comparing
the anon_name pointers when considering vma merging. The pointer
is stored in a union with fieds that are only used on file-backed
mappings, so it does not increase memory usage.
Includes fix from Jed Davis <jld@mozilla.com> for typo in
prctl_set_vma_anon_name, which could attempt to set the name
across two vmas at the same time due to a typo, which might
corrupt the vma list. Fix it to use tmp instead of end to limit
the name setting to a single vma at a time.
Bug: 120441514
Change-Id: I9aa7b6b5ef536cd780599ba4e2fba8ceebe8b59f
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
[AmitP: Fix get_user_pages_remote() call to align with upstream commit
|
||
Michal Hocko
|
9dec38554a |
mm, page_alloc: check for max order in hot path
[ Upstream commit c63ae43ba53bc432b414fd73dd5f4b01fcb1ab43 ] Konstantin has noticed that kvmalloc might trigger the following warning: WARNING: CPU: 0 PID: 6676 at mm/vmstat.c:986 __fragmentation_index+0x54/0x60 [...] Call Trace: fragmentation_index+0x76/0x90 compaction_suitable+0x4f/0xf0 shrink_node+0x295/0x310 node_reclaim+0x205/0x250 get_page_from_freelist+0x649/0xad0 __alloc_pages_nodemask+0x12a/0x2a0 kmalloc_large_node+0x47/0x90 __kmalloc_node+0x22b/0x2e0 kvmalloc_node+0x3e/0x70 xt_alloc_table_info+0x3a/0x80 [x_tables] do_ip6t_set_ctl+0xcd/0x1c0 [ip6_tables] nf_setsockopt+0x44/0x60 SyS_setsockopt+0x6f/0xc0 do_syscall_64+0x67/0x120 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 the problem is that we only check for an out of bound order in the slow path and the node reclaim might happen from the fast path already. This is fixable by making sure that kvmalloc doesn't ever use kmalloc for requests that are larger than KMALLOC_MAX_SIZE but this also shows that the code is rather fragile. A recent UBSAN report just underlines that by the following report UBSAN: Undefined behaviour in mm/page_alloc.c:3117:19 shift exponent 51 is too large for 32-bit type 'int' CPU: 0 PID: 6520 Comm: syz-executor1 Not tainted 4.19.0-rc2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xd2/0x148 lib/dump_stack.c:113 ubsan_epilogue+0x12/0x94 lib/ubsan.c:159 __ubsan_handle_shift_out_of_bounds+0x2b6/0x30b lib/ubsan.c:425 __zone_watermark_ok+0x2c7/0x400 mm/page_alloc.c:3117 zone_watermark_fast mm/page_alloc.c:3216 [inline] get_page_from_freelist+0xc49/0x44c0 mm/page_alloc.c:3300 __alloc_pages_nodemask+0x21e/0x640 mm/page_alloc.c:4370 alloc_pages_current+0xcc/0x210 mm/mempolicy.c:2093 alloc_pages include/linux/gfp.h:509 [inline] __get_free_pages+0x12/0x60 mm/page_alloc.c:4414 dma_mem_alloc+0x36/0x50 arch/x86/include/asm/floppy.h:156 raw_cmd_copyin drivers/block/floppy.c:3159 [inline] raw_cmd_ioctl drivers/block/floppy.c:3206 [inline] fd_locked_ioctl+0xa00/0x2c10 drivers/block/floppy.c:3544 fd_ioctl+0x40/0x60 drivers/block/floppy.c:3571 __blkdev_driver_ioctl block/ioctl.c:303 [inline] blkdev_ioctl+0xb3c/0x1a30 block/ioctl.c:601 block_ioctl+0x105/0x150 fs/block_dev.c:1883 vfs_ioctl fs/ioctl.c:46 [inline] do_vfs_ioctl+0x1c0/0x1150 fs/ioctl.c:687 ksys_ioctl+0x9e/0xb0 fs/ioctl.c:702 __do_sys_ioctl fs/ioctl.c:709 [inline] __se_sys_ioctl fs/ioctl.c:707 [inline] __x64_sys_ioctl+0x7e/0xc0 fs/ioctl.c:707 do_syscall_64+0xc4/0x510 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Note that this is not a kvmalloc path. It is just that the fast path really depends on having sanitzed order as well. Therefore move the order check to the fast path. Link: http://lkml.kernel.org/r/20181113094305.GM15120@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reported-by: Kyungtae Kim <kt0755@gmail.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Aaron Lu <aaron.lu@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Byoungyoung Lee <lifeasageek@gmail.com> Cc: "Dae R. Jeong" <threeearcat@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Yufen Yu
|
db89fc007b |
tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset
[ Upstream commit 1a413646931cb14442065cfc17561e50f5b5bb44 ] Other filesystems such as ext4, f2fs and ubifs all return ENXIO when lseek (SEEK_DATA or SEEK_HOLE) requests a negative offset. man 2 lseek says : EINVAL whence is not valid. Or: the resulting file offset would be : negative, or beyond the end of a seekable device. : : ENXIO whence is SEEK_DATA or SEEK_HOLE, and the file offset is beyond : the end of the file. Make tmpfs return ENXIO under these circumstances as well. After this, tmpfs also passes xfstests's generic/448. [akpm@linux-foundation.org: rewrite changelog] Link: http://lkml.kernel.org/r/1540434176-14349-1-git-send-email-yuyufen@huawei.com Signed-off-by: Yufen Yu <yuyufen@huawei.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hughd@google.com> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Michal Hocko
|
b44fd1268b |
mm, memory_hotplug: check zone_movable in has_unmovable_pages
[ Upstream commit 9d7899999c62c1a81129b76d2a6ecbc4655e1597 ] Page state checks are racy. Under a heavy memory workload (e.g. stress -m 200 -t 2h) it is quite easy to hit a race window when the page is allocated but its state is not fully populated yet. A debugging patch to dump the struct page state shows has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0 page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1 flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked) Note that the state has been checked for both PageLRU and PageSwapBacked already. Closing this race completely would require some sort of retry logic. This can be tricky and error prone (think of potential endless or long taking loops). Workaround this problem for movable zones at least. Such a zone should only contain movable pages. Commit |
||
Vitaly Wool
|
510066729b |
z3fold: fix possible reclaim races
[ Upstream commit ca0246bb97c23da9d267c2107c07fb77e38205c9 ] Reclaim and free can race on an object which is basically fine but in order for reclaim to be able to map "freed" object we need to encode object length in the handle. handle_to_chunks() is then introduced to extract object length from a handle and use it during mapping. Moreover, to avoid racing on a z3fold "headless" page release, we should not try to free that page in z3fold_free() if the reclaim bit is set. Also, in the unlikely case of trying to reclaim a page being freed, we should not proceed with that page. While at it, fix the page accounting in reclaim function. This patch supersedes "[PATCH] z3fold: fix reclaim lock-ups". Link: http://lkml.kernel.org/r/20181105162225.74e8837d03583a9b707cf559@gmail.com Signed-off-by: Vitaly Wool <vitaly.vul@sony.com> Signed-off-by: Jongseok Kim <ks77sj@gmail.com> Reported-by-by: Jongseok Kim <ks77sj@gmail.com> Reviewed-by: Snild Dolkow <snild@sony.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Aneesh Kumar K.V
|
5999609a93 |
mm/memory.c: recheck page table entry with page table lock held
commit ff09d7ec9786be4ad7589aa987d7dc66e2dd9160 upstream. We clear the pte temporarily during read/modify/write update of the pte. If we take a page fault while the pte is cleared, the application can get SIGBUS. One such case is with remap_pfn_range without a backing vm_ops->fault callback. do_fault will return SIGBUS in that case. cpu 0 cpu1 mprotect() ptep_modify_prot_start()/pte cleared. . . page fault. . . prep_modify_prot_commit() Fix this by taking page table lock and rechecking for pte_none. [aneesh.kumar@linux.ibm.com: fix crash observed with syzkaller run] Link: http://lkml.kernel.org/r/87va6bwlfg.fsf@linux.ibm.com Link: http://lkml.kernel.org/r/20180926031858.9692-1-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ido Schimmel <idosch@idosch.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Dmitry Vyukov
|
3996e891ec |
mm: don't warn about large allocations for slab
commit 61448479a9f2c954cde0cfe778cb6bec5d0a748d upstream. Slub does not call kmalloc_slab() for sizes > KMALLOC_MAX_CACHE_SIZE, instead it falls back to kmalloc_large(). For slab KMALLOC_MAX_CACHE_SIZE == KMALLOC_MAX_SIZE and it calls kmalloc_slab() for all allocations relying on NULL return value for over-sized allocations. This inconsistency leads to unwanted warnings from kmalloc_slab() for over-sized allocations for slab. Returning NULL for failed allocations is the expected behavior. Make slub and slab code consistent by checking size > KMALLOC_MAX_CACHE_SIZE in slab before calling kmalloc_slab(). While we are here also fix the check in kmalloc_slab(). We should check against KMALLOC_MAX_CACHE_SIZE rather than KMALLOC_MAX_SIZE. It all kinda worked because for slab the constants are the same, and slub always checks the size against KMALLOC_MAX_CACHE_SIZE before kmalloc_slab(). But if we get there with size > KMALLOC_MAX_CACHE_SIZE anyhow bad things will happen. For example, in case of a newly introduced bug in slub code. Also move the check in kmalloc_slab() from function entry to the size > 192 case. This partially compensates for the additional check in slab code and makes slub code a bit faster (at least theoretically). Also drop __GFP_NOWARN in the warning check. This warning means a bug in slab code itself, user-passed flags have nothing to do with it. Nothing of this affects slob. Link: http://lkml.kernel.org/r/20180927171502.226522-1-dvyukov@gmail.com Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: syzbot+87829a10073277282ad1@syzkaller.appspotmail.com Reported-by: syzbot+ef4e8fc3a06e9019bb40@syzkaller.appspotmail.com Reported-by: syzbot+6e438f4036df52cbb863@syzkaller.appspotmail.com Reported-by: syzbot+8574471d8734457d98aa@syzkaller.appspotmail.com Reported-by: syzbot+af1504df0807a083dbd9@syzkaller.appspotmail.com Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |