Commit graph

99 commits

Author SHA1 Message Date
Greg Kroah-Hartman
287ec341d6 This is the 4.19.93 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4Q1jAACgkQONu9yGCS
 aT7vqg/9FEBVO/NARJYQ/R7Z6L4fQUNgHmFI0y9iaTP2nlHuVuBvMHJdF7BmidHF
 9iwe/lctPobgoknUoA3nmt8WmPmCaKbFhABsS03sz1Q5Z+IC1g218s4SUppER3fB
 YlgqRDjKY0wwk2MPAOgIPaRQCNSiaVZFo+bH1Mxrj77m8D7NHKXiZPlrbDunVlEB
 NA0DWOyb04JehRoRNbKTHzLBs/VfZ0LhxEO5sS17M2hhOauYAKAFmSzdMPJwv4ka
 qiCR+4zWYR5LF64mG5jxmerhUjIOrhRUc+334//WH4jCuo9xjKrCmxLIjqR7wwHC
 dK4Apu128Ujl4boHxLrFKIG3f2K19gZz6h+sWrcxjTzZ/YWPYjPI4atuWrZEJIG5
 nhhcz4fZfLAxMNm51kM9i4WAcP2k+CX1ynD0AuzXIZXs+t+xOoaUtYeFHc/tpmig
 P/AA4eAYjojQHPUwNeR+8GmjOGPfwSuTNkd6PqAaaI1cvGtHK0y5M38FNrut+I1k
 pvYvWOvtvWOsR6YaJviU2HF7uNFX0saNqJ4Ahmm/nxdlxOKRcKDIzDI7ibwcwEOQ
 E20SZdPQG/oiaXq0itSstpDuYJ9hKr5YehPS7uAXvy0RT/H7J5cpSZuCUK74J4Zr
 rC2D5M99rW9aztpfEQxU6CTluIGLZ+eBp2pKTU420jkySxmOo6o=
 =qgtu
 -----END PGP SIGNATURE-----

Merge 4.19.93 into android-4.19

Changes in 4.19.93
	scsi: lpfc: Fix discovery failures when target device connectivity bounces
	scsi: mpt3sas: Fix clear pending bit in ioctl status
	scsi: lpfc: Fix locking on mailbox command completion
	Input: atmel_mxt_ts - disable IRQ across suspend
	f2fs: fix to update time in lazytime mode
	iommu: rockchip: Free domain on .domain_free
	iommu/tegra-smmu: Fix page tables in > 4 GiB memory
	dmaengine: xilinx_dma: Clear desc_pendingcount in xilinx_dma_reset
	scsi: target: compare full CHAP_A Algorithm strings
	scsi: lpfc: Fix SLI3 hba in loop mode not discovering devices
	scsi: csiostor: Don't enable IRQs too early
	scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec()
	powerpc/pseries: Mark accumulate_stolen_time() as notrace
	powerpc/pseries: Don't fail hash page table insert for bolted mapping
	powerpc/tools: Don't quote $objdump in scripts
	dma-debug: add a schedule point in debug_dma_dump_mappings()
	leds: lm3692x: Handle failure to probe the regulator
	clocksource/drivers/asm9260: Add a check for of_clk_get
	clocksource/drivers/timer-of: Use unique device name instead of timer
	powerpc/security/book3s64: Report L1TF status in sysfs
	powerpc/book3s64/hash: Add cond_resched to avoid soft lockup warning
	ext4: update direct I/O read lock pattern for IOCB_NOWAIT
	ext4: iomap that extends beyond EOF should be marked dirty
	jbd2: Fix statistics for the number of logged blocks
	scsi: tracing: Fix handling of TRANSFER LENGTH == 0 for READ(6) and WRITE(6)
	scsi: lpfc: Fix duplicate unreg_rpi error in port offline flow
	f2fs: fix to update dir's i_pino during cross_rename
	clk: qcom: Allow constant ratio freq tables for rcg
	clk: clk-gpio: propagate rate change to parent
	irqchip/irq-bcm7038-l1: Enable parent IRQ if necessary
	irqchip: ingenic: Error out if IRQ domain creation failed
	fs/quota: handle overflows of sysctl fs.quota.* and report as unsigned long
	scsi: lpfc: fix: Coverity: lpfc_cmpl_els_rsp(): Null pointer dereferences
	PCI: rpaphp: Fix up pointer to first drc-info entry
	scsi: ufs: fix potential bug which ends in system hang
	powerpc/pseries/cmm: Implement release() function for sysfs device
	PCI: rpaphp: Don't rely on firmware feature to imply drc-info support
	PCI: rpaphp: Annotate and correctly byte swap DRC properties
	PCI: rpaphp: Correctly match ibm, my-drc-index to drc-name when using drc-info
	powerpc/security: Fix wrong message when RFI Flush is disable
	scsi: atari_scsi: sun3_scsi: Set sg_tablesize to 1 instead of SG_NONE
	clk: pxa: fix one of the pxa RTC clocks
	bcache: at least try to shrink 1 node in bch_mca_scan()
	HID: quirks: Add quirk for HP MSU1465 PIXART OEM mouse
	HID: logitech-hidpp: Silence intermittent get_battery_capacity errors
	ARM: 8937/1: spectre-v2: remove Brahma-B53 from hardening
	libnvdimm/btt: fix variable 'rc' set but not used
	HID: Improve Windows Precision Touchpad detection.
	HID: rmi: Check that the RMI_STARTED bit is set before unregistering the RMI transport device
	watchdog: Fix the race between the release of watchdog_core_data and cdev
	scsi: pm80xx: Fix for SATA device discovery
	scsi: ufs: Fix error handing during hibern8 enter
	scsi: scsi_debug: num_tgts must be >= 0
	scsi: NCR5380: Add disconnect_mask module parameter
	scsi: iscsi: Don't send data to unbound connection
	scsi: target: iscsi: Wait for all commands to finish before freeing a session
	gpio: mpc8xxx: Don't overwrite default irq_set_type callback
	apparmor: fix unsigned len comparison with less than zero
	scripts/kallsyms: fix definitely-lost memory leak
	powerpc: Don't add -mabi= flags when building with Clang
	cdrom: respect device capabilities during opening action
	perf script: Fix brstackinsn for AUXTRACE
	perf regs: Make perf_reg_name() return "unknown" instead of NULL
	s390/zcrypt: handle new reply code FILTERED_BY_HYPERVISOR
	libfdt: define INT32_MAX and UINT32_MAX in libfdt_env.h
	s390/cpum_sf: Check for SDBT and SDB consistency
	ocfs2: fix passing zero to 'PTR_ERR' warning
	mailbox: imx: Fix Tx doorbell shutdown path
	kernel: sysctl: make drop_caches write-only
	userfaultfd: require CAP_SYS_PTRACE for UFFD_FEATURE_EVENT_FORK
	Revert "powerpc/vcpu: Assume dedicated processors as non-preempt"
	x86/mce: Fix possibly incorrect severity calculation on AMD
	net, sysctl: Fix compiler warning when only cBPF is present
	netfilter: nf_queue: enqueue skbs with NULL dst
	ALSA: hda - Downgrade error message for single-cmd fallback
	bonding: fix active-backup transition after link failure
	perf strbuf: Remove redundant va_end() in strbuf_addv()
	Make filldir[64]() verify the directory entry filename is valid
	filldir[64]: remove WARN_ON_ONCE() for bad directory entries
	netfilter: ebtables: compat: reject all padding in matches/watchers
	6pack,mkiss: fix possible deadlock
	netfilter: bridge: make sure to pull arp header in br_nf_forward_arp()
	inetpeer: fix data-race in inet_putpeer / inet_putpeer
	net: add a READ_ONCE() in skb_peek_tail()
	net: icmp: fix data-race in cmp_global_allow()
	hrtimer: Annotate lockless access to timer->state
	net: ena: fix napi handler misbehavior when the napi budget is zero
	net/mlxfw: Fix out-of-memory error in mfa2 flash burning
	net: stmmac: dwmac-meson8b: Fix the RGMII TX delay on Meson8b/8m2 SoCs
	ptp: fix the race between the release of ptp_clock and cdev
	tcp: Fix highest_sack and highest_sack_seq
	udp: fix integer overflow while computing available space in sk_rcvbuf
	vhost/vsock: accept only packets with the right dst_cid
	net: add bool confirm_neigh parameter for dst_ops.update_pmtu
	ip6_gre: do not confirm neighbor when do pmtu update
	gtp: do not confirm neighbor when do pmtu update
	net/dst: add new function skb_dst_update_pmtu_no_confirm
	tunnel: do not confirm neighbor when do pmtu update
	vti: do not confirm neighbor when do pmtu update
	sit: do not confirm neighbor when do pmtu update
	net/dst: do not confirm neighbor for vxlan and geneve pmtu update
	gtp: do not allow adding duplicate tid and ms_addr pdp context
	net: marvell: mvpp2: phylink requires the link interrupt
	tcp/dccp: fix possible race __inet_lookup_established()
	tcp: do not send empty skb from tcp_write_xmit()
	gtp: fix wrong condition in gtp_genl_dump_pdp()
	gtp: fix an use-after-free in ipv4_pdp_find()
	gtp: avoid zero size hashtable
	spi: fsl: don't map irq during probe
	tty/serial: atmel: fix out of range clock divider handling
	pinctrl: baytrail: Really serialize all register accesses
	spi: fsl: use platform_get_irq() instead of of_irq_to_resource()
	Linux 4.19.93

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie31b3fba19c5a45be0b85f272bc50cb8b67ea3c0
2020-01-04 19:29:03 +01:00
Mike Rapoport
9df1ac5dd9 userfaultfd: require CAP_SYS_PTRACE for UFFD_FEATURE_EVENT_FORK
[ Upstream commit 3c1c24d91ffd536de0a64688a9df7f49e58fadbc ]

A while ago Andy noticed
(http://lkml.kernel.org/r/CALCETrWY+5ynDct7eU_nDUqx=okQvjm=Y5wJvA4ahBja=CQXGw@mail.gmail.com)
that UFFD_FEATURE_EVENT_FORK used by an unprivileged user may have
security implications.

As the first step of the solution the following patch limits the availably
of UFFD_FEATURE_EVENT_FORK only for those having CAP_SYS_PTRACE.

The usage of CAP_SYS_PTRACE ensures compatibility with CRIU.

Yet, if there are other users of non-cooperative userfaultfd that run
without CAP_SYS_PTRACE, they would be broken :(

Current implementation of UFFD_FEATURE_EVENT_FORK modifies the file
descriptor table from the read() implementation of uffd, which may have
security implications for unprivileged use of the userfaultfd.

Limit availability of UFFD_FEATURE_EVENT_FORK only for callers that have
CAP_SYS_PTRACE.

Link: http://lkml.kernel.org/r/1572967777-8812-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Daniel Colascione <dancol@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Nick Kralevich <nnk@google.com>
Cc: Nosh Minwalla <nosh@google.com>
Cc: Pavel Emelyanov <ovzxemul@gmail.com>
Cc: Tim Murray <timmurray@google.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:13:18 +01:00
Andrey Konovalov
324c38cc0d UPSTREAM: userfaultfd: untag user pointers
(Upstream commit 7d0325749a6c77b075424ab9de76bcb73a118430).

This patch is a part of a series that extends kernel ABI to allow to pass
tagged user pointers (with the top byte set to something else other than
0x00) as syscall arguments.

userfaultfd code use provided user pointers for vma lookups, which can
only by done with untagged pointers.

Untag user pointers in validate_range().

Link: http://lkml.kernel.org/r/cdc59ddd7011012ca2e689bc88c3b65b1ea7e413.1563904656.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jens Wiklander <jens.wiklander@linaro.org>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: Ib1f0d2cffdd94e03651904a08d6852f3a183d2a3
2019-10-07 15:27:41 -04:00
Greg Kroah-Hartman
12dc90c620 This is the 4.19.69 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1ncKwACgkQONu9yGCS
 aT6FPg//RiiJo8O+CUzkP4MFohy8JUuGC1MnnSfSFJn9bzAljWhYtSoJlZ9PbfHq
 qx9oWuQNNVZRn9nWuRbTRfRlz6ztc7whsjhAth4eNCtXvu+xAvFLvFhlbVt6xiZ0
 Wg3jXDtIY3Y8km01uJdzVk/juUqvTU8nioM4s1OWTFRfOfakLMK9CkxOKfZMFnxP
 mVILTcOxZAf0Js2tRMRPvm8c6OhegkXZjWUhGMlvmFKk/pqUouVXH8pKbBoTj8zR
 VoHB6pWs3YG3S15WgkNfKiR9WpeXywC7XN9ilziczaQ8HbsH6Y+5wM9Ncx+3FVxd
 mzygLCtlckYWjiabS/w3tQHrH+LV8MaPYuW/2tlL9sBljlDW5RBW+g7vaBQjIpCK
 gco/z0qmeEIYt8ktLL08i9FQBPp00Fra9x3jZKLz8Tp+W//EBm4ENMm2cxHtRKtd
 3fG70ngJmycksCK/e8N466/1f/aAOxfBZkog3R/4yqNvOX8rkQGRJlhv0AzIdsy8
 RlTDotDwwQFbMWROVs/Jea+9Wwp71jlPOXyMqX2EqUR/WfDWuIZEyzSdbqKPNgHL
 Q9OoUB7kIKGusqQC6ABYKtDn7T046o6ePEB8r2aIvPmw5AiWMefSubHNV6mP3KJb
 0NbQlUelMTvxuwm5NLMY2EWbWi+jvg4OgJvajwcDretTPeBGMZI=
 =945Y
 -----END PGP SIGNATURE-----

Merge 4.19.69 into android-4.19

Changes in 4.19.69
	HID: Add 044f:b320 ThrustMaster, Inc. 2 in 1 DT
	MIPS: kernel: only use i8253 clocksource with periodic clockevent
	mips: fix cacheinfo
	netfilter: ebtables: fix a memory leak bug in compat
	ASoC: dapm: Fix handling of custom_stop_condition on DAPM graph walks
	selftests/bpf: fix sendmsg6_prog on s390
	bonding: Force slave speed check after link state recovery for 802.3ad
	net: mvpp2: Don't check for 3 consecutive Idle frames for 10G links
	selftests: forwarding: gre_multipath: Enable IPv4 forwarding
	selftests: forwarding: gre_multipath: Fix flower filters
	can: dev: call netif_carrier_off() in register_candev()
	can: mcp251x: add error check when wq alloc failed
	can: gw: Fix error path of cgw_module_init
	ASoC: Fail card instantiation if DAI format setup fails
	st21nfca_connectivity_event_received: null check the allocation
	st_nci_hci_connectivity_event_received: null check the allocation
	ASoC: rockchip: Fix mono capture
	ASoC: ti: davinci-mcasp: Correct slot_width posed constraint
	net: usb: qmi_wwan: Add the BroadMobi BM818 card
	qed: RDMA - Fix the hw_ver returned in device attributes
	isdn: mISDN: hfcsusb: Fix possible null-pointer dereferences in start_isoc_chain()
	mac80211_hwsim: Fix possible null-pointer dereferences in hwsim_dump_radio_nl()
	netfilter: ipset: Actually allow destination MAC address for hash:ip,mac sets too
	netfilter: ipset: Copy the right MAC address in bitmap:ip,mac and hash:ip,mac sets
	netfilter: ipset: Fix rename concurrency with listing
	rxrpc: Fix potential deadlock
	rxrpc: Fix the lack of notification when sendmsg() fails on a DATA packet
	isdn: hfcsusb: Fix mISDN driver crash caused by transfer buffer on the stack
	net: phy: phy_led_triggers: Fix a possible null-pointer dereference in phy_led_trigger_change_speed()
	perf bench numa: Fix cpu0 binding
	can: sja1000: force the string buffer NULL-terminated
	can: peak_usb: force the string buffer NULL-terminated
	net/ethernet/qlogic/qed: force the string buffer NULL-terminated
	NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()
	NFS: Fix regression whereby fscache errors are appearing on 'nofsc' mounts
	HID: quirks: Set the INCREMENT_USAGE_ON_DUPLICATE quirk on Saitek X52
	HID: input: fix a4tech horizontal wheel custom usage
	drm/rockchip: Suspend DP late
	SMB3: Fix potential memory leak when processing compound chain
	SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL
	s390: put _stext and _etext into .text section
	net: cxgb3_main: Fix a resource leak in a error path in 'init_one()'
	net: stmmac: Fix issues when number of Queues >= 4
	net: stmmac: tc: Do not return a fragment entry
	net: hisilicon: make hip04_tx_reclaim non-reentrant
	net: hisilicon: fix hip04-xmit never return TX_BUSY
	net: hisilicon: Fix dma_map_single failed on arm64
	libata: have ata_scsi_rw_xlat() fail invalid passthrough requests
	libata: add SG safety checks in SFF pio transfers
	x86/lib/cpu: Address missing prototypes warning
	drm/vmwgfx: fix memory leak when too many retries have occurred
	block, bfq: handle NULL return value by bfq_init_rq()
	perf ftrace: Fix failure to set cpumask when only one cpu is present
	perf cpumap: Fix writing to illegal memory in handling cpumap mask
	perf pmu-events: Fix missing "cpu_clk_unhalted.core" event
	KVM: arm64: Don't write junk to sysregs on reset
	KVM: arm: Don't write junk to CP15 registers on reset
	selftests: kvm: Adding config fragments
	HID: wacom: correct misreported EKR ring values
	HID: wacom: Correct distance scale for 2nd-gen Intuos devices
	Revert "dm bufio: fix deadlock with loop device"
	clk: socfpga: stratix10: fix rate caclulationg for cnt_clks
	ceph: clear page dirty before invalidate page
	ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply
	libceph: fix PG split vs OSD (re)connect race
	drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX
	gpiolib: never report open-drain/source lines as 'input' to user-space
	Drivers: hv: vmbus: Fix virt_to_hvpfn() for X86_PAE
	userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx
	x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386
	x86/apic: Handle missing global clockevent gracefully
	x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h
	x86/boot: Save fields explicitly, zero out everything else
	x86/boot: Fix boot regression caused by bootparam sanitizing
	dm kcopyd: always complete failed jobs
	dm btree: fix order of block initialization in btree_split_beneath
	dm integrity: fix a crash due to BUG_ON in __journal_read_write()
	dm raid: add missing cleanup in raid_ctr()
	dm space map metadata: fix missing store of apply_bops() return value
	dm table: fix invalid memory accesses with too high sector number
	dm zoned: improve error handling in reclaim
	dm zoned: improve error handling in i/o map code
	dm zoned: properly handle backing device failure
	genirq: Properly pair kobject_del() with kobject_add()
	mm, page_owner: handle THP splits correctly
	mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely
	mm/zsmalloc.c: fix race condition in zs_destroy_pool
	xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT
	xfs: don't trip over uninitialized buffer on extent read of corrupted inode
	xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h
	xfs: Add helper function xfs_attr_try_sf_addname
	xfs: Add attibute set and helper functions
	xfs: Add attibute remove and helper functions
	xfs: always rejoin held resources during defer roll
	dm zoned: fix potential NULL dereference in dmz_do_reclaim()
	powerpc: Allow flush_(inval_)dcache_range to work across ranges >4GB
	rxrpc: Fix local endpoint refcounting
	rxrpc: Fix read-after-free in rxrpc_queue_local()
	rxrpc: Fix local endpoint replacement
	rxrpc: Fix local refcounting
	Linux 4.19.69

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9824a29e0434a6a80e2f32fdb88c0ac1fe8e5af5
2019-09-02 17:39:29 +02:00
Oleg Nesterov
cf13e30c58 userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx
commit 46d0b24c5ee10a15dfb25e20642f5a5ed59c5003 upstream.

userfaultfd_release() should clear vm_flags/vm_userfaultfd_ctx even if
mm->core_state != NULL.

Otherwise a page fault can see userfaultfd_missing() == T and use an
already freed userfaultfd_ctx.

Link: http://lkml.kernel.org/r/20190820160237.GB4983@redhat.com
Fixes: 04f5866e41fb ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-29 08:28:52 +02:00
Greg Kroah-Hartman
5ad6eeba58 This is the 4.19.58 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0lmYwACgkQONu9yGCS
 aT4h5w//ZG0BYEwxoa4Qc8rwvncnk78miK/VRH5JVTiToDqTuttHZQoMp+NLD2fQ
 V679f/2+VqEPn8o6yJsrbM8uea0iIratI8U6L2OEt6TKPbar3CPcRUPJeqlPWkej
 tf3qjAtvNNjLcl7xCYt9JNvpF4RwA8rLWWP5hZyYMi7xcMiB0FOriTlVJYHJ0PLK
 Iqg+edkBxKwx7mvFlZnJkT0ln5hCqT4QBq2XrOYGUfy2Ans5Ytg5dhhp41QDD6iu
 oE4mS+fybCzNOR3BWl7pfpeJRg8TKq4XNzYsQr9ftt2e3OZxOi3Jg+RLsgzjJB9P
 1aTsuSzSeMXVGrAwRpBAot7TC+8F88sci0gibh4pg5N0ujGdvRW4gyzYHtdKhsTc
 wmjYMKbAxJWwz0vkRp1aSnUMSRur4Wo3qCWaOWpjkP4xhSBTTER5e5cqeuVSWde5
 FaD8s0yjnQsUaH3oxZ7zDL//MR0N+C4Izs9c2A8HkdksWTdTvI7YX8c766iIZgrm
 JFV0FIZYIHAyuXT04W9n3VSvV4tLS+ouwYZpgG09oK0lBA8NT6RyZWzijY3VE0ed
 Kl+t6iu02qZgZrvnq4pHUVnLQtw7KfyL3mzeljVxEeaTbGODPOJfypY1OMfhWYw+
 dIlmsmfa2aANf5wttl8CjLkAIIG3JmuWO2exMQidvXlGCE+rKVM=
 =u7q2
 -----END PGP SIGNATURE-----

Merge 4.19.58 into android-4.19

Changes in 4.19.58
	Bluetooth: Fix faulty expression for minimum encryption key size check
	block: Fix a NULL pointer dereference in generic_make_request()
	md/raid0: Do not bypass blocking queue entered for raid0 bios
	netfilter: nf_flow_table: ignore DF bit setting
	netfilter: nft_flow_offload: set liberal tracking mode for tcp
	netfilter: nft_flow_offload: don't offload when sequence numbers need adjustment
	netfilter: nft_flow_offload: IPCB is only valid for ipv4 family
	ASoC : cs4265 : readable register too low
	ASoC: ak4458: add return value for ak4458_probe
	ASoC: soc-pcm: BE dai needs prepare when pause release after resume
	ASoC: ak4458: rstn_control - return a non-zero on error only
	spi: bitbang: Fix NULL pointer dereference in spi_unregister_master
	drm/mediatek: fix unbind functions
	drm/mediatek: unbind components in mtk_drm_unbind()
	drm/mediatek: call drm_atomic_helper_shutdown() when unbinding driver
	drm/mediatek: clear num_pipes when unbind driver
	drm/mediatek: call mtk_dsi_stop() after mtk_drm_crtc_atomic_disable()
	ASoC: max98090: remove 24-bit format support if RJ is 0
	ASoC: sun4i-i2s: Fix sun8i tx channel offset mask
	ASoC: sun4i-i2s: Add offset to RX channel select
	x86/CPU: Add more Icelake model numbers
	usb: gadget: fusb300_udc: Fix memory leak of fusb300->ep[i]
	usb: gadget: udc: lpc32xx: allocate descriptor with GFP_ATOMIC
	ALSA: hdac: fix memory release for SST and SOF drivers
	SoC: rt274: Fix internal jack assignment in set_jack callback
	scsi: hpsa: correct ioaccel2 chaining
	drm: panel-orientation-quirks: Add quirk for GPD pocket2
	drm: panel-orientation-quirks: Add quirk for GPD MicroPC
	platform/x86: asus-wmi: Only Tell EC the OS will handle display hotkeys from asus_nb_wmi
	platform/x86: intel-vbtn: Report switch events when event wakes device
	platform/x86: mlx-platform: Fix parent device in i2c-mux-reg device registration
	platform/mellanox: mlxreg-hotplug: Add devm_free_irq call to remove flow
	i2c: pca-platform: Fix GPIO lookup code
	cpuset: restore sanity to cpuset_cpus_allowed_fallback()
	scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILE
	mm/mlock.c: change count_mm_mlocked_page_nr return type
	tracing: avoid build warning with HAVE_NOP_MCOUNT
	module: Fix livepatch/ftrace module text permissions race
	ftrace: Fix NULL pointer dereference in free_ftrace_func_mapper()
	drm/i915/dmc: protect against reading random memory
	ptrace: Fix ->ptracer_cred handling for PTRACE_TRACEME
	crypto: user - prevent operating on larval algorithms
	crypto: cryptd - Fix skcipher instance memory leak
	ALSA: seq: fix incorrect order of dest_client/dest_ports arguments
	ALSA: firewire-lib/fireworks: fix miss detection of received MIDI messages
	ALSA: line6: Fix write on zero-sized buffer
	ALSA: usb-audio: fix sign unintended sign extension on left shifts
	ALSA: hda/realtek: Add quirks for several Clevo notebook barebones
	ALSA: hda/realtek - Change front mic location for Lenovo M710q
	lib/mpi: Fix karactx leak in mpi_powm
	fs/userfaultfd.c: disable irqs for fault_pending and event locks
	tracing/snapshot: Resize spare buffer if size changed
	ARM: dts: armada-xp-98dx3236: Switch to armada-38x-uart serial node
	arm64: kaslr: keep modules inside module region when KASAN is enabled
	drm/amd/powerplay: use hardware fan control if no powerplay fan table
	drm/amdgpu/gfx9: use reset default for PA_SC_FIFO_SIZE
	drm/etnaviv: add missing failure path to destroy suballoc
	drm/imx: notify drm core before sending event during crtc disable
	drm/imx: only send event on crtc disable if kept disabled
	ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()
	mm/vmscan.c: prevent useless kswapd loops
	btrfs: Ensure replaced device doesn't have pending chunk allocation
	tty: rocket: fix incorrect forward declaration of 'rp_init()'
	mlxsw: spectrum: Handle VLAN device unlinking
	net/smc: move unhash before release of clcsock
	media: s5p-mfc: fix incorrect bus assignment in virtual child device
	drm/fb-helper: generic: Don't take module ref for fbcon
	f2fs: don't access node/meta inode mapping after iput
	mac80211: mesh: fix missing unlock on error in table_path_del()
	scsi: tcmu: fix use after free
	selftests: fib_rule_tests: Fix icmp proto with ipv6
	x86/boot/compressed/64: Do not corrupt EDX on EFER.LME=1 setting
	net: hns: Fixes the missing put_device in positive leg for roce reset
	ALSA: hda: Initialize power_state field properly
	rds: Fix warning.
	ip6: fix skb leak in ip6frag_expire_frag_queue()
	netfilter: ipv6: nf_defrag: fix leakage of unqueued fragments
	sc16is7xx: move label 'err_spi' to correct section
	net: hns: fix unsigned comparison to less than zero
	bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K
	netfilter: ipv6: nf_defrag: accept duplicate fragments again
	KVM: x86: degrade WARN to pr_warn_ratelimited
	KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC
	nfsd: Fix overflow causing non-working mounts on 1 TB machines
	svcrdma: Ignore source port when computing DRC hash
	MIPS: Fix bounds check virt_addr_valid
	MIPS: Add missing EHB in mtc0 -> mfc0 sequence.
	MIPS: have "plain" make calls build dtbs for selected platforms
	dmaengine: qcom: bam_dma: Fix completed descriptors count
	dmaengine: imx-sdma: remove BD_INTR for channel0
	Linux 4.19.58

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-07-10 11:40:00 +02:00
Eric Biggers
052b318100 fs/userfaultfd.c: disable irqs for fault_pending and event locks
commit cbcfa130a911c613a1d9d921af2eea171c414172 upstream.

When IOCB_CMD_POLL is used on a userfaultfd, aio_poll() disables IRQs
and takes kioctx::ctx_lock, then userfaultfd_ctx::fd_wqh.lock.

This may have to wait for userfaultfd_ctx::fd_wqh.lock to be released by
userfaultfd_ctx_read(), which in turn can be waiting for
userfaultfd_ctx::fault_pending_wqh.lock or
userfaultfd_ctx::event_wqh.lock.

But elsewhere the fault_pending_wqh and event_wqh locks are taken with
IRQs enabled.  Since the IRQ handler may take kioctx::ctx_lock, lockdep
reports that a deadlock is possible.

Fix it by always disabling IRQs when taking the fault_pending_wqh and
event_wqh locks.

Commit ae62c16e105a ("userfaultfd: disable irqs when taking the
waitqueue lock") didn't fix this because it only accounted for the
fd_wqh lock, not the other locks nested inside it.

Link: http://lkml.kernel.org/r/20190627075004.21259-1-ebiggers@kernel.org
Fixes: bfe4037e72 ("aio: implement IOCB_CMD_POLL")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reported-by: syzbot+fab6de82892b6b9c6191@syzkaller.appspotmail.com
Reported-by: syzbot+53c0b767f7ca0dc0c451@syzkaller.appspotmail.com
Reported-by: syzbot+a3accb352f9c22041cfa@syzkaller.appspotmail.com
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>	[4.19+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-10 09:53:42 +02:00
Greg Kroah-Hartman
9bf5904866 This is the 4.19.37 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlzEBokACgkQONu9yGCS
 aT7G7w/8C93URGM67H7ynkCHTo8y3hkRE2rUJPckJNdS+IJKuecmOphak4tF0h07
 qPWDPya70Q1S0cNu661TuVAGrhmE5jBx8/xfZaAOeaaU0xtZive+TfSHdAQQaHct
 tDk32O85N1aZ49rDEz9ibr7CGLVFDZtyhxV5gFMYQpjbqA7MzJC61zQg1jHyPSCz
 sKjQzW+uXMuSLru8jXHMvp41K5sFFp5gYdQbAVKlWtt79qPxWdxZPJbLbM0LBbtz
 XHt9E45Ink3ALF9P6tZ4e6gi4zzlNbh9yR92+X5NK5/8AP57yWba4W9JHWIfMBpC
 yyDYTOEAzdxqa2Jrgwr4WTdKH6U7FbQZFmWfTBB4VotbHLBWkVXj0OnF10qxP9eQ
 p5wGDTJAlWezhX1BTCfYroglDsvqhj+gHfwHzDRF1Del1dRgydRMQc0qLD1d9tul
 ovzwOkx1xyJrM2wq05I5gc0FoVyOL6/KCwqMrpVfKa3WKY7Uttjgf56bMqdIIkns
 i/6opzF+wtvwlLlCoXgYPXdm6kbWdgvS+skVHfWcHmZFMuGrFGGzJNwzXb7qnVjK
 T0hD1OestsfTyD/amnDNYkNeCkoOZqtHAi+xYOQR4kGY5cxP1lQJf85MgAy6RZSY
 h+rjys76Qf6+hTCtrowLr8SgksX4ACWxm+UarfAiiNnnDXwGfu8=
 =SrFV
 -----END PGP SIGNATURE-----

Merge 4.19.37 into android-4.19

Changes in 4.19.37
	bonding: fix event handling for stacked bonds
	failover: allow name change on IFF_UP slave interfaces
	net: atm: Fix potential Spectre v1 vulnerabilities
	net: bridge: fix per-port af_packet sockets
	net: bridge: multicast: use rcu to access port list from br_multicast_start_querier
	net: Fix missing meta data in skb with vlan packet
	net: fou: do not use guehdr after iptunnel_pull_offloads in gue_udp_recv
	tcp: tcp_grow_window() needs to respect tcp_space()
	team: set slave to promisc if team is already in promisc mode
	tipc: missing entries in name table of publications
	vhost: reject zero size iova range
	ipv4: recompile ip options in ipv4_link_failure
	ipv4: ensure rcu_read_lock() in ipv4_link_failure()
	net: thunderx: raise XDP MTU to 1508
	net: thunderx: don't allow jumbo frames with XDP
	net/mlx5: FPGA, tls, hold rcu read lock a bit longer
	net/tls: prevent bad memory access in tls_is_sk_tx_device_offloaded()
	net/mlx5: FPGA, tls, idr remove on flow delete
	route: Avoid crash from dereferencing NULL rt->from
	sch_cake: Use tc_skb_protocol() helper for getting packet protocol
	sch_cake: Make sure we can write the IP header before changing DSCP bits
	nfp: flower: replace CFI with vlan present
	nfp: flower: remove vlan CFI bit from push vlan action
	sch_cake: Simplify logic in cake_select_tin()
	net: IP defrag: encapsulate rbtree defrag code into callable functions
	net: IP6 defrag: use rbtrees for IPv6 defrag
	net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c
	CIFS: keep FileInfo handle live during oplock break
	cifs: Fix use-after-free in SMB2_write
	cifs: Fix use-after-free in SMB2_read
	cifs: fix handle leak in smb2_query_symlink()
	KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU
	KVM: x86: svm: make sure NMI is injected after nmi_singlestep
	Staging: iio: meter: fixed typo
	staging: iio: ad7192: Fix ad7193 channel address
	iio: gyro: mpu3050: fix chip ID reading
	iio/gyro/bmg160: Use millidegrees for temperature scale
	iio:chemical:bme680: Fix, report temperature in millidegrees
	iio:chemical:bme680: Fix SPI read interface
	iio: cros_ec: Fix the maths for gyro scale calculation
	iio: ad_sigma_delta: select channel when reading register
	iio: dac: mcp4725: add missing powerdown bits in store eeprom
	iio: Fix scan mask selection
	iio: adc: at91: disable adc channel interrupt in timeout case
	iio: core: fix a possible circular locking dependency
	io: accel: kxcjk1013: restore the range after resume.
	staging: most: core: use device description as name
	staging: comedi: vmk80xx: Fix use of uninitialized semaphore
	staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf
	staging: comedi: ni_usb6501: Fix use of uninitialized mutex
	staging: comedi: ni_usb6501: Fix possible double-free of ->usb_rx_buf
	ALSA: hda/realtek - add two more pin configuration sets to quirk table
	ALSA: core: Fix card races between register and disconnect
	Input: elan_i2c - add hardware ID for multiple Lenovo laptops
	serial: sh-sci: Fix HSCIF RX sampling point adjustment
	serial: sh-sci: Fix HSCIF RX sampling point calculation
	vt: fix cursor when clearing the screen
	scsi: core: set result when the command cannot be dispatched
	Revert "scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO"
	Revert "svm: Fix AVIC incomplete IPI emulation"
	coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
	ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier
	crypto: x86/poly1305 - fix overflow during partial reduction
	drm/ttm: fix out-of-bounds read in ttm_put_pages() v2
	arm64: futex: Restore oldval initialization to work around buggy compilers
	x86/kprobes: Verify stack frame on kretprobe
	kprobes: Mark ftrace mcount handler functions nokprobe
	kprobes: Fix error check when reusing optimized probes
	rt2x00: do not increment sequence number while re-transmitting
	mac80211: do not call driver wake_tx_queue op during reconfig
	drm/amdgpu/gmc9: fix VM_L2_CNTL3 programming
	perf/x86/amd: Add event map for AMD Family 17h
	x86/cpu/bugs: Use __initconst for 'const' init data
	perf/x86: Fix incorrect PEBS_REGS
	x86/speculation: Prevent deadlock on ssb_state::lock
	timers/sched_clock: Prevent generic sched_clock wrap caused by tick_freeze()
	nfit/ars: Remove ars_start_flags
	nfit/ars: Introduce scrub_flags
	nfit/ars: Allow root to busy-poll the ARS state machine
	nfit/ars: Avoid stale ARS results
	mmc: sdhci: Fix data command CRC error handling
	mmc: sdhci: Rename SDHCI_ACMD12_ERR and SDHCI_INT_ACMD12ERR
	mmc: sdhci: Handle auto-command errors
	modpost: file2alias: go back to simple devtable lookup
	modpost: file2alias: check prototype of handler
	tpm/tpm_i2c_atmel: Return -E2BIG when the transfer is incomplete
	tpm: Fix the type of the return value in calc_tpm2_event_size()
	Revert "kbuild: use -Oz instead of -Os when using clang"
	sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup
	device_cgroup: fix RCU imbalance in error case
	mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
	ALSA: info: Fix racy addition/deletion of nodes
	percpu: stop printing kernel addresses
	tools include: Adopt linux/bits.h
	ASoC: rockchip: add missing INTERLEAVED PCM attribute
	i2c-hid: properly terminate i2c_hid_dmi_desc_override_table[] array
	Revert "locking/lockdep: Add debug_locks check in __lock_downgrade()"
	kernel/sysctl.c: fix out-of-bounds access when setting file-max
	Linux 4.19.37

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-04-30 12:53:00 +02:00
Andrea Arcangeli
6ff17bc593 coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
commit 04f5866e41fb70690e28397487d8bd8eea7d712a upstream.

The core dumping code has always run without holding the mmap_sem for
writing, despite that is the only way to ensure that the entire vma
layout will not change from under it.  Only using some signal
serialization on the processes belonging to the mm is not nearly enough.
This was pointed out earlier.  For example in Hugh's post from Jul 2017:

  https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils

  "Not strictly relevant here, but a related note: I was very surprised
   to discover, only quite recently, how handle_mm_fault() may be called
   without down_read(mmap_sem) - when core dumping. That seems a
   misguided optimization to me, which would also be nice to correct"

In particular because the growsdown and growsup can move the
vm_start/vm_end the various loops the core dump does around the vma will
not be consistent if page faults can happen concurrently.

Pretty much all users calling mmget_not_zero()/get_task_mm() and then
taking the mmap_sem had the potential to introduce unexpected side
effects in the core dumping code.

Adding mmap_sem for writing around the ->core_dump invocation is a
viable long term fix, but it requires removing all copy user and page
faults and to replace them with get_dump_page() for all binary formats
which is not suitable as a short term fix.

For the time being this solution manually covers the places that can
confuse the core dump either by altering the vma layout or the vma flags
while it runs.  Once ->core_dump runs under mmap_sem for writing the
function mmget_still_valid() can be dropped.

Allowing mmap_sem protected sections to run in parallel with the
coredump provides some minor parallelism advantage to the swapoff code
(which seems to be safe enough by never mangling any vma field and can
keep doing swapins in parallel to the core dumping) and to some other
corner case.

In order to facilitate the backporting I added "Fixes: 86039bd3b4e6"
however the side effect of this same race condition in /proc/pid/mem
should be reproducible since before 2.6.12-rc2 so I couldn't add any
other "Fixes:" because there's no hash beyond the git genesis commit.

Because find_extend_vma() is the only location outside of the process
context that could modify the "mm" structures under mmap_sem for
reading, by adding the mmget_still_valid() check to it, all other cases
that take the mmap_sem for reading don't need the new check after
mmget_not_zero()/get_task_mm().  The expand_stack() in page fault
context also doesn't need the new check, because all tasks under core
dumping are frozen.

Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com
Fixes: 86039bd3b4 ("userfaultfd: add new syscall to provide memory externalization")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Jann Horn <jannh@google.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-27 09:36:37 +02:00
Greg Kroah-Hartman
26bf816608 This is the 4.19.18 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlxMGy0ACgkQONu9yGCS
 aT5ppQ/8COjyZg1aTrCrd0ttMHYotw3Lb4B6E/SCf2ub4X38SxGz9irhQ7r2FKdK
 w0ZXlLOF2ddqWe6BUnIfWago4Pk1GBpg3bgnp5XyYTjlJbfI2yZ9ggiO0iNYBPaL
 fN2JwM9eze/7cDlpYbhwGpF4+Wz8wTrzh+NIputcvC6n3SQH/cTGmOUa9rlamQju
 uukkvLanAYY3sqDCl4B415Ds44ROU4filqHYIkvZC81jc3Q0YZ8M7cTmpLcDQKGz
 8Z+Veil07jEM9bF2W8iX79nwxMT+edFC62HMuRCoxJKq+1kccw1TVMWpQ8TWbv13
 zeLOqXxNP6VcNaC251q3QzlInRDp1dtr8KtzA/OG0WFnZBTEDng/iChhiL8qZt0R
 9+Sz7n9uZ5pMRK3tr03Ccjg3AneKWRqad2iaTB/kOwAdu7Uqxz8U9qUuRDFPV7OY
 KTMCCfdS8XpMHl/S+Cvg2dqSNiBEkNmowYO6NvQClG0aoN4/6wH+m2TZ0hCl6PVq
 pNFOTJmp7FOaztEZC4rqW8DoOGeGaNo5DP9A2XKKDR20F7EiAE437ApEQ4p5QGVk
 ek4uslZkwJWU/UOzXRl/Hoz0OlI0ixsdZy1vw88HCl7SD1E7xHJpnRUkOjigTT1Q
 nbCt0Nm/A2+c1tKbzU+PVW8FtIbutZhW1BtrqaIbbHr9NBTICR0=
 =Yg+/
 -----END PGP SIGNATURE-----

Merge 4.19.18 into android-4.19

Changes in 4.19.18
	ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address
	mlxsw: spectrum: Disable lag port TX before removing it
	mlxsw: spectrum_switchdev: Set PVID correctly during VLAN deletion
	net: dsa: mv88x6xxx: mv88e6390 errata
	net, skbuff: do not prefer skb allocation fails early
	qmi_wwan: add MTU default to qmap network interface
	r8169: Add support for new Realtek Ethernet
	ipv6: Take rcu_read_lock in __inet6_bind for mapped addresses
	net: clear skb->tstamp in bridge forwarding path
	netfilter: ipset: Allow matching on destination MAC address for mac and ipmac sets
	gpio: pl061: Move irq_chip definition inside struct pl061
	drm/amd/display: Guard against null stream_state in set_crc_source
	drm/amdkfd: fix interrupt spin lock
	ixgbe: allow IPsec Tx offload in VEPA mode
	platform/x86: asus-wmi: Tell the EC the OS will handle the display off hotkey
	e1000e: allow non-monotonic SYSTIM readings
	usb: typec: tcpm: Do not disconnect link for self powered devices
	selftests/bpf: enable (uncomment) all tests in test_libbpf.sh
	of: overlay: add missing of_node_put() after add new node to changeset
	writeback: don't decrement wb->refcnt if !wb->bdi
	serial: set suppress_bind_attrs flag only if builtin
	bpf: Allow narrow loads with offset > 0
	ALSA: oxfw: add support for APOGEE duet FireWire
	x86/mce: Fix -Wmissing-prototypes warnings
	MIPS: SiByte: Enable swiotlb for SWARM, LittleSur and BigSur
	crypto: ecc - regularize scalar for scalar multiplication
	arm64: perf: set suppress_bind_attrs flag to true
	drm/atomic-helper: Complete fake_commit->flip_done potentially earlier
	clk: meson: meson8b: fix incorrect divider mapping in cpu_scale_table
	samples: bpf: fix: error handling regarding kprobe_events
	usb: gadget: udc: renesas_usb3: add a safety connection way for forced_b_device
	fpga: altera-cvp: fix probing for multiple FPGAs on the bus
	selinux: always allow mounting submounts
	ASoC: pcm3168a: Don't disable pcm3168a when CONFIG_PM defined
	scsi: qedi: Check for session online before getting iSCSI TLV data.
	drm/amdgpu: Reorder uvd ring init before uvd resume
	rxe: IB_WR_REG_MR does not capture MR's iova field
	efi/libstub: Disable some warnings for x86{,_64}
	jffs2: Fix use of uninitialized delayed_work, lockdep breakage
	clk: imx: make mux parent strings const
	pstore/ram: Do not treat empty buffers as valid
	media: uvcvideo: Refactor teardown of uvc on USB disconnect
	powerpc/xmon: Fix invocation inside lock region
	powerpc/pseries/cpuidle: Fix preempt warning
	media: firewire: Fix app_info parameter type in avc_ca{,_app}_info
	ASoC: use dma_ops of parent device for acp_audio_dma
	media: venus: core: Set dma maximum segment size
	staging: erofs: fix use-after-free of on-stack `z_erofs_vle_unzip_io'
	net: call sk_dst_reset when set SO_DONTROUTE
	scsi: target: use consistent left-aligned ASCII INQUIRY data
	scsi: target/core: Make sure that target_wait_for_sess_cmds() waits long enough
	selftests: do not macro-expand failed assertion expressions
	arm64: kasan: Increase stack size for KASAN_EXTRA
	clk: imx6q: reset exclusive gates on init
	arm64: Fix minor issues with the dcache_by_line_op macro
	bpf: relax verifier restriction on BPF_MOV | BPF_ALU
	kconfig: fix file name and line number of warn_ignored_character()
	kconfig: fix memory leak when EOF is encountered in quotation
	mmc: atmel-mci: do not assume idle after atmci_request_end
	btrfs: volumes: Make sure there is no overlap of dev extents at mount time
	btrfs: alloc_chunk: fix more DUP stripe size handling
	btrfs: fix use-after-free due to race between replace start and cancel
	btrfs: improve error handling of btrfs_add_link
	tty/serial: do not free trasnmit buffer page under port lock
	perf intel-pt: Fix error with config term "pt=0"
	perf tests ARM: Disable breakpoint tests 32-bit
	perf svghelper: Fix unchecked usage of strncpy()
	perf parse-events: Fix unchecked usage of strncpy()
	perf vendor events intel: Fix Load_Miss_Real_Latency on SKL/SKX
	netfilter: ipt_CLUSTERIP: check MAC address when duplicate config is set
	netfilter: ipt_CLUSTERIP: remove wrong WARN_ON_ONCE in netns exit routine
	netfilter: ipt_CLUSTERIP: fix deadlock in netns exit routine
	x86/topology: Use total_cpus for max logical packages calculation
	dm crypt: use u64 instead of sector_t to store iv_offset
	dm kcopyd: Fix bug causing workqueue stalls
	perf stat: Avoid segfaults caused by negated options
	tools lib subcmd: Don't add the kernel sources to the include path
	dm snapshot: Fix excessive memory usage and workqueue stalls
	perf cs-etm: Correct packets swapping in cs_etm__flush()
	perf tools: Add missing sigqueue() prototype for systems lacking it
	perf tools: Add missing open_memstream() prototype for systems lacking it
	quota: Lock s_umount in exclusive mode for Q_XQUOTA{ON,OFF} quotactls.
	clocksource/drivers/integrator-ap: Add missing of_node_put()
	dm: Check for device sector overflow if CONFIG_LBDAF is not set
	Bluetooth: btusb: Add support for Intel bluetooth device 8087:0029
	ALSA: bebob: fix model-id of unit for Apogee Ensemble
	sysfs: Disable lockdep for driver bind/unbind files
	IB/usnic: Fix potential deadlock
	scsi: mpt3sas: fix memory ordering on 64bit writes
	scsi: smartpqi: correct lun reset issues
	ath10k: fix peer stats null pointer dereference
	scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown()
	scsi: megaraid: fix out-of-bound array accesses
	iomap: don't search past page end in iomap_is_partially_uptodate
	ocfs2: fix panic due to unrecovered local alloc
	mm/page-writeback.c: don't break integrity writeback on ->writepage() error
	mm/swap: use nr_node_ids for avail_lists in swap_info_struct
	userfaultfd: clear flag if remap event not enabled
	mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps
	iwlwifi: mvm: Send LQ command as async when necessary
	Bluetooth: Fix unnecessary error message for HCI request completion
	ipmi: fix use-after-free of user->release_barrier.rda
	ipmi: msghandler: Fix potential Spectre v1 vulnerabilities
	ipmi: Prevent use-after-free in deliver_response
	ipmi:ssif: Fix handling of multi-part return messages
	ipmi: Don't initialize anything in the core until something uses it
	Linux 4.19.18

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-01-26 11:58:37 +01:00
Peter Xu
2011eb7418 userfaultfd: clear flag if remap event not enabled
[ Upstream commit 3cfd22be0ad663248fadfc8f6ffa3e255c394552 ]

When the process being tracked does mremap() without
UFFD_FEATURE_EVENT_REMAP on the corresponding tracking uffd file handle,
we should not generate the remap event, and at the same time we should
clear all the uffd flags on the new VMA.  Without this patch, we can still
have the VM_UFFD_MISSING|VM_UFFD_WP flags on the new VMA even the fault
handling process does not even know the existance of the VMA.

Link: http://lkml.kernel.org/r/20181211053409.20317-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Pravin Shedge <pravin.shedge4linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-26 09:32:43 +01:00
Greg Kroah-Hartman
a87fb6b90d This is the 4.19.11 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlwai8oACgkQONu9yGCS
 aT7mEA//TNX+LqwK18576UwG/nnUmFNlcfFsTycY1cAOSa4PdYdA5yavO8+BvRuf
 D8iLvHhaFM7YINvkWy8Yngb4H6MLCBqFYrpPcwIBTf5vPf4i7Ct31X9Jw7Kilv1B
 j6sCgGvI7BUjkXAL/rqFLfnlS3qkUcaF3g1OOvyaCcg8A+mxP0mZ+8hWNC6GzVue
 If7RzoeQFVSeG38Ji6acrwwfeIGcD4JS8nmHv0ATMRn9QDj/Sc1rHlv6kWxKzrlD
 k1876ciCGSdo3LWxqhbNiyL6z1cNL+eYQiof7NCCb2BReVhteT2Wsp7SdwTiA/0V
 tT2ZqC7z+qXKrO3O1KbKYO3OVUsV/Au3E9cj2RCripkO4UJdnYMZ43XYaHA4lHsq
 NoV0THuaO+O1CqBV0hZC22gIwue1vJ+D5G+jeygOl9bBS5NGEeCentNguRKUVOQC
 sybn0x8EQ1ldWUxIYateJ/9NCDDTXsbD/heEtXMGYL48KG3x4ibagysXwWobGspK
 uoJKAXD3UtcsLCeJ7p6qlA+hhtUBcFm48m3ADvJ0SYDDFynzAK+BOER39XSUW8AF
 u6LAFc0/XV+1Ci+GuIVXL1grIehZyRzqmamqfn+6c9kOnZ1DMEyVPMUdtKyi+c3G
 4wvYKK+uf6RBGr2n8Fg9rMaL6ZWOSolj7SV/QBSducKhJS4quYo=
 =dihN
 -----END PGP SIGNATURE-----

Merge 4.19.11 into android-4.19

Changes in 4.19.11
	sched/pelt: Fix warning and clean up IRQ PELT config
	scsi: raid_attrs: fix unused variable warning
	staging: olpc_dcon: add a missing dependency
	slimbus: ngd: mark PM functions as __maybe_unused
	i2c: aspeed: fix build warning
	ARM: dts: qcom-apq8064-arrow-sd-600eval fix graph_endpoint warning
	drm/msm: fix address space warning
	pinctrl: sunxi: a83t: Fix IRQ offset typo for PH11
	aio: fix spectre gadget in lookup_ioctx
	scripts/spdxcheck.py: always open files in binary mode
	fs/iomap.c: get/put the page in iomap_page_create/release()
	userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered
	arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearing
	block/bio: Do not zero user pages
	ovl: fix decode of dir file handle with multi lower layers
	ovl: fix missing override creds in link of a metacopy upper
	MMC: OMAP: fix broken MMC on OMAP15XX/OMAP5910/OMAP310
	mmc: core: use mrq->sbc when sending CMD23 for RPMB
	mmc: sdhci-omap: Fix DCRC error handling during tuning
	mmc: sdhci: fix the timeout check window for clock and reset
	fuse: continue to send FUSE_RELEASEDIR when FUSE_OPEN returns ENOSYS
	ARM: mmp/mmp2: fix cpu_is_mmp2() on mmp2-dt
	ARM: dts: bcm2837: Fix polarity of wifi reset GPIOs
	dm thin: send event about thin-pool state change _after_ making it
	dm cache metadata: verify cache has blocks in blocks_are_clean_separate_dirty()
	dm: call blk_queue_split() to impose device limits on bios
	tracing: Fix memory leak in create_filter()
	tracing: Fix memory leak in set_trigger_filter()
	tracing: Fix memory leak of instance function hash filters
	media: vb2: don't call __vb2_queue_cancel if vb2_start_streaming failed
	powerpc/msi: Fix NULL pointer access in teardown code
	powerpc: Look for "stdout-path" when setting up legacy consoles
	drm/nouveau/kms: Fix memory leak in nv50_mstm_del()
	drm/nouveau/kms/nv50-: also flush fb writes when rewinding push buffer
	Revert "drm/rockchip: Allow driver to be shutdown on reboot/kexec"
	drm/i915/gvt: Fix tiled memory decoding bug on BDW
	drm/i915/execlists: Apply a full mb before execution for Braswell
	drm/amdgpu/powerplay: Apply avfs cks-off voltages on VI
	drm/amdkfd: add new vega10 pci ids
	drm/amdgpu: add some additional vega10 pci ids
	drm/amdgpu: update smu firmware images for VI variants (v2)
	drm/amdgpu: update SMC firmware image for polaris10 variants
	dm zoned: Fix target BIO completion handling
	x86/build: Fix compiler support check for CONFIG_RETPOLINE
	Linux 4.19.11

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-12-19 19:30:07 +01:00
Andrea Arcangeli
d41c49daf2 userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered
commit 01e881f5a1fca4677e82733061868c6d6ea05ca7 upstream.

Calling UFFDIO_UNREGISTER on virtual ranges not yet registered in uffd
could trigger an harmless false positive WARN_ON.  Check the vma is
already registered before checking VM_MAYWRITE to shut off the false
positive warning.

Link: http://lkml.kernel.org/r/20181206212028.18726-2-aarcange@redhat.com
Cc: <stable@vger.kernel.org>
Fixes: 29ec90660d68 ("userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: syzbot+06c7092e7d71218a2c16@syzkaller.appspotmail.com
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-19 19:19:50 +01:00
Greg Kroah-Hartman
c454ec1e21 This is the 4.19.7 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlwIG48ACgkQONu9yGCS
 aT7g6Q//RkJ8ZWaRkykcCGaWIvwI6QF1tmKalIEWmToPdndDuQdUDGzWVwfE9G7P
 yLcnp3GMlXo4F82BBwG8lFSAm9zaeqaLabnJnXbCc5mZ3xi/2aNqIGHzBY1isNZl
 0fTzzcelnAKzjp0Aa/egRLOeraSLgVt/Cp7Ha3FXMP6RNxUMzs1pbQ2IFZ3m+P4G
 CAD3Iye6geOaZTu/kXiiooUEUGFQFbV4c3AZ4VW7dZDdrG+ekwtF4YHtkEPseWJQ
 Ugtrbr6S0IxYQ91o1Pk77kg4uwUFYo12jrk8Ni4gaPZE6mQCa08tr2Alg2oZkJGw
 PdXnt2ASYGRWFYK2JAuTvKzhHrTEJYhiC323dKYCAx7BgfFaqdo5F20oNzYxXFBB
 gGA3AzDDtLUD3OOO+lxrDxXMhpwXUx92WXsoJVsaSafdqIDAueq14sH19wqm0gUJ
 D1fC2dWTsFrPZKjkU8Z6rJAyO1XZED55h7v1YlqAt2ibjCeDKpjnW3yvUt8Ivpqc
 nlnmp8v/Yl2cdY55XtlgUadpknSc2jApFMwhSWetxAaqDCvha2dLQ28YMyPRJzat
 ZHOkizM/VUntXvlUzFvVTsqLQiX0sfLG6MKcUkzWehPomNKT+B8XL1wtzytv9QXb
 jOY8nRD5PiQo2p35cqdDCskBwqzEwY+WxDe7ji0yHZysBZLxoxQ=
 =OiCf
 -----END PGP SIGNATURE-----

Merge 4.19.7 into android-4.19

Changes in 4.19.7
	mm/huge_memory: rename freeze_page() to unmap_page()
	mm/huge_memory: splitting set mapping+index before unfreeze
	mm/huge_memory: fix lockdep complaint on 32-bit i_size_read()
	mm/khugepaged: collapse_shmem() stop if punched or truncated
	mm/khugepaged: fix crashes due to misaccounted holes
	mm/khugepaged: collapse_shmem() remember to clear holes
	mm/khugepaged: minor reorderings in collapse_shmem()
	mm/khugepaged: collapse_shmem() without freezing new_page
	mm/khugepaged: collapse_shmem() do not crash on Compound
	lan743x: Enable driver to work with LAN7431
	lan743x: fix return value for lan743x_tx_napi_poll
	net: don't keep lonely packets forever in the gro hash
	net: gemini: Fix copy/paste error
	net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue
	packet: copy user buffers before orphan or clone
	rapidio/rionet: do not free skb before reading its length
	s390/qeth: fix length check in SNMP processing
	usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2
	net: thunderx: set xdp_prog to NULL if bpf_prog_add fails
	net: skb_scrub_packet(): Scrub offload_fwd_mark
	virtio-net: disable guest csum during XDP set
	virtio-net: fail XDP set if guest csum is negotiated
	net/dim: Update DIM start sample after each DIM iteration
	tcp: defer SACK compression after DupThresh
	net: phy: add workaround for issue where PHY driver doesn't bind to the device
	tipc: fix lockdep warning during node delete
	x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation
	x86/speculation: Apply IBPB more strictly to avoid cross-process data leak
	x86/speculation: Propagate information about RSB filling mitigation to sysfs
	x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant
	x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support
	x86/retpoline: Remove minimal retpoline support
	x86/speculation: Update the TIF_SSBD comment
	x86/speculation: Clean up spectre_v2_parse_cmdline()
	x86/speculation: Remove unnecessary ret variable in cpu_show_common()
	x86/speculation: Move STIPB/IBPB string conditionals out of cpu_show_common()
	x86/speculation: Disable STIBP when enhanced IBRS is in use
	x86/speculation: Rename SSBD update functions
	x86/speculation: Reorganize speculation control MSRs update
	sched/smt: Make sched_smt_present track topology
	x86/Kconfig: Select SCHED_SMT if SMP enabled
	sched/smt: Expose sched_smt_present static key
	x86/speculation: Rework SMT state change
	x86/l1tf: Show actual SMT state
	x86/speculation: Reorder the spec_v2 code
	x86/speculation: Mark string arrays const correctly
	x86/speculataion: Mark command line parser data __initdata
	x86/speculation: Unify conditional spectre v2 print functions
	x86/speculation: Add command line control for indirect branch speculation
	x86/speculation: Prepare for per task indirect branch speculation control
	x86/process: Consolidate and simplify switch_to_xtra() code
	x86/speculation: Avoid __switch_to_xtra() calls
	x86/speculation: Prepare for conditional IBPB in switch_mm()
	ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS
	x86/speculation: Split out TIF update
	x86/speculation: Prevent stale SPEC_CTRL msr content
	x86/speculation: Prepare arch_smt_update() for PRCTL mode
	x86/speculation: Add prctl() control for indirect branch speculation
	x86/speculation: Enable prctl mode for spectre_v2_user
	x86/speculation: Add seccomp Spectre v2 user space protection mode
	x86/speculation: Provide IBPB always command line options
	userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas
	kvm: mmu: Fix race in emulated page table writes
	kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb
	KVM: nVMX/nSVM: Fix bug which sets vcpu->arch.tsc_offset to L1 tsc_offset
	KVM: x86: Fix kernel info-leak in KVM_HC_CLOCK_PAIRING hypercall
	KVM: LAPIC: Fix pv ipis use-before-initialization
	KVM: X86: Fix scan ioapic use-before-initialization
	KVM: VMX: re-add ple_gap module parameter
	xtensa: enable coprocessors that are being flushed
	xtensa: fix coprocessor context offset definitions
	xtensa: fix coprocessor part of ptrace_{get,set}xregs
	udf: Allow mounting volumes with incorrect identification strings
	btrfs: Always try all copies when reading extent buffers
	Btrfs: ensure path name is null terminated at btrfs_control_ioctl
	Btrfs: fix rare chances for data loss when doing a fast fsync
	Btrfs: fix race between enabling quotas and subvolume creation
	btrfs: relocation: set trans to be NULL after ending transaction
	PCI: layerscape: Fix wrong invocation of outbound window disable accessor
	PCI: dwc: Fix MSI-X EP framework address calculation bug
	PCI: Fix incorrect value returned from pcie_get_speed_cap()
	arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou.
	x86/MCE/AMD: Fix the thresholding machinery initialization order
	x86/fpu: Disable bottom halves while loading FPU registers
	perf/x86/intel: Move branch tracing setup to the Intel-specific source file
	perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts()
	perf/x86/intel: Disallow precise_ip on BTS events
	fs: fix lost error code in dio_complete
	ALSA: wss: Fix invalid snd_free_pages() at error path
	ALSA: ac97: Fix incorrect bit shift at AC97-SPSA control write
	ALSA: control: Fix race between adding and removing a user element
	ALSA: sparc: Fix invalid snd_free_pages() at error path
	ALSA: hda: Add ASRock N68C-S UCC the power_save blacklist
	ALSA: hda/realtek - Support ALC300
	ALSA: hda/realtek - fix headset mic detection for MSI MS-B171
	ALSA: hda/realtek - fix the pop noise on headphone for lenovo laptops
	ALSA: hda/realtek - Add auto-mute quirk for HP Spectre x360 laptop
	function_graph: Create function_graph_enter() to consolidate architecture code
	ARM: function_graph: Simplify with function_graph_enter()
	microblaze: function_graph: Simplify with function_graph_enter()
	x86/function_graph: Simplify with function_graph_enter()
	nds32: function_graph: Simplify with function_graph_enter()
	powerpc/function_graph: Simplify with function_graph_enter()
	sh/function_graph: Simplify with function_graph_enter()
	sparc/function_graph: Simplify with function_graph_enter()
	parisc: function_graph: Simplify with function_graph_enter()
	riscv/function_graph: Simplify with function_graph_enter()
	s390/function_graph: Simplify with function_graph_enter()
	arm64: function_graph: Simplify with function_graph_enter()
	MIPS: function_graph: Simplify with function_graph_enter()
	function_graph: Make ftrace_push_return_trace() static
	function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack
	function_graph: Have profiler use curr_ret_stack and not depth
	function_graph: Move return callback before update of curr_ret_stack
	function_graph: Reverse the order of pushing the ret_stack and the callback
	binder: fix race that allows malicious free of live buffer
	ext2: initialize opts.s_mount_opt as zero before using it
	ext2: fix potential use after free
	ASoC: intel: cht_bsw_max98090_ti: Add quirk for boards using pmc_plt_clk_0
	ASoC: pcm186x: Fix device reset-registers trigger value
	ARM: dts: rockchip: Remove @0 from the veyron memory node
	dmaengine: at_hdmac: fix memory leak in at_dma_xlate()
	dmaengine: at_hdmac: fix module unloading
	staging: most: use format specifier "%s" in snprintf
	staging: vchiq_arm: fix compat VCHIQ_IOC_AWAIT_COMPLETION
	staging: mt7621-dma: fix potentially dereferencing uninitialized 'tx_desc'
	staging: mt7621-pinctrl: fix uninitialized variable ngroups
	staging: rtl8723bs: Fix incorrect sense of ether_addr_equal
	staging: rtl8723bs: Add missing return for cfg80211_rtw_get_station
	USB: usb-storage: Add new IDs to ums-realtek
	usb: core: quirks: add RESET_RESUME quirk for Cherry G230 Stream series
	Revert "usb: dwc3: gadget: skip Set/Clear Halt when invalid"
	iio/hid-sensors: Fix IIO_CHAN_INFO_RAW returning wrong values for signed numbers
	iio:st_magn: Fix enable device after trigger
	lib/test_kmod.c: fix rmmod double free
	mm: cleancache: fix corruption on missed inode invalidation
	mm: use swp_offset as key in shmem_replace_page()
	Drivers: hv: vmbus: check the creation_status in vmbus_establish_gpadl()
	misc: mic/scif: fix copy-paste error in scif_create_remote_lookup
	Linux 4.19.7

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-12-06 10:34:09 +01:00
Andrea Arcangeli
34b7a7cc53 userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas
commit 29ec90660d68bbdd69507c1c8b4e33aa299278b1 upstream.

After the VMA to register the uffd onto is found, check that it has
VM_MAYWRITE set before allowing registration.  This way we inherit all
common code checks before allowing to fill file holes in shmem and
hugetlbfs with UFFDIO_COPY.

The userfaultfd memory model is not applicable for readonly files unless
it's a MAP_PRIVATE.

Link: http://lkml.kernel.org/r/20181126173452.26955-4-aarcange@redhat.com
Fixes: ff62a34210 ("hugetlb: implement memfd sealing")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Hugh Dickins <hughd@google.com>
Reported-by: Jann Horn <jannh@google.com>
Fixes: 4c27fe4c4c ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
Cc: <stable@vger.kernel.org>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05 19:32:04 +01:00
Colin Cross
533e4ed309 ANDROID: mm: add a field to store names for private anonymous memory
Userspace processes often have multiple allocators that each do
anonymous mmaps to get memory.  When examining memory usage of
individual processes or systems as a whole, it is useful to be
able to break down the various heaps that were allocated by
each layer and examine their size, RSS, and physical memory
usage.

This patch adds a user pointer to the shared union in
vm_area_struct that points to a null terminated string inside
the user process containing a name for the vma.  vmas that
point to the same address will be merged, but vmas that
point to equivalent strings at different addresses will
not be merged.

Userspace can set the name for a region of memory by calling
prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, start, len, (unsigned long)name);
Setting the name to NULL clears it.

The names of named anonymous vmas are shown in /proc/pid/maps
as [anon:<name>] and in /proc/pid/smaps in a new "Name" field
that is only present for named vmas.  If the userspace pointer
is no longer valid all or part of the name will be replaced
with "<fault>".

The idea to store a userspace pointer to reduce the complexity
within mm (at the expense of the complexity of reading
/proc/pid/mem) came from Dave Hansen.  This results in no
runtime overhead in the mm subsystem other than comparing
the anon_name pointers when considering vma merging.  The pointer
is stored in a union with fieds that are only used on file-backed
mappings, so it does not increase memory usage.

Includes fix from Jed Davis <jld@mozilla.com> for typo in
prctl_set_vma_anon_name, which could attempt to set the name
across two vmas at the same time due to a typo, which might
corrupt the vma list.  Fix it to use tmp instead of end to limit
the name setting to a single vma at a time.

Bug: 120441514
Change-Id: I9aa7b6b5ef536cd780599ba4e2fba8ceebe8b59f
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
[AmitP: Fix get_user_pages_remote() call to align with upstream commit
        5b56d49fc3 ("mm: add locked parameter to get_user_pages_remote()")]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2018-12-05 09:48:11 -08:00
Christoph Hellwig
d2e97f026b userfaultfd: disable irqs when taking the waitqueue lock
commit ae62c16e105a869524afcf8a07ee85c5ae5d0479 upstream.

userfaultfd contains howe-grown locking of the waitqueue lock, and does
not disable interrupts.  This relies on the fact that no one else takes it
from interrupt context and violates an invariat of the normal waitqueue
locking scheme.  With aio poll it is easy to trigger other locks that
disable interrupts (or are called from interrupt context).

Link: http://lkml.kernel.org/r/20181018154101.18750-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>	[4.19.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13 11:08:46 -08:00
Souptick Joarder
2b74030354 mm: Change return type int to vm_fault_t for fault handlers
Use new return type vm_fault_t for fault handler.  For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno.  Once all instances are converted, vm_fault_t will become a
distinct type.

Ref-> commit 1c8f422059 ("mm: change return type to vm_fault_t")

The aim is to change the return type of finish_fault() and
handle_mm_fault() to vm_fault_t type.  As part of that clean up return
type of all other recursively called functions have been changed to
vm_fault_t type.

The places from where handle_mm_fault() is getting invoked will be
change to vm_fault_t type but in a separate patch.

vmf_error() is the newly introduce inline function in 4.17-rc6.

[akpm@linux-foundation.org: don't shadow outer local `ret' in __do_huge_pmd_anonymous_page()]
Link: http://lkml.kernel.org/r/20180604171727.GA20279@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-23 18:48:44 -07:00
Matthew Wilcox
c430d1e848 userfaultfd: use fault_wqh lock
The userfaultfd code currently uses the unlocked waitqueue helpers for
managing fault_wqh, but instead of holding the waitqueue lock for this
waitqueue around these calls, it the waitqueue lock of
fault_pending_wq, which is a different waitqueue instance.  Given that
the waitqueue is not exposed to the rest of the kernel this actually
works ok at the moment, but prevents the userfaultfd locking rules from
being enforced using lockdep.

Switch to the internally locked waitqueue helpers instead.  This means
that the lock inside fault_wqh now nests inside the fault_pending_wqh
lock, but that's not a problem since it was entirely unused before.

[hch@lst.de: slight changelog updates]
[rppt@linux.vnet.ibm.com: spotted changelog spellos]
Link: http://lkml.kernel.org/r/20171214152344.6880-3-hch@lst.de
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:47 -07:00
Colin Ian King
5241d47274 fs/userfaultfd.c: remove redundant pointer uwq
Pointer uwq is being assigned but is never used hence it is redundant
and can be removed.

Cleans up clang warning:
  warning: variable 'uwq' set but not used [-Wunused-but-set-variable]

Link: http://lkml.kernel.org/r/20180717090802.18357-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:32 -07:00
Mike Rapoport
31e810aa10 userfaultfd: remove uffd flags from vma->vm_flags if UFFD_EVENT_FORK fails
The fix in commit 0cbb4b4f4c ("userfaultfd: clear the
vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails") cleared the
vma->vm_userfaultfd_ctx but kept userfaultfd flags in vma->vm_flags
that were copied from the parent process VMA.

As the result, there is an inconsistency between the values of
vma->vm_userfaultfd_ctx.ctx and vma->vm_flags which triggers BUG_ON
in userfaultfd_release().

Clearing the uffd flags from vma->vm_flags in case of UFFD_EVENT_FORK
failure resolves the issue.

Link: http://lkml.kernel.org/r/1532931975-25473-1-git-send-email-rppt@linux.vnet.ibm.com
Fixes: 0cbb4b4f4c ("userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails")
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reported-by: syzbot+121be635a7a35ddb7dcb@syzkaller.appspotmail.com
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-02 16:03:40 -07:00
Janosch Frank
1e2c043628 userfaultfd: hugetlbfs: fix userfaultfd_huge_must_wait() pte access
Use huge_ptep_get() to translate huge ptes to normal ptes so we can
check them with the huge_pte_* functions.  Otherwise some architectures
will check the wrong values and will not wait for userspace to bring in
the memory.

Link: http://lkml.kernel.org/r/20180626132421.78084-1-frankja@linux.ibm.com
Fixes: 369cd2121b ("userfaultfd: hugetlbfs: userfaultfd_huge_must_wait for hugepmd ranges")
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-03 17:32:18 -07:00
Mike Rapoport
df2cc96e77 userfaultfd: prevent non-cooperative events vs mcopy_atomic races
If a process monitored with userfaultfd changes it's memory mappings or
forks() at the same time as uffd monitor fills the process memory with
UFFDIO_COPY, the actual creation of page table entries and copying of
the data in mcopy_atomic may happen either before of after the memory
mapping modifications and there is no way for the uffd monitor to
maintain consistent view of the process memory layout.

For instance, let's consider fork() running in parallel with
userfaultfd_copy():

process        		         |	uffd monitor
---------------------------------+------------------------------
fork()        		         | userfaultfd_copy()
...        		         | ...
    dup_mmap()        	         |     down_read(mmap_sem)
    down_write(mmap_sem)         |     /* create PTEs, copy data */
        dup_uffd()               |     up_read(mmap_sem)
        copy_page_range()        |
        up_write(mmap_sem)       |
        dup_uffd_complete()      |
            /* notify monitor */ |

If the userfaultfd_copy() takes the mmap_sem first, the new page(s) will
be present by the time copy_page_range() is called and they will appear
in the child's memory mappings.  However, if the fork() is the first to
take the mmap_sem, the new pages won't be mapped in the child's address
space.

If the pages are not present and child tries to access them, the monitor
will get page fault notification and everything is fine.  However, if
the pages *are present*, the child can access them without uffd
noticing.  And if we copy them into child it'll see the wrong data.
Since we are talking about background copy, we'd need to decide whether
the pages should be copied or not regardless #PF notifications.

Since userfaultfd monitor has no way to determine what was the order,
let's disallow userfaultfd_copy in parallel with the non-cooperative
events.  In such case we return -EAGAIN and the uffd monitor can
understand that userfaultfd_copy() clashed with a non-cooperative event
and take an appropriate action.

Link: http://lkml.kernel.org/r/1527061324-19949-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-07 17:34:38 -07:00
Linus Torvalds
a9a08845e9 vfs: do bulk POLL* -> EPOLL* replacement
This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:

    for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
        L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
        for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
    done

with de-mangling cleanups yet to come.

NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do.  But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.

The next patch from Al will sort out the final differences, and we
should be all done.

Scripted-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-11 14:34:03 -08:00
Eric Biggers
284cd241a1 userfaultfd: convert to use anon_inode_getfd()
Nothing actually calls userfaultfd_file_create() besides the
userfaultfd() system call itself.  So simplify things by folding it into
the system call and using anon_inode_getfd() instead of
anon_inode_getfile().  Do the same in resolve_userfault_fork() as well.

This removes over 50 lines with no change in functionality.

Link: http://lkml.kernel.org/r/20171229212403.22800-1-ebiggers3@gmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-31 17:18:39 -08:00
Huang Ying
a365ac09d3 mm, userfaultfd, THP: avoid waiting when PMD under THP migration
If THP migration is enabled, for a VMA handled by userfaultfd, consider
the following situation,

  do_page_fault()
    __do_huge_pmd_anonymous_page()
     handle_userfault()
       userfault_msg()
         /* a huge page is allocated and mapped at fault address */
         /* the huge page is under migration, leaves migration entry
            in page table */
       userfaultfd_must_wait()
         /* return true because !pmd_present() */
       /* may wait in loop until fatal signal */

That is, it may be possible for userfaultfd_must_wait() encounters a PMD
entry which is !pmd_none() && !pmd_present().  In the current
implementation, we will wait for such PMD entries, which may cause
unnecessary waiting, and potential soft lockup.

This is fixed via avoiding to wait when !pmd_none() && !pmd_present(),
only wait when pmd_none().

This may be not a problem in practice, because userfaultfd_must_wait()
is always called with mm->mmap_sem read-locked.  mremap() will
write-lock mm->mmap_sem.  And UFFDIO_COPY doesn't support to copy THP
mapping.  But the change introduced still makes the code more correct,
and makes the PMD and PTE code more consistent.

Link: http://lkml.kernel.org/r/20171207011752.3292-1-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.UK>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-31 17:18:37 -08:00
Linus Torvalds
168fe32a07 Merge branch 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull poll annotations from Al Viro:
 "This introduces a __bitwise type for POLL### bitmap, and propagates
  the annotations through the tree. Most of that stuff is as simple as
  'make ->poll() instances return __poll_t and do the same to local
  variables used to hold the future return value'.

  Some of the obvious brainos found in process are fixed (e.g. POLLIN
  misspelled as POLL_IN). At that point the amount of sparse warnings is
  low and most of them are for genuine bugs - e.g. ->poll() instance
  deciding to return -EINVAL instead of a bitmap. I hadn't touched those
  in this series - it's large enough as it is.

  Another problem it has caught was eventpoll() ABI mess; select.c and
  eventpoll.c assumed that corresponding POLL### and EPOLL### were
  equal. That's true for some, but not all of them - EPOLL### are
  arch-independent, but POLL### are not.

  The last commit in this series separates userland POLL### values from
  the (now arch-independent) kernel-side ones, converting between them
  in the few places where they are copied to/from userland. AFAICS, this
  is the least disruptive fix preserving poll(2) ABI and making epoll()
  work on all architectures.

  As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
  it will trigger only on what would've triggered EPOLLWRBAND on other
  architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
  at all on sparc. With this patch they should work consistently on all
  architectures"

* 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
  make kernel-side POLL... arch-independent
  eventpoll: no need to mask the result of epi_item_poll() again
  eventpoll: constify struct epoll_event pointers
  debugging printk in sg_poll() uses %x to print POLL... bitmap
  annotate poll(2) guts
  9p: untangle ->poll() mess
  ->si_band gets POLL... bitmap stored into a user-visible long field
  ring_buffer_poll_wait() return value used as return value of ->poll()
  the rest of drivers/*: annotate ->poll() instances
  media: annotate ->poll() instances
  fs: annotate ->poll() instances
  ipc, kernel, mm: annotate ->poll() instances
  net: annotate ->poll() instances
  apparmor: annotate ->poll() instances
  tomoyo: annotate ->poll() instances
  sound: annotate ->poll() instances
  acpi: annotate ->poll() instances
  crypto: annotate ->poll() instances
  block: annotate ->poll() instances
  x86: annotate ->poll() instances
  ...
2018-01-30 17:58:07 -08:00
Andrea Arcangeli
0cbb4b4f4c userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails
The previous fix in commit 384632e67e ("userfaultfd: non-cooperative:
fix fork use after free") corrected the refcounting in case of
UFFD_EVENT_FORK failure for the fork userfault paths.

That still didn't clear the vma->vm_userfaultfd_ctx of the vmas that
were set to point to the aborted new uffd ctx earlier in
dup_userfaultfd.

Link: http://lkml.kernel.org/r/20171223002505.593-2-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-04 16:45:09 -08:00
Al Viro
076ccb76e1 fs: annotate ->poll() instances
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-11-27 16:20:05 -05:00
Mike Rapoport
00bb31fa44 userfaultfd: use mmgrab instead of open-coded increment of mm_count
Link: http://lkml.kernel.org/r/1508132478-7738-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:05 -08:00
Mark Rutland
6aa7de0591 locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.

For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.

However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:

----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()

// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch

virtual patch

@ depends on patch @
expression E1, E2;
@@

- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)

@ depends on patch @
expression E;
@@

- ACCESS_ONCE(E)
+ READ_ONCE(E)
----

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-25 11:01:08 +02:00
Andrea Arcangeli
384632e67e userfaultfd: non-cooperative: fix fork use after free
When reading the event from the uffd, we put it on a temporary
fork_event list to detect if we can still access it after releasing and
retaking the event_wqh.lock.

If fork aborts and removes the event from the fork_event all is fine as
long as we're still in the userfault read context and fork_event head is
still alive.

We've to put the event allocated in the fork kernel stack, back from
fork_event list-head to the event_wqh head, before returning from
userfaultfd_ctx_read, because the fork_event head lifetime is limited to
the userfaultfd_ctx_read stack lifetime.

Forgetting to move the event back to its event_wqh place then results in
__remove_wait_queue(&ctx->event_wqh, &ewq->wq); in
userfaultfd_event_wait_completion to remove it from a head that has been
already freed from the reader stack.

This could only happen if resolve_userfault_fork failed (for example if
there are no file descriptors available to allocate the fork uffd).  If
it succeeded it was put back correctly.

Furthermore, after find_userfault_evt receives a fork event, the forked
userfault context in fork_nctx and uwq->msg.arg.reserved.reserved1 can
be released by the fork thread as soon as the event_wqh.lock is
released.  Taking a reference on the fork_nctx before dropping the lock
prevents an use after free in resolve_userfault_fork().

If the fork side aborted and it already released everything, we still
try to succeed resolve_userfault_fork(), if possible.

Fixes: 893e26e61d ("userfaultfd: non-cooperative: Add fork() event")
Link: http://lkml.kernel.org/r/20170920180413.26713-1-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03 17:54:25 -07:00
Andrea Arcangeli
656710a60e userfaultfd: non-cooperative: closing the uffd without triggering SIGBUS
This is an enhancement to avoid a non cooperative userfaultfd manager
having to unregister all regions before it can close the uffd after all
userfaultfd activity completed.

The UFFDIO_UNREGISTER would serialize against the handle_userfault by
taking the mmap_sem for writing, but we can simply repeat the page fault
if we detect the uffd was closed and so the regular page fault paths
should takeover.

Link: http://lkml.kernel.org/r/20170823181227.19926-1-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08 18:26:47 -07:00
Andrea Arcangeli
a36985d31a userfaultfd: provide pid in userfault msg - add feat union
No ABI change, but this will make it more explicit to software that ptid
is only available if requested by passing UFFD_FEATURE_THREAD_ID to
UFFDIO_API.  The fact it's a union will also self document it shouldn't
be taken for granted there's a tpid there.

Link: http://lkml.kernel.org/r/20170802165145.22628-7-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Alexey Perevalov <a.perevalov@samsung.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:29 -07:00
Alexey Perevalov
9d4ac93482 userfaultfd: provide pid in userfault msg
It could be useful for calculating downtime during postcopy live
migration per vCPU.  Side observer or application itself will be
informed about proper task's sleep during userfaultfd processing.

Process's thread id is being provided when user requeste it by setting
UFFD_FEATURE_THREAD_ID bit into uffdio_api.features.

Link: http://lkml.kernel.org/r/20170802165145.22628-6-aarcange@redhat.com
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:29 -07:00
Prakash Sangappa
2d6d6f5a09 mm: userfaultfd: add feature to request for a signal delivery
In some cases, userfaultfd mechanism should just deliver a SIGBUS signal
to the faulting process, instead of the page-fault event.  Dealing with
page-fault event using a monitor thread can be an overhead in these
cases.  For example applications like the database could use the
signaling mechanism for robustness purpose.

Database uses hugetlbfs for performance reason.  Files on hugetlbfs
filesystem are created and huge pages allocated using fallocate() API.
Pages are deallocated/freed using fallocate() hole punching support.
These files are mmapped and accessed by many processes as shared memory.
The database keeps track of which offsets in the hugetlbfs file have
pages allocated.

Any access to mapped address over holes in the file, which can occur due
to bugs in the application, is considered invalid and expect the process
to simply receive a SIGBUS.  However, currently when a hole in the file
is accessed via the mapped address, kernel/mm attempts to automatically
allocate a page at page fault time, resulting in implicitly filling the
hole in the file.  This may not be the desired behavior for applications
like the database that want to explicitly manage page allocations of
hugetlbfs files.

Using userfaultfd mechanism with this support to get a signal, database
application can prevent pages from being allocated implicitly when
processes access mapped address over holes in the file.

This patch adds UFFD_FEATURE_SIGBUS feature to userfaultfd mechnism to
request for a SIGBUS signal.

See following for previous discussion about the database requirement
leading to this proposal as suggested by Andrea.

http://www.spinics.net/lists/linux-mm/msg129224.html

Link: http://lkml.kernel.org/r/1501552446-748335-2-git-send-email-prakash.sangappa@oracle.com
Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>
Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:29 -07:00
Mike Rapoport
ce53e8e6f2 userfaultfd: report UFFDIO_ZEROPAGE as available for shmem VMAs
Now when shmem VMAs can be filled with zero page via userfaultfd we can
report that UFFDIO_ZEROPAGE is available for those VMAs

Link: http://lkml.kernel.org/r/1497939652-16528-7-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:28 -07:00
Ingo Molnar
040cca3ab2 Merge branch 'linus' into locking/core, to resolve conflicts
Conflicts:
	include/linux/mm_types.h
	mm/huge_memory.c

I removed the smp_mb__before_spinlock() like the following commit does:

  8b1b436dd1 ("mm, locking: Rework {set,clear,mm}_tlb_flush_pending()")

and fixed up the affected commits.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-11 13:51:59 +02:00
Mike Rapoport
e86b298beb userfaultfd: replace ENOSPC with ESRCH in case mm has gone during copy/zeropage
When the process exit races with outstanding mcopy_atomic, it would be
better to return ESRCH error.  When such race occurs the process and
it's mm are going away and returning "no such process" to the uffd
monitor seems better fit than ENOSPC.

Link: http://lkml.kernel.org/r/1502111545-32305-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-10 15:54:07 -07:00
Peter Zijlstra
a9668cd6ee locking: Remove smp_mb__before_spinlock()
Now that there are no users of smp_mb__before_spinlock() left, remove
it entirely.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:29:03 +02:00
Mike Rapoport
5a18b64e3f userfaultfd: non-cooperative: flush event_wqh at release time
There may still be threads waiting on event_wqh at the time the
userfault file descriptor is closed.  Flush the events wait-queue to
prevent waiting threads from hanging.

Link: http://lkml.kernel.org/r/1501398127-30419-1-git-send-email-rppt@linux.vnet.ibm.com
Fixes: 9cd75c3cd4 ("userfaultfd: non-cooperative: add ability to report
non-PF events from uffd descriptor")
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-02 17:16:13 -07:00
Mike Rapoport
9d95aa4bad userfaultfd_zeropage: return -ENOSPC in case mm has gone
In the non-cooperative userfaultfd case, the process exit may race with
outstanding mcopy_atomic called by the uffd monitor.  Returning -ENOSPC
instead of -EINVAL when mm is already gone will allow uffd monitor to
distinguish this case from other error conditions.

Unfortunately I overlooked userfaultfd_zeropage when updating
userfaultd_copy().

Link: http://lkml.kernel.org/r/1501136819-21857-1-git-send-email-rppt@linux.vnet.ibm.com
Fixes: 96333187ab ("userfaultfd_copy: return -ENOSPC in case mm has gone")
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-02 17:16:12 -07:00
Punit Agrawal
7868a2087e mm/hugetlb: add size parameter to huge_pte_offset()
A poisoned or migrated hugepage is stored as a swap entry in the page
tables.  On architectures that support hugepages consisting of
contiguous page table entries (such as on arm64) this leads to ambiguity
in determining the page table entry to return in huge_pte_offset() when
a poisoned entry is encountered.

Let's remove the ambiguity by adding a size parameter to convey
additional information about the requested address.  Also fixup the
definition/usage of huge_pte_offset() throughout the tree.

Link: http://lkml.kernel.org/r/20170522133604.11392-4-punit.agrawal@arm.com
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: James Hogan <james.hogan@imgtec.com> (odd fixer:METAG ARCHITECTURE)
Cc: Ralf Baechle <ralf@linux-mips.org> (supporter:MIPS)
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-06 16:24:34 -07:00
Mike Rapoport
f93ae36462 fs/userfaultfd.c: drop dead code
Calculation of start end end in __wake_userfault function are not used
and can be removed.

Link: http://lkml.kernel.org/r/1494930917-3134-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-06 16:24:33 -07:00
Ingo Molnar
2055da9738 sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming
So I've noticed a number of instances where it was not obvious from the
code whether ->task_list was for a wait-queue head or a wait-queue entry.

Furthermore, there's a number of wait-queue users where the lists are
not for 'tasks' but other entities (poll tables, etc.), in which case
the 'task_list' name is actively confusing.

To clear this all up, name the wait-queue head and entry list structure
fields unambiguously:

	struct wait_queue_head::task_list	=> ::head
	struct wait_queue_entry::task_list	=> ::entry

For example, this code:

	rqw->wait.task_list.next != &wait->task_list

... is was pretty unclear (to me) what it's doing, while now it's written this way:

	rqw->wait.head.next != &wait->entry

... which makes it pretty clear that we are iterating a list until we see the head.

Other examples are:

	list_for_each_entry_safe(pos, next, &x->task_list, task_list) {
	list_for_each_entry(wq, &fence->wait.task_list, task_list) {

... where it's unclear (to me) what we are iterating, and during review it's
hard to tell whether it's trying to walk a wait-queue entry (which would be
a bug), while now it's written as:

	list_for_each_entry_safe(pos, next, &x->head, entry) {
	list_for_each_entry(wq, &fence->wait.head, entry) {

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20 12:19:14 +02:00
Ingo Molnar
ac6424b981 sched/wait: Rename wait_queue_t => wait_queue_entry_t
Rename:

	wait_queue_t		=>	wait_queue_entry_t

'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue",
but in reality it's a queue *entry*. The 'real' queue is the wait queue head,
which had to carry the name.

Start sorting this out by renaming it to 'wait_queue_entry_t'.

This also allows the real structure name 'struct __wait_queue' to
lose its double underscore and become 'struct wait_queue_entry',
which is the more canonical nomenclature for such data types.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20 12:18:27 +02:00
Andrea Arcangeli
64c2b20301 userfaultfd: shmem: handle coredumping in handle_userfault()
Anon and hugetlbfs handle FOLL_DUMP set by get_dump_page() internally to
__get_user_pages().

shmem as opposed has no special FOLL_DUMP handling there so
handle_mm_fault() is invoked without mmap_sem and ends up calling
handle_userfault() that isn't expecting to be invoked without mmap_sem
held.

This makes handle_userfault() fail immediately if invoked through
shmem_vm_ops->fault during coredumping and solves the problem.

The side effect is a BUG_ON with no lock held triggered by the
coredumping process which exits.  Only 4.11 is affected, pre-4.11 anon
memory holes are skipped in __get_user_pages by checking FOLL_DUMP
explicitly against empty pagetables (mm/gup.c:no_page_table()).

It's zero cost as we already had a check for current->flags to prevent
futex to trigger userfaults during exit (PF_EXITING).

Link: http://lkml.kernel.org/r/20170615214838.27429-1-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: <stable@vger.kernel.org>	[4.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-17 06:37:05 +09:00
Mike Rapoport
045098e944 userfaultfd: report actual registered features in fdinfo
fdinfo for userfault file descriptor reports UFFD_API_FEATURES.  Up
until recently, the UFFD_API_FEATURES was defined as 0, therefore
corresponding field in fdinfo always contained zero.  Now, with
introduction of several additional features, UFFD_API_FEATURES is not
longer 0 and it seems better to report actual features requested for the
userfaultfd object described by the fdinfo.

First, the applications that were using userfault will still see zero at
the features field in fdinfo.  Next, reporting actual features rather
than available features, gives clear indication of what userfault
features are used by an application.

Link: http://lkml.kernel.org/r/1491140181-22121-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08 00:47:48 -07:00
Linus Torvalds
baeedc7158 Merge branch 'prep-for-5level'
Merge 5-level page table prep from Kirill Shutemov:
 "Here's relatively low-risk part of 5-level paging patchset. Merging it
  now will make x86 5-level paging enabling in v4.12 easier.

  The first patch is actually x86-specific: detect 5-level paging
  support. It boils down to single define.

  The rest of patchset converts Linux MMU abstraction from 4- to 5-level
  paging.

  Enabling of new abstraction in most cases requires adding single line
  of code in arch-specific code. The rest is taken care by asm-generic/.

  Changes to mm/ code are mostly mechanical: add support for new page
  table level -- p4d_t -- where we deal with pud_t now.

  v2:
   - fix build on microblaze (Michal);
   - comment for __ARCH_HAS_5LEVEL_HACK in kasan_populate_zero_shadow();
   - acks from Michal"

* emailed patches from Kirill A Shutemov <kirill.shutemov@linux.intel.com>:
  mm: introduce __p4d_alloc()
  mm: convert generic code to 5-level paging
  asm-generic: introduce <asm-generic/pgtable-nop4d.h>
  arch, mm: convert all architectures to use 5level-fixup.h
  asm-generic: introduce __ARCH_USE_5LEVEL_HACK
  asm-generic: introduce 5level-fixup.h
  x86/cpufeature: Add 5-level paging detection
2017-03-10 08:59:07 -08:00
David Hildenbrand
2378cd6181 userfaultfd: remove wrong comment from userfaultfd_ctx_get()
It's a void function, so there is no return value;

Link: http://lkml.kernel.org/r/20170309150817.7510-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-09 17:01:10 -08:00