Commit graph

13242 commits

Author SHA1 Message Date
Greg Kroah-Hartman
9f80205d66 This is the 4.19.151 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl+Gt3oACgkQONu9yGCS
 aT4MAQ/+NmMSJHbPu5DRX4qfg+toj9SRdDJfv0H6+dmJ/Wabeogq3GF6VB5ku8fJ
 N5uJLVb+iz/n4cbvTvE1sI+5WEPzK/nRVRnwUUH3SDycB+tCk4loKbGwV+gZTvYi
 PsHqHQ9UUKpdBs4gJm9I/FCOlA4mtcdK2W0JDMfdCmZc3ACyCWo+y83ESAfjOcO9
 vda/1xaeqNPaPFxteVKpmXujlHQw3meH8ZzpkQi7t1HeZ6bnpoObMIy8c1gtYjSt
 jeLvw9pnvgQEpAidIZb3MwVyxHE/mFGRcoQmI0WYoWWXOnRAHUfFQTsK7ZWbTYAF
 MUvJlGg3zchhBetfo2523rdGM9/AHMnwmGAuTs5PkZdPDRwLaWjeFs2u8njmjEk9
 R8SrtF+2zxuO92SPp37jXqRUb7G69vPnlZ6IIcMpr61qeUAf78dWN67VW0kVMs/K
 hMHsl0E1Ax3K9GV9ZSQc7oe7/fMj+xXabpeuDZ/5lxvbBKrIP/UZLMJeZg28y+91
 0JbFhMfEN71sTfCHXpxe9bAvmI1XKlYh37RLAOJtNQRqwKsP2Tuim5C+tJsREmF9
 eAPZq0QrjjTb1t8wii7HKXyjnrSfTpOS7HUPsGbeiU4JSVmWGeLy6IxmjZh1fjY+
 WLr+Hn0Q62w5OGXBxS7rEfq3FMnAEkT/27a5IwrDPoGje+rO6WY=
 =TSOP
 -----END PGP SIGNATURE-----

Merge 4.19.151 into android-4.19-stable

Changes in 4.19.151
	fbdev, newport_con: Move FONT_EXTRA_WORDS macros into linux/font.h
	Fonts: Support FONT_EXTRA_WORDS macros for built-in fonts
	fbcon: Fix global-out-of-bounds read in fbcon_get_font()
	Revert "ravb: Fixed to be able to unload modules"
	net: wireless: nl80211: fix out-of-bounds access in nl80211_del_key()
	drm/nouveau/mem: guard against NULL pointer access in mem_del
	usermodehelper: reset umask to default before executing user process
	platform/x86: intel-vbtn: Fix SW_TABLET_MODE always reporting 1 on the HP Pavilion 11 x360
	platform/x86: thinkpad_acpi: initialize tp_nvram_state variable
	platform/x86: intel-vbtn: Switch to an allow-list for SW_TABLET_MODE reporting
	platform/x86: thinkpad_acpi: re-initialize ACPI buffer size when reuse
	driver core: Fix probe_count imbalance in really_probe()
	perf top: Fix stdio interface input handling with glibc 2.28+
	i2c: i801: Exclude device from suspend direct complete optimization
	mtd: rawnand: sunxi: Fix the probe error path
	arm64: dts: stratix10: add status to qspi dts node
	nvme-core: put ctrl ref when module ref get fail
	macsec: avoid use-after-free in macsec_handle_frame()
	mm/khugepaged: fix filemap page_to_pgoff(page) != offset
	xfrmi: drop ignore_df check before updating pmtu
	cifs: Fix incomplete memory allocation on setxattr path
	i2c: meson: fix clock setting overwrite
	i2c: meson: fixup rate calculation with filter delay
	i2c: owl: Clear NACK and BUS error bits
	sctp: fix sctp_auth_init_hmacs() error path
	team: set dev->needed_headroom in team_setup_by_port()
	net: team: fix memory leak in __team_options_register
	openvswitch: handle DNAT tuple collision
	drm/amdgpu: prevent double kfree ttm->sg
	xfrm: clone XFRMA_SET_MARK in xfrm_do_migrate
	xfrm: clone XFRMA_REPLAY_ESN_VAL in xfrm_do_migrate
	xfrm: clone XFRMA_SEC_CTX in xfrm_do_migrate
	xfrm: clone whole liftime_cur structure in xfrm_do_migrate
	net: stmmac: removed enabling eee in EEE set callback
	platform/x86: fix kconfig dependency warning for FUJITSU_LAPTOP
	xfrm: Use correct address family in xfrm_state_find
	bonding: set dev->needed_headroom in bond_setup_by_slave()
	mdio: fix mdio-thunder.c dependency & build error
	net: usb: ax88179_178a: fix missing stop entry in driver_info
	net/mlx5e: Fix VLAN cleanup flow
	net/mlx5e: Fix VLAN create flow
	rxrpc: Fix rxkad token xdr encoding
	rxrpc: Downgrade the BUG() for unsupported token type in rxrpc_read()
	rxrpc: Fix some missing _bh annotations on locking conn->state_lock
	rxrpc: Fix server keyring leak
	perf: Fix task_function_call() error handling
	mmc: core: don't set limits.discard_granularity as 0
	mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged
	net: usb: rtl8150: set random MAC address when set_ethernet_addr() fails
	Linux 4.19.151

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9ee2b0fc4fc39f27be6ae680529e1046f249a3e6
2020-10-14 12:11:08 +02:00
Vijay Balakrishna
94c5167581 mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged
commit 4aab2be0983031a05cb4a19696c9da5749523426 upstream.

When memory is hotplug added or removed the min_free_kbytes should be
recalculated based on what is expected by khugepaged.  Currently after
hotplug, min_free_kbytes will be set to a lower default and higher
default set when THP enabled is lost.

This change restores min_free_kbytes as expected for THP consumers.

[vijayb@linux.microsoft.com: v5]
  Link: https://lkml.kernel.org/r/1601398153-5517-1-git-send-email-vijayb@linux.microsoft.com

Fixes: f000565adb ("thp: set recommended min free kbytes")
Signed-off-by: Vijay Balakrishna <vijayb@linux.microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Allen Pais <apais@microsoft.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/1600305709-2319-2-git-send-email-vijayb@linux.microsoft.com
Link: https://lkml.kernel.org/r/1600204258-13683-1-git-send-email-vijayb@linux.microsoft.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-14 10:31:26 +02:00
Hugh Dickins
fbe96d5aab mm/khugepaged: fix filemap page_to_pgoff(page) != offset
commit 033b5d77551167f8c24ca862ce83d3e0745f9245 upstream.

There have been elusive reports of filemap_fault() hitting its
VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page) on kernels built
with CONFIG_READ_ONLY_THP_FOR_FS=y.

Suren has hit it on a kernel with CONFIG_READ_ONLY_THP_FOR_FS=y and
CONFIG_NUMA is not set: and he has analyzed it down to how khugepaged
without NUMA reuses the same huge page after collapse_file() failed
(whereas NUMA targets its allocation to the respective node each time).
And most of us were usually testing with CONFIG_NUMA=y kernels.

collapse_file(old start)
  new_page = khugepaged_alloc_page(hpage)
  __SetPageLocked(new_page)
  new_page->index = start // hpage->index=old offset
  new_page->mapping = mapping
  xas_store(&xas, new_page)

                          filemap_fault
                            page = find_get_page(mapping, offset)
                            // if offset falls inside hpage then
                            // compound_head(page) == hpage
                            lock_page_maybe_drop_mmap()
                              __lock_page(page)

  // collapse fails
  xas_store(&xas, old page)
  new_page->mapping = NULL
  unlock_page(new_page)

collapse_file(new start)
  new_page = khugepaged_alloc_page(hpage)
  __SetPageLocked(new_page)
  new_page->index = start // hpage->index=new offset
  new_page->mapping = mapping // mapping becomes valid again

                            // since compound_head(page) == hpage
                            // page_to_pgoff(page) got changed
                            VM_BUG_ON_PAGE(page_to_pgoff(page) != offset)

An initial patch replaced __SetPageLocked() by lock_page(), which did
fix the race which Suren illustrates above.  But testing showed that it's
not good enough: if the racing task's __lock_page() gets delayed long
after its find_get_page(), then it may follow collapse_file(new start)'s
successful final unlock_page(), and crash on the same VM_BUG_ON_PAGE.

It could be fixed by relaxing filemap_fault()'s VM_BUG_ON_PAGE to a
check and retry (as is done for mapping), with similar relaxations in
find_lock_entry() and pagecache_get_page(): but it's not obvious what
else might get caught out; and khugepaged non-NUMA appears to be unique
in exposing a page to page cache, then revoking, without going through
a full cycle of freeing before reuse.

Instead, non-NUMA khugepaged_prealloc_page() release the old page
if anyone else has a reference to it (1% of cases when I tested).

Although never reported on huge tmpfs, I believe its find_lock_entry()
has been at similar risk; but huge tmpfs does not rely on khugepaged
for its normal working nearly so much as READ_ONLY_THP_FOR_FS does.

Reported-by: Denis Lisov <dennis.lissov@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206569
Link: https://lore.kernel.org/linux-mm/?q=20200219144635.3b7417145de19b65f258c943%40linux-foundation.org
Reported-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/linux-xfs/?q=20200616013309.GB815%40lca.pw
Reported-and-analyzed-by: Suren Baghdasaryan <surenb@google.com>
Fixes: 87c460a0bded ("mm/khugepaged: collapse_shmem() without freezing new_page")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org # v4.9+
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-14 10:31:23 +02:00
Greg Kroah-Hartman
2dce03a5c2 This is the 4.19.150 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl99WYoACgkQONu9yGCS
 aT75hA/+Lu0udzlCD/Uw1SDFKPo3Xed6hFiFUOHHtQxVel9r3DNOX1+kfNAH5ifo
 G2NW6O25Bl4qxmG02QRW85r7JRHhoMQOx1DldLGlJCfcgmVcwEaiRg6HgMh+9OiC
 qPAuE6zbB5h97dPQppe5u7e6pzjrsTgR6pYlnuwPVF6TSmTYXM3OGubXItOyneRZ
 ePnzH9w4bk/n4UAARYOowfFhRnO/Qml+QPxc8rFbK2inGXCJ31QLITJCa3Y3KXP3
 AC2aM2M8B04GJBFhXH8pLFrzvB/+S1DwzmtT3d6TWdQRqdSr+GAPJ/3jX2eKMuwK
 6vfF/caGvfYpomEFCHFKyLxmFwhbSEfVD0ht4jr3aiF/E0ii8sXKN8mnnowpNFpG
 23kG+baxxe1ZbjXw4VjtGGXruJQ6im5o7siRnmfKYv18Fo5O3yEga9pCjOdkHT20
 gjes0GfTtgr3nrlW19B03ZYL7p3ri46NlY7Zlawvtz3dlWY9rTkII3nxD5k5Ywxs
 KgDVpREwr7LuKOgGTxkwLGHNEF7b1mJrjdrlX/X2SDJD/IQc3xdD5382lXSQRaem
 QaZhu6S6SNCjI9fGQ7jOYMR3ouGegFsPOnr6BQxvKmoolsUzXTeTxR0u2R5dqYGI
 1RwmuTIQTvURoFy1XyhNyLJAbCvk+9BeNp8I5/YTNbvkFKlnawM=
 =qKZh
 -----END PGP SIGNATURE-----

Merge 4.19.150 into android-4.19-stable

Changes in 4.19.150
	mmc: sdhci: Workaround broken command queuing on Intel GLK based IRBIS models
	USB: gadget: f_ncm: Fix NDP16 datagram validation
	gpio: mockup: fix resource leak in error path
	gpio: tc35894: fix up tc35894 interrupt configuration
	clk: socfpga: stratix10: fix the divider for the emac_ptp_free_clk
	vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock
	vsock/virtio: stop workers during the .remove()
	vsock/virtio: add transport parameter to the virtio_transport_reset_no_sock()
	net: virtio_vsock: Enhance connection semantics
	Input: i8042 - add nopnp quirk for Acer Aspire 5 A515
	ftrace: Move RCU is watching check after recursion check
	drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config
	drivers/net/wan/hdlc_fr: Add needed_headroom for PVC devices
	drm/sun4i: mixer: Extend regmap max_register
	net: dec: de2104x: Increase receive ring size for Tulip
	rndis_host: increase sleep time in the query-response loop
	nvme-core: get/put ctrl and transport module in nvme_dev_open/release()
	drivers/net/wan/lapbether: Make skb->protocol consistent with the header
	drivers/net/wan/hdlc: Set skb->protocol before transmitting
	mac80211: do not allow bigger VHT MPDUs than the hardware supports
	spi: fsl-espi: Only process interrupts for expected events
	nvme-fc: fail new connections to a deleted host or remote port
	gpio: sprd: Clear interrupt when setting the type as edge
	pinctrl: mvebu: Fix i2c sda definition for 98DX3236
	nfs: Fix security label length not being reset
	clk: samsung: exynos4: mark 'chipid' clock as CLK_IGNORE_UNUSED
	iommu/exynos: add missing put_device() call in exynos_iommu_of_xlate()
	i2c: cpm: Fix i2c_ram structure
	Input: trackpoint - enable Synaptics trackpoints
	random32: Restore __latent_entropy attribute on net_rand_state
	mm: replace memmap_context by meminit_context
	mm: don't rely on system state to detect hot-plug operations
	net/packet: fix overflow in tpacket_rcv
	epoll: do not insert into poll queues until all sanity checks are done
	epoll: replace ->visited/visited_list with generation count
	epoll: EPOLL_CTL_ADD: close the race in decision to take fast path
	ep_create_wakeup_source(): dentry name can change under you...
	netfilter: ctnetlink: add a range check for l3/l4 protonum
	Linux 4.19.150

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ib6f1b6fce01bec80efd4a905d03903ff20ca89be
2020-10-07 08:45:35 +02:00
Laurent Dufour
b6f69f72c1 mm: don't rely on system state to detect hot-plug operations
commit f85086f95fa36194eb0db5cd5c12e56801b98523 upstream.

In register_mem_sect_under_node() the system_state's value is checked to
detect whether the call is made during boot time or during an hot-plug
operation.  Unfortunately, that check against SYSTEM_BOOTING is wrong
because regular memory is registered at SYSTEM_SCHEDULING state.  In
addition, memory hot-plug operation can be triggered at this system
state by the ACPI [1].  So checking against the system state is not
enough.

The consequence is that on system with interleaved node's ranges like this:

 Early memory node ranges
   node   1: [mem 0x0000000000000000-0x000000011fffffff]
   node   2: [mem 0x0000000120000000-0x000000014fffffff]
   node   1: [mem 0x0000000150000000-0x00000001ffffffff]
   node   0: [mem 0x0000000200000000-0x000000048fffffff]
   node   2: [mem 0x0000000490000000-0x00000007ffffffff]

This can be seen on PowerPC LPAR after multiple memory hot-plug and
hot-unplug operations are done.  At the next reboot the node's memory
ranges can be interleaved and since the call to link_mem_sections() is
made in topology_init() while the system is in the SYSTEM_SCHEDULING
state, the node's id is not checked, and the sections registered to
multiple nodes:

  $ ls -l /sys/devices/system/memory/memory21/node*
  total 0
  lrwxrwxrwx 1 root root     0 Aug 24 05:27 node1 -> ../../node/node1
  lrwxrwxrwx 1 root root     0 Aug 24 05:27 node2 -> ../../node/node2

In that case, the system is able to boot but if later one of theses
memory blocks is hot-unplugged and then hot-plugged, the sysfs
inconsistency is detected and this is triggering a BUG_ON():

  kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
  CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
  Call Trace:
    add_memory_resource+0x23c/0x340 (unreliable)
    __add_memory+0x5c/0xf0
    dlpar_add_lmb+0x1b4/0x500
    dlpar_memory+0x1f8/0xb80
    handle_dlpar_errorlog+0xc0/0x190
    dlpar_store+0x198/0x4a0
    kobj_attr_store+0x30/0x50
    sysfs_kf_write+0x64/0x90
    kernfs_fop_write+0x1b0/0x290
    vfs_write+0xe8/0x290
    ksys_write+0xdc/0x130
    system_call_exception+0x160/0x270
    system_call_common+0xf0/0x27c

This patch addresses the root cause by not relying on the system_state
value to detect whether the call is due to a hot-plug operation.  An
extra parameter is added to link_mem_sections() detailing whether the
operation is due to a hot-plug operation.

[1] According to Oscar Salvador, using this qemu command line, ACPI
memory hotplug operations are raised at SYSTEM_SCHEDULING state:

  $QEMU -enable-kvm -machine pc -smp 4,sockets=4,cores=1,threads=1 -cpu host -monitor pty \
        -m size=$MEM,slots=255,maxmem=4294967296k  \
        -numa node,nodeid=0,cpus=0-3,mem=512 -numa node,nodeid=1,mem=512 \
        -object memory-backend-ram,id=memdimm0,size=134217728 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0,slot=0 \
        -object memory-backend-ram,id=memdimm1,size=134217728 -device pc-dimm,node=0,memdev=memdimm1,id=dimm1,slot=1 \
        -object memory-backend-ram,id=memdimm2,size=134217728 -device pc-dimm,node=0,memdev=memdimm2,id=dimm2,slot=2 \
        -object memory-backend-ram,id=memdimm3,size=134217728 -device pc-dimm,node=0,memdev=memdimm3,id=dimm3,slot=3 \
        -object memory-backend-ram,id=memdimm4,size=134217728 -device pc-dimm,node=1,memdev=memdimm4,id=dimm4,slot=4 \
        -object memory-backend-ram,id=memdimm5,size=134217728 -device pc-dimm,node=1,memdev=memdimm5,id=dimm5,slot=5 \
        -object memory-backend-ram,id=memdimm6,size=134217728 -device pc-dimm,node=1,memdev=memdimm6,id=dimm6,slot=6 \

Fixes: 4fbce63391 ("mm/memory_hotplug.c: make register_mem_sect_under_node() a callback of walk_memory_range()")
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Scott Cheloha <cheloha@linux.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200915094143.79181-3-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-07 08:00:08 +02:00
Laurent Dufour
25eaea1b33 mm: replace memmap_context by meminit_context
commit c1d0da83358a2316d9be7f229f26126dbaa07468 upstream.

Patch series "mm: fix memory to node bad links in sysfs", v3.

Sometimes, firmware may expose interleaved memory layout like this:

 Early memory node ranges
   node   1: [mem 0x0000000000000000-0x000000011fffffff]
   node   2: [mem 0x0000000120000000-0x000000014fffffff]
   node   1: [mem 0x0000000150000000-0x00000001ffffffff]
   node   0: [mem 0x0000000200000000-0x000000048fffffff]
   node   2: [mem 0x0000000490000000-0x00000007ffffffff]

In that case, we can see memory blocks assigned to multiple nodes in
sysfs:

  $ ls -l /sys/devices/system/memory/memory21
  total 0
  lrwxrwxrwx 1 root root     0 Aug 24 05:27 node1 -> ../../node/node1
  lrwxrwxrwx 1 root root     0 Aug 24 05:27 node2 -> ../../node/node2
  -rw-r--r-- 1 root root 65536 Aug 24 05:27 online
  -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device
  -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index
  drwxr-xr-x 2 root root     0 Aug 24 05:27 power
  -r--r--r-- 1 root root 65536 Aug 24 05:27 removable
  -rw-r--r-- 1 root root 65536 Aug 24 05:27 state
  lrwxrwxrwx 1 root root     0 Aug 24 05:25 subsystem -> ../../../../bus/memory
  -rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent
  -r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones

The same applies in the node's directory with a memory21 link in both
the node1 and node2's directory.

This is wrong but doesn't prevent the system to run.  However when
later, one of these memory blocks is hot-unplugged and then hot-plugged,
the system is detecting an inconsistency in the sysfs layout and a
BUG_ON() is raised:

  kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
  CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
  Call Trace:
    add_memory_resource+0x23c/0x340 (unreliable)
    __add_memory+0x5c/0xf0
    dlpar_add_lmb+0x1b4/0x500
    dlpar_memory+0x1f8/0xb80
    handle_dlpar_errorlog+0xc0/0x190
    dlpar_store+0x198/0x4a0
    kobj_attr_store+0x30/0x50
    sysfs_kf_write+0x64/0x90
    kernfs_fop_write+0x1b0/0x290
    vfs_write+0xe8/0x290
    ksys_write+0xdc/0x130
    system_call_exception+0x160/0x270
    system_call_common+0xf0/0x27c

This has been seen on PowerPC LPAR.

The root cause of this issue is that when node's memory is registered,
the range used can overlap another node's range, thus the memory block
is registered to multiple nodes in sysfs.

There are two issues here:

 (a) The sysfs memory and node's layouts are broken due to these
     multiple links

 (b) The link errors in link_mem_sections() should not lead to a system
     panic.

To address (a) register_mem_sect_under_node should not rely on the
system state to detect whether the link operation is triggered by a hot
plug operation or not.  This is addressed by the patches 1 and 2 of this
series.

Issue (b) will be addressed separately.

This patch (of 2):

The memmap_context enum is used to detect whether a memory operation is
due to a hot-add operation or happening at boot time.

Make it general to the hotplug operation and rename it as
meminit_context.

There is no functional change introduced by this patch

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J . Wysocki" <rafael@kernel.org>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Scott Cheloha <cheloha@linux.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com
Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-07 08:00:08 +02:00
Greg Kroah-Hartman
9ce79d9bed This is the 4.19.149 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl91ulMACgkQONu9yGCS
 aT7ezhAArTOQxPGkhktgdGfCMYgjvIHdny8o4pNGumnxW6TG7FCiJHoZuj8OLkdx
 2x5brOOvSGgcGTOwJXyUjL6opQzD5syTCuzbgEpGB2Tyd1x5q8vgqvI2XPxZeYHy
 x+mUDgacT+4m7FNbFDhNMZoTS4KCiJ3IcTevjeQexDtIs6R38HhxNl0Ee67gkqxZ
 p7c6L3kbUuR5T9EWGE1DPPLhOFGeOMk592qzkFsCGERsuswQOpXrxyw6zkik/0UG
 6Losmo2i+OtQFeiDz0WYJZNO9ySI511j+7R2Ewch/nFuTp6yFzy9kJZnP0YWK/KE
 U4BLmopgzCs9q+TQ/QNjxlCltl4eOrrjkFXF3Zz8o5ddbKwrugEsJUdUUDIpva71
 qEUgSw7vguGKoCttBenCDwyYOcjIVJRBFSWTVDzkgw5pXrz3m7qePF1Kj+KzG0pN
 8gTqosXPlYPzH1mh+2vRVntiCpZRMJYo18CX+ifqN20dHH3dsM4vA5NiWwjTJVY8
 JddRXfujxBQ0jxs2jFKvPZNrgqeY3Mh51L0a5G+HbHCIb+4kgD+2jl+C/X38TKch
 osTM1/qQriFVxtlH9TkTa8opYvrYBWO+G+XhNVc2tSpmd8T2EaKokMAVVvGiK3l9
 ZPq06SytJyKDPsSLvk4BKxCUv5CY0VT18k6mCYd1fq4oxTR92A4=
 =5bC5
 -----END PGP SIGNATURE-----

Merge 4.19.149 into android-4.19-stable

Changes in 4.19.149
	selinux: allow labeling before policy is loaded
	media: mc-device.c: fix memleak in media_device_register_entity
	dma-fence: Serialise signal enabling (dma_fence_enable_sw_signaling)
	ath10k: fix array out-of-bounds access
	ath10k: fix memory leak for tpc_stats_final
	mm: fix double page fault on arm64 if PTE_AF is cleared
	scsi: aacraid: fix illegal IO beyond last LBA
	m68k: q40: Fix info-leak in rtc_ioctl
	gma/gma500: fix a memory disclosure bug due to uninitialized bytes
	ASoC: kirkwood: fix IRQ error handling
	media: smiapp: Fix error handling at NVM reading
	arch/x86/lib/usercopy_64.c: fix __copy_user_flushcache() cache writeback
	x86/ioapic: Unbreak check_timer()
	ALSA: usb-audio: Add delay quirk for H570e USB headsets
	ALSA: hda/realtek - Couldn't detect Mic if booting with headset plugged
	ALSA: hda/realtek: Enable front panel headset LED on Lenovo ThinkStation P520
	lib/string.c: implement stpcpy
	leds: mlxreg: Fix possible buffer overflow
	PM / devfreq: tegra30: Fix integer overflow on CPU's freq max out
	scsi: fnic: fix use after free
	scsi: lpfc: Fix kernel crash at lpfc_nvme_info_show during remote port bounce
	net: silence data-races on sk_backlog.tail
	clk/ti/adpll: allocate room for terminating null
	drm/amdgpu/powerplay: fix AVFS handling with custom powerplay table
	mtd: cfi_cmdset_0002: don't free cfi->cfiq in error path of cfi_amdstd_setup()
	mfd: mfd-core: Protect against NULL call-back function pointer
	drm/amdgpu/powerplay/smu7: fix AVFS handling with custom powerplay table
	tpm_crb: fix fTPM on AMD Zen+ CPUs
	tracing: Adding NULL checks for trace_array descriptor pointer
	bcache: fix a lost wake-up problem caused by mca_cannibalize_lock
	dmaengine: mediatek: hsdma_probe: fixed a memory leak when devm_request_irq fails
	RDMA/qedr: Fix potential use after free
	RDMA/i40iw: Fix potential use after free
	fix dget_parent() fastpath race
	xfs: fix attr leaf header freemap.size underflow
	RDMA/iw_cgxb4: Fix an error handling path in 'c4iw_connect()'
	ubi: Fix producing anchor PEBs
	mmc: core: Fix size overflow for mmc partitions
	gfs2: clean up iopen glock mess in gfs2_create_inode
	scsi: pm80xx: Cleanup command when a reset times out
	debugfs: Fix !DEBUG_FS debugfs_create_automount
	CIFS: Properly process SMB3 lease breaks
	ASoC: max98090: remove msleep in PLL unlocked workaround
	kernel/sys.c: avoid copying possible padding bytes in copy_to_user
	KVM: arm/arm64: vgic: Fix potential double free dist->spis in __kvm_vgic_destroy()
	xfs: fix log reservation overflows when allocating large rt extents
	neigh_stat_seq_next() should increase position index
	rt_cpu_seq_next should increase position index
	ipv6_route_seq_next should increase position index
	seqlock: Require WRITE_ONCE surrounding raw_seqcount_barrier
	media: ti-vpe: cal: Restrict DMA to avoid memory corruption
	sctp: move trace_sctp_probe_path into sctp_outq_sack
	ACPI: EC: Reference count query handlers under lock
	scsi: ufs: Make ufshcd_add_command_trace() easier to read
	scsi: ufs: Fix a race condition in the tracing code
	dmaengine: zynqmp_dma: fix burst length configuration
	s390/cpum_sf: Use kzalloc and minor changes
	powerpc/eeh: Only dump stack once if an MMIO loop is detected
	Bluetooth: btrtl: Use kvmalloc for FW allocations
	tracing: Set kernel_stack's caller size properly
	ARM: 8948/1: Prevent OOB access in stacktrace
	ar5523: Add USB ID of SMCWUSBT-G2 wireless adapter
	ceph: ensure we have a new cap before continuing in fill_inode
	selftests/ftrace: fix glob selftest
	tools/power/x86/intel_pstate_tracer: changes for python 3 compatibility
	Bluetooth: Fix refcount use-after-free issue
	mm/swapfile.c: swap_next should increase position index
	mm: pagewalk: fix termination condition in walk_pte_range()
	Bluetooth: prefetch channel before killing sock
	KVM: fix overflow of zero page refcount with ksm running
	ALSA: hda: Clear RIRB status before reading WP
	skbuff: fix a data race in skb_queue_len()
	audit: CONFIG_CHANGE don't log internal bookkeeping as an event
	selinux: sel_avc_get_stat_idx should increase position index
	scsi: lpfc: Fix RQ buffer leakage when no IOCBs available
	scsi: lpfc: Fix coverity errors in fmdi attribute handling
	drm/omap: fix possible object reference leak
	clk: stratix10: use do_div() for 64-bit calculation
	crypto: chelsio - This fixes the kernel panic which occurs during a libkcapi test
	mt76: clear skb pointers from rx aggregation reorder buffer during cleanup
	ALSA: usb-audio: Don't create a mixer element with bogus volume range
	perf test: Fix test trace+probe_vfs_getname.sh on s390
	RDMA/rxe: Fix configuration of atomic queue pair attributes
	KVM: x86: fix incorrect comparison in trace event
	dmaengine: stm32-mdma: use vchan_terminate_vdesc() in .terminate_all
	media: staging/imx: Missing assignment in imx_media_capture_device_register()
	x86/pkeys: Add check for pkey "overflow"
	bpf: Remove recursion prevention from rcu free callback
	dmaengine: stm32-dma: use vchan_terminate_vdesc() in .terminate_all
	dmaengine: tegra-apb: Prevent race conditions on channel's freeing
	drm/amd/display: dal_ddc_i2c_payloads_create can fail causing panic
	firmware: arm_sdei: Use cpus_read_lock() to avoid races with cpuhp
	random: fix data races at timer_rand_state
	bus: hisi_lpc: Fixup IO ports addresses to avoid use-after-free in host removal
	media: go7007: Fix URB type for interrupt handling
	Bluetooth: guard against controllers sending zero'd events
	timekeeping: Prevent 32bit truncation in scale64_check_overflow()
	ext4: fix a data race at inode->i_disksize
	perf jevents: Fix leak of mapfile memory
	mm: avoid data corruption on CoW fault into PFN-mapped VMA
	drm/amdgpu: increase atombios cmd timeout
	drm/amd/display: Stop if retimer is not available
	ath10k: use kzalloc to read for ath10k_sdio_hif_diag_read
	scsi: aacraid: Disabling TM path and only processing IOP reset
	Bluetooth: L2CAP: handle l2cap config request during open state
	media: tda10071: fix unsigned sign extension overflow
	xfs: don't ever return a stale pointer from __xfs_dir3_free_read
	xfs: mark dir corrupt when lookup-by-hash fails
	ext4: mark block bitmap corrupted when found instead of BUGON
	tpm: ibmvtpm: Wait for buffer to be set before proceeding
	rtc: sa1100: fix possible race condition
	rtc: ds1374: fix possible race condition
	nfsd: Don't add locks to closed or closing open stateids
	RDMA/cm: Remove a race freeing timewait_info
	KVM: PPC: Book3S HV: Treat TM-related invalid form instructions on P9 like the valid ones
	drm/msm: fix leaks if initialization fails
	drm/msm/a5xx: Always set an OPP supported hardware value
	tracing: Use address-of operator on section symbols
	thermal: rcar_thermal: Handle probe error gracefully
	perf parse-events: Fix 3 use after frees found with clang ASAN
	serial: 8250_port: Don't service RX FIFO if throttled
	serial: 8250_omap: Fix sleeping function called from invalid context during probe
	serial: 8250: 8250_omap: Terminate DMA before pushing data on RX timeout
	perf cpumap: Fix snprintf overflow check
	cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_work_fn
	tools: gpio-hammer: Avoid potential overflow in main
	nvme-multipath: do not reset on unknown status
	nvme: Fix controller creation races with teardown flow
	RDMA/rxe: Set sys_image_guid to be aligned with HW IB devices
	scsi: hpsa: correct race condition in offload enabled
	SUNRPC: Fix a potential buffer overflow in 'svc_print_xprts()'
	svcrdma: Fix leak of transport addresses
	PCI: Use ioremap(), not phys_to_virt() for platform ROM
	ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len
	ALSA: usb-audio: Fix case when USB MIDI interface has more than one extra endpoint descriptor
	PCI: pciehp: Fix MSI interrupt race
	NFS: Fix races nfs_page_group_destroy() vs nfs_destroy_unlinked_subrequests()
	mm/kmemleak.c: use address-of operator on section symbols
	mm/filemap.c: clear page error before actual read
	mm/vmscan.c: fix data races using kswapd_classzone_idx
	nvmet-rdma: fix double free of rdma queue
	mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area
	scsi: qedi: Fix termination timeouts in session logout
	serial: uartps: Wait for tx_empty in console setup
	KVM: Remove CREATE_IRQCHIP/SET_PIT2 race
	bdev: Reduce time holding bd_mutex in sync in blkdev_close()
	drivers: char: tlclk.c: Avoid data race between init and interrupt handler
	KVM: arm64: vgic-its: Fix memory leak on the error path of vgic_add_lpi()
	net: openvswitch: use u64 for meter bucket
	scsi: aacraid: Fix error handling paths in aac_probe_one()
	staging:r8188eu: avoid skb_clone for amsdu to msdu conversion
	sparc64: vcc: Fix error return code in vcc_probe()
	arm64: cpufeature: Relax checks for AArch32 support at EL[0-2]
	dt-bindings: sound: wm8994: Correct required supplies based on actual implementaion
	atm: fix a memory leak of vcc->user_back
	perf mem2node: Avoid double free related to realloc
	power: supply: max17040: Correct voltage reading
	phy: samsung: s5pv210-usb2: Add delay after reset
	Bluetooth: Handle Inquiry Cancel error after Inquiry Complete
	USB: EHCI: ehci-mv: fix error handling in mv_ehci_probe()
	tipc: fix memory leak in service subscripting
	tty: serial: samsung: Correct clock selection logic
	ALSA: hda: Fix potential race in unsol event handler
	powerpc/traps: Make unrecoverable NMIs die instead of panic
	fuse: don't check refcount after stealing page
	USB: EHCI: ehci-mv: fix less than zero comparison of an unsigned int
	scsi: cxlflash: Fix error return code in cxlflash_probe()
	arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register
	e1000: Do not perform reset in reset_task if we are already down
	drm/nouveau/debugfs: fix runtime pm imbalance on error
	drm/nouveau: fix runtime pm imbalance on error
	drm/nouveau/dispnv50: fix runtime pm imbalance on error
	printk: handle blank console arguments passed in.
	usb: dwc3: Increase timeout for CmdAct cleared by device controller
	btrfs: don't force read-only after error in drop snapshot
	vfio/pci: fix memory leaks of eventfd ctx
	perf evsel: Fix 2 memory leaks
	perf trace: Fix the selection for architectures to generate the errno name tables
	perf stat: Fix duration_time value for higher intervals
	perf util: Fix memory leak of prefix_if_not_in
	perf metricgroup: Free metric_events on error
	perf kcore_copy: Fix module map when there are no modules loaded
	ASoC: img-i2s-out: Fix runtime PM imbalance on error
	wlcore: fix runtime pm imbalance in wl1271_tx_work
	wlcore: fix runtime pm imbalance in wlcore_regdomain_config
	mtd: rawnand: omap_elm: Fix runtime PM imbalance on error
	PCI: tegra: Fix runtime PM imbalance on error
	ceph: fix potential race in ceph_check_caps
	mm/swap_state: fix a data race in swapin_nr_pages
	rapidio: avoid data race between file operation callbacks and mport_cdev_add().
	mtd: parser: cmdline: Support MTD names containing one or more colons
	x86/speculation/mds: Mark mds_user_clear_cpu_buffers() __always_inline
	vfio/pci: Clear error and request eventfd ctx after releasing
	cifs: Fix double add page to memcg when cifs_readpages
	nvme: fix possible deadlock when I/O is blocked
	scsi: libfc: Handling of extra kref
	scsi: libfc: Skip additional kref updating work event
	selftests/x86/syscall_nt: Clear weird flags after each test
	vfio/pci: fix racy on error and request eventfd ctx
	btrfs: qgroup: fix data leak caused by race between writeback and truncate
	ubi: fastmap: Free unused fastmap anchor peb during detach
	perf parse-events: Use strcmp() to compare the PMU name
	net: openvswitch: use div_u64() for 64-by-32 divisions
	nvme: explicitly update mpath disk capacity on revalidation
	ASoC: wm8994: Skip setting of the WM8994_MICBIAS register for WM1811
	ASoC: wm8994: Ensure the device is resumed in wm89xx_mic_detect functions
	ASoC: Intel: bytcr_rt5640: Add quirk for MPMAN Converter9 2-in-1
	RISC-V: Take text_mutex in ftrace_init_nop()
	s390/init: add missing __init annotations
	lockdep: fix order in trace_hardirqs_off_caller()
	drm/amdkfd: fix a memory leak issue
	i2c: core: Call i2c_acpi_install_space_handler() before i2c_acpi_register_devices()
	objtool: Fix noreturn detection for ignored functions
	ieee802154: fix one possible memleak in ca8210_dev_com_init
	ieee802154/adf7242: check status of adf7242_read_reg
	clocksource/drivers/h8300_timer8: Fix wrong return value in h8300_8timer_init()
	mwifiex: Increase AES key storage size to 256 bits
	batman-adv: bla: fix type misuse for backbone_gw hash indexing
	atm: eni: fix the missed pci_disable_device() for eni_init_one()
	batman-adv: mcast/TT: fix wrongly dropped or rerouted packets
	mac802154: tx: fix use-after-free
	bpf: Fix clobbering of r2 in bpf_gen_ld_abs
	drm/vc4/vc4_hdmi: fill ASoC card owner
	net: qed: RDMA personality shouldn't fail VF load
	drm/sun4i: sun8i-csc: Secondary CSC register correction
	batman-adv: Add missing include for in_interrupt()
	batman-adv: mcast: fix duplicate mcast packets in BLA backbone from mesh
	batman-adv: mcast: fix duplicate mcast packets from BLA backbone to mesh
	bpf: Fix a rcu warning for bpffs map pretty-print
	ALSA: asihpi: fix iounmap in error handler
	regmap: fix page selection for noinc reads
	MIPS: Add the missing 'CPU_1074K' into __get_cpu_type()
	KVM: x86: Reset MMU context if guest toggles CR4.SMAP or CR4.PKE
	KVM: SVM: Add a dedicated INVD intercept routine
	tracing: fix double free
	s390/dasd: Fix zero write for FBA devices
	kprobes: Fix to check probe enabled before disarm_kprobe_ftrace()
	mm, THP, swap: fix allocating cluster for swapfile by mistake
	s390/zcrypt: Fix ZCRYPT_PERDEV_REQCNT ioctl
	kprobes: Fix compiler warning for !CONFIG_KPROBES_ON_FTRACE
	ata: define AC_ERR_OK
	ata: make qc_prep return ata_completion_errors
	ata: sata_mv, avoid trigerrable BUG_ON
	KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch
	Linux 4.19.149

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Idfc1b35ec63b4b464aeb6e32709102bee0efc872
2020-10-01 16:49:05 +02:00
Gao Xiang
f3e8ed3d33 mm, THP, swap: fix allocating cluster for swapfile by mistake
commit 41663430588c737dd735bad5a0d1ba325dcabd59 upstream.

SWP_FS is used to make swap_{read,write}page() go through the
filesystem, and it's only used for swap files over NFS.  So, !SWP_FS
means non NFS for now, it could be either file backed or device backed.
Something similar goes with legacy SWP_FILE.

So in order to achieve the goal of the original patch, SWP_BLKDEV should
be used instead.

FS corruption can be observed with SSD device + XFS + fragmented
swapfile due to CONFIG_THP_SWAP=y.

I reproduced the issue with the following details:

Environment:

  QEMU + upstream kernel + buildroot + NVMe (2 GB)

Kernel config:

  CONFIG_BLK_DEV_NVME=y
  CONFIG_THP_SWAP=y

Some reproducible steps:

  mkfs.xfs -f /dev/nvme0n1
  mkdir /tmp/mnt
  mount /dev/nvme0n1 /tmp/mnt
  bs="32k"
  sz="1024m"    # doesn't matter too much, I also tried 16m
  xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
  xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
  xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
  xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
  xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw

  mkswap /tmp/mnt/sw
  swapon /tmp/mnt/sw

  stress --vm 2 --vm-bytes 600M   # doesn't matter too much as well

Symptoms:
 - FS corruption (e.g. checksum failure)
 - memory corruption at: 0xd2808010
 - segfault

Fixes: f0eea189e8 ("mm, THP, swap: Don't allocate huge cluster for file backed swap device")
Fixes: 38d8b4e6bd ("mm, THP, swap: delay splitting THP during swap out")
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Eric Sandeen <esandeen@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200820045323.7809-1-hsiangkao@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-01 13:14:53 +02:00
Qian Cai
8cc3afd53d mm/swap_state: fix a data race in swapin_nr_pages
[ Upstream commit d6c1f098f2a7ba62627c9bc17cda28f534ef9e4a ]

"prev_offset" is a static variable in swapin_nr_pages() that can be
accessed concurrently with only mmap_sem held in read mode as noticed by
KCSAN,

 BUG: KCSAN: data-race in swap_cluster_readahead / swap_cluster_readahead

 write to 0xffffffff92763830 of 8 bytes by task 14795 on cpu 17:
  swap_cluster_readahead+0x2a6/0x5e0
  swapin_readahead+0x92/0x8dc
  do_swap_page+0x49b/0xf20
  __handle_mm_fault+0xcfb/0xd70
  handle_mm_fault+0xfc/0x2f0
  do_page_fault+0x263/0x715
  page_fault+0x34/0x40

 1 lock held by (dnf)/14795:
  #0: ffff897bd2e98858 (&mm->mmap_sem#2){++++}-{3:3}, at: do_page_fault+0x143/0x715
  do_user_addr_fault at arch/x86/mm/fault.c:1405
  (inlined by) do_page_fault at arch/x86/mm/fault.c:1535
 irq event stamp: 83493
 count_memcg_event_mm+0x1a6/0x270
 count_memcg_event_mm+0x119/0x270
 __do_softirq+0x365/0x589
 irq_exit+0xa2/0xc0

 read to 0xffffffff92763830 of 8 bytes by task 1 on cpu 22:
  swap_cluster_readahead+0xfd/0x5e0
  swapin_readahead+0x92/0x8dc
  do_swap_page+0x49b/0xf20
  __handle_mm_fault+0xcfb/0xd70
  handle_mm_fault+0xfc/0x2f0
  do_page_fault+0x263/0x715
  page_fault+0x34/0x40

 1 lock held by systemd/1:
  #0: ffff897c38f14858 (&mm->mmap_sem#2){++++}-{3:3}, at: do_page_fault+0x143/0x715
 irq event stamp: 43530289
 count_memcg_event_mm+0x1a6/0x270
 count_memcg_event_mm+0x119/0x270
 __do_softirq+0x365/0x589
 irq_exit+0xa2/0xc0

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Marco Elver <elver@google.com>
Cc: Hugh Dickins <hughd@google.com>
Link: http://lkml.kernel.org/r/20200402213748.2237-1-cai@lca.pw
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01 13:14:47 +02:00
Jaewon Kim
6bee7991f6 mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area
[ Upstream commit 09ef5283fd96ac424ef0e569626f359bf9ab86c9 ]

On passing requirement to vm_unmapped_area, arch_get_unmapped_area and
arch_get_unmapped_area_topdown did not set align_offset.  Internally on
both unmapped_area and unmapped_area_topdown, if info->align_mask is 0,
then info->align_offset was meaningless.

But commit df529cabb7a2 ("mm: mmap: add trace point of
vm_unmapped_area") always prints info->align_offset even though it is
uninitialized.

Fix this uninitialized value issue by setting it to 0 explicitly.

Before:
  vm_unmapped_area: addr=0x755b155000 err=0 total_vm=0x15aaf0 flags=0x1 len=0x109000 lo=0x8000 hi=0x75eed48000 mask=0x0 ofs=0x4022

After:
  vm_unmapped_area: addr=0x74a4ca1000 err=0 total_vm=0x168ab1 flags=0x1 len=0x9000 lo=0x8000 hi=0x753d94b000 mask=0x0 ofs=0x0

Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20200409094035.19457-1-jaewon31.kim@samsung.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01 13:14:42 +02:00
Qian Cai
b73c744019 mm/vmscan.c: fix data races using kswapd_classzone_idx
[ Upstream commit 5644e1fbbfe15ad06785502bbfe5751223e5841d ]

pgdat->kswapd_classzone_idx could be accessed concurrently in
wakeup_kswapd().  Plain writes and reads without any lock protection
result in data races.  Fix them by adding a pair of READ|WRITE_ONCE() as
well as saving a branch (compilers might well optimize the original code
in an unintentional way anyway).  While at it, also take care of
pgdat->kswapd_order and non-kswapd threads in allow_direct_reclaim().  The
data races were reported by KCSAN,

 BUG: KCSAN: data-race in wakeup_kswapd / wakeup_kswapd

 write to 0xffff9f427ffff2dc of 4 bytes by task 7454 on cpu 13:
  wakeup_kswapd+0xf1/0x400
  wakeup_kswapd at mm/vmscan.c:3967
  wake_all_kswapds+0x59/0xc0
  wake_all_kswapds at mm/page_alloc.c:4241
  __alloc_pages_slowpath+0xdcc/0x1290
  __alloc_pages_slowpath at mm/page_alloc.c:4512
  __alloc_pages_nodemask+0x3bb/0x450
  alloc_pages_vma+0x8a/0x2c0
  do_anonymous_page+0x16e/0x6f0
  __handle_mm_fault+0xcd5/0xd40
  handle_mm_fault+0xfc/0x2f0
  do_page_fault+0x263/0x6f9
  page_fault+0x34/0x40

 1 lock held by mtest01/7454:
  #0: ffff9f425afe8808 (&mm->mmap_sem#2){++++}, at:
 do_page_fault+0x143/0x6f9
 do_user_addr_fault at arch/x86/mm/fault.c:1405
 (inlined by) do_page_fault at arch/x86/mm/fault.c:1539
 irq event stamp: 6944085
 count_memcg_event_mm+0x1a6/0x270
 count_memcg_event_mm+0x119/0x270
 __do_softirq+0x34c/0x57c
 irq_exit+0xa2/0xc0

 read to 0xffff9f427ffff2dc of 4 bytes by task 7472 on cpu 38:
  wakeup_kswapd+0xc8/0x400
  wake_all_kswapds+0x59/0xc0
  __alloc_pages_slowpath+0xdcc/0x1290
  __alloc_pages_nodemask+0x3bb/0x450
  alloc_pages_vma+0x8a/0x2c0
  do_anonymous_page+0x16e/0x6f0
  __handle_mm_fault+0xcd5/0xd40
  handle_mm_fault+0xfc/0x2f0
  do_page_fault+0x263/0x6f9
  page_fault+0x34/0x40

 1 lock held by mtest01/7472:
  #0: ffff9f425a9ac148 (&mm->mmap_sem#2){++++}, at:
 do_page_fault+0x143/0x6f9
 irq event stamp: 6793561
 count_memcg_event_mm+0x1a6/0x270
 count_memcg_event_mm+0x119/0x270
 __do_softirq+0x34c/0x57c
 irq_exit+0xa2/0xc0

 BUG: KCSAN: data-race in kswapd / wakeup_kswapd

 write to 0xffff90973ffff2dc of 4 bytes by task 820 on cpu 6:
  kswapd+0x27c/0x8d0
  kthread+0x1e0/0x200
  ret_from_fork+0x27/0x50

 read to 0xffff90973ffff2dc of 4 bytes by task 6299 on cpu 0:
  wakeup_kswapd+0xf3/0x450
  wake_all_kswapds+0x59/0xc0
  __alloc_pages_slowpath+0xdcc/0x1290
  __alloc_pages_nodemask+0x3bb/0x450
  alloc_pages_vma+0x8a/0x2c0
  do_anonymous_page+0x170/0x700
  __handle_mm_fault+0xc9f/0xd00
  handle_mm_fault+0xfc/0x2f0
  do_page_fault+0x263/0x6f9
  page_fault+0x34/0x40

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Marco Elver <elver@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Link: http://lkml.kernel.org/r/1582749472-5171-1-git-send-email-cai@lca.pw
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01 13:14:41 +02:00
Xianting Tian
cebefe4f6f mm/filemap.c: clear page error before actual read
[ Upstream commit faffdfa04fa11ccf048cebdde73db41ede0679e0 ]

Mount failure issue happens under the scenario: Application forked dozens
of threads to mount the same number of cramfs images separately in docker,
but several mounts failed with high probability.  Mount failed due to the
checking result of the page(read from the superblock of loop dev) is not
uptodate after wait_on_page_locked(page) returned in function cramfs_read:

   wait_on_page_locked(page);
   if (!PageUptodate(page)) {
      ...
   }

The reason of the checking result of the page not uptodate: systemd-udevd
read the loopX dev before mount, because the status of loopX is Lo_unbound
at this time, so loop_make_request directly trigger the calling of io_end
handler end_buffer_async_read, which called SetPageError(page).  So It
caused the page can't be set to uptodate in function
end_buffer_async_read:

   if(page_uptodate && !PageError(page)) {
      SetPageUptodate(page);
   }

Then mount operation is performed, it used the same page which is just
accessed by systemd-udevd above, Because this page is not uptodate, it
will launch a actual read via submit_bh, then wait on this page by calling
wait_on_page_locked(page).  When the I/O of the page done, io_end handler
end_buffer_async_read is called, because no one cleared the page
error(during the whole read path of mount), which is caused by
systemd-udevd reading, so this page is still in "PageError" status, which
can't be set to uptodate in function end_buffer_async_read, then caused
mount failure.

But sometimes mount succeed even through systemd-udeved read loopX dev
just before, The reason is systemd-udevd launched other loopX read just
between step 3.1 and 3.2, the steps as below:

1, loopX dev default status is Lo_unbound;
2, systemd-udved read loopX dev (page is set to PageError);
3, mount operation
   1) set loopX status to Lo_bound;
   ==>systemd-udevd read loopX dev<==
   2) read loopX dev(page has no error)
   3) mount succeed

As the loopX dev status is set to Lo_bound after step 3.1, so the other
loopX dev read by systemd-udevd will go through the whole I/O stack, part
of the call trace as below:

   SYS_read
      vfs_read
          do_sync_read
              blkdev_aio_read
                 generic_file_aio_read
                     do_generic_file_read:
                        ClearPageError(page);
                        mapping->a_ops->readpage(filp, page);

here, mapping->a_ops->readpage() is blkdev_readpage.  In latest kernel,
some function name changed, the call trace as below:

   blkdev_read_iter
      generic_file_read_iter
         generic_file_buffered_read:
            /*
             * A previous I/O error may have been due to temporary
             * failures, eg. mutipath errors.
             * Pg_error will be set again if readpage fails.
             */
            ClearPageError(page);
            /* Start the actual read. The read will unlock the page*/
            error=mapping->a_ops->readpage(flip, page);

We can see ClearPageError(page) is called before the actual read,
then the read in step 3.2 succeed.

This patch is to add the calling of ClearPageError just before the actual
read of read path of cramfs mount.  Without the patch, the call trace as
below when performing cramfs mount:

   do_mount
      cramfs_read
         cramfs_blkdev_read
            read_cache_page
               do_read_cache_page:
                  filler(data, page);
                  or
                  mapping->a_ops->readpage(data, page);

With the patch, the call trace as below when performing mount:

   do_mount
      cramfs_read
         cramfs_blkdev_read
            read_cache_page:
               do_read_cache_page:
                  ClearPageError(page); <== new add
                  filler(data, page);
                  or
                  mapping->a_ops->readpage(data, page);

With the patch, mount operation trigger the calling of
ClearPageError(page) before the actual read, the page has no error if no
additional page error happen when I/O done.

Signed-off-by: Xianting Tian <xianting_tian@126.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: <yubin@h3c.com>
Link: http://lkml.kernel.org/r/1583318844-22971-1-git-send-email-xianting_tian@126.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01 13:14:41 +02:00
Nathan Chancellor
afe001488e mm/kmemleak.c: use address-of operator on section symbols
[ Upstream commit b0d14fc43d39203ae025f20ef4d5d25d9ccf4be1 ]

Clang warns:

  mm/kmemleak.c:1955:28: warning: array comparison always evaluates to a constant [-Wtautological-compare]
        if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)
                                  ^
  mm/kmemleak.c:1955:60: warning: array comparison always evaluates to a constant [-Wtautological-compare]
        if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)

These are not true arrays, they are linker defined symbols, which are just
addresses.  Using the address of operator silences the warning and does
not change the resulting assembly with either clang/ld.lld or gcc/ld
(tested with diff + objdump -Dr).

Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/895
Link: http://lkml.kernel.org/r/20200220051551.44000-1-natechancellor@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01 13:14:41 +02:00
Kirill A. Shutemov
2b294ac325 mm: avoid data corruption on CoW fault into PFN-mapped VMA
[ Upstream commit c3e5ea6ee574ae5e845a40ac8198de1fb63bb3ab ]

Jeff Moyer has reported that one of xfstests triggers a warning when run
on DAX-enabled filesystem:

	WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50
	...
	wp_page_copy+0x98c/0xd50 (unreliable)
	do_wp_page+0xd8/0xad0
	__handle_mm_fault+0x748/0x1b90
	handle_mm_fault+0x120/0x1f0
	__do_page_fault+0x240/0xd70
	do_page_fault+0x38/0xd0
	handle_page_fault+0x10/0x30

The warning happens on failed __copy_from_user_inatomic() which tries to
copy data into a CoW page.

This happens because of race between MADV_DONTNEED and CoW page fault:

	CPU0					CPU1
 handle_mm_fault()
   do_wp_page()
     wp_page_copy()
       do_wp_page()
					madvise(MADV_DONTNEED)
					  zap_page_range()
					    zap_pte_range()
					      ptep_get_and_clear_full()
					      <TLB flush>
	 __copy_from_user_inatomic()
	 sees empty PTE and fails
	 WARN_ON_ONCE(1)
	 clear_page()

The solution is to re-try __copy_from_user_inatomic() under PTL after
checking that PTE is matches the orig_pte.

The second copy attempt can still fail, like due to non-readable PTE, but
there's nothing reasonable we can do about, except clearing the CoW page.

Reported-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Jeff Moyer <jmoyer@redhat.com>
Cc: <stable@vger.kernel.org>
Cc: Justin He <Justin.He@arm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Link: http://lkml.kernel.org/r/20200218154151.13349-1-kirill.shutemov@linux.intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01 13:14:36 +02:00
Steven Price
f9cb6b6124 mm: pagewalk: fix termination condition in walk_pte_range()
[ Upstream commit c02a98753e0a36ba65a05818626fa6adeb4e7c97 ]

If walk_pte_range() is called with a 'end' argument that is beyond the
last page of memory (e.g.  ~0UL) then the comparison between 'addr' and
'end' will always fail and the loop will be infinite.  Instead change the
comparison to >= while accounting for overflow.

Link: http://lkml.kernel.org/r/20191218162402.45610-15-steven.price@arm.com
Signed-off-by: Steven Price <steven.price@arm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zong Li <zong.li@sifive.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01 13:14:32 +02:00
Vasily Averin
52f5a09ab7 mm/swapfile.c: swap_next should increase position index
[ Upstream commit 10c8d69f314d557d94d74ec492575ae6a4f1eb1c ]

If seq_file .next fuction does not change position index, read after
some lseek can generate unexpected output.

In Aug 2018 NeilBrown noticed commit 1f4aace60b ("fs/seq_file.c:
simplify seq_file iteration code and interface") "Some ->next functions
do not increment *pos when they return NULL...  Note that such ->next
functions are buggy and should be fixed.  A simple demonstration is

  dd if=/proc/swaps bs=1000 skip=1

Choose any block size larger than the size of /proc/swaps.  This will
always show the whole last line of /proc/swaps"

Described problem is still actual.  If you make lseek into middle of
last output line following read will output end of last line and whole
last line once again.

  $ dd if=/proc/swaps bs=1  # usual output
  Filename				Type		Size	Used	Priority
  /dev/dm-0                               partition	4194812	97536	-2
  104+0 records in
  104+0 records out
  104 bytes copied

  $ dd if=/proc/swaps bs=40 skip=1    # last line was generated twice
  dd: /proc/swaps: cannot skip to specified offset
  v/dm-0                               partition	4194812	97536	-2
  /dev/dm-0                               partition	4194812	97536	-2
  3+1 records in
  3+1 records out
  131 bytes copied

https://bugzilla.kernel.org/show_bug.cgi?id=206283

Link: http://lkml.kernel.org/r/bd8cfd7b-ac95-9b91-f9e7-e8438bd5047d@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jann Horn <jannh@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01 13:14:32 +02:00
Jia He
8579a04403 mm: fix double page fault on arm64 if PTE_AF is cleared
[ Upstream commit 83d116c53058d505ddef051e90ab27f57015b025 ]

When we tested pmdk unit test [1] vmmalloc_fork TEST3 on arm64 guest, there
will be a double page fault in __copy_from_user_inatomic of cow_user_page.

To reproduce the bug, the cmd is as follows after you deployed everything:
make -C src/test/vmmalloc_fork/ TEST_TIME=60m check

Below call trace is from arm64 do_page_fault for debugging purpose:
[  110.016195] Call trace:
[  110.016826]  do_page_fault+0x5a4/0x690
[  110.017812]  do_mem_abort+0x50/0xb0
[  110.018726]  el1_da+0x20/0xc4
[  110.019492]  __arch_copy_from_user+0x180/0x280
[  110.020646]  do_wp_page+0xb0/0x860
[  110.021517]  __handle_mm_fault+0x994/0x1338
[  110.022606]  handle_mm_fault+0xe8/0x180
[  110.023584]  do_page_fault+0x240/0x690
[  110.024535]  do_mem_abort+0x50/0xb0
[  110.025423]  el0_da+0x20/0x24

The pte info before __copy_from_user_inatomic is (PTE_AF is cleared):
[ffff9b007000] pgd=000000023d4f8003, pud=000000023da9b003,
               pmd=000000023d4b3003, pte=360000298607bd3

As told by Catalin: "On arm64 without hardware Access Flag, copying from
user will fail because the pte is old and cannot be marked young. So we
always end up with zeroed page after fork() + CoW for pfn mappings. we
don't always have a hardware-managed access flag on arm64."

This patch fixes it by calling pte_mkyoung. Also, the parameter is
changed because vmf should be passed to cow_user_page()

Add a WARN_ON_ONCE when __copy_from_user_inatomic() returns error
in case there can be some obscure use-case (by Kirill).

[1] https://github.com/pmem/pmdk/tree/master/src/test/vmmalloc_fork

Signed-off-by: Jia He <justin.he@arm.com>
Reported-by: Yibo Cai <Yibo.Cai@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01 13:14:24 +02:00
Greg Kroah-Hartman
5d5e582aab This is the 4.19.148 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl9vZgoACgkQONu9yGCS
 aT6CSBAAqsWMXD4MWA35hVDaFmbMOQkZzdDPTIYzRSDIDMqkbtL+4XAPzqfP7yJH
 CvQwqSi5XxqBFnlZwIMvY9cbhEumYUyWbFT0rK07ZJApT2uDBcfdZiFxDP/I0vQk
 Fj5uuFFORs39Kv2CUYWHgkyS0KcrSul5kwfxWblJADZNq5eJp6vnOBHMc59uRP14
 SaWWXac9ltGecIn2yKkVz28lW0iI8gSnIf5Bj1/kWC8vcRcaWP2biwflwz6PjFi6
 dfKWYN4t7wNK/cM0RigRp7suuIcpvcNv5yu+al2Ul0vi4zn5Po8Kt8Bancn/iWR5
 5aCtvhT6OyuQfD67HxSMLZLcEH9X/C7pv+0sEIYlOb1j1IiEec4XrEs8ivlFpl0/
 weG+fEM6gyd4+IxTBkLAHQah8s4kSM9fWmyrLFqMyw3O1ImvImjrQmWMnZdDWBtO
 BGe5IuCMaFbCzbTGZRNZ7PG8EF3ayGYl16rE2FsHawDACp97p6Tih5Y6suwp9vYM
 Ph4IhNWNYTs1DcgOhqKLTnhbbF1IeJ8kJiVU3nOA7xyLwzdAxwHWk8NEN2yxhz5D
 aM8YS75gAqRp6km9OPKWG+M9THgT4RtI1O/fAA26CRMc505LRi+w4qPtEMJUPocU
 v6oJwB3T5BJdkdHPGcJi84hjqea2VGzG5x/cTI2a985DaoNtvc8=
 =uvGG
 -----END PGP SIGNATURE-----

Merge 4.19.148 into android-4.19-stable

Changes in 4.19.148
	af_key: pfkey_dump needs parameter validation
	KVM: fix memory leak in kvm_io_bus_unregister_dev()
	kprobes: fix kill kprobe which has been marked as gone
	mm/thp: fix __split_huge_pmd_locked() for migration PMD
	cxgb4: Fix offset when clearing filter byte counters
	geneve: add transport ports in route lookup for geneve
	hdlc_ppp: add range checks in ppp_cp_parse_cr()
	ip: fix tos reflection in ack and reset packets
	ipv6: avoid lockdep issue in fib6_del()
	net: DCB: Validate DCB_ATTR_DCB_BUFFER argument
	net: dsa: rtl8366: Properly clear member config
	net: ipv6: fix kconfig dependency warning for IPV6_SEG6_HMAC
	net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
	nfp: use correct define to return NONE fec
	tipc: Fix memory leak in tipc_group_create_member()
	tipc: fix shutdown() of connection oriented socket
	tipc: use skb_unshare() instead in tipc_buf_append()
	bnxt_en: return proper error codes in bnxt_show_temp
	bnxt_en: Protect bnxt_set_eee() and bnxt_set_pauseparam() with mutex.
	net: phy: Avoid NPD upon phy_detach() when driver is unbound
	net: qrtr: check skb_put_padto() return value
	net: add __must_check to skb_put_padto()
	ipv4: Update exception handling for multipath routes via same device
	MAINTAINERS: add CLANG/LLVM BUILD SUPPORT info
	kbuild: add OBJSIZE variable for the size tool
	Documentation/llvm: add documentation on building w/ Clang/LLVM
	Documentation/llvm: fix the name of llvm-size
	net: wan: wanxl: use allow to pass CROSS_COMPILE_M68k for rebuilding firmware
	net: wan: wanxl: use $(M68KCC) instead of $(M68KAS) for rebuilding firmware
	x86/boot: kbuild: allow readelf executable to be specified
	kbuild: remove AS variable
	kbuild: replace AS=clang with LLVM_IAS=1
	kbuild: support LLVM=1 to switch the default tools to Clang/LLVM
	mm: memcg: fix memcg reclaim soft lockup
	tcp_bbr: refactor bbr_target_cwnd() for general inflight provisioning
	tcp_bbr: adapt cwnd based on ack aggregation estimation
	serial: 8250: Avoid error message on reprobe
	Linux 4.19.148

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5077093f58ae882547e3b23c5bbd4dfe78116071
2020-09-27 07:58:48 +02:00
Xunlei Pang
1aa7a9e5ee mm: memcg: fix memcg reclaim soft lockup
commit e3336cab2579012b1e72b5265adf98e2d6e244ad upstream.

We've met softlockup with "CONFIG_PREEMPT_NONE=y", when the target memcg
doesn't have any reclaimable memory.

It can be easily reproduced as below:

  watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204]
  CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12
  Call Trace:
    shrink_lruvec+0x49f/0x640
    shrink_node+0x2a6/0x6f0
    do_try_to_free_pages+0xe9/0x3e0
    try_to_free_mem_cgroup_pages+0xef/0x1f0
    try_charge+0x2c1/0x750
    mem_cgroup_charge+0xd7/0x240
    __add_to_page_cache_locked+0x2fd/0x370
    add_to_page_cache_lru+0x4a/0xc0
    pagecache_get_page+0x10b/0x2f0
    filemap_fault+0x661/0xad0
    ext4_filemap_fault+0x2c/0x40
    __do_fault+0x4d/0xf9
    handle_mm_fault+0x1080/0x1790

It only happens on our 1-vcpu instances, because there's no chance for
oom reaper to run to reclaim the to-be-killed process.

Add a cond_resched() at the upper shrink_node_memcgs() to solve this
issue, this will mean that we will get a scheduling point for each memcg
in the reclaimed hierarchy without any dependency on the reclaimable
memory in that memcg thus making it more predictable.

Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Xunlei Pang <xlpang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Chris Down <chris@chrisdown.name>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: http://lkml.kernel.org/r/1598495549-67324-1-git-send-email-xlpang@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Julius Hemanth Pitti <jpitti@cisco.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-26 18:01:32 +02:00
Ralph Campbell
ec56646e3b mm/thp: fix __split_huge_pmd_locked() for migration PMD
[ Upstream commit ec0abae6dcdf7ef88607c869bf35a4b63ce1b370 ]

A migrating transparent huge page has to already be unmapped.  Otherwise,
the page could be modified while it is being copied to a new page and data
could be lost.  The function __split_huge_pmd() checks for a PMD migration
entry before calling __split_huge_pmd_locked() leading one to think that
__split_huge_pmd_locked() can handle splitting a migrating PMD.

However, the code always increments the page->_mapcount and adjusts the
memory control group accounting assuming the page is mapped.

Also, if the PMD entry is a migration PMD entry, the call to
is_huge_zero_pmd(*pmd) is incorrect because it calls pmd_pfn(pmd) instead
of migration_entry_to_pfn(pmd_to_swp_entry(pmd)).  Fix these problems by
checking for a PMD migration entry.

Fixes: 84c3fc4e9c ("mm: thp: check pmd migration entry in common path")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>	[4.14+]
Link: https://lkml.kernel.org/r/20200903183140.19055-1-rcampbell@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-26 18:01:28 +02:00
Greg Kroah-Hartman
0b8c61c48e This is the 4.19.147 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl9rH4IACgkQONu9yGCS
 aT4wsw/9E4DOfXrQvUvScgV6weqomHrJ4oEEh/hckog4lHcym6//DxRhXXgd1OVL
 CoNdIciv5ixqMjENMjGxM9Ob/GiTihhl7Zivu+DsLmCHAovcUsUxBGjg4Oh6nJHt
 BCh4Dy2umFi6dzohkWRCdLyJie6hlXoGtpycRZ+3KlOV3x6/A4lLeiqTBh1bwJxq
 30VBFimykIcjSzGSpu/HuMFXuDlT9TCaY20JSnKTW1Z2hapTG3Q6wERNMfRyaUhH
 JwDXWVPN+1zLecNQGODSsJrapaQCqWgVKaQjJoB6Pn3TMTMQox0bRCXErjq8YZJr
 NPF54m0flHj1hIQhsKRM5Zp/3Dd2kGXNzOq8HN2sRb0jz2L9WsnVTvWYPyPV9TEt
 yaytDy7XwSASVp1IFzGe4uFVevezw7OF0bbNJm2dMB0+O6pTaHyZ2KAlVa9CWiSy
 h/kGdu1JiV4jgaRDz4WdEVUvojSrQonC3CsmMZKOt2klAgPCKE6LOws5ZyCr1haK
 TgsajqNwNZcfqMGnCxOwA5nZEzxSxGdAnFZ+1slzDQzDOqwGYTQZr3sa4pY+7V7r
 eIGYIC1VL3Vjnq3KavDdBbUiQvskEYUMnjq7NUCMZVW2IJ0ULm05MO4nHXzKfT7o
 dw/H5EUUJ5ta+8f2XTS0p2ZRy+F0bkW0evmh15Cvt5dQLkxF35A=
 =gyKc
 -----END PGP SIGNATURE-----

Merge 4.19.147 into android-4.19-stable

Changes in 4.19.147
	dsa: Allow forwarding of redirected IGMP traffic
	scsi: qla2xxx: Update rscn_rcvd field to more meaningful scan_needed
	scsi: qla2xxx: Move rport registration out of internal work_list
	scsi: qla2xxx: Reduce holding sess_lock to prevent CPU lock-up
	gfs2: initialize transaction tr_ailX_lists earlier
	RDMA/bnxt_re: Restrict the max_gids to 256
	net: handle the return value of pskb_carve_frag_list() correctly
	hv_netvsc: Remove "unlikely" from netvsc_select_queue
	NFSv4.1 handle ERR_DELAY error reclaiming locking state on delegation recall
	scsi: pm8001: Fix memleak in pm8001_exec_internal_task_abort
	scsi: libfc: Fix for double free()
	scsi: lpfc: Fix FLOGI/PLOGI receive race condition in pt2pt discovery
	regulator: pwm: Fix machine constraints application
	spi: spi-loopback-test: Fix out-of-bounds read
	NFS: Zero-stateid SETATTR should first return delegation
	SUNRPC: stop printk reading past end of string
	rapidio: Replace 'select' DMAENGINES 'with depends on'
	openrisc: Fix cache API compile issue when not inlining
	nvme-fc: cancel async events before freeing event struct
	nvme-rdma: cancel async events before freeing event struct
	f2fs: fix indefinite loop scanning for free nid
	f2fs: Return EOF on unaligned end of file DIO read
	i2c: algo: pca: Reapply i2c bus settings after reset
	spi: Fix memory leak on splited transfers
	KVM: MIPS: Change the definition of kvm type
	clk: davinci: Use the correct size when allocating memory
	clk: rockchip: Fix initialization of mux_pll_src_4plls_p
	ASoC: qcom: Set card->owner to avoid warnings
	Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload
	perf test: Fix the "signal" test inline assembly
	MIPS: SNI: Fix MIPS_L1_CACHE_SHIFT
	perf test: Free formats for perf pmu parse test
	fbcon: Fix user font detection test at fbcon_resize().
	MIPS: SNI: Fix spurious interrupts
	drm/mediatek: Add exception handing in mtk_drm_probe() if component init fail
	drm/mediatek: Add missing put_device() call in mtk_hdmi_dt_parse_pdata()
	USB: quirks: Add USB_QUIRK_IGNORE_REMOTE_WAKEUP quirk for BYD zhaoxin notebook
	USB: UAS: fix disconnect by unplugging a hub
	usblp: fix race between disconnect() and read()
	i2c: i801: Fix resume bug
	Revert "ALSA: hda - Fix silent audio output and corrupted input on MSI X570-A PRO"
	percpu: fix first chunk size calculation for populated bitmap
	Input: trackpoint - add new trackpoint variant IDs
	Input: i8042 - add Entroware Proteus EL07R4 to nomux and reset lists
	serial: 8250_pci: Add Realtek 816a and 816b
	x86/boot/compressed: Disable relocation relaxation
	ehci-hcd: Move include to keep CRC stable
	powerpc/dma: Fix dma_map_ops::get_required_mask
	x86/defconfig: Enable CONFIG_USB_XHCI_HCD=y
	Linux 4.19.147

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1c512021698db2701c51491a813bec79bda6bbf5
2020-09-24 12:48:04 +02:00
Sunghyun Jin
5afd52f302 percpu: fix first chunk size calculation for populated bitmap
commit b3b33d3c43bbe0177d70653f4e889c78cc37f097 upstream.

Variable populated, which is a member of struct pcpu_chunk, is used as a
unit of size of unsigned long.
However, size of populated is miscounted. So, I fix this minor part.

Fixes: 8ab16c43ea ("percpu: change the number of pages marked in the first_chunk pop bitmap")
Cc: <stable@vger.kernel.org> # 4.14+
Signed-off-by: Sunghyun Jin <mcsmonk@gmail.com>
Signed-off-by: Dennis Zhou <dennis@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-23 12:11:01 +02:00
Greg Kroah-Hartman
009b982d9c This is the 4.19.144 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl9ZCzcACgkQONu9yGCS
 aT6VCw//aRDPlJMx+LHqasSImMs3lCHnMcHVxE09knu30mQNpJXv05piJ70/Ct/S
 ukE4e6NEv2L9wgE/Ti4q7Hk5ahTeMqE3fxewtsJcoTnswa6LwL7X0iPtdFs3aVSa
 u6nwlkkz5uXE4xFD1fE8WudizglSqFc6cSD+3AJvtI4DAM+it920EGvAX6yzk9Gi
 7wF5lsqX0xm6Jn2XZ+ZnI8f49cKE+7n8aKpkXFGyrPVqrVotXdJUcovz2eEXxO3E
 vo95Z1FksBqY7gMOCbrLiBXspMujaIduphKmWUIeNccAsexMVJjfJqO5GTZA/siW
 GxdauwbMWWAOcw9RAnjt5crmPU7gtUcaXr32ST42BmZtDWW0frj4hN6jcsvfW6KO
 uyWKIi6SidQJui/dcDyzTwhJlUzUhxY1bj/hWMwLmJznfNMqeS1wFF/5xewWShwG
 dxmhuZAsoI8CrpHG4kvJiZ2vvHvS7zNDdXWQyHE9GOh6xcAdhRc9nhkPd9ugubDf
 3wuHuSpQg7fbsq98QxTKM1irlsBXNXBpw/VbiYhhbfN5n9VCFj82KSJZf321/BVk
 PoETRPFnrYU3/85xDxvEbAX9EYCWHQaJWq49kZRBAQ9yMOUFYrcRmzfRtDvNdNzs
 dE+kGJhgu90wrJYkywOqBHsi/7jNIqjRG6/lDYxICaRI9NEbaa4=
 =aozR
 -----END PGP SIGNATURE-----

Merge 4.19.144 into android-4.19-stable

Changes in 4.19.144
	HID: core: Correctly handle ReportSize being zero
	HID: core: Sanitize event code and type when mapping input
	perf record/stat: Explicitly call out event modifiers in the documentation
	scsi: target: tcmu: Fix size in calls to tcmu_flush_dcache_range
	scsi: target: tcmu: Optimize use of flush_dcache_page
	tty: serial: qcom_geni_serial: Drop __init from qcom_geni_console_setup
	drm/msm: add shutdown support for display platform_driver
	hwmon: (applesmc) check status earlier.
	nvmet: Disable keep-alive timer when kato is cleared to 0h
	drm/msm/a6xx: fix gmu start on newer firmware
	ceph: don't allow setlease on cephfs
	cpuidle: Fixup IRQ state
	s390: don't trace preemption in percpu macros
	xen/xenbus: Fix granting of vmalloc'd memory
	dmaengine: of-dma: Fix of_dma_router_xlate's of_dma_xlate handling
	batman-adv: Avoid uninitialized chaddr when handling DHCP
	batman-adv: Fix own OGM check in aggregated OGMs
	batman-adv: bla: use netif_rx_ni when not in interrupt context
	dmaengine: at_hdmac: check return value of of_find_device_by_node() in at_dma_xlate()
	MIPS: mm: BMIPS5000 has inclusive physical caches
	MIPS: BMIPS: Also call bmips_cpu_setup() for secondary cores
	netfilter: nf_tables: add NFTA_SET_USERDATA if not null
	netfilter: nf_tables: incorrect enum nft_list_attributes definition
	netfilter: nf_tables: fix destination register zeroing
	net: hns: Fix memleak in hns_nic_dev_probe
	net: systemport: Fix memleak in bcm_sysport_probe
	ravb: Fixed to be able to unload modules
	net: arc_emac: Fix memleak in arc_mdio_probe
	dmaengine: pl330: Fix burst length if burst size is smaller than bus width
	gtp: add GTPA_LINK info to msg sent to userspace
	bnxt_en: Don't query FW when netif_running() is false.
	bnxt_en: Check for zero dir entries in NVRAM.
	bnxt_en: Fix PCI AER error recovery flow
	bnxt_en: fix HWRM error when querying VF temperature
	xfs: fix boundary test in xfs_attr_shortform_verify
	bnxt: don't enable NAPI until rings are ready
	selftests/bpf: Fix massive output from test_maps
	netfilter: nfnetlink: nfnetlink_unicast() reports EAGAIN instead of ENOBUFS
	nvmet-fc: Fix a missed _irqsave version of spin_lock in 'nvmet_fc_fod_op_done()'
	perf tools: Correct SNOOPX field offset
	net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init()
	fix regression in "epoll: Keep a reference on files added to the check list"
	net: gemini: Fix another missing clk_disable_unprepare() in probe
	xfs: fix xfs_bmap_validate_extent_raw when checking attr fork of rt files
	perf jevents: Fix suspicious code in fixregex()
	tg3: Fix soft lockup when tg3_reset_task() fails.
	x86, fakenuma: Fix invalid starting node ID
	iommu/vt-d: Serialize IOMMU GCMD register modifications
	thermal: ti-soc-thermal: Fix bogus thermal shutdowns for omap4430
	include/linux/log2.h: add missing () around n in roundup_pow_of_two()
	ext2: don't update mtime on COW faults
	xfs: don't update mtime on COW faults
	btrfs: drop path before adding new uuid tree entry
	vfio/type1: Support faulting PFNMAP vmas
	vfio-pci: Fault mmaps to enable vma tracking
	vfio-pci: Invalidate mmaps and block MMIO access on disabled memory
	btrfs: Remove redundant extent_buffer_get in get_old_root
	btrfs: Remove extraneous extent_buffer_get from tree_mod_log_rewind
	btrfs: set the lockdep class for log tree extent buffers
	uaccess: Add non-pagefault user-space read functions
	uaccess: Add non-pagefault user-space write function
	btrfs: fix potential deadlock in the search ioctl
	net: usb: qmi_wwan: add Telit 0x1050 composition
	usb: qmi_wwan: add D-Link DWM-222 A2 device ID
	ALSA: ca0106: fix error code handling
	ALSA: pcm: oss: Remove superfluous WARN_ON() for mulaw sanity check
	ALSA: hda/hdmi: always check pin power status in i915 pin fixup
	ALSA: firewire-digi00x: exclude Avid Adrenaline from detection
	ALSA: hda - Fix silent audio output and corrupted input on MSI X570-A PRO
	media: rc: do not access device via sysfs after rc_unregister_device()
	media: rc: uevent sysfs file races with rc_unregister_device()
	affs: fix basic permission bits to actually work
	block: allow for_each_bvec to support zero len bvec
	libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to Sandisks
	dm writecache: handle DAX to partitions on persistent memory correctly
	dm cache metadata: Avoid returning cmd->bm wild pointer on error
	dm thin metadata: Avoid returning cmd->bm wild pointer on error
	mm: slub: fix conversion of freelist_corrupted()
	KVM: arm64: Add kvm_extable for vaxorcism code
	KVM: arm64: Defer guest entry when an asynchronous exception is pending
	KVM: arm64: Survive synchronous exceptions caused by AT instructions
	KVM: arm64: Set HCR_EL2.PTW to prevent AT taking synchronous exception
	vfio/pci: Fix SR-IOV VF handling with MMIO blocking
	checkpatch: fix the usage of capture group ( ... )
	mm/hugetlb: fix a race between hugetlb sysctl handlers
	cfg80211: regulatory: reject invalid hints
	net: usb: Fix uninit-was-stored issue in asix_read_phy_addr()
	Linux 4.19.144

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I81d6b3f044fe0dd919d1ece16d131c2185c00bb3
2020-09-09 19:48:58 +02:00
Muchun Song
221ea9a3da mm/hugetlb: fix a race between hugetlb sysctl handlers
commit 17743798d81238ab13050e8e2833699b54e15467 upstream.

There is a race between the assignment of `table->data` and write value
to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on
the other thread.

  CPU0:                                 CPU1:
                                        proc_sys_write
  hugetlb_sysctl_handler                  proc_sys_call_handler
  hugetlb_sysctl_handler_common             hugetlb_sysctl_handler
    table->data = &tmp;                       hugetlb_sysctl_handler_common
                                                table->data = &tmp;
      proc_doulongvec_minmax
        do_proc_doulongvec_minmax           sysctl_head_finish
          __do_proc_doulongvec_minmax         unuse_table
            i = table->data;
            *i = val;  // corrupt CPU1's stack

Fix this by duplicating the `table`, and only update the duplicate of
it.  And introduce a helper of proc_hugetlb_doulongvec_minmax() to
simplify the code.

The following oops was seen:

    BUG: kernel NULL pointer dereference, address: 0000000000000000
    #PF: supervisor instruction fetch in kernel mode
    #PF: error_code(0x0010) - not-present page
    Code: Bad RIP value.
    ...
    Call Trace:
     ? set_max_huge_pages+0x3da/0x4f0
     ? alloc_pool_huge_page+0x150/0x150
     ? proc_doulongvec_minmax+0x46/0x60
     ? hugetlb_sysctl_handler_common+0x1c7/0x200
     ? nr_hugepages_store+0x20/0x20
     ? copy_fd_bitmaps+0x170/0x170
     ? hugetlb_sysctl_handler+0x1e/0x20
     ? proc_sys_call_handler+0x2f1/0x300
     ? unregister_sysctl_table+0xb0/0xb0
     ? __fd_install+0x78/0x100
     ? proc_sys_write+0x14/0x20
     ? __vfs_write+0x4d/0x90
     ? vfs_write+0xef/0x240
     ? ksys_write+0xc0/0x160
     ? __ia32_sys_read+0x50/0x50
     ? __close_fd+0x129/0x150
     ? __x64_sys_write+0x43/0x50
     ? do_syscall_64+0x6c/0x200
     ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: e5ff215941 ("hugetlb: multiple hstates for multiple page sizes")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20200828031146.43035-1-songmuchun@bytedance.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-09 19:04:32 +02:00
Eugeniu Rosca
af2cf2c5a2 mm: slub: fix conversion of freelist_corrupted()
commit dc07a728d49cf025f5da2c31add438d839d076c0 upstream.

Commit 52f23478081ae0 ("mm/slub.c: fix corrupted freechain in
deactivate_slab()") suffered an update when picked up from LKML [1].

Specifically, relocating 'freelist = NULL' into 'freelist_corrupted()'
created a no-op statement.  Fix it by sticking to the behavior intended
in the original patch [1].  In addition, make freelist_corrupted()
immune to passing NULL instead of &freelist.

The issue has been spotted via static analysis and code review.

[1] https://lore.kernel.org/linux-mm/20200331031450.12182-1-dongli.zhang@oracle.com/

Fixes: 52f23478081ae0 ("mm/slub.c: fix corrupted freechain in deactivate_slab()")
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dongli Zhang <dongli.zhang@oracle.com>
Cc: Joe Jin <joe.jin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200824130643.10291-1-erosca@de.adit-jv.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-09 19:04:31 +02:00
Daniel Borkmann
cfb4721fce uaccess: Add non-pagefault user-space write function
[ Upstream commit 1d1585ca0f48fe7ed95c3571f3e4a82b2b5045dc ]

Commit 3d7081822f7f ("uaccess: Add non-pagefault user-space read functions")
missed to add probe write function, therefore factor out a probe_write_common()
helper with most logic of probe_kernel_write() except setting KERNEL_DS, and
add a new probe_user_write() helper so it can be used from BPF side.

Again, on some archs, the user address space and kernel address space can
co-exist and be overlapping, so in such case, setting KERNEL_DS would mean
that the given address is treated as being in kernel address space.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/bpf/9df2542e68141bfa3addde631441ee45503856a8.1572649915.git.daniel@iogearbox.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-09 19:04:29 +02:00
Masami Hiramatsu
61135a9c74 uaccess: Add non-pagefault user-space read functions
[ Upstream commit 3d7081822f7f9eab867d9bcc8fd635208ec438e0 ]

Add probe_user_read(), strncpy_from_unsafe_user() and
strnlen_unsafe_user() which allows caller to access user-space
in IRQ context.

Current probe_kernel_read() and strncpy_from_unsafe() are
not available for user-space memory, because it sets
KERNEL_DS while accessing data. On some arch, user address
space and kernel address space can be co-exist, but others
can not. In that case, setting KERNEL_DS means given
address is treated as a kernel address space.
Also strnlen_user() is only available from user context since
it can sleep if pagefault is enabled.

To access user-space memory without pagefault, we need
these new functions which sets USER_DS while accessing
the data.

Link: http://lkml.kernel.org/r/155789869802.26965.4940338412595759063.stgit@devnote2

Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-09 19:04:29 +02:00
Greg Kroah-Hartman
599bf02de4 This is the 4.19.142 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl9GHdsACgkQONu9yGCS
 aT4AoA//RHH+8srJoIL7iz4HMcXbSTqom//BkZKhDLMvDoOHt7GE3t571kM4Bx99
 cY+oJCxsfUgbSLGE2eBRmfr0i+kcyT/Ke1Jyp0/3+lrqZeFxhtda8z1TYz0PC0E6
 V/M9OaKKpKFW2tsGxsiKsomE4wNZExhKl2yti6QWS6jl+1ngAKZEg0LLMjDDSC3G
 CGtnk9yYjdExxky0XYN15B7I4RfIFLmHprT++Ctrgxq6wlrOiZyB2LqNJeZdJmsx
 7tieTxC0rAsyMG5w1j6kFy5+6e+5t81B5yk5IfHNH17ZUU+L8p15fC172GEi3rwn
 UOYPZxIEJs4wRImJTur3JwfQbt2ySt45GNJBTVtOt/dUvS141NgpBVTSaQ60Zv4Y
 4aXi4GucVr3nApTnTfAM5nRjtnRrHPXg49qzM0CqOAzdlyuUpzpvQsyek1ml8Etl
 Vdgn7iLyUbV7Cb/aVVEAwvkT+EAPdrzqSK8Q3nonl8R4pZy35CrxlPkdFPVSIKmH
 KGLZP+xg3wJSHdjVuLAtMAYcREau/Yo+i3W8Pz4niU3MUnskPqdPQyp8XzY+hwfp
 4OgJatcUPdB9782b242WmrVJ4b4Ts4ZOuM6hrIrSqdvOkuzQQ9vyDmfHHlEEfH4F
 6tSEA96MZ1bG7uIyMwgx+11lbBC48UYhm/dKcXmyX/yV60N8oPw=
 =/u20
 -----END PGP SIGNATURE-----

Merge 4.19.142 into android-4.19-stable

Changes in 4.19.142
	drm/vgem: Replace opencoded version of drm_gem_dumb_map_offset()
	perf probe: Fix memory leakage when the probe point is not found
	khugepaged: khugepaged_test_exit() check mmget_still_valid()
	khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()
	btrfs: export helpers for subvolume name/id resolution
	btrfs: don't show full path of bind mounts in subvol=
	btrfs: Move free_pages_out label in inline extent handling branch in compress_file_range
	btrfs: inode: fix NULL pointer dereference if inode doesn't need compression
	btrfs: sysfs: use NOFS for device creation
	romfs: fix uninitialized memory leak in romfs_dev_read()
	kernel/relay.c: fix memleak on destroy relay channel
	mm: include CMA pages in lowmem_reserve at boot
	mm, page_alloc: fix core hung in free_pcppages_bulk()
	ext4: fix checking of directory entry validity for inline directories
	jbd2: add the missing unlock_buffer() in the error path of jbd2_write_superblock()
	scsi: zfcp: Fix use-after-free in request timeout handlers
	drm/amd/display: fix pow() crashing when given base 0
	kthread: Do not preempt current task if it is going to call schedule()
	spi: Prevent adding devices below an unregistering controller
	scsi: ufs: Add DELAY_BEFORE_LPM quirk for Micron devices
	scsi: target: tcmu: Fix crash in tcmu_flush_dcache_range on ARM
	media: budget-core: Improve exception handling in budget_register()
	rtc: goldfish: Enable interrupt in set_alarm() when necessary
	media: vpss: clean up resources in init
	Input: psmouse - add a newline when printing 'proto' by sysfs
	m68knommu: fix overwriting of bits in ColdFire V3 cache control
	svcrdma: Fix another Receive buffer leak
	xfs: fix inode quota reservation checks
	jffs2: fix UAF problem
	ceph: fix use-after-free for fsc->mdsc
	cpufreq: intel_pstate: Fix cpuinfo_max_freq when MSR_TURBO_RATIO_LIMIT is 0
	scsi: libfc: Free skb in fc_disc_gpn_id_resp() for valid cases
	virtio_ring: Avoid loop when vq is broken in virtqueue_poll
	tools/testing/selftests/cgroup/cgroup_util.c: cg_read_strcmp: fix null pointer dereference
	xfs: Fix UBSAN null-ptr-deref in xfs_sysfs_init
	alpha: fix annotation of io{read,write}{16,32}be()
	fs/signalfd.c: fix inconsistent return codes for signalfd4
	ext4: fix potential negative array index in do_split()
	ext4: don't allow overlapping system zones
	ASoC: q6routing: add dummy register read/write function
	i40e: Set RX_ONLY mode for unicast promiscuous on VLAN
	i40e: Fix crash during removing i40e driver
	net: fec: correct the error path for regulator disable in probe
	bonding: show saner speed for broadcast mode
	bonding: fix a potential double-unregister
	s390/runtime_instrumentation: fix storage key handling
	s390/ptrace: fix storage key handling
	ASoC: msm8916-wcd-analog: fix register Interrupt offset
	ASoC: intel: Fix memleak in sst_media_open
	vfio/type1: Add proper error unwind for vfio_iommu_replay()
	kvm: x86: Toggling CR4.SMAP does not load PDPTEs in PAE mode
	kvm: x86: Toggling CR4.PKE does not load PDPTEs in PAE mode
	kconfig: qconf: do not limit the pop-up menu to the first row
	kconfig: qconf: fix signal connection to invalid slots
	efi: avoid error message when booting under Xen
	Fix build error when CONFIG_ACPI is not set/enabled:
	RDMA/bnxt_re: Do not add user qps to flushlist
	afs: Fix NULL deref in afs_dynroot_depopulate()
	bonding: fix active-backup failover for current ARP slave
	net: ena: Prevent reset after device destruction
	net: gemini: Fix missing free_netdev() in error path of gemini_ethernet_port_probe()
	hv_netvsc: Fix the queue_mapping in netvsc_vf_xmit()
	net: dsa: b53: check for timeout
	powerpc/pseries: Do not initiate shutdown when system is running on UPS
	efi: add missed destroy_workqueue when efisubsys_init fails
	epoll: Keep a reference on files added to the check list
	do_epoll_ctl(): clean the failure exits up a bit
	mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible
	xen: don't reschedule in preemption off sections
	clk: Evict unregistered clks from parent caches
	KVM: Pass MMU notifier range flags to kvm_unmap_hva_range()
	KVM: arm64: Only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is not set
	Linux 4.19.142

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ibfe4a0a4249f76ab35076f4b003e32cd6f9788a5
2020-08-26 11:07:03 +02:00
Peter Xu
734654ae79 mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible
commit 75802ca66354a39ab8e35822747cd08b3384a99a upstream.

This is found by code observation only.

Firstly, the worst case scenario should assume the whole range was covered
by pmd sharing.  The old algorithm might not work as expected for ranges
like (1g-2m, 1g+2m), where the adjusted range should be (0, 1g+2m) but the
expected range should be (0, 2g).

Since at it, remove the loop since it should not be required.  With that,
the new code should be faster too when the invalidating range is huge.

Mike said:

: With range (1g-2m, 1g+2m) within a vma (0, 2g) the existing code will only
: adjust to (0, 1g+2m) which is incorrect.
:
: We should cc stable.  The original reason for adjusting the range was to
: prevent data corruption (getting wrong page).  Since the range is not
: always adjusted correctly, the potential for corruption still exists.
:
: However, I am fairly confident that adjust_range_if_pmd_sharing_possible
: is only gong to be called in two cases:
:
: 1) for a single page
: 2) for range == entire vma
:
: In those cases, the current code should produce the correct results.
:
: To be safe, let's just cc stable.

Fixes: 017b1660df ("mm: migration: fix migration of huge PMD shared pages")
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200730201636.74778-1-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-26 10:31:06 +02:00
Charan Teja Reddy
c666936d8d mm, page_alloc: fix core hung in free_pcppages_bulk()
commit 88e8ac11d2ea3acc003cf01bb5a38c8aa76c3cfd upstream.

The following race is observed with the repeated online, offline and a
delay between two successive online of memory blocks of movable zone.

P1						P2

Online the first memory block in
the movable zone. The pcp struct
values are initialized to default
values,i.e., pcp->high = 0 &
pcp->batch = 1.

					Allocate the pages from the
					movable zone.

Try to Online the second memory
block in the movable zone thus it
entered the online_pages() but yet
to call zone_pcp_update().
					This process is entered into
					the exit path thus it tries
					to release the order-0 pages
					to pcp lists through
					free_unref_page_commit().
					As pcp->high = 0, pcp->count = 1
					proceed to call the function
					free_pcppages_bulk().
Update the pcp values thus the
new pcp values are like, say,
pcp->high = 378, pcp->batch = 63.
					Read the pcp's batch value using
					READ_ONCE() and pass the same to
					free_pcppages_bulk(), pcp values
					passed here are, batch = 63,
					count = 1.

					Since num of pages in the pcp
					lists are less than ->batch,
					then it will stuck in
					while(list_empty(list)) loop
					with interrupts disabled thus
					a core hung.

Avoid this by ensuring free_pcppages_bulk() is called with proper count of
pcp list pages.

The mentioned race is some what easily reproducible without [1] because
pcp's are not updated for the first memory block online and thus there is
a enough race window for P2 between alloc+free and pcp struct values
update through onlining of second memory block.

With [1], the race still exists but it is very narrow as we update the pcp
struct values for the first memory block online itself.

This is not limited to the movable zone, it could also happen in cases
with the normal zone (e.g., hotplug to a node that only has DMA memory, or
no other memory yet).

[1]: https://patchwork.kernel.org/patch/11696389/

Fixes: 5f8dcc2121 ("page-allocator: split per-cpu list into one-list-per-migrate-type")
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: <stable@vger.kernel.org> [2.6+]
Link: http://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaurora.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-26 10:30:59 +02:00
Doug Berger
84b8dc232a mm: include CMA pages in lowmem_reserve at boot
commit e08d3fdfe2dafa0331843f70ce1ff6c1c4900bf4 upstream.

The lowmem_reserve arrays provide a means of applying pressure against
allocations from lower zones that were targeted at higher zones.  Its
values are a function of the number of pages managed by higher zones and
are assigned by a call to the setup_per_zone_lowmem_reserve() function.

The function is initially called at boot time by the function
init_per_zone_wmark_min() and may be called later by accesses of the
/proc/sys/vm/lowmem_reserve_ratio sysctl file.

The function init_per_zone_wmark_min() was moved up from a module_init to
a core_initcall to resolve a sequencing issue with khugepaged.
Unfortunately this created a sequencing issue with CMA page accounting.

The CMA pages are added to the managed page count of a zone when
cma_init_reserved_areas() is called at boot also as a core_initcall.  This
makes it uncertain whether the CMA pages will be added to the managed page
counts of their zones before or after the call to
init_per_zone_wmark_min() as it becomes dependent on link order.  With the
current link order the pages are added to the managed count after the
lowmem_reserve arrays are initialized at boot.

This means the lowmem_reserve values at boot may be lower than the values
used later if /proc/sys/vm/lowmem_reserve_ratio is accessed even if the
ratio values are unchanged.

In many cases the difference is not significant, but for example
an ARM platform with 1GB of memory and the following memory layout

  cma: Reserved 256 MiB at 0x0000000030000000
  Zone ranges:
    DMA      [mem 0x0000000000000000-0x000000002fffffff]
    Normal   empty
    HighMem  [mem 0x0000000030000000-0x000000003fffffff]

would result in 0 lowmem_reserve for the DMA zone.  This would allow
userspace to deplete the DMA zone easily.

Funnily enough

  $ cat /proc/sys/vm/lowmem_reserve_ratio

would fix up the situation because as a side effect it forces
setup_per_zone_lowmem_reserve.

This commit breaks the link order dependency by invoking
init_per_zone_wmark_min() as a postcore_initcall so that the CMA pages
have the chance to be properly accounted in their zone(s) and allowing
the lowmem_reserve arrays to receive consistent values.

Fixes: bc22af74f2 ("mm: update min_free_kbytes from khugepaged after core initialization")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1597423766-27849-1-git-send-email-opendmb@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-26 10:30:59 +02:00
Hugh Dickins
2ef7ebb143 khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()
[ Upstream commit f3f99d63a8156c7a4a6b20aac22b53c5579c7dc1 ]

syzbot crashes on the VM_BUG_ON_MM(khugepaged_test_exit(mm), mm) in
__khugepaged_enter(): yes, when one thread is about to dump core, has set
core_state, and is waiting for others, another might do something calling
__khugepaged_enter(), which now crashes because I lumped the core_state
test (known as "mmget_still_valid") into khugepaged_test_exit().  I still
think it's best to lump them together, so just in this exceptional case,
check mm->mm_users directly instead of khugepaged_test_exit().

Fixes: bbe98f9cadff ("khugepaged: khugepaged_test_exit() check mmget_still_valid()")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Yang Shi <shy828301@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: <stable@vger.kernel.org>	[4.8+]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008141503370.18085@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-26 10:30:58 +02:00
Hugh Dickins
17c08ee00b khugepaged: khugepaged_test_exit() check mmget_still_valid()
[ Upstream commit bbe98f9cadff58cdd6a4acaeba0efa8565dabe65 ]

Move collapse_huge_page()'s mmget_still_valid() check into
khugepaged_test_exit() itself.  collapse_huge_page() is used for anon THP
only, and earned its mmget_still_valid() check because it inserts a huge
pmd entry in place of the page table's pmd entry; whereas
collapse_file()'s retract_page_tables() or collapse_pte_mapped_thp()
merely clears the page table's pmd entry.  But core dumping without mmap
lock must have been as open to mistaking a racily cleared pmd entry for a
page table at physical page 0, as exit_mmap() was.  And we certainly have
no interest in mapping as a THP once dumping core.

Fixes: 59ea6d06cfa9 ("coredump: fix race condition between collapse_huge_page() and core dumping")
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>	[4.8+]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021217020.27773@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-26 10:30:58 +02:00
Greg Kroah-Hartman
369c9d2963 This is the 4.19.141 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl8/jnoACgkQONu9yGCS
 aT54fA//azItNUOsY1HujeNfINHWCqCLV7OHpdxa9MEeixSpP/ufsGcgyZBTslNw
 WOENkdUPGYxUQt9yyZjSY5CEneH6a007idCfUWIuHRZ9nxKbDZm312xDDcDkeZI7
 P4TGvIdpDq7Czk2c+OCSUnmp/+fCJdPCpCYJZp0kBDVbUsKeUwpBJ42Dca8f/2iM
 lWVlGR2KwMIV+NSVArpu8EUOpw7X4rPsGz72kEvVhCkcXa9GFxGbs65AVNG5NTzt
 9sHBlja7PZTqt844/6UBM5EgTR43uJT5z8sSV5N5s6j2d07m/T+2f73PyKqr6+jQ
 SXKpIp/J6Po7tCej5u4B9LO+ePpasuxbNAXmn1GLuiP7qzKRAriFxK2RfXXxqIuE
 aP9DB6P/wbr/MszFjIFFg9nrr9G/biriRNPWtnzD2hbUk1mfM8WNCCSIt90MZh0f
 CT85JiEBFlU5cZhgUJfqJcfZcckE8gbdUGOBvZ5NOq0hxqN2S6+/phespwkd4h/a
 A4QyhER6eI9zT/StBoSLejs8c/lHKHjqyMARNjXLPF+bkkR90L9WDgocB1KiV0jn
 YOY+j4tjXGnn/QAsuW/uhYVvzETtkQ5oSyeV5uTcgYvU3iw+QFo9H//y83yB5Q0o
 pdDRNmMTtdYwrwkzt73xsjKGlVXaA8kB5kRCuBwGb5kzr0G8baE=
 =Txdz
 -----END PGP SIGNATURE-----

Merge 4.19.141 into android-4.19-stable

Changes in 4.19.141
	smb3: warn on confusing error scenario with sec=krb5
	genirq/affinity: Make affinity setting if activated opt-in
	PCI: hotplug: ACPI: Fix context refcounting in acpiphp_grab_context()
	PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken
	PCI: Add device even if driver attach failed
	PCI: qcom: Define some PARF params needed for ipq8064 SoC
	PCI: qcom: Add support for tx term offset for rev 2.1.0
	PCI: Probe bridge window attributes once at enumeration-time
	btrfs: free anon block device right after subvolume deletion
	btrfs: don't allocate anonymous block device for user invisible roots
	btrfs: ref-verify: fix memory leak in add_block_entry
	btrfs: don't traverse into the seed devices in show_devname
	btrfs: open device without device_list_mutex
	btrfs: fix messages after changing compression level by remount
	btrfs: only search for left_info if there is no right_info in try_merge_free_space
	btrfs: fix memory leaks after failure to lookup checksums during inode logging
	btrfs: fix return value mixup in btrfs_get_extent
	dt-bindings: iio: io-channel-mux: Fix compatible string in example code
	iio: dac: ad5592r: fix unbalanced mutex unlocks in ad5592r_read_raw()
	xtensa: fix xtensa_pmu_setup prototype
	cifs: Fix leak when handling lease break for cached root fid
	powerpc: Allow 4224 bytes of stack expansion for the signal frame
	powerpc: Fix circular dependency between percpu.h and mmu.h
	media: vsp1: dl: Fix NULL pointer dereference on unbind
	net: ethernet: stmmac: Disable hardware multicast filter
	net: stmmac: dwmac1000: provide multicast filter fallback
	net/compat: Add missing sock updates for SCM_RIGHTS
	md/raid5: Fix Force reconstruct-write io stuck in degraded raid5
	bcache: allocate meta data pages as compound pages
	bcache: fix overflow in offset_to_stripe()
	mac80211: fix misplaced while instead of if
	driver core: Avoid binding drivers to dead devices
	MIPS: CPU#0 is not hotpluggable
	ext2: fix missing percpu_counter_inc
	ocfs2: change slot number type s16 to u16
	mm/page_counter.c: fix protection usage propagation
	ftrace: Setup correct FTRACE_FL_REGS flags for module
	kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler
	tracing/hwlat: Honor the tracing_cpumask
	tracing: Use trace_sched_process_free() instead of exit() for pid tracing
	watchdog: f71808e_wdt: indicate WDIOF_CARDRESET support in watchdog_info.options
	watchdog: f71808e_wdt: remove use of wrong watchdog_info option
	watchdog: f71808e_wdt: clear watchdog timeout occurred flag
	pseries: Fix 64 bit logical memory block panic
	module: Correctly truncate sysfs sections output
	perf intel-pt: Fix FUP packet state
	remoteproc: qcom: q6v5: Update running state before requesting stop
	drm/imx: imx-ldb: Disable both channels for split mode in enc->disable()
	mfd: arizona: Ensure 32k clock is put on driver unbind and error
	RDMA/ipoib: Return void from ipoib_ib_dev_stop()
	RDMA/ipoib: Fix ABBA deadlock with ipoib_reap_ah()
	media: rockchip: rga: Introduce color fmt macros and refactor CSC mode logic
	media: rockchip: rga: Only set output CSC mode for RGB input
	USB: serial: ftdi_sio: make process-packet buffer unsigned
	USB: serial: ftdi_sio: clean up receive processing
	mmc: renesas_sdhi_internal_dmac: clean up the code for dma complete
	gpu: ipu-v3: image-convert: Combine rotate/no-rotate irq handlers
	dm rq: don't call blk_mq_queue_stopped() in dm_stop_queue()
	selftests/powerpc: ptrace-pkey: Rename variables to make it easier to follow code
	selftests/powerpc: ptrace-pkey: Update the test to mark an invalid pkey correctly
	selftests/powerpc: ptrace-pkey: Don't update expected UAMOR value
	iommu/omap: Check for failure of a call to omap_iommu_dump_ctx
	iommu/vt-d: Enforce PASID devTLB field mask
	i2c: rcar: slave: only send STOP event when we have been addressed
	clk: clk-atlas6: fix return value check in atlas6_clk_init()
	pwm: bcm-iproc: handle clk_get_rate() return
	tools build feature: Use CC and CXX from parent
	i2c: rcar: avoid race when unregistering slave
	openrisc: Fix oops caused when dumping stack
	scsi: lpfc: nvmet: Avoid hang / use-after-free again when destroying targetport
	watchdog: initialize device before misc_register
	Input: sentelic - fix error return when fsp_reg_write fails
	drm/vmwgfx: Use correct vmw_legacy_display_unit pointer
	drm/vmwgfx: Fix two list_for_each loop exit tests
	net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init
	nfs: Fix getxattr kernel panic and memory overflow
	fs/minix: set s_maxbytes correctly
	fs/minix: fix block limit check for V1 filesystems
	fs/minix: remove expected error message in block_to_path()
	fs/ufs: avoid potential u32 multiplication overflow
	test_kmod: avoid potential double free in trigger_config_run_type()
	mfd: dln2: Run event handler loop under spinlock
	ALSA: echoaudio: Fix potential Oops in snd_echo_resume()
	perf bench mem: Always memset source before memcpy
	tools build feature: Quote CC and CXX for their arguments
	sh: landisk: Add missing initialization of sh_io_port_base
	khugepaged: retract_page_tables() remember to test exit
	arm64: dts: marvell: espressobin: add ethernet alias
	drm: Added orientation quirk for ASUS tablet model T103HAF
	drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume
	Linux 4.19.141

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I0800f8e03919fd8f054c1bcda87efd70a6e5db6b
2020-08-21 13:01:46 +02:00
Hugh Dickins
2406c45db3 khugepaged: retract_page_tables() remember to test exit
commit 18e77600f7a1ed69f8ce46c9e11cad0985712dfa upstream.

Only once have I seen this scenario (and forgot even to notice what forced
the eventual crash): a sequence of "BUG: Bad page map" alerts from
vm_normal_page(), from zap_pte_range() servicing exit_mmap();
pmd:00000000, pte values corresponding to data in physical page 0.

The pte mappings being zapped in this case were supposed to be from a huge
page of ext4 text (but could as well have been shmem): my belief is that
it was racing with collapse_file()'s retract_page_tables(), found *pmd
pointing to a page table, locked it, but *pmd had become 0 by the time
start_pte was decided.

In most cases, that possibility is excluded by holding mmap lock; but
exit_mmap() proceeds without mmap lock.  Most of what's run by khugepaged
checks khugepaged_test_exit() after acquiring mmap lock:
khugepaged_collapse_pte_mapped_thps() and hugepage_vma_revalidate() do so,
for example.  But retract_page_tables() did not: fix that.

The fix is for retract_page_tables() to check khugepaged_test_exit(),
after acquiring mmap lock, before doing anything to the page table.
Getting the mmap lock serializes with __mmput(), which briefly takes and
drops it in __khugepaged_exit(); then the khugepaged_test_exit() check on
mm_users makes sure we don't touch the page table once exit_mmap() might
reach it, since exit_mmap() will be proceeding without mmap lock, not
expecting anyone to be racing with it.

Fixes: f3f0e1d215 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: <stable@vger.kernel.org>	[4.8+]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021215400.27773@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-21 11:05:39 +02:00
Michal Koutný
e88a72e86b mm/page_counter.c: fix protection usage propagation
commit a6f23d14ec7d7d02220ad8bb2774be3322b9aeec upstream.

When workload runs in cgroups that aren't directly below root cgroup and
their parent specifies reclaim protection, it may end up ineffective.

The reason is that propagate_protected_usage() is not called in all
hierarchy up.  All the protected usage is incorrectly accumulated in the
workload's parent.  This means that siblings_low_usage is overestimated
and effective protection underestimated.  Even though it is transitional
phenomenon (uncharge path does correct propagation and fixes the wrong
children_low_usage), it can undermine the intended protection
unexpectedly.

We have noticed this problem while seeing a swap out in a descendant of a
protected memcg (intermediate node) while the parent was conveniently
under its protection limit and the memory pressure was external to that
hierarchy.  Michal has pinpointed this down to the wrong
siblings_low_usage which led to the unwanted reclaim.

The fix is simply updating children_low_usage in respective ancestors also
in the charging path.

Fixes: 230671533d ("mm: memory.low hierarchical behavior")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>	[4.18+]
Link: http://lkml.kernel.org/r/20200803153231.15477-1-mhocko@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-21 11:05:33 +02:00
Greg Kroah-Hartman
7086849b9c This is the 4.19.140 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl88w3UACgkQONu9yGCS
 aT4fZQ/+NYyfbsgFARqld2HbOIDSYyua90k42Xj7nlHXa3UcbPCsNEWIWn2k5SXU
 kvzwXSUuI14AOyqBOp/swKkEZh7Dh5c6q8QBHA6YJnNaJBQQxv2tjMpA4TngMifG
 oUSnxgHKNHtiD/6D7ZZ36l8u83sXfE6qPiginJAECdC2bVjpdfT7EqK5bY2lFd1s
 yiz9RxEDXSlrVMXqew75XBEj+304RYhZcJEVPQrqFb/q2Q+rSYs1mAkCazFvAkX+
 6Yooyp0tYlfUlkF2ItDpWmuKDcbGtWDd/I9LVGwZ0J67uAN86ZhNGbqlI8bpig9o
 qNW0FXAN2TNjpBvKXwg1qavfs5xYQu2E0OrRpCUleL1yD/kWu2vfK4HqyIardsVq
 63ffUvMJnJaWPnIvB2gx5f5tRt3Ca7uqvoM4LlYR1fwNZwVaU1fyWNEfeID4MAkr
 jBhC8x3n40TF1ngdaZ/XETiAJbjiYve2uEVuvdCtnp1fFbQ892QD5A8MQYsOVFuh
 6aR4f6bsR23F/h+tOMJc89wZTRYsCrFxbjwjye+tsWPcBm2GR7hgvCxo1JFqHgrz
 elY15u+AWj4pjVhiQcsnXLL8pGKkZTvPrq+iwg12AE23gvE4ww1lpYbxO46GUWuw
 q6L8oaHYA6cZiEnIde6yTUpkm6zag0MK+HDutiFUrEAmJTmeWds=
 =9oFP
 -----END PGP SIGNATURE-----

Merge 4.19.140 into android-4.19-stable

Changes in 4.19.140
	tracepoint: Mark __tracepoint_string's __used
	HID: input: Fix devices that return multiple bytes in battery report
	cgroup: add missing skcd->no_refcnt check in cgroup_sk_clone()
	x86/mce/inject: Fix a wrong assignment of i_mce.status
	sched/fair: Fix NOHZ next idle balance
	sched: correct SD_flags returned by tl->sd_flags()
	arm64: dts: rockchip: fix rk3368-lion gmac reset gpio
	arm64: dts: rockchip: fix rk3399-puma vcc5v0-host gpio
	arm64: dts: rockchip: fix rk3399-puma gmac reset gpio
	EDAC: Fix reference count leaks
	arm64: dts: qcom: msm8916: Replace invalid bias-pull-none property
	crypto: ccree - fix resource leak on error path
	firmware: arm_scmi: Fix SCMI genpd domain probing
	arm64: dts: exynos: Fix silent hang after boot on Espresso
	clk: scmi: Fix min and max rate when registering clocks with discrete rates
	m68k: mac: Don't send IOP message until channel is idle
	m68k: mac: Fix IOP status/control register writes
	platform/x86: intel-hid: Fix return value check in check_acpi_dev()
	platform/x86: intel-vbtn: Fix return value check in check_acpi_dev()
	ARM: dts: gose: Fix ports node name for adv7180
	ARM: dts: gose: Fix ports node name for adv7612
	ARM: at91: pm: add missing put_device() call in at91_pm_sram_init()
	spi: lantiq: fix: Rx overflow error in full duplex mode
	ARM: socfpga: PM: add missing put_device() call in socfpga_setup_ocram_self_refresh()
	drm/tilcdc: fix leak & null ref in panel_connector_get_modes
	soc: qcom: rpmh-rsc: Set suppress_bind_attrs flag
	Bluetooth: add a mutex lock to avoid UAF in do_enale_set
	loop: be paranoid on exit and prevent new additions / removals
	fs/btrfs: Add cond_resched() for try_release_extent_mapping() stalls
	drm/amdgpu: avoid dereferencing a NULL pointer
	drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync
	crypto: aesni - Fix build with LLVM_IAS=1
	video: fbdev: neofb: fix memory leak in neo_scan_monitor()
	md-cluster: fix wild pointer of unlock_all_bitmaps()
	arm64: dts: hisilicon: hikey: fixes to comply with adi, adv7533 DT binding
	drm/etnaviv: fix ref count leak via pm_runtime_get_sync
	drm/nouveau: fix multiple instances of reference count leaks
	usb: mtu3: clear dual mode of u3port when disable device
	drm/debugfs: fix plain echo to connector "force" attribute
	drm/radeon: disable AGP by default
	irqchip/irq-mtk-sysirq: Replace spinlock with raw_spinlock
	mm/mmap.c: Add cond_resched() for exit_mmap() CPU stalls
	brcmfmac: keep SDIO watchdog running when console_interval is non-zero
	brcmfmac: To fix Bss Info flag definition Bug
	brcmfmac: set state of hanger slot to FREE when flushing PSQ
	iwlegacy: Check the return value of pcie_capability_read_*()
	gpu: host1x: debug: Fix multiple channels emitting messages simultaneously
	usb: gadget: net2280: fix memory leak on probe error handling paths
	bdc: Fix bug causing crash after multiple disconnects
	usb: bdc: Halt controller on suspend
	dyndbg: fix a BUG_ON in ddebug_describe_flags
	bcache: fix super block seq numbers comparision in register_cache_set()
	ACPICA: Do not increment operation_region reference counts for field units
	drm/msm: ratelimit crtc event overflow error
	agp/intel: Fix a memory leak on module initialisation failure
	video: fbdev: sm712fb: fix an issue about iounmap for a wrong address
	console: newport_con: fix an issue about leak related system resources
	video: pxafb: Fix the function used to balance a 'dma_alloc_coherent()' call
	ath10k: Acquire tx_lock in tx error paths
	iio: improve IIO_CONCENTRATION channel type description
	drm/etnaviv: Fix error path on failure to enable bus clk
	drm/arm: fix unintentional integer overflow on left shift
	leds: lm355x: avoid enum conversion warning
	media: omap3isp: Add missed v4l2_ctrl_handler_free() for preview_init_entities()
	ASoC: Intel: bxt_rt298: add missing .owner field
	scsi: cumana_2: Fix different dev_id between request_irq() and free_irq()
	drm/mipi: use dcs write for mipi_dsi_dcs_set_tear_scanline
	cxl: Fix kobject memleak
	drm/radeon: fix array out-of-bounds read and write issues
	scsi: powertec: Fix different dev_id between request_irq() and free_irq()
	scsi: eesox: Fix different dev_id between request_irq() and free_irq()
	ipvs: allow connection reuse for unconfirmed conntrack
	media: firewire: Using uninitialized values in node_probe()
	media: exynos4-is: Add missed check for pinctrl_lookup_state()
	xfs: don't eat an EIO/ENOSPC writeback error when scrubbing data fork
	xfs: fix reflink quota reservation accounting error
	RDMA/rxe: Skip dgid check in loopback mode
	PCI: Fix pci_cfg_wait queue locking problem
	leds: core: Flush scheduled work for system suspend
	drm: panel: simple: Fix bpc for LG LB070WV8 panel
	phy: exynos5-usbdrd: Calibrating makes sense only for USB2.0 PHY
	drm/bridge: sil_sii8620: initialize return of sii8620_readb
	scsi: scsi_debug: Add check for sdebug_max_queue during module init
	mwifiex: Prevent memory corruption handling keys
	powerpc/vdso: Fix vdso cpu truncation
	RDMA/qedr: SRQ's bug fixes
	RDMA/rxe: Prevent access to wr->next ptr afrer wr is posted to send queue
	staging: rtl8192u: fix a dubious looking mask before a shift
	PCI/ASPM: Add missing newline in sysfs 'policy'
	powerpc/book3s64/pkeys: Use PVR check instead of cpu feature
	drm/imx: tve: fix regulator_disable error path
	USB: serial: iuu_phoenix: fix led-activity helpers
	usb: core: fix quirks_param_set() writing to a const pointer
	thermal: ti-soc-thermal: Fix reversed condition in ti_thermal_expose_sensor()
	coresight: tmc: Fix TMC mode read in tmc_read_unprepare_etb()
	MIPS: OCTEON: add missing put_device() call in dwc3_octeon_device_init()
	usb: dwc2: Fix error path in gadget registration
	scsi: mesh: Fix panic after host or bus reset
	net: dsa: mv88e6xxx: MV88E6097 does not support jumbo configuration
	PCI: cadence: Fix updating Vendor ID and Subsystem Vendor ID register
	RDMA/core: Fix return error value in _ib_modify_qp() to negative
	Smack: fix another vsscanf out of bounds
	Smack: prevent underflow in smk_set_cipso()
	power: supply: check if calc_soc succeeded in pm860x_init_battery
	Bluetooth: hci_h5: Set HCI_UART_RESET_ON_INIT to correct flags
	Bluetooth: hci_serdev: Only unregister device if it was registered
	net: dsa: rtl8366: Fix VLAN semantics
	net: dsa: rtl8366: Fix VLAN set-up
	powerpc/boot: Fix CONFIG_PPC_MPC52XX references
	selftests/powerpc: Fix CPU affinity for child process
	PCI: Release IVRS table in AMD ACS quirk
	selftests/powerpc: Fix online CPU selection
	ASoC: meson: axg-tdm-interface: fix link fmt setup
	s390/qeth: don't process empty bridge port events
	wl1251: fix always return 0 error
	tools, build: Propagate build failures from tools/build/Makefile.build
	net: ethernet: aquantia: Fix wrong return value
	liquidio: Fix wrong return value in cn23xx_get_pf_num()
	net: spider_net: Fix the size used in a 'dma_free_coherent()' call
	fsl/fman: use 32-bit unsigned integer
	fsl/fman: fix dereference null return value
	fsl/fman: fix unreachable code
	fsl/fman: check dereferencing null pointer
	fsl/fman: fix eth hash table allocation
	dlm: Fix kobject memleak
	ocfs2: fix unbalanced locking
	pinctrl-single: fix pcs_parse_pinconf() return value
	svcrdma: Fix page leak in svc_rdma_recv_read_chunk()
	x86/fsgsbase/64: Fix NULL deref in 86_fsgsbase_read_task
	crypto: aesni - add compatibility with IAS
	af_packet: TPACKET_V3: fix fill status rwlock imbalance
	drivers/net/wan/lapbether: Added needed_headroom and a skb->len check
	net/nfc/rawsock.c: add CAP_NET_RAW check.
	net: Set fput_needed iff FDPUT_FPUT is set
	net/tls: Fix kmap usage
	net: refactor bind_bucket fastreuse into helper
	net: initialize fastreuse on inet_inherit_port
	USB: serial: cp210x: re-enable auto-RTS on open
	USB: serial: cp210x: enable usb generic throttle/unthrottle
	ALSA: hda - fix the micmute led status for Lenovo ThinkCentre AIO
	ALSA: usb-audio: Creative USB X-Fi Pro SB1095 volume knob support
	ALSA: usb-audio: fix overeager device match for MacroSilicon MS2109
	ALSA: usb-audio: work around streaming quirk for MacroSilicon MS2109
	pstore: Fix linking when crypto API disabled
	crypto: hisilicon - don't sleep of CRYPTO_TFM_REQ_MAY_SLEEP was not specified
	crypto: qat - fix double free in qat_uclo_create_batch_init_list
	crypto: ccp - Fix use of merged scatterlists
	crypto: cpt - don't sleep of CRYPTO_TFM_REQ_MAY_SLEEP was not specified
	bitfield.h: don't compile-time validate _val in FIELD_FIT
	fs/minix: check return value of sb_getblk()
	fs/minix: don't allow getting deleted inodes
	fs/minix: reject too-large maximum file size
	ALSA: usb-audio: add quirk for Pioneer DDJ-RB
	9p: Fix memory leak in v9fs_mount
	drm/ttm/nouveau: don't call tt destroy callback on alloc failure.
	NFS: Don't move layouts to plh_return_segs list while in use
	NFS: Don't return layout segments that are in use
	cpufreq: dt: fix oops on armada37xx
	include/asm-generic/vmlinux.lds.h: align ro_after_init
	spi: spidev: Align buffers for DMA
	mtd: rawnand: qcom: avoid write to unavailable register
	parisc: Implement __smp_store_release and __smp_load_acquire barriers
	parisc: mask out enable and reserved bits from sba imask
	ARM: 8992/1: Fix unwind_frame for clang-built kernels
	irqdomain/treewide: Free firmware node after domain removal
	xen/balloon: fix accounting in alloc_xenballooned_pages error path
	xen/balloon: make the balloon wait interruptible
	xen/gntdev: Fix dmabuf import with non-zero sgt offset
	Linux 4.19.140

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I6b0d8dcf9ded022f62d9c62605388f1c1e9112d1
2020-08-19 08:43:22 +02:00
Paul E. McKenney
abfa9c47ec mm/mmap.c: Add cond_resched() for exit_mmap() CPU stalls
[ Upstream commit 0a3b3c253a1eb2c7fe7f34086d46660c909abeb3 ]

A large process running on a heavily loaded system can encounter the
following RCU CPU stall warning:

  rcu: INFO: rcu_sched self-detected stall on CPU
  rcu: 	3-....: (20998 ticks this GP) idle=4ea/1/0x4000000000000002 softirq=556558/556558 fqs=5190
  	(t=21013 jiffies g=1005461 q=132576)
  NMI backtrace for cpu 3
  CPU: 3 PID: 501900 Comm: aio-free-ring-w Kdump: loaded Not tainted 5.2.9-108_fbk12_rc3_3858_gb83b75af7909 #1
  Hardware name: Wiwynn   HoneyBadger/PantherPlus, BIOS HBM6.71 02/03/2016
  Call Trace:
   <IRQ>
   dump_stack+0x46/0x60
   nmi_cpu_backtrace.cold.3+0x13/0x50
   ? lapic_can_unplug_cpu.cold.27+0x34/0x34
   nmi_trigger_cpumask_backtrace+0xba/0xca
   rcu_dump_cpu_stacks+0x99/0xc7
   rcu_sched_clock_irq.cold.87+0x1aa/0x397
   ? tick_sched_do_timer+0x60/0x60
   update_process_times+0x28/0x60
   tick_sched_timer+0x37/0x70
   __hrtimer_run_queues+0xfe/0x270
   hrtimer_interrupt+0xf4/0x210
   smp_apic_timer_interrupt+0x5e/0x120
   apic_timer_interrupt+0xf/0x20
   </IRQ>
  RIP: 0010:kmem_cache_free+0x223/0x300
  Code: 88 00 00 00 0f 85 ca 00 00 00 41 8b 55 18 31 f6 f7 da 41 f6 45 0a 02 40 0f 94 c6 83 c6 05 9c 41 5e fa e8 a0 a7 01 00 41 56 9d <49> 8b 47 08 a8 03 0f 85 87 00 00 00 65 48 ff 08 e9 3d fe ff ff 65
  RSP: 0018:ffffc9000e8e3da8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
  RAX: 0000000000020000 RBX: ffff88861b9de960 RCX: 0000000000000030
  RDX: fffffffffffe41e8 RSI: 000060777fe3a100 RDI: 000000000001be18
  RBP: ffffea00186e7780 R08: ffffffffffffffff R09: ffffffffffffffff
  R10: ffff88861b9dea28 R11: ffff88887ffde000 R12: ffffffff81230a1f
  R13: ffff888854684dc0 R14: 0000000000000206 R15: ffff8888547dbc00
   ? remove_vma+0x4f/0x60
   remove_vma+0x4f/0x60
   exit_mmap+0xd6/0x160
   mmput+0x4a/0x110
   do_exit+0x278/0xae0
   ? syscall_trace_enter+0x1d3/0x2b0
   ? handle_mm_fault+0xaa/0x1c0
   do_group_exit+0x3a/0xa0
   __x64_sys_exit_group+0x14/0x20
   do_syscall_64+0x42/0x100
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

And on a PREEMPT=n kernel, the "while (vma)" loop in exit_mmap() can run
for a very long time given a large process.  This commit therefore adds
a cond_resched() to this loop, providing RCU any needed quiescent states.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <linux-mm@kvack.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19 08:14:52 +02:00
Greg Kroah-Hartman
bcf9517454 This is the 4.19.135 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl8hMIIACgkQONu9yGCS
 aT4+fQ/+LvplGwblkH8vuttKsz17BHXD8/FcL6LIbLKTyaWJMWp/raMWQyQLrZpL
 B58dFVZxenmpljLvupB9a1qifZk00U8M+aoU45Pf6PHAww2sGoU1h5sZexqRqOmN
 FAh3hLMq1qyJ6qDIugZ0sbtDaO7t5GwvT1YKKo6V9hqi0XamTVrppI/EVVDDA0ve
 /WigyxjT51DIPd3bmjJ3Xn920artrW+fydA+jTyMBBME/qFi5s2yN7Rui0ViNz44
 crwGxAN1v3+MboulHFsnCdLAlh9hyI4VNXpvpNhKIoVE9BMHgBmmnWA+0KUIjAeA
 8GfL2TcspjElNnz9T4f957Rj6Ft7qlStYIyJ45rcGRMXkyNs1lw5CDfkJcy8giVD
 7yImkQZ2c8jCgkr/Vor/MfHOPtg1KzpAuNrWZnobTdnaBGxgcC61pnKHxF5Vx40h
 78hOFXqunGNMwNBR4EEjmP4B3zapeHVo4GXBPtwY8M878Uj28z2pL4Vx6MhbvmTf
 1i8xipclcgpV5ZyN+zv8XA55pcw8ahQOuUknEx+3yH0chlf5cIxXhr92g1DrDwoF
 YvNYJQA7qJpgx/k582u6bJYkBdNa+XJaBLjQUhI/Z9UVS33S/CouGHpFyIMpVMx9
 vo3ujFpuUP4ZCeKENjINa7RfQhD7oHQQQrk5RcsFBYJaWgCdi3A=
 =ugxS
 -----END PGP SIGNATURE-----

Merge 4.19.135 into android-4.19-stable

Changes in 4.19.135
	soc: qcom: rpmh: Dirt can only make you dirtier, not cleaner
	gpio: arizona: handle pm_runtime_get_sync failure case
	gpio: arizona: put pm_runtime in case of failure
	pinctrl: amd: fix npins for uart0 in kerncz_groups
	mac80211: allow rx of mesh eapol frames with default rx key
	scsi: scsi_transport_spi: Fix function pointer check
	xtensa: fix __sync_fetch_and_{and,or}_4 declarations
	xtensa: update *pos in cpuinfo_op.next
	drivers/net/wan/lapbether: Fixed the value of hard_header_len
	net: sky2: initialize return of gm_phy_read
	drm/nouveau/i2c/g94-: increase NV_PMGR_DP_AUXCTL_TRANSACTREQ timeout
	drivers/firmware/psci: Fix memory leakage in alloc_init_cpu_groups()
	fuse: fix weird page warning
	irqdomain/treewide: Keep firmware node unconditionally allocated
	SUNRPC reverting d03727b248d0 ("NFSv4 fix CLOSE not waiting for direct IO compeletion")
	spi: spi-fsl-dspi: Exit the ISR with IRQ_NONE when it's not ours
	tipc: clean up skb list lock handling on send path
	IB/umem: fix reference count leak in ib_umem_odp_get()
	uprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix GDB regression
	ALSA: info: Drop WARN_ON() from buffer NULL sanity check
	ASoC: rt5670: Correct RT5670_LDO_SEL_MASK
	btrfs: fix double free on ulist after backref resolution failure
	btrfs: fix mount failure caused by race with umount
	btrfs: fix page leaks after failure to lock page for delalloc
	bnxt_en: Fix race when modifying pause settings.
	fpga: dfl: fix bug in port reset handshake
	hippi: Fix a size used in a 'pci_free_consistent()' in an error handling path
	ax88172a: fix ax88172a_unbind() failures
	net: dp83640: fix SIOCSHWTSTAMP to update the struct with actual configuration
	ieee802154: fix one possible memleak in adf7242_probe
	drm: sun4i: hdmi: Fix inverted HPD result
	net: smc91x: Fix possible memory leak in smc_drv_probe()
	bonding: check error value of register_netdevice() immediately
	mlxsw: destroy workqueue when trap_register in mlxsw_emad_init
	qed: suppress "don't support RoCE & iWARP" flooding on HW init
	ipvs: fix the connection sync failed in some cases
	net: ethernet: ave: Fix error returns in ave_init
	i2c: rcar: always clear ICSAR to avoid side effects
	bonding: check return value of register_netdevice() in bond_newlink()
	serial: exar: Fix GPIO configuration for Sealevel cards based on XR17V35X
	scripts/decode_stacktrace: strip basepath from all paths
	scripts/gdb: fix lx-symbols 'gdb.error' while loading modules
	HID: i2c-hid: add Mediacom FlexBook edge13 to descriptor override
	HID: alps: support devices with report id 2
	HID: steam: fixes race in handling device list.
	HID: apple: Disable Fn-key key-re-mapping on clone keyboards
	dmaengine: tegra210-adma: Fix runtime PM imbalance on error
	Input: add `SW_MACHINE_COVER`
	spi: mediatek: use correct SPI_CFG2_REG MACRO
	regmap: dev_get_regmap_match(): fix string comparison
	hwmon: (aspeed-pwm-tacho) Avoid possible buffer overflow
	dmaengine: ioat setting ioat timeout as module parameter
	Input: synaptics - enable InterTouch for ThinkPad X1E 1st gen
	usb: gadget: udc: gr_udc: fix memleak on error handling path in gr_ep_init()
	hwmon: (adm1275) Make sure we are reading enough data for different chips
	hwmon: (scmi) Fix potential buffer overflow in scmi_hwmon_probe()
	arm64: Use test_tsk_thread_flag() for checking TIF_SINGLESTEP
	x86: math-emu: Fix up 'cmp' insn for clang ias
	RISC-V: Upgrade smp_mb__after_spinlock() to iorw,iorw
	binder: Don't use mmput() from shrinker function.
	usb: xhci-mtk: fix the failure of bandwidth allocation
	usb: xhci: Fix ASM2142/ASM3142 DMA addressing
	Revert "cifs: Fix the target file was deleted when rename failed."
	staging: wlan-ng: properly check endpoint types
	staging: comedi: addi_apci_1032: check INSN_CONFIG_DIGITAL_TRIG shift
	staging: comedi: ni_6527: fix INSN_CONFIG_DIGITAL_TRIG support
	staging: comedi: addi_apci_1500: check INSN_CONFIG_DIGITAL_TRIG shift
	staging: comedi: addi_apci_1564: check INSN_CONFIG_DIGITAL_TRIG shift
	serial: 8250: fix null-ptr-deref in serial8250_start_tx()
	serial: 8250_mtk: Fix high-speed baud rates clamping
	fbdev: Detect integer underflow at "struct fbcon_ops"->clear_margins.
	vt: Reject zero-sized screen buffer size.
	Makefile: Fix GCC_TOOLCHAIN_DIR prefix for Clang cross compilation
	mm/memcg: fix refcount error while moving and swapping
	mm: memcg/slab: synchronize access to kmem_cache dying flag using a spinlock
	mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
	io-mapping: indicate mapping failure
	drm/amdgpu: Fix NULL dereference in dpm sysfs handlers
	drm/amd/powerplay: fix a crash when overclocking Vega M
	parisc: Add atomic64_set_release() define to avoid CPU soft lockups
	x86, vmlinux.lds: Page-align end of ..page_aligned sections
	ASoC: rt5670: Add new gpio1_is_ext_spk_en quirk and enable it on the Lenovo Miix 2 10
	ASoC: qcom: Drop HAS_DMA dependency to fix link failure
	dm integrity: fix integrity recalculation that is improperly skipped
	ath9k: Fix general protection fault in ath9k_hif_usb_rx_cb
	ath9k: Fix regression with Atheros 9271
	Linux 4.19.135

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I0bbcde83e7c810352d998f28d3484efa2b9ede8e
2020-07-29 13:22:30 +02:00
Muchun Song
d87ddcdb2d mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
commit d38a2b7a9c939e6d7329ab92b96559ccebf7b135 upstream.

If the kmem_cache refcount is greater than one, we should not mark the
root kmem_cache as dying.  If we mark the root kmem_cache dying
incorrectly, the non-root kmem_cache can never be destroyed.  It
resulted in memory leak when memcg was destroyed.  We can use the
following steps to reproduce.

  1) Use kmem_cache_create() to create a new kmem_cache named A.
  2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
     so the refcount of B is just increased.
  3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
     decrease the B's refcount but mark the B as dying.
  4) Create a new memory cgroup and alloc memory from the kmem_cache
     B. It leads to create a non-root kmem_cache for allocating memory.
  5) When destroy the memory cgroup created in the step 4), the
     non-root kmem_cache can never be destroyed.

If we repeat steps 4) and 5), this will cause a lot of memory leak.  So
only when refcount reach zero, we mark the root kmem_cache as dying.

Fixes: 92ee383f6d ("mm: fix race between kmem_cache destroy, create and deactivate")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200716165103.83462-1-songmuchun@bytedance.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-29 10:16:57 +02:00
Roman Gushchin
763b04c6b2 mm: memcg/slab: synchronize access to kmem_cache dying flag using a spinlock
[ Upstream commit 63b02ef7dc4ec239df45c018ac0adbd02ba30a0c ]

Currently the memcg_params.dying flag and the corresponding workqueue used
for the asynchronous deactivation of kmem_caches is synchronized using the
slab_mutex.

It makes impossible to check this flag from the irq context, which will be
required in order to implement asynchronous release of kmem_caches.

So let's switch over to the irq-save flavor of the spinlock-based
synchronization.

Link: http://lkml.kernel.org/r/20190611231813.3148843-8-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Waiman Long <longman@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-29 10:16:57 +02:00
Hugh Dickins
91404e91eb mm/memcg: fix refcount error while moving and swapping
commit 8d22a9351035ef2ff12ef163a1091b8b8cf1e49c upstream.

It was hard to keep a test running, moving tasks between memcgs with
move_charge_at_immigrate, while swapping: mem_cgroup_id_get_many()'s
refcount is discovered to be 0 (supposedly impossible), so it is then
forced to REFCOUNT_SATURATED, and after thousands of warnings in quick
succession, the test is at last put out of misery by being OOM killed.

This is because of the way moved_swap accounting was saved up until the
task move gets completed in __mem_cgroup_clear_mc(), deferred from when
mem_cgroup_move_swap_account() actually exchanged old and new ids.
Concurrent activity can free up swap quicker than the task is scanned,
bringing id refcount down 0 (which should only be possible when
offlining).

Just skip that optimization: do that part of the accounting immediately.

Fixes: 615d66c37c ("mm: memcontrol: fix memcg id ref counter on swap charge move")
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2007071431050.4726@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-29 10:16:57 +02:00
Greg Kroah-Hartman
a6c20c0442 This is the 4.19.132 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl8GySkACgkQONu9yGCS
 aT5AJxAAhkThJ1JxpJ/Ci5KASCkyXxqghdD+EQ/dwVp9fWjFcUZT4Pp+PZzGk02d
 6DW9F8sNsCJOepeXmzJnFDf2rIuFCCGWgkX/wD+zTHDzY+ThmaOBy9hW2k2tM98B
 ii40ui4YYUYHKh0nPF1uPWsAGEdipqv2QWCaDszKtpH/mjcodWJmlJpXTl1Lh6+H
 dWT1CEYhLJBG5Z6A9NVeKn7PMtX7hMoYEF8DRyUg59a2oNKc2ys8FsjeRfJ9EvYN
 Qd1ZKWUvkzxQmL14RIvkA3bcU7bdIAB5Qm9lV0WKs/kVD2+zMOZb3j4HR2+0ey6U
 ieyX6iM07la1cMEEHxtHtIZoghgId7XgB2UJu7oFGH/4Y98ocwKa/8iA+qtXEDVj
 7mHa3dnC/gKMISydKsLN8AswVJXTy+RrcOnOf4qNfRUe0hhRZhfniusio5Hp8s/x
 vV+k+/PH8kPvVOhM7Jw3wGgms4wMzgKR8zcZm4wf9qnZMyLn/e902w04d+xkEs3Z
 ShSZtd/9CWed7FENKEb0mCCJrE2c6kHCAExNfP6Xx2NSdB7Nr1vFNqYhXIynGRko
 Hpf746FaQ2J4jr1IHDaGvfxeD64EJqpH8M2Ti3/jkQG45FrYRGirX4cry0Kx655r
 SmSENRH4H5alnGL/1m/8xYHBDsA3u2hY7+7ZfKBgK3B0ikxPia8=
 =NI9a
 -----END PGP SIGNATURE-----

Merge 4.19.132 into android-4.19-stable

Changes in 4.19.132
	btrfs: fix a block group ref counter leak after failure to remove block group
	mm: fix swap cache node allocation mask
	EDAC/amd64: Read back the scrub rate PCI register on F15h
	usbnet: smsc95xx: Fix use-after-free after removal
	mm/slub.c: fix corrupted freechain in deactivate_slab()
	mm/slub: fix stack overruns with SLUB_STATS
	usb: usbtest: fix missing kfree(dev->buf) in usbtest_disconnect
	s390/debug: avoid kernel warning on too large number of pages
	nvme-multipath: set bdi capabilities once
	nvme-multipath: fix deadlock between ana_work and scan_work
	kgdb: Avoid suspicious RCU usage warning
	crypto: af_alg - fix use-after-free in af_alg_accept() due to bh_lock_sock()
	drm/msm/dpu: fix error return code in dpu_encoder_init
	cxgb4: use unaligned conversion for fetching timestamp
	cxgb4: parse TC-U32 key values and masks natively
	cxgb4: use correct type for all-mask IP address comparison
	cxgb4: fix SGE queue dump destination buffer context
	hwmon: (max6697) Make sure the OVERT mask is set correctly
	hwmon: (acpi_power_meter) Fix potential memory leak in acpi_power_meter_add()
	drm: sun4i: hdmi: Remove extra HPD polling
	virtio-blk: free vblk-vqs in error path of virtblk_probe()
	SMB3: Honor 'posix' flag for multiuser mounts
	nvme: fix a crash in nvme_mpath_add_disk
	i2c: algo-pca: Add 0x78 as SCL stuck low status for PCA9665
	i2c: mlxcpld: check correct size of maximum RECV_LEN packet
	nfsd: apply umask on fs without ACL support
	Revert "ALSA: usb-audio: Improve frames size computation"
	SMB3: Honor 'seal' flag for multiuser mounts
	SMB3: Honor persistent/resilient handle flags for multiuser mounts
	SMB3: Honor lease disabling for multiuser mounts
	cifs: Fix the target file was deleted when rename failed.
	MIPS: Add missing EHB in mtc0 -> mfc0 sequence for DSPen
	irqchip/gic: Atomically update affinity
	dm zoned: assign max_io_len correctly
	efi: Make it possible to disable efivar_ssdt entirely
	Linux 4.19.132

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I394bc3ee947bdf92bad83297e7bd579e92fed8a9
2020-07-09 11:20:59 +02:00
Qian Cai
3e632652e3 mm/slub: fix stack overruns with SLUB_STATS
[ Upstream commit a68ee0573991e90af2f1785db309206408bad3e5 ]

There is no need to copy SLUB_STATS items from root memcg cache to new
memcg cache copies.  Doing so could result in stack overruns because the
store function only accepts 0 to clear the stat and returns an error for
everything else while the show method would print out the whole stat.

Then, the mismatch of the lengths returns from show and store methods
happens in memcg_propagate_slab_attrs():

	else if (root_cache->max_attr_size < ARRAY_SIZE(mbuf))
		buf = mbuf;

max_attr_size is only 2 from slab_attr_store(), then, it uses mbuf[64]
in show_stat() later where a bounch of sprintf() would overrun the stack
variable.  Fix it by always allocating a page of buffer to be used in
show_stat() if SLUB_STATS=y which should only be used for debug purpose.

  # echo 1 > /sys/kernel/slab/fs_cache/shrink
  BUG: KASAN: stack-out-of-bounds in number+0x421/0x6e0
  Write of size 1 at addr ffffc900256cfde0 by task kworker/76:0/53251

  Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
  Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
  Call Trace:
    number+0x421/0x6e0
    vsnprintf+0x451/0x8e0
    sprintf+0x9e/0xd0
    show_stat+0x124/0x1d0
    alloc_slowpath_show+0x13/0x20
    __kmem_cache_create+0x47a/0x6b0

  addr ffffc900256cfde0 is located in stack of task kworker/76:0/53251 at offset 0 in frame:
   process_one_work+0x0/0xb90

  this frame has 1 object:
   [32, 72) 'lockdep_map'

  Memory state around the buggy address:
   ffffc900256cfc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   ffffc900256cfd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  >ffffc900256cfd80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
                                                         ^
   ffffc900256cfe00: 00 00 00 00 00 f2 f2 f2 00 00 00 00 00 00 00 00
   ffffc900256cfe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ==================================================================
  Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: __kmem_cache_create+0x6ac/0x6b0
  Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
  Call Trace:
    __kmem_cache_create+0x6ac/0x6b0

Fixes: 107dab5c92 ("slub: slub-specific propagation changes")
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Glauber Costa <glauber@scylladb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Link: http://lkml.kernel.org/r/20200429222356.4322-1-cai@lca.pw
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-09 09:37:09 +02:00
Dongli Zhang
6c09755c02 mm/slub.c: fix corrupted freechain in deactivate_slab()
[ Upstream commit 52f23478081ae0dcdb95d1650ea1e7d52d586829 ]

The slub_debug is able to fix the corrupted slab freelist/page.
However, alloc_debug_processing() only checks the validity of current
and next freepointer during allocation path.  As a result, once some
objects have their freepointers corrupted, deactivate_slab() may lead to
page fault.

Below is from a test kernel module when 'slub_debug=PUF,kmalloc-128
slub_nomerge'.  The test kernel corrupts the freepointer of one free
object on purpose.  Unfortunately, deactivate_slab() does not detect it
when iterating the freechain.

  BUG: unable to handle page fault for address: 00000000123456f8
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  ... ...
  RIP: 0010:deactivate_slab.isra.92+0xed/0x490
  ... ...
  Call Trace:
   ___slab_alloc+0x536/0x570
   __slab_alloc+0x17/0x30
   __kmalloc+0x1d9/0x200
   ext4_htree_store_dirent+0x30/0xf0
   htree_dirblock_to_tree+0xcb/0x1c0
   ext4_htree_fill_tree+0x1bc/0x2d0
   ext4_readdir+0x54f/0x920
   iterate_dir+0x88/0x190
   __x64_sys_getdents+0xa6/0x140
   do_syscall_64+0x49/0x170
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Therefore, this patch adds extra consistency check in deactivate_slab().
Once an object's freepointer is corrupted, all following objects
starting at this object are isolated.

[akpm@linux-foundation.org: fix build with CONFIG_SLAB_DEBUG=n]
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Jin <joe.jin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Link: http://lkml.kernel.org/r/20200331031450.12182-1-dongli.zhang@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-09 09:37:09 +02:00
Hugh Dickins
fa11088c6f mm: fix swap cache node allocation mask
[ Upstream commit 243bce09c91b0145aeaedd5afba799d81841c030 ]

Chris Murphy reports that a slightly overcommitted load, testing swap
and zram along with i915, splats and keeps on splatting, when it had
better fail less noisily:

  gnome-shell: page allocation failure: order:0,
  mode:0x400d0(__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_RECLAIMABLE),
  nodemask=(null),cpuset=/,mems_allowed=0
  CPU: 2 PID: 1155 Comm: gnome-shell Not tainted 5.7.0-1.fc33.x86_64 #1
  Call Trace:
    dump_stack+0x64/0x88
    warn_alloc.cold+0x75/0xd9
    __alloc_pages_slowpath.constprop.0+0xcfa/0xd30
    __alloc_pages_nodemask+0x2df/0x320
    alloc_slab_page+0x195/0x310
    allocate_slab+0x3c5/0x440
    ___slab_alloc+0x40c/0x5f0
    __slab_alloc+0x1c/0x30
    kmem_cache_alloc+0x20e/0x220
    xas_nomem+0x28/0x70
    add_to_swap_cache+0x321/0x400
    __read_swap_cache_async+0x105/0x240
    swap_cluster_readahead+0x22c/0x2e0
    shmem_swapin+0x8e/0xc0
    shmem_swapin_page+0x196/0x740
    shmem_getpage_gfp+0x3a2/0xa60
    shmem_read_mapping_page_gfp+0x32/0x60
    shmem_get_pages+0x155/0x5e0 [i915]
    __i915_gem_object_get_pages+0x68/0xa0 [i915]
    i915_vma_pin+0x3fe/0x6c0 [i915]
    eb_add_vma+0x10b/0x2c0 [i915]
    i915_gem_do_execbuffer+0x704/0x3430 [i915]
    i915_gem_execbuffer2_ioctl+0x1ea/0x3e0 [i915]
    drm_ioctl_kernel+0x86/0xd0 [drm]
    drm_ioctl+0x206/0x390 [drm]
    ksys_ioctl+0x82/0xc0
    __x64_sys_ioctl+0x16/0x20
    do_syscall_64+0x5b/0xf0
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported on 5.7, but it goes back really to 3.1: when
shmem_read_mapping_page_gfp() was implemented for use by i915, and
allowed for __GFP_NORETRY and __GFP_NOWARN flags in most places, but
missed swapin's "& GFP_KERNEL" mask for page tree node allocation in
__read_swap_cache_async() - that was to mask off HIGHUSER_MOVABLE bits
from what page cache uses, but GFP_RECLAIM_MASK is now what's needed.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=208085
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2006151330070.11064@eggly.anvils
Fixes: 68da9f0557 ("tmpfs: pass gfp to shmem_getpage_gfp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Chris Murphy <lists@colorremedies.com>
Analyzed-by: Vlastimil Babka <vbabka@suse.cz>
Analyzed-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Chris Murphy <lists@colorremedies.com>
Cc: <stable@vger.kernel.org>	[3.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-09 09:37:09 +02:00
Minchan Kim
1cf3a7305e ANDROID: GKI: fix ABI diffs caused by GPU heap and pool vmstat additions
New nr_gpu_heap fields in vmstat file cause ABI differences.
Fix them by adding the fields and we don't print it to vmstat now
as it's out-of-tree.

Bug: 128254966
Bug: 130198686

Test: boot and check /proc/vmstat has nr_gpu_heap
Change-Id: Idb322f80dbfde785720f01b3ad9954eb4feb7326
Signed-off-by: Minchan Kim <minchan@google.com>
Signed-off-by: Kyle Lin <kylelin@google.com>
Signed-off-by: Martin Liu <liumartin@google.com>
2020-07-07 23:32:09 +08:00
Greg Kroah-Hartman
b3293788b9 Linux 4.19.131
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE4n5dijQDou9mhzu83qZv95d3LNwFAl78APEACgkQ3qZv95d3
 LNyt+w/+PkFP8++ZsiI6GegXraxVbGuY4ndXroXAiTYa0uZjdsIhqJpgyVsJ/pbq
 jU/Hcfv8a0UGme7Hqy61KwN6aaCpM27zxE3aV/N9othtJWn59hiB51CyCcKMrjxK
 Mj6PN+yHxLPzCNBszEvOsICsBQt9HtJB11gcbJQPJ2skriVxSER0QrZi2s5jJuoS
 vVbxfRngXCnzTsxmpbYjMh1sE9/z/dNpCuyQ13f1MPAPpWFP1SxmMUfknXEO8gkF
 ThRIhI6uHDucAQxhP42McBsuoP64KfB90fKzFEuWmlit4OCmqW9subTeaI8V1muK
 CxkPqwRnyYmqbAM9auRwbJxtYfT0ONtDZj4zbLulq4qMTJF650968RQNIW+B1K3C
 jika93Am0YbNPOyq3m9Ac96NaTFjjhpIzu13P6xUQNf3/ydPKY9PHif2CnWCHPsX
 BO9fap7gsWHa88khjEGYXwcQCOC+UzQlcsT6CsWPTUTmcLObHiv863Rqm7LpXjit
 9gjKlNHdP6U0q+bz5aiiEtoNJ/2ZDwoz1I+srbrk7QMdVzAn+uRRtLRQxmJtryw1
 oTnJJu0iv9Zspn/PFXwlrpsYDDEBFfXFWvC+izfz8nm8CPFKgH9G96XNefcXlI9e
 3qxjDpkFb74R6ovnWKtJY8pR1qX/5TRC0/+/WpbZBILqW4Z0k5w=
 =YVa/
 -----END PGP SIGNATURE-----

Merge 4.19.131 into android-4.19-stable

Changes in 4.19.131
	net: be more gentle about silly gso requests coming from user
	block/bio-integrity: don't free 'buf' if bio_integrity_add_page() failed
	fanotify: fix ignore mask logic for events on child and on dir
	mtd: rawnand: marvell: Fix the condition on a return code
	net: bcmgenet: remove HFB_CTRL access
	net: sched: export __netdev_watchdog_up()
	EDAC/amd64: Add Family 17h Model 30h PCI IDs
	i2c: tegra: Cleanup kerneldoc comments
	i2c: tegra: Add missing kerneldoc for some fields
	i2c: tegra: Fix Maximum transfer size
	fix a braino in "sparc32: fix register window handling in genregs32_[gs]et()"
	ALSA: hda/realtek - Enable the headset of ASUS B9450FA with ALC294
	ALSA: hda/realtek: Enable mute LED on an HP system
	ALSA: hda/realtek - Enable micmute LED on and HP system
	apparmor: don't try to replace stale label in ptraceme check
	ibmveth: Fix max MTU limit
	mld: fix memory leak in ipv6_mc_destroy_dev()
	net: bridge: enfore alignment for ethernet address
	net: fix memleak in register_netdevice()
	net: place xmit recursion in softnet data
	net: use correct this_cpu primitive in dev_recursion_level
	net: increment xmit_recursion level in dev_direct_xmit()
	net: usb: ax88179_178a: fix packet alignment padding
	rocker: fix incorrect error handling in dma_rings_init
	rxrpc: Fix notification call on completion of discarded calls
	sctp: Don't advertise IPv4 addresses if ipv6only is set on the socket
	tcp: don't ignore ECN CWR on pure ACK
	tcp: grow window for OOO packets only for SACK flows
	tg3: driver sleeps indefinitely when EEH errors exceed eeh_max_freezes
	ip6_gre: fix use-after-free in ip6gre_tunnel_lookup()
	net: phy: Check harder for errors in get_phy_id()
	ip_tunnel: fix use-after-free in ip_tunnel_lookup()
	sch_cake: don't try to reallocate or unshare skb unconditionally
	sch_cake: fix a few style nits
	tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
	sch_cake: don't call diffserv parsing code when it is not needed
	net: Fix the arp error in some cases
	net: Do not clear the sock TX queue in sk_set_socket()
	net: core: reduce recursion limit value
	USB: ohci-sm501: Add missed iounmap() in remove
	usb: dwc2: Postponed gadget registration to the udc class driver
	usb: add USB_QUIRK_DELAY_INIT for Logitech C922
	USB: ehci: reopen solution for Synopsys HC bug
	usb: host: xhci-mtk: avoid runtime suspend when removing hcd
	xhci: Poll for U0 after disabling USB2 LPM
	usb: host: ehci-exynos: Fix error check in exynos_ehci_probe()
	usb: typec: tcpci_rt1711h: avoid screaming irq causing boot hangs
	ALSA: usb-audio: add quirk for Denon DCD-1500RE
	ALSA: usb-audio: add quirk for Samsung USBC Headset (AKG)
	ALSA: usb-audio: Fix OOB access of mixer element list
	scsi: zfcp: Fix panic on ERP timeout for previously dismissed ERP action
	xhci: Fix incorrect EP_STATE_MASK
	xhci: Fix enumeration issue when setting max packet size for FS devices.
	xhci: Return if xHCI doesn't support LPM
	cdc-acm: Add DISABLE_ECHO quirk for Microchip/SMSC chip
	loop: replace kill_bdev with invalidate_bdev
	IB/mad: Fix use after free when destroying MAD agent
	cifs/smb3: Fix data inconsistent when punch hole
	cifs/smb3: Fix data inconsistent when zero file range
	xfrm: Fix double ESP trailer insertion in IPsec crypto offload.
	ASoC: q6asm: handle EOS correctly
	efi/esrt: Fix reference count leak in esre_create_sysfs_entry.
	regualtor: pfuze100: correct sw1a/sw2 on pfuze3000
	ASoC: fsl_ssi: Fix bclk calculation for mono channel
	ARM: dts: Fix duovero smsc interrupt for suspend
	x86/resctrl: Fix a NULL vs IS_ERR() static checker warning in rdt_cdp_peer_get()
	regmap: Fix memory leak from regmap_register_patch
	ARM: dts: NSP: Correct FA2 mailbox node
	rxrpc: Fix handling of rwind from an ACK packet
	RDMA/qedr: Fix KASAN: use-after-free in ucma_event_handler+0x532
	RDMA/cma: Protect bind_list and listen_list while finding matching cm id
	ASoC: rockchip: Fix a reference count leak.
	RDMA/mad: Fix possible memory leak in ib_mad_post_receive_mads()
	net: qed: fix left elements count calculation
	net: qed: fix NVMe login fails over VFs
	net: qed: fix excessive QM ILT lines consumption
	cxgb4: move handling L2T ARP failures to caller
	ARM: imx5: add missing put_device() call in imx_suspend_alloc_ocram()
	usb: gadget: udc: Potential Oops in error handling code
	netfilter: ipset: fix unaligned atomic access
	net: bcmgenet: use hardware padding of runt frames
	i2c: fsi: Fix the port number field in status register
	i2c: core: check returned size of emulated smbus block read
	sched/deadline: Initialize ->dl_boosted
	sched/core: Fix PI boosting between RT and DEADLINE tasks
	sata_rcar: handle pm_runtime_get_sync failure cases
	ata/libata: Fix usage of page address by page_address in ata_scsi_mode_select_xlat function
	drm/amd/display: Use kfree() to free rgb_user in calculate_user_regamma_ramp()
	riscv/atomic: Fix sign extension for RV64I
	hwrng: ks-sa - Fix runtime PM imbalance on error
	ibmvnic: Harden device login requests
	net: alx: fix race condition in alx_remove
	s390/ptrace: fix setting syscall number
	s390/vdso: fix vDSO clock_getres()
	arm64: sve: Fix build failure when ARM64_SVE=y and SYSCTL=n
	kbuild: improve cc-option to clean up all temporary files
	blktrace: break out of blktrace setup on concurrent calls
	RISC-V: Don't allow write+exec only page mapping request in mmap
	ALSA: hda: Add NVIDIA codec IDs 9a & 9d through a0 to patch table
	ALSA: hda/realtek - Add quirk for MSI GE63 laptop
	ACPI: sysfs: Fix pm_profile_attr type
	erofs: fix partially uninitialized misuse in z_erofs_onlinepage_fixup
	KVM: X86: Fix MSR range of APIC registers in X2APIC mode
	KVM: nVMX: Plumb L2 GPA through to PML emulation
	x86/asm/64: Align start of __clear_user() loop to 16-bytes
	btrfs: fix data block group relocation failure due to concurrent scrub
	btrfs: fix failure of RWF_NOWAIT write into prealloc extent beyond eof
	mm/slab: use memzero_explicit() in kzfree()
	ocfs2: avoid inode removal while nfsd is accessing it
	ocfs2: load global_inode_alloc
	ocfs2: fix value of OCFS2_INVALID_SLOT
	ocfs2: fix panic on nfs server over ocfs2
	arm64: perf: Report the PC value in REGS_ABI_32 mode
	tracing: Fix event trigger to accept redundant spaces
	ring-buffer: Zero out time extend if it is nested and not absolute
	drm: rcar-du: Fix build error
	drm/radeon: fix fb_div check in ni_init_smc_spll_table()
	Staging: rtl8723bs: prevent buffer overflow in update_sta_support_rate()
	sunrpc: fixed rollback in rpc_gssd_dummy_populate()
	SUNRPC: Properly set the @subbuf parameter of xdr_buf_subsegment()
	pNFS/flexfiles: Fix list corruption if the mirror count changes
	NFSv4 fix CLOSE not waiting for direct IO compeletion
	dm writecache: correct uncommitted_block when discarding uncommitted entry
	dm writecache: add cond_resched to loop in persistent_memory_claim()
	xfs: add agf freeblocks verify in xfs_agf_verify
	Revert "tty: hvc: Fix data abort due to race in hvc_open"
	Linux 4.19.131

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2c5abdfc2979e50d441bb0e0bcd499e03c61cefd
2020-07-01 13:11:06 +02:00
Waiman Long
9ac47ed7c9 mm/slab: use memzero_explicit() in kzfree()
commit 8982ae527fbef170ef298650c15d55a9ccd33973 upstream.

The kzfree() function is normally used to clear some sensitive
information, like encryption keys, in the buffer before freeing it back to
the pool.  Memset() is currently used for buffer clearing.  However
unlikely, there is still a non-zero probability that the compiler may
choose to optimize away the memory clearing especially if LTO is being
used in the future.

To make sure that this optimization will never happen,
memzero_explicit(), which is introduced in v3.18, is now used in
kzfree() to future-proof it.

Link: http://lkml.kernel.org/r/20200616154311.12314-2-longman@redhat.com
Fixes: 3ef0e5ba46 ("slab: introduce kzfree()")
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Joe Perches <joe@perches.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30 23:17:16 -04:00
Greg Kroah-Hartman
4d01d462e6 This is the 4.19.129 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl7wXf4ACgkQONu9yGCS
 aT6rDBAAg6jIYJhhb9lpK59hpMxNsaFnPfXdA3Z6qARqH7iIQa9TTP9JF5eFndS0
 +2wV8t/8Nz/3BWq9NQAF525QJdqyY6Ahcj5QQXzIzEZyb/p5fRVCBOUcBP7uaBCu
 gdORR7OhHI9+7aGLr05Svb7pVWPLi0Mk5vjvthEIkojEOIREGuGlERRZNlL1SN3y
 cYDBCCJtD2XiuhyZNLNxtwE/2/d/1xuIG7T3VRDS6oBtqfOXdsy5xoU9lpbbmZQg
 s1i3cjWgxEYjJOJqONwzfUSu9Zj4GUZfLTx3gtXG7iEiuUfEw3ljEvIrqSqtNxB5
 aTysoOu4MSdJTALHkA7Szhk2Q8Pecmo+NdKLfgMCxAWwIEbn1X9seea7QC5M5/lr
 Q1z150M2+Lcs6z9I/vR+vmPh9YKn1yGV4RbTeMXwiQLlWcRh7vh7jN+YJvrpmJSL
 BGbsRLB02J4i58CLW7n2rgeq5ycO41bJeWdXSSZjJg7KiZMvuD7mnDv1nUoj3Ad0
 lFxgfBRYYZzGCe53xLBXKnjua1lxp8rStUK4iotqkXyhaZqHo0J52okDxSqbP4VZ
 DYMfgyiFDufITd7l7qK5H6OeWQJ2IPtaude0HiMQf00bdOIrIsl+xXCtFHo6kx6z
 VxwFUAUZWIKZT9ZWo2DNmbbDSRmij3Pqm6ZiakDSPT+kZFqvkBo=
 =u5pA
 -----END PGP SIGNATURE-----

Merge 4.19.129 into android-4.19-stable

Changes in 4.19.129
	ipv6: fix IPV6_ADDRFORM operation logic
	net_failover: fixed rollback in net_failover_open()
	bridge: Avoid infinite loop when suppressing NS messages with invalid options
	vxlan: Avoid infinite loop when suppressing NS messages with invalid options
	tun: correct header offsets in napi frags mode
	selftests: bpf: fix use of undeclared RET_IF macro
	make 'user_access_begin()' do 'access_ok()'
	Fix 'acccess_ok()' on alpha and SH
	arch/openrisc: Fix issues with access_ok()
	x86: uaccess: Inhibit speculation past access_ok() in user_access_begin()
	lib: Reduce user_access_begin() boundaries in strncpy_from_user() and strnlen_user()
	btrfs: merge btrfs_find_device and find_device
	btrfs: Detect unbalanced tree with empty leaf before crashing btree operations
	crypto: talitos - fix ECB and CBC algs ivsize
	Input: mms114 - fix handling of mms345l
	ARM: 8977/1: ptrace: Fix mask for thumb breakpoint hook
	sched/fair: Don't NUMA balance for kthreads
	Input: synaptics - add a second working PNP_ID for Lenovo T470s
	drivers/net/ibmvnic: Update VNIC protocol version reporting
	powerpc/xive: Clear the page tables for the ESB IO mapping
	ath9k_htc: Silence undersized packet warnings
	RDMA/uverbs: Make the event_queue fds return POLLERR when disassociated
	x86/cpu/amd: Make erratum #1054 a legacy erratum
	perf probe: Accept the instance number of kretprobe event
	mm: add kvfree_sensitive() for freeing sensitive data objects
	aio: fix async fsync creds
	btrfs: tree-checker: Check level for leaves and nodes
	x86_64: Fix jiffies ODR violation
	x86/PCI: Mark Intel C620 MROMs as having non-compliant BARs
	x86/speculation: Prevent rogue cross-process SSBD shutdown
	x86/reboot/quirks: Add MacBook6,1 reboot quirk
	efi/efivars: Add missing kobject_put() in sysfs entry creation error path
	ALSA: es1688: Add the missed snd_card_free()
	ALSA: hda/realtek - add a pintbl quirk for several Lenovo machines
	ALSA: usb-audio: Fix inconsistent card PM state after resume
	ALSA: usb-audio: Add vendor, product and profile name for HP Thunderbolt Dock
	ACPI: sysfs: Fix reference count leak in acpi_sysfs_add_hotplug_profile()
	ACPI: CPPC: Fix reference count leak in acpi_cppc_processor_probe()
	ACPI: GED: add support for _Exx / _Lxx handler methods
	ACPI: PM: Avoid using power resources if there are none for D0
	cgroup, blkcg: Prepare some symbols for module and !CONFIG_CGROUP usages
	nilfs2: fix null pointer dereference at nilfs_segctor_do_construct()
	spi: dw: Fix controller unregister order
	spi: bcm2835aux: Fix controller unregister order
	spi: bcm-qspi: when tx/rx buffer is NULL set to 0
	PM: runtime: clk: Fix clk_pm_runtime_get() error path
	crypto: cavium/nitrox - Fix 'nitrox_get_first_device()' when ndevlist is fully iterated
	ALSA: pcm: disallow linking stream to itself
	x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned
	KVM: x86: Fix APIC page invalidation race
	kvm: x86: Fix L1TF mitigation for shadow MMU
	KVM: x86/mmu: Consolidate "is MMIO SPTE" code
	KVM: x86: only do L1TF workaround on affected processors
	x86/speculation: Change misspelled STIPB to STIBP
	x86/speculation: Add support for STIBP always-on preferred mode
	x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS.
	x86/speculation: PR_SPEC_FORCE_DISABLE enforcement for indirect branches.
	spi: No need to assign dummy value in spi_unregister_controller()
	spi: Fix controller unregister order
	spi: pxa2xx: Fix controller unregister order
	spi: bcm2835: Fix controller unregister order
	spi: pxa2xx: Balance runtime PM enable/disable on error
	spi: pxa2xx: Fix runtime PM ref imbalance on probe error
	crypto: virtio: Fix use-after-free in virtio_crypto_skcipher_finalize_req()
	crypto: virtio: Fix src/dst scatterlist calculation in __virtio_crypto_skcipher_do_req()
	crypto: virtio: Fix dest length calculation in __virtio_crypto_skcipher_do_req()
	selftests/net: in rxtimestamp getopt_long needs terminating null entry
	ovl: initialize error in ovl_copy_xattr
	proc: Use new_inode not new_inode_pseudo
	video: fbdev: w100fb: Fix a potential double free.
	KVM: nSVM: fix condition for filtering async PF
	KVM: nSVM: leave ASID aside in copy_vmcb_control_area
	KVM: nVMX: Consult only the "basic" exit reason when routing nested exit
	KVM: MIPS: Define KVM_ENTRYHI_ASID to cpu_asid_mask(&boot_cpu_data)
	KVM: MIPS: Fix VPN2_MASK definition for variable cpu_vmbits
	KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts
	scsi: megaraid_sas: TM command refire leads to controller firmware crash
	ath9k: Fix use-after-free Read in ath9k_wmi_ctrl_rx
	ath9k: Fix use-after-free Write in ath9k_htc_rx_msg
	ath9x: Fix stack-out-of-bounds Write in ath9k_hif_usb_rx_cb
	ath9k: Fix general protection fault in ath9k_hif_usb_rx_cb
	Smack: slab-out-of-bounds in vsscanf
	drm/vkms: Hold gem object while still in-use
	mm/slub: fix a memory leak in sysfs_slab_add()
	fat: don't allow to mount if the FAT length == 0
	perf: Add cond_resched() to task_function_call()
	agp/intel: Reinforce the barrier after GTT updates
	mmc: sdhci-msm: Clear tuning done flag while hs400 tuning
	ARM: dts: at91: sama5d2_ptc_ek: fix sdmmc0 node description
	mmc: sdio: Fix potential NULL pointer error in mmc_sdio_init_card()
	xen/pvcalls-back: test for errors when calling backend_connect()
	KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception
	ACPI: GED: use correct trigger type field in _Exx / _Lxx handling
	drm: bridge: adv7511: Extend list of audio sample rates
	crypto: ccp -- don't "select" CONFIG_DMADEVICES
	media: si2157: Better check for running tuner in init
	objtool: Ignore empty alternatives
	spi: pxa2xx: Apply CS clk quirk to BXT
	net: atlantic: make hw_get_regs optional
	net: ena: fix error returning in ena_com_get_hash_function()
	efi/libstub/x86: Work around LLVM ELF quirk build regression
	arm64: cacheflush: Fix KGDB trap detection
	spi: dw: Zero DMA Tx and Rx configurations on stack
	arm64: insn: Fix two bugs in encoding 32-bit logical immediates
	ixgbe: Fix XDP redirect on archs with PAGE_SIZE above 4K
	MIPS: Loongson: Build ATI Radeon GPU driver as module
	Bluetooth: Add SCO fallback for invalid LMP parameters error
	kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb
	kgdb: Prevent infinite recursive entries to the debugger
	spi: dw: Enable interrupts in accordance with DMA xfer mode
	clocksource: dw_apb_timer: Make CPU-affiliation being optional
	clocksource: dw_apb_timer_of: Fix missing clockevent timers
	btrfs: do not ignore error from btrfs_next_leaf() when inserting checksums
	ARM: 8978/1: mm: make act_mm() respect THREAD_SIZE
	batman-adv: Revert "disable ethtool link speed detection when auto negotiation off"
	mmc: meson-mx-sdio: trigger a soft reset after a timeout or CRC error
	spi: dw: Fix Rx-only DMA transfers
	x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit
	net: vmxnet3: fix possible buffer overflow caused by bad DMA value in vmxnet3_get_rss()
	staging: android: ion: use vmap instead of vm_map_ram
	brcmfmac: fix wrong location to get firmware feature
	tools api fs: Make xxx__mountpoint() more scalable
	e1000: Distribute switch variables for initialization
	dt-bindings: display: mediatek: control dpi pins mode to avoid leakage
	audit: fix a net reference leak in audit_send_reply()
	media: dvb: return -EREMOTEIO on i2c transfer failure.
	media: platform: fcp: Set appropriate DMA parameters
	MIPS: Make sparse_init() using top-down allocation
	Bluetooth: btbcm: Add 2 missing models to subver tables
	audit: fix a net reference leak in audit_list_rules_send()
	netfilter: nft_nat: return EOPNOTSUPP if type or flags are not supported
	selftests/bpf: Fix memory leak in extract_build_id()
	net: bcmgenet: set Rx mode before starting netif
	lib/mpi: Fix 64-bit MIPS build with Clang
	exit: Move preemption fixup up, move blocking operations down
	sched/core: Fix illegal RCU from offline CPUs
	drivers/perf: hisi: Fix typo in events attribute array
	net: lpc-enet: fix error return code in lpc_mii_init()
	media: cec: silence shift wrapping warning in __cec_s_log_addrs()
	net: allwinner: Fix use correct return type for ndo_start_xmit()
	powerpc/spufs: fix copy_to_user while atomic
	xfs: clean up the error handling in xfs_swap_extents
	Crypto/chcr: fix for ccm(aes) failed test
	MIPS: Truncate link address into 32bit for 32bit kernel
	mips: cm: Fix an invalid error code of INTVN_*_ERR
	kgdb: Fix spurious true from in_dbg_master()
	xfs: reset buffer write failure state on successful completion
	xfs: fix duplicate verification from xfs_qm_dqflush()
	platform/x86: intel-vbtn: Use acpi_evaluate_integer()
	platform/x86: intel-vbtn: Split keymap into buttons and switches parts
	platform/x86: intel-vbtn: Do not advertise switches to userspace if they are not there
	platform/x86: intel-vbtn: Also handle tablet-mode switch on "Detachable" and "Portable" chassis-types
	nvme: refine the Qemu Identify CNS quirk
	ath10k: Remove msdu from idr when management pkt send fails
	wcn36xx: Fix error handling path in 'wcn36xx_probe()'
	net: qed*: Reduce RX and TX default ring count when running inside kdump kernel
	mt76: avoid rx reorder buffer overflow
	md: don't flush workqueue unconditionally in md_open
	veth: Adjust hard_start offset on redirect XDP frames
	net/mlx5e: IPoIB, Drop multicast packets that this interface sent
	rtlwifi: Fix a double free in _rtl_usb_tx_urb_setup()
	mwifiex: Fix memory corruption in dump_station
	x86/boot: Correct relocation destination on old linkers
	mips: MAAR: Use more precise address mask
	mips: Add udelay lpj numbers adjustment
	crypto: stm32/crc32 - fix ext4 chksum BUG_ON()
	crypto: stm32/crc32 - fix run-time self test issue.
	crypto: stm32/crc32 - fix multi-instance
	x86/mm: Stop printing BRK addresses
	m68k: mac: Don't call via_flush_cache() on Mac IIfx
	btrfs: qgroup: mark qgroup inconsistent if we're inherting snapshot to a new qgroup
	macvlan: Skip loopback packets in RX handler
	PCI: Don't disable decoding when mmio_always_on is set
	MIPS: Fix IRQ tracing when call handle_fpe() and handle_msa_fpe()
	bcache: fix refcount underflow in bcache_device_free()
	mmc: sdhci-msm: Set SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 quirk
	staging: greybus: sdio: Respect the cmd->busy_timeout from the mmc core
	mmc: via-sdmmc: Respect the cmd->busy_timeout from the mmc core
	ixgbe: fix signed-integer-overflow warning
	mmc: sdhci-esdhc-imx: fix the mask for tuning start point
	spi: dw: Return any value retrieved from the dma_transfer callback
	cpuidle: Fix three reference count leaks
	platform/x86: hp-wmi: Convert simple_strtoul() to kstrtou32()
	platform/x86: intel-hid: Add a quirk to support HP Spectre X2 (2015)
	platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE on the 9 / "Laptop" chasis-type
	string.h: fix incompatibility between FORTIFY_SOURCE and KASAN
	btrfs: include non-missing as a qualifier for the latest_bdev
	btrfs: send: emit file capabilities after chown
	mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked()
	mm: initialize deferred pages with interrupts enabled
	ima: Fix ima digest hash table key calculation
	ima: Directly assign the ima_default_policy pointer to ima_rules
	evm: Fix possible memory leak in evm_calc_hmac_or_hash()
	ext4: fix EXT_MAX_EXTENT/INDEX to check for zeroed eh_max
	ext4: fix error pointer dereference
	ext4: fix race between ext4_sync_parent() and rename()
	PCI: Avoid Pericom USB controller OHCI/EHCI PME# defect
	PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0
	PCI: Avoid FLR for AMD Starship USB 3.0
	PCI: Add ACS quirk for iProc PAXB
	PCI: Add ACS quirk for Intel Root Complex Integrated Endpoints
	PCI: Remove unused NFP32xx IDs
	pci:ipmi: Move IPMI PCI class id defines to pci_ids.h
	hwmon/k10temp, x86/amd_nb: Consolidate shared device IDs
	x86/amd_nb: Add PCI device IDs for family 17h, model 30h
	PCI: add USR vendor id and use it in r8169 and w6692 driver
	PCI: Move Synopsys HAPS platform device IDs
	PCI: Move Rohm Vendor ID to generic list
	misc: pci_endpoint_test: Add the layerscape EP device support
	misc: pci_endpoint_test: Add support to test PCI EP in AM654x
	PCI: Add Synopsys endpoint EDDA Device ID
	PCI: Add NVIDIA GPU multi-function power dependencies
	PCI: Enable NVIDIA HDA controllers
	PCI: mediatek: Add controller support for MT7629
	x86/amd_nb: Add PCI device IDs for family 17h, model 70h
	ALSA: lx6464es - add support for LX6464ESe pci express variant
	PCI: Add Genesys Logic, Inc. Vendor ID
	PCI: Add Amazon's Annapurna Labs vendor ID
	PCI: vmd: Add device id for VMD device 8086:9A0B
	x86/amd_nb: Add Family 19h PCI IDs
	PCI: Add Loongson vendor ID
	serial: 8250_pci: Move Pericom IDs to pci_ids.h
	PCI: Make ACS quirk implementations more uniform
	PCI: Unify ACS quirk desired vs provided checking
	PCI: Generalize multi-function power dependency device links
	btrfs: fix error handling when submitting direct I/O bio
	btrfs: fix wrong file range cleanup after an error filling dealloc range
	ima: Call ima_calc_boot_aggregate() in ima_eventdigest_init()
	PCI: Program MPS for RCiEP devices
	e1000e: Disable TSO for buffer overrun workaround
	e1000e: Relax condition to trigger reset for ME workaround
	carl9170: remove P2P_GO support
	media: go7007: fix a miss of snd_card_free
	Bluetooth: hci_bcm: fix freeing not-requested IRQ
	b43legacy: Fix case where channel status is corrupted
	b43: Fix connection problem with WPA3
	b43_legacy: Fix connection problem with WPA3
	media: ov5640: fix use of destroyed mutex
	igb: Report speed and duplex as unknown when device is runtime suspended
	power: vexpress: add suppress_bind_attrs to true
	pinctrl: samsung: Correct setting of eint wakeup mask on s5pv210
	pinctrl: samsung: Save/restore eint_mask over suspend for EINT_TYPE GPIOs
	gnss: sirf: fix error return code in sirf_probe()
	sparc32: fix register window handling in genregs32_[gs]et()
	sparc64: fix misuses of access_process_vm() in genregs32_[sg]et()
	dm crypt: avoid truncating the logical block size
	alpha: fix memory barriers so that they conform to the specification
	kernel/cpu_pm: Fix uninitted local in cpu_pm
	ARM: tegra: Correct PL310 Auxiliary Control Register initialization
	ARM: dts: exynos: Fix GPIO polarity for thr GalaxyS3 CM36651 sensor's bus
	ARM: dts: at91: sama5d2_ptc_ek: fix vbus pin
	ARM: dts: s5pv210: Set keep-power-in-suspend for SDHCI1 on Aries
	drivers/macintosh: Fix memleak in windfarm_pm112 driver
	powerpc/64s: Don't let DT CPU features set FSCR_DSCR
	powerpc/64s: Save FSCR to init_task.thread.fscr after feature init
	kbuild: force to build vmlinux if CONFIG_MODVERSION=y
	sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.
	sunrpc: clean up properly in gss_mech_unregister()
	mtd: rawnand: brcmnand: fix hamming oob layout
	mtd: rawnand: pasemi: Fix the probe error path
	w1: omap-hdq: cleanup to add missing newline for some dev_dbg
	perf probe: Do not show the skipped events
	perf probe: Fix to check blacklist address correctly
	perf probe: Check address correctness by map instead of _etext
	perf symbols: Fix debuginfo search for Ubuntu
	Linux 4.19.129

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I7b1108d90ee1109a28fe488a4358b7a3e101d9c9
2020-06-22 10:50:54 +02:00