Commit graph

594 commits

Author SHA1 Message Date
Oleg Nesterov
caf7caca0b UPSTREAM: cgroup: freezer: call cgroup_enter_frozen() with preemption disabled in ptrace_stop()
ptrace_stop() does preempt_enable_no_resched() to avoid the preemption,
but after that cgroup_enter_frozen() does spin_lock/unlock and this adds
another preemption point.

Reported-and-tested-by: Bruce Ashfield <bruce.ashfield@gmail.com>
Fixes: 76f969e8948d ("cgroup: cgroup v2 freezer")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

Change-Id: Ic53e0f2d6624b0bb90817b0c57060fb7db971348
(cherry picked from commit 937c6b27c73e02cd4114f95f5c37ba2c29fadba1)
Bug: 154548692
Signed-off-by: Marco Ballesio <balejs@google.com>
2020-08-26 15:35:17 -07:00
Roman Gushchin
f7840a0a07 UPSTREAM: signal: unconditionally leave the frozen state in ptrace_stop()
Alex Xu reported a regression in strace, caused by the introduction of
the cgroup v2 freezer. The regression can be reproduced by stracing
the following simple program:

  #include <unistd.h>

  int main() {
      write(1, "a", 1);
      return 0;
  }

An attempt to run strace ./a.out leads to the infinite loop:
  [ pre-main omitted ]
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  [ repeats forever ]

The problem occurs because the traced task leaves ptrace_stop()
(and the signal handling loop) with the frozen bit set. So let's
call cgroup_leave_frozen(true) unconditionally after sleeping
in ptrace_stop().

With this patch applied, strace works as expected:
  [ pre-main omitted ]
  write(1, "a", 1)                        = 1
  exit_group(0)                           = ?
  +++ exited with 0 +++

Reported-by: Alex Xu <alex_y_xu@yahoo.ca>
Fixes: 76f969e8948d ("cgroup: cgroup v2 freezer")
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>

Change-Id: If644b15ead36ce13f0c2c3dd57eebe3658e3edf7
(cherry picked from commit 05b289263772b0698589abc47771264a685cd365)
Bug: 154548692
Signed-off-by: Marco Ballesio <balejs@google.com>
2020-08-26 15:35:17 -07:00
Roman Gushchin
5318f3163c BACKPORT: cgroup: cgroup v2 freezer
Cgroup v1 implements the freezer controller, which provides an ability
to stop the workload in a cgroup and temporarily free up some
resources (cpu, io, network bandwidth and, potentially, memory)
for some other tasks. Cgroup v2 lacks this functionality.

This patch implements freezer for cgroup v2.

Cgroup v2 freezer tries to put tasks into a state similar to jobctl
stop. This means that tasks can be killed, ptraced (using
PTRACE_SEIZE*), and interrupted. It is possible to attach to
a frozen task, get some information (e.g. read registers) and detach.
It's also possible to migrate a frozen tasks to another cgroup.

This differs cgroup v2 freezer from cgroup v1 freezer, which mostly
tried to imitate the system-wide freezer. However uninterruptible
sleep is fine when all tasks are going to be frozen (hibernation case),
it's not the acceptable state for some subset of the system.

Cgroup v2 freezer is not supporting freezing kthreads.
If a non-root cgroup contains kthread, the cgroup still can be frozen,
but the kthread will remain running, the cgroup will be shown
as non-frozen, and the notification will not be delivered.

* PTRACE_ATTACH is not working because non-fatal signal delivery
is blocked in frozen state.

There are some interface differences between cgroup v1 and cgroup v2
freezer too, which are required to conform the cgroup v2 interface
design principles:
1) There is no separate controller, which has to be turned on:
the functionality is always available and is represented by
cgroup.freeze and cgroup.events cgroup control files.
2) The desired state is defined by the cgroup.freeze control file.
Any hierarchical configuration is allowed.
3) The interface is asynchronous. The actual state is available
using cgroup.events control file ("frozen" field). There are no
dedicated transitional states.
4) It's allowed to make any changes with the cgroup hierarchy
(create new cgroups, remove old cgroups, move tasks between cgroups)
no matter if some cgroups are frozen.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
No-objection-from-me-by: Oleg Nesterov <oleg@redhat.com>
Cc: kernel-team@fb.com
Change-Id: I3404119678cbcd7410aa56e9334055cee79d02fa
(cherry picked from commit 76f969e8948d82e78e1bc4beb6b9465908e74873)
cgroup-defs.h: use the struct cgroup_freezer_state and the
freezer field from definitions in I6221a975c04f06249a4f8d693852776ae08a8d8e
sched.h: use the frozen field defined in
I6221a975c04f06249a4f8d693852776ae08a8d8e
Bug: 154548692
Signed-off-by: Marco Ballesio <balejs@google.com>
2020-08-26 15:35:17 -07:00
Greg Kroah-Hartman
95bff4cdab This is the 4.19.116 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl6ZbYYACgkQONu9yGCS
 aT76ohAAn4lIjSuMRCILy/lq0DXVWDy7q6YdfyzNBITxc86tVfnfjMeQxUBviE/1
 OzShWgMRXeqrb0xJTJ5Rv6mt5Kf9a3DpPWt2jwo1iqWkl4AihDtDV7Z2Bh+QdnSX
 +lQ1xGPqDi4MMgoYlpMtlFc3wq/pJV0i8Q7amXC/KbsDkt5dlDrQYeEZHe2P7pR9
 ZljKLHEdGRE3uGqXmEM8qb6aLjQudnHmH/9uChP4UX6b+ZADDCc05DMhEkhEoCZT
 jdxiqVZvRdiiXTc1r6ckGv0xae77s0IAAZMQAd+24zFK94QByi6d9Cw0y6qyyDi7
 1rfHIWSjvetY3+4DCQDOu/k2/pLt/Vqh9zuvtaf8Tu8cKM9rxow0Hl9FlL3fZpBN
 btpqeCY6twFxApHoAp9ZDK6otaVEOtbg1MCsmpUbVxWIF9IR8cPqMGyYK3lR2Ao1
 HgdKEFkYOycAOu51ujuHsDLx/9k2ZqeSPyh0yrdVpFUVvMV/YqoYP9X3jzGRVllL
 hgYfFcywgrVgxK4c02/6cPiJNbFskTpLllDPVVXGIjO+9R4vTRUgJ74CNrqL25aT
 ioSFWJA00UvXObnbCDdA+otYYWAmYOJX7HVvEieb0oDqPYHZHa1UW6+1WlYSAQLm
 WAsHiejOv6PwzRmCDI6RyuZKQjjX6bppAWFq0/RLPO0uEqjXlxc=
 =Iq3k
 -----END PGP SIGNATURE-----

Merge 4.19.116 into android-4.19

Changes in 4.19.116
	ARM: dts: sun8i-a83t-tbs-a711: HM5065 doesn't like such a high voltage
	bus: sunxi-rsb: Return correct data when mixing 16-bit and 8-bit reads
	net: vxge: fix wrong __VA_ARGS__ usage
	hinic: fix a bug of waitting for IO stopped
	hinic: fix wrong para of wait_for_completion_timeout
	cxgb4/ptp: pass the sign of offset delta in FW CMD
	qlcnic: Fix bad kzalloc null test
	i2c: st: fix missing struct parameter description
	cpufreq: imx6q: Fixes unwanted cpu overclocking on i.MX6ULL
	media: venus: hfi_parser: Ignore HEVC encoding for V1
	firmware: arm_sdei: fix double-lock on hibernate with shared events
	null_blk: Fix the null_add_dev() error path
	null_blk: Handle null_add_dev() failures properly
	null_blk: fix spurious IO errors after failed past-wp access
	xhci: bail out early if driver can't accress host in resume
	x86: Don't let pgprot_modify() change the page encryption bit
	block: keep bdi->io_pages in sync with max_sectors_kb for stacked devices
	irqchip/versatile-fpga: Handle chained IRQs properly
	sched: Avoid scale real weight down to zero
	selftests/x86/ptrace_syscall_32: Fix no-vDSO segfault
	PCI/switchtec: Fix init_completion race condition with poll_wait()
	media: i2c: video-i2c: fix build errors due to 'imply hwmon'
	libata: Remove extra scsi_host_put() in ata_scsi_add_hosts()
	pstore/platform: fix potential mem leak if pstore_init_fs failed
	gfs2: Don't demote a glock until its revokes are written
	x86/boot: Use unsigned comparison for addresses
	efi/x86: Ignore the memory attributes table on i386
	genirq/irqdomain: Check pointer in irq_domain_alloc_irqs_hierarchy()
	block: Fix use-after-free issue accessing struct io_cq
	media: i2c: ov5695: Fix power on and off sequences
	usb: dwc3: core: add support for disabling SS instances in park mode
	irqchip/gic-v4: Provide irq_retrigger to avoid circular locking dependency
	md: check arrays is suspended in mddev_detach before call quiesce operations
	firmware: fix a double abort case with fw_load_sysfs_fallback
	locking/lockdep: Avoid recursion in lockdep_count_{for,back}ward_deps()
	block, bfq: fix use-after-free in bfq_idle_slice_timer_body
	btrfs: qgroup: ensure qgroup_rescan_running is only set when the worker is at least queued
	btrfs: remove a BUG_ON() from merge_reloc_roots()
	btrfs: track reloc roots based on their commit root bytenr
	IB/mlx5: Replace tunnel mpls capability bits for tunnel_offloads
	uapi: rename ext2_swab() to swab() and share globally in swab.h
	slub: improve bit diffusion for freelist ptr obfuscation
	ASoC: fix regwmask
	ASoC: dapm: connect virtual mux with default value
	ASoC: dpcm: allow start or stop during pause for backend
	ASoC: topology: use name_prefix for new kcontrol
	usb: gadget: f_fs: Fix use after free issue as part of queue failure
	usb: gadget: composite: Inform controller driver of self-powered
	ALSA: usb-audio: Add mixer workaround for TRX40 and co
	ALSA: hda: Add driver blacklist
	ALSA: hda: Fix potential access overflow in beep helper
	ALSA: ice1724: Fix invalid access for enumerated ctl items
	ALSA: pcm: oss: Fix regression by buffer overflow fix
	ALSA: doc: Document PC Beep Hidden Register on Realtek ALC256
	ALSA: hda/realtek - Set principled PC Beep configuration for ALC256
	ALSA: hda/realtek - Remove now-unnecessary XPS 13 headphone noise fixups
	ALSA: hda/realtek - Add quirk for MSI GL63
	media: ti-vpe: cal: fix disable_irqs to only the intended target
	acpi/x86: ignore unspecified bit positions in the ACPI global lock field
	thermal: devfreq_cooling: inline all stubs for CONFIG_DEVFREQ_THERMAL=n
	nvme-fc: Revert "add module to ops template to allow module references"
	nvme: Treat discovery subsystems as unique subsystems
	PCI: pciehp: Fix indefinite wait on sysfs requests
	PCI/ASPM: Clear the correct bits when enabling L1 substates
	PCI: Add boot interrupt quirk mechanism for Xeon chipsets
	PCI: endpoint: Fix for concurrent memory allocation in OB address region
	tpm: Don't make log failures fatal
	tpm: tpm1_bios_measurements_next should increase position index
	tpm: tpm2_bios_measurements_next should increase position index
	KEYS: reaching the keys quotas correctly
	irqchip/versatile-fpga: Apply clear-mask earlier
	pstore: pstore_ftrace_seq_next should increase position index
	MIPS/tlbex: Fix LDDIR usage in setup_pw() for Loongson-3
	MIPS: OCTEON: irq: Fix potential NULL pointer dereference
	ath9k: Handle txpower changes even when TPC is disabled
	signal: Extend exec_id to 64bits
	x86/entry/32: Add missing ASM_CLAC to general_protection entry
	KVM: nVMX: Properly handle userspace interrupt window request
	KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks
	KVM: s390: vsie: Fix delivery of addressing exceptions
	KVM: x86: Allocate new rmap and large page tracking when moving memslot
	KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support
	KVM: x86: Gracefully handle __vmalloc() failure during VM allocation
	KVM: VMX: fix crash cleanup when KVM wasn't used
	CIFS: Fix bug which the return value by asynchronous read is error
	mtd: spinand: Stop using spinand->oobbuf for buffering bad block markers
	mtd: spinand: Do not erase the block before writing a bad block marker
	Btrfs: fix crash during unmount due to race with delayed inode workers
	btrfs: set update the uuid generation as soon as possible
	btrfs: drop block from cache on error in relocation
	btrfs: fix missing file extent item for hole after ranged fsync
	btrfs: fix missing semaphore unlock in btrfs_sync_file
	crypto: mxs-dcp - fix scatterlist linearization for hash
	erofs: correct the remaining shrink objects
	powerpc/pseries: Drop pointless static qualifier in vpa_debugfs_init()
	x86/speculation: Remove redundant arch_smt_update() invocation
	tools: gpio: Fix out-of-tree build regression
	mm: Use fixed constant in page_frag_alloc instead of size + 1
	net: qualcomm: rmnet: Allow configuration updates to existing devices
	arm64: dts: allwinner: h6: Fix PMU compatible
	dm writecache: add cond_resched to avoid CPU hangs
	dm verity fec: fix memory leak in verity_fec_dtr
	scsi: zfcp: fix missing erp_lock in port recovery trigger for point-to-point
	arm64: armv8_deprecated: Fix undef_hook mask for thumb setend
	selftests: vm: drop dependencies on page flags from mlock2 tests
	rtc: omap: Use define directive for PIN_CONFIG_ACTIVE_HIGH
	drm/etnaviv: rework perfmon query infrastructure
	powerpc/pseries: Avoid NULL pointer dereference when drmem is unavailable
	NFS: Fix a page leak in nfs_destroy_unlinked_subrequests()
	ext4: fix a data race at inode->i_blocks
	fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
	ocfs2: no need try to truncate file beyond i_size
	perf tools: Support Python 3.8+ in Makefile
	s390/diag: fix display of diagnose call statistics
	Input: i8042 - add Acer Aspire 5738z to nomux list
	clk: ingenic/jz4770: Exit with error if CGU init failed
	kmod: make request_module() return an error when autoloading is disabled
	cpufreq: powernv: Fix use-after-free
	hfsplus: fix crash and filesystem corruption when deleting files
	libata: Return correct status in sata_pmp_eh_recover_pm() when ATA_DFLAG_DETACH is set
	ipmi: fix hung processes in __get_guid()
	xen/blkfront: fix memory allocation flags in blkfront_setup_indirect()
	powerpc/powernv/idle: Restore AMR/UAMOR/AMOR after idle
	powerpc/64/tm: Don't let userspace set regs->trap via sigreturn
	powerpc/hash64/devmap: Use H_PAGE_THP_HUGE when setting up huge devmap PTE entries
	powerpc/xive: Use XIVE_BAD_IRQ instead of zero to catch non configured IPIs
	powerpc/kprobes: Ignore traps that happened in real mode
	scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug
	powerpc: Add attributes for setjmp/longjmp
	powerpc: Make setjmp/longjmp signature standard
	btrfs: use nofs allocations for running delayed items
	dm zoned: remove duplicate nr_rnd_zones increase in dmz_init_zone()
	crypto: caam - update xts sector size for large input length
	crypto: ccree - improve error handling
	crypto: ccree - zero out internal struct before use
	crypto: ccree - don't mangle the request assoclen
	crypto: ccree - dec auth tag size from cryptlen map
	crypto: ccree - only try to map auth tag if needed
	Revert "drm/dp_mst: Remove VCPI while disabling topology mgr"
	drm/dp_mst: Fix clearing payload state on topology disable
	drm: Remove PageReserved manipulation from drm_pci_alloc
	ftrace/kprobe: Show the maxactive number on kprobe_events
	powerpc/fsl_booke: Avoid creating duplicate tlb1 entry
	misc: echo: Remove unnecessary parentheses and simplify check for zero
	etnaviv: perfmon: fix total and idle HI cyleces readout
	mfd: dln2: Fix sanity checking for endpoints
	efi/x86: Fix the deletion of variables in mixed mode
	Linux 4.19.116

Change-Id: If09fbb53fcb11ea01eaaa7fee7ed21ed6234f352
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2020-04-18 13:33:51 +02:00
Eric W. Biederman
a2a1be2de7 signal: Extend exec_id to 64bits
commit d1e7fd6462ca9fc76650fbe6ca800e35b24267da upstream.

Replace the 32bit exec_id with a 64bit exec_id to make it impossible
to wrap the exec_id counter.  With care an attacker can cause exec_id
wrap and send arbitrary signals to a newly exec'd parent.  This
bypasses the signal sending checks if the parent changes their
credentials during exec.

The severity of this problem can been seen that in my limited testing
of a 32bit exec_id it can take as little as 19s to exec 65536 times.
Which means that it can take as little as 14 days to wrap a 32bit
exec_id.  Adam Zabrocki has succeeded wrapping the self_exe_id in 7
days.  Even my slower timing is in the uptime of a typical server.
Which means self_exec_id is simply a speed bump today, and if exec
gets noticably faster self_exec_id won't even be a speed bump.

Extending self_exec_id to 64bits introduces a problem on 32bit
architectures where reading self_exec_id is no longer atomic and can
take two read instructions.  Which means that is is possible to hit
a window where the read value of exec_id does not match the written
value.  So with very lucky timing after this change this still
remains expoiltable.

I have updated the update of exec_id on exec to use WRITE_ONCE
and the read of exec_id in do_notify_parent to use READ_ONCE
to make it clear that there is no locking between these two
locations.

Link: https://lore.kernel.org/kernel-hardening/20200324215049.GA3710@pi3.com.pl
Fixes: 2.3.23pre2
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-17 10:48:47 +02:00
Greg Kroah-Hartman
417d28a44d This is the 4.19.112 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl50oUYACgkQONu9yGCS
 aT5w2Q//TEz14rFZh31L+QyAmW2lPGb9UnukLgUjs5RQKbRL/Bl3h5hU0EbYMl4i
 fdb09WsU3Ns2twelCnzWqsXPgc3GiDrSMgOfDR1jJpELzAk4WGgxdYoVj2Wkyi6x
 Uus1KDH5tEgwvHMMjwszDxi8N3behr3IidcvxN6EnTRbmSmLck2rlfp1Y0q4hqF9
 /CDIHxYfArXFVS0sxG3tGduf2wK2XieHr2YLTetcXs8W0GV7KmrSxmcEOsQOjtnv
 aifFuuRY7T4XLfkxoWephKl/YZfVsCAlgOpLn5BgwfSQyXHr/X4GUG+MI7/rsQ6l
 X5cZiqoKAM8K4olnYJB23PHTmSK3AEp8sWQBRS6WxZPcZlu/eq6qnwxhnVBrDNKP
 l40CarJECD8bc3cZnKRZuQjPC2s7ay8KpsLFfSSn0bAmPNVaTF65oqfccA1OMtJd
 nuyBOQNQ4LATemtkloDI+Xxpkhng/klXH+/yVyhbaNFXkCE8XsPaAIClpLf5rVWx
 ojREnXSNfCr3GhPPskVLfjiYhBEEdwlmOB+dBUUqxSNL3IZkdRXPy2Hpv2KFmuUR
 9VNyndsFUFjKraZR29+CBIgFjgsAInvdqvyJkBAtqaAgH3u0nTK/zA/mx6yOGA6o
 ayFotBONmvfGDSRZUNoSkSHcEXHUSNucEzrR/1aMEAb+zqCEgjk=
 =nZnf
 -----END PGP SIGNATURE-----

Merge 4.19.112 into android-4.19

Changes in 4.19.112
	perf/amd/uncore: Replace manual sampling check with CAP_NO_INTERRUPT flag
	mmc: sdhci-omap: Add platform specific reset callback
	mmc: sdhci-omap: Workaround errata regarding SDR104/HS200 tuning failures (i929)
	mmc: host: Fix Kconfig warnings on keystone_defconfig
	ACPI: watchdog: Allow disabling WDAT at boot
	HID: apple: Add support for recent firmware on Magic Keyboards
	HID: i2c-hid: add Trekstor Surfbook E11B to descriptor override
	cfg80211: check reg_rule for NULL in handle_channel_custom()
	scsi: libfc: free response frame from GPN_ID
	net: usb: qmi_wwan: restore mtu min/max values after raw_ip switch
	net: ks8851-ml: Fix IRQ handling and locking
	mac80211: rx: avoid RCU list traversal under mutex
	signal: avoid double atomic counter increments for user accounting
	slip: not call free_netdev before rtnl_unlock in slip_open
	hinic: fix a irq affinity bug
	hinic: fix a bug of setting hw_ioctxt
	net: rmnet: fix NULL pointer dereference in rmnet_newlink()
	net: rmnet: fix NULL pointer dereference in rmnet_changelink()
	net: rmnet: fix suspicious RCU usage
	net: rmnet: remove rcu_read_lock in rmnet_force_unassociate_device()
	net: rmnet: do not allow to change mux id if mux id is duplicated
	net: rmnet: use upper/lower device infrastructure
	net: rmnet: fix bridge mode bugs
	net: rmnet: fix packet forwarding in rmnet bridge mode
	sfc: fix timestamp reconstruction at 16-bit rollover points
	jbd2: fix data races at struct journal_head
	wimax: i2400: fix memory leak
	wimax: i2400: Fix memory leak in i2400m_op_rfkill_sw_toggle
	mmc: sdhci-omap: Don't finish_mrq() on a command error during tuning
	mmc: sdhci-omap: Fix Tuning procedure for temperatures < -20C
	driver core: Remove the link if there is no driver with AUTO flag
	driver core: Fix adding device links to probing suppliers
	driver core: Make driver core own stateful device links
	driver core: Add device link flag DL_FLAG_AUTOPROBE_CONSUMER
	driver core: Remove device link creation limitation
	driver core: Fix creation of device links with PM-runtime flags
	net: qrtr: fix len of skb_put_padto in qrtr_node_enqueue
	ARM: 8957/1: VDSO: Match ARMv8 timer in cntvct_functional()
	ARM: 8958/1: rename missed uaccess .fixup section
	mm: slub: add missing TID bump in kmem_cache_alloc_bulk()
	HID: google: add moonball USB id
	efi: Fix debugobjects warning on 'efi_rts_work'
	ipv4: ensure rcu_read_lock() in cipso_v4_error()
	Linux 4.19.112

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I68bb3ea9d74f698994a1b958d112827a0873a0f7
2020-03-21 08:37:27 +01:00
Linus Torvalds
797479da0a signal: avoid double atomic counter increments for user accounting
[ Upstream commit fda31c50292a5062332fa0343c084bd9f46604d9 ]

When queueing a signal, we increment both the users count of pending
signals (for RLIMIT_SIGPENDING tracking) and we increment the refcount
of the user struct itself (because we keep a reference to the user in
the signal structure in order to correctly account for it when freeing).

That turns out to be fairly expensive, because both of them are atomic
updates, and particularly under extreme signal handling pressure on big
machines, you can get a lot of cache contention on the user struct.
That can then cause horrid cacheline ping-pong when you do these
multiple accesses.

So change the reference counting to only pin the user for the _first_
pending signal, and to unpin it when the last pending signal is
dequeued.  That means that when a user sees a lot of concurrent signal
queuing - which is the only situation when this matters - the only
atomic access needed is generally the 'sigpending' count update.

This was noticed because of a particularly odd timing artifact on a
dual-socket 96C/192T Cascade Lake platform: when you get into bad
contention, on that machine for some reason seems to be much worse when
the contention happens in the upper 32-byte half of the cacheline.

As a result, the kernel test robot will-it-scale 'signal1' benchmark had
an odd performance regression simply due to random alignment of the
'struct user_struct' (and pointed to a completely unrelated and
apparently nonsensical commit for the regression).

Avoiding the double increments (and decrements on the dequeueing side,
of course) makes for much less contention and hugely improved
performance on that will-it-scale microbenchmark.

Quoting Feng Tang:

 "It makes a big difference, that the performance score is tripled! bump
  from original 17000 to 54000. Also the gap between 5.0-rc6 and
  5.0-rc6+Jiri's patch is reduced to around 2%"

[ The "2% gap" is the odd cacheline placement difference on that
  platform: under the extreme contention case, the effect of which half
  of the cacheline was hot was 5%, so with the reduced contention the
  odd timing artifact is reduced too ]

It does help in the non-contended case too, but is not nearly as
noticeable.

Reported-and-tested-by: Feng Tang <feng.tang@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Philip Li <philip.li@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-20 11:55:53 +01:00
Greg Kroah-Hartman
1fca2c99f4 This is the 4.19.99 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4u6tsACgkQONu9yGCS
 aT693A//TExeDRnNnf+2v4TJorylyRr17BMxk/Ie2L5E6d2n/RWodsrOThAPU9tx
 5alNUkXCT8Jd31BUVnUoPoAQ4zSymSVi++XEf05wDeO0tQ982IESGaLmu9EC1uMF
 nnM5y4IdRYmFI1Zji4h5vRJckoYUlB6Mdg4BgMr4Q1KX7RkZYfe6bjs7DwM/uyMx
 jVXdFaQBD1H6F5W6A+GmgUZ36g9uNqzcBxxWwv5URj+q816NdI4bsxIJMF0v0WC+
 S54fmpS07QWIYKKsQBUepeSgEF4ECESOE2VoF1ICcnfakdPnDBmNgyPJPSrLmVf+
 itRUxoH1MewaOvoJrv+xsGBPmM29LcKH2oBmj5DR2Xstp7ACPs+OtXJEU9dUTDN4
 NhaSts5fIp0f4Y5mMn508pDUwYDAWDt99ZJWdx6aK/TRyUsHBgpxBQDt37BE3U5W
 PCBnObNe2b2KDAsVXLjX5iDYoA0+usFreveMo8uEP+ohfh0ANvJlRkzedYw7NquI
 ZCcT+I1P9q8aa0528tR332VLrQeYg+kG6LVi2kAabmRA/VtEsT0w90MY/eo2vuTU
 WlPmbs2yerv2HTm050e6MOgBZfPh7wP/FpbjsSXufj7EDywlfxF+1hXdwfrpPJeN
 fN3g0kepeUp7+kLzO40FLam/z5ndjAUhyN2SBaPzGsXjMkZdETk=
 =zvlh
 -----END PGP SIGNATURE-----

Merge 4.19.99 into android-4.19

Changes in 4.19.99
	Revert "efi: Fix debugobjects warning on 'efi_rts_work'"
	xfs: Sanity check flags of Q_XQUOTARM call
	i2c: stm32f7: rework slave_id allocation
	i2c: i2c-stm32f7: fix 10-bits check in slave free id search loop
	mfd: intel-lpss: Add default I2C device properties for Gemini Lake
	SUNRPC: Fix svcauth_gss_proxy_init()
	powerpc/pseries: Enable support for ibm,drc-info property
	powerpc/archrandom: fix arch_get_random_seed_int()
	tipc: update mon's self addr when node addr generated
	tipc: fix wrong timeout input for tipc_wait_for_cond()
	mt7601u: fix bbp version check in mt7601u_wait_bbp_ready
	crypto: sun4i-ss - fix big endian issues
	perf map: No need to adjust the long name of modules
	soc: aspeed: Fix snoop_file_poll()'s return type
	watchdog: sprd: Fix the incorrect pointer getting from driver data
	ipmi: Fix memory leak in __ipmi_bmc_register
	drm/sti: do not remove the drm_bridge that was never added
	ARM: dts: at91: nattis: set the PRLUD and HIPOW signals low
	ARM: dts: at91: nattis: make the SD-card slot work
	ixgbe: don't clear IPsec sa counters on HW clearing
	drm/virtio: fix bounds check in virtio_gpu_cmd_get_capset()
	iio: fix position relative kernel version
	apparmor: Fix network performance issue in aa_label_sk_perm
	ALSA: hda: fix unused variable warning
	apparmor: don't try to replace stale label in ptrace access check
	ARM: qcom_defconfig: Enable MAILBOX
	firmware: coreboot: Let OF core populate platform device
	PCI: iproc: Remove PAXC slot check to allow VF support
	bridge: br_arp_nd_proxy: set icmp6_router if neigh has NTF_ROUTER
	drm/hisilicon: hibmc: Don't overwrite fb helper surface depth
	signal/ia64: Use the generic force_sigsegv in setup_frame
	signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn
	ASoC: wm9712: fix unused variable warning
	mailbox: mediatek: Add check for possible failure of kzalloc
	IB/rxe: replace kvfree with vfree
	IB/hfi1: Add mtu check for operational data VLs
	genirq/debugfs: Reinstate full OF path for domain name
	usb: dwc3: add EXTCON dependency for qcom
	usb: gadget: fsl_udc_core: check allocation return value and cleanup on failure
	cfg80211: regulatory: make initialization more robust
	mei: replace POLL* with EPOLL* for write queues.
	drm/msm: fix unsigned comparison with less than zero
	of: Fix property name in of_node_get_device_type
	ALSA: usb-audio: update quirk for B&W PX to remove microphone
	iwlwifi: nvm: get num of hw addresses from firmware
	staging: comedi: ni_mio_common: protect register write overflow
	netfilter: nft_osf: usage from output path is not valid
	pwm: lpss: Release runtime-pm reference from the driver's remove callback
	powerpc/pseries/memory-hotplug: Fix return value type of find_aa_index
	rtlwifi: rtl8821ae: replace _rtl8821ae_mrate_idx_to_arfr_id with generic version
	RDMA/bnxt_re: Add missing spin lock initialization
	netfilter: nf_flow_table: do not remove offload when other netns's interface is down
	powerpc/kgdb: add kgdb_arch_set/remove_breakpoint()
	tipc: eliminate message disordering during binding table update
	net: socionext: Add dummy PHY register read in phy_write()
	drm/sun4i: hdmi: Fix double flag assignation
	net: hns3: add error handler for hns3_nic_init_vector_data()
	mlxsw: reg: QEEC: Add minimum shaper fields
	mlxsw: spectrum: Set minimum shaper on MC TCs
	NTB: ntb_hw_idt: replace IS_ERR_OR_NULL with regular NULL checks
	ASoC: wm97xx: fix uninitialized regmap pointer problem
	ARM: dts: bcm283x: Correct mailbox register sizes
	pcrypt: use format specifier in kobject_add
	ASoC: sun8i-codec: add missing route for ADC
	pinctrl: meson-gxl: remove invalid GPIOX tsin_a pins
	bus: ti-sysc: Add mcasp optional clocks flag
	exportfs: fix 'passing zero to ERR_PTR()' warning
	drm: rcar-du: Fix the return value in case of error in 'rcar_du_crtc_set_crc_source()'
	drm: rcar-du: Fix vblank initialization
	net: always initialize pagedlen
	drm/dp_mst: Skip validating ports during destruction, just ref
	arm64: dts: meson-gx: Add hdmi_5v regulator as hdmi tx supply
	arm64: dts: renesas: r8a7795-es1: Add missing power domains to IPMMU nodes
	net: phy: Fix not to call phy_resume() if PHY is not attached
	IB/hfi1: Correctly process FECN and BECN in packets
	OPP: Fix missing debugfs supply directory for OPPs
	IB/rxe: Fix incorrect cache cleanup in error flow
	mailbox: ti-msgmgr: Off by one in ti_msgmgr_of_xlate()
	staging: bcm2835-camera: Abort probe if there is no camera
	staging: bcm2835-camera: fix module autoloading
	switchtec: Remove immediate status check after submitting MRPC command
	ipv6: add missing tx timestamping on IPPROTO_RAW
	pinctrl: sh-pfc: r8a7740: Add missing REF125CK pin to gether_gmii group
	pinctrl: sh-pfc: r8a7740: Add missing LCD0 marks to lcd0_data24_1 group
	pinctrl: sh-pfc: r8a7791: Remove bogus ctrl marks from qspi_data4_b group
	pinctrl: sh-pfc: r8a7791: Remove bogus marks from vin1_b_data18 group
	pinctrl: sh-pfc: sh73a0: Add missing TO pin to tpu4_to3 group
	pinctrl: sh-pfc: r8a7794: Remove bogus IPSR9 field
	pinctrl: sh-pfc: r8a77970: Add missing MOD_SEL0 field
	pinctrl: sh-pfc: r8a77980: Add missing MOD_SEL0 field
	pinctrl: sh-pfc: sh7734: Add missing IPSR11 field
	pinctrl: sh-pfc: r8a77995: Remove bogus SEL_PWM[0-3]_3 configurations
	pinctrl: sh-pfc: sh7269: Add missing PCIOR0 field
	pinctrl: sh-pfc: sh7734: Remove bogus IPSR10 value
	net: hns3: fix error handling int the hns3_get_vector_ring_chain
	vxlan: changelink: Fix handling of default remotes
	Input: nomadik-ske-keypad - fix a loop timeout test
	fork,memcg: fix crash in free_thread_stack on memcg charge fail
	clk: highbank: fix refcount leak in hb_clk_init()
	clk: qoriq: fix refcount leak in clockgen_init()
	clk: ti: fix refcount leak in ti_dt_clocks_register()
	clk: socfpga: fix refcount leak
	clk: samsung: exynos4: fix refcount leak in exynos4_get_xom()
	clk: imx6q: fix refcount leak in imx6q_clocks_init()
	clk: imx6sx: fix refcount leak in imx6sx_clocks_init()
	clk: imx7d: fix refcount leak in imx7d_clocks_init()
	clk: vf610: fix refcount leak in vf610_clocks_init()
	clk: armada-370: fix refcount leak in a370_clk_init()
	clk: kirkwood: fix refcount leak in kirkwood_clk_init()
	clk: armada-xp: fix refcount leak in axp_clk_init()
	clk: mv98dx3236: fix refcount leak in mv98dx3236_clk_init()
	clk: dove: fix refcount leak in dove_clk_init()
	MIPS: BCM63XX: drop unused and broken DSP platform device
	arm64: defconfig: Re-enable bcm2835-thermal driver
	remoteproc: qcom: q6v5-mss: Add missing clocks for MSM8996
	remoteproc: qcom: q6v5-mss: Add missing regulator for MSM8996
	drm: Fix error handling in drm_legacy_addctx
	ARM: dts: r8a7743: Remove generic compatible string from iic3
	drm/etnaviv: fix some off by one bugs
	drm/fb-helper: generic: Fix setup error path
	fork, memcg: fix cached_stacks case
	IB/usnic: Fix out of bounds index check in query pkey
	RDMA/ocrdma: Fix out of bounds index check in query pkey
	RDMA/qedr: Fix out of bounds index check in query pkey
	drm/shmob: Fix return value check in shmob_drm_probe
	arm64: dts: apq8016-sbc: Increase load on l11 for SDCARD
	spi: cadence: Correct initialisation of runtime PM
	RDMA/iw_cxgb4: Fix the unchecked ep dereference
	net: phy: micrel: set soft_reset callback to genphy_soft_reset for KSZ9031
	memory: tegra: Don't invoke Tegra30+ specific memory timing setup on Tegra20
	drm/etnaviv: NULL vs IS_ERR() buf in etnaviv_core_dump()
	media: s5p-jpeg: Correct step and max values for V4L2_CID_JPEG_RESTART_INTERVAL
	kbuild: mark prepare0 as PHONY to fix external module build
	crypto: brcm - Fix some set-but-not-used warning
	crypto: tgr192 - fix unaligned memory access
	ASoC: imx-sgtl5000: put of nodes if finding codec fails
	IB/iser: Pass the correct number of entries for dma mapped SGL
	net: hns3: fix wrong combined count returned by ethtool -l
	media: tw9910: Unregister subdevice with v4l2-async
	IB/mlx5: Don't override existing ip_protocol
	rtc: cmos: ignore bogus century byte
	spi/topcliff_pch: Fix potential NULL dereference on allocation error
	net: hns3: fix bug of ethtool_ops.get_channels for VF
	ARM: dts: sun8i-a23-a33: Move NAND controller device node to sort by address
	clk: sunxi-ng: sun8i-a23: Enable PLL-MIPI LDOs when ungating it
	iwlwifi: mvm: avoid possible access out of array.
	net/mlx5: Take lock with IRQs disabled to avoid deadlock
	ip_tunnel: Fix route fl4 init in ip_md_tunnel_xmit
	arm64: dts: allwinner: h6: Move GIC device node fix base address ordering
	iwlwifi: mvm: fix A-MPDU reference assignment
	bus: ti-sysc: Fix timer handling with drop pm_runtime_irq_safe()
	tty: ipwireless: Fix potential NULL pointer dereference
	driver: uio: fix possible memory leak in __uio_register_device
	driver: uio: fix possible use-after-free in __uio_register_device
	crypto: crypto4xx - Fix wrong ppc4xx_trng_probe()/ppc4xx_trng_remove() arguments
	driver core: Fix DL_FLAG_AUTOREMOVE_SUPPLIER device link flag handling
	driver core: Avoid careless re-use of existing device links
	driver core: Do not resume suppliers under device_links_write_lock()
	driver core: Fix handling of runtime PM flags in device_link_add()
	driver core: Do not call rpm_put_suppliers() in pm_runtime_drop_link()
	ARM: dts: lpc32xx: add required clocks property to keypad device node
	ARM: dts: lpc32xx: reparent keypad controller to SIC1
	ARM: dts: lpc32xx: fix ARM PrimeCell LCD controller variant
	ARM: dts: lpc32xx: fix ARM PrimeCell LCD controller clocks property
	ARM: dts: lpc32xx: phy3250: fix SD card regulator voltage
	drm/xen-front: Fix mmap attributes for display buffers
	iwlwifi: mvm: fix RSS config command
	staging: most: cdev: add missing check for cdev_add failure
	clk: ingenic: jz4740: Fix gating of UDC clock
	rtc: ds1672: fix unintended sign extension
	thermal: mediatek: fix register index error
	arm64: dts: msm8916: remove bogus argument to the cpu clock
	ath10k: fix dma unmap direction for management frames
	net: phy: fixed_phy: Fix fixed_phy not checking GPIO
	rtc: ds1307: rx8130: Fix alarm handling
	net/smc: original socket family in inet_sock_diag
	rtc: 88pm860x: fix unintended sign extension
	rtc: 88pm80x: fix unintended sign extension
	rtc: pm8xxx: fix unintended sign extension
	fbdev: chipsfb: remove set but not used variable 'size'
	iw_cxgb4: use tos when importing the endpoint
	iw_cxgb4: use tos when finding ipv6 routes
	ipmi: kcs_bmc: handle devm_kasprintf() failure case
	xsk: add missing smp_rmb() in xsk_mmap
	drm/etnaviv: potential NULL dereference
	ntb_hw_switchtec: debug print 64bit aligned crosslink BAR Numbers
	ntb_hw_switchtec: NT req id mapping table register entry number should be 512
	pinctrl: sh-pfc: emev2: Add missing pinmux functions
	pinctrl: sh-pfc: r8a7791: Fix scifb2_data_c pin group
	pinctrl: sh-pfc: r8a7792: Fix vin1_data18_b pin group
	pinctrl: sh-pfc: sh73a0: Fix fsic_spdif pin groups
	RDMA/mlx5: Fix memory leak in case we fail to add an IB device
	driver core: Fix possible supplier PM-usage counter imbalance
	PCI: endpoint: functions: Use memcpy_fromio()/memcpy_toio()
	usb: phy: twl6030-usb: fix possible use-after-free on remove
	block: don't use bio->bi_vcnt to figure out segment number
	keys: Timestamp new keys
	net: dsa: b53: Fix default VLAN ID
	net: dsa: b53: Properly account for VLAN filtering
	net: dsa: b53: Do not program CPU port's PVID
	mt76: usb: fix possible memory leak in mt76u_buf_free
	media: sh: migor: Include missing dma-mapping header
	vfio_pci: Enable memory accesses before calling pci_map_rom
	hwmon: (pmbus/tps53679) Fix driver info initialization in probe routine
	mdio_bus: Fix PTR_ERR() usage after initialization to constant
	KVM: PPC: Release all hardware TCE tables attached to a group
	staging: r8822be: check kzalloc return or bail
	dmaengine: mv_xor: Use correct device for DMA API
	cdc-wdm: pass return value of recover_from_urb_loss
	brcmfmac: create debugfs files for bus-specific layer
	regulator: pv88060: Fix array out-of-bounds access
	regulator: pv88080: Fix array out-of-bounds access
	regulator: pv88090: Fix array out-of-bounds access
	net: dsa: qca8k: Enable delay for RGMII_ID mode
	net/mlx5: Delete unused FPGA QPN variable
	drm/nouveau/bios/ramcfg: fix missing parentheses when calculating RON
	drm/nouveau/pmu: don't print reply values if exec is false
	drm/nouveau: fix missing break in switch statement
	driver core: Fix PM-runtime for links added during consumer probe
	ASoC: qcom: Fix of-node refcount unbalance in apq8016_sbc_parse_of()
	net: dsa: fix unintended change of bridge interface STP state
	fs/nfs: Fix nfs_parse_devname to not modify it's argument
	staging: rtlwifi: Use proper enum for return in halmac_parse_psd_data_88xx
	powerpc/64s: Fix logic when handling unknown CPU features
	NFS: Fix a soft lockup in the delegation recovery code
	perf: Copy parent's address filter offsets on clone
	perf, pt, coresight: Fix address filters for vmas with non-zero offset
	clocksource/drivers/sun5i: Fail gracefully when clock rate is unavailable
	clocksource/drivers/exynos_mct: Fix error path in timer resources initialization
	platform/x86: wmi: fix potential null pointer dereference
	NFS/pnfs: Bulk destroy of layouts needs to be safe w.r.t. umount
	mmc: sdhci-brcmstb: handle mmc_of_parse() errors during probe
	iommu: Fix IOMMU debugfs fallout
	ARM: 8847/1: pm: fix HYP/SVC mode mismatch when MCPM is used
	ARM: 8848/1: virt: Align GIC version check with arm64 counterpart
	ARM: 8849/1: NOMMU: Fix encodings for PMSAv8's PRBAR4/PRLAR4
	regulator: wm831x-dcdc: Fix list of wm831x_dcdc_ilim from mA to uA
	ath10k: Fix length of wmi tlv command for protected mgmt frames
	netfilter: nft_set_hash: fix lookups with fixed size hash on big endian
	netfilter: nft_set_hash: bogus element self comparison from deactivation path
	net: sched: act_csum: Fix csum calc for tagged packets
	hwrng: bcm2835 - fix probe as platform device
	iommu/vt-d: Fix NULL pointer reference in intel_svm_bind_mm()
	NFS: Add missing encode / decode sequence_maxsz to v4.2 operations
	NFSv4/flexfiles: Fix invalid deref in FF_LAYOUT_DEVID_NODE()
	net: aquantia: fixed instack structure overflow
	powerpc/mm: Check secondary hash page table
	media: dvb/earth-pt1: fix wrong initialization for demod blocks
	rbd: clear ->xferred on error from rbd_obj_issue_copyup()
	PCI: Fix "try" semantics of bus and slot reset
	nios2: ksyms: Add missing symbol exports
	x86/mm: Remove unused variable 'cpu'
	scsi: megaraid_sas: reduce module load time
	nfp: fix simple vNIC mailbox length
	drivers/rapidio/rio_cm.c: fix potential oops in riocm_ch_listen()
	xen, cpu_hotplug: Prevent an out of bounds access
	net/mlx5: Fix multiple updates of steering rules in parallel
	net/mlx5e: IPoIB, Fix RX checksum statistics update
	net: sh_eth: fix a missing check of of_get_phy_mode
	regulator: lp87565: Fix missing register for LP87565_BUCK_0
	soc: amlogic: gx-socinfo: Add mask for each SoC packages
	media: ivtv: update *pos correctly in ivtv_read_pos()
	media: cx18: update *pos correctly in cx18_read_pos()
	media: wl128x: Fix an error code in fm_download_firmware()
	media: cx23885: check allocation return
	regulator: tps65086: Fix tps65086_ldoa1_ranges for selector 0xB
	crypto: ccree - reduce kernel stack usage with clang
	jfs: fix bogus variable self-initialization
	tipc: tipc clang warning
	m68k: mac: Fix VIA timer counter accesses
	ARM: dts: sun8i: a33: Reintroduce default pinctrl muxing
	arm64: dts: allwinner: a64: Add missing PIO clocks
	ARM: dts: sun9i: optimus: Fix fixed-regulators
	net: phy: don't clear BMCR in genphy_soft_reset
	ARM: OMAP2+: Fix potentially uninitialized return value for _setup_reset()
	net: dsa: Avoid null pointer when failing to connect to PHY
	soc: qcom: cmd-db: Fix an error code in cmd_db_dev_probe()
	media: davinci-isif: avoid uninitialized variable use
	media: tw5864: Fix possible NULL pointer dereference in tw5864_handle_frame
	spi: tegra114: clear packed bit for unpacked mode
	spi: tegra114: fix for unpacked mode transfers
	spi: tegra114: terminate dma and reset on transfer timeout
	spi: tegra114: flush fifos
	spi: tegra114: configure dma burst size to fifo trig level
	bus: ti-sysc: Fix sysc_unprepare() when no clocks have been allocated
	soc/fsl/qe: Fix an error code in qe_pin_request()
	spi: bcm2835aux: fix driver to not allow 65535 (=-1) cs-gpios
	drm/fb-helper: generic: Call drm_client_add() after setup is done
	arm64/vdso: don't leak kernel addresses
	rtc: Fix timestamp value for RTC_TIMESTAMP_BEGIN_1900
	rtc: mt6397: Don't call irq_dispose_mapping.
	ehea: Fix a copy-paste err in ehea_init_port_res
	bpf: Add missed newline in verifier verbose log
	drm/vmwgfx: Remove set but not used variable 'restart'
	scsi: qla2xxx: Unregister chrdev if module initialization fails
	of: use correct function prototype for of_overlay_fdt_apply()
	net/sched: cbs: fix port_rate miscalculation
	clk: qcom: Skip halt checks on gcc_pcie_0_pipe_clk for 8998
	ACPI: button: reinitialize button state upon resume
	firmware: arm_scmi: fix of_node leak in scmi_mailbox_check
	rxrpc: Fix detection of out of order acks
	scsi: target/core: Fix a race condition in the LUN lookup code
	brcmfmac: fix leak of mypkt on error return path
	ARM: pxa: ssp: Fix "WARNING: invalid free of devm_ allocated data"
	PCI: rockchip: Fix rockchip_pcie_ep_assert_intx() bitwise operations
	net: hns3: fix for vport->bw_limit overflow problem
	hwmon: (w83627hf) Use request_muxed_region for Super-IO accesses
	perf/core: Fix the address filtering fix
	staging: android: vsoc: fix copy_from_user overrun
	PCI: dwc: Fix dw_pcie_ep_find_capability() to return correct capability offset
	soc: amlogic: meson-gx-pwrc-vpu: Fix power on/off register bitmask
	platform/x86: alienware-wmi: fix kfree on potentially uninitialized pointer
	tipc: set sysctl_tipc_rmem and named_timeout right range
	usb: typec: tcpm: Notify the tcpc to start connection-detection for SRPs
	selftests/ipc: Fix msgque compiler warnings
	net: hns3: fix loop condition of hns3_get_tx_timeo_queue_info()
	powerpc: vdso: Make vdso32 installation conditional in vdso_install
	ARM: dts: ls1021: Fix SGMII PCS link remaining down after PHY disconnect
	media: ov2659: fix unbalanced mutex_lock/unlock
	6lowpan: Off by one handling ->nexthdr
	dmaengine: axi-dmac: Don't check the number of frames for alignment
	ALSA: usb-audio: Handle the error from snd_usb_mixer_apply_create_quirk()
	afs: Fix AFS file locking to allow fine grained locks
	afs: Further fix file locking
	NFS: Don't interrupt file writeout due to fatal errors
	coresight: catu: fix clang build warning
	s390/kexec_file: Fix potential segment overlap in ELF loader
	irqchip/gic-v3-its: fix some definitions of inner cacheability attributes
	scsi: qla2xxx: Fix a format specifier
	scsi: qla2xxx: Fix error handling in qlt_alloc_qfull_cmd()
	scsi: qla2xxx: Avoid that qlt_send_resp_ctio() corrupts memory
	KVM: PPC: Book3S HV: Fix lockdep warning when entering the guest
	netfilter: nft_flow_offload: add entry to flowtable after confirmation
	PCI: iproc: Enable iProc config read for PAXBv2
	ARM: dts: logicpd-som-lv: Fix MMC1 card detect
	packet: in recvmsg msg_name return at least sizeof sockaddr_ll
	ASoC: fix valid stream condition
	usb: gadget: fsl: fix link error against usb-gadget module
	dwc2: gadget: Fix completed transfer size calculation in DDMA
	IB/mlx5: Add missing XRC options to QP optional params mask
	RDMA/rxe: Consider skb reserve space based on netdev of GID
	iommu/vt-d: Make kernel parameter igfx_off work with vIOMMU
	net: ena: fix swapped parameters when calling ena_com_indirect_table_fill_entry
	net: ena: fix: Free napi resources when ena_up() fails
	net: ena: fix incorrect test of supported hash function
	net: ena: fix ena_com_fill_hash_function() implementation
	dmaengine: tegra210-adma: restore channel status
	watchdog: rtd119x_wdt: Fix remove function
	mmc: core: fix possible use after free of host
	lightnvm: pblk: fix lock order in pblk_rb_tear_down_check
	ath10k: Fix encoding for protected management frames
	afs: Fix the afs.cell and afs.volume xattr handlers
	vfio/mdev: Avoid release parent reference during error path
	vfio/mdev: Follow correct remove sequence
	vfio/mdev: Fix aborting mdev child device removal if one fails
	l2tp: Fix possible NULL pointer dereference
	ALSA: aica: Fix a long-time build breakage
	media: omap_vout: potential buffer overflow in vidioc_dqbuf()
	media: davinci/vpbe: array underflow in vpbe_enum_outputs()
	platform/x86: alienware-wmi: printing the wrong error code
	crypto: caam - fix caam_dump_sg that iterates through scatterlist
	netfilter: ebtables: CONFIG_COMPAT: reject trailing data after last rule
	pwm: meson: Consider 128 a valid pre-divider
	pwm: meson: Don't disable PWM when setting duty repeatedly
	ARM: riscpc: fix lack of keyboard interrupts after irq conversion
	nfp: bpf: fix static check error through tightening shift amount adjustment
	kdb: do a sanity check on the cpu in kdb_per_cpu()
	netfilter: nf_tables: correct NFT_LOGLEVEL_MAX value
	backlight: lm3630a: Return 0 on success in update_status functions
	thermal: rcar_gen3_thermal: fix interrupt type
	thermal: cpu_cooling: Actually trace CPU load in thermal_power_cpu_get_power
	EDAC/mc: Fix edac_mc_find() in case no device is found
	afs: Fix key leak in afs_release() and afs_evict_inode()
	afs: Don't invalidate callback if AFS_VNODE_DIR_VALID not set
	afs: Fix lock-wait/callback-break double locking
	afs: Fix double inc of vnode->cb_break
	ARM: dts: sun8i-h3: Fix wifi in Beelink X2 DT
	clk: meson: gxbb: no spread spectrum on mpll0
	clk: meson: axg: spread spectrum is on mpll2
	dmaengine: tegra210-adma: Fix crash during probe
	arm64: dts: meson: libretech-cc: set eMMC as removable
	RDMA/qedr: Fix incorrect device rate.
	spi: spi-fsl-spi: call spi_finalize_current_message() at the end
	crypto: ccp - fix AES CFB error exposed by new test vectors
	crypto: ccp - Fix 3DES complaint from ccp-crypto module
	serial: stm32: fix word length configuration
	serial: stm32: fix rx error handling
	serial: stm32: fix rx data length when parity enabled
	serial: stm32: fix transmit_chars when tx is stopped
	serial: stm32: Add support of TC bit status check
	serial: stm32: fix wakeup source initialization
	misc: sgi-xp: Properly initialize buf in xpc_get_rsvd_page_pa
	iommu: Add missing new line for dma type
	iommu: Use right function to get group for device
	signal/bpfilter: Fix bpfilter_kernl to use send_sig not force_sig
	signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig
	inet: frags: call inet_frags_fini() after unregister_pernet_subsys()
	net: hns3: fix a memory leak issue for hclge_map_unmap_ring_to_vf_vector
	crypto: talitos - fix AEAD processing.
	netvsc: unshare skb in VF rx handler
	net: core: support XDP generic on stacked devices.
	RDMA/uverbs: check for allocation failure in uapi_add_elm()
	net: don't clear sock->sk early to avoid trouble in strparser
	phy: qcom-qusb2: fix missing assignment of ret when calling clk_prepare_enable
	cpufreq: brcmstb-avs-cpufreq: Fix initial command check
	cpufreq: brcmstb-avs-cpufreq: Fix types for voltage/frequency
	clk: sunxi-ng: sun50i-h6-r: Fix incorrect W1 clock gate register
	media: vivid: fix incorrect assignment operation when setting video mode
	crypto: inside-secure - fix zeroing of the request in ahash_exit_inv
	crypto: inside-secure - fix queued len computation
	arm64: dts: renesas: ebisu: Remove renesas, no-ether-link property
	mpls: fix warning with multi-label encap
	serial: stm32: fix a recursive locking in stm32_config_rs485
	arm64: dts: meson-gxm-khadas-vim2: fix gpio-keys-polled node
	arm64: dts: meson-gxm-khadas-vim2: fix Bluetooth support
	iommu/vt-d: Duplicate iommu_resv_region objects per device list
	phy: usb: phy-brcm-usb: Remove sysfs attributes upon driver removal
	firmware: arm_scmi: fix bitfield definitions for SENSOR_DESC attributes
	firmware: arm_scmi: update rate_discrete in clock_describe_rates_get
	ntb_hw_switchtec: potential shift wrapping bug in switchtec_ntb_init_sndev()
	ASoC: meson: axg-tdmin: right_j is not supported
	ASoC: meson: axg-tdmout: right_j is not supported
	qed: iWARP - Use READ_ONCE and smp_store_release to access ep->state
	qed: iWARP - fix uninitialized callback
	powerpc/cacheinfo: add cacheinfo_teardown, cacheinfo_rebuild
	powerpc/pseries/mobility: rebuild cacheinfo hierarchy post-migration
	bpf: fix the check that forwarding is enabled in bpf_ipv6_fib_lookup
	IB/hfi1: Handle port down properly in pio
	drm/msm/mdp5: Fix mdp5_cfg_init error return
	net: netem: fix backlog accounting for corrupted GSO frames
	net/udp_gso: Allow TX timestamp with UDP GSO
	net/af_iucv: build proper skbs for HiperTransport
	net/af_iucv: always register net_device notifier
	ASoC: ti: davinci-mcasp: Fix slot mask settings when using multiple AXRs
	rtc: pcf8563: Fix interrupt trigger method
	rtc: pcf8563: Clear event flags and disable interrupts before requesting irq
	ARM: dts: iwg20d-q7-common: Fix SDHI1 VccQ regularor
	net/sched: cbs: Fix error path of cbs_module_init
	arm64: dts: allwinner: h6: Pine H64: Add interrupt line for RTC
	drm/msm/a3xx: remove TPL1 regs from snapshot
	ip6_fib: Don't discard nodes with valid routing information in fib6_locate_1()
	perf/ioctl: Add check for the sample_period value
	dmaengine: hsu: Revert "set HSU_CH_MTSR to memory width"
	clk: qcom: Fix -Wunused-const-variable
	nvmem: imx-ocotp: Ensure WAIT bits are preserved when setting timing
	nvmem: imx-ocotp: Change TIMING calculation to u-boot algorithm
	tools: bpftool: use correct argument in cgroup errors
	backlight: pwm_bl: Fix heuristic to determine number of brightness levels
	fork,memcg: alloc_thread_stack_node needs to set tsk->stack
	bnxt_en: Fix ethtool selftest crash under error conditions.
	bnxt_en: Suppress error messages when querying DSCP DCB capabilities.
	iommu/amd: Make iommu_disable safer
	mfd: intel-lpss: Release IDA resources
	rxrpc: Fix uninitialized error code in rxrpc_send_data_packet()
	xprtrdma: Fix use-after-free in rpcrdma_post_recvs
	um: Fix IRQ controller regression on console read
	PM: ACPI/PCI: Resume all devices during hibernation
	ACPI: PM: Simplify and fix PM domain hibernation callbacks
	ACPI: PM: Introduce "poweroff" callbacks for ACPI PM domain and LPSS
	fsi/core: Fix error paths on CFAM init
	devres: allow const resource arguments
	fsi: sbefifo: Don't fail operations when in SBE IPL state
	RDMA/hns: Fixs hw access invalid dma memory error
	PCI: mobiveil: Remove the flag MSI_FLAG_MULTI_PCI_MSI
	PCI: mobiveil: Fix devfn check in mobiveil_pcie_valid_device()
	PCI: mobiveil: Fix the valid check for inbound and outbound windows
	ceph: fix "ceph.dir.rctime" vxattr value
	net: pasemi: fix an use-after-free in pasemi_mac_phy_init()
	net/tls: fix socket wmem accounting on fallback with netem
	x86/pgtable/32: Fix LOWMEM_PAGES constant
	xdp: fix possible cq entry leak
	ARM: stm32: use "depends on" instead of "if" after prompt
	scsi: libfc: fix null pointer dereference on a null lport
	xfrm interface: ifname may be wrong in logs
	drm/panel: make drm_panel.h self-contained
	clk: sunxi-ng: v3s: add the missing PLL_DDR1
	PM: sleep: Fix possible overflow in pm_system_cancel_wakeup()
	libertas_tf: Use correct channel range in lbtf_geo_init
	qed: reduce maximum stack frame size
	usb: host: xhci-hub: fix extra endianness conversion
	media: rcar-vin: Clean up correct notifier in error path
	mic: avoid statically declaring a 'struct device'.
	x86/kgbd: Use NMI_VECTOR not APIC_DM_NMI
	crypto: ccp - Reduce maximum stack usage
	ALSA: aoa: onyx: always initialize register read value
	arm64: dts: renesas: r8a77995: Fix register range of display node
	tipc: reduce risk of wakeup queue starvation
	ARM: dts: stm32: add missing vdda-supply to adc on stm32h743i-eval
	net/mlx5: Fix mlx5_ifc_query_lag_out_bits
	cifs: fix rmmod regression in cifs.ko caused by force_sig changes
	iio: tsl2772: Use devm_add_action_or_reset for tsl2772_chip_off
	net: fix bpf_xdp_adjust_head regression for generic-XDP
	spi: bcm-qspi: Fix BSPI QUAD and DUAL mode support when using flex mode
	cxgb4: smt: Add lock for atomic_dec_and_test
	crypto: caam - free resources in case caam_rng registration failed
	ext4: set error return correctly when ext4_htree_store_dirent fails
	RDMA/hns: Bugfix for slab-out-of-bounds when unloading hip08 driver
	RDMA/hns: bugfix for slab-out-of-bounds when loading hip08 driver
	ASoC: es8328: Fix copy-paste error in es8328_right_line_controls
	ASoC: cs4349: Use PM ops 'cs4349_runtime_pm'
	ASoC: wm8737: Fix copy-paste error in wm8737_snd_controls
	net/rds: Add a few missing rds_stat_names entries
	tools: bpftool: fix arguments for p_err() in do_event_pipe()
	tools: bpftool: fix format strings and arguments for jsonw_printf()
	drm: rcar-du: lvds: Fix bridge_to_rcar_lvds
	bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd fails
	signal: Allow cifs and drbd to receive their terminating signals
	powerpc/64s/radix: Fix memory hot-unplug page table split
	ASoC: sun4i-i2s: RX and TX counter registers are swapped
	dmaengine: dw: platform: Switch to acpi_dma_controller_register()
	rtc: rv3029: revert error handling patch to rv3029_eeprom_write()
	mac80211: minstrel_ht: fix per-group max throughput rate initialization
	i40e: reduce stack usage in i40e_set_fc
	media: atmel: atmel-isi: fix timeout value for stop streaming
	ARM: 8896/1: VDSO: Don't leak kernel addresses
	rtc: pcf2127: bugfix: read rtc disables watchdog
	mips: avoid explicit UB in assignment of mips_io_port_base
	media: em28xx: Fix exception handling in em28xx_alloc_urbs()
	iommu/mediatek: Fix iova_to_phys PA start for 4GB mode
	ahci: Do not export local variable ahci_em_messages
	rxrpc: Fix lack of conn cleanup when local endpoint is cleaned up [ver #2]
	Partially revert "kfifo: fix kfifo_alloc() and kfifo_init()"
	hwmon: (lm75) Fix write operations for negative temperatures
	net/sched: cbs: Set default link speed to 10 Mbps in cbs_set_port_rate
	power: supply: Init device wakeup after device_add()
	x86, perf: Fix the dependency of the x86 insn decoder selftest
	staging: greybus: light: fix a couple double frees
	irqdomain: Add the missing assignment of domain->fwnode for named fwnode
	bcma: fix incorrect update of BCMA_CORE_PCI_MDIO_DATA
	usb: typec: tps6598x: Fix build error without CONFIG_REGMAP_I2C
	bcache: Fix an error code in bch_dump_read()
	iio: dac: ad5380: fix incorrect assignment to val
	netfilter: ctnetlink: honor IPS_OFFLOAD flag
	ath9k: dynack: fix possible deadlock in ath_dynack_node_{de}init
	wcn36xx: use dynamic allocation for large variables
	tty: serial: fsl_lpuart: Use appropriate lpuart32_* I/O funcs
	ARM: dts: aspeed-g5: Fixe gpio-ranges upper limit
	xsk: avoid store-tearing when assigning queues
	xsk: avoid store-tearing when assigning umem
	led: triggers: Fix dereferencing of null pointer
	net: sonic: return NETDEV_TX_OK if failed to map buffer
	net: hns3: fix error VF index when setting VLAN offload
	rtlwifi: Fix file release memory leak
	ARM: dts: logicpd-som-lv: Fix i2c2 and i2c3 Pin mux
	f2fs: fix wrong error injection path in inc_valid_block_count()
	f2fs: fix error path of f2fs_convert_inline_page()
	scsi: fnic: fix msix interrupt allocation
	Btrfs: fix hang when loading existing inode cache off disk
	Btrfs: fix inode cache waiters hanging on failure to start caching thread
	Btrfs: fix inode cache waiters hanging on path allocation failure
	btrfs: use correct count in btrfs_file_write_iter()
	ixgbe: sync the first fragment unconditionally
	hwmon: (shtc1) fix shtc1 and shtw1 id mask
	net: sonic: replace dev_kfree_skb in sonic_send_packet
	pinctrl: iproc-gpio: Fix incorrect pinconf configurations
	gpio/aspeed: Fix incorrect number of banks
	ath10k: adjust skb length in ath10k_sdio_mbox_rx_packet
	RDMA/cma: Fix false error message
	net/rds: Fix 'ib_evt_handler_call' element in 'rds_ib_stat_names'
	um: Fix off by one error in IRQ enumeration
	bnxt_en: Increase timeout for HWRM_DBG_COREDUMP_XX commands
	f2fs: fix to avoid accessing uninitialized field of inode page in is_alive()
	mailbox: qcom-apcs: fix max_register value
	clk: actions: Fix factor clk struct member access
	powerpc/mm/mce: Keep irqs disabled during lockless page table walk
	bpf: fix BTF limits
	crypto: hisilicon - Matching the dma address for dma_pool_free()
	iommu/amd: Wait for completion of IOTLB flush in attach_device
	net: aquantia: Fix aq_vec_isr_legacy() return value
	cxgb4: Signedness bug in init_one()
	net: hisilicon: Fix signedness bug in hix5hd2_dev_probe()
	net: broadcom/bcmsysport: Fix signedness in bcm_sysport_probe()
	net: netsec: Fix signedness bug in netsec_probe()
	net: socionext: Fix a signedness bug in ave_probe()
	net: stmmac: dwmac-meson8b: Fix signedness bug in probe
	net: axienet: fix a signedness bug in probe
	of: mdio: Fix a signedness bug in of_phy_get_and_connect()
	net: nixge: Fix a signedness bug in nixge_probe()
	net: ethernet: stmmac: Fix signedness bug in ipq806x_gmac_of_parse()
	net: sched: cbs: Avoid division by zero when calculating the port rate
	nvme: retain split access workaround for capability reads
	net: stmmac: gmac4+: Not all Unicast addresses may be available
	rxrpc: Fix trace-after-put looking at the put connection record
	mac80211: accept deauth frames in IBSS mode
	llc: fix another potential sk_buff leak in llc_ui_sendmsg()
	llc: fix sk_buff refcounting in llc_conn_state_process()
	ip6erspan: remove the incorrect mtu limit for ip6erspan
	net: stmmac: fix length of PTP clock's name string
	net: stmmac: fix disabling flexible PPS output
	sctp: add chunks to sk_backlog when the newsk sk_socket is not set
	s390/qeth: Fix error handling during VNICC initialization
	s390/qeth: Fix initialization of vnicc cmd masks during set online
	act_mirred: Fix mirred_init_module error handling
	net: avoid possible false sharing in sk_leave_memory_pressure()
	net: add {READ|WRITE}_ONCE() annotations on ->rskq_accept_head
	tcp: annotate lockless access to tcp_memory_pressure
	net/smc: receive returns without data
	net/smc: receive pending data after RCV_SHUTDOWN
	drm/msm/dsi: Implement reset correctly
	vhost/test: stop device before reset
	dmaengine: imx-sdma: fix size check for sdma script_number
	firmware: dmi: Fix unlikely out-of-bounds read in save_mem_devices
	arm64: hibernate: check pgd table allocation
	net: netem: fix error path for corrupted GSO frames
	net: netem: correct the parent's backlog when corrupted packet was dropped
	xsk: Fix registration of Rx-only sockets
	bpf, offload: Unlock on error in bpf_offload_dev_create()
	afs: Fix missing timeout reset
	net: qca_spi: Move reset_count to struct qcaspi
	hv_netvsc: Fix offset usage in netvsc_send_table()
	hv_netvsc: Fix send_table offset in case of a host bug
	afs: Fix large file support
	drm: panel-lvds: Potential Oops in probe error handling
	hwrng: omap3-rom - Fix missing clock by probing with device tree
	dpaa_eth: perform DMA unmapping before read
	dpaa_eth: avoid timestamp read on error paths
	MIPS: Loongson: Fix return value of loongson_hwmon_init
	hv_netvsc: flag software created hash value
	net: neigh: use long type to store jiffies delta
	packet: fix data-race in fanout_flow_is_huge()
	i2c: stm32f7: report dma error during probe
	mmc: sdio: fix wl1251 vendor id
	mmc: core: fix wl1251 sdio quirks
	affs: fix a memory leak in affs_remount
	afs: Remove set but not used variables 'before', 'after'
	dmaengine: ti: edma: fix missed failure handling
	drm/radeon: fix bad DMA from INTERRUPT_CNTL2
	arm64: dts: juno: Fix UART frequency
	samples/bpf: Fix broken xdp_rxq_info due to map order assumptions
	usb: dwc3: Allow building USB_DWC3_QCOM without EXTCON
	IB/iser: Fix dma_nents type definition
	serial: stm32: fix clearing interrupt error flags
	arm64: dts: meson-gxm-khadas-vim2: fix uart_A bluetooth node
	m68k: Call timer_interrupt() with interrupts disabled
	Linux 4.19.99

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ieabeab79ea5c8cb4b6b1552702fa5d6100cea5db
2020-01-27 15:55:44 +01:00
Eric W. Biederman
6db0e28b89 signal: Allow cifs and drbd to receive their terminating signals
[ Upstream commit 33da8e7c814f77310250bb54a9db36a44c5de784 ]

My recent to change to only use force_sig for a synchronous events
wound up breaking signal reception cifs and drbd.  I had overlooked
the fact that by default kthreads start out with all signals set to
SIG_IGN.  So a change I thought was safe turned out to have made it
impossible for those kernel thread to catch their signals.

Reverting the work on force_sig is a bad idea because what the code
was doing was very much a misuse of force_sig.  As the way force_sig
ultimately allowed the signal to happen was to change the signal
handler to SIG_DFL.  Which after the first signal will allow userspace
to send signals to these kernel threads.  At least for
wake_ack_receiver in drbd that does not appear actively wrong.

So correct this problem by adding allow_kernel_signal that will allow
signals whose siginfo reports they were sent by the kernel through,
but will not allow userspace generated signals, and update cifs and
drbd to call allow_kernel_signal in an appropriate place so that their
thread can receive this signal.

Fixing things this way ensures that userspace won't be able to send
signals and cause problems, that it is clear which signals the
threads are expecting to receive, and it guarantees that nothing
else in the system will be affected.

This change was partly inspired by similar cifs and drbd patches that
added allow_signal.

Reported-by: ronnie sahlberg <ronniesahlberg@gmail.com>
Reported-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Tested-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Cc: Steve French <smfrench@gmail.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Fixes: 247bc9470b1e ("cifs: fix rmmod regression in cifs.ko caused by force_sig changes")
Fixes: 72abe3bcf091 ("signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig")
Fixes: fee109901f39 ("signal/drbd: Use send_sig not force_sig")
Fixes: 3cf5d076fb4d ("signal: Remove task parameter from force_sig")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27 14:51:05 +01:00
Greg Kroah-Hartman
44b82a3d1b This is the 4.19.85 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl3VfEoACgkQONu9yGCS
 aT7vHRAAv3fZQ5+Rn0zn0cgYsgG5OGbtHL01aJB99g2Dgf/VmB3OrB2rx+ZF7WVw
 Uakab5XZp6rLSxG4LNQy7jjIuxADdDab5xWTlhqpEHVydsFC9IOktT91DW2luf8Y
 Xyr8q7sQIS7eV67NkUnUSqri1IdsRNB5qeWmhC0l6+PSuQrk+WF0y5B4TtrjF5Er
 GjYTq9RTJh7/luFKUSmxN8+TIwo4uY15b3oqX75LMPObzbH+c5iqp5QiHJh/BQ7/
 awf7kxlMay0V/hPRmGomHxX70TgHTF2er0b+HyJwf1OX0zgKycsztWZT+p7qN+DT
 yjPWwYJ3kGs/7GwZL7HNhk8p/3aDf9HFHFvbVSty63wgZ8dfo4EuXZ9YfWa+lfI8
 Kn4wKeynUvrvNLH9iYug/XuEPjXysQeSlBaL4pZTPTWtipu1MP0OpR05l8UzO2cO
 lqWgf0Y7wsunZBeyCLkWd9TCO7gd1s7csdkJAy37rG7mCjN3p83NeMznLlj+H4I8
 MHlcAWdlxlWWitKohi0kr/VYiHmhBVsOZu4rQmuCBWuo++HrWwn7XaGBzYsP8Eku
 7ZNaS5oJFAjBzKnQxp8i3mgE8ifODuokgPISImyyRWidedfoHcv6Kr+pdEoQ+gjk
 nL5xwqKAMsh/vMyxVmetzytULHtvBqJelquzQcfnanyEvBoS46Q=
 =EUxi
 -----END PGP SIGNATURE-----

Merge 4.19.85 into android-4.19

Changes in 4.19.85
	KVM: x86: introduce is_pae_paging
	MIPS: BCM63XX: fix switch core reset on BCM6368
	scsi: core: Handle drivers which set sg_tablesize to zero
	ax88172a: fix information leak on short answers
	ipmr: Fix skb headroom in ipmr_get_route().
	net: gemini: add missed free_netdev
	net: usb: qmi_wwan: add support for Foxconn T77W968 LTE modules
	slip: Fix memory leak in slip_open error path
	ALSA: usb-audio: Fix missing error check at mixer resolution test
	ALSA: usb-audio: not submit urb for stopped endpoint
	ALSA: usb-audio: Fix incorrect NULL check in create_yamaha_midi_quirk()
	ALSA: usb-audio: Fix incorrect size check for processing/extension units
	Btrfs: fix log context list corruption after rename exchange operation
	Input: ff-memless - kill timer in destroy()
	Input: synaptics-rmi4 - fix video buffer size
	Input: synaptics-rmi4 - disable the relative position IRQ in the F12 driver
	Input: synaptics-rmi4 - do not consume more data than we have (F11, F12)
	Input: synaptics-rmi4 - clear IRQ enables for F54
	Input: synaptics-rmi4 - destroy F54 poller workqueue when removing
	IB/hfi1: Ensure full Gen3 speed in a Gen4 system
	IB/hfi1: Use a common pad buffer for 9B and 16B packets
	i2c: acpi: Force bus speed to 400KHz if a Silead touchscreen is present
	ecryptfs_lookup_interpose(): lower_dentry->d_inode is not stable
	ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable either
	net: ethernet: dwmac-sun8i: Use the correct function in exit path
	iommu/vt-d: Fix QI_DEV_IOTLB_PFSID and QI_DEV_EIOTLB_PFSID macros
	mm: mempolicy: fix the wrong return value and potential pages leak of mbind
	mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm()
	mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup()
	mmc: sdhci-of-at91: fix quirk2 overwrite
	iio: adc: max9611: explicitly cast gain_selectors
	tee: optee: take DT status property into account
	ath10k: fix kernel panic by moving pci flush after napi_disable
	iio: dac: mcp4922: fix error handling in mcp4922_write_raw
	clk: sunxi-ng: h6: fix PWM gate/reset offset
	soundwire: Initialize completion for defer messages
	soundwire: intel: Fix uninitialized adev deref
	arm64: dts: allwinner: a64: Orange Pi Win: Fix SD card node
	arm64: dts: allwinner: a64: Olinuxino: fix DRAM voltage
	arm64: dts: allwinner: a64: NanoPi-A64: Fix DCDC1 voltage
	ALSA: pcm: signedness bug in snd_pcm_plug_alloc()
	soc/tegra: pmc: Fix pad voltage configuration for Tegra186
	arm64: dts: tegra210-p2180: Correct sdmmc4 vqmmc-supply
	y2038: make do_gettimeofday() and get_seconds() inline
	ARM: dts: rcar: Correct SATA device sizes to 2 MiB
	ARM: dts: at91/trivial: Fix USART1 definition for at91sam9g45
	rtc: sysfs: fix NULL check in rtc_add_groups()
	rtc: rv8803: fix the rv8803 id in the OF table
	remoteproc/davinci: Use %zx for formating size_t
	extcon: cht-wc: Return from default case to avoid warnings
	cfg80211: Avoid regulatory restore when COUNTRY_IE_IGNORE is set
	ALSA: seq: Do error checks at creating system ports
	ath10k: skip resetting rx filter for WCN3990
	ath9k: fix tx99 with monitor mode interface
	wil6210: drop Rx multicast packets that are looped-back to STA
	wil6210: set edma variables only for Talyn-MB devices
	wil6210: prevent usage of tx ring 0 for eDMA
	wil6210: fix invalid memory access for rx_buff_mgmt debugfs
	ath10k: limit available channels via DT ieee80211-freq-limit
	ice: Update request resource command to latest specification
	ice: Prevent control queue operations during reset
	gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updated
	ice: Fix and update driver version string
	ASoC: dapm: Don't fail creating new DAPM control on NULL pinctrl
	ASoC: dpcm: Properly initialise hw->rate_max
	ASoC: meson: axg-fifo: report interrupt request failure
	ASoC: AMD: Change MCLK to 48Mhz
	pinctrl: ingenic: Probe driver at subsys_initcall
	MIPS: BCM47XX: Enable USB power on Netgear WNDR3400v3
	ARM: dts: exynos: Use i2c-gpio for HDMI-DDC on Arndale
	ARM: dts: exynos: Fix HDMI-HPD line handling on Arndale
	ARM: dts: exynos: Fix sound in Snow-rev5 Chromebook
	liquidio: fix race condition in instruction completion processing
	arm64: dts: stratix10: i2c clock running out of spec
	ARM: dts: exynos: Fix regulators configuration on Peach Pi/Pit Chromebooks
	i40evf: Validate the number of queues a PF sends
	i40e: use correct length for strncpy
	i40evf: set IFF_UNICAST_FLT flag for the VF
	i40e: Check and correct speed values for link on open
	i40evf: Don't enable vlan stripping when rx offload is turned on
	i40e: hold the rtnl lock on clearing interrupt scheme
	i40evf: cancel workqueue sync for adminq when a VF is removed
	i40e: Prevent deleting MAC address from VF when set by PF
	IB/rxe: avoid back-to-back retries
	IB/rxe: fixes for rdma read retry
	iwlwifi: drop packets with bad status in CD
	iwlwifi: don't WARN on trying to dump dead firmware
	iwlwifi: mvm: avoid sending too many BARs
	media: vicodec: fix out-of-range values when decoding
	media: i2c: Fix pm_runtime_get_if_in_use() usage in sensor drivers
	media: ov772x: Disable clk on error path
	ARM: dts: pxa: fix the rtc controller
	ARM: dts: pxa: fix power i2c base address
	rtl8187: Fix warning generated when strncpy() destination length matches the sixe argument
	mwifiex: do no submit URB in suspended state
	mwifex: free rx_cmd skb in suspended state
	brcmfmac: fix wrong strnchr usage
	mt76: Fix comparisons with invalid hardware key index
	soc: imx: gpc: fix PDN delay
	ASoC: rsnd: ssi: Fix issue in dma data address assignment
	net: hns3: Fix for multicast failure
	net: hns3: Fix error of checking used vlan id
	net: hns3: Fix for loopback selftest failed problem
	net: hns3: Change the dst mac addr of loopback packet
	net/mlx5: Fix atomic_mode enum values
	net: phy: mscc: read 'vsc8531,vddmac' as an u32
	net: phy: mscc: read 'vsc8531, edge-slowdown' as an u32
	ARM: dts: meson8: fix the clock controller register size
	ARM: dts: meson8b: fix the clock controller register size
	mtd: rawnand: marvell: use regmap_update_bits() for syscon access
	mtd: rawnand: fsl_ifc: check result of SRAM initialization
	mtd: rawnand: fsl_ifc: fixup SRAM init for newer ctrl versions
	mtd: rawnand: qcom: don't include dma-direct.h
	IB/mlx5: Change TX affinity assignment in RoCE LAG mode
	qxl: fix null-pointer crash during suspend
	mac80211: fix saving a few HE values
	cfg80211: validate wmm rule when setting
	f2fs: avoid wrong decrypted data from disk
	net: lan78xx: Bail out if lan78xx_get_endpoints fails
	rtnetlink: move type calculation out of loop
	ASoC: sgtl5000: avoid division by zero if lo_vag is zero
	ath10k: avoid possible memory access violation
	ARM: dts: exynos: Disable pull control for S5M8767 PMIC
	ath10k: wmi: disable softirq's while calling ieee80211_rx
	i2c: mediatek: Use DMA safe buffers for i2c transactions
	IB/mlx5: Don't hold spin lock while checking device state
	IB/ipoib: Ensure that MTU isn't less than minimum permitted
	RDMA/core: Rate limit MAD error messages
	RDMA/core: Follow correct unregister order between sysfs and cgroup
	mips: txx9: fix iounmap related issue
	udf: Fix crash during mount
	ASoC: dapm: Avoid uninitialised variable warning
	ASoC: Intel: hdac_hdmi: Limit sampling rates at dai creation
	ata: Disable AHCI ALPM feature for Ampere Computing eMAG SATA
	of: make PowerMac cache node search conditional on CONFIG_PPC_PMAC
	ARM: dts: omap3-gta04: give spi_lcd node a label so that we can overwrite in other DTS files
	ARM: dts: omap3-gta04: fixes for tvout / venc
	ARM: dts: omap3-gta04: tvout: enable as display1 alias
	ARM: dts: omap3-gta04: fix touchscreen tsc2007
	ARM: dts: omap3-gta04: make NAND partitions compatible with recent U-Boot
	ARM: dts: omap3-gta04: keep vpll2 always on
	f2fs: submit bio after shutdown
	failover: Fix error return code in net_failover_create
	sched/debug: Explicitly cast sched_feat() to bool
	sched/debug: Use symbolic names for task state constants
	firmware: arm_scmi: use strlcpy to ensure NULL-terminated strings
	arm64: dts: rockchip: Fix VCC5V0_HOST_EN on rk3399-sapphire
	ARM: dts: exynos: Disable pull control for PMIC IRQ line on Artik5 board
	usb: mtu3: disable vbus rise/fall interrupts of ltssm
	dmaengine: dma-jz4780: Don't depend on MACH_JZ4780
	dmaengine: dma-jz4780: Further residue status fix
	EDAC, sb_edac: Return early on ADDRV bit and address type test
	rtc: mt6397: fix possible race condition
	rtc: pl030: fix possible race condition
	ath9k: add back support for using active monitor interfaces for tx99
	dmaengine: at_xdmac: remove a stray bottom half unlock
	RDMA/hns: Fix an error code in hns_roce_v2_init_eq_table()
	IB/hfi1: Missing return value in error path for user sdma
	signal: Always ignore SIGKILL and SIGSTOP sent to the global init
	signal: Properly deliver SIGILL from uprobes
	signal: Properly deliver SIGSEGV from x86 uprobes
	f2fs: fix memory leak of write_io in fill_super()
	f2fs: fix memory leak of percpu counter in fill_super()
	f2fs: fix setattr project check upon fssetxattr ioctl
	scsi: qla2xxx: Use correct qpair for ABTS/CMD
	scsi: qla2xxx: Fix iIDMA error
	scsi: qla2xxx: Defer chip reset until target mode is enabled
	scsi: qla2xxx: Terminate Plogi/PRLI if WWN is 0
	scsi: qla2xxx: Fix deadlock between ATIO and HW lock
	scsi: qla2xxx: Increase abort timeout value
	scsi: qla2xxx: Check for Register disconnect
	scsi: qla2xxx: Fix port speed display on chip reset
	scsi: qla2xxx: Fix dropped srb resource.
	scsi: qla2xxx: Fix duplicate switch's Nport ID entries
	scsi: lpfc: Fix GFT_ID and PRLI logic for RSCN
	scsi: lpfc: Correct invalid EQ doorbell write on if_type=6
	scsi: lpfc: Fix errors in log messages.
	scsi: sym53c8xx: fix NULL pointer dereference panic in sym_int_sir()
	ARM: imx6: register pm_power_off handler if "fsl,pmic-stby-poweroff" is set
	scsi: pm80xx: Corrected dma_unmap_sg() parameter
	scsi: pm80xx: Fixed system hang issue during kexec boot
	kprobes: Don't call BUG_ON() if there is a kprobe in use on free list
	net: aquantia: fix hw_atl_utils_fw_upload_dwords
	Drivers: hv: vmbus: Fix synic per-cpu context initialization
	nvmem: core: return error code instead of NULL from nvmem_device_get
	media: dt-bindings: adv748x: Fix decimal unit addresses
	ALSA: hda: Fix implicit definition of pci_iomap() on SH
	media: fix: media: pci: meye: validate offset to avoid arbitrary access
	media: dvb: fix compat ioctl translation
	net: bcmgenet: Fix speed selection for reverse MII
	arm64: dts: meson: libretech: update board model
	arm64: dts: meson-axg: use the proper compatible for ethmac
	ALSA: intel8x0m: Register irq handler after register initializations
	arm64: dts: renesas: salvator-common: adv748x: Override secondary addresses
	arm64: dts: renesas: r8a77965: Attach the SYS-DMAC to the IPMMU
	arm64: dts: renesas: r8a77965: Fix HS-USB compatible
	arm64: dts: renesas: r8a77965: Fix clock/reset for usb2_phy1
	pinctrl: at91-pio4: fix has_config check in atmel_pctl_dt_subnode_to_map()
	llc: avoid blocking in llc_sap_close()
	ARM: dts: qcom: ipq4019: fix cpu0's qcom,saw2 reg value
	soc: qcom: geni: Don't ignore clk_round_rate() errors in geni_se_clk_tbl_get()
	soc: qcom: geni: geni_se_clk_freq_match() should always accept multiples
	soc: qcom: wcnss_ctrl: Avoid string overflow
	soc: qcom: apr: Avoid string overflow
	drivers: qcom: rpmh-rsc: clear wait_for_compl after use
	arm64: dts: broadcom: Fix I2C and SPI bus warnings
	ARM: dts: bcm: Fix SPI bus warnings
	ARM: dts: aspeed: Fix I2C bus warnings
	powerpc/vdso: Correct call frame information
	ARM: dts: socfpga: Fix I2C bus unit-address error
	ARM: dts: sunxi: Fix I2C bus warnings
	pinctrl: at91: don't use the same irqchip with multiple gpiochips
	ARM: dts: sun9i: Fix I2C bus warnings
	android: binder: no outgoing transaction when thread todo has transaction
	cxgb4: Fix endianness issue in t4_fwcache()
	arm64: fix for bad_mode() handler to always result in panic
	block, bfq: inject other-queue I/O into seeky idle queues on NCQ flash
	blok, bfq: do not plug I/O if all queues are weight-raised
	arm64: dts: meson: Fix erroneous SPI bus warnings
	power: supply: ab8500_fg: silence uninitialized variable warnings
	power: reset: at91-poweroff: do not procede if at91_shdwc is allocated
	power: supply: max8998-charger: Fix platform data retrieval
	component: fix loop condition to call unbind() if bind() fails
	kernfs: Fix range checks in kernfs_get_target_path
	ip_gre: fix parsing gre header in ipgre_err
	scsi: ufshcd: Fix NULL pointer dereference for in ufshcd_init
	ARM: dts: rockchip: Fix erroneous SPI bus dtc warnings on rk3036
	arm64: dts: rockchip: Fix I2C bus unit-address error on rk3399-puma-haikou
	ACPI / LPSS: Exclude I2C busses shared with PUNIT from pmc_atom_d3_mask
	netfilter: nf_tables: avoid BUG_ON usage
	ath9k: Fix a locking bug in ath9k_add_interface()
	s390/qeth: uninstall IRQ handler on device removal
	s390/qeth: invoke softirqs after napi_schedule()
	media: vsp1: Fix vsp1_regs.h license header
	media: vsp1: Fix YCbCr planar formats pitch calculation
	media: ov2680: don't register the v4l2 subdevice before checking chip ID
	PCI/ACPI: Correct error message for ASPM disabling
	net: socionext: Fix two sleep-in-atomic-context bugs in ave_rxfifo_reset()
	PCI: mediatek: Fix unchecked return value
	ARM: dts: xilinx: Fix I2C and SPI bus warnings
	serial: uartps: Fix suspend functionality
	serial: samsung: Enable baud clock for UART reset procedure in resume
	serial: mxs-auart: Fix potential infinite loop
	tty: serial: qcom_geni_serial: Fix serial when not used as console
	arm64: dts: ti: k3-am65: Change #address-cells and #size-cells of interconnect to 2
	samples/bpf: fix a compilation failure
	spi/bcm63xx-hsspi: keep pll clk enabled
	spi: mediatek: Don't modify spi_transfer when transfer.
	ASoC: rt5682: Fix the boost volume at the begining of playback
	ipmi_si_pci: fix NULL device in ipmi_si error message
	ipmi_si: fix potential integer overflow on large shift
	ipmi:dmi: Ignore IPMI SMBIOS entries with a zero base address
	ipmi: fix return value of ipmi_set_my_LUN
	net: hns3: fix return type of ndo_start_xmit function
	net: cavium: fix return type of ndo_start_xmit function
	net: ibm: fix return type of ndo_start_xmit function
	powerpc/iommu: Avoid derefence before pointer check
	selftests/powerpc: Do not fail with reschedule
	powerpc/64s/hash: Fix stab_rr off by one initialization
	powerpc/pseries/memory-hotplug: Only update DT once per memory DLPAR request
	powerpc/pseries: Disable CPU hotplug across migrations
	powerpc: Fix duplicate const clang warning in user access code
	RDMA/i40iw: Fix incorrect iterator type
	ARM: dts: atmel: Fix I2C and SPI bus warnings
	OPP: Protect dev_list with opp_table lock
	of/unittest: Fix I2C bus unit-address error
	libfdt: Ensure INT_MAX is defined in libfdt_env.h
	power: supply: twl4030_charger: fix charging current out-of-bounds
	power: supply: twl4030_charger: disable eoc interrupt on linear charge
	net: mvpp2: fix the number of queues per cpu for PPv2.2
	net: marvell: fix return type of ndo_start_xmit function
	net: toshiba: fix return type of ndo_start_xmit function
	net: xilinx: fix return type of ndo_start_xmit function
	net: broadcom: fix return type of ndo_start_xmit function
	net: amd: fix return type of ndo_start_xmit function
	net: sun: fix return type of ndo_start_xmit function
	net: hns3: Fix for setting speed for phy failed problem
	net: hns3: Fix cmdq registers initialization issue for vf
	net: hns3: Clear client pointer when initialize client failed or unintialize finished
	net: hns3: Fix client initialize state issue when roce client initialize failed
	net: hns3: Fix parameter type for q_id in hclge_tm_q_to_qs_map_cfg()
	nfp: provide a better warning when ring allocation fails
	usb: chipidea: imx: enable OTG overcurrent in case USB subsystem is already started
	usb: chipidea: Fix otg event handler
	usb: usbtmc: Fix ioctl USBTMC_IOCTL_ABORT_BULK_OUT
	s390/zcrypt: enable AP bus scan without a valid default domain
	s390/vdso: avoid 64-bit vdso mapping for compat tasks
	s390/vdso: correct CFI annotations of vDSO functions
	brcmfmac: increase buffer for obtaining firmware capabilities
	brcmsmac: Use kvmalloc() for ucode allocations
	mlxsw: spectrum: Init shaper for TCs 8..15
	PCI: portdrv: Initialize service drivers directly
	ARM: dts: am335x-evm: fix number of cpsw
	ARM: dts: ti: Fix SPI and I2C bus warnings
	f2fs: avoid infinite loop in f2fs_alloc_nid
	f2fs: fix to recover inode's uid/gid during POR
	ARM: dts: ux500: Correct SCU unit address
	ARM: dts: ux500: Fix LCDA clock line muxing
	ARM: dts: ste: Fix SPI controller node names
	spi: pic32: Use proper enum in dmaengine_prep_slave_rg
	crypto: chacha20 - Fix chacha20_block() keystream alignment (again)
	cpufeature: avoid warning when compiling with clang
	crypto: arm/crc32 - avoid warning when compiling with Clang
	ARM: dts: marvell: Fix SPI and I2C bus warnings
	x86/mce-inject: Reset injection struct after injection
	ARM: dts: stm32: enable display on stm32mp157c-ev1 board
	ARM: dts: clearfog: fix sdhci supply property name
	ARM: dts: stm32: Fix SPI controller node names
	bnx2x: Ignore bandwidth attention in single function mode
	PCI/AER: Take reference on error devices
	PCI/AER: Don't read upstream ports below fatal errors
	PCI/ERR: Use slot reset if available
	samples/bpf: fix compilation failure
	net: phy: mdio-bcm-unimac: Allow configuring MDIO clock divider
	net: micrel: fix return type of ndo_start_xmit function
	net: freescale: fix return type of ndo_start_xmit function
	x86/CPU: Use correct macros for Cyrix calls
	x86/CPU: Change query logic so CPUID is enabled before testing
	EDAC: Correct DIMM capacity unit symbol
	MIPS: kexec: Relax memory restriction
	arm64: dts: rockchip: Fix microSD in rk3399 sapphire board
	mlxsw: Make MLXSW_SP1_FWREV_MINOR a hard requirement
	media: imx: work around false-positive warning, again
	media: pci: ivtv: Fix a sleep-in-atomic-context bug in ivtv_yuv_init()
	media: au0828: Fix incorrect error messages
	media: davinci: Fix implicit enum conversion warning
	ARM: dts: rockchip: explicitly set vcc_sd0 pin to gpio on rk3188-radxarock
	usb: gadget: uvc: configfs: Drop leaked references to config items
	usb: gadget: uvc: configfs: Prevent format changes after linking header
	usb: gadget: uvc: configfs: Sort frame intervals upon writing
	ARM: dts: exynos: Correct audio subsystem parent clock on Peach Chromebooks
	i2c: aspeed: fix invalid clock parameters for very large divisors
	gpiolib: Fix gpio_direction_* for single direction GPIOs
	ARM: at91: pm: call put_device instead of of_node_put in at91_pm_config_ws
	phy: brcm-sata: allow PHY_BRCM_SATA driver to be built for DSL SoCs
	phy: renesas: rcar-gen3-usb2: fix vbus_ctrl for role sysfs
	phy: phy-twl4030-usb: fix denied runtime access
	ARM: dts: imx6ull: update vdd_soc voltage for 900MHz operating point
	usb: gadget: uvc: Factor out video USB request queueing
	usb: gadget: uvc: Only halt video streaming endpoint in bulk mode
	coresight: Use ERR_CAST instead of ERR_PTR
	coresight: Fix handling of sinks
	coresight: perf: Fix per cpu path management
	coresight: perf: Disable trace path upon source error
	coresight: tmc-etr: Handle driver mode specific ETR buffers
	coresight: etm4x: Configure EL2 exception level when kernel is running in HYP
	coresight: tmc: Fix byte-address alignment for RRP
	coresight: dynamic-replicator: Handle multiple connections
	slimbus: ngd: register ngd driver only once.
	slimbus: ngd: return proper error code instead of zero
	silmbus: ngd: register controller after power up.
	misc: kgdbts: Fix restrict error
	misc: genwqe: should return proper error value.
	vmbus: keep pointer to ring buffer page
	vfio/pci: Fix potential memory leak in vfio_msi_cap_len
	vfio/pci: Mask buggy SR-IOV VF INTx support
	iw_cxgb4: Use proper enumerated type in c4iw_bar2_addrs
	scsi: libsas: always unregister the old device if going to discover new
	f2fs: fix remount problem of option io_bits
	phy: lantiq: Fix compile warning
	arm64: dts: fsl: Fix I2C and SPI bus warnings
	ARM: dts: imx51-zii-rdu1: Fix the rtc compatible string
	arm64: tegra: I2C on Tegra194 is not compatible with Tegra114
	ARM: dts: tegra30: fix xcvr-setup-use-fuses
	ARM: dts: tegra20: restore address order
	ARM: tegra: apalis_t30: fix mmc1 cmd pull-up
	ARM: tegra: apalis_t30: fix mcp2515 can controller interrupt polarity
	ARM: tegra: colibri_t30: fix mcp2515 can controller interrupt polarity
	ARM: dts: paz00: fix wakeup gpio keycode
	net: smsc: fix return type of ndo_start_xmit function
	net: faraday: fix return type of ndo_start_xmit function
	PCI/ERR: Run error recovery callbacks for all affected devices
	f2fs: update i_size after DIO completion
	f2fs: fix to recover inode's project id during POR
	f2fs: mark inode dirty explicitly in recover_inode()
	RDMA: Fix dependencies for rdma_user_mmap_io
	EDAC: Raise the maximum number of memory controllers
	ARM: dts: realview: Fix SPI controller node names
	firmware: dell_rbu: Make payload memory uncachable
	Bluetooth: hci_serdev: clear HCI_UART_PROTO_READY to avoid closing proto races
	Bluetooth: L2CAP: Detect if remote is not able to use the whole MPS
	Bluetooth: btrsi: fix bt tx timeout issue
	x86/hyperv: Suppress "PCI: Fatal: No config space access function found"
	crypto: s5p-sss: Fix race in error handling
	crypto: s5p-sss: Fix Fix argument list alignment
	crypto: fix a memory leak in rsa-kcs1pad's encryption mode
	iwlwifi: dbg: don't crash if the firmware crashes in the middle of a debug dump
	iwlwifi: fix non_shared_ant for 22000 devices
	iwlwifi: pcie: read correct prph address for newer devices
	iwlwifi: api: annotate compressed BA notif array sizes
	iwlwifi: pcie: gen2: build A-MSDU only for GSO
	iwlwifi: pcie: fit reclaim msg to MAX_MSG_LEN
	iwlwifi: mvm: use correct FIFO length
	iwlwifi: mvm: Allow TKIP for AP mode
	scsi: NCR5380: Clear all unissued commands on host reset
	scsi: NCR5380: Have NCR5380_select() return a bool
	scsi: NCR5380: Withhold disconnect privilege for REQUEST SENSE
	scsi: NCR5380: Use DRIVER_SENSE to indicate valid sense data
	scsi: NCR5380: Check for invalid reselection target
	scsi: NCR5380: Don't clear busy flag when abort fails
	scsi: NCR5380: Don't call dsprintk() following reselection interrupt
	scsi: NCR5380: Handle BUS FREE during reselection
	scsi: NCR5380: Check for bus reset
	arm64: dts: amd: Fix SPI bus warnings
	arm64: dts: lg: Fix SPI controller node names
	ARM: dts: lpc32xx: Fix SPI controller node names
	rtc: isl1208: avoid possible sysfs race
	rtc: tx4939: fixup nvmem name and register size
	rtc: armada38x: fix possible race condition
	netfilter: masquerade: don't flush all conntracks if only one address deleted on device
	usb: xhci-mtk: fix ISOC error when interval is zero
	usb: usbtmc: uninitialized symbol 'actual' in usbtmc_ioctl_clear
	fuse: use READ_ONCE on congestion_threshold and max_background
	IB/iser: Fix possible NULL deref at iser_inv_desc()
	media: ov2680: fix null dereference at power on
	s390/vdso: correct vdso mapping for compat tasks
	net: phy: mdio-bcm-unimac: mark PM functions as __maybe_unused
	memfd: Use radix_tree_deref_slot_protected to avoid the warning.
	slcan: Fix memory leak in error path
	Linux 4.19.85

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I0857e66ee2cdd412cd736548a1395bf764a8ab0a
2019-11-20 20:43:17 +01:00
Eric W. Biederman
3b5681d39f signal: Always ignore SIGKILL and SIGSTOP sent to the global init
[ Upstream commit 86989c41b5ea08776c450cb759592532314a4ed6 ]

If the first process started (aka /sbin/init) receives a SIGKILL it
will panic the system if it is delivered.  Making the system unusable
and undebugable.  It isn't much better if the first process started
receives SIGSTOP.

So always ignore SIGSTOP and SIGKILL sent to init.

This is done in a separate clause in sig_task_ignored as force_sig_info
can clear SIG_UNKILLABLE and this protection should work even then.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-20 18:46:21 +01:00
Joel Fernandes (Google)
b3481301f4 UPSTREAM: pidfd: add polling support
This patch adds polling support to pidfd.

Android low memory killer (LMK) needs to know when a process dies once
it is sent the kill signal. It does so by checking for the existence of
/proc/pid which is both racy and slow. For example, if a PID is reused
between when LMK sends a kill signal and checks for existence of the
PID, since the wrong PID is now possibly checked for existence.
Using the polling support, LMK will be able to get notified when a process
exists in race-free and fast way, and allows the LMK to do other things
(such as by polling on other fds) while awaiting the process being killed
to die.

For notification to polling processes, we follow the same existing
mechanism in the kernel used when the parent of the task group is to be
notified of a child's death (do_notify_parent). This is precisely when the
tasks waiting on a poll of pidfd are also awakened in this patch.

We have decided to include the waitqueue in struct pid for the following
reasons:
1. The wait queue has to survive for the lifetime of the poll. Including
   it in task_struct would not be option in this case because the task can
   be reaped and destroyed before the poll returns.

2. By including the struct pid for the waitqueue means that during
   de_thread(), the new thread group leader automatically gets the new
   waitqueue/pid even though its task_struct is different.

Appropriate test cases are added in the second patch to provide coverage of
all the cases the patch is handling.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Daniel Colascione <dancol@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Tim Murray <timmurray@google.com>
Cc: Jonathan Kowalski <bl0pbl33p@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: kernel-team@android.com
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Co-developed-by: Daniel Colascione <dancol@google.com>
Signed-off-by: Daniel Colascione <dancol@google.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Christian Brauner <christian@brauner.io>

(cherry picked from commit b53b0b9d9a613c418057f6cb921c2f40a6f78c24)

Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: I0391acb666b34e33ef60a778dffdf12ed3654393
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-08-12 13:36:37 -04:00
Christian Brauner
2471d6f992 UPSTREAM: signal: improve comments
Improve the comments for pidfd_send_signal().
First, the comment still referred to a file descriptor for a process as a
"task file descriptor" which stems from way back at the beginning of the
discussion. Replace this with "pidfd" for consistency.
Second, the wording for the explanation of the arguments to the syscall
was a bit inconsistent, e.g. some used the past tense some used present
tense. Make the wording more consistent.

Signed-off-by: Christian Brauner <christian@brauner.io>

(cherry picked from commit c732327f04a3818f35fa97d07b1d64d31b691d78)

Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: Iaa289e1bc2f96366102e51241accbb97099553d4
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-08-12 13:36:37 -04:00
Christian Brauner
5eecbc68ed UPSTREAM: signal: support CLONE_PIDFD with pidfd_send_signal
Let pidfd_send_signal() use pidfds retrieved via CLONE_PIDFD.  With this
patch pidfd_send_signal() becomes independent of procfs.  This fullfils
the request made when we merged the pidfd_send_signal() patchset.  The
pidfd_send_signal() syscall is now always available allowing for it to
be used by users without procfs mounted or even users without procfs
support compiled into the kernel.

Signed-off-by: Christian Brauner <christian@brauner.io>
Co-developed-by: Jann Horn <jannh@google.com>
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Howells <dhowells@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>

(cherry picked from commit 2151ad1b067275730de1b38c7257478cae47d29e)

Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: If6ee59279f41e0ae3c6d9e24f7b5481f23aab469
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-08-12 13:36:37 -04:00
Christian Brauner
c46b05132a UPSTREAM: signal: use fdget() since we don't allow O_PATH
As stated in the original commit for pidfd_send_signal() we don't allow
to signal processes through O_PATH file descriptors since it is
semantically equivalent to a write on the pidfd.

We already correctly error out right now and return EBADF if an O_PATH
fd is passed.  This is because we use file->f_op to detect whether a
pidfd is passed and O_PATH fds have their file->f_op set to empty_fops
in do_dentry_open() and thus fail the test.

Thus, there is no regression.  It's just semantically correct to use
fdget() and return an error right from there instead of taking a
reference and returning an error later.

Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jann Horn <jann@thejh.net>
Cc: David Howells <dhowells@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit 738a7832d21e3d911fcddab98ce260b79010b461)

Bug: 135608568
Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL
Change-Id: Ib102922f9793e8610940d34ad5fb1256d4b07476
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-08-12 13:36:37 -04:00
Jann Horn
74c14d6081 UPSTREAM: signal: don't silently convert SI_USER signals to non-current pidfd
The current sys_pidfd_send_signal() silently turns signals with explicit
SI_USER context that are sent to non-current tasks into signals with
kernel-generated siginfo.
This is unlike do_rt_sigqueueinfo(), which returns -EPERM in this case.
If a user actually wants to send a signal with kernel-provided siginfo,
they can do that with pidfd_send_signal(pidfd, sig, NULL, 0); so allowing
this case is unnecessary.

Instead of silently replacing the siginfo, just bail out with an error;
this is consistent with other interfaces and avoids special-casing behavior
based on security checks.

Fixes: 3eb39f47934f ("signal: add pidfd_send_signal() syscall")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Christian Brauner <christian@brauner.io>

(cherry picked from commit 556a888a14afe27164191955618990fb3ccc9aad)

Bug: 135608568
Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL
Change-Id: I004452c19c50296730a2c6852a5ef47abd69d819
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-08-12 13:36:37 -04:00
Christian Brauner
1f27ef8d9b BACKPORT: signal: add pidfd_send_signal() syscall
The kill() syscall operates on process identifiers (pid). After a process
has exited its pid can be reused by another process. If a caller sends a
signal to a reused pid it will end up signaling the wrong process. This
issue has often surfaced and there has been a push to address this problem [1].

This patch uses file descriptors (fd) from proc/<pid> as stable handles on
struct pid. Even if a pid is recycled the handle will not change. The fd
can be used to send signals to the process it refers to.
Thus, the new syscall pidfd_send_signal() is introduced to solve this
problem. Instead of pids it operates on process fds (pidfd).

/* prototype and argument /*
long pidfd_send_signal(int pidfd, int sig, siginfo_t *info, unsigned int flags);

/* syscall number 424 */
The syscall number was chosen to be 424 to align with Arnd's rework in his
y2038 to minimize merge conflicts (cf. [25]).

In addition to the pidfd and signal argument it takes an additional
siginfo_t and flags argument. If the siginfo_t argument is NULL then
pidfd_send_signal() is equivalent to kill(<positive-pid>, <signal>). If it
is not NULL pidfd_send_signal() is equivalent to rt_sigqueueinfo().
The flags argument is added to allow for future extensions of this syscall.
It currently needs to be passed as 0. Failing to do so will cause EINVAL.

/* pidfd_send_signal() replaces multiple pid-based syscalls */
The pidfd_send_signal() syscall currently takes on the job of
rt_sigqueueinfo(2) and parts of the functionality of kill(2), Namely, when a
positive pid is passed to kill(2). It will however be possible to also
replace tgkill(2) and rt_tgsigqueueinfo(2) if this syscall is extended.

/* sending signals to threads (tid) and process groups (pgid) */
Specifically, the pidfd_send_signal() syscall does currently not operate on
process groups or threads. This is left for future extensions.
In order to extend the syscall to allow sending signal to threads and
process groups appropriately named flags (e.g. PIDFD_TYPE_PGID, and
PIDFD_TYPE_TID) should be added. This implies that the flags argument will
determine what is signaled and not the file descriptor itself. Put in other
words, grouping in this api is a property of the flags argument not a
property of the file descriptor (cf. [13]). Clarification for this has been
requested by Eric (cf. [19]).
When appropriate extensions through the flags argument are added then
pidfd_send_signal() can additionally replace the part of kill(2) which
operates on process groups as well as the tgkill(2) and
rt_tgsigqueueinfo(2) syscalls.
How such an extension could be implemented has been very roughly sketched
in [14], [15], and [16]. However, this should not be taken as a commitment
to a particular implementation. There might be better ways to do it.
Right now this is intentionally left out to keep this patchset as simple as
possible (cf. [4]).

/* naming */
The syscall had various names throughout iterations of this patchset:
- procfd_signal()
- procfd_send_signal()
- taskfd_send_signal()
In the last round of reviews it was pointed out that given that if the
flags argument decides the scope of the signal instead of different types
of fds it might make sense to either settle for "procfd_" or "pidfd_" as
prefix. The community was willing to accept either (cf. [17] and [18]).
Given that one developer expressed strong preference for the "pidfd_"
prefix (cf. [13]) and with other developers less opinionated about the name
we should settle for "pidfd_" to avoid further bikeshedding.

The  "_send_signal" suffix was chosen to reflect the fact that the syscall
takes on the job of multiple syscalls. It is therefore intentional that the
name is not reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the
fomer because it might imply that pidfd_send_signal() is a replacement for
kill(2), and not the latter because it is a hassle to remember the correct
spelling - especially for non-native speakers - and because it is not
descriptive enough of what the syscall actually does. The name
"pidfd_send_signal" makes it very clear that its job is to send signals.

/* zombies */
Zombies can be signaled just as any other process. No special error will be
reported since a zombie state is an unreliable state (cf. [3]). However,
this can be added as an extension through the @flags argument if the need
ever arises.

/* cross-namespace signals */
The patch currently enforces that the signaler and signalee either are in
the same pid namespace or that the signaler's pid namespace is an ancestor
of the signalee's pid namespace. This is done for the sake of simplicity
and because it is unclear to what values certain members of struct
siginfo_t would need to be set to (cf. [5], [6]).

/* compat syscalls */
It became clear that we would like to avoid adding compat syscalls
(cf. [7]).  The compat syscall handling is now done in kernel/signal.c
itself by adding __copy_siginfo_from_user_generic() which lets us avoid
compat syscalls (cf. [8]). It should be noted that the addition of
__copy_siginfo_from_user_any() is caused by a bug in the original
implementation of rt_sigqueueinfo(2) (cf. 12).
With upcoming rework for syscall handling things might improve
significantly (cf. [11]) and __copy_siginfo_from_user_any() will not gain
any additional callers.

/* testing */
This patch was tested on x64 and x86.

/* userspace usage */
An asciinema recording for the basic functionality can be found under [9].
With this patch a process can be killed via:

 #define _GNU_SOURCE
 #include <errno.h>
 #include <fcntl.h>
 #include <signal.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <sys/stat.h>
 #include <sys/syscall.h>
 #include <sys/types.h>
 #include <unistd.h>

 static inline int do_pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
                                         unsigned int flags)
 {
 #ifdef __NR_pidfd_send_signal
         return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
 #else
         return -ENOSYS;
 #endif
 }

 int main(int argc, char *argv[])
 {
         int fd, ret, saved_errno, sig;

         if (argc < 3)
                 exit(EXIT_FAILURE);

         fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
         if (fd < 0) {
                 printf("%s - Failed to open \"%s\"\n", strerror(errno), argv[1]);
                 exit(EXIT_FAILURE);
         }

         sig = atoi(argv[2]);

         printf("Sending signal %d to process %s\n", sig, argv[1]);
         ret = do_pidfd_send_signal(fd, sig, NULL, 0);

         saved_errno = errno;
         close(fd);
         errno = saved_errno;

         if (ret < 0) {
                 printf("%s - Failed to send signal %d to process %s\n",
                        strerror(errno), sig, argv[1]);
                 exit(EXIT_FAILURE);
         }

         exit(EXIT_SUCCESS);
 }

/* Q&A
 * Given that it seems the same questions get asked again by people who are
 * late to the party it makes sense to add a Q&A section to the commit
 * message so it's hopefully easier to avoid duplicate threads.
 *
 * For the sake of progress please consider these arguments settled unless
 * there is a new point that desperately needs to be addressed. Please make
 * sure to check the links to the threads in this commit message whether
 * this has not already been covered.
 */
Q-01: (Florian Weimer [20], Andrew Morton [21])
      What happens when the target process has exited?
A-01: Sending the signal will fail with ESRCH (cf. [22]).

Q-02:  (Andrew Morton [21])
       Is the task_struct pinned by the fd?
A-02:  No. A reference to struct pid is kept. struct pid - as far as I
       understand - was created exactly for the reason to not require to
       pin struct task_struct (cf. [22]).

Q-03: (Andrew Morton [21])
      Does the entire procfs directory remain visible? Just one entry
      within it?
A-03: The same thing that happens right now when you hold a file descriptor
      to /proc/<pid> open (cf. [22]).

Q-04: (Andrew Morton [21])
      Does the pid remain reserved?
A-04: No. This patchset guarantees a stable handle not that pids are not
      recycled (cf. [22]).

Q-05: (Andrew Morton [21])
      Do attempts to signal that fd return errors?
A-05: See {Q,A}-01.

Q-06: (Andrew Morton [22])
      Is there a cleaner way of obtaining the fd? Another syscall perhaps.
A-06: Userspace can already trivially retrieve file descriptors from procfs
      so this is something that we will need to support anyway. Hence,
      there's no immediate need to add another syscalls just to make
      pidfd_send_signal() not dependent on the presence of procfs. However,
      adding a syscalls to get such file descriptors is planned for a
      future patchset (cf. [22]).

Q-07: (Andrew Morton [21] and others)
      This fd-for-a-process sounds like a handy thing and people may well
      think up other uses for it in the future, probably unrelated to
      signals. Are the code and the interface designed to permit such
      future applications?
A-07: Yes (cf. [22]).

Q-08: (Andrew Morton [21] and others)
      Now I think about it, why a new syscall? This thing is looking
      rather like an ioctl?
A-08: This has been extensively discussed. It was agreed that a syscall is
      preferred for a variety or reasons. Here are just a few taken from
      prior threads. Syscalls are safer than ioctl()s especially when
      signaling to fds. Processes are a core kernel concept so a syscall
      seems more appropriate. The layout of the syscall with its four
      arguments would require the addition of a custom struct for the
      ioctl() thereby causing at least the same amount or even more
      complexity for userspace than a simple syscall. The new syscall will
      replace multiple other pid-based syscalls (see description above).
      The file-descriptors-for-processes concept introduced with this
      syscall will be extended with other syscalls in the future. See also
      [22], [23] and various other threads already linked in here.

Q-09: (Florian Weimer [24])
      What happens if you use the new interface with an O_PATH descriptor?
A-09:
      pidfds opened as O_PATH fds cannot be used to send signals to a
      process (cf. [2]). Signaling processes through pidfds is the
      equivalent of writing to a file. Thus, this is not an operation that
      operates "purely at the file descriptor level" as required by the
      open(2) manpage. See also [4].

/* References */
[1]:  https://lore.kernel.org/lkml/20181029221037.87724-1-dancol@google.com/
[2]:  https://lore.kernel.org/lkml/874lbtjvtd.fsf@oldenburg2.str.redhat.com/
[3]:  https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6fhva@brauner.io/
[4]:  https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru2ku@brauner.io/
[5]:  https://lore.kernel.org/lkml/20181121213946.GA10795@mail.hallyn.com/
[6]:  https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6nv4@brauner.io/
[7]:  https://lore.kernel.org/lkml/36323361-90BD-41AF-AB5B-EE0D7BA02C21@amacapital.net/
[8]:  https://lore.kernel.org/lkml/87tvjxp8pc.fsf@xmission.com/
[9]:  https://asciinema.org/a/IQjuCHew6bnq1cr78yuMv16cy
[11]: https://lore.kernel.org/lkml/F53D6D38-3521-4C20-9034-5AF447DF62FF@amacapital.net/
[12]: https://lore.kernel.org/lkml/87zhtjn8ck.fsf@xmission.com/
[13]: https://lore.kernel.org/lkml/871s6u9z6u.fsf@xmission.com/
[14]: https://lore.kernel.org/lkml/20181206231742.xxi4ghn24z4h2qki@brauner.io/
[15]: https://lore.kernel.org/lkml/20181207003124.GA11160@mail.hallyn.com/
[16]: https://lore.kernel.org/lkml/20181207015423.4miorx43l3qhppfz@brauner.io/
[17]: https://lore.kernel.org/lkml/CAGXu5jL8PciZAXvOvCeCU3wKUEB_dU-O3q0tDw4uB_ojMvDEew@mail.gmail.com/
[18]: https://lore.kernel.org/lkml/20181206222746.GB9224@mail.hallyn.com/
[19]: https://lore.kernel.org/lkml/20181208054059.19813-1-christian@brauner.io/
[20]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/
[21]: https://lore.kernel.org/lkml/20181228152012.dbf0508c2508138efc5f2bbe@linux-foundation.org/
[22]: https://lore.kernel.org/lkml/20181228233725.722tdfgijxcssg76@brauner.io/
[23]: https://lwn.net/Articles/773459/
[24]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/
[25]: https://lore.kernel.org/lkml/CAK8P3a0ej9NcJM8wXNPbcGUyOUZYX+VLoDFdbenW3s3114oQZw@mail.gmail.com/

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Tycho Andersen <tycho@tycho.ws>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Aleksa Sarai <cyphar@cyphar.com>

(cherry picked from commit 3eb39f47934f9d5a3027fe00d906a45fe3a15fad)

Conflicts:
        include/linux/proc_fs.h - trivial manual merge
        include/uapi/asm-generic/unistd.h - trivial manual merge
        kernel/signal.c

(1. manual merges because of 4.19 differences
 2. change prepare_kill_siginfo() to use struct siginfo instead of
kernel_siginfo
 3. change copy_siginfo_from_user_any() to use struct siginfo instead of
kernel_siginfo
 4. change pidfd_send_signal() to use struct siginfo instead of
kernel_siginfo
 5. use copy_from_user() instead of copy_siginfo_from_user() in
copy_siginfo_from_user_any())

Bug: 135608568
Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL
Change-Id: I24e6298ecf036d1822f3fa6c5286984b4e195c16
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-08-12 13:36:37 -04:00
Zhenliang Wei
6f8d26270c kernel/signal.c: trace_signal_deliver when signal_group_exit
commit 98af37d624ed8c83f1953b1b6b2f6866011fc064 upstream.

In the fixes commit, removing SIGKILL from each thread signal mask and
executing "goto fatal" directly will skip the call to
"trace_signal_deliver".  At this point, the delivery tracking of the
SIGKILL signal will be inaccurate.

Therefore, we need to add trace_signal_deliver before "goto fatal" after
executing sigdelset.

Note: SEND_SIG_NOINFO matches the fact that SIGKILL doesn't have any info.

Link: http://lkml.kernel.org/r/20190425025812.91424-1-weizhenliang@huawei.com
Fixes: cf43a757fd4944 ("signal: Restore the stop PTRACE_EVENT_EXIT")
Signed-off-by: Zhenliang Wei <weizhenliang@huawei.com>
Reviewed-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ivan Delalande <colona@arista.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-09 09:17:20 +02:00
Eric W. Biederman
a2b3e2c0f5 signal: Restore the stop PTRACE_EVENT_EXIT
commit cf43a757fd49442bc38f76088b70c2299eed2c2f upstream.

In the middle of do_exit() there is there is a call
"ptrace_event(PTRACE_EVENT_EXIT, code);" That call places the process
in TACKED_TRACED aka "(TASK_WAKEKILL | __TASK_TRACED)" and waits for
for the debugger to release the task or SIGKILL to be delivered.

Skipping past dequeue_signal when we know a fatal signal has already
been delivered resulted in SIGKILL remaining pending and
TIF_SIGPENDING remaining set.  This in turn caused the
scheduler to not sleep in PTACE_EVENT_EXIT as it figured
a fatal signal was pending.  This also caused ptrace_freeze_traced
in ptrace_check_attach to fail because it left a per thread
SIGKILL pending which is what fatal_signal_pending tests for.

This difference in signal state caused strace to report
strace: Exit of unknown pid NNNNN ignored

Therefore update the signal handling state like dequeue_signal
would when removing a per thread SIGKILL, by removing SIGKILL
from the per thread signal mask and clearing TIF_SIGPENDING.

Acked-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Ivan Delalande <colona@arista.com>
Cc: stable@vger.kernel.org
Fixes: 35634ffa1751 ("signal: Always notice exiting tasks")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-20 10:25:48 +01:00
Eric W. Biederman
959e46afec signal: Better detection of synchronous signals
commit 7146db3317c67b517258cb5e1b08af387da0618b upstream.

Recently syzkaller was able to create unkillablle processes by
creating a timer that is delivered as a thread local signal on SIGHUP,
and receiving SIGHUP SA_NODEFERER.  Ultimately causing a loop failing
to deliver SIGHUP but always trying.

When the stack overflows delivery of SIGHUP fails and force_sigsegv is
called.  Unfortunately because SIGSEGV is numerically higher than
SIGHUP next_signal tries again to deliver a SIGHUP.

From a quality of implementation standpoint attempting to deliver the
timer SIGHUP signal is wrong.  We should attempt to deliver the
synchronous SIGSEGV signal we just forced.

We can make that happening in a fairly straight forward manner by
instead of just looking at the signal number we also look at the
si_code.  In particular for exceptions (aka synchronous signals) the
si_code is always greater than 0.

That still has the potential to pick up a number of asynchronous
signals as in a few cases the same si_codes that are used
for synchronous signals are also used for asynchronous signals,
and SI_KERNEL is also included in the list of possible si_codes.

Still the heuristic is much better and timer signals are definitely
excluded.  Which is enough to prevent all known ways for someone
sending a process signals fast enough to cause unexpected and
arguably incorrect behavior.

Cc: stable@vger.kernel.org
Fixes: a27341cd5f ("Prioritize synchronous signals over 'normal' signals")
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-15 08:10:11 +01:00
Eric W. Biederman
f681f2684f signal: Always notice exiting tasks
commit 35634ffa1751b6efd8cf75010b509dcb0263e29b upstream.

Recently syzkaller was able to create unkillablle processes by
creating a timer that is delivered as a thread local signal on SIGHUP,
and receiving SIGHUP SA_NODEFERER.  Ultimately causing a loop
failing to deliver SIGHUP but always trying.

Upon examination it turns out part of the problem is actually most of
the solution.  Since 2.5 signal delivery has found all fatal signals,
marked the signal group for death, and queued SIGKILL in every threads
thread queue relying on signal->group_exit_code to preserve the
information of which was the actual fatal signal.

The conversion of all fatal signals to SIGKILL results in the
synchronous signal heuristic in next_signal kicking in and preferring
SIGHUP to SIGKILL.  Which is especially problematic as all
fatal signals have already been transformed into SIGKILL.

Instead of dequeueing signals and depending upon SIGKILL to
be the first signal dequeued, first test if the signal group
has already been marked for death.  This guarantees that
nothing in the signal queue can prevent a process that needs
to exit from exiting.

Cc: stable@vger.kernel.org
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Ref: ebf5ebe31d2c ("[PATCH] signal-fixes-2.5.59-A4")
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-15 08:10:11 +01:00
Eric W. Biederman
04eb71942e signal: Guard against negative signal numbers in copy_siginfo_from_user32
commit a36700589b85443e28170be59fa11c8a104130a5 upstream.

While fixing an out of bounds array access in known_siginfo_layout
reported by the kernel test robot it became apparent that the same bug
exists in siginfo_layout and affects copy_siginfo_from_user32.

The straight forward fix that makes guards against making this mistake
in the future and should keep the code size small is to just take an
unsigned signal number instead of a signed signal number, as I did to
fix known_siginfo_layout.

Cc: stable@vger.kernel.org
Fixes: cc731525f2 ("signal: Remove kernel interal si_code magic")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13 11:08:45 -08:00
Eric W. Biederman
cd0852081d signal: Always deliver the kernel's SIGKILL and SIGSTOP to a pid namespace init
[ Upstream commit 3597dfe01d12f570bc739da67f857fd222a3ea66 ]

Instead of playing whack-a-mole and changing SEND_SIG_PRIV to
SEND_SIG_FORCED throughout the kernel to ensure a pid namespace init
gets signals sent by the kernel, stop allowing a pid namespace init to
ignore SIGKILL or SIGSTOP sent by the kernel.  A pid namespace init is
only supposed to be able to ignore signals sent from itself and
children with SIG_DFL.

Fixes: 921cf9f630 ("signals: protect cinit from unblocked SIG_DFL signals")
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13 11:08:38 -08:00
Will Deacon
5445a4b0ff signal: Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack
[ Upstream commit 22839869f21ab3850fbbac9b425ccc4c0023926f ]

The sigaltstack(2) system call fails with -ENOMEM if the new alternative
signal stack is found to be smaller than SIGMINSTKSZ. On architectures
such as arm64, where the native value for SIGMINSTKSZ is larger than
the compat value, this can result in an unexpected error being reported
to a compat task. See, for example:

  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=904385

This patch fixes the problem by extending do_sigaltstack to take the
minimum signal stack size as an additional parameter, allowing the
native and compat system call entry code to pass in their respective
values. COMPAT_SIGMINSTKSZ is just defined as SIGMINSTKSZ if it has not
been defined by the architecture.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Reported-by: Steve McIntyre <steve.mcintyre@arm.com>
Tested-by: Steve McIntyre <93sam@debian.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13 11:08:25 -08:00
Linus Torvalds
cd9b44f907 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - the rest of MM

 - procfs updates

 - various misc things

 - more y2038 fixes

 - get_maintainer updates

 - lib/ updates

 - checkpatch updates

 - various epoll updates

 - autofs updates

 - hfsplus

 - some reiserfs work

 - fatfs updates

 - signal.c cleanups

 - ipc/ updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (166 commits)
  ipc/util.c: update return value of ipc_getref from int to bool
  ipc/util.c: further variable name cleanups
  ipc: simplify ipc initialization
  ipc: get rid of ids->tables_initialized hack
  lib/rhashtable: guarantee initial hashtable allocation
  lib/rhashtable: simplify bucket_table_alloc()
  ipc: drop ipc_lock()
  ipc/util.c: correct comment in ipc_obtain_object_check
  ipc: rename ipcctl_pre_down_nolock()
  ipc/util.c: use ipc_rcu_putref() for failues in ipc_addid()
  ipc: reorganize initialization of kern_ipc_perm.seq
  ipc: compute kern_ipc_perm.id under the ipc lock
  init/Kconfig: remove EXPERT from CHECKPOINT_RESTORE
  fs/sysv/inode.c: use ktime_get_real_seconds() for superblock stamp
  adfs: use timespec64 for time conversion
  kernel/sysctl.c: fix typos in comments
  drivers/rapidio/devices/rio_mport_cdev.c: remove redundant pointer md
  fork: don't copy inconsistent signal handler state to child
  signal: make get_signal() return bool
  signal: make sigkill_pending() return bool
  ...
2018-08-22 12:34:08 -07:00
Christian Brauner
20ab7218d2 signal: make get_signal() return bool
make get_signal() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-18-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:51 -07:00
Christian Brauner
f99e9d8c5c signal: make sigkill_pending() return bool
sigkill_pending() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-17-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:51 -07:00
Christian Brauner
a19e2c01f7 signal: make legacy_queue() return bool
legacy_queue() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-16-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:51 -07:00
Christian Brauner
acd14e62f0 signal: make wants_signal() return bool
wants_signal() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-15-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:51 -07:00
Christian Brauner
8f11351eee signal: make flush_sigqueue_mask() void
The return value of flush_sigqueue_mask() is never checked anywhere.

Link: http://lkml.kernel.org/r/20180602103653.18181-14-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:51 -07:00
Christian Brauner
67a48a2447 signal: make unhandled_signal() return bool
unhandled_signal() already behaves like a boolean function.  Let's
actually declare it as such too.  All callers treat it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-13-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:51 -07:00
Christian Brauner
09ae854edb signal: make recalc_sigpending_tsk() return bool
recalc_sigpending_tsk() already behaves like a boolean function.  Let's
actually declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-12-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:51 -07:00
Christian Brauner
938696a829 signal: make has_pending_signals() return bool
has_pending_signals() already behaves like a boolean function.  Let's
actually declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-11-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:51 -07:00
Christian Brauner
6a0cdcd788 signal: make sig_ignored() return bool
sig_ignored() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-10-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:51 -07:00
Christian Brauner
41aaa48119 signal: make sig_task_ignored() return bool
sig_task_ignored() already behaves like a boolean function.  Let's
actually declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-9-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:50 -07:00
Christian Brauner
e4a8b4efbf signal: make sig_handler_ignored() return bool
sig_handler_ignored() already behaves like a boolean function.  Let's
actually declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-8-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:50 -07:00
Christian Brauner
2a9b909409 signal: make kill_ok_by_cred() return bool
kill_ok_by_cred() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-7-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:50 -07:00
Christian Brauner
d8f993b3db signal: simplify rt_sigaction()
The goto is not needed and does not add any clarity.  Simply return
-EINVAL on unexpected sigset_t struct size directly.

Link: http://lkml.kernel.org/r/20180602103653.18181-6-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:50 -07:00
Christian Brauner
b1d294c803 signal: make do_sigpending() void
do_sigpending() returned 0 unconditionally so it doesn't make sense to
have it return at all.  This allows us to simplify a bunch of syscall
callers.

Link: http://lkml.kernel.org/r/20180602103653.18181-5-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:50 -07:00
Christian Brauner
6527de9533 signal: make may_ptrace_stop() return bool
may_ptrace_stop() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-4-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:50 -07:00
Christian Brauner
bb17fcca07 signal: make kill_as_cred_perm() return bool
kill_as_cred_perm() already behaves like a boolean function.  Let's
actually declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-3-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:50 -07:00
Christian Brauner
52cba1a274 signal: make force_sigsegv() void
Patch series "signal: refactor some functions", v3.

This series refactors a bunch of functions in signal.c to simplify parts
of the code.

The greatest single change is declaring the static do_sigpending() helper
as void which makes it possible to remove a bunch of unnecessary checks in
the syscalls later on.

This patch (of 17):

force_sigsegv() returned 0 unconditionally so it doesn't make sense to have
it return at all. In addition, there are no callers that check
force_sigsegv()'s return value.

Link: http://lkml.kernel.org/r/20180602103653.18181-2-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:50 -07:00
Eric W. Biederman
c3ad2c3b02 signal: Don't restart fork when signals come in.
Wen Yang <wen.yang99@zte.com.cn> and majiang <ma.jiang@zte.com.cn>
report that a periodic signal received during fork can cause fork to
continually restart preventing an application from making progress.

The code was being overly pessimistic.  Fork needs to guarantee that a
signal sent to multiple processes is logically delivered before the
fork and just to the forking process or logically delivered after the
fork to both the forking process and it's newly spawned child.  For
signals like periodic timers that are always delivered to a single
process fork can safely complete and let them appear to logically
delivered after the fork().

While examining this issue I also discovered that fork today will miss
signals delivered to multiple processes during the fork and handled by
another thread.  Similarly the current code will also miss blocked
signals that are delivered to multiple process, as those signals will
not appear pending during fork.

Add a list of each thread that is currently forking, and keep on that
list a signal set that records all of the signals sent to multiple
processes.  When fork completes initialize the new processes
shared_pending signal set with it.  The calculate_sigpending function
will see those signals and set TIF_SIGPENDING causing the new task to
take the slow path to userspace to handle those signals.  Making it
appear as if those signals were received immediately after the fork.

It is not possible to send real time signals to multiple processes and
exceptions don't go to multiple processes, which means that that are
no signals sent to multiple processes that require siginfo.  This
means it is safe to not bother collecting siginfo on signals sent
during fork.

The sigaction of a child of fork is initially the same as the
sigaction of the parent process.  So a signal the parent ignores the
child will also initially ignore.  Therefore it is safe to ignore
signals sent to multiple processes and ignored by the forking process.

Signals sent to only a single process or only a single thread and delivered
during fork are treated as if they are received after the fork, and generally
not dealt with.  They won't cause any problems.

V2: Added removal from the multiprocess list on failure.
V3: Use -ERESTARTNOINTR directly
V4: - Don't queue both SIGCONT and SIGSTOP
    - Initialize signal_struct.multiprocess in init_task
    - Move setting of shared_pending to before the new task
      is visible to signals.  This prevents signals from comming
      in before shared_pending.signal is set to delayed.signal
      and being lost.
V5: - rework list add and delete to account for idle threads
v6: - Use sigdelsetmask when removing stop signals

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200447
Reported-by: Wen Yang <wen.yang99@zte.com.cn> and
Reported-by: majiang <ma.jiang@zte.com.cn>
Fixes: 4a2c7a7837 ("[PATCH] make fork() atomic wrt pgrp/session signals")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-09 13:07:01 -05:00
Eric W. Biederman
924de3b8c9 fork: Have new threads join on-going signal group stops
There are only two signals that are delivered to every member of a
signal group: SIGSTOP and SIGKILL.  Signal delivery requires every
signal appear to be delivered either before or after a clone syscall.
SIGKILL terminates the clone so does not need to be considered.  Which
leaves only SIGSTOP that needs to be considered when creating new
threads.

Today in the event of a group stop TIF_SIGPENDING will get set and the
fork will restart ensuring the fork syscall participates in the group
stop.

A fork (especially of a process with a lot of memory) is one of the
most expensive system so we really only want to restart a fork when
necessary.

It is easy so check to see if a SIGSTOP is ongoing and have the new
thread join it immediate after the clone completes.  Making it appear
the clone completed happened just before the SIGSTOP.

The calculate_sigpending function will see the bits set in jobctl and
set TIF_SIGPENDING to ensure the new task takes the slow path to userspace.

V2: The call to task_join_group_stop was moved before the new task is
    added to the thread group list.  This should not matter as
    sighand->siglock is held over both the addition of the threads,
    the call to task_join_group_stop and do_signal_stop.  But the change
    is trivial and it is one less thing to worry about when reading
    the code.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-03 20:20:14 -05:00
Eric W. Biederman
088fe47ce9 signal: Add calculate_sigpending()
Add a function calculate_sigpending to test to see if any signals are
pending for a new task immediately following fork.  Signals have to
happen either before or after fork.  Today our practice is to push
all of the signals to before the fork, but that has the downside that
frequent or periodic signals can make fork take much much longer than
normal or prevent fork from completing entirely.

So we need move signals that we can after the fork to prevent that.

This updates the code to set TIF_SIGPENDING on a new task if there
are signals or other activities that have moved so that they appear
to happen after the fork.

As the code today restarts if it sees any such activity this won't
immediately have an effect, as there will be no reason for it
to set TIF_SIGPENDING immediately after the fork.

Adding calculate_sigpending means the code in fork can safely be
changed to not always restart if a signal is pending.

The new calculate_sigpending function sets sigpending if there
are pending bits in jobctl, pending signals, the freezer needs
to freeze the new task or the live kernel patching framework
need the new thread to take the slow path to userspace.

I have verified that setting TIF_SIGPENDING does make a new process
take the slow path to userspace before it executes it's first userspace
instruction.

I have looked at the callers of signal_wake_up and the code paths
setting TIF_SIGPENDING and I don't see anything else that needs to be
handled.  The code probably doesn't need to set TIF_SIGPENDING for the
kernel live patching as it uses a separate thread flag as well.  But
at this point it seems safer reuse the recalc_sigpending logic and get
the kernel live patching folks to sort out their story later.

V2: I have moved the test into schedule_tail where siglock can
    be grabbed and recalc_sigpending can be reused directly.
    Further as the last action of setting up a new task this
    guarantees that TIF_SIGPENDING will be properly set in the
    new process.

    The helper calculate_sigpending takes the siglock and
    uncontitionally sets TIF_SIGPENDING and let's recalc_sigpending
    clear TIF_SIGPENDING if it is unnecessary.  This allows reusing
    the existing code and keeps maintenance of the conditions simple.

    Oleg Nesterov <oleg@redhat.com>  suggested the movement
    and pointed out the need to take siglock if this code
    was going to be called while the new task is discoverable.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-03 20:10:31 -05:00
Eric W. Biederman
0729614992 signal: Push pid type down into complete_signal.
This is the bottom and by pushing this down it simplifies the callers
and otherwise leaves things as is.  This is in preparation for allowing
fork to implement better handling of signals set to groups of processes.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-21 12:57:35 -05:00
Eric W. Biederman
5a883cee74 signal: Push pid type down into __send_signal
This information is already available in the callers and by pushing it
down it makes the code a little clearer, and allows implementing
better handling of signales set to a group of processes in fork.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-21 12:57:35 -05:00
Eric W. Biederman
b213984bd3 signal: Push pid type down into send_signal
This information is already available in the callers and by pushing it
down it makes the code a little clearer, and allows better group
signal behavior in fork.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-21 12:57:35 -05:00
Eric W. Biederman
40b3b02535 signal: Pass pid type into do_send_sig_info
This passes the information we already have at the call sight into
do_send_sig_info.  Ultimately allowing for better handling of signals
sent to a group of processes during fork.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-21 12:57:35 -05:00
Eric W. Biederman
0102498083 signal: Pass pid type into group_send_sig_info
This passes the information we already have at the call sight
into group_send_sig_info.  Ultimatelly allowing for to better handle
signals sent to a group of processes.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-07-21 12:57:35 -05:00