kernel-fxtec-pro1x

Author	SHA1	Message	Date
Paul E. McKenney	9c85fc004e	locktorture: Print ratio of acquisitions, not failures commit 80c503e0e68fbe271680ab48f0fe29bc034b01b7 upstream. The __torture_print_stats() function in locktorture.c carefully initializes local variable "min" to statp[0].n_lock_acquired, but then compares it to statp[i].n_lock_fail. Given that the .n_lock_fail field should normally be zero, and given the initialization, it seems reasonable to display the maximum and minimum number acquisitions instead of miscomputing the maximum and minimum number of failures. This commit therefore switches from failures to acquisitions. And this turns out to be not only a day-zero bug, but entirely my own fault. I hate it when that happens! Fixes: `0af3fe1efa` ("locktorture: Add a lock-torture kernel module") Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Will Deacon <will@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-23 10:30:23 +02:00
Anil Kumar Mamidala	8660cb6674	ANDROID: GKI: qos: Register irq notify after adding the qos request Before adding the irq affinity based qos request to the list, if the affinity of the interrupt changes it will trigger notify call. This notifier call will try to update the qos request. Accessing the qos request which is not yet added to the list leads to a NULL pointer exception. Avoid this race by registering the notifier after adding the qos request. Test: build, boot Bug: 150901210 Change-Id: I99869cc233573b5db10e4f3224d65c29511050ea Signed-off-by: Anil Kumar Mamidala <amami@codeaurora.org> (cherry picked from commit `5db62557cb`) Signed-off-by: Hridya Valsaraju <hridya@google.com>	2020-04-23 00:35:53 +00:00
Greg Kroah-Hartman	fd8a9d61cf	This is the 4.19.117 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl6emrgACgkQONu9yGCS aT7KHhAAnWFfpGr89QEPUIDcdYNqmjnBlf7WRmVqQxbM+umD5AWo8fdLkKA43Fsx nNdMP6POYUwMqXahNOYwxCfRuw5sqsz/5bZO8O5p6fIXk1WhtW6Nzw78DHmDpQSj Cdfo92dJVhRcsCOElhrdsIypuBr7LoAOFjTGIzx4OZVXM3VJhWPpIgDEtU5yy/+S ym9TSU1RyQ9C/mIev3z6AXTAzAzWKdHXKtkWf3YW/7Mgr2QCcwmZxDlp9L1+L6e3 lLn2IMcFH91Wj0hJX98OhkmjA0EJ/LNU4LaaIe/DxGBEtzyLjn+aoxGIEREnU/Y6 36+3neWC3tJmUIzgyoRgVby+Jti3APEq3ncD0xzD8MAKitxihru7vKdTyfSWwmY0 xSz2UbCbbF1BeG3MZQNzgdSQCn4o21Iyxu+aQVGSvVd4k43x4jbtNedLqA6mHmkz 7I/V7UXyyzztDwlgT+DZa3LT6j4iv8VI6rPl7Evm3b5Iu9un3KLjnOEsXnvxjx9D o8dsPkK/pqbIW75bfThkoo8llmm/SsQ0n5GTKbITx9x0jU9E3VlQNHv+DUkT2CEn 1cY4hsVNql475RsOabhXbfOXI7+uwUCxKEOVN7DysT8UGARGIXZOkrGLr4UqjQHI B4J8oKBPPS5ZQKEIC7j/h4V/exqtSZYTQ1GWUNj4uo9X7KnJ+K4= =kytQ -----END PGP SIGNATURE----- Merge 4.19.117 into android-4.19 Changes in 4.19.117 amd-xgbe: Use __napi_schedule() in BH context hsr: check protocol version in hsr_newlink() net: ipv4: devinet: Fix crash when add/del multicast IP with autojoin net: ipv6: do not consider routes via gateways for anycast address check net: qrtr: send msgs from local of same id as broadcast net: revert default NAPI poll timeout to 2 jiffies net: stmmac: dwmac-sunxi: Provide TX and RX fifo sizes net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode ovl: fix value of i_ino for lower hardlink corner case scsi: ufs: Fix ufshcd_hold() caused scheduling while atomic jbd2: improve comments about freeing data buffers whose page mapping is NULL pwm: pca9685: Fix PWM/GPIO inter-operation ext4: fix incorrect group count in ext4_fill_super error message ext4: fix incorrect inodes per group in error message ASoC: Intel: mrfld: fix incorrect check on p->sink ASoC: Intel: mrfld: return error codes when an error occurs ALSA: usb-audio: Filter error from connector kctl ops, too ALSA: usb-audio: Don't override ignore_ctl_error value from the map ALSA: usb-audio: Don't create jack controls for PCM terminals ALSA: usb-audio: Check mapping at creating connector controls, too keys: Fix proc_keys_next to increase position index tracing: Fix the race between registering 'snapshot' event trigger and triggering 'snapshot' operation btrfs: check commit root generation in should_ignore_root mac80211_hwsim: Use kstrndup() in place of kasprintf() usb: dwc3: gadget: don't enable interrupt when disabling endpoint usb: dwc3: gadget: Don't clear flags before transfer ended drm/amd/powerplay: force the trim of the mclk dpm_levels if OD is enabled ext4: do not zeroout extents beyond i_disksize kvm: x86: Host feature SSBD doesn't imply guest feature SPEC_CTRL_SSBD scsi: target: remove boilerplate code scsi: target: fix hang when multiple threads try to destroy the same iscsi session x86/microcode/AMD: Increase microcode PATCH_MAX_SIZE x86/resctrl: Preserve CDP enable over CPU hotplug x86/resctrl: Fix invalid attempt at removing the default resource group wil6210: check rx_buff_mgmt before accessing it wil6210: ignore HALP ICR if already handled wil6210: add general initialization/size checks wil6210: make sure Rx ring sizes are correlated wil6210: remove reset file from debugfs mm/vmalloc.c: move 'area->pages' after if statement Linux 4.19.117 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ib4ab9aa34c22c034887be15902a625ecc5622b35	2020-04-21 10:20:12 +02:00
Xiao Yang	57f2a2ad73	tracing: Fix the race between registering 'snapshot' event trigger and triggering 'snapshot' operation commit 0bbe7f719985efd9adb3454679ecef0984cb6800 upstream. Traced event can trigger 'snapshot' operation(i.e. calls snapshot_trigger() or snapshot_count_trigger()) when register_snapshot_trigger() has completed registration but doesn't allocate buffer for 'snapshot' event trigger. In the rare case, 'snapshot' operation always detects the lack of allocated buffer so make register_snapshot_trigger() allocate buffer first. trigger-snapshot.tc in kselftest reproduces the issue on slow vm: ----------------------------------------------------------- cat trace ... ftracetest-3028 [002] .... 236.784290: sched_process_fork: comm=ftracetest pid=3028 child_comm=ftracetest child_pid=3036 <...>-2875 [003] .... 240.460335: tracing_snapshot_instance_cond: * SNAPSHOT NOT ALLOCATED * <...>-2875 [003] .... 240.460338: tracing_snapshot_instance_cond: * stopping trace here! * ----------------------------------------------------------- Link: http://lkml.kernel.org/r/20200414015145.66236-1-yangx.jy@cn.fujitsu.com Cc: stable@vger.kernel.org Fixes: `93e31ffbf4` ("tracing: Add 'snapshot' event trigger command") Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:03:09 +02:00
Greg Kroah-Hartman	95bff4cdab	This is the 4.19.116 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl6ZbYYACgkQONu9yGCS aT76ohAAn4lIjSuMRCILy/lq0DXVWDy7q6YdfyzNBITxc86tVfnfjMeQxUBviE/1 OzShWgMRXeqrb0xJTJ5Rv6mt5Kf9a3DpPWt2jwo1iqWkl4AihDtDV7Z2Bh+QdnSX +lQ1xGPqDi4MMgoYlpMtlFc3wq/pJV0i8Q7amXC/KbsDkt5dlDrQYeEZHe2P7pR9 ZljKLHEdGRE3uGqXmEM8qb6aLjQudnHmH/9uChP4UX6b+ZADDCc05DMhEkhEoCZT jdxiqVZvRdiiXTc1r6ckGv0xae77s0IAAZMQAd+24zFK94QByi6d9Cw0y6qyyDi7 1rfHIWSjvetY3+4DCQDOu/k2/pLt/Vqh9zuvtaf8Tu8cKM9rxow0Hl9FlL3fZpBN btpqeCY6twFxApHoAp9ZDK6otaVEOtbg1MCsmpUbVxWIF9IR8cPqMGyYK3lR2Ao1 HgdKEFkYOycAOu51ujuHsDLx/9k2ZqeSPyh0yrdVpFUVvMV/YqoYP9X3jzGRVllL hgYfFcywgrVgxK4c02/6cPiJNbFskTpLllDPVVXGIjO+9R4vTRUgJ74CNrqL25aT ioSFWJA00UvXObnbCDdA+otYYWAmYOJX7HVvEieb0oDqPYHZHa1UW6+1WlYSAQLm WAsHiejOv6PwzRmCDI6RyuZKQjjX6bppAWFq0/RLPO0uEqjXlxc= =Iq3k -----END PGP SIGNATURE----- Merge 4.19.116 into android-4.19 Changes in 4.19.116 ARM: dts: sun8i-a83t-tbs-a711: HM5065 doesn't like such a high voltage bus: sunxi-rsb: Return correct data when mixing 16-bit and 8-bit reads net: vxge: fix wrong __VA_ARGS__ usage hinic: fix a bug of waitting for IO stopped hinic: fix wrong para of wait_for_completion_timeout cxgb4/ptp: pass the sign of offset delta in FW CMD qlcnic: Fix bad kzalloc null test i2c: st: fix missing struct parameter description cpufreq: imx6q: Fixes unwanted cpu overclocking on i.MX6ULL media: venus: hfi_parser: Ignore HEVC encoding for V1 firmware: arm_sdei: fix double-lock on hibernate with shared events null_blk: Fix the null_add_dev() error path null_blk: Handle null_add_dev() failures properly null_blk: fix spurious IO errors after failed past-wp access xhci: bail out early if driver can't accress host in resume x86: Don't let pgprot_modify() change the page encryption bit block: keep bdi->io_pages in sync with max_sectors_kb for stacked devices irqchip/versatile-fpga: Handle chained IRQs properly sched: Avoid scale real weight down to zero selftests/x86/ptrace_syscall_32: Fix no-vDSO segfault PCI/switchtec: Fix init_completion race condition with poll_wait() media: i2c: video-i2c: fix build errors due to 'imply hwmon' libata: Remove extra scsi_host_put() in ata_scsi_add_hosts() pstore/platform: fix potential mem leak if pstore_init_fs failed gfs2: Don't demote a glock until its revokes are written x86/boot: Use unsigned comparison for addresses efi/x86: Ignore the memory attributes table on i386 genirq/irqdomain: Check pointer in irq_domain_alloc_irqs_hierarchy() block: Fix use-after-free issue accessing struct io_cq media: i2c: ov5695: Fix power on and off sequences usb: dwc3: core: add support for disabling SS instances in park mode irqchip/gic-v4: Provide irq_retrigger to avoid circular locking dependency md: check arrays is suspended in mddev_detach before call quiesce operations firmware: fix a double abort case with fw_load_sysfs_fallback locking/lockdep: Avoid recursion in lockdep_count_{for,back}ward_deps() block, bfq: fix use-after-free in bfq_idle_slice_timer_body btrfs: qgroup: ensure qgroup_rescan_running is only set when the worker is at least queued btrfs: remove a BUG_ON() from merge_reloc_roots() btrfs: track reloc roots based on their commit root bytenr IB/mlx5: Replace tunnel mpls capability bits for tunnel_offloads uapi: rename ext2_swab() to swab() and share globally in swab.h slub: improve bit diffusion for freelist ptr obfuscation ASoC: fix regwmask ASoC: dapm: connect virtual mux with default value ASoC: dpcm: allow start or stop during pause for backend ASoC: topology: use name_prefix for new kcontrol usb: gadget: f_fs: Fix use after free issue as part of queue failure usb: gadget: composite: Inform controller driver of self-powered ALSA: usb-audio: Add mixer workaround for TRX40 and co ALSA: hda: Add driver blacklist ALSA: hda: Fix potential access overflow in beep helper ALSA: ice1724: Fix invalid access for enumerated ctl items ALSA: pcm: oss: Fix regression by buffer overflow fix ALSA: doc: Document PC Beep Hidden Register on Realtek ALC256 ALSA: hda/realtek - Set principled PC Beep configuration for ALC256 ALSA: hda/realtek - Remove now-unnecessary XPS 13 headphone noise fixups ALSA: hda/realtek - Add quirk for MSI GL63 media: ti-vpe: cal: fix disable_irqs to only the intended target acpi/x86: ignore unspecified bit positions in the ACPI global lock field thermal: devfreq_cooling: inline all stubs for CONFIG_DEVFREQ_THERMAL=n nvme-fc: Revert "add module to ops template to allow module references" nvme: Treat discovery subsystems as unique subsystems PCI: pciehp: Fix indefinite wait on sysfs requests PCI/ASPM: Clear the correct bits when enabling L1 substates PCI: Add boot interrupt quirk mechanism for Xeon chipsets PCI: endpoint: Fix for concurrent memory allocation in OB address region tpm: Don't make log failures fatal tpm: tpm1_bios_measurements_next should increase position index tpm: tpm2_bios_measurements_next should increase position index KEYS: reaching the keys quotas correctly irqchip/versatile-fpga: Apply clear-mask earlier pstore: pstore_ftrace_seq_next should increase position index MIPS/tlbex: Fix LDDIR usage in setup_pw() for Loongson-3 MIPS: OCTEON: irq: Fix potential NULL pointer dereference ath9k: Handle txpower changes even when TPC is disabled signal: Extend exec_id to 64bits x86/entry/32: Add missing ASM_CLAC to general_protection entry KVM: nVMX: Properly handle userspace interrupt window request KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks KVM: s390: vsie: Fix delivery of addressing exceptions KVM: x86: Allocate new rmap and large page tracking when moving memslot KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support KVM: x86: Gracefully handle __vmalloc() failure during VM allocation KVM: VMX: fix crash cleanup when KVM wasn't used CIFS: Fix bug which the return value by asynchronous read is error mtd: spinand: Stop using spinand->oobbuf for buffering bad block markers mtd: spinand: Do not erase the block before writing a bad block marker Btrfs: fix crash during unmount due to race with delayed inode workers btrfs: set update the uuid generation as soon as possible btrfs: drop block from cache on error in relocation btrfs: fix missing file extent item for hole after ranged fsync btrfs: fix missing semaphore unlock in btrfs_sync_file crypto: mxs-dcp - fix scatterlist linearization for hash erofs: correct the remaining shrink objects powerpc/pseries: Drop pointless static qualifier in vpa_debugfs_init() x86/speculation: Remove redundant arch_smt_update() invocation tools: gpio: Fix out-of-tree build regression mm: Use fixed constant in page_frag_alloc instead of size + 1 net: qualcomm: rmnet: Allow configuration updates to existing devices arm64: dts: allwinner: h6: Fix PMU compatible dm writecache: add cond_resched to avoid CPU hangs dm verity fec: fix memory leak in verity_fec_dtr scsi: zfcp: fix missing erp_lock in port recovery trigger for point-to-point arm64: armv8_deprecated: Fix undef_hook mask for thumb setend selftests: vm: drop dependencies on page flags from mlock2 tests rtc: omap: Use define directive for PIN_CONFIG_ACTIVE_HIGH drm/etnaviv: rework perfmon query infrastructure powerpc/pseries: Avoid NULL pointer dereference when drmem is unavailable NFS: Fix a page leak in nfs_destroy_unlinked_subrequests() ext4: fix a data race at inode->i_blocks fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() ocfs2: no need try to truncate file beyond i_size perf tools: Support Python 3.8+ in Makefile s390/diag: fix display of diagnose call statistics Input: i8042 - add Acer Aspire 5738z to nomux list clk: ingenic/jz4770: Exit with error if CGU init failed kmod: make request_module() return an error when autoloading is disabled cpufreq: powernv: Fix use-after-free hfsplus: fix crash and filesystem corruption when deleting files libata: Return correct status in sata_pmp_eh_recover_pm() when ATA_DFLAG_DETACH is set ipmi: fix hung processes in __get_guid() xen/blkfront: fix memory allocation flags in blkfront_setup_indirect() powerpc/powernv/idle: Restore AMR/UAMOR/AMOR after idle powerpc/64/tm: Don't let userspace set regs->trap via sigreturn powerpc/hash64/devmap: Use H_PAGE_THP_HUGE when setting up huge devmap PTE entries powerpc/xive: Use XIVE_BAD_IRQ instead of zero to catch non configured IPIs powerpc/kprobes: Ignore traps that happened in real mode scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug powerpc: Add attributes for setjmp/longjmp powerpc: Make setjmp/longjmp signature standard btrfs: use nofs allocations for running delayed items dm zoned: remove duplicate nr_rnd_zones increase in dmz_init_zone() crypto: caam - update xts sector size for large input length crypto: ccree - improve error handling crypto: ccree - zero out internal struct before use crypto: ccree - don't mangle the request assoclen crypto: ccree - dec auth tag size from cryptlen map crypto: ccree - only try to map auth tag if needed Revert "drm/dp_mst: Remove VCPI while disabling topology mgr" drm/dp_mst: Fix clearing payload state on topology disable drm: Remove PageReserved manipulation from drm_pci_alloc ftrace/kprobe: Show the maxactive number on kprobe_events powerpc/fsl_booke: Avoid creating duplicate tlb1 entry misc: echo: Remove unnecessary parentheses and simplify check for zero etnaviv: perfmon: fix total and idle HI cyleces readout mfd: dln2: Fix sanity checking for endpoints efi/x86: Fix the deletion of variables in mixed mode Linux 4.19.116 Change-Id: If09fbb53fcb11ea01eaaa7fee7ed21ed6234f352 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2020-04-18 13:33:51 +02:00
Minchan Kim	c27ea19ea4	ANDROID: GKI: attribute page lock and waitqueue functions as sched trace_sched_blocked_trace in CFS is really useful for debugging via trace because it tell where the process was stuck on callstack. For example, <...>-6143 ( 6136) [005] d..2 50.278987: sched_blocked_reason: pid=6136 iowait=0 caller=SyS_mprotect+0x88/0x208 <...>-6136 ( 6136) [005] d..2 50.278990: sched_blocked_reason: pid=6142 iowait=0 caller=do_page_fault+0x1f4/0x3b0 <...>-6142 ( 6136) [006] d..2 50.278996: sched_blocked_reason: pid=6144 iowait=0 caller=SyS_prctl+0x52c/0xb58 <...>-6144 ( 6136) [006] d..2 50.279007: sched_blocked_reason: pid=6136 iowait=0 caller=vm_mmap_pgoff+0x74/0x104 However, sometime it gives pointless information like this. RenderThread-2322 ( 1805) [006] d.s3 50.319046: sched_blocked_reason: pid=6136 iowait=1 caller=__lock_page_killable+0x17c/0x220 logd.writer-594 ( 587) [002] d.s3 50.334011: sched_blocked_reason: pid=6126 iowait=1 caller=wait_on_page_bit+0x194/0x208 kworker/u16:13-333 ( 333) [007] d.s4 50.343161: sched_blocked_reason: pid=6136 iowait=1 caller=__lock_page_killable+0x17c/0x220 Such wait_on_page_bit, __lock_page_killable are pointless because it doesn't carry on higher information to identify the callstack. The reason is page_lock and waitqueue are special synchronization method unlike other normal locks(mutex, spinlock). Let's mark them as "__sched" so get_wchan which used in trace_sched_blocked_trace could detect it and skip them. It will produce more meaningful callstack function like this. <...>-2867 ( 1068) [002] d.h4 124.209701: sched_blocked_reason: pid=329 iowait=0 caller=worker_thread+0x378/0x470 <...>-2867 ( 1068) [002] d.s3 124.209763: sched_blocked_reason: pid=8454 iowait=1 caller=__filemap_fdatawait_range+0xa0/0x104 <...>-2867 ( 1068) [002] d.s4 124.209803: sched_blocked_reason: pid=869 iowait=0 caller=worker_thread+0x378/0x470 ScreenDecoratio-2364 ( 1867) [002] d.s3 124.209973: sched_blocked_reason: pid=8454 iowait=1 caller=f2fs_wait_on_page_writeback+0x84/0xcc ScreenDecoratio-2364 ( 1867) [002] d.s4 124.209986: sched_blocked_reason: pid=869 iowait=0 caller=worker_thread+0x378/0x470 <...>-329 ( 329) [000] d..3 124.210435: sched_blocked_reason: pid=538 iowait=0 caller=worker_thread+0x378/0x470 kworker/u16:13-538 ( 538) [007] d..3 124.210450: sched_blocked_reason: pid=6 iowait=0 caller=worker_thread+0x378/0x470 Bug: 144961676 Bug: 144713689 Change-Id: I30397400c5d056946bdfbc86c9ef5f4d7e6c98fe Signed-off-by: Minchan Kim <minchan@google.com> Signed-off-by: Jimmy Shiu <jimmyshiu@google.com> Bug: 152417756 (cherry picked from commit 8a780c0eb6800cecbfce21362c2d2a3bcab14e1c) Signed-off-by: Saravana Kannan <saravanak@google.com>	2020-04-17 22:51:44 -07:00
Masami Hiramatsu	52f1c4257c	ftrace/kprobe: Show the maxactive number on kprobe_events [ Upstream commit 6a13a0d7b4d1171ef9b80ad69abc37e1daa941b3 ] Show maxactive parameter on kprobe_events. This allows user to save the current configuration and restore it without losing maxactive parameter. Link: http://lkml.kernel.org/r/4762764a-6df7-bc93-ed60-e336146dce1f@gmail.com Link: http://lkml.kernel.org/r/158503528846.22706.5549974121212526020.stgit@devnote2 Cc: stable@vger.kernel.org Fixes: `696ced4fb1` ("tracing/kprobes: expose maxactive for kretprobe in kprobe_events") Reported-by: Taeung Song <treeze.taeung@gmail.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:48:55 +02:00
Eric Biggers	2a87b491b7	kmod: make request_module() return an error when autoloading is disabled commit d7d27cfc5cf0766a26a8f56868c5ad5434735126 upstream. Patch series "module autoloading fixes and cleanups", v5. This series fixes a bug where request_module() was reporting success to kernel code when module autoloading had been completely disabled via 'echo > /proc/sys/kernel/modprobe'. It also addresses the issues raised on the original thread (https://lkml.kernel.org/lkml/20200310223731.126894-1-ebiggers@kernel.org/T/#u) bydocumenting the modprobe sysctl, adding a self-test for the empty path case, and downgrading a user-reachable WARN_ONCE(). This patch (of 4): It's long been possible to disable kernel module autoloading completely (while still allowing manual module insertion) by setting /proc/sys/kernel/modprobe to the empty string. This can be preferable to setting it to a nonexistent file since it avoids the overhead of an attempted execve(), avoids potential deadlocks, and avoids the call to security_kernel_module_request() and thus on SELinux-based systems eliminates the need to write SELinux rules to dontaudit module_request. However, when module autoloading is disabled in this way, request_module() returns 0. This is broken because callers expect 0 to mean that the module was successfully loaded. Apparently this was never noticed because this method of disabling module autoloading isn't used much, and also most callers don't use the return value of request_module() since it's always necessary to check whether the module registered its functionality or not anyway. But improperly returning 0 can indeed confuse a few callers, for example get_fs_type() in fs/filesystems.c where it causes a WARNING to be hit: if (!fs && (request_module("fs-%.s", len, name) == 0)) { fs = __get_fs_type(name, len); WARN_ONCE(!fs, "request_module fs-%.s succeeded, but still no fs?\n", len, name); } This is easily reproduced with: echo > /proc/sys/kernel/modprobe mount -t NONEXISTENT none / It causes: request_module fs-NONEXISTENT succeeded, but still no fs? WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0 [...] This should actually use pr_warn_once() rather than WARN_ONCE(), since it's also user-reachable if userspace immediately unloads the module. Regardless, request_module() should correctly return an error when it fails. So let's make it return -ENOENT, which matches the error when the modprobe binary doesn't exist. I've also sent patches to document and test this case. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jessica Yu <jeyu@kernel.org> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Ben Hutchings <benh@debian.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200310223731.126894-1-ebiggers@kernel.org Link: http://lkml.kernel.org/r/20200312202552.241885-1-ebiggers@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:48:52 +02:00
Zhenzhong Duan	6209e0981b	x86/speculation: Remove redundant arch_smt_update() invocation commit 34d66caf251df91ff27b24a3a786810d29989eca upstream. With commit a74cfffb03b7 ("x86/speculation: Rework SMT state change"), arch_smt_update() is invoked from each individual CPU hotplug function. Therefore the extra arch_smt_update() call in the sysfs SMT control is redundant. Fixes: a74cfffb03b7 ("x86/speculation: Rework SMT state change") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <konrad.wilk@oracle.com> Cc: <dwmw@amazon.co.uk> Cc: <bp@suse.de> Cc: <srinivas.eeda@oracle.com> Cc: <peterz@infradead.org> Cc: <hpa@zytor.com> Link: https://lkml.kernel.org/r/e2e064f2-e8ef-42ca-bf4f-76b612964752@default Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:48:50 +02:00
Eric W. Biederman	a2a1be2de7	signal: Extend exec_id to 64bits commit d1e7fd6462ca9fc76650fbe6ca800e35b24267da upstream. Replace the 32bit exec_id with a 64bit exec_id to make it impossible to wrap the exec_id counter. With care an attacker can cause exec_id wrap and send arbitrary signals to a newly exec'd parent. This bypasses the signal sending checks if the parent changes their credentials during exec. The severity of this problem can been seen that in my limited testing of a 32bit exec_id it can take as little as 19s to exec 65536 times. Which means that it can take as little as 14 days to wrap a 32bit exec_id. Adam Zabrocki has succeeded wrapping the self_exe_id in 7 days. Even my slower timing is in the uptime of a typical server. Which means self_exec_id is simply a speed bump today, and if exec gets noticably faster self_exec_id won't even be a speed bump. Extending self_exec_id to 64bits introduces a problem on 32bit architectures where reading self_exec_id is no longer atomic and can take two read instructions. Which means that is is possible to hit a window where the read value of exec_id does not match the written value. So with very lucky timing after this change this still remains expoiltable. I have updated the update of exec_id on exec to use WRITE_ONCE and the read of exec_id in do_notify_parent to use READ_ONCE to make it clear that there is no locking between these two locations. Link: https://lore.kernel.org/kernel-hardening/20200324215049.GA3710@pi3.com.pl Fixes: 2.3.23pre2 Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:48:47 +02:00
Boqun Feng	c6090fe788	locking/lockdep: Avoid recursion in lockdep_count_{for,back}ward_deps() [ Upstream commit 25016bd7f4caf5fc983bbab7403d08e64cba3004 ] Qian Cai reported a bug when PROVE_RCU_LIST=y, and read on /proc/lockdep triggered a warning: [ ] DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled) ... [ ] Call Trace: [ ] lock_is_held_type+0x5d/0x150 [ ] ? rcu_lockdep_current_cpu_online+0x64/0x80 [ ] rcu_read_lock_any_held+0xac/0x100 [ ] ? rcu_read_lock_held+0xc0/0xc0 [ ] ? __slab_free+0x421/0x540 [ ] ? kasan_kmalloc+0x9/0x10 [ ] ? __kmalloc_node+0x1d7/0x320 [ ] ? kvmalloc_node+0x6f/0x80 [ ] __bfs+0x28a/0x3c0 [ ] ? class_equal+0x30/0x30 [ ] lockdep_count_forward_deps+0x11a/0x1a0 The warning got triggered because lockdep_count_forward_deps() call __bfs() without current->lockdep_recursion being set, as a result a lockdep internal function (__bfs()) is checked by lockdep, which is unexpected, and the inconsistency between the irq-off state and the state traced by lockdep caused the warning. Apart from this warning, lockdep internal functions like __bfs() should always be protected by current->lockdep_recursion to avoid potential deadlocks and data inconsistency, therefore add the current->lockdep_recursion on-and-off section to protect __bfs() in both lockdep_count_forward_deps() and lockdep_count_backward_deps() Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200312151258.128036-1-boqun.feng@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:48:42 +02:00
Alexander Sverdlin	1b16ddb28b	genirq/irqdomain: Check pointer in irq_domain_alloc_irqs_hierarchy() [ Upstream commit 87f2d1c662fa1761359fdf558246f97e484d177a ] irq_domain_alloc_irqs_hierarchy() has 3 call sites in the compilation unit but only one of them checks for the pointer which is being dereferenced inside the called function. Move the check into the function. This allows for catching the error instead of the following crash: Unable to handle kernel NULL pointer dereference at virtual address 00000000 PC is at 0x0 LR is at gpiochip_hierarchy_irq_domain_alloc+0x11f/0x140 ... [<c06c23ff>] (gpiochip_hierarchy_irq_domain_alloc) [<c0462a89>] (__irq_domain_alloc_irqs) [<c0462dad>] (irq_create_fwspec_mapping) [<c06c2251>] (gpiochip_to_irq) [<c06c1c9b>] (gpiod_to_irq) [<bf973073>] (gpio_irqs_init [gpio_irqs]) [<bf974048>] (gpio_irqs_exit+0xecc/0xe84 [gpio_irqs]) Code: bad PC value Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200306174720.82604-1-alexander.sverdlin@nokia.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:48:41 +02:00
Michael Wang	2851621747	sched: Avoid scale real weight down to zero [ Upstream commit 26cf52229efc87e2effa9d788f9b33c40fb3358a ] During our testing, we found a case that shares no longer working correctly, the cgroup topology is like: /sys/fs/cgroup/cpu/A (shares=102400) /sys/fs/cgroup/cpu/A/B (shares=2) /sys/fs/cgroup/cpu/A/B/C (shares=1024) /sys/fs/cgroup/cpu/D (shares=1024) /sys/fs/cgroup/cpu/D/E (shares=1024) /sys/fs/cgroup/cpu/D/E/F (shares=1024) The same benchmark is running in group C & F, no other tasks are running, the benchmark is capable to consumed all the CPUs. We suppose the group C will win more CPU resources since it could enjoy all the shares of group A, but it's F who wins much more. The reason is because we have group B with shares as 2, since A->cfs_rq.load.weight == B->se.load.weight == B->shares/nr_cpus, so A->cfs_rq.load.weight become very small. And in calc_group_shares() we calculate shares as: load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg); shares = (tg_shares * load) / tg_weight; Since the 'cfs_rq->load.weight' is too small, the load become 0 after scale down, although 'tg_shares' is 102400, shares of the se which stand for group A on root cfs_rq become 2. While the se of D on root cfs_rq is far more bigger than 2, so it wins the battle. Thus when scale_load_down() scale real weight down to 0, it's no longer telling the real story, the caller will have the wrong information and the calculation will be buggy. This patch add check in scale_load_down(), so the real weight will be >= MIN_SHARES after scale, after applied the group C wins as expected. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/38e8e212-59a1-64b2-b247-b6d0b52d8dc1@linux.alibaba.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:48:40 +02:00
Aaro Koskinen	4a853c72f4	UPSTREAM: GKI: panic/reboot: allow specifying reboot_mode for panic only Allow specifying reboot_mode for panic only. This is needed on systems where ramoops is used to store panic logs, and user wants to use warm reset to preserve those, while still having cold reset on normal reboots. Link: http://lkml.kernel.org/r/20190322004735.27702-1-aaro.koskinen@iki.fi Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit b287a25a7148a89d977c819c1f7d6584f875b682) Signed-off-by: Mark Salyzyn <salyzyn@google.com> Bug: 154175554 Change-Id: Id1075f4d97eddb818aa495903a7643958e6c73d6	2020-04-17 05:00:40 +00:00
Mark Salyzyn	a6a25a9d07	ANDROID: GKI: qcom: Fix compile issue when setting msm_lmh_dcvs as a module Partial cherry picked from commit 5cb59eb5283f6f5a900c3c4971f7efbd83b7e43a ("qcom: Fix compile issue when setting msm_lmh_dcvs as a module") adjusted: kernel/trace/power-traces.c skipped: drivers/thermal/qcom/lmh_dbg.h drivers/thermal/qcom/msm_lmh_dcvs.c Export the trace symbol -- clock_set_rate -- for the msm_lmh_dcvs driver. Signed-off-by: Will McVicker <willmcvicker@google.com> Change-Id: Ic68bc07997d73ba55f9ba7deff7dc7eef320e4bf (cherry picked from commit 5cb59eb5283f6f5a900c3c4971f7efbd83b7e43a) Signed-off-by: Mark Salyzyn <salyzyn@google.com> Bug: 154153737	2020-04-17 03:24:20 +00:00
Masahiro Yamada	9cd9f31cf2	UPSTREAM: kheaders: include only headers into kheaders_data.tar.xz Currently, kheaders_data.tar.xz contains some build scripts as well as headers. None of them is needed in the header archive. For ARCH=x86, this commit excludes the following from the archive: arch/x86/include/asm/Kbuild arch/x86/include/uapi/asm/Kbuild include/asm-generic/Kbuild include/config/auto.conf include/config/kernel.release include/config/tristate.conf include/uapi/asm-generic/Kbuild include/uapi/Kbuild kernel/gen_kheaders.sh This change is actually motivated for the planned header compile-testing because it will generate more build artifacts, which should not be included in the archive. Change-Id: I688e041842740216cace0373ca9f358bc7704809 Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> (cherry picked from commit 7199ff7d74003b5aad1e6328bf6128cd8ceea735) Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>	2020-04-16 18:00:25 +00:00
Masahiro Yamada	667cbc0e2a	UPSTREAM: kheaders: remove meaningless -R option of 'ls' The -R option of 'ls' is supposed to be used for directories. -R, --recursive list subdirectories recursively Since 'find ... -type f' only matches to regular files, we do not expect directories passed to the 'ls' command here. Giving -R is harmless at least, but unneeded. Change-Id: I73588f18e40824ccecc4149fbc467015b5c5e142 Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> (cherry picked from commit b60b7c2ea9b7f854d457fefd592c77f621a86580) Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>	2020-04-16 18:00:19 +00:00
Saravana Kannan	cb2fe03684	ANDROID: GKI: Add vendor fields to root_domain This is needed for ABI compatibility Bug: 153905799 Change-Id: Idd802feeb29652e9f575faff8a0770af5697eedb Signed-off-by: Saravana Kannan <saravanak@google.com>	2020-04-14 16:21:06 +00:00
Mark Salyzyn	35be952ae0	Revert "BACKPORT: mm: reclaim small amounts of memory when an external fragmentation event occurs" This reverts commit `5cbbeadd5a`. Reason for revert: revert customized code Bug: 140544941 Test: boot Signed-off-by: Minchan Kim <minchan@google.com> Signed-off-by: Martin Liu <liumartin@google.com> Signed-off-by: Mark Salyzyn <salyzyn@google.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I65735f27f6a44a112957bcec07e2f63f2d8ccff6	2020-04-14 19:35:27 +08:00
Will McVicker	56ebfff5eb	ANDROID: GKI: panic: add vendor callback function in panic() Each vendor might want to implement some debug code when the kernel panics. So, add a vendor_panic_cb callback for vendors to implement. Bug: 149258398 Test: compile Change-Id: I7a374b0089f72c2511db6fe3b8cdd18f41a1eb6c Signed-off-by: Saravana Kannan <saravanak@google.com> (cherry picked from commit 911d9c70c2c50b0383ed0b652bb84ca8832e4a2b) Signed-off-by: Will McVicker <willmcvicker@google.com> [willmcvicker: only pulled in the ABI diffs]	2020-04-14 05:24:52 +00:00
Will McVicker	0a2394dc5a	ANDROID: GKI: export symbols from abi_gki_aarch64_qcom_whitelist Run the script, $ ../build/gki/add_EXPORT_SYMBOL_GPL < abi_gki_aarch64_qcom_whitelist This will export all the required symbols that are in this kernel. Signed-off-by: Will McVicker <willmcvicker@google.com> Bug: 153886473 Test: compile Change-Id: I703509d75104cd86f472481346e3efbd235121ab	2020-04-13 21:36:41 +00:00
Greg Kroah-Hartman	2d2af525a7	This is the 4.19.115 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl6USBUACgkQONu9yGCS aT6dNxAA1BJKHbO1TOMTYi21N8XNbMOOblVxrLDe+Y2nEj2KIqiehlsoreV34F/g IswNAuA3JXp7pU53RIKsTIWvx9CvNit55sJ1eKWTfFZGCotsBWH9Xzeh9Ao1wURG vhE5tX8PUwEzZ/sFphVmVvv5oUkQyHYKpEosyVOqL5eIQe5E430PxB/xvz4I0Vyq HHiXmrNekXi5kY156k1RqLQ/RhKMFPNi7swm1uFKLS1qrcIlQzgq5MFk5l59oEMo xob25EeeddVa/4roNSVk9IZGZjXpRPsvRM8kxjSXn2KVz1aO8TgYXF1RyWeNthsZ VXf6XkasSh3bwMX6imhV1fGmepG3OvSZg0k2EvRTpcY84kFlIrC1l2YuOHrCETgL GkptUtGK0a2FEiyBK/0nxvf2E6iaoT4NeTYlyTkL8iOgJ+xMvuSzCpFfQjfkOjGz h3AD+Twqu7lqY54nOvyAkA94joEFzVuzSoYCABAImFq4kvu4khhWBXTmkqUf47aI 1O3m4bMEMLDBRwiBpRsu5c0C+ghHHQtOWTH/UjyOI1aGEKFZyBe5CHYmRo2W9tDg rrlymg1iVMR1o9pvzzRroCokKCzBSirEWKxyyMIFWko5xQvTvae5fTIaAWlvBGjP oH3eIPDWw1ZD1WxiSGzM2Wx4AyumZ1y3pnOHV3uUnYb3cM0l9g8= =bfll -----END PGP SIGNATURE----- Merge 4.19.115 into android-4.19 Changes in 4.19.115 ipv4: fix a RCU-list lock in fib_triestat_seq_show net, ip_tunnel: fix interface lookup with no key sctp: fix refcount bug in sctp_wfree sctp: fix possibly using a bad saddr with a given dst nvme-rdma: Avoid double freeing of async event data drm/amd/display: Add link_rate quirk for Apple 15" MBP 2017 drm/bochs: downgrade pci_request_region failure from error to warning initramfs: restore default compression behavior drm/amdgpu: fix typo for vcn1 idle check tools/power turbostat: Fix gcc build warnings tools/power turbostat: Fix missing SYS_LPI counter on some Chromebooks drm/etnaviv: replace MMU flush marker with flush sequence media: rc: IR signal for Panasonic air conditioner too long misc: rtsx: set correct pcr_ops for rts522A misc: pci_endpoint_test: Fix to support > 10 pci-endpoint-test devices misc: pci_endpoint_test: Avoid using module parameter to determine irqtype coresight: do not use the BIT() macro in the UAPI header mei: me: add cedar fork device ids extcon: axp288: Add wakeup support power: supply: axp288_charger: Add special handling for HP Pavilion x2 10 ALSA: hda/ca0132 - Add Recon3Di quirk to handle integrated sound on EVGA X99 Classified motherboard rxrpc: Fix sendmsg(MSG_WAITALL) handling net: Fix Tx hash bound checking padata: always acquire cpu_hotplug_lock before pinst->lock bitops: protect variables in set_mask_bits() macro include/linux/notifier.h: SRCU: fix ctags mm: mempolicy: require at least one nodeid for MPOL_PREFERRED ipv6: don't auto-add link-local address to lag ports net: dsa: bcm_sf2: Do not register slave MDIO bus with OF net: dsa: bcm_sf2: Ensure correct sub-node is parsed net: phy: micrel: kszphy_resume(): add delay after genphy_resume() before accessing PHY registers net: stmmac: dwmac1000: fix out-of-bounds mac address reg setting slcan: Don't transmit uninitialized stack data in padding mlxsw: spectrum_flower: Do not stop at FLOW_ACTION_VLAN_MANGLE random: always use batched entropy for get_random_u{32,64} usb: dwc3: gadget: Wrap around when skip TRBs tools/accounting/getdelays.c: fix netlink attribute length hwrng: imx-rngc - fix an error path ASoC: jz4740-i2s: Fix divider written at incorrect offset in register IB/hfi1: Call kobject_put() when kobject_init_and_add() fails IB/hfi1: Fix memory leaks in sysfs registration and unregistration ceph: remove the extra slashes in the server path ceph: canonicalize server path in place RDMA/ucma: Put a lock around every call to the rdma_cm layer RDMA/cma: Teach lockdep about the order of rtnl and lock Bluetooth: RFCOMM: fix ODEBUG bug in rfcomm_dev_ioctl RDMA/cm: Update num_paths in cma_resolve_iboe_route error flow fbcon: fix null-ptr-deref in fbcon_switch clk: qcom: rcg: Return failure for RCG update drm/msm: stop abusing dma_map/unmap for cache arm64: Fix size of __early_cpu_boot_status rpmsg: glink: Remove chunk size word align warning usb: dwc3: don't set gadget->is_otg flag drm_dp_mst_topology: fix broken drm_dp_sideband_parse_remote_dpcd_read() drm/msm: Use the correct dma_sync calls in msm_gem Linux 4.19.115 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Idc17d8aa387491167efc60df0a9764b82e4344da	2020-04-13 13:09:17 +02:00
Daniel Jordan	bf498d6b8d	padata: always acquire cpu_hotplug_lock before pinst->lock commit 38228e8848cd7dd86ccb90406af32de0cad24be3 upstream. lockdep complains when padata's paths to update cpumasks via CPU hotplug and sysfs are both taken: # echo 0 > /sys/devices/system/cpu/cpu1/online # echo ff > /sys/kernel/pcrypt/pencrypt/parallel_cpumask ====================================================== WARNING: possible circular locking dependency detected 5.4.0-rc8-padata-cpuhp-v3+ #1 Not tainted ------------------------------------------------------ bash/205 is trying to acquire lock: ffffffff8286bcd0 (cpu_hotplug_lock.rw_sem){++++}, at: padata_set_cpumask+0x2b/0x120 but task is already holding lock: ffff8880001abfa0 (&pinst->lock){+.+.}, at: padata_set_cpumask+0x26/0x120 which lock already depends on the new lock. padata doesn't take cpu_hotplug_lock and pinst->lock in a consistent order. Which should be first? CPU hotplug calls into padata with cpu_hotplug_lock already held, so it should have priority. Fixes: `6751fb3c0e` ("padata: Use get_online_cpus/put_online_cpus") Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-13 10:45:05 +02:00
Tri Vo	e64d8dcc68	ANDROID: GKI: kernel/dma, mm/cma: Export symbols needed by vendor modules This allows them to work with a GKI kernel. Bug: 140651649 Bug: 140651863 Change-Id: I41ae14d90df31d552b2a0eab89a20d7ba8a9243d Signed-off-by: Tri Vo <trong@google.com> [saravanak partial cherry-pick and dropped a ton of changes] Signed-off-by: Saravana Kannan <saravanak@google.com>	2020-04-10 02:39:49 +00:00
Saravana Kannan	61a3356ad1	ANDROID: GKI: Add stub __cpu_isolated_mask symbol Some vendor modules might want to keep track of isolated CPUs. Add a stub symbol that never isolates any CPU. Bug: 149816871 Signed-off-by: Saravana Kannan <saravanak@google.com> Change-Id: Ia494314168e94d72b0c1e8b786c150b9403dbf1a	2020-04-10 00:48:38 +00:00
Quentin Perret	eead51495c	ANDROID: GKI: sched: stub sched_isolate symbols These are needed to let modules load during compliance testing, but the underlying core-isolation feature is not necessary for android-common. Bug: 149816871 Test: compiled, checked abi diff for missing sched_isolate symbols Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: Iaece1e98f821c50f2497b4a47b60714f49272750	2020-04-10 00:48:27 +00:00
Kelly Rossmoyer	e7b509cf04	ANDROID: power: wakeup_reason: wake reason enhancements These changes build upon the existing Android kernel wakeup reason code to: * improve the positioning of suspend abort logging calls in suspend flow * add logging of abnormal wakeup reasons like unexpected HW IRQs and IRQs configured as both wake-enabled and no-suspend * add support for capturing deferred-processing threaded nested IRQs as wakeup reasons rather than their synchronously-processed parents Bug: 150970830 Bug: 140217217 Signed-off-by: Kelly Rossmoyer <krossmo@google.com> Change-Id: I903b811a0fe11a605a25815c3a341668a23de700	2020-04-09 15:27:37 +00:00
Lina Iyer	723feab600	ANDROID: GKI: QoS: Enhance framework to support cpu/irq specific QoS requests QoS request for CPU_DMA_LATENCY can be better optimized if the request can be set only for the required cpus and not all cpus. This helps save power on other cores, while still gauranteeing the quality of service. Enhance the QoS constraints data structures to support target value for each core. Requests specify if the QoS is applicable to all cores (default) or to a selective subset of the cores or to a core(s), that the IRQ is affine to. QoS requests that need to track an IRQ can be set to apply only on the cpus to which the IRQ's smp_affinity attribute is set to. The QoS framework will automatically track IRQ migration between the cores. The QoS is updated to be applied only to the core(s) that the IRQ has been migrated to. Idle and interested drivers can request a PM QoS value for a constraint across all cpus, or a specific cpu or a set of cpus. Separate APIs have been added to request for individual cpu or a cpumask. The default behaviour of PM QoS is maintained i.e, requests that do not specify a type of the request will continue to be effected on all cores. Requests that want to specify an affinity of cpu(s) or an irq, can modify the PM QoS request data structures by specifying the type of the request and either the mask of the cpus or the IRQ number depending on the type. Updating the request does not reset the type of the request. The userspace sysfs interface does not support CPU/IRQ affinity. Signed-off-by: Lina Iyer <ilina@codeaurora.org> (cherry picked from commit `16cafdb44f`) Bug: 153463922 Test: build Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I1aad836a985fd303f2254fe607bb909a6b720dd5	2020-04-08 16:37:25 +00:00
Eric Biggers	a898a4764c	FROMLIST: kmod: make request_module() return an error when autoloading is disabled It's long been possible to disable kernel module autoloading completely (while still allowing manual module insertion) by setting /proc/sys/kernel/modprobe to the empty string. This can be preferable to setting it to a nonexistent file since it avoids the overhead of an attempted execve(), avoids potential deadlocks, and avoids the call to security_kernel_module_request() and thus on SELinux-based systems eliminates the need to write SELinux rules to dontaudit module_request. However, when module autoloading is disabled in this way, request_module() returns 0. This is broken because callers expect 0 to mean that the module was successfully loaded. Apparently this was never noticed because this method of disabling module autoloading isn't used much, and also most callers don't use the return value of request_module() since it's always necessary to check whether the module registered its functionality or not anyway. But improperly returning 0 can indeed confuse a few callers, for example get_fs_type() in fs/filesystems.c where it causes a WARNING to be hit: if (!fs && (request_module("fs-%.s", len, name) == 0)) { fs = __get_fs_type(name, len); WARN_ONCE(!fs, "request_module fs-%.s succeeded, but still no fs?\n", len, name); } This is easily reproduced with: echo > /proc/sys/kernel/modprobe mount -t NONEXISTENT none / It causes: request_module fs-NONEXISTENT succeeded, but still no fs? WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0 [...] This should actually use pr_warn_once() rather than WARN_ONCE(), since it's also user-reachable if userspace immediately unloads the module. Regardless, request_module() should correctly return an error when it fails. So let's make it return -ENOENT, which matches the error when the modprobe binary doesn't exist. I've also sent patches to document and test this case. Acked-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Jessica Yu <jeyu@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: NeilBrown <neilb@suse.com> Link: https://lore.kernel.org/r/20200318230515.171692-2-ebiggers@kernel.org Bug: 151589316 Change-Id: I5e04f85e12a4f85da23e53bc11da1ade565abcd6 Signed-off-by: Eric Biggers <ebiggers@google.com>	2020-04-06 10:44:34 -07:00
Alistair Delva	182b38fc31	ANDROID: Fix wq fp check for CFI builds A previous change added a test on the wrong config flag; rename CFI to CFI_CLANG. Bug: 145210207 Change-Id: Id8aead2eb2c75ad6442d10165f6cb86ccfb9c2f9 Signed-off-by: Alistair Delva <adelva@google.com>	2020-04-04 16:30:40 +00:00
Greg Kroah-Hartman	6ca29140d7	This is the 4.19.114 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl6F6HkACgkQONu9yGCS aT7FkxAAgZOwRDVRkqjfSE+MBAqbE41sO3iAWmv9gQazdK+APGdQaasQ73gBdcuQ wliG5W9k9J0qkcnUIAnEgooAWXB9+7p4NF1BZHmpmYleXZckmXtaDK3cKgFWAOVD KMQgiEYHgdm6otlNf328uOmoaggN1wRqmMsW/PZys0AvQ183oTsidhQwfOofCt3k LwJiu5o+gJCIePrqKuHtkteKmjFR1KQ2RZHPmJ2ApoxVymBreJWKMl8ZVCRyteDx JoWZfprPnZZaqb83ylkpE/lXyut0etT2zmI+W/Bg4LFDZTVfqw+HPB7opvITfP0p 6H0YwH9Qn/BiOcP6JncVUPLe8/bEiOJ/jsJwPRCcl0C7PmDrn6uhBNVfrY4CreAL h38/vKSwK8iduyPpne6zq6hQDYBTdEpBDtXFsnElNBmyIE7yIH3ta8qDYsW13Fr7 x9U7F9KagIR1AH2b/uMzjlTDv85hvzGP8vS06S1gJn6RJP0WSDtpE7RNT6MkfMIw Ti16a9nEJ3H+Zn76vdvlLirmziETsIVpxHSDRu/X9QfxJmXHnXg7581bu8aGZ1zN 6xwWP9mWA8KJzbX5mxXChHoZ9qQ/o4D10MxS+7DXFYiya4prHWphyTS2MYbzMzIl TIOJ54FVg01QiQbh29X05hvd3RMOkdzJ9Tggq8oTSLvgTIUSmi0= =jtGQ -----END PGP SIGNATURE----- Merge 4.19.114 into android-4.19 Changes in 4.19.114 mmc: core: Allow host controllers to require R1B for CMD6 mmc: core: Respect MMC_CAP_NEED_RSP_BUSY for erase/trim/discard mmc: core: Respect MMC_CAP_NEED_RSP_BUSY for eMMC sleep command mmc: sdhci-omap: Fix busy detection by enabling MMC_CAP_NEED_RSP_BUSY mmc: sdhci-tegra: Fix busy detection by enabling MMC_CAP_NEED_RSP_BUSY Revert "drm/dp_mst: Skip validating ports during destruction, just ref" geneve: move debug check after netdev unregister hsr: fix general protection fault in hsr_addr_is_self() macsec: restrict to ethernet devices mlxsw: spectrum_mr: Fix list iteration in error path net: cbs: Fix software cbs to consider packet sending time net: dsa: Fix duplicate frames flooded by learning net: mvneta: Fix the case where the last poll did not process all rx net/packet: tpacket_rcv: avoid a producer race condition net: qmi_wwan: add support for ASKEY WWHC050 net_sched: cls_route: remove the right filter from hashtable net_sched: keep alloc_hash updated after hash allocation net: stmmac: dwmac-rk: fix error path in rk_gmac_probe NFC: fdp: Fix a signedness bug in fdp_nci_send_patch() slcan: not call free_netdev before rtnl_unlock in slcan_open bnxt_en: fix memory leaks in bnxt_dcbnl_ieee_getets() bnxt_en: Reset rings if ring reservation fails during open() net: ip_gre: Separate ERSPAN newlink / changelink callbacks net: ip_gre: Accept IFLA_INFO_DATA-less configuration net: dsa: mt7530: Change the LINK bit to reflect the link status net: phy: mdio-mux-bcm-iproc: check clk_prepare_enable() return value r8169: re-enable MSI on RTL8168c tcp: repair: fix TCP_QUEUE_SEQ implementation vxlan: check return value of gro_cells_init() hsr: use rcu_read_lock() in hsr_get_node_{list/status}() hsr: add restart routine into hsr_get_node_list() hsr: set .netnsok flag cgroup-v1: cgroup_pidlist_next should update position index nfs: add minor version to nfs_server_key for fscache cpupower: avoid multiple definition with gcc -fno-common drivers/of/of_mdio.c:fix of_mdiobus_register() cgroup1: don't call release_agent when it is "" dt-bindings: net: FMan erratum A050385 arm64: dts: ls1043a: FMan erratum A050385 fsl/fman: detect FMan erratum A050385 s390/qeth: handle error when backing RX buffer scsi: ipr: Fix softlockup when rescanning devices in petitboot mac80211: Do not send mesh HWMP PREQ if HWMP is disabled dpaa_eth: Remove unnecessary boolean expression in dpaa_get_headroom sxgbe: Fix off by one in samsung driver strncpy size arg ftrace/x86: Anotate text_mutex split between ftrace_arch_code_modify_post_process() and ftrace_arch_code_modify_prepare() i2c: hix5hd2: add missed clk_disable_unprepare in remove Input: raydium_i2c_ts - fix error codes in raydium_i2c_boot_trigger() Input: synaptics - enable RMI on HP Envy 13-ad105ng Input: avoid BIT() macro usage in the serio.h UAPI header ceph: check POOL_FLAG_FULL/NEARFULL in addition to OSDMAP_FULL/NEARFULL ARM: dts: dra7: Add bus_dma_limit for L3 bus ARM: dts: omap5: Add bus_dma_limit for L3 bus perf probe: Do not depend on dwfl_module_addrsym() tools: Let O= makes handle a relative path with -C option scripts/dtc: Remove redundant YYLOC global declaration scsi: sd: Fix optimal I/O size for devices that change reported values nl80211: fix NL80211_ATTR_CHANNEL_WIDTH attribute type mac80211: mark station unauthorized before key removal gpiolib: acpi: Correct comment for HP x2 10 honor_wakeup quirk gpiolib: acpi: Rework honor_wakeup option into an ignore_wake option gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 BYT + AXP288 model RDMA/core: Ensure security pkey modify is not lost genirq: Fix reference leaks on irq affinity notifiers xfrm: handle NETDEV_UNREGISTER for xfrm device vti[6]: fix packet tx through bpf_redirect() in XinY cases RDMA/mlx5: Block delay drop to unprivileged users xfrm: fix uctx len check in verify_sec_ctx_len xfrm: add the missing verify_sec_ctx_len check in xfrm_add_acquire xfrm: policy: Fix doulbe free in xfrm_policy_timer afs: Fix some tracing details netfilter: flowtable: reload ip{v6}h in nf_flow_tuple_ip{v6} netfilter: nft_fwd_netdev: validate family and chain type bpf/btf: Fix BTF verification of enum members in struct/union vti6: Fix memory leak of skb if input policy check fails Revert "r8169: check that Realtek PHY driver module is loaded" mac80211: add option for setting control flags mac80211: set IEEE80211_TX_CTRL_PORT_CTRL_PROTO for nl80211 TX USB: serial: option: add support for ASKEY WWHC050 USB: serial: option: add BroadMobi BM806U USB: serial: option: add Wistron Neweb D19Q1 USB: cdc-acm: restore capability check order USB: serial: io_edgeport: fix slab-out-of-bounds read in edge_interrupt_callback usb: musb: fix crash with highmen PIO and usbmon media: flexcop-usb: fix endpoint sanity check media: usbtv: fix control-message timeouts staging: rtl8188eu: Add ASUS USB-N10 Nano B1 to device table staging: wlan-ng: fix ODEBUG bug in prism2sta_disconnect_usb staging: wlan-ng: fix use-after-free Read in hfa384x_usbin_callback ahci: Add Intel Comet Lake H RAID PCI ID libfs: fix infoleak in simple_attr_read() media: ov519: add missing endpoint sanity checks media: dib0700: fix rc endpoint lookup media: stv06xx: add missing descriptor sanity checks media: xirlink_cit: add missing descriptor sanity checks mac80211: Check port authorization in the ieee80211_tx_dequeue() case mac80211: fix authentication with iwlwifi/mvm vt: selection, introduce vc_is_sel vt: ioctl, switch VT_IS_IN_USE and VT_BUSY to inlines vt: switch vt_dont_switch to bool vt: vt_ioctl: remove unnecessary console allocation checks vt: vt_ioctl: fix VT_DISALLOCATE freeing in-use virtual console vt: vt_ioctl: fix use-after-free in vt_in_use() platform/x86: pmc_atom: Add Lex 2I385SW to critclk_systems DMI table bpf: Explicitly memset the bpf_attr structure bpf: Explicitly memset some bpf info structures declared on the stack gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 CHT + AXP288 model net: ks8851-ml: Fix IO operations, again arm64: alternative: fix build with clang integrated assembler perf map: Fix off by one in strncpy() size argument ARM: dts: oxnas: Fix clear-mask property ARM: bcm2835-rpi-zero-w: Add missing pinctrl name ARM: dts: imx6: phycore-som: fix arm and soc minimum voltage ARM: dts: N900: fix onenand timings arm64: dts: ls1043a-rdb: correct RGMII delay mode to rgmii-id arm64: dts: ls1046ardb: set RGMII interfaces to RGMII_ID mode Linux 4.19.114 Change-Id: Icc165d2e49aba750e1b5a8856d9774c149e59ce7 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2020-04-03 08:17:23 +02:00
Saravana Kannan	1d887ea976	ANDROID: GKI: kernel: Export task and IRQ affinity symbols A module uses these symbols. So, export them to allow loading of that module. Bug: 149816871 Bug: 149256712 Signed-off-by: Saravana Kannan <saravanak@google.com> Change-Id: I949da5d091894ea3d79a6c9244bfc2f8426eee71 (cherry picked from commit dc928ba3bdfb4527e0ffca7c491d946a02e5bd11) [ qperret: made changes to commit message for AOSP compliance ] Signed-off-by: Quentin Perret <qperret@google.com>	2020-04-02 16:27:12 -07:00
Greg Kroah-Hartman	638d8c748e	bpf: Explicitly memset some bpf info structures declared on the stack commit 5c6f25887963f15492b604dd25cb149c501bbabf upstream. Trying to initialize a structure with "= {};" will not always clean out all padding locations in a structure. So be explicit and call memset to initialize everything for a number of bpf information structures that are then copied from userspace, sometimes from smaller memory locations than the size of the structure. Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200320162258.GA794295@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-02 15:28:23 +02:00
Greg Kroah-Hartman	aca6a9b098	bpf: Explicitly memset the bpf_attr structure commit 8096f229421f7b22433775e928d506f0342e5907 upstream. For the bpf syscall, we are relying on the compiler to properly zero out the bpf_attr union that we copy userspace data into. Unfortunately that doesn't always work properly, padding and other oddities might not be correctly zeroed, and in some tests odd things have been found when the stack is pre-initialized to other values. Fix this by explicitly memsetting the structure to 0 before using it. Reported-by: Maciej Żenczykowski <maze@google.com> Reported-by: John Stultz <john.stultz@linaro.org> Reported-by: Alexander Potapenko <glider@google.com> Reported-by: Alistair Delva <adelva@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://android-review.googlesource.com/c/kernel/common/+/1235490 Link: https://lore.kernel.org/bpf/20200320094813.GA421650@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-02 15:28:23 +02:00
Yoshiki Komachi	fb957d1003	bpf/btf: Fix BTF verification of enum members in struct/union commit da6c7faeb103c493e505e87643272f70be586635 upstream. btf_enum_check_member() was currently sure to recognize the size of "enum" type members in struct/union as the size of "int" even if its size was packed. This patch fixes BTF enum verification to use the correct size of member in BPF programs. Fixes: `179cde8cef` ("bpf: btf: Check members of struct/union") Signed-off-by: Yoshiki Komachi <komachi.yoshiki@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1583825550-18606-2-git-send-email-komachi.yoshiki@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-02 15:28:19 +02:00
Edward Cree	277db1b634	genirq: Fix reference leaks on irq affinity notifiers commit df81dfcfd6991d547653d46c051bac195cd182c1 upstream. The handling of notify->work did not properly maintain notify->kref in two cases: 1) where the work was already scheduled, another irq_set_affinity_locked() would get the ref and (no-op-ly) schedule the work. Thus when irq_affinity_notify() ran, it would drop the original ref but not the additional one. 2) when cancelling the (old) work in irq_set_affinity_notifier(), if there was outstanding work a ref had been got for it but was never put. Fix both by checking the return values of the work handling functions (schedule_work() for (1) and cancel_work_sync() for (2)) and put the extra ref if the return value indicates preexisting work. Fixes: `cd7eab44e9` ("genirq: Add IRQ affinity notifiers") Fixes: 59c39840f5ab ("genirq: Prevent use-after-free and work list corruption") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ben Hutchings <ben@decadent.org.uk> Link: https://lkml.kernel.org/r/24f5983f-2ab5-e83a-44ee-a45b5f9300f5@solarflare.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-02 15:28:18 +02:00
Tycho Andersen	5a8a69435d	cgroup1: don't call release_agent when it is "" [ Upstream commit 2e5383d7904e60529136727e49629a82058a5607 ] Older (and maybe current) versions of systemd set release_agent to "" when shutting down, but do not set notify_on_release to 0. Since `64e90a8acb` ("Introduce STATIC_USERMODEHELPER to mediate call_usermodehelper()"), we filter out such calls when the user mode helper path is "". However, when used in conjunction with an actual (i.e. non "") STATIC_USERMODEHELPER, the path is never "", so the real usermode helper will be called with argv[0] == "". Let's avoid this by not invoking the release_agent when it is "". Signed-off-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-02 15:28:14 +02:00
Vasily Averin	967e97461e	cgroup-v1: cgroup_pidlist_next should update position index [ Upstream commit db8dd9697238be70a6b4f9d0284cd89f59c0e070 ] if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. # mount \| grep cgroup # dd if=/mnt/cgroup.procs bs=1 # normal output ... 1294 1295 1296 1304 1382 584+0 records in 584+0 records out 584 bytes copied dd: /mnt/cgroup.procs: cannot skip to specified offset 83 <<< generates end of last line 1383 <<< ... and whole last line once again 0+1 records in 0+1 records out 8 bytes copied dd: /mnt/cgroup.procs: cannot skip to specified offset 1386 <<< generates last line anyway 0+1 records in 0+1 records out 5 bytes copied https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-02 15:28:14 +02:00
Qais Yousef	b38208f02b	FROMGIT: sched/rt: cpupri_find: Trigger a full search as fallback If we failed to find a fitting CPU, in cpupri_find(), we only fallback to the level we found a hit at. But Steve suggested to fallback to a second full scan instead as this could be a better effort. https://lore.kernel.org/lkml/20200304135404.146c56eb@gandalf.local.home/ We trigger the 2nd search unconditionally since the argument about triggering a full search is that the recorded fall back level might have become empty by then. Which means storing any data about what happened would be meaningless and stale. I had a humble try at timing it and it seemed okay for the small 6 CPUs system I was running on https://lore.kernel.org/lkml/20200305124324.42x6ehjxbnjkklnh@e107158-lin.cambridge.arm.com/ On large system this second full scan could be expensive. But there are no users outside capacity awareness for this fitness function at the moment. Heterogeneous systems tend to be small with 8cores in total. Bug: 120440300 Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20200310142219.syxzn5ljpdxqtbgx@e107158-lin.cambridge.arm.com (cherry picked from commit e94f80f6c49020008e6fa0f3d4b806b8595d17d8 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ib20d400be47cd913a43a5c71fafee6a7fffb78aa	2020-03-26 12:04:38 +00:00
Qais Yousef	32061ff6b5	FROMGIT: sched/rt: Remove unnecessary push for unfit tasks In task_woken_rt() and switched_to_rto() we try trigger push-pull if the task is unfit. But the logic is found lacking because if the task was the only one running on the CPU, then rt_rq is not in overloaded state and won't trigger a push. The necessity of this logic was under a debate as well, a summary of the discussion can be found in the following thread: https://lore.kernel.org/lkml/20200226160247.iqvdakiqbakk2llz@e107158-lin.cambridge.arm.com/ Remove the logic for now until a better approach is agreed upon. Bug: 120440300 Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-6-qais.yousef@arm.com (cherry picked from commit d94a9df49069ba8ff7c4aaeca1229e6471a01a15 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Id120ada4a89972b3feb8d8b022babb98db1a157f	2020-03-26 12:04:38 +00:00
Qais Yousef	8efce17187	BACKPORT: FROMGIT: sched/rt: Allow pulling unfitting task When implemented RT Capacity Awareness; the logic was done such that if a task was running on a fitting CPU, then it was sticky and we would try our best to keep it there. But as Steve suggested, to adhere to the strict priority rules of RT class; allow pulling an RT task to unfitting CPU to ensure it gets a chance to run ASAP. Bug: 120440300 LINK: https://lore.kernel.org/lkml/20200203111451.0d1da58f@oasis.local.home/ Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-5-qais.yousef@arm.com (cherry picked from commit 98ca645f824301bde72e0a51cdc8bdbbea6774a5 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) [Trivial merge conflict] Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ie25fa5a4f3b0979ed06df8d156e5586b2928479e	2020-03-26 12:04:38 +00:00
Qais Yousef	2bf4a52c7a	FROMGIT: sched/rt: Optimize cpupri_find() on non-heterogenous systems By introducing a new cpupri_find_fitness() function that takes the fitness_fn as an argument and only called when asym_system static key is enabled. cpupri_find() is now a wrapper function that calls cpupri_find_fitness() passing NULL as a fitness_fn, hence disabling the logic that handles fitness by default. Bug: 120440300 LINK: https://lore.kernel.org/lkml/c0772fca-0a4b-c88d-fdf2-5715fcf8447b@arm.com/ Reported-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-4-qais.yousef@arm.com (cherry picked from commit a1bd02e1f28b1939cac8c64072a0e578c3cbc345 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I8ad4d9e391030ae499f7a1805485147de64abcdf	2020-03-26 12:04:38 +00:00
Qais Yousef	27d84b63b0	FROMGIT: sched/rt: Re-instate old behavior in select_task_rq_rt() When RT Capacity Aware support was added, the logic in select_task_rq_rt was modified to force a search for a fitting CPU if the task currently doesn't run on one. But if the search failed, and the search was only triggered to fulfill the fitness request; we could end up selecting a new CPU unnecessarily. Fix this and re-instate the original behavior by ensuring we bail out in that case. This behavior change only affected asymmetric systems that are using util_clamp to implement capacity aware. None asymmetric systems weren't affected. Bug: 120440300 LINK: https://lore.kernel.org/lkml/20200218041620.GD28029@codeaurora.org/ Reported-by: Pavan Kondeti <pkondeti@codeaurora.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-3-qais.yousef@arm.com (cherry picked from commit b28bc1e002c23ff8a4999c4a2fb1d4d412bc6f5e https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I670ab7f95a3bd8b4790e1cafe89308ead524367e	2020-03-26 12:04:38 +00:00
Qais Yousef	ca79ac3cba	BACKPORT: FROMGIT: sched/rt: cpupri_find: Implement fallback mechanism for !fit case When searching for the best lowest_mask with a fitness_fn passed, make sure we record the lowest_level that returns a valid lowest_mask so that we can use that as a fallback in case we fail to find a fitting CPU at all levels. The intention in the original patch was not to allow a down migration to unfitting CPU. But this missed the case where we are already running on unfitting one. With this change now RT tasks can still move between unfitting CPUs when they're already running on such CPU. And as Steve suggested; to adhere to the strict priority rules of RT, if a task is already running on a fitting CPU but due to priority it can't run on it, allow it to downmigrate to unfitting CPU so it can run. Bug: 120440300 Reported-by: Pavan Kondeti <pkondeti@codeaurora.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-2-qais.yousef@arm.com Link: https://lore.kernel.org/lkml/20200203142712.a7yvlyo2y3le5cpn@e107158-lin/ (cherry picked from commit d9cb236b9429044dc694ea70a50163ddd283cea6 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) [Trivial merge conflict] Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I3430e9624f8f7b11d3875c39c5765a51aec4a6f5	2020-03-26 12:04:38 +00:00
Greg Kroah-Hartman	248555d63c	This is the 4.19.113 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl57BzQACgkQONu9yGCS aT4sXg/9ERHCo0CNoF+KpzfPcH718NEzICFa5pHVE5OJnjGQ+W+AUx4ERkU9gMrk W4N+zAFH+v6D/ejNVUPKVB+XMXeZo+QXVBfLme/N4VlrxaSeA2pMxiRrXVY1UX3/ mb6eShhpBn5q942c8QkzcQJbA1TUKnrqGuqq6rfDQTnjm6OcU/PbPHWqOBVIuOVk Op9VSCcTC41N8sCdsjg411fRBue24+zRU05mw4leeqh0f/XaLjZ9xJyEAuW+6zIz Vu/8c/vzb4j9TPBg0FaEzgDrR7TUwde7F/O9eejY/GWdQngyfPxjOGpKdakrQOOA qRnPv0Fou8hFlKZWFKJhAYcPpTc7KP/y9o1aUsNDrIeA00t9R3nDKng+ewW09hfk K75Znyw6yrWlc5nHotd1pG2NEEeDvSKdXVbyarnaTwqunla9WAdQ0IDFwdaiTklt CfrJ+AJcd5Smnuo+JfljqF4oF88UJSzhI5hp0Zi9w0JROfbPOYFK0JM2DeEOE27J IFm1Z5lvTj4VEJmyLL7CvJSM23yjK5todlG3+zFJt2ZncY2Kw1eHEOIvIwbBtxBp 2AWRkco+hsf+GToJNQopxGYyTjMI3NDy/FocAVIJ8wMSEWZkyS8NkIlgjPjTC2dk ygJ0ZDDiPU0pKouZofQhzGR/Esv/phjWvTLPerFkKIIuiaG+uUo= =1W2f -----END PGP SIGNATURE----- Merge 4.19.113 into android-4.19 Changes in 4.19.113 drm/mediatek: Find the cursor plane instead of hard coding it spi: qup: call spi_qup_pm_resume_runtime before suspending powerpc: Include .BTF section ARM: dts: dra7: Add "dma-ranges" property to PCIe RC DT nodes spi: pxa2xx: Add CS control clock quirk spi/zynqmp: remove entry that causes a cs glitch drm/exynos: dsi: propagate error value and silence meaningless warning drm/exynos: dsi: fix workaround for the legacy clock name drivers/perf: arm_pmu_acpi: Fix incorrect checking of gicc pointer altera-stapl: altera_get_note: prevent write beyond end of 'key' dm bio record: save/restore bi_end_io and bi_integrity dm integrity: use dm_bio_record and dm_bio_restore riscv: avoid the PIC offset of static percpu data in module beyond 2G limits drm/amd/display: Clear link settings on MST disable connector drm/amd/display: fix dcc swath size calculations on dcn1 xenbus: req->body should be updated before req->state xenbus: req->err should be updated before req->state block, bfq: fix overwrite of bfq_group pointer in bfq_find_set_group() parse-maintainers: Mark as executable USB: Disable LPM on WD19's Realtek Hub usb: quirks: add NO_LPM quirk for RTL8153 based ethernet adapters USB: serial: option: add ME910G1 ECM composition 0x110b usb: host: xhci-plat: add a shutdown USB: serial: pl2303: add device-id for HP LD381 usb: xhci: apply XHCI_SUSPEND_DELAY to AMD XHCI controller 1022:145c ALSA: line6: Fix endless MIDI read loop ALSA: seq: virmidi: Fix running status after receiving sysex ALSA: seq: oss: Fix running status after receiving sysex ALSA: pcm: oss: Avoid plugin buffer overflow ALSA: pcm: oss: Remove WARNING from snd_pcm_plug_alloc() checks iio: st_sensors: remap SMO8840 to LIS2DH12 iio: trigger: stm32-timer: disable master mode when stopping iio: magnetometer: ak8974: Fix negative raw values in sysfs iio: adc: at91-sama5d2_adc: fix differential channels in triggered mode mmc: rtsx_pci: Fix support for speed-modes that relies on tuning mmc: sdhci-of-at91: fix cd-gpios for SAMA5D2 staging: rtl8188eu: Add device id for MERCUSYS MW150US v2 staging: greybus: loopback_test: fix poll-mask build breakage staging/speakup: fix get_word non-space look-ahead intel_th: Fix user-visible error codes intel_th: pci: Add Elkhart Lake CPU support rtc: max8907: add missing select REGMAP_IRQ xhci: Do not open code __print_symbolic() in xhci trace events btrfs: fix log context list corruption after rename whiteout error drm/amd/amdgpu: Fix GPR read from debugfs (v2) drm/lease: fix WARNING in idr_destroy memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event mm: slub: be more careful about the double cmpxchg of freelist mm, slub: prevent kmalloc_node crashes and memory leaks page-flags: fix a crash at SetPageError(THP_SWAP) x86/mm: split vmalloc_sync_all() USB: cdc-acm: fix close_delay and closing_wait units in TIOCSSERIAL USB: cdc-acm: fix rounding error in TIOCSSERIAL iio: light: vcnl4000: update sampling periods for vcnl4200 kbuild: Disable -Wpointer-to-enum-cast futex: Fix inode life-time issue futex: Unbreak futex hashing Revert "vrf: mark skb for multicast or link-local as enslaved to VRF" Revert "ipv6: Fix handling of LLA with VRF and sockets bound to VRF" ALSA: hda/realtek: Fix pop noise on ALC225 arm64: smp: fix smp_send_stop() behaviour arm64: smp: fix crash_smp_send_stop() behaviour drm/bridge: dw-hdmi: fix AVI frame colorimetry staging: greybus: loopback_test: fix potential path truncation staging: greybus: loopback_test: fix potential path truncations Linux 4.19.113 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I90c48cd7189a964e59d199ecc0f32c0a68688ec5	2020-03-25 09:50:38 +01:00
Thomas Gleixner	17a8ca79a5	futex: Unbreak futex hashing commit 8d67743653dce5a0e7aa500fcccb237cde7ad88e upstream. The recent futex inode life time fix changed the ordering of the futex key union struct members, but forgot to adjust the hash function accordingly, As a result the hashing omits the leading 64bit and even hashes beyond the futex key causing a bad hash distribution which led to a ~100% performance regression. Hand in the futex key pointer instead of a random struct member and make the size calculation based of the struct offset. Fixes: 8019ad13ef7f ("futex: Fix inode life-time issue") Reported-by: Rong Chen <rong.a.chen@intel.com> Decoded-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Rong Chen <rong.a.chen@intel.com> Link: https://lkml.kernel.org/r/87h7yy90ve.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-25 08:06:14 +01:00
Peter Zijlstra	e6d506cd22	futex: Fix inode life-time issue commit 8019ad13ef7f64be44d4f892af9c840179009254 upstream. As reported by Jann, ihold() does not in fact guarantee inode persistence. And instead of making it so, replace the usage of inode pointers with a per boot, machine wide, unique inode identifier. This sequence number is global, but shared (file backed) futexes are rare enough that this should not become a performance issue. Reported-by: Jann Horn <jannh@google.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-25 08:06:14 +01:00
Joerg Roedel	6c1051ffc7	x86/mm: split vmalloc_sync_all() commit 763802b53a427ed3cbd419dbba255c414fdd9e7c upstream. Commit 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in the vunmap() code-path. While this change was necessary to maintain correctness on x86-32-pae kernels, it also adds additional cycles for architectures that don't need it. Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported severe performance regressions in micro-benchmarks because it now also calls the x86-64 implementation of vmalloc_sync_all() on vunmap(). But the vmalloc_sync_all() implementation on x86-64 is only needed for newly created mappings. To avoid the unnecessary work on x86-64 and to gain the performance back, split up vmalloc_sync_all() into two functions: * vmalloc_sync_mappings(), and * vmalloc_sync_unmappings() Most call-sites to vmalloc_sync_all() only care about new mappings being synchronized. The only exception is the new call-site added in the above mentioned commit. Shile Zhang directed us to a report of an 80% regression in reaim throughput. Fixes: 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()") Reported-by: kernel test robot <oliver.sang@intel.com> Reported-by: Shile Zhang <shile.zhang@linux.alibaba.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Borislav Petkov <bp@suse.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [GHES] Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20191009124418.8286-1-joro@8bytes.org Link: https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/ Link: http://lkml.kernel.org/r/20191113095530.228959-1-shile.zhang@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-25 08:06:13 +01:00
Mahesh Sivasubramanian	db6999554d	ANDROID: GKI: kernel: tick-sched: Add API to get the next wakeup for a CPU Add get_next_event_cpu to get the next wakeup time for the CPU. This is used by the sleep driver if it has to query the next wakeup for a CPU other than the thread that its running on. Test: make Bug: 150895657 Signed-off-by: Mahesh Sivasubramanian <msivasub@codeaurora.org> Change-Id: I0f0347f9648932a55cb64c630694d0a2e290b633 (cherry picked from commit `e948653a6a`) [hridya: also added an EXPORT_SYMBOL_GPL(get_next_event_cpu) to make the symbol available to kernel modules.] Signed-off-by: Hridya Valsaraju <hridya@google.com>	2020-03-23 12:21:04 -07:00
Greg Kroah-Hartman	cb57b8b85f	UPSTREAM: bpf: Explicitly memset some bpf info structures declared on the stack Trying to initialize a structure with "= {};" will not always clean out all padding locations in a structure. So be explicit and call memset to initialize everything for a number of bpf information structures that are then copied from userspace, sometimes from smaller memory locations than the size of the structure. Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200320162258.GA794295@kroah.com (cherry picked from commit 269efb7fc478563a7e7b22590d8076823f4ac82a) Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I52a2cab20aa310085ec104bd811ac4f2b83657b6	2020-03-23 16:40:45 +01:00
Greg Kroah-Hartman	de2b205fa7	UPSTREAM: bpf: Explicitly memset the bpf_attr structure For the bpf syscall, we are relying on the compiler to properly zero out the bpf_attr union that we copy userspace data into. Unfortunately that doesn't always work properly, padding and other oddities might not be correctly zeroed, and in some tests odd things have been found when the stack is pre-initialized to other values. Fix this by explicitly memsetting the structure to 0 before using it. Reported-by: Maciej Żenczykowski <maze@google.com> Reported-by: John Stultz <john.stultz@linaro.org> Reported-by: Alexander Potapenko <glider@google.com> Reported-by: Alistair Delva <adelva@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://android-review.googlesource.com/c/kernel/common/+/1235490 Link: https://lore.kernel.org/bpf/20200320094813.GA421650@kroah.com (cherry picked from commit 8096f229421f7b22433775e928d506f0342e5907) Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I2dc28cd45024da5cc6861ff4a9b25fae389cc6d8	2020-03-23 16:39:13 +01:00
Greg Kroah-Hartman	417d28a44d	This is the 4.19.112 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl50oUYACgkQONu9yGCS aT5w2Q//TEz14rFZh31L+QyAmW2lPGb9UnukLgUjs5RQKbRL/Bl3h5hU0EbYMl4i fdb09WsU3Ns2twelCnzWqsXPgc3GiDrSMgOfDR1jJpELzAk4WGgxdYoVj2Wkyi6x Uus1KDH5tEgwvHMMjwszDxi8N3behr3IidcvxN6EnTRbmSmLck2rlfp1Y0q4hqF9 /CDIHxYfArXFVS0sxG3tGduf2wK2XieHr2YLTetcXs8W0GV7KmrSxmcEOsQOjtnv aifFuuRY7T4XLfkxoWephKl/YZfVsCAlgOpLn5BgwfSQyXHr/X4GUG+MI7/rsQ6l X5cZiqoKAM8K4olnYJB23PHTmSK3AEp8sWQBRS6WxZPcZlu/eq6qnwxhnVBrDNKP l40CarJECD8bc3cZnKRZuQjPC2s7ay8KpsLFfSSn0bAmPNVaTF65oqfccA1OMtJd nuyBOQNQ4LATemtkloDI+Xxpkhng/klXH+/yVyhbaNFXkCE8XsPaAIClpLf5rVWx ojREnXSNfCr3GhPPskVLfjiYhBEEdwlmOB+dBUUqxSNL3IZkdRXPy2Hpv2KFmuUR 9VNyndsFUFjKraZR29+CBIgFjgsAInvdqvyJkBAtqaAgH3u0nTK/zA/mx6yOGA6o ayFotBONmvfGDSRZUNoSkSHcEXHUSNucEzrR/1aMEAb+zqCEgjk= =nZnf -----END PGP SIGNATURE----- Merge 4.19.112 into android-4.19 Changes in 4.19.112 perf/amd/uncore: Replace manual sampling check with CAP_NO_INTERRUPT flag mmc: sdhci-omap: Add platform specific reset callback mmc: sdhci-omap: Workaround errata regarding SDR104/HS200 tuning failures (i929) mmc: host: Fix Kconfig warnings on keystone_defconfig ACPI: watchdog: Allow disabling WDAT at boot HID: apple: Add support for recent firmware on Magic Keyboards HID: i2c-hid: add Trekstor Surfbook E11B to descriptor override cfg80211: check reg_rule for NULL in handle_channel_custom() scsi: libfc: free response frame from GPN_ID net: usb: qmi_wwan: restore mtu min/max values after raw_ip switch net: ks8851-ml: Fix IRQ handling and locking mac80211: rx: avoid RCU list traversal under mutex signal: avoid double atomic counter increments for user accounting slip: not call free_netdev before rtnl_unlock in slip_open hinic: fix a irq affinity bug hinic: fix a bug of setting hw_ioctxt net: rmnet: fix NULL pointer dereference in rmnet_newlink() net: rmnet: fix NULL pointer dereference in rmnet_changelink() net: rmnet: fix suspicious RCU usage net: rmnet: remove rcu_read_lock in rmnet_force_unassociate_device() net: rmnet: do not allow to change mux id if mux id is duplicated net: rmnet: use upper/lower device infrastructure net: rmnet: fix bridge mode bugs net: rmnet: fix packet forwarding in rmnet bridge mode sfc: fix timestamp reconstruction at 16-bit rollover points jbd2: fix data races at struct journal_head wimax: i2400: fix memory leak wimax: i2400: Fix memory leak in i2400m_op_rfkill_sw_toggle mmc: sdhci-omap: Don't finish_mrq() on a command error during tuning mmc: sdhci-omap: Fix Tuning procedure for temperatures < -20C driver core: Remove the link if there is no driver with AUTO flag driver core: Fix adding device links to probing suppliers driver core: Make driver core own stateful device links driver core: Add device link flag DL_FLAG_AUTOPROBE_CONSUMER driver core: Remove device link creation limitation driver core: Fix creation of device links with PM-runtime flags net: qrtr: fix len of skb_put_padto in qrtr_node_enqueue ARM: 8957/1: VDSO: Match ARMv8 timer in cntvct_functional() ARM: 8958/1: rename missed uaccess .fixup section mm: slub: add missing TID bump in kmem_cache_alloc_bulk() HID: google: add moonball USB id efi: Fix debugobjects warning on 'efi_rts_work' ipv4: ensure rcu_read_lock() in cipso_v4_error() Linux 4.19.112 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I68bb3ea9d74f698994a1b958d112827a0873a0f7	2020-03-21 08:37:27 +01:00
Linus Torvalds	797479da0a	signal: avoid double atomic counter increments for user accounting [ Upstream commit fda31c50292a5062332fa0343c084bd9f46604d9 ] When queueing a signal, we increment both the users count of pending signals (for RLIMIT_SIGPENDING tracking) and we increment the refcount of the user struct itself (because we keep a reference to the user in the signal structure in order to correctly account for it when freeing). That turns out to be fairly expensive, because both of them are atomic updates, and particularly under extreme signal handling pressure on big machines, you can get a lot of cache contention on the user struct. That can then cause horrid cacheline ping-pong when you do these multiple accesses. So change the reference counting to only pin the user for the _first_ pending signal, and to unpin it when the last pending signal is dequeued. That means that when a user sees a lot of concurrent signal queuing - which is the only situation when this matters - the only atomic access needed is generally the 'sigpending' count update. This was noticed because of a particularly odd timing artifact on a dual-socket 96C/192T Cascade Lake platform: when you get into bad contention, on that machine for some reason seems to be much worse when the contention happens in the upper 32-byte half of the cacheline. As a result, the kernel test robot will-it-scale 'signal1' benchmark had an odd performance regression simply due to random alignment of the 'struct user_struct' (and pointed to a completely unrelated and apparently nonsensical commit for the regression). Avoiding the double increments (and decrements on the dequeueing side, of course) makes for much less contention and hugely improved performance on that will-it-scale microbenchmark. Quoting Feng Tang: "It makes a big difference, that the performance score is tripled! bump from original 17000 to 54000. Also the gap between 5.0-rc6 and 5.0-rc6+Jiri's patch is reduced to around 2%" [ The "2% gap" is the odd cacheline placement difference on that platform: under the extreme contention case, the effect of which half of the cacheline was hot was 5%, so with the reduced contention the odd timing artifact is reduced too ] It does help in the non-contended case too, but is not nearly as noticeable. Reported-and-tested-by: Feng Tang <feng.tang@intel.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Philip Li <philip.li@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-03-20 11:55:53 +01:00
Greg Kroah-Hartman	bfe2901c20	This is the 4.19.111 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl5xvNEACgkQONu9yGCS aT6VUg//SJSSC5IX7gulaIm8IzvVijE7EKkdkjukJ4TD672J1QqzXVlhKp8tSvAV ZBknOar0AP5sDNtvF3cgz0t6w6IJHrLWGyWqcMfUTC75M9HVZH6YUgHDkPmi0g8f dyTrVe20/lC5yBNAFmS0pnYB+UfL8biJEF6N++pULZQhOY0eRr6BMKdl2npxH7D3 YL/jipdGHmwkr/OgOtRaOBgEP6HIu1xKnZUkGzvhF0BOxAM/ib/5lQognOD6x4Hm 9vHzc8+nBXlWj6N7XkE+I3RiZumUx+vEr2kLljdrTE7cH7ALzJQl4GQ6Db6lbd0E q78Y44FhrfKiwxDeGPHKOX39sgzVwCsKhwTg3a4Rq4Aq0I7QQoPikAyCUj9kaeFq q8bI0Wub+4nQhzuyv6UgRWaQnIBZxXe56M8z3u4CTy6ljwvn4hXeZ9bkVRyXdQtS D4h3WtxFwBed0tQGb5ypv83Wg/lwK8bQHab4LDV9AZNZ3Jrbg70ldlea0GiA8Csc Y3MncS6zF9mnAU8ZdsYT3GNkRQS3OTFNeb7+V5MRgdCnG3xk5GltHTy0JYhZKmXH 8zMXlUgUeyihFx6f7LwFhYk8NTSg3+W700SKND/zd+VK8m7mqT7PB1bkny5zJ6aC teehBWmHlxZlL1ENXya8lUEEsOieAWxi3IMlhOEo2roidPW1N0o= =ryZW -----END PGP SIGNATURE----- Merge 4.19.111 into android-4.19 Changes in 4.19.111 phy: Revert toggling reset changes. net: phy: Avoid multiple suspends cgroup, netclassid: periodically release file_lock on classid updating gre: fix uninit-value in __iptunnel_pull_header inet_diag: return classid for all socket types ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface ipvlan: add cond_resched_rcu() while processing muticast backlog ipvlan: do not add hardware address of master to its unicast filter list ipvlan: do not use cond_resched_rcu() in ipvlan_process_multicast() ipvlan: don't deref eth hdr before checking it's set net/ipv6: use configured metric when add peer route netlink: Use netlink header as base to calculate bad attribute offset net: macsec: update SCI upon MAC address change. net: nfc: fix bounds checking bugs on "pipe" net/packet: tpacket_rcv: do not increment ring index on drop net: stmmac: dwmac1000: Disable ACS if enhanced descs are not used net: systemport: fix index check to avoid an array out of bounds access r8152: check disconnect status after long sleep sfc: detach from cb_page in efx_copy_channel() bnxt_en: reinitialize IRQs when MTU is modified cgroup: memcg: net: do not associate sock with unrelated cgroup net: memcg: late association of sock to memcg net: memcg: fix lockdep splat in inet_csk_accept() devlink: validate length of param values fib: add missing attribute validation for tun_id nl802154: add missing attribute validation nl802154: add missing attribute validation for dev_type can: add missing attribute validation for termination macsec: add missing attribute validation for port net: fq: add missing attribute validation for orphan mask team: add missing attribute validation for port ifindex team: add missing attribute validation for array index nfc: add missing attribute validation for SE API nfc: add missing attribute validation for deactivate target nfc: add missing attribute validation for vendor subcommand net: phy: fix MDIO bus PM PHY resuming selftests/net/fib_tests: update addr_metric_test for peer route testing net/ipv6: need update peer route when modify metric net/ipv6: remove the old peer route if change it to a new one tipc: add missing attribute validation for MTU property devlink: validate length of region addr/len bonding/alb: make sure arp header is pulled before accessing it slip: make slhc_compress() more robust against malicious packets net: fec: validate the new settings in fec_enet_set_coalesce() macvlan: add cond_resched() during multicast processing cgroup: cgroup_procs_next should increase position index cgroup: Iterate tasks that did not finish do_exit() iwlwifi: mvm: Do not require PHY_SKU NVM section for 3168 devices virtio-blk: fix hw_queue stopped on arbitrary error iommu/vt-d: quirk_ioat_snb_local_iommu: replace WARN_TAINT with pr_warn + add_taint netfilter: nf_conntrack: ct_cpu_seq_next should increase position index netfilter: synproxy: synproxy_cpu_seq_next should increase position index netfilter: xt_recent: recent_seq_next should increase position index netfilter: x_tables: xt_mttg_seq_next should increase position index workqueue: don't use wq_select_unbound_cpu() for bound works drm/amd/display: remove duplicated assignment to grph_obj_type ktest: Add timeout for ssh sync testing cifs_atomic_open(): fix double-put on late allocation failure gfs2_atomic_open(): fix O_EXCL\|O_CREAT handling on cold dcache KVM: x86: clear stale x86_emulate_ctxt->intercept value ARC: define __ALIGN_STR and __ALIGN symbols for ARC macintosh: windfarm: fix MODINFO regression efi: Fix a race and a buffer overflow while reading efivars via sysfs efi: Make efi_rts_work accessible to efi page fault handler mt76: fix array overflow on receiving too many fragments for a packet x86/mce: Fix logic and comments around MSR_PPIN_CTL iommu/dma: Fix MSI reservation allocation iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page batman-adv: Don't schedule OGM for disabled interface pinctrl: meson-gxl: fix GPIOX sdio pins pinctrl: core: Remove extra kref_get which blocks hogs being freed drm/i915/gvt: Fix unnecessary schedule timer when no vGPU exits i2c: gpio: suppress error on probe defer nl80211: add missing attribute validation for critical protocol indication nl80211: add missing attribute validation for beacon report scanning nl80211: add missing attribute validation for channel switch perf bench futex-wake: Restore thread count default to online CPU count netfilter: cthelper: add missing attribute validation for cthelper netfilter: nft_payload: add missing attribute validation for payload csum flags netfilter: nft_tunnel: add missing attribute validation for tunnels iommu/vt-d: Fix the wrong printing in RHSA parsing iommu/vt-d: Ignore devices with out-of-spec domain number i2c: acpi: put device when verifying client fails ipv6: restrict IPV6_ADDRFORM operation net/smc: check for valid ib_client_data net/smc: cancel event worker during device removal efi: Add a sanity check to efivar_store_raw() batman-adv: Avoid free/alloc race when handling OGM2 buffer Linux 4.19.111 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ide220f0b6a12d291bda4a83f17cde25bbe64e2ff	2020-03-18 08:19:47 +01:00
Hillf Danton	3cd2a91a88	workqueue: don't use wq_select_unbound_cpu() for bound works commit aa202f1f56960c60e7befaa0f49c72b8fa11b0a8 upstream. wq_select_unbound_cpu() is designed for unbound workqueues only, but it's wrongly called when using a bound workqueue too. Fixing this ensures work queued to a bound workqueue with cpu=WORK_CPU_UNBOUND always runs on the local CPU. Before, that would happen only if wq_unbound_cpumask happened to include it (likely almost always the case), or was empty, or we got lucky with forced round-robin placement. So restricting /sys/devices/virtual/workqueue/cpumask to a small subset of a machine's CPUs would cause some bound work items to run unexpectedly there. Fixes: `ef55718044` ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Hillf Danton <hdanton@sina.com> [dj: massage changelog] Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Tejun Heo <tj@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-18 07:14:20 +01:00
Michal Koutný	ab3e3b23d8	cgroup: Iterate tasks that did not finish do_exit() commit 9c974c77246460fa6a92c18554c3311c8c83c160 upstream. PF_EXITING is set earlier than actual removal from css_set when a task is exitting. This can confuse cgroup.procs readers who see no PF_EXITING tasks, however, rmdir is checking against css_set membership so it can transitionally fail with EBUSY. Fix this by listing tasks that weren't unlinked from css_set active lists. It may happen that other users of the task iterator (without CSS_TASK_ITER_PROCS) spot a PF_EXITING task before cgroup_exit(). This is equal to the state before commit c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") but it may be reviewed later. Reported-by: Suren Baghdasaryan <surenb@google.com> Fixes: c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") Signed-off-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-18 07:14:19 +01:00
Vasily Averin	ff79a4a75c	cgroup: cgroup_procs_next should increase position index commit 2d4ecb030dcc90fb725ecbfc82ce5d6c37906e0e upstream. If seq_file .next fuction does not change position index, read after some lseek can generate unexpected output: 1) dd bs=1 skip output of each 2nd elements $ dd if=/sys/fs/cgroup/cgroup.procs bs=8 count=1 2 3 4 5 1+0 records in 1+0 records out 8 bytes copied, 0,000267297 s, 29,9 kB/s [test@localhost ~]$ dd if=/sys/fs/cgroup/cgroup.procs bs=1 count=8 2 4 <<< NB! 3 was skipped 6 <<< ... and 5 too 8 <<< ... and 7 8+0 records in 8+0 records out 8 bytes copied, 5,2123e-05 s, 153 kB/s This happen because __cgroup_procs_start() makes an extra extra cgroup_procs_next() call 2) read after lseek beyond end of file generates whole last line. 3) read after lseek into middle of last line generates expected rest of last line and unexpected whole line once again. Additionally patch removes an extra position index changes in __cgroup_procs_start() Cc: stable@vger.kernel.org https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-18 07:14:19 +01:00
Shakeel Butt	941464dcbc	cgroup: memcg: net: do not associate sock with unrelated cgroup [ Upstream commit e876ecc67db80dfdb8e237f71e5b43bb88ae549c ] We are testing network memory accounting in our setup and noticed inconsistent network memory usage and often unrelated cgroups network usage correlates with testing workload. On further inspection, it seems like mem_cgroup_sk_alloc() and cgroup_sk_alloc() are broken in irq context specially for cgroup v1. mem_cgroup_sk_alloc() and cgroup_sk_alloc() can be called in irq context and kind of assumes that this can only happen from sk_clone_lock() and the source sock object has already associated cgroup. However in cgroup v1, where network memory accounting is opt-in, the source sock can be unassociated with any cgroup and the new cloned sock can get associated with unrelated interrupted cgroup. Cgroup v2 can also suffer if the source sock object was created by process in the root cgroup or if sk_alloc() is called in irq context. The fix is to just do nothing in interrupt. WARNING: Please note that about half of the TCP sockets are allocated from the IRQ context, so, memory used by such sockets will not be accouted by the memcg. The stack trace of mem_cgroup_sk_alloc() from IRQ-context: CPU: 70 PID: 12720 Comm: ssh Tainted: 5.6.0-smp-DEV #1 Hardware name: ... Call Trace: <IRQ> dump_stack+0x57/0x75 mem_cgroup_sk_alloc+0xe9/0xf0 sk_clone_lock+0x2a7/0x420 inet_csk_clone_lock+0x1b/0x110 tcp_create_openreq_child+0x23/0x3b0 tcp_v6_syn_recv_sock+0x88/0x730 tcp_check_req+0x429/0x560 tcp_v6_rcv+0x72d/0xa40 ip6_protocol_deliver_rcu+0xc9/0x400 ip6_input+0x44/0xd0 ? ip6_protocol_deliver_rcu+0x400/0x400 ip6_rcv_finish+0x71/0x80 ipv6_rcv+0x5b/0xe0 ? ip6_sublist_rcv+0x2e0/0x2e0 process_backlog+0x108/0x1e0 net_rx_action+0x26b/0x460 __do_softirq+0x104/0x2a6 do_softirq_own_stack+0x2a/0x40 </IRQ> do_softirq.part.19+0x40/0x50 __local_bh_enable_ip+0x51/0x60 ip6_finish_output2+0x23d/0x520 ? ip6table_mangle_hook+0x55/0x160 __ip6_finish_output+0xa1/0x100 ip6_finish_output+0x30/0xd0 ip6_output+0x73/0x120 ? __ip6_finish_output+0x100/0x100 ip6_xmit+0x2e3/0x600 ? ipv6_anycast_cleanup+0x50/0x50 ? inet6_csk_route_socket+0x136/0x1e0 ? skb_free_head+0x1e/0x30 inet6_csk_xmit+0x95/0xf0 __tcp_transmit_skb+0x5b4/0xb20 __tcp_send_ack.part.60+0xa3/0x110 tcp_send_ack+0x1d/0x20 tcp_rcv_state_process+0xe64/0xe80 ? tcp_v6_connect+0x5d1/0x5f0 tcp_v6_do_rcv+0x1b1/0x3f0 ? tcp_v6_do_rcv+0x1b1/0x3f0 __release_sock+0x7f/0xd0 release_sock+0x30/0xa0 __inet_stream_connect+0x1c3/0x3b0 ? prepare_to_wait+0xb0/0xb0 inet_stream_connect+0x3b/0x60 __sys_connect+0x101/0x120 ? __sys_getsockopt+0x11b/0x140 __x64_sys_connect+0x1a/0x20 do_syscall_64+0x51/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The stack trace of mem_cgroup_sk_alloc() from IRQ-context: Fixes: `2d75807383` ("mm: memcontrol: consolidate cgroup socket tracking") Fixes: `d979a39d72` ("cgroup: duplicate cgroup reference when cloning sockets") Signed-off-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Roman Gushchin <guro@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-18 07:14:14 +01:00
Michal Koutný	6f2a9a3536	UPSTREAM: cgroup: Iterate tasks that did not finish do_exit() PF_EXITING is set earlier than actual removal from css_set when a task is exitting. This can confuse cgroup.procs readers who see no PF_EXITING tasks, however, rmdir is checking against css_set membership so it can transitionally fail with EBUSY. Fix this by listing tasks that weren't unlinked from css_set active lists. It may happen that other users of the task iterator (without CSS_TASK_ITER_PROCS) spot a PF_EXITING task before cgroup_exit(). This is equal to the state before commit c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") but it may be reviewed later. Reported-by: Suren Baghdasaryan <surenb@google.com> Fixes: c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") Signed-off-by: Michal Koutný <mkoutny@suse.com> (cherry picked from commit 9c974c77246460fa6a92c18554c3311c8c83c160) Bug: 141213848 Bug: 146758430 Test: test_cgcore_destroy from linux-kselftest Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: Iac57661b931129ed1e44b89675f8115bb89084ff (cherry picked from commit 21ee296526c70d6dc3c64639406f156f39b80fd0)	2020-03-16 21:15:43 +00:00
Laura Abbott	c9a574054d	ANDROID: GKI: drivers: Add dma removed ops The current DMA coherent pool assumes that there is a kernel mapping at all times for the entire pool. This may not be what we want for the entire times. Add the dma_removed ops to support this use case. Bug: 145617272 Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Patrick Daly <pdaly@codeaurora.org> Signed-off-by: Liam Mark <lmark@codeaurora.org> Signed-off-by: Swathi Sridhar <swatsrid@codeaurora.org> [surenb Squashed the following commits: `a478a8bf78` "drivers: Add dma removed ops" `8510985ae3` "dma: removed: Merge dma removed changes from 4.14" and removed-dma-pool driver changes from `9d4b7d6415` "iommu: arm-smmu: Merge for ..."] Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: Ie4f1e9bdf57b79699fa8fa7e7a6087e6d88ebbfa	2020-03-16 18:10:36 +00:00
Will Deacon	8e37367a32	FROMGIT: kallsyms: unexport kallsyms_lookup_name() and kallsyms_on_each_symbol() kallsyms_lookup_name() and kallsyms_on_each_symbol() are exported to modules despite having no in-tree users and being wide open to abuse by out-of-tree modules that can use them as a method to invoke arbitrary non-exported kernel functions. Unexport kallsyms_lookup_name() and kallsyms_on_each_symbol(). Bug: 149978696 Change-Id: I8f3c1b5222939c46901f4d149d4c7bb63916ff04 Link: http://lkml.kernel.org/r/20200221114404.14641-4-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Quentin Perret <qperret@google.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Petr Mladek <pmladek@suse.com> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit ab3e66797c7fddbf80fbba31c5bf4574ad52f320 https://github.com/hnaz/linux-mm.git master) Signed-off-by: Quentin Perret <qperret@google.com>	2020-03-12 11:18:50 +00:00
Greg Kroah-Hartman	ca0a95ff50	This is the 4.19.109 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl5o5HYACgkQONu9yGCS aT4/tQ//Xsg40emvKL+hfz22PB5OccTYr4LFGIiVFq5kOGbwM7/oXkVwzD3tk948 0gGbad65hzi5pKnuQOhNgOtkIEtheub/lf0lUJPf+TM9xUj6Vi0/KjkODfJ01O1+ OiUy/ZoqL1NB1GQxXMUdtZwayQJkIdVq/taralbfxFwrJlffhjjBg2/I7N+C/SPw LrlYzbtIqS8wu/d4xPwsEGm4vnhqb0jLiJ42/kb3+Ts21/FhUge8+lkwTGq8JyH1 QeRtPUHEJxJ2hA6H2T9CJg4fiJYhD96tLZUKYz57A95z20uqVszvFWVGApz5uLi1 n1BUkYFAOmJ+H4hWcT1kiYMhxA7iMk1JTbEs9EJOwq1CFfEK2LRfY7Xze3XqCDOd eujecPlpqji+7wxTCd5XkOKAwjVdBLo7faZypQCUai8A6ca9D/rrxerglAa/VSsj KfTaTThIxKHg8bskDSBJqe9rPZg92u7LEVMj+EE05CcfNevBvKDwV6I4F7cKZV2X Y7w76OaYgm8e+H6w6ryvCN8d/T5tSGNly2wJ+rHBl5kc3GD9NPUXRUJKKePU55/3 SuD6q/8gnGp2upqC6FFNTdFMzAar8vIbvHYh9vMA8ISMfM5ShibE/V06PxaupmV6 RWCiAOiBnOu5wi/hDQ8wrlWOI2+H8fUWUldy2LIp78HAS4OWnUw= =VaEt -----END PGP SIGNATURE----- Merge 4.19.109 into android-4.19 Changes in 4.19.109 EDAC/amd64: Set grain per DIMM ALSA: hda/realtek - Fix a regression for mute led on Lenovo Carbon X1 net: dsa: bcm_sf2: Forcibly configure IMP port for 1Gb/sec RDMA/core: Fix pkey and port assignment in get_new_pps RDMA/core: Fix use of logical OR in get_new_pps kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic ALSA: hda: do not override bus codec_mask in link_get() serial: ar933x_uart: set UART_CS_{RX,TX}_READY_ORIDE selftests: fix too long argument usb: gadget: composite: Support more than 500mA MaxPower usb: gadget: ffs: ffs_aio_cancel(): Save/restore IRQ flags usb: gadget: serial: fix Tx stall after buffer overflow drm/msm/mdp5: rate limit pp done timeout warnings drm: msm: Fix return type of dsi_mgr_connector_mode_valid for kCFI scsi: megaraid_sas: silence a warning drm/msm/dsi: save pll state before dsi host is powered off drm/msm/dsi/pll: call vco set rate explicitly selftests: forwarding: use proto icmp for {gretap, ip6gretap}_mac testing net: dsa: b53: Ensure the default VID is untagged net: ks8851-ml: Remove 8-bit bus accessors net: ks8851-ml: Fix 16-bit data access net: ks8851-ml: Fix 16-bit IO operation watchdog: da9062: do not ping the hw during stop() s390/cio: cio_ignore_proc_seq_next should increase position index s390: make 'install' not depend on vmlinux x86/boot/compressed: Don't declare __force_order in kaslr_64.c s390/qdio: fill SL with absolute addresses nvme: Fix uninitialized-variable warning ice: Don't tell the OS that link is going down x86/xen: Distribute switch variables for initialization net: thunderx: workaround BGX TX Underflow issue ALSA: hda/realtek - Add Headset Mic supported ALSA: hda/realtek - Fix silent output on Gigabyte X570 Aorus Master cifs: don't leak -EAGAIN for stat() during reconnect usb: storage: Add quirk for Samsung Fit flash usb: quirks: add NO_LPM quirk for Logitech Screen Share usb: dwc3: gadget: Update chain bit correctly when using sg list usb: core: hub: fix unhandled return by employing a void function usb: core: hub: do error out if usb_autopm_get_interface() fails usb: core: port: do error out if usb_autopm_get_interface() fails vgacon: Fix a UAF in vgacon_invert_region mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa mm: fix possible PMD dirty bit lost in set_pmd_migration_entry() fat: fix uninit-memory access for partial initialized inode arm: dts: dra76x: Fix mmc3 max-frequency tty:serial:mvebu-uart:fix a wrong return serial: 8250_exar: add support for ACCES cards vt: selection, close sel_buffer race vt: selection, push console lock down vt: selection, push sel_lock up media: v4l2-mem2mem.c: fix broken links x86/pkeys: Manually set X86_FEATURE_OSPKE to preserve existing changes dmaengine: tegra-apb: Fix use-after-free dmaengine: tegra-apb: Prevent race conditions of tasklet vs free list dm cache: fix a crash due to incorrect work item cancelling dm: report suspended device during destroy dm writecache: verify watermark during resume ARM: dts: ls1021a: Restore MDIO compatible to gianfar spi: bcm63xx-hsspi: Really keep pll clk enabled ASoC: topology: Fix memleak in soc_tplg_link_elems_load() ASoC: topology: Fix memleak in soc_tplg_manifest_load() ASoC: intel: skl: Fix pin debug prints ASoC: intel: skl: Fix possible buffer overflow in debug outputs dmaengine: imx-sdma: remove dma_slave_config direction usage and leave sdma_event_enable() ASoC: pcm: Fix possible buffer overflow in dpcm state sysfs output ASoC: pcm512x: Fix unbalanced regulator enable call in probe error path ASoC: dapm: Correct DAPM handling of active widgets during shutdown drm/sun4i: Fix DE2 VI layer format support drm/sun4i: de2/de3: Remove unsupported VI layer formats phy: mapphone-mdm6600: Fix timeouts by adding wake-up handling phy: mapphone-mdm6600: Fix write timeouts with shorter GPIO toggle interval ARM: dts: imx6: phycore-som: fix emmc supply RDMA/iwcm: Fix iwcm work deallocation RMDA/cm: Fix missing ib_cm_destroy_id() in ib_cm_insert_listen() IB/hfi1, qib: Ensure RCU is locked when accessing list ARM: imx: build v7_cpu_resume() unconditionally ARM: dts: am437x-idk-evm: Fix incorrect OPP node names ARM: dts: imx7-colibri: Fix frequency for sd/mmc hwmon: (adt7462) Fix an error return in ADT7462_REG_VOLT() dmaengine: coh901318: Fix a double lock bug in dma_tc_handle() powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systems efi/x86: Align GUIDs to their size in the mixed mode runtime wrapper efi/x86: Handle by-ref arguments covering multiple pages in mixed mode dm integrity: fix a deadlock due to offloading to an incorrect workqueue scsi: pm80xx: Fixed kernel panic during error recovery for SATA drive Linux 4.19.109 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Iae5cc72b8c7c96b0a15c76657b9c3bcc4341a7aa	2020-03-11 17:10:39 +01:00
Masami Hiramatsu	38d3707340	kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic [ Upstream commit e4add247789e4ba5e08ad8256183ce2e211877d4 ] optimize_kprobe() and unoptimize_kprobe() cancels if a given kprobe is on the optimizing_list or unoptimizing_list already. However, since the following commit: f66c0447cca1 ("kprobes: Set unoptimized flag after unoptimizing code") modified the update timing of the KPROBE_FLAG_OPTIMIZED, it doesn't work as expected anymore. The optimized_kprobe could be in the following states: - [optimizing]: Before inserting jump instruction op.kp->flags has KPROBE_FLAG_OPTIMIZED and op->list is not empty. - [optimized]: jump inserted op.kp->flags has KPROBE_FLAG_OPTIMIZED and op->list is empty. - [unoptimizing]: Before removing jump instruction (including unused optprobe) op.kp->flags has KPROBE_FLAG_OPTIMIZED and op->list is not empty. - [unoptimized]: jump removed op.kp->flags doesn't have KPROBE_FLAG_OPTIMIZED and op->list is empty. Current code mis-expects [unoptimizing] state doesn't have KPROBE_FLAG_OPTIMIZED, and that can cause incorrect results. To fix this, introduce optprobe_queued_unopt() to distinguish [optimizing] and [unoptimizing] states and fixes the logic in optimize_kprobe() and unoptimize_kprobe(). [ mingo: Cleaned up the changelog and the code a bit. ] Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bristot@redhat.com Fixes: f66c0447cca1 ("kprobes: Set unoptimized flag after unoptimizing code") Link: https://lkml.kernel.org/r/157840814418.7181.13478003006386303481.stgit@devnote2 Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-03-11 14:14:47 +01:00
Saravana Kannan	ebb43b6aeb	ANDROID: GKI: genirq: Export symbols to compile irqchip drivers as modules We want to allow compiling irqchip drivers as modules. So export the necessary symbols. Bug: 148105066 Change-Id: Id3de4b8451bed1af9b0afeb5863493697730acb6 Signed-off-by: Saravana Kannan <saravanak@google.com> Signed-off-by: Will McVicker <willmcvicker@google.com> (cherry picked from commit cfc69e9b2fe82a46addfcb1912bd642456548baa)	2020-03-09 11:32:04 -07:00
Will McVicker	e0bd5f70e2	ANDROID: GKI: genirq/irqdomain: add export symbols for modularizing These symbols are needed for modularizing pinctrl. Signed-off-by: Will McVicker <willmcvicker@google.com> Bug: 145771121 Test: compile, boot Change-Id: I8693c3a41b5fcab05b8e4a8a82f4057205bafd3b (cherry picked from commit 9d2cbb36a60747e885f77d776a3ec2bf7523e2e6)	2020-03-09 11:32:04 -07:00
Maulik Shah	657d3fdc70	ANDROID: GKI: genirq: Introduce irq_chip_get/set_parent_state calls On certain QTI chipsets some GPIOs are direct-connect interrupts to the GIC. Even when GPIOs are not used for interrupt generation and interrupt line is disabled, it does not prevent interrupt to get pending at GIC_ISPEND. When drivers call enable_irq unwanted interrupt occures. Introduce irq_chip_get/set_parent_state calls to clear pending irq which can get called within irq_enable of child irq chip to clear any pending irq before enabling. Signed-off-by: Maulik Shah <mkshah@codeaurora.org> Bug: 150233439 Change-Id: Ie8559657bd8da926cc741514809ffe9adbd73a80 Signed-off-by: Will McVicker <willmcvicker@google.com> (cherry picked from commit `d923314622`)	2020-03-09 11:32:04 -07:00
Greg Kroah-Hartman	8290fa4ad8	This is the 4.19.108 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl5hHeEACgkQONu9yGCS aT7uFQ/+KcS1brUUid3C+zewoJ7vvB7wspMRogdJk5/9Y/ty4uxolFRNxM7Fq2Sj 2Uq0jyt8TiiOQBguqpRfJN/GqPX7HBLPR6e9B4Pq67BR34VT6azqb5F7tKjhCa4I CGmA5XvCCQSGwqmwyP4biV0yOdN6Fy0B/9q7+7RSOsY/Mr86RyEHQgarRxaMy2QW v/dh/yIsdPXG27ZicETudKqIWaMiXL5k0zXr81HY4TcOzBrKW66nuqcI0uXZ6r54 RwqxfGVTGQeGIN4bBAFGTlEvvMDO0NAENGA0vOpt8Mqf7yRIye78GCmn8A/nOgd/ +ZsrS9y+baJun0O/7zmuYSFd37GDecRu6kNYI+fc1Hbf784wLj05A52kNZ5ndYPB CdHgcow63QV3DGTXsfQOi/yZEDm/YMUzhMoL2/KP/LlJzq8raXMf95pB3fgs6zmX HI3sA4AuyWQaQb/ogzW+8m8v1oHzT4+aNaBi9rBS1uqCg5q6AhTBRApUNpbybgsG vkiTwhIc2y74Y7M5wV0Fp29pQBPPn033smIq3V/qxgyMvoBbMXxNGZ7jTK882h5g UBjprtX/wyHgVLEXITiz1BPJTinweJarRCL6iGn5w7IOfd3enSamfph5wh5vuXR6 ea0SCw3Dni5G930BMldxubZRtiYTiqvDCeC/IpG7trP9mpczGeE= =/2bv -----END PGP SIGNATURE----- Merge 4.19.108 into android-4.19 Changes in 4.19.108 irqchip/gic-v3-its: Fix misuse of GENMASK macro iwlwifi: pcie: fix rb_allocator workqueue allocation ipmi:ssif: Handle a possible NULL pointer reference drm/msm: Set dma maximum segment size for mdss dax: pass NOWAIT flag to iomap_apply mac80211: consider more elements in parsing CRC cfg80211: check wiphy driver existence for drvinfo report s390/zcrypt: fix card and queue total counter wrap qmi_wwan: re-add DW5821e pre-production variant qmi_wwan: unconditionally reject 2 ep interfaces ARM: dts: sti: fixup sound frame-inversion for stihxxx-b2120.dtsi soc/tegra: fuse: Fix build with Tegra194 configuration net: ena: fix potential crash when rxfh key is NULL net: ena: fix uses of round_jiffies() net: ena: add missing ethtool TX timestamping indication net: ena: fix incorrect default RSS key net: ena: rss: fix failure to get indirection table net: ena: rss: store hash function as values and not bits net: ena: fix incorrectly saving queue numbers when setting RSS indirection table net: ena: ethtool: use correct value for crc32 hash net: ena: ena-com.c: prevent NULL pointer dereference cifs: Fix mode output in debugging statements cfg80211: add missing policy for NL80211_ATTR_STATUS_CODE sysrq: Restore original console_loglevel when sysrq disabled sysrq: Remove duplicated sysrq message net: fib_rules: Correctly set table field when table number exceeds 8 bits net: mscc: fix in frame extraction net: phy: restore mdio regs in the iproc mdio driver net: sched: correct flower port blocking nfc: pn544: Fix occasional HW initialization failure sctp: move the format error check out of __sctp_sf_do_9_1_abort ipv6: Fix route replacement with dev-only route ipv6: Fix nlmsg_flags when splitting a multipath route qede: Fix race between rdma destroy workqueue and link change event net/tls: Fix to avoid gettig invalid tls record ext4: potential crash on allocation error in ext4_alloc_flex_bg_array() audit: fix error handling in audit_data_to_entry() ACPICA: Introduce ACPI_ACCESS_BYTE_WIDTH() macro ACPI: watchdog: Fix gas->access_width usage KVM: VMX: check descriptor table exits on instruction emulation HID: ite: Only bind to keyboard USB interface on Acer SW5-012 keyboard dock HID: core: fix off-by-one memset in hid_report_raw_event() HID: core: increase HID report buffer size to 8KiB macintosh: therm_windtunnel: fix regression when instantiating devices tracing: Disable trace_printk() on post poned tests Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs" amdgpu/gmc_v9: save/restore sdpif regs during S3 vhost: Check docket sk_family instead of call getname HID: alps: Fix an error handling path in 'alps_input_configured()' HID: hiddev: Fix race in in hiddev_disconnect() MIPS: VPE: Fix a double free and a memory leak in 'release_vpe()' i2c: altera: Fix potential integer overflow i2c: jz4780: silence log flood on txabrt drm/i915/gvt: Fix orphan vgpu dmabuf_objs' lifetime drm/i915/gvt: Separate display reset from ALL_ENGINES reset hv_netvsc: Fix unwanted wakeup in netvsc_attach() usb: charger: assign specific number for enum value s390/qeth: vnicc Fix EOPNOTSUPP precedence net: netlink: cap max groups which will be considered in netlink_bind() net: atlantic: fix use after free kasan warn net: atlantic: fix potential error handling net/smc: no peer ID in CLC decline for SMCD net: ena: make ena rxfh support ETH_RSS_HASH_NO_CHANGE namei: only return -ECHILD from follow_dotdot_rcu() mwifiex: drop most magic numbers from mwifiex_process_tdls_action_frame() mwifiex: delete unused mwifiex_get_intf_num() KVM: SVM: Override default MMIO mask if memory encryption is enabled KVM: Check for a bad hva before dropping into the ghc slow path sched/fair: Optimize update_blocked_averages() sched/fair: Fix O(nr_cgroups) in the load balancing path perf stat: Use perf_evsel__is_clocki() for clock events perf stat: Fix shadow stats for clock events drivers: net: xgene: Fix the order of the arguments of 'alloc_etherdev_mqs()' kprobes: Set unoptimized flag after unoptimizing code pwm: omap-dmtimer: put_device() after of_find_device_by_node() perf hists browser: Restore ESC as "Zoom out" of DSO/thread/etc KVM: x86: Remove spurious kvm_mmu_unload() from vcpu destruction path KVM: x86: Remove spurious clearing of async #PF MSR thermal: brcmstb_thermal: Do not use DT coefficients netfilter: nft_tunnel: no need to call htons() when dumping ports netfilter: nf_flowtable: fix documentation mm/huge_memory.c: use head to check huge zero page mm, thp: fix defrag setting if newline is not used audit: always check the netlink payload length in audit_receive_msg() Linux 4.19.108 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ib98db500eded0a83d89c38900bbdf9ff5d6a37e0	2020-03-05 17:40:55 +01:00
Paul Moore	9d2fdc4c7e	audit: always check the netlink payload length in audit_receive_msg() [ Upstream commit 756125289285f6e55a03861bf4b6257aa3d19a93 ] This patch ensures that we always check the netlink payload length in audit_receive_msg() before we take any action on the payload itself. Cc: stable@vger.kernel.org Reported-by: syzbot+399c44bf1f43b8747403@syzkaller.appspotmail.com Reported-by: syzbot+e4b12d8d202701f08b6d@syzkaller.appspotmail.com Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-03-05 16:42:23 +01:00
Masami Hiramatsu	39af044d1c	kprobes: Set unoptimized flag after unoptimizing code commit f66c0447cca1281116224d474cdb37d6a18e4b5b upstream. Set the unoptimized flag after confirming the code is completely unoptimized. Without this fix, when a kprobe hits the intermediate modified instruction (the first byte is replaced by an INT3, but later bytes can still be a jump address operand) while unoptimizing, it can return to the middle byte of the modified code, which causes an invalid instruction exception in the kernel. Usually, this is a rare case, but if we put a probe on the function call while text patching, it always causes a kernel panic as below: # echo p text_poke+5 > kprobe_events # echo 1 > events/kprobes/enable # echo 0 > events/kprobes/enable invalid opcode: 0000 [#1] PREEMPT SMP PTI RIP: 0010:text_poke+0x9/0x50 Call Trace: arch_unoptimize_kprobe+0x22/0x28 arch_unoptimize_kprobes+0x39/0x87 kprobe_optimizer+0x6e/0x290 process_one_work+0x2a0/0x610 worker_thread+0x28/0x3d0 ? process_one_work+0x610/0x610 kthread+0x10d/0x130 ? kthread_park+0x80/0x80 ret_from_fork+0x3a/0x50 text_poke() is used for patching the code in optprobes. This can happen even if we blacklist text_poke() and other functions, because there is a small time window during which we show the intermediate code to other CPUs. [ mingo: Edited the changelog. ] Tested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bristot@redhat.com Fixes: `6274de4984` ("kprobes: Support delayed unoptimizing") Link: https://lkml.kernel.org/r/157483422375.25881.13508326028469515760.stgit@devnote2 Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-05 16:42:22 +01:00
Vincent Guittot	d71744b5c1	sched/fair: Fix O(nr_cgroups) in the load balancing path commit 039ae8bcf7a5f4476f4487e6bf816885fb3fb617 upstream. This re-applies the commit reverted here: commit c40f7d74c741 ("sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c") I.e. now that cfs_rq can be safely removed/added in the list, we can re-apply: commit `a9e7f6544b` ("sched/fair: Fix O(nr_cgroups) in load balance path") Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: sargun@sargun.me Cc: tj@kernel.org Cc: xiexiuqi@huawei.com Cc: xiezhipeng1@huawei.com Link: https://lkml.kernel.org/r/1549469662-13614-3-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Vishnu Rangayyan <vishnu.rangayyan@apple.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-05 16:42:21 +01:00
Vincent Guittot	a1f1a978a7	sched/fair: Optimize update_blocked_averages() commit 31bc6aeaab1d1de8959b67edbed5c7a4b3cdbe7c upstream. Removing a cfs_rq from rq->leaf_cfs_rq_list can break the parent/child ordering of the list when it will be added back. In order to remove an empty and fully decayed cfs_rq, we must remove its children too, so they will be added back in the right order next time. With a normal decay of PELT, a parent will be empty and fully decayed if all children are empty and fully decayed too. In such a case, we just have to ensure that the whole branch will be added when a new task is enqueued. This is default behavior since : commit f6783319737f ("sched/fair: Fix insertion in rq->leaf_cfs_rq_list") In case of throttling, the PELT of throttled cfs_rq will not be updated whereas the parent will. This breaks the assumption made above unless we remove the children of a cfs_rq that is throttled. Then, they will be added back when unthrottled and a sched_entity will be enqueued. As throttled cfs_rq are now removed from the list, we can remove the associated test in update_blocked_averages(). Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: sargun@sargun.me Cc: tj@kernel.org Cc: xiexiuqi@huawei.com Cc: xiezhipeng1@huawei.com Link: https://lkml.kernel.org/r/1549469662-13614-2-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Vishnu Rangayyan <vishnu.rangayyan@apple.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-05 16:42:21 +01:00
Steven Rostedt (VMware)	91495e01e8	tracing: Disable trace_printk() on post poned tests commit 78041c0c9e935d9ce4086feeff6c569ed88ddfd4 upstream. The tracing seftests checks various aspects of the tracing infrastructure, and one is filtering. If trace_printk() is active during a self test, it can cause the filtering to fail, which will disable that part of the trace. To keep the selftests from failing because of trace_printk() calls, trace_printk() checks the variable tracing_selftest_running, and if set, it does not write to the tracing buffer. As some tracers were registered earlier in boot, the selftest they triggered would fail because not all the infrastructure was set up for the full selftest. Thus, some of the tests were post poned to when their infrastructure was ready (namely file system code). The postpone code did not set the tracing_seftest_running variable, and could fail if a trace_printk() was added and executed during their run. Cc: stable@vger.kernel.org Fixes: `9afecfbb95` ("tracing: Postpone tracer start-up tests till the system is more robust") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-05 16:42:18 +01:00
Paul Moore	c24d457a82	audit: fix error handling in audit_data_to_entry() commit 2ad3e17ebf94b7b7f3f64c050ff168f9915345eb upstream. Commit `219ca39427` ("audit: use union for audit_field values since they are mutually exclusive") combined a number of separate fields in the audit_field struct into a single union. Generally this worked just fine because they are generally mutually exclusive. Unfortunately in audit_data_to_entry() the overlap can be a problem when a specific error case is triggered that causes the error path code to attempt to cleanup an audit_field struct and the cleanup involves attempting to free a stored LSM string (the lsm_str field). Currently the code always has a non-NULL value in the audit_field.lsm_str field as the top of the for-loop transfers a value into audit_field.val (both .lsm_str and .val are part of the same union); if audit_data_to_entry() fails and the audit_field struct is specified to contain a LSM string, but the audit_field.lsm_str has not yet been properly set, the error handling code will attempt to free the bogus audit_field.lsm_str value that was set with audit_field.val at the top of the for-loop. This patch corrects this by ensuring that the audit_field.val is only set when needed (it is cleared when the audit_field struct is allocated with kcalloc()). It also corrects a few other issues to ensure that in case of error the proper error code is returned. Cc: stable@vger.kernel.org Fixes: `219ca39427` ("audit: use union for audit_field values since they are mutually exclusive") Reported-by: syzbot+1f4d90ead370d72e450b@syzkaller.appspotmail.com Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-05 16:42:17 +01:00
Mark Salyzyn	5cbbeadd5a	BACKPORT: mm: reclaim small amounts of memory when an external fragmentation event occurs An external fragmentation event was previously described as When the page allocator fragments memory, it records the event using the mm_page_alloc_extfrag event. If the fallback_order is smaller than a pageblock order (order-9 on 64-bit x86) then it's considered an event that will cause external fragmentation issues in the future. The kernel reduces the probability of such events by increasing the watermark sizes by calling set_recommended_min_free_kbytes early in the lifetime of the system. This works reasonably well in general but if there are enough sparsely populated pageblocks then the problem can still occur as enough memory is free overall and kswapd stays asleep. This patch introduces a watermark_boost_factor sysctl that allows a zone watermark to be temporarily boosted when an external fragmentation causing events occurs. The boosting will stall allocations that would decrease free memory below the boosted low watermark and kswapd is woken if the calling context allows to reclaim an amount of memory relative to the size of the high watermark and the watermark_boost_factor until the boost is cleared. When kswapd finishes, it wakes kcompactd at the pageblock order to clean some of the pageblocks that may have been affected by the fragmentation event. kswapd avoids any writeback, slab shrinkage and swap from reclaim context during this operation to avoid excessive system disruption in the name of fragmentation avoidance. Care is taken so that kswapd will do normal reclaim work if the system is really low on memory. This was evaluated using the same workloads as "mm, page_alloc: Spread allocations across zones before introducing fragmentation". 1-socket Skylake machine config-global-dhp__workload_thpfioscale XFS (no special madvise) 4 fio threads, 1 THP allocating thread -------------------------------------- 4.20-rc3 extfrag events < order 9: 804694 4.20-rc3+patch: 408912 (49% reduction) 4.20-rc3+patch1-4: 18421 (98% reduction) 4.20.0-rc3 4.20.0-rc3 lowzone-v5r8 boost-v5r8 Amean fault-base-1 653.58 ( 0.00%) 652.71 ( 0.13%) Amean fault-huge-1 0.00 ( 0.00%) 178.93 * -99.00%* 4.20.0-rc3 4.20.0-rc3 lowzone-v5r8 boost-v5r8 Percentage huge-1 0.00 ( 0.00%) 5.12 ( 100.00%) Note that external fragmentation causing events are massively reduced by this path whether in comparison to the previous kernel or the vanilla kernel. The fault latency for huge pages appears to be increased but that is only because THP allocations were successful with the patch applied. 1-socket Skylake machine global-dhp__workload_thpfioscale-madvhugepage-xfs (MADV_HUGEPAGE) ----------------------------------------------------------------- 4.20-rc3 extfrag events < order 9: 291392 4.20-rc3+patch: 191187 (34% reduction) 4.20-rc3+patch1-4: 13464 (95% reduction) thpfioscale Fault Latencies 4.20.0-rc3 4.20.0-rc3 lowzone-v5r8 boost-v5r8 Min fault-base-1 912.00 ( 0.00%) 905.00 ( 0.77%) Min fault-huge-1 127.00 ( 0.00%) 135.00 ( -6.30%) Amean fault-base-1 1467.55 ( 0.00%) 1481.67 ( -0.96%) Amean fault-huge-1 1127.11 ( 0.00%) 1063.88 * 5.61%* 4.20.0-rc3 4.20.0-rc3 lowzone-v5r8 boost-v5r8 Percentage huge-1 77.64 ( 0.00%) 83.46 ( 7.49%) As before, massive reduction in external fragmentation events, some jitter on latencies and an increase in THP allocation success rates. 2-socket Haswell machine config-global-dhp__workload_thpfioscale XFS (no special madvise) 4 fio threads, 5 THP allocating threads ---------------------------------------------------------------- 4.20-rc3 extfrag events < order 9: 215698 4.20-rc3+patch: 200210 (7% reduction) 4.20-rc3+patch1-4: 14263 (93% reduction) 4.20.0-rc3 4.20.0-rc3 lowzone-v5r8 boost-v5r8 Amean fault-base-5 1346.45 ( 0.00%) 1306.87 ( 2.94%) Amean fault-huge-5 3418.60 ( 0.00%) 1348.94 ( 60.54%) 4.20.0-rc3 4.20.0-rc3 lowzone-v5r8 boost-v5r8 Percentage huge-5 0.78 ( 0.00%) 7.91 ( 910.64%) There is a 93% reduction in fragmentation causing events, there is a big reduction in the huge page fault latency and allocation success rate is higher. 2-socket Haswell machine global-dhp__workload_thpfioscale-madvhugepage-xfs (MADV_HUGEPAGE) ----------------------------------------------------------------- 4.20-rc3 extfrag events < order 9: 166352 4.20-rc3+patch: 147463 (11% reduction) 4.20-rc3+patch1-4: 11095 (93% reduction) thpfioscale Fault Latencies 4.20.0-rc3 4.20.0-rc3 lowzone-v5r8 boost-v5r8 Amean fault-base-5 6217.43 ( 0.00%) 7419.67 * -19.34%* Amean fault-huge-5 3163.33 ( 0.00%) 3263.80 ( -3.18%) 4.20.0-rc3 4.20.0-rc3 lowzone-v5r8 boost-v5r8 Percentage huge-5 95.14 ( 0.00%) 87.98 ( -7.53%) There is a large reduction in fragmentation events with some jitter around the latencies and success rates. As before, the high THP allocation success rate does mean the system is under a lot of pressure. However, as the fragmentation events are reduced, it would be expected that the long-term allocation success rate would be higher. Link: http://lkml.kernel.org/r/20181123114528.28802-5-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: Ied06272defcdbf3fff07b7ebccb46c68ce081e1e Git-commit: 1c30844d2dfe272d58c8fc000960b835d13aa2ac Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git [vinmenon@codeaurora.org: trivial merge conflict fixes] Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> (cherry picked from commit 1c30844d2dfe272d58c8fc000960b835d13aa2ac) Signed-off-by: Mark Salyzyn <salyzyn@google.com> Bug: 150378964	2020-03-04 12:01:17 -08:00
Qais Yousef	cb0a2cdd48	UPSTREAM: sched/uclamp: Reject negative values in cpu_uclamp_write() The check to ensure that the new written value into cpu.uclamp.{min,max} is within range, [0:100], wasn't working because of the signed comparison 7301 if (req.percent > UCLAMP_PERCENT_SCALE) { 7302 req.ret = -ERANGE; 7303 return req; 7304 } # echo -1 > cpu.uclamp.min # cat cpu.uclamp.min 42949671.96 Cast req.percent into u64 to force the comparison to be unsigned and work as intended in capacity_from_percent(). # echo -1 > cpu.uclamp.min sh: write error: Numerical result out of range Bug: 120440300 Fixes: 2480c093130f ("sched/uclamp: Extend CPU's cgroup controller") Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20200114210947.14083-1-qais.yousef@arm.com (cherry picked from commit b562d140649966d4daedd0483a8fe59ad3bb465a) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I17fc2b119dcbffb212e130ed2c37ae3a8d5bbb61	2020-03-03 11:41:08 +00:00
Greg Kroah-Hartman	7cd2c86c50	This is the 4.19.107 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl5ZNBUACgkQONu9yGCS aT6vMhAAoNfLw1JEqsOgplIUKuLJnIBOldyJeZ8HCrR9yhIEDgevHQzaWutyD6H4 2AzImhL8YBwAw+9UHq5Z1PT3PluKt78vRr1ZxDyNniHGJdDsoWTed9h+QjyRkDFl KZSV30GraO8/P6e9Ep5CgKLiCID7m2U9jYZkb6QL21wawprEi6dgSOb21prPyN1d SKCtcrhUQFqDPOgqU3Cyv9t/vxzrgBKSZRKOXZON5gBlwmFHuPk7lcSB80DKd+7S Um7oatwFBhQwKyuhJARXbrhIw2z6Y+xf1wJF+yNW9v/VpR4NE+SkzX2SaX7lercF JigVmtpth1KBa2wGw3N0XOdNG6NYrLtzeBW+o7mlZk4D2OKCeUoZEdM5RiVJNLCK Ze1soQtHoRFViqPx5Or06pOsMagKRNxzjkFPd1cfA7vpRw2KRNKCFXec/Ms8coUd /WslTHkyfryRfzFDtyyCATVXHPizkZqJyrR/3pes4sGITIpFczWVHiQ3mqUIrdXN d08CwsYS0ivQwvl5hZzxyqUlUWVhGccT1PpO6+SZp2IuGT3YWZzpQKDh0+IlIsv0 TUvEtz3xjzL5EDUmUFsRUy5hBINdzjE/iKb3KOHw0y8xik5Rp0LkHtMRPmro5+TT A4JqVfxTGdTprRXPeCS/7X1jOoOxnxm06QZ+HqbHCL5CUi6nZFk= =XhCO -----END PGP SIGNATURE----- Merge 4.19.107 into android-4.19 Changes in 4.19.107 iommu/qcom: Fix bogus detach logic ALSA: hda: Use scnprintf() for printing texts for sysfs/procfs ALSA: hda/realtek - Apply quirk for MSI GP63, too ALSA: hda/realtek - Apply quirk for yet another MSI laptop ASoC: sun8i-codec: Fix setting DAI data format ecryptfs: fix a memory leak bug in parse_tag_1_packet() ecryptfs: fix a memory leak bug in ecryptfs_init_messaging() thunderbolt: Prevent crash if non-active NVMem file is read USB: misc: iowarrior: add support for 2 OEMed devices USB: misc: iowarrior: add support for the 28 and 28L devices USB: misc: iowarrior: add support for the 100 device floppy: check FDC index for errors before assigning it vt: fix scrollback flushing on background consoles vt: selection, handle pending signals in paste_selection vt: vt_ioctl: fix race in VT_RESIZEX staging: android: ashmem: Disallow ashmem memory from being remapped staging: vt6656: fix sign of rx_dbm to bb_pre_ed_rssi. xhci: Force Maximum Packet size for Full-speed bulk devices to valid range. xhci: fix runtime pm enabling for quirky Intel hosts xhci: Fix memory leak when caching protocol extended capability PSI tables - take 2 usb: host: xhci: update event ring dequeue pointer on purpose USB: core: add endpoint-blacklist quirk USB: quirks: blacklist duplicate ep on Sound Devices USBPre2 usb: uas: fix a plug & unplug racing USB: Fix novation SourceControl XL after suspend USB: hub: Don't record a connect-change event during reset-resume USB: hub: Fix the broken detection of USB3 device in SMSC hub usb: dwc2: Fix SET/CLEAR_FEATURE and GET_STATUS flows usb: dwc3: gadget: Check for IOC/LST bit in TRB->ctrl fields staging: rtl8188eu: Fix potential security hole staging: rtl8188eu: Fix potential overuse of kernel memory staging: rtl8723bs: Fix potential security hole staging: rtl8723bs: Fix potential overuse of kernel memory powerpc/tm: Fix clearing MSR[TS] in current when reclaiming on signal delivery jbd2: fix ocfs2 corrupt when clearing block group bits x86/mce/amd: Publish the bank pointer only after setup has succeeded x86/mce/amd: Fix kobject lifetime x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF serial: 8250: Check UPF_IRQ_SHARED in advance tty/serial: atmel: manage shutdown in case of RS485 or ISO7816 mode tty: serial: imx: setup the correct sg entry for tx dma serdev: ttyport: restore client ops on deregistration MAINTAINERS: Update drm/i915 bug filing URL Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()" mm/memcontrol.c: lost css_put in memcg_expand_shrinker_maps() nvme-multipath: Fix memory leak with ana_log_buf genirq/irqdomain: Make sure all irq domain flags are distinct mm/vmscan.c: don't round up scan size for online memory cgroup drm/amdgpu/soc15: fix xclk for raven xhci: apply XHCI_PME_STUCK_QUIRK to Intel Comet Lake platforms KVM: nVMX: Don't emulate instructions in guest mode KVM: x86: don't notify userspace IOAPIC on edge-triggered interrupt EOI tty: serial: qcom_geni_serial: Fix UART hang tty: serial: qcom_geni_serial: Remove interrupt storm tty: serial: qcom_geni_serial: Remove use of *_relaxed() and mb() tty: serial: qcom_geni_serial: Remove set_rfr_wm() and related variables tty: serial: qcom_geni_serial: Remove xfer_mode variable tty: serial: qcom_geni_serial: Fix RX cancel command failure lib/stackdepot.c: fix global out-of-bounds in stack_slabs drm/nouveau/kms/gv100-: Re-set LUT after clearing for modesets ext4: fix a data race in EXT4_I(inode)->i_disksize ext4: add cond_resched() to __ext4_find_entry() ext4: fix potential race between online resizing and write operations ext4: fix potential race between s_group_info online resizing and access ext4: fix potential race between s_flex_groups online resizing and access ext4: fix mount failure with quota configured as module ext4: rename s_journal_flag_rwsem to s_writepages_rwsem ext4: fix race between writepages and enabling EXT4_EXTENTS_FL KVM: nVMX: Refactor IO bitmap checks into helper function KVM: nVMX: Check IO instruction VM-exit conditions KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1 KVM: apic: avoid calculating pending eoi from an uninitialized val btrfs: fix bytes_may_use underflow in prealloc error condtition btrfs: reset fs_root to NULL on error in open_ctree btrfs: do not check delayed items are empty for single transaction cleanup Btrfs: fix btrfs_wait_ordered_range() so that it waits for all ordered extents Revert "dmaengine: imx-sdma: Fix memory leak" scsi: Revert "RDMA/isert: Fix a recently introduced regression related to logout" scsi: Revert "target: iscsi: Wait for all commands to finish before freeing a session" usb: gadget: composite: Fix bMaxPower for SuperSpeedPlus usb: dwc2: Fix in ISOC request length checking staging: rtl8723bs: fix copy of overlapping memory staging: greybus: use after free in gb_audio_manager_remove_all() ecryptfs: replace BUG_ON with error handling code iommu/vt-d: Fix compile warning from intel-svm.h genirq/proc: Reject invalid affinity masks (again) bpf, offload: Replace bitwise AND by logical AND in bpf_prog_offload_info_fill ALSA: rawmidi: Avoid bit fields for state flags ALSA: seq: Avoid concurrent access to queue flags ALSA: seq: Fix concurrent access to queue current tick/time netfilter: xt_hashlimit: limit the max size of hashtable rxrpc: Fix call RCU cleanup using non-bh-safe locks ata: ahci: Add shutdown to freeze hardware resources of ahci xen: Enable interrupts when calling _cond_resched() s390/mm: Explicitly compare PAGE_DEFAULT_KEY against zero in storage_key_init_range Revert "char/random: silence a lockdep splat with printk()" Linux 4.19.107 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I74e3d49c54d4afcfa4049042163cb879c3de3100	2020-03-03 07:33:01 +01:00
Johannes Krude	bf3043d277	bpf, offload: Replace bitwise AND by logical AND in bpf_prog_offload_info_fill commit e20d3a055a457a10a4c748ce5b7c2ed3173a1324 upstream. This if guards whether user-space wants a copy of the offload-jited bytecode and whether this bytecode exists. By erroneously doing a bitwise AND instead of a logical AND on user- and kernel-space buffer-size can lead to no data being copied to user-space especially when user-space size is a power of two and bigger then the kernel-space buffer. Fixes: `fcfb126def` ("bpf: add new jited info fields in bpf_dev_offload and bpf_prog_info") Signed-off-by: Johannes Krude <johannes@krude.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/bpf/20200212193227.GA3769@phlox.h.transitiv.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-28 16:38:59 +01:00
Thomas Gleixner	3132696dd7	genirq/proc: Reject invalid affinity masks (again) commit cba6437a1854fde5934098ec3bd0ee83af3129f5 upstream. Qian Cai reported that the WARN_ON() in the x86/msi affinity setting code, which catches cases where the affinity setting is not done on the CPU which is the current target of the interrupt, triggers during CPU hotplug stress testing. It turns out that the warning which was added with the commit addressing the MSI affinity race unearthed yet another long standing bug. If user space writes a bogus affinity mask, i.e. it contains no online CPUs, then it calls irq_select_affinity_usr(). This was introduced for ALPHA in `eee45269b0` ("[PATCH] Alpha: convert to generic irq framework (generic part)") and subsequently made available for all architectures in `1840475676` ("genirq: Expose default irq affinity mask (take 3)") which introduced the circumvention of the affinity setting restrictions for interrupt which cannot be moved in process context. The whole exercise is bogus in various aspects: 1) If the interrupt is already started up then there is absolutely no point to honour a bogus interrupt affinity setting from user space. The interrupt is already assigned to an online CPU and it does not make any sense to reassign it to some other randomly chosen online CPU. 2) If the interupt is not yet started up then there is no point either. A subsequent startup of the interrupt will invoke irq_setup_affinity() anyway which will chose a valid target CPU. So the only correct solution is to just return -EINVAL in case user space wrote an affinity mask which does not contain any online CPUs, except for ALPHA which has it's own magic sauce for this. Fixes: `1840475676` ("genirq: Expose default irq affinity mask (take 3)") Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Qian Cai <cai@lca.pw> Link: https://lkml.kernel.org/r/878sl8xdbm.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-28 16:38:59 +01:00
Suren Baghdasaryan	67e4408599	UPSTREAM: sched/psi: Fix OOB write when writing 0 bytes to PSI files Issuing write() with count parameter set to 0 on any file under /proc/pressure/ will cause an OOB write because of the access to buf[buf_size-1] when NUL-termination is performed. Fix this by checking for buf_size to be non-zero. Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lkml.kernel.org/r/20200203212216.7076-1-surenb@google.com (cherry picked from commit 6fcca0fa48118e6d63733eb4644c6cd880c15b8f) Bug: 148159562 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I9ec7acfc6e1083c677a95b0ea1c559ab50152873	2020-02-28 15:30:25 +00:00
Johannes Weiner	cf46cf40bc	UPSTREAM: psi: Fix a division error in psi poll() The psi window size is a u64 an can be up to 10 seconds right now, which exceeds the lower 32 bits of the variable. We currently use div_u64 for it, which is meant only for 32-bit divisors. The result is garbage pressure sampling values and even potential div0 crashes. Use div64_u64. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Jingfeng Xie <xiejingfeng@linux.alibaba.com> Link: https://lkml.kernel.org/r/20191203183524.41378-3-hannes@cmpxchg.org Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit c3466952ca1514158d7c16c9cfc48c27d5c5dc0f) Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I49fdfd55751d1a2cde19666624c9c5d76dc78dad	2020-02-28 15:30:04 +00:00
Johannes Weiner	55013802e8	UPSTREAM: sched/psi: Fix sampling error and rare div0 crashes with cgroups and high uptime Jingfeng reports rare div0 crashes in psi on systems with some uptime: [58914.066423] divide error: 0000 [#1] SMP [58914.070416] Modules linked in: ipmi_poweroff ipmi_watchdog toa overlay fuse tcp_diag inet_diag binfmt_misc aisqos(O) aisqos_hotfixes(O) [58914.083158] CPU: 94 PID: 140364 Comm: kworker/94:2 Tainted: G W OE K 4.9.151-015.ali3000.alios7.x86_64 #1 [58914.093722] Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS 3.23.34 02/14/2019 [58914.102728] Workqueue: events psi_update_work [58914.107258] task: ffff8879da83c280 task.stack: ffffc90059dcc000 [58914.113336] RIP: 0010:[] [] psi_update_stats+0x1c1/0x330 [58914.122183] RSP: 0018:ffffc90059dcfd60 EFLAGS: 00010246 [58914.127650] RAX: 0000000000000000 RBX: ffff8858fe98be50 RCX: 000000007744d640 [58914.134947] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00003594f700648e [58914.142243] RBP: ffffc90059dcfdf8 R08: 0000359500000000 R09: 0000000000000000 [58914.149538] R10: 0000000000000000 R11: 0000000000000000 R12: 0000359500000000 [58914.156837] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8858fe98bd78 [58914.164136] FS: 0000000000000000(0000) GS:ffff887f7f380000(0000) knlGS:0000000000000000 [58914.172529] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [58914.178467] CR2: 00007f2240452090 CR3: 0000005d5d258000 CR4: 00000000007606f0 [58914.185765] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [58914.193061] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [58914.200360] PKRU: 55555554 [58914.203221] Stack: [58914.205383] ffff8858fe98bd48 00000000000002f0 0000002e81036d09 ffffc90059dcfde8 [58914.213168] ffff8858fe98bec8 0000000000000000 0000000000000000 0000000000000000 [58914.220951] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [58914.228734] Call Trace: [58914.231337] [] psi_update_work+0x22/0x60 [58914.237067] [] process_one_work+0x189/0x420 [58914.243063] [] worker_thread+0x4e/0x4b0 [58914.248701] [] ? process_one_work+0x420/0x420 [58914.254869] [] kthread+0xe6/0x100 [58914.259994] [] ? kthread_park+0x60/0x60 [58914.265640] [] ret_from_fork+0x39/0x50 [58914.271193] Code: 41 29 c3 4d 39 dc 4d 0f 42 dc <49> f7 f1 48 8b 13 48 89 c7 48 c1 [58914.279691] RIP [] psi_update_stats+0x1c1/0x330 The crashing instruction is trying to divide the observed stall time by the sampling period. The period, stored in R8, is not 0, but we are dividing by the lower 32 bits only, which are all 0 in this instance. We could switch to a 64-bit division, but the period shouldn't be that big in the first place. It's the time between the last update and the next scheduled one, and so should always be around 2s and comfortably fit into 32 bits. The bug is in the initialization of new cgroups: we schedule the first sampling event in a cgroup as an offset of sched_clock(), but fail to initialize the last_update timestamp, and it defaults to 0. That results in a bogusly large sampling period the first time we run the sampling code, and consequently we underreport pressure for the first 2s of a cgroup's life. But worse, if sched_clock() is sufficiently advanced on the system, and the user gets unlucky, the period's lower 32 bits can all be 0 and the sampling division will crash. Fix this by initializing the last update timestamp to the creation time of the cgroup, thus correctly marking the start of the first pressure sampling period in a new cgroup. Reported-by: Jingfeng Xie <xiejingfeng@linux.alibaba.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Link: https://lkml.kernel.org/r/20191203183524.41378-2-hannes@cmpxchg.org Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 3dfbe25c27eab7c90c8a7e97b4c354a9d24dd985) Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: Iaada5c2f1a03cf38cbb053adde478f762ce40843	2020-02-28 15:29:50 +00:00
Miles Chen	88a47f1659	UPSTREAM: sched/psi: Correct overly pessimistic size calculation When passing a equal or more then 32 bytes long string to psi_write(), psi_write() copies 31 bytes to its buf and overwrites buf[30] with '\0'. Which makes the input string 1 byte shorter than it should be. Fix it by copying sizeof(buf) bytes when nbytes >= sizeof(buf). This does not cause problems in normal use case like: "some 500000 10000000" or "full 500000 10000000" because they are less than 32 bytes in length. /* assuming nbytes == 35 / char buf[32]; buf_size = min(nbytes, (sizeof(buf) - 1)); / buf_size = 31 / if (copy_from_user(buf, user_buf, buf_size)) return -EFAULT; buf[buf_size - 1] = '\0'; / buf[30] = '\0' */ Before: %cd /proc/pressure/ %echo "123456789\|123456789\|123456789\|1234" > memory [ 22.473497] nbytes=35,buf_size=31 [ 22.473775] 123456789\|123456789\|123456789\| (print 30 chars) %sh: write error: Invalid argument %echo "123456789\|123456789\|123456789\|1" > memory [ 64.916162] nbytes=32,buf_size=31 [ 64.916331] 123456789\|123456789\|123456789\| (print 30 chars) %sh: write error: Invalid argument After: %cd /proc/pressure/ %echo "123456789\|123456789\|123456789\|1234" > memory [ 254.837863] nbytes=35,buf_size=32 [ 254.838541] 123456789\|123456789\|123456789\|1 (print 31 chars) %sh: write error: Invalid argument %echo "123456789\|123456789\|123456789\|1" > memory [ 9965.714935] nbytes=32,buf_size=32 [ 9965.715096] 123456789\|123456789\|123456789\|1 (print 31 chars) %sh: write error: Invalid argument Also remove the superfluous parentheses. Signed-off-by: Miles Chen <miles.chen@mediatek.com> Cc: <linux-mediatek@lists.infradead.org> Cc: <wsd_upstream@mediatek.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190912103452.13281-1-miles.chen@mediatek.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 4adcdcea717cb2d8436bef00dd689aa5bc76f11b) Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I9371b4d5e465bb8b84ff7adf5f40f30696c6ff70	2020-02-28 15:28:23 +00:00
Sami Tolvanen	8028f78053	ANDROID: Disable wq fp check in CFI builds With non-canonical CFI, LLVM generates jump table entries for external symbols in modules and as a result, a function pointer passed from a module to the core kernel will have a different address. Disable the warning for now. Bug: 145210207 Change-Id: Ifdcee3479280f7b97abdee6b4c746f447e0944e6 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Alistair Delva <adelva@google.com>	2020-02-27 00:09:59 +00:00
Todd Kjos	08256862e0	ANDROID: increase limit on sched-tune boost groups Some devices need an additional sched-tune boost group to optimize performance for key tasks Bug: 150302001 Change-Id: I392c8cc05a8851f1d416c381b4a27242924c2c27 Signed-off-by: Todd Kjos <tkjos@google.com>	2020-02-26 21:55:29 +00:00
Greg Kroah-Hartman	4dc4199770	This is the 4.19.106 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl5TfLwACgkQONu9yGCS aT5wlRAAhZELK39c78NMCTZKHtKGLsGb2os2IiI7zIRbqNNwnvJi+jAc3kgbS9jP +W+wnhYFtFisDvqdCQ009I6A0NA1p3Nqy166JplW0iIg1e7rgUKKUfabCN9sJmjh HGK913cJlHwGmkSxq//sBucBwWhYYGaHec28pZ7uCFATjWrTaH3G4VrvLStuicYR YgS9MH261tWJKJm5+V2MxnOOI0103+Uey+xVqwSnLlV+qmasxwDCMU5ae+SK7e7f cXIkNZwvDph1zunekHg+jd64GN3GYswXVcRighWP0n7Lr+0tGPN7SY5pvZIjZLv/ sdroyrqAxytTYP32hypIUgsToVvJr7zXD09LGdsgOCKVwFVn8yl1e4zgGKH3L9Xu OK2krI90v1MVevibyaNndZ4UDKilF75oE2YYDOFW/BU1lorFAIzk4hh15CfKc8s1 KHRjePfcgQREs/SGK8k2BAmf/JwxFN1/Ro5dl7MvKn07ZYqx6QOwUoMhgxspIntN 9TlFw6elu1RSwu2BFts9wvoHO1tr7GZBa1cVkNF8qV1rzaGVY68aLDvvHGdffD6W JgX+BCfr6vcN7R4izak1RxzAoqDrRxS0vWoC1vVsPqeIIZydSxpYDquaFnbZm+Wc MRuh5gpQ2PzTXuMLeBB+ig6UnzsAO3x+3yIG/l5ZmmYxJbMFBKU= =zE/i -----END PGP SIGNATURE----- Merge 4.19.106 into android-4.19 Changes in 4.19.106 core: Don't skip generic XDP program execution for cloned SKBs enic: prevent waking up stopped tx queues over watchdog reset net/smc: fix leak of kernel memory to user space net: dsa: tag_qca: Make sure there is headroom for tag net/sched: matchall: add missing validation of TCA_MATCHALL_FLAGS net/sched: flower: add missing validation of TCA_FLOWER_FLAGS Revert "KVM: nVMX: Use correct root level for nested EPT shadow page tables" Revert "KVM: VMX: Add non-canonical check on writes to RTIT address MSRs" KVM: nVMX: Use correct root level for nested EPT shadow page tables drm/gma500: Fixup fbdev stolen size usage evaluation cpu/hotplug, stop_machine: Fix stop_machine vs hotplug order brcmfmac: Fix use after free in brcmf_sdio_readframes() leds: pca963x: Fix open-drain initialization ext4: fix ext4_dax_read/write inode locking sequence for IOCB_NOWAIT ALSA: ctl: allow TLV read operation for callback type of element in locked case gianfar: Fix TX timestamping with a stacked DSA driver pinctrl: sh-pfc: sh7264: Fix CAN function GPIOs pxa168fb: Fix the function used to release some memory in an error handling path media: i2c: mt9v032: fix enum mbus codes and frame sizes powerpc/powernv/iov: Ensure the pdn for VFs always contains a valid PE number gpio: gpio-grgpio: fix possible sleep-in-atomic-context bugs in grgpio_irq_map/unmap() iommu/vt-d: Fix off-by-one in PASID allocation char/random: silence a lockdep splat with printk() media: sti: bdisp: fix a possible sleep-in-atomic-context bug in bdisp_device_run() pinctrl: baytrail: Do not clear IRQ flags on direct-irq enabled pins efi/x86: Map the entire EFI vendor string before copying it MIPS: Loongson: Fix potential NULL dereference in loongson3_platform_init() sparc: Add .exit.data section. uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol() usb: gadget: udc: fix possible sleep-in-atomic-context bugs in gr_probe() usb: dwc2: Fix IN FIFO allocation clocksource/drivers/bcm2835_timer: Fix memory leak of timer kselftest: Minimise dependency of get_size on C library interfaces jbd2: clear JBD2_ABORT flag before journal_reset to update log tail info when load journal x86/sysfb: Fix check for bad VRAM size pwm: omap-dmtimer: Simplify error handling s390/pci: Fix possible deadlock in recover_store() powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov() tracing: Fix tracing_stat return values in error handling paths tracing: Fix very unlikely race of registering two stat tracers ARM: 8952/1: Disable kmemleak on XIP kernels ext4, jbd2: ensure panic when aborting with zero errno ath10k: Correct the DMA direction for management tx buffers drm/amd/display: Retrain dongles when SINK_COUNT becomes non-zero nbd: add a flush_workqueue in nbd_start_device KVM: s390: ENOTSUPP -> EOPNOTSUPP fixups kconfig: fix broken dependency in randconfig-generated .config clk: qcom: rcg2: Don't crash if our parent can't be found; return an error drm/amdgpu: remove 4 set but not used variable in amdgpu_atombios_get_connector_info_from_object_table drm/amdgpu: Ensure ret is always initialized when using SOC15_WAIT_ON_RREG regulator: rk808: Lower log level on optional GPIOs being not available net/wan/fsl_ucc_hdlc: reject muram offsets above 64K NFC: port100: Convert cpu_to_le16(le16_to_cpu(E1) + E2) to use le16_add_cpu(). selinux: fall back to ref-walk if audit is required arm64: dts: allwinner: H6: Add PMU mode arm: dts: allwinner: H3: Add PMU node selinux: ensure we cleanup the internal AVC counters on error in avc_insert() arm64: dts: qcom: msm8996: Disable USB2 PHY suspend by core ARM: dts: imx6: rdu2: Disable WP for USDHC2 and USDHC3 ARM: dts: imx6: rdu2: Limit USBH1 to Full Speed PCI: iproc: Apply quirk_paxc_bridge() for module as well as built-in media: cx23885: Add support for AVerMedia CE310B PCI: Add generic quirk for increasing D3hot delay PCI: Increase D3 delay for AMD Ryzen5/7 XHCI controllers media: v4l2-device.h: Explicitly compare grp{id,mask} to zero in v4l2_device macros reiserfs: Fix spurious unlock in reiserfs_fill_super() error handling r8169: check that Realtek PHY driver module is loaded fore200e: Fix incorrect checks of NULL pointer dereference netfilter: nft_tunnel: add the missing ERSPAN_VERSION nla_policy ALSA: usx2y: Adjust indentation in snd_usX2Y_hwdep_dsp_status b43legacy: Fix -Wcast-function-type ipw2x00: Fix -Wcast-function-type iwlegacy: Fix -Wcast-function-type rtlwifi: rtl_pci: Fix -Wcast-function-type orinoco: avoid assertion in case of NULL pointer ACPICA: Disassembler: create buffer fields in ACPI_PARSE_LOAD_PASS1 scsi: ufs: Complete pending requests in host reset and restore path scsi: aic7xxx: Adjust indentation in ahc_find_syncrate drm/mediatek: handle events when enabling/disabling crtc ARM: dts: r8a7779: Add device node for ARM global timer selinux: ensure we cleanup the internal AVC counters on error in avc_update() dmaengine: Store module owner in dma_device struct dmaengine: imx-sdma: Fix memory leak crypto: chtls - Fixed memory leak x86/vdso: Provide missing include file PM / devfreq: rk3399_dmc: Add COMPILE_TEST and HAVE_ARM_SMCCC dependency pinctrl: sh-pfc: sh7269: Fix CAN function GPIOs reset: uniphier: Add SCSSI reset control for each channel RDMA/rxe: Fix error type of mmap_offset clk: sunxi-ng: add mux and pll notifiers for A64 CPU clock ALSA: sh: Fix unused variable warnings clk: uniphier: Add SCSSI clock gate for each channel ALSA: sh: Fix compile warning wrt const tools lib api fs: Fix gcc9 stringop-truncation compilation error ACPI: button: Add DMI quirk for Razer Blade Stealth 13 late 2019 lid switch mlx5: work around high stack usage with gcc drm: remove the newline for CRC source name. ARM: dts: stm32: Add power-supply for DSI panel on stm32f469-disco usbip: Fix unsafe unaligned pointer usage udf: Fix free space reporting for metadata and virtual partitions staging: rtl8188: avoid excessive stack usage IB/hfi1: Add software counter for ctxt0 seq drop soc/tegra: fuse: Correct straps' address for older Tegra124 device trees efi/x86: Don't panic or BUG() on non-critical error conditions rcu: Use WRITE_ONCE() for assignments to ->pprev for hlist_nulls Input: edt-ft5x06 - work around first register access error x86/nmi: Remove irq_work from the long duration NMI handler wan: ixp4xx_hss: fix compile-testing on 64-bit ASoC: atmel: fix build error with CONFIG_SND_ATMEL_SOC_DMA=m tty: synclinkmp: Adjust indentation in several functions tty: synclink_gt: Adjust indentation in several functions visorbus: fix uninitialized variable access driver core: platform: Prevent resouce overflow from causing infinite loops driver core: Print device when resources present in really_probe() bpf: Return -EBADRQC for invalid map type in __bpf_tx_xdp_map vme: bridges: reduce stack usage drm/nouveau/secboot/gm20b: initialize pointer in gm20b_secboot_new() drm/nouveau/gr/gk20a,gm200-: add terminators to method lists read from fw drm/nouveau: Fix copy-paste error in nouveau_fence_wait_uevent_handler drm/nouveau/drm/ttm: Remove set but not used variable 'mem' drm/nouveau/fault/gv100-: fix memory leak on module unload drm/vmwgfx: prevent memory leak in vmw_cmdbuf_res_add usb: musb: omap2430: Get rid of musb .set_vbus for omap2430 glue iommu/arm-smmu-v3: Use WRITE_ONCE() when changing validity of an STE f2fs: set I_LINKABLE early to avoid wrong access by vfs f2fs: free sysfs kobject scsi: iscsi: Don't destroy session if there are outstanding connections arm64: fix alternatives with LLVM's integrated assembler drm/amd/display: fixup DML dependencies watchdog/softlockup: Enforce that timestamp is valid on boot f2fs: fix memleak of kobject x86/mm: Fix NX bit clearing issue in kernel_map_pages_in_pgd pwm: omap-dmtimer: Remove PWM chip in .remove before making it unfunctional cmd64x: potential buffer overflow in cmd64x_program_timings() ide: serverworks: potential overflow in svwks_set_pio_mode() pwm: Remove set but not set variable 'pwm' btrfs: fix possible NULL-pointer dereference in integrity checks btrfs: safely advance counter when looking up bio csums btrfs: device stats, log when stats are zeroed module: avoid setting info->name early in case we can fall back to info->mod->name remoteproc: Initialize rproc_class before use irqchip/mbigen: Set driver .suppress_bind_attrs to avoid remove problems ALSA: hda/hdmi - add retry logic to parse_intel_hdmi() kbuild: use -S instead of -E for precise cc-option test in Kconfig x86/decoder: Add TEST opcode to Group3-2 s390: adjust -mpacked-stack support check for clang 10 s390/ftrace: generate traced function stack frame driver core: platform: fix u32 greater or equal to zero comparison ALSA: hda - Add docking station support for Lenovo Thinkpad T420s drm/nouveau/mmu: fix comptag memory leak powerpc/sriov: Remove VF eeh_dev state when disabling SR-IOV bcache: cached_dev_free needs to put the sb page iommu/vt-d: Remove unnecessary WARN_ON_ONCE() selftests: bpf: Reset global state between reuseport test runs jbd2: switch to use jbd2_journal_abort() when failed to submit the commit record jbd2: make sure ESHUTDOWN to be recorded in the journal superblock ARM: 8951/1: Fix Kexec compilation issue. hostap: Adjust indentation in prism2_hostapd_add_sta iwlegacy: ensure loop counter addr does not wrap and cause an infinite loop cifs: fix NULL dereference in match_prepath bpf: map_seq_next should always increase position index ceph: check availability of mds cluster on mount after wait timeout rbd: work around -Wuninitialized warning irqchip/gic-v3: Only provision redistributors that are enabled in ACPI drm/nouveau/disp/nv50-: prevent oops when no channel method map provided ftrace: fpid_next() should increase position index trigger_next should increase position index radeon: insert 10ms sleep in dce5_crtc_load_lut ocfs2: fix a NULL pointer dereference when call ocfs2_update_inode_fsync_trans() lib/scatterlist.c: adjust indentation in __sg_alloc_table reiserfs: prevent NULL pointer dereference in reiserfs_insert_item() bcache: explicity type cast in bset_bkey_last() irqchip/gic-v3-its: Reference to its_invall_cmd descriptor when building INVALL iwlwifi: mvm: Fix thermal zone registration microblaze: Prevent the overflow of the start brd: check and limit max_part par drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_latency drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_voltage NFS: Fix memory leaks help_next should increase position index cifs: log warning message (once) if out of disk space virtio_balloon: prevent pfn array overflow mlxsw: spectrum_dpipe: Add missing error path drm/amdgpu/display: handle multiple numbers of fclks in dcn_calcs.c (v2) Linux 4.19.106 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ia1032b50dd82b42e13973120dcbf94ae7b864648	2020-02-24 09:13:25 +01:00
Vasily Averin	9ed840b756	trigger_next should increase position index [ Upstream commit 6722b23e7a2ace078344064a9735fb73e554e9ef ] if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. Without patch: # dd bs=30 skip=1 if=/sys/kernel/tracing/events/sched/sched_switch/trigger dd: /sys/kernel/tracing/events/sched/sched_switch/trigger: cannot skip to specified offset n traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist # Available triggers: # traceon traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist 6+1 records in 6+1 records out 206 bytes copied, 0.00027916 s, 738 kB/s Notice the printing of "# Available triggers:..." after the line. With the patch: # dd bs=30 skip=1 if=/sys/kernel/tracing/events/sched/sched_switch/trigger dd: /sys/kernel/tracing/events/sched/sched_switch/trigger: cannot skip to specified offset n traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist 2+1 records in 2+1 records out 88 bytes copied, 0.000526867 s, 167 kB/s It only prints the end of the file, and does not restart. Link: http://lkml.kernel.org/r/3c35ee24-dd3a-8119-9c19-552ed253388a@virtuozzo.com https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:52 +01:00
Vasily Averin	ddb005d906	ftrace: fpid_next() should increase position index [ Upstream commit e4075e8bdffd93a9b6d6e1d52fabedceeca5a91b ] if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. Without patch: # dd bs=4 skip=1 if=/sys/kernel/tracing/set_ftrace_pid dd: /sys/kernel/tracing/set_ftrace_pid: cannot skip to specified offset id no pid 2+1 records in 2+1 records out 10 bytes copied, 0.000213285 s, 46.9 kB/s Notice the "id" followed by "no pid". With the patch: # dd bs=4 skip=1 if=/sys/kernel/tracing/set_ftrace_pid dd: /sys/kernel/tracing/set_ftrace_pid: cannot skip to specified offset id 0+1 records in 0+1 records out 3 bytes copied, 0.000202112 s, 14.8 kB/s Notice that it only prints "id" and not the "no pid" afterward. Link: http://lkml.kernel.org/r/4f87c6ad-f114-30bb-8506-c32274ce2992@virtuozzo.com https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:52 +01:00
Vasily Averin	ca2b459365	bpf: map_seq_next should always increase position index [ Upstream commit 90435a7891a2259b0f74c5a1bc5600d0d64cba8f ] If seq_file .next fuction does not change position index, read after some lseek can generate an unexpected output. See also: https://bugzilla.kernel.org/show_bug.cgi?id=206283 v1 -> v2: removed missed increment in end of function Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/eca84fdd-c374-a154-d874-6c7b55fc3bc4@virtuozzo.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:51 +01:00
Jessica Yu	c371b1e41f	module: avoid setting info->name early in case we can fall back to info->mod->name [ Upstream commit 708e0ada1916be765b7faa58854062f2bc620bbf ] In setup_load_info(), info->name (which contains the name of the module, mostly used for early logging purposes before the module gets set up) gets unconditionally assigned if .modinfo is missing despite the fact that there is an if (!info->name) check near the end of the function. Avoid assigning a placeholder string to info->name if .modinfo doesn't exist, so that we can fall back to info->mod->name later on. Fixes: `5fdc7db644` ("module: setup load info before module_sig_check()") Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:49 +01:00
Thomas Gleixner	c2913e2c50	watchdog/softlockup: Enforce that timestamp is valid on boot [ Upstream commit 11e31f608b499f044f24b20be73f1dcab3e43f8a ] Robert reported that during boot the watchdog timestamp is set to 0 for one second which is the indicator for a watchdog reset. The reason for this is that the timestamp is in seconds and the time is taken from sched clock and divided by ~1e9. sched clock starts at 0 which means that for the first second during boot the watchdog timestamp is 0, i.e. reset. Use ULONG_MAX as the reset indicator value so the watchdog works correctly right from the start. ULONG_MAX would only conflict with a real timestamp if the system reaches an uptime of 136 years on 32bit and almost eternity on 64bit. Reported-by: Robert Richter <rrichter@marvell.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/87o8v3uuzl.fsf@nanos.tec.linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:49 +01:00
Steven Rostedt (VMware)	56d3793229	tracing: Fix very unlikely race of registering two stat tracers [ Upstream commit dfb6cd1e654315168e36d947471bd2a0ccd834ae ] Looking through old emails in my INBOX, I came across a patch from Luis Henriques that attempted to fix a race of two stat tracers registering the same stat trace (extremely unlikely, as this is done in the kernel, and probably doesn't even exist). The submitted patch wasn't quite right as it needed to deal with clean up a bit better (if two stat tracers were the same, it would have the same files). But to make the code cleaner, all we needed to do is to keep the all_stat_sessions_mutex held for most of the registering function. Link: http://lkml.kernel.org/r/1410299375-20068-1-git-send-email-luis.henriques@canonical.com Fixes: `002bb86d8d` ("tracing/ftrace: separate events tracing and stats tracing engine") Reported-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:39 +01:00
Luis Henriques	fb0085070a	tracing: Fix tracing_stat return values in error handling paths [ Upstream commit afccc00f75bbbee4e4ae833a96c2d29a7259c693 ] tracing_stat_init() was always returning '0', even on the error paths. It now returns -ENODEV if tracing_init_dentry() fails or -ENOMEM if it fails to created the 'trace_stat' debugfs directory. Link: http://lkml.kernel.org/r/1410299381-20108-1-git-send-email-luis.henriques@canonical.com Fixes: `ed6f1c996b` ("tracing: Check return value of tracing_init_dentry()") Signed-off-by: Luis Henriques <luis.henriques@canonical.com> [ Pulled from the archeological digging of my INBOX ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:39 +01:00
Peter Zijlstra	b9dc4d61b5	cpu/hotplug, stop_machine: Fix stop_machine vs hotplug order [ Upstream commit 45178ac0cea853fe0e405bf11e101bdebea57b15 ] Paul reported a very sporadic, rcutorture induced, workqueue failure. When the planets align, the workqueue rescuer's self-migrate fails and then triggers a WARN for running a work on the wrong CPU. Tejun then figured that set_cpus_allowed_ptr()'s stop_one_cpu() call could be ignored! When stopper->enabled is false, stop_machine will insta complete the work, without actually doing the work. Worse, it will not WARN about this (we really should fix this). It turns out there is a small window where a freshly online'ed CPU is marked 'online' but doesn't yet have the stopper task running: BP AP bringup_cpu() __cpu_up(cpu, idle) --> start_secondary() ... cpu_startup_entry() bringup_wait_for_ap() wait_for_ap_thread() <-- cpuhp_online_idle() while (1) do_idle() ... available to run kthreads ... stop_machine_unpark() stopper->enable = true; Close this by moving the stop_machine_unpark() into cpuhp_online_idle(), such that the stopper thread is ready before we start the idle loop and schedule. Reported-by: "Paul E. McKenney" <paulmck@kernel.org> Debugged-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: "Paul E. McKenney" <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:35 +01:00
Quentin Perret	2a557de670	UPSTREAM: sched/topology: Introduce a sysctl for Energy Aware Scheduling In its current state, Energy Aware Scheduling (EAS) starts automatically on asymmetric platforms having an Energy Model (EM). However, there are users who want to have an EM (for thermal management for example), but don't want EAS with it. In order to let users disable EAS explicitly, introduce a new sysctl called 'sched_energy_aware'. It is enabled by default so that EAS can start automatically on platforms where it makes sense. Flipping it to 0 rebuilds the scheduling domains and disables EAS. Bug: 120440300 Signed-off-by: Quentin Perret <quentin.perret@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: adharmap@codeaurora.org Cc: chris.redpath@arm.com Cc: currojerez@riseup.net Cc: dietmar.eggemann@arm.com Cc: edubezval@gmail.com Cc: gregkh@linuxfoundation.org Cc: javi.merino@kernel.org Cc: joel@joelfernandes.org Cc: juri.lelli@redhat.com Cc: morten.rasmussen@arm.com Cc: patrick.bellasi@arm.com Cc: pkondeti@codeaurora.org Cc: rjw@rjwysocki.net Cc: skannan@codeaurora.org Cc: smuckle@google.com Cc: srinivas.pandruvada@linux.intel.com Cc: thara.gopinath@linaro.org Cc: tkjos@google.com Cc: valentin.schneider@arm.com Cc: vincent.guittot@linaro.org Cc: viresh.kumar@linaro.org Link: https://lkml.kernel.org/r/20181203095628.11858-11-quentin.perret@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 8d5d0cfb63cbcb4005e19a332b31d687b1d01e58) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: I4ca842d07b82869cfab7542c8c4351f631e1024d	2020-02-19 10:50:59 +00:00
Greg Kroah-Hartman	4eee97caec	This is the 4.19.104 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl5HEigACgkQONu9yGCS aT7Gcw/6AkDGkK5U/aDpKMqWiRmZUqDIg8U9xR+44Gl57Q71vicrzq8NGPHxxbsF slWoCyXLVSD7bMWGsTD0qJR8muROAraMxDl8dCxojEXHnXFMx4A4Cf0h1E0lY0mu Jq/O9m33ZMSppjio88sCcLpo0pbXF+cCX1CY87NI5QUitUzHgRh18W8BtyFpMMI8 eC0Fc+hMWax3+qqHt/hFVpufaTKm35zLCpGjGAJiHd7GFvqUJnuAzBYCs1Cf8NO1 KrrL3l/IWk8z3Z0Wc9PbBz309a9H6FVpjrXSXj6URkxjtqJ0F0mBMaIYxhaUF8PD CHY5xLyqKodC8/7O5zNOrP80oT9nqJvsmKwUwlG34IJuMVaq/o+hZu+88JVB02Yw v9XVcaQda5aZgWF9cBWzFQEcNwHFDCQ9VNidLDcHJLGPyFo/BogvMo8T4yPM9tI0 O0PSFm/yYu0airZSCzIbPzuF2Iv+iilVtq+o10VRDsGtEYAOzTL7nA01MkdXFhwy 4V+Q51C90TGo13BnnZ6xpEqjspuDWgeOD71/xkQ5cnyFgam0XQq/5R6JJghJIHOP 7p8NMMyNhK2FnOGrFUgqvwBCp6Dap1ISZyKvie1Z8vuCJsZcwMVIw8fxAzoZWOjj MlmmePjlbC7XTFxjdo0jrQTdvBwq+gFgNitD7UAlfHAdqKJKKA4= =8ktI -----END PGP SIGNATURE----- Merge 4.19.104 into android-4.19 Changes in 4.19.104 ASoC: pcm: update FE/BE trigger order based on the command hv_sock: Remove the accept port restriction IB/mlx4: Fix memory leak in add_gid error flow RDMA/netlink: Do not always generate an ACK for some netlink operations RDMA/core: Fix locking in ib_uverbs_event_read RDMA/uverbs: Verify MR access flags scsi: ufs: Fix ufshcd_probe_hba() reture value in case ufshcd_scsi_add_wlus() fails PCI/IOV: Fix memory leak in pci_iov_add_virtfn() ath10k: pci: Only dump ATH10K_MEM_REGION_TYPE_IOREG when safe PCI/switchtec: Fix vep_vector_number ioread width PCI: Don't disable bridge BARs when assigning bus resources nfs: NFS_SWAP should depend on SWAP NFS: Revalidate the file size on a fatal write error NFS/pnfs: Fix pnfs_generic_prepare_to_resend_writes() NFSv4: try lease recovery on NFS4ERR_EXPIRED serial: uartps: Add a timeout to the tx empty wait gpio: zynq: Report gpio direction at boot spi: spi-mem: Add extra sanity checks on the op param spi: spi-mem: Fix inverted logic in op sanity check rtc: hym8563: Return -EINVAL if the time is known to be invalid rtc: cmos: Stop using shared IRQ ARC: [plat-axs10x]: Add missing multicast filter number to GMAC node platform/x86: intel_mid_powerbtn: Take a copy of ddata ARM: dts: at91: Reenable UART TX pull-ups ARM: dts: am43xx: add support for clkout1 clock ARM: dts: at91: sama5d3: fix maximum peripheral clock rates ARM: dts: at91: sama5d3: define clock rate range for tcb1 tools/power/acpi: fix compilation error powerpc/pseries/vio: Fix iommu_table use-after-free refcount warning powerpc/pseries: Allow not having ibm, hypertas-functions::hcall-multi-tce for DDW iommu/arm-smmu-v3: Populate VMID field for CMDQ_OP_TLBI_NH_VA KVM: arm/arm64: vgic-its: Fix restoration of unmapped collections ARM: 8949/1: mm: mark free_memmap as __init arm64: cpufeature: Fix the type of no FP/SIMD capability arm64: ptrace: nofpsimd: Fail FP/SIMD regset operations KVM: arm/arm64: Fix young bit from mmu notifier KVM: arm: Fix DFSR setting for non-LPAE aarch32 guests KVM: arm: Make inject_abt32() inject an external abort instead KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset mtd: onenand_base: Adjust indentation in onenand_read_ops_nolock mtd: sharpslpart: Fix unsigned comparison to zero crypto: artpec6 - return correct error code for failed setkey() crypto: atmel-sha - fix error handling when setting hmac key media: i2c: adv748x: Fix unsafe macros pinctrl: sh-pfc: r8a7778: Fix duplicate SDSELF_B and SD1_CLK_B mwifiex: Fix possible buffer overflows in mwifiex_ret_wmm_get_status() mwifiex: Fix possible buffer overflows in mwifiex_cmd_append_vsie_tlv() libertas: don't exit from lbs_ibss_join_existing() with RCU read lock held libertas: make lbs_ibss_join_existing() return error code on rates overflow scsi: megaraid_sas: Do not initiate OCR if controller is not in ready state x86/stackframe: Move ENCODE_FRAME_POINTER to asm/frame.h x86/stackframe, x86/ftrace: Add pt_regs frame annotations serial: uartps: Move the spinlock after the read of the tx empty padata: fix null pointer deref of pd->pinst Linux 4.19.104 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I42a465b140183dcc8cf49e19903d0e8f4b688930	2020-02-19 08:31:05 +01:00
Daniel Jordan	cad926f70b	padata: fix null pointer deref of pd->pinst The 4.19 backport `dc34710a7a` ("padata: Remove broken queue flushing") removed padata_alloc_pd()'s assignment to pd->pinst, resulting in: Unable to handle kernel NULL pointer dereference ... ... pc : padata_reorder+0x144/0x2e0 ... Call trace: padata_reorder+0x144/0x2e0 padata_do_serial+0xc8/0x128 pcrypt_aead_enc+0x60/0x70 [pcrypt] padata_parallel_worker+0xd8/0x138 process_one_work+0x1bc/0x4b8 worker_thread+0x164/0x580 kthread+0x134/0x138 ret_from_fork+0x10/0x18 This happened because the backport was based on an enhancement that moved this assignment but isn't in 4.19: bfde23ce200e ("padata: unbind parallel jobs from specific CPUs") Simply restore the assignment to fix the crash. Fixes: `dc34710a7a` ("padata: Remove broken queue flushing") Reported-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Sasha Levin <sashal@kernel.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-14 16:33:28 -05:00
Greg Kroah-Hartman	3389e56d31	This is the 4.19.103 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl5Cn0wACgkQONu9yGCS aT584xAAtePSlzTxst/jukREoyrpAfTM1BeovMdsZEBpKh+/F3n1udqHeo+iNAAN qSOig012aW2qP7b5/4CrEU9ZRTvd0AM4fog7ABLJVahMYMqoJgod8TRaE4v0nVut eRans6w3NbZJCZwdw2aiu5gwFfjwJLSUckBNmj4XVYdyfh7q0BgnZV5OY0V+zhuG 1MWXaylbRqjguR/ZFk0UPAmRaqNKHbwfCJ1V0ygL9xQkJM0cUn7hX9/CqM4aYnm6 m1oux4ektLAmF1XK4NiQEuRBMeFO74XlKcsZqQHf/b4FZfcPergcPwIj8ugtCHzJ kx2QgURDjgH4Tnu+Q0ScPrjj2kjU8rWmjqlcv1PcUyOWm+MR0OK9bW7TLEntMSF8 HOEe9j6SsjQNIOoYh1YcMnuGjKNIZjl2L3VbDzpVN2GxZxwAutY6G68tV7sbA2pu wtsrAVOqdcjoo0ruRmwognBqQAdNdsbiBx7bgcNjVEXWL0N3Ddiv6CNYwnehA5Hq cvQwVQpFGP9ZGYUcCMbdwR+7kJzVy6V2S615M8GkE9FouOwTfV60zM/sZ1rFVt1J 70zxfRX5ys19aTAVkbi6pHHCUJ0ZAiTgWujp5Hp4kPt7gEz01Ur0s1kI3b7b6iWh cuycRFULvqeXCApQacs//lOVDoUV20uFcL/zqOFM33v/+YzkyjA= =3D8z -----END PGP SIGNATURE----- Merge 4.19.103 into android-4.19 Changes in 4.19.103 Revert "drm/sun4i: dsi: Change the start delay calculation" ovl: fix lseek overflow on 32bit kernel/module: Fix memleak in module_add_modinfo_attrs() media: iguanair: fix endpoint sanity check ocfs2: fix oops when writing cloned file x86/cpu: Update cached HLE state on write to TSX_CTRL_CPUID_CLEAR udf: Allow writing to 'Rewritable' partitions printk: fix exclusive_console replaying iwlwifi: mvm: fix NVM check for 3168 devices sparc32: fix struct ipc64_perm type definition cls_rsvp: fix rsvp_policy gtp: use __GFP_NOWARN to avoid memalloc warning l2tp: Allow duplicate session creation with UDP net: hsr: fix possible NULL deref in hsr_handle_frame() net_sched: fix an OOB access in cls_tcindex net: stmmac: Delete txtimer in suspend() bnxt_en: Fix TC queue mapping. tcp: clear tp->total_retrans in tcp_disconnect() tcp: clear tp->delivered in tcp_disconnect() tcp: clear tp->data_segs{in\|out} in tcp_disconnect() tcp: clear tp->segs_{in\|out} in tcp_disconnect() rxrpc: Fix use-after-free in rxrpc_put_local() rxrpc: Fix insufficient receive notification generation rxrpc: Fix missing active use pinning of rxrpc_local object rxrpc: Fix NULL pointer deref due to call->conn being cleared on disconnect media: uvcvideo: Avoid cyclic entity chains due to malformed USB descriptors mfd: dln2: More sanity checking for endpoints ipc/msg.c: consolidate all xxxctl_down() functions tracing: Fix sched switch start/stop refcount racy updates rcu: Avoid data-race in rcu_gp_fqs_check_wake() brcmfmac: Fix memory leak in brcmf_usbdev_qinit usb: typec: tcpci: mask event interrupts when remove driver usb: gadget: legacy: set max_speed to super-speed usb: gadget: f_ncm: Use atomic_t to track in-flight request usb: gadget: f_ecm: Use atomic_t to track in-flight request ALSA: usb-audio: Fix endianess in descriptor validation ALSA: dummy: Fix PCM format loop in proc output mm/memory_hotplug: fix remove_memory() lockdep splat mm: move_pages: report the number of non-attempted pages media/v4l2-core: set pages dirty upon releasing DMA buffers media: v4l2-core: compat: ignore native command codes media: v4l2-rect.h: fix v4l2_rect_map_inside() top/left adjustments lib/test_kasan.c: fix memory leak in kmalloc_oob_krealloc_more() irqdomain: Fix a memory leak in irq_domain_push_irq() platform/x86: intel_scu_ipc: Fix interrupt support ALSA: hda: Add Clevo W65_67SB the power_save blacklist KVM: arm64: Correct PSTATE on exception entry KVM: arm/arm64: Correct CPSR on exception entry KVM: arm/arm64: Correct AArch32 SPSR on exception entry KVM: arm64: Only sign-extend MMIO up to register width MIPS: fix indentation of the 'RELOCS' message MIPS: boot: fix typo in 'vmlinux.lzma.its' target s390/mm: fix dynamic pagetable upgrade for hugetlbfs powerpc/xmon: don't access ASDR in VMs powerpc/pseries: Advance pfn if section is not present in lmb_is_removable() smb3: fix signing verification of large reads PCI: tegra: Fix return value check of pm_runtime_get_sync() mmc: spi: Toggle SPI polarity, do not hardcode it ACPI: video: Do not export a non working backlight interface on MSI MS-7721 boards ACPI / battery: Deal with design or full capacity being reported as -1 ACPI / battery: Use design-cap for capacity calculations if full-cap is not available ACPI / battery: Deal better with neither design nor full capacity not being reported alarmtimer: Unregister wakeup source when module get fails ubifs: Reject unsupported ioctl flags explicitly ubifs: don't trigger assertion on invalid no-key filename ubifs: Fix FS_IOC_SETFLAGS unexpectedly clearing encrypt flag ubifs: Fix deadlock in concurrent bulk-read and writepage crypto: geode-aes - convert to skcipher API and make thread-safe PCI: keystone: Fix link training retries initiation mmc: sdhci-of-at91: fix memleak on clk_get failure hv_balloon: Balloon up according to request page number mfd: axp20x: Mark AXP20X_VBUS_IPSOUT_MGMT as volatile crypto: api - Check spawn->alg under lock in crypto_drop_spawn crypto: ccree - fix backlog memory leak crypto: ccree - fix pm wrongful error reporting crypto: ccree - fix PM race condition scripts/find-unused-docs: Fix massive false positives scsi: qla2xxx: Fix mtcp dump collection failure power: supply: ltc2941-battery-gauge: fix use-after-free ovl: fix wrong WARN_ON() in ovl_cache_update_ino() f2fs: choose hardlimit when softlimit is larger than hardlimit in f2fs_statfs_project() f2fs: fix miscounted block limit in f2fs_statfs_project() f2fs: code cleanup for f2fs_statfs_project() PM: core: Fix handling of devices deleted during system-wide resume of: Add OF_DMA_DEFAULT_COHERENT & select it on powerpc dm zoned: support zone sizes smaller than 128MiB dm space map common: fix to ensure new block isn't already in use dm crypt: fix benbi IV constructor crash if used in authenticated mode dm: fix potential for q->make_request_fn NULL pointer dm writecache: fix incorrect flush sequence when doing SSD mode commit padata: Remove broken queue flushing tracing: Annotate ftrace_graph_hash pointer with __rcu tracing: Annotate ftrace_graph_notrace_hash pointer with __rcu ftrace: Add comment to why rcu_dereference_sched() is open coded ftrace: Protect ftrace_graph_hash with ftrace_sync samples/bpf: Don't try to remove user's homedir on clean crypto: ccp - set max RSA modulus size for v3 platform devices as well crypto: pcrypt - Do not clear MAY_SLEEP flag in original request crypto: atmel-aes - Fix counter overflow in CTR mode crypto: api - Fix race condition in crypto_spawn_alg crypto: picoxcell - adjust the position of tasklet_init and fix missed tasklet_kill scsi: qla2xxx: Fix unbound NVME response length NFS: Fix memory leaks and corruption in readdir NFS: Directory page cache pages need to be locked when read jbd2_seq_info_next should increase position index Btrfs: fix missing hole after hole punching and fsync when using NO_HOLES btrfs: set trans->drity in btrfs_commit_transaction Btrfs: fix race between adding and putting tree mod seq elements and nodes ARM: tegra: Enable PLLP bypass during Tegra124 LP1 iwlwifi: don't throw error when trying to remove IGTK mwifiex: fix unbalanced locking in mwifiex_process_country_ie() sunrpc: expiry_time should be seconds not timeval gfs2: move setting current->backing_dev_info gfs2: fix O_SYNC write handling drm/rect: Avoid division by zero media: rc: ensure lirc is initialized before registering input device tools/kvm_stat: Fix kvm_exit filter name xen/balloon: Support xend-based toolstack take two watchdog: fix UAF in reboot notifier handling in watchdog core code bcache: add readahead cache policy options via sysfs interface eventfd: track eventfd_signal() recursion depth aio: prevent potential eventfd recursion on poll KVM: x86: Refactor picdev_write() to prevent Spectre-v1/L1TF attacks KVM: x86: Refactor prefix decoding to prevent Spectre-v1/L1TF attacks KVM: x86: Protect pmu_intel.c from Spectre-v1/L1TF attacks KVM: x86: Protect DR-based index computations from Spectre-v1/L1TF attacks KVM: x86: Protect kvm_lapic_reg_write() from Spectre-v1/L1TF attacks KVM: x86: Protect kvm_hv_msr_[get\|set]_crash_data() from Spectre-v1/L1TF attacks KVM: x86: Protect ioapic_write_indirect() from Spectre-v1/L1TF attacks KVM: x86: Protect MSR-based index computations in pmu.h from Spectre-v1/L1TF attacks KVM: x86: Protect ioapic_read_indirect() from Spectre-v1/L1TF attacks KVM: x86: Protect MSR-based index computations from Spectre-v1/L1TF attacks in x86.c KVM: x86: Protect x86_decode_insn from Spectre-v1/L1TF attacks KVM: x86: Protect MSR-based index computations in fixed_msr_to_seg_unit() from Spectre-v1/L1TF attacks KVM: x86: Fix potential put_fpu() w/o load_fpu() on MPX platform KVM: PPC: Book3S HV: Uninit vCPU if vcore creation fails KVM: PPC: Book3S PR: Free shared page if mmu initialization fails x86/kvm: Be careful not to clear KVM_VCPU_FLUSH_TLB bit KVM: x86: Don't let userspace set host-reserved cr4 bits KVM: x86: Free wbinvd_dirty_mask if vCPU creation fails KVM: s390: do not clobber registers during guest reset/store status clk: tegra: Mark fuse clock as critical drm/amd/dm/mst: Ignore payload update failures percpu: Separate decrypted varaibles anytime encryption can be enabled scsi: qla2xxx: Fix the endianness of the qla82xx_get_fw_size() return type scsi: csiostor: Adjust indentation in csio_device_reset scsi: qla4xxx: Adjust indentation in qla4xxx_mem_free scsi: ufs: Recheck bkops level if bkops is disabled phy: qualcomm: Adjust indentation in read_poll_timeout ext2: Adjust indentation in ext2_fill_super powerpc/44x: Adjust indentation in ibm4xx_denali_fixup_memsize drm: msm: mdp4: Adjust indentation in mdp4_dsi_encoder_enable NFC: pn544: Adjust indentation in pn544_hci_check_presence ppp: Adjust indentation into ppp_async_input net: smc911x: Adjust indentation in smc911x_phy_configure net: tulip: Adjust indentation in {dmfe, uli526x}_init_module IB/mlx5: Fix outstanding_pi index for GSI qps IB/core: Fix ODP get user pages flow nfsd: fix delay timer on 32-bit architectures nfsd: fix jiffies/time_t mixup in LRU list nfsd: Return the correct number of bytes written to the file ubi: fastmap: Fix inverted logic in seen selfcheck ubi: Fix an error pointer dereference in error handling code mfd: da9062: Fix watchdog compatible string mfd: rn5t618: Mark ADC control register volatile bonding/alb: properly access headers in bond_alb_xmit() net: dsa: bcm_sf2: Only 7278 supports 2Gb/sec IMP port net: mvneta: move rx_dropped and rx_errors in per-cpu stats net_sched: fix a resource leak in tcindex_set_parms() net: systemport: Avoid RBUF stuck in Wake-on-LAN mode net/mlx5: IPsec, Fix esp modify function attribute net/mlx5: IPsec, fix memory leak at mlx5_fpga_ipsec_delete_sa_ctx net: macb: Remove unnecessary alignment check for TSO net: macb: Limit maximum GEM TX length in TSO net: dsa: b53: Always use dev->vlan_enabled in b53_configure_vlan() ext4: fix deadlock allocating crypto bounce page from mempool btrfs: use bool argument in free_root_pointers() btrfs: free block groups after free'ing fs trees drm: atmel-hlcdc: enable clock before configuring timing engine drm/dp_mst: Remove VCPI while disabling topology mgr btrfs: flush write bio if we loop in extent_write_cache_pages KVM: x86/mmu: Apply max PA check for MMIO sptes to 32-bit KVM KVM: x86: Use gpa_t for cr2/gpa to fix TDP support on 32-bit KVM KVM: VMX: Add non-canonical check on writes to RTIT address MSRs KVM: nVMX: vmread should not set rflags to specify success in case of #PF KVM: Use vcpu-specific gva->hva translation when querying host page size KVM: Play nice with read-only memslots when querying host page size mm: zero remaining unavailable struct pages mm: return zero_resv_unavail optimization mm/page_alloc.c: fix uninitialized memmaps on a partially populated last section cifs: fail i/o on soft mounts if sessionsetup errors out x86/apic/msi: Plug non-maskable MSI affinity race clocksource: Prevent double add_timer_on() for watchdog_timer perf/core: Fix mlock accounting in perf_mmap() rxrpc: Fix service call disconnection Linux 4.19.103 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I0d7f09085c3541373e0fd6b2e3ffacc5e34f7d55	2020-02-11 15:05:03 -08:00
Song Liu	a3623db43a	perf/core: Fix mlock accounting in perf_mmap() commit 003461559ef7a9bd0239bae35a22ad8924d6e9ad upstream. Decreasing sysctl_perf_event_mlock between two consecutive perf_mmap()s of a perf ring buffer may lead to an integer underflow in locked memory accounting. This may lead to the undesired behaviors, such as failures in BPF map creation. Address this by adjusting the accounting logic to take into account the possibility that the amount of already locked memory may exceed the current limit. Fixes: c4b75479741c ("perf/core: Make the mlock accounting simple again") Suggested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://lkml.kernel.org/r/20200123181146.2238074-1-songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:34:19 -08:00
Konstantin Khlebnikov	6284d30e96	clocksource: Prevent double add_timer_on() for watchdog_timer commit febac332a819f0e764aa4da62757ba21d18c182b upstream. Kernel crashes inside QEMU/KVM are observed: kernel BUG at kernel/time/timer.c:1154! BUG_ON(timer_pending(timer) \|\| !timer->function) in add_timer_on(). At the same time another cpu got: general protection fault: 0000 [#1] SMP PTI of poinson pointer 0xdead000000000200 in: __hlist_del at include/linux/list.h:681 (inlined by) detach_timer at kernel/time/timer.c:818 (inlined by) expire_timers at kernel/time/timer.c:1355 (inlined by) __run_timers at kernel/time/timer.c:1686 (inlined by) run_timer_softirq at kernel/time/timer.c:1699 Unfortunately kernel logs are badly scrambled, stacktraces are lost. Printing the timer->function before the BUG_ON() pointed to clocksource_watchdog(). The execution of clocksource_watchdog() can race with a sequence of clocksource_stop_watchdog() .. clocksource_start_watchdog(): expire_timers() detach_timer(timer, true); timer->entry.pprev = NULL; raw_spin_unlock_irq(&base->lock); call_timer_fn clocksource_watchdog() clocksource_watchdog_kthread() or clocksource_unbind() spin_lock_irqsave(&watchdog_lock, flags); clocksource_stop_watchdog(); del_timer(&watchdog_timer); watchdog_running = 0; spin_unlock_irqrestore(&watchdog_lock, flags); spin_lock_irqsave(&watchdog_lock, flags); clocksource_start_watchdog(); add_timer_on(&watchdog_timer, ...); watchdog_running = 1; spin_unlock_irqrestore(&watchdog_lock, flags); spin_lock(&watchdog_lock); add_timer_on(&watchdog_timer, ...); BUG_ON(timer_pending(timer) \|\| !timer->function); timer_pending() -> true BUG() I.e. inside clocksource_watchdog() watchdog_timer could be already armed. Check timer_pending() before calling add_timer_on(). This is sufficient as all operations are synchronized by watchdog_lock. Fixes: `75c5158f70` ("timekeeping: Update clocksource with stop_machine") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/158048693917.4378.13823603769948933793.stgit@buzz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:34:18 -08:00
Thomas Gleixner	032a2bf978	x86/apic/msi: Plug non-maskable MSI affinity race commit 6f1a4891a5928a5969c87fa5a584844c983ec823 upstream. Evan tracked down a subtle race between the update of the MSI message and the device raising an interrupt internally on PCI devices which do not support MSI masking. The update of the MSI message is non-atomic and consists of either 2 or 3 sequential 32bit wide writes to the PCI config space. - Write address low 32bits - Write address high 32bits (If supported by device) - Write data When an interrupt is migrated then both address and data might change, so the kernel attempts to mask the MSI interrupt first. But for MSI masking is optional, so there exist devices which do not provide it. That means that if the device raises an interrupt internally between the writes then a MSI message is sent built from half updated state. On x86 this can lead to spurious interrupts on the wrong interrupt vector when the affinity setting changes both address and data. As a consequence the device interrupt can be lost causing the device to become stuck or malfunctioning. Evan tried to handle that by disabling MSI accross an MSI message update. That's not feasible because disabling MSI has issues on its own: If MSI is disabled the PCI device is routing an interrupt to the legacy INTx mechanism. The INTx delivery can be disabled, but the disablement is not working on all devices. Some devices lose interrupts when both MSI and INTx delivery are disabled. Another way to solve this would be to enforce the allocation of the same vector on all CPUs in the system for this kind of screwed devices. That could be done, but it would bring back the vector space exhaustion problems which got solved a few years ago. Fortunately the high address (if supported by the device) is only relevant when X2APIC is enabled which implies interrupt remapping. In the interrupt remapping case the affinity setting is happening at the interrupt remapping unit and the PCI MSI message is programmed only once when the PCI device is initialized. That makes it possible to solve it with a two step update: 1) Target the MSI msg to the new vector on the current target CPU 2) Target the MSI msg to the new vector on the new target CPU In both cases writing the MSI message is only changing a single 32bit word which prevents the issue of inconsistency. After writing the final destination it is necessary to check whether the device issued an interrupt while the intermediate state #1 (new vector, current CPU) was in effect. This is possible because the affinity change is always happening on the current target CPU. The code runs with interrupts disabled, so the interrupt can be detected by checking the IRR of the local APIC. If the vector is pending in the IRR then the interrupt is retriggered on the new target CPU by sending an IPI for the associated vector on the target CPU. This can cause spurious interrupts on both the local and the new target CPU. 1) If the new vector is not in use on the local CPU and the device affected by the affinity change raised an interrupt during the transitional state (step #1 above) then interrupt entry code will ignore that spurious interrupt. The vector is marked so that the 'No irq handler for vector' warning is supressed once. 2) If the new vector is in use already on the local CPU then the IRR check might see an pending interrupt from the device which is using this vector. The IPI to the new target CPU will then invoke the handler of the device, which got the affinity change, even if that device did not issue an interrupt 3) If the new vector is in use already on the local CPU and the device affected by the affinity change raised an interrupt during the transitional state (step #1 above) then the handler of the device which uses that vector on the local CPU will be invoked. expose issues in device driver interrupt handlers which are not prepared to handle a spurious interrupt correctly. This not a regression, it's just exposing something which was already broken as spurious interrupts can happen for a lot of reasons and all driver handlers need to be able to deal with them. Reported-by: Evan Green <evgreen@chromium.org> Debugged-by: Evan Green <evgreen@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Evan Green <evgreen@chromium.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87imkr4s7n.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:34:18 -08:00
Steven Rostedt (VMware)	0948d6294d	ftrace: Protect ftrace_graph_hash with ftrace_sync [ Upstream commit 54a16ff6f2e50775145b210bcd94d62c3c2af117 ] As function_graph tracer can run when RCU is not "watching", it can not be protected by synchronize_rcu() it requires running a task on each CPU before it can be freed. Calling schedule_on_each_cpu(ftrace_sync) needs to be used. Link: https://lore.kernel.org/r/20200205131110.GT2935@paulmck-ThinkPad-P72 Cc: stable@vger.kernel.org Fixes: `b9b0c831be` ("ftrace: Convert graph filter to use hash tables") Reported-by: "Paul E. McKenney" <paulmck@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:34:05 -08:00
Steven Rostedt (VMware)	c03d235980	ftrace: Add comment to why rcu_dereference_sched() is open coded [ Upstream commit 16052dd5bdfa16dbe18d8c1d4cde2ddab9d23177 ] Because the function graph tracer can execute in sections where RCU is not "watching", the rcu_dereference_sched() for the has needs to be open coded. This is fine because the RCU "flavor" of the ftrace hash is protected by its own RCU handling (it does its own little synchronization on every CPU and does not rely on RCU sched). Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:34:04 -08:00
Amol Grover	30afa80b0f	tracing: Annotate ftrace_graph_notrace_hash pointer with __rcu [ Upstream commit fd0e6852c407dd9aefc594f54ddcc21d84803d3b ] Fix following instances of sparse error kernel/trace/ftrace.c:5667:29: error: incompatible types in comparison kernel/trace/ftrace.c:5813:21: error: incompatible types in comparison kernel/trace/ftrace.c:5868:36: error: incompatible types in comparison kernel/trace/ftrace.c:5870:25: error: incompatible types in comparison Use rcu_dereference_protected to dereference the newly annotated pointer. Link: http://lkml.kernel.org/r/20200205055701.30195-1-frextrite@gmail.com Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:34:04 -08:00
Amol Grover	f144ad2e84	tracing: Annotate ftrace_graph_hash pointer with __rcu [ Upstream commit 24a9729f831462b1d9d61dc85ecc91c59037243f ] Fix following instances of sparse error kernel/trace/ftrace.c:5664:29: error: incompatible types in comparison kernel/trace/ftrace.c:5785:21: error: incompatible types in comparison kernel/trace/ftrace.c:5864:36: error: incompatible types in comparison kernel/trace/ftrace.c:5866:25: error: incompatible types in comparison Use rcu_dereference_protected to access the __rcu annotated pointer. Link: http://lkml.kernel.org/r/20200201072703.17330-1-frextrite@gmail.com Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:34:04 -08:00
Herbert Xu	dc34710a7a	padata: Remove broken queue flushing [ Upstream commit 07928d9bfc81640bab36f5190e8725894d93b659 ] The function padata_flush_queues is fundamentally broken because it cannot force padata users to complete the request that is underway. IOW padata has to passively wait for the completion of any outstanding work. As it stands flushing is used in two places. Its use in padata_stop is simply unnecessary because nothing depends on the queues to be flushed afterwards. The other use in padata_replace is more substantial as we depend on it to free the old pd structure. This patch instead uses the pd->refcnt to dynamically free the pd structure once all requests are complete. Fixes: `2b73b07ab8` ("padata: Flush the padata queues actively") Cc: <stable@vger.kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:34:04 -08:00
Stephen Boyd	b522ff023e	alarmtimer: Unregister wakeup source when module get fails commit 6b6d188aae79a630957aefd88ff5c42af6553ee3 upstream. The alarmtimer_rtc_add_device() function creates a wakeup source and then tries to grab a module reference. If that fails the function returns early with an error code, but fails to remove the wakeup source. Cleanup this exit path so there is no dangling wakeup source, which is named 'alarmtime' left allocated which will conflict with another RTC device that may be registered later. Fixes: `51218298a2` ("alarmtimer: Ensure RTC module is not unloaded") Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Douglas Anderson <dianders@chromium.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200109155910.907-2-swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:33:59 -08:00
Kevin Hao	4f7d834cec	irqdomain: Fix a memory leak in irq_domain_push_irq() commit 0f394daef89b38d58c91118a2b08b8a1b316703b upstream. Fix a memory leak reported by kmemleak: unreferenced object 0xffff000bc6f50e80 (size 128): comm "kworker/23:2", pid 201, jiffies 4294894947 (age 942.132s) hex dump (first 32 bytes): 00 00 00 00 41 00 00 00 86 c0 03 00 00 00 00 00 ....A........... 00 a0 b2 c6 0b 00 ff ff 40 51 fd 10 00 80 ff ff ........@Q...... backtrace: [<00000000e62d2240>] kmem_cache_alloc_trace+0x1a4/0x320 [<00000000279143c9>] irq_domain_push_irq+0x7c/0x188 [<00000000d9f4c154>] thunderx_gpio_probe+0x3ac/0x438 [<00000000fd09ec22>] pci_device_probe+0xe4/0x198 [<00000000d43eca75>] really_probe+0xdc/0x320 [<00000000d3ebab09>] driver_probe_device+0x5c/0xf0 [<000000005b3ecaa0>] __device_attach_driver+0x88/0xc0 [<000000004e5915f5>] bus_for_each_drv+0x7c/0xc8 [<0000000079d4db41>] __device_attach+0xe4/0x140 [<00000000883bbda9>] device_initial_probe+0x18/0x20 [<000000003be59ef6>] bus_probe_device+0x98/0xa0 [<0000000039b03d3f>] deferred_probe_work_func+0x74/0xa8 [<00000000870934ce>] process_one_work+0x1c8/0x470 [<00000000e3cce570>] worker_thread+0x1f8/0x428 [<000000005d64975e>] kthread+0xfc/0x128 [<00000000f0eaa764>] ret_from_fork+0x10/0x18 Fixes: `495c38d300` ("irqdomain: Add irq_domain_{push,pop}_irq() functions") Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200120043547.22271-1-haokexin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:33:57 -08:00
Eric Dumazet	00b13445f9	rcu: Avoid data-race in rcu_gp_fqs_check_wake() commit 6935c3983b246d5fbfebd3b891c825e65c118f2d upstream. The rcu_gp_fqs_check_wake() function uses rcu_preempt_blocked_readers_cgp() to read ->gp_tasks while other cpus might overwrite this field. We need READ_ONCE()/WRITE_ONCE() pairs to avoid compiler tricks and KCSAN splats like the following : BUG: KCSAN: data-race in rcu_gp_fqs_check_wake / rcu_preempt_deferred_qs_irqrestore write to 0xffffffff85a7f190 of 8 bytes by task 7317 on cpu 0: rcu_preempt_deferred_qs_irqrestore+0x43d/0x580 kernel/rcu/tree_plugin.h:507 rcu_read_unlock_special+0xec/0x370 kernel/rcu/tree_plugin.h:659 __rcu_read_unlock+0xcf/0xe0 kernel/rcu/tree_plugin.h:394 rcu_read_unlock include/linux/rcupdate.h:645 [inline] __ip_queue_xmit+0x3b0/0xa40 net/ipv4/ip_output.c:533 ip_queue_xmit+0x45/0x60 include/net/ip.h:236 __tcp_transmit_skb+0xdeb/0x1cd0 net/ipv4/tcp_output.c:1158 __tcp_send_ack+0x246/0x300 net/ipv4/tcp_output.c:3685 tcp_send_ack+0x34/0x40 net/ipv4/tcp_output.c:3691 tcp_cleanup_rbuf+0x130/0x360 net/ipv4/tcp.c:1575 tcp_recvmsg+0x633/0x1a30 net/ipv4/tcp.c:2179 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec net/socket.c:871 [inline] sock_recvmsg net/socket.c:889 [inline] sock_recvmsg+0x92/0xb0 net/socket.c:885 sock_read_iter+0x15f/0x1e0 net/socket.c:967 call_read_iter include/linux/fs.h:1864 [inline] new_sync_read+0x389/0x4f0 fs/read_write.c:414 read to 0xffffffff85a7f190 of 8 bytes by task 10 on cpu 1: rcu_gp_fqs_check_wake kernel/rcu/tree.c:1556 [inline] rcu_gp_fqs_check_wake+0x93/0xd0 kernel/rcu/tree.c:1546 rcu_gp_fqs_loop+0x36c/0x580 kernel/rcu/tree.c:1611 rcu_gp_kthread+0x143/0x220 kernel/rcu/tree.c:1768 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 10 Comm: rcu_preempt Not tainted 5.3.0+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> [ paulmck: Added another READ_ONCE() for RCU CPU stall warnings. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:33:55 -08:00
Mathieu Desnoyers	62bfa26e4d	tracing: Fix sched switch start/stop refcount racy updates commit 64ae572bc7d0060429e40e1c8d803ce5eb31a0d6 upstream. Reading the sched_cmdline_ref and sched_tgid_ref initial state within tracing_start_sched_switch without holding the sched_register_mutex is racy against concurrent updates, which can lead to tracepoint probes being registered more than once (and thus trigger warnings within tracepoint.c). [ May be the fix for this bug ] Link: https://lore.kernel.org/r/000000000000ab6f84056c786b93@google.com Link: http://lkml.kernel.org/r/20190817141208.15226-1-mathieu.desnoyers@efficios.com Cc: stable@vger.kernel.org CC: Steven Rostedt (VMware) <rostedt@goodmis.org> CC: Joel Fernandes (Google) <joel@joelfernandes.org> CC: Peter Zijlstra <peterz@infradead.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Paul E. McKenney <paulmck@linux.ibm.com> Reported-by: syzbot+774fddf07b7ab29a1e55@syzkaller.appspotmail.com Fixes: `d914ba37d7` ("tracing: Add support for recording tgid of tasks") Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:33:55 -08:00
John Ogness	8360063bfa	printk: fix exclusive_console replaying [ Upstream commit def97da136515cb289a14729292c193e0a93bc64 ] Commit f92b070f2dc8 ("printk: Do not miss new messages when replaying the log") introduced a new variable @exclusive_console_stop_seq to store when an exclusive console should stop printing. It should be set to the @console_seq value at registration. However, @console_seq is previously set to @syslog_seq so that the exclusive console knows where to begin. This results in the exclusive console immediately reactivating all the other consoles and thus repeating the messages for those consoles. Set @console_seq after @exclusive_console_stop_seq has stored the current @console_seq value. Fixes: f92b070f2dc8 ("printk: Do not miss new messages when replaying the log") Link: http://lkml.kernel.org/r/20191219115322.31160-1-john.ogness@linutronix.de Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: John Ogness <john.ogness@linutronix.de> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:33:51 -08:00
YueHaibing	bdfaaf35ac	kernel/module: Fix memleak in module_add_modinfo_attrs() [ Upstream commit f6d061d617124abbd55396a3bc37b9bf7d33233c ] In module_add_modinfo_attrs() if sysfs_create_file() fails on the first iteration of the loop (so i = 0), we forget to free the modinfo_attrs. Fixes: bc6f2a757d52 ("kernel/module: Fix mem leak in module_add_modinfo_attrs") Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:33:50 -08:00
Greg Kroah-Hartman	83b584a64c	This is the 4.19.102 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl461NIACgkQONu9yGCS aT6Mqw//W5xIIcs0Ut+P+QYNN6lCTRJ0AvFUolz79M3pyK/rHUluwTvYJbDAeGE3 sckv96rE1pxj5ZSf6LegXIoALrA4RlYHS8xXkYnRrt6xfrb7UwpqsJtt4Mx+IrJ3 9uFfaWRSvuDfRCraZxLiE2Bl9xVYvaPfFJYBmH383VB+deYNfpwORFsqNDQT+gR6 PZLuV0x//Kerwmd4OvaaHR/fIl8YVKmIz5lu3+3WIuVKxTK6Bbd3YzVu13dhVaX2 mETflLEAO/sYsUQiS4SO22ejLAiWyD8LyMV8s9KeTFQXzML3JpibKnt3ySDfzsFE m8VRlaLcQwB0Ca2AVGHA5QV0+V+2+6qh/IcZl630feBueGQX59qLQkOurD4e/9lm Na6ZkLPTh9UipIfTu9fvA5HY5lPt2VcSWwG2nLluckfJIpKNFVQEB7vuk9zd7468 qkXmj/J1YDdJzt2YgD0WZuKu3f1/No7rXbNmT2Oj0+HNWWvIU9xFNFlIPAxo7pJy kwekd9+gHI0n1OhLRjzYUyf0pD+j0o75ZHsYYsUW0y6cGoWX/LmQ8JPFi+waHiov FOe8FJz/uDtfQnJ4+izAM5Jjbu1LE+L8uGoIExYAv4DuXgPZtI2wtHvP4HHM3Aov mDWLesMgizsroViv57aXC0C1ZPksPpGeHT+HcH7RnDQ0kQmpe3E= =2XGW -----END PGP SIGNATURE----- Merge 4.19.102 into android-4.19 Changes in 4.19.102 vfs: fix do_last() regression x86/resctrl: Fix use-after-free when deleting resource groups x86/resctrl: Fix use-after-free due to inaccurate refcount of rdtgroup x86/resctrl: Fix a deadlock due to inaccurate reference crypto: pcrypt - Fix user-after-free on module unload rsi: add hci detach for hibernation and poweroff rsi: fix use-after-free on failed probe and unbind perf c2c: Fix return type for histogram sorting comparision functions PM / devfreq: Add new name attribute for sysfs tools lib: Fix builds when glibc contains strlcpy() arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean' ext4: validate the debug_want_extra_isize mount option at parse time mm/mempolicy.c: fix out of bounds write in mpol_parse_str() reiserfs: Fix memory leak of journal device string media: digitv: don't continue if remote control state can't be read media: af9005: uninitialized variable printked media: vp7045: do not read uninitialized values if usb transfer fails media: gspca: zero usb_buf media: dvb-usb/dvb-usb-urb.c: initialize actlen to 0 tomoyo: Use atomic_t for statistics counter ttyprintk: fix a potential deadlock in interrupt context issue Bluetooth: Fix race condition in hci_release_sock() cgroup: Prevent double killing of css when enabling threaded cgroup media: si470x-i2c: Move free() past last use of 'radio' ARM: dts: sun8i: a83t: Correct USB3503 GPIOs polarity ARM: dts: am57xx-beagle-x15/am57xx-idk: Remove "gpios" for endpoint dt nodes ARM: dts: beagle-x15-common: Model 5V0 regulator soc: ti: wkup_m3_ipc: Fix race condition with rproc_boot tools lib traceevent: Fix memory leakage in filter_event rseq: Unregister rseq for clone CLONE_VM clk: sunxi-ng: h6-r: Fix AR100/R_APB2 parent order mac80211: mesh: restrict airtime metric to peered established plinks clk: mmp2: Fix the order of timer mux parents ASoC: rt5640: Fix NULL dereference on module unload ixgbevf: Remove limit of 10 entries for unicast filter list ixgbe: Fix calculation of queue with VFs and flow director on interface flap igb: Fix SGMII SFP module discovery for 100FX/LX. platform/x86: GPD pocket fan: Allow somewhat lower/higher temperature limits ASoC: sti: fix possible sleep-in-atomic qmi_wwan: Add support for Quectel RM500Q parisc: Use proper printk format for resource_size_t wireless: fix enabling channel 12 for custom regulatory domain cfg80211: Fix radar event during another phy CAC mac80211: Fix TKIP replay protection immediately after key setup wireless: wext: avoid gcc -O3 warning netfilter: nft_tunnel: ERSPAN_VERSION must not be null net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec bnxt_en: Fix ipv6 RFS filter matching logic. riscv: delete temporary files iwlwifi: Don't ignore the cap field upon mcc update ARM: dts: am335x-boneblack-common: fix memory size vti[6]: fix packet tx through bpf_redirect() xfrm interface: fix packet tx through bpf_redirect() xfrm: interface: do not confirm neighbor when do pmtu update scsi: fnic: do not queue commands during fwreset ARM: 8955/1: virt: Relax arch timer version check during early boot tee: optee: Fix compilation issue with nommu airo: Fix possible info leak in AIROOLDIOCTL/SIOCDEVPRIVATE airo: Add missing CAP_NET_ADMIN check in AIROOLDIOCTL/SIOCDEVPRIVATE r8152: get default setting of WOL before initializing ARM: dts: am43x-epos-evm: set data pin directions for spi0 and spi1 qlcnic: Fix CPU soft lockup while collecting firmware dump powerpc/fsl/dts: add fsl,erratum-a011043 net/fsl: treat fsl,erratum-a011043 net: fsl/fman: rename IF_MODE_XGMII to IF_MODE_10G seq_tab_next() should increase position index l2t_seq_next should increase position index net: Fix skb->csum update in inet_proto_csum_replace16(). btrfs: do not zero f_bavail if we have available space perf report: Fix no libunwind compiled warning break s390 issue mm/migrate.c: also overwrite error when it is bigger than zero Linux 4.19.102 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ia9b63c7932b66f469ab0e88467e1e07741408f0b	2020-02-05 19:20:26 +00:00
Michal Koutný	6d26630912	cgroup: Prevent double killing of css when enabling threaded cgroup commit 3bc0bb36fa30e95ca829e9cf480e1ef7f7638333 upstream. The test_cgcore_no_internal_process_constraint_on_threads selftest when running with subsystem controlling noise triggers two warnings: > [ 597.443115] WARNING: CPU: 1 PID: 28167 at kernel/cgroup/cgroup.c:3131 cgroup_apply_control_enable+0xe0/0x3f0 > [ 597.443413] WARNING: CPU: 1 PID: 28167 at kernel/cgroup/cgroup.c:3177 cgroup_apply_control_disable+0xa6/0x160 Both stem from a call to cgroup_type_write. The first warning was also triggered by syzkaller. When we're switching cgroup to threaded mode shortly after a subsystem was disabled on it, we can see the respective subsystem css dying there. The warning in cgroup_apply_control_enable is harmless in this case since we're not adding new subsys anyway. The warning in cgroup_apply_control_disable indicates an attempt to kill css of recently disabled subsystem repeatedly. The commit prevents these situations by making cgroup_type_write wait for all dying csses to go away before re-applying subtree controls. When at it, the locations of WARN_ON_ONCE calls are moved so that warning is triggered only when we are about to misuse the dying css. Reported-by: syzbot+5493b2a54d31d6aea629@syzkaller.appspotmail.com Reported-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-05 14:43:39 +00:00
Greg Kroah-Hartman	1b44c9bd91	This is the 4.19.101 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl41RsgACgkQONu9yGCS aT4P7A/+PZVt4c6phHZ9tj0OV4TjAWfu3IX9nLypzyBxjmBeJu8yt1pkNrfKj6fT +N3MjDlmAYss5CV6SOACPWXdhAQF3SsM6PR+CSrzwpS3+iAVTqNTaHpZqJFBgr3R cDe+MksbMLDpw3x+hXWV1E6WKcJZZJVeANuaD09HQDRVqKw1hRGxGEdyPChEjT71 Ml3o9a2TYzOvRClBtBHPRQNy/MP4cVv06kS7jefDNh1z9PMsD2w01W54ur44WFJb aujt6bLyJlcs0cPdSkU7D8pmgzs/0cxW8N+4gCpfW66P6FJL8SU4RDTujUARlyvC oP5d62XrARXAO0hh1NYdWyUqpQjOFJRTWfEqW+lFGo5s9yL9oPW8vcCBKBuZfg+u HlVCCTCyU/IJN0DMeqdneThDg8sxirlzHu/NllgGIf7rhyMRqRmruQZXc1W3/7e8 UgqqAEFkgVmJgq3mVWlHsV5Fmgb+PQlqj4rSB05wlAbXsQwF0nbSS/lsvwDR8qqE 8nO/PQoxpQyAOYJ+iyaCsq51IsJUCwWOto8L/RpdYSbFpLTn+BRmNdDr7jHOVnPl FshugoXijE6IrVGIJhJBGGy/E+eG8Dru7IZEsi2UZLsw+bBvucqv7raIHAJ2YRaL 8ZuwwmvpZpCOdYSWa7lIgqZb0qOTyR+b6UQ57X8hS5U3MZ2jMOE= =+bpt -----END PGP SIGNATURE----- Merge 4.19.101 into android-4.19 Changes in 4.19.101 orinoco_usb: fix interface sanity check rsi_91x_usb: fix interface sanity check usb: dwc3: pci: add ID for the Intel Comet Lake -V variant USB: serial: ir-usb: add missing endpoint sanity check USB: serial: ir-usb: fix link-speed handling USB: serial: ir-usb: fix IrLAP framing usb: dwc3: turn off VBUS when leaving host mode staging: most: net: fix buffer overflow staging: wlan-ng: ensure error return is actually returned staging: vt6656: correct packet types for CTS protect, mode. staging: vt6656: use NULLFUCTION stack on mac80211 staging: vt6656: Fix false Tx excessive retries reporting. serial: 8250_bcm2835aux: Fix line mismatch on driver unbind component: do not dereference opaque pointer in debugfs mei: me: add comet point (lake) H device ids iio: st_gyro: Correct data for LSM9DS0 gyro crypto: chelsio - fix writing tfm flags to wrong place cifs: Fix memory allocation in __smb2_handle_cancelled_cmd() ath9k: fix storage endpoint lookup brcmfmac: fix interface sanity check rtl8xxxu: fix interface sanity check zd1211rw: fix storage endpoint lookup net_sched: ematch: reject invalid TCF_EM_SIMPLE net_sched: fix ops->bind_class() implementations HID: multitouch: Add LG MELF0410 I2C touchscreen support arc: eznps: fix allmodconfig kconfig warning HID: Add quirk for Xin-Mo Dual Controller HID: ite: Add USB id match for Acer SW5-012 keyboard dock HID: Add quirk for incorrect input length on Lenovo Y720 drivers/hid/hid-multitouch.c: fix a possible null pointer access. phy: qcom-qmp: Increase PHY ready timeout phy: cpcap-usb: Prevent USB line glitches from waking up modem watchdog: max77620_wdt: fix potential build errors watchdog: rn5t618_wdt: fix module aliases spi: spi-dw: Add lock protect dw_spi rx/tx to prevent concurrent calls drivers/net/b44: Change to non-atomic bit operations on pwol_mask net: wan: sdla: Fix cast from pointer to integer of different size gpio: max77620: Add missing dependency on GPIOLIB_IRQCHIP atm: eni: fix uninitialized variable warning HID: steam: Fix input device disappearing platform/x86: dell-laptop: disable kbd backlight on Inspiron 10xx PCI: Add DMA alias quirk for Intel VCA NTB iommu/amd: Support multiple PCI DMA aliases in IRQ Remapping ARM: OMAP2+: SmartReflex: add omap_sr_pdata definition usb-storage: Disable UAS on JMicron SATA enclosure sched/fair: Add tmp_alone_branch assertion sched/fair: Fix insertion in rq->leaf_cfs_rq_list rsi: fix use-after-free on probe errors rsi: fix memory leak on failed URB submission rsi: fix non-atomic allocation in completion handler crypto: af_alg - Use bh_lock_sock in sk_destruct random: try to actively add entropy rather than passively wait for it block: cleanup __blkdev_issue_discard() block: fix 32 bit overflow in __blkdev_issue_discard() KVM: arm64: Write arch.mdcr_el2 changes since last vcpu_load on VHE Linux 4.19.101 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I801cd8d04eea35b4b53957cc69c0987d88094992	2020-02-02 20:22:38 +00:00
Patrick Bellasi	8b2fbd9076	UPSTREAM: sched/fair/util_est: Implement faster ramp-up EWMA on utilization increases The estimated utilization for a task: util_est = max(util_avg, est.enqueue, est.ewma) is defined based on: - util_avg: the PELT defined utilization - est.enqueued: the util_avg at the end of the last activation - est.ewma: a exponential moving average on the est.enqueued samples According to this definition, when a task suddenly changes its bandwidth requirements from small to big, the EWMA will need to collect multiple samples before converging up to track the new big utilization. This slow convergence towards bigger utilization values is not aligned to the default scheduler behavior, which is to optimize for performance. Moreover, the est.ewma component fails to compensate for temporarely utilization drops which spans just few est.enqueued samples. To let util_est do a better job in the scenario depicted above, change its definition by making util_est directly follow upward motion and only decay the est.ewma on downward. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@matbug.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Douglas Raillard <douglas.raillard@arm.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <qperret@google.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191023205630.14469-1-patrick.bellasi@matbug.net Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit b8c96361402aa3e74ad48ceef18aed99153d8da8) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: I5c0bdd401f3fe599a2b7b9215c9a3a621f91002d	2020-02-01 17:35:00 +00:00
Qais Yousef	f503db1178	ANDROID: Re-use SUGOV_RT_MAX_FREQ to control uclamp rt behavior By default uclamp RT tasks will use the max frequency, which is not the desired default behavior on mobile devices. Re-use the SUGOV_RT_MAX_FREQ sched_feat to control the default behavior. When SUGOV_RT_MAX_FREQ is NOT selected, the uclamp_min value of the RT tasks will be 0. Note, since now we use SUGOV_RT_MAX_FREQ to enforce the default max frequency for RT when uclamp is compiled in; the condition in schedutil_cpu_util() needs to be inverted so that max no longer unconditionally applied when uclamp is compiled in && SUGOV_RT_MAX_FREQ is true. This unconditional application means uclamp values are always ignored which is not what we want when uclamp is compiled in. Bug: 120440300 Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I3d36f1ebed6ef35a6299af32bbf4462d0353e783 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 16:14:12 +00:00
Valentin Schneider	ecce1cf84a	BACKPORT: sched/fair: Make EAS wakeup placement consider uclamp restrictions task_fits_capacity() has just been made uclamp-aware, and find_energy_efficient_cpu() needs to go through the same treatment. Things are somewhat different here however - using the task max clamp isn't sufficient. Consider the following setup: The target runqueue, rq: rq.cpu_capacity_orig = 512 rq.cfs.avg.util_avg = 200 rq.uclamp.max = 768 // the max p.uclamp.max of all enqueued p's is 768 The waking task, p (not yet enqueued on rq): p.util_est = 600 p.uclamp.max = 100 Now, consider the following code which doesn't use the rq clamps: util = uclamp_task_util(p); // Does the task fit in the spare CPU capacity? cpu = cpu_of(rq); fits_capacity(util, cpu_capacity(cpu) - cpu_util(cpu)) This would lead to: util = 100; fits_capacity(100, 512 - 200) fits_capacity() would return true. However, enqueuing p on that CPU will cause it to become overutilized since rq clamp values are max-aggregated, so we'd remain with rq.uclamp.max = 768 which comes from the other tasks already enqueued on rq. Thus, we could select a high enough frequency to reach beyond 0.8 * 512 utilization (== overutilized) after enqueuing p on rq. What find_energy_efficient_cpu() needs here is uclamp_rq_util_with() which lets us peek at the future utilization landscape, including rq-wide uclamp values. Make find_energy_efficient_cpu() use uclamp_rq_util_with() for its fits_capacity() check. This is in line with what compute_energy() ends up using for estimating utilization. [QP: moved changes to select_cpu_candidates(), which is the equivalent to the mainline path, and fix missing dependency on fits_capacity() by using the open coded version] Bug: 120440300 Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com> Suggested-by: Quentin Perret <qperret@google.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191211113851.24241-6-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 1d42509e475cdc8542aa5b3e03a7e845244f4f57) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: Ibe1643cd5e6c97daceceae9733344e54bf4a4857	2020-02-01 16:14:11 +00:00
Valentin Schneider	50262f741b	BACKPORT: sched/fair: Make task_fits_capacity() consider uclamp restrictions task_fits_capacity() drives CPU selection at wakeup time, and is also used to detect misfit tasks. Right now it does so by comparing task_util_est() with a CPU's capacity, but doesn't take into account uclamp restrictions. There's a few interesting uses that can come out of doing this. For instance, a low uclamp.max value could prevent certain tasks from being flagged as misfit tasks, so they could merrily remain on low-capacity CPUs. Similarly, a high uclamp.min value would steer tasks towards high capacity CPUs at wakeup (and, should that fail, later steered via misfit balancing), so such "boosted" tasks would favor CPUs of higher capacity. Introduce uclamp_task_util() and make task_fits_capacity() use it. [QP: fixed missing dependency on fits_capacity() by using the open coded alternative] Bug: 120440300 Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Quentin Perret <qperret@google.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191211113851.24241-5-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit a7008c07a568278ed2763436404752a98004c7ff) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: Iabde2eda7252c3bcc273e61260a7a12a7de991b1	2020-02-01 16:14:11 +00:00
Patrick Bellasi	f609a2239f	ANDROID: sched/core: Move SchedTune task API into UtilClamp wrappers The main SchedTune API calls realted to task tuning attributes are now wrapped by more generic and mainlinish UtilClamp calls. The new APIs are: - uclamp_task(p) <= boosted_task_util(p) - uclamp_boosted(p) <= schedtune_task_boost(p) > 0 - uclamp_latency_sensitive(p) <= schedtune_prefer_idle(p) Let's provide also an implementation of the same API based on the new uclamp.uclamp_latency_sensitive flag. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> [Modified the patch to use uclamp.latency_sensitive instead mainline attributes] Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ib1a6902e1c07a82a370e36bf1776d895b7528cbc Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 16:14:11 +00:00
Quentin Perret	752b47b84d	ANDROID: sched/core: Add a latency-sensitive flag to uclamp Add a 'latency_sensitive' flag to uclamp in order to express the need for some tasks to find a CPU where they can wake-up quickly. This is not expected to be used without cgroup support, so add solely a cgroup interface for it. As this flag represents a boolean attribute and not an amount of resources to be shared, it is not clear what the delegation logic should be. As such, it is kept simple: every new cgroup starts with latency_sensitive set to false, regardless of the parent. In essence, this is similar to SchedTune's prefer-idle flag which was used in android-4.19 and prior. Bug: 120440300 Change-Id: I722d8ecabb428bb7b95a5b54bc70a87f182dde2a Signed-off-by: Quentin Perret <quentin.perret@arm.com> (cherry picked from commit ad7dd648fc7dbe11f23673a3463af2468a274998) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:17 +00:00
Patrick Bellasi	9a05300da0	ANDROID: sched/tune: Move SchedTune cpu API into UtilClamp wrappers The SchedTune CPU boosting API is currently used from sugov_get_util() to get the boosted utilization and to pass it into schedutil_cpu_util(). When UtilClamp is in use instead we call schedutil_cpu_util() by passing in just the CFS utilization and the clamping is done internally on the aggregated CFS+RT utilization for FREQUENCY_UTIL calls. This asymmetry is not required moreover, schedutil code is polluted by non-mainline SchedTune code. Wrap SchedTune API call related to cpu utilization boosting with a more generic and mainlinish UtilClamp call: - uclamp_rq_util_with(cpu, util, p) <= boosted_cpu_util(cpu) This new API is already used in schedutil_cpu_util() to clamp the aggregated RT+CFS utilization on FREQUENCY_UTIL calls. Move the cpu boosting into uclamp_rq_util_with() so that we remove any SchedTune specific bit from kernel/sched/cpufreq_schedutil.c. Get rid of the no more required boosted_cpu_util(cpu) method and replace it with a stune_util(cpu, util) which signature is better aligned with its uclamp_rq_util_with(cpu, util, p) counterpart. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I45b0f0f54123fe0a2515fa9f1683842e6b99234f [Removed superfluous __maybe_unused for capacity_orig_of] Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:17 +00:00
Li Guanglei	7e1c333ed1	FROMGIT: sched/core: Fix size of rq::uclamp initialization rq::uclamp is an array of struct uclamp_rq, make sure we clear the whole thing. Bug: 120440300 Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcountinga") Signed-off-by: Li Guanglei <guanglei.li@unisoc.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Qais Yousef <qais.yousef@arm.com> Link: https://lkml.kernel.org/r/1577259844-12677-1-git-send-email-guangleix.li@gmail.com (cherry picked from commit dcd6dffb0a75741471297724640733fa4e958d72 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Id36a2b77c45e586535e8fadfb7d66868ca8fe8c7 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:16 +00:00
Qais Yousef	45b9d34bec	FROMGIT: sched/uclamp: Fix a bug in propagating uclamp value in new cgroups When a new cgroup is created, the effective uclamp value wasn't updated with a call to cpu_util_update_eff() that looks at the hierarchy and update to the most restrictive values. Fix it by ensuring to call cpu_util_update_eff() when a new cgroup becomes online. Without this change, the newly created cgroup uses the default root_task_group uclamp values, which is 1024 for both uclamp_{min, max}, which will cause the rq to to be clamped to max, hence cause the system to run at max frequency. The problem was observed on Ubuntu server and was reproduced on Debian and Buildroot rootfs. By default, Ubuntu and Debian create a cpu controller cgroup hierarchy and add all tasks to it - which creates enough noise to keep the rq uclamp value at max most of the time. Imitating this behavior makes the problem visible in Buildroot too which otherwise looks fine since it's a minimal userspace. Bug: 120440300 Fixes: 0b60ba2dd342 ("sched/uclamp: Propagate parent clamps") Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Doug Smythies <dsmythies@telus.net> Link: https://lore.kernel.org/lkml/000701d5b965$361b6c60$a2524520$@net/ (cherry picked from commit 7226017ad37a888915628e59a84a2d1e57b40707 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I9636c60e04d58bbfc5041df1305b34a12b5a3f46 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:16 +00:00
Valentin Schneider	f59dfad8f9	FROMGIT: sched/uclamp: Rename uclamp_util_with() into uclamp_rq_util_with() The current helper returns (CPU) rq utilization with uclamp restrictions taken into account. A uclamp task utilization helper would be quite helpful, but this requires some renaming. Prepare the code for the introduction of a uclamp_task_util() by renaming the existing uclamp_util_with() to uclamp_rq_util_with(). Bug: 120440300 Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Quentin Perret <qperret@google.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191211113851.24241-4-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit d2b58a286e89824900d501db0be1d4f6aed474fc https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I3e7146b788e079e400167203df5e5dadee2fd232 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:16 +00:00
Valentin Schneider	254e090f3a	FROMGIT: sched/uclamp: Make uclamp util helpers use and return UL values Vincent pointed out recently that the canonical type for utilization values is 'unsigned long'. Internally uclamp uses 'unsigned int' values for cache optimization, but this doesn't have to be exported to its users. Make the uclamp helpers that deal with utilization use and return unsigned long values. Bug: 120440300 Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Quentin Perret <qperret@google.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191211113851.24241-3-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 686516b55e98edf18c2a02d36aaaa6f4c0f6c39c https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Id3837f12237e5b77eb3a236bd32457dcd7de743e Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:16 +00:00
Valentin Schneider	6477d90135	FROMGIT: sched/uclamp: Remove uclamp_util() The sole user of uclamp_util(), schedutil_cpu_util(), was made to use uclamp_util_with() instead in commit: af24bde8df20 ("sched/uclamp: Add uclamp support to energy_compute()") From then on, uclamp_util() has remained unused. Being a simple wrapper around uclamp_util_with(), we can get rid of it and win back a few lines. Bug: 120440300 Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com> Suggested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191211113851.24241-2-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 59fe675248ffc37d4167e9ec6920a2f3d5ec67bb https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I11dbff80c6c4be9666438800b2527aca8cd24025 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:16 +00:00
Qais Yousef	cdadd91444	BACKPORT: sched/rt: Make RT capacity-aware Capacity Awareness refers to the fact that on heterogeneous systems (like Arm big.LITTLE), the capacity of the CPUs is not uniform, hence when placing tasks we need to be aware of this difference of CPU capacities. In such scenarios we want to ensure that the selected CPU has enough capacity to meet the requirement of the running task. Enough capacity means here that capacity_orig_of(cpu) >= task.requirement. The definition of task.requirement is dependent on the scheduling class. For CFS, utilization is used to select a CPU that has >= capacity value than the cfs_task.util. capacity_orig_of(cpu) >= cfs_task.util DL isn't capacity aware at the moment but can make use of the bandwidth reservation to implement that in a similar manner CFS uses utilization. The following patchset implements that: https://lore.kernel.org/lkml/20190506044836.2914-1-luca.abeni@santannapisa.it/ capacity_orig_of(cpu)/SCHED_CAPACITY >= dl_deadline/dl_runtime For RT we don't have a per task utilization signal and we lack any information in general about what performance requirement the RT task needs. But with the introduction of uclamp, RT tasks can now control that by setting uclamp_min to guarantee a minimum performance point. ATM the uclamp value are only used for frequency selection; but on heterogeneous systems this is not enough and we need to ensure that the capacity of the CPU is >= uclamp_min. Which is what implemented here. capacity_orig_of(cpu) >= rt_task.uclamp_min Note that by default uclamp.min is 1024, which means that RT tasks will always be biased towards the big CPUs, which make for a better more predictable behavior for the default case. Must stress that the bias acts as a hint rather than a definite placement strategy. For example, if all big cores are busy executing other RT tasks we can't guarantee that a new RT task will be placed there. On non-heterogeneous systems the original behavior of RT should be retained. Similarly if uclamp is not selected in the config. [ mingo: Minor edits to comments. ] Bug: 120440300 Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191009104611.15363-1-qais.yousef@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 804d402fb6f6487b825aae8cf42fda6426c62867 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git) [Qais: resolved minor conflict in kernel/sched/cpupri.c] Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ifc9da1c47de1aec9b4d87be2614e4c8968366900 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:16 +00:00
Valentin Schneider	ea9ce42997	UPSTREAM: sched/uclamp: Fix overzealous type replacement Some uclamp helpers had their return type changed from 'unsigned int' to 'enum uclamp_id' by commit 0413d7f33e60 ("sched/uclamp: Always use 'enum uclamp_id' for clamp_id values") but it happens that some do return a value in the [0, SCHED_CAPACITY_SCALE] range, which should really be unsigned int. The affected helpers are uclamp_none(), uclamp_rq_max_value() and uclamp_eff_value(). Fix those up. Note that this doesn't lead to any obj diff using a relatively recent aarch64 compiler (8.3-2019.03). The current code of e.g. uclamp_eff_value() properly returns an 11 bit value (bits_per(1024)) and doesn't seem to do anything funny. I'm still marking this as fixing the above commit to be on the safe side. Bug: 120440300 Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Qais Yousef <qais.yousef@arm.com> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar.Eggemann@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: patrick.bellasi@matbug.net Cc: qperret@google.com Cc: surenb@google.com Cc: tj@kernel.org Fixes: 0413d7f33e60 ("sched/uclamp: Always use 'enum uclamp_id' for clamp_id values") Link: https://lkml.kernel.org/r/20191115103908.27610-1-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 7763baace1b738d65efa46d68326c9406311c6bf) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I924a99c125372a8fca81cb4bc0c82e6a7183fc8a Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:15 +00:00
Qais Yousef	7125c7cfca	UPSTREAM: sched/uclamp: Fix incorrect condition uclamp_update_active() should perform the update when p->uclamp[clamp_id].active is true. But when the logic was inverted in [1], the if condition wasn't inverted correctly too. [1] https://lore.kernel.org/lkml/20190902073836.GO2369@hirez.programming.kicks-ass.net/ Bug: 120440300 Reported-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Ben Segall <bsegall@google.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Patrick Bellasi <patrick.bellasi@matbug.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: babbe170e053 ("sched/uclamp: Update CPU's refcount on TG's clamp changes") Link: https://lkml.kernel.org/r/20191114211052.15116-1-qais.yousef@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 6e1ff0773f49c7d38e8b4a9df598def6afb9f415) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I51b58a6089290277e08a0aaa72b86f852eec1512 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:15 +00:00
Qais Yousef	64bf81cac2	UPSTREAM: sched/core: Fix compilation error when cgroup not selected When cgroup is disabled the following compilation error was hit kernel/sched/core.c: In function ‘uclamp_update_active_tasks’: kernel/sched/core.c:1081:23: error: storage size of ‘it’ isn’t known struct css_task_iter it; ^~ kernel/sched/core.c:1084:2: error: implicit declaration of function ‘css_task_iter_start’; did you mean ‘__sg_page_iter_start’? [-Werror=implicit-function-declaration] css_task_iter_start(css, 0, &it); ^~~~~~~~~~~~~~~~~~~ __sg_page_iter_start kernel/sched/core.c:1085:14: error: implicit declaration of function ‘css_task_iter_next’; did you mean ‘__sg_page_iter_next’? [-Werror=implicit-function-declaration] while ((p = css_task_iter_next(&it))) { ^~~~~~~~~~~~~~~~~~ __sg_page_iter_next kernel/sched/core.c:1091:2: error: implicit declaration of function ‘css_task_iter_end’; did you mean ‘get_task_cred’? [-Werror=implicit-function-declaration] css_task_iter_end(&it); ^~~~~~~~~~~~~~~~~ get_task_cred kernel/sched/core.c:1081:23: warning: unused variable ‘it’ [-Wunused-variable] struct css_task_iter it; ^~ cc1: some warnings being treated as errors make[2]: *** [kernel/sched/core.o] Error 1 Fix by protetion uclamp_update_active_tasks() with CONFIG_UCLAMP_TASK_GROUP Bug: 120440300 Fixes: babbe170e053 ("sched/uclamp: Update CPU's refcount on TG's clamp changes") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Patrick Bellasi <patrick.bellasi@matbug.net> Cc: Mel Gorman <mgorman@suse.de> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Ben Segall <bsegall@google.com> Link: https://lkml.kernel.org/r/20191105112212.596-1-qais.yousef@arm.com (cherry picked from commit e3b8b6a0d12cccf772113d6b5c1875192186fbd4) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ia4c0f801d68050526f9f117ec9189e448b01345a Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:15 +00:00
Ingo Molnar	7f682d7abc	UPSTREAM: sched/core: Fix uclamp ABI bug, clean up and robustify sched_read_attr() ABI logic and code Thadeu Lima de Souza Cascardo reported that 'chrt' broke on recent kernels: $ chrt -p $$ chrt: failed to get pid 26306's policy: Argument list too long and he has root-caused the bug to the following commit increasing sched_attr size and breaking sched_read_attr() into returning -EFBIG: a509a7cd7974 ("sched/uclamp: Extend sched_setattr() to support utilization clamping") The other, bigger bug is that the whole sched_getattr() and sched_read_attr() logic of checking non-zero bits in new ABI components is arguably broken, and pretty much any extension of the ABI will spuriously break the ABI. That's way too fragile. Instead implement the perf syscall's extensible ABI instead, which we already implement on the sched_setattr() side: - if user-attributes have the same size as kernel attributes then the logic is unchanged. - if user-attributes are larger than the kernel knows about then simply skip the extra bits, but set attr->size to the (smaller) kernel size so that tooling can (in principle) handle older kernel as well. - if user-attributes are smaller than the kernel knows about then just copy whatever user-space can accept. Also clean up the whole logic: - Simplify the code flow - there's no need for 'ret' for example. - Standardize on 'kattr/uattr' and 'ksize/usize' naming to make sure we always know which side we are dealing with. - Why is it called 'read' when what it does is to copy to user? This code is so far away from VFS read() semantics that the naming is actively confusing. Name it sched_attr_copy_to_user() instead, which mirrors other copy_to_user() functionality. - Move the attr->size assignment from the head of sched_getattr() to the sched_attr_copy_to_user() function. Nothing else within the kernel should care about the size of the structure. With these fixes the sched_getattr() syscall now nicely supports an extensible ABI in both a forward and backward compatible fashion, and will also fix the chrt bug. As an added bonus the bogus -EFBIG return is removed as well, which as Thadeu noted should have been -E2BIG to begin with. Bug: 120440300 Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Patrick Bellasi <patrick.bellasi@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: a509a7cd7974 ("sched/uclamp: Extend sched_setattr() to support utilization clamping") Link: https://lkml.kernel.org/r/20190904075532.GA26751@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 1251201c0d34fadf69d56efa675c2b7dd0a90eca) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I67e653c4f69db0140e9651c125b60e2b8cfd62f1 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:15 +00:00
Patrick Bellasi	53a73b1f35	UPSTREAM: sched/uclamp: Always use 'enum uclamp_id' for clamp_id values The supported clamp indexes are defined in 'enum clamp_id', however, because of the code logic in some of the first utilization clamping series version, sometimes we needed to use 'unsigned int' to represent indices. This is not more required since the final version of the uclamp_* APIs can always use the proper enum uclamp_id type. Fix it with a bulk rename now that we have all the bits merged. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-7-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 0413d7f33e60751570fd6c179546bde2f7d82dcb) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I0be680b2489fa07244bac63b5c6fe1a79a53bef7 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:15 +00:00
Patrick Bellasi	d286ec414a	UPSTREAM: sched/uclamp: Update CPU's refcount on TG's clamp changes On updates of task group (TG) clamp values, ensure that these new values are enforced on all RUNNABLE tasks of the task group, i.e. all RUNNABLE tasks are immediately boosted and/or capped as requested. Do that each time we update effective clamps from cpu_util_update_eff(). Use the *cgroup_subsys_state (css) to walk the list of tasks in each affected TG and update their RUNNABLE tasks. Update each task by using the same mechanism used for cpu affinity masks updates, i.e. by taking the rq lock. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-6-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit babbe170e053c6ec2343751749995b7b9fd5fd2c) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I5e48891bd48c266dd282e1bab8f60533e4e29b48 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:15 +00:00
Patrick Bellasi	a1f3376922	UPSTREAM: sched/uclamp: Use TG's clamps to restrict TASK's clamps When a task specific clamp value is configured via sched_setattr(2), this value is accounted in the corresponding clamp bucket every time the task is {en,de}qeued. However, when cgroups are also in use, the task specific clamp values could be restricted by the task_group (TG) clamp values. Update uclamp_cpu_inc() to aggregate task and TG clamp values. Every time a task is enqueued, it's accounted in the clamp bucket tracking the smaller clamp between the task specific value and its TG effective value. This allows to: 1. ensure cgroup clamps are always used to restrict task specific requests, i.e. boosted not more than its TG effective protection and capped at least as its TG effective limit. 2. implement a "nice-like" policy, where tasks are still allowed to request less than what enforced by their TG effective limits and protections Do this by exploiting the concept of "effective" clamp, which is already used by a TG to track parent enforced restrictions. Apply task group clamp restrictions only to tasks belonging to a child group. While, for tasks in the root group or in an autogroup, system defaults are still enforced. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-5-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 3eac870a324728e5d17118888840dad70bcd37f3) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I0215e0a68cc0fa7c441e33052757f8571b7c99b9 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:15 +00:00
Patrick Bellasi	c4c03cf9bf	UPSTREAM: sched/uclamp: Propagate system defaults to the root group The clamp values are not tunable at the level of the root task group. That's for two main reasons: - the root group represents "system resources" which are always entirely available from the cgroup standpoint. - when tuning/restricting "system resources" makes sense, tuning must be done using a system wide API which should also be available when control groups are not. When a system wide restriction is available, cgroups should be aware of its value in order to know exactly how much "system resources" are available for the subgroups. Utilization clamping supports already the concepts of: - system defaults: which define the maximum possible clamp values usable by tasks. - effective clamps: which allows a parent cgroup to constraint (maybe temporarily) its descendants without losing the information related to the values "requested" from them. Exploit these two concepts and bind them together in such a way that, whenever system default are tuned, the new values are propagated to (possibly) restrict or relax the "effective" value of nested cgroups. When cgroups are in use, force an update of all the RUNNABLE tasks. Otherwise, keep things simple and do just a lazy update next time each task will be enqueued. Do that since we assume a more strict resource control is required when cgroups are in use. This allows also to keep "effective" clamp values updated in case we need to expose them to user-space. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-4-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 7274a5c1bbec45f06f1fff4b8c8b5855b6cc189d) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ibf7ce5c46b67c79765b56b792ee22ed9595802c3 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:14 +00:00
Patrick Bellasi	77a413e758	UPSTREAM: sched/uclamp: Propagate parent clamps In order to properly support hierarchical resources control, the cgroup delegation model requires that attribute writes from a child group never fail but still are locally consistent and constrained based on parent's assigned resources. This requires to properly propagate and aggregate parent attributes down to its descendants. Implement this mechanism by adding a new "effective" clamp value for each task group. The effective clamp value is defined as the smaller value between the clamp value of a group and the effective clamp value of its parent. This is the actual clamp value enforced on tasks in a task group. Since it's possible for a cpu.uclamp.min value to be bigger than the cpu.uclamp.max value, ensure local consistency by restricting each "protection" (i.e. min utilization) with the corresponding "limit" (i.e. max utilization). Do that at effective clamps propagation to ensure all user-space write never fails while still always tracking the most restrictive values. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-3-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 0b60ba2dd342016e4e717dbaa4ca9af3a43f4434) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: If1cc136e1fb4a8f4c6ea15dc440b28d833a8d7e7 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:14 +00:00
Patrick Bellasi	19718921c3	UPSTREAM: sched/uclamp: Extend CPU's cgroup controller The cgroup CPU bandwidth controller allows to assign a specified (maximum) bandwidth to the tasks of a group. However this bandwidth is defined and enforced only on a temporal base, without considering the actual frequency a CPU is running on. Thus, the amount of computation completed by a task within an allocated bandwidth can be very different depending on the actual frequency the CPU is running that task. The amount of computation can be affected also by the specific CPU a task is running on, especially when running on asymmetric capacity systems like Arm's big.LITTLE. With the availability of schedutil, the scheduler is now able to drive frequency selections based on actual task utilization. Moreover, the utilization clamping support provides a mechanism to bias the frequency selection operated by schedutil depending on constraints assigned to the tasks currently RUNNABLE on a CPU. Giving the mechanisms described above, it is now possible to extend the cpu controller to specify the minimum (or maximum) utilization which should be considered for tasks RUNNABLE on a cpu. This makes it possible to better defined the actual computational power assigned to task groups, thus improving the cgroup CPU bandwidth controller which is currently based just on time constraints. Extend the CPU controller with a couple of new attributes uclamp.{min,max} which allow to enforce utilization boosting and capping for all the tasks in a group. Specifically: - uclamp.min: defines the minimum utilization which should be considered i.e. the RUNNABLE tasks of this group will run at least at a minimum frequency which corresponds to the uclamp.min utilization - uclamp.max: defines the maximum utilization which should be considered i.e. the RUNNABLE tasks of this group will run up to a maximum frequency which corresponds to the uclamp.max utilization These attributes: a) are available only for non-root nodes, both on default and legacy hierarchies, while system wide clamps are defined by a generic interface which does not depends on cgroups. This system wide interface enforces constraints on tasks in the root node. b) enforce effective constraints at each level of the hierarchy which are a restriction of the group requests considering its parent's effective constraints. Root group effective constraints are defined by the system wide interface. This mechanism allows each (non-root) level of the hierarchy to: - request whatever clamp values it would like to get - effectively get only up to the maximum amount allowed by its parent c) have higher priority than task-specific clamps, defined via sched_setattr(), thus allowing to control and restrict task requests. Add two new attributes to the cpu controller to collect "requested" clamp values. Allow that at each non-root level of the hierarchy. Keep it simple by not caring now about "effective" values computation and propagation along the hierarchy. Update sysctl_sched_uclamp_handler() to use the newly introduced uclamp_mutex so that we serialize system default updates with cgroup relate updates. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-2-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 2480c093130f64ac3a410504fa8b3db1fc4b87ce) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I0285c44910bf073b80d7996361e6698bc5aedfae Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:14 +00:00
Patrick Bellasi	9a843ff48d	BACKPORT: sched/uclamp: Add uclamp support to energy_compute() The Energy Aware Scheduler (EAS) estimates the energy impact of waking up a task on a given CPU. This estimation is based on: a) an (active) power consumption defined for each CPU frequency b) an estimation of which frequency will be used on each CPU c) an estimation of the busy time (utilization) of each CPU Utilization clamping can affect both b) and c). A CPU is expected to run: - on an higher than required frequency, but for a shorter time, in case its estimated utilization will be smaller than the minimum utilization enforced by uclamp - on a smaller than required frequency, but for a longer time, in case its estimated utilization is bigger than the maximum utilization enforced by uclamp While compute_energy() already accounts clamping effects on busy time, the clamping effects on frequency selection are currently ignored. Fix it by considering how CPU clamp values will be affected by a task waking up and being RUNNABLE on that CPU. Do that by refactoring schedutil_freq_util() to take an additional task_struct* which allows EAS to evaluate the impact on clamp values of a task being eventually queued in a CPU. Clamp values are applied to the RT+CFS utilization only when a FREQUENCY_UTIL is required by compute_energy(). Do note that switching from ENERGY_UTIL to FREQUENCY_UTIL in the computation of the cpu_util signal implies that we are more likely to estimate the highest OPP when a RT task is running in another CPU of the same performance domain. This can have an impact on energy estimation but: - it's not easy to say which approach is better, since it depends on the use case - the original approach could still be obtained by setting a smaller task-specific util_min whenever required Since we are at that: - rename schedutil_freq_util() into schedutil_cpu_util(), since it's not only used for frequency selection. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190621084217.8167-12-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit af24bde8df2029f067dc46aff0393c8f18ff6e2f) Signed-off-by: Qais Yousef <qais.yousef@arm.com> [Moved cpu_util_cfs() outside of CONFIG_CPU_FREQ_GOV_SCHEDUTIL] Change-Id: Idc4933f44be746ce35c1181a9288e6cb5d9607b2 [Protect cpu_util_cfs() with CONFIG_SMP ifdefery] Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:14 +00:00
Patrick Bellasi	814a151015	UPSTREAM: sched/uclamp: Add uclamp_util_with() So far uclamp_util() allows to clamp a specified utilization considering the clamp values requested by RUNNABLE tasks in a CPU. For the Energy Aware Scheduler (EAS) it is interesting to test how clamp values will change when a task is becoming RUNNABLE on a given CPU. For example, EAS is interested in comparing the energy impact of different scheduling decisions and the clamp values can play a role on that. Add uclamp_util_with() which allows to clamp a given utilization by considering the possible impact on CPU clamp values of a specified task. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190621084217.8167-11-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 9d20ad7dfc9a5cc64e33d725902d3863d350a66a) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ida153a3526b87f5674a6e037d4725d99eec7b478 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 14:39:38 +00:00
Patrick Bellasi	61d44b22d9	BACKPORT: sched/cpufreq, sched/uclamp: Add clamps for FAIR and RT tasks Each time a frequency update is required via schedutil, a frequency is selected to (possibly) satisfy the utilization reported by each scheduling class and irqs. However, when utilization clamping is in use, the frequency selection should consider userspace utilization clamping hints. This will allow, for example, to: - boost tasks which are directly affecting the user experience by running them at least at a minimum "requested" frequency - cap low priority tasks not directly affecting the user experience by running them only up to a maximum "allowed" frequency These constraints are meant to support a per-task based tuning of the frequency selection thus supporting a fine grained definition of performance boosting vs energy saving strategies in kernel space. Add support to clamp the utilization of RUNNABLE FAIR and RT tasks within the boundaries defined by their aggregated utilization clamp constraints. Do that by considering the max(min_util, max_util) to give boosted tasks the performance they need even when they happen to be co-scheduled with other capped tasks. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190621084217.8167-10-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 982d9cdc22c9f6df5ad790caa229ff74fb1d95e7) Conflicts: kernel/sched/cpufreq_schedutil.c 1. Merged the if condition to include the non-upstream sched_feat(SUGOV_RT_MAX_FREQ) check 2. Change the function signature to pass util_cfs and define util as an automatic variable. Bug: 120440300 Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ie222c9ad84776fc2948e30c116eee876df697a17 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 14:39:38 +00:00
Patrick Bellasi	fa167475e1	UPSTREAM: sched/uclamp: Set default clamps for RT tasks By default FAIR tasks start without clamps, i.e. neither boosted nor capped, and they run at the best frequency matching their utilization demand. This default behavior does not fit RT tasks which instead are expected to run at the maximum available frequency, if not otherwise required by explicitly capping them. Enforce the correct behavior for RT tasks by setting util_min to max whenever: 1. the task is switched to the RT class and it does not already have a user-defined clamp value assigned. 2. an RT task is forked from a parent with RESET_ON_FORK set. NOTE: utilization clamp values are cross scheduling class attributes and thus they are never changed/reset once a value has been explicitly defined from user-space. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190621084217.8167-9-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 1a00d999971c78ab024a17b0efc37d78404dd120) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I81fcadaea34f557e531fa5ac6aab84fcb0ee37c7 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 14:39:38 +00:00
Patrick Bellasi	341f61099d	UPSTREAM: sched/uclamp: Reset uclamp values on RESET_ON_FORK A forked tasks gets the same clamp values of its parent however, when the RESET_ON_FORK flag is set on parent, e.g. via: sys_sched_setattr() sched_setattr() __sched_setscheduler(attr::SCHED_FLAG_RESET_ON_FORK) the new forked task is expected to start with all attributes reset to default values. Do that for utilization clamp values too by checking the reset request from the existing uclamp_fork() call which already provides the required initialization for other uclamp related bits. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190621084217.8167-8-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit a87498ace58e23b62a572dc7267579ede4c8495c) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: If7bda202707aac3a2696a42f8146f607cdd36905 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 14:39:38 +00:00
Patrick Bellasi	e6056b2a5b	UPSTREAM: sched/uclamp: Extend sched_setattr() to support utilization clamping The SCHED_DEADLINE scheduling class provides an advanced and formal model to define tasks requirements that can translate into proper decisions for both task placements and frequencies selections. Other classes have a more simplified model based on the POSIX concept of priorities. Such a simple priority based model however does not allow to exploit most advanced features of the Linux scheduler like, for example, driving frequencies selection via the schedutil cpufreq governor. However, also for non SCHED_DEADLINE tasks, it's still interesting to define tasks properties to support scheduler decisions. Utilization clamping exposes to user-space a new set of per-task attributes the scheduler can use as hints about the expected/required utilization for a task. This allows to implement a "proactive" per-task frequency control policy, a more advanced policy than the current one based just on "passive" measured task utilization. For example, it's possible to boost interactive tasks (e.g. to get better performance) or cap background tasks (e.g. to be more energy/thermal efficient). Introduce a new API to set utilization clamping values for a specified task by extending sched_setattr(), a syscall which already allows to define task specific properties for different scheduling classes. A new pair of attributes allows to specify a minimum and maximum utilization the scheduler can consider for a task. Do that by validating the required clamp values before and then applying the required changes using _the_ same pattern already in use for __setscheduler(). This ensures that the task is re-enqueued with the new clamp values. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190621084217.8167-7-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit a509a7cd79747074a2c018a45bbbc52d1f4aed44) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I420e7ece5628bc639811a79654c35135a65bfd02 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 14:39:37 +00:00
Patrick Bellasi	258e6b82dd	UPSTREAM: sched/core: Allow sched_setattr() to use the current policy The sched_setattr() syscall mandates that a policy is always specified. This requires to always know which policy a task will have when attributes are configured and this makes it impossible to add more generic task attributes valid across different scheduling policies. Reading the policy before setting generic tasks attributes is racy since we cannot be sure it is not changed concurrently. Introduce the required support to change generic task attributes without affecting the current task policy. This is done by adding an attribute flag (SCHED_FLAG_KEEP_POLICY) to enforce the usage of the current policy. Add support for the SETPARAM_POLICY policy, which is already used by the sched_setparam() POSIX syscall, to the sched_setattr() non-POSIX syscall. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190621084217.8167-6-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 1d6362fa0cfc8c7b243fa92924429d826599e691) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I41cbe73d7aa30123adbd757fa30e346938651784 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 14:39:37 +00:00
Patrick Bellasi	613eecebf9	UPSTREAM: sched/uclamp: Add system default clamps Tasks without a user-defined clamp value are considered not clamped and by default their utilization can have any value in the [0..SCHED_CAPACITY_SCALE] range. Tasks with a user-defined clamp value are allowed to request any value in that range, and the required clamp is unconditionally enforced. However, a "System Management Software" could be interested in limiting the range of clamp values allowed for all tasks. Add a privileged interface to define a system default configuration via: /proc/sys/kernel/sched_uclamp_util_{min,max} which works as an unconditional clamp range restriction for all tasks. With the default configuration, the full SCHED_CAPACITY_SCALE range of values is allowed for each clamp index. Otherwise, the task-specific clamp is capped by the corresponding system default value. Do that by tracking, for each task, the "effective" clamp value and bucket the task has been refcounted in at enqueue time. This allows to lazy aggregate "requested" and "system default" values at enqueue time and simplifies refcounting updates at dequeue time. The cached bucket ids are used to avoid (relatively) more expensive integer divisions every time a task is enqueued. An active flag is used to report when the "effective" value is valid and thus the task is actually refcounted in the corresponding rq's bucket. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190621084217.8167-5-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit e8f14172c6b11e9a86c65532497087f8eb0f91b1) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I4f014c5ec9c312aaad606518f6e205fd0cfbcaa2 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 14:39:37 +00:00
Patrick Bellasi	c659be787e	UPSTREAM: sched/uclamp: Enforce last task's UCLAMP_MAX When a task sleeps it removes its max utilization clamp from its CPU. However, the blocked utilization on that CPU can be higher than the max clamp value enforced while the task was running. This allows undesired CPU frequency increases while a CPU is idle, for example, when another CPU on the same frequency domain triggers a frequency update, since schedutil can now see the full not clamped blocked utilization of the idle CPU. Fix this by using: uclamp_rq_dec_id(p, rq, UCLAMP_MAX) uclamp_rq_max_value(rq, UCLAMP_MAX, clamp_value) to detect when a CPU has no more RUNNABLE clamped tasks and to flag this condition. Don't track any minimum utilization clamps since an idle CPU never requires a minimum frequency. The decay of the blocked utilization is good enough to reduce the CPU frequency. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190621084217.8167-4-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit e496187da71070687b55ff455e7d8d7d7f0ae0b9) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ie9eab897eb654ec9d4fba5eda20f66a91a712817 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 14:39:37 +00:00
Patrick Bellasi	ad20939c13	UPSTREAM: sched/uclamp: Add bucket local max tracking Because of bucketization, different task-specific clamp values are tracked in the same bucket. For example, with 20% bucket size and assuming to have: Task1: util_min=25% Task2: util_min=35% both tasks will be refcounted in the [20..39]% bucket and always boosted only up to 20% thus implementing a simple floor aggregation normally used in histograms. In systems with only few and well-defined clamp values, it would be useful to track the exact clamp value required by a task whenever possible. For example, if a system requires only 23% and 47% boost values then it's possible to track the exact boost required by each task using only 3 buckets of ~33% size each. Introduce a mechanism to max aggregate the requested clamp values of RUNNABLE tasks in the same bucket. Keep it simple by resetting the bucket value to its base value only when a bucket becomes inactive. Allow a limited and controlled overboosting margin for tasks recounted in the same bucket. In systems where the boost values are not known in advance, it is still possible to control the maximum acceptable overboosting margin by tuning the number of clamp groups. For example, 20 groups ensure a 5% maximum overboost. Remove the rq bucket initialization code since a correct bucket value is now computed when a task is refcounted into a CPU's rq. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190621084217.8167-3-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 60daf9c19410604f08c99e146bc378c8a64f4ccd) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I8782971f8867033cee5aaf981c96f9de33a5288c Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 14:39:37 +00:00
Patrick Bellasi	d96bd1d5fc	UPSTREAM: sched/uclamp: Add CPU's clamp buckets refcounting Utilization clamping allows to clamp the CPU's utilization within a [util_min, util_max] range, depending on the set of RUNNABLE tasks on that CPU. Each task references two "clamp buckets" defining its minimum and maximum (util_{min,max}) utilization "clamp values". A CPU's clamp bucket is active if there is at least one RUNNABLE tasks enqueued on that CPU and refcounting that bucket. When a task is {en,de}queued {on,from} a rq, the set of active clamp buckets on that CPU can change. If the set of active clamp buckets changes for a CPU a new "aggregated" clamp value is computed for that CPU. This is because each clamp bucket enforces a different utilization clamp value. Clamp values are always MAX aggregated for both util_min and util_max. This ensures that no task can affect the performance of other co-scheduled tasks which are more boosted (i.e. with higher util_min clamp) or less capped (i.e. with higher util_max clamp). A task has: task_struct::uclamp[clamp_id]::bucket_id to track the "bucket index" of the CPU's clamp bucket it refcounts while enqueued, for each clamp index (clamp_id). A runqueue has: rq::uclamp[clamp_id]::bucket[bucket_id].tasks to track how many RUNNABLE tasks on that CPU refcount each clamp bucket (bucket_id) of a clamp index (clamp_id). It also has a: rq::uclamp[clamp_id]::bucket[bucket_id].value to track the clamp value of each clamp bucket (bucket_id) of a clamp index (clamp_id). The rq::uclamp::bucket[clamp_id][] array is scanned every time it's needed to find a new MAX aggregated clamp value for a clamp_id. This operation is required only when it's dequeued the last task of a clamp bucket tracking the current MAX aggregated clamp value. In this case, the CPU is either entering IDLE or going to schedule a less boosted or more clamped task. The expected number of different clamp values configured at build time is small enough to fit the full unordered array into a single cache line, for configurations of up to 7 buckets. Add to struct rq the basic data structures required to refcount the number of RUNNABLE tasks for each clamp bucket. Add also the max aggregation required to update the rq's clamp value at each enqueue/dequeue event. Use a simple linear mapping of clamp values into clamp buckets. Pre-compute and cache bucket_id to avoid integer divisions at enqueue/dequeue time. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190621084217.8167-2-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 69842cba9ace84849bb9b8edcdf2cefccd97901c) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I2c2c23572fb82e004f815cc9c783881355df6836 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 14:39:37 +00:00
Tejun Heo	3703043afb	UPSTREAM: cgroup: add cgroup_parse_float() cgroup already uses floating point for percent[ile] numbers and there are several controllers which want to take them as input. Add a generic parse helper to handle inputs. Update the interface convention documentation about the use of percentage numbers. While at it, also clarify the default time unit. Bug: 120440300 Signed-off-by: Tejun Heo <tj@kernel.org> (cherry picked from commit a5e112e6424adb77d953eac20e6936b952fd6b32) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ic1fcf21d7955eb8edd2e8e91517bca6aef41694f Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 14:39:37 +00:00
Vincent Guittot	2d935df7b2	sched/fair: Fix insertion in rq->leaf_cfs_rq_list commit f6783319737f28e4436a69611853a5a098cbe974 upstream. Sargun reported a crash: "I picked up c40f7d74c741a907cfaeb73a7697081881c497d0 sched/fair: Fix infinite loop in update_blocked_averages() by reverting `a9e7f6544b` and put it on top of 4.19.13. In addition to this, I uninlined list_add_leaf_cfs_rq for debugging. This revealed a new bug that we didn't get to because we kept getting crashes from the previous issue. When we are running with cgroups that are rapidly changing, with CFS bandwidth control, and in addition using the cpusets cgroup, we see this crash. Specifically, it seems to occur with cgroups that are throttled and we change the allowed cpuset." The algorithm used to order cfs_rq in rq->leaf_cfs_rq_list assumes that it will walk down to root the 1st time a cfs_rq is used and we will finish to add either a cfs_rq without parent or a cfs_rq with a parent that is already on the list. But this is not always true in presence of throttling. Because a cfs_rq can be throttled even if it has never been used but other CPUs of the cgroup have already used all the bandwdith, we are not sure to go down to the root and add all cfs_rq in the list. Ensure that all cfs_rq will be added in the list even if they are throttled. [ mingo: Fix !CGROUPS build. ] Reported-by: Sargun Dhillon <sargun@sargun.me> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: tj@kernel.org Fixes: `9c2791f936` ("Fix hierarchical order in rq->leaf_cfs_rq_list") Link: https://lkml.kernel.org/r/1548825767-10799-1-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Janne Huttunen <janne.huttunen@nokia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-01 09:37:10 +00:00
Peter Zijlstra	6c11530ea4	sched/fair: Add tmp_alone_branch assertion commit 5d299eabea5a251fbf66e8277704b874bbba92dc upstream. The magic in list_add_leaf_cfs_rq() requires that at the end of enqueue_task_fair(): rq->tmp_alone_branch == &rq->lead_cfs_rq_list If this is violated, list integrity is compromised for list entries and the tmp_alone_branch pointer might dangle. Also, reflow list_add_leaf_cfs_rq() while there. This looses one indentation level and generates a form that's convenient for the next patch. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Janne Huttunen <janne.huttunen@nokia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-01 09:37:10 +00:00
Sami Tolvanen	4596eee0c8	ANDROID: kallsyms: strip hashes from function names with ThinLTO With CONFIG_THINLTO and CFI both enabled, LLVM appends a hash to the names of all static functions. This breaks userspace tools, so strip out the hash from output. Bug: 147422318 Change-Id: Ibea6be089d530e92dcd191481cb02549041203f6 Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2020-01-31 16:50:06 +00:00
Greg Kroah-Hartman	654c66e990	This is the 4.19.100 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4xqFEACgkQONu9yGCS aT4pWhAAnBOvHPDEBjQzrvrhQAEZkT421Plew1Z1E/RL0CgbYisuaUMmNhGppfAu MFt3TXt7sbQ9XfzfRGhiuY3jv2pOBiKlu3DdmKanLGjeCXGSPFPXf+UL/m4utD2F /XvtWTQwOakrghfJn93iF01nF6pSc7IIe7hBgptyc0C0TZXvPy7FC03JxCiepW/8 XEsXYbth6jTEaWwwFZf/QK9sYh7BThm/CmK8UsIdG1kZMW8I9jpAp+1m2DqCB4Je KACR4IEfWGEvipw8r0tCDjbSeo8LlKkbb3Kiz/yPZemECX/MEeN3ErGLQT+eut5a G6Bs5QJfgoYaq3/XjhRp3IQhM8OFEFe9Z0rm7mikRbKDmPp24f9cQ8OUVDEUSBWK zaTi5U7K5jEAI1/PNn2ZSKWMKya2AP5awX48jV2e6bHDo6AXK/JPr2omu2WqpT9f SbPa47cFmgR11oFWpCmLFG//sL5oB5djP1blAhnMxExVlzBpMOmJi4jCd461hTaA mQ+E5WDOMoRTxC2G+SoBoyYprsNK0PIPZilIs6hPz9kSJan7EXKeqfA5ZFjwY3mW tb+kS5XpllfWvajAkWskZpre2NITUV3ybIywm8pPaX2OSvC3zodaglHm76yj32Vk qkfS3psVnpQTKepip5a1y6pkB951jE4hx/zofmW3tMivQGOFSCU= =dyoL -----END PGP SIGNATURE----- Merge 4.19.100 into android-4.19 Changes in 4.19.100 can, slip: Protect tty->disc_data in write_wakeup and close with RCU firestream: fix memory leaks gtp: make sure only SOCK_DGRAM UDP sockets are accepted ipv6: sr: remove SKB_GSO_IPXIP6 on End.D* actions net: bcmgenet: Use netif_tx_napi_add() for TX NAPI net: cxgb3_main: Add CAP_NET_ADMIN check to CHELSIO_GET_MEM net: ip6_gre: fix moving ip6gre between namespaces net, ip6_tunnel: fix namespaces move net, ip_tunnel: fix namespaces move net: rtnetlink: validate IFLA_MTU attribute in rtnl_create_link() net_sched: fix datalen for ematch net-sysfs: Fix reference count leak in rx\|netdev_queue_add_kobject net-sysfs: fix netdev_queue_add_kobject() breakage net-sysfs: Call dev_hold always in netdev_queue_add_kobject net-sysfs: Call dev_hold always in rx_queue_add_kobject net-sysfs: Fix reference count leak net: usb: lan78xx: Add .ndo_features_check Revert "udp: do rmem bulk free even if the rx sk queue is empty" tcp_bbr: improve arithmetic division in bbr_update_bw() tcp: do not leave dangling pointers in tp->highest_sack tun: add mutex_unlock() call and napi.skb clearing in tun_get_user() afs: Fix characters allowed into cell names hwmon: (adt7475) Make volt2reg return same reg as reg2volt input hwmon: (core) Do not use device managed functions for memory allocations PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken tracing: trigger: Replace unneeded RCU-list traversals Input: keyspan-remote - fix control-message timeouts Revert "Input: synaptics-rmi4 - don't increment rmiaddr for SMBus transfers" ARM: 8950/1: ftrace/recordmcount: filter relocation types mmc: tegra: fix SDR50 tuning override mmc: sdhci: fix minimum clock rate for v3 controller Documentation: Document arm64 kpti control Input: pm8xxx-vib - fix handling of separate enable register Input: sur40 - fix interface sanity checks Input: gtco - fix endpoint sanity check Input: aiptek - fix endpoint sanity check Input: pegasus_notetaker - fix endpoint sanity check Input: sun4i-ts - add a check for devm_thermal_zone_of_sensor_register netfilter: nft_osf: add missing check for DREG attribute hwmon: (nct7802) Fix voltage limits to wrong registers scsi: RDMA/isert: Fix a recently introduced regression related to logout tracing: xen: Ordered comparison of function pointers do_last(): fetch directory ->i_mode and ->i_uid before it's too late net/sonic: Add mutual exclusion for accessing shared state net/sonic: Clear interrupt flags immediately net/sonic: Use MMIO accessors net/sonic: Fix interface error stats collection net/sonic: Fix receive buffer handling net/sonic: Avoid needless receive descriptor EOL flag updates net/sonic: Improve receive descriptor status flag check net/sonic: Fix receive buffer replenishment net/sonic: Quiesce SONIC before re-initializing descriptor memory net/sonic: Fix command register usage net/sonic: Fix CAM initialization net/sonic: Prevent tx watchdog timeout tracing: Use hist trigger's var_ref array to destroy var_refs tracing: Remove open-coding of hist trigger var_ref management tracing: Fix histogram code when expression has same var as value sd: Fix REQ_OP_ZONE_REPORT completion handling crypto: geode-aes - switch to skcipher for cbc(aes) fallback coresight: etb10: Do not call smp_processor_id from preemptible coresight: tmc-etf: Do not call smp_processor_id from preemptible libertas: Fix two buffer overflows at parsing bss descriptor media: v4l2-ioctl.c: zero reserved fields for S/TRY_FMT scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func netfilter: ipset: use bitmap infrastructure completely netfilter: nf_tables: add __nft_chain_type_get() net/x25: fix nonblocking connect mm/memory_hotplug: make remove_memory() take the device_hotplug_lock mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section() mm, sparse: pass nid instead of pgdat to sparse_add_one_section() drivers/base/memory.c: remove an unnecessary check on NR_MEM_SECTIONS mm, memory_hotplug: add nid parameter to arch_remove_memory mm/memory_hotplug: release memory resource after arch_remove_memory() drivers/base/memory.c: clean up relics in function parameters mm, memory_hotplug: update a comment in unregister_memory() mm/memory_hotplug: make unregister_memory_section() never fail mm/memory_hotplug: make __remove_section() never fail powerpc/mm: Fix section mismatch warning mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail s390x/mm: implement arch_remove_memory() mm/memory_hotplug: allow arch_remove_memory() without CONFIG_MEMORY_HOTREMOVE drivers/base/memory: pass a block_id to init_memory_block() mm/memory_hotplug: create memory block devices after arch_add_memory() mm/memory_hotplug: remove memory block devices before arch_remove_memory() mm/memory_hotplug: make unregister_memory_block_under_nodes() never fail mm/memory_hotplug: remove "zone" parameter from sparse_remove_one_section mm/hotplug: kill is_dev_zone() usage in __remove_pages() drivers/base/node.c: simplify unregister_memory_block_under_nodes() mm/memunmap: don't access uninitialized memmap in memunmap_pages() mm/memory_hotplug: fix try_offline_node() mm/memory_hotplug: shrink zones when offlining memory Linux 4.19.100 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I1664d6d4de9358bff5632c291a26e1401ec7b5f1	2020-01-29 17:10:45 +01:00
David Hildenbrand	86834898d5	mm/memory_hotplug: shrink zones when offlining memory commit feee6b2989165631b17ac6d4ccdbf6759254e85a upstream. -- snip -- - Missing arm64 hot(un)plug support - Missing some vmem_altmap_offset() cleanups - Missing sub-section hotadd support - Missing unification of mm/hmm.c and kernel/memremap.c -- snip -- We currently try to shrink a single zone when removing memory. We use the zone of the first page of the memory we are removing. If that memmap was never initialized (e.g., memory was never onlined), we will read garbage and can trigger kernel BUGs (due to a stale pointer): BUG: unable to handle page fault for address: 000000000000353d #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4 Workqueue: kacpi_hotplug acpi_hotplug_work_fn RIP: 0010:clear_zone_contiguous+0x5/0x10 Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840 RSP: 0018:ffffad2400043c98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000 RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40 RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000 R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680 FS: 0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __remove_pages+0x4b/0x640 arch_remove_memory+0x63/0x8d try_remove_memory+0xdb/0x130 __remove_memory+0xa/0x11 acpi_memory_device_remove+0x70/0x100 acpi_bus_trim+0x55/0x90 acpi_device_hotplug+0x227/0x3a0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x221/0x550 worker_thread+0x50/0x3b0 kthread+0x105/0x140 ret_from_fork+0x3a/0x50 Modules linked in: CR2: 000000000000353d Instead, shrink the zones when offlining memory or when onlining failed. Introduce and use remove_pfn_range_from_zone(() for that. We now properly shrink the zones, even if we have DIMMs whereby - Some memory blocks fall into no zone (never onlined) - Some memory blocks fall into multiple zones (offlined+re-onlined) - Multiple memory blocks that fall into different zones Drop the zone parameter (with a potential dubious value) from __remove_pages() and __remove_section(). Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com Fixes: `f1dd2cd13c` ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after `d0dc12e86b`] Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: <stable@vger.kernel.org> [5.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-29 16:43:27 +01:00
Aneesh Kumar K.V	f291080659	mm/memunmap: don't access uninitialized memmap in memunmap_pages() commit 77e080e7680e1e615587352f70c87b9e98126d03 upstream. -- snip -- - Missing mm/hmm.c and kernel/memremap.c unification. -- hmm code does not need fixes (no altmap) - Missing 7cc7867fb061 ("mm/devm_memremap_pages: enable sub-section remap") -- snip -- Patch series "mm/memory_hotplug: Shrink zones before removing memory", v6. This series fixes the access of uninitialized memmaps when shrinking zones/nodes and when removing memory. Also, it contains all fixes for crashes that can be triggered when removing certain namespace using memunmap_pages() - ZONE_DEVICE, reported by Aneesh. We stop trying to shrink ZONE_DEVICE, as it's buggy, fixing it would be more involved (we don't have SECTION_IS_ONLINE as an indicator), and shrinking is only of limited use (set_zone_contiguous() cannot detect the ZONE_DEVICE as contiguous). We continue shrinking !ZONE_DEVICE zones, however, I reduced the amount of code to a minimum. Shrinking is especially necessary to keep zone->contiguous set where possible, especially, on memory unplug of DIMMs at zone boundaries. -------------------------------------------------------------------------- Zones are now properly shrunk when offlining memory blocks or when onlining failed. This allows to properly shrink zones on memory unplug even if the separate memory blocks of a DIMM were onlined to different zones or re-onlined to a different zone after offlining. Example: :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 :/# echo "online_movable" > /sys/devices/system/memory/memory41/state :/# echo "online_movable" > /sys/devices/system/memory/memory43/state :/# cat /proc/zoneinfo Node 1, zone Movable spanned 98304 present 65536 managed 65536 :/# echo 0 > /sys/devices/system/memory/memory43/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 32768 present 32768 managed 32768 :/# echo 0 > /sys/devices/system/memory/memory41/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 This patch (of 10): With an altmap, the memmap falling into the reserved altmap space are not initialized and, therefore, contain a garbage NID and a garbage zone. Make sure to read the NID/zone from a memmap that was initialized. This fixes a kernel crash that is observed when destroying a namespace: kernel BUG at include/linux/mm.h:1107! cpu 0x1: Vector: 700 (Program Check) at [c000000274087890] pc: c0000000004b9728: memunmap_pages+0x238/0x340 lr: c0000000004b9724: memunmap_pages+0x234/0x340 ... pid = 3669, comm = ndctl kernel BUG at include/linux/mm.h:1107! devm_action_release+0x30/0x50 release_nodes+0x268/0x2d0 device_release_driver_internal+0x174/0x240 unbind_store+0x13c/0x190 drv_attr_store+0x44/0x60 sysfs_kf_write+0x70/0xa0 kernfs_fop_write+0x1ac/0x290 __vfs_write+0x3c/0x70 vfs_write+0xe4/0x200 ksys_write+0x7c/0x140 system_call+0x5c/0x68 The "page_zone(pfn_to_page(pfn)" was introduced by 69324b8f4833 ("mm, devm_memremap_pages: add MEMORY_DEVICE_PRIVATE support"), however, I think we will never have driver reserved memory with MEMORY_DEVICE_PRIVATE (no altmap AFAIKS). [david@redhat.com: minimze code changes, rephrase description] Link: http://lkml.kernel.org/r/20191006085646.5768-2-david@redhat.com Fixes: 2c2a5af6fed2 ("mm, memory_hotplug: add nid parameter to arch_remove_memory") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Damian Tometzki <damian.tometzki@gmail.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jun Yao <yaojun8558363@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pankaj Gupta <pagupta@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qian Cai <cai@lca.pw> Cc: Rich Felker <dalias@libc.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Steve Capper <steve.capper@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> [5.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-29 16:43:27 +01:00
Oscar Salvador	5c1f8f5358	mm, memory_hotplug: add nid parameter to arch_remove_memory commit 2c2a5af6fed20cf74401c9d64319c76c5ff81309 upstream. -- snip -- Missing unification of mm/hmm.c and kernel/memremap.c -- snip -- Patch series "Do not touch pages in hot-remove path", v2. This patchset aims for two things: 1) A better definition about offline and hot-remove stage 2) Solving bugs where we can access non-initialized pages during hot-remove operations [2] [3]. This is achieved by moving all page/zone handling to the offline stage, so we do not need to access pages when hot-removing memory. [1] https://patchwork.kernel.org/cover/10691415/ [2] https://patchwork.kernel.org/patch/10547445/ [3] https://www.spinics.net/lists/linux-mm/msg161316.html This patch (of 5): This is a preparation for the following-up patches. The idea of passing the nid is that it will allow us to get rid of the zone parameter afterwards. Link: http://lkml.kernel.org/r/20181127162005.15833-2-osalvador@suse.de Signed-off-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-29 16:43:25 +01:00
Steven Rostedt (VMware)	ce28d66405	tracing: Fix histogram code when expression has same var as value commit 8bcebc77e85f3d7536f96845a0fe94b1dddb6af0 upstream. While working on a tool to convert SQL syntex into the histogram language of the kernel, I discovered the following bug: # echo 'first u64 start_time u64 end_time pid_t pid u64 delta' >> synthetic_events # echo 'hist:keys=pid:start=common_timestamp' > events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:delta=common_timestamp-$start,start2=$start:onmatch(sched.sched_waking).trace(first,$start2,common_timestamp,next_pid,$delta)' > events/sched/sched_switch/trigger Would not display any histograms in the sched_switch histogram side. But if I were to swap the location of "delta=common_timestamp-$start" with "start2=$start" Such that the last line had: # echo 'hist:keys=next_pid:start2=$start,delta=common_timestamp-$start:onmatch(sched.sched_waking).trace(first,$start2,common_timestamp,next_pid,$delta)' > events/sched/sched_switch/trigger The histogram works as expected. What I found out is that the expressions clear out the value once it is resolved. As the variables are resolved in the order listed, when processing: delta=common_timestamp-$start The $start is cleared. When it gets to "start2=$start", it errors out with "unresolved symbol" (which is silent as this happens at the location of the trace), and the histogram is dropped. When processing the histogram for variable references, instead of adding a new reference for a variable used twice, use the same reference. That way, not only is it more efficient, but the order will no longer matter in processing of the variables. From Tom Zanussi: "Just to clarify some more about what the problem was is that without your patch, we would have two separate references to the same variable, and during resolve_var_refs(), they'd both want to be resolved separately, so in this case, since the first reference to start wasn't part of an expression, it wouldn't get the read-once flag set, so would be read normally, and then the second reference would do the read-once read and also be read but using read-once. So everything worked and you didn't see a problem: from: start2=$start,delta=common_timestamp-$start In the second case, when you switched them around, the first reference would be resolved by doing the read-once, and following that the second reference would try to resolve and see that the variable had already been read, so failed as unset, which caused it to short-circuit out and not do the trigger action to generate the synthetic event: to: delta=common_timestamp-$start,start2=$start With your patch, we only have the single resolution which happens correctly the one time it's resolved, so this can't happen." Link: https://lore.kernel.org/r/20200116154216.58ca08eb@gandalf.local.home Cc: stable@vger.kernel.org Fixes: `067fe038e7` ("tracing: Add variable reference handling to hist triggers") Reviewed-by: Tom Zanuss <zanussi@kernel.org> Tested-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-29 16:43:23 +01:00
Tom Zanussi	cbb042fd87	tracing: Remove open-coding of hist trigger var_ref management commit de40f033d4e84e843d6a12266e3869015ea9097c upstream. Have create_var_ref() manage the hist trigger's var_ref list, rather than having similar code doing it in multiple places. This cleans up the code and makes sure var_refs are always accounted properly. Also, document the var_ref-related functions to make what their purpose clearer. Link: http://lkml.kernel.org/r/05ddae93ff514e66fc03897d6665231892939913.1545161087.git.tom.zanussi@linux.intel.com Acked-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-29 16:43:23 +01:00
Tom Zanussi	836717841a	tracing: Use hist trigger's var_ref array to destroy var_refs commit 656fe2ba85e81d00e4447bf77b8da2be3c47acb2 upstream. Since every var ref for a trigger has an entry in the var_ref[] array, use that to destroy the var_refs, instead of piecemeal via the field expressions. This allows us to avoid having to keep and treat differently separate lists for the action-related references, which future patches will remove. Link: http://lkml.kernel.org/r/fad1a164f0e257c158e70d6eadbf6c586e04b2a2.1545161087.git.tom.zanussi@linux.intel.com Acked-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-29 16:43:23 +01:00
Masami Hiramatsu	47eb3574d0	tracing: trigger: Replace unneeded RCU-list traversals commit aeed8aa3874dc15b9d82a6fe796fd7cfbb684448 upstream. With CONFIG_PROVE_RCU_LIST, I had many suspicious RCU warnings when I ran ftracetest trigger testcases. ----- # dmesg -c > /dev/null # ./ftracetest test.d/trigger ... # dmesg \| grep "RCU-list traversed" \| cut -f 2 -d ] \| cut -f 2 -d " " kernel/trace/trace_events_hist.c:6070 kernel/trace/trace_events_hist.c:1760 kernel/trace/trace_events_hist.c:5911 kernel/trace/trace_events_trigger.c:504 kernel/trace/trace_events_hist.c:1810 kernel/trace/trace_events_hist.c:3158 kernel/trace/trace_events_hist.c:3105 kernel/trace/trace_events_hist.c:5518 kernel/trace/trace_events_hist.c:5998 kernel/trace/trace_events_hist.c:6019 kernel/trace/trace_events_hist.c:6044 kernel/trace/trace_events_trigger.c:1500 kernel/trace/trace_events_trigger.c:1540 kernel/trace/trace_events_trigger.c:539 kernel/trace/trace_events_trigger.c:584 ----- I investigated those warnings and found that the RCU-list traversals in event trigger and hist didn't need to use RCU version because those were called only under event_mutex. I also checked other RCU-list traversals related to event trigger list, and found that most of them were called from event_hist_trigger_func() or hist_unregister_trigger() or register/unregister functions except for a few cases. Replace these unneeded RCU-list traversals with normal list traversal macro and lockdep_assert_held() to check the event_mutex is held. Link: http://lkml.kernel.org/r/157680910305.11685.15110237954275915782.stgit@devnote2 Cc: stable@vger.kernel.org Fixes: `30350d65ac` ("tracing: Add variable support to hist triggers") Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-29 16:43:18 +01:00
Greg Kroah-Hartman	1fca2c99f4	This is the 4.19.99 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4u6tsACgkQONu9yGCS aT693A//TExeDRnNnf+2v4TJorylyRr17BMxk/Ie2L5E6d2n/RWodsrOThAPU9tx 5alNUkXCT8Jd31BUVnUoPoAQ4zSymSVi++XEf05wDeO0tQ982IESGaLmu9EC1uMF nnM5y4IdRYmFI1Zji4h5vRJckoYUlB6Mdg4BgMr4Q1KX7RkZYfe6bjs7DwM/uyMx jVXdFaQBD1H6F5W6A+GmgUZ36g9uNqzcBxxWwv5URj+q816NdI4bsxIJMF0v0WC+ S54fmpS07QWIYKKsQBUepeSgEF4ECESOE2VoF1ICcnfakdPnDBmNgyPJPSrLmVf+ itRUxoH1MewaOvoJrv+xsGBPmM29LcKH2oBmj5DR2Xstp7ACPs+OtXJEU9dUTDN4 NhaSts5fIp0f4Y5mMn508pDUwYDAWDt99ZJWdx6aK/TRyUsHBgpxBQDt37BE3U5W PCBnObNe2b2KDAsVXLjX5iDYoA0+usFreveMo8uEP+ohfh0ANvJlRkzedYw7NquI ZCcT+I1P9q8aa0528tR332VLrQeYg+kG6LVi2kAabmRA/VtEsT0w90MY/eo2vuTU WlPmbs2yerv2HTm050e6MOgBZfPh7wP/FpbjsSXufj7EDywlfxF+1hXdwfrpPJeN fN3g0kepeUp7+kLzO40FLam/z5ndjAUhyN2SBaPzGsXjMkZdETk= =zvlh -----END PGP SIGNATURE----- Merge 4.19.99 into android-4.19 Changes in 4.19.99 Revert "efi: Fix debugobjects warning on 'efi_rts_work'" xfs: Sanity check flags of Q_XQUOTARM call i2c: stm32f7: rework slave_id allocation i2c: i2c-stm32f7: fix 10-bits check in slave free id search loop mfd: intel-lpss: Add default I2C device properties for Gemini Lake SUNRPC: Fix svcauth_gss_proxy_init() powerpc/pseries: Enable support for ibm,drc-info property powerpc/archrandom: fix arch_get_random_seed_int() tipc: update mon's self addr when node addr generated tipc: fix wrong timeout input for tipc_wait_for_cond() mt7601u: fix bbp version check in mt7601u_wait_bbp_ready crypto: sun4i-ss - fix big endian issues perf map: No need to adjust the long name of modules soc: aspeed: Fix snoop_file_poll()'s return type watchdog: sprd: Fix the incorrect pointer getting from driver data ipmi: Fix memory leak in __ipmi_bmc_register drm/sti: do not remove the drm_bridge that was never added ARM: dts: at91: nattis: set the PRLUD and HIPOW signals low ARM: dts: at91: nattis: make the SD-card slot work ixgbe: don't clear IPsec sa counters on HW clearing drm/virtio: fix bounds check in virtio_gpu_cmd_get_capset() iio: fix position relative kernel version apparmor: Fix network performance issue in aa_label_sk_perm ALSA: hda: fix unused variable warning apparmor: don't try to replace stale label in ptrace access check ARM: qcom_defconfig: Enable MAILBOX firmware: coreboot: Let OF core populate platform device PCI: iproc: Remove PAXC slot check to allow VF support bridge: br_arp_nd_proxy: set icmp6_router if neigh has NTF_ROUTER drm/hisilicon: hibmc: Don't overwrite fb helper surface depth signal/ia64: Use the generic force_sigsegv in setup_frame signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn ASoC: wm9712: fix unused variable warning mailbox: mediatek: Add check for possible failure of kzalloc IB/rxe: replace kvfree with vfree IB/hfi1: Add mtu check for operational data VLs genirq/debugfs: Reinstate full OF path for domain name usb: dwc3: add EXTCON dependency for qcom usb: gadget: fsl_udc_core: check allocation return value and cleanup on failure cfg80211: regulatory: make initialization more robust mei: replace POLL* with EPOLL* for write queues. drm/msm: fix unsigned comparison with less than zero of: Fix property name in of_node_get_device_type ALSA: usb-audio: update quirk for B&W PX to remove microphone iwlwifi: nvm: get num of hw addresses from firmware staging: comedi: ni_mio_common: protect register write overflow netfilter: nft_osf: usage from output path is not valid pwm: lpss: Release runtime-pm reference from the driver's remove callback powerpc/pseries/memory-hotplug: Fix return value type of find_aa_index rtlwifi: rtl8821ae: replace _rtl8821ae_mrate_idx_to_arfr_id with generic version RDMA/bnxt_re: Add missing spin lock initialization netfilter: nf_flow_table: do not remove offload when other netns's interface is down powerpc/kgdb: add kgdb_arch_set/remove_breakpoint() tipc: eliminate message disordering during binding table update net: socionext: Add dummy PHY register read in phy_write() drm/sun4i: hdmi: Fix double flag assignation net: hns3: add error handler for hns3_nic_init_vector_data() mlxsw: reg: QEEC: Add minimum shaper fields mlxsw: spectrum: Set minimum shaper on MC TCs NTB: ntb_hw_idt: replace IS_ERR_OR_NULL with regular NULL checks ASoC: wm97xx: fix uninitialized regmap pointer problem ARM: dts: bcm283x: Correct mailbox register sizes pcrypt: use format specifier in kobject_add ASoC: sun8i-codec: add missing route for ADC pinctrl: meson-gxl: remove invalid GPIOX tsin_a pins bus: ti-sysc: Add mcasp optional clocks flag exportfs: fix 'passing zero to ERR_PTR()' warning drm: rcar-du: Fix the return value in case of error in 'rcar_du_crtc_set_crc_source()' drm: rcar-du: Fix vblank initialization net: always initialize pagedlen drm/dp_mst: Skip validating ports during destruction, just ref arm64: dts: meson-gx: Add hdmi_5v regulator as hdmi tx supply arm64: dts: renesas: r8a7795-es1: Add missing power domains to IPMMU nodes net: phy: Fix not to call phy_resume() if PHY is not attached IB/hfi1: Correctly process FECN and BECN in packets OPP: Fix missing debugfs supply directory for OPPs IB/rxe: Fix incorrect cache cleanup in error flow mailbox: ti-msgmgr: Off by one in ti_msgmgr_of_xlate() staging: bcm2835-camera: Abort probe if there is no camera staging: bcm2835-camera: fix module autoloading switchtec: Remove immediate status check after submitting MRPC command ipv6: add missing tx timestamping on IPPROTO_RAW pinctrl: sh-pfc: r8a7740: Add missing REF125CK pin to gether_gmii group pinctrl: sh-pfc: r8a7740: Add missing LCD0 marks to lcd0_data24_1 group pinctrl: sh-pfc: r8a7791: Remove bogus ctrl marks from qspi_data4_b group pinctrl: sh-pfc: r8a7791: Remove bogus marks from vin1_b_data18 group pinctrl: sh-pfc: sh73a0: Add missing TO pin to tpu4_to3 group pinctrl: sh-pfc: r8a7794: Remove bogus IPSR9 field pinctrl: sh-pfc: r8a77970: Add missing MOD_SEL0 field pinctrl: sh-pfc: r8a77980: Add missing MOD_SEL0 field pinctrl: sh-pfc: sh7734: Add missing IPSR11 field pinctrl: sh-pfc: r8a77995: Remove bogus SEL_PWM[0-3]_3 configurations pinctrl: sh-pfc: sh7269: Add missing PCIOR0 field pinctrl: sh-pfc: sh7734: Remove bogus IPSR10 value net: hns3: fix error handling int the hns3_get_vector_ring_chain vxlan: changelink: Fix handling of default remotes Input: nomadik-ske-keypad - fix a loop timeout test fork,memcg: fix crash in free_thread_stack on memcg charge fail clk: highbank: fix refcount leak in hb_clk_init() clk: qoriq: fix refcount leak in clockgen_init() clk: ti: fix refcount leak in ti_dt_clocks_register() clk: socfpga: fix refcount leak clk: samsung: exynos4: fix refcount leak in exynos4_get_xom() clk: imx6q: fix refcount leak in imx6q_clocks_init() clk: imx6sx: fix refcount leak in imx6sx_clocks_init() clk: imx7d: fix refcount leak in imx7d_clocks_init() clk: vf610: fix refcount leak in vf610_clocks_init() clk: armada-370: fix refcount leak in a370_clk_init() clk: kirkwood: fix refcount leak in kirkwood_clk_init() clk: armada-xp: fix refcount leak in axp_clk_init() clk: mv98dx3236: fix refcount leak in mv98dx3236_clk_init() clk: dove: fix refcount leak in dove_clk_init() MIPS: BCM63XX: drop unused and broken DSP platform device arm64: defconfig: Re-enable bcm2835-thermal driver remoteproc: qcom: q6v5-mss: Add missing clocks for MSM8996 remoteproc: qcom: q6v5-mss: Add missing regulator for MSM8996 drm: Fix error handling in drm_legacy_addctx ARM: dts: r8a7743: Remove generic compatible string from iic3 drm/etnaviv: fix some off by one bugs drm/fb-helper: generic: Fix setup error path fork, memcg: fix cached_stacks case IB/usnic: Fix out of bounds index check in query pkey RDMA/ocrdma: Fix out of bounds index check in query pkey RDMA/qedr: Fix out of bounds index check in query pkey drm/shmob: Fix return value check in shmob_drm_probe arm64: dts: apq8016-sbc: Increase load on l11 for SDCARD spi: cadence: Correct initialisation of runtime PM RDMA/iw_cxgb4: Fix the unchecked ep dereference net: phy: micrel: set soft_reset callback to genphy_soft_reset for KSZ9031 memory: tegra: Don't invoke Tegra30+ specific memory timing setup on Tegra20 drm/etnaviv: NULL vs IS_ERR() buf in etnaviv_core_dump() media: s5p-jpeg: Correct step and max values for V4L2_CID_JPEG_RESTART_INTERVAL kbuild: mark prepare0 as PHONY to fix external module build crypto: brcm - Fix some set-but-not-used warning crypto: tgr192 - fix unaligned memory access ASoC: imx-sgtl5000: put of nodes if finding codec fails IB/iser: Pass the correct number of entries for dma mapped SGL net: hns3: fix wrong combined count returned by ethtool -l media: tw9910: Unregister subdevice with v4l2-async IB/mlx5: Don't override existing ip_protocol rtc: cmos: ignore bogus century byte spi/topcliff_pch: Fix potential NULL dereference on allocation error net: hns3: fix bug of ethtool_ops.get_channels for VF ARM: dts: sun8i-a23-a33: Move NAND controller device node to sort by address clk: sunxi-ng: sun8i-a23: Enable PLL-MIPI LDOs when ungating it iwlwifi: mvm: avoid possible access out of array. net/mlx5: Take lock with IRQs disabled to avoid deadlock ip_tunnel: Fix route fl4 init in ip_md_tunnel_xmit arm64: dts: allwinner: h6: Move GIC device node fix base address ordering iwlwifi: mvm: fix A-MPDU reference assignment bus: ti-sysc: Fix timer handling with drop pm_runtime_irq_safe() tty: ipwireless: Fix potential NULL pointer dereference driver: uio: fix possible memory leak in __uio_register_device driver: uio: fix possible use-after-free in __uio_register_device crypto: crypto4xx - Fix wrong ppc4xx_trng_probe()/ppc4xx_trng_remove() arguments driver core: Fix DL_FLAG_AUTOREMOVE_SUPPLIER device link flag handling driver core: Avoid careless re-use of existing device links driver core: Do not resume suppliers under device_links_write_lock() driver core: Fix handling of runtime PM flags in device_link_add() driver core: Do not call rpm_put_suppliers() in pm_runtime_drop_link() ARM: dts: lpc32xx: add required clocks property to keypad device node ARM: dts: lpc32xx: reparent keypad controller to SIC1 ARM: dts: lpc32xx: fix ARM PrimeCell LCD controller variant ARM: dts: lpc32xx: fix ARM PrimeCell LCD controller clocks property ARM: dts: lpc32xx: phy3250: fix SD card regulator voltage drm/xen-front: Fix mmap attributes for display buffers iwlwifi: mvm: fix RSS config command staging: most: cdev: add missing check for cdev_add failure clk: ingenic: jz4740: Fix gating of UDC clock rtc: ds1672: fix unintended sign extension thermal: mediatek: fix register index error arm64: dts: msm8916: remove bogus argument to the cpu clock ath10k: fix dma unmap direction for management frames net: phy: fixed_phy: Fix fixed_phy not checking GPIO rtc: ds1307: rx8130: Fix alarm handling net/smc: original socket family in inet_sock_diag rtc: 88pm860x: fix unintended sign extension rtc: 88pm80x: fix unintended sign extension rtc: pm8xxx: fix unintended sign extension fbdev: chipsfb: remove set but not used variable 'size' iw_cxgb4: use tos when importing the endpoint iw_cxgb4: use tos when finding ipv6 routes ipmi: kcs_bmc: handle devm_kasprintf() failure case xsk: add missing smp_rmb() in xsk_mmap drm/etnaviv: potential NULL dereference ntb_hw_switchtec: debug print 64bit aligned crosslink BAR Numbers ntb_hw_switchtec: NT req id mapping table register entry number should be 512 pinctrl: sh-pfc: emev2: Add missing pinmux functions pinctrl: sh-pfc: r8a7791: Fix scifb2_data_c pin group pinctrl: sh-pfc: r8a7792: Fix vin1_data18_b pin group pinctrl: sh-pfc: sh73a0: Fix fsic_spdif pin groups RDMA/mlx5: Fix memory leak in case we fail to add an IB device driver core: Fix possible supplier PM-usage counter imbalance PCI: endpoint: functions: Use memcpy_fromio()/memcpy_toio() usb: phy: twl6030-usb: fix possible use-after-free on remove block: don't use bio->bi_vcnt to figure out segment number keys: Timestamp new keys net: dsa: b53: Fix default VLAN ID net: dsa: b53: Properly account for VLAN filtering net: dsa: b53: Do not program CPU port's PVID mt76: usb: fix possible memory leak in mt76u_buf_free media: sh: migor: Include missing dma-mapping header vfio_pci: Enable memory accesses before calling pci_map_rom hwmon: (pmbus/tps53679) Fix driver info initialization in probe routine mdio_bus: Fix PTR_ERR() usage after initialization to constant KVM: PPC: Release all hardware TCE tables attached to a group staging: r8822be: check kzalloc return or bail dmaengine: mv_xor: Use correct device for DMA API cdc-wdm: pass return value of recover_from_urb_loss brcmfmac: create debugfs files for bus-specific layer regulator: pv88060: Fix array out-of-bounds access regulator: pv88080: Fix array out-of-bounds access regulator: pv88090: Fix array out-of-bounds access net: dsa: qca8k: Enable delay for RGMII_ID mode net/mlx5: Delete unused FPGA QPN variable drm/nouveau/bios/ramcfg: fix missing parentheses when calculating RON drm/nouveau/pmu: don't print reply values if exec is false drm/nouveau: fix missing break in switch statement driver core: Fix PM-runtime for links added during consumer probe ASoC: qcom: Fix of-node refcount unbalance in apq8016_sbc_parse_of() net: dsa: fix unintended change of bridge interface STP state fs/nfs: Fix nfs_parse_devname to not modify it's argument staging: rtlwifi: Use proper enum for return in halmac_parse_psd_data_88xx powerpc/64s: Fix logic when handling unknown CPU features NFS: Fix a soft lockup in the delegation recovery code perf: Copy parent's address filter offsets on clone perf, pt, coresight: Fix address filters for vmas with non-zero offset clocksource/drivers/sun5i: Fail gracefully when clock rate is unavailable clocksource/drivers/exynos_mct: Fix error path in timer resources initialization platform/x86: wmi: fix potential null pointer dereference NFS/pnfs: Bulk destroy of layouts needs to be safe w.r.t. umount mmc: sdhci-brcmstb: handle mmc_of_parse() errors during probe iommu: Fix IOMMU debugfs fallout ARM: 8847/1: pm: fix HYP/SVC mode mismatch when MCPM is used ARM: 8848/1: virt: Align GIC version check with arm64 counterpart ARM: 8849/1: NOMMU: Fix encodings for PMSAv8's PRBAR4/PRLAR4 regulator: wm831x-dcdc: Fix list of wm831x_dcdc_ilim from mA to uA ath10k: Fix length of wmi tlv command for protected mgmt frames netfilter: nft_set_hash: fix lookups with fixed size hash on big endian netfilter: nft_set_hash: bogus element self comparison from deactivation path net: sched: act_csum: Fix csum calc for tagged packets hwrng: bcm2835 - fix probe as platform device iommu/vt-d: Fix NULL pointer reference in intel_svm_bind_mm() NFS: Add missing encode / decode sequence_maxsz to v4.2 operations NFSv4/flexfiles: Fix invalid deref in FF_LAYOUT_DEVID_NODE() net: aquantia: fixed instack structure overflow powerpc/mm: Check secondary hash page table media: dvb/earth-pt1: fix wrong initialization for demod blocks rbd: clear ->xferred on error from rbd_obj_issue_copyup() PCI: Fix "try" semantics of bus and slot reset nios2: ksyms: Add missing symbol exports x86/mm: Remove unused variable 'cpu' scsi: megaraid_sas: reduce module load time nfp: fix simple vNIC mailbox length drivers/rapidio/rio_cm.c: fix potential oops in riocm_ch_listen() xen, cpu_hotplug: Prevent an out of bounds access net/mlx5: Fix multiple updates of steering rules in parallel net/mlx5e: IPoIB, Fix RX checksum statistics update net: sh_eth: fix a missing check of of_get_phy_mode regulator: lp87565: Fix missing register for LP87565_BUCK_0 soc: amlogic: gx-socinfo: Add mask for each SoC packages media: ivtv: update pos correctly in ivtv_read_pos() media: cx18: update pos correctly in cx18_read_pos() media: wl128x: Fix an error code in fm_download_firmware() media: cx23885: check allocation return regulator: tps65086: Fix tps65086_ldoa1_ranges for selector 0xB crypto: ccree - reduce kernel stack usage with clang jfs: fix bogus variable self-initialization tipc: tipc clang warning m68k: mac: Fix VIA timer counter accesses ARM: dts: sun8i: a33: Reintroduce default pinctrl muxing arm64: dts: allwinner: a64: Add missing PIO clocks ARM: dts: sun9i: optimus: Fix fixed-regulators net: phy: don't clear BMCR in genphy_soft_reset ARM: OMAP2+: Fix potentially uninitialized return value for _setup_reset() net: dsa: Avoid null pointer when failing to connect to PHY soc: qcom: cmd-db: Fix an error code in cmd_db_dev_probe() media: davinci-isif: avoid uninitialized variable use media: tw5864: Fix possible NULL pointer dereference in tw5864_handle_frame spi: tegra114: clear packed bit for unpacked mode spi: tegra114: fix for unpacked mode transfers spi: tegra114: terminate dma and reset on transfer timeout spi: tegra114: flush fifos spi: tegra114: configure dma burst size to fifo trig level bus: ti-sysc: Fix sysc_unprepare() when no clocks have been allocated soc/fsl/qe: Fix an error code in qe_pin_request() spi: bcm2835aux: fix driver to not allow 65535 (=-1) cs-gpios drm/fb-helper: generic: Call drm_client_add() after setup is done arm64/vdso: don't leak kernel addresses rtc: Fix timestamp value for RTC_TIMESTAMP_BEGIN_1900 rtc: mt6397: Don't call irq_dispose_mapping. ehea: Fix a copy-paste err in ehea_init_port_res bpf: Add missed newline in verifier verbose log drm/vmwgfx: Remove set but not used variable 'restart' scsi: qla2xxx: Unregister chrdev if module initialization fails of: use correct function prototype for of_overlay_fdt_apply() net/sched: cbs: fix port_rate miscalculation clk: qcom: Skip halt checks on gcc_pcie_0_pipe_clk for 8998 ACPI: button: reinitialize button state upon resume firmware: arm_scmi: fix of_node leak in scmi_mailbox_check rxrpc: Fix detection of out of order acks scsi: target/core: Fix a race condition in the LUN lookup code brcmfmac: fix leak of mypkt on error return path ARM: pxa: ssp: Fix "WARNING: invalid free of devm_ allocated data" PCI: rockchip: Fix rockchip_pcie_ep_assert_intx() bitwise operations net: hns3: fix for vport->bw_limit overflow problem hwmon: (w83627hf) Use request_muxed_region for Super-IO accesses perf/core: Fix the address filtering fix staging: android: vsoc: fix copy_from_user overrun PCI: dwc: Fix dw_pcie_ep_find_capability() to return correct capability offset soc: amlogic: meson-gx-pwrc-vpu: Fix power on/off register bitmask platform/x86: alienware-wmi: fix kfree on potentially uninitialized pointer tipc: set sysctl_tipc_rmem and named_timeout right range usb: typec: tcpm: Notify the tcpc to start connection-detection for SRPs selftests/ipc: Fix msgque compiler warnings net: hns3: fix loop condition of hns3_get_tx_timeo_queue_info() powerpc: vdso: Make vdso32 installation conditional in vdso_install ARM: dts: ls1021: Fix SGMII PCS link remaining down after PHY disconnect media: ov2659: fix unbalanced mutex_lock/unlock 6lowpan: Off by one handling ->nexthdr dmaengine: axi-dmac: Don't check the number of frames for alignment ALSA: usb-audio: Handle the error from snd_usb_mixer_apply_create_quirk() afs: Fix AFS file locking to allow fine grained locks afs: Further fix file locking NFS: Don't interrupt file writeout due to fatal errors coresight: catu: fix clang build warning s390/kexec_file: Fix potential segment overlap in ELF loader irqchip/gic-v3-its: fix some definitions of inner cacheability attributes scsi: qla2xxx: Fix a format specifier scsi: qla2xxx: Fix error handling in qlt_alloc_qfull_cmd() scsi: qla2xxx: Avoid that qlt_send_resp_ctio() corrupts memory KVM: PPC: Book3S HV: Fix lockdep warning when entering the guest netfilter: nft_flow_offload: add entry to flowtable after confirmation PCI: iproc: Enable iProc config read for PAXBv2 ARM: dts: logicpd-som-lv: Fix MMC1 card detect packet: in recvmsg msg_name return at least sizeof sockaddr_ll ASoC: fix valid stream condition usb: gadget: fsl: fix link error against usb-gadget module dwc2: gadget: Fix completed transfer size calculation in DDMA IB/mlx5: Add missing XRC options to QP optional params mask RDMA/rxe: Consider skb reserve space based on netdev of GID iommu/vt-d: Make kernel parameter igfx_off work with vIOMMU net: ena: fix swapped parameters when calling ena_com_indirect_table_fill_entry net: ena: fix: Free napi resources when ena_up() fails net: ena: fix incorrect test of supported hash function net: ena: fix ena_com_fill_hash_function() implementation dmaengine: tegra210-adma: restore channel status watchdog: rtd119x_wdt: Fix remove function mmc: core: fix possible use after free of host lightnvm: pblk: fix lock order in pblk_rb_tear_down_check ath10k: Fix encoding for protected management frames afs: Fix the afs.cell and afs.volume xattr handlers vfio/mdev: Avoid release parent reference during error path vfio/mdev: Follow correct remove sequence vfio/mdev: Fix aborting mdev child device removal if one fails l2tp: Fix possible NULL pointer dereference ALSA: aica: Fix a long-time build breakage media: omap_vout: potential buffer overflow in vidioc_dqbuf() media: davinci/vpbe: array underflow in vpbe_enum_outputs() platform/x86: alienware-wmi: printing the wrong error code crypto: caam - fix caam_dump_sg that iterates through scatterlist netfilter: ebtables: CONFIG_COMPAT: reject trailing data after last rule pwm: meson: Consider 128 a valid pre-divider pwm: meson: Don't disable PWM when setting duty repeatedly ARM: riscpc: fix lack of keyboard interrupts after irq conversion nfp: bpf: fix static check error through tightening shift amount adjustment kdb: do a sanity check on the cpu in kdb_per_cpu() netfilter: nf_tables: correct NFT_LOGLEVEL_MAX value backlight: lm3630a: Return 0 on success in update_status functions thermal: rcar_gen3_thermal: fix interrupt type thermal: cpu_cooling: Actually trace CPU load in thermal_power_cpu_get_power EDAC/mc: Fix edac_mc_find() in case no device is found afs: Fix key leak in afs_release() and afs_evict_inode() afs: Don't invalidate callback if AFS_VNODE_DIR_VALID not set afs: Fix lock-wait/callback-break double locking afs: Fix double inc of vnode->cb_break ARM: dts: sun8i-h3: Fix wifi in Beelink X2 DT clk: meson: gxbb: no spread spectrum on mpll0 clk: meson: axg: spread spectrum is on mpll2 dmaengine: tegra210-adma: Fix crash during probe arm64: dts: meson: libretech-cc: set eMMC as removable RDMA/qedr: Fix incorrect device rate. spi: spi-fsl-spi: call spi_finalize_current_message() at the end crypto: ccp - fix AES CFB error exposed by new test vectors crypto: ccp - Fix 3DES complaint from ccp-crypto module serial: stm32: fix word length configuration serial: stm32: fix rx error handling serial: stm32: fix rx data length when parity enabled serial: stm32: fix transmit_chars when tx is stopped serial: stm32: Add support of TC bit status check serial: stm32: fix wakeup source initialization misc: sgi-xp: Properly initialize buf in xpc_get_rsvd_page_pa iommu: Add missing new line for dma type iommu: Use right function to get group for device signal/bpfilter: Fix bpfilter_kernl to use send_sig not force_sig signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig inet: frags: call inet_frags_fini() after unregister_pernet_subsys() net: hns3: fix a memory leak issue for hclge_map_unmap_ring_to_vf_vector crypto: talitos - fix AEAD processing. netvsc: unshare skb in VF rx handler net: core: support XDP generic on stacked devices. RDMA/uverbs: check for allocation failure in uapi_add_elm() net: don't clear sock->sk early to avoid trouble in strparser phy: qcom-qusb2: fix missing assignment of ret when calling clk_prepare_enable cpufreq: brcmstb-avs-cpufreq: Fix initial command check cpufreq: brcmstb-avs-cpufreq: Fix types for voltage/frequency clk: sunxi-ng: sun50i-h6-r: Fix incorrect W1 clock gate register media: vivid: fix incorrect assignment operation when setting video mode crypto: inside-secure - fix zeroing of the request in ahash_exit_inv crypto: inside-secure - fix queued len computation arm64: dts: renesas: ebisu: Remove renesas, no-ether-link property mpls: fix warning with multi-label encap serial: stm32: fix a recursive locking in stm32_config_rs485 arm64: dts: meson-gxm-khadas-vim2: fix gpio-keys-polled node arm64: dts: meson-gxm-khadas-vim2: fix Bluetooth support iommu/vt-d: Duplicate iommu_resv_region objects per device list phy: usb: phy-brcm-usb: Remove sysfs attributes upon driver removal firmware: arm_scmi: fix bitfield definitions for SENSOR_DESC attributes firmware: arm_scmi: update rate_discrete in clock_describe_rates_get ntb_hw_switchtec: potential shift wrapping bug in switchtec_ntb_init_sndev() ASoC: meson: axg-tdmin: right_j is not supported ASoC: meson: axg-tdmout: right_j is not supported qed: iWARP - Use READ_ONCE and smp_store_release to access ep->state qed: iWARP - fix uninitialized callback powerpc/cacheinfo: add cacheinfo_teardown, cacheinfo_rebuild powerpc/pseries/mobility: rebuild cacheinfo hierarchy post-migration bpf: fix the check that forwarding is enabled in bpf_ipv6_fib_lookup IB/hfi1: Handle port down properly in pio drm/msm/mdp5: Fix mdp5_cfg_init error return net: netem: fix backlog accounting for corrupted GSO frames net/udp_gso: Allow TX timestamp with UDP GSO net/af_iucv: build proper skbs for HiperTransport net/af_iucv: always register net_device notifier ASoC: ti: davinci-mcasp: Fix slot mask settings when using multiple AXRs rtc: pcf8563: Fix interrupt trigger method rtc: pcf8563: Clear event flags and disable interrupts before requesting irq ARM: dts: iwg20d-q7-common: Fix SDHI1 VccQ regularor net/sched: cbs: Fix error path of cbs_module_init arm64: dts: allwinner: h6: Pine H64: Add interrupt line for RTC drm/msm/a3xx: remove TPL1 regs from snapshot ip6_fib: Don't discard nodes with valid routing information in fib6_locate_1() perf/ioctl: Add check for the sample_period value dmaengine: hsu: Revert "set HSU_CH_MTSR to memory width" clk: qcom: Fix -Wunused-const-variable nvmem: imx-ocotp: Ensure WAIT bits are preserved when setting timing nvmem: imx-ocotp: Change TIMING calculation to u-boot algorithm tools: bpftool: use correct argument in cgroup errors backlight: pwm_bl: Fix heuristic to determine number of brightness levels fork,memcg: alloc_thread_stack_node needs to set tsk->stack bnxt_en: Fix ethtool selftest crash under error conditions. bnxt_en: Suppress error messages when querying DSCP DCB capabilities. iommu/amd: Make iommu_disable safer mfd: intel-lpss: Release IDA resources rxrpc: Fix uninitialized error code in rxrpc_send_data_packet() xprtrdma: Fix use-after-free in rpcrdma_post_recvs um: Fix IRQ controller regression on console read PM: ACPI/PCI: Resume all devices during hibernation ACPI: PM: Simplify and fix PM domain hibernation callbacks ACPI: PM: Introduce "poweroff" callbacks for ACPI PM domain and LPSS fsi/core: Fix error paths on CFAM init devres: allow const resource arguments fsi: sbefifo: Don't fail operations when in SBE IPL state RDMA/hns: Fixs hw access invalid dma memory error PCI: mobiveil: Remove the flag MSI_FLAG_MULTI_PCI_MSI PCI: mobiveil: Fix devfn check in mobiveil_pcie_valid_device() PCI: mobiveil: Fix the valid check for inbound and outbound windows ceph: fix "ceph.dir.rctime" vxattr value net: pasemi: fix an use-after-free in pasemi_mac_phy_init() net/tls: fix socket wmem accounting on fallback with netem x86/pgtable/32: Fix LOWMEM_PAGES constant xdp: fix possible cq entry leak ARM: stm32: use "depends on" instead of "if" after prompt scsi: libfc: fix null pointer dereference on a null lport xfrm interface: ifname may be wrong in logs drm/panel: make drm_panel.h self-contained clk: sunxi-ng: v3s: add the missing PLL_DDR1 PM: sleep: Fix possible overflow in pm_system_cancel_wakeup() libertas_tf: Use correct channel range in lbtf_geo_init qed: reduce maximum stack frame size usb: host: xhci-hub: fix extra endianness conversion media: rcar-vin: Clean up correct notifier in error path mic: avoid statically declaring a 'struct device'. x86/kgbd: Use NMI_VECTOR not APIC_DM_NMI crypto: ccp - Reduce maximum stack usage ALSA: aoa: onyx: always initialize register read value arm64: dts: renesas: r8a77995: Fix register range of display node tipc: reduce risk of wakeup queue starvation ARM: dts: stm32: add missing vdda-supply to adc on stm32h743i-eval net/mlx5: Fix mlx5_ifc_query_lag_out_bits cifs: fix rmmod regression in cifs.ko caused by force_sig changes iio: tsl2772: Use devm_add_action_or_reset for tsl2772_chip_off net: fix bpf_xdp_adjust_head regression for generic-XDP spi: bcm-qspi: Fix BSPI QUAD and DUAL mode support when using flex mode cxgb4: smt: Add lock for atomic_dec_and_test crypto: caam - free resources in case caam_rng registration failed ext4: set error return correctly when ext4_htree_store_dirent fails RDMA/hns: Bugfix for slab-out-of-bounds when unloading hip08 driver RDMA/hns: bugfix for slab-out-of-bounds when loading hip08 driver ASoC: es8328: Fix copy-paste error in es8328_right_line_controls ASoC: cs4349: Use PM ops 'cs4349_runtime_pm' ASoC: wm8737: Fix copy-paste error in wm8737_snd_controls net/rds: Add a few missing rds_stat_names entries tools: bpftool: fix arguments for p_err() in do_event_pipe() tools: bpftool: fix format strings and arguments for jsonw_printf() drm: rcar-du: lvds: Fix bridge_to_rcar_lvds bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd fails signal: Allow cifs and drbd to receive their terminating signals powerpc/64s/radix: Fix memory hot-unplug page table split ASoC: sun4i-i2s: RX and TX counter registers are swapped dmaengine: dw: platform: Switch to acpi_dma_controller_register() rtc: rv3029: revert error handling patch to rv3029_eeprom_write() mac80211: minstrel_ht: fix per-group max throughput rate initialization i40e: reduce stack usage in i40e_set_fc media: atmel: atmel-isi: fix timeout value for stop streaming ARM: 8896/1: VDSO: Don't leak kernel addresses rtc: pcf2127: bugfix: read rtc disables watchdog mips: avoid explicit UB in assignment of mips_io_port_base media: em28xx: Fix exception handling in em28xx_alloc_urbs() iommu/mediatek: Fix iova_to_phys PA start for 4GB mode ahci: Do not export local variable ahci_em_messages rxrpc: Fix lack of conn cleanup when local endpoint is cleaned up [ver #2] Partially revert "kfifo: fix kfifo_alloc() and kfifo_init()" hwmon: (lm75) Fix write operations for negative temperatures net/sched: cbs: Set default link speed to 10 Mbps in cbs_set_port_rate power: supply: Init device wakeup after device_add() x86, perf: Fix the dependency of the x86 insn decoder selftest staging: greybus: light: fix a couple double frees irqdomain: Add the missing assignment of domain->fwnode for named fwnode bcma: fix incorrect update of BCMA_CORE_PCI_MDIO_DATA usb: typec: tps6598x: Fix build error without CONFIG_REGMAP_I2C bcache: Fix an error code in bch_dump_read() iio: dac: ad5380: fix incorrect assignment to val netfilter: ctnetlink: honor IPS_OFFLOAD flag ath9k: dynack: fix possible deadlock in ath_dynack_node_{de}init wcn36xx: use dynamic allocation for large variables tty: serial: fsl_lpuart: Use appropriate lpuart32_* I/O funcs ARM: dts: aspeed-g5: Fixe gpio-ranges upper limit xsk: avoid store-tearing when assigning queues xsk: avoid store-tearing when assigning umem led: triggers: Fix dereferencing of null pointer net: sonic: return NETDEV_TX_OK if failed to map buffer net: hns3: fix error VF index when setting VLAN offload rtlwifi: Fix file release memory leak ARM: dts: logicpd-som-lv: Fix i2c2 and i2c3 Pin mux f2fs: fix wrong error injection path in inc_valid_block_count() f2fs: fix error path of f2fs_convert_inline_page() scsi: fnic: fix msix interrupt allocation Btrfs: fix hang when loading existing inode cache off disk Btrfs: fix inode cache waiters hanging on failure to start caching thread Btrfs: fix inode cache waiters hanging on path allocation failure btrfs: use correct count in btrfs_file_write_iter() ixgbe: sync the first fragment unconditionally hwmon: (shtc1) fix shtc1 and shtw1 id mask net: sonic: replace dev_kfree_skb in sonic_send_packet pinctrl: iproc-gpio: Fix incorrect pinconf configurations gpio/aspeed: Fix incorrect number of banks ath10k: adjust skb length in ath10k_sdio_mbox_rx_packet RDMA/cma: Fix false error message net/rds: Fix 'ib_evt_handler_call' element in 'rds_ib_stat_names' um: Fix off by one error in IRQ enumeration bnxt_en: Increase timeout for HWRM_DBG_COREDUMP_XX commands f2fs: fix to avoid accessing uninitialized field of inode page in is_alive() mailbox: qcom-apcs: fix max_register value clk: actions: Fix factor clk struct member access powerpc/mm/mce: Keep irqs disabled during lockless page table walk bpf: fix BTF limits crypto: hisilicon - Matching the dma address for dma_pool_free() iommu/amd: Wait for completion of IOTLB flush in attach_device net: aquantia: Fix aq_vec_isr_legacy() return value cxgb4: Signedness bug in init_one() net: hisilicon: Fix signedness bug in hix5hd2_dev_probe() net: broadcom/bcmsysport: Fix signedness in bcm_sysport_probe() net: netsec: Fix signedness bug in netsec_probe() net: socionext: Fix a signedness bug in ave_probe() net: stmmac: dwmac-meson8b: Fix signedness bug in probe net: axienet: fix a signedness bug in probe of: mdio: Fix a signedness bug in of_phy_get_and_connect() net: nixge: Fix a signedness bug in nixge_probe() net: ethernet: stmmac: Fix signedness bug in ipq806x_gmac_of_parse() net: sched: cbs: Avoid division by zero when calculating the port rate nvme: retain split access workaround for capability reads net: stmmac: gmac4+: Not all Unicast addresses may be available rxrpc: Fix trace-after-put looking at the put connection record mac80211: accept deauth frames in IBSS mode llc: fix another potential sk_buff leak in llc_ui_sendmsg() llc: fix sk_buff refcounting in llc_conn_state_process() ip6erspan: remove the incorrect mtu limit for ip6erspan net: stmmac: fix length of PTP clock's name string net: stmmac: fix disabling flexible PPS output sctp: add chunks to sk_backlog when the newsk sk_socket is not set s390/qeth: Fix error handling during VNICC initialization s390/qeth: Fix initialization of vnicc cmd masks during set online act_mirred: Fix mirred_init_module error handling net: avoid possible false sharing in sk_leave_memory_pressure() net: add {READ\|WRITE}_ONCE() annotations on ->rskq_accept_head tcp: annotate lockless access to tcp_memory_pressure net/smc: receive returns without data net/smc: receive pending data after RCV_SHUTDOWN drm/msm/dsi: Implement reset correctly vhost/test: stop device before reset dmaengine: imx-sdma: fix size check for sdma script_number firmware: dmi: Fix unlikely out-of-bounds read in save_mem_devices arm64: hibernate: check pgd table allocation net: netem: fix error path for corrupted GSO frames net: netem: correct the parent's backlog when corrupted packet was dropped xsk: Fix registration of Rx-only sockets bpf, offload: Unlock on error in bpf_offload_dev_create() afs: Fix missing timeout reset net: qca_spi: Move reset_count to struct qcaspi hv_netvsc: Fix offset usage in netvsc_send_table() hv_netvsc: Fix send_table offset in case of a host bug afs: Fix large file support drm: panel-lvds: Potential Oops in probe error handling hwrng: omap3-rom - Fix missing clock by probing with device tree dpaa_eth: perform DMA unmapping before read dpaa_eth: avoid timestamp read on error paths MIPS: Loongson: Fix return value of loongson_hwmon_init hv_netvsc: flag software created hash value net: neigh: use long type to store jiffies delta packet: fix data-race in fanout_flow_is_huge() i2c: stm32f7: report dma error during probe mmc: sdio: fix wl1251 vendor id mmc: core: fix wl1251 sdio quirks affs: fix a memory leak in affs_remount afs: Remove set but not used variables 'before', 'after' dmaengine: ti: edma: fix missed failure handling drm/radeon: fix bad DMA from INTERRUPT_CNTL2 arm64: dts: juno: Fix UART frequency samples/bpf: Fix broken xdp_rxq_info due to map order assumptions usb: dwc3: Allow building USB_DWC3_QCOM without EXTCON IB/iser: Fix dma_nents type definition serial: stm32: fix clearing interrupt error flags arm64: dts: meson-gxm-khadas-vim2: fix uart_A bluetooth node m68k: Call timer_interrupt() with interrupts disabled Linux 4.19.99 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ieabeab79ea5c8cb4b6b1552702fa5d6100cea5db	2020-01-27 15:55:44 +01:00
Dan Carpenter	4622676d8f	bpf, offload: Unlock on error in bpf_offload_dev_create() [ Upstream commit d0fbb51dfaa612f960519b798387be436e8f83c5 ] We need to drop the bpf_devs_lock on error before returning. Fixes: `9fd7c55591` ("bpf: offload: aggregate offloads per-device") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Link: https://lore.kernel.org/bpf/20191104091536.GB31509@mwanda Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:51:20 +01:00
Dexuan Cui	3f929fe0ac	irqdomain: Add the missing assignment of domain->fwnode for named fwnode [ Upstream commit 711419e504ebd68c8f03656616829c8ad7829389 ] Recently device pass-through stops working for Linux VM running on Hyper-V. git-bisect shows the regression is caused by the recent commit 467a3bb97432 ("PCI: hv: Allocate a named fwnode ..."), but the root cause is that the commit `d59f6617ee` forgets to set the domain->fwnode for IRQCHIP_FWNODE_NAMED*, and as a result: 1. The domain->fwnode remains to be NULL. 2. irq_find_matching_fwspec() returns NULL since "h->fwnode == fwnode" is false, and pci_set_bus_msi_domain() sets the Hyper-V PCI root bus's msi_domain to NULL. 3. When the device is added onto the root bus, the device's dev->msi_domain is set to NULL in pci_set_msi_domain(). 4. When a device driver tries to enable MSI-X, pci_msi_setup_msi_irqs() calls arch_setup_msi_irqs(), which uses the native MSI chip (i.e. arch/x86/kernel/apic/msi.c: pci_msi_controller) to set up the irqs, but actually pci_msi_setup_msi_irqs() is supposed to call msi_domain_alloc_irqs() with the hbus->irq_domain, which is created in hv_pcie_init_irq_domain() and is associated with the Hyper-V chip hv_msi_irq_chip. Consequently, the irq line is not properly set up, and the device driver can not receive any interrupt. Fixes: `d59f6617ee` ("genirq: Allow fwnode to carry name information only") Fixes: 467a3bb97432 ("PCI: hv: Allocate a named fwnode instead of an address-based one") Reported-by: Lili Deng <v-lide@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/PU1P153MB01694D9AF625AC335C600C5FBFBE0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:51:09 +01:00
Eric W. Biederman	6db0e28b89	signal: Allow cifs and drbd to receive their terminating signals [ Upstream commit 33da8e7c814f77310250bb54a9db36a44c5de784 ] My recent to change to only use force_sig for a synchronous events wound up breaking signal reception cifs and drbd. I had overlooked the fact that by default kthreads start out with all signals set to SIG_IGN. So a change I thought was safe turned out to have made it impossible for those kernel thread to catch their signals. Reverting the work on force_sig is a bad idea because what the code was doing was very much a misuse of force_sig. As the way force_sig ultimately allowed the signal to happen was to change the signal handler to SIG_DFL. Which after the first signal will allow userspace to send signals to these kernel threads. At least for wake_ack_receiver in drbd that does not appear actively wrong. So correct this problem by adding allow_kernel_signal that will allow signals whose siginfo reports they were sent by the kernel through, but will not allow userspace generated signals, and update cifs and drbd to call allow_kernel_signal in an appropriate place so that their thread can receive this signal. Fixing things this way ensures that userspace won't be able to send signals and cause problems, that it is clear which signals the threads are expecting to receive, and it guarantees that nothing else in the system will be affected. This change was partly inspired by similar cifs and drbd patches that added allow_signal. Reported-by: ronnie sahlberg <ronniesahlberg@gmail.com> Reported-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Tested-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Cc: Steve French <smfrench@gmail.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: David Laight <David.Laight@ACULAB.COM> Fixes: 247bc9470b1e ("cifs: fix rmmod regression in cifs.ko caused by force_sig changes") Fixes: 72abe3bcf091 ("signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig") Fixes: fee109901f39 ("signal/drbd: Use send_sig not force_sig") Fixes: 3cf5d076fb4d ("signal: Remove task parameter from force_sig") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:51:05 +01:00
Andrea Arcangeli	fde68698dd	fork,memcg: alloc_thread_stack_node needs to set tsk->stack [ Upstream commit 1bf4580e00a248a2c86269125390eb3648e1877c ] Commit 5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on memcg charge fail") corrected two instances, but there was a third instance of this bug. Without setting tsk->stack, if memcg_charge_kernel_stack fails, it'll execute free_thread_stack() on a dangling pointer. Enterprise kernels are compiled with VMAP_STACK=y so this isn't critical, but custom VMAP_STACK=n builds should have some performance advantage, with the drawback of risking to fail fork because compaction didn't succeed. So as long as VMAP_STACK=n is a supported option it's worth fixing it upstream. Link: http://lkml.kernel.org/r/20190619011450.28048-1-aarcange@redhat.com Fixes: 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting") Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Rik van Riel <riel@surriel.com> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:50:58 +01:00
Ravi Bangoria	574fe4c9a3	perf/ioctl: Add check for the sample_period value [ Upstream commit 913a90bc5a3a06b1f04c337320e9aeee2328dd77 ] perf_event_open() limits the sample_period to 63 bits. See: `0819b2e30c` ("perf: Limit perf_event_attr::sample_period to 63 bits") Make ioctl() consistent with it. Also on PowerPC, negative sample_period could cause a recursive PMIs leading to a hang (reported when running perf-fuzzer). Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: maddy@linux.vnet.ibm.com Cc: mpe@ellerman.id.au Fixes: `0819b2e30c` ("perf: Limit perf_event_attr::sample_period to 63 bits") Link: https://lkml.kernel.org/r/20190604042953.914-1-ravi.bangoria@linux.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:50:57 +01:00
Dan Carpenter	9245e019e5	kdb: do a sanity check on the cpu in kdb_per_cpu() [ Upstream commit b586627e10f57ee3aa8f0cfab0d6f7dc4ae63760 ] The "whichcpu" comes from argv[3]. The cpu_online() macro looks up the cpu in a bitmap of online cpus, but if the value is too high then it could read beyond the end of the bitmap and possibly Oops. Fixes: `5d5314d679` ("kdb: core for kgdb back end (1 of 2)") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:50:48 +01:00
Alexander Shishkin	d6ef9a8fd8	perf/core: Fix the address filtering fix [ Upstream commit 52a44f83fc2d64a5e74d5d685fad2fecc7b7a321 ] The following recent commit: c60f83b813e5 ("perf, pt, coresight: Fix address filters for vmas with non-zero offset") changes the address filtering logic to communicate filter ranges to the PMU driver via a single address range object, instead of having the driver do the final bit of math. That change forgets to take into account kernel filters, which are not calculated the same way as DSO based filters. Fix that by passing the kernel filters the same way as file-based filters. This doesn't require any additional changes in the drivers. Reported-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: c60f83b813e5 ("perf, pt, coresight: Fix address filters for vmas with non-zero offset") Link: https://lkml.kernel.org/r/20190329091212.29870-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:50:39 +01:00
Andrey Ignatov	462c72919b	bpf: Add missed newline in verifier verbose log [ Upstream commit 1fbd20f8b77b366ea4aeb92ade72daa7f36a7e3b ] check_stack_access() that prints verbose log is used in adjust_ptr_min_max_vals() that prints its own verbose log and now they stick together, e.g.: variable stack access var_off=(0xfffffffffffffff0; 0x4) off=-16 size=1R2 stack pointer arithmetic goes out of range, prohibited for !root Add missing newline so that log is more readable: variable stack access var_off=(0xfffffffffffffff0; 0x4) off=-16 size=1 R2 stack pointer arithmetic goes out of range, prohibited for !root Fixes: `f1174f77b5` ("bpf/verifier: rework value tracking") Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:50:37 +01:00
Alexander Shishkin	b34abf24f2	perf, pt, coresight: Fix address filters for vmas with non-zero offset [ Upstream commit c60f83b813e5b25ccd5de7e8c8925c31b3aebcc1 ] Currently, the address range calculation for file-based filters works as long as the vma that maps the matching part of the object file starts from offset zero into the file (vm_pgoff==0). Otherwise, the resulting filter range would be off by vm_pgoff pages. Another related problem is that in case of a partially matching vma, that is, a vma that matches part of a filter region, the filter range size wouldn't be adjusted. Fix the arithmetics around address filter range calculations, taking into account vma offset, so that the entire calculation is done before the filter configuration is passed to the PMU drivers instead of having those drivers do the final bit of arithmetics. Based on the patch by Adrian Hunter <adrian.hunter.intel.com>. Reported-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Olsa <jolsa@redhat.com> Fixes: `375637bc52` ("perf/core: Introduce address range filtering") Link: http://lkml.kernel.org/r/20190215115655.63469-3-alexander.shishkin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:50:27 +01:00
Alexander Shishkin	673f190df0	perf: Copy parent's address filter offsets on clone [ Upstream commit 18736eef12137c59f60cc9f56dc5bea05c92e0eb ] When a child event is allocated in the inherit_event() path, the VMA based filter offsets are not copied from the parent, even though the address space mapping of the new task remains the same, which leads to no trace for the new task until exec. Reported-by: Mansour Alharthi <malharthi9@gatech.edu> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Olsa <jolsa@redhat.com> Fixes: `375637bc52` ("perf/core: Introduce address range filtering") Link: http://lkml.kernel.org/r/20190215115655.63469-2-alexander.shishkin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:50:27 +01:00
Shakeel Butt	3ed8ca4d29	fork, memcg: fix cached_stacks case [ Upstream commit ba4a45746c362b665e245c50b870615f02f34781 ] Commit 5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on memcg charge fail") fixes a crash caused due to failed memcg charge of the kernel stack. However the fix misses the cached_stacks case which this patch fixes. So, the same crash can happen if the memcg charge of a cached stack is failed. Link: http://lkml.kernel.org/r/20190102180145.57406-1-shakeelb@google.com Fixes: 5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on memcg charge fail") Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Rik van Riel <riel@surriel.com> Cc: Rik van Riel <riel@surriel.com> Cc: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:50:11 +01:00
Rik van Riel	641164565b	fork,memcg: fix crash in free_thread_stack on memcg charge fail [ Upstream commit 5eed6f1dff87bfb5e545935def3843edf42800f2 ] Commit 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting") will result in fork failing if allocating a kernel stack for a task in dup_task_struct exceeds the kernel memory allowance for that cgroup. Unfortunately, it also results in a crash. This is due to the code jumping to free_stack and calling free_thread_stack when the memcg kernel stack charge fails, but without tsk->stack pointing at the freshly allocated stack. This in turn results in the vfree_atomic in free_thread_stack oopsing with a backtrace like this: #5 [ffffc900244efc88] die at ffffffff8101f0ab #6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86 #7 [ffffc900244efce0] general_protection at ffffffff818ff082 [exception RIP: llist_add_batch+7] RIP: ffffffff8150d487 RSP: ffffc900244efd98 RFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88085ef55980 RCX: 0000000000000000 RDX: ffff88085ef55980 RSI: 343834343531203a RDI: 343834343531203a RBP: ffffc900244efd98 R8: 0000000000000001 R9: ffff8808578c3600 R10: 0000000000000000 R11: 0000000000000001 R12: ffff88029f6c21c0 R13: 0000000000000286 R14: ffff880147759b00 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7 #9 [ffffc900244efdb8] copy_process at ffffffff81086e37 #10 [ffffc900244efe98] _do_fork at ffffffff810884e0 #11 [ffffc900244eff10] sys_vfork at ffffffff810887ff #12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43 RIP: 000000000049b948 RSP: 00007ffcdb307830 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000896030 RCX: 000000000049b948 RDX: 0000000000000000 RSI: 00007ffcdb307790 RDI: 00000000005d7421 RBP: 000000000067370f R8: 00007ffcdb3077b0 R9: 000000000001ed00 R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000040 R13: 000000000000000f R14: 0000000000000000 R15: 000000000088d018 ORIG_RAX: 000000000000003a CS: 0033 SS: 002b The simplest fix is to assign tsk->stack right where it is allocated. Link: http://lkml.kernel.org/r/20181214231726.7ee4843c@imladris.surriel.com Fixes: 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting") Signed-off-by: Rik van Riel <riel@surriel.com> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:50:08 +01:00
Marc Zyngier	c153dcfc29	genirq/debugfs: Reinstate full OF path for domain name [ Upstream commit 94967b55ebf3b603f2fe750ecedd896042585a1c ] On a DT based system, we use the of_node full name to name the corresponding irq domain. We expect that name to be unique, so so that domains with the same base name won't clash (this happens on multi-node topologies, for example). Since `a7e4cfb0a7` ("of/fdt: only store the device node basename in full_name"), of_node_full_name() lies and only returns the basename. This breaks the above requirement, and we end-up with only a subset of the domains in /sys/kernel/debug/irq/domains. Let's reinstate the feature by using the fancy new %pOF format specifier, which happens to do the right thing. Fixes: `a7e4cfb0a7` ("of/fdt: only store the device node basename in full_name") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20181001100522.180054-3-marc.zyngier@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-27 14:49:56 +01:00
Jeff Vander Stoep	025a1ee618	Revert "ANDROID: security,perf: Allow further restriction of perf_event_open" Unfork Android. This reverts commit `8e5e42d5ae`. Perf_event_paranoid=3 is no longer needed on Android. Access control of perf events is now done by selinux. See: https://patchwork.kernel.org/patch/11185793/ Bug: 120445712 Bug: 137092007 Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Change-Id: Iba493424174b30baff460caaa25a54a472c87bd4	2020-01-23 22:02:46 +00:00
Kusanagi Kouichi	3ccf82e899	BACKPORT: tracing: Remove unnecessary DEBUG_FS dependency Tracing replaced debugfs with tracefs. Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20191120104350753.EWCT.12796.ppp.dion.ne.jp@dmta0009.auone-net.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 0e4a459f56c32d3e52ae69a4b447db2f48a65f44) Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Id61dddcb804cf7a5d62d2d04a455d8b84097c967	2020-01-23 11:16:22 +01:00
Greg Kroah-Hartman	8cb4870403	This is the 4.19.98 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4pSYMACgkQONu9yGCS aT7Rkg/8C/AXaTp+2HxRj3ZO56uzpMBMb5duBzdzxnEnvFp+DIM7xxRX+NFI5CSK 4rjnxMd2tPsFtqiWo/bBCUcHh9gu5HJKOMFRZGaRYAXvJ/8hgahgzkBE00JiAB6r mrk9Y/pwcKxMFsAHtu3xM0oENeefXOmavVTHc9N3DQLd3hNuyTrPztBMFaDg8djR pSwh1uE2G+Z2UOdi2kXmHiEIG6NViIqp+qFYI5CUIyeKfvOEsR5nSQ97LyNQ+dUX qshARQFuk78+Ax+GNPTQXiWdzN7+SH5aw5frFtdhAN90F+XrRDj4ZXw+EkX+/M2J NZU9P/v41ESG8RWxbAZ6osAUkQ4Dgq2BQpdyRxNNjTchXc0Kr4K6BCKuhY6cGxS7 0PXPV7MsuAHYIrIvzG2lqif9gmknA0UrGVKuYJIZxBaWlHD2mEkFby0W0HIcBwir yKKK3fkFjmsGKYzh+VZVoGySWDbs7qYASWXHOCz0QCLb0CT8/ePbyxLdjY7u5KyX wDaDHXG9nm6Nu68HD/9CRnUkiK8dnsODZ0k+sBZfEa+xvHPJCdv3gnrf4SwU7dj7 ZyhO9XkFzncOJDoxYxiXTfI+zbU1ZhaDw7fk2PFvAI6P1xRS3m6rp8pDWp8iw/MX 92Sz1YzS68+otHLi+OBGxzu10PwMDtu2nUvqn68SYq6Rp0mZnnE= =2O94 -----END PGP SIGNATURE----- Merge 4.19.98 into android-4.19 Changes in 4.19.98 ARM: dts: meson8: fix the size of the PMU registers clk: qcom: gcc-sdm845: Add missing flag to votable GDSCs dt-bindings: reset: meson8b: fix duplicate reset IDs ARM: dts: imx6q-dhcom: fix rtc compatible clk: Don't try to enable critical clocks if prepare failed ASoC: msm8916-wcd-digital: Reset RX interpolation path after use iio: buffer: align the size of scan bytes to size of the largest element USB: serial: simple: Add Motorola Solutions TETRA MTP3xxx and MTP85xx USB: serial: option: Add support for Quectel RM500Q USB: serial: opticon: fix control-message timeouts USB: serial: option: add support for Quectel RM500Q in QDL mode USB: serial: suppress driver bind attributes USB: serial: ch341: handle unbound port at reset_resume USB: serial: io_edgeport: handle unbound ports on URB completion USB: serial: io_edgeport: add missing active-port sanity check USB: serial: keyspan: handle unbound ports USB: serial: quatech2: handle unbound ports scsi: fnic: fix invalid stack access scsi: mptfusion: Fix double fetch bug in ioctl ASoC: msm8916-wcd-analog: Fix selected events for MIC BIAS External1 ASoC: msm8916-wcd-analog: Fix MIC BIAS Internal1 ARM: dts: imx6q-dhcom: Fix SGTL5000 VDDIO regulator connection ALSA: dice: fix fallback from protocol extension into limited functionality ALSA: seq: Fix racy access for queue timer in proc read ALSA: usb-audio: fix sync-ep altsetting sanity check arm64: dts: allwinner: a64: olinuxino: Fix SDIO supply regulator Fix built-in early-load Intel microcode alignment block: fix an integer overflow in logical block size ARM: dts: am571x-idk: Fix gpios property to have the correct gpio number LSM: generalize flag passing to security_capable ptrace: reintroduce usage of subjective credentials in ptrace_has_cap() usb: core: hub: Improved device recognition on remote wakeup x86/resctrl: Fix an imbalance in domain_remove_cpu() x86/CPU/AMD: Ensure clearing of SME/SEV features is maintained x86/efistub: Disable paging at mixed mode entry drm/i915: Add missing include file <linux/math64.h> x86/resctrl: Fix potential memory leak perf hists: Fix variable name's inconsistency in hists__for_each() macro perf report: Fix incorrectly added dimensions as switch perf data file mm/shmem.c: thp, shmem: fix conflict of above-47bit hint address and PMD alignment mm: memcg/slab: call flush_memcg_workqueue() only if memcg workqueue is valid btrfs: rework arguments of btrfs_unlink_subvol btrfs: fix invalid removal of root ref btrfs: do not delete mismatched root refs btrfs: fix memory leak in qgroup accounting mm/page-writeback.c: avoid potential division by zero in wb_min_max_ratio() ARM: dts: imx6qdl: Add Engicam i.Core 1.5 MX6 ARM: dts: imx6q-icore-mipi: Use 1.5 version of i.Core MX6DL ARM: dts: imx7: Fix Toradex Colibri iMX7S 256MB NAND flash support net: stmmac: 16KB buffer must be 16 byte aligned net: stmmac: Enable 16KB buffer size mm/huge_memory.c: make __thp_get_unmapped_area static mm/huge_memory.c: thp: fix conflict of above-47bit hint address and PMD alignment arm64: dts: agilex/stratix10: fix pmu interrupt numbers bpf: Fix incorrect verifier simulation of ARSH under ALU32 cfg80211: fix deadlocks in autodisconnect work cfg80211: fix memory leak in cfg80211_cqm_rssi_update cfg80211: fix page refcount issue in A-MSDU decap netfilter: fix a use-after-free in mtype_destroy() netfilter: arp_tables: init netns pointer in xt_tgdtor_param struct netfilter: nft_tunnel: fix null-attribute check netfilter: nf_tables: remove WARN and add NLA_STRING upper limits netfilter: nf_tables: store transaction list locally while requesting module netfilter: nf_tables: fix flowtable list del corruption NFC: pn533: fix bulk-message timeout batman-adv: Fix DAT candidate selection on little endian systems macvlan: use skb_reset_mac_header() in macvlan_queue_xmit() hv_netvsc: Fix memory leak when removing rndis device net: dsa: tag_qca: fix doubled Tx statistics net: hns: fix soft lockup when there is not enough memory net: usb: lan78xx: limit size of local TSO packets net/wan/fsl_ucc_hdlc: fix out of bounds write on array utdm_info ptp: free ptp device pin descriptors properly r8152: add missing endpoint sanity check tcp: fix marked lost packets not being retransmitted sh_eth: check sh_eth_cpu_data::dual_port when dumping registers mlxsw: spectrum: Wipe xstats.backlog of down ports mlxsw: spectrum_qdisc: Include MC TCs in Qdisc counters xen/blkfront: Adjust indentation in xlvbd_alloc_gendisk tcp: refine rule to allow EPOLLOUT generation under mem pressure irqchip: Place CONFIG_SIFIVE_PLIC into the menu cw1200: Fix a signedness bug in cw1200_load_firmware() arm64: dts: meson-gxl-s905x-khadas-vim: fix gpio-keys-polled node cfg80211: check for set_wiphy_params tick/sched: Annotate lockless access to last_jiffies_update arm64: dts: marvell: Fix CP110 NAND controller node multi-line comment alignment Revert "arm64: dts: juno: add dma-ranges property" mtd: devices: fix mchp23k256 read and write drm/nouveau/bar/nv50: check bar1 vmm return value drm/nouveau/bar/gf100: ensure BAR is mapped drm/nouveau/mmu: qualify vmm during dtor reiserfs: fix handling of -EOPNOTSUPP in reiserfs_for_each_xattr scsi: esas2r: unlock on error in esas2r_nvram_read_direct() scsi: qla4xxx: fix double free bug scsi: bnx2i: fix potential use after free scsi: target: core: Fix a pr_debug() argument scsi: qla2xxx: Fix qla2x00_request_irqs() for MSI scsi: qla2xxx: fix rports not being mark as lost in sync fabric scan scsi: core: scsi_trace: Use get_unaligned_be*() perf probe: Fix wrong address verification clk: sprd: Use IS_ERR() to validate the return value of syscon_regmap_lookup_by_phandle() regulator: ab8500: Remove SYSCLKREQ from enum ab8505_regulator_id hwmon: (pmbus/ibm-cffps) Switch LEDs to blocking brightness call Linux 4.19.98 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I74a43a9e60734aec6d24b10374ba97de89172eca	2020-01-23 08:36:16 +01:00
Eric Dumazet	a31889a691	tick/sched: Annotate lockless access to last_jiffies_update commit de95a991bb72e009f47e0c4bbc90fc5f594588d5 upstream. syzbot (KCSAN) reported a data-race in tick_do_update_jiffies64(): BUG: KCSAN: data-race in tick_do_update_jiffies64 / tick_do_update_jiffies64 write to 0xffffffff8603d008 of 8 bytes by interrupt on cpu 1: tick_do_update_jiffies64+0x100/0x250 kernel/time/tick-sched.c:73 tick_sched_do_timer+0xd4/0xe0 kernel/time/tick-sched.c:138 tick_sched_timer+0x43/0xe0 kernel/time/tick-sched.c:1292 __run_hrtimer kernel/time/hrtimer.c:1514 [inline] __hrtimer_run_queues+0x274/0x5f0 kernel/time/hrtimer.c:1576 hrtimer_interrupt+0x22a/0x480 kernel/time/hrtimer.c:1638 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1110 [inline] smp_apic_timer_interrupt+0xdc/0x280 arch/x86/kernel/apic/apic.c:1135 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 arch_local_irq_restore arch/x86/include/asm/paravirt.h:756 [inline] kcsan_setup_watchpoint+0x1d4/0x460 kernel/kcsan/core.c:436 check_access kernel/kcsan/core.c:466 [inline] __tsan_read1 kernel/kcsan/core.c:593 [inline] __tsan_read1+0xc2/0x100 kernel/kcsan/core.c:593 kallsyms_expand_symbol.constprop.0+0x70/0x160 kernel/kallsyms.c:79 kallsyms_lookup_name+0x7f/0x120 kernel/kallsyms.c:170 insert_report_filterlist kernel/kcsan/debugfs.c:155 [inline] debugfs_write+0x14b/0x2d0 kernel/kcsan/debugfs.c:256 full_proxy_write+0xbd/0x100 fs/debugfs/file.c:225 __vfs_write+0x67/0xc0 fs/read_write.c:494 vfs_write fs/read_write.c:558 [inline] vfs_write+0x18a/0x390 fs/read_write.c:542 ksys_write+0xd5/0x1b0 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __x64_sys_write+0x4c/0x60 fs/read_write.c:620 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 read to 0xffffffff8603d008 of 8 bytes by task 0 on cpu 0: tick_do_update_jiffies64+0x2b/0x250 kernel/time/tick-sched.c:62 tick_nohz_update_jiffies kernel/time/tick-sched.c:505 [inline] tick_nohz_irq_enter kernel/time/tick-sched.c:1257 [inline] tick_irq_enter+0x139/0x1c0 kernel/time/tick-sched.c:1274 irq_enter+0x4f/0x60 kernel/softirq.c:354 entering_irq arch/x86/include/asm/apic.h:517 [inline] entering_ack_irq arch/x86/include/asm/apic.h:523 [inline] smp_apic_timer_interrupt+0x55/0x280 arch/x86/kernel/apic/apic.c:1133 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:60 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:571 default_idle_call+0x1e/0x40 kernel/sched/idle.c:94 cpuidle_idle_call kernel/sched/idle.c:154 [inline] do_idle+0x1af/0x280 kernel/sched/idle.c:263 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:355 rest_init+0xec/0xf6 init/main.c:452 arch_call_rest_init+0x17/0x37 start_kernel+0x838/0x85e init/main.c:786 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:490 x86_64_start_kernel+0x72/0x76 arch/x86/kernel/head64.c:471 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc7+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Use READ_ONCE() and WRITE_ONCE() to annotate this expected race. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191205045619.204946-1-edumazet@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-23 08:21:37 +01:00
Daniel Borkmann	042a3a6d93	bpf: Fix incorrect verifier simulation of ARSH under ALU32 commit 0af2ffc93a4b50948f9dad2786b7f1bd253bf0b9 upstream. Anatoly has been fuzzing with kBdysch harness and reported a hang in one of the outcomes: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (85) call bpf_get_socket_cookie#46 1: R0_w=invP(id=0) R10=fp0 1: (57) r0 &= 808464432 2: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 2: (14) w0 -= 810299440 3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 3: (c4) w0 s>>= 1 4: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 4: (76) if w0 s>= 0x30303030 goto pc+216 221: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 221: (95) exit processed 6 insns (limit 1000000) [...] Taking a closer look, the program was xlated as follows: # ./bpftool p d x i 12 0: (85) call bpf_get_socket_cookie#7800896 1: (bf) r6 = r0 2: (57) r6 &= 808464432 3: (14) w6 -= 810299440 4: (c4) w6 s>>= 1 5: (76) if w6 s>= 0x30303030 goto pc+216 6: (05) goto pc-1 7: (05) goto pc-1 8: (05) goto pc-1 [...] 220: (05) goto pc-1 221: (05) goto pc-1 222: (95) exit Meaning, the visible effect is very similar to f54c7898ed1c ("bpf: Fix precision tracking for unbounded scalars"), that is, the fall-through branch in the instruction 5 is considered to be never taken given the conclusion from the min/max bounds tracking in w6, and therefore the dead-code sanitation rewrites it as goto pc-1. However, real-life input disagrees with verification analysis since a soft-lockup was observed. The bug sits in the analysis of the ARSH. The definition is that we shift the target register value right by K bits through shifting in copies of its sign bit. In adjust_scalar_min_max_vals(), we do first coerce the register into 32 bit mode, same happens after simulating the operation. However, for the case of simulating the actual ARSH, we don't take the mode into account and act as if it's always 64 bit, but location of sign bit is different: dst_reg->smin_value >>= umin_val; dst_reg->smax_value >>= umin_val; dst_reg->var_off = tnum_arshift(dst_reg->var_off, umin_val); Consider an unknown R0 where bpf_get_socket_cookie() (or others) would for example return 0xffff. With the above ARSH simulation, we'd see the following results: [...] 1: R1=ctx(id=0,off=0,imm=0) R2_w=invP65535 R10=fp0 1: (85) call bpf_get_socket_cookie#46 2: R0_w=invP(id=0) R10=fp0 2: (57) r0 &= 808464432 -> R0_runtime = 0x3030 3: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 3: (14) w0 -= 810299440 -> R0_runtime = 0xcfb40000 4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 (0xffffffff) 4: (c4) w0 s>>= 1 -> R0_runtime = 0xe7da0000 5: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 (0x67c00000) (0x7ffbfff8) [...] In insn 3, we have a runtime value of 0xcfb40000, which is '1100 1111 1011 0100 0000 0000 0000 0000', the result after the shift has 0xe7da0000 that is '1110 0111 1101 1010 0000 0000 0000 0000', where the sign bit is correctly retained in 32 bit mode. In insn4, the umax was 0xffffffff, and changed into 0x7ffbfff8 after the shift, that is, '0111 1111 1111 1011 1111 1111 1111 1000' and means here that the simulation didn't retain the sign bit. With above logic, the updates happen on the 64 bit min/max bounds and given we coerced the register, the sign bits of the bounds are cleared as well, meaning, we need to force the simulation into s32 space for 32 bit alu mode. Verification after the fix below. We're first analyzing the fall-through branch on 32 bit signed >= test eventually leading to rejection of the program in this specific case: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r2 = 808464432 1: R1=ctx(id=0,off=0,imm=0) R2_w=invP808464432 R10=fp0 1: (85) call bpf_get_socket_cookie#46 2: R0_w=invP(id=0) R10=fp0 2: (bf) r6 = r0 3: R0_w=invP(id=0) R6_w=invP(id=0) R10=fp0 3: (57) r6 &= 808464432 4: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 4: (14) w6 -= 810299440 5: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 5: (c4) w6 s>>= 1 6: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0 (0x67c00000) (0xfffbfff8) 6: (76) if w6 s>= 0x30303030 goto pc+216 7: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0 7: (30) r0 = (u8 )skb[808464432] BPF_LD_[ABS\|IND] uses reserved fields processed 8 insns (limit 1000000) [...] Fixes: `9cbe1f5a32` ("bpf/verifier: improve register value range tracking with ARSH") Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200115204733.16648-1-daniel@iogearbox.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-23 08:21:32 +01:00
Christian Brauner	21cd79a27a	ptrace: reintroduce usage of subjective credentials in ptrace_has_cap() commit 6b3ad6649a4c75504edeba242d3fd36b3096a57f upstream. Commit `69f594a389` ("ptrace: do not audit capability check when outputing /proc/pid/stat") introduced the ability to opt out of audit messages for accesses to various proc files since they are not violations of policy. While doing so it somehow switched the check from ns_capable() to has_ns_capability{_noaudit}(). That means it switched from checking the subjective credentials of the task to using the objective credentials. This is wrong since. ptrace_has_cap() is currently only used in ptrace_may_access() And is used to check whether the calling task (subject) has the CAP_SYS_PTRACE capability in the provided user namespace to operate on the target task (object). According to the cred.h comments this would mean the subjective credentials of the calling task need to be used. This switches ptrace_has_cap() to use security_capable(). Because we only call ptrace_has_cap() in ptrace_may_access() and in there we already have a stable reference to the calling task's creds under rcu_read_lock() there's no need to go through another series of dereferences and rcu locking done in ns_capable{_noaudit}(). As one example where this might be particularly problematic, Jann pointed out that in combination with the upcoming IORING_OP_OPENAT feature, this bug might allow unprivileged users to bypass the capability checks while asynchronously opening files like /proc/*/mem, because the capability checks for this would be performed against kernel credentials. To illustrate on the former point about this being exploitable: When io_uring creates a new context it records the subjective credentials of the caller. Later on, when it starts to do work it creates a kernel thread and registers a callback. The callback runs with kernel creds for ktask->real_cred and ktask->cred. To prevent this from becoming a full-blown 0-day io_uring will call override_cred() and override ktask->cred with the subjective credentials of the creator of the io_uring instance. With ptrace_has_cap() currently looking at ktask->real_cred this override will be ineffective and the caller will be able to open arbitray proc files as mentioned above. Luckily, this is currently not exploitable but will turn into a 0-day once IORING_OP_OPENAT{2} land in v5.6. Fix it now! Cc: Oleg Nesterov <oleg@redhat.com> Cc: Eric Paris <eparis@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Jann Horn <jannh@google.com> Fixes: `69f594a389` ("ptrace: do not audit capability check when outputing /proc/pid/stat") Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-23 08:21:29 +01:00
Micah Morton	87ca9aaf0c	LSM: generalize flag passing to security_capable [ Upstream commit c1a85a00ea66cb6f0bd0f14e47c28c2b0999799f ] This patch provides a general mechanism for passing flags to the security_capable LSM hook. It replaces the specific 'audit' flag that is used to tell security_capable whether it should log an audit message for the given capability check. The reason for generalizing this flag passing is so we can add an additional flag that signifies whether security_capable is being called by a setid syscall (which is needed by the proposed SafeSetID LSM). Signed-off-by: Micah Morton <mortonm@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.morris@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-23 08:21:29 +01:00
Andrey Konovalov	a7ed6cc7b7	UPSTREAM: kcov: remote coverage support (Upstream commit eec028c9386ed1a692aa01a85b55952202b41619.) Patch series " kcov: collect coverage from usb and vhost", v3. This patchset extends kcov to allow collecting coverage from backgound kernel threads. This extension requires custom annotations for each of the places where coverage collection is desired. This patchset implements this for hub events in the USB subsystem and for vhost workers. See the first patch description for details about the kcov extension. The other two patches apply this kcov extension to USB and vhost. Examples of other subsystems that might potentially benefit from this when custom annotations are added (the list is based on process_one_work() callers for bugs recently reported by syzbot): 1. fs: writeback wb_workfn() worker, 2. net: addrconf_dad_work()/addrconf_verify_work() workers, 3. net: neigh_periodic_work() worker, 4. net/p9: p9_write_work()/p9_read_work() workers, 5. block: blk_mq_run_work_fn() worker. These patches have been used to enable coverage-guided USB fuzzing with syzkaller for the last few years, see the details here: https://github.com/google/syzkaller/blob/master/docs/linux/external_fuzzing_usb.md This patchset has been pushed to the public Linux kernel Gerrit instance: https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/1524 This patch (of 3): Add background thread coverage collection ability to kcov. With KCOV_ENABLE coverage is collected only for syscalls that are issued from the current process. With KCOV_REMOTE_ENABLE it's possible to collect coverage for arbitrary parts of the kernel code, provided that those parts are annotated with kcov_remote_start()/kcov_remote_stop(). This allows to collect coverage from two types of kernel background threads: the global ones, that are spawned during kernel boot in a limited number of instances (e.g. one USB hub_event() worker thread is spawned per USB HCD); and the local ones, that are spawned when a user interacts with some kernel interface (e.g. vhost workers). To enable collecting coverage from a global background thread, a unique global handle must be assigned and passed to the corresponding kcov_remote_start() call. Then a userspace process can pass a list of such handles to the KCOV_REMOTE_ENABLE ioctl in the handles array field of the kcov_remote_arg struct. This will attach the used kcov device to the code sections, that are referenced by those handles. Since there might be many local background threads spawned from different userspace processes, we can't use a single global handle per annotation. Instead, the userspace process passes a non-zero handle through the common_handle field of the kcov_remote_arg struct. This common handle gets saved to the kcov_handle field in the current task_struct and needs to be passed to the newly spawned threads via custom annotations. Those threads should in turn be annotated with kcov_remote_start()/kcov_remote_stop(). Internally kcov stores handles as u64 integers. The top byte of a handle is used to denote the id of a subsystem that this handle belongs to, and the lower 4 bytes are used to denote the id of a thread instance within that subsystem. A reserved value 0 is used as a subsystem id for common handles as they don't belong to a particular subsystem. The bytes 4-7 are currently reserved and must be zero. In the future the number of bytes used for the subsystem or handle ids might be increased. When a particular userspace process collects coverage by via a common handle, kcov will collect coverage for each code section that is annotated to use the common handle obtained as kcov_handle from the current task_struct. However non common handles allow to collect coverage selectively from different subsystems. Link: http://lkml.kernel.org/r/e90e315426a384207edbec1d6aa89e43008e4caf.1572366574.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: David Windsor <dwindsor@gmail.com> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Anders Roxell <anders.roxell@linaro.org> Cc: Alexander Potapenko <glider@google.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 147413187 Change-Id: I868c4846a412bfbae16086017e113813571df377	2020-01-15 14:51:23 +00:00
Elena Reshetova	7073504599	UPSTREAM: kcov: convert kcov.refcount to refcount_t (Upstream commit 39e07cb60860e3162fc377380b8a60409315681e.) atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable kcov.refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. **Important note for maintainers: Some functions from refcount_t API defined in lib/refcount.c have different memory ordering guarantees than their atomic counterparts. The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon in state to be merged to the documentation tree. Normally the differences should not matter since refcount_t provides enough guarantees to satisfy the refcounting use cases, but in some rare cases it might matter. Please double check that you don't have some undocumented memory guarantees for this variable usage. For the kcov.refcount it might make a difference in following places: - kcov_put(): decrement in refcount_dec_and_test() only provides RELEASE ordering and control dependency on success vs. fully ordered atomic counterpart Link: http://lkml.kernel.org/r/1547634429-772-1-git-send-email-elena.reshetova@intel.com Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andrea Parri <andrea.parri@amarulasolutions.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Bug: 147413187 Change-Id: Ie22524d133af5ab86dcc5cadde4bdca931625d3a	2020-01-15 14:50:51 +00:00
Greg Kroah-Hartman	8e39dd1479	UPSTREAM: kcov: no need to check return value of debugfs_create functions (Upstream commit ec9672d57670d495404f36ab8b651bfefc0ea10b.) When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Link: http://lkml.kernel.org/r/20190122152151.16139-46-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Change-Id: I8ed5dc6aeba3dda8b91ceea4fed5cd9ef058461f Bug: 147413187	2020-01-15 14:50:31 +00:00
Greg Kroah-Hartman	21e3a71314	This is the 4.19.96 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4eEV4ACgkQONu9yGCS aT6D6Q//W205i6vBtK440pV659K29fw2+XYhJnG+3kkcXJ1GBEHlS7ddZCzV1lua uOLYxG2J0A3D0WC8S78XwE1ZXe1YTor647qW0H6pDeZIj/ID2x4TEN6rZq0YlCb7 JLx6d+yb7MkK9fHGXVyTZbKESLr7WimDnZckWbO4xWTqtlh195ygMYxdwTxa97ZP rMJXR/qS/n5rlgQE8BEFsPZBNcsSVKecIi8ibSWu8zDJerBb2h0TuUMKvn1fteG9 O69y8MeoJSWGTsorii+f7toGKhECFRWSrDj9GhK4SFTj3eV5evrkozAWJpXuzIOA 9ou9OHUQnJxpXaVyBxiUDkP58tDYYddsrSCHFSNPtPcsTJSxDUEwZBDXr7V8TyVq axjoXgHwXOg9qi5IvEOvOBIahTX9OdKvR578eLQIFinZTRmZGKl2XW08olybcUw3 ZpTBW6A97Xmhs4szFOQfgxRrc3n0F4+Mk9tFp2HFzSu23y6A/wwctIt4xoOWJ8Nq ouZbGwI3Em6rtTWDufNJbagm5QYLEcsS5F4Ala6uGIxBXs8mD7dAisILEXxFfSAn aPzPvsKh1vMhYBFmTStsowxy1pjeuUMHukrBQLDtmhzR4aBs5GXCJi/Iq2ojC123 LLAnG/ytLyYyVUxpYyNMsconLITZ1iS3iUAIcUHeYcEFZ3JAV3E= =D3a0 -----END PGP SIGNATURE----- Merge 4.19.96 into android-4.19 Changes in 4.19.96 chardev: Avoid potential use-after-free in 'chrdev_open()' i2c: fix bus recovery stop mode timing usb: chipidea: host: Disable port power only if previously enabled ALSA: usb-audio: Apply the sample rate quirk for Bose Companion 5 ALSA: hda/realtek - Add new codec supported for ALCS1200A ALSA: hda/realtek - Set EAPD control to default for ALC222 ALSA: hda/realtek - Add quirk for the bass speaker on Lenovo Yoga X1 7th gen kernel/trace: Fix do not unregister tracepoints when register sched_migrate_task fail tracing: Have stack tracer compile when MCOUNT_INSN_SIZE is not defined tracing: Change offset type to s32 in preempt/irq tracepoints HID: Fix slab-out-of-bounds read in hid_field_extract HID: uhid: Fix returning EPOLLOUT from uhid_char_poll HID: hid-input: clear unmapped usages Input: add safety guards to input_set_keycode() Input: input_event - fix struct padding on sparc64 drm/sun4i: tcon: Set RGB DCLK min. divider based on hardware model drm/fb-helper: Round up bits_per_pixel if possible drm/dp_mst: correct the shifting in DP_REMOTE_I2C_READ can: kvaser_usb: fix interface sanity check can: gs_usb: gs_usb_probe(): use descriptors of current altsetting can: mscan: mscan_rx_poll(): fix rx path lockup when returning from polling to irq mode can: can_dropped_invalid_skb(): ensure an initialized headroom in outgoing CAN sk_buffs gpiolib: acpi: Turn dmi_system_id table into a generic quirk table gpiolib: acpi: Add honor_wakeup module-option + quirk mechanism staging: vt6656: set usb_set_intfdata on driver fail. USB: serial: option: add ZLP support for 0x1bc7/0x9010 usb: musb: fix idling for suspend after disconnect interrupt usb: musb: Disable pullup at init usb: musb: dma: Correct parameter passed to IRQ handler staging: comedi: adv_pci1710: fix AI channels 16-31 for PCI-1713 staging: rtl8188eu: Add device code for TP-Link TL-WN727N v5.21 serdev: Don't claim unsupported ACPI serial devices tty: link tty and port before configuring it as console tty: always relink the port mwifiex: fix possible heap overflow in mwifiex_process_country_ie() mwifiex: pcie: Fix memory leak in mwifiex_pcie_alloc_cmdrsp_buf scsi: bfa: release allocated memory in case of error rtl8xxxu: prevent leaking urb ath10k: fix memory leak HID: hiddev: fix mess in hiddev_open() USB: Fix: Don't skip endpoint descriptors with maxpacket=0 phy: cpcap-usb: Fix error path when no host driver is loaded phy: cpcap-usb: Fix flakey host idling and enumerating of devices netfilter: arp_tables: init netns pointer in xt_tgchk_param struct netfilter: conntrack: dccp, sctp: handle null timeout argument netfilter: ipset: avoid null deref when IPSET_ATTR_LINENO is present drm/i915/gen9: Clear residual context state on context switch Linux 4.19.96 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I261d6a9e90a5461701f74e3ca1482e3c00939f3e	2020-01-15 08:57:09 +01:00
Steven Rostedt (VMware)	c919096552	tracing: Have stack tracer compile when MCOUNT_INSN_SIZE is not defined commit b8299d362d0837ae39e87e9019ebe6b736e0f035 upstream. On some archs with some configurations, MCOUNT_INSN_SIZE is not defined, and this makes the stack tracer fail to compile. Just define it to zero in this case. Link: https://lore.kernel.org/r/202001020219.zvE3vsty%lkp@intel.com Cc: stable@vger.kernel.org Fixes: `4df297129f` ("tracing: Remove most or all of stack tracer stack size from stack_max_size") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-14 20:06:59 +01:00
Kaitao Cheng	5ab4bb7b40	kernel/trace: Fix do not unregister tracepoints when register sched_migrate_task fail commit 50f9ad607ea891a9308e67b81f774c71736d1098 upstream. In the function, if register_trace_sched_migrate_task() returns error, sched_switch/sched_wakeup_new/sched_wakeup won't unregister. That is why fail_deprobe_sched_switch was added. Link: http://lkml.kernel.org/r/20191231133530.2794-1-pilgrimtao@gmail.com Cc: stable@vger.kernel.org Fixes: `478142c39c` ("tracing: do not grab lock in wakeup latency function tracing") Signed-off-by: Kaitao Cheng <pilgrimtao@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-14 20:06:59 +01:00
Greg Kroah-Hartman	5da11144c3	This is the 4.19.95 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4bAEoACgkQONu9yGCS aT70DQ//YgSd3JJ9/d7TLy+Mm8GrlZIZWgRrz8YZdKcJBG+l/c+m6FR0uLDw95nf zIFq1GY8DwS3djuNkxPoz8yvIgqoHuSjEUo+YF8v71ZdnGCXIt6SCoKbugAH+azp zTEJ4lU3/Wrc1Gh4w5zeSgdQVPIJBQ+d7pKccZHOJ0DFwbU3hQ69vqXcaxrFuhop vqKvNNEeCT2l1AxgAhKwNhceFL3Rb/2HxAjUED68ueY/EZ0/OsoinboBu2riSKNu NWZTPq7B2Kht19aV48FThwksEvkelz72gkSzKvrsT03tDtNewQ7C/mreUqpU6VZE lrcIWOHjXMQyk3g3XZfG3j8ppzHvZPaG/WEqmnJV8txjtaRU8hvj3Q95kHPheTv6 O/Ds4OpHBaS5g6+dmsUmtfSrbrhm3KKXMn0IJvyYkC/VY0RVdLKTZu0TM5ZS5id/ zW4AZWWzephUID9LAwRrTWvfGKMBK6gEiv2AtnY4XRiEMYEFw45uP97yWcUCg89L a3BCbhhiO4tMqL/lf7KD4vPbXZsRgDM3q1wLitvM2KCVVSA32XGRMT9x8A6uwTJl OLL39NSi8h8rqn5S22AowKcR/3VakRNw9pf5Y6AH5xOnP3R/OBHzYJNho2DUc8OI 6p+zeUZkDEIsFI1wdJSaaSWsNdpJj3Sw6SdB2QJI3c782F4pFUA= =RCVS -----END PGP SIGNATURE----- Merge 4.19.95 into android-4.19 Changes in 4.19.95 USB: dummy-hcd: use usb_urb_dir_in instead of usb_pipein USB: dummy-hcd: increase max number of devices to 32 bpf: Fix passing modified ctx to ld/abs/ind instruction regulator: fix use after free issue ASoC: max98090: fix possible race conditions locking/spinlock/debug: Fix various data races netfilter: ctnetlink: netns exit must wait for callbacks mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame() libtraceevent: Fix lib installation with O= x86/efi: Update e820 with reserved EFI boot services data to fix kexec breakage ASoC: Intel: bytcr_rt5640: Update quirk for Teclast X89 efi/gop: Return EFI_NOT_FOUND if there are no usable GOPs efi/gop: Return EFI_SUCCESS if a usable GOP was found efi/gop: Fix memory leak in __gop_query32/64() ARM: dts: imx6ul: imx6ul-14x14-evk.dtsi: Fix SPI NOR probing ARM: vexpress: Set-up shared OPP table instead of individual for each CPU netfilter: uapi: Avoid undefined left-shift in xt_sctp.h netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets netfilter: nf_tables: validate NFT_SET_ELEM_INTERVAL_END netfilter: nf_tables: validate NFT_DATA_VALUE after nft_data_init() ARM: dts: BCM5301X: Fix MDIO node address/size cells selftests/ftrace: Fix multiple kprobe testcase ARM: dts: Cygnus: Fix MDIO node address/size cells spi: spi-cavium-thunderx: Add missing pci_release_regions() ASoC: topology: Check return value for soc_tplg_pcm_create() ARM: dts: bcm283x: Fix critical trip point bnxt_en: Return error if FW returns more data than dump length bpf, mips: Limit to 33 tail calls spi: spi-ti-qspi: Fix a bug when accessing non default CS ARM: dts: am437x-gp/epos-evm: fix panel compatible samples: bpf: Replace symbol compare of trace_event samples: bpf: fix syscall_tp due to unused syscall powerpc: Ensure that swiotlb buffer is allocated from low memory btrfs: Fix error messages in qgroup_rescan_init bpf: Clear skb->tstamp in bpf_redirect when necessary bnx2x: Do not handle requests from VFs after parity bnx2x: Fix logic to get total no. of PFs per engine cxgb4: Fix kernel panic while accessing sge_info net: usb: lan78xx: Fix error message format specifier parisc: add missing __init annotation rfkill: Fix incorrect check to avoid NULL pointer dereference ASoC: wm8962: fix lambda value regulator: rn5t618: fix module aliases iommu/iova: Init the struct iova to fix the possible memleak kconfig: don't crash on NULL expressions in expr_eq() perf/x86/intel: Fix PT PMI handling fs: avoid softlockups in s_inodes iterators net: stmmac: Do not accept invalid MTU values net: stmmac: xgmac: Clear previous RX buffer size net: stmmac: RX buffer size must be 16 byte aligned net: stmmac: Always arm TX Timer at end of transmission start s390/purgatory: do not build purgatory with kcov, kasan and friends drm/exynos: gsc: add missed component_del s390/dasd/cio: Interpret ccw_device_get_mdc return value correctly s390/dasd: fix memleak in path handling error case block: fix memleak when __blk_rq_map_user_iov() is failed parisc: Fix compiler warnings in debug_core.c llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c) hv_netvsc: Fix unwanted rx_table reset powerpc/vcpu: Assume dedicated processors as non-preempt powerpc/spinlocks: Include correct header for static key cpufreq: imx6q: read OCOTP through nvmem for imx6ul/imx6ull ARM: dts: imx6ul: use nvmem-cells for cpu speed grading PCI/switchtec: Read all 64 bits of part_event_bitmap gtp: fix bad unlock balance in gtp_encap_enable_socket macvlan: do not assume mac_header is set in macvlan_broadcast() net: dsa: mv88e6xxx: Preserve priority when setting CPU port. net: stmmac: dwmac-sun8i: Allow all RGMII modes net: stmmac: dwmac-sunxi: Allow all RGMII modes net: usb: lan78xx: fix possible skb leak pkt_sched: fq: do not accept silly TCA_FQ_QUANTUM sch_cake: avoid possible divide by zero in cake_enqueue() sctp: free cmd->obj.chunk for the unprocessed SCTP_CMD_REPLY tcp: fix "old stuff" D-SACK causing SACK to be treated as D-SACK vxlan: fix tos value before xmit vlan: fix memory leak in vlan_dev_set_egress_priority vlan: vlan_changelink() should propagate errors mlxsw: spectrum_qdisc: Ignore grafting of invisible FIFO net: sch_prio: When ungrafting, replace with FIFO usb: dwc3: gadget: Fix request complete check USB: core: fix check for duplicate endpoints USB: serial: option: add Telit ME910G1 0x110a composition usb: missing parentheses in USE_NEW_SCHEME Linux 4.19.95 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I611f034c58f975a5d2b70eed0a0884f1ff5b09cc	2020-01-12 12:29:19 +01:00
Marco Elver	c7673f0160	locking/spinlock/debug: Fix various data races [ Upstream commit 1a365e822372ba24c9da0822bc583894f6f3d821 ] This fixes various data races in spinlock_debug. By testing with KCSAN, it is observable that the console gets spammed with data races reports, suggesting these are extremely frequent. Example data race report: read to 0xffff8ab24f403c48 of 4 bytes by task 221 on cpu 2: debug_spin_lock_before kernel/locking/spinlock_debug.c:85 [inline] do_raw_spin_lock+0x9b/0x210 kernel/locking/spinlock_debug.c:112 __raw_spin_lock include/linux/spinlock_api_smp.h:143 [inline] _raw_spin_lock+0x39/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] get_partial_node.isra.0.part.0+0x32/0x2f0 mm/slub.c:1873 get_partial_node mm/slub.c:1870 [inline] <snip> write to 0xffff8ab24f403c48 of 4 bytes by task 167 on cpu 3: debug_spin_unlock kernel/locking/spinlock_debug.c:103 [inline] do_raw_spin_unlock+0xc9/0x1a0 kernel/locking/spinlock_debug.c:138 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:159 [inline] _raw_spin_unlock_irqrestore+0x2d/0x50 kernel/locking/spinlock.c:191 spin_unlock_irqrestore include/linux/spinlock.h:393 [inline] free_debug_processing+0x1b3/0x210 mm/slub.c:1214 __slab_free+0x292/0x400 mm/slub.c:2864 <snip> As a side-effect, with KCSAN, this eventually locks up the console, most likely due to deadlock, e.g. .. -> printk lock -> spinlock_debug -> KCSAN detects data race -> kcsan_print_report() -> printk lock -> deadlock. This fix will 1) avoid the data races, and 2) allow using lock debugging together with KCSAN. Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Marco Elver <elver@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: https://lkml.kernel.org/r/20191120155715.28089-1-elver@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-01-12 12:17:05 +01:00
Daniel Borkmann	f70280ee89	bpf: Fix passing modified ctx to ld/abs/ind instruction commit 6d4f151acf9a4f6fab09b615f246c717ddedcf0c upstream. Anatoly has been fuzzing with kBdysch harness and reported a KASAN slab oob in one of the outcomes: [...] [ 77.359642] BUG: KASAN: slab-out-of-bounds in bpf_skb_load_helper_8_no_cache+0x71/0x130 [ 77.360463] Read of size 4 at addr ffff8880679bac68 by task bpf/406 [ 77.361119] [ 77.361289] CPU: 2 PID: 406 Comm: bpf Not tainted 5.5.0-rc2-xfstests-00157-g2187f215eba #1 [ 77.362134] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 77.362984] Call Trace: [ 77.363249] dump_stack+0x97/0xe0 [ 77.363603] print_address_description.constprop.0+0x1d/0x220 [ 77.364251] ? bpf_skb_load_helper_8_no_cache+0x71/0x130 [ 77.365030] ? bpf_skb_load_helper_8_no_cache+0x71/0x130 [ 77.365860] __kasan_report.cold+0x37/0x7b [ 77.366365] ? bpf_skb_load_helper_8_no_cache+0x71/0x130 [ 77.366940] kasan_report+0xe/0x20 [ 77.367295] bpf_skb_load_helper_8_no_cache+0x71/0x130 [ 77.367821] ? bpf_skb_load_helper_8+0xf0/0xf0 [ 77.368278] ? mark_lock+0xa3/0x9b0 [ 77.368641] ? kvm_sched_clock_read+0x14/0x30 [ 77.369096] ? sched_clock+0x5/0x10 [ 77.369460] ? sched_clock_cpu+0x18/0x110 [ 77.369876] ? bpf_skb_load_helper_8+0xf0/0xf0 [ 77.370330] ___bpf_prog_run+0x16c0/0x28f0 [ 77.370755] __bpf_prog_run32+0x83/0xc0 [ 77.371153] ? __bpf_prog_run64+0xc0/0xc0 [ 77.371568] ? match_held_lock+0x1b/0x230 [ 77.371984] ? rcu_read_lock_held+0xa1/0xb0 [ 77.372416] ? rcu_is_watching+0x34/0x50 [ 77.372826] sk_filter_trim_cap+0x17c/0x4d0 [ 77.373259] ? sock_kzfree_s+0x40/0x40 [ 77.373648] ? __get_filter+0x150/0x150 [ 77.374059] ? skb_copy_datagram_from_iter+0x80/0x280 [ 77.374581] ? do_raw_spin_unlock+0xa5/0x140 [ 77.375025] unix_dgram_sendmsg+0x33a/0xa70 [ 77.375459] ? do_raw_spin_lock+0x1d0/0x1d0 [ 77.375893] ? unix_peer_get+0xa0/0xa0 [ 77.376287] ? __fget_light+0xa4/0xf0 [ 77.376670] __sys_sendto+0x265/0x280 [ 77.377056] ? __ia32_sys_getpeername+0x50/0x50 [ 77.377523] ? lock_downgrade+0x350/0x350 [ 77.377940] ? __sys_setsockopt+0x2a6/0x2c0 [ 77.378374] ? sock_read_iter+0x240/0x240 [ 77.378789] ? __sys_socketpair+0x22a/0x300 [ 77.379221] ? __ia32_sys_socket+0x50/0x50 [ 77.379649] ? mark_held_locks+0x1d/0x90 [ 77.380059] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 77.380536] __x64_sys_sendto+0x74/0x90 [ 77.380938] do_syscall_64+0x68/0x2a0 [ 77.381324] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 77.381878] RIP: 0033:0x44c070 [...] After further debugging, turns out while in case of other helper functions we disallow passing modified ctx, the special case of ld/abs/ind instruction which has similar semantics (except r6 being the ctx argument) is missing such check. Modified ctx is impossible here as bpf_skb_load_helper_8_no_cache() and others are expecting skb fields in original position, hence, add check_ctx_reg() to reject any modified ctx. Issue was first introduced back in `f1174f77b5` ("bpf/verifier: rework value tracking"). Fixes: `f1174f77b5` ("bpf/verifier: rework value tracking") Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200106215157.3553-1-daniel@iogearbox.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-12 12:17:04 +01:00
Joel Fernandes (Google)	89ae5a7cad	BACKPORT: perf_event: Add support for LSM and SELinux checks In current mainline, the degree of access to perf_event_open(2) system call depends on the perf_event_paranoid sysctl. This has a number of limitations: 1. The sysctl is only a single value. Many types of accesses are controlled based on the single value thus making the control very limited and coarse grained. 2. The sysctl is global, so if the sysctl is changed, then that means all processes get access to perf_event_open(2) opening the door to security issues. This patch adds LSM and SELinux access checking which will be used in Android to access perf_event_open(2) for the purposes of attaching BPF programs to tracepoints, perf profiling and other operations from userspace. These operations are intended for production systems. 5 new LSM hooks are added: 1. perf_event_open: This controls access during the perf_event_open(2) syscall itself. The hook is called from all the places that the perf_event_paranoid sysctl is checked to keep it consistent with the systctl. The hook gets passed a 'type' argument which controls CPU, kernel and tracepoint accesses (in this context, CPU, kernel and tracepoint have the same semantics as the perf_event_paranoid sysctl). Additionally, I added an 'open' type which is similar to perf_event_paranoid sysctl == 3 patch carried in Android and several other distros but was rejected in mainline [1] in 2016. 2. perf_event_alloc: This allocates a new security object for the event which stores the current SID within the event. It will be useful when the perf event's FD is passed through IPC to another process which may try to read the FD. Appropriate security checks will limit access. 3. perf_event_free: Called when the event is closed. 4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event. 5. perf_event_write: Called from the ioctl(2) syscalls for the event. [1] https://lwn.net/Articles/696240/ Since Peter had suggest LSM hooks in 2016 [1], I am adding his Suggested-by tag below. To use this patch, we set the perf_event_paranoid sysctl to -1 and then apply selinux checking as appropriate (default deny everything, and then add policy rules to give access to domains that need it). In the future we can remove the perf_event_paranoid sysctl altogether. Suggested-by: Peter Zijlstra <peterz@infradead.org> Co-developed-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: James Morris <jmorris@namei.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: rostedt@goodmis.org Cc: Yonghong Song <yhs@fb.com> Cc: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: jeffv@google.com Cc: Jiri Olsa <jolsa@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: primiano@google.com Cc: Song Liu <songliubraving@fb.com> Cc: rsavitski@google.com Cc: Namhyung Kim <namhyung@kernel.org> Cc: Matthew Garrett <matthewgarrett@google.com> Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org (cherry picked from commit da97e18458fb42d7c00fac5fd1c56a3896ec666e) [ Ryan Savitski: Resolved conflicts with existing code, and folded in upstream ae79d5588a04 (perf/core: Fix !CONFIG_PERF_EVENTS build warnings and failures). This should fix the build errors from the previous backport attempt, where certain configurations would end up with functions referring to the perf_event struct prior to its declaration (and therefore declaring it with a different scope). ] Bug: 137092007 Signed-off-by: Ryan Savitski <rsavitski@google.com> Change-Id: Ief8c669083c81f4ea2fa75d5c0d947d19ea741b3	2020-01-10 15:18:52 +00:00
Greg Kroah-Hartman	ff0e96e80f	This is the 4.19.94 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4W8A4ACgkQONu9yGCS aT5ZcBAAha0GMcpxm1ettNVMXUVD/Df2pntc3x2G1T+dtI89YwIilJcdQBpbDGB6 6oNRpnopc+/ynm820SMlhjBNE8KlDzHS3Tmsn1lplru0yOqZMFcFlHSESCAA0b4E T21KwQ4rLZTzW4LvTf//4WpJZD1RnVrwKkbgkci9kvCjZsdh2GMK3XkBeVBUdXuX 3gvW+9zsgmkU3Bhk5Mi8JUmOw2yY5sJt2tDmIyxOtBknAo1TK6n4kqd+NgjfsdcI cnNTstDU0Ikmi4UBOZGDmey0THtHdvi/oM3DUkzHtZ68W0rg/gPu4nDR+Fx3sKvo y5bI10j4W16PKXyxVehel+lD8XmIV/+zSerS0enGjijBPZKI9FTlGEuczk0x7sj+ wqMh3WkkPig2bQPrCOIjkA5VW4n/ZL07cjd1nNeZ48MkvA/3k47o4vDV/lPE88ZT 49qqaJvZ3kAdqtV1pfzpQtrG1Pp8YPcEHAMYIM/6jb6poCro5vFtuRX4tzj2fRSf u7jSVPDt7ED9SgHPe+RrGWVIx2/iVnr5mVdi53rjWTWfeTdNIY5bUs/wsTde1k99 9bnAhwD4ZbFrO240MMYPWpOCr8kl0LXAeyQ104m7ONbMRnLoRp4sQCae252jyHFD Qxgez5cDwDQnj2W4/SdXSWytioTnyVHsI89FkWw+f/IU8AsbBuw= =mmeT -----END PGP SIGNATURE----- Merge 4.19.94 into android-4.19 Changes in 4.19.94 nvme_fc: add module to ops template to allow module references nvme-fc: fix double-free scenarios on hw queues drm/amdgpu: add check before enabling/disabling broadcast mode drm/amdgpu: add cache flush workaround to gfx8 emit_fence drm/amd/display: Fixed kernel panic when booting with DP-to-HDMI dongle iio: adc: max9611: Fix too short conversion time delay PM / devfreq: Fix devfreq_notifier_call returning errno PM / devfreq: Set scaling_max_freq to max on OPP notifier error PM / devfreq: Don't fail devfreq_dev_release if not in list afs: Fix afs_find_server lookups for ipv4 peers afs: Fix SELinux setting security label on /afs RDMA/cma: add missed unregister_pernet_subsys in init failure rxe: correctly calculate iCRC for unaligned payloads scsi: lpfc: Fix memory leak on lpfc_bsg_write_ebuf_set func scsi: qla2xxx: Drop superfluous INIT_WORK of del_work scsi: qla2xxx: Don't call qlt_async_event twice scsi: qla2xxx: Fix PLOGI payload and ELS IOCB dump length scsi: qla2xxx: Configure local loop for N2N target scsi: qla2xxx: Send Notify ACK after N2N PLOGI scsi: qla2xxx: Ignore PORT UPDATE after N2N PLOGI scsi: iscsi: qla4xxx: fix double free in probe scsi: libsas: stop discovering if oob mode is disconnected drm/nouveau: Move the declaration of struct nouveau_conn_atom up a bit usb: gadget: fix wrong endpoint desc net: make socket read/write_iter() honor IOCB_NOWAIT afs: Fix creation calls in the dynamic root to fail with EOPNOTSUPP md: raid1: check rdev before reference in raid1_sync_request func s390/cpum_sf: Adjust sampling interval to avoid hitting sample limits s390/cpum_sf: Avoid SBD overflow condition in irq handler IB/mlx4: Follow mirror sequence of device add during device removal IB/mlx5: Fix steering rule of drop and count xen-blkback: prevent premature module unload xen/balloon: fix ballooned page accounting without hotplug enabled PM / hibernate: memory_bm_find_bit(): Tighten node optimisation ALSA: hda/realtek - Add Bass Speaker and fixed dac for bass speaker ALSA: hda/realtek - Enable the bass speaker of ASUS UX431FLC ALSA: hda - fixup for the bass speaker on Lenovo Carbon X1 7th gen xfs: fix mount failure crash on invalid iclog memory access taskstats: fix data-race drm: limit to INT_MAX in create_blob ioctl netfilter: nft_tproxy: Fix port selector on Big Endian ALSA: ice1724: Fix sleep-in-atomic in Infrasonic Quartet support code ALSA: usb-audio: fix set_format altsetting sanity check ALSA: usb-audio: set the interface format after resume on Dell WD19 ALSA: hda/realtek - Add headset Mic no shutup for ALC283 drm/sun4i: hdmi: Remove duplicate cleanup calls MIPS: Avoid VDSO ABI breakage due to global register variable media: pulse8-cec: fix lost cec_transmit_attempt_done() call media: cec: CEC 2.0-only bcast messages were ignored media: cec: avoid decrementing transmit_queue_sz if it is 0 media: cec: check 'transmit_in_progress', not 'transmitting' mm/zsmalloc.c: fix the migrated zspage statistics. memcg: account security cred as well to kmemcg mm: move_pages: return valid node id in status if the page is already on the target node pstore/ram: Write new dumps to start of recycled zones locks: print unsigned ino in /proc/locks dmaengine: Fix access to uninitialized dma_slave_caps compat_ioctl: block: handle Persistent Reservations compat_ioctl: block: handle BLKREPORTZONE/BLKRESETZONE ata: libahci_platform: Export again ahci_platform_<en/dis>able_phys() ata: ahci_brcm: Fix AHCI resources management ata: ahci_brcm: Allow optional reset controller to be used ata: ahci_brcm: Add missing clock management during recovery ata: ahci_brcm: BCM7425 AHCI requires AHCI_HFLAG_DELAY_ENGINE libata: Fix retrieving of active qcs gpiolib: fix up emulated open drain outputs riscv: ftrace: correct the condition logic in function graph tracer rseq/selftests: Fix: Namespace gettid() for compatibility with glibc 2.30 tracing: Fix lock inversion in trace_event_enable_tgid_record() tracing: Avoid memory leak in process_system_preds() tracing: Have the histogram compare functions convert to u64 first tracing: Fix endianness bug in histogram trigger apparmor: fix aa_xattrs_match() may sleep while holding a RCU lock ALSA: cs4236: fix error return comparison of an unsigned integer ALSA: firewire-motu: Correct a typo in the clock proc string exit: panic before exit_mm() on global init exit arm64: Revert support for execute-only user mappings ftrace: Avoid potential division by zero in function profiler drm/msm: include linux/sched/task.h PM / devfreq: Check NULL governor in available_governors_show nfsd4: fix up replay_matches_cache() HID: i2c-hid: Reset ALPS touchpads on resume ACPI: sysfs: Change ACPI_MASKABLE_GPE_MAX to 0x100 xfs: don't check for AG deadlock for realtime files in bunmapi platform/x86: pmc_atom: Add Siemens CONNECT X300 to critclk_systems DMI table Bluetooth: btusb: fix PM leak in error case of setup Bluetooth: delete a stray unlock Bluetooth: Fix memory leak in hci_connect_le_scan media: flexcop-usb: ensure -EIO is returned on error condition regulator: ab8500: Remove AB8505 USB regulator media: usb: fix memory leak in af9005_identify_state dt-bindings: clock: renesas: rcar-usb2-clock-sel: Fix typo in example arm64: dts: meson: odroid-c2: Disable usb_otg bus to avoid power failed warning tty: serial: msm_serial: Fix lockup for sysrq and oops fix compat handling of FICLONERANGE, FIDEDUPERANGE and FS_IOC_FIEMAP bdev: Factor out bdev revalidation into a common helper bdev: Refresh bdev size for disks without partitioning scsi: qedf: Do not retry ELS request if qedf_alloc_cmd fails drm/mst: Fix MST sideband up-reply failure handling powerpc/pseries/hvconsole: Fix stack overread via udbg selftests: rtnetlink: add addresses with fixed life time KVM: PPC: Book3S HV: use smp_mb() when setting/clearing host_ipi flag rxrpc: Fix possible NULL pointer access in ICMP handling tcp: annotate tp->rcv_nxt lockless reads net: core: limit nested device depth ath9k_htc: Modify byte order for an error message ath9k_htc: Discard undersized packets xfs: periodically yield scrub threads to the scheduler net: add annotations on hh->hh_len lockless accesses ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps s390/smp: fix physical to logical CPU map for SMT xen/blkback: Avoid unmapping unmapped grant pages perf/x86/intel/bts: Fix the use of page_private() Linux 4.19.94 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ic3d1a4e10565c38d0e82448f0fb7b6fd1822aab2	2020-01-09 16:14:43 +01:00
Greg Kroah-Hartman	58fd41cb2d	Revert "BACKPORT: perf_event: Add support for LSM and SELinux checks" This reverts commit `8af21ac176` as it breaks the build :( Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Ryan Savitski <rsavitski@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2020-01-09 11:07:21 +01:00
Wen Yang	010a7e846d	ftrace: Avoid potential division by zero in function profiler commit e31f7939c1c27faa5d0e3f14519eaf7c89e8a69d upstream. The ftrace_profile->counter is unsigned long and do_div truncates it to 32 bits, which means it can test non-zero and be truncated to zero for division. Fix this issue by using div64_ul() instead. Link: http://lkml.kernel.org/r/20200103030248.14516-1-wenyang@linux.alibaba.com Cc: stable@vger.kernel.org Fixes: `e330b3bcd8` ("tracing: Show sample std dev in function profiling") Fixes: `34886c8bc5` ("tracing: add average time in function to function profiler") Signed-off-by: Wen Yang <wenyang@linux.alibaba.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:03 +01:00
chenqiwu	b9227aacdc	exit: panic before exit_mm() on global init exit commit 43cf75d96409a20ef06b756877a2e72b10a026fc upstream. Currently, when global init and all threads in its thread-group have exited we panic via: do_exit() -> exit_notify() -> forget_original_parent() -> find_child_reaper() This makes it hard to extract a useable coredump for global init from a kernel crashdump because by the time we panic exit_mm() will have already released global init's mm. This patch moves the panic futher up before exit_mm() is called. As was the case previously, we only panic when global init and all its threads in the thread-group have exited. Signed-off-by: chenqiwu <chenqiwu@xiaomi.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Oleg Nesterov <oleg@redhat.com> [christian.brauner@ubuntu.com: fix typo, rewrite commit message] Link: https://lore.kernel.org/r/1576736993-10121-1-git-send-email-qiwuchen55@gmail.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:02 +01:00
Sven Schnelle	1c662483c5	tracing: Fix endianness bug in histogram trigger commit fe6e096a5bbf73a142f09c72e7aa2835026eb1a3 upstream. At least on PA-RISC and s390 synthetic histogram triggers are failing selftests because trace_event_raw_event_synth() always writes a 64 bit values, but the reader expects a field->size sized value. On little endian machines this doesn't hurt, but on big endian this makes the reader always read zero values. Link: http://lore.kernel.org/linux-trace-devel/20191218074427.96184-4-svens@linux.ibm.com Cc: stable@vger.kernel.org Fixes: `4b147936fa` ("tracing: Add support for 'synthetic' events") Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:02 +01:00
Steven Rostedt (VMware)	0c81595930	tracing: Have the histogram compare functions convert to u64 first commit 106f41f5a302cb1f36c7543fae6a05de12e96fa4 upstream. The compare functions of the histogram code would be specific for the size of the value being compared (byte, short, int, long long). It would reference the value from the array via the type of the compare, but the value was stored in a 64 bit number. This is fine for little endian machines, but for big endian machines, it would end up comparing zeros or all ones (depending on the sign) for anything but 64 bit numbers. To fix this, first derference the value as a u64 then convert it to the type being compared. Link: http://lkml.kernel.org/r/20191211103557.7bed6928@gandalf.local.home Cc: stable@vger.kernel.org Fixes: `08d43a5fa0` ("tracing: Add lock-free tracing_map") Acked-by: Tom Zanussi <zanussi@kernel.org> Reported-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:02 +01:00
Keita Suzuki	8595e2aadd	tracing: Avoid memory leak in process_system_preds() commit 79e65c27f09683fbb50c33acab395d0ddf5302d2 upstream. When failing in the allocation of filter_item, process_system_preds() goes to fail_mem, where the allocated filter is freed. However, this leads to memory leak of filter->filter_string and filter->prog, which is allocated before and in process_preds(). This bug has been detected by kmemleak as well. Fix this by changing kfree to __free_fiter. unreferenced object 0xffff8880658007c0 (size 32): comm "bash", pid 579, jiffies 4295096372 (age 17.752s) hex dump (first 32 bytes): 63 6f 6d 6d 6f 6e 5f 70 69 64 20 20 3e 20 31 30 common_pid > 10 00 00 00 00 00 00 00 00 65 73 00 00 00 00 00 00 ........es...... backtrace: [<0000000067441602>] kstrdup+0x2d/0x60 [<00000000141cf7b7>] apply_subsystem_event_filter+0x378/0x932 [<000000009ca32334>] subsystem_filter_write+0x5a/0x90 [<0000000072da2bee>] vfs_write+0xe1/0x240 [<000000004f14f473>] ksys_write+0xb4/0x150 [<00000000a968b4a0>] do_syscall_64+0x6d/0x1e0 [<000000001a189f40>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 unreferenced object 0xffff888060c22d00 (size 64): comm "bash", pid 579, jiffies 4295096372 (age 17.752s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 e8 d7 41 80 88 ff ff ...........A.... 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000b8c1b109>] process_preds+0x243/0x1820 [<000000003972c7f0>] apply_subsystem_event_filter+0x3be/0x932 [<000000009ca32334>] subsystem_filter_write+0x5a/0x90 [<0000000072da2bee>] vfs_write+0xe1/0x240 [<000000004f14f473>] ksys_write+0xb4/0x150 [<00000000a968b4a0>] do_syscall_64+0x6d/0x1e0 [<000000001a189f40>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 unreferenced object 0xffff888041d7e800 (size 512): comm "bash", pid 579, jiffies 4295096372 (age 17.752s) hex dump (first 32 bytes): 70 bc 85 97 ff ff ff ff 0a 00 00 00 00 00 00 00 p............... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000001e04af34>] process_preds+0x71a/0x1820 [<000000003972c7f0>] apply_subsystem_event_filter+0x3be/0x932 [<000000009ca32334>] subsystem_filter_write+0x5a/0x90 [<0000000072da2bee>] vfs_write+0xe1/0x240 [<000000004f14f473>] ksys_write+0xb4/0x150 [<00000000a968b4a0>] do_syscall_64+0x6d/0x1e0 [<000000001a189f40>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Link: http://lkml.kernel.org/r/20191211091258.11310-1-keitasuzuki.park@sslab.ics.keio.ac.jp Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: `404a3add43` ("tracing: Only add filter list when needed") Signed-off-by: Keita Suzuki <keitasuzuki.park@sslab.ics.keio.ac.jp> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:02 +01:00
Prateek Sood	0e48d030e3	tracing: Fix lock inversion in trace_event_enable_tgid_record() commit 3a53acf1d9bea11b57c1f6205e3fe73f9d8a3688 upstream. Task T2 Task T3 trace_options_core_write() subsystem_open() mutex_lock(trace_types_lock) mutex_lock(event_mutex) set_tracer_flag() trace_event_enable_tgid_record() mutex_lock(trace_types_lock) mutex_lock(event_mutex) This gives a circular dependency deadlock between trace_types_lock and event_mutex. To fix this invert the usage of trace_types_lock and event_mutex in trace_options_core_write(). This keeps the sequence of lock usage consistent. Link: http://lkml.kernel.org/r/0101016eef175e38-8ca71caf-a4eb-480d-a1e6-6f0bbc015495-000000@us-west-2.amazonses.com Cc: stable@vger.kernel.org Fixes: `d914ba37d7` ("tracing: Add support for recording tgid of tasks") Signed-off-by: Prateek Sood <prsood@codeaurora.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:02 +01:00
Shakeel Butt	3b677f7543	memcg: account security cred as well to kmemcg commit 84029fd04c201a4c7e0b07ba262664900f47c6f5 upstream. The cred_jar kmem_cache is already memcg accounted in the current kernel but cred->security is not. Account cred->security to kmemcg. Recently we saw high root slab usage on our production and on further inspection, we found a buggy application leaking processes. Though that buggy application was contained within its memcg but we observe much more system memory overhead, couple of GiBs, during that period. This overhead can adversely impact the isolation on the system. One source of high overhead we found was cred->security objects, which have a lifetime of at least the life of the process which allocated them. Link: http://lkml.kernel.org/r/20191205223721.40034-1-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Chris Down <chris@chrisdown.name> Reviewed-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-09 10:19:00 +01:00

... 2 3 4 5 6 ...

29354 commits