17 commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Greg Kroah-Hartman
|
8ca5759502 |
This is the 4.19.73 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1/KiEACgkQONu9yGCS aT49JBAAy7b3wv1WXAtg9wsyS1JL4HbMXt3YjtokIX+UpkznoqII4B85QftPBbiD 9zDuTWPjhrqKv1GsMkFRCqBVp5wGVik1MIbjVuKdstFN5W8KQybpbYnSW4T52+wS cs6oOPkLydAfWzKeq+ekEeU8yr5dua+Ui3huundZ49wseJWQP3fh9T+ToUx8V/cr tsLiRRgI0djj7KQWVuM1j8YGKT/6qk/UL0HMVZyoIdLmsxpLap+LWe0+CRXn8rvs eJJlVQTVtYf/ySoHkpnwR12VsjRYjx6pNkm/GrebMCkM7wF/4RMqxk7j9EU0PENH VUdRrUd+j/YPp6QzjSFMK0+0eb7Gm3X0FEN0IGZshu1r/CDnoj/7hqnBmOlYIbhv pdteYaLqWq7JjAHu7vF+S4aNQRGpAZb05LsbTJ39Eu3FbdVTLXsAuUveZ7Y4/y0X ri2M3d/sF/cjc3C+V7Y7h422SM36jSAK6496VAoRyqqjX/3JyROhgfU9NAMzVr83 4uI904z9lH4TZGOd5YQgX2VuOtBcGwa7+g6fy97u1tp8UxSWFZRGDDLRysF/dIJO Wi51UK0Q7EWnqBTe0TFF6TjE5tC7R3ZgzqEQ1MU4eLI5mqokg82DAK4Ub2Wk5Qch CGs5/d16OOrLtG2RoaOGz9UdQR7IHUXLSqkKbaEdstc16MXNXns= =cmGh -----END PGP SIGNATURE----- Merge 4.19.73 into android-4.19 Changes in 4.19.73 ALSA: hda - Fix potential endless loop at applying quirks ALSA: hda/realtek - Fix overridden device-specific initialization ALSA: hda/realtek - Add quirk for HP Pavilion 15 ALSA: hda/realtek - Enable internal speaker & headset mic of ASUS UX431FL ALSA: hda/realtek - Fix the problem of two front mics on a ThinkCentre sched/fair: Don't assign runtime for throttled cfs_rq drm/vmwgfx: Fix double free in vmw_recv_msg() vhost/test: fix build for vhost test vhost/test: fix build for vhost test - again powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction batman-adv: fix uninit-value in batadv_netlink_get_ifindex() batman-adv: Only read OGM tvlv_len after buffer len check hv_sock: Fix hang when a connection is closed Blk-iolatency: warn on negative inflight IO counter blk-iolatency: fix STS_AGAIN handling {nl,mac}80211: fix interface combinations on crypto controlled devices timekeeping: Use proper ktime_add when adding nsecs in coarse offset selftests: fib_rule_tests: use pre-defined DEV_ADDR x86/ftrace: Fix warning and considate ftrace_jmp_replace() and ftrace_call_replace() powerpc/64: mark start_here_multiplatform as __ref media: stm32-dcmi: fix irq = 0 case arm64: dts: rockchip: enable usb-host regulators at boot on rk3328-rock64 scripts/decode_stacktrace: match basepath using shell prefix operator, not regex riscv: remove unused variable in ftrace nvme-fc: use separate work queue to avoid warning clk: s2mps11: Add used attribute to s2mps11_dt_match remoteproc: qcom: q6v5: shore up resource probe handling modules: always page-align module section allocations kernel/module: Fix mem leak in module_add_modinfo_attrs drm/i915: Re-apply "Perform link quality check, unconditionally during long pulse" media: cec/v4l2: move V4L2 specific CEC functions to V4L2 media: cec: remove cec-edid.c scsi: qla2xxx: Move log messages before issuing command to firmware keys: Fix the use of the C++ keyword "private" in uapi/linux/keyctl.h Drivers: hv: kvp: Fix two "this statement may fall through" warnings x86, hibernate: Fix nosave_regions setup for hibernation remoteproc: qcom: q6v5-mss: add SCM probe dependency drm/amdgpu/gfx9: Update gfx9 golden settings. drm/amdgpu: Update gc_9_0 golden settings. KVM: x86: hyperv: enforce vp_index < KVM_MAX_VCPUS KVM: x86: hyperv: consistently use 'hv_vcpu' for 'struct kvm_vcpu_hv' variables KVM: x86: hyperv: keep track of mismatched VP indexes KVM: hyperv: define VP assist page helpers x86/kvm/lapic: preserve gfn_to_hva_cache len on cache reinit drm/i915: Fix intel_dp_mst_best_encoder() drm/i915: Rename PLANE_CTL_DECOMPRESSION_ENABLE drm/i915/gen9+: Fix initial readout for Y tiled framebuffers drm/atomic_helper: Disallow new modesets on unregistered connectors Drivers: hv: kvp: Fix the indentation of some "break" statements Drivers: hv: kvp: Fix the recent regression caused by incorrect clean-up powerplay: Respect units on max dcfclk watermark drm/amd/pp: Fix truncated clock value when set watermark drm/amd/dm: Understand why attaching path/tile properties are needed ARM: davinci: da8xx: define gpio interrupts as separate resources ARM: davinci: dm365: define gpio interrupts as separate resources ARM: davinci: dm646x: define gpio interrupts as separate resources ARM: davinci: dm355: define gpio interrupts as separate resources ARM: davinci: dm644x: define gpio interrupts as separate resources s390/zcrypt: reinit ap queue state machine during device probe media: vim2m: use workqueue media: vim2m: use cancel_delayed_work_sync instead of flush_schedule_work drm/i915: Restore sane defaults for KMS on GEM error load drm/i915: Cleanup gt powerstate from gem KVM: PPC: Book3S HV: Fix race between kvm_unmap_hva_range and MMU mode switch Btrfs: clean up scrub is_dev_replace parameter Btrfs: fix deadlock with memory reclaim during scrub btrfs: Remove extent_io_ops::fill_delalloc btrfs: Fix error handling in btrfs_cleanup_ordered_extents scsi: megaraid_sas: Fix combined reply queue mode detection scsi: megaraid_sas: Add check for reset adapter bit scsi: megaraid_sas: Use 63-bit DMA addressing powerpc/pkeys: Fix handling of pkey state across fork() btrfs: volumes: Make sure no dev extent is beyond device boundary btrfs: Use real device structure to verify dev extent media: vim2m: only cancel work if it is for right context ARC: show_regs: lockdep: re-enable preemption ARC: mm: do_page_fault fixes #1: relinquish mmap_sem if signal arrives while handle_mm_fault IB/uverbs: Fix OOPs upon device disassociation crypto: ccree - fix resume race condition on init crypto: ccree - add missing inline qualifier drm/vblank: Allow dynamic per-crtc max_vblank_count drm/i915/ilk: Fix warning when reading emon_status with no output mfd: Kconfig: Fix I2C_DESIGNWARE_PLATFORM dependencies tpm: Fix some name collisions with drivers/char/tpm.h bcache: replace hard coded number with BUCKET_GC_GEN_MAX bcache: treat stale && dirty keys as bad keys KVM: VMX: Compare only a single byte for VMCS' "launched" in vCPU-run iio: adc: exynos-adc: Add S5PV210 variant dt-bindings: iio: adc: exynos-adc: Add S5PV210 variant iio: adc: exynos-adc: Use proper number of channels for Exynos4x12 mt76: fix corrupted software generated tx CCMP PN drm/nouveau: Don't WARN_ON VCPI allocation failures iwlwifi: fix devices with PCI Device ID 0x34F0 and 11ac RF modules iwlwifi: add new card for 9260 series x86/kvmclock: set offset for kvm unstable clock spi: spi-gpio: fix SPI_CS_HIGH capability powerpc/kvm: Save and restore host AMR/IAMR/UAMOR mmc: renesas_sdhi: Fix card initialization failure in high speed mode btrfs: scrub: pass fs_info to scrub_setup_ctx btrfs: scrub: move scrub_setup_ctx allocation out of device_list_mutex btrfs: scrub: fix circular locking dependency warning btrfs: init csum_list before possible free PCI: qcom: Fix error handling in runtime PM support PCI: qcom: Don't deassert reset GPIO during probe drm: add __user attribute to ptr_to_compat() CIFS: Fix error paths in writeback code CIFS: Fix leaking locked VFS cache pages in writeback retry drm/i915: Handle vm_mmap error during I915_GEM_MMAP ioctl with WC set drm/i915: Sanity check mmap length against object size usb: typec: tcpm: Try PD-2.0 if sink does not respond to 3.0 source-caps arm64: dts: stratix10: add the sysmgr-syscon property from the gmac's IB/mlx5: Reset access mask when looping inside page fault handler kvm: mmu: Fix overflow on kvm mmu page limit calculation x86/kvm: move kvm_load/put_guest_xcr0 into atomic context KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernels cifs: Fix lease buffer length error media: i2c: tda1997x: select V4L2_FWNODE ext4: protect journal inode's blocks using block_validity ARM: dts: qcom: ipq4019: fix PCI range ARM: dts: qcom: ipq4019: Fix MSI IRQ type ARM: dts: qcom: ipq4019: enlarge PCIe BAR range dt-bindings: mmc: Add supports-cqe property dt-bindings: mmc: Add disable-cqe-dcmd property. PCI: Add macro for Switchtec quirk declarations PCI: Reset Lenovo ThinkPad P50 nvgpu at boot if necessary dm mpath: fix missing call of path selector type->end_io blk-mq: free hw queue's resource in hctx's release handler mmc: sdhci-pci: Add support for Intel CML PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code cifs: smbd: take an array of reqeusts when sending upper layer data dm crypt: move detailed message into debug level signal/arc: Use force_sig_fault where appropriate ARC: mm: fix uninitialised signal code in do_page_fault ARC: mm: SIGSEGV userspace trying to access kernel virtual memory drm/amdkfd: Add missing Polaris10 ID kvm: Check irqchip mode before assign irqfd drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2) drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc Btrfs: fix race between block group removal and block group allocation cifs: add spinlock for the openFileList to cifsInodeInfo clk: tegra: Fix maximum audio sync clock for Tegra124/210 clk: tegra210: Fix default rates for HDA clocks IB/hfi1: Avoid hardlockup with flushlist_lock apparmor: reset pos on failure to unpack for various functions scsi: target/core: Use the SECTOR_SHIFT constant scsi: target/iblock: Fix overrun in WRITE SAME emulation staging: wilc1000: fix error path cleanup in wilc_wlan_initialize() scsi: zfcp: fix request object use-after-free in send path causing wrong traces cifs: Properly handle auto disabling of serverino option ALSA: hda - Don't resume forcibly i915 HDMI/DP codec ceph: use ceph_evict_inode to cleanup inode's resource KVM: x86: optimize check for valid PAT value KVM: VMX: Always signal #GP on WRMSR to MSR_IA32_CR_PAT with bad value KVM: VMX: Fix handling of #MC that occurs during VM-Entry KVM: VMX: check CPUID before allowing read/write of IA32_XSS KVM: PPC: Use ccr field in pt_regs struct embedded in vcpu struct KVM: PPC: Book3S HV: Fix CR0 setting in TM emulation ARM: dts: gemini: Set DIR-685 SPI CS as active low RDMA/srp: Document srp_parse_in() arguments RDMA/srp: Accept again source addresses that do not have a port number btrfs: correctly validate compression type resource: Include resource end in walk_*() interfaces resource: Fix find_next_iomem_res() iteration issue resource: fix locking in find_next_iomem_res() pstore: Fix double-free in pstore_mkfile() failure path dm thin metadata: check if in fail_io mode when setting needs_check drm/panel: Add support for Armadeus ST0700 Adapt ALSA: hda - Fix intermittent CORB/RIRB stall on Intel chips powerpc/mm: Limit rma_size to 1TB when running without HV mode iommu/iova: Remove stale cached32_node gpio: don't WARN() on NULL descs if gpiolib is disabled i2c: at91: disable TXRDY interrupt after sending data i2c: at91: fix clk_offset for sama5d2 mm/migrate.c: initialize pud_entry in migrate_vma() iio: adc: gyroadc: fix uninitialized return code NFSv4: Fix delegation state recovery bcache: only clear BTREE_NODE_dirty bit when it is set bcache: add comments for mutex_lock(&b->write_lock) bcache: fix race in btree_flush_write() drm/i915: Make sure cdclk is high enough for DP audio on VLV/CHV virtio/s390: fix race on airq_areas[] drm/atomic_helper: Allow DPMS On<->Off changes for unregistered connectors ext4: don't perform block validity checks on the journal inode ext4: fix block validity checks for journal inodes using indirect blocks ext4: unsigned int compared against zero PCI: Reset both NVIDIA GPU and HDA in ThinkPad P50 workaround powerpc/tm: Remove msr_tm_active() powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts vhost: make sure log_num < in_num Linux 4.19.73 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I7bc57825aeb36759bb8e8726888da9af06392c09 |
||
Dennis Zhou
|
178d1337a5 |
blk-iolatency: fix STS_AGAIN handling
[ Upstream commit c9b3007feca018d3f7061f5d5a14cb00766ffe9b ] The iolatency controller is based on rq_qos. It increments on rq_qos_throttle() and decrements on either rq_qos_cleanup() or rq_qos_done_bio(). a3fb01ba5af0 fixes the double accounting issue where blk_mq_make_request() may call both rq_qos_cleanup() and rq_qos_done_bio() on REQ_NO_WAIT. So checking STS_AGAIN prevents the double decrement. The above works upstream as the only way we can get STS_AGAIN is from blk_mq_get_request() failing. The STS_AGAIN handling isn't a real problem as bio_endio() skipping only happens on reserved tag allocation failures which can only be caused by driver bugs and already triggers WARN. However, the fix creates a not so great dependency on how STS_AGAIN can be propagated. Internally, we (Facebook) carry a patch that kills read ahead if a cgroup is io congested or a fatal signal is pending. This combined with chained bios progagate their bi_status to the parent is not already set can can cause the parent bio to not clean up properly even though it was successful. This consequently leaks the inflight counter and can hang all IOs under that blkg. To nip the adverse interaction early, this removes the rq_qos_cleanup() callback in iolatency in favor of cleaning up always on the rq_qos_done_bio() path. Fixes: a3fb01ba5af0 ("blk-iolatency: only account submitted bios") Debugged-by: Tejun Heo <tj@kernel.org> Debugged-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Liu Bo
|
5f33e81250 |
Blk-iolatency: warn on negative inflight IO counter
[ Upstream commit 391f552af213985d3d324c60004475759a7030c5 ] This is to catch any unexpected negative value of inflight IO counter. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Greg Kroah-Hartman
|
71ce27c31a |
This is the 4.19.61 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl06qFcACgkQONu9yGCS aT6O9A/+JZqoVYnItpOnT8Hu//0mYEKvREWqsoTJNpZJhLWtGjPTT9ospHNpVgfC GUkFqngWzXHpzCgTYHUV3Mm+SIiVXCM3nkCU1+2YOsPzrKo/lJSfFt3wOYGpKO5V qratAQLra5TqR0teR00aQblqKqfmrux05uL9dNcVIwve813m00jFALcpjrXnanpP tx5cqCo3uHOou5XLraHx/CMPnfJI/mLegBUTM4DxAmN2vG4gQck2gnrU7s1eg4cy 1Fqh0Oo2Ycj5p9yoGss02JqR3wGZHOEmF55j2JcTZAPvW6/c55iPd52Trn8kPOHB Awq/VwJmP4p10a4TWoZpv7VqpL3PzO8/AW7QWOER8QnDzfOTHGae7YT8LVp5Xqj5 1NqowuP/Tm0yaZSaDLqkdvhVqTi0oGL8OCYLErpeR9PQ3P+p3paaswopsPqnXURj Q4Pahe1vm9WG2NpKh2bHVmmVkQmvwuxxxnaa31HI/IyLd5bYFV1/LbEa/XrSK36W VJtO+0AjERO9uTVP/YDloDkQ4R3+3W+m520jYsgf1OwY7v/Kc6iLb7cDwci/ZWMy YSMm8hrO0nzuT0SI25TKLDvxjGbANKvxytzOQMOTb8NsIWwaoEKWh+4r9XkdUXNa +dx72I5J2Be+3hk+eaDNzCdEae5pgVTxBpwJbzI4RfnK1Doa4uE= =hJdd -----END PGP SIGNATURE----- Merge 4.19.61 into android-4.19 Changes in 4.19.61 MIPS: ath79: fix ar933x uart parity mode MIPS: fix build on non-linux hosts arm64/efi: Mark __efistub_stext_offset as an absolute symbol explicitly scsi: iscsi: set auth_protocol back to NULL if CHAP_A value is not supported dmaengine: imx-sdma: fix use-after-free on probe error path wil6210: fix potential out-of-bounds read ath10k: Do not send probe response template for mesh ath9k: Check for errors when reading SREV register ath6kl: add some bounds checking ath10k: add peer id check in ath10k_peer_find_by_id wil6210: fix spurious interrupts in 3-msi ath: DFS JP domain W56 fixed pulse type 3 RADAR detection regmap: debugfs: Fix memory leak in regmap_debugfs_init batman-adv: fix for leaked TVLV handler. media: dvb: usb: fix use after free in dvb_usb_device_exit media: spi: IR LED: add missing of table registration crypto: talitos - fix skcipher failure due to wrong output IV media: ov7740: avoid invalid framesize setting media: marvell-ccic: fix DMA s/g desc number calculation media: vpss: fix a potential NULL pointer dereference media: media_device_enum_links32: clean a reserved field net: stmmac: dwmac1000: Clear unused address entries net: stmmac: dwmac4/5: Clear unused address entries qed: Set the doorbell address correctly signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig af_key: fix leaks in key_pol_get_resp and dump_sp. xfrm: Fix xfrm sel prefix length validation fscrypt: clean up some BUG_ON()s in block encryption/decryption perf annotate TUI browser: Do not use member from variable within its own initialization media: mc-device.c: don't memset __user pointer contents media: saa7164: fix remove_proc_entry warning media: staging: media: davinci_vpfe: - Fix for memory leak if decoder initialization fails. net: phy: Check against net_device being NULL crypto: talitos - properly handle split ICV. crypto: talitos - Align SEC1 accesses to 32 bits boundaries. tua6100: Avoid build warnings. batman-adv: Fix duplicated OGMs on NETDEV_UP locking/lockdep: Fix merging of hlocks with non-zero references media: wl128x: Fix some error handling in fm_v4l2_init_video_device() net: hns3: set ops to null when unregister ad_dev cpupower : frequency-set -r option misses the last cpu in related cpu list arm64: mm: make CONFIG_ZONE_DMA32 configurable perf jvmti: Address gcc string overflow warning for strncpy() net: stmmac: dwmac4: fix flow control issue net: stmmac: modify default value of tx-frames crypto: inside-secure - do not rely on the hardware last bit for result descriptors net: fec: Do not use netdev messages too early net: axienet: Fix race condition causing TX hang s390/qdio: handle PENDING state for QEBSM devices RAS/CEC: Fix pfn insertion net: sfp: add mutex to prevent concurrent state checks ipset: Fix memory accounting for hash types on resize perf cs-etm: Properly set the value of 'old' and 'head' in snapshot mode perf test 6: Fix missing kvm module load for s390 perf report: Fix OOM error in TUI mode on s390 irqchip/meson-gpio: Add support for Meson-G12A SoC media: uvcvideo: Fix access to uninitialized fields on probe error media: fdp1: Support M3N and E3 platforms iommu: Fix a leak in iommu_insert_resv_region gpio: omap: fix lack of irqstatus_raw0 for OMAP4 gpio: omap: ensure irq is enabled before wakeup regmap: fix bulk writes on paged registers bpf: silence warning messages in core media: s5p-mfc: fix reading min scratch buffer size on MFC v6/v7 selinux: fix empty write to keycreate file x86/cpu: Add Ice Lake NNPI to Intel family ASoC: meson: axg-tdm: fix sample clock inversion rcu: Force inlining of rcu_read_lock() x86/cpufeatures: Add FDP_EXCPTN_ONLY and ZERO_FCS_FDS qed: iWARP - Fix tc for MPA ll2 connection net: hns3: fix for skb leak when doing selftest block: null_blk: fix race condition for null_del_dev blkcg, writeback: dead memcgs shouldn't contribute to writeback ownership arbitration xfrm: fix sa selector validation sched/core: Add __sched tag for io_schedule() sched/fair: Fix "runnable_avg_yN_inv" not used warnings perf/x86/intel/uncore: Handle invalid event coding for free-running counter x86/atomic: Fix smp_mb__{before,after}_atomic() perf evsel: Make perf_evsel__name() accept a NULL argument vhost_net: disable zerocopy by default ipoib: correcly show a VF hardware address x86/cacheinfo: Fix a -Wtype-limits warning blk-iolatency: only account submitted bios ACPICA: Clear status of GPEs on first direct enable EDAC/sysfs: Fix memory leak when creating a csrow object nvme: fix possible io failures when removing multipathed ns nvme-pci: properly report state change failure in nvme_reset_work nvme-pci: set the errno on ctrl state change error lightnvm: pblk: fix freeing of merged pages arm64: Do not enable IRQs for ct_user_exit ipsec: select crypto ciphers for xfrm_algo ipvs: defer hook registration to avoid leaks media: s5p-mfc: Make additional clocks optional media: i2c: fix warning same module names ntp: Limit TAI-UTC offset timer_list: Guard procfs specific code acpi/arm64: ignore 5.1 FADTs that are reported as 5.0 media: coda: fix mpeg2 sequence number handling media: coda: fix last buffer handling in V4L2_ENC_CMD_STOP media: coda: increment sequence offset for the last returned frame media: vimc: cap: check v4l2_fill_pixfmt return value media: hdpvr: fix locking and a missing msleep net: stmmac: sun8i: force select external PHY when no internal one rtlwifi: rtl8192cu: fix error handle when usb probe failed mt7601u: do not schedule rx_tasklet when the device has been disconnected x86/build: Add 'set -e' to mkcapflags.sh to delete broken capflags.c mt7601u: fix possible memory leak when the device is disconnected ipvs: fix tinfo memory leak in start_sync_thread ath10k: add missing error handling ath10k: fix PCIE device wake up failed perf tools: Increase MAX_NR_CPUS and MAX_CACHES ASoC: Intel: hdac_hdmi: Set ops to NULL on remove libata: don't request sense data on !ZAC ATA devices clocksource/drivers/exynos_mct: Increase priority over ARM arch timer xsk: Properly terminate assignment in xskq_produce_flush_desc rslib: Fix decoding of shortened codes rslib: Fix handling of of caller provided syndrome ixgbe: Check DDM existence in transceiver before access crypto: serpent - mark __serpent_setkey_sbox noinline crypto: asymmetric_keys - select CRYPTO_HASH where needed wil6210: drop old event after wmi_call timeout EDAC: Fix global-out-of-bounds write when setting edac_mc_poll_msec bcache: check CACHE_SET_IO_DISABLE in allocator code bcache: check CACHE_SET_IO_DISABLE bit in bch_journal() bcache: acquire bch_register_lock later in cached_dev_free() bcache: check c->gc_thread by IS_ERR_OR_NULL in cache_set_flush() bcache: fix potential deadlock in cached_def_free() net: hns3: fix a -Wformat-nonliteral compile warning net: hns3: add some error checking in hclge_tm module ath10k: destroy sdio workqueue while remove sdio module net: mvpp2: prs: Don't override the sign bit in SRAM parser shift igb: clear out skb->tstamp after reading the txtime iwlwifi: mvm: Drop large non sta frames bpf: fix uapi bpf_prog_info fields alignment perf stat: Make metric event lookup more robust perf stat: Fix group lookup for metric group bnx2x: Prevent ptp_task to be rescheduled indefinitely net: usb: asix: init MAC address buffers rxrpc: Fix oops in tracepoint bpf, libbpf, smatch: Fix potential NULL pointer dereference selftests: bpf: fix inlines in test_lwt_seg6local bonding: validate ip header before check IPPROTO_IGMP gpiolib: Fix references to gpiod_[gs]et_*value_cansleep() variants tools: bpftool: Fix json dump crash on powerpc Bluetooth: hci_bcsp: Fix memory leak in rx_skb Bluetooth: Add new 13d3:3491 QCA_ROME device Bluetooth: Add new 13d3:3501 QCA_ROME device Bluetooth: 6lowpan: search for destination address in all peers perf tests: Fix record+probe_libc_inet_pton.sh for powerpc64 Bluetooth: Check state in l2cap_disconnect_rsp gtp: add missing gtp_encap_disable_sock() in gtp_encap_enable() Bluetooth: validate BLE connection interval updates gtp: fix suspicious RCU usage gtp: fix Illegal context switch in RCU read-side critical section. gtp: fix use-after-free in gtp_encap_destroy() gtp: fix use-after-free in gtp_newlink() net: mvmdio: defer probe of orion-mdio if a clock is not ready iavf: fix dereference of null rx_buffer pointer floppy: fix div-by-zero in setup_format_params floppy: fix out-of-bounds read in next_valid_format floppy: fix invalid pointer dereference in drive_name floppy: fix out-of-bounds read in copy_buffer xen: let alloc_xenballooned_pages() fail if not enough memory free scsi: NCR5380: Reduce goto statements in NCR5380_select() scsi: NCR5380: Always re-enable reselection interrupt Revert "scsi: ncr5380: Increase register polling limit" scsi: core: Fix race on creating sense cache scsi: megaraid_sas: Fix calculation of target ID scsi: mac_scsi: Increase PIO/PDMA transfer length threshold scsi: mac_scsi: Fix pseudo DMA implementation, take 2 crypto: ghash - fix unaligned memory access in ghash_setkey() crypto: ccp - Validate the the error value used to index error messages crypto: arm64/sha1-ce - correct digest for empty data in finup crypto: arm64/sha2-ce - correct digest for empty data in finup crypto: chacha20poly1305 - fix atomic sleep when using async algorithm crypto: crypto4xx - fix AES CTR blocksize value crypto: crypto4xx - fix blocksize for cfb and ofb crypto: crypto4xx - block ciphers should only accept complete blocks crypto: ccp - memset structure fields to zero before reuse crypto: ccp/gcm - use const time tag comparison. crypto: crypto4xx - fix a potential double free in ppc4xx_trng_probe Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()" bcache: Revert "bcache: fix high CPU occupancy during journal" bcache: Revert "bcache: free heap cache_set->flush_btree in bch_journal_free" bcache: ignore read-ahead request failure on backing device bcache: fix mistaken sysfs entry for io_error counter bcache: destroy dc->writeback_write_wq if failed to create dc->writeback_thread Input: gtco - bounds check collection indent level Input: alps - don't handle ALPS cs19 trackpoint-only device Input: synaptics - whitelist Lenovo T580 SMBus intertouch Input: alps - fix a mismatch between a condition check and its comment regulator: s2mps11: Fix buck7 and buck8 wrong voltages arm64: tegra: Update Jetson TX1 GPU regulator timings iwlwifi: pcie: don't service an interrupt that was masked iwlwifi: pcie: fix ALIVE interrupt handling for gen2 devices w/o MSI-X iwlwifi: don't WARN when calling iwl_get_shared_mem_conf with RF-Kill iwlwifi: fix RF-Kill interrupt while FW load for gen2 devices NFSv4: Handle the special Linux file open access mode pnfs/flexfiles: Fix PTR_ERR() dereferences in ff_layout_track_ds_error pNFS: Fix a typo in pnfs_update_layout pnfs: Fix a problem where we gratuitously start doing I/O through the MDS lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE ASoC: dapm: Adapt for debugfs API change raid5-cache: Need to do start() part job after adding journal device ALSA: seq: Break too long mutex context in the write loop ALSA: hda/realtek - Fixed Headphone Mic can't record on Dell platform ALSA: hda/realtek: apply ALC891 headset fixup to one Dell machine media: v4l2: Test type instead of cfg->type in v4l2_ctrl_new_custom() media: coda: Remove unbalanced and unneeded mutex unlock media: videobuf2-core: Prevent size alignment wrapping buffer size to 0 media: videobuf2-dma-sg: Prevent size from overflowing KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed arm64: tegra: Fix AGIC register range fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes. kconfig: fix missing choice values in auto.conf drm/nouveau/i2c: Enable i2c pads & busses during preinit padata: use smp_mb in padata_reorder to avoid orphaned padata jobs dm zoned: fix zone state management race xen/events: fix binding user event channels to cpus 9p/xen: Add cleanup path in p9_trans_xen_init 9p/virtio: Add cleanup path in p9_virtio_init x86/boot: Fix memory leak in default_get_smp_config() perf/x86/intel: Fix spurious NMI on fixed counter perf/x86/amd/uncore: Do not set 'ThreadMask' and 'SliceMask' for non-L3 PMCs perf/x86/amd/uncore: Set the thread mask for F17h L3 PMCs drm/edid: parse CEA blocks embedded in DisplayID intel_th: pci: Add Ice Lake NNPI support PCI: hv: Fix a use-after-free bug in hv_eject_device_work() PCI: Do not poll for PME if the device is in D3cold PCI: qcom: Ensure that PERST is asserted for at least 100 ms Btrfs: fix data loss after inode eviction, renaming it, and fsync it Btrfs: fix fsync not persisting dentry deletions due to inode evictions Btrfs: add missing inode version, ctime and mtime updates when punching hole IB/mlx5: Report correctly tag matching rendezvous capability HID: wacom: generic: only switch the mode on devices with LEDs HID: wacom: generic: Correct pad syncing HID: wacom: correct touch resolution x/y typo libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields coda: pass the host file in vma->vm_file on mmap include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures xfs: fix pagecache truncation prior to reflink xfs: flush removing page cache in xfs_reflink_remap_prep xfs: don't overflow xattr listent buffer xfs: rename m_inotbt_nores to m_finobt_nores xfs: don't ever put nlink > 0 inodes on the unlinked list xfs: reserve blocks for ifree transaction during log recovery xfs: fix reporting supported extra file attributes for statx() xfs: serialize unaligned dio writes against all other dio writes xfs: abort unaligned nowait directio early gpu: ipu-v3: ipu-ic: Fix saturation bit offset in TPMEM crypto: caam - limit output IV to CBC to work around CTR mode DMA issue parisc: Ensure userspace privilege for ptraced processes in regset functions parisc: Fix kernel panic due invalid values in IAOQ0 or IAOQ1 powerpc/32s: fix suspend/resume when IBATs 4-7 are used powerpc/watchpoint: Restore NV GPRs while returning from exception powerpc/powernv/npu: Fix reference leak powerpc/pseries: Fix oops in hotplug memory notifier mmc: sdhci-msm: fix mutex while in spinlock eCryptfs: fix a couple type promotion bugs mtd: rawnand: mtk: Correct low level time calculation of r/w cycle mtd: spinand: read returns badly if the last page has bitflips intel_th: msu: Fix single mode with disabled IOMMU Bluetooth: Add SMP workaround Microsoft Surface Precision Mouse bug usb: Handle USB3 remote wakeup for LPM enabled devices correctly blk-throttle: fix zero wait time for iops throttled group blk-iolatency: clear use_delay when io.latency is set to zero blkcg: update blkcg_print_stat() to handle larger outputs net: mvmdio: allow up to four clocks to be specified for orion-mdio dt-bindings: allow up to four clocks for orion-mdio dm bufio: fix deadlock with loop device Linux 4.19.61 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I2f565111b1c16f369fa86e0481527fcc6357fe1b |
||
Tejun Heo
|
73efdc5d7d |
blk-iolatency: clear use_delay when io.latency is set to zero
commit 5de0073fcd50cc1f150895a7bb04d3cf8067b1d7 upstream.
If use_delay was non-zero when the latency target of a cgroup was set
to zero, it will stay stuck until io.latency is enabled on the cgroup
again. This keeps readahead disabled for the cgroup impacting
performance negatively.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <jbacik@fb.com>
Fixes:
|
||
Dennis Zhou
|
3ae98dc2db |
blk-iolatency: only account submitted bios
[ Upstream commit a3fb01ba5af066521f3f3421839e501bb2c71805 ] As is, iolatency recognizes done_bio and cleanup as ending paths. If a request is marked REQ_NOWAIT and fails to get a request, the bio is cleaned up via rq_qos_cleanup() and ended in bio_wouldblock_error(). This results in underflowing the inflight counter. Fix this by only accounting bios that were actually submitted. Signed-off-by: Dennis Zhou <dennis@kernel.org> Cc: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Greg Kroah-Hartman
|
10f41ccfc7 |
This is the 4.19.36 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAly6xzUACgkQONu9yGCS aT5sIA//b7nAk2zuhmbkonsBfzFq5uBJmqXcCrOgy3XHMs4fE+Q11kLd1wMAV7dx U7FNHe4PIJ8Rczxgqr2VP3VmFbV6UuTK+UTclJKfbV3ouIAQiQBuutABBmbDUj2p FInc/yAYyhVc9n7gX78czTiUxKnKi4+sisUYDCZPr3hr6jDPcLvm/WVWdyrcXJje rYFNmE/2MBH1NofG+MOpq+ILhKHXlf2APN2/spl+I42a8bwodiSl9g+dhuWr7wgT Ln2Ocf7BZ6BPCQKoveZdD1Gd56NNR/lJh4ulqpuhaZw4Yp+B/C7GmrBtdPzVSGka IwPWoSc9/9VSUl+ooSZHms78VLbqq0rNNclskL2bN6m962u04Eu7sB2Tg/bwUs52 Wkcw0DY4J/oMJtj/CMHcQOUPsk6vwHxqnjsj+LYJ1ZjHO68tUshnENxXrbAoDc45 2fuY3TCA+XqFvqNt5HbkLPtFR78u8QmZ1lP/Pkri6xoG/GA6O0EAxhS0Z9hncGK7 8wJNuxLMd2UX94wlajQ+DF7yyCU4HOFEdeSEOwlHHBid/fckXsGzL2tKJUAbbUPP ux3An8kJHni8nQrmUkyy1Nx29ROyAFxBLOQshWGpXgJrV3qRMYLyB2Icv0WYCGFk zZCTupPgvb46u81VzqxrLH4RZdy4Ar4uB3BQGPKs596rlYmvnSo= =CArs -----END PGP SIGNATURE----- Merge 4.19.36 into android-4.19 Changes in 4.19.36 ARC: u-boot args: check that magic number is correct arc: hsdk_defconfig: Enable CONFIG_BLK_DEV_RAM inotify: Fix fsnotify_mark refcount leak in inotify_update_existing_watch() perf/core: Restore mmap record type correctly ext4: avoid panic during forced reboot ext4: add missing brelse() in add_new_gdb_meta_bg() ext4: report real fs size after failed resize ALSA: echoaudio: add a check for ioremap_nocache ALSA: sb8: add a check for request_region auxdisplay: hd44780: Fix memory leak on ->remove() drm/udl: use drm_gem_object_put_unlocked. IB/mlx4: Fix race condition between catas error reset and aliasguid flows i40iw: Avoid panic when handling the inetdev event mmc: davinci: remove extraneous __init annotation ALSA: opl3: fix mismatch between snd_opl3_drum_switch definition and declaration thermal/intel_powerclamp: fix __percpu declaration of worker_data thermal: samsung: Fix incorrect check after code merge thermal: bcm2835: Fix crash in bcm2835_thermal_debugfs thermal/int340x_thermal: Add additional UUIDs thermal/int340x_thermal: fix mode setting thermal/intel_powerclamp: fix truncated kthread name scsi: iscsi: flush running unbind operations when removing a session sched/cpufreq: Fix 32-bit math overflow sched/core: Fix buffer overflow in cgroup2 property cpu.max x86/mm: Don't leak kernel addresses tools/power turbostat: return the exit status of a command perf list: Don't forget to drop the reference to the allocated thread_map perf config: Fix an error in the config template documentation perf config: Fix a memory leak in collect_config() perf build-id: Fix memory leak in print_sdt_events() perf top: Fix error handling in cmd_top() perf hist: Add missing map__put() in error case perf evsel: Free evsel->counts in perf_evsel__exit() perf tests: Fix a memory leak of cpu_map object in the openat_syscall_event_on_all_cpus test perf tests: Fix memory leak by expr__find_other() in test__expr() perf tests: Fix a memory leak in test__perf_evsel__tp_sched_test() ACPI / utils: Drop reference in test for device presence PM / Domains: Avoid a potential deadlock blk-iolatency: #include "blk.h" drm/exynos/mixer: fix MIXER shadow registry synchronisation code irqchip/stm32: Don't clear rising/falling config registers at init irqchip/mbigen: Don't clear eventid when freeing an MSI x86/hpet: Prevent potential NULL pointer dereference x86/hyperv: Prevent potential NULL pointer dereference x86/cpu/cyrix: Use correct macros for Cyrix calls on Geode processors drm/nouveau/debugfs: Fix check of pm_runtime_get_sync failure iommu/vt-d: Check capability before disabling protected memory x86/hw_breakpoints: Make default case in hw_breakpoint_arch_parse() return an error fix incorrect error code mapping for OBJECTID_NOT_FOUND x86/gart: Exclude GART aperture from kcore ext4: prohibit fstrim in norecovery mode drm/cirrus: Use drm_framebuffer_put to avoid kernel oops in clean-up gpio: pxa: handle corner case of unprobed device rsi: improve kernel thread handling to fix kernel panic f2fs: fix to avoid NULL pointer dereference on se->discard_map 9p: do not trust pdu content for stat item size 9p locks: add mount option for lock retry interval ASoC: Fix UBSAN warning at snd_soc_get/put_volsw_sx() f2fs: fix to do sanity check with current segment number netfilter: xt_cgroup: shrink size of v2 path serial: uartps: console_setup() can't be placed to init section powerpc/pseries: Remove prrn_work workqueue media: au0828: cannot kfree dev before usb disconnect Bluetooth: Fix debugfs NULL pointer dereference HID: i2c-hid: override HID descriptors for certain devices pinctrl: core: make sure strcmp() doesn't get a null parameter ARM: samsung: Limit SAMSUNG_PM_CHECK config option to non-Exynos platforms usbip: fix vhci_hcd controller counting ACPI / SBS: Fix GPE storm on recent MacBookPro's HID: usbhid: Add quirk for Redragon/Dragonrise Seymur 2 KVM: nVMX: restore host state in nested_vmx_vmexit for VMFail compiler.h: update definition of unreachable() netfilter: nf_flow_table: remove flowtable hook flush routine in netns exit routine f2fs: cleanup dirty pages if recover failed net: stmmac: Set OWN bit for jumbo frames cifs: fallback to older infolevels on findfirst queryinfo retry kernel: hung_task.c: disable on suspend platform/x86: Add Intel AtomISP2 dummy / power-management driver drm/ttm: Fix bo_global and mem_global kfree error ALSA: hda: fix front speakers on Huawei MBXP ACPI: EC / PM: Disable non-wakeup GPEs for suspend-to-idle net/rds: fix warn in rds_message_alloc_sgs xfrm: destroy xfrm_state synchronously on net exit path crypto: sha256/arm - fix crash bug in Thumb2 build crypto: sha512/arm - fix crash bug in Thumb2 build net: ip6_gre: fix possible NULL pointer dereference in ip6erspan_set_version iommu/dmar: Fix buffer overflow during PCI bus notification scsi: core: Avoid that system resume triggers a kernel warning soc/tegra: pmc: Drop locking from tegra_powergate_is_powered() lkdtm: Print real addresses lkdtm: Add tests for NULL pointer dereference drm/panel: panel-innolux: set display off in innolux_panel_unprepare crypto: axis - fix for recursive locking from bottom half Revert "ACPI / EC: Remove old CLEAR_ON_RESUME quirk" coresight: cpu-debug: Support for CA73 CPUs PCI: Blacklist power management of Gigabyte X299 DESIGNARE EX PCIe ports drm/nouveau/volt/gf117: fix speedo readout register ARM: 8839/1: kprobe: make patch_lock a raw_spinlock_t drm/amdkfd: use init_mqd function to allocate object for hid_mqd (CI) appletalk: Fix use-after-free in atalk_proc_exit lib/div64.c: off by one in shift rxrpc: Fix client call connect/disconnect race f2fs: fix to dirty inode for i_mode recovery include/linux/swap.h: use offsetof() instead of custom __swapoffset macro bpf: fix use after free in bpf_evict_inode IB/hfi1: Failed to drain send queue when QP is put into error state mm: hide incomplete nr_indirectly_reclaimable in /proc/zoneinfo mm: hide incomplete nr_indirectly_reclaimable in sysfs appletalk: Fix compile regression Linux 4.19.36 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Bart Van Assche
|
bde271d1ad |
blk-iolatency: #include "blk.h"
[ Upstream commit 373e915cd8e84544609eced57a44fbc084f8d60f ]
This patch avoids that the following warning is reported when building
with W=1:
block/blk-iolatency.c:734:5: warning: no previous prototype for 'blk_iolatency_init' [-Wmissing-prototypes]
Cc: Josef Bacik <jbacik@fb.com>
Fixes:
|
||
Johannes Weiner
|
2ba18b41d3 |
BACKPORT: sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD
There are several definitions of those functions/macros in places that mess with fixed-point load averages. Provide an official version. [akpm@linux-foundation.org: fix missed conversion in block/blk-iolatency.c] Link: http://lkml.kernel.org/r/20180828172258.3185-5-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Suren Baghdasaryan <surenb@google.com> Tested-by: Daniel Drake <drake@endlessm.com> Cc: Christopher Lameter <cl@linux.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <jweiner@fb.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Enderborg <peter.enderborg@sony.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 8508cf3ffad4defa202b303e5b6379efc4cd9054) Conflicts: block/blk-iolatency.c (1. manual merge to replace stat->rqs.mean with stat.mean) Bug: 127712811 Test: lmkd in PSI mode Change-Id: I716b4874491cff75a2355c6d95c64cf02d05e7ee Signed-off-by: Suren Baghdasaryan <surenb@google.com> |
||
Liu Bo
|
6d482bc569 |
blk-iolatency: fix IO hang due to negative inflight counter
[ Upstream commit 8c772a9bfc7c07c76f4a58b58910452fbb20843b ] Our test reported the following stack, and vmcore showed that ->inflight counter is -1. [ffffc9003fcc38d0] __schedule at ffffffff8173d95d [ffffc9003fcc3958] schedule at ffffffff8173de26 [ffffc9003fcc3970] io_schedule at ffffffff810bb6b6 [ffffc9003fcc3988] blkcg_iolatency_throttle at ffffffff813911cb [ffffc9003fcc3a20] rq_qos_throttle at ffffffff813847f3 [ffffc9003fcc3a48] blk_mq_make_request at ffffffff8137468a [ffffc9003fcc3b08] generic_make_request at ffffffff81368b49 [ffffc9003fcc3b68] submit_bio at ffffffff81368d7d [ffffc9003fcc3bb8] ext4_io_submit at ffffffffa031be00 [ext4] [ffffc9003fcc3c00] ext4_writepages at ffffffffa03163de [ext4] [ffffc9003fcc3d68] do_writepages at ffffffff811c49ae [ffffc9003fcc3d78] __filemap_fdatawrite_range at ffffffff811b6188 [ffffc9003fcc3e30] filemap_write_and_wait_range at ffffffff811b6301 [ffffc9003fcc3e60] ext4_sync_file at ffffffffa030cee8 [ext4] [ffffc9003fcc3ea8] vfs_fsync_range at ffffffff8128594b [ffffc9003fcc3ee8] do_fsync at ffffffff81285abd [ffffc9003fcc3f18] sys_fsync at ffffffff81285d50 [ffffc9003fcc3f28] do_syscall_64 at ffffffff81003c04 [ffffc9003fcc3f50] entry_SYSCALL_64_after_swapgs at ffffffff81742b8e The ->inflight counter may be negative (-1) if 1) blk-iolatency was disabled when the IO was issued, 2) blk-iolatency was enabled before this IO reached its endio, 3) the ->inflight counter is decreased from 0 to -1 in endio() In fact the hang can be easily reproduced by the below script, H=/sys/fs/cgroup/unified/ P=/sys/fs/cgroup/unified/test echo "+io" > $H/cgroup.subtree_control mkdir -p $P echo $$ > $P/cgroup.procs xfs_io -f -d -c "pwrite 0 4k" /dev/sdg echo "`cat /sys/block/sdg/dev` target=1000000" > $P/io.latency xfs_io -f -d -c "pwrite 0 4k" /dev/sdg This fixes the problem by freezing the queue so that while enabling/disabling iolatency, there is no inflight rq running. Note that quiesce_queue is not needed as this only updating iolatency configuration about which dispatching request_queue doesn't care. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
Dennis Zhou (Facebook)
|
c480bcf97b |
block: make iolatency avg_lat exponentially decay
Currently, avg_lat is calculated by accumulating the mean of every window in a long running cumulative average. As time goes on, the metric becomes less and less useful due to the accumulated history. This patch reuses the same calculation done in load averages to make the avg_lat metric more lively. Unlike load averages, the avg only advances when a window elapses (due to an io). Idle periods extend the most recent window. Bucketing is used to limit the history of avg_lat by binding it to the window size. So, the window range for 1/exp (decay rate) is [1 min, 2.5 min) when windows elapse immediately. The current sample window size is exposed in the debug info to enable calculation of the window range. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Josef Bacik
|
52a1199ccd |
blk-iolatency: fix blkg leak in timer_fn
At this point we have a ref on the blkg, we need to drop it if we don't have a iolat. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Josef Bacik
|
71e9690b59 |
blk-iolatency: truncate our current time
In our longer tests we noticed that some boxes would degrade to the point of uselessness. This is because we truncate the current time when saving it in our bio, but I was using the raw current time to subtract from. So once the box had been up a certain amount of time it would appear as if our IO's were taking several years to complete. Fix this by truncating the current time so it matches the issue time. Verified this worked by running with this patch for a week on our test tier. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Josef Bacik
|
d607eefa3b |
blk-iolatency: don't change the latency window
Early versions of these patches had us waiting for seconds at a time during submission, so we had to adjust the timing window we monitored for latency. Now we don't do things like that so this is unnecessary code. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Josef Bacik
|
a284390b39 |
blk-iolatency: fix max_depth comparisons
max_depth used to be a u64, but I changed it to a unsigned int but didn't convert my comparisons over everywhere. Fix by using UINT_MAX everywhere instead of (u64)-1. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Arnd Bergmann
|
88b7210c81 |
block: iolatency: avoid 64-bit division
On 32-bit architectures, dividing a 64-bit number needs to use the
do_div() function or something like it to avoid a link failure:
block/blk-iolatency.o: In function `iolatency_prfill_limit':
blk-iolatency.c:(.text+0x8cc): undefined reference to `__aeabi_uldivmod'
Using div_u64() gives us the best output and avoids the need for an
explicit cast.
Fixes:
|
||
Josef Bacik
|
d706751215 |
block: introduce blk-iolatency io controller
Current IO controllers for the block layer are less than ideal for our use case. The io.max controller is great at hard limiting, but it is not work conserving. This patch introduces io.latency. You provide a latency target for your group and we monitor the io in short windows to make sure we are not exceeding those latency targets. This makes use of the rq-qos infrastructure and works much like the wbt stuff. There are a few differences from wbt - It's bio based, so the latency covers the whole block layer in addition to the actual io. - We will throttle all IO types that comes in here if we need to. - We use the mean latency over the 100ms window. This is because writes can be particularly fast, which could give us a false sense of the impact of other workloads on our protected workload. - By default there's no throttling, we set the queue_depth to INT_MAX so that we can have as many outstanding bio's as we're allowed to. Only at throttle time do we pay attention to the actual queue depth. - We backcharge cgroups for root cg issued IO and induce artificial delays in order to deal with cases like metadata only or swap heavy workloads. In testing this has worked out relatively well. Protected workloads will throttle noisy workloads down to 1 io at time if they are doing normal IO on their own, or induce up to a 1 second delay per syscall if they are doing a lot of root issued IO (metadata/swap IO). Our testing has revolved mostly around our production web servers where we have hhvm (the web server application) in a protected group and everything else in another group. We see slightly higher requests per second (RPS) on the test tier vs the control tier, and much more stable RPS across all machines in the test tier vs the control tier. Another test we run is a slow memory allocator in the unprotected group. Before this would eventually push us into swap and cause the whole box to die and not recover at all. With these patches we see slight RPS drops (usually 10-15%) before the memory consumer is properly killed and things recover within seconds. Signed-off-by: Josef Bacik <jbacik@fb.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |