commit cb85f4d23f794e24127f3e562cb3b54b0803f456 upstream.
If EXT4_EXTENTS_FL is set on an inode while ext4_writepages() is running
on it, the following warning in ext4_add_complete_io() can be hit:
WARNING: CPU: 1 PID: 0 at fs/ext4/page-io.c:234 ext4_put_io_end_defer+0xf0/0x120
Here's a minimal reproducer (not 100% reliable) (root isn't required):
while true; do
sync
done &
while true; do
rm -f file
touch file
chattr -e file
echo X >> file
chattr +e file
done
The problem is that in ext4_writepages(), ext4_should_dioread_nolock()
(which only returns true on extent-based files) is checked once to set
the number of reserved journal credits, and also again later to select
the flags for ext4_map_blocks() and copy the reserved journal handle to
ext4_io_end::handle. But if EXT4_EXTENTS_FL is being concurrently set,
the first check can see dioread_nolock disabled while the later one can
see it enabled, causing the reserved handle to unexpectedly be NULL.
Since changing EXT4_EXTENTS_FL is uncommon, and there may be other races
related to doing so as well, fix this by synchronizing changing
EXT4_EXTENTS_FL with ext4_writepages() via the existing
s_writepages_rwsem (previously called s_journal_flag_rwsem).
This was originally reported by syzbot without a reproducer at
https://syzkaller.appspot.com/bug?extid=2202a584a00fffd19fbf,
but now that dioread_nolock is the default I also started seeing this
when running syzkaller locally.
Link: https://lore.kernel.org/r/20200219183047.47417-3-ebiggers@kernel.org
Reported-by: syzbot+2202a584a00fffd19fbf@syzkaller.appspotmail.com
Fixes: 6b523df4fb ("ext4: use transaction reservation for extent conversion in ext4_end_io")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bbd55937de8f2754adc5792b0f8e5ff7d9c0420e upstream.
In preparation for making s_journal_flag_rwsem synchronize
ext4_writepages() with changes to both the EXTENTS and JOURNAL_DATA
flags (rather than just JOURNAL_DATA as it does currently), rename it to
s_writepages_rwsem.
Link: https://lore.kernel.org/r/20200219183047.47417-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9db176bceb5c5df4990486709da386edadc6bd1d upstream.
When CONFIG_QFMT_V2 is configured as a module, the test in
ext4_feature_set_ok() fails and so mount of filesystems with quota or
project features fails. Fix the test to use IS_ENABLED macro which
works properly even for modules.
Link: https://lore.kernel.org/r/20200221100835.9332-1-jack@suse.cz
Fixes: d65d87a07476 ("ext4: improve explanation of a mount failure caused by a misconfigured kernel")
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7c990728b99ed6fbe9c75fc202fce1172d9916da upstream.
During an online resize an array of s_flex_groups structures gets replaced
so it can get enlarged. If there is a concurrent access to the array and
this memory has been reused then this can lead to an invalid memory access.
The s_flex_group array has been converted into an array of pointers rather
than an array of structures. This is to ensure that the information
contained in the structures cannot get out of sync during a resize due to
an accessor updating the value in the old structure after it has been
copied but before the array pointer is updated. Since the structures them-
selves are no longer copied but only the pointers to them this case is
mitigated.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206443
Link: https://lore.kernel.org/r/20200221053458.730016-4-tytso@mit.edu
Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df3da4ea5a0fc5d115c90d5aa6caa4dd433750a7 upstream.
During an online resize an array of pointers to s_group_info gets replaced
so it can get enlarged. If there is a concurrent access to the array in
ext4_get_group_info() and this memory has been reused then this can lead to
an invalid memory access.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206443
Link: https://lore.kernel.org/r/20200221053458.730016-3-tytso@mit.edu
Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Balbir Singh <sblbir@amazon.com>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d0c3924a92e69bfa91163bda83c12a994b4d106 upstream.
During an online resize an array of pointers to buffer heads gets
replaced so it can get enlarged. If there is a racing block
allocation or deallocation which uses the old array, and the old array
has gotten reused this can lead to a GPF or some other random kernel
memory getting modified.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206443
Link: https://lore.kernel.org/r/20200221053458.730016-2-tytso@mit.edu
Reported-by: Suraj Jitindar Singh <surajjs@amazon.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9424ef56e13a1f14c57ea161eed3ecfdc7b2770e upstream.
We tested a soft lockup problem in linux 4.19 which could also
be found in linux 5.x.
When dir inode takes up a large number of blocks, and if the
directory is growing when we are searching, it's possible the
restart branch could be called many times, and the do while loop
could hold cpu a long time.
Here is the call trace in linux 4.19.
[ 473.756186] Call trace:
[ 473.756196] dump_backtrace+0x0/0x198
[ 473.756199] show_stack+0x24/0x30
[ 473.756205] dump_stack+0xa4/0xcc
[ 473.756210] watchdog_timer_fn+0x300/0x3e8
[ 473.756215] __hrtimer_run_queues+0x114/0x358
[ 473.756217] hrtimer_interrupt+0x104/0x2d8
[ 473.756222] arch_timer_handler_virt+0x38/0x58
[ 473.756226] handle_percpu_devid_irq+0x90/0x248
[ 473.756231] generic_handle_irq+0x34/0x50
[ 473.756234] __handle_domain_irq+0x68/0xc0
[ 473.756236] gic_handle_irq+0x6c/0x150
[ 473.756238] el1_irq+0xb8/0x140
[ 473.756286] ext4_es_lookup_extent+0xdc/0x258 [ext4]
[ 473.756310] ext4_map_blocks+0x64/0x5c0 [ext4]
[ 473.756333] ext4_getblk+0x6c/0x1d0 [ext4]
[ 473.756356] ext4_bread_batch+0x7c/0x1f8 [ext4]
[ 473.756379] ext4_find_entry+0x124/0x3f8 [ext4]
[ 473.756402] ext4_lookup+0x8c/0x258 [ext4]
[ 473.756407] __lookup_hash+0x8c/0xe8
[ 473.756411] filename_create+0xa0/0x170
[ 473.756413] do_mkdirat+0x6c/0x140
[ 473.756415] __arm64_sys_mkdirat+0x28/0x38
[ 473.756419] el0_svc_common+0x78/0x130
[ 473.756421] el0_svc_handler+0x38/0x78
[ 473.756423] el0_svc+0x8/0xc
[ 485.755156] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [tmp:5149]
Add cond_resched() to avoid soft lockup and to provide a better
system responding.
Link: https://lore.kernel.org/r/20200215080206.13293-1-luoshijie1@huawei.com
Signed-off-by: Shijie Luo <luoshijie1@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 35df4299a6487f323b0aca120ea3f485dfee2ae3 upstream.
EXT4_I(inode)->i_disksize could be accessed concurrently as noticed by
KCSAN,
BUG: KCSAN: data-race in ext4_write_end [ext4] / ext4_writepages [ext4]
write to 0xffff91c6713b00f8 of 8 bytes by task 49268 on cpu 127:
ext4_write_end+0x4e3/0x750 [ext4]
ext4_update_i_disksize at fs/ext4/ext4.h:3032
(inlined by) ext4_update_inode_size at fs/ext4/ext4.h:3046
(inlined by) ext4_write_end at fs/ext4/inode.c:1287
generic_perform_write+0x208/0x2a0
ext4_buffered_write_iter+0x11f/0x210 [ext4]
ext4_file_write_iter+0xce/0x9e0 [ext4]
new_sync_write+0x29c/0x3b0
__vfs_write+0x92/0xa0
vfs_write+0x103/0x260
ksys_write+0x9d/0x130
__x64_sys_write+0x4c/0x60
do_syscall_64+0x91/0xb47
entry_SYSCALL_64_after_hwframe+0x49/0xbe
read to 0xffff91c6713b00f8 of 8 bytes by task 24872 on cpu 37:
ext4_writepages+0x10ac/0x1d00 [ext4]
mpage_map_and_submit_extent at fs/ext4/inode.c:2468
(inlined by) ext4_writepages at fs/ext4/inode.c:2772
do_writepages+0x5e/0x130
__writeback_single_inode+0xeb/0xb20
writeback_sb_inodes+0x429/0x900
__writeback_inodes_wb+0xc4/0x150
wb_writeback+0x4bd/0x870
wb_workfn+0x6b4/0x960
process_one_work+0x54c/0xbe0
worker_thread+0x80/0x650
kthread+0x1e0/0x200
ret_from_fork+0x27/0x50
Reported by Kernel Concurrency Sanitizer on:
CPU: 37 PID: 24872 Comm: kworker/u261:2 Tainted: G W O L 5.5.0-next-20200204+ #5
Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
Workqueue: writeback wb_workfn (flush-7:0)
Since only the read is operating as lockless (outside of the
"i_data_sem"), load tearing could introduce a logic bug. Fix it by
adding READ_ONCE() for the read and WRITE_ONCE() for the write.
Signed-off-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/r/1581085751-31793-1-git-send-email-cai@lca.pw
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f287d3d19769b1d22cba4e51fa0487f2697713c9 ]
While certain modeset operations on gv100+ need us to temporarily
disable the LUT, we make the mistake of sometimes neglecting to
reprogram the LUT after such modesets. In particular, moving a head from
one encoder to another seems to trigger this quite often. GV100+ is very
picky about having a LUT in most scenarios, so this causes the display
engine to hang with the following error code:
disp: chid 1 stat 00005080 reason 5 [INVALID_STATE] mthd 0200 data
00000001 code 0000002d)
So, fix this by always re-programming the LUT if we're clearing it in a
state where the wndw is still visible, and has a XLUT handle programmed.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: facaed62b4 ("drm/nouveau/kms/gv100: initial support")
Cc: <stable@vger.kernel.org> # v4.18+
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 305e519ce48e935702c32241f07d393c3c8fed3e ]
Walter Wu has reported a potential case in which init_stack_slab() is
called after stack_slabs[STACK_ALLOC_MAX_SLABS - 1] has already been
initialized. In that case init_stack_slab() will overwrite
stack_slabs[STACK_ALLOC_MAX_SLABS], which may result in a memory
corruption.
Link: http://lkml.kernel.org/r/20200218102950.260263-1-glider@google.com
Fixes: cd11016e5f ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
Signed-off-by: Alexander Potapenko <glider@google.com>
Reported-by: Walter Wu <walter-zh.wu@mediatek.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 679aac5ead2f18d223554a52b543e1195e181811 ]
RX cancel command fails when BT is switched on and off multiple times.
To handle this, poll for the cancel bit in SE_GENI_S_IRQ_STATUS register
instead of SE_GENI_S_CMD_CTRL_REG.
As per the HPG update, handle the RX last bit after cancel command
and flush out the RX FIFO buffer.
Signed-off-by: satya priya <skakit@codeaurora.org>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1581415982-8793-1-git-send-email-skakit@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bdc05a8a3f822ca0662464055f902faf760da6be ]
The driver only supports FIFO mode so setting and checking this variable
is unnecessary. If DMA support is ever added then such checks can be
introduced.
Signed-off-by: Ryan Case <ryandcase@chromium.org>
Reviewed-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a85fb9ce1fab34a3216fd4d769fede643dbc68d4 ]
The variables of tx_wm and rx_wm were set to the same define value in
all cases, never updated, and the define was sometimes used
interchangably. Remove the variables/function and use the fixed value.
Signed-off-by: Ryan Case <ryandcase@chromium.org>
Reviewed-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9e06d55f7b856bfaf82036b50072600b21e52d20 ]
A frequent side comment has been to remove the use of writel_relaxed,
readl_relaxed, and mb. This reduces driver complexity and the _relaxed
variants were not known to provide any noticeable performance benefit.
Signed-off-by: Ryan Case <ryandcase@chromium.org>
Reviewed-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 64a428077758383518c258641e81d57fcd454792 ]
Disable M_TX_FIFO_WATERMARK_EN after we've sent all data for a given
transaction so we don't continue to receive a flurry of free space
interrupts while waiting for the M_CMD_DONE notification. Re-enable the
watermark when establishing the next transaction.
Also clear the watermark interrupt after filling the FIFO so we do not
receive notification again prior to actually having free space.
Signed-off-by: Ryan Case <ryandcase@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 663abb1a7a7ff8fea9ab0145463de7fcff823755 ]
If a serial console write occured while a UART transmit command was
waiting for a done signal then no further data would be sent until
something new kicked the system into gear. If there is already data
waiting in the circular buffer we must re-enable the tx watermark so we
receive the expected interrupts.
Signed-off-by: Ryan Case <ryandcase@chromium.org>
Reviewed-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 7455a8327674e1a7c9a1f5dd1b0743ab6713f6d1 upstream.
Commit 13db77347d ("KVM: x86: don't notify userspace IOAPIC on edge
EOI") said, edge-triggered interrupts don't set a bit in TMR, which means
that IOAPIC isn't notified on EOI. And var level indicates level-triggered
interrupt.
But commit 3159d36ad7 ("KVM: x86: use generic function for MSI parsing")
replace var level with irq.level by mistake. Fix it by changing irq.level
to irq.trig_mode.
Cc: stable@vger.kernel.org
Fixes: 3159d36ad7 ("KVM: x86: use generic function for MSI parsing")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 07721feee46b4b248402133228235318199b05ec upstream.
vmx_check_intercept is not yet fully implemented. To avoid emulating
instructions disallowed by the L1 hypervisor, refuse to emulate
instructions by default.
Cc: stable@vger.kernel.org
[Made commit, added commit msg - Oliver]
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3ae87dce3a5abe0b57c811bab02b2564b574106 upstream.
Intel Comet Lake based platform require the XHCI_PME_STUCK_QUIRK
quirk as well. Without this xHC can not enter D3 in runtime suspend.
Cc: stable@vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20200210134553.9144-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 76073c646f5f4999d763f471df9e38a5a912d70d upstream.
Commit 68600f623d69 ("mm: don't miss the last page because of round-off
error") makes the scan size round up to @denominator regardless of the
memory cgroup's state, online or offline. This affects the overall
reclaiming behavior: the corresponding LRU list is eligible for
reclaiming only when its size logically right shifted by @sc->priority
is bigger than zero in the former formula.
For example, the inactive anonymous LRU list should have at least 0x4000
pages to be eligible for reclaiming when we have 60/12 for
swappiness/priority and without taking scan/rotation ratio into account.
After the roundup is applied, the inactive anonymous LRU list becomes
eligible for reclaiming when its size is bigger than or equal to 0x1000
in the same condition.
(0x4000 >> 12) * 60 / (60 + 140 + 1) = 1
((0x1000 >> 12) * 60) + 200) / (60 + 140 + 1) = 1
aarch64 has 512MB huge page size when the base page size is 64KB. The
memory cgroup that has a huge page is always eligible for reclaiming in
that case.
The reclaiming is likely to stop after the huge page is reclaimed,
meaing the further iteration on @sc->priority and the silbing and child
memory cgroups will be skipped. The overall behaviour has been changed.
This fixes the issue by applying the roundup to offlined memory cgroups
only, to give more preference to reclaim memory from offlined memory
cgroup. It sounds reasonable as those memory is unlikedly to be used by
anyone.
The issue was found by starting up 8 VMs on a Ampere Mustang machine,
which has 8 CPUs and 16 GB memory. Each VM is given with 2 vCPUs and
2GB memory. It took 264 seconds for all VMs to be completely up and
784MB swap is consumed after that. With this patch applied, it took 236
seconds and 60MB swap to do same thing. So there is 10% performance
improvement for my case. Note that KSM is disable while THP is enabled
in the testing.
total used free shared buff/cache available
Mem: 16196 10065 2049 16 4081 3749
Swap: 8175 784 7391
total used free shared buff/cache available
Mem: 16196 11324 3656 24 1215 2936
Swap: 8175 60 8115
Link: http://lkml.kernel.org/r/20200211024514.8730-1-gshan@redhat.com
Fixes: 68600f623d69 ("mm: don't miss the last page because of round-off error")
Signed-off-by: Gavin Shan <gshan@redhat.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org> [4.20+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2546287c5fb363a0165933ae2181c92f03e701d0 upstream.
This was noticed when printing debugfs for MSIs on my ARM64 server. The
new dstate IRQD_MSI_NOMASK_QUIRK came out surprisingly while it should only
be the x86 stuff for the time being...
The new MSI quirk flag uses the same bit as IRQ_DOMAIN_NAME_ALLOCATED which
is oddly defined as bit 6 for no good reason.
Switch it to the non used bit 1.
Fixes: 6f1a4891a592 ("x86/apic/msi: Plug non-maskable MSI affinity race")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200221020725.2038-1-yuzenghui@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3b7830904e17202524bad1974505a9bfc718d31f upstream.
kmemleak reports a memory leak with the ana_log_buf allocated by
nvme_mpath_init():
unreferenced object 0xffff888120e94000 (size 8208):
comm "nvme", pid 6884, jiffies 4295020435 (age 78786.312s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000e2360188>] kmalloc_order+0x97/0xc0
[<0000000079b18dd4>] kmalloc_order_trace+0x24/0x100
[<00000000f50c0406>] __kmalloc+0x24c/0x2d0
[<00000000f31a10b9>] nvme_mpath_init+0x23c/0x2b0
[<000000005802589e>] nvme_init_identify+0x75f/0x1600
[<0000000058ef911b>] nvme_loop_configure_admin_queue+0x26d/0x280
[<00000000673774b9>] nvme_loop_create_ctrl+0x2a7/0x710
[<00000000f1c7a233>] nvmf_dev_write+0xc66/0x10b9
[<000000004199f8d0>] __vfs_write+0x50/0xa0
[<0000000065466fef>] vfs_write+0xf3/0x280
[<00000000b0db9a8b>] ksys_write+0xc6/0x160
[<0000000082156b91>] __x64_sys_write+0x43/0x50
[<00000000c34fbb6d>] do_syscall_64+0x77/0x2f0
[<00000000bbc574c9>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
nvme_mpath_init() is called by nvme_init_identify() which is called in
multiple places (nvme_reset_work(), nvme_passthru_end(), etc). This
means nvme_mpath_init() may be called multiple times before
nvme_mpath_uninit() (which is only called on nvme_free_ctrl()).
When nvme_mpath_init() is called multiple times, it overwrites the
ana_log_buf pointer with a new allocation, thus leaking the previous
allocation.
To fix this, free ana_log_buf before allocating a new one.
Fixes: 0d0b660f21 ("nvme: add ANA support")
Cc: <stable@vger.kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit edf28f4061afe4c2d9eb1c3323d90e882c1d6800 upstream.
This reverts commit a979558448.
Commit a979558448 ("ipc,sem: remove uneeded sem_undo_list lock usage
in exit_sem()") removes a lock that is needed. This leads to a process
looping infinitely in exit_sem() and can also lead to a crash. There is
a reproducer available in [1] and with the commit reverted the issue
does not reproduce anymore.
Using the reproducer found in [1] is fairly easy to reach a point where
one of the child processes is looping infinitely in exit_sem between
for(;;) and if (semid == -1) block, while it's trying to free its last
sem_undo structure which has already been freed by freeary().
Each sem_undo struct is on two lists: one per semaphore set (list_id)
and one per process (list_proc). The list_id list tracks undos by
semaphore set, and the list_proc by process.
Undo structures are removed either by freeary() or by exit_sem(). The
freeary function is invoked when the user invokes a syscall to remove a
semaphore set. During this operation freeary() traverses the list_id
associated with the semaphore set and removes the undo structures from
both the list_id and list_proc lists.
For this case, exit_sem() is called at process exit. Each process
contains a struct sem_undo_list (referred to as "ulp") which contains
the head for the list_proc list. When the process exits, exit_sem()
traverses this list to remove each sem_undo struct. As in freeary(),
whenever a sem_undo struct is removed from list_proc, it is also removed
from the list_id list.
Removing elements from list_id is safe for both exit_sem() and freeary()
due to sem_lock(). Removing elements from list_proc is not safe;
freeary() locks &un->ulp->lock when it performs
list_del_rcu(&un->list_proc) but exit_sem() does not (locking was
removed by commit a979558448 ("ipc,sem: remove uneeded sem_undo_list
lock usage in exit_sem()").
This can result in the following situation while executing the
reproducer [1] : Consider a child process in exit_sem() and the parent
in freeary() (because of semctl(sid[i], NSEM, IPC_RMID)).
- The list_proc for the child contains the last two undo structs A and
B (the rest have been removed either by exit_sem() or freeary()).
- The semid for A is 1 and semid for B is 2.
- exit_sem() removes A and at the same time freeary() removes B.
- Since A and B have different semid sem_lock() will acquire different
locks for each process and both can proceed.
The bug is that they remove A and B from the same list_proc at the same
time because only freeary() acquires the ulp lock. When exit_sem()
removes A it makes ulp->list_proc.next to point at B and at the same
time freeary() removes B setting B->semid=-1.
At the next iteration of for(;;) loop exit_sem() will try to remove B.
The only way to break from for(;;) is for (&un->list_proc ==
&ulp->list_proc) to be true which is not. Then exit_sem() will check if
B->semid=-1 which is and will continue looping in for(;;) until the
memory for B is reallocated and the value at B->semid is changed.
At that point, exit_sem() will crash attempting to unlink B from the
lists (this can be easily triggered by running the reproducer [1] a
second time).
To prove this scenario instrumentation was added to keep information
about each sem_undo (un) struct that is removed per process and per
semaphore set (sma).
CPU0 CPU1
[caller holds sem_lock(sma for A)] ...
freeary() exit_sem()
... ...
... sem_lock(sma for B)
spin_lock(A->ulp->lock) ...
list_del_rcu(un_A->list_proc) list_del_rcu(un_B->list_proc)
Undo structures A and B have different semid and sem_lock() operations
proceed. However they belong to the same list_proc list and they are
removed at the same time. This results into ulp->list_proc.next
pointing to the address of B which is already removed.
After reverting commit a979558448 ("ipc,sem: remove uneeded
sem_undo_list lock usage in exit_sem()") the issue was no longer
reproducible.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1694779
Link: http://lkml.kernel.org/r/20191211191318.11860-1-ioanna-maria.alifieraki@canonical.com
Fixes: a979558448 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()")
Signed-off-by: Ioanna Alifieraki <ioanna-maria.alifieraki@canonical.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Acked-by: Herton R. Krzesinski <herton@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: <malat@debian.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 96228b7df33f8eb9006f8ae96949400aed9bd303 upstream.
We've moved from bugzilla to gitlab.
Cc: stable@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200212160434.6437-1-jani.nikula@intel.com
(cherry picked from commit 3a6a4f0810c8ade6f1ff63c34aa9834176b9d88b)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c5aae59270fb1f827acce182786094c9ccf598e upstream.
The serdev tty-port controller driver should reset the tty-port client
operations also on deregistration to avoid a NULL-pointer dereference in
case the port is later re-registered as a normal tty device.
Note that this can only happen with tty drivers such as 8250 which have
statically allocated port structures that can end up being reused and
where a later registration would not register a serdev controller (e.g.
due to registration errors or if the devicetree has been changed in
between).
Specifically, this can be an issue for any statically defined ports that
would be registered by 8250 core when an 8250 driver is being unbound.
Fixes: bed35c6dfa ("serdev: add a tty port controller driver")
Cc: stable <stable@vger.kernel.org> # 4.11
Reported-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20200210145730.22762-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 04b5bfe3dc94e64d0590c54045815cb5183fb095 upstream.
In atmel_shutdown() we call atmel_stop_rx() and atmel_stop_tx() functions.
Prevent the rx restart that is implemented in RS485 or ISO7816 modes when
calling atmel_stop_tx() by using the atomic information tasklet_shutdown
that is already in place for this purpose.
Fixes: 98f2082c3a ("tty/serial: atmel: enforce tasklet init and termination sequences")
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200210152053.8289-1-nicolas.ferre@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7febbcbc48fc92e3f33863b32ed715ba4aff18c4 upstream.
The commit 54e53b2e80
("tty: serial: 8250: pass IRQ shared flag to UART ports")
nicely explained the problem:
---8<---8<---
On some systems IRQ lines between multiple UARTs might be shared. If so, the
irqflags have to be configured accordingly. The reason is: The 8250 port startup
code performs IRQ tests *before* the IRQ handler for that particular port is
registered. This is performed in serial8250_do_startup(). This function checks
whether IRQF_SHARED is configured and only then disables the IRQ line while
testing.
This test is performed upon each open() of the UART device. Imagine two UARTs
share the same IRQ line: On is already opened and the IRQ is active. When the
second UART is opened, the IRQ line has to be disabled while performing IRQ
tests. Otherwise an IRQ might handler might be invoked, but the IRQ itself
cannot be handled, because the corresponding handler isn't registered,
yet. That's because the 8250 code uses a chain-handler and invokes the
corresponding port's IRQ handling routines himself.
Unfortunately this IRQF_SHARED flag isn't configured for UARTs probed via device
tree even if the IRQs are shared. This way, the actual and shared IRQ line isn't
disabled while performing tests and the kernel correctly detects a spurious
IRQ. So, adding this flag to the DT probe solves the issue.
Note: The UPF_SHARE_IRQ flag is configured unconditionally. Therefore, the
IRQF_SHARED flag can be set unconditionally as well.
Example stack trace by performing `echo 1 > /dev/ttyS2` on a non-patched system:
|irq 85: nobody cared (try booting with the "irqpoll" option)
| [...]
|handlers:
|[<ffff0000080fc628>] irq_default_primary_handler threaded [<ffff00000855fbb8>] serial8250_interrupt
|Disabling IRQ #85
---8<---8<---
But unfortunately didn't fix the root cause. Let's try again here by moving
IRQ flag assignment from serial_link_irq_chain() to serial8250_do_startup().
This should fix the similar issue reported for 8250_pnp case.
Since this change we don't need to have custom solutions in 8250_aspeed_vuart
and 8250_of drivers, thus, drop them.
Fixes: 1c2f04937b ("serial: 8250: add IRQ trigger support")
Reported-by: Li RongQing <lirongqing@baidu.com>
Cc: Kurt Kanzenbach <kurt@linutronix.de>
Cc: Vikram Pandita <vikram.pandita@ti.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Acked-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20200211135559.85960-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 21b5ee59ef18e27d85810584caf1f7ddc705ea83 upstream.
Commit
aaf248848d ("perf/x86/msr: Add AMD IRPERF (Instructions Retired)
performance counter")
added support for access to the free-running counter via 'perf -e
msr/irperf/', but when exercised, it always returns a 0 count:
BEFORE:
$ perf stat -e instructions,msr/irperf/ true
Performance counter stats for 'true':
624,833 instructions
0 msr/irperf/
Simply set its enable bit - HWCR bit 30 - to make it start counting.
Enablement is restricted to all machines advertising IRPERF capability,
except those susceptible to an erratum that makes the IRPERF return
bad values.
That erratum occurs in Family 17h models 00-1fh [1], but not in F17h
models 20h and above [2].
AFTER (on a family 17h model 31h machine):
$ perf stat -e instructions,msr/irperf/ true
Performance counter stats for 'true':
621,690 instructions
622,490 msr/irperf/
[1] Revision Guide for AMD Family 17h Models 00h-0Fh Processors
[2] Revision Guide for AMD Family 17h Models 30h-3Fh Processors
The revision guides are available from the bugzilla Link below.
[ bp: Massage commit message. ]
Fixes: aaf248848d ("perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Link: http://lkml.kernel.org/r/20200214201805.13830-1-kim.phillips@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 51dede9c05df2b78acd6dcf6a17d21f0877d2d7b upstream.
Accessing the MCA thresholding controls in sysfs concurrently with CPU
hotplug can lead to a couple of KASAN-reported issues:
BUG: KASAN: use-after-free in sysfs_file_ops+0x155/0x180
Read of size 8 at addr ffff888367578940 by task grep/4019
and
BUG: KASAN: use-after-free in show_error_count+0x15c/0x180
Read of size 2 at addr ffff888368a05514 by task grep/4454
for example. Both result from the fact that the threshold block
creation/teardown code frees the descriptor memory itself instead of
defining proper ->release function and leaving it to the driver core to
take care of that, after all sysfs accesses have completed.
Do that and get rid of the custom freeing code, fixing the above UAFs in
the process.
[ bp: write commit message. ]
Fixes: 9526866439 ("[PATCH] x86_64: mce_amd support for family 0x10 processors")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200214082801.13836-1-bp@alien8.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e5cf31fbe651bed7ba1df768f2e123531132417 upstream.
threshold_create_bank() creates a bank descriptor per MCA error
thresholding counter which can be controlled over sysfs. It publishes
the pointer to that bank in a per-CPU variable and then goes on to
create additional thresholding blocks if the bank has such.
However, that creation of additional blocks in
allocate_threshold_blocks() can fail, leading to a use-after-free
through the per-CPU pointer.
Therefore, publish that pointer only after all blocks have been setup
successfully.
Fixes: 019f34fccf ("x86, MCE, AMD: Move shared bank to node descriptor")
Reported-by: Saar Amar <Saar.Amar@microsoft.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200128140846.phctkvx5btiexvbx@kili.mountain
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2464cc4c345699adea52c7aef75707207cb8a2f6 upstream.
After a treclaim, we expect to be in non-transactional state. If we
don't clear the current thread's MSR[TS] before we get preempted, then
tm_recheckpoint_new_task() will recheckpoint and we get rescheduled in
suspended transaction state.
When handling a signal caught in transactional state,
handle_rt_signal64() calls get_tm_stackpointer() that treclaims the
transaction using tm_reclaim_current() but without clearing the
thread's MSR[TS]. This can cause the TM Bad Thing exception below if
later we pagefault and get preempted trying to access the user's
sigframe, using __put_user(). Afterwards, when we are rescheduled back
into do_page_fault() (but now in suspended state since the thread's
MSR[TS] was not cleared), upon executing 'rfid' after completion of
the page fault handling, the exception is raised because a transition
from suspended to non-transactional state is invalid.
Unexpected TM Bad Thing exception at c00000000000de44 (msr 0x8000000302a03031) tm_scratch=800000010280b033
Oops: Unrecoverable exception, sig: 6 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
CPU: 25 PID: 15547 Comm: a.out Not tainted 5.4.0-rc2 #32
NIP: c00000000000de44 LR: c000000000034728 CTR: 0000000000000000
REGS: c00000003fe7bd70 TRAP: 0700 Not tainted (5.4.0-rc2)
MSR: 8000000302a03031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[SE]> CR: 44000884 XER: 00000000
CFAR: c00000000000dda4 IRQMASK: 0
PACATMSCRATCH: 800000010280b033
GPR00: c000000000034728 c000000f65a17c80 c000000001662800 00007fffacf3fd78
GPR04: 0000000000001000 0000000000001000 0000000000000000 c000000f611f8af0
GPR08: 0000000000000000 0000000078006001 0000000000000000 000c000000000000
GPR12: c000000f611f84b0 c00000003ffcb200 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000f611f8140
GPR24: 0000000000000000 00007fffacf3fd68 c000000f65a17d90 c000000f611f7800
GPR28: c000000f65a17e90 c000000f65a17e90 c000000001685e18 00007fffacf3f000
NIP [c00000000000de44] fast_exception_return+0xf4/0x1b0
LR [c000000000034728] handle_rt_signal64+0x78/0xc50
Call Trace:
[c000000f65a17c80] [c000000000034710] handle_rt_signal64+0x60/0xc50 (unreliable)
[c000000f65a17d30] [c000000000023640] do_notify_resume+0x330/0x460
[c000000f65a17e20] [c00000000000dcc4] ret_from_except_lite+0x70/0x74
Instruction dump:
7c4ff120 e8410170 7c5a03a6 38400000 f8410060 e8010070 e8410080 e8610088
60000000 60000000 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed0989
---[ end trace 93094aa44b442f87 ]---
The simplified sequence of events that triggers the above exception is:
... # userspace in NON-TRANSACTIONAL state
tbegin # userspace in TRANSACTIONAL state
signal delivery # kernelspace in SUSPENDED state
handle_rt_signal64()
get_tm_stackpointer()
treclaim # kernelspace in NON-TRANSACTIONAL state
__put_user()
page fault happens. We will never get back here because of the TM Bad Thing exception.
page fault handling kicks in and we voluntarily preempt ourselves
do_page_fault()
__schedule()
__switch_to(other_task)
our task is rescheduled and we recheckpoint because the thread's MSR[TS] was not cleared
__switch_to(our_task)
switch_to_tm()
tm_recheckpoint_new_task()
trechkpt # kernelspace in SUSPENDED state
The page fault handling resumes, but now we are in suspended transaction state
do_page_fault() completes
rfid <----- trying to get back where the page fault happened (we were non-transactional back then)
TM Bad Thing # illegal transition from suspended to non-transactional
This patch fixes that issue by clearing the current thread's MSR[TS]
just after treclaim in get_tm_stackpointer() so that we stay in
non-transactional state in case we are preempted. In order to make
treclaim and clearing the thread's MSR[TS] atomic from a preemption
perspective when CONFIG_PREEMPT is set, preempt_disable/enable() is
used. It's also necessary to save the previous value of the thread's
MSR before get_tm_stackpointer() is called so that it can be exposed
to the signal handler later in setup_tm_sigcontexts() to inform the
userspace MSR at the moment of the signal delivery.
Found with tm-signal-context-force-tm kernel selftest.
Fixes: 2b0a576d15 ("powerpc: Add new transactional memory state to the signal context")
Cc: stable@vger.kernel.org # v3.9
Signed-off-by: Gustavo Luiz Duarte <gustavold@linux.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200211033831.11165-1-gustavold@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 23954cb078febfc63a755301fe77e06bccdb4d2a upstream.
In routine wpa_supplicant_ioctl(), the user-controlled p->length is
checked to be at least the size of struct ieee_param size, but the code
does not detect the case where p->length is greater than the size
of the struct, thus a malicious user could be wasting kernel memory.
Fixes commit 554c0a3abf ("staging: Add rtl8723bs sdio wifi driver").
Reported by: Pietro Oliva <pietroliva@gmail.com>
Cc: Pietro Oliva <pietroliva@gmail.com>
Cc: Stable <stable@vger.kernel.org>
Fixes: 554c0a3abf ("staging: Add rtl8723bs sdio wifi driver").
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Link: https://lore.kernel.org/r/20200210180235.21691-5-Larry.Finger@lwfinger.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac33597c0c0d1d819dccfe001bcd0acef7107e7c upstream.
In routine rtw_hostapd_ioctl(), the user-controlled p->length is assumed
to be at least the size of struct ieee_param size, but this assumption is
never checked. This could result in out-of-bounds read/write on kernel
heap in case a p->length less than the size of struct ieee_param is
specified by the user. If p->length is allowed to be greater than the size
of the struct, then a malicious user could be wasting kernel memory.
Fixes commit 554c0a3abf ("0taging: Add rtl8723bs sdio wifi driver").
Reported by: Pietro Oliva <pietroliva@gmail.com>
Cc: Pietro Oliva <pietroliva@gmail.com>
Cc: Stable <stable@vger.kernel.org>
Fixes 554c0a3abf ("0taging: Add rtl8723bs sdio wifi driver").
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Link: https://lore.kernel.org/r/20200210180235.21691-3-Larry.Finger@lwfinger.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4ddf8ab8d15ddbc52eefb44eb64e38466ce1f70f upstream.
In routine wpa_supplicant_ioctl(), the user-controlled p->length is
checked to be at least the size of struct ieee_param size, but the code
does not detect the case where p->length is greater than the size
of the struct, thus a malicious user could be wasting kernel memory.
Fixes commit a2c60d42d9 ("Add files for new driver - part 16").
Reported by: Pietro Oliva <pietroliva@gmail.com>
Cc: Pietro Oliva <pietroliva@gmail.com>
Cc: Stable <stable@vger.kernel.org>
Fixes commit a2c60d42d9 ("Add files for new driver - part 16").
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Link: https://lore.kernel.org/r/20200210180235.21691-4-Larry.Finger@lwfinger.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 499c405b2b80bb3a04425ba3541d20305e014d3e upstream.
In routine rtw_hostapd_ioctl(), the user-controlled p->length is assumed
to be at least the size of struct ieee_param size, but this assumption is
never checked. This could result in out-of-bounds read/write on kernel
heap in case a p->length less than the size of struct ieee_param is
specified by the user. If p->length is allowed to be greater than the size
of the struct, then a malicious user could be wasting kernel memory.
Fixes commit a2c60d42d9 ("Add files for new driver - part 16").
Reported by: Pietro Oliva <pietroliva@gmail.com>
Cc: Pietro Oliva <pietroliva@gmail.com>
Cc: Stable <stable@vger.kernel.org>
Fixes: a2c60d42d9 ("staging: r8188eu: Add files for new driver - part 16")
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Link: https://lore.kernel.org/r/20200210180235.21691-2-Larry.Finger@lwfinger.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5ee858975b13a9b40db00f456989a689fdbb296c upstream.
The current code in dwc3_gadget_ep_reclaim_completed_trb() will
check for IOC/LST bit in the event->status and returns if
IOC/LST bit is set. This logic doesn't work if multiple TRBs
are queued per request and the IOC/LST bit is set on the last
TRB of that request.
Consider an example where a queued request has multiple queued
TRBs and IOC/LST bit is set only for the last TRB. In this case,
the core generates XferComplete/XferInProgress events only for
the last TRB (since IOC/LST are set only for the last TRB). As
per the logic in dwc3_gadget_ep_reclaim_completed_trb()
event->status is checked for IOC/LST bit and returns on the
first TRB. This leaves the remaining TRBs left unhandled.
Similarly, if the gadget function enqueues an unaligned request
with sglist already in it, it should fail the same way, since we
will append another TRB to something that already uses more than
one TRB.
To aviod this, this patch changes the code to check for IOC/LST
bits in TRB->ctrl instead.
At a practical level, this patch resolves USB transfer stalls seen
with adb on dwc3 based HiKey960 after functionfs gadget added
scatter-gather support around v4.20.
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Yang Fei <fei.yang@intel.com>
Cc: Thinh Nguyen <thinhn@synopsys.com>
Cc: Tejas Joglekar <tejas.joglekar@synopsys.com>
Cc: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Cc: Jack Pham <jackp@codeaurora.org>
Cc: Todd Kjos <tkjos@google.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Linux USB List <linux-usb@vger.kernel.org>
Cc: stable <stable@vger.kernel.org>
Tested-by: Tejas Joglekar <tejas.joglekar@synopsys.com>
Reviewed-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com>
[jstultz: forward ported to mainline, reworded commit log, reworked
to only check trb->ctrl as suggested by Felipe]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a0d6f7c0a83844baae1d6d85482863d2bf3b7a7 upstream.
SET/CLEAR_FEATURE for Remote Wakeup allowance not handled correctly.
GET_STATUS handling provided not correct data on DATA Stage.
Issue seen when gadget's dr_mode set to "otg" mode and connected
to MacOS.
Both are fixed and tested using USBCV Ch.9 tests.
Signed-off-by: Minas Harutyunyan <hminas@synopsys.com>
Fixes: fa389a6d77 ("usb: dwc2: gadget: Add remote_wakeup_allowed flag")
Tested-by: Jack Mitchell <ml@embed.me.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1208f9e1d758c991b0a46a1bd60c616b906bbe27 upstream.
Renesas R-Car H3ULCB + Kingfisher Infotainment Board is either not able
to detect the USB3.0 mass storage devices or is detecting those as
USB2.0 high speed devices.
The explanation given by Renesas is that, due to a HW issue, the XHCI
driver does not wake up after going to sleep on connecting a USB3.0
device.
In order to mitigate that, disable the auto-suspend feature
specifically for SMSC hubs from hub_probe() function, as a quirk.
Renesas Kingfisher Infotainment Board has two USB3.0 ports (CN2) which
are connected via USB5534B 4-port SuperSpeed/Hi-Speed, low-power,
configurable hub controller.
[1] SanDisk USB 3.0 device detected as USB-2.0 before the patch
[ 74.036390] usb 5-1.1: new high-speed USB device number 4 using xhci-hcd
[ 74.061598] usb 5-1.1: New USB device found, idVendor=0781, idProduct=5581, bcdDevice= 1.00
[ 74.069976] usb 5-1.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 74.077303] usb 5-1.1: Product: Ultra
[ 74.080980] usb 5-1.1: Manufacturer: SanDisk
[ 74.085263] usb 5-1.1: SerialNumber: 4C530001110208116550
[2] SanDisk USB 3.0 device detected as USB-3.0 after the patch
[ 34.565078] usb 6-1.1: new SuperSpeed Gen 1 USB device number 3 using xhci-hcd
[ 34.588719] usb 6-1.1: New USB device found, idVendor=0781, idProduct=5581, bcdDevice= 1.00
[ 34.597098] usb 6-1.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 34.604430] usb 6-1.1: Product: Ultra
[ 34.608110] usb 6-1.1: Manufacturer: SanDisk
[ 34.612397] usb 6-1.1: SerialNumber: 4C530001110208116550
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Hardik Gajjar <hgajjar@de.adit-jv.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1580989763-32291-1-git-send-email-hgajjar@de.adit-jv.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8099f58f1ecddf4f374f4828a3dff8397c7cbd74 upstream.
Paul Zimmerman reports that his USB Bluetooth adapter sometimes
crashes following system resume, when it receives a
Get-Device-Descriptor request while it is busy doing something else.
Such a request was added by commit a4f55d8b8c14 ("usb: hub: Check
device descriptor before resusciation"). It gets sent when the hub
driver's work thread checks whether a connect-change event on an
enabled port really indicates a new device has been connected, as
opposed to an old device momentarily disconnecting and then
reconnecting (which can happen with xHCI host controllers, since they
automatically enable connected ports).
The same kind of thing occurs when a port's power session is lost
during system suspend. When the system wakes up it sees a
connect-change event on the port, and if the child device's
persist_enabled flag was set then hub_activate() sets the device's
reset_resume flag as well as the port's bit in hub->change_bits. The
reset-resume code then takes responsibility for checking that the same
device is still attached to the port, and it does this as part of the
device's resume pathway. By the time the hub driver's work thread
starts up again, the device has already been fully reinitialized and
is busy doing its own thing. There's no need for the work thread to
do the same check a second time, and in fact this unnecessary check is
what caused the problem that Paul observed.
Note that performing the unnecessary check is not actually a bug.
Devices are supposed to be able to send descriptors back to the host
even when they are busy doing something else. The underlying cause of
Paul's problem lies in his Bluetooth adapter. Nevertheless, we
shouldn't perform the same check twice in a row -- and as a nice side
benefit, removing the extra check allows the Bluetooth adapter to work
more reliably.
The work thread performs its check when it sees that the port's bit is
set in hub->change_bits. In this situation that bit is interpreted as
though a connect-change event had occurred on the port _after_ the
reset-resume, which is not what actually happened.
One possible fix would be to make the reset-resume code clear the
port's bit in hub->change_bits. But it seems simpler to just avoid
setting the bit during hub_activate() in the first place. That's what
this patch does.
(Proving that the patch is correct when CONFIG_PM is disabled requires
a little thought. In that setting hub_activate() will be called only
for initialization and resets, since there won't be any resumes or
reset-resumes. During initialization and hub resets the hub doesn't
have any child devices, and so this code path never gets executed.)
Reported-and-tested-by: Paul Zimmerman <pauldzim@gmail.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://marc.info/?t=157949360700001&r=1&w=2
CC: David Heinzelmann <heinzelmann.david@gmail.com>
CC: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.2001311037460.1577-100000@iolanthe.rowland.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b692056db8ecc7f452b934f016c17348282b7699 upstream.
Currently, the SourceControl will stay in power-down mode after resuming
from suspend. This patch resets the device after suspend to power it up.
Signed-off-by: Richard Dodd <richard.o.dodd@gmail.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200212142220.36892-1-richard.o.dodd@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e99862c05a9caa5a27969f41566b428696f5a9a upstream.
When a uas disk is plugged into an external hub, uas_probe()
will be called by the hub thread to do the probe. It will
first create a SCSI host and then do the scan for this host.
During the scan, it will probe the LUN using SCSI INQUERY command
which will be packed in the URB and submitted to uas disk.
There might be a chance that this external hub with uas disk
attached is unplugged during the scan. In this case, uas driver
will fail to submit the URB (due to the NOTATTACHED state of uas
device) and try to put this SCSI command back to request queue
waiting for next chance to run.
In normal case, this cycle will terminate when hub thread gets
disconnection event and calls into uas_disconnect() accordingly.
But in this case, uas_disconnect() will not be called because
hub thread of external hub gets stuck waiting for the completion
of this SCSI command. A deadlock happened.
In this fix, uas will call scsi_scan_host() asynchronously to
avoid the blocking of hub thread.
Signed-off-by: EJ Hsu <ejh@nvidia.com>
Acked-by: Oliver Neukum <oneukum@suse.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200130092506.102760-1-ejh@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 73f8bda9b5dc1c69df2bc55c0cbb24461a6391a9 upstream.
Add a new device quirk that can be used to blacklist endpoints.
Since commit 3e4f8e21c4f2 ("USB: core: fix check for duplicate
endpoints") USB core ignores any duplicate endpoints found during
descriptor parsing.
In order to handle devices where the first interfaces with duplicate
endpoints are the ones that should have their endpoints ignored, we need
to add a blacklist.
Tested-by: edes <edes@gmx.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20200203153830.26394-2-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc0ffbea5729a3abafa577ebfce87f18b79e294b upstream.
On some situations, the software handles TRB events slower
than adding TRBs, then xhci_handle_event can't return zero
long time, the xHC will consider the event ring is full,
and trigger "Event Ring Full" error, but in fact, the software
has already finished lots of events, just no chance to
update ERDP (event ring dequeue pointer).
In this commit, we force update ERDP if half of TRBS_PER_SEGMENT
events have handled to avoid "Event Ring Full" error.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/1573836603-10871-2-git-send-email-mathias.nyman@linux.intel.com
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cf0ee7c60c89641f6e4d1d3c7867fe32b9e30300 upstream.
xhci driver assumed that xHC controllers have at most one custom
supported speed table (PSI) for all usb 3.x ports.
Memory was allocated for one PSI table under the xhci hub structure.
Turns out this is not the case, some controllers have a separate
"supported protocol capability" entry with a PSI table for each port.
This means each usb3 roothub port can in theory support different custom
speeds.
To solve this, cache all supported protocol capabilities with their PSI
tables in an array, and add pointers to the xhci port structure so that
every port points to its capability entry in the array.
When creating the SuperSpeedPlus USB Device Capability BOS descriptor
for the xhci USB 3.1 roothub we for now will use only data from the
first USB 3.1 capable protocol capability entry in the array.
This could be improved later, this patch focuses resolving
the memory leak.
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reported-by: Sajja Venkateswara Rao <VenkateswaraRao.Sajja@amd.com>
Fixes: 47189098f8 ("xhci: parse xhci protocol speed ID list for usb 3.1 usage")
Cc: stable <stable@vger.kernel.org> # v4.4+
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20200211150158.14475-1-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 024d411e9c5d49eb96c825af52a3ce2682895676 upstream.
Intel hosts that need the XHCI_PME_STUCK_QUIRK flag should enable
runtime pm by calling xhci_pme_acpi_rtd3_enable() before
usb_hcd_pci_probe() calls pci_dev_run_wake().
Otherwise usage count for the device won't be decreased, and runtime
suspend is prevented.
usb_hcd_pci_probe() only decreases the usage count if device can
generate run-time wake-up events, i.e. when pci_dev_run_wake()
returns true.
This issue was exposed by pci_dev_run_wake() change in
commit 8feaec33b9 ("PCI / PM: Always check PME wakeup capability for
runtime wakeup support")
and should be backported to kernels with that change
Cc: <stable@vger.kernel.org> # 4.13+
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20200210134553.9144-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>