[ Upstream commit ea956d8be91edc702a98b7fe1f9463e7ca8c42ab ]
Empty executable arguments were being skipped when printing out the list
of arguments in an EXECVE record, making it appear they were somehow
lost. Include empty arguments as an itemized empty string.
Reproducer:
autrace /bin/ls "" "/etc"
ausearch --start recent -m execve -i | grep EXECVE
type=EXECVE msg=audit(10/03/2018 13:04:03.208:1391) : argc=3 a0=/bin/ls a2=/etc
With fix:
type=EXECVE msg=audit(10/03/2018 21:51:38.290:194) : argc=3 a0=/bin/ls a1= a2=/etc
type=EXECVE msg=audit(1538617898.290:194): argc=3 a0="/bin/ls" a1="" a2="/etc"
Passes audit-testsuite. GH issue tracker at
https://github.com/linux-audit/audit-kernel/issues/99
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: cleaned up the commit metadata]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fb14ada11d62fb849fc357a25ef8016ba438ba10 ]
When the kernel is built with CONFIG_THUMB2_KERNEL we would set the
kernel's resume entry point to be a function that is already built as
Thumb-2 code while the boot agent doing the resume is in ARM mode, so
this does not work. There is a header label defined: cpu_resume_arm
which we can use to do the switching for us.
Fixes: 0b741b8234 ("soc: bcm: brcmstb: Add support for S2/S3/S5 suspend states (ARM)")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 859783d1390035e29ba850963bded2b4ffdf43b5 ]
In the user manual of A64 SoC, the bit 22 and 23 of pll-mipi control
register is called "LDO{1,2}_EN", and according to the BSP source code
from Allwinner , the LDOs are enabled during the clock's enabling
process.
The clock failed to generate output if the two LDOs are not enabled.
Add the two bits to the clock's gate bits, so that the LDOs are enabled
when the PLL is enabled.
Fixes: c6a0637460 ("clk: sunxi-ng: Add A64 clocks")
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1ad9fb750a104f51851c092edd7b3553f0218428 ]
Bindings for "fixed-regulator" only explicitly support "gpio" property,
not "gpios". Fix by correcting the property name.
The enet PHYs on imx6sx-sdb needs to be explicitly reset after a power
cycle, this can be handled by the phy-reset-gpios property. Sadly this
is not handled on suspend: the fec driver turns phy-supply off but
doesn't assert phy-reset-gpios again on resume.
Since additional phy-level work is required to support powering off the
phy in suspend fix the problem by just marking the regulator as
"boot-on" "always-on" so that it's never turned off. This behavior is
equivalent to older releases.
Keep the phy-reset-gpios property on fec anyway because it is a correct
description of board design.
This issue was exposed by commit efdfeb079cc3 ("regulator: fixed:
Convert to use GPIO descriptor only") which causes the "gpios" property
to also be parsed. Before that commit the "gpios" property had no
effect, PHY reset was only handled in the the bootloader.
This fixes linux-next boot failures previously reported here:
https://lore.kernel.org/patchwork/patch/982437/#1177900https://lore.kernel.org/patchwork/patch/994091/#1178304
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a277d516de5f498c91d91189717ef7e01102ad27 ]
When CONFIG_CC_OPTIMIZE_FOR_DEBUGGING is enabled, the compiler
fails to optimize out a dead code path, which leads to a link failure:
net/openvswitch/conntrack.o: In function `ovs_ct_set_labels':
conntrack.c:(.text+0x2e60): undefined reference to `nf_connlabels_replace'
In this configuration, we can take a shortcut, and completely
remove the contrack label code. This may also help the regular
optimization.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3f130a37c442d5c4d66531b240ebe9abfef426b5 ]
When load_balance() fails to move some load because of task affinity,
we end up increasing sd->balance_interval to delay the next periodic
balance in the hopes that next time we look, that annoying pinned
task(s) will be gone.
However, idle_balance() pays no attention to sd->balance_interval, yet
it will still lead to an increase in balance_interval in case of
pinned tasks.
If we're going through several newidle balances (e.g. we have a
periodic task), this can lead to a huge increase of the
balance_interval in a very small amount of time.
To prevent that, don't increase the balance interval when going
through a newidle balance.
This is a similar approach to what is done in commit 58b26c4c02
("sched: Increment cache_nice_tries only on periodic lb"), where we
disregard newidle balance and rely on periodic balance for more stable
results.
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar.Eggemann@arm.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: patrick.bellasi@arm.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1537974727-30788-2-git-send-email-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 993f0b0510dad98b4e6e39506834dab0d13fd539 ]
With the addition of the NUMA identity level, we increased @level by
one and will run off the end of the array in the distance sort loop.
Fixed: 051f3ca02e ("sched/topology: Introduce NUMA identity node sched domain")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fe60faa5063822f2d555f4f326c7dd72a60929bf ]
Before calling dev_hard_start_xmit(), upper layers tried
to cook optimal skb list based on BQL budget.
Problem is that GSO packets can end up comsuming more than
the BQL budget.
Breaking the loop is not useful, since requeued packets
are ahead of any packets still in the qdisc.
It is also more expensive, since next TX completion will
push these packets later, while skbs are not in cpu caches.
It is also a behavior difference with TSO packets, that can
break the BQL limit by a large amount.
Note that drivers should use __netdev_tx_sent_queue()
in order to have optimal xmit_more support, and avoid
useless atomic operations as shown in the following patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6194ae4242dec0c9d604bc05df83aa9260a899e4 ]
ocfs2_defrag_extent() might leak allocated clusters. When the file
system has insufficient space, the number of claimed clusters might be
less than the caller wants. If that happens, the original code might
directly commit the transaction without returning clusters.
This patch is based on code in ocfs2_add_clusters_in_btree().
[akpm@linux-foundation.org: include localalloc.h, reduce scope of data_ac]
Link: http://lkml.kernel.org/r/20180904041621.16874-3-lchen@suse.com
Signed-off-by: Larry Chen <lchen@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cf76c78595ca87548ca5e45c862ac9e0949c4687 ]
ocfs2_read_blocks() and ocfs2_read_blocks_sync() are both used to read
several blocks from disk. Currently, the input argument *bhs* can be
NULL or NOT. It depends on the caller's behavior. If the function
fails in reading blocks from disk, the corresponding bh will be assigned
to NULL and put.
Obviously, above process for non-NULL input bh is not appropriate.
Because the caller doesn't even know its bhs are put and re-assigned.
If buffer head is managed by caller, ocfs2_read_blocks and
ocfs2_read_blocks_sync() should not evaluate it to NULL. It will cause
caller accessing illegal memory, thus crash.
Link: http://lkml.kernel.org/r/HK2PR06MB045285E0F4FBB561F9F2F9B3D5680@HK2PR06MB0452.apcprd06.prod.outlook.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reviewed-by: Guozhonghua <guozhonghua@h3c.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9e985787750db8aae87f02b67e908f28ac4d6b83 ]
When -EIOCBQUEUED returns, it means that aio_complete() will be called
from dio_complete(), which is an asynchronous progress against
write_iter. Generally, IO is a very slow progress than executing
instruction, but we still can't take the risk to access a freed iocb.
And we do face a BUG crash issue. Using the crash tool, iocb is
obviously freed already.
crash> struct -x kiocb ffff881a350f5900
struct kiocb {
ki_filp = 0xffff881a350f5a80,
ki_pos = 0x0,
ki_complete = 0x0,
private = 0x0,
ki_flags = 0x0
}
And the backtrace shows:
ocfs2_file_write_iter+0xcaa/0xd00 [ocfs2]
aio_run_iocb+0x229/0x2f0
do_io_submit+0x291/0x540
SyS_io_submit+0x10/0x20
system_call_fastpath+0x16/0x75
Link: http://lkml.kernel.org/r/1523361653-14439-1-git-send-email-ge.changwei@h3c.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 21158ca85b73ddd0088076a5209cfd040513a8b5 ]
During one dead node's recovery by other node, quota recovery work will
be queued. We should avoid calling quota when it is not supported, so
check the quota flags.
Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA401071AC9FB@H3CMLB12-EX.srv.huawei-3com.com
Signed-off-by: guozhonghua <guozhonghua@h3c.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e68599a3c3ad0f3171a7cb4e48aa6f9a69381902 ]
Mike Galbraith reported a regression caused by the commit 9b6f7e163cd0
("mm: rework memcg kernel stack accounting") on a system with
"cgroup_disable=memory" boot option: the system panics with the following
stack trace:
BUG: unable to handle kernel NULL pointer dereference at 00000000000000f8
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP PTI
CPU: 0 PID: 1 Comm: systemd Not tainted 4.19.0-preempt+ #410
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fed4
RIP: 0010:page_counter_try_charge+0x22/0xc0
Code: 41 5d c3 c3 0f 1f 40 00 0f 1f 44 00 00 48 85 ff 0f 84 a7 00 00 00 41 56 48 89 f8 49 89 fe 49
Call Trace:
try_charge+0xcb/0x780
memcg_kmem_charge_memcg+0x28/0x80
memcg_kmem_charge+0x8b/0x1d0
copy_process.part.41+0x1ca/0x2070
_do_fork+0xd7/0x3d0
do_syscall_64+0x5a/0x180
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The problem occurs because get_mem_cgroup_from_current() returns the NULL
pointer if memory controller is disabled. Let's check if this is a case
at the beginning of memcg_kmem_charge() and just return 0 if
mem_cgroup_disabled() returns true. This is how we handle this case in
many other places in the memory controller code.
Link: http://lkml.kernel.org/r/20181029215123.17830-1-guro@fb.com
Fixes: 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Mike Galbraith <efault@gmx.de>
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3404155190ce09a1e5d8407e968fc19aac4493e3 ]
turbostat recently gained a feature adding APIC and X2APIC columns.
While they are disabled by-default, they are enabled with --debug
or when explicitly requested, eg.
$ sudo turbostat --quiet --show Package,Node,Core,CPU,APIC,X2APIC date
But these columns erroneously showed zeros on AMD hardware.
This patch corrects the APIC and X2APIC [sic] columns on AMD.
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 98356eb0ae499c63e78073ccedd9a5fc5c563288 ]
After 'a66649dab350 arm64: fix vdso-offsets.h dependency' if
one will try to build .i file in case of external kernel module,
build fails complaining that prepare0 target is missing. This
issue came up with SystemTap when it tries to build variety
of .i files for its own generated kernel modules trying to
figure given kernel features/capabilities.
The issue is that prepare0 is defined in top level Makefile
only if KBUILD_EXTMOD is not defined. .i file rule depends
on prepare and in case KBUILD_EXTMOD defined top level Makefile
contains empty rule for prepare. But after mentioned commit
arch/arm64/Makefile would introduce dependency on prepare0
through its own prepare target.
Fix it to put proper ifdef KBUILD_EXTMOD around code introduced
by mentioned commit. It matches what top level Makefile does.
Acked-by: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Victor Kamensky <kamensky@cisco.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9fe5c59ff6a1e5e26a39b75489a1420e7eaaf0b1 ]
The nvme pci driver had been adding its CMB resource to the P2P DMA
subsystem everytime on on a controller reset. This results in the
following warning:
------------[ cut here ]------------
nvme 0000:00:03.0: Conflicting mapping in same section
WARNING: CPU: 7 PID: 81 at kernel/memremap.c:155 devm_memremap_pages+0xa6/0x380
...
Call Trace:
pci_p2pdma_add_resource+0x153/0x370
nvme_reset_work+0x28c/0x17b1 [nvme]
? add_timer+0x107/0x1e0
? dequeue_entity+0x81/0x660
? dequeue_entity+0x3b0/0x660
? pick_next_task_fair+0xaf/0x610
? __switch_to+0xbc/0x410
process_one_work+0x1cf/0x350
worker_thread+0x215/0x3d0
? process_one_work+0x350/0x350
kthread+0x107/0x120
? kthread_park+0x80/0x80
ret_from_fork+0x1f/0x30
---[ end trace f7ea76ac6ee72727 ]---
nvme nvme0: failed to register the CMB
This patch fixes this by registering the CMB with P2P only once.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 57f01796f14fecf00d330fe39c8d2477ced9cd79 ]
IRQ_MATRIX_SIZE is the number of longs needed for a bitmap, multiplied by
the size of a long, yielding a byte count. But it is used to size an array
of longs, which is way more memory than is needed.
Change IRQ_MATRIX_SIZE so it is just the number of longs needed and the
arrays come out the correct size.
Fixes: 2f75d9e1c9 ("genirq: Implement bitmap matrix allocator")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: KY Srinivasan <kys@microsoft.com>
Link: https://lkml.kernel.org/r/1541032428-10392-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7756e2b5d68c36e170a111dceea22f7365f83256 ]
ndev_vec_mask() should be returning u64 mask value instead of int.
Otherwise the mask value returned can be incorrect for larger
vectors.
Fixes: e26a5843f7 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Lucas Van <lucas.van@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a861594b1b7ffd630f335b351c4e9f938feadb8e ]
The tx_time should be in usecs (according to the comment above the
variable), but the setting of the timer during the rearming is done in
msecs. Change it to match the expected units.
Fixes: e74bfeedad ("NTB: Add flow control to the ntb_netdev")
Suggested-by: Gerd W. Haeussler <gerd.haeussler@cesys-it.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1c12493809924deda6c0834cb2f2c5a6dc786390 ]
When there is a PHY, the driver needs to complete some operations through
MDIO during reset reinitialization, so HCLGE_STATE_CMD_DISABLE is more
suitable than HCLGE_STATE_RST_HANDLING to prevent the MDIO operation from
being sent during the hardware reset.
Fixes: b50ae26c57 ("net: hns3: never send command queue message to IMP when reset)
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6d71ec6cbf74ac9c2823ef751b1baa5b889bb3ac ]
The HEAD pointer of the hardware command queue maybe equal to the command
queue's next_to_use in the driver, so that does not belong to the invalid
HEAD pointer, since the hardware may not process the command in time,
causing the HEAD pointer to be too late to update. The variables' name
in this function is unreadable, so give them a more readable one.
Fixes: 3ff504908f ("net: hns3: fix a dead loop in hclge_cmd_csq_clean")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0d4411408a7fb9aad0645f23911d9bfdd2ce3177 ]
The current driver supports handling two vector0 interrupts, reset and
mailbox. When the hardware reports an interrupt of another type of
interrupt source, if the driver does not process the interrupt, but
enables the interrupt, the hardware will repeatedly report the unknown
interrupt.
Therefore, the driver enables the vector0 interrupt after clearing the
known type of interrupt source. Other conditions are not enabled.
Fixes: cd8c5c269b ("net: hns3: Fix for hclge_reset running repeatly problem")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 73b907a083b8a8c1c62cb494bc9fbe6ae086c460 ]
When hns3_get_ring_config()/hns3_queue_to_ring()/
hns3_get_vector_ring_chain() failed during resetting, the allocated
memory has not been freed before these three functions return. So
this patch adds error handler in these functions to fix it.
Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e330af788998b0de4da4f5bd7ddd087507999800 ]
VF drivers can trigger PCIe completer aborts any time they read a queue
that they don't own. Even in nominal circumstances, it is not possible
to prevent the VF driver from reading queues it doesn't own. VF drivers
may attempt to read queues it previously owned, but which it no longer
does due to a PF reset.
Normally these completer aborts aren't an issue. However, on some
platforms these trigger machine check errors. This is true even if we
lower their severity from fatal to non-fatal. Indeed, we already have
code for lowering the severity.
We could attempt to mask these errors conditionally around resets, which
is the most common time they would occur. However this would essentially
be a race between the PF and VF drivers, and we may still occasionally
see machine check exceptions on these strictly configured platforms.
Instead, mask the errors entirely any time we resume VFs. By doing so,
we prevent the completer aborts from being sent to the parent PCIe
device, and thus these strict platforms will not upgrade them into
machine check errors.
Additionally, we don't lose any information by masking these errors,
because we'll still report VFs which attempt to access queues via the
FUM_BAD_VF_QACCESS errors.
Without this change, on platforms where completer aborts cause machine
check exceptions, the VF reading queues it doesn't own could crash the
host system. Masking the completer abort prevents this, so we should
mask it for good, and not just around a PCIe reset. Otherwise malicious
or misconfigured VFs could cause the host system to crash.
Because we are masking the error entirely, there is little reason to
also keep setting the severity bit, so that code is also removed.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 094bf4d0e9657f6ea1ee3d7e07ce3970796949ce ]
The timecounter needs to be updated at least once per ~550 seconds in
order to avoid a 40-bit SYSTIM timestamp to be misinterpreted as an old
timestamp.
Since commit 500462a9d ("timers: Switch to a non-cascading wheel"),
scheduling of delayed work seems to be less accurate and a requested
delay of 540 seconds may actually be longer than 550 seconds. Shorten
the delay to 480 seconds to be sure the timecounter is updated in time.
This fixes an issue with HW timestamps on 82580/I350/I354 being off by
~1100 seconds for few seconds every ~9 minutes.
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cec1680591d6d5b10ecc10f370210089416e98af ]
device_online() should be called with device_hotplug_lock() held.
Link: http://lkml.kernel.org/r/20180925091457.28651-5-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 381eab4a6ee81266f8dddc62e57376c7e584e5b8 ]
There seem to be some problems as result of 30467e0b3b ("mm, hotplug:
fix concurrent memory hot-add deadlock"), which tried to fix a possible
lock inversion reported and discussed in [1] due to the two locks
a) device_lock()
b) mem_hotplug_lock
While add_memory() first takes b), followed by a) during
bus_probe_device(), onlining of memory from user space first took a),
followed by b), exposing a possible deadlock.
In [1], and it was decided to not make use of device_hotplug_lock, but
rather to enforce a locking order.
The problems I spotted related to this:
1. Memory block device attributes: While .state first calls
mem_hotplug_begin() and the calls device_online() - which takes
device_lock() - .online does no longer call mem_hotplug_begin(), so
effectively calls online_pages() without mem_hotplug_lock.
2. device_online() should be called under device_hotplug_lock, however
onlining memory during add_memory() does not take care of that.
In addition, I think there is also something wrong about the locking in
3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
without locks. This was introduced after 30467e0b3b. And skimming over
the code, I assume it could need some more care in regards to locking
(e.g. device_online() called without device_hotplug_lock. This will
be addressed in the following patches.
Now that we hold the device_hotplug_lock when
- adding memory (e.g. via add_memory()/add_memory_resource())
- removing memory (e.g. via remove_memory())
- device_online()/device_offline()
We can move mem_hotplug_lock usage back into
online_pages()/offline_pages().
Why is mem_hotplug_lock still needed? Essentially to make
get_online_mems()/put_online_mems() be very fast (relying on
device_hotplug_lock would be very slow), and to serialize against
addition of memory that does not create memory block devices (hmm).
[1] http://driverdev.linuxdriverproject.org/pipermail/ driverdev-devel/
2015-February/065324.html
This patch is partly based on a patch by Vitaly Kuznetsov.
Link: http://lkml.kernel.org/r/20180925091457.28651-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8df1d0e4a265f25dc1e7e7624ccdbcb4a6630c89 ]
add_memory() currently does not take the device_hotplug_lock, however
is aleady called under the lock from
arch/powerpc/platforms/pseries/hotplug-memory.c
drivers/acpi/acpi_memhotplug.c
to synchronize against CPU hot-remove and similar.
In general, we should hold the device_hotplug_lock when adding memory to
synchronize against online/offline request (e.g. from user space) - which
already resulted in lock inversions due to device_lock() and
mem_hotplug_lock - see 30467e0b3b ("mm, hotplug: fix concurrent memory
hot-add deadlock"). add_memory()/add_memory_resource() will create memory
block devices, so this really feels like the right thing to do.
Holding the device_hotplug_lock makes sure that a memory block device
can really only be accessed (e.g. via .online/.state) from user space,
once the memory has been fully added to the system.
The lock is not held yet in
drivers/xen/balloon.c
arch/powerpc/platforms/powernv/memtrace.c
drivers/s390/char/sclp_cmd.c
drivers/hv/hv_balloon.c
So, let's either use the locked variants or take the lock.
Don't export add_memory_resource(), as it once was exported to be used by
XEN, which is never built as a module. If somebody requires it, we also
have to export a locked variant (as device_hotplug_lock is never
exported).
Link: http://lkml.kernel.org/r/20180925091457.28651-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6c9a3f843a29d6894dfc40df338b91dbd78f0ae3 ]
Currently extent and index i are both being incremented causing an array
out of bounds read on extent[i]. Fix this by removing the extraneous
increment of extent.
Ernesto said:
: This is only triggered when deleting a file with a resource fork. I
: may be wrong because the documentation isn't clear, but I don't think
: you can create those under linux. So I guess nobody was testing them.
:
: > A disk space leak, perhaps?
:
: That's what it looks like in general. hfs_free_extents() won't do
: anything if the block count doesn't add up, and the error will be
: ignored. Now, if the block count randomly does add up, we could see
: some corruption.
Detected by CoverityScan, CID#711541 ("Out of bounds read")
Link: http://lkml.kernel.org/r/20180831140538.31566-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ernesto A. Fernndez <ernesto.mnd.fernandez@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8cd3cb5061730af085a3f9890a3352f162b4e20c ]
The vfs takes care of updating mtime on ftruncate(), but on truncate() it
must be done by the module.
Link: http://lkml.kernel.org/r/e1611eda2985b672ed2d8677350b4ad8c2d07e8a.1539316825.git.ernesto.mnd.fernandez@gmail.com
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dc8844aada735890a6de109bef327f5df36a982e ]
The vfs takes care of updating ctime and mtime on ftruncate(), but on
truncate() it must be done by the module.
This patch can be tested with xfstests generic/313.
Link: http://lkml.kernel.org/r/9beb0913eea37288599e8e1b7cec8768fb52d1b8.1539316825.git.ernesto.mnd.fernandez@gmail.com
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1267a07be5ebbff2d2739290f3d043ae137c15b4 ]
Direct writes to empty inodes fail with EIO. The generic direct-io code
is in part to blame (a patch has been submitted as "direct-io: allow
direct writes to empty inodes"), but hfs is worse affected than the other
filesystems because the fallback to buffered I/O doesn't happen.
The problem is the return value of hfs_get_block() when called with
!create. Change it to be more consistent with the other modules.
Link: http://lkml.kernel.org/r/4538ab8c35ea37338490525f0f24cbc37227528c.1539195310.git.ernesto.mnd.fernandez@gmail.com
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 839c3a6a5e1fbc8542d581911b35b2cb5cd29304 ]
Direct writes to empty inodes fail with EIO. The generic direct-io code
is in part to blame (a patch has been submitted as "direct-io: allow
direct writes to empty inodes"), but hfsplus is worse affected than the
other filesystems because the fallback to buffered I/O doesn't happen.
The problem is the return value of hfsplus_get_block() when called with
!create. Change it to be more consistent with the other modules.
Link: http://lkml.kernel.org/r/2cd1301404ec7cf1e39c8f11a01a4302f1460ad6.1539195310.git.ernesto.mnd.fernandez@gmail.com
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 54640c7502e5ed41fbf4eedd499e85f9acc9698f ]
Inserting a new record in a btree may require splitting several of its
nodes. If we hit ENOSPC halfway through, the new nodes will be left
orphaned and their records will be lost. This could mean lost inodes or
extents.
Henceforth, check the available disk space before making any changes.
This still leaves the potential problem of corruption on ENOMEM.
There is no need to reserve space before deleting a catalog record, as we
do for hfsplus. This difference is because hfs index nodes have fixed
length keys.
Link: http://lkml.kernel.org/r/ab5fc8a7d5ffccfd5f27b1cf2cb4ceb6c110da74.1536269131.git.ernesto.mnd.fernandez@gmail.com
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d92915c35bfaf763d78bf1d5ac7f183420e3bd99 ]
Inserting or deleting a record in a btree may require splitting several of
its nodes. If we hit ENOSPC halfway through, the new nodes will be left
orphaned and their records will be lost. This could mean lost inodes,
extents or xattrs.
Henceforth, check the available disk space before making any changes.
This still leaves the potential problem of corruption on ENOMEM.
The patch can be tested with xfstests generic/027.
Link: http://lkml.kernel.org/r/4596eef22fbda137b4ffa0272d92f0da15364421.1536269129.git.ernesto.mnd.fernandez@gmail.com
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ef75bcc5763d130451a99825f247d301088b790b ]
hfs_brec_update_parent() may hit BUG_ON() if the first record of both a
leaf node and its parent are changed, and if this forces the parent to
be split. It is not possible for this to happen on a valid hfs
filesystem because the index nodes have fixed length keys.
For reasons I ignore, the hfs module does have support for a number of
hfsplus features. A corrupt btree header may report variable length
keys and trigger this BUG, so it's better to fix it.
Link: http://lkml.kernel.org/r/cf9b02d57f806217a2b1bf5db8c3e39730d8f603.1535682463.git.ernesto.mnd.fernandez@gmail.com
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 19a9d0f1acf75e8be8cfba19c1a34e941846fa2b ]
Creating, renaming or deleting a file may hit BUG_ON() if the first
record of both a leaf node and its parent are changed, and if this
forces the parent to be split. This bug is triggered by xfstests
generic/027, somewhat rarely; here is a more reliable reproducer:
truncate -s 50M fs.iso
mkfs.hfsplus fs.iso
mount fs.iso /mnt
i=1000
while [ $i -le 2400 ]; do
touch /mnt/$i &>/dev/null
((++i))
done
i=2400
while [ $i -ge 1000 ]; do
mv /mnt/$i /mnt/$(perl -e "print $i x61") &>/dev/null
((--i))
done
The issue is that a newly created bnode is being put twice. Reset
new_node to NULL in hfs_brec_update_parent() before reaching goto again.
Link: http://lkml.kernel.org/r/5ee1db09b60373a15890f6a7c835d00e76bf601d.1535682461.git.ernesto.mnd.fernandez@gmail.com
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ce1091d471107dbf6f91db66a480a25950c9b9ff ]
For various alignments of buf, the current expression computes
4096 ok
4095 ok
8190
8189
...
4097
i.e., if the caller has already written two bytes into the page buffer,
len is 8190 rather than 4094, because PTR_ALIGN aligns up to the next
boundary. So if the printed version of the bitmap is huge, scnprintf()
ends up writing beyond the page boundary.
I don't think any current callers actually write anything before
bitmap_print_to_pagebuf, but the API seems to be designed to allow it.
[akpm@linux-foundation.org: use offset_in_page(), per Andy]
[akpm@linux-foundation.org: include mm.h for offset_in_page()]
Link: http://lkml.kernel.org/r/20180818131623.8755-7-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Yury Norov <ynorov@caviumnetworks.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d9873969fa8725dc6a5a21ab788c057fd8719751 ]
Most other bitmap API, including the OOL version __bitmap_shift_right,
take unsigned nbits. This was accidentally left out from 2fbad29917.
Link: http://lkml.kernel.org/r/20180818131623.8755-5-linux@rasmusvillemoes.dk
Fixes: 2fbad29917 ("lib: bitmap: change bitmap_shift_right to take unsigned parameters")
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reported-by: Yury Norov <ynorov@caviumnetworks.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7275b097851a5e2e0dd4da039c7e96b59ac5314e ]
The static inlines in bitmap.h do not handle a compile-time constant
nbits==0 correctly (they dereference the passed src or dst pointers,
despite only 0 words being valid to access). I had the 0-day buildbot
chew on a patch [1] that would cause build failures for such cases without
complaining, suggesting that we don't have any such users currently, at
least for the 70 .config/arch combinations that was built. Should any
turn up, make sure they use the out-of-line versions, which do handle
nbits==0 correctly.
This is of course not the most efficient, but it's much less churn than
teaching all the static inlines an "if (zero_const_nbits())", and since we
don't have any current instances, this doesn't affect existing code at
all.
[1] lkml.kernel.org/r/20180815085539.27485-1-linux@rasmusvillemoes.dk
Link: http://lkml.kernel.org/r/20180818131623.8755-3-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Yury Norov <ynorov@caviumnetworks.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4b408c74ee5a0b74fc9265c2fe39b0e7dec7c056 ]
The concern here is that "gup->size" is a u64 and "nr_pages" is unsigned
long. On 32 bit systems we could trick the kernel into allocating fewer
pages than expected.
Link: http://lkml.kernel.org/r/20181025061546.hnhkv33diogf2uis@kili.mountain
Fixes: 64c349f4ae ("mm: add infrastructure for get_user_pages_fast() benchmarking")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c57cdf7a9e51d97a43e29b8f4a04157875104000 ]
rq_qos_exit() removes the current q->rq_qos, this action has to be
done after queue is frozen, otherwise the IO queue path may never
be waken up, then IO hang is caused.
So fixes this issue by moving rq_qos_exit() after queue is frozen.
Cc: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 69f8117f17b332a68cd8f4bf8c2d0d3d5b84efc5 ]
Use TEST_GEN_PROGS and don't redefine all, this makes the out-of-tree
build work. We need to move the extra dependencies below the include
of lib.mk, because it adds the $(OUTPUT) prefix if it's defined.
We can also drop the clean rule, lib.mk does it for us.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 266bac361d5677e61a6815bd29abeb3bdced2b07 ]
For the out-of-tree build to work we need to tell switch_endian_test
to look for check-reversed.S in $(OUTPUT).
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 27825349d7b238533a47e3d98b8bb0efd886b752 ]
We should use TEST_GEN_PROGS, not TEST_PROGS. That tells the selftests
makefile (lib.mk) that those tests are generated (built), and so it
adds the $(OUTPUT) prefix for us, making the out-of-tree build work
correctly.
It also means we don't need our own clean rule, lib.mk does it.
We also have to update the signal_tm rule to use $(OUTPUT).
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c39b79082a38a4f8c801790edecbbb4d62ed2992 ]
We should use TEST_GEN_PROGS, not TEST_PROGS. That tells the selftests
makefile (lib.mk) that those tests are generated (built), and so it
adds the $(OUTPUT) prefix for us, making the out-of-tree build work
correctly.
It also means we don't need our own clean rule, lib.mk does it.
We also have to update the ptrace-pkey and core-pkey rules to use
$(OUTPUT).
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9c87156cce5a63735d1218f0096a65c50a7a32aa ]
When building with clang (8 trunk, 7.0 release) the frame size limit is
hit:
arch/powerpc/xmon/xmon.c:452:12: warning: stack frame size of 2576
bytes in function 'xmon_core' [-Wframe-larger-than=]
Some investigation by Naveen indicates this is due to clang saving the
addresses to printf format strings on the stack.
While this issue is investigated, bump up the frame size limit for xmon
when building with clang.
Link: https://github.com/ClangBuiltLinux/linux/issues/252
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 966c37f2d77eb44d47af8e919267b1ba675b2eca ]
Similiar with ipv6 mcast commit 89225d1ce6 ("net: ipv6: mld: fix v1/v2
switchback timeout to rfc3810, 9.12.")
i) RFC3376 8.12. Older Version Querier Present Timeout says:
The Older Version Querier Interval is the time-out for transitioning
a host back to IGMPv3 mode once an older version query is heard.
When an older version query is received, hosts set their Older
Version Querier Present Timer to Older Version Querier Interval.
This value MUST be ((the Robustness Variable) times (the Query
Interval in the last Query received)) plus (one Query Response
Interval).
Currently we only use a hardcode value IGMP_V1/v2_ROUTER_PRESENT_TIMEOUT.
Fix it by adding two new items mr_qi(Query Interval) and mr_qri(Query Response
Interval) in struct in_device.
Now we can calculate the switchback time via (mr_qrv * mr_qi) + mr_qri.
We need update these values when receive IGMPv3 queries.
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 07d19dc9fbe9128378b9e226abe886fd8fd473df ]
A deduplication data corruption is exposed in XFS and btrfs. It is
caused by extending the block match range to include the partial EOF
block, but then allowing unknown data beyond EOF to be considered a
"match" to data in the destination file because the comparison is only
made to the end of the source file. This corrupts the destination file
when the source extent is shared with it.
The VFS remapping prep functions only support whole block dedupe, but
we still need to appear to support whole file dedupe correctly. Hence
if the dedupe request includes the last block of the souce file, don't
include it in the actual dedupe operation. If the rest of the range
dedupes successfully, then reject the entire request. A subsequent
patch will enable us to shorten dedupe requests correctly.
When reflinking sub-file ranges, a data corruption can occur when the
source file range includes a partial EOF block. This shares the unknown
data beyond EOF into the second file at a position inside EOF, exposing
stale data in the second file.
If the reflink request includes the last block of the souce file, only
proceed with the reflink operation if it lands at or past the
destination file's current EOF. If it lands within the destination file
EOF, reject the entire request with -EINVAL and make the caller go the
hard way. A subsequent patch will enable us to shorten reflink requests
correctly.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>