Commit graph

737415 commits

Author SHA1 Message Date
Chuck Lever
38a7031559 NFSD: Clean up legacy NFS SYMLINK argument XDR decoders
Move common code in NFSD's legacy SYMLINK decoders into a helper.
The immediate benefits include:

 - one fewer data copies on transports that support DDP
 - consistent error checking across all versions
 - reduction of code duplication
 - support for both legal forms of SYMLINK requests on RDMA
   transports for all versions of NFS (in particular, NFSv2, for
   completeness)

In the long term, this helper is an appropriate spot to perform a
per-transport call-out to fill the pathname argument using, say,
RDMA Reads.

Filling the pathname in the proc function also means that eventually
the incoming filehandle can be interpreted so that filesystem-
specific memory can be allocated as a sink for the pathname
argument, rather than using anonymous pages.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:16 -04:00
Chuck Lever
8154ef2776 NFSD: Clean up legacy NFS WRITE argument XDR decoders
Move common code in NFSD's legacy NFS WRITE decoders into a helper.
The immediate benefit is reduction of code duplication and some nice
micro-optimizations (see below).

In the long term, this helper can perform a per-transport call-out
to fill the rq_vec (say, using RDMA Reads).

The legacy WRITE decoders and procs are changed to work like NFSv4,
which constructs the rq_vec just before it is about to call
vfs_writev.

Why? Calling a transport call-out from the proc instead of the XDR
decoder means that the incoming FH can be resolved to a particular
filesystem and file. This would allow pages from the backing file to
be presented to the transport to be filled, rather than presenting
anonymous pages and copying or flipping them into the file's page
cache later.

I also prefer using the pages in rq_arg.pages, instead of pulling
the data pages directly out of the rqstp::rq_pages array. This is
currently the way the NFSv3 write decoder works, but the other two
do not seem to take this approach. Fixing this removes the only
reference to rq_pages found in NFSD, eliminating an NFSD assumption
about how transports use the pages in rq_pages.

Lastly, avoid setting up the first element of rq_vec as a zero-
length buffer. This happens with an RDMA transport when a normal
Read chunk is present because the data payload is in rq_arg's
page list (none of it is in the head buffer).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:16 -04:00
Chuck Lever
fff4080b2f nfsd: Trace NFSv4 COMPOUND execution
This helps record the identity and timing of the ops in each NFSv4
COMPOUND, replacing dprintk calls that did much the same thing.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:15 -04:00
Chuck Lever
87c5942e8f nfsd: Add I/O trace points in the NFSv4 read proc
NFSv4 read compound processing invokes nfsd_splice_read and
nfs_readv directly, so the trace points currently in nfsd_read are
not invoked for NFSv4 reads.

Move the NFSD READ trace points to common helpers so that NFSv4
reads are captured.

Also, record any local I/O error that occurs, the total count of
bytes that were actually returned, and whether splice or vectored
read was used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:15 -04:00
Chuck Lever
d890be159a nfsd: Add I/O trace points in the NFSv4 write path
NFSv4 write compound processing invokes nfsd_vfs_write directly. The
trace points currently in nfsd_write are not effective for NFSv4
writes.

Move the trace points into the shared nfsd_vfs_write() helper.

After the I/O, we also want to record any local I/O error that
might have occurred, and the total count of bytes that were actually
moved (rather than the requested number).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:15 -04:00
Chuck Lever
f394b62b7b nfsd: Add "nfsd_" to trace point names
Follow naming convention used in client and in sunrpc layers.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:14 -04:00
Chuck Lever
79e0b4e247 nfsd: Record request byte count, not count of vectors
Byte count is more helpful to know than vector count.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:14 -04:00
Chuck Lever
afa720a091 nfsd: Fix NFSD trace points
nfsd-1915  [003] 77915.780959: write_opened:
	[FAILED TO PARSE] xid=3286130958 fh=0 offset=154624 len=1
nfsd-1915  [003] 77915.780960: write_io_done:
	[FAILED TO PARSE] xid=3286130958 fh=0 offset=154624 len=1
nfsd-1915  [003] 77915.780964: write_done:
	[FAILED TO PARSE] xid=3286130958 fh=0 offset=154624 len=1

Byte swapping and knfsd_fh_hash() are not available in "trace-cmd
report", where the print format string is actually used. These
data transformations have to be done during the TP_fast_assign step.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:13 -04:00
Chuck Lever
55f5088c22 svc: Report xprt dequeue latency
Record the time between when a rqstp is enqueued on a transport
and when it is dequeued. This includes how long the rqstp waits on
the queue and how long it takes the kernel scheduler to wake a
nfsd thread to service it.

The svc_xprt_dequeue trace point is altered to include the number
of microseconds between xprt_enqueue and xprt_dequeue.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:13 -04:00
Chuck Lever
aaba72cd4e sunrpc: Report per-RPC execution stats
Introduce a mechanism to report the server-side execution latency of
each RPC. The goal is to enable user space to filter the trace
record for latency outliers, build histograms, etc.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:12 -04:00
Chuck Lever
0b9547bf6b sunrpc: Re-purpose trace_svc_process
Currently, trace_svc_process has two call sites:

1. Just after a call to svc_send. svc_send already invokes
   trace_svc_send with the same arguments just before returning

2. Just before a call to svc_drop. svc_drop already invokes
   trace_svc_drop with the same arguments just after it is called

Therefore trace_svc_process does not provide any additional
information not already provided by these other trace points.

However, it would be useful to record the incoming RPC procedure.
So reuse trace_svc_process for this purpose.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:12 -04:00
Chuck Lever
ece200ddd5 sunrpc: Save remote presentation address in svc_xprt for trace events
TP_printk defines a format string that is passed to user space for
converting raw trace event records to something human-readable.

My user space's printf (Oracle Linux 7), however, does not have a
%pI format specifier. The result is that what is supposed to be an
IP address in the output of "trace-cmd report" is just a string that
says the field couldn't be displayed.

To fix this, adopt the same approach as the client: maintain a pre-
formated presentation address for occasions when %pI is not
available.

The location of the trace_svc_send trace point is adjusted so that
rqst->rq_xprt is not NULL when the trace event is recorded.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:11 -04:00
Chuck Lever
41f306d0c2 sunrpc: Simplify trace_svc_recv
There doesn't seem to be a lot of value in calling trace_svc_recv
in the failing case.

1. There are two very common cases: one is the transport is not
ready, and the other is shutdown. Neither is terribly interesting.

2. The trace record for the failing case contains nothing but
the status code.

Therefore the trace point call site in the error exit is removed.
Since the trace point is now recording a length instead of a
status, rename the status field and remove the case that records a
zero XID.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:11 -04:00
Chuck Lever
7dbb53baed sunrpc: Simplify do_enqueue tracing
There are three cases where svc_xprt_do_enqueue() returns without
waking an nfsd thread:

1. There is no work to do

2. The transport is already busy

3. There are no available nfsd threads

Only 3. is truly interesting. Move the trace point so it records
that there was work to do and either an nfsd thread was awoken, or
a free one could not found.

As an additional clean up, remove a redundant comment and a couple
of dprintk call sites.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:11 -04:00
Chuck Lever
caa3e106dc sunrpc: Move trace_svc_xprt_dequeue()
Reduce the amount of noise generated by trace_svc_xprt_dequeue by
moving it to the end of svc_get_next_xprt. This generates exactly
one trace event when a ready xprt is found, rather than spurious
events when there is no work to do. The empty events contain no
information that can't be obtained simply by tracing function calls
to svc_xprt_dequeue.

A small additional benefit is simplification of the svc_xprt_event
trace class, which no longer has to handle the case when the @xprt
parameter is NULL.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:10 -04:00
Chuck Lever
03edb90f57 sunrpc: Update show_svc_xprt_flags() to include recently added flags
XPT_KILL_TEMP was added by commit 546125d161 ("sunrpc: don't call
sleeping functions from the notifier block callbacks"), and
XPT_CONG_CTRL was added by commit 362142b258 ("sunrpc: flag
transports as having congestion control") .

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:10 -04:00
Chuck Lever
989f881ebf svc: Simplify ->xpo_secure_port
Clean up: Instead of returning a value that is used to set or clear
a bit, just make ->xpo_secure_port mangle that bit, and return void.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:09 -04:00
Chuck Lever
63a1b15693 sunrpc: Remove unneeded pointer dereference
Clean up: Noticed during code inspection that there is already a
local automatic variable "xprt" so dereferencing rqst->rq_xprt
again is unnecessary.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:09 -04:00
Stefan Agner
47299f79ea nfsd: use correct enum type in decode_cb_op_status
Use enum nfs_cb_opnum4 in decode_cb_op_status. This fixes warnings
seen with clang:
  fs/nfsd/nfs4callback.c:451:36: warning: implicit conversion from
      enumeration type 'enum nfs_cb_opnum4' to different enumeration
      type 'enum nfs_opnum4' [-Wenum-conversion]
        status = decode_cb_op_status(xdr, OP_CB_SEQUENCE, &cb->cb_seq_status);
                 ~~~~~~~~~~~~~~~~~~~      ^~~~~~~~~~~~~~

Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:08 -04:00
Fengguang Wu
51d87bc2bf nfsd: fix boolreturn.cocci warnings
fs/nfsd/nfs4state.c:926:8-9: WARNING: return of 0/1 in function 'nfs4_delegation_exists' with return type bool
fs/nfsd/nfs4state.c:2955:9-10: WARNING: return of 0/1 in function 'nfsd4_compound_in_session' with return type bool

 Return statements in functions returning bool should use
 true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci

Fixes: 68b18f5294 ("nfsd: make nfs4_get_existing_delegation less confusing")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
[bfields: also fix -EAGAIN]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:08 -04:00
J. Bruce Fields
353601e7d3 nfsd: create a separate lease for each delegation
Currently we only take one vfs-level delegation (lease) for each file,
no matter how many clients hold delegations on that file.

Let's instead keep a one-to-one mapping between NFSv4 delegations and
VFS delegations.  This turns out to be simpler.

There is still a many-to-one mapping of NFS opens to NFS files, and the
delegations on one file are all associated with one struct file.  The
VFS can still distinguish between these delegations since we're setting
fl_owner to the struct nfs4_delegation now, not to the shared file.

I'm replacing at least one complicated function wholesale, which I don't
like to do, but I haven't figured out how to do this more incrementally.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2018-03-20 17:51:14 -04:00
J. Bruce Fields
86d29b10eb nfsd: move sc_file assignment into alloc_init_deleg
Take an easy chance to simplify the caller a little.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2018-03-20 17:51:13 -04:00
J. Bruce Fields
0af6e690f0 nfsd: factor out common delegation-destruction code
Pull some duplicated code into a common helper.

This changes the order in destroy_delegation a little, but it looks to
me like that shouldn't matter.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2018-03-20 17:51:13 -04:00
J. Bruce Fields
68b18f5294 nfsd: make nfs4_get_existing_delegation less confusing
This doesn't "get" anything.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2018-03-20 17:51:12 -04:00
J. Bruce Fields
0c911f5408 nfsd4: dp->dl_stid.sc_file doesn't need locking
The delegation isn't visible to anyone yet.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2018-03-20 17:51:12 -04:00
J. Bruce Fields
653e514e9e nfsd4: set fl_owner to delegation, not file pointer
For now this makes no difference, as for files having delegations,
there's a one-to-one relationship between an nfs4_file and its
nfs4_delegation.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2018-03-20 17:51:11 -04:00
J. Bruce Fields
cba7b3d150 nfsd: simplify nfs4_put_deleg_lease calls
Every single caller gets the file out of the delegation, so let's do
that once in nfs4_put_deleg_lease.

Plus we'll need it there for other reasons.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2018-03-20 17:51:11 -04:00
J. Bruce Fields
b8232d3315 nfsd: simplify put of fi_deleg_file
fi_delegees is basically just a reference count on users of
fi_deleg_file, which is cleared when fi_delegees goes to zero.  The
fi_deleg_file check here is redundant.  Also add an assertion to make
sure we don't have unbalanced puts.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2018-03-20 17:51:10 -04:00
Chuck Lever
6f29d07ca4 svcrdma: Clean up rdma_build_arg_xdr
Clean up: The value of the byte_count parameter is already passed
to rdma_build_arg_xdr as part of the svc_rdma_op_ctxt structure.

Further, without the parameter called "byte_count" there is no need
to have the abbreviated "bc" automatic variable. "bc" can now be
called something more intuitive.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-03-20 17:32:13 -04:00
Chuck Lever
97cc326450 svcrdma: Consult max_qp_init_rd_atom when accepting connections
The target needs to return the lesser of the client's Inbound RDMA
Read Queue Depth (IRD), provided in the connection parameters, and
the local device's Outbound RDMA Read Queue Depth (ORD). The latter
limit is max_qp_init_rd_atom, not max_qp_rd_atom.

The svcrdma_ord value caps the ORD value for iWARP transports, which
do not exchange ORD/IRD values at connection time. Since no other
Linux kernel RDMA-enabled storage target sees fit to provide this
cap, I'm removing it here too.

initiator_depth is a u8, so ensure the computed ORD value does not
overflow that field.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-03-20 17:32:13 -04:00
Chuck Lever
0c4398ff8b svcrdma: Use pr_err to report Receive errors
Clean up: Other completion handlers use pr_err, not pr_warn.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-03-20 17:32:12 -04:00
Jeff Layton
9258a2d5cd nfsd: move nfs4_client allocation to dedicated slabcache
On x86_64, it's 1152 bytes, so we can avoid wasting 896 bytes each.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-03-19 16:38:13 -04:00
J. Bruce Fields
9d7ed1355d nfsd: don't require low ports for gss requests
In a traditional NFS deployment using auth_unix, the clients are trusted
to correctly report the credentials of their logged-in users.  The
server assumes that only root on client machines is allowed to send
requests from low-numbered ports, so it can use the originating port
number to distinguish "real" NFS clients from NFS clients run by
ordinary users, to prevent ordinary users from spoofing credentials.

The originating port number on a gss-authenticated request is less
important.  The authentication ties the request to a user, and we take
it as proof that that user authorized the request.  The low port number
check no longer adds much.

So, don't enforce low port numbers in the auth_gss case.

Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-03-19 16:38:13 -04:00
J. Bruce Fields
edcc8452a0 nfsd: remove unsused "cp_consecutive" field
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-03-19 16:38:13 -04:00
Colin Ian King
554faf2819 lockd: make nlm_ntf_refcnt and nlm_ntf_wq static
The variables nlm_ntf_refcnt and nlm_ntf_wq are local to the source and
do not need to be in global scope, so make them static.

Cleans up sparse warnings:
fs/lockd/svc.c:60:10: warning: symbol 'nlm_ntf_refcnt' was not declared.
Should it be static?
fs/lockd/svc.c:61:1: warning: symbol 'nlm_ntf_wq' was not declared.
Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-03-19 16:38:13 -04:00
Jeff Layton
bd2decac5e nfsd4: send the special close_stateid in v4.0 replies as well
We already send it for v4.1, but RFC7530 also notes that the stateid in
the close reply is bogus.

Always send the special close stateid, even in v4.0 responses. No client
should put any meaning on it whatsoever. For now, we continue to
increment the stateid value, though that might not be necessary either.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-03-19 16:38:12 -04:00
NeilBrown
3b68e6ee3c SUNRPC: cache: ignore timestamp written to 'flush' file.
The interface for flushing the sunrpc auth cache was poorly
designed and has caused problems a number of times.

The design is that you write a timestamp, and all entries
created before that time are discarded.
The most obvious problem is that this is not what people
actually want.  They want to just flush the whole cache.
The 1-second granularity can be a problem, as can the use
of wall-clock time.

A current problem is that code will write the current time to
this file - expecting it to clear everything - and if the
seconds number ticks over before this timestamp is checked,
the test "then >= now" fails, and a full flush isn't forced.

So lets just drop the subtleties and always flush the whole
cache.  The worst this could do is impose an extra cost
refilling it, but that would require someone to be using
non-standard tools.

We still report an error if the string written is not a number,
but we cause any valid number to flush the whole cache.

Reported-by: "Wang, Alan 1. (NSB - CN/Hangzhou)" <alan.1.wang@nokia-sbell.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-03-19 16:38:12 -04:00
James Ettle
90a9b1473d sunrpc: Fix unaligned access on sparc64
Fix unaligned access in gss_{get,verify}_mic_v2() on sparc64

Signed-off-by: James Ettle <james@ettle.org.uk>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-03-19 16:38:12 -04:00
Jeff Layton
68ef3bc316 nfsd: remove blocked locks on client teardown
We had some reports of panics in nfsd4_lm_notify, and that showed a
nfs4_lockowner that had outlived its so_client.

Ensure that we walk any leftover lockowners after tearing down all of
the stateids, and remove any blocked locks that they hold.

With this change, we also don't need to walk the nbl_lru on nfsd_net
shutdown, as that will happen naturally when we tear down the clients.

Fixes: 76d348fadf (nfsd: have nfsd4_lock use blocking locks for v4.1+ locks)
Reported-by: Frank Sorenson <fsorenso@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org # 4.9
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-03-19 16:37:21 -04:00
Linus Torvalds
91ab883eb2 Linux 4.16-rc2 2018-02-18 17:29:42 -08:00
Linus Torvalds
0e06fb5b9a Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 Kconfig fixes from Thomas Gleixner:
 "Three patchlets to correct HIGHMEM64G and CMPXCHG64 dependencies in
  Kconfig when CPU selections are explicitely set to M586 or M686"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/Kconfig: Explicitly enumerate i686-class CPUs in Kconfig
  x86/Kconfig: Exclude i586-class CPUs lacking PAE support from the HIGHMEM64G Kconfig group
  x86/Kconfig: Add missing i586-class CPUs to the X86_CMPXCHG64 Kconfig group
2018-02-18 12:56:41 -08:00
Linus Torvalds
9ca2c16f3b Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Thomas Gleixner:
 "Perf tool updates and kprobe fixes:

   - perf_mmap overwrite mode fixes/overhaul, prep work to get 'perf
     top' using it, making it bearable to use it in large core count
     systems such as Knights Landing/Mill Intel systems (Kan Liang)

   - s/390 now uses syscall.tbl, just like x86-64 to generate the
     syscall table id -> string tables used by 'perf trace' (Hendrik
     Brueckner)

   - Use strtoull() instead of home grown function (Andy Shevchenko)

   - Synchronize kernel ABI headers, v4.16-rc1 (Ingo Molnar)

   - Document missing 'perf data --force' option (Sangwon Hong)

   - Add perf vendor JSON metrics for ARM Cortex-A53 Processor (William
     Cohen)

   - Improve error handling and error propagation of ftrace based
     kprobes so failures when installing kprobes are not silently
     ignored and create disfunctional tracepoints"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  kprobes: Propagate error from disarm_kprobe_ftrace()
  kprobes: Propagate error from arm_kprobe_ftrace()
  Revert "tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h"
  perf s390: Rework system call table creation by using syscall.tbl
  perf s390: Grab a copy of arch/s390/kernel/syscall/syscall.tbl
  tools/headers: Synchronize kernel ABI headers, v4.16-rc1
  perf test: Fix test trace+probe_libc_inet_pton.sh for s390x
  perf data: Document missing --force option
  perf tools: Substitute yet another strtoull()
  perf top: Check the latency of perf_top__mmap_read()
  perf top: Switch default mode to overwrite mode
  perf top: Remove lost events checking
  perf hists browser: Add parameter to disable lost event warning
  perf top: Add overwrite fall back
  perf evsel: Expose the perf_missing_features struct
  perf top: Check per-event overwrite term
  perf mmap: Discard legacy interface for mmap read
  perf test: Update mmap read functions for backward-ring-buffer test
  perf mmap: Introduce perf_mmap__read_event()
  perf mmap: Introduce perf_mmap__read_done()
  ...
2018-02-18 12:38:40 -08:00
Linus Torvalds
2d6c4e40ab Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "A small set of updates mostly for irq chip drivers:

   - MIPS GIC fix for spurious, masked interrupts

   - fix for a subtle IPI bug in GICv3

   - do not probe GICv3 ITSs that are marked as disabled

   - multi-MSI support for GICv2m

   - various small cleanups"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqdomain: Re-use DEFINE_SHOW_ATTRIBUTE() macro
  irqchip/bcm: Remove hashed address printing
  irqchip/gic-v2m: Add PCI Multi-MSI support
  irqchip/gic-v3: Ignore disabled ITS nodes
  irqchip/gic-v3: Use wmb() instead of smb_wmb() in gic_raise_softirq()
  irqchip/gic-v3: Change pr_debug message to pr_devel
  irqchip/mips-gic: Avoid spuriously handling masked interrupts
2018-02-18 12:22:04 -08:00
Linus Torvalds
59e4721544 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fix from Thomas Gleixner:
 "A small fix which adds the missing for_each_cpu_wrap() stub for the UP
  case to avoid build failures"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpumask: Make for_each_cpu_wrap() available on UP as well
2018-02-18 11:54:22 -08:00
Linus Torvalds
c786427f57 for-linus-20180217
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJaiF9dAAoJEPfTWPspceCmkS8P/1bLQUbCKuby+aKG52ik80Xb
 ao+CM0Ytn1vKxDRnk3rcZyN35++0c2rLzRlK7SCYQ006ivFFGBBrdPJlJq2WismK
 06dMnkqGQGr1I6cIryFsUzi3dSk/uc9S3afgYuc6Ga3tvYvM90q1JA4PNUf4u463
 pjJoDwL1ZgXeACtG7r8Bmbjb2LxoWODDqeNe3nTUdZLrdRPROn/mkjqOB+NhsTcL
 47nIic+U1+QT8A3+gZgmDRz9TKXgLU+5BdUMGavOi3V3d8ZIsBijY20Inr3ovsCc
 rSO6WIipk2u3kTIZr3nXhZs2WfDEo+q/G+7vKz+F0ICf4luPScwpPJk0rv9Uf838
 LYKn97uucAssV3+tNTWHprCdOBpG1w2fX7a1oSTczYZztWY6CNJzbOQ9w9WFXxUc
 cskF7jBShC5l9XmgwoKOFGrnSsuOG5TOzadNcuW5IDBFGOizAEKiIHyQOobYxIHT
 ZwipUgVZFbiK7vlxLssYihgrO5rMgpWz4o54OPmCzpD04d1We+Yf1VSOpMFdpR05
 h3YQ3y8tj1Ndicnw4P0aj0wDPh4wFd9vuxVtLuvYryh4dffIeU/GWSfmmedakn+0
 uc/7QiOxbXj0NcPBd9cJpTNCEGttfKLbedMyGj9ztMoQJAuRnbwWwY4VUJwY+Lvn
 hXBr/UYqkwYXkJLy2uKK
 =n5RP
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20180217' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - NVMe pull request from Keith, with fixes all over the map for nvme.
   From various folks.

 - Classic polling fix, that avoids a latency issue where we still end
   up waiting for an interrupt in some cases. From Nitesh Shetty.

 - Comment typo fix from Minwoo Im.

* tag 'for-linus-20180217' of git://git.kernel.dk/linux-block:
  block: fix a typo in comment of BLK_MQ_POLL_STATS_BKTS
  nvme-rdma: fix sysfs invoked reset_ctrl error flow
  nvmet: Change return code of discard command if not supported
  nvme-pci: Fix timeouts in connecting state
  nvme-pci: Remap CMB SQ entries on every controller reset
  nvme: fix the deadlock in nvme_update_formats
  blk: optimization for classic polling
  nvme: Don't use a stack buffer for keep-alive command
  nvme_fc: cleanup io completion
  nvme_fc: correct abort race condition on resets
  nvme: Fix discard buffer overrun
  nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition
  nvme-rdma: use NVME_CTRL_CONNECTING state to mark init process
  nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
2018-02-17 10:20:47 -08:00
Linus Torvalds
fa2139ef9c MMC host:
- meson-gx: Revert to earlier tuning process
  - bcm2835: Don't overwrite max frequency unconditionally
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJah/1EAAoJEP4mhCVzWIwp7CsQAMWYAyzQTNJ6WjI5CRcSq+QM
 1Cykv98XarR6s3WhUA6xd+p3C6d5PDHsXdrkS0RYLqDHtfOiKdodbSjlh9oo/Uub
 KnjCIM8mDRwoJs6TxvBjqIyfpGQYHVF0rf0W71Ik607NGcosoHr7vgYZo2o/3V6g
 lu3xXKam9Uc691VFcTcD2ub4H0qI/RsoZaqIEApTnL1m0d6mAPvxald5AV/tJvKc
 5R+l4Wb4dXB64q3uQvcyxQRSaWWbWEETHA79qS04W5+RJdyy/TtpPbCnPs3VqCHn
 WUEJpoqmFeWNhkg1lmq9MHQvjaweKyLWFKawlpft3mjhImdQBQNjHaLAWW0spsdN
 ifqHCs/9NMJOo0fSWwPjJNeC9UxY0HG1s8yznY+CUnESnMhZ3+h5qToFwI3iMaQF
 BjJCjxmNvvRAqlptcR3hqVOdrWmzkwuqTKdh/5LHuYOrrvVbDiFM1EQbCT3pe0B+
 pHdlB10+wCz1PlNOvi8OUlqW+Co3/lSasWvkS6hKb8SQgZvWfmtyxewR/Mq/zAqT
 WKoYDjsyJCHbDXdZDKLcab5Eo8E67ogEFlkIqMvDTwaVEb1xizydxFFCVMUb8SER
 mROOGCS9OhJwugjOvA8HAy8ueOuTrSMWZI5XvjoBIaB8mQLEJGEVkc0IneDiC6P+
 5wopD/VKGjYqg6xz1EKC
 =+wYZ
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:

 - meson-gx: Revert to earlier tuning process

 - bcm2835: Don't overwrite max frequency unconditionally

* tag 'mmc-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: bcm2835: Don't overwrite max frequency unconditionally
  Revert "mmc: meson-gx: include tx phase in the tuning process"
2018-02-17 10:08:28 -08:00
Linus Torvalds
4b6415f9f9 * Add missing dependency to NAND_MARVELL Kconfig entry
* Use the appropriate OOB layout in the VF610 driver
 -----BEGIN PGP SIGNATURE-----
 
 iQI5BAABCAAjBQJah+/3HBxib3Jpcy5icmV6aWxsb25AYm9vdGxpbi5jb20ACgkQ
 Ze02AX4ItwCMoQ/+NQjeBPDslHM7FdnrPSF6UXAhtuw6NaXnIp6vmnTnkgV9dru/
 nNsiWMZCWD8VAXrtSDRrHUbmuHS+qzG/yxUtTkmnDaOG4YaxBHYPaiWg3C1+9ioV
 1EzAX/XEb8EK3xbeFVnr1gMxr2E+/43mJF8b0DXpcX340+AQaMxfKYnHlNl0KqCS
 qQV4rhqZ4Zm3j+pXcqt1S21EIQTVcmq/iIziLCMRn+T6zESheTMRwh32kh2E8mea
 B8L3cmocf34QdPbJZLULgdjaNbHQHSK8WkYI6x2w+1lJvUo7RHmhrU2di41UAfD+
 uTeNhratFWrAvBa2kI60cA/SQe2ZqXXuoAKC1WGir9qxLqHmdx54x6rGQZWar+0K
 n93deqQyTef20kOiQCUT/Ps4cA8zveZCKpt0CG8Znx4Z7Znjm8J6HMe1yTSVBbQh
 lrevi7hcxDnGn61/2zEUld+BmmDDgDLYoEdV1ofenMRbL4cb4WaKDdwdFUP5mMCM
 hC6NlrwT+7zZVtdoEUe6QbTeiottDNovow2L5BAv6c3NyYzV48f7KxTb3Rwyv0Yp
 Apje7ZUQpNux+NeuJVCQUrerY91rDk8WtnUEI2PIcL32AjR60brtY/MWBqBWclCk
 4/iwHihAttz/J5GuDVWcFcF/QId9zYENT4jJLZjDT7MPAvmFcw5rmI4e4AM=
 =MaSC
 -----END PGP SIGNATURE-----

Merge tag 'mtd/fixes-for-4.16-rc2' of git://git.infradead.org/linux-mtd

Pull mtd fixes from Boris Brezillon:

 - add missing dependency to NAND_MARVELL Kconfig entry

 - use the appropriate OOB layout in the VF610 driver

* tag 'mtd/fixes-for-4.16-rc2' of git://git.infradead.org/linux-mtd:
  mtd: nand: MTD_NAND_MARVELL should depend on HAS_DMA
  mtd: nand: vf610: set correct ooblayout
2018-02-17 10:06:13 -08:00
Linus Torvalds
ee78ad7848 powerpc fixes for 4.16 #3
The main attraction is a fix for a bug in the new drmem code, which was causing
 an oops on boot on some versions of Qemu.
 
 There's also a fix for XIVE (Power9 interrupt controller) on KVM, as well as a
 few other minor fixes.
 
 Thanks to:
   Corentin Labbe, Cyril Bur, Cédric Le Goater, Daniel Black, Nathan Fontenot,
   Nicholas Piggin.
 -----BEGIN PGP SIGNATURE-----
 
 iQIwBAABCAAaBQJah/esExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYCi
 KA/+IaDlvxKezRBNQnj6GElBrgfUzICH6MtG6qo+rCKsTMgbAiZJkk3vz/JgqKY/
 EusTNCwcqLaPBDgwoSmbazdtnj7YOwBGdIQOq+rC/qeSV0/gpdo02dPUWaMMOE/x
 nj+zASrOsv/o9XWX4XmJeuYWhW/8a/nWXKa+oLt3g/5pIHHP5TXTzMHvHH0Rn23D
 1ejwDHDwMNL3p2jHlcf+v1DDol51/Kaa8e8KwJJMf00HVfFVXtdnH7do6I1qBeC0
 t7PLDeWnpyY+3M1fNJ303EXIqc9DArUCn6tdhy6om96rGvBddORFuRkS4kkXbx76
 pnTRPWnPa9aeC2rU+C84sJDQJgeBCpMYOvw96Yr1SxFhE9z0T+9YYTnZxiB7GISK
 5BAf3EzE9dc0RtStrfTKnvcuz2OffPq2VZi3sqjiHFDit2TsF+i7ZkX/CR9UAmaH
 HPk4Fbi/IzlSfx/RffrOXYrpsTNcUmzvA/Tj83qGhM30LCMQ24o84eGTZN36a/eg
 Z+7/MZawtNElsNNpJz6MYtvEQkHZyrTUS+9iyqRwLXCIy/JIHZYwb4aRtu41SET8
 lWwuaLjfwdpVPCeEkiNQwCswtppt4j2XS8Ggqef9GkqElsk6JKyxuZWTv2hRfJUK
 DTQvPV4PIhvhrB6qvT11qOm5yDSSV9f6yaLJ5dm3BiXulF0=
 =8QPF
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "The main attraction is a fix for a bug in the new drmem code, which
  was causing an oops on boot on some versions of Qemu.

  There's also a fix for XIVE (Power9 interrupt controller) on KVM, as
  well as a few other minor fixes.

  Thanks to: Corentin Labbe, Cyril Bur, Cédric Le Goater, Daniel Black,
  Nathan Fontenot, Nicholas Piggin"

* tag 'powerpc-4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries: Check for zero filled ibm,dynamic-memory property
  powerpc/pseries: Add empty update_numa_cpu_lookup_table() for NUMA=n
  powerpc/powernv: IMC fix out of bounds memory access at shutdown
  powerpc/xive: Use hw CPU ids when configuring the CPU queues
  powerpc: Expose TSCR via sysfs only on powernv
2018-02-17 09:48:26 -08:00
Linus Torvalds
74688a02fa arm64 fixes:
- Updated the page table accessors to use READ/WRITE_ONCE and prevent
   compiler transformation that could lead to an apparent loss of
   coherency
 
 - Enabled branch predictor hardening for the Falkor CPU
 
 - Fix interaction between kpti enabling and KASan causing the recursive
   page table walking to take a significant time
 
 - Fix some sparse warnings
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlqH6zoACgkQa9axLQDI
 XvHwLhAAnnsscis7BVmUCKo0yGQbPAgJaTaWDQGADkSi8N+h3GmdQYWu+PxVMrwh
 eLyASZJ+ioYiWfXQ4rvZcOaWeBzJ7/FCxJb/W+RkZoPIkGJl/myY5UEvm+YefbTT
 7ilGae0dpNZBcqiO4OPv36Cqc47AnVw+HsjGcBjMrnUHv7sykqdy8VknLHY9VXdu
 l3N0XJ2zc8e29yZFq4i+1ryx/pO42Ps+oKGuPVCB6U5wIYRleTU4dSoiKYRMp1jI
 4giL0f1Ho8Dw7DTQp/l/KIlFoMPRGCsda5/miByn1MGXYSoyf2mlso5jyegiKX5f
 PSRqPIwwKwIYmC/FR3Lb+P+zZo4YltPnfGfXV5VnzUlYLXAm64qDCPiDbG3I/8wL
 5VhpTR40XYB456i4w9+/hu8giMpRt19jTbG/Ugk7M+i6OQ2ZD9L/olCn7OlhL0Np
 Ra0Uw8CHyQbU8psyN9zZ6EeYhucA41YC2lMxNNPFupwhoB68VvLDBrSPBaisA0Pu
 C2fXJEvsahFYxMqHpw/DAOmb8xokVcdHONiMaDV5ovtRXRlxaTZdPEOHhf/I8hF4
 n7ng2GBOzOnIAc+GcYboDVtVbdnoCPcijKSROI+XeIeDpfD6gv/iR1HmRJXTK+eV
 lnqqXHRJ7M/U2XpJzGDAQulSc9iOxLV8H2tnsxwxMfI8c7T6Kw0=
 =O6of
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
 "The bulk of this is the pte accessors annotation to READ/WRITE_ONCE
  (we tried to avoid pushing this during the merge window to avoid
  conflicts)

   - Updated the page table accessors to use READ/WRITE_ONCE and prevent
     compiler transformation that could lead to an apparent loss of
     coherency

   - Enabled branch predictor hardening for the Falkor CPU

   - Fix interaction between kpti enabling and KASan causing the
     recursive page table walking to take a significant time

   - Fix some sparse warnings"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: cputype: Silence Sparse warnings
  arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables
  arm64: proc: Set PTE_NG for table entries to avoid traversing them twice
  arm64: Add missing Falkor part number for branch predictor hardening
2018-02-17 09:46:18 -08:00
Linus Torvalds
f73f047dd7 xen: fixes for 4.16-rc2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJah+tbAAoJELDendYovxMvHIYH/38IC3Nd3IWTVsLvHXUCxFrn
 fNPKgSyC5/igLbmwjPQ+kbr7bha6Vi3uZwmovoC/9i03gYfzmuuMhhOvOVByYHXg
 HHC+kqegB7tZ2GFeR2hrIba4UxBz4ZC0R5+qQYHZMx5dRt0/Llby663mkcK7WEWr
 Na8jT32AbIOiCKWHgsmTC7h2ZiSXeY+WVj1B3Re7ovLHMTYoMDQhVi5I7w34bcch
 bgTLx4TokC8Z3kCNotPAwrL0rQggEJ+PR91j2mL52uEWv80Q4hgR+QIFtEwiYXmG
 jDbx4Y+jQUAu4+r6/2z4S/gFTV93lB+dWbXYuHoFv5Mv1A+Ve7Al744RvrNh3gM=
 =UVdB
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.16a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - fixes for the Xen pvcalls frontend driver

 - fix for booting Xen pv domains

 - fix for the xenbus driver user interface

* tag 'for-linus-4.16a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  pvcalls-front: wait for other operations to return when release passive sockets
  pvcalls-front: introduce a per sock_mapping refcount
  x86/xen: Calculate __max_logical_packages on PV domains
  xenbus: track caller request id
2018-02-17 09:16:09 -08:00