Commit graph

635844 commits

Author SHA1 Message Date
Thomas Gleixner
1be5d4fa0a locking/rtmutex: Use READ_ONCE() in rt_mutex_owner()
While debugging the rtmutex unlock vs. dequeue race Will suggested to use
READ_ONCE() in rt_mutex_owner() as it might race against the
cmpxchg_release() in unlock_rt_mutex_safe().

Will: "It's a minor thing which will most likely not matter in practice"

Careful search did not unearth an actual problem in todays code, but it's
better to be safe than surprised.

Suggested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20161130210030.431379999@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-02 11:13:26 +01:00
Thomas Gleixner
dbb26055de locking/rtmutex: Prevent dequeue vs. unlock race
David reported a futex/rtmutex state corruption. It's caused by the
following problem:

CPU0		CPU1		CPU2

l->owner=T1
		rt_mutex_lock(l)
		lock(l->wait_lock)
		l->owner = T1 | HAS_WAITERS;
		enqueue(T2)
		boost()
		  unlock(l->wait_lock)
		schedule()

				rt_mutex_lock(l)
				lock(l->wait_lock)
				l->owner = T1 | HAS_WAITERS;
				enqueue(T3)
				boost()
				  unlock(l->wait_lock)
				schedule()
		signal(->T2)	signal(->T3)
		lock(l->wait_lock)
		dequeue(T2)
		deboost()
		  unlock(l->wait_lock)
				lock(l->wait_lock)
				dequeue(T3)
				  ===> wait list is now empty
				deboost()
				 unlock(l->wait_lock)
		lock(l->wait_lock)
		fixup_rt_mutex_waiters()
		  if (wait_list_empty(l)) {
		    owner = l->owner & ~HAS_WAITERS;
		    l->owner = owner
		     ==> l->owner = T1
		  }

				lock(l->wait_lock)
rt_mutex_unlock(l)		fixup_rt_mutex_waiters()
				  if (wait_list_empty(l)) {
				    owner = l->owner & ~HAS_WAITERS;
cmpxchg(l->owner, T1, NULL)
 ===> Success (l->owner = NULL)
				    l->owner = owner
				     ==> l->owner = T1
				  }

That means the problem is caused by fixup_rt_mutex_waiters() which does the
RMW to clear the waiters bit unconditionally when there are no waiters in
the rtmutexes rbtree.

This can be fatal: A concurrent unlock can release the rtmutex in the
fastpath because the waiters bit is not set. If the cmpxchg() gets in the
middle of the RMW operation then the previous owner, which just unlocked
the rtmutex is set as the owner again when the write takes place after the
successfull cmpxchg().

The solution is rather trivial: verify that the owner member of the rtmutex
has the waiters bit set before clearing it. This does not require a
cmpxchg() or other atomic operations because the waiters bit can only be
set and cleared with the rtmutex wait_lock held. It's also safe against the
fast path unlock attempt. The unlock attempt via cmpxchg() will either see
the bit set and take the slowpath or see the bit cleared and release it
atomically in the fastpath.

It's remarkable that the test program provided by David triggers on ARM64
and MIPS64 really quick, but it refuses to reproduce on x86-64, while the
problem exists there as well. That refusal might explain that this got not
discovered earlier despite the bug existing from day one of the rtmutex
implementation more than 10 years ago.

Thanks to David for meticulously instrumenting the code and providing the
information which allowed to decode this subtle problem.

Reported-by: David Daney <ddaney@caviumnetworks.com>
Tested-by: David Daney <david.daney@cavium.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Fixes: 23f78d4a03 ("[PATCH] pi-futex: rt mutex core")
Link: http://lkml.kernel.org/r/20161130210030.351136722@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-02 11:13:26 +01:00
Sven Eckelmann
c2d0f48a13 batman-adv: Check for alloc errors when preparing TT local data
batadv_tt_prepare_tvlv_local_data can fail to allocate the memory for the
new TVLV block. The caller is informed about this problem with the returned
length of 0. Not checking this value results in an invalid memory access
when either tt_data or tt_change is accessed.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 7ea7b4a142 ("batman-adv: make the TT CRC logic VLAN specific")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-12-02 10:46:59 +01:00
Linus Torvalds
4db5e636dd pci-v4.9-fixes-4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYQKe0AAoJEFmIoMA60/r8dmsQAJ1BjfcgWunT8cyBjh9DW8MT
 mFj4w4qEtN8JthecXYKDHYY1zTRocuuKYQTCdX6qKnnx37amJwfiEtPsLqzoio3U
 HqIx0Nyereh6ir3VHJgITa2C0317pw6ti2rEZS+oMfQyWUDWVXMKOo3nsCKYtqLJ
 fO0K1ubYSUwNr1ph3rxTbJaycRUZsXK1PAdaROVeDjiw6IPgSNd9eboQCQAg3WQm
 JFsENhhCDM7qlFpwgbjtjv2IkzK0zpxs6vkVKRUJ1x8D2OAfg0j+rxYEVaOU23bO
 isj7rnbM1fFuC3WrAB1uexPfISLuzqUSIceB46EItoTJ7x3wmQGs4BIIt9LlmUte
 Z6RNAMbUx+K/5p2+xCVJAnbhfnCQv/vLkYEKpr2uPx43PywALYJq/8I4p/qh0zIW
 562ulb7HUqh8jNMvFj/7kqCijnkFHw0iddL0zwC6VD5/lYiTeYN19/T00gUGLtB6
 YWunN1G/fl/SdtI29oo8e+xVKuWraAsyKVX7LZIl2XaZhVBTy9vTC2wC/hdZqiMg
 yXK4/lE+Fr0tnHt8vVRgEicTHTmlQYQnRKNcy9PyDQWyYndg4ExacmsafQ61u0EE
 bUKoPPT7zJT/TVDp54cWk4t/AHc4TONNONNUH2xZKAMElsAiQrHd4GwFHUAQgz/C
 MiwbEXvfYTBcPCRP4cqD
 =DhJD
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.9-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "PCI fixes:

   - Fix Read Completion Boundary setting, which fixes a boot failure on
     IBM x3850 with Mellanox MT27500 ConnectX-3

   - Update some MAINTAINERS entries and email addresses"

* tag 'pci-v4.9-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)
  PCI: Export pcie_find_root_port
  PCI: designware-plat: Update author email
  PCI: designware: Change maintainer to Joao Pinto
  MAINTAINERS: Add devicetree binding to PCI i.MX6 entry
  MAINTAINERS: Update Richard Zhu's email address
2016-12-01 16:44:42 -08:00
Sylwester Nawrocki
1bfbc260a5 ASoC: samsung: Add machine driver for Exynos5433 based TM2 board
This patch adds the sound machine driver for the TM2 and TM2E boards.
Speaker and headphone playback, Main Mic capture, Bluetooth, Voice
call and external accessory are supported.

Signed-off-by: Inha Song <ideal.song@samsung.com>
[k.kozlowski: rebased on 4.1]
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
[s.nawrocki: rebased to 4.7, adjustment to the ASoC core changes,
 removed unused ops and direct calls to the max98504 function,
 added parsing of "audio-amplifier" and "audio-codec"
 properties, added TDM API calls, switched to gpiod API]
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-12-01 21:54:27 +00:00
Sylwester Nawrocki
48a760279b ASoC: samsung: Add DT bindings documentation for TM2 sound subsystem
This patch adds DT binding documentation for Exnos5433 based TM2
and TM2E boards sound subsystem.

Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-12-01 21:54:27 +00:00
Mark Brown
5203a3ca09 Merge branch 'topic/component' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into asoc-intel 2016-12-01 21:47:57 +00:00
Alexander Duyck
c54cdc316d ixgbe/ixgbevf: Don't use lco_csum to compute IPv4 checksum
In the case of IPIP and SIT tunnel frames the outer transport header
offset is actually set to the same offset as the inner transport header.
This results in the lco_csum call not doing any checksum computation over
the inner IPv4/v6 header data.

In order to account for that I am updating the code so that we determine
the location to start the checksum ourselves based on the location of the
IPv4 header and the length.

Fixes: b83e30104b ("ixgbe/ixgbevf: Add support for GSO partial")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-01 15:41:36 -05:00
Alexander Duyck
516165a1e2 igb/igbvf: Don't use lco_csum to compute IPv4 checksum
In the case of IPIP and SIT tunnel frames the outer transport header
offset is actually set to the same offset as the inner transport header.
This results in the lco_csum call not doing any checksum computation over
the inner IPv4/v6 header data.

In order to account for that I am updating the code so that we determine
the location to start the checksum ourselves based on the location of the
IPv4 header and the length.

Fixes: e10715d3e9 ("igb/igbvf: Add support for GSO partial")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-01 15:41:35 -05:00
Richard Fitzgerald
931afc4114 Input: arizona-haptics - Use SoC component pin control functions
The name of a codec pin can have an optional prefix string, which is
defined by the SoC machine driver. The snd_soc_dapm_x_pin functions
take the fully-specified name including the prefix and so the existing
code would fail to find the pin if the audio machine driver had added
a prefix.

Switch to using the snd_soc_component_x_pin equivalent functions that
take a specified SoC component and automatically add the name prefix to
the provided pin name.

Signed-off-by: Richard Fitzgerald <rf@opensource.wolfsonmicro.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-12-01 20:19:13 +00:00
Richard Fitzgerald
efd95c71f4 extcon: arizona: Use SoC component pin control functions
The name of a codec pin can have an optional prefix string, which is
defined by the SoC machine driver. The snd_soc_dapm_x_pin functions
take the fully-specified name including the prefix and so the existing
code would fail to find the pin if the audio machine driver had added
a prefix.

Switch to using the snd_soc_component_x_pin equivalent functions that
take a specified SoC component and automatically add the name prefix to
the provided pin name.

Signed-off-by: Richard Fitzgerald <rf@opensource.wolfsonmicro.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-12-01 20:18:44 +00:00
Kuninori Morimoto
9178feb453 ASoC: add Component level suspend/resume
In current ALSA SoC, Codec only has suspend/resume feature,
but it should be supported on Component level. This patch adds it.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-12-01 20:09:58 +00:00
Kuninori Morimoto
1a653aa447 ASoC: core: replace aux_comp_list to component_dev_list
Now, Card has component_dev_list, we can replace aux_comp_list
to component_dev_list with new auxiliary flags

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-12-01 20:09:50 +00:00
Kuninori Morimoto
d9fc40639d ASoC: core: replace codec_dev_list to component_dev_list on Card
Current Card has Codec list (= codec_dev_list), but Codec will be
removed in the future. Because of this reason, this patch adds
new Component list in Card, and replace Codec list.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-12-01 20:09:34 +00:00
allan
fadf3a2805 net: asix: Fix AX88772_suspend() USB vendor commands failure issues
The change fixes AX88772_suspend() USB vendor commands failure issues.

Signed-off-by: Allan Chou <allan@asix.com.tw>
Tested-by: Allan Chou <allan@asix.com.tw>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-01 14:26:56 -05:00
Linus Torvalds
2caceb3294 Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs fix from Miklos Szeredi:
 "This fixes a regression introduced in 4.8"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: fix d_real() for stacked fs
2016-12-01 10:31:53 -08:00
Linus Torvalds
92cf44e284 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov: "We are disabling automatic
  probing of BYD touchpads as it results in too many false positives,
  and the hardware is not terribly popular and having the protocol
  support does not result in significantly improved user experience.

  We also change keycode for KEY_DATA to avoid clashing with
  KEY_FASTREVERSE. Luckily this newish code is used by CEC framework
  that is still in staging, so it is extremely unlikely that someone has
  already started using this keycode"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: change KEY_DATA from 0x275 to 0x277
  Input: psmouse - disable automatic probing of BYD touchpads
2016-12-01 10:29:41 -08:00
Nicolas Pitre
d3fc425e81 kbuild: make sure autoksyms.h exists early
Some people are able to trigger a race where autoksyms.h is used before
its empty version is even created.  Let's create it at the same time as
the directory holding it is created.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Prarit Bhargava <prarit@redhat.com>
Tested-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-01 10:19:22 -08:00
David S. Miller
7bbf91ce27 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2016-12-01

1) Change the error value when someone tries to run 32bit
   userspace on a 64bit host from -ENOTSUPP to the userspace
   exported -EOPNOTSUPP. Fix from Yi Zhao.

2) On inbound, ESN sequence numbers are already in network
   byte order. So don't try to convert it again, this fixes
   integrity verification for ESN. Fixes from Tobias Brunner.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-01 11:35:49 -05:00
David S. Miller
3d2dd617fb Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

This is a large batch of Netfilter fixes for net, they are:

1) Three patches to fix NAT conversion to rhashtable: Switch to rhlist
   structure that allows to have several objects with the same key.
   Moreover, fix wrong comparison logic in nf_nat_bysource_cmp() as this is
   expecting a return value similar to memcmp(). Change location of
   the nat_bysource field in the nf_conn structure to avoid zeroing
   this as it breaks interaction with SLAB_DESTROY_BY_RCU and lead us
   to crashes. From Florian Westphal.

2) Don't allow malformed fragments go through in IPv6, drop them,
   otherwise we hit GPF, patch from Florian Westphal.

3) Fix crash if attributes are missing in nft_range, from Liping Zhang.

4) Fix arptables 32-bits userspace 64-bits kernel compat, from Hongxu Jia.

5) Two patches from David Ahern to fix netfilter interaction with vrf.
   From David Ahern.

6) Fix element timeout calculation in nf_tables, we take milliseconds
   from userspace, but we use jiffies from kernelspace. Patch from
   Anders K.  Pedersen.

7) Missing validation length netlink attribute for nft_hash, from
   Laura Garcia.

8) Fix nf_conntrack_helper documentation, we don't default to off
   anymore for a bit of time so let's get this in sync with the code.

I know is late but I think these are important, specifically the NAT
bits, as they are mostly addressing fallout from recent changes. I also
read there are chances to have -rc8, if that is the case, that would
also give us a bit more time to test this.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-01 11:04:41 -05:00
Dan Carpenter
a0f1d21c1c KVM: use after free in kvm_ioctl_create_device()
We should move the ops->destroy(dev) after the list_del(&dev->vm_node)
so that we don't use "dev" after freeing it.

Fixes: a28ebea2ad ("KVM: Protect device ops->create and list_add with kvm->lock")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-01 16:10:50 +01:00
Radim Krčmář
0f4828a1da KVM/ARM updates for v4.9-rc7
- Do not call kvm_notify_acked for PPIs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYNzGYAAoJECPQ0LrRPXpDwC8P/3SlsYK9ickZfxoX05tfwbmy
 H5IVmMvnhqQwi2ALe1PycKU9a9c5MISEvFyzGtr/SVkwZdiGRztGCQsYgxAyL0Tr
 mJDttavNU8B9YKC/d+pNNl18uue1Ny297aPDwL6eo3i9s7MX7EZRdRG3U0MiGlbB
 MFVCOLCAd8eUGI68eE5CsRC5+3OFqbkh2JlgtZJPV1BDu/K1ojViijUnpv/CJX52
 8g8qKU9xTgHnd1pTAaE22u5+odgOvOa62rGqVAF8T9eOMpVHxUDeAvzaFLXQAgty
 tVwYlEtoglLKXFa/B0dqBX639J8hLKBC3gBM/1sEbUU4Ii026iPuCbWLjDGju7Ra
 ggaeFp9X8IK9wcwyT88yUAFLwk/neApm5YemzdD7VWSb/5Np3mJpuIH7McwoJp3p
 cvXrTV4P+XBSYgYSdBsGKSQo38dynW8m8Gqq3D5DEAJc33P/kvwBMFRuzj/F3GwZ
 5w1uTDJx+tTdGhpEvxY+Mwb17XDid9WPKyYdgI5Xy662g904m7WmQvP08VezxVcw
 woMlqqSpJvsNxOphj3xRb00W61MTu7zcfYQlwiDwtEqXgIPlpk3tBZO651eMMaSF
 bQmP2qPDKw5UQHtRfcDq4SmcyvaDn6j9BMYCR/XvXmtlFi7+zyglhkIn+wkJF0Dz
 J/hmZNTPVN6rtRv9wY/2
 =1IXI
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm

KVM/ARM updates for v4.9-rc7

- Do not call kvm_notify_acked for PPIs
2016-12-01 14:56:34 +01:00
Stephane Grosjean
f00b534ded can: peak: Add support for PCAN-USB X6 USB interface
This adds support for PEAK-System PCAN-USB X6 USB to CAN interface.

The CAN FD adapter PCAN-USB X6 allows the connection of up to 6 CAN FD
or CAN networks to a computer via USB. The interface is installed in an
aluminum profile casing and is shipped in versions with D-Sub connectors
or M12 circular connectors.

The PCAN-USB X6 registers in the USB sub-system as if 3x PCAN-USB-Pro FD
adapters were plugged. So, this patch:

- updates the PEAK_USB entry of the corresponding Kconfig file
- defines and adds the device id. of the PCAN-USB X6 (0x0014) into the
  table of supported device ids
- defines and adds the new software structure implementing the PCAN-USB X6,
  which is obviously a clone of the software structure implementing the
  PCAN-USB Pro FD.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-12-01 14:12:20 +01:00
Stephane Grosjean
fe5b40642f can: peak: Fix bittiming fields size in bits
This fixes the bitimings fields ranges supported by all the CAN-FD USB
interfaces of the PEAK-System CAN-FD adapters.

Very first development versions of the IP core API defined smaller TSGEx
and SJW fields for both nominal and data bittimings records than the
production versions. This patch fixes them by enlarging their sizes to
the actual values:

field:           old size:    fixed size:
nominal TSGEG1   6            8
nominal TSGEG2   4            7
nominal SJW      4            7
data TSGEG1      4            5
data TSGEG2      3            4
data SJW         2            4

Note that this has no other consequences than offering larger choice to
bitrate encoding.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-12-01 14:11:25 +01:00
Nicholas Piggin
dadc4a1bb9 powerpc/64: Fix placement of .text to be immediately following .head.text
Do not introduce any additional alignment. Placement of text section
will be set by fixed section macros. Without this, output section
alignment defaults to 4096, which makes BookE text section start at
0x1000 when it is expected to start at 0x100.

This was introduced by commit 57f266497d ("powerpc: Use gas sections
for arranging exception vectors") and was caught with the scripted head
section checker (not yet merged).

Fixes: 57f266497d ("powerpc: Use gas sections for arranging exception vectors")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-12-01 22:26:31 +11:00
Andrew Donnellan
409bf7f8a0 powerpc/eeh: Fix deadlock when PE frozen state can't be cleared
In eeh_reset_device(), we take the pci_rescan_remove_lock immediately after
after we call eeh_reset_pe() to reset the PCI controller. We then call
eeh_clear_pe_frozen_state(), which can return an error. In this case, we
bail out of eeh_reset_device() without calling pci_unlock_rescan_remove().

Add a call to pci_unlock_rescan_remove() in the eeh_clear_pe_frozen_state()
error path so that we don't cause a deadlock later on.

Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
Fixes: 7895470063 ("powerpc/eeh: Avoid I/O access during PE reset")
Cc: stable@vger.kernel.org # v3.16+
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-12-01 22:26:27 +11:00
Linus Torvalds
43c4f67c96 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "7 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: fix false-positive WARN_ON() in truncate/invalidate for hugetlb
  kasan: support use-after-scope detection
  kasan: update kasan_global for gcc 7
  lib/debugobjects: export for use in modules
  zram: fix unbalanced idr management at hot removal
  thp: fix corner case of munlock() of PTE-mapped THPs
  mm, thp: propagation of conditional compilation in khugepaged.c
2016-11-30 16:33:41 -08:00
Kirill A. Shutemov
5cbc198ae0 mm: fix false-positive WARN_ON() in truncate/invalidate for hugetlb
Hugetlb pages have ->index in size of the huge pages (PMD_SIZE or
PUD_SIZE), not in PAGE_SIZE as other types of pages.  This means we
cannot user page_to_pgoff() to check whether we've got the right page
for the radix-tree index.

Let's introduce page_to_index() which would return radix-tree index for
given page.

We will be able to get rid of this once hugetlb will be switched to
multi-order entries.

Fixes: fc127da085 ("truncate: handle file thp")
Link: http://lkml.kernel.org/r/20161123093053.mjbnvn5zwxw5e6lk@black.fi.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Doug Nelson <doug.nelson@intel.com>
Tested-by: Doug Nelson <doug.nelson@intel.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Dmitry Vyukov
828347f8f9 kasan: support use-after-scope detection
Gcc revision 241896 implements use-after-scope detection.  Will be
available in gcc 7.  Support it in KASAN.

Gcc emits 2 new callbacks to poison/unpoison large stack objects when
they go in/out of scope.  Implement the callbacks and add a test.

[dvyukov@google.com: v3]
  Link: http://lkml.kernel.org/r/1479998292-144502-1-git-send-email-dvyukov@google.com
Link: http://lkml.kernel.org/r/1479226045-145148-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>	[4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Dmitry Vyukov
045d599a28 kasan: update kasan_global for gcc 7
kasan_global struct is part of compiler/runtime ABI.  gcc revision
241983 has added a new field to kasan_global struct.  Update kernel
definition of kasan_global struct to include the new field.

Without this patch KASAN is broken with gcc 7.

Link: http://lkml.kernel.org/r/1479219743-28682-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>	[4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Chris Wilson
f8ff04e2be lib/debugobjects: export for use in modules
Drivers, or other modules, that use a mixture of objects (especially
objects embedded within other objects) would like to take advantage of
the debugobjects facilities to help catch misuse.  Currently, the
debugobjects interface is only available to builtin drivers and requires
a set of EXPORT_SYMBOL_GPL for use by modules.

I am using the debugobjects in i915.ko to try and catch some invalid
operations on embedded objects.  The problem currently only presents
itself across module unload so forcing i915 to be builtin is not an
option.

Link: http://lkml.kernel.org/r/20161122143039.6433-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Du, Changbin" <changbin.du@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Takashi Iwai
529e71e164 zram: fix unbalanced idr management at hot removal
The zram hot removal code calls idr_remove() even when zram_remove()
returns an error (typically -EBUSY).  This results in a leftover at the
device release, eventually leading to a crash when the module is
reloaded.

As described in the bug report below, the following procedure would
cause an Oops with zram:

 - provision three zram devices via modprobe zram num_devices=3
 - configure a size for each device
   + echo "1G" > /sys/block/$zram_name/disksize
 - mkfs and mount zram0 only
 - attempt to hot remove all three devices
   + echo 2 > /sys/class/zram-control/hot_remove
   + echo 1 > /sys/class/zram-control/hot_remove
   + echo 0 > /sys/class/zram-control/hot_remove
     - zram0 removal fails with EBUSY, as expected
 - unmount zram0
 - try zram0 hot remove again
   + echo 0 > /sys/class/zram-control/hot_remove
     - fails with ENODEV (unexpected)
 - unload zram kernel module
   + completes successfully
 - zram0 device node still exists
 - attempt to mount /dev/zram0
   + mount command is killed
   + following BUG is encountered

 BUG: unable to handle kernel paging request at ffffffffa0002ba0
 IP: get_disk+0x16/0x50
 Oops: 0000 [#1] SMP
 CPU: 0 PID: 252 Comm: mount Not tainted 4.9.0-rc6 #176
 Call Trace:
   exact_lock+0xc/0x20
   kobj_lookup+0xdc/0x160
   get_gendisk+0x2f/0x110
   __blkdev_get+0x10c/0x3c0
   blkdev_get+0x19d/0x2e0
   blkdev_open+0x56/0x70
   do_dentry_open.isra.19+0x1ff/0x310
   vfs_open+0x43/0x60
   path_openat+0x2c9/0xf30
   do_filp_open+0x79/0xd0
   do_sys_open+0x114/0x1e0
   SyS_open+0x19/0x20
   entry_SYSCALL_64_fastpath+0x13/0x94

This patch adds the proper error check in hot_remove_store() not to call
idr_remove() unconditionally.

Fixes: 17ec4cd985 ("zram: don't call idr_remove() from zram_remove()")
Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1010970
Link: http://lkml.kernel.org/r/20161121132140.12683-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Reported-by: David Disseldorp <ddiss@suse.de>
Tested-by: David Disseldorp <ddiss@suse.de>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: <stable@vger.kernel.org>    [4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Kirill A. Shutemov
655548bf62 thp: fix corner case of munlock() of PTE-mapped THPs
The following program triggers BUG() in munlock_vma_pages_range():

	// autogenerated by syzkaller (http://github.com/google/syzkaller)
	#include <sys/mman.h>

	int main()
	{
	  mmap((void*)0x20105000ul, 0xc00000ul, 0x2ul, 0x2172ul, -1, 0);
	  mremap((void*)0x201fd000ul, 0x4000ul, 0xc00000ul, 0x3ul, 0x203f0000ul);
	  return 0;
	}

The test-case constructs the situation when munlock_vma_pages_range()
finds PTE-mapped THP-head in the middle of page table and, by mistake,
skips HPAGE_PMD_NR pages after that.

As result, on the next iteration it hits the middle of PMD-mapped THP
and gets upset seeing mlocked tail page.

The solution is only skip HPAGE_PMD_NR pages if the THP was mlocked
during munlock_vma_page().  It would guarantee that the page is
PMD-mapped as we never mlock PTE-mapeed THPs.

Fixes: e90309c9f7 ("thp: allow mlocked THP again")
Link: http://lkml.kernel.org/r/20161115132703.7s7rrgmwttegcdh4@black.fi.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>	[4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Jérémy Lefaure
e1465d125d mm, thp: propagation of conditional compilation in khugepaged.c
Commit b46e756f5e ("thp: extract khugepaged from mm/huge_memory.c")
moved code from huge_memory.c to khugepaged.c.  Some of this code should
be compiled only when CONFIG_SYSFS is enabled but the condition around
this code was not moved into khugepaged.c.

The result is a compilation error when CONFIG_SYSFS is disabled:

  mm/built-in.o: In function `khugepaged_defrag_store': khugepaged.c:(.text+0x2d095): undefined reference to `single_hugepage_flag_store'
  mm/built-in.o: In function `khugepaged_defrag_show': khugepaged.c:(.text+0x2d0ab): undefined reference to `single_hugepage_flag_show'

This commit adds the #ifdef CONFIG_SYSFS around the code related to
sysfs.

Link: http://lkml.kernel.org/r/20161114203448.24197-1-jeremy.lefaure@lse.epita.fr
Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Dave Airlie
83fb8b0555 Merge tag 'drm-misc-fixes-2016-11-30' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
single drm fix.

* tag 'drm-misc-fixes-2016-11-30' of git://anongit.freedesktop.org/git/drm-misc:
  drm: Don't call drm_for_each_crtc with a non-KMS driver
2016-12-01 10:00:14 +10:00
Linus Torvalds
f513581c35 Two small fixes for MIPI PLLs on sunxi devices and a build fix
for a Broadcom clk driver having unmet dependencies.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJYNhi4AAoJEK0CiJfG5JUlRc8P+wYZRyhm/rCVpMnqCk/2ZaZU
 Oe9hJQe6x85LEeh3+Q/IdgxGYQ3nDklNRtmFQJ62wkCv7WrrIXO7zLvaTJ2JjsGd
 RU8tZTc9v1IsBhZhKlqv8fx3lvTCGf8aCeHg/vGczDfbkwHEwdsgFjwD9115WTdR
 6UenybklZmU+p9k0Th/DrqYN4xVIlbr+So9dy+KbaRdM9ZUvg3pTb2wX3KH7V0qG
 Wytz90FQ9WZChjxR8kQRLyBsJXtgNGgLDe7KTraYUizSE0Do/oVGM/FhQWEWnmJ4
 i1WgkS5Bve9AoOdWWkLPlSz295OwAbW2uQ+U8KYEYsalT2eSM/ppTnX0Mo5CopKh
 vwSjNL/kWTAtwc5majTf8fuDworX6QhCHo9FseF38D/xDYPSiXZsaCYCENY7Hjpt
 ggiFe6KkdSWgMPcns+vvgRRVcNQT+I2kKPUGN2IDfu5r/brHWJvbstnqsmHq75gB
 qlNIVcq5o69B4jr6Qumh/unI9JMER/zo7m9PLVG6LVGU0SqumzAU3XDN4Ifk5dea
 /xsQF1H6M9d5vulEAqToxG5dl3dmIXndLnAYuHHQoUzVsFSxfLfTKxlksZsOgyza
 xdbpzQ3NNZAvfTTxjpsNKDlcrik4Je9bbsXypsvt3QvpLGAz18yirxAr00pmvdbV
 R6RO12nye3q94+8u4uPt
 =a/l6
 -----END PGP SIGNATURE-----

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "Two small fixes for MIPI PLLs on sunxi devices and a build fix for a
  Broadcom clk driver having unmet dependencies"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: bcm: Fix unmet Kconfig dependencies for CLK_BCM_63XX
  clk: sunxi-ng: enable so-said LDOs for A33 SoC's pll-mipi clock
  clk: sunxi-ng: sun6i-a31: Enable PLL-MIPI LDOs when ungating it
2016-11-30 15:15:49 -08:00
Jeremy Linton
4c9456df88 arm64: dts: juno: Correct PCI IO window
The PCIe root complex on Juno translates the MMIO mapped
at 0x5f800000 to the PIO address range starting at 0
(which is common because PIO addresses are generally < 64k).
Correct the DT to reflect this.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-11-30 23:49:16 +01:00
Jason Wang
aa196eed3d macvtap: handle ubuf refcount correctly when meet errors
We trigger uarg->callback() immediately after we decide do datacopy
even if caller want to do zerocopy. This will cause the callback
(vhost_net_zerocopy_callback) decrease the refcount. But when we meet
an error afterwards, the error handling in vhost handle_tx() will try
to decrease it again. This is wrong and fix this by delay the
uarg->callback() until we're sure there's no errors.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 15:06:02 -05:00
Jason Wang
af1cc7a2b8 tun: handle ubuf refcount correctly when meet errors
We trigger uarg->callback() immediately after we decide do datacopy
even if caller want to do zerocopy. This will cause the callback
(vhost_net_zerocopy_callback) decrease the refcount. But when we meet
an error afterwards, the error handling in vhost handle_tx() will try
to decrease it again. This is wrong and fix this by delay the
uarg->callback() until we're sure there's no errors.

Reported-by: wangyunjian <wangyunjian@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 15:06:01 -05:00
Grygorii Strashko
4ccfd6383a net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during resume
netif_set_real_num_tx/rx_queues() are required to be called with rtnl_lock
taken, otherwise ASSERT_RTNL() warning will be triggered - which happens
now during System resume from suspend:
cpsw_resume()
|- cpsw_ndo_open()
  |- netif_set_real_num_tx/rx_queues()
     |- ASSERT_RTNL();

Hence, fix it by surrounding cpsw_ndo_open() by rtnl_lock/unlock() calls.

Cc: Dave Gerlach <d-gerlach@ti.com>
Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Fixes: commit e05107e6b7 ("net: ethernet: ti: cpsw: add multi queue support")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Tested-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:59:08 -05:00
Linus Torvalds
e2b588ab60 pwm: Fixes for v4.9
This contains two one-line fixes for issues that were introduced in
 v4.9-rc1.
 -----BEGIN PGP SIGNATURE-----
 
 iQI2BAABCAAgBQJYPx27GRx0aGllcnJ5LnJlZGluZ0BnbWFpbC5jb20ACgkQ3SOs
 138+s6EcRg//bcdAO9u28pAe9FnL/ZRTtnDBW+Xwd6kFijaIDfu5ZRJTz7ZDtIlc
 JxMc6NOqvbxL3pJnxxzZSlLPwI7Su8gwFOaB+FrHhsNn5ocT5izzrBeHQTRgMFN+
 ziJBWMphwFU0y08ie0soiGK/B10X/HfzgzSKnafQc5i06GlpRKft32Mdqq/PgO5z
 qmKhscT3b6vvPKreJ5lNjlwP79N+J151bRrcQmV9Uh4oBW1OGVJoHFhaRDAHYmmF
 bnS9Y1iX4gwtjIUEDwzFEpf3/TJ0cjBe06mW9I9OtTe22Jzo2Rc2Hyi0+JMrKrUF
 sVmP3R07qVVgIe1QJcrvIxaGlDO9ck1HS1PycnYgAb6naonLLXNsrGFn9N7a+tQ1
 rEcwA8i2c7jKBNbZ/xL5qDE60WEZY9px8gQuu8Z3+T16qpd2BYBVwlJOxDYf6Dfe
 qKvOsNg1d5x7pye5GotGhbddCrSstahMfK+NlzMH463P/5/sdymHwRNnjAL9rM0X
 +6AHPC5aRJvgvbf4ku5hO0AI+cPU0y0T7S5dhoFKW4Mfl7F3N+bcmiE0xFiNnXrU
 e9/VsdPyhVDxx4Qfpd3vMudYApoE2dvae0Eyn37hbq0Y9F5ZVE1KAwhcfHU3I4BN
 jbQAozOFogHaJoIjRudqXo3TmZkKZ9pI8/CJ9AfDHNpJH4bdLv3sF+Y=
 =Yr7I
 -----END PGP SIGNATURE-----

Merge tag 'pwm/for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm

Pull pwm fixes from Thierry Reding:
 "This contains two one-line fixes for issues that were introduced in
  v4.9-rc1"

* tag 'pwm/for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
  pwm: Fix device reference leak
  pwm: meson: Add missing spin_lock_init()
2016-11-30 11:53:50 -08:00
Josef Bacik
e2d2afe15e bpf: fix states equal logic for varlen access
If we have a branch that looks something like this

int foo = map->value;
if (condition) {
  foo += blah;
} else {
  foo = bar;
}
map->array[foo] = baz;

We will incorrectly assume that the !condition branch is equal to the condition
branch as the register for foo will be UNKNOWN_VALUE in both cases.  We need to
adjust this logic to only do this if we didn't do a varlen access after we
processed the !condition branch, otherwise we have different ranges and need to
check the other branch as well.

Fixes: 484611357c ("bpf: allow access into map value arrays")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:50:52 -05:00
Hongxu Jia
17a49cd549 netfilter: arp_tables: fix invoking 32bit "iptable -P INPUT ACCEPT" failed in 64bit kernel
Since 09d9686047 ("netfilter: x_tables: do compat validation via
translate_table"), it used compatr structure to assign newinfo
structure.  In translate_compat_table of ip_tables.c and ip6_tables.c,
it used compatr->hook_entry to replace info->hook_entry and
compatr->underflow to replace info->underflow, but not do the same
replacement in arp_tables.c.

It caused invoking 32-bit "arptbale -P INPUT ACCEPT" failed in 64bit
kernel.
--------------------------------------
root@qemux86-64:~# arptables -P INPUT ACCEPT
root@qemux86-64:~# arptables -P INPUT ACCEPT
ERROR: Policy for `INPUT' offset 448 != underflow 0
arptables: Incompatible with this kernel
--------------------------------------

Fixes: 09d9686047 ("netfilter: x_tables: do compat validation via translate_table")
Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-30 20:50:23 +01:00
David S. Miller
0fcba2894c wireless-drivers fixes for 4.9
mwifiex
 
 * properly terminate SSIDs so that uninitalised memory is not printed
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJYPZM9AAoJEG4XJFUm622bnTwH/j7KbWbTLE2H9abYJxne3sWQ
 FOCrGNICgG9HYyLn33k7+dCBHGYa1f5qWO7dIeWhe6LEtXqrWBsxUFZbMrJVU7Te
 sDr3s2364iIPhdLtYl5mM7M75Y2h2pt1XhpErmldCpFpnYad5vZEbIR1n96F3cz6
 0ft6iUJpd3bf+KWxDUc707Vln42optvbcp7gjF+6mdShb0jlFkV9eOa85aJH6v38
 5kKPhLfiv1Qs1sZXPrWc2oQUIc0LDY19sXtw/5DTLe4+r6ybsKlF1o4+b2yOVeiu
 nrm1F/2D/829w3+4iYE63wACPGvyVaKYROtYgquyYkrI+6xyh1fmnout6SiwLe8=
 =oEuI
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-for-davem-2016-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for 4.9

mwifiex

* properly terminate SSIDs so that uninitalised memory is not printed
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:33:44 -05:00
David S. Miller
7752f72748 Merge branch 'l2tp-fixes'
Guillaume Nault says:

====================
l2tp: fixes for l2tp_ip and l2tp_ip6 socket handling

This series addresses problems found while working on commit 32c231164b
("l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()").

The first three patches fix races in socket's connect, recv and bind
operations. The last two ones fix scenarios where l2tp fails to
correctly lookup its userspace sockets.

Apart from the last patch, which is l2tp_ip6 specific, every patch
fixes the same problem in the L2TP IPv4 and IPv6 code.

All problems fixed by this series exist since the creation of the
l2tp_ip and l2tp_ip6 modules.

Changes since v1:
  * Patch #3: fix possible uninitialised use of 'ret' in l2tp_ip_bind().
====================

Acked-by: James Chapman <jchapman@katalix.com>
2016-11-30 14:14:09 -05:00
Guillaume Nault
31e2f21fb3 l2tp: fix address test in __l2tp_ip6_bind_lookup()
The '!(addr && ipv6_addr_equal(addr, laddr))' part of the conditional
matches if addr is NULL or if addr != laddr.
But the intend of __l2tp_ip6_bind_lookup() is to find a sockets with
the same address, so the ipv6_addr_equal() condition needs to be
inverted.

For better clarity and consistency with the rest of the expression, the
(!X || X == Y) notation is used instead of !(X && X != Y).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:14:08 -05:00
Guillaume Nault
df90e68861 l2tp: fix lookup for sockets not bound to a device in l2tp_ip
When looking up an l2tp socket, we must consider a null netdevice id as
wild card. There are currently two problems caused by
__l2tp_ip_bind_lookup() not considering 'dif' as wild card when set to 0:

  * A socket bound to a device (i.e. with sk->sk_bound_dev_if != 0)
    never receives any packet. Since __l2tp_ip_bind_lookup() is called
    with dif == 0 in l2tp_ip_recv(), sk->sk_bound_dev_if is always
    different from 'dif' so the socket doesn't match.

  * Two sockets, one bound to a device but not the other, can be bound
    to the same address. If the first socket binding to the address is
    the one that is also bound to a device, the second socket can bind
    to the same address without __l2tp_ip_bind_lookup() noticing the
    overlap.

To fix this issue, we need to consider that any null device index, be
it 'sk->sk_bound_dev_if' or 'dif', matches with any other value.
We also need to pass the input device index to __l2tp_ip_bind_lookup()
on reception so that sockets bound to a device never receive packets
from other devices.

This patch fixes l2tp_ip6 in the same way.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:14:08 -05:00
Guillaume Nault
d5e3a19093 l2tp: fix racy socket lookup in l2tp_ip and l2tp_ip6 bind()
It's not enough to check for sockets bound to same address at the
beginning of l2tp_ip{,6}_bind(): even if no socket is found at that
time, a socket with the same address could be bound before we take
the l2tp lock again.

This patch moves the lookup right before inserting the new socket, so
that no change can ever happen to the list between address lookup and
socket insertion.

Care is taken to avoid side effects on the socket in case of failure.
That is, modifications of the socket are done after the lookup, when
binding is guaranteed to succeed, and before releasing the l2tp lock,
so that concurrent lookups will always see fully initialised sockets.

For l2tp_ip, 'ret' is set to -EINVAL before checking the SOCK_ZAPPED
bit. Error code was mistakenly set to -EADDRINUSE on error by commit
32c231164b ("l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()").
Using -EINVAL restores original behaviour.

For l2tp_ip6, the lookup is now always done with the correct bound
device. Before this patch, when binding to a link-local address, the
lookup was done with the original sk->sk_bound_dev_if, which was later
overwritten with addr->l2tp_scope_id. Lookup is now performed with the
final sk->sk_bound_dev_if value.

Finally, the (addr_len >= sizeof(struct sockaddr_in6)) check has been
dropped: addr is a sockaddr_l2tpip6 not sockaddr_in6 and addr_len has
already been checked at this point (this part of the code seems to have
been copy-pasted from net/ipv6/raw.c).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:14:08 -05:00
Guillaume Nault
a3c18422a4 l2tp: hold socket before dropping lock in l2tp_ip{, 6}_recv()
Socket must be held while under the protection of the l2tp lock; there
is no guarantee that sk remains valid after the read_unlock_bh() call.

Same issue for l2tp_ip and l2tp_ip6.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:14:07 -05:00
Guillaume Nault
0382a25af3 l2tp: lock socket before checking flags in connect()
Socket flags aren't updated atomically, so the socket must be locked
while reading the SOCK_ZAPPED flag.

This issue exists for both l2tp_ip and l2tp_ip6. For IPv6, this patch
also brings error handling for __ip6_datagram_connect() failures.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:14:07 -05:00