Pull userns fixes from Eric W Biederman:
"The bulk of the changes are fixing the worst consequences of the user
namespace design oversight in not considering what happens when one
namespace starts off as a clone of another namespace, as happens with
the mount namespace.
The rest of the changes are just plain bug fixes.
Many thanks to Andy Lutomirski for pointing out many of these issues."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
userns: Restrict when proc and sysfs can be mounted
ipc: Restrict mounting the mqueue filesystem
vfs: Carefully propogate mounts across user namespaces
vfs: Add a mount flag to lock read only bind mounts
userns: Don't allow creation if the user is chrooted
yama: Better permission check for ptraceme
pid: Handle the exit of a multi-threaded init.
scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids.
Submitted by me since BenH is on vacation.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABCAAGBQJRUMdFAAoJEIn5HApB1cB6QzEP/1O0KL7QCbWiCnT92hvExo8a
qFHdaAk/O9qasOv/+R1ObuFOVB0oSxlYFKKz6OE5XvHYzK6406Dn2xBDdBYo+SRO
dkF6Uq7ZAzbaNXOOFa9yG6R43DDHdJgiXvQa0a8J8dJZX3OgaPSmtMMkoQIn39Fi
xLsDMjEWfOVsOe4yebi/Btmwokqd9SRlKOyPeRRHsok9vyEqyteyNPNhCXtT9dTp
s6noshnp3f7Yjk2al2UOoh74jUEc6oZahPK393FG93McuXXBZPRLQjh0JOpbHMeS
12/LoyLYlqxqAEM5bUnEICkDQZCwUShpHtoMFrufr1LF+zRcqX4sN51iCahQnlNO
XDZfAe+4UvkZGs0MbGJSgUiq/tG5D7uRoKkcgtEUmxd+gemDLVwrCnAITc9qUk3r
D+pOJB8ZNKezG/n5e/KzQurymrj2isvAB4MOXSn0OrvShukFC+ST4civXQYUMHnr
sJK9uas9MjPxjWqdXwHuaknV5EObdSAh4O5JEw5ZFZ8L1lYXXNaJrhnmPtHCD69Q
Hw58ytQ8S5tcQMylxlshJGnRYTni1/8n20HAvyyw/bDWcSbo7usrsfCYQz2czPMM
H1c+pcz/bWJrO1xxD0eX4n8EHC+JqEte1z/RNJhThZvNXvym+VdmhC3sAluM9U3H
PYPmaOeMQifvRoSh5ZSt
=rm4T
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/next-fixes
Pull powerpc build fixes from Stephen Rothwell:
"Just a couple of build fixes for powerpc all{mod,yes}config.
Submitted by me since BenH is on vacation."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/next-fixes:
powerpc: define the conditions where the ePAPR idle hcall can be supported
powerpc: make additional room in exception vector area
- Regression fixes for C-and-P states not being parsed properly.
- Fix possible security issue with guests triggering DoS via non-assigned MSI-Xs.
- Fix regression (introduced in v3.7) with raising an event (v2).
- Fix hastily introduced band-aid during c0 for the CR3 blowup.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQEcBAABAgAGBQJRUxlVAAoJEFjIrFwIi8fJiUsH/2a3A8EVqS7OYDNgT0ZFb1VI
rMLNiA50sRJNDsq0NbGl1Y+Lubus1czc0c7HXFQ557OakN6WqcmPPjCKp4JT6NnV
Jz/IZ0iimdoHiPru1Qe4ah3fSgzUtht2LB48Z/a0Is4k3LsRP2W3/niVC3ypnyuJ
52HjjuxeFAfXIkNeqsrO2a6cUXZeXzUyR4g9GNxDozi4jHpoPQ4j9okZbo218xH+
/pRnFeMD7t7dFkgNeyeGXUiJn2AkNPHi3Hx+RH5nN9KXQ1eem9R4p7Qpez1dUEWF
YEc/bs7MyOYezzTVHPYk77Yt8baOHJt7UbHjM6jfi1aGYYINTRr3m5mORd3rCmc=
=61IX
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
"This is mostly just the last stragglers of the regression bugs that
this merge window had. There are also two bug-fixes: one that adds an
extra layer of security, and a regression fix for a change that was
added in v3.7 (the v1 was faulty, the v2 works).
- Regression fixes for C-and-P states not being parsed properly.
- Fix possible security issue with guests triggering DoS via
non-assigned MSI-Xs.
- Fix regression (introduced in v3.7) with raising an event (v2).
- Fix hastily introduced band-aid during c0 for the CR3 blowup."
* tag 'stable/for-linus-3.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/events: avoid race with raising an event in unmask_evtchn()
xen/mmu: Move the setting of pvops.write_cr3 to later phase in bootup.
xen/acpi-stub: Disable it b/c the acpi_processor_add is no longer called.
xen-pciback: notify hypervisor about devices intended to be assigned to guests
xen/acpi-processor: Don't dereference struct acpi_processor on all CPUs.
Pull HID fixes from Jiri Kosina:
- fix for potential 3.9 regression in handling of buttons for touchpads
following HID mt specification; potential because reportedly there is
no retail product on the market that would be using this feature, but
nevertheless we'd better follow the spec. Fix by Benjamin Tissoires.
- support for two quirky devices added by Josh Boyer.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: multitouch: fix touchpad buttons
HID: usbhid: fix build problem
HID: usbhid: quirk for MSI GX680R led panel
HID: usbhid: quirk for Realtek Multi-card reader
Here are some fixes which have collected since Linux v3.9-rc1. The most
important one fixes a long-standing regressen which make re-hotplugged
devices unusable when AMD IOMMU is used. The other patches fix build
issues (build regression on OMAP and a section mismatch). One patch just
removes a duplicate header include.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRUrpyAAoJECvwRC2XARrjOQcQAMN2Rza2Ios/MgPmIPI5e+76
ogSYAnl3biSeYe7pLYzOOwmzIIGx8N/GOzM+XsoV8HF4y8dbPNrrNCKpAwweJrcS
RQUUw22+HZglg3nR8gVWuwx1BU6Gy8AukzjrAgBIIuKpxOloFGYDJD+NXn3ctq/v
rULPXc7M6bZDIoXu4Mjv0otCdh+CpTYl24TNs59vEJIibipQC5PKLzqCT6AhA6qP
4G3irtrlju63B2ItVAWDI1Ge2JF00qpXMhJKojCimv8bqooaG+XXOUg3ip4G61hp
6HgYkfozf081Y4vy4ojpu7RfysemA10gYy60f/g+uaMLvjCG5eLITlghUTzo7l30
8sTD6zB4V98tPK2V5CQ6S6WoVyd0Hq7u0ugt2nmVmBHk7VVKcc4JmViKxeNTWeOa
N/uKNoKD+Kb+ma0mSG/ig5hOHpBeOXrPDyDSDtbR+s7G6TRpmd5JKOs+Mjv0HhC1
Ny+NanFJlSWECpK3uhHQ68wQ9e75XppbEJfrUfdrsMBFbQlpE2lwvakeiuzqHLmO
/ZVyyAKXe0R4NEVNlqf2kOVQznBKIX+88Jn4NeCPR2go70kHLrWKCVrH4flv0qD9
nj08Ko8IY+jW+oXpcH7jZLTxdqVo031BAPV0hHoJvxbDAl0QyGu/B4VFWrVnesnM
mpx0vCbObLUdlkp9bXWK
=/K5U
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v3.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"Here are some fixes which have collected since Linux v3.9-rc1.
The most important one fixes a long-standing regressen which make
re-hotplugged devices unusable when AMD IOMMU is used.
The other patches fix build issues (build regression on OMAP and a
section mismatch). One patch just removes a duplicate header include."
* tag 'iommu-fixes-v3.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Make sure dma_ops are set for hotplug devices
x86, io_apic: remove duplicated include from irq_remapping.c
iommu: OMAP: build only on OMAP2+
amd_iommu_init: remove __init from amd_iommu_erratum_746_workaround
Commit 06ae43f34b ("Don't bother with redoing rw_verify_area() from
default_file_splice_from()") lost the checks to test existence of the
write/aio_write methods. My apologies ;-/
Eventually, we want that in fs/splice.c side of things (no point
repeating it for every buffer, after all), but for now this is the
obvious minimal fix.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In unmask_evtchn(), when the mask bit is cleared after testing for
pending and the event becomes pending between the test and clear, then
the upcall will not become pending and the event may be lost or
delayed.
Avoid this by always clearing the mask bit before checking for
pending. If a hypercall is needed, remask the event as
EVTCHNOP_unmask will only retrigger pending events if they were
masked.
This fixes a regression introduced in 3.7 by
b5e579232d (xen/events: fix
unmask_evtchn for PV on HVM guests) which reordered the clear mask and
check pending operations.
Changes in v2:
- set mask before hypercall.
Cc: stable@vger.kernel.org
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
We move the setting of write_cr3 from the early bootup variant
(see git commit 0cc9129d75
"x86-64, xen, mmu: Provide an early version of write_cr3.")
to a more appropiate location.
This new location sets all of the other non-early variants
of pvops calls - and most importantly is before the
alternative_asm mechanism kicks in.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
With the Xen ACPI stub code (CONFIG_XEN_STUB=y) enabled, the power
C and P states are no longer uploaded to the hypervisor.
The reason is that the Xen CPU hotplug code: xen-acpi-cpuhotplug.c
and the xen-acpi-stub.c register themselves as the "processor" type object.
That means the generic processor (processor_driver.c) stops
working and it does not call (acpi_processor_add) which populates the
per_cpu(processors, pr->id) = pr;
structure. The 'pr' is gathered from the acpi_processor_get_info function
which does the job of finding the C-states and figuring out PBLK address.
The 'processors->pr' is then later used by xen-acpi-processor.c (the one that
uploads C and P states to the hypervisor). Since it is NULL, we end
skip the gathering of _PSD, _PSS, _PCT, etc and never upload the power
management data.
The end result is that enabling the CONFIG_XEN_STUB in the build means that
xen-acpi-processor is not working anymore.
This temporary patch fixes it by marking the XEN_STUB driver as
BROKEN until this can be properly fixed.
CC: jinsong.liu@intel.com
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Only allow unprivileged mounts of proc and sysfs if they are already
mounted when the user namespace is created.
proc and sysfs are interesting because they have content that is
per namespace, and so fresh mounts are needed when new namespaces
are created while at the same time proc and sysfs have content that
is shared between every instance.
Respect the policy of who may see the shared content of proc and sysfs
by only allowing new mounts if there was an existing mount at the time
the user namespace was created.
In practice there are only two interesting cases: proc and sysfs are
mounted at their usual places, proc and sysfs are not mounted at all
(some form of mount namespace jail).
Cc: stable@vger.kernel.org
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Only allow mounting the mqueue filesystem if the caller has CAP_SYS_ADMIN
rights over the ipc namespace. The principle here is if you create
or have capabilities over it you can mount it, otherwise you get to live
with what other people have mounted.
This information is not particularly sensitive and mqueue essentially
only reports which posix messages queues exist. Still when creating a
restricted environment for an application to live any extra
information may be of use to someone with sufficient creativity. The
historical if imperfect way this information has been restricted has
been not to allow mounts and restricting this to ipc namespace
creators maintains the spirit of the historical restriction.
Cc: stable@vger.kernel.org
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
As a matter of policy MNT_READONLY should not be changable if the
original mounter had more privileges than creator of the mount
namespace.
Add the flag CL_UNPRIVILEGED to note when we are copying a mount from
a mount namespace that requires more privileges to a mount namespace
that requires fewer privileges.
When the CL_UNPRIVILEGED flag is set cause clone_mnt to set MNT_NO_REMOUNT
if any of the mnt flags that should never be changed are set.
This protects both mount propagation and the initial creation of a less
privileged mount namespace.
Cc: stable@vger.kernel.org
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
When a read-only bind mount is copied from mount namespace in a higher
privileged user namespace to a mount namespace in a lesser privileged
user namespace, it should not be possible to remove the the read-only
restriction.
Add a MNT_LOCK_READONLY mount flag to indicate that a mount must
remain read-only.
CC: stable@vger.kernel.org
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Guarantee that the policy of which files may be access that is
established by setting the root directory will not be violated
by user namespaces by verifying that the root directory points
to the root of the mount namespace at the time of user namespace
creation.
Changing the root is a privileged operation, and as a matter of policy
it serves to limit unprivileged processes to files below the current
root directory.
For reasons of simplicity and comprehensibility the privilege to
change the root directory is gated solely on the CAP_SYS_CHROOT
capability in the user namespace. Therefore when creating a user
namespace we must ensure that the policy of which files may be access
can not be violated by changing the root directory.
Anyone who runs a processes in a chroot and would like to use user
namespace can setup the same view of filesystems with a mount
namespace instead. With this result that this is not a practical
limitation for using user namespaces.
Cc: stable@vger.kernel.org
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Commit "HID: multitouch: use the callback "report" instead..." breaks the
buttons of touchpads following the HID multitouch specification.
The buttons were emmitted through hid-input, but as now the events
are generated only in hid-multitouch, the buttons are not emmitted anymore.
The input_event() call is far much simpler than the hid-input one as
many of the different tests do not apply to multitouch touchpads.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
There is a bug introduced with commit 27c2127 that causes
devices which are hot unplugged and then hot-replugged to
not have per-device dma_ops set. This causes these devices
to not function correctly. Fixed with this patch.
Cc: stable@vger.kernel.org
Reported-by: Andreas Degert <andreas.degert@googlemail.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Pull vfs fixes from Al Viro:
"stable fodder; assorted deadlock fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vt: synchronize_rcu() under spinlock is not nice...
Nest rename_lock inside vfsmount_lock
Don't bother with redoing rw_verify_area() from default_file_splice_from()
vcs_poll_data_free() calls unregister_vt_notifier(), which calls
atomic_notifier_chain_unregister(), which calls synchronize_rcu().
Do it *after* we'd dropped ->f_lock.
Cc: stable@vger.kernel.org (all kernels since 2.6.37)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
... lest we get livelocks between path_is_under() and d_path() and friends.
The thing is, wrt fairness lglocks are more similar to rwsems than to rwlocks;
it is possible to have thread B spin on attempt to take lock shared while thread
A is already holding it shared, if B is on lower-numbered CPU than A and there's
a thread C spinning on attempt to take the same lock exclusive.
As the result, we need consistent ordering between vfsmount_lock (lglock) and
rename_lock (seq_lock), even though everything that takes both is going to take
vfsmount_lock only shared.
Spotted-by: Brad Spengler <spender@grsecurity.net>
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull networking fixes from David Miller:
1) Always increment IPV4 ID field in encapsulated GSO packets, even
when DF is set. Regression fix from Pravin B Shelar.
2) Fix per-net subsystem initialization in netfilter conntrack,
otherwise we may access dynamically allocated memory before it is
actually allocated. From Gao Feng.
3) Fix DMA buffer lengths in iwl3945 driver, from Stanislaw Gruszka.
4) Fix race between submission of sync vs async commands in mwifiex
driver, from Amitkumar Karwar.
5) Add missing cancel of command timer in mwifiex driver, from Bing
Zhao.
6) Missing SKB free in rtlwifi USB driver, from Jussi Kivilinna.
7) Thermal layer tries to use a genetlink multicast string that is
longer than the 16 character limit. Fix it and add a BUG check to
prevent this kind of thing from happening in the future.
From Masatake YAMATO.
8) Fix many bugs in the handling of the teardown of L2TP connections,
UDP encapsulation instances, and sockets. From Tom Parkin.
9) Missing socket release in IRDA, from Kees Cook.
10) Fix fec driver modular build, from Fabio Estevam.
11) Erroneous use of kfree() instead of free_netdev() in lantiq_etop,
from Wei Yongjun.
12) Fix bugs in handling of queue numbers and steering rules in mlx4
driver, from Moshe Lazer, Hadar Hen Zion, and Or Gerlitz.
13) Some FOO_DIAG_MAX constants were defined off by one, fix from Andrey
Vagin.
14) TCP segmentation deferral is unintentionally done too strongly,
breaking ACK clocking. Fix from Eric Dumazet.
15) net_enable_timestamp() can legitimately be invoked from software
interrupts, and in a way that is safe, so remove the WARN_ON().
Also from Eric Dumazet.
16) Fix use after free in VLANs, from Cong Wang.
17) Fix TCP slow start retransmit storms after SACK reneging, from
Yuchung Cheng.
18) Unix socket release should mark a socket dead before NULL'ing out
sock->sk, otherwise we can race. Fix from Paul Moore.
19) IPV6 addrconf code can try to free static memory, from Hong Zhiguo.
20) Fix register mis-programming, NULL pointer derefs, and wrong PHC
clock frequency in IGB driver. From Lior LevyAlex Williamson, Jiri
Benc, and Jeff Kirsher.
21) skb->ip_summed logic in pch_gbe driver is reversed, breaking packet
forwarding. Fix from Veaceslav Falico.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
ipv4: Fix ip-header identification for gso packets.
bonding: remove already created master sysfs link on failure
af_unix: dont send SCM_CREDENTIAL when dest socket is NULL
pch_gbe: fix ip_summed checksum reporting on rx
igb: fix PHC stopping on max freq
igb: make sensor info static
igb: SR-IOV init reordering
igb: Fix null pointer dereference
igb: fix i350 anti spoofing config
ixgbevf: don't release the soft entries
ipv6: fix bad free of addrconf_init_net
unix: fix a race condition in unix_release()
tcp: undo spurious timeout after SACK reneging
bnx2x: fix assignment of signed expression to unsigned variable
bridge: fix crash when set mac address of br interface
8021q: fix a potential use-after-free
net: remove a WARN_ON() in net_enable_timestamp()
tcp: preserve ACK clocking in TSO
net: fix *_DIAG_MAX constants
net/mlx4_core: Disallow releasing VF QPs which have steering rules
...
- Fix an NFSv4 idmapper regression
- Fix an Oops in the pNFS blocks client
- Fix up various issues with pNFS layoutcommit
- Ensure correct read ordering of variables in rpc_wake_up_task_queue_locked
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQIcBAABAgAGBQJRUedyAAoJEGcL54qWCgDyar0P/2pTT/yxX8ejTu5DmY7e4PYJ
jhPG2AEqY/yMLn9GvB375VIs1L8tuY50+3NFhWZFjyNbEU3GV+5Y+kPpBtAgYiSI
VyIXiJ/xMtXdYJMYuE/nh5jbcqJsHwGjpcIaSd5BuWzQUaoUYvLulxWd4QN8mmaT
5SuzmgV+7WIqV6RjlaYF82srcOKAjwemcrfRkCNzzJr6aT39gH2YdYFbDaTr7qhU
fw0x3QlI7887vSNQcfaGbC1+jr6oe8wRCneOR0tceU/8bcj6zlUDk5HxqSOc28mA
jUQieoVRggcM4s5DFpNcuwW6qCPZOmzv/OFD6oqnhyyonPOrue+7zaoujZmGNmjx
dT2V/jQehanYD25WpDO8OyFXUeYE4x9bgHKsszhBTwr4x5D8ceEJ1sugcOPiTTxu
tflbbuWbt+BguvXp4p8QayUj0V2cplM/nOovWyUG+BH46sz3Dtv46NOgJeO2a29g
T6jayxmKCxvtPKtG0j34BzLngiKabZTSEhFms6Qarp9lwWvHWrR9KWGuDBNvy1Ts
GMBN8P6Ib40yVi6Pwlj5Jpy6yLKVklHtJQpactr63AZmYrF4bBBSom+MWAh3X1iO
QtF0x9Z1bBkXY2Q/u+3vWMxQtEPeW+pSiloj8aiceFAt33zKM+1bLofDhEw0s2fI
wJEHYsGyGtDQINgP0v1e
=OPbZ
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Fix an NFSv4 idmapper regression
- Fix an Oops in the pNFS blocks client
- Fix up various issues with pNFS layoutcommit
- Ensure correct read ordering of variables in
rpc_wake_up_task_queue_locked
* tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked
NFSv4.1: Add a helper pnfs_commit_and_return_layout
NFSv4.1: Always clear the NFS_INO_LAYOUTCOMMIT in layoutreturn
NFSv4.1: Fix a race in pNFS layoutcommit
pnfs-block: removing DM device maybe cause oops when call dev_remove
NFSv4: Fix the string length returned by the idmapper
Change the permission check for yama_ptrace_ptracee to the standard
ptrace permission check, testing if the traceer has CAP_SYS_PTRACE
in the tracees user namespace.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
ip-header id needs to be incremented even if IP_DF flag is set.
This behaviour was changed in commit 490ab08127
(IP_GRE: Fix IP-Identification).
Following patch fixes it so that identification is always
incremented.
Reported-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If slave sysfs symlink failes to be created - we end up without removing
the master sysfs symlink. Remove it in case of failure.
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SCM_SCREDENTIALS should apply to write() syscalls only either source or destination
socket asserted SOCK_PASSCRED. The original implememtation in maybe_add_creds is wrong,
and breaks several LSB testcases ( i.e. /tset/LSB.os/netowkr/recvfrom/T.recvfrom).
Origionally-authored-by: Karel Srot <ksrot@redhat.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
This series contains updates to ixgbevf and igb.
The ixgbevf calls to pci_disable_msix() and to free the msix_entries
memory should not occur if device open fails. Instead they should be
called during device driver removal to balance with the call to
pci_enable_msix() and the call to allocate msix_entries memory
during the device probe and driver load.
The remaining 4 of 5 igb patches are simple 1-3 line patches to fix
several issues such as possible null pointer dereference, PHC stopping
on max frequency, make sensor info static and SR-IOV initialization
reordering.
The remaining igb patch to fix anti-spoofing config fixes a problem
in i350 where anti spoofing configuration was written into a wrong
register.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
skb->ip_summed should be CHECKSUM_UNNECESSARY when the driver reports that
checksums were correct and CHECKSUM_NONE in any other case. They're
currently placed vice versa, which breaks the forwarding scenario. Fix it
by placing them as described above.
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a multi-threaded init exits and the initial thread is not the
last thread to exit the initial thread hangs around as a zombie
until the last thread exits. In that case zap_pid_ns_processes
needs to wait until there are only 2 hashed pids in the pid
namespace not one.
v2. Replace thread_pid_vnr(me) == 1 with the test thread_group_leader(me)
as suggested by Oleg.
Cc: stable@vger.kernel.org
Cc: Oleg Nesterov <oleg@redhat.com>
Reported-by: Caj Larsson <caj@omnicloud.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
For 82576 MAC type, max_adj is reported as 1000000000 ppb. However, if
this value is passed to igb_ptp_adjfreq_82576, incvalue overflows out of
INCVALUE_82576_MASK, resulting in setting of zero TIMINCA.incvalue, stopping
the PHC (instead of going at twice the nominal speed).
Fix the advertised max_adj value to the largest value hardware can handle.
As there is no min_adj value available (-max_adj is used instead), this will
also prevent stopping the clock intentionally. It's probably not a big deal,
other igb MAC types don't support stopping the clock, either.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Trivial sparse warning.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
igb is ineffective at setting a lower total VFs because:
int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs)
{
...
/* Shouldn't change if VFs already enabled */
if (dev->sriov->ctrl & PCI_SRIOV_CTRL_VFE)
return -EBUSY;
Swap init ordering.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The max_vfs= option has always been self limiting to the number of VFs
supported by the device. fa44f2f1 added SR-IOV configuration via
sysfs, but in the process broke this self correction factor. The
failing path is:
igb_probe
igb_sw_init
if (max_vfs > 7) {
adapter->vfs_allocated_count = 7;
...
igb_probe_vfs
igb_enable_sriov(, max_vfs)
if (num_vfs > 7) {
err = -EPERM;
...
This leaves vfs_allocated_count = 7 and vf_data = NULL, so we bomb out
when igb_probe finally calls igb_reset. It seems like a really bad
idea, and somewhat pointless, to set vfs_allocated_count separate from
vf_data, but limiting max_vfs is enough to avoid the null pointer.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix a problem in i350 where anti spoofing configuration was written into a
wrong register.
Signed-off-by: Lior Levy <lior.levy@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the ixgbevf driver is opened the request to allocate MSIX irq
vectors may fail. In that case the driver will call ixgbevf_down()
which will call ixgbevf_irq_disable() to clear the HW interrupt
registers and calls synchronize_irq() using the msix_entries pointer in
the adapter structure. However, when the function to request the MSIX
irq vectors failed it had already freed the msix_entries which causes
an OOPs from using the NULL pointer in synchronize_irq().
The calls to pci_disable_msix() and to free the msix_entries memory
should not occur if device open fails. Instead they should be called
during device driver removal to balance with the call to
pci_enable_msix() and the call to allocate msix_entries memory
during the device probe and driver load.
Signed-off-by: Li Xun <xunleer.li@huawei.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Pull timer fix from Thomas Gleixner:
"A single bugfix which prevents that a non functional timer device is
selected to provide the fallback device, which is supposed to serve
timer interrupts on behalf of non functional devices ..."
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clockevents: Don't allow dummy broadcast timers
For 32-bit, CONFIG_EPAPR_PARAVIRT pulls in both epapr_paravirt.c
and epapr_hcalls.c which contains the 32-bit paravirt idle loop.
For 64-bit, the paravirt idle loop is in idle_book3e.S and that
source file is included only if CONFIG_PPC_BOOK3E_64 defined.
This patch makes that dependency for 64-bit explicit.
Fixes these build errors:
arch/powerpc/kernel/built-in.o: In function `restore_pblist_ptr':
ftrace.c:(.toc+0xdc0): undefined reference to `epapr_ev_idle_start'
ftrace.c:(.toc+0xdd0): undefined reference to `epapr_ev_idle'
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
As reported by Jan, and others over the past few years, there is a
race condition caused by unix_release setting the sock->sk pointer
to NULL before properly marking the socket as dead/orphaned. This
can cause a problem with the LSM hook security_unix_may_send() if
there is another socket attempting to write to this partially
released socket in between when sock->sk is set to NULL and it is
marked as dead/orphaned. This patch fixes this by only setting
sock->sk to NULL after the socket has been marked as dead; I also
take the opportunity to make unix_release_sock() a void function
as it only ever returned 0/success.
Dave, I think this one should go on the -stable pile.
Special thanks to Jan for coming up with a reproducer for this
problem.
Reported-by: Jan Stancek <jan.stancek@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix for TX lockup in IPoIB
- QLogic -> Intel update for qib driver
- Small static checker fix for qib
- Fix error path return value in cxgb4
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABCAAGBQJRUH4BAAoJEENa44ZhAt0hP7gP/1SAVyeL8H/QNG6FDki7WUJC
qoW4bnezuNb35/H9NSPUQ/XA0QY2XKdJamkSnDJgE5WVjWZS80hKUBcdUi3S2GBl
Zbbs7H0Z2c/fiDvULrxRwqkZ/MlW4a6NnyKRHfSswKY+bAi+uybGoYGKmczXV4gC
LlscYagb4WYVTNVlD5JG9sxo5ChueAx51IbzuOWq/94iyKtocFB87ZyK2FhbQaua
n6af7lijAcTk02ad4JY2+ZZOOjGgOxP5+iigExzkXuS28dgNvNtN8FrrzLbiDHHy
aZ7miLSdPawO790nvXzn1NGL/nZiUmRI46d7qv6K/X38IX7GUFtMNEDUFCCZknwg
4VtNXzFExQy4YLCMpZ6di6rxMNVQ/EGj7QLN8jfD/pBEfiH/BMISdITL2KbgzcUv
5TUQGeQFzzyY+2HwUCqb0RCskbhsIUwLDpEnhnyHI2sZKbdRrpI07ZpL3iAhTcqK
5RDyXHNaVudVaKzaxq96vuw2OhH5fvACp3SdpaoCgZ54XuqoDoVW2q54vlj7anQI
E0/f+EDyBf8Nu5YydQri76HE+BqG+PzCqU5Rq9866itLEG2DMsmAj9KJJUcDTiCn
ANUIHoOpriGBAm1NP5tau/C0isvTw+lD/pBh26sdum1FwhoeqVf55PMvhxMG4Q/N
aC3g+RMj3BPkdhewWpY+
=SOee
-----END PGP SIGNATURE-----
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband/rdma fixes from Roland Dreier:
"Small batch of InfiniBand/RDMA fixes for 3.9:
- Fix for TX lockup in IPoIB
- QLogic -> Intel update for qib driver
- Small static checker fix for qib
- Fix error path return value in cxgb4"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/qib: change QLogic to Intel
IB/ipath: Silence a static checker warning
IPoIB: Fix send lockup due to missed TX completion
RDMA/cxgb4: Fix error return code in create_qp()
Four patches for arm-soc this week:
- Kevin Hilman is no longer reachable under his previous email address.
He submitted the patch earlier, but nobody felt responsible to pick
it up.
- One Tegra fix for an incorect register address in device tree.
- IMX multiplatform support exposes a configuration option that
leads to unbootable kernels on all other machines and that needs
to depend on that platform.
- A nontrivial bug fix for the setup of the mxs video output.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAUVBQf2CrR//JCVInAQJwyRAA0qzt+jbn+ezvLpeWvrgdEVLikiAieVyt
HmSex/pEWh2ytIkrX2maHv5Oov9QmX7v4ZEU2WtdvMv5YERsT7/y1hjHaZTLcdmH
11ogfVwTRbrNVWOb41ofdd3rUVgZfgzCGQ0rfEua3wLRK6AetZJxkqsuGXRaqjdm
BBbmgKAmLsLeM3/aBzeuFint1+EDY74WBMxgqkwUretefKFMxzcBaqhoR+FNDIdV
YaYbOocq45LsOa44gxlF6pmJkZsOsB2pqAjoANm5KtZlphTEpDD1C/wXvBaVAOBh
8mCuk2mHEyZsyLrufh/ZywaPcDaUMDwpO1zidATwaRCf6qWOr3jtWiCtQo4FeNYS
o+kkYtELyAEvwDQuljghviq0p+z2vpnk52sYdXkYW8Pgz5TqhK+Gu2ywaeiqeT2B
cbLGG32lVgJnmWOOXI7Z6MjekgKx5arx7z6Br+1pTT/fE44DgE+CtabEsCdcpFWG
ftC7FdxZabDUhfynSaO43tgKhdVv2XpVobG3iW2RYJOWm2dJSxulZg9+jypdxITm
VX3kPar+mjvwyf3svsGIc65SVaayR6pfiLzV8qBtR3trFRbIlrI7vo21d2tFtuS8
PfAR+9VHkeOdjVKDbu1sl7YycWz03xq4cM9XPFhvrobeFXb5OFwDwLWn4DZR5ZWY
iJSMJkaBSww=
=F6UW
-----END PGP SIGNATURE-----
Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC bug fixes from Arnd Bergmann:
"Four patches for arm-soc this week:
- Kevin Hilman is no longer reachable under his previous email
address. He submitted the patch earlier, but nobody felt
responsible to pick it up.
- One Tegra fix for an incorect register address in device tree.
- IMX multiplatform support exposes a configuration option that leads
to unbootable kernels on all other machines and that needs to
depend on that platform.
- A nontrivial bug fix for the setup of the mxs video output."
* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
MAINTAINERS: update email address for Kevin Hilman
ARM: tegra: fix register address of slink controller
ARM: imx: add dependency check for DEBUG_IMX_UART_PORT
ARM: video: mxs: Fix mxsfb misconfiguring VDCTRL0
Pull nfsd bugfixes from J Bruce Fields:
"Fixes for a couple mistakes in the new DRC code. And thanks to Kent
Overstreet for noticing we've been sync'ing the wrong range on stable
writes since 3.8."
* 'for-3.9' of git://linux-nfs.org/~bfields/linux:
nfsd: fix bad offset use
nfsd: fix startup order in nfsd_reply_cache_init
nfsd: only unhash DRC entries that are in the hashtable
We need to be careful when testing task->tk_waitqueue in
rpc_wake_up_task_queue_locked, because it can be changed while we
are holding the queue->lock.
By adding appropriate memory barriers, we can ensure that it is safe to
test task->tk_waitqueue for equality if the RPC_TASK_QUEUED bit is set.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Pull drm fixes from Dave Airlie:
"Exynos and Intel fixes.
The intel fixes are fairly straightforward, mostly reverts due to bugs
found. The exynos one is a big larger since they found some issues
with the G2D engine and iommu interaction, and needed to verify the
operations a lot better than they were previously, otherwise a user
app can just crash the kernel with an iommu fault."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
Revert "drm/i915: write backlight harder"
drm/i915: don't disable the power well yet
Revert "drm/i915: set TRANSCODER_EDP even earlier"
drm/exynos: Check g2d cmd list for g2d restrictions
drm/exynos: Add a new function to get gem buffer size
drm/exynos: Deal with g2d buffer info more efficiently
drm/exynos: Clean up some G2D codes for readability
drm/exynos: Fix G2D core malfunctioning issue
drm/exynos: clear node object type at gem unmap
drm/exynos: Fix error routine to getting dma addr.
drm/exynos: Replaced kzalloc & memcpy with kmemdup
drm/exynos: fimd: calculate the correct address offset
drm/exynos: Make mixer_check_timing static
drm/exynos: modify the compatible string for exynos fimd
The FWNMI region is fixed at 0x7000 and the vector are now overflowing
that with allmodconfig. Fix that by moving slb_miss_realmode code out
of that region as it doesn't need to be that close to the call sites
(it is a _GLOBAL function)
Fixes this build error:
arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
arch/powerpc/kernel/exceptions-64s.S:1304: Error: attempt to move .org backwards
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Daniel writes:
"Just three revert/disable by default patches, one of them cc: stable
(since the offending commit was cc: stable, too)."
* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
Revert "drm/i915: write backlight harder"
drm/i915: don't disable the power well yet
Revert "drm/i915: set TRANSCODER_EDP even earlier"
Inki writes:
Includes bug fixes and code cleanups.
And it considers some restrictions to G2D hardware.
With this, the malfunction and page fault issues to g2d driver
would be fixed.
* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
drm/exynos: Check g2d cmd list for g2d restrictions
drm/exynos: Add a new function to get gem buffer size
drm/exynos: Deal with g2d buffer info more efficiently
drm/exynos: Clean up some G2D codes for readability
drm/exynos: Fix G2D core malfunctioning issue
drm/exynos: clear node object type at gem unmap
drm/exynos: Fix error routine to getting dma addr.
drm/exynos: Replaced kzalloc & memcpy with kmemdup
drm/exynos: fimd: calculate the correct address offset
drm/exynos: Make mixer_check_timing static
drm/exynos: modify the compatible string for exynos fimd
On SACK reneging the sender immediately retransmits and forces a
timeout but disables Eifel (undo). If the (buggy) receiver does not
drop any packet this can trigger a false slow-start retransmit storm
driven by the ACKs of the original packets. This can be detected with
undo and TCP timestamps.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fix for incorrect assignment of signed expression to unsigned variable.
Signed-off-by: Kumar Amit Mehta <gmate.amit@gmail.com>
Acked-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When I tried to set mac address of a bridge interface to a mac
address which already learned on this bridge, I got system hang.
The cause is straight forward: function br_fdb_change_mac_address
calls fdb_insert with NULL source nbp. Then an fdb lookup is
performed. If an fdb entry is found and it's local, it's OK. But
if it's not local, source is dereferenced for printk without NULL
check.
Signed-off-by: Hong Zhiguo <honkiko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>