Instead of calling 3 times ntohs(random->param_hdr.length), 2 times
ntohs(hmacs->param_hdr.length), and 3 times ntohs(chunks->param_hdr.length)
within the same function, we only call each once and store it in a
variable.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.8.0-rc5+ #82 Not tainted
---------------------------------------------------------
swapper/0/0 just changed the state of lock:
(&(&fep->hw_lock)->rlock){..-...}, at: [<8034e2f8>] fec_enet_start_xmit+0x48/0x 2cc
but this lock took another, SOFTIRQ-unsafe lock in the past:
(prepare_lock){+.+.+.}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(prepare_lock);
local_irq_disable()
lock(&(&fep->hw_lock)->rlock);
lock(prepare_lock);
<Interrupt>
lock(&(&fep->hw_lock)->rlock);
*** DEADLOCK ***
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Old method will cause init spinlock twice.
New method will avoid init spinlock twice and fix miss init spinlock
at fec_restart.
BUG: spinlock bad magic on CPU#1, swapper/0/1
lock: 0xbfae0f8c, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
Backtrace:
[<80011d54>] (dump_backtrace+0x0/0x10c) from [<804e7800>] (dump_stack+0x18/0x1c)
r6:bfae0000 r5:bfae0f8c r4:00000000 r3:806c1310
[<804e77e8>] (dump_stack+0x0/0x1c) from [<804e9f20>] (spin_dump+0x80/0x94)
[<804e9ea0>] (spin_dump+0x0/0x94) from [<804e9f60>] (spin_bug+0x2c/0x30)
r5:805f6f8c r4:bfae0f8c
[<804e9f34>] (spin_bug+0x0/0x30) from [<80257984>] (do_raw_spin_lock+0x170/0x1b0 )
r5:806b4950 r4:bfae0f8c
[<80257814>] (do_raw_spin_lock+0x0/0x1b0) from [<804ed15c>] (_raw_spin_lock_irqs ave+0x18/0x20)
[<804ed144>] (_raw_spin_lock_irqsave+0x0/0x20) from [<8033c694>] (fec_ptp_start_ cyclecounter+0x3c/0x120)
r4:bfae0f8c r3:00000002
[<8033c658>] (fec_ptp_start_cyclecounter+0x0/0x120) from [<80339e08>] (fec_resta rt+0x56c/0x5f8)
r8:00000000 r7:806e6f48 r6:00000112 r5:806b4950 r4:bfae0000
[<8033989c>] (fec_restart+0x0/0x5f8) from [<8033b9e4>] (fec_probe+0x508/0xa48)
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix issue in Mellanox driver related to BQL. netdev_tx_reset_queue
was not being called in certain situations where the device was
being start and stopped. Moved netdev_tx_reset_queue from the reset
device path to mlx4_en_free_tx_buf which is where the rings are
cleaned in a reset (specifically from device being stopped).
Signed-off-by: Tom Herbert <therbert@google.com>
Acked-By: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amir Vadai says:
====================
This series from Yan Burman adds support for unicast MAC address filtering and
ndo FDB operations. It also includes some optimizations to loopback related
decisions and checks in the TX/RX fast path and one cleanup, all in separate
patches.
Today, when adding macvlan devices, the NIC goes into promiscuous mode, since
unicast MAC filtering is not supported. With these changes, macvlan devices can
be added without the penalty of promiscuous mode.
If for some reason adding a unicast address filter fails e.g as of missing space in
the HW mac table, the device forces itself into promiscuous mode (and out of this
forced state when enough space is available).
Also, now it is possible to have bridge under multi-function configuration that include
PF and VFs. In order to use bridge over PF/VFs, VM MAC fdb entries must be added e.g.
using 'bridge fdb add' command.
Changes from v1 - based on more comments from Eric Dumazet:
* added failure handling when adding unicast address filter
Changes from v0 - based on comments from Eric Dumazet:
* Removed unneeded synchronize_rcu()
* Use kfree_rcu() instead of synchronize_rcu() + kfree()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for setting embedded switch fdb in case of SRIOV, by
implementing ndo_fdb_{add, del, dump}. This will allow to use
bridged configuration with multi-function. In order to add VM MAC
to the eSwitch fdb, the following command may be used over the relevant function interface:
bridge fdb add <MAC> permanent self dev <IFACE>
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement and advertise unicast MAC filtering, such that setting macvlan
instance over mlx4_en interfaces will not require the networking core
to put mlx4_en devices in promiscuous mode.
If for some reason adding a unicast address filter fails e.g as of missing space in
the HW mac table, the device forces itself into promiscuous mode (and out of this
forced state when enough space is available).
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As a preparation step for supporting multiple unicast addresses, store MAC addresses in hash table.
Remove the radix tree for MAC addresses per QP, as it's not in use.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation to having more than one unicast MAC per port, we need to keep track
of the previous MAC address in the flow of ndo_set_mac_address,
so that mlx4_en_replace_mac will know what to replace.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, mlx4_en_do_set_multicast serves as the ndo_set_rx_mode entry for mlx4_en,
doing all related work. Split it to few calls, one per required functionality
(e.g multicast, promiscuous, etc) and rename some structures and calls
to use rx_mode notation instead of multicast.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move low level code that deals with management of Ethernet MACs and QPs from mlx4_core to mlx4_en.
Also convert the new functions to deal with MACs in form of char array instead of u64.
Actual functions moved:
mlx4_replace_mac
mlx4_get_eth_qp
mlx4_put_eth_qp
To conduct this change, some functionality had to be exported from the core,
the following functions were added:
mlx4_get_base_qp
__mlx4_replace_mac (low level function for CX1/A0 compatibility)
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the code consistent in regard to error messages
not spanning multiple lines.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, RX path code that does RX filtering is not optimized
and does an expensive conversion. In order to use ether_addr_equal_64bits
which is optimized for such cases, we need the MAC address kept by the device
to be in the form of unsigned char array instead of u64. Store the MAC address
as unsigned char array and convert to/from u64 out of the fast path when needed.
Side effect of this is that we no longer need priv->mac, since it's the same
as dev->dev_addr.
This optimization was suggested by Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there are relatively complex conditional checks in the fast path,
for TX loopback enabling and resulting RX filter logic.
Move elaborate if's out of data path, replace them with a single flag
for each state and update that state from appropriate places.
Also, in native (non SRIOV) mode and not in loopback or in selftest,
there is no need to try and filter out packets that HW loopback-ed,
as in native mode we do not loopback packets anymore.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When changing the device from or to promisc mode this only affects the
device after the device is bought up the next time. For bridging it is
needed to change the device to promisc mode while it is up, which is
possible with this patch.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The generic implementation just changes the netdev struct and does not
write the new mac address to the hardware or issues some command to do
so.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting up IPv6 addresses on configurations with many macvlans
is not really working, as many multicast messages are dropped.
Add a multicast filter to macvlan to reduce the amount of cloned
skbs and overhead.
Successfully tested with 1024 macvlans on one ethernet device.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On 64 bit arches :
There is a off-by-one error in qdisc_pkt_len_init() because
mac_header is not set in xmit path.
skb_mac_header() returns an out of bound value that was
harmless because hdr_len is an 'unsigned int'
On 32bit arches, the error is abysmal.
This patch is also a prereq for "macvlan: add multicast filter"
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_gso_segment() is almost always called in tx path,
except for openvswitch. It calls this function when
it receives the packet and tries to queue it to user-space.
In this special case, the ->ip_summed check inside
skb_gso_segment() is no longer true, as ->ip_summed value
has different meanings on rx path.
This patch adjusts skb_gso_segment() so that we can at least
avoid such warnings on checksum.
Cc: Jesse Gross <jesse@nicira.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
head buffer is only temporary available in mac802154_header_create.
So it's not necessary to put it on the heap.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
head buffer is only temporary available in lowpan_header_create.
So it's not necessary to put it on the heap.
Also fixed a comment codestyle issue.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some modes don't require any special carrier handling so
in these cases, the kernel can control the carrier as for
any other interface. However, some other modes, e.g. lacp,
requires more than just that, so userspace needs to control
the carrier itself.
The daemon today is ready to control it, but the kernel
still can change it based on events.
This fix so that either kernel or userspace is controlling
the carrier.
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
adding support for VLAN interface for cpsw.
CPSW VLAN Capability
* Can filter VLAN packets in Hardware
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add helper functions for VLAN ALE implementations for Add, Delete
Dump VLAN related ALE entries
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vercera was recently backporting commit
9c13cb8bb4 to a RHEL kernel, and I noticed that,
while this patch protects the tg3 driver from having its ndo_poll_controller
routine called during device initalization, it does nothing for the driver
during shutdown. I.e. it would be entirely possible to have the
ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
ndo_open routine could be called. Given that the two latter routines tend to
initizlize and free many data structures that the former two rely on, the result
can easily be data corruption or various other crashes. Furthermore, it seems
that this is potentially a problem with all net drivers that support netpoll,
and so this should ideally be fixed in a common path.
As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
context, so I've come up with this solution. We can use a mutex to sleep in
open/close paths and just do a mutex_trylock in the napi poll path and abandon
the poll attempt if we're locked, as we'll just retry the poll on the next send
anyway.
I've tested this here by flooding netconsole with messages on a system whos nic
driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
workqueue would be forced to send frames and poll the device. While this was
going on I rapidly ifdown/up'ed the interface and watched for any problems.
I've not found any.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Ivan Vecera <ivecera@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Francois Romieu <romieu@fr.zoreil.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calling icmpv6_send() on a local message size error leads to an
incorrect update of the path mtu in the case when IPsec is used.
So use ipv6_local_error() instead to notify the socket about the
error.
Reported-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commits 9d11bd159
("wimax: Remove unnecessary alloc/OOM messages, alloc cleanups")
and b2adaca92
("ethernet: Remove unnecessary alloc/OOM messages, alloc cleanups")
added a couple of unused variable warnings.
Remove the now unused variables.
Noticed-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alloc failures already get standardized OOM
messages and a dump_stack.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
This series contains updates to e1000e and ixgbe. Majority of the patches
are against e1000e, where Bruce makes several cosmetic #define moves into
header files. In addition, Bruce does a cleanup of braces to resolve
checkpatch warnings (when using the strict option).
Ixgbe patches contain several fixes as well as updating the copyright. The
fixes from Josh Hay, resolved a possible NULL pointer dereference and
resolved Smatch warnings by fixing return values and memcpy parameters.
Alex provides 2 fixes, the first is to replace rmb() with
read_barrier_depends() in the Tx cleanup. The second fixes an MTU
warning when using SR-IOV which corrects the fact that we were using 1522
to test for the max frame size in ixgbe_change_mtu and 1518 in
ixgbe_set_vf_lpe. The difference was the addition of VLAN_HLEN, which we
only need to add in the case of computing a buffer size, but not a filter
size. Lastly, a patch from Emil which is based on a community patch from
Aurélien Guillaume which adds functions needed for reading SFF-8472
diagnostic data from SFP modules.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP Appropriate Byte Count was added by me, but later disabled.
There is no point in maintaining it since it is a potential source
of bugs and Linux already implements other better window protection
heuristics.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
All in-tree ipv4 protocol implementations are now namespace
aware. Therefore all the run-time checks are superfluous.
Reject registry of any non-namespace aware ipv4 protocol.
Eventually we'll remove prot->netns_ok and this registry
time check as well.
Signed-off-by: David S. Miller <davem@davemloft.net>
The infrastructure is already pretty much entirely there
to allow this conversion.
The tunnel and session lookups have per-namespace tables,
and the ipv4 bind lookup includes the namespace in the
lookup key.
Set netns_ok in l2tp_ip_protocol.
Signed-off-by: David S. Miller <davem@davemloft.net>
When creating unmanaged tunnel sockets we should honour the network namespace
passed to l2tp_tunnel_create. Furthermore, unmanaged tunnel sockets should
not hold a reference to the network namespace lest they accidentally keep
alive a namespace which should otherwise have been released.
Unmanaged tunnel sockets now drop their namespace reference via sk_change_net,
and are released in a new pernet exit callback, l2tp_exit_net.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
l2tp_tunnel_create is passed a pointer to the network namespace for the
tunnel, along with an optional file descriptor for the tunnel which may
be passed in from userspace via. netlink.
In the case where the file descriptor is defined, ensure that the namespace
associated with that socket matches the namespace explicitly passed to
l2tp_tunnel_create.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The L2TP netlink code can run in namespaces. Set the netnsok flag in
genl_family to true to reflect that fact.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To allow l2tp_tunnel_delete to be called from an atomic context, place the
tunnel socket release calls on a workqueue for asynchronous execution.
Tunnel memory is eventually freed in the tunnel socket destructor.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/intel/e1000e/ethtool.c
drivers/net/vmxnet3/vmxnet3_drv.c
drivers/net/wireless/iwlwifi/dvm/tx.c
net/ipv6/route.c
The ipv6 route.c conflict is simple, just ignore the 'net' side change
as we fixed the same problem in 'net-next' by eliminating cached
neighbours from ipv6 routes.
The e1000e conflict is an addition of a new statistic in the ethtool
code, trivial.
The vmxnet3 conflict is about one change in 'net' removing a guarding
conditional, whilst in 'net-next' we had a netdev_info() conversion.
The iwlwifi conflict is dealing with a WARN_ON() conversion in
'net-next' vs. a revert happening in 'net'.
Signed-off-by: David S. Miller <davem@davemloft.net>
This change corrects the fact that we were using 1522 to test for the
max frame size in ixgbe_change_mtu and 1518 in ixgbe_set_vf_lpe. The
difference was the addition of VLAN_HLEN which we only need to add in the case
of computing a buffer size, but not a filter size.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Sibai Li <Sibai.li@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The rmb in the Tx cleanup path is a much stronger barrier than we really need.
All that is really needed is a read_barrier_depends since the location of the
EOP descriptor is dependent on the eop_desc value.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch removes the rval variable returns from function and replaces
them with direct returns in ixgbe_dcbnl_getnumtcs. It also changes how
ixgbe_gstrings_test is copied into data with memcpy in ixgbe_get_strings
because "*ixgbe_gstrings_test too small (32 vs 160)".
Signed-off-by: Josh Hay <joshua.a.hay@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds a default case which goes to the next loop iteration
in the case where p is not set, preventing p from being dereferenced.
Signed-off-by: Josh Hay <joshua.a.hay@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds functions needed for reading SFF-8472 diagnostic data
from SFP modules.
Based on original patch from Aurélien Guillaume <footplus@gmail.com>
CC: Aurélien Guillaume <footplus@gmail.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Resolve the following strict checkpatch checks:
CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
CHECK:BRACES: Blank lines aren't necessary before a close brace '}'
CHECK:BRACES: braces {} should be used on all arms of this statement
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There are enough register offsets to warrant being in their own header
file, and doing so logically separates them from other header file content.
They have been converted from an enumerated data type to #defines as is
done in all the other Intel wired ethernet drivers.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move #defines, function prototypes and data types which are applicable to
all/most devices supported by the driver but are specific to the
manageability component of each device to the new manage.h header file.
These #defines, function prototypes and data types can be used by other
files in the driver and moving them to the manageability-specific file
makes it clearer to which component they are applicable.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>