Commit graph

427401 commits

Author SHA1 Message Date
Steffen Klassert
895de9a348 vti4: Enable namespace changing
vti4 is now fully namespace aware, so allow namespace changing
for vti devices

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:19 +01:00
Steffen Klassert
6e2de802af vti4: Check the tunnel endpoints of the xfrm state and the vti interface
The tunnel endpoints of the xfrm_state we got from the xfrm_lookup
must match the tunnel endpoints of the vti interface. This patch
ensures this matching.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:19 +01:00
Steffen Klassert
78a010cca0 vti4: Support inter address family tunneling.
With this patch we can tunnel ipv6 traffic via a vti4
interface. A vti4 interface can now have an ipv6 address
and ipv6 traffic can be routed via a vti4 interface.
The resulting traffic is xfrm transformed and tunneled
throuhg ipv4 if matching IPsec policies and states are
present.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:19 +01:00
Steffen Klassert
a34cd4f319 vti4: Use the on xfrm_lookup returned dst_entry directly
We need to be protocol family indepenent to support
inter addresss family tunneling with vti. So use a
dst_entry instead of the ipv4 rtable in vti_tunnel_xmit.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:18 +01:00
Steffen Klassert
9994bb8e1e xfrm4: Remove xfrm_tunnel_notifier
This was used from vti and is replaced by the IPsec protocol
multiplexer hooks. It is now unused, so remove it.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:18 +01:00
Steffen Klassert
df3893c176 vti: Update the ipv4 side to use it's own receive hook.
With this patch, vti uses the IPsec protocol multiplexer to
register it's own receive side hooks for ESP, AH and IPCOMP.

Vti now does the following on receive side:

1. Do an input policy check for the IPsec packet we received.
   This is required because this packet could be already
   prosecces by IPsec, so an inbuond policy check is needed.

2. Mark the packet with the i_key. The policy and the state
   must match this key now. Policy and state belong to the outer
   namespace and policy enforcement is done at the further layers.

3. Call the generic xfrm layer to do decryption and decapsulation.

4. Wait for a callback from the xfrm layer to properly clean the
   skb to not leak informations on namespace and to update the
   device statistics.

On transmit side:

1. Mark the packet with the o_key. The policy and the state
   must match this key now.

2. Do a xfrm_lookup on the original packet with the mark applied.

3. Check if we got an IPsec route.

4. Clean the skb to not leak informations on namespace
   transitions.

5. Attach the dst_enty we got from the xfrm_lookup to the skb.

6. Call dst_output to do the IPsec processing.

7. Do the device statistics.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:18 +01:00
Steffen Klassert
6d608f06e3 ip_tunnel: Make vti work with i_key set
Vti uses the o_key to mark packets that were transmitted or received
by a vti interface. Unfortunately we can't apply different marks
to in and outbound packets with only one key availabe. Vti interfaces
typically use wildcard selectors for vti IPsec policies. On forwarding,
the same output policy will match for both directions. This generates
a loop between the IPsec gateways until the ttl of the packet is
exceeded.

The gre i_key/o_key are usually there to find the right gre tunnel
during a lookup. When vti uses the i_key to mark packets, the tunnel
lookup does not work any more because vti does not use the gre keys
as a hash key for the lookup.

This patch workarounds this my not including the i_key when comupting
the hash for the tunnel lookup in case of vti tunnels.

With this we have separate keys available for the transmitting and
receiving side of the vti interface.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:18 +01:00
Steffen Klassert
70be6c91c8 xfrm: Add xfrm_tunnel_skb_cb to the skb common buffer
IPsec vti_rcv needs to remind the tunnel pointer to
check it later at the vti_rcv_cb callback. So add
this pointer to the IPsec common buffer, initialize
it and check it to avoid transport state matching of
a tunneled packet.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:17 +01:00
Steffen Klassert
d099160e02 ipcomp4: Use the IPsec protocol multiplexer API
Switch ipcomp4 to use the new IPsec protocol multiplexer.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:17 +01:00
Steffen Klassert
e5b56454e0 ah4: Use the IPsec protocol multiplexer API
Switch ah4 to use the new IPsec protocol multiplexer.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:17 +01:00
Steffen Klassert
827789cbd7 esp4: Use the IPsec protocol multiplexer API
Switch esp4 to use the new IPsec protocol multiplexer.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:17 +01:00
Steffen Klassert
3328715e6c xfrm4: Add IPsec protocol multiplexer
This patch add an IPsec protocol multiplexer. With this
it is possible to add alternative protocol handlers as
needed for IPsec virtual tunnel interfaces.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:16 +01:00
Florian Fainelli
51adfcc333 net: bcmgenet: remove unused bh_lock member
bh_lock spinlock is unused, remove it from the private driver structure.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 20:26:37 -05:00
Florian Fainelli
da56bbf71d net: bcmgenet: remove commented code in bcmgenet_xmit()
This code is commented since it is unused, left-over from the very first
time this driver was merged.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 20:26:36 -05:00
Florian Fainelli
80d8e96d12 net: bcmgenet: drop checks on priv->phydev
Drop all the checks on priv->phydev since we will refuse probing the
driver if we cannot attach to a PHY device. Drop all checks on
priv->phydev. This also fixes some smatch issues reported by Dan
Carpenter.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 20:26:36 -05:00
David S. Miller
432c5b3a10 Merge branch 'gianfar'
Claudiu Manoil says:

====================
gianfar: Device reset and reconfig fixes

These patches end up fixing some notable device reset & reconfig
related problems.  One issue is on-the-fly (Rx/Tx on) programming
of interrupt coalescing (IC) registers on the processing path,
against HW recommendation.  This is an old issue that became visible
after BQL introduction, as under certain conditions (low traffic)
one TX interrupt gets lost and BQL fires Tx timeout as a result.
Another notable issue is a race on the Tx path (xmit, clean_tx)
during device reset (i.e. during Tx timeout watchdog firing)
that leads to NULL access.
Fixing the problematic on-thy-fly register writes (i.e. the IC regs)
required the implementation of a MAC soft reset procedure.
The race leading to NULL access was addressed by fixing the
stop_gfar()/startup_gfar() pair (disable/enable napi a.s.o.)
and adding the device state DOWN to sync with the TX path.

v2: Refactored if() clauses from gfar_set_features(), PATCH 2.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:38:53 -05:00
Claudiu Manoil
f19015baa2 gianfar: Fix Tx int miss, dont write IC on-the-fly
Programming the interrupt coalescing (IC) registers while
the controller/DMA is on may incur the loss of one Tx
confirmation interrupt, under certain conditions.  This is
a subtle hw race because it does not occur during a burst
of Tx packets.  It has been observed on p2020 devices that,
if just one packet is being xmit'ed, the Tx confirmation
doesn't trigger and BQL evetually blocks the Tx queues,
followed by Tx timeout and an un-responsive device.
This issue was not apparent prior to introducing BQL
support, as a late Tx confirmation was not an issue back then
and the next burst of Tx frames would have triggered the
Tx confirmation/ Tx ring cleanup anyway.

Bottom line, the hw specifications state that the IC registers
should not be programmed while the Rx/Tx blocks (the DMA) are
enabled. Further more, these registers are currently re-written
with the same values on the processing path, over and over again.
To fix this, rewriting the IC registers has been removed from
the processing path (napi poll).  A complete MAC reset procedure
has been implemented for the ethtool -c option instead, to
reliably update these registers while the controller is stopped.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:38:20 -05:00
Claudiu Manoil
0851133bb5 gianfar: Fix device reset races (oops) for Tx
The device reset procedure, stop_gfar()/startup_gfar(), has
concurrency issues.
"Kernel access of bad area" oopses show up during Tx timeout
device reset or other reset cases (like changing MTU) that
happen while the interface still has traffic. The oopses
happen in start_xmit and clean_tx_ring when accessing tx_queue->
tx_skbuff which is NULL. The race comes from de-allocating the
tx_skbuff while transmission and napi processing are still
active. Though the Tx queues get temoprarily stopped when Tx
timeout occurs, they get re-enabled as a result of Tx congestion
handling inside the napi context (see clean_tx_ring()). Not
disabling the napi during reset is also a bug, because
clean_tx_ring() will try to access tx_skbuff while it is being
de-alloc'ed and re-alloc'ed.

To fix this, stop_gfar() needs to disable napi processing
after stopping the Tx queues. However, in order to prevent
clean_tx_ring() to re-enable the Tx queue before the napi
gets disabled, the device state DOWN has been introduced.
It prevents the Tx congestion management from re-enabling the
de-congested Tx queue while the device is brought down.
An additional locking state, RESETTING, has been introduced
to prevent simultaneous resets or to prevent configuring the
device while it is resetting.
The bogus 'rxlock's (for each Rx queue) have been removed since
their purpose is not justified, as they don't prevent nor are
suited to prevent device reset/reconfig races (such as this one).

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:38:20 -05:00
Claudiu Manoil
80ec396cb6 gianfar: Don't free/request irqs on device reset
Resetting the device (stop_gfar()/startup_gfar()) should
be fast and to the point, in order to timely recover
from an error condition (like Tx timeout) or during
device reconfig.  The irq free/ request routines are just
redundant here, and they should be part of the device
close/ open routines instead.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:38:20 -05:00
Claudiu Manoil
88302648be gianfar: Fix on-the-fly vlan and mtu updates
The RCTRL and TCTRL registers should not be changed
on-the-fly, while the controller is running, otherwise
unexpected behaviour occurs.  But that's exactly what
gfar_vlan_mode() does, updating the VLAN acceleration
bits inside RCTRL/TCTRL.  The attempt to lock these
operations doesn't help, but only adds to the confusion.
There's also a dependency for Rx FCB insertion (activating
/de-activating the TOE offload block on Rx) which might
change the required rx buffer size.  This makes matters
worse as gfar_vlan_mode() ends up calling gfar_change_mtu(),
though the MTU size remains the same.  Note that there are
other situations that may affect the required rx buffer size,
like changing RXCSUM or rx hw timestamping, but errorneously
the rx buffer size is not recomputed/ updated in the process.

To fix this, do the vlan updates properly inside the MAC
reset and reconfiguration procedure, which takes care of
the rx buffer size dependecy and the rx TOE block (PRSDEP)
activation/deactivation as well (in the correct order).
As a consequence, MTU/ rx buff size updates are done now
by the same MAC reset and reconfig procedure, so that out
of context updates to MAXFRM, MRBLR, and MACCFG inside
change_mtu() are no longer needed.  The rx buffer size
dependecy to Rx FCB is now handled for the other cases too
(RXCSUM and rx hw timestamping).

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:38:20 -05:00
Claudiu Manoil
a328ac92d3 gianfar: Implement MAC reset and reconfig procedure
The main MAC config registers like: RCTRL/TCTRL, MRBLR,
MAXFRM, RXIC/TXIC, most fields of MACCFG1/2, should not
be changed on-the-fly, but at least after stopping the
DMA and disabling the Rx/Tx blocks and, for increased
reliability, after a MAC soft reset.

Impelement a complete MAC soft reset and reconfig procedure
following the latest HW advisories - gfar_mac_reset() - to
replace gfar_mac_init() and (the confusing) init_registers()
functions.

Factor out separate config functions for RCTRL and TCTRL,
insure programming order of the relevant config regs after
MAC soft reset.

Split gfar_hw_init() into gfar_mac_reset() and the remaining
global regs that don't need to be reconfigured after MAC soft
reset (FIFOCFG, ATTRELI, HW counters a.s.o).

As gfar_hw_init() now makes all the register writes @probe()
time, based on all the device flags and config options, it
must be moved further down, just before register_netdev(),
as the last config step when the config values are comitted
to HW.  Also, move netif_carrier_off() after register_netdev(),
because it has no effect if called before.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:38:20 -05:00
David S. Miller
225a9a2566 bcmgenet: Deleted unnecessary select_queue method.
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:33:27 -05:00
Fabio Estevam
5343a10d15 net: bcmgenet: Use devm_ioremap_resource()
According to Documentation/driver-model/devres.txt, devm_request_and_ioremap()
is deprecated, so use devm_ioremap_resource() instead.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:28:56 -05:00
Joe Perches
0409114282 bridge: netfilter: Use ether_addr_copy
Convert the uses of memcpy to ether_addr_copy because
for some architectures it is smaller and faster.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:16:44 -05:00
Joe Perches
e5a727f663 bridge: Use ether_addr_copy and ETH_ALEN
Convert the more obvious uses of memcpy to ether_addr_copy.

There are still uses of memcpy that could be converted but
these addresses are __aligned(2).

Convert a couple uses of 6 in gr_private.h to ETH_ALEN.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:16:43 -05:00
Ben Hutchings
e8b39015b5 cgxb4: Stop using ethtool SPEED_* constants
ethtool speed values are just numbers of megabits and there is no need
to add SPEED_40000.  To be consistent, use integer constants directly
for all speeds.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:04:08 -05:00
Daniel Borkmann
7debf7806e tools: bpf_dbg: various misc code cleanups
Lets clean up bpf_dbg a bit and improve its code slightly
in various areas: i) Get rid of some macros as there's no
good reason for keeping them, ii) remove one unused variable
and reduce scope of various variables found by cppcheck,
iii) Close non-default file descriptors when exiting the shell.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:02:10 -05:00
Daniel Borkmann
b17c706987 loopback: sctp: add NETIF_F_SCTP_CSUM to device features
Drivers are allowed to set NETIF_F_SCTP_CSUM if they have
hardware crc32c checksumming support for the SCTP protocol.
Currently, NETIF_F_SCTP_CSUM flag is available in igb,
ixgbe, i40e/i40evf drivers and for vlan devices.

If we don't have NETIF_F_SCTP_CSUM then crc32c is done
through CPU instructions, invoked from crypto layer, or
if not available as slow-path fallback in software.

Currently, loopback device propagates checksum offloading
feature flags in dev->features, but is missing SCTP checksum
offloading. Therefore, account for NETIF_F_SCTP_CSUM as
well.

Before patch:

./netperf_sctp -H 192.168.0.100 -t SCTP_STREAM_MANY
SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.100 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

4194304 4194304   4096    10.00    4683.50

After patch:

./netperf_sctp -H 192.168.0.100 -t SCTP_STREAM_MANY
SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.100 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

4194304 4194304   4096    10.00    15348.26

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:00:08 -05:00
Mathias Krause
72f8e06f3e pktgen: document all supported flags
The documentation misses a few of the supported flags. Fix this. Also
respect the dependency to CONFIG_XFRM for the IPSEC flag.

Cc: Fan Du <fan.du@windriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:54:26 -05:00
Mathias Krause
0945574750 pktgen: simplify error handling in pgctrl_write()
The 'out' label is just a relict from previous times as pgctrl_write()
had multiple error paths. Get rid of it and simply return right away
on errors.

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:54:26 -05:00
Mathias Krause
20b0c718c3 pktgen: fix out-of-bounds access in pgctrl_write()
If a privileged user writes an empty string to /proc/net/pktgen/pgctrl
the code for stripping the (then non-existent) '\n' actually writes the
zero byte at index -1 of data[]. The then still uninitialized array will
very likely fail the command matching tests and the pr_warning() at the
end will therefore leak stack bytes to the kernel log.

Fix those issues by simply ensuring we're passed a non-empty string as
the user API apparently expects a trailing '\n' for all commands.

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:54:25 -05:00
David S. Miller
8bfdfbc188 Merge branch 'qlcnic-next'
Shahed Shaikh says:

====================
qlcnic: Re-factoring and enhancements

This patch series includes following changes -
* Re-factored firmware minidump template header handling
* Support to make 8 vNIC mode application to work with 16 vNIC mode
* Enhance error message logging when adapter is in failed state and
  when adapter lock access fails.
* Allow vlan0 traffic
* update MAINTAINERS

Please apply this series to net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:44:05 -05:00
Shahed Shaikh
e6b0b01979 Update MAINTAINERS for qlcnic driver
Keep myself as only maintainer for qlcnic driver and update
group email alias to Dept-HSGLinuxNICDev@qlogic.com

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:43:19 -05:00
Shahed Shaikh
3dd4705698 qlcnic: Update version to 5.3.56
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:43:19 -05:00
Harish Patil
1a51042bb8 qlcnic: Enhance semaphore lock access failure error message
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:43:19 -05:00
Rajesh Borundia
cecd59d84d qlcnic: Allow vlan0 traffic
o Adapter allows vlan0 traffic in case of SR-IOV after setting
  QLC_SRIOV_ALLOW_VLAN0 bit even though we do not add vlan0 filters.

Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:43:19 -05:00
Sucheta Chakraborty
2a355aecd2 qlcnic: Enhance driver message in failed state.
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:43:19 -05:00
Jitendra Kalsaria
d91abf903b qlcnic: Updates to QLogic application/driver interface for virtual NIC configuration
Qlogic application interface in the driver which has larger than 8 vNIC
configuration support has been updated to handle the following cases:

o Only 8 or lower total vNICs were enabled within the vNIC 0-7 range
o vNICs were enabled in the vNIC 0-15 range such that enabled vNICs were
  not contiguous and only 8 or lower number of total VNICs were enabled
o Disconnect in the vNIC mapping between application and driver when the
  enabled VNICs were dis contiguous

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:43:19 -05:00
Shahed Shaikh
225837a076 qlcnic: Re-factor firmware minidump template header handling
Treat firmware minidump template headers for 82xx and 83xx/84xx adapters separately,
as it may change for 82xx and 83xx/84xx adapter type independently.

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:43:19 -05:00
David S. Miller
a1991c749a Merge branch 'mlx4'
Amir Vadai says:

====================
net/mlx4: Mellanox driver update 01-01-2014

This small patchset has a fix to a bogus usage of
netif_get_num_default_rss_queues() in mlx4_en driver.

Changes from V1:
- Removed affinity_hint patch, to make it a generic instead of mlx specific

Changes from V0:
- Instead of reverting the netif_get_num_default_rss_queues() in mlx4_en,
  fixing it to limit the actual number of receive queues instead of limiting
  the number of IRQ's.

Patchset was applied and tested against commit: cb6e926 "ipv6:fix checkpatch
errors with assignment in if condition"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:38:27 -05:00
Ido Shamay
bb2146bc88 net/mlx4: Fix limiting number of IRQ's instead of RSS queues
This fix a performance bug introduced by commit 90b1ebe "mlx4: set
maximal number of default RSS queues", which limits the numbers of IRQs
opened by core module.
The limit should be on the number of queues in the indirection table -
rx_rings, and not on the number of IRQ's. Also, limiting on mlx4_core
initialization instead of in mlx4_en, prevented using "ethtool -L" to
utilize all the CPU's, when performance mode is prefered, since limiting
this number to 8 reduces overall packet rate by 15%-50% in multiple TCP
streams applications.

For example, after running ethtool -L <ethx> rx 16

          Packet rate
Before the fix  897799
After the fix   1142070

Results were obtained using netperf:

S=200 ; ( for i in $(seq 1 $S) ; do ( \
  netperf -H 11.7.13.55 -t TCP_RR -l 30 &) ; \
  wait ; done | grep "1        1" | awk '{SUM+=$6} END {print SUM}' )

CC: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:38:14 -05:00
Ido Shamay
0251248232 net/mlx4: Set number of RX rings in a utility function
mlx4_en_add() is too long.
Moving set number of RX rings to a utiltity function to improve
readability and modulization of the code.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:38:01 -05:00
dingtianhong
7a4ddcd92e bonding: remove no longer needed lock for bond_xxx_info_query()
The bond_xxx_info_query() was already in RTNL, so no need to use
bond lock to protect the bond slave list, so remove it.

Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:28:23 -05:00
dingtianhong
4335d60e5e bonding: use rcu_dereference() to access curr_active_slave
The bond_info_show_master already in RCU read-side critical section,
and the we access curr_active_slave without the curr_slave_lock, we
could not sure whether the curr_active_slave will be changed during
the processing, so use RCU to protected the pointer.

Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:28:23 -05:00
dingtianhong
827418081a bonding: netpoll: remove unwanted slave_dev_support_netpoll()
The __netpoll_setup() will check the slave's flag and ndo_poll_controller just
like the slave_dev_support_netpoll() does, and slave_dev_support_netpoll() was
not used by any place, so remove it.

Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:28:23 -05:00
David S. Miller
1f5a7407e4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
1) Introduce skb_to_sgvec_nomark function to add further data to the sg list
   without calling sg_unmark_end first. Needed to add extended sequence
   number informations. From Fan Du.

2) Add IPsec extended sequence numbers support to the Authentication Header
   protocol for ipv4 and ipv6. From Fan Du.

3) Make the IPsec flowcache namespace aware, from Fan Du.

4) Avoid creating temporary SA for every packet when no key manager is
   registered. From Horia Geanta.

5) Support filtering of SA dumps to show only the SAs that match a
   given filter. From Nicolas Dichtel.

6) Remove caching of xfrm_policy_sk_bundles. The cached socket policy bundles
   are never used, instead we create a new cache entry whenever xfrm_lookup()
   is called on a socket policy. Most protocols cache the used routes to the
   socket, so this caching is not needed.

7)  Fix a forgotten SADB_X_EXT_FILTER length check in pfkey, from Nicolas
    Dichtel.

8) Cleanup error handling of xfrm_state_clone.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:13:33 -05:00
David S. Miller
3b5c8ab115 Merge branch 'i40evf'
Aaron Brown says:

====================
Intel Wired LAN Driver Updates

This series contains updates to i40e and (mostly to) i40evf.

Mitch provides most the work for this series.  For the vf driver he
requests a reset on a tx hang, removes vlan filtes on close since we
already remove the MAC filters, fixes some crashes, gets rid of PCI DAC
as it does not mean much on virtualized PCIe parts, skips assigning the
device name that just gets renamed anyway, stores the descriptor ring
size in a manner that allows the use of common tx and rx code with the
PF driver and makes a handful of cosmetic fixes.  For i40e he removes
a delay left over from debugging and changes a do/while loop to a for
loop to avoid hitting another delay each time.

Catherine fixes inconsistent MSI and MSI-X messages and bumps the
driver version.

v2: Removed unnecessary periods and redundant OOM message.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-21 12:39:23 -05:00
Catherine Sullivan
acbc3eb5f8 i40e and i40evf: Bump driver versions
Update the driver versions.

Change-ID: I3fe23024d17da0e614ce126edb365bb2c428d482
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-21 12:38:26 -05:00
Catherine Sullivan
77fa28befc i40e: Change MSIX to MSI-X
Fix inconsistent use of MSIX and MSI-X in messages.

Change-ID: Iae9ffb42819677c34544719044ed77632e06147d
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-21 12:38:26 -05:00
Mitch Williams
6c5ef6209d i40e: tighten up ring enable/disable flow
Change the do/while to a for loop, so we don't hit the delay each
time, even when the register is ready for action.
Don't bother to set or clear the QENA_STAT bit as it is
read-only.

Change-ID: Ie464718804dd79f6d726f291caa9b0c872b49978
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-21 12:38:26 -05:00