This patch is inspired by patch recently posted by Johannes Berg. Basically what
my patch does is to group list and a count of addresses into newly introduced
structure netdev_hw_addr_list. This brings us two benefits:
1) struct net_device becames a bit nicer.
2) in the future there will be a possibility to operate with lists independently
on netdevices (with exporting right functions).
I wanted to introduce this patch before I'll post a multicast lists conversion.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
drivers/net/bnx2.c | 4 +-
drivers/net/e1000/e1000_main.c | 4 +-
drivers/net/ixgbe/ixgbe_main.c | 6 +-
drivers/net/mv643xx_eth.c | 2 +-
drivers/net/niu.c | 4 +-
drivers/net/virtio_net.c | 10 ++--
drivers/s390/net/qeth_l2_main.c | 2 +-
include/linux/netdevice.h | 17 +++--
net/core/dev.c | 130 ++++++++++++++++++--------------------
9 files changed, 89 insertions(+), 90 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
Add interface and functions to support a new CNIC driver to drive
the Broadcom bnx2 hardware for iSCSI offload.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
There is no need to check if a pointer is NULL before calling
vfree(), since vfree() function already check for it.
Signed-off-by: Breno Leitão <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_dma_unmap() is quite expensive for small packets,
because we use two different cache lines from skb_shared_info.
One to access nr_frags, one to access dma_maps[0]
Instead of dma_maps being an array of MAX_SKB_FRAGS + 1 elements,
let dma_head alone in a new dma_head field, close to nr_frags,
to reduce cache lines misses.
Tested on my dev machine (bnx2 & tg3 adapters), nice speedup !
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch converts unicast address list to standard list_head using
previously introduced struct netdev_hw_addr. It also relaxes the
locking. Original spinlock (still used for multicast addresses) is not
needed and is no longer used for a protection of this list. All
reading and writing takes place under rtnl (with no changes).
I also removed a possibility to specify the length of the address
while adding or deleting unicast address. It's always dev->addr_len.
The convertion touched especially e1000 and ixgbe codes when the
change is not so trivial.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
drivers/net/bnx2.c | 13 +--
drivers/net/e1000/e1000_main.c | 24 +++--
drivers/net/ixgbe/ixgbe_common.c | 14 ++--
drivers/net/ixgbe/ixgbe_common.h | 4 +-
drivers/net/ixgbe/ixgbe_main.c | 6 +-
drivers/net/ixgbe/ixgbe_type.h | 4 +-
drivers/net/macvlan.c | 11 +-
drivers/net/mv643xx_eth.c | 11 +-
drivers/net/niu.c | 7 +-
drivers/net/virtio_net.c | 7 +-
drivers/s390/net/qeth_l2_main.c | 6 +-
drivers/scsi/fcoe/fcoe.c | 16 ++--
include/linux/netdevice.h | 18 ++--
net/8021q/vlan.c | 4 +-
net/8021q/vlan_dev.c | 10 +-
net/core/dev.c | 195 +++++++++++++++++++++++++++-----------
net/dsa/slave.c | 10 +-
net/packet/af_packet.c | 4 +-
18 files changed, 227 insertions(+), 137 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
Second round of drivers for Gb cards (and NIU one I forgot in the 10GB round)
Now that core network takes care of trans_start updates, dont do it
in drivers themselves, if possible. Drivers can avoid one cache miss
(on dev->trans_start) in their start_xmit() handler.
Exceptions are NETIF_F_LLTX drivers
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using bnx2 in a high transmit load, bnx2_tx_int() cost is pretty high.
There are two reasons.
One is an expensive call to bnx2_get_hw_tx_cons(bnapi) for each freed skb
One is cpu stalls when accessing skb_is_gso(skb) / skb_shinfo(skb)->nr_frags
because of two cache line misses.
(One to get skb->end/head to compute skb_shinfo(skb),
one to get is_gso/nr_frags)
This patch :
1) avoids calling bnx2_get_hw_tx_cons(bnapi) too many times.
2) makes bnx2_start_xmit() cache is_gso & nr_frags into sw_tx_bd descriptor.
This uses a litle bit more ram (256 longs per device on x86), but helps a lot.
3) uses a prefetch(&skb->end) to speedup dev_kfree_skb(), bringing
cache line that will be needed in skb_release_data()
result is 5 % bandwidth increase in benchmarks, involving UDP or TCP receive
& transmits, when a cpu is dedicated to ksoftirqd for bnx2.
bnx2_tx_int going from 3.33 % cpu to 0.5 % cpu in oprofile
Note : skb_dma_unmap() still very expensive but this is for another patch,
not related to bnx2 (2.9 % of cpu, while it does nothing on x86_32)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add barrier() to bnx2_get_hw_{tx|rx}_cons() to fix this issue:
http://bugzilla.kernel.org/show_bug.cgi?id=12698
This issue was reported by multiple i386 users. Without barrier(),
the compiled code looks like the following where %eax contains the
address of the tx_cons or rx_cons in the DMA status block. The
status block contents can change between the cmpb and the movzwl
instruction. The driver would crash if the value was not 0xff during
the cmpb instruction, but changed to 0xff during the movzwl
instruction.
6828: 80 38 ff cmpb $0xff,(%eax)
682b: 0f b7 10 movzwl (%eax),%edx
With the added barrier(), the compiled code now looks correct:
683d: 0f b7 10 movzwl (%eax),%edx
6840: 0f b6 c2 movzbl %dl,%eax
6843: 3d ff 00 00 00 cmp $0xff,%eax
Thanks to Pascal de Bruijn <pmjdebruijn@pcode.nl> for reporting the
problem and Holger Noefer <hnoefer@pironet-ndh.com> for patiently
testing test patches for us.
Also updated version to 2.0.1.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mips identifier is reserved by gcc on mips plattforms. Don't use it
in the code.
Signed-off-by: Bastian Blank <waldi@debian.org>
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace all DMA_40BIT_MASK macro with DMA_BIT_MASK(40)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Based on original patch by Ben Hutchings <ben@decadent.org.uk> and
Bastian Blank <waldi@debian.org>, with the following main changes:
Separated the mips firmware and rv2p firmware into different files
to make it easier to update them separately.
Added some code to fixup the rv2p code with run-time information
such as PAGE_SIZE.
Update version to 2.0.0.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MSI-X handler was chosen before the call to pci_enable_msix().
If MSI-X was not available, the wrong MSI-X handler would be used in
INTA mode. This would cause a screaming interrupt problem because
INTA would not be cleared by the MSI-X handler.
Fixed by assigning MSI-X handler after pci_enable_msix() returns
successfully. Also update version to 1.9.3.
Thomas Chenault <thomas_chenault@dell.com> helped us find this problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If errors are reported on a frame descriptor, we need to
account for the buffer pages that may have been used for this
error packet and recycle them. Otherwise, we may get the wrong
pages for the next packet.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like the locking is OK as the locks were being taken before the
various phy setup functions, add the annotations as they release and
reacquire the phy_lock.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following the removal of the unused struct net_device * parameter from
the NAPI functions named *netif_rx_* in commit 908a7a1, they are
exactly equivalent to the corresponding *napi_* functions and are
therefore redundant.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the napi api was changed to separate its 1:1 binding to the net_device
struct, the netif_rx_[prep|schedule|complete] api failed to remove the now
vestigual net_device structure parameter. This patch cleans up that api by
properly removing it..
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DMA memory for the jumbo rx page rings was freed incorrectly using the
wrong local variable as the array index.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And fix the 5716S pci_device_id entry to point to the proper string.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change MSI-X vector names to "ethx-%d".
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bnx2 chips do not support per MSI vector masking. On 5706/5708, new MSI
address/data are stored only when the MSI enable bit is toggled. As a result,
SMP affinity no longer works in the latest kernel. A more serious problem is
that the driver will no longer receive interrupts when the MSI receiving CPU
goes offline.
The workaround in this patch only addresses the problem of CPU going offline.
When that happens, the driver's timer function will detect that it is making
no forward progress on pending interrupt events and will recover from it.
Eric Dumazet reported the problem.
We also found that if an interrupt is internally asserted while MSI and INTA
are disabled, the chip will end up in the same state after MSI is re-enabled.
The same workaround is needed for this problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Tested-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert driver to new net_device_ops. Compile tested only.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix bnx2 so that netpoll works properly. Specifically:
1) Fix parameters to bnx2_interrupt to be a struct bnx2_napi rather than a
struct net_device
2) Fix poll_controller method to check every queue in the rx case so frames
aren't missed
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move all related timeout constants to the same location. BNX2
prefix is also added to make them more consistent.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The default rx buffer water marks for XOFF/XON are for 1500 MTU. At
larger MTUs, these water marks need to be adjusted for effective
flow control.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some quad-port cards that cannot support WoL on all ports due
to excessive power consumption, the driver needs to restrict WoL
on some ports by checking VAUX_PRESET bit.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The generic packet receive code takes care of setting
netdev->last_rx when necessary, for the sake of the
bonding ARP monitor.
Drivers need not do it any more.
Some cases had to be skipped over because the drivers
were making use of the ->last_rx value themselves.
Signed-off-by: David S. Miller <davem@davemloft.net>
This converts pretty much everything to print_mac. There were
a few things that had conflicts which I have just dropped for
now, no harm done.
I've built an allyesconfig with this and looked at the files
that weren't built very carefully, but it's a huge patch.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before, the driver would not care about the return codes from pci_map_*
functions. This could be potentially dangerous if a mapping failed.
Now, we will check all pci_map_* calls. On the transmit side, we switch
to use the new function skb_dma_map(). On the receive side, we add
pci_dma_mapping_error().
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to check netif_running() state in most ethtool operations
and properly handle the !netif_running() state where the chip is
in an uninitailzed state or low power state that may not accept
any MMIO.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This logic is used in bnx2_close() and bnx2_suspend() and
so should be separated out into a separate function.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The timer_interval field is only assigned once, and never reassigned.
We can safely replace all instances of the timer_interval with a
constant value.
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The name of the board is only used during the initialization of
the adapter. We can save the space of a pointer by not storing
this information.
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnx2_set_mac_link() doesn't need to return any error codes. And
all the callers don't check the return code. It is safe to
change the return type to a void.
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>