last_synq_overflow eats 4 or 8 bytes in struct tcp_sock, even
though it is only used when a listening sockets syn queue
is full.
We can (ab)use rx_opt.ts_recent_stamp to store the same information;
it is not used otherwise as long as a socket is in listen state.
Move linger2 around to avoid splitting struct mtu_probe
across cacheline boundary on 32 bit arches.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can in some situations drop packets in netif_rx()
loopback driver does not report these (unlikely) drops to its stats,
and incorrectly change packets/bytes counts.
After this patch applied, "ifconfig lo" can reports these drops as in :
# ifconfig lo
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:692562900 errors:3228 dropped:3228 overruns:0 frame:0
TX packets:692562900 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
I initialy chose to reflect those errors only in tx_dropped/tx_errors, but David
convinced me that it was really RX errors, as loopback_xmit() really starts
a RX process. (calling eth_type_trans() for example, that itself pulls the ethernet header)
These errors are accounted in rx_dropped/rx_errors.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This loop over fragments in napi_fraginfo_skb() was "interesting".
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just noticed while doing some new work that the recent
mid-wq adjustment logic will misbehave when FACK is not
in use (happens either due sysctl'ed off or auto-detected
reordering) because I forgot the relevant TCPCB tagbit.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Sidorenko reported:
"while experimenting with 'netem' we have found some strange behaviour. It
seemed that ingress delay as measured by 'ping' command shows up on some
hosts but not on others.
After some investigation I have found that the problem is that skbuff->tstamp
field value depends on whether there are any packet sniffers enabled. That
is:
- if any ptype_all handler is registered, the tstamp field is as expected
- if there are no ptype_all handlers, the tstamp field does not show the delay"
This patch prevents unnecessary update of tstamp in dev_queue_xmit_nit()
on ingress path (with act_mirred) adding a check, so minimal overhead on
the fast path, but only when sniffers etc. are active.
Since netem at ingress seems to logically emulate a network before a host,
tstamp is zeroed to trigger the update and pretend delays are from the
outside.
Reported-by: Alex Sidorenko <alexandre.sidorenko@hp.com>
Tested-by: Alex Sidorenko <alexandre.sidorenko@hp.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This has been broken for a while. I happened to catch it testing because one
app "knew" that the top line of the calls data was the policy line and got
confused.
Put the header back.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
EEH attempts to recover up 6 times.
The last attempt leaves all the ports and adapter down.hen
The driver is then unloaded, bringing the adapter down again
unconditionally. The unload will hang.
Check if the adapter is already down before trying to bring it down again.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The fatal error task can be scheduled while processing an offload packet
in NAPI context when the connection handle is bogus. this can race
with the ports being brought down and the cxgb3 workqueue being flushed.
Stop napi processing before flushing the work queue.
The ULP drivers (iSCSI, iWARP) might also schedule a task on keventd_wk
while releasing a connection handle (cxgb3_offload.c::cxgb3_queue_tid_release()).
The driver however does not flush any work on keventd_wq while being unloaded.
This patch also fixes this.
Also call cancel_delayed_work_sync in place of the the deprecated
cancel_rearming_delayed_workqueue.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the existing periodic task to handle link faults.
The link fault interrupt handler is also called in work queue context,
which is wrong and might cause potential deadlocks.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As reported by Andrew Lutomirski <amluto@gmail.com>
All the intel wired ethernet drivers were calling netif_carrier_off
and netif_stop_queue (or variants) before calling register_netdevice
This is incorrect behavior as was pointed out by davem, and causes
ifconfig and friends to report a strange state before first link
after the driver was loaded.
This apparently confused *some* versions of networkmanager.
Andy tested this for e1000e and confirmed it was working for him.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reported-by: Andrew Lutomirski <amluto@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As reported by Andrew Lutomirski <amluto@gmail.com>
All the intel wired ethernet drivers were calling netif_carrier_off
and netif_stop_queue (or variants) before calling register_netdevice
This is incorrect behavior as was pointed out by davem, and causes
ifconfig and friends to report a strange state before first link
after the driver was loaded, since without a netif_carrier_off, the stack
assumes carrier_on, but before register_netdev, netlink messages are not
sent out telling link state.
This apparently confused *some* versions of networkmanager.
Andy tested this for e1000e and confirmed it was working for him.
see thread: http://marc.info/?l=linux-netdev&m=123946479705636&w=2
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andy Lutomirski <amluto@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As reported by Andrew Lutomirski <amluto@gmail.com>
All the intel wired ethernet drivers were calling netif_carrier_off
and netif_stop_queue (or variants) before calling register_netdevice
This is incorrect behavior as was pointed out by davem, and causes
ifconfig and friends to report a strange state before first link
after the driver was loaded, since without a netif_carrier_off, the stack
assumes carrier_on, but before register_netdev, netlink messages are not
sent out telling link state.
This apparently confused *some* versions of networkmanager.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reported-by: Andrew Lutomirski <amluto@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As reported by Andrew Lutomirski <amluto@gmail.com>
All the intel wired ethernet drivers were calling netif_carrier_off
and netif_stop_queue (or variants) before calling register_netdevice
This is incorrect behavior as was pointed out by davem, and causes
ifconfig and friends to report a strange state before first link
after the driver was loaded, since without a netif_carrier_off, the stack
assumes carrier_on, but before register_netdev, netlink messages are not
sent out telling link state.
This apparently confused *some* versions of networkmanager.
in addition this driver appeared to need a netif_start_queue at
the end of open.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reported-by: Andrew Lutomirski <amluto@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As reported by Andrew Lutomirski <amluto@gmail.com>
All the intel wired ethernet drivers were calling netif_carrier_off
and netif_stop_queue (or variants) before calling register_netdevice
This is incorrect behavior as was pointed out by davem, and causes
ifconfig and friends to report a strange state before first link
after the driver was loaded, since without a netif_carrier_off, the stack
assumes carrier_on, but before register_netdev, netlink messages are not
sent out telling link state.
This apparently confused *some* versions of networkmanager.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reported-by: Andrew Lutomirski <amluto@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Broadcom chips with 2.1 firmware handle the fallback case to a SCO
link wrongly when setting up eSCO connections.
< HCI Command: Setup Synchronous Connection (0x01|0x0028) plen 17
handle 11 voice setting 0x0060
> HCI Event: Command Status (0x0f) plen 4
Setup Synchronous Connection (0x01|0x0028) status 0x00 ncmd 1
> HCI Event: Connect Complete (0x03) plen 11
status 0x00 handle 1 bdaddr 00:1E:3A:xx:xx:xx type SCO encrypt 0x01
The Link Manager negotiates the fallback to SCO, but then sends out
a Connect Complete event. This is wrong and the Link Manager should
actually send a Synchronous Connection Complete event if the Setup
Synchronous Connection has been used. Only the remote side is allowed
to use Connect Complete to indicate the missing support for eSCO in
the host stack.
This patch adds a workaround for this which clearly should not be
needed, but reality is that broken Broadcom devices are deployed.
Based on a report by Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Marcel Holtman <marcel@holtmann.org>
Some Bluetooth chips (like the ones from Texas Instruments) don't do
proper eSCO negotiations inside the Link Manager. They just return an
error code and in case of the Kyocera ED-8800 headset it is just a
random error.
< HCI Command: Setup Synchronous Connection 0x01|0x0028) plen 17
handle 1 voice setting 0x0060
> HCI Event: Command Status (0x0f) plen 4
Setup Synchronous Connection (0x01|0x0028) status 0x00 ncmd 1
> HCI Event: Synchronous Connect Complete (0x2c) plen 17
status 0x1f handle 257 bdaddr 00:14:0A:xx:xx:xx type eSCO
Error: Unspecified Error
In these cases it is up to the host stack to fallback to a SCO setup
and so retry with SCO parameters.
Based on a report by Nick Pelly <npelly@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
There is a missing call to rfcomm_dlc_clear_timer in the case that
DEFER_SETUP is used and so the connection gets disconnected after the
timeout even if it was successfully accepted previously.
This patch adds a call to rfcomm_dlc_clear_timer to rfcomm_dlc_accept
which will get called when the user accepts the connection by calling
read() on the socket.
Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Check whether the underlying device provides a set of ethtool ops before
checking for individual handlers to avoid NULL pointer dereferences.
Reported-by: Art van Breemen <ard@telegraafnet.nl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
A compilation error snuck into
2d90b0aa3b
due to an over-zealous indent script removing spaces around array
initialization ellipsis. The attached patch fixes the myri10ge
compilation in net-next.
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TIM IE must not be shorter than 4 bytes, so verify that
when parsing it.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead, allocate extra IE memory if necessary. Normally,
this isn't even necessary since there's enough space.
This is a better way of correcting the "held BSS can
disappear" issue, but also a lot more code. It is also
necessary for proper auth/assoc BSS handling in the
future.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When we receive a probe response frame we can replace the
BSS struct in our list -- but if that struct is held then
we need to hold the new one as well.
We really should fix this completely and not replace the
struct, but this is a bandaid for now.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Using the scan_sdata variable here is terribly wrong,
if there has never been a scan then we fail. However,
we need a bandaid...
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org [2.6.29]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
With this patch, nfnetlink returns -ENOMEM instead of -EPERM if we
fail to create the nfnetlink netlink socket during the module
loading. This is exactly what rtnetlink does in this case.
Ideally, it would be better if we propagate the error that has
happened in netlink_kernel_create(), however, this function still
does not implement this yet.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch fixes an inconsistency that results in no error reports
to user-space listeners if we fail to allocate the event message.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
inet_register_protosw() function is responsible for adding a new
inet protocol into a global table (inetsw[]) that is used with RCU rules.
As soon as the store of the pointer is done, other cpus might see
this new protocol in inetsw[], so we have to make sure new protocol
is ready for use. All pending memory updates should thus be committed
to memory before setting the pointer.
This is correctly done using rcu_assign_pointer()
synchronize_net() is typically used at unregister time, after
unsetting the pointer, to make sure no other cpu is still using
the object we want to dismantle. Using it at register time
is only adding an artificial delay that could hide a real bug,
and this bug could popup if/when synchronize_rcu() can proceed
faster than now.
This saves about 13 ms on boot time on a HZ=1000 8 cpus machine ;)
(4 calls to inet_register_protosw(), and about 3200 us per call)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After calling skb_gro_receive skb->len can no longer be relied
on since if the skb was merged using frags, then its pages will
have been removed and the length reduced.
This caused tcp_gro_receive to prematurely end merging which
resulted in suboptimal performance with ixgbe.
The fix is to store skb->len on the stack.
Reported-by: Mark Wagner <mwagner@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
EPERM means that disconnect() is runnung. It should be treated like
ENODEV
Signed-off-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit ead2ceb0ec ("Network Drop
Monitor: Adding kfree_skb_clean for non-drops and modifying
end-of-line points for skbs") so called end-of-line points for skb's
should use consume_skb() to free the socket buffer.
In opposite to consume_skb() the function kfree_skb() is intended to
be used for unexpected skb drops e.g. in error conditions that now can
trigger the network drop monitor if enabled.
This patch moves the skb end-of-line point in af_can.c to use
consume_skb().
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suppose that we receive lots of frames, start processing them, but
exhaust our budget so that we return before we had a chance to look
at all of them.
Then, when the network layer calls us again, we will only continue
processing the buffers if the REC bit was set in the mean time, which it
might not be if there was a brief pause in the flow of packets. If this
happens, we'll simply display a warning and call netif_rx_complete()
with potentially lots of unprocessed packets in the RX ring...
Fix this by scanning the ring no matter what flags are set in the
interrupt status register.
Signed-off-by: Erik Waling <erik.waling@konftel.com>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When transfering large amounts of data we sometimes experienced that the
Retry Limit Exceeded (RLE) bit got set in TSR during transmission
attempts. When this happened the driver would stall in a state that
prevented any more data from being sent.
Signed-off-by: Erik Waling <erik.waling@konftel.com>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The thresholds for the DCB priority flow control are incorrect for 82599.
This fixes the thresholds to be correct.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The traffic classes in hardware are not symmetrical for Rx and Tx. Rx
is every 16 descriptor queues, Tx is not. It runs 32-32-16-16-8-8-8 when
running with 8 traffic classes, and runs 64-32-16 when running with 4
traffic classes. This patch fixes the mapping.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Cc: stable@kernel.org
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the e1000 transmit cleanup inner loop exited early, then
cleaned might not be true. This could cause tx hangs or other
badness. Use count to track the total number of descriptors
cleaned instead of basing a tx queue restart off of a temporary
working state variable.
This code now makes the flow the same for e1000/e1000e/igb/ixgbe
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the e1000e transmit cleanup inner loop exited early, then
cleaned might not be true. This could cause tx hangs or other
badness. Use count to track the total number of descriptors
cleaned instead of basing a tx queue restart off of a temporary
working state variable.
This code now makes the flow the same for e1000/e1000e/igb/ixgbe
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add MODULE_DEVICE_TABLE so that modinfo reports pci device id aliases.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add myri10ge_fw_names to override the default firmware
on a per-board basis.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Force a statistics update when our ethtool gstats routine
is called. Otherwise, ethtool will continue to read stale
stats until something forces an update by reading /proc/net/dev
This requires putting a lock around the stats update to guard
against 2 threads (one via ethtool, and one via procfs)
updating the stats at once.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>