In "cfg80211: no cookies in cfg80211_send_XXX()"
Holger Schurig removed the cookies in the calls
from mac80211 to cfg80211, but the ones in the
other direction were left in. Remove them now.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
That's a lot longer than open-coding it and
doesn't really add value, so just remove it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The comment for sta_info_flush() states
"Returns the number of removed STA entries"
but that isn't actually true. Consequently,
the warning when a station is still around
on interface removal can never trigger and
this delayed finding the timer issue the
previous patch fixed. Fix the return value
here to make that warning useful again.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When removing an interface while it is in the
process of authenticating or associating, we
leak the auth_data or assoc_data, and leave
the timer pending. The timer then crashes the
system when it fires as its data is gone.
Fix this by explicitly deleting all the data
when the interface is removed. This uncovered
another bug -- this problem should have been
detected by the sta_info_flush() warning but
that function doesn't ever return non-zero,
I'll fix that in a separate patch.
Reported-by: Hieu Nguyen <hieux.c.nguyen@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Use interface data from sta instead of invalid pointer
to list head in calls to drv_sta_state.
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Eliad reports that if a scan finishes in the
middle of processing associated (however it
happens), the interface can go idle. This is
because we set assoc_data to NULL before we
set associated. Change the order so any idle
check will find either one of them.
Doing this requires duplicating the TX sync
processing, but I already have a patch to
delete that completely and will submit that
as soon as my driver changes to no longer
require it are submitted.
Reported-by: Eliad Peller <eliad@wizery.com>
Tested-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some files implicitly get this via mesh.h
which itself doesn't need it, so move the
inclusion into the right files. Some other
files don't need it at all but include it,
so remove it from there.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
ieee80211_restart_sta_timer() takes care for enqueueing
monitor_work if needed, so no need to do it again.
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Devices that monitor the connection in the hw don't need
the monitor work in the driver.
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The AP/GO mode API isn't very clearly defined, it
has "set beacon" and "new beacon" etc.
Modify the API to the following:
* start AP -- all settings
* change beacon -- new beacon data
* stop AP -- stop AP mode operation
This also reflects in the nl80211 API, rename
the commands there correspondingly (but keep
the old names for compatibility.)
Overall, this makes it much clearer what's going
on in the API.
Kalle developed the ath6kl changes, I created
the rest of the patch.
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Most rate control implementations assume .get_rate and .tx_status are only
called once the per-station data has been fully initialized.
minstrel_ht crashes if this assumption is violated.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Tested-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There are situations where we don't have the
necessary rate control information yet for
station entries, e.g. when associating. This
currently doesn't really happen due to the
dummy station handling; explicitly disabling
rate control when it's not initialised will
allow us to remove dummy stations.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We need to use the _sync() version for cancelling the info and security
timer in the L2CAP connection delete path. Otherwise the delayed work
handler might run after the connection object is freed.
Signed-off-by: Ulisses Furquim <ulisses@profusion.mobi>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
__cancel_delayed_work() is being used in some paths where we cannot
sleep waiting for the delayed work to finish. However, that function
might return while the timer is running and the work will be queued
again. Replace the calls with safer cancel_delayed_work() version
which spins until the timer handler finishes on other CPUs and
cancels the delayed work.
Signed-off-by: Ulisses Furquim <ulisses@profusion.mobi>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
We should only perform a reset in hci_dev_do_close if the
HCI_QUIRK_NO_RESET flag is set (since in such a case a reset will not be
performed when initializing the device).
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
There is an imbalance in the rfcomm_session_hold / rfcomm_session_put
operations which causes the following crash:
[ 685.010159] BUG: unable to handle kernel paging request at 6b6b6b6b
[ 685.010169] IP: [<c149d76d>] rfcomm_process_dlcs+0x1b/0x15e
[ 685.010181] *pdpt = 000000002d665001 *pde = 0000000000000000
[ 685.010191] Oops: 0000 [#1] PREEMPT SMP
[ 685.010247]
[ 685.010255] Pid: 947, comm: krfcommd Tainted: G C 3.0.16-mid8-dirty #44
[ 685.010266] EIP: 0060:[<c149d76d>] EFLAGS: 00010246 CPU: 1
[ 685.010274] EIP is at rfcomm_process_dlcs+0x1b/0x15e
[ 685.010281] EAX: e79f551c EBX: 6b6b6b6b ECX: 00000007 EDX: e79f40b4
[ 685.010288] ESI: e79f4060 EDI: ed4e1f70 EBP: ed4e1f68 ESP: ed4e1f50
[ 685.010295] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 685.010303] Process krfcommd (pid: 947, ti=ed4e0000 task=ed43e5e0 task.ti=ed4e0000)
[ 685.010308] Stack:
[ 685.010312] ed4e1f68 c149eb53 e5925150 e79f4060 ed500000 ed4e1f70 ed4e1f80 c149ec10
[ 685.010331] 00000000 ed43e5e0 00000000 ed4e1f90 ed4e1f9c c149ec87 0000bf54 00000000
[ 685.010348] 00000000 ee03bf54 c149ec37 ed4e1fe4 c104fe01 00000000 00000000 00000000
[ 685.010367] Call Trace:
[ 685.010376] [<c149eb53>] ? rfcomm_process_rx+0x6e/0x74
[ 685.010387] [<c149ec10>] rfcomm_process_sessions+0xb7/0xde
[ 685.010398] [<c149ec87>] rfcomm_run+0x50/0x6d
[ 685.010409] [<c149ec37>] ? rfcomm_process_sessions+0xde/0xde
[ 685.010419] [<c104fe01>] kthread+0x63/0x68
[ 685.010431] [<c104fd9e>] ? __init_kthread_worker+0x42/0x42
[ 685.010442] [<c14dae82>] kernel_thread_helper+0x6/0xd
This issue has been brought up earlier here:
https://lkml.org/lkml/2011/5/21/127
The issue appears to be the rfcomm_session_put in rfcomm_recv_ua. This
operation doesn't seem be to required as for the non-initiator case we
have the rfcomm_process_rx doing an explicit put and in the initiator
case the last dlc_unlink will drive the reference counter to 0.
There have been several attempts to fix these issue:
6c2718d Bluetooth: Do not call rfcomm_session_put() for RFCOMM UA on closed socket
683d949 Bluetooth: Never deallocate a session when some DLC points to it
but AFAICS they do not fix the issue just make it harder to reproduce.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: Gopala Krishna Murala <gopala.krishna.murala@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
After moving L2CAP timers to workqueues l2cap_set_timer expects timeout
value to be specified in jiffies but constants defined in miliseconds
are used. This makes timeouts unreliable when CONFIG_HZ is not set to
1000.
__set_chan_timer macro still uses jiffies as input to avoid multiple
conversions from/to jiffies for sk_sndtimeo value which is already
specified in jiffies.
Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com>
Ackec-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
sk_sndtime value should be specified in jiffies thus initial value
needs to be converted from miliseconds. Otherwise this timeout is
unreliable when CONFIG_HZ is not set to 1000.
Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
As reported by Dan Carpenter this function causes a Sparse warning and
shouldn't be declared inline:
include/net/bluetooth/l2cap.h:837:30 error: marked inline, but without a
definition"
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Commit 330605423c fixed l2cap conn establishment for non-ssp remote
devices by not setting HCI_CONN_ENCRYPT_PEND every time conn security
is tested (which was always returning failure on any subsequent
security checks).
However, this broke l2cap conn establishment for ssp remote devices
when an ACL link was already established at SDP-level security. This
fix ensures that encryption must be pending whenever authentication
is also pending.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
read_lock(&tpt_trig->trig.leddev_list_lock) is accessed via the path
ieee80211_open (->) ieee80211_do_open (->) ieee80211_mod_tpt_led_trig
(->) ieee80211_start_tpt_led_trig (->) tpt_trig_timer before initializing
it.
the intilization of this read/write lock happens via the path
ieee80211_led_init (->) led_trigger_register, but we are doing
'ieee80211_led_init' after 'ieeee80211_if_add' where we
register netdev_ops.
so we access leddev_list_lock before initializing it and causes the
following bug in chrome laptops with AR928X cards with the following
script
while true
do
sudo modprobe -v ath9k
sleep 3
sudo modprobe -r ath9k
sleep 3
done
BUG: rwlock bad magic on CPU#1, wpa_supplicant/358, f5b9eccc
Pid: 358, comm: wpa_supplicant Not tainted 3.0.13 #1
Call Trace:
[<8137b9df>] rwlock_bug+0x3d/0x47
[<81179830>] do_raw_read_lock+0x19/0x29
[<8137f063>] _raw_read_lock+0xd/0xf
[<f9081957>] tpt_trig_timer+0xc3/0x145 [mac80211]
[<f9081f3a>] ieee80211_mod_tpt_led_trig+0x152/0x174 [mac80211]
[<f9076a3f>] ieee80211_do_open+0x11e/0x42e [mac80211]
[<f9075390>] ? ieee80211_check_concurrent_iface+0x26/0x13c [mac80211]
[<f9076d97>] ieee80211_open+0x48/0x4c [mac80211]
[<812dbed8>] __dev_open+0x82/0xab
[<812dc0c9>] __dev_change_flags+0x9c/0x113
[<812dc1ae>] dev_change_flags+0x18/0x44
[<8132144f>] devinet_ioctl+0x243/0x51a
[<81321ba9>] inet_ioctl+0x93/0xac
[<812cc951>] sock_ioctl+0x1c6/0x1ea
[<812cc78b>] ? might_fault+0x20/0x20
[<810b1ebb>] do_vfs_ioctl+0x46e/0x4a2
[<810a6ebb>] ? fget_light+0x2f/0x70
[<812ce549>] ? sys_recvmsg+0x3e/0x48
[<810b1f35>] sys_ioctl+0x46/0x69
[<8137fa77>] sysenter_do_call+0x12/0x2
Cc: <stable@vger.kernel.org>
Cc: Gary Morain <gmorain@google.com>
Cc: Paul Stewart <pstew@google.com>
Cc: Abhijit Pradhan <abhijit@qca.qualcomm.com>
Cc: Vasanthakumar Thiagarajan <vthiagar@qca.qualcomm.com>
Cc: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Acked-by: Johannes Berg <johannes.berg@intel.com>
Tested-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Most rate control implementations assume .get_rate and .tx_status are only
called once the per-station data has been fully initialized.
minstrel_ht crashes if this assumption is violated.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Tested-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If you want to use mesh support from mac80211 on a recent
kernel on 2.6.24 you'll run into a name clash when compiling
against include/linux/namei.h, so rename this routine.
/home/mcgrof/tmp/compat-wireless-3.2.5-1/net/mac80211/mesh_pathtbl.c: At top level:
/home/mcgrof/tmp/compat-wireless-3.2.5-1/net/mac80211/mesh_pathtbl.c:342:26: error: conflicting types for ‘path_lookup’
include/linux/namei.h:71:12: note: previous declaration of ‘path_lookup’ was here
Although this could sit as a separate patch in compat-wireless it seems
best to just merge upstream.
Cc: Javier Cardona <javier@cozybit.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@frijolero.org>
Acked-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When not debugging mac80211 code, station state transitions do not need to
show up in the kernel log.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There are situations where we don't have the
necessary rate control information yet for
station entries, e.g. when associating. This
currently doesn't really happen due to the
dummy station handling; explicitly disabling
rate control when it's not initialised will
allow us to remove dummy stations.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Currently, mac80211 goes to idle-off before starting a scan.
However, some devices that implement hw scan might not
need going idle-off in order to perform a hw scan, and
thus saving some energy and simplifying their state machine.
(Note that this is also the case for sched scan - it
currently doesn't make mac80211 go idle-off)
Add a new flag to indicate support for hw scan while idle.
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
"ridx" is used as an index into the mcs_mask[] array which has
IEEE80211_HT_MCS_MASK_LEN elements.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If the IBSS network is RSN-protected, let userspace authorize the stations
instead of adding them as AUTHORIZED by default.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This is the second part of the auth/assoc redesign,
the mac80211 part. This moves the auth/assoc code
out of the work abstraction and into the MLME, so
that we don't flip channels all the time etc.
The only downside is that when we are associated,
we need to drop the association in order to create
a connection to another AP, but for most drivers
this is actually desirable and the ability to do
was never used by any applications. If we want to
implement resource reservation with FT-OTA, we'd
probably best do it with explicit R-O-C in wpa_s.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This is needed by mac80211 to keep a reference
to a BSS alive for the auth process. Remove the
old version of cfg80211_ref_bss() since it's
not actually used.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
To track authenticated state seems to have been
a design mistake in cfg80211. It is possible to
have out of band authentication (FT), tracking
multiple authentications caused more problems
than it ever helped, and the implementation in
mac80211 is too complex.
Remove all this complexity, and let userspace
do whatever it wants to, mac80211 can deal with
that just fine. Association is still tracked of
course, but authentication no longer is. Local
auth state changes are thus no longer of value,
so ignore them completely.
This will also help implement SAE -- asking the
driver to do an authentication is now almost
equivalent to sending an authentication frame,
with the exception of shared key authentication
which is still handled completely.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The dummy STA support was added because I didn't
want to change the driver API at the time. Now
that we have state transitions triggering station
add/remove in the driver, we only call add once a
station reaches ASSOCIATED, so we can remove the
dummy station stuff again.
While at it, tighten the RX check and accept only
port control (EAP) frames from the AP station if
it's not associated yet -- in other cases there's
no race.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead of maintaining separate sta_add/sta_remove
callsites, implement it in sta_state when the driver
has no sta_state implementation.
The only behavioural change this should cause is in
secure mesh mode: with this the station entries will
only be created after the stations are set to AUTH.
Given which drivers support mesh, this seems to not
be a problem.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
(based on Eliad's patch)
Add a callback to notify the low-level driver whenever
the state of a station changes. The driver is only
notified when the station is actually in the mac80211
hash table, not for pre-insert state transitions.
To allow the driver to replace sta_add/remove calls
with this, call extra transitions with the NOTEXIST
state.
This callback can fail, so we need to be careful in
handling it when a station is inserted, particularly
in the IBSS case where we still keep the station entry
around for mac80211 purposes.
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This will be used by drivers later if they
need to have stations inserted all the time,
in mac80211 has no purpose, is never used
and sta_state starts out in NONE.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If a station couldn't be uploaded to the driver but
is still kept (only in IBSS mode) we still shouldn't
try to program the keys for it into hardware; fix
this bug by skipping the key upload in this case.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Setting keys and updating TKIP keys must use the
BSS sdata (not AP_VLAN), so we translate. Move
the translation into driver-ops wrappers instead
of having it inline in the code to simplify the
normal code flow.
The same can be done for sta_add/remove which
already does the translation in the wrapper.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Move the station state modification right before insert,
this just makes the current code more readable (you can
tell that it's before insertion looking at a single
screenful of code) right now, but some upcoming changes
will require this.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The current code checks for stored_mpdu_num > 1, causing
the reorder_timer to be triggered indefinitely, but the
frame is never timed-out (until the next packet is received)
Signed-off-by: Eliad Peller <eliad@wizery.com>
Cc: <stable@vger.kernel.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Unted the assumption that the sta struct is still accessible before the
synchronize_rcu call we should move the num_sta_ps counter decrement
after synchronize_rcu to avoid incorrect decrements if num_sta_ps.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* Handle MCS masks set by the user.
* Match rates provided by the rate control algorithm to the mask set,
also in HT mode, and switch back to legacy mode if necessary.
* add debugfs files to observate the rate selection
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Allow to set mcs masks through nl80211. We also allow to set MCS
rates but no legacy rates (and vice versa).
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If the driver blocked this specific STA with the help of
ieee80211_sta_block_awake we won't clear WLAN_STA_PS_STA later but
still decrement num_sta_ps. Hence, the next data frame from this
STA will trigger ap_sta_ps_end again and also decrement num_sta_ps
again leading to an incorrect num_sta_ps counter.
This can result in problems with powersaving clients not waking up
from PS because the TIM calculation might be skipped due to the
incorrect num_sta_ps counter.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When WLAN_STA_PS_DRIVER is set by ieee80211_sta_block_awake the
num_sta_ps counter is not incremented. Hence, we shouldn't decrement
it in __sta_info_destroy if only WLAN_STA_PS_DRIVER is set. This
could result in an incorrect num_sta_ps counter leading to strange side
effects with associated powersaving clients.
Fix this by only decrementing num_sta_ps when WLAN_STA_PS_STA was set
before.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In the future, when we start notifying drivers,
state transitions could potentially fail. To make
it easier to distinguish between programming bugs
and driver failures:
* rename sta_info_move_state() to
sta_info_pre_move_state() which can only be
called before the station is inserted (and
check this with a new station flag).
* rename sta_info_move_state_checked() to just
plain sta_info_move_state(), as it will be
the regular function that can fail for more
than just one reason (bad transition or an
error from the driver)
This makes the programming model easier -- one of
the functions can only be called before insertion
and can't fail, the other can fail.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This reverts commit f1e3be1561.
Johannes Berg <johannes@sipsolutions.net> thinks that this patch is
incorrect. I'll defer to his judgment.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Commit 0884d7aa24 (AF_UNIX: Fix poll blocking problem when reading from
a stream socket) added a regression for epoll() in Edge Triggered mode
(EPOLLET)
Appropriate fix is to use skb_peek()/skb_unlink() instead of
skb_dequeue(), and only call skb_unlink() when skb is fully consumed.
This remove the need to requeue a partial skb into sk_receive_queue head
and the extra sk->sk_data_ready() calls that added the regression.
This is safe because once skb is given to sk_receive_queue, it is not
modified by a writer, and readers are serialized by u->readlock mutex.
This also reduce number of spinlock acquisition for small reads or
MSG_PEEK users so should improve overall performance.
Reported-by: Nick Mathewson <nickm@freehaven.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Alexey Moiseytsev <himeraster@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>