If the 'driver_initiated' function argument to
__cfg80211_stop_sched_scan() is not 0 then we'll return an
uninitialized 'err' from the function.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Based on inputs from Johannes Berg <johannes@sipsolutions.net>
from http://article.gmane.org/gmane.linux.kernel.wireless.general/68193
and http://article.gmane.org/gmane.linux.kernel.wireless.general/71702
In xmit path, devices that do full hardware crypto (including
MMIC and ICV) need no tailroom. For such devices, tailroom
reservation can be skipped if all the keys are programmed into
the hardware (i.e software crypto is not used for any of the
keys) and none of the keys wants software to generate Michael
MIC and IV.
v2: Added check for IV along with MMIC.
Reported-by: Fabio Rossi <rossi.f@inwind.it>
Tested-by: Fabio Rossi <rossi.f@inwind.it>
Signed-off-by: Mohammed Shafi Shajakhan <mshajakhan@atheros.com>
Cc: Mohammed Shafi Shajakhan <mshajakhan@atheros.com>
v3: Fixing races to avoid WARNING: at net/mac80211/wpa.c:397
ccmp_encrypt_skb+0xc4/0x1f0
Reported-by: Andreas Hartmann <andihartmann@01019freenet.de>
Tested-by: Andreas Hartmann <andihartmann@01019freenet.de>
v4: Added links with message ID
Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There was a deadlock when rfkill-blocking a wireless interface,
because we were locking the rdev mutex on NETDEV_GOING_DOWN to stop
sched_scans that were eventually running. The rfkill block code was
already holding a mutex under rdev:
kernel: =======================================================
kernel: [ INFO: possible circular locking dependency detected ]
kernel: 3.0.0-rc1-00049-g1fa7b6a #57
kernel: -------------------------------------------------------
kernel: kworker/0:1/4525 is trying to acquire lock:
kernel: (&rdev->mtx){+.+.+.}, at: [<ffffffff8164c831>] cfg80211_netdev_notifier_call+0x131/0x5b0
kernel:
kernel: but task is already holding lock:
kernel: (&rdev->devlist_mtx){+.+.+.}, at: [<ffffffff8164dcef>] cfg80211_rfkill_set_block+0x4f/0xa0
kernel:
kernel: which lock already depends on the new lock.
To fix this, add a new mutex specifically for sched_scan, to protect
the sched_scan_req element in the rdev struct, instead of using the
global rdev mutex.
Reported-by: Duane Griffin <duaneg@dghda.com>
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The version number of modules build outside of the tree can get revision
numbers added. This is useful to give hints about the revision of a
distribution package and the used patchset. The prepended source number or
branch name doesn't add any additional information which would help to identify
problems and can therefore be omitted.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
The packet aggregation needs to ensure that only compatible packets
are aggregated. Some of the checks are based on the interface number
while assuming that the first interface also is the primary interface
which is not always the case.
This patch addresses the issue by using the primary_if pointer.
Reported-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
The primary interface OGM has to be broadcasted on all hard-interfaces
even if the primary interface is not the first interface (if_num = 0).
Therefore the code has to compare the originating interface with the
primary interface instead of checking the if_num.
Reported-by: Linus Luessing <linus.luessing@web.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
now tt_local_event() takes a flags argument instead of a sequence of
boolean values which would grow up with the time.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
In order to make possible to use the broadcast list for delayed sendings
the "delay" parameter is now provided instead of using 1 as hardcoded
value.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
The tt_local_entry structure now has a 'flags' field. This helps to
unify the flags format to all the client related structures (tt_global_entry
and tt_change). The 'never_purge' field is now encoded in the 'flags' one.
To optimise the usage of this field, its length has been increased to 16bit
in order to use the eight leading bits (from 0 to 7) to store flags that
have to be sent on the wire, while the eight ending ones are used for local
computation only.
Moreover 'enum tt_change_flags' is now called 'enum tt_client_flags' and the
defined values apply to the tt_local_entry, tt_global_entry and the tt_change
'flags' field.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Hi,
Reinhard Max also pointed out that the error should EAFNOSUPPORT according
to POSIX.
The Linux manpages have it as EINVAL, some other OSes (Minix, HPUX, perhaps BSD) use
EAFNOSUPPORT. Windows uses WSAEFAULT according to MSDN.
Other protocols error values in their af bind() methods in current mainline git as far
as a brief look shows:
EAFNOSUPPORT: atm, appletalk, l2tp, llc, phonet, rxrpc
EINVAL: ax25, bluetooth, decnet, econet, ieee802154, iucv, netlink, netrom, packet, rds, rose, unix, x25,
No check?: can/raw, ipv6/raw, irda, l2tp/l2tp_ip
Ciao, Marcus
Signed-off-by: Marcus Meissner <meissner@suse.de>
Cc: Reinhard Max <max@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
CCID-2's cwnd increases like TCP during slow-start, which has implications for
* the local Sequence Window value (should be > cwnd),
* the Ack Ratio value.
Hence an exponential growth, if it does not reflect the actual network
conditions, can quickly lead to instability.
This patch adds congestion-window validation (RFC2861) to CCID-2:
* cwnd is constrained if the sender is application limited;
* cwnd is reduced after a long idle period, as suggested in the '90 paper
by Van Jacobson, in RFC 2581 (sec. 4.1);
* cwnd is never reduced below the RFC 3390 initial window.
As marked in the comments, the code is actually almost a direct copy of the
TCP congestion-window-validation algorithms. By continuing this work, it may
in future be possible to use the TCP code (not possible at the moment).
The mechanism can be turned off using a module parameter. Sampling of the
currently-used window (moving-maximum) is however done constantly; this is
used to determine the expected window, which can be exploited to regulate
DCCP's Sequence Window value.
This patch also sets slow-start-after-idle (RFC 4341, 5.1), i.e. it behaves like
TCP when net.ipv4.tcp_slow_start_after_idle = 1.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This replaces a switch statement with a test, using the equivalent
function dccp_data_packet(skb). It also doubles the range of the field
`rx_num_data_pkts' by changing the type from `int' to `u32', avoiding
signed/unsigned comparison with the u16 field `dccps_r_ack_ratio'.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This moves CCID-2's initial window function into the header file, since several
parts throughout the CCID-2 code need to call it (CCID-2 still uses RFC 3390).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Leandro Melo de Sales <leandro@ic.ufal.br>
Change the CCID (de)activation message to start with the
protocol name, as 'CCID' is already in there.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Realising the following call pattern,
* first dccp_entail() is called to enqueue a new skb and
* then skb_clone() is called to transmit a clone of that skb,
this patch integrates both into the same function.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This patch rearranges the order of statements of the slow-path input processing
(i.e. any other state than OPEN), to resolve the following issues.
1. Dependencies: the order of statements now better matches RFC 4340, 8.5, i.e.
step 7 is before step 9 (previously 9 was before 7), and parsing options in
step 8 (which may consume resources) now comes after step 7.
2. Sequence number checks are omitted if in state LISTEN/REQUEST, due to the
note underneath the table in RFC 4340, 7.5.3.
As a result, CCID processing is now indeed confined to OPEN/PARTOPEN states,
i.e. congestion control is performed only on the flow of data packets. This
avoids pathological cases of doing congestion control on those messages
which set up and terminate the connection.
3. Packets are now passed on to Ack Vector / CCID processing only after
- step 7 (receive unexpected packets),
- step 9 (receive Reset),
- step 13 (receive CloseReq),
- step 14 (receive Close)
and only if the state is PARTOPEN. This simplifies CCID processing:
- in LISTEN/CLOSED the CCIDs are non-existent;
- in RESPOND/REQUEST the CCIDs have not yet been negotiated;
- in CLOSEREQ and active-CLOSING the node has already closed this socket;
- in passive-CLOSING the client is waiting for its Reset.
In the last case, RFC 4340, 8.3 leaves it open to ignore further incoming
data, which is the approach taken here.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Use pr_fmt() without KBUILD_MODNAME to allow AUN and econet prefixes.
Convert printks with KERN_DEBUG to pr_debug.
Hoist assigns from if.
80 column wrapping.
Move open braces to end of line.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Too trivial to live.
cc: WANG Cong <amwang@redhat.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unused symbols waste space.
Commit 0e34e93177
"(netpoll: add generic support for bridge and bonding devices)"
added the symbol more than a year ago with the promise of "future use".
Because it is so far unused, remove it for now.
It can be easily readded if or when it actually needs to be used.
cc: WANG Cong <amwang@redhat.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calling icmp_send() on a local message size error leads to
an incorrect update of the path mtu. So use ip_local_error()
instead to notify the socket about the error.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We might call ip_ufo_append_data() for packets that will be IPsec
transformed later. This function should be used just for real
udp packets. So we check for rt->dst.header_len which is only
nonzero on IPsec handling and call ip_ufo_append_data() just
if rt->dst.header_len is zero.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The family arg is not used any more, so remove it.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPV6, unlike IPV4, doesn't have a routing cache.
Routing table entries, as well as clones made in response
to route lookup requests, all live in the same table. And
all of these things are together collected in the destination
cache table for ipv6.
This means that routing table entries count against the garbage
collection limits, even though such entries cannot ever be reclaimed
and are added explicitly by the administrator (rather than being
created in response to lookups).
Therefore it makes no sense to count ipv6 routing table entries
against the GC limits.
Add a DST_NOCOUNT destination cache entry flag, and skip the counting
if it is set. Use this flag bit in ipv6 when adding routing table
entries.
Signed-off-by: David S. Miller <davem@davemloft.net>
Correct the syntax so that both array and pointer are const.
Signed-off-by: Greg Dietsche <Gregory.Dietsche@cuw.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows 80 column line reflowing.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows 80 column line reflowing.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows useless break;s removed after returns
and a comment added to an unnecessary default: break;
because of a dubious gcc warning.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows code on same line as case moved
to separate lines and a "/* Fallthrough */" comment
added where appropriate.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows 80 column reflowing.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows 80 column reflowing,
removal of a useless break after return, and moving
open brace after case instead of separate line.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows no difference.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows miscellaneous 80 column wrapping,
comment reflowing and a comment for a useless gcc
warning for an otherwise unused default: case.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows miscellaneous 80 column wrapping.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
Simplify a default case / return (unreached)
Remove unnecessary break after return.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
git diff -w shows differences for line wrapping.
(fit multiple lines to 80 columns, join where possible)
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the case labels the same indent as the switch.
(git diff -w net/appletalk shows no difference)
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a change sequence counter to each net namespace
which is bumped whenever a netdevice is added or removed from
the list. If such a change occurred while a link dump took place,
the dump will have the NLM_F_DUMP_INTR flag set in the first
message which has been interrupted and in all subsequent messages
of the same dump.
Note that links may still be modified or renamed while a dump is
taking place but we can guarantee for userspace to receive a
complete list of links and not miss any.
Testing:
I have added 500 VLAN netdevices to make sure the dump is split
over multiple messages. Then while continuously dumping links in
one process I also continuously deleted and re-added a dummy
netdevice in another process. Multiple dumps per seconds have
had the NLM_F_DUMP_INTR flag set.
I guess we can wait for Johannes patch to hit net-next via the
wireless tree. I just wanted to give this some testing right away.
Signed-off-by: Thomas Graf <tgraf@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even when the received tx_seq is expected, the frame still needs to be
dropped if the TX window is exceeded or the receiver is in the local
busy state.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Add a local logging function to emit bluetooth specific
messages. Using vsprintf extension %pV saves code/text
space.
Convert the current BT_INFO and BT_ERR macros to use bt_printk.
Remove __func__ from BT_ERR macro (and the uses).
Prefix "Bluetooth: " to BT_ERR
Remove __func__ from BT_DBG as function can be prefixed when
using dynamic_debug.
With allyesconfig:
text data bss dec hex filename
129956 8632 36096 174684 2aa5c drivers/bluetooth/built-in.o.new2
134402 8632 36064 179098 2bb9a drivers/bluetooth/built-in.o.old
14778 1012 3408 19198 4afe net/bluetooth/bnep/built-in.o.new2
15067 1012 3408 19487 4c1f net/bluetooth/bnep/built-in.o.old
346595 19163 86080 451838 6e4fe net/bluetooth/built-in.o.new2
353751 19163 86064 458978 700e2 net/bluetooth/built-in.o.old
18483 1172 4264 23919 5d6f net/bluetooth/cmtp/built-in.o.new2
18927 1172 4264 24363 5f2b net/bluetooth/cmtp/built-in.o.old
19237 1172 5152 25561 63d9 net/bluetooth/hidp/built-in.o.new2
19581 1172 5152 25905 6531 net/bluetooth/hidp/built-in.o.old
59461 3884 14464 77809 12ff1 net/bluetooth/rfcomm/built-in.o.new2
61206 3884 14464 79554 136c2 net/bluetooth/rfcomm/built-in.o.old
with x86 defconfig (and just bluetooth):
$ size net/bluetooth/built-in.o.defconfig.*
text data bss dec hex filename
66358 933 100 67391 1073f net/bluetooth/built-in.o.defconfig.new
66643 933 100 67676 1085c net/bluetooth/built-in.o.defconfig.old
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Make it easier to use more normal logging styles later.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
ERTM timeouts are defined in milliseconds, but need to be converted
to jiffies when passed to mod_timer().
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
If the remote device is not present, the connections attemp fails and
the struct hci_conn was not freed
Signed-off-by: Tomas Targownik <ttargownik@geicp.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
PTS test A2DP/SRC/SRC_SET/TC_SRC_SET_BV_02_I revealed that
( probably after the df3c3931e commit ) the l2cap connection
could not be established in case when the "Auth Complete" HCI
event does not arive before the initiator send "Configuration
request", in which case l2cap replies with "Command rejected"
since the channel is still in BT_CONNECT2 state.
Based on patch from: Ilia Kolomisnky <iliak@ti.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Partial revert of commit aabf6f89. When the hidp session thread
was converted from kernel_thread to kthread, the atomic/wakeups
were replaced with kthread_stop. kthread_stop has blocking semantics
which are inappropriate for the hidp session kthread. In addition,
the kthread signals itself to terminate in hidp_process_hid_control()
- it cannot do this with kthread_stop().
Lastly, a wakeup can be lost if the wakeup happens between checking
for the loop exit condition and setting the current state to
TASK_INTERRUPTIBLE. (Without appropriate synchronization mechanisms,
the task state should not be changed between the condition test and
the yield - via schedule() - as this creates a race between the
wakeup and resetting the state back to interruptible.)
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (24 commits)
usbnet: Remove over-broad module alias from zaurus.
MAINTAINERS: drop Michael from bfin_mac driver
net/can: activate bit-timing calculation and netlink based drivers by default
rionet: fix NULL pointer dereference in rionet_remove
net+crypto: Use vmalloc for zlib inflate buffers.
netfilter: Fix ip_route_me_harder triggering ip_rt_bug
ipv4: Fix IPsec slowpath fragmentation problem
ipv4: Fix packet size calculation in __ip_append_data
cxgb3: skb_record_rx_queue now records the queue index relative to the net_device.
bridge: Only flood unregistered groups to routers
qlge: Add maintainer.
MAINTAINERS: mark socketcan-core lists as subscribers-only
MAINTAINERS: Remove Sven Eckelmann from BATMAN ADVANCED
r8169: fix wrong register use.
net/usb/kalmia: signedness bug in kalmia_bind()
net/usb: kalmia: Various fixes for better support of non-x86 architectures.
rtl8192cu: Fix missing firmware load
udp/recvmsg: Clear MSG_TRUNC flag when starting over for a new packet
ipv6/udp: Use the correct variable to determine non-blocking condition
netconsole: fix build when CONFIG_NETCONSOLE_DYNAMIC is turned on
...
In this revision the conversion of secid to SELinux context and adding it
to the audit log is moved from xt_AUDIT.c to audit.c with the aid of a
separate helper function - audit_log_secctx - which does both the conversion
and logging of SELinux context, thus also preventing internal secid number
being leaked to userspace. If conversion is not successful an error is raised.
With the introduction of this helper function the work done in xt_AUDIT.c is
much more simplified. It also opens the possibility of this helper function
being used by other modules (including auditd itself), if desired. With this
addition, typical (raw auditd) output after applying the patch would be:
type=NETFILTER_PKT msg=audit(1305852240.082:31012): action=0 hook=1 len=52 inif=? outif=eth0 saddr=10.1.1.7 daddr=10.1.2.1 ipid=16312 proto=6 sport=56150 dport=22 obj=system_u:object_r:ssh_client_packet_t:s0
type=NETFILTER_PKT msg=audit(1306772064.079:56): action=0 hook=3 len=48 inif=eth0 outif=? smac=00:05:5d:7c:27:0b dmac=00:02:b3:0a:7f:81 macproto=0x0800 saddr=10.1.2.1 daddr=10.1.1.7 ipid=462 proto=6 sport=22 dport=3561 obj=system_u:object_r:ssh_server_packet_t:s0
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Mr Dash Four <mr.dash.four@googlemail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Add a memeber to the ieee80211_sta structure to indicate whether the STA
supports WME.
Signed-off-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch enables the 6131 family of chips to forward DSA
packets to other switch chips. This is needed if multiple
DSA chips are used in a device. Without this patch the
chip will drop any DSA packets not destined for it.
This patch only enables the forwarding of DSA packets if
multiple chips are used in the switch configuration.
Signed-off-by: Barry Grussling <barry@grussling.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid creating input routes with ip_route_me_harder.
It does not work for locally generated packets. Instead,
restrict sockets to provide valid saddr for output route (or
unicast saddr for transparent proxy). For other traffic
allow saddr to be unicast or local but if callers forget
to check saddr type use 0 for the output route.
The resulting handling should be:
- REJECT TCP:
- in INPUT we can provide addr_type = RTN_LOCAL but
better allow rejecting traffic delivered with
local route (no IP address => use RTN_UNSPEC to
allow also RTN_UNICAST).
- FORWARD: RTN_UNSPEC => allow RTN_LOCAL/RTN_UNICAST
saddr, add fix to ignore RTN_BROADCAST and RTN_MULTICAST
- OUTPUT: RTN_UNSPEC
- NAT, mangle, ip_queue, nf_ip_reroute: RTN_UNSPEC in LOCAL_OUT
- IPVS:
- use RTN_LOCAL in LOCAL_OUT and FORWARD after SNAT
to restrict saddr to be local
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
A remote user can provide a small value for the command size field in
the command header of an l2cap configuration request, resulting in an
integer underflow when subtracting the size of the configuration request
header. This results in copying a very large amount of data via
memcpy() and destroying the kernel heap. Check for underflow.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
ip_append_data() builds packets based on the mtu from dst_mtu(rt->dst.path).
On IPsec the effective mtu is lower because we need to add the protocol
headers and trailers later when we do the IPsec transformations. So after
the IPsec transformations the packet might be too big, which leads to a
slowpath fragmentation then. This patch fixes this by building the packets
based on the lower IPsec mtu from dst_mtu(&rt->dst) and adapts the exthdr
handling to this.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Git commit 59104f06 (ip: take care of last fragment in ip_append_data)
added a check to see if we exceed the mtu when we add trailer_len.
However, the mtu is already subtracted by the trailer length when the
xfrm transfomation bundles are set up. So IPsec packets with mtu
size get fragmented, or if the DF bit is set the packets will not
be send even though they match the mtu perfectly fine. This patch
actually reverts commit 59104f06.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the tx_frames_pending() driver callback to determine if Tx frames are
pending for its internal queues. If so postpone the dynamic PS timeout
to avoid interrupting Tx traffic.
The commit e8306f9894 enabled this
behavior for drivers with IEEE80211_HW_PS_NULLFUNC_STACK. We enable this
for all drivers supporting dynamic PS.
This patch helps improve performance in noisy environments.
Signed-off-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Do not send DS Channel parameter for directed probe requests
in order to maximize the chance that we get a response. Some
badly-behaved APs don't respond when this parameter is included.
Signed-off-by: Paul Stewart <pstew@chromium.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When forming a Rx BA session, sometimes the ADDBA response gets lost.
This leads to a situation where the session is configured locally, but
doesn't exist on the remote side. Subsequent ADDBA requests are declined
by mac80211.
Fix this by assuming the session state of the initiator is the correct
one. When receiving an unexpected ADDBA request on a TID with an active
Rx BA session, delete the existing one and establish a new session.
Signed-off-by: Arik Nemtsov <arik@wizery.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Recent changes to hci_core.c use crypto interfaces, so select CRYPTO
to make sure that those interfaces are present.
Fixes these build errors when CRYPTO is not enabled:
net/built-in.o: In function `hci_register_dev':
(.text+0x4cf86): undefined reference to `crypto_alloc_base'
net/built-in.o: In function `hci_unregister_dev':
(.text+0x4f912): undefined reference to `crypto_destroy_tfm'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Advertise only user-requested bitrates in a HW scan.
Note that the hw_scan API doesn't currently have a
way of asking for a specific probe request bitrate,
so we might end up using a bitrate that we don't
advertise as supported. I'll fix that later.
Also add a hexdump printk to hwsim to verify this.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Move all that mac80211 has into the generic
ieee80211.h header file and use them. At the
same time move them from mask+shift to just
bits and rename them for consistent names.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Sometimes when reporting a MIC failure rx->key may be unset. This
code path is hit when receiving a packet meant for a multicast
address, and decryption is performed in HW.
Fortunately, the failing key_idx is not used for anything up to
(and including) usermode, so we allow ourselves to drop it on the
way up when a key cannot be retrieved.
Signed-off-by: Arik Nemtsov <arik@wizery.com>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Recent changes to hci_core.c use crypto interfaces, so select CRYPTO
to make sure that those interfaces are present.
Fixes these build errors when CRYPTO is not enabled:
net/built-in.o: In function `hci_register_dev':
(.text+0x4cf86): undefined reference to `crypto_alloc_base'
net/built-in.o: In function `hci_unregister_dev':
(.text+0x4f912): undefined reference to `crypto_destroy_tfm'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Results on dummy device can be seen in my netconf 2011
slides. These results are for a 10Gige IXGBE intel
nic - on another i5 machine, very similar specs to
the one used in the netconf2011 results.
It turns out - this is a hell lot worse than dummy
and so this patch is even more beneficial for 10G.
Test setup:
----------
System under test sending packets out.
Additional box connected directly dropping packets.
Installed prio qdisc on the eth device and default
netdev default length of 1000 used as is.
The 3 prio bands each were set to 100 (didnt factor in
the results).
5 packet runs were made and the middle 3 picked.
results
-------
The "cpu" column indicates the which cpu the sample
was taken on,
The "Pkt runx" carries the number of packets a cpu
dequeued when forced to be in the "dequeuer" role.
The "avg" for each run is the number of times each
cpu should be a "dequeuer" if the system was fair.
3.0-rc4 (plain)
cpu Pkt run1 Pkt run2 Pkt run3
================================================
cpu0 21853354 21598183 22199900
cpu1 431058 473476 393159
cpu2 481975 477529 458466
cpu3 23261406 23412299 22894315
avg 11506948 11490372 11486460
3.0-rc4 with patch and default weight 64
cpu Pkt run1 Pkt run2 Pkt run3
================================================
cpu0 13205312 13109359 13132333
cpu1 10189914 10159127 10122270
cpu2 10213871 10124367 10168722
cpu3 13165760 13164767 13096705
avg 11693714 11639405 11630008
As you can see the system is still not perfect but
is a lot better than what it was before...
At the moment we use the old backlog weight, weight_p
which is 64 packets. It seems to be reasonably fine
with that value.
The system could be made more fair if we reduce the
weight_p (as per my presentation), but we are going
to affect the shared backlog weight. Unless deemed
necessary, I think the default value is fine. If not
we could add yet another knob.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bridge currently floods packets to groups that we have never
seen before to all ports. This is not required by RFC4541 and
in fact it is not desirable in environment where traffic to
unregistered group is always present.
This patch changes the behaviour so that we only send traffic
to unregistered groups to ports marked as routers.
The user can always force flooding behaviour to any given port
by marking it as a router.
Note that this change does not apply to traffic to 224.0.0.X
as traffic to those groups must always be flooded to all ports.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
File net/TUNABLE has never be updated since git age.
For some tunable parameters which user can control with proc file-system,
They are all in ip-sysctl.txt doc.
For tunable parameters that only at compile time, no meaning to note them.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simplifies the creation of connection protocol messages by eliminating
the passing of information that is no longer required, is constant,
or is contained within the port structure that is issuing the message.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Modifies the logic that creates a connection termination payload
message so that it no longer (mis)uses a routine that creates a
connection protocol message. The revised code is now more easily
understood, and avoids setting several fields that are either not
present in payload messages or were being set more than once.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Restructures the logic used in tipc_port_recv_proto_msg() to ensure
that incoming connection protocol messages are handled properly. The
routine now uses a two-stage process that first ensures the message
applies on an existing connection and then processes the request.
This corrects a loophole that allowed a connection probe request to
be processed if it was sent to an unconnected port that had no names
bound to it.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Speeds up the creation of the FIN message that terminates a TIPC
connection. The typical peer termination message is now created by
duplicating the terminating port's standard payload message header
and adjusting the message size, importance, and error code fields,
rather than building all fields of the message from scratch. A FIN
message that is directed to the port itself is created the same way.
but also requires swapping the origin and destination address fields.
In addition to reducing the work required to create FIN messages,
these changes eliminate several instances of duplicated code,
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Performs cosmetic cleanup of the symbolic names used to specify TIPC
payload message header sizes. The revised names now more accurately
reflect the payload messages in which they can appear. In addition,
several places where these payload message symbol names were being used
to create non-payload messages have been updated to use the proper
internal message symbolic name.
No functional changes are introduced by this rework.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Gets rid of code that allows tipc_msg_init() to create a short
payload message header. This optimization is possible because
there are no longer any callers who require this capability.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Eliminates a pair of #include statements for files that are brought in
automatically by including core.h.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Gets rid of counter that records the number of times a bearer has
resumed after congestion or blocking, since the value is never
referenced anywhere.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Fixes a minor error in the title of one of the message size profiling
values printed as part of TIPC's link statistics.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Gets rid of a pair of checks to see if a name sequence entry in
TIPC's name table has an empty zone list. These checks are pointless
since the zone list can never be empty (i.e. as soon as the list
becomes empty the associated name sequence entry is deleted).
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Modifies the main circular linked lists of publications used in TIPC's
name table to use the standard kernel linked list type. This change
simplifies the deletion of an existing publication by eliminating
the need to search up to three lists to locate the publication.
The use of standard list routines also helps improve the readability
of the name table code by make it clearer what each list operation
being performed is actually doing.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Modifies the name table array structure that contains the name
sequence instances for a given name type so that the publication
lists associated with a given instance are stored in a dynamically
allocated structure, rather than being embedded within the array
entry itself. This change is being done for several reasons:
1) It reduces the amount of data that needs to be copied whenever
a given array is expanded or contracted to accommodate the first
publication of a new name sequence or the removal of the last
publication of an existing name sequence.
2) It reduces the amount of memory associated with array entries that
are currently unused.
3) It facilitates the upcoming conversion of the publication lists
from TIPC-specific circular lists to standard kernel lists. (Standard
lists cannot be used with the former array structure because the
relocation of array entries during array expansion and contraction
would corrupt the lists.)
Note that, aside from introducing a small amount of code to dynamically
allocate and free the structure that now holds publication list info,
this change is largely a simple renaming exercise that replaces
references to "sseq->LIST" with "sseq->info->LIST" (or "info->LIST").
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Gets rid of unnecessary masking in two routines that set TIPC message
header fields. (The msg_set_bits() routine already takes care of
masking the new value to the correct size.)
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Gets rid of a pair of routines that provide support for temporarily
caching the destination node for a message in the associated message
buffer's application handle, since this capability is no longer used.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Optimizes the creation of a returned payload message by duplicating
the original message and then updating the small number of fields
that need to be adjusted, rather than building the new message header
from scratch. In addition, certain operations that are not always
required are relocated so that they are only done if needed.
These optimizations also have the effect of addressing other issues
that were present previously:
1) Fixes a bug that caused the socket send routines to return the
size of the returned message, rather than the size of the sent
message, when a returnable payload message was sent to a non-existent
destination port.
2) The message header of the returned message now matches that of
the original message more closely. The header is now always the same
size as the original header, and some message header fields that
weren't being initialized in the returned message header are now
populated correctly -- namely the "d" and "s" bits, and the upper
bound of a multicast name instance (where present).
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Reduces the work involved in transmitting a returned payload message
by doing only the work necessary to route such a message directly to
the specified destination port, rather than invoking the code used
to route an arbitrary message to an arbitrary destination.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Introduces an internal sanity check to ensure that the only undeliverable
messages TIPC attempts to return to their origin are application payload
messages.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Modifies the routine that handles the rejection of payload messages
so that it has a single exit point that frees up the rejected message,
thereby eliminating some duplicated code.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Eliminates a TIPC-specific assert() macro that is no longer used.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Modifies the existing broadcast link sanity check that detects an
attempt to send a message off-node when there are no available
destinations so that it no longer causes a kernel panic; instead,
the check now issues a warning and stack trace and then returns
without sending the message anywhere.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
net/bluetooth/smp.c: In function 'smp_e':
net/bluetooth/smp.c:49:21: error: storage size of 'sg' isn't known
net/bluetooth/smp.c:67:2: error: implicit declaration of function 'sg_init_one'
net/bluetooth/smp.c:49:21: warning: unused variable 'sg'
Caused by commit d22ef0bc83 ("Bluetooth: Add LE SMP Cryptoolbox
functions"). Missing include file, presumably. This batch has been in
the bluetooth tree since June 14, so it may have been exposed by the
removal of linux/mm.h from netdevice.h ...
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
These sk_buff structs were allocated with nlmsg_new() so they should
be freed with nlmsg_free().
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the new consistent dump feature from (generic) netlink
to advertise when dumps are incomplete.
Readers may note that this does not initialize the
rdev->bss_generation counter to a non-zero value. This is
still OK since the value is modified only under spinlock
when the list is modified. Since the dump code holds the
spinlock, the value will either be > 0 already, or the
list will still be empty in which case a consistent dump
will actually be made (and be empty).
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Consider the following situation:
* a dump that would show 8 entries, four in the first
round, and four in the second
* between the first and second rounds, 6 entries are
removed
* now the second round will not show any entry, and
even if there is a sequence/generation counter the
application will not know
To solve this problem, add a new flag NLM_F_DUMP_INTR
to the netlink header that indicates the dump wasn't
consistent, this flag can also be set on the MSG_DONE
message that terminates the dump, and as such above
situation can be detected.
To achieve this, add a sequence counter to the netlink
callback struct. Of course, netlink code still needs
to use this new functionality. The correct way to do
that is to always set cb->seq when a dumpit callback
is invoked and call nl_dump_check_consistent() for
each new message. The core code will also call this
function for the final MSG_DONE message.
To make it usable with generic netlink, a new function
genlmsg_nlhdr() is needed to obtain the netlink header
from the genetlink user header.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Consider this scenario: When the size of the first received udp packet
is bigger than the receive buffer, MSG_TRUNC bit is set in msg->msg_flags.
However, if checksum error happens and this is a blocking socket, it will
goto try_again loop to receive the next packet. But if the size of the
next udp packet is smaller than receive buffer, MSG_TRUNC flag should not
be set, but because MSG_TRUNC bit is not cleared in msg->msg_flags before
receive the next packet, MSG_TRUNC is still set, which is wrong.
Fix this problem by clearing MSG_TRUNC flag when starting over for a
new packet.
Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
udpv6_recvmsg() function is not using the correct variable to determine
whether or not the socket is in non-blocking operation, this will lead
to unexpected behavior when a UDP checksum error occurs.
Consider a non-blocking udp receive scenario: when udpv6_recvmsg() is
called by sock_common_recvmsg(), MSG_DONTWAIT bit of flags variable in
udpv6_recvmsg() is cleared by "flags & ~MSG_DONTWAIT" in this call:
err = sk->sk_prot->recvmsg(iocb, sk, msg, size, flags & MSG_DONTWAIT,
flags & ~MSG_DONTWAIT, &addr_len);
i.e. with udpv6_recvmsg() getting these values:
int noblock = flags & MSG_DONTWAIT
int flags = flags & ~MSG_DONTWAIT
So, when udp checksum error occurs, the execution will go to
csum_copy_err, and then the problem happens:
csum_copy_err:
...............
if (flags & MSG_DONTWAIT)
return -EAGAIN;
goto try_again;
...............
But it will always go to try_again as MSG_DONTWAIT has been cleared
from flags at call time -- only noblock contains the original value
of MSG_DONTWAIT, so the test should be:
if (noblock)
return -EAGAIN;
This is also consistent with what the ipv4/udp code does.
Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are enough instances of this:
iph->frag_off & htons(IP_MF | IP_OFFSET)
that a helper function is probably warranted.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove linux/mm.h inclusion from netdevice.h -- it's unused (I've checked manually).
To prevent mm.h inclusion via other channels also extract "enum dma_data_direction"
definition into separate header. This tiny piece is what gluing netdevice.h with mm.h
via "netdevice.h => dmaengine.h => dma-mapping.h => scatterlist.h => mm.h".
Removal of mm.h from scatterlist.h was tried and was found not feasible
on most archs, so the link was cutoff earlier.
Hope people are OK with tiny include file.
Note, that mm_types.h is still dragged in, but it is a separate story.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Fix decode_secinfo_maxsz
NFSv4.1: Fix an off-by-one error in pnfs_generic_pg_test
NFSv4.1: Fix some issues with pnfs_generic_pg_test
NFSv4.1: file layout must consider pg_bsize for coalescing
pnfs-obj: No longer needed to take an extra ref at add_device
SUNRPC: Ensure the RPC client only quits on fatal signals
NFSv4: Fix a readdir regression
nfs4.1: mark layout as bad on error path in _pnfs_return_layout
nfs4.1: prevent race that allowed use of freed layout in _pnfs_return_layout
NFSv4.1: need to put_layout_hdr on _pnfs_return_layout error path
NFS: (d)printks should use %zd for ssize_t arguments
NFSv4.1: fix break condition in pnfs_find_lseg
nfs4.1: fix several problems with _pnfs_return_layout
NFSv4.1: allow zero fh array in filelayout decode layout
NFSv4.1: allow nfs_fhget to succeed with mounted on fileid
NFSv4.1: Fix a refcounting issue in the pNFS device id cache
NFSv4.1: deprecate headerpadsz in CREATE_SESSION
NFS41: do not update isize if inode needs layoutcommit
NLM: Don't hang forever on NLM unlock requests
NFS: fix umount of pnfs filesystems
Missing error checking before nla_parse_nested().
Reported-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Incorrect return type on dcb_setapp() this routine
returns negative error codes. All call sites of
dcb_setapp() assign the return value to an int already
so no need to update drivers.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With multiple APP entries per selector and protocol drivers
or stacks may want to pick a specific value or stripe traffic
across many priorities. Also if an APP entry in use is
deleted the stack/driver may want to choose from the existing
APP entries.
To facilitate this and avoid having duplicate code to walk
the APP ring provide a routine dcb_ieee_getapp_mask() to
return a u8 bitmask of all priorities set for the specified
selector and protocol. This routine and bitmask is a helper
for DCB kernel users.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we allow multiple IEEE App entries we need a way
to remove specific entries. To do this add the ieee_dcb_delapp()
routine.
Additionaly drivers may need to remove the APP entry from
their firmware tables. Add dcb ops routine to handle this.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a setapp routine for IEEE802.1Qaz encoded APP data types.
The IEEE 802.1Qaz spec encodes the priority bits differently and
allows for multiple APP data entries of the same selector and
protocol. Trying to force these to use the same set routines was
becoming tedious. Furthermore, userspace could probably enforce
the correct semantics, but expecting drivers to do this seems
error prone in the firmware case.
For these reasons add ieee_dcb_setapp() that understands the
IEEE 802.1Qaz encoded form.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that dcbnl is being used in many cases by more
than a single agent it is beneficial to be notified
when some entity either driver or user space has
changed the DCB attributes.
Today applications either end up polling the interface
or relying on a user space database to maintain the DCB
state and post events. Polling is a poor solution for
obvious reasons. And relying on a user space database
has its own downside. Namely it has created strange
boot dependencies requiring the database be populated
before any applications dependent on DCB attributes
starts or the application goes into a polling loop.
Populating the database requires negotiating link
setting with the peer and can take anywhere from less
than a second up to a few seconds depending on the switch
implementation.
Perhaps more importantly if another application or an
embedded agent sets a DCB link attribute the database
has no way of knowing other than polling the kernel.
This prevents applications from responding quickly to
changes in link events which at least in the FCoE case
and probably any other protocols expecting a lossless
link may result in IO errors.
By adding a multicast group for DCB we have clean way
to disseminate kernel DCB link attributes up to user
space. Avoiding the need for user space to maintain
a coherant database and disperse events that potentially
do not reflect the current link state.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding the capabilities bitmask to the get_ieee response allows
user space to determine the current DCBX mode. Either CEE or IEEE
this is useful with devices that support switching between modes
where knowing the current state is relevant.
Derived from work by Mark Rustad
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds 2 tracepoints to get a status of a socket receive queue
and related parameter.
One tracepoint is added to sock_queue_rcv_skb. It records rcvbuf size
and its usage. The other tracepoint is added to __sk_mem_schedule and
it records limitations of memory for sockets and current usage.
By using these tracepoints we're able to know detailed reason why kernel
drop the packet.
Signed-off-by: Satoru Moriya <satoru.moriya@hds.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a tracepoint to __udp_queue_rcv_skb to get the
return value of ip_queue_rcv_skb. It indicates why kernel drops
a packet at this point.
ip_queue_rcv_skb returns following values in the packet drop case:
rcvbuf is full : -ENOMEM
sk_filter returns error : -EINVAL, -EACCESS, -ENOMEM, etc.
__sk_mem_schedule returns error: -ENOBUF
Signed-off-by: Satoru Moriya <satoru.moriya@hds.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It was suggested by "make versioncheck" that the follwing includes of
linux/version.h are redundant:
/home/jj/src/linux-2.6/net/caif/caif_dev.c: 14 linux/version.h not needed.
/home/jj/src/linux-2.6/net/caif/chnl_net.c: 10 linux/version.h not needed.
/home/jj/src/linux-2.6/net/ipv4/gre.c: 19 linux/version.h not needed.
/home/jj/src/linux-2.6/net/netfilter/ipset/ip_set_core.c: 20 linux/version.h not needed.
/home/jj/src/linux-2.6/net/netfilter/xt_set.c: 16 linux/version.h not needed.
and it seems that it is right.
Beyond manually inspecting the source files I also did a few build
tests with various configs to confirm that including the header in
those files is indeed not needed.
Here's a patch to remove the pointless includes.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the connection is ready we should set the connection
to CONNECTED so userspace can use it.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
* 'for-2.6.40' of git://linux-nfs.org/~bfields/linux:
nfsd4: fix break_lease flags on nfsd open
nfsd: link returns nfserr_delay when breaking lease
nfsd: v4 support requires CRYPTO
nfsd: fix dependency of nfsd on auth_rpcgss
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (40 commits)
pxa168_eth: fix race in transmit path.
ipv4, ping: Remove duplicate icmp.h include
netxen: fix race in skb->len access
sgi-xp: fix a use after free
hp100: fix an skb->len race
netpoll: copy dev name of slaves to struct netpoll
ipv4: fix multicast losses
r8169: fix static initializers.
inet_diag: fix inet_diag_bc_audit()
gigaset: call module_put before restart of if_open()
farsync: add module_put to error path in fst_open()
net: rfs: enable RFS before first data packet is received
fs_enet: fix freescale FCC ethernet dp buffer alignment
netdev: bfin_mac: fix memory leak when freeing dma descriptors
vlan: don't call ndo_vlan_rx_register on hardware that doesn't have vlan support
caif: Bugfix - XOFF removed channel from caif-mux
tun: teach the tun/tap driver to support netpoll
dp83640: drop PHY status frames in the driver.
dp83640: fix phy status frame event parsing
phylib: Allow BCM63XX PHY to be selected only on BCM63XX.
...
Ethernet MAC drivers based on phylib (but not using NAPI) can
enable hardware time stamping in phy devices by calling netif_rx()
conditionally based on a call to skb_defer_rx_timestamp().
This commit exports that function so that drivers calling it may
be compiled as modules.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the duplicate inclusion of net/icmp.h from net/ipv4/ping.c
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
We already have access to the chan, we don't have to access the
socket to get its imtu.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
We should not try to do any other type of configuration for
LE links when they become ready.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Many stupid corrections of duplicated includes based on the output of
scripts/checkincludes.pl.
Signed-off-by: Vitaliy Ivanov <vitalivanov@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
If a client issues a DHCPREQUEST for renewal, the packet is dropped
if the old destination (the old gateway for the client) TQ is smaller
than the current best gateway TQ less GW_THRESHOLD
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
In case of new default gw, changing the default gw or deleting the default gw a
uevent is triggered with type=gw, action=add/change/del and
data={GW_ORIG_ADDRESS} (if any).
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
The gateway election mechanism has been a little revised. Now the
gw_election is trigered by an atomic_t flag (gw_reselect) which is set
to 1 in case of election needed, avoding to set curr_gw to NULL.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Using throw_uevent() is now possible to trigger uevent signal that can
be recognised in userspace. Uevents will be triggered through the
/devices/virtual/net/{MESH_IFACE} kobject.
A triggered uevent has three properties:
- type: the event class. Who generates the event (only 'gw' is currently
defined). Corresponds to the BATTYPE uevent variable.
- action: the associated action with the event ('add'/'change'/'del' are
currently defined). Corresponds to the BATACTION uevent variable.
- data: any useful data for the userspace. Corresponds to the BATDATA
uevent variable.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
The local and the global translation-tables are now lock free and rcu
protected.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
With the current client announcement implementation, in case of roaming,
an update is triggered on the new AP serving the client. At that point
the new information is spread around by means of the OGM broadcasting
mechanism. Until this operations is not executed, no node is able to
correctly route traffic towards the client. This obviously causes packet
drops and introduces a delay in the time needed by the client to recover
its connections.
A new packet type called ROAMING_ADVERTISEMENT is added to account this
issue.
This message is sent in case of roaming from the new AP serving the
client to the old one and will contain the client MAC address. In this
way an out-of-OGM update is immediately committed, so that the old node
can update its global translation table. Traffic reaching this node will
then be redirected to the correct destination utilising the fresher
information. Thus reducing the packet drops and the connection recovery
delay.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
The client announcement mechanism informs every mesh node in the network
of any connected non-mesh client, in order to find the path towards that
client from any given point in the mesh.
The old implementation was based on the simple idea of appending a data
buffer to each OGM containing all the client MAC addresses the node is
serving. All other nodes can populate their global translation tables
(table which links client MAC addresses to node addresses) using this
MAC address buffer and linking it to the node's address contained in the
OGM. A node that wants to contact a client has to lookup the node the
client is connected to and its address in the global translation table.
It is easy to understand that this implementation suffers from several
issues:
- big overhead (each and every OGM contains the entire list of
connected clients)
- high latencies for client route updates due to long OGM trip time and
OGM losses
The new implementation addresses these issues by appending client
changes (new client joined or a client left) to the OGM instead of
filling it with all the client addresses each time. In this way nodes
can modify their global tables by means of "updates", thus reducing the
overhead within the OGMs.
To keep the entire network in sync each node maintains a translation
table version number (ttvn) and a translation table checksum. These
values are spread with the OGM to allow all the network participants to
determine whether or not they need to update their translation table
information.
When a translation table lookup is performed in order to send a packet
to a client attached to another node, the destination's ttvn is added to
the payload packet. Forwarding nodes can compare the packet's ttvn with
their destination's ttvn (this node could have a fresher information
than the source) and re-route the packet if necessary. This greatly
reduces the packet loss of clients roaming from one AP to the next.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
The amount of duplicated code in the receive and routing code can be
reduced when all headers provide the packet type, version and ttl in the
same first bytes.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
char was used in different places to store information without really
using the characteristics of that data type or by ignoring the fact that
char has not a well defined signedness.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
count_real_packets() in batman-adv assumes char is signed, and returns -1
through it:
net/batman-adv/routing.c: In function 'receive_bat_packet':
net/batman-adv/routing.c:739: warning: comparison is always false due to limited range of data type
Use int instead.
Signed-off-by: David Howells <dhowells@redhat.com>
[sven@narfation.org: Rebase on top of current version]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
interface_tx is not used outside of soft-interface.c and thus doesn't
need to be declared inside soft-interface.h
Signed-off-by: Sven Eckelmann <sven@narfation.org>
compare_orig is only used in context of orig_node which is managed
inside originator.c. It is not necessary to keep that function inside
the header originator.h.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Otherwise we will not see the name of the slave dev in error
message:
[ 388.469446] (null): doesn't support polling, aborting.
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Knut Tidemann found that first packet of a multicast flow was not
correctly received, and bisected the regression to commit b23dd4fe42
(Make output route lookup return rtable directly.)
Special thanks to Knut, who provided a very nice bug report, including
sample programs to demonstrate the bug.
Reported-and-bisectedby: Knut Tidemann <knut.andre.tidemann@jotron.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A malicious user or buggy application can inject code and trigger an
infinite loop in inet_diag_bc_audit()
Also make sure each instruction is aligned on 4 bytes boundary, to avoid
unaligned accesses.
Reported-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Le jeudi 16 juin 2011 à 23:38 -0400, David Miller a écrit :
> From: Ben Hutchings <bhutchings@solarflare.com>
> Date: Fri, 17 Jun 2011 00:50:46 +0100
>
> > On Wed, 2011-06-15 at 04:15 +0200, Eric Dumazet wrote:
> >> @@ -1594,6 +1594,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
> >> goto discard;
> >>
> >> if (nsk != sk) {
> >> + sock_rps_save_rxhash(nsk, skb->rxhash);
> >> if (tcp_child_process(sk, nsk, skb)) {
> >> rsk = nsk;
> >> goto reset;
> >>
> >
> > I haven't tried this, but it looks reasonable to me.
> >
> > What about IPv6? The logic in tcp_v6_do_rcv() looks very similar.
>
> Indeed ipv6 side needs the same fix.
>
> Eric please add that part and resubmit. And in fact I might stick
> this into net-2.6 instead of net-next-2.6
>
OK, here is the net-2.6 based one then, thanks !
[PATCH v2] net: rfs: enable RFS before first data packet is received
First packet received on a passive tcp flow is not correctly RFS
steered.
One sock_rps_record_flow() call is missing in inet_accept()
But before that, we also must record rxhash when child socket is setup.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
When suspending, __ieee80211_suspend() calls ieee80211_scan_cancel(),
which will only cancel sw scan. In order to cancel hw scan, the
low-level driver has to cancel it in the suspend() callback. however,
this is too late, as a new scan_work will be enqueued (while the driver
is going into suspend).
Add a new cancel_hw_scan() callback, asking the driver to cancel an
active hw scan, and call it in ieee80211_scan_cancel().
Signed-off-by: Eliad Peller <eliad@wizery.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Trigger connection monitor on resume from suspend. Since we
have been sleeping, there is reason to suspect that we might
not still be associated. The speed of detecting loss of
{connection,authentication} is worth the cost of the small
additional traffic at resume.
Signed-off-by: Paul Stewart <pstew@chromium.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Fix a couple of instances where we were exiting the RPC client on
arbitrary signals. We should only do so on fatal signals.
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch removes the call to ndo_vlan_rx_register if the underlying
device doesn't have hardware support for VLAN.
Signed-off-by: Antoine Reversat <a.reversat@gmail.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Since printk_ratelimit() shouldn't be used anymore (see comment in
include/linux/printk.h), replace it with printk_ratelimited()
Signed-off-by: Manuel Zerpies <manuel.f.zerpies@ww.stud.uni-erlangen.de>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Since printk_ratelimit() shouldn't be used anymore (see comment in
include/linux/printk.h), replace it with printk_ratelimited().
Signed-off-by: Manuel Zerpies <manuel.f.zerpies@ww.stud.uni-erlangen.de>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
XOFF was mixed up with DOWN indication, causing causing CAIF channel to be
removed from mux and all incoming traffic to be lost after receiving flow-off.
Fix this by replacing FLOW_OFF with DOWN notification.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
In c7ac8679be "rtnetlink: Compute and store minimum ifinfo dump
size", we moved the allocation under the lock so we need to unlock
on error path.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Unnecessary casts of void * clutter the code.
These are the remainder casts after several specific
patches to remove netdev_priv and dev_priv.
Done via coccinelle script:
$ cat cast_void_pointer.cocci
@@
type T;
T *pt;
void *pv;
@@
- pt = (T *)pv;
+ pt = pv;
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Upon reception of a MGM report packet the kernel sets the mrouters_only flag
in a skb that is a clone of the original skb, which means that the bridge
loses track of MGM packets (cb buffers are tied to a specific skb and not
shared) and it ends up forwading join requests to the bridge interface.
This can cause unexpected membership timeouts and intermitent/permanent loss
of connectivity as described in RFC 4541 [2.1.1. IGMP Forwarding Rules]:
A snooping switch should forward IGMP Membership Reports only to
those ports where multicast routers are attached.
[...]
Sending membership reports to other hosts can result, for IGMPv1
and IGMPv2, in unintentionally preventing a host from joining a
specific multicast group.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Upon reception of a IGMP/IGMPv2 membership report the kernel sets the
mrouters_only flag in a skb that may be a clone of the original skb, which
means that sometimes the bridge loses track of membership report packets (cb
buffers are tied to a specific skb and not shared) and it ends up forwading
join requests to the bridge interface.
This can cause unexpected membership timeouts and intermitent/permanent loss
of connectivity as described in RFC 4541 [2.1.1. IGMP Forwarding Rules]:
A snooping switch should forward IGMP Membership Reports only to
those ports where multicast routers are attached.
[...]
Sending membership reports to other hosts can result, for IGMPv1
and IGMPv2, in unintentionally preventing a host from joining a
specific multicast group.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Tested-by: Hayato Kakuta <kakuta.hayato@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Instead of setting bits manually we use set_bit, test_bit, etc.
Also remove L2CAP_ prefix from macros.
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Management interface commands for blocking and unblocking devices.
Signed-off-by: Antti Julku <antti.julku@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Move blacklisting functions to hci_core.c, so that they can
be used by both management interface and hci socket interface.
Signed-off-by: Antti Julku <antti.julku@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
AFS: Use i_generation not i_version for the vnode uniquifier
AFS: Set s_id in the superblock to the volume name
vfs: Fix data corruption after failed write in __block_write_begin()
afs: afs_fill_page reads too much, or wrong data
VFS: Fix vfsmount overput on simultaneous automount
fix wrong iput on d_inode introduced by e6bc45d65d
Delay struct net freeing while there's a sysfs instance refering to it
afs: fix sget() races, close leak on umount
ubifs: fix sget races
ubifs: split allocation of ubifs_info into a separate function
fix leak in proc_set_super()
The hash:net,iface type makes possible to store network address and
interface name pairs in a set. It's mostly suitable for egress
and ingress filtering. Examples:
# ipset create test hash:net,iface
# ipset add test 192.168.0.0/16,eth0
# ipset add test 192.168.0.0/24,eth1
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
With the change the sets can use any parameter available for the match
and target extensions, like input/output interface. It's required for
the hash:net,iface set type.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
When creating a set from a range expressed as a network like
10.1.1.172/29, the from address was taken as the IP address part and
not masked with the netmask from the cidr.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The range internally is converted to the network(s) equal to the range.
Example:
# ipset new test hash:net
# ipset add test 10.2.0.0-10.2.1.12
# ipset list test
Name: test
Type: hash:net
Header: family inet hashsize 1024 maxelem 65536
Size in memory: 16888
References: 0
Members:
10.2.1.12
10.2.1.0/29
10.2.0.0/24
10.2.1.8/30
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
A set type may have multiple revisions, for example when syntax is
extended. Support continuous revision ranges in set types.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
When ranges are added to hash types, the elements may trigger rehashing
the set. However, the last successfully added element was not kept track
so the adding started again with the first element after the rehashing.
Bug reported by Mr Dash Four.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Current listing makes possible to list sets with full content only.
The patch adds support partial listings, i.e. listing just
the existing setnames or listing set headers, without set members.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The support makes possible to specify the timeout value for
the SET target and a flag to reset the timeout for already existing
entries.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
When an element to a set with timeout added, one can change the timeout
by "readding" the element with the "-exist" flag. That means the timeout
value is reset to the specified one (or to the default from the set
specification if the "timeout n" option is not used). Example
ipset add foo 1.2.3.4 timeout 10
ipset add foo 1.2.3.4 timeout 600 -exist
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Avoid double seq adjustment for loopback traffic
because it causes silent repetition of TCP data. One
example is passive FTP with DNAT rule and difference in the
length of IP addresses.
This patch adds check if packet is sent and
received via loopback device. As the same conntrack is
used both for outgoing and incoming direction, we restrict
seq adjustment to happen only in POSTROUTING.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Patrick McHardy <kaber@trash.net>
By default, when broadcast or multicast packet are sent from a local
application, they are sent to the interface then looped by the kernel
to other local applications, going throught netfilter hooks in the
process.
These looped packet have their MAC header removed from the skb by the
kernel looping code. This confuse various netfilter's netlink queue,
netlink log and the legacy ip_queue, because they try to extract a
hardware address from these packets, but extracts a part of the IP
header instead.
This patch prevent NFQUEUE, NFLOG and ip_QUEUE to include a MAC header
if there is none in the packet.
Signed-off-by: Nicolas Cavallari <cavallar@lri.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Userspace allows to specify inversion for IP header ECN matches, the
kernel silently accepts it, but doesn't invert the match result.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Check for protocol inversion in ecn_mt_check() and remove the
unnecessary runtime check for IPPROTO_TCP in ecn_mt().
Signed-off-by: Patrick McHardy <kaber@trash.net>
In hci_conn_security ( which is used during L2CAP connection
establishment ) test for HCI_CONN_ENCRYPT_PEND state also
sets this state, which is bogus and leads to connection time-out
on L2CAP sockets in certain situations (especially when
using non-ssp devices )
Signed-off-by: Ilia Kolomisnky <iliak@ti.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
If the NLM daemon is killed on the NFS server, we can currently end up
hanging forever on an 'unlock' request, instead of aborting. Basically,
if the rpcbind request fails, or the server keeps returning garbage, we
really want to quit instead of retrying.
Tested-by: Vasily Averin <vvs@sw.ru>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
This patch implements a check in smp cmd pairing request and pairing
response to verify if encryption key maximum size is compatible in both
slave and master when SMP Pairing is requested. Keys are also masked to
the correct negotiated size.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Anderson Briglia <anderson.briglia@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This patch adds support for disconnecting the link when SMP procedure
takes more than 30 seconds.
SMP begins when either the Pairing Request command is sent or the
Pairing Response is received, and it ends when the link is encrypted
(or terminated). Vol 3, Part H Section 3.4.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
When authentication completes we shouldn't blindly accept any pending
L2CAP connect requests. If the socket has the defer_setup feature
enabled it should still wait for user space acceptance of the connect
request. The issue only happens for non-SSP connections since with SSP
the L2CAP Connect request may not be sent for non-SDP PSMs before
authentication has completed successfully.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
After restructuring, there is some unused or empty functions
left to be removed.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Put goto labels at the beginig of row
acording to coding style example.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Set the page count correctly for non-page-aligned IO. We were already
doing this correctly for alignment, but not the page count. Fixes
DIRECT_IO writes from unaligned pages.
Signed-off-by: Sage Weil <sage@newdream.net>
In net/ieee802154/nl-phy.c::ieee802154_nl_fill_phy() I see two small
issues.
1) If the allocation of 'buf' fails we may just as well return -EMSGSIZE
directly rather than jumping to 'out:' and do a pointless kfree(0).
2) We do not free 'buf' unless we jump to one of the error labels and this
leaks memory.
This patch should address both.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
l2tp_ip_sendmsg() in non connected mode incorrectly calls
sk_setup_caps(). Subsequent send() calls send data to wrong destination.
We can also avoid changing dst refcount in connected mode, using
appropriate rcu locking. Once output route lookups can also be done
under rcu, sendto() calls wont change dst refcounts too.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
The network stack provides the function, skb_clone_tx_timestamp().
Ethernet MAC drivers can call this via the transmit time stamping
hook, skb_tx_timestamp(). This commit exports the clone function so
that drivers using it can be compiled as modules.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
The "dc" variable is initialized but not passed to hci_send_cmd().
Signed-off-by: Anderson Lizardo <anderson.lizardo@openbossa.org>
Signed-off-by: Bruna Moreira <bruna.moreira@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This patch implements a simple version of the SMP Pairing Features
exchange procedure (Vol. 3 Part H, Section 2.3.5.1).
For now, everything that would cause a Pairing Method different of
Just Works to be chosen is rejected.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Before we are able to do a proper exchange of pairing parameters,
we need a unified way of building pairing requests and responses.
For IO Capability we use the value that was set by userspace,
using the management interface.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
As the default security level (BT_SECURITY_SDP) doesn't make sense for
LE links, initialize LE links with something that makes sense.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This adds support for resuming the user space traffic when SMP
negotiation is complete.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Now that these commands are sent to the controller we can use hcidump
to verify that the correct values are produced.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This adds support for starting SMP Phase 2 Encryption, when the initial
SMP negotiation is successful. This adds the LE Start Encryption and LE
Long Term Key Request commands and related events.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This patch includes support for generating and sending the random value
used to produce the confirmation value.
Signed-off-by: Anderson Briglia <anderson.briglia@openbossa.org>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This patch adds initial support for verifying the confirmation value
that the remote side has sent.
Signed-off-by: Anderson Briglia <anderson.briglia@openbossa.org>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This patch implements SMP crypto functions called ah, c1, s1 and e.
It also implements auxiliary functions. All These functions are needed
for SMP keys generation.
Signed-off-by: Anderson Briglia <anderson.briglia@openbossa.org>
Signed-off-by: Anderson Lizardo <anderson.lizardo@openbossa.org>
Signed-off-by: Bruna Moreira <bruna.moreira@openbossa.org>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This will allow using the crypto subsystem for encrypting data. As SMP
(Security Manager Protocol) is implemented almost entirely on the host
side and the crypto module already implements the needed methods
(AES-128), it makes sense to use it.
There's now a new module option to enable/disable SMP support.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Anderson Briglia <anderson.briglia@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This implementation only exchanges SMP messages between the Host and the
Remote. No keys are being generated. TK and STK generation will be
provided in further patches.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Start SMP procedure for LE connections. This modification intercepts
l2cap received frames and call proper SMP functions to start the SMP
procedure. By now, no keys are being used.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Anderson Briglia <anderson.briglia@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: unwind canceled flock state
ceph: fix ENOENT logic in striped_read
ceph: fix short sync reads from the OSD
ceph: fix sync vs canceled write
ceph: use ihold when we already have an inode ref
These simple commands will allow the SMP procedure to be started
and terminated with a not supported error. This is the first step
toward something useful.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Anderson Briglia <anderson.briglia@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
ERTM use the generic L2CAP timer functions to keep a reference to the
channel. This is useful for avoiding crashes.
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
struct l2cap_chan has now its own refcnt that is compatible with the
socket refcnt, i.e., we won't see sk_refcnt = 0 and chan->refcnt > 0.
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Now socket state is tracked by struct sock and channel state is tracked by
chan->state. At this point both says the same, but this is going to change
when we add AMP Support for example.
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Add an abstraction layer between L2CAP core and its users (only
l2cap_sock.c now). The first function implemented is new_connection() that
replaces calls to l2cap_sock_alloc() in l2cap_core.c
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
With older userspace versions (using hciops) it might not have the
key type to check if the key has sufficient security for any security
level so it is necessary to check the return of hci_conn_auth to make
sure the connection is authenticated
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Quote from Patric Mc Hardy
"This looks like nfnetlink.c excited and destroyed the nfnl socket, but
ip_vs was still holding a reference to a conntrack. When the conntrack
got destroyed it created a ctnetlink event, causing an oops in
netlink_has_listeners when trying to use the destroyed nfnetlink
socket."
If nf_conntrack_netlink is loaded before ip_vs this is not a problem.
This patch simply avoids calling ip_vs_conn_drop_conntrack()
when netns is dying as suggested by Julian.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Make it more clear what the functions does,
on request by Julian.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Change the parsing of FTP commands and responses to
support skip character. It allows to detect variations in
the 227 PASV response.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
* new refcount in struct net, controlling actual freeing of the memory
* new method in kobj_ns_type_operations (->drop_ns())
* ->current_ns() semantics change - it's supposed to be followed by
corresponding ->drop_ns(). For struct net in case of CONFIG_NET_NS it bumps
the new refcount; net_drop_ns() decrements it and calls net_free() if the
last reference has been dropped. Method renamed to ->grab_current_ns().
* old net_free() callers call net_drop_ns() instead.
* sysfs_exit_ns() is gone, along with a large part of callchain
leading to it; now that the references stored in ->ns[...] stay valid we
do not need to hunt them down and replace them with NULL. That fixes
problems in sysfs_lookup() and sysfs_readdir(), along with getting rid
of sb->s_instances abuse.
Note that struct net *shutdown* logics has not changed - net_cleanup()
is called exactly when it used to be called. The only thing postponed by
having a sysfs instance refering to that struct net is actual freeing of
memory occupied by struct net.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
There is a dev_put(ndev) missing on an error path. This was
introduced in 0c1ad04aec "netpoll: prevent netpoll setup on slave
devices".
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SNMP mibs use two percpu arrays, one used in BH context, another in USER
context. With increasing number of cpus in machines, and fact that ipv6
uses per network device ipstats_mib, this is consuming a lot of memory
if many network devices are registered.
commit be281e554e (ipv6: reduce per device ICMP mib sizes) shrinked
percpu needs for ipv6, but we can reduce memory use a bit more.
With recent percpu infrastructure (irqsafe_cpu_inc() ...), we no longer
need this BH/USER separation since we can update counters in a single
x86 instruction, regardless of the BH/USER context.
Other arches than x86 might need to disable irq in their
irqsafe_cpu_inc() implementation : If this happens to be a problem, we
can make SNMP_ARRAY_SZ arch dependent, but a previous poll
( https://lkml.org/lkml/2011/3/17/174 ) to arch maintainers did not
raise strong opposition.
Only on 32bit arches, we need to disable BH for 64bit counters updates
done from USER context (currently used for IP MIB)
This also reduces vmlinux size :
1) x86_64 build
$ size vmlinux.before vmlinux.after
text data bss dec hex filename
7853650 1293772 1896448 11043870 a8841e vmlinux.before
7850578 1293772 1896448 11040798 a8781e vmlinux.after
2) i386 build
$ size vmlinux.before vmlinux.afterpatch
text data bss dec hex filename
6039335 635076 3670016 10344427 9dd7eb vmlinux.before
6037342 635076 3670016 10342434 9dd022 vmlinux.afterpatch
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Tejun Heo <tj@kernel.org>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org
CC: linux-arch@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Testing of VLAN_FLAG_REORDER_HDR does not belong in vlan_untag
but rather in vlan_do_receive. Otherwise the vlan header
will not be properly put on the packet in the case of
vlan header accelleration.
As we remove the check from vlan_check_reorder_header
rename it vlan_reorder_header to keep the naming clean.
Fix up the skb->pkt_type early so we don't look at the packet
after adding the vlan tag, which guarantees we don't goof
and look at the wrong field.
Use a simple if statement instead of a complicated switch
statement to decided that we need to increment rx_stats
for a multicast packet.
Hopefully at somepoint we will just declare the case where
VLAN_FLAG_REORDER_HDR is cleared as unsupported and remove
the code. Until then this keeps it working correctly.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Acked-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's no need for the guest to validate the checksum if it have been
validated by host nics. So this patch introduces a new flag -
VIRTIO_NET_HDR_F_DATA_VALID which is used to bypass the checksum
examing in guest. The backend (tap/macvtap) may set this flag when
met skbs with CHECKSUM_UNNECESSARY to save cpu utilization.
No feature negotiation is needed as old driver just ignore this flag.
Iperf shows 12%-30% performance improvement for UDP traffic. For TCP,
when gro is on no difference as it produces skb with partial
checksum. But when gro is disabled, 20% or even higher improvement
could be measured by netperf.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Function ieee80211_reconfig in net/mac80211/util.c contains label wake_up
which is defined unconditionally, but only used with CONFIG_PM. Gcc
warns about this when CONFIG_PM is not defined.
This patch makes the label's definition dependent on CONFIG_PM too,
eliminating the warning.
The issue was apparently introduced in git commit
eecc48000a.
Signed-off-by: Vincent Zweije <vincent@zweije.nl>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Downsteram DEAUTH messages do not refer to a current authentication
attempt -- AUTH responses do. Therefore we should not allow DEAUTH
from an AP to void state for an AUTH attempt in progress.
Signed-off-by: Paul Stewart <pstew@chromium.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add ieee80211_get_operstate() function to get the operstate
of the netdevice.
This is needed for drivers that need to know when the interface
is IF_OPER_UP (e.g. wl12xx), and block notifiers can't be used
(e.g. because the interface is already IF_OPER_UP, like after
resuming from suspend)
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some old hci controllers do not accept any mask so leave the
default mask on for these devices.
< HCI Command: Set Event Mask (0x03|0x0001) plen 8
Mask: 0xfffffbff00000000
> HCI Event: Command Complete (0x0e) plen 4
Set Event Mask (0x03|0x0001) ncmd 1
status 0x12
Error: Invalid HCI Command Parameters
Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Tested-by: Corey Boyle <corey@kansanian.com>
Tested-by: Ed Tomlinson <edt@aei.ca>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
shutdown should wait for SCO link to be properly disconnected before
detroying the socket, otherwise an application using the socket may
assume link is properly disconnected before it really happens which
can be a problem when e.g synchronizing profile switch.
Signed-off-by: Luiz Augusto von Dentz <luiz.dentz-von@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
The message size allocated for rtnl ifinfo dumps was limited to
a single page. This is not enough for additional interface info
available with devices that support SR-IOV and caused a bug in
which VF info would not be displayed if more than approximately
40 VFs were created per interface.
Implement a new function pointer for the rtnl_register service that will
calculate the amount of data required for the ifinfo dump and allocate
enough data to satisfy the request.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We assume that transhdrlen is positive on the first fragment
which is wrong for raw packets. So we don't add exthdrlen to the
packet size for raw packets. This leads to a reallocation on IPsec
because we have not enough headroom on the skb to place the IPsec
headers. This patch fixes this by adding exthdrlen to the packet
size whenever the send queue of the socket is empty. This issue was
introduced with git commit 1470ddf7 (inet: Remove explicit write
references to sk/inet in ip_append_data)
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The definition NO_FLAGS was introduced to make the code more
readable and shall be used to initialize flag fields.
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
CodingStyle "Chapter 12: Macros, Enums and RTL" recommends to use enums
for several related constants. Internal states can be used without
defining the actual value, but all values which are visible to the
outside must be defined as before. Normal values are assigned as usual
and flags are defined by shifts of a bit.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
CodingStyle "Chapter 12: Macros, Enums and RTL" highly recommends to use
functions instead of macros were possible. This ensures type safety and
prevents shadowing of other variables.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
strict_strtoul as used in parse_gw_bandwidth is defined for unsigned
long and strict_strtol should be used instead for long.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
gw_node_delete is defined with "void" as return type, but still tries to
return a value. The called function gw_node_delete is also return as
void and thus doesn't provide a value for us.
Signed-off-by: Sven Eckelmann <sven@narfation.org>