Commit graph

22790 commits

Author SHA1 Message Date
Javier Cardona
dbf498fbaf mac80211: Implement mesh synchronization framework
This patch adds MBSS extensible synchronization framework (Sec.
13.13.2 of IEEE Std. 802.11-2012).

The framework is implemented via an ops table which defines the
following functions:

    rx_bcn_presp() - this is called every time a mesh beacon is
received.
    adjust_tbtt() - this is called immediately before a beacon is about
to be transmitted.

The default neighbor offset synchronization defined in the standard is
implemented.  We also provide template functions for vendor specific
methods.

When neighbor offset synchronization is active (which is the default)
mesh neighbors in the same MBSS will track timing offsets to each other
and compensate clock drift.

In our tests we observed that this mesh synchronization implementation
successfully corrected drifts between stations of ~2PPM while
introducing a jitter of ~20us.

It is also possible to test this framework on mac80211_hwsim simulated
phys to see how it behaves under different topologies, over poor links,
etc.

Signed-off-by: Marco Porsch <marco.porsch@s2005.tu-chemnitz.de>
Signed-off-by: Pavel Zubarev <pavel.zubarev@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 15:20:31 -04:00
Javier Cardona
9bdd3a6bf8 mac80211: Allow tsf increments via debugfs
Reading and writing back the tsf value via tsf is too slow if one wants
to make small increments to this timer.  With this change you can use
the syntax "+=<some value>" or "-=<some value>" to add or substract a
value from the tsf counter.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 15:20:30 -04:00
Stanislaw Gruszka
88c868c43b mac80211: sanity check for null SSID
While associated we should never have empty SSID, but life can be full
of surprises, and is allways better to print a warning than crash.

Before memcpy() in ieee80211_probereq_get() check ssid_len instead of
ssid pointer, sice pointer it always passed by "ssidie + 2" expression
to send probe functions, so practically never can be NULL.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 15:20:28 -04:00
Johannes Berg
32c5057b22 mac80211: use IEEE80211_NUM_ACS
When comparing hw->queues to determine if the
device is QoS capable, use IEEE80211_NUM_ACS
instead of just 4.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:56:10 -04:00
Johannes Berg
4644ae8903 mac80211: lazily stop queues in add_pending
When adding pending SKBs there's no need to
stop all queues, we only need to stop those
that we're adding frames to. Implement that
by lazily stopping a queue as we add an SKB.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:11 -04:00
Johannes Berg
ada1512526 mac80211: debounce queue stop/wake
When the queue status changes we need to do a fair
bit of work, so ignore no-op changes early.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:11 -04:00
Johannes Berg
ded81f6ba9 mac80211: decouple # of netdev queues from HW queues
When we get more hardware queues, we'll still want
to only have netdev queues per AC, so set it up in
that way. If the hardware doesn't support QoS (by
not supporting at least 4 queues) the netdevs get
a single queue only (this is no change in behavior
as there are no drivers with 2 or 3 queues today.)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:10 -04:00
Johannes Berg
54bcbc695e mac80211: refuse TX queue configuration on non-QoS HW
Drivers that don't support QoS also don't support
setting up their ACs, catch that early. While at
it, remove the input check since cfg80211 does it
now.

Also fix up the restart code to not try to set up
the queues in this case.

Finally also change the tx_conf array to have
IEEE80211_NUM_ACS entries instead of # of queues
since that's what it really needs.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:10 -04:00
Johannes Berg
a3304b0a17 cfg80211/nl80211: clarify TX queue API
With the plan to change mac80211's queue API to
not map ACs to queues 1:1, it seems necessary to
clarify some APIs that act on ACs rather than on
queues to spell that out explicitly. Do this.

Also verify that the AC number given is valid.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:09 -04:00
Johannes Berg
8f727ef3c4 mac80211: notify driver of rate control updates
Devices that have internal rate control need to be
notified when the bandwidth or SMPS state changes
just like external rate control algorithms get a
notification now.

Add this notification and clarify the change bits
while at it, the HT_CHANGED bit really meant only
bandwidth changed.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:08 -04:00
Johannes Berg
7213cf2cb0 mac80211: remove queue stop on rate control update
We currently stop the queue when changing the rate
control between 20/40 MHz in the BSS. This seems to
have been necessary when we actually changed the
channel, but now that we just update the station it
doesn't seem right any more. Remove it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:08 -04:00
Johannes Berg
64f68e5d15 mac80211: remove channel type argument from rate_update
The channel type argument to the rate_update()
callback isn't really the correct way to give
the rate control algorithm about the desired
RX bandwidth of the peer.

Remove this argument, and instead update the
STA capabilities with 20/40 appropriately. The
SMPS update done by this callback works in the
same way, so this makes the callback cleaner.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:08 -04:00
Johannes Berg
24398e39c8 mac80211: set HT channel before association
Changing the channel type during operation is
confusing to some drivers and will be hard to
handle in multi-channel scenarios. Instead of
changing the channel, set it to the right HT
channel before authenticating/associating and
don't change it -- just update the 20/40 MHz
restrictions in rate control as needed when
changed by the AP.

This also fixes a problem that Paul missed in
his fix for the "regulatory makes us deaf"
issue -- when we couldn't use 40 MHz we still
associated saying we were using 40 MHz, which
could in similarly broken APs make us never
even connect successfully.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:07 -04:00
Johannes Berg
1d98fb122d mac80211: use AC constants
Use the AC constants instead of hard-coding
the numbers with comments.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:07 -04:00
Johannes Berg
78307daadf mac80211: inline ieee80211_add_pending_skbs
This is a trivial wrapper function, inline it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:06 -04:00
Johannes Berg
4670cf7a84 mac80211: make ieee80211_downgrade_queue static
There's no reason for it to not be static.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:06 -04:00
Johannes Berg
4875d30df5 mac80211: clean up uAPSD TX code
Clean up the code formatting and also replace
the constant 0 by IEEE80211_AC_VO.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:06 -04:00
Johannes Berg
98aed9fd01 mac80211: fix mesh TX coding style
Fix bad indentation & pointless if nesting.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:05 -04:00
Johannes Berg
81ddbb5c11 mac80211: don't always advertise remain-on-channel
Not all devices are really capable of implementing
remain-on-channel, even if it is implemented in SW,
as they can't necessarily deal with channel changes
while associated.

Remove the WIPHY_FLAG_HAS_REMAIN_ON_CHANNEL and add
it only if either the driver has remain_on_channel
implemented in the driver/device.

Also add it to all drivers that advertise P2P right
now since those definitely have to have it working.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-10 14:54:04 -04:00
Neal Cardwell
18a223e0b9 tcp: fix tcp_rcv_rtt_update() use of an unscaled RTT sample
Fix a code path in tcp_rcv_rtt_update() that was comparing scaled and
unscaled RTT samples.

The intent in the code was to only use the 'm' measurement if it was a
new minimum.  However, since 'm' had not yet been shifted left 3 bits
but 'new_sample' had, this comparison would nearly always succeed,
leading us to erroneously set our receive-side RTT estimate to the 'm'
sample when that sample could be nearly 8x too high to use.

The overall effect is to often cause the receive-side RTT estimate to
be significantly too large (up to 40% too large for brief periods in
my tests).

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10 14:47:09 -04:00
Eric Dumazet
5fb84b1428 tcp: restore correct limit
Commit c43b874d5d (tcp: properly initialize tcp memory limits) tried
to fix a regression added in commits 4acb4190 & 3dc43e3,
but still get it wrong.

Result is machines with low amount of memory have too small tcp_rmem[2]
value and slow tcp receives : Per socket limit being 1/1024 of memory
instead of 1/128 in old kernels, so rcv window is capped to small
values.

Fix this to match comment and previous behavior.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10 14:39:26 -04:00
David S. Miller
ecd159fc5f Merge branch 'master' of git://1984.lsi.us.es/net 2012-04-10 14:38:31 -04:00
David S. Miller
06eb4eafbd Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-10 14:30:45 -04:00
Gao feng
6ba900676b netfilter: nf_conntrack: fix incorrect logic in nf_conntrack_init_net
in function nf_conntrack_init_net,when nf_conntrack_timeout_init falied,
we should call nf_conntrack_ecache_fini to do rollback.
but the current code calls nf_conntrack_timeout_fini.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-10 13:00:38 +02:00
Jozsef Kadlecsik
07153c6ec0 netfilter: nf_ct_ipv4: packets with wrong ihl are invalid
It was reported that the Linux kernel sometimes logs:

klogd: [2629147.402413] kernel BUG at net / netfilter /
nf_conntrack_proto_tcp.c: 447!
klogd: [1072212.887368] kernel BUG at net / netfilter /
nf_conntrack_proto_tcp.c: 392

ipv4_get_l4proto() in nf_conntrack_l3proto_ipv4.c and tcp_error() in
nf_conntrack_proto_tcp.c should catch malformed packets, so the errors
at the indicated lines - TCP options parsing - should not happen.
However, tcp_error() relies on the "dataoff" offset to the TCP header,
calculated by ipv4_get_l4proto().  But ipv4_get_l4proto() does not check
bogus ihl values in IPv4 packets, which then can slip through tcp_error()
and get caught at the TCP options parsing routines.

The patch fixes ipv4_get_l4proto() by invalidating packets with bogus
ihl value.

The patch closes netfilter bugzilla id 771.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-10 12:50:49 +02:00
Jozsef Kadlecsik
8430eac2f6 netfilter: nf_ct_ipv4: handle invalid IPv4 and IPv6 packets consistently
IPv6 conntrack marked invalid packets as INVALID and let the user
drop those by an explicit rule, while IPv4 conntrack dropped such
packets itself.

IPv4 conntrack is changed so that it marks INVALID packets and let
the user to drop them.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-10 00:38:34 +02:00
Luis R. Rodriguez
80007efeff cfg80211: warn if db.txt is empty with CONFIG_CFG80211_INTERNAL_REGDB
It has happened twice now where elaborate troubleshooting has
undergone on systems where CONFIG_CFG80211_INTERNAL_REGDB [0]
has been set but yet net/wireless/db.txt was not updated.

Despite the documentation on this it seems system integrators could
use some more help with this, so throw out a kernel warning at boot time
when their database is empty.

This does mean that the error-prone system integrator won't likely
realize the issue until they boot the machine but -- it does not seem
to make sense to enable a build bug breaking random build testing.

[0] http://wireless.kernel.org/en/developers/Regulatory/CRDA#CONFIG_CFG80211_INTERNAL_REGDB

Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Youngsin Lee <youngsin@qualcomm.com>
Cc: Raja Mani <rmani@qca.qualcomm.com>
Cc: Senthil Kumar Balasubramanian <senthilb@qca.qualcomm.com>
Cc: Vipin Mehta <vipimeht@qca.qualcomm.com>
Cc: yahuan@qca.qualcomm.com
Cc: jjan@qca.qualcomm.com
Cc: vthiagar@qca.qualcomm.com
Cc: henrykim@qualcomm.com
Cc: jouni@qca.qualcomm.com
Cc: athiruve@qca.qualcomm.com
Cc: cjkim@qualcomm.com
Cc: philipk@qca.qualcomm.com
Cc: sunnykim@qualcomm.com
Cc: sskwak@qualcomm.com
Cc: kkim@qualcomm.com
Cc: mattbyun@qualcomm.com
Cc: ryanlee@qualcomm.com
Cc: simbap@qualcomm.com
Cc: krislee@qualcomm.com
Cc: conner@qualcomm.com
Cc: hojinkim@qualcomm.com
Cc: honglee@qualcomm.com
Cc: johnwkim@qualcomm.com
Cc: jinyong@qca.qualcomm.com
Cc: stable@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@frijolero.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 16:37:11 -04:00
Chun-Yeow Yeoh
d2a079fd48 mac80211: fix the RANN propagation issues
This patch is intended to solve the follwing issues in RANN propagation:
[1] The interval in propagated RANN should be based on the interval of received RANN.
[2] The aggregated path metric for propagated RANN is as received plus own link metric
    towards the transmitting mesh STA (not root mesh STA).
[3] The comparison of path metric for RANN with same sequence number should be done
    before deciding whether to propagate or not.

Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 16:37:10 -04:00
Chun-Yeow Yeoh
292c41acdd mac80211: fix the sparse warnings on endian handling in RANN propagation
The HWMP sequence number of received RANN element is compared to decide whether to be
propagated. The sequence number is required to covert from 32bit little endian data into
CPUs endianness for comparison. The same applies to the RANN metric.

Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 16:12:30 -04:00
Ronald Wahl
70b12f2612 mac80211: when receiving DTIM disable power-save mode only if it was enabled
When receiving DTIM we currently disable power save mode in the
hardware unconditionally, i.e. also when the hardware was not sleeping.
This causes trouble with at least one wireless chipset (Ralink RT3572).
When the hardware is not sleeping and we send a wakeup command (e.g.
this happens after a scan) then a significant decrease of the link
quality or a disconnect may occur.
Disabling power save mode only when it was enabled prevents this issue.

Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com>
Reviewed-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 16:12:30 -04:00
Felix Fietkau
12d3952fc4 mac80211: optimize aggregation session timeout handling
Calling mod_timer from the rx/tx hotpath is somewhat expensive, and the
timeout doesn't need to be so precise.

Switch to a different strategy: Schedule the timer initially, store jiffies
of all last rx/tx activity which would previously modify the timer, and
let the timer re-arm itself after checking the last rx/tx timestamp.
Make the session timers deferrable to avoid causing extra wakeups on systems
running on battery.
This visibly reduces CPU load under high network load on small embedded
systems.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 16:09:36 -04:00
Felix Fietkau
fcb2c9e102 mac80211: reduce code duplication in debugfs code
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 16:09:35 -04:00
Felix Fietkau
c6fb08aaa8 cfg80211: use compare_ether_addr on MAC addresses instead of memcmp
Because of the constant size and guaranteed 16 bit alignment, the inline
compare_ether_addr function is much cheaper than calling memcmp.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 16:09:35 -04:00
Marco Porsch
52a3f20c09 mac80211: end service period only after sending last buffered frame
Signed-off-by: Marco Porsch <marco.porsch@etit.tu-chemnitz.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 16:06:00 -04:00
Ben Greear
d17087e78d mac80211: Add iface name when calling WARN-ON.
This lets the user know which interface has failed
the check_sdata_in_driver check.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 16:05:57 -04:00
Johannes Berg
074d46d1d2 wireless: rename ht_info to ht_operation
Since some of the HT code pre-dates 802.11n-2009
some names are wrong. The one that bothers me most
is that "HT operation" is called "HT information"
in our code and that causes confusion.

Rename "HT information" to "HT operation" and also
the control_chan field to primary_chan to match
the name used in the spec.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 16:05:55 -04:00
Rajkumar Manoharan
f69b9c79c9 mac80211: flush to get the tx status of nullfunc frame immediately
Sometimes the probe frame (nullfunc) is stuck at the hw queue. so that
the mac80211 terminates the connection as it wont see the tx status.
Instead of waiting for long period for ack status, lets call flush
to get nullfunc status immediately. It also helps to send the nullfunc
till max tries reached.

Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 16:05:54 -04:00
Rajkumar Manoharan
6f0756a38f mac80211: do not send pspoll when powersave is disabled
There might be latency at AP side to update TIM IE which could cause the
station to send pspoll frame even after the wakeup. If the powersave is
disabled, the nullfunc notification alone is sufficient to receive
frames from the AP. And if the pspoll frame was already sent, no need to
resend the frame till it was acked by AP.

Cc: Jouni Malinen <jouni@qca.qualcomm.com>
Cc: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 16:05:54 -04:00
Julia Lawall
7ab2485b69 net/wireless/wext-core.c: add missing kfree
Free extra as done in the error-handling code just above.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 15:54:50 -04:00
Johannes Berg
2b5f8b0b44 nl80211: ensure interface is up in various APIs
The nl80211 handling code should ensure as much as
it can that the interface is in a valid state, it
can certainly ensure the interface is running.

Not doing so can cause calls through mac80211 into
the driver that result in warnings and unspecified
behaviour in the driver.

Cc: stable@vger.kernel.org
Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 15:54:46 -04:00
Johannes Berg
3f9768a5d2 mac80211: fix association beacon wait timeout
The TU_TO_EXP_TIME() macro already includes the
"jiffies +" piece of the calculation, so don't
add jiffies again.

Reported-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 15:54:46 -04:00
John W. Linville
41833af713 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2012-04-09 15:47:49 -04:00
Pablo Neira Ayuso
95ad2f873d netfilter: ip6_tables: ip6t_ext_hdr is now static inline
We may hit this in xt_LOG:

net/built-in.o:xt_LOG.c:function dump_ipv6_packet:
	error: undefined reference to 'ip6t_ext_hdr'

happens with these config options:

CONFIG_NETFILTER_XT_TARGET_LOG=y
CONFIG_IP6_NF_IPTABLES=m

ip6t_ext_hdr is fairly small and it is called in the packet path.
Make it static inline.

Reported-by: Simon Kirby <sim@netnation.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-09 16:29:34 +02:00
Changli Gao
6ee0b693bd netfilter: nf_ct_tcp: don't scale the size of the window up twice
For a picked up connection, the window win is scaled twice: one is by the
initialization code, and the other is by the sender updating code.

I use the temporary variable swin instead of modifying the variable win.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-09 16:29:30 +02:00
Linus Torvalds
23f347ef63 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking updates from David Miller:

 1) Fix inaccuracies in network driver interface documentation, from Ben
    Hutchings.

 2) Fix handling of negative offsets in BPF JITs, from Jan Seiffert.

 3) Compile warning, locking, and refcounting fixes in netfilter's
    xt_CT, from Pablo Neira Ayuso.

 4) phonet sendmsg needs to validate user length just like any other
    datagram protocol, fix from Sasha Levin.

 5) Ipv6 multicast code uses wrong loop index, from RongQing Li.

 6) Link handling and firmware fixes in bnx2x driver from Yaniv Rosner
    and Yuval Mintz.

 7) mlx4 erroneously allocates 4 pages at a time, regardless of page
    size, fix from Thadeu Lima de Souza Cascardo.

 8) SCTP socket option wasn't extended in a backwards compatible way,
    fix from Thomas Graf.

 9) Add missing address change event emissions to bonding, from Shlomo
    Pongratz.

10) /proc/net/dev regressed because it uses a private offset to track
    where we are in the hash table, but this doesn't track the offset
    pullback that the seq_file code does resulting in some entries being
    missed in large dumps.

    Fix from Eric Dumazet.

11) do_tcp_sendpage() unloads the send queue way too fast, because it
    invokes tcp_push() when it shouldn't.  Let the natural sequence
    generated by the splice paths, and the assosciated MSG_MORE
    settings, guide the tcp_push() calls.

    Otherwise what goes out of TCP is spaghetti and doesn't batch
    effectively into GSO/TSO clusters.

    From Eric Dumazet.

12) Once we put a SKB into either the netlink receiver's queue or a
    socket error queue, it can be consumed and freed up, therefore we
    cannot touch it after queueing it like that.

    Fixes from Eric Dumazet.

13) PPP has this annoying behavior in that for every transmit call it
    immediately stops the TX queue, then calls down into the next layer
    to transmit the PPP frame.

    But if that next layer can take it immediately, it just un-stops the
    TX queue right before returning from the transmit method.

    Besides being useless work, it makes several facilities unusable, in
    particular things like the equalizers.  Well behaved devices should
    only stop the TX queue when they really are full, and in PPP's case
    when it gets backlogged to the downstream device.

    David Woodhouse therefore fixed PPP to not stop the TX queue until
    it's downstream can't take data any more.

14) IFF_UNICAST_FLT got accidently lost in some recent stmmac driver
    changes, re-add.  From Marc Kleine-Budde.

15) Fix link flaps in ixgbe, from Eric W. Multanen.

16) Descriptor writeback fixes in e1000e from Matthew Vick.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
  net: fix a race in sock_queue_err_skb()
  netlink: fix races after skb queueing
  doc, net: Update ndo_start_xmit return type and values
  doc, net: Remove instruction to set net_device::trans_start
  doc, net: Update netdev operation names
  doc, net: Update documentation of synchronisation for TX multiqueue
  doc, net: Remove obsolete reference to dev->poll
  ethtool: Remove exception to the requirement of holding RTNL lock
  MAINTAINERS: update for Marvell Ethernet drivers
  bonding: properly unset current_arp_slave on slave link up
  phonet: Check input from user before allocating
  tcp: tcp_sendpages() should call tcp_push() once
  ipv6: fix array index in ip6_mc_add_src()
  mlx4: allocate just enough pages instead of always 4 pages
  stmmac: re-add IFF_UNICAST_FLT for dwmac1000
  bnx2x: Clear MDC/MDIO warning message
  bnx2x: Fix BCM57711+BCM84823 link issue
  bnx2x: Clear BCM84833 LED after fan failure
  bnx2x: Fix BCM84833 PHY FW version presentation
  bnx2x: Fix link issue for BCM8727 boards.
  ...
2012-04-06 10:37:38 -07:00
Eric Dumazet
110c43304d net: fix a race in sock_queue_err_skb()
As soon as an skb is queued into socket error queue, another thread
can consume it, so we are not allowed to reference skb anymore, or risk
use after free.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06 05:07:21 -04:00
Eric Dumazet
4a7e7c2ad5 netlink: fix races after skb queueing
As soon as an skb is queued into socket receive_queue, another thread
can consume it, so we are not allowed to reference skb anymore, or risk
use after free.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06 04:21:06 -04:00
Sasha Levin
bcf1b70ac6 phonet: Check input from user before allocating
A phonet packet is limited to USHRT_MAX bytes, this is never checked during
tx which means that the user can specify any size he wishes, and the kernel
will attempt to allocate that size.

In the good case, it'll lead to the following warning, but it may also cause
the kernel to kick in the OOM and kill a random task on the server.

[ 8921.744094] WARNING: at mm/page_alloc.c:2255 __alloc_pages_slowpath+0x65/0x730()
[ 8921.749770] Pid: 5081, comm: trinity Tainted: G        W    3.4.0-rc1-next-20120402-sasha #46
[ 8921.756672] Call Trace:
[ 8921.758185]  [<ffffffff810b2ba7>] warn_slowpath_common+0x87/0xb0
[ 8921.762868]  [<ffffffff810b2be5>] warn_slowpath_null+0x15/0x20
[ 8921.765399]  [<ffffffff8117eae5>] __alloc_pages_slowpath+0x65/0x730
[ 8921.769226]  [<ffffffff81179c8a>] ? zone_watermark_ok+0x1a/0x20
[ 8921.771686]  [<ffffffff8117d045>] ? get_page_from_freelist+0x625/0x660
[ 8921.773919]  [<ffffffff8117f3a8>] __alloc_pages_nodemask+0x1f8/0x240
[ 8921.776248]  [<ffffffff811c03e0>] kmalloc_large_node+0x70/0xc0
[ 8921.778294]  [<ffffffff811c4bd4>] __kmalloc_node_track_caller+0x34/0x1c0
[ 8921.780847]  [<ffffffff821b0e3c>] ? sock_alloc_send_pskb+0xbc/0x260
[ 8921.783179]  [<ffffffff821b3c65>] __alloc_skb+0x75/0x170
[ 8921.784971]  [<ffffffff821b0e3c>] sock_alloc_send_pskb+0xbc/0x260
[ 8921.787111]  [<ffffffff821b002e>] ? release_sock+0x7e/0x90
[ 8921.788973]  [<ffffffff821b0ff0>] sock_alloc_send_skb+0x10/0x20
[ 8921.791052]  [<ffffffff824cfc20>] pep_sendmsg+0x60/0x380
[ 8921.792931]  [<ffffffff824cb4a6>] ? pn_socket_bind+0x156/0x180
[ 8921.794917]  [<ffffffff824cb50f>] ? pn_socket_autobind+0x3f/0x90
[ 8921.797053]  [<ffffffff824cb63f>] pn_socket_sendmsg+0x4f/0x70
[ 8921.798992]  [<ffffffff821ab8e7>] sock_aio_write+0x187/0x1b0
[ 8921.801395]  [<ffffffff810e325e>] ? sub_preempt_count+0xae/0xf0
[ 8921.803501]  [<ffffffff8111842c>] ? __lock_acquire+0x42c/0x4b0
[ 8921.805505]  [<ffffffff821ab760>] ? __sock_recv_ts_and_drops+0x140/0x140
[ 8921.807860]  [<ffffffff811e07cc>] do_sync_readv_writev+0xbc/0x110
[ 8921.809986]  [<ffffffff811958e7>] ? might_fault+0x97/0xa0
[ 8921.811998]  [<ffffffff817bd99e>] ? security_file_permission+0x1e/0x90
[ 8921.814595]  [<ffffffff811e17e2>] do_readv_writev+0xe2/0x1e0
[ 8921.816702]  [<ffffffff810b8dac>] ? do_setitimer+0x1ac/0x200
[ 8921.818819]  [<ffffffff810e2ec1>] ? get_parent_ip+0x11/0x50
[ 8921.820863]  [<ffffffff810e325e>] ? sub_preempt_count+0xae/0xf0
[ 8921.823318]  [<ffffffff811e1926>] vfs_writev+0x46/0x60
[ 8921.825219]  [<ffffffff811e1a3f>] sys_writev+0x4f/0xb0
[ 8921.827127]  [<ffffffff82658039>] system_call_fastpath+0x16/0x1b
[ 8921.829384] ---[ end trace dffe390f30db9eb7 ]---

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05 19:05:56 -04:00
Eric Dumazet
35f9c09fe9 tcp: tcp_sendpages() should call tcp_push() once
commit 2f53384424 (tcp: allow splice() to build full TSO packets) added
a regression for splice() calls using SPLICE_F_MORE.

We need to call tcp_flush() at the end of the last page processed in
tcp_sendpages(), or else transmits can be deferred and future sends
stall.

Add a new internal flag, MSG_SENDPAGE_NOTLAST, acting like MSG_MORE, but
with different semantic.

For all sendpage() providers, its a transparent change. Only
sock_sendpage() and tcp_sendpages() can differentiate the two different
flags provided by pipe_to_sendpage()

Reported-by: Tom Herbert <therbert@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail>com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05 19:04:27 -04:00
Linus Torvalds
5d32c88f0b Merge branch 'akpm' (Andrew's patch-bomb)
Merge batch of fixes from Andrew Morton:
 "The simple_open() cleanup was held back while I wanted for laggards to
  merge things.

  I still need to send a few checkpoint/restore patches.  I've been
  wobbly about merging them because I'm wobbly about the overall
  prospects for success of the project.  But after speaking with Pavel
  at the LSF conference, it sounds like they're further toward
  completion than I feared - apparently davem is at the "has stopped
  complaining" stage regarding the net changes.  So I need to go back
  and re-review those patchs and their (lengthy) discussion."

* emailed from Andrew Morton <akpm@linux-foundation.org>: (16 patches)
  memcg swap: use mem_cgroup_uncharge_swap fix
  backlight: add driver for DA9052/53 PMIC v1
  C6X: use set_current_blocked() and block_sigmask()
  MAINTAINERS: add entry for sparse checker
  MAINTAINERS: fix REMOTEPROC F: typo
  alpha: use set_current_blocked() and block_sigmask()
  simple_open: automatically convert to simple_open()
  scripts/coccinelle/api/simple_open.cocci: semantic patch for simple_open()
  libfs: add simple_open()
  hugetlbfs: remove unregister_filesystem() when initializing module
  drivers/rtc/rtc-88pm860x.c: fix rtc irq enable callback
  fs/xattr.c:setxattr(): improve handling of allocation failures
  fs/xattr.c:listxattr(): fall back to vmalloc() if kmalloc() failed
  fs/xattr.c: suppress page allocation failure warnings from sys_listxattr()
  sysrq: use SEND_SIG_FORCED instead of force_sig()
  proc: fix mount -t proc -o AAA
2012-04-05 15:30:34 -07:00