Commit graph

3829 commits

Author SHA1 Message Date
Florian Westphal
36c7778218 inet: frag: constify match, hashfn and constructor arguments
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-27 22:34:35 -07:00
Li RongQing
ac3d2e5a9e ipv6: remove obsolete comment in ip6_append_data()
After 11878b40e[net-timestamp: SOCK_RAW and PING timestamping], this comment
becomes obsolete since the codes check not only UDP socket, but also RAW sock;
and the codes are clear, not need the comments

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-24 23:47:04 -07:00
Himangi Saraogi
56ec0fb10c neigh: remove exceptional & on function name
In this file, function names are otherwise used as pointers without &.

A simplified version of the Coccinelle semantic patch that makes this
change is as follows:

// <smpl>
@r@
identifier f;
@@

f(...) { ... }

@@
identifier r.f;
@@

- &f
+ f
// </smpl>

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-24 23:23:31 -07:00
Sorin Dumitru
274f482d33 sock: remove skb argument from sk_rcvqueues_full
It hasn't been used since commit 0fd7bac(net: relax rcvbuf limits).

Signed-off-by: Sorin Dumitru <sorin@returnze.ro>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-23 13:23:06 -07:00
David S. Miller
8fd90bb889 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/infiniband/hw/cxgb4/device.c

The cxgb4 conflict was simply overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-22 00:44:59 -07:00
David S. Miller
a8138f42d4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains updates for your net-next tree,
they are:

1) Use kvfree() helper function from x_tables, from Eric Dumazet.

2) Remove extra timer from the conntrack ecache extension, use a
   workqueue instead to redeliver lost events to userspace instead,
   from Florian Westphal.

3) Removal of the ulog targets for ebtables and iptables. The nflog
   infrastructure superseded this almost 9 years ago, time to get rid
   of this code.

4) Replace the list of loggers by an array now that we can only have
   two possible non-overlapping logger flavours, ie. kernel ring buffer
   and netlink logging.

5) Move Eric Dumazet's log buffer code to nf_log to reuse it from
   all of the supported per-family loggers.

6) Consolidate nf_log_packet() as an unified interface for packet logging.
   After this patch, if the struct nf_loginfo is available, it explicitly
   selects the logger that is used.

7) Move ip and ip6 logging code from xt_LOG to the corresponding
   per-family loggers. Thus, x_tables and nf_tables share the same code
   for packet logging.

8) Add generic ARP packet logger, which is used by nf_tables. The
   format aims to be consistent with the output of xt_LOG.

9) Add generic bridge packet logger. Again, this is used by nf_tables
   and it routes the packets to the real family loggers. As a result,
   we get consistent logging format for the bridge family. The ebt_log
   logging code has been intentionally left in place not to break
   backward compatibility since the logging output differs from xt_LOG.

10) Update nft_log to explicitly request the required family logger when
    needed.

11) Finish nft_log so it supports arp, ip, ip6, bridge and inet families.
    Allowing selection between netlink and kernel buffer ring logging.

12) Several fixes coming after the netfilter core logging changes spotted
    by robots.

13) Use IS_ENABLED() macros whenever possible in the netfilter tree,
    from Duan Jiong.

14) Removal of a couple of unnecessary branch before kfree, from Fabian
    Frederick.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-20 21:01:43 -07:00
David Held
2dc41cff75 udp: Use hash2 for long hash1 chains in __udp*_lib_mcast_deliver.
Many multicast sources can have the same port which can result in a very
large list when hashing by port only. Hash by address and port instead
if this is the case. This makes multicast more similar to unicast.

On a 24-core machine receiving from 500 multicast sockets on the same
port, before this patch 80% of system CPU was used up by spin locking
and only ~25% of packets were successfully delivered.

With this patch, all packets are delivered and kernel overhead is ~8%
system CPU on spinlocks.

Signed-off-by: David Held <drheld@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 23:29:52 -07:00
David Held
5cf3d46192 udp: Simplify __udp*_lib_mcast_deliver.
Switch to using sk_nulls_for_each which shortens the code and makes it
easier to update.

Signed-off-by: David Held <drheld@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 23:29:52 -07:00
Jerry Chu
c3caf1192f net-gre-gro: Fix a bug that breaks the forwarding path
Fixed a bug that was introduced by my GRE-GRO patch
(bf5a755f5e net-gre-gro: Add GRE
support to the GRO stack) that breaks the forwarding path
because various GSO related fields were not set. The bug will
cause on the egress path either the GSO code to fail, or a
GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
fix has been tested for both cases.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 14:45:26 -07:00
David S. Miller
1a98c69af1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 14:09:34 -07:00
Willem de Bruijn
11878b40ed net-timestamp: SOCK_RAW and PING timestamping
Add SO_TIMESTAMPING to sockets of type PF_INET[6]/SOCK_RAW:

Add the necessary sock_tx_timestamp calls to the datapath for RAW
sockets (ping sockets already had these calls).

Fix the IP output path to pass the timestamp flags on the first
fragment also for these sockets. The existing code relies on
transhdrlen != 0 to indicate a first fragment. For these sockets,
that assumption does not hold.

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=77221

Tested SOCK_RAW on IPv4 and IPv6, not PING.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15 16:32:45 -07:00
Fabian Frederick
66635dc121 ipv6: remove unnecessary break after return
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15 16:27:01 -07:00
Fabian Frederick
d518825eab netfilter: remove unnecessary break after return
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15 16:27:00 -07:00
Tom Gundersen
c835a67733 net: set name_assign_type in alloc_netdev()
Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert
all users to pass NET_NAME_UNKNOWN.

Coccinelle patch:

@@
expression sizeof_priv, name, setup, txqs, rxqs, count;
@@

(
-alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs)
+alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs)
|
-alloc_netdev_mq(sizeof_priv, name, setup, count)
+alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count)
|
-alloc_netdev(sizeof_priv, name, setup)
+alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup)
)

v9: move comments here from the wrong commit

Signed-off-by: Tom Gundersen <teg@jklm.no>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15 16:12:48 -07:00
Himangi Saraogi
e3f0b86b99 ipv6: Use BUG_ON
The semantic patch that makes this transformation is as follows:

// <smpl>
@@ expression e; @@
-if (e) BUG();
+BUG_ON(e);
// </smpl>

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-11 15:06:38 -07:00
Himangi Saraogi
8242fc3392 net: ipv6: Use BUG_ON
The semantic patch that makes the transformation is as follows:

// <smpl>
@@ expression e; @@
-if (e) BUG();
+BUG_ON(e);
// </smpl>

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-11 15:06:38 -07:00
Jiri Pirko
bc91b0f07a ipv6: addrconf: implement address generation modes
This patch introduces a possibility for userspace to set various (so far
two) modes of generating addresses. This is useful for example for
NetworkManager because it can set the mode to NONE and take care of link
local addresses itself. That allow it to have the interface up,
monitoring carrier but still don't have any addresses on it.

One more use-case by Dan Williams:
<quote>
WWAN devices often have their LL address provided by the firmware of the
device, which sometimes refuses to respond to incorrect LL addresses
when doing DHCPv6 or IPv6 ND.  The kernel cannot generate the correct LL
address for two reasons:

1) WWAN pseudo-ethernet interfaces often construct a fake MAC address,
or read a meaningless MAC address from the firmware.  Thus the EUI64 and
the IPv6LL address the kernel assigns will be wrong.  The real LL
address is often retrieved from the firmware with AT or proprietary
commands.

2) WWAN PPP interfaces receive their LL address from IPV6CP, not from
kernel assignments.  Only after IPV6CP has completed do we know the LL
address of the PPP interface and its peer.  But the kernel has already
assigned an incorrect LL address to the interface.

So being able to suppress the kernel LL address generation and assign
the one retrieved from the firmware is less complicated and more robust.
</quote>

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-11 15:05:45 -07:00
Li RongQing
b642881719 ipv6: fix the check when handle RA
d9333196572(ipv6:  Allow accepting RA from local IP addresses.) made the wrong
check, whether or not to accept RA with source-addr found on local machine, when
accept_ra_from_local is 0.

Fixes: d9333196572(ipv6:  Allow accepting RA from local IP addresses.)
Cc: Ben Greear <greearb@candelatech.com>
Cc: Hannes Frederic Sowa <hannes@redhat.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-10 16:56:33 -07:00
Tom Herbert
cb1ce2ef38 ipv6: Implement automatic flow label generation on transmit
Automatically generate flow labels for IPv6 packets on transmit.
The flow label is computed based on skb_get_hash. The flow label will
only automatically be set when it is zero otherwise (i.e. flow label
manager hasn't set one). This supports the transmit side functionality
of RFC 6438.

Added an IPv6 sysctl auto_flowlabels to enable/disable this behavior
system wide, and added IPV6_AUTOFLOWLABEL socket option to enable this
functionality per socket.

By default, auto flowlabels are disabled to avoid possible conflicts
with flow label manager, however if this feature proves useful we
may want to enable it by default.

It should also be noted that FreeBSD has already implemented automatic
flow labels (including the sysctl and socket option). In FreeBSD,
automatic flow labels default to enabled.

Performance impact:

Running super_netperf with 200 flows for TCP_RR and UDP_RR for
IPv6. Note that in UDP case, __skb_get_hash will be called for
every packet with explains slight regression. In the TCP case
the hash is saved in the socket so there is no regression.

Automatic flow labels disabled:

  TCP_RR:
    86.53% CPU utilization
    127/195/322 90/95/99% latencies
    1.40498e+06 tps

  UDP_RR:
    90.70% CPU utilization
    118/168/243 90/95/99% latencies
    1.50309e+06 tps

Automatic flow labels enabled:

  TCP_RR:
    85.90% CPU utilization
    128/199/337 90/95/99% latencies
    1.40051e+06

  UDP_RR
    92.61% CPU utilization
    115/164/236 90/95/99% latencies
    1.4687e+06

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:21 -07:00
Tom Herbert
b73c3d0e4f net: Save TX flow hash in sock and set in skbuf on xmit
For a connected socket we can precompute the flow hash for setting
in skb->hash on output. This is a performance advantage over
calculating the skb->hash for every packet on the connection. The
computation is done using the common hash algorithm to be consistent
with computations done for packets of the connection in other states
where thers is no socket (e.g. time-wait, syn-recv, syn-cookies).

This patch adds sk_txhash to the sock structure. inet_set_txhash and
ip6_set_txhash functions are added which are called from points in
TCP and UDP where socket moves to established state.

skb_set_hash_from_sk is a function which sets skb->hash from the
sock txhash value. This is called in UDP and TCP transmit path when
transmitting within the context of a socket.

Tested: ran super_netperf with 200 TCP_RR streams over a vxlan
interface (in this case skb_get_hash called on every TX packet to
create a UDP source port).

Before fix:

  95.02% CPU utilization
  154/256/505 90/95/99% latencies
  1.13042e+06 tps

  Time in functions:
    0.28% skb_flow_dissect
    0.21% __skb_get_hash

After fix:

  94.95% CPU utilization
  156/254/485 90/95/99% latencies
  1.15447e+06

  Neither __skb_get_hash nor skb_flow_dissect appear in perf

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:21 -07:00
Neal Cardwell
86c6a2c75a tcp: switch snt_synack back to measuring transmit time of first SYNACK
Always store in snt_synack the time at which the server received the
first client SYN and attempted to send the first SYNACK.

Recent commit aa27fc501 ("tcp: tcp_v[46]_conn_request: fix snt_synack
initialization") resolved an inconsistency between IPv4 and IPv6 in
the initialization of snt_synack. This commit brings back the idea
from 843f4a55e (tcp: use tcp_v4_send_synack on first SYN-ACK), which
was going for the original behavior of snt_synack from the commit
where it was added in 9ad7c049f0 ("tcp: RFC2988bis + taking RTT
sample from 3WHS for the passive open side") in v3.1.

In addition to being simpler (and probably a tiny bit faster),
unconditionally storing the time of the first SYNACK attempt has been
useful because it allows calculating a performance metric quantifying
how long it took to establish a passive TCP connection.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Cc: Octavian Purdila <octavian.purdila@intel.com>
Cc: Jerry Chu <hkchu@google.com>
Acked-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 19:26:37 -07:00
Eric Dumazet
9fe516ba3f inet: move ipv6only in sock_common
When an UDP application switches from AF_INET to AF_INET6 sockets, we
have a small performance degradation for IPv4 communications because of
extra cache line misses to access ipv6only information.

This can also be noticed for TCP listeners, as ipv6_only_sock() is also
used from __inet_lookup_listener()->compute_score()

This is magnified when SO_REUSEPORT is used.

Move ipv6only into struct sock_common so that it is available at
no extra cost in lookups.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-01 23:46:21 -07:00
Ben Greear
d933319657 ipv6: Allow accepting RA from local IP addresses.
This can be used in virtual networking applications, and
may have other uses as well.  The option is disabled by
default.

A specific use case is setting up virtual routers, bridges, and
hosts on a single OS without the use of network namespaces or
virtual machines.  With proper use of ip rules, routing tables,
veth interface pairs and/or other virtual interfaces,
and applications that can bind to interfaces and/or IP addresses,
it is possibly to create one or more virtual routers with multiple
hosts attached.  The host interfaces can act as IPv6 systems,
with radvd running on the ports in the virtual routers.  With the
option provided in this patch enabled, those hosts can now properly
obtain IPv6 addresses from the radvd.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-01 12:16:24 -07:00
Ben Greear
f2a762d8a9 ipv6: Add more debugging around accept-ra logic.
This is disabled by default, just like similar debug info
already in this module.  But, makes it easier to find out
why RA is not being accepted when debugging strange behaviour.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-01 12:16:24 -07:00
Duan Jiong
24de3d3775 netfilter: use IS_ENABLED() macro
replace:
 #if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
with
 #if IS_ENABLED(CONFIG_NF_CT_NETLINK)

replace:
 #if !defined(CONFIG_NF_NAT) && !defined(CONFIG_NF_NAT_MODULE)
with
 #if !IS_ENABLED(CONFIG_NF_NAT)

replace:
 #if !defined(CONFIG_NF_CONNTRACK) && !defined(CONFIG_NF_CONNTRACK_MODULE)
with
 #if !IS_ENABLED(CONFIG_NF_CONNTRACK)

And add missing:
 IS_ENABLED(CONFIG_NF_CT_NETLINK)

in net/ipv{4,6}/netfilter/nf_nat_l3proto_ipv{4,6}.c

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-30 11:38:03 +02:00
Pablo Neira Ayuso
c1878869c0 netfilter: fix several Kconfig problems in NF_LOG_*
warning: (NETFILTER_XT_TARGET_LOG) selects NF_LOG_IPV6 which has unmet direct dependencies (NET && INET && IPV6 && NETFILTER && IP6_NF_IPTABLES && NETFILTER_ADVANCED)
warning: (NF_LOG_IPV4 && NF_LOG_IPV6) selects NF_LOG_COMMON which has unmet direct dependencies (NET && INET && NETFILTER && NF_CONNTRACK)

Fixes: 83e96d4 ("netfilter: log: split family specific code to nf_log_{ip,ip6,common}.c files")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-28 18:49:49 +02:00
Octavian Purdila
1fb6f159fd tcp: add tcp_conn_request
Create tcp_conn_request and remove most of the code from
tcp_v4_conn_request and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:37 -07:00
Octavian Purdila
695da14eb0 tcp: add queue_add_hash to tcp_request_sock_ops
Add queue_add_hash member to tcp_request_sock_ops so that we can later
unify tcp_v4_conn_request and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
2aec4a297b tcp: add mss_clamp to tcp_request_sock_ops
Add mss_clamp member to tcp_request_sock_ops so that we can later
unify tcp_v4_conn_request and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
5db92c9949 tcp: unify tcp_v4_rtx_synack and tcp_v6_rtx_synack
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
d6274bd8d6 tcp: add send_synack method to tcp_request_sock_ops
Create a new tcp_request_sock_ops method to unify the IPv4/IPv6
signature for tcp_v[46]_send_synack. This allows us to later unify
tcp_v4_rtx_synack with tcp_v6_rtx_synack and tcp_v4_conn_request with
tcp_v4_conn_request.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
936b8bdb53 tcp: add init_seq method to tcp_request_sock_ops
More work in preparation of unifying tcp_v4_conn_request and
tcp_v6_conn_request: indirect the init sequence calls via the
tcp_request_sock_ops.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
9403715977 tcp: move around a few calls in tcp_v6_conn_request
Make the tcp_v6_conn_request calls flow similar with that of
tcp_v4_conn_request.

Note that want_cookie can be true only if isn is zero and that is why
we can move the if (want_cookie) block out of the if (!isn) block.

Moving security_inet_conn_request() has a couple of side effects:
missing inet_rsk(req)->ecn_ok update and the req->cookie_ts
update. However, neither SELinux nor Smack security hooks seems to
check them. This change should also avoid future different behaviour
for IPv4 and IPv6 in the security hooks.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
d94e0417ad tcp: add route_req method to tcp_request_sock_ops
Create wrappers with same signature for the IPv4/IPv6 request routing
calls and use these wrappers (via route_req method from
tcp_request_sock_ops) in tcp_v4_conn_request and tcp_v6_conn_request
with the purpose of unifying the two functions in a later patch.

We can later drop the wrapper functions and modify inet_csk_route_req
and inet6_cks_route_req to use the same signature.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
fb7b37a7f3 tcp: add init_cookie_seq method to tcp_request_sock_ops
Move the specific IPv4/IPv6 cookie sequence initialization to a new
method in tcp_request_sock_ops in preparation for unifying
tcp_v4_conn_request and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:35 -07:00
Octavian Purdila
16bea70aa7 tcp: add init_req method to tcp_request_sock_ops
Move the specific IPv4/IPv6 intializations to a new method in
tcp_request_sock_ops in preparation for unifying tcp_v4_conn_request
and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:35 -07:00
Octavian Purdila
476eab8251 net: remove inet6_reqsk_alloc
Since pktops is only used for IPv6 only and opts is used for IPv4
only, we can move these fields into a union and this allows us to drop
the inet6_reqsk_alloc function as after this change it becomes
equivalent with inet_reqsk_alloc.

This patch also fixes a kmemcheck issue in the IPv6 stack: the flags
field was not annotated after a request_sock was allocated.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:35 -07:00
Octavian Purdila
aa27fc5018 tcp: tcp_v[46]_conn_request: fix snt_synack initialization
Commit 016818d07 (tcp: TCP Fast Open Server - take SYNACK RTT after
completing 3WHS) changes the code to only take a snt_synack timestamp
when a SYNACK transmit or retransmit succeeds. This behaviour is later
broken by commit 843f4a55e (tcp: use tcp_v4_send_synack on first
SYN-ACK), as snt_synack is now updated even if tcp_v4_send_synack
fails.

Also, commit 3a19ce0ee (tcp: IPv6 support for fastopen server) misses
the required IPv6 updates for 016818d07.

This patch makes sure that snt_synack is updated only when the SYNACK
trasnmit/retransmit succeeds, for both IPv4 and IPv6.

Cc: Cardwell <ncardwell@google.com>
Cc: Daniel Lee <longinus00@gmail.com>
Cc: Yuchung Cheng <ycheng@google.com>

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:35 -07:00
Pablo Neira Ayuso
fab4085f4e netfilter: log: nf_log_packet() as real unified interface
Before this patch, the nf_loginfo parameter specified the logging
configuration in case the specified default logger was loaded. This
patch updates the semantics of the nf_loginfo parameter in
nf_log_packet() which now indicates the logger that you explicitly
want to use.

Thus, nf_log_packet() is exposed as an unified interface which
internally routes the log message to the corresponding logger type
by family.

The module dependencies are expressed by the new nf_logger_find_get()
and nf_logger_put() functions which bump the logger module refcount.
Thus, you can not remove logger modules that are used by rules anymore.

Another important effect of this change is that the family specific
module is only loaded when required. Therefore, xt_LOG and nft_log
will just trigger the autoload of the nf_log_{ip,ip6} modules
according to the family.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-27 13:20:13 +02:00
Pablo Neira Ayuso
83e96d443b netfilter: log: split family specific code to nf_log_{ip,ip6,common}.c files
The plain text logging is currently embedded into the xt_LOG target.
In order to be able to use the plain text logging from nft_log, as a
first step, this patch moves the family specific code to the following
files and Kconfig symbols:

1) net/ipv4/netfilter/nf_log_ip.c: CONFIG_NF_LOG_IPV4
2) net/ipv6/netfilter/nf_log_ip6.c: CONFIG_NF_LOG_IPV6
3) net/netfilter/nf_log_common.c: CONFIG_NF_LOG_COMMON

These new modules will be required by xt_LOG and nft_log. This patch
is based on original patch from Arturo Borrero Gonzalez.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-27 13:19:59 +02:00
Hangbin Liu
e940f5d6ba ipv6: Fix MLD Query message check
Based on RFC3810 6.2, we also need to check the hop limit and router alert
option besides source address.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 00:21:50 -07:00
James M Leddy
3e215c8d1b udp: Add MIB counters for rcvbuferrors
Add MIB counters for rcvbuferrors in UDP to help diagnose problems.

Signed-off-by: James M Leddy <james.leddy@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 00:20:55 -07:00
Octavian Purdila
e0f802fbca tcp: move ir_mark initialization to tcp_openreq_init
ir_mark initialization is done for both TCP v4 and v6, move it in the
common tcp_openreq_init function.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-17 15:30:54 -07:00
David S. Miller
902455e007 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/core/rtnetlink.c
	net/core/skbuff.c

Both conflicts were very simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 16:02:55 -07:00
huizhang
f6c20c596f net: ipv6: Fixed up ipsec packet be re-routing issue
Bug report on https://bugzilla.kernel.org/show_bug.cgi?id=75781

When a local output ipsec packet match the mangle table rule,
and be set mark value, the packet will be route again in
route_me_harder -> _session_decoder6

In this case, the nhoff in CB of skb was still the default
value 0. So the protocal match can't success and the packet can't match
correct SA rule,and then the packet be send out in plaintext.

To fixed up the issue. The CB->nhoff must be set.

Signed-off-by: Hui Zhang <huizhang@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:47:31 -07:00
Dmitry Popov
2346829e64 ipip, sit: fix ipv4_{update_pmtu,redirect} calls
ipv4_{update_pmtu,redirect} were called with tunnel's ifindex (t->dev is a
tunnel netdevice). It caused wrong route lookup and failure of pmtu update or
redirect. We should use the same ifindex that we use in ip_route_output_* in
*tunnel_xmit code. It is t->parms.link .

Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 23:35:52 -07:00
Sven Wegener
9e89fd8b7d ipv6: Shrink udp_v6_mcast_next() to one socket variable
To avoid the confusion of having two variables, shrink the function to
only use the parameter variable for looping.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-05 16:23:08 -07:00
David S. Miller
f666f87b94 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/xen-netback/netback.c
	net/core/filter.c

A filter bug fix overlapped some cleanups and a conversion
over to some new insn generation macros.

A xen-netback bug fix overlapped the addition of multi-queue
support.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-05 16:22:02 -07:00
Tom Herbert
4749c09c37 gre: Call gso_make_checksum
Call gso_make_checksum. This should have the benefit of using a
checksum that may have been previously computed for the packet.

This also adds NETIF_F_GSO_GRE_CSUM to differentiate devices that
offload GRE GSO with and without the GRE checksum offloaed.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-04 22:46:38 -07:00
Tom Herbert
0f4f4ffa7b net: Add GSO support for UDP tunnels with checksum
Added a new netif feature for GSO_UDP_TUNNEL_CSUM. This indicates
that a device is capable of computing the UDP checksum in the
encapsulating header of a UDP tunnel.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-04 22:46:38 -07:00