Commit graph

196 commits

Author SHA1 Message Date
Al Viro
1c7d1fc149 [SCTP]: Switch sctp_endpoint_is_match() to net-endian.
The only caller (__sctp_rcv_lookup_endpoint()) also switched,
its caller adjusted

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:40 -08:00
Al Viro
c9a08505ec [SCTP]: Switch sctp_del_bind_addr() to net-endian.
Callers adjusted.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:39 -08:00
Al Viro
63de08f45b [SCTP]: Switch address inside the heartbeat opaque data to net-endian.
Its only use happens on the same host, when it gets quoted back to
us.  So we are free to flip to net-endian and avoid extra PITA.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:38 -08:00
Al Viro
be29681edf [SCTP]: Switch sctp_assoc_lookup_paddr() to net-endian.
Callers updated.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:37 -08:00
Al Viro
38a03145ef [SCTP]: sctp_assoc_del_peer() switched to net-endian.
Callers adjusted.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:36 -08:00
Al Viro
854d43a465 [SCTP]: Annotate ->dst_saddr()
switched to taking a pointer to net-endian sctp_addr
and a net-endian port number.  Instances and callers
adjusted; interestingly enough, the only calls are
direct calls of specific instances - the method is not
used at all.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:35 -08:00
Al Viro
acd2bc96e1 [SCTP]: Switch ->primary_addr to net-endian.
Users adjusted.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:34 -08:00
Al Viro
7e1e4a2b9d [SCTP]: Switch sctp_bind_addr_match() to net-endian.
Callers adjusted.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:33 -08:00
Al Viro
5f242a13e8 [SCTP]: Switch ->cmp_addr() and sctp_cmp_addr_exact() to net-endian.
instances of ->cmp_addr() are fine with switching both arguments
to net-endian; callers other than in sctp_cmp_addr_exact() (both
as ->cmp_addr(...) and direct calls of instances) adjusted;
sctp_cmp_addr_exact() switched to net-endian itself and adjustment
is done in its callers

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:32 -08:00
Al Viro
c604e368a4 [SCTP]: Pass net-endian to ->seq_dump_addr()
No actual modifications of method instances are needed -
they don't look at port numbers.  Switch callers...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:31 -08:00
Al Viro
2a6fd78ade [SCTP] embedded sctp_addr: net-endian mirrors
Add sctp_chunk->source, sctp_sockaddr_entry->a, sctp_transport->ipaddr
and sctp_transport->saddr, maintain them as net-endian mirrors of
their host-endian counterparts.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:30 -08:00
Al Viro
09ef7fecea [SCTP]: Beginning of conversion to net-endian for embedded sctp_addr.
Part 1: rename sctp_chunk->source, sctp_sockaddr_entry->a,
sctp_transport->ipaddr and sctp_transport->saddr (to ..._h)

The next patch will reintroduce these fields and keep them as
net-endian mirrors of the original (renamed) ones.  Split in
two patches to make sure that we hadn't forgotten any instanes.

Later in the series we'll eliminate uses of host-endian variants
(basically switching users to net-endian counterparts as we
progress through that mess).  Then host-endian ones will die.

Other embedded host-endian sctp_addr will be easier to switch
directly, so we leave them alone for now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:29 -08:00
Al Viro
30330ee00c [SCTP] bug: endianness problem in sctp_getsockopt_sctp_status()
Again, invalid sockaddr passed to userland - host-endiand sin_port.
Potential leak, again, but less dramatic than in previous case.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:28 -08:00
Al Viro
0906e20fa0 [SCTP] bug: sctp_assoc_control_transport() breakage
a) struct sockaddr_storage * passed to sctp_ulpevent_make_peer_addr_change()
actually points at union sctp_addr field in a structure.  Then that sucker
gets copied to userland, with whatever junk we might have there.

b) it's actually having host-endian sin_port.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:27 -08:00
Al Viro
d5c747f6ef [SCTP] bug: sctp_find_unmatch_addr() compares net-endian to host-endian
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:26 -08:00
Al Viro
39940a48c4 [SCTP] bug: sctp_assoc_lookup_laddr() is broken with ipv6.
It expects (and gets) laddr with net-endian sin_port.  And then it calls
sctp_bind_addr_match(), which *does* care about port numbers in case of
ipv6 and expects them to be host-endian.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:25 -08:00
Al Viro
04afd8b282 [SCTP]: Beginning of sin_port fixes.
That's going to be a long series.  Introduced temporary helpers
doing copy-and-convert for sctp_addr; they are used to kill
flip-in-place in global data structures and will be used
to gradually push host-endian uses of sctp_addr out of existence.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:24 -08:00
Al Viro
dbc16db1e5 [SCTP]: Trivial sctp endianness annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:23 -08:00
Al Viro
5be291fe2d [SCTP]: SCTP_CMD_ASSOC_FAILED annotations.
also always get __be16 protocol error; switch to SCTP_PERR()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:21 -08:00
Al Viro
dc251b2b1c [SCTP]: SCTP_CMD_INIT_FAILED annotations.
argument stored for SCTP_CMD_INIT_FAILED is always __be16
(protocol error).  Introduced new field and accessor for
it (SCTP_PERR()); switched to their use (from SCTP_U32() and
.u32)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:20 -08:00
Al Viro
f94c0198dd [SCTP]: sctp_stop_t1_and_abort() annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:19 -08:00
Al Viro
63706c5c6f [SCTP]: sctp_make_op_error() annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:18 -08:00
Al Viro
5bf2db0390 [SCTP]: Annotate sctp_init_cause().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:17 -08:00
Peter Zijlstra
1ed176a801 [SCTP]: Cleanup of the sctp state table code.
I noticed an insane high density of repeated characters fixable by a
simple regular expression:

  % s/{.fn = \([^,]*\),[[:space:]]\+\(\\\n[[:space:]]\+\)\?.name = "\1"}/TYPE_SCTP_FUNC(\1)/g

(NOTE: the .name for .fn = sctp_sf_do_9_2_start_shutdown didn't match)

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:26 -08:00
David S. Miller
931731123a [TCP]: Don't set SKB owner in tcp_transmit_skb().
The data itself is already charged to the SKB, doing
the skb_set_owner_w() just generates a lot of noise and
extra atomics we don't really need.

Lmbench improvements on lat_tcp are minimal:

before:
TCP latency using localhost: 23.2701 microseconds
TCP latency using localhost: 23.1994 microseconds
TCP latency using localhost: 23.2257 microseconds

after:
TCP latency using localhost: 22.8380 microseconds
TCP latency using localhost: 22.9465 microseconds
TCP latency using localhost: 22.8462 microseconds

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:52 -08:00
Vlad Yasevich
b68dbcab1d [SCTP]: Fix warning
An alternate solution would be to make the digest a pointer, allocate
it in sctp_endpoint_init() and free it in sctp_endpoint_destroy().

I guess I should have originally done it this way...

  CC [M]  net/sctp/sm_make_chunk.o
net/sctp/sm_make_chunk.c: In function 'sctp_unpack_cookie':
net/sctp/sm_make_chunk.c:1358: warning: initialization discards qualifiers from pointer target type

The reason is that sctp_unpack_cookie() takes a const struct
sctp_endpoint and modifies the digest in it (digest being embedded in
the struct, not a pointer).  Make digest a pointer to fix this
warning.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:47 -08:00
Al Viro
04ce69093f [IPV6]: 'info' argument of ipv6 ->err_handler() is net-endian
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:12 -08:00
David Howells
c4028958b6 WorkStruct: make allyesconfig
Fix up for make allyesconfig.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-11-22 14:57:56 +00:00
Vlad Yasevich
de76e695a5 [SCTP]: Remove temporary associations from backlog and hash.
Every time SCTP creates a temporary association, the stack hashes it,
puts it on a list of endpoint associations and increments the backlog.
However, the lifetime of a temporary association is the processing time
of a current packet and it's destroyed after that. In fact, we don't
really want anyone else finding this association. There is no reason to
do this extra work.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-30 18:55:11 -08:00
Vlad Yasevich
4f4443088b [SCTP]: Correctly set IP id for SCTP traffic
Make SCTP 1-1 style and peeled-off associations behave like TCP when
setting IP id. In both cases, we set the inet_sk(sk)->daddr and initialize
inet_sk(sk)->id to a random value.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-30 18:54:32 -08:00
Herbert Xu
28cd775273 [SCTP]: Always linearise packet on input
I was looking at a RHEL5 bug report involving Xen and SCTP
(https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212550).
It turns out that SCTP wasn't written to handle skb fragments at
all.  The absence of any calls to skb_may_pull is testament to
that.

It just so happens that Xen creates fragmented packets more often
than other scenarios (header & data split when going from domU to
dom0).  That's what caused this bug to show up.

Until someone has the time sits down and audits the entire net/sctp
directory, here is a conservative and safe solution that simply
linearises all packets on input.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-30 15:24:39 -08:00
Ville Nuorvala
4251320fa2 [IPV6]: Make sure error handling is done when calling ip6_route_output().
As ip6_route_output() never returns NULL, error checking must be done by
looking at dst->error in stead of comparing dst against NULL.

Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-18 19:55:27 -07:00
Ville Nuorvala
23c435f7ff [SCTP]: Fix minor typo
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-18 19:55:26 -07:00
Vlad Yasevich
6aa2551cf1 [SCTP]: Fix the RX queue size shown in /proc/net/sctp/assocs output.
Show the true receive buffer usage.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-11 23:59:46 -07:00
Vlad Yasevich
331c4ee7fa [SCTP]: Fix receive buffer accounting.
When doing receiver buffer accounting, we always used skb->truesize.
This is problematic when processing bundled DATA chunks because for
every DATA chunk that could be small part of one large skb, we would
charge the size of the entire skb.  The new approach is to store the
size of the DATA chunk we are accounting for in the sctp_ulpevent
structure and use that stored value for accounting.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-11 23:59:44 -07:00
Vlad Yasevich
f236218b72 [SCTP]: Do not timestamp every SCTP packet.
We only need the timestamp on COOKIE-ECHO chunks, so instead of always
timestamping every SCTP packet, let common code timestamp if the socket
option is set.  For COOKIE-ECHO, simply get the time of day if we don't
have a timestamp.  This introduces a small possibility that the cookie
may be considered expired, but it will be renegotiated.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-29 17:10:03 -07:00
Vlad Yasevich
b56bab46f3 [SCTP]: Use correct mask when disabling PMTUD.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-29 17:09:34 -07:00
Sridhar Samudrala
cd49788563 [SCTP]: Include sk_buff overhead while updating the peer's receive window.
Currently if the sender is sending small messages, it can cause a receiver
to run out of receive buffer space even when the advertised receive window
is still open and results in packet drops and retransmissions. Including
a overhead while updating the sender's view of peer receive window will
reduce the chances of receive buffer space overshooting the receive window.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-29 17:09:05 -07:00
Sridhar Samudrala
208edef6a5 [SCTP]: Enable Nagle algorithm by default.
This allows more aggressive bundling of chunks when sending small
messages.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-29 17:08:01 -07:00
YOSHIFUJI Hideaki
8814c4b533 [IPV6] ADDRCONF: Convert addrconf_lock to RCU.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:20:26 -07:00
Adrian Bunk
1616436601 [SCTP]: Cleanups
This patch contains the following cleanups:
- make the following needlessly global function static:
  - socket.c: sctp_apply_peer_addr_params()
- add proper prototypes for the several global functions in
  include/net/sctp/sctp.h

Note that this fixes wrong prototypes for the following functions:
- sctp_snmp_proc_exit()
- sctp_eps_proc_exit()
- sctp_assocs_proc_exit()

The latter was spotted by the GNU C compiler and reported
by David Woodhouse.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:19:03 -07:00
Brian Haley
4cbf1cae9f [SCTP]: Change globals to __read_mostly
Change sctp globals to __read_mostly.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:18:53 -07:00
Dmitry Mishin
fda9ef5d67 [NET]: Fix sk->sk_filter field access
Function sk_filter() is called from tcp_v{4,6}_rcv() functions with arg
needlock = 0, while socket is not locked at that moment. In order to avoid
this and similar issues in the future, use rcu for sk->sk_filter field read
protection.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
2006-09-22 15:18:47 -07:00
Vladislav Yasevich
3fd091e73b [SCTP]: Remove multiple levels of msecs to jiffies conversions.
The SCTP sysctl entries are displayed in milliseconds, but stored
internally in jiffies. This results in multiple levels of msecs to
jiffies conversion and as a result produces a truncation error. This
patch makes things consistent in that we store and display defaults
in milliseconds and only convert once for use by association.
This patch also adds some sane min/max values so that we don't go off
the deep end.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:55:39 -07:00
Sridhar Samudrala
8abfedd889 [SCTP]: Use the flags value that is passed as an arg to sctp_accept.
No need to do multiple dereferences - sk->sk_socket->file->f_flags

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:55:19 -07:00
Vladislav Yasevich
eb5fa39f5e [SCTP]: Fix IPv6 address flag setting when doing peel-off/accept.
During accept/peeloff we try to copy the list of bound addresses from
the original endpoint to the new one. However, we forgot to set the flag
to say that IPv6 is allowed on the new endpoint.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:55:18 -07:00
Vladislav Yasevich
df7deeb540 [SCTP]: Cleanup nomem handling in the state functions.
This patch cleans up the "nomem" conditions that may occur during the
processing by the state machine functions. In most cases we delay adding
side-effect commands until all memory allocations are done.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:55:17 -07:00
Sridhar Samudrala
ac0b046272 [SCTP]: Extend /proc/net/sctp/snmp to provide more statistics.
This patch adds more statistics info under /proc/net/sctp/snmp
that should be useful for debugging. The additional events that
are counted now include timer expirations, retransmits, packet
and data chunk discards.

The Data chunk discards include all the cases where a data chunk
is discarded including high tsn, bad stream, dup tsn and the most
useful one(out of receive buffer/rwnd).

Also moved the SCTP MIB data structures from the generic include
directories to include/sctp/sctp.h.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:55:16 -07:00
Herbert Xu
1b489e11d4 [SCTP]: Use HMAC template and hash interface
This patch converts SCTP to use the new HMAC template and hash interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-21 11:46:19 +10:00
Sridhar Samudrala
b9ac86727f [SCTP]: Fix sctp_primitive_ABORT() call in sctp_close().
With the recent fix, the callers of sctp_primitive_ABORT()
need to create an ABORT chunk and pass it as an argument rather
than msghdr that was passed earlier.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-29 21:22:13 -07:00
Sridhar Samudrala
c164a9ba0a Fix sctp privilege elevation (CVE-2006-3745)
sctp_make_abort_user() now takes the msg_len along with the msg
so that we don't have to recalculate the bytes in iovec.
It also uses memcpy_fromiovec() so that we don't go beyond the
length allocated.

It is good to have this fix even if verify_iovec() is fixed to
return error on overflow.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-22 12:52:23 -07:00
Sridhar Samudrala
dc022a9874 [SCTP]: ADDIP: Don't use an address as source until it is ASCONF-ACKed
This implements Rules D1 and D4 of Sec 4.3 in the ADDIP draft.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21 14:49:25 -07:00
Sridhar Samudrala
9faa730f1c [SCTP]: Set chunk->data_accepted only if we are going to accept it.
Currently there is a code path in sctp_eat_data() where it is possible
to set this flag even when we are dropping this chunk.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21 14:49:07 -07:00
Sridhar Samudrala
ad8fec1720 [SCTP]: Verify all the paths to a peer via heartbeat before using them.
This patch implements Path Initialization procedure as described in
Sec 2.36 of RFC4460.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21 14:48:50 -07:00
Vlad Yasevich
cfdeef3282 [SCTP]: Unhash the endpoint in sctp_endpoint_free().
This prevents a race between the close of a socket and receive of an
incoming packet.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21 14:48:26 -07:00
Sridhar Samudrala
37fa6878bc [SCTP]: Check for NULL arg to sctp_bucket_destroy().
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21 14:45:47 -07:00
Jörn Engel
6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
Al Viro
8ca84481b6 [SCTP]: sctp_unpack_cookie() fix
sizeof(pointer) != sizeof(array)...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 03:26:14 -07:00
Neil Horman
d5b9f4c083 [SCTP]: Fix persistent slowdown in sctp when a gap ack consumes rx buffer.
In the event that our entire receive buffer is full with a series of
chunks that represent a single gap-ack, and then we accept a chunk
(or chunks) that fill in the gap between the ctsn and the first gap,
we renege chunks from the end of the buffer, which effectively does
nothing but move our gap to the end of our received tsn stream. This
does little but move our missing tsns down stream a little, and, if the
sender is sending sufficiently large retransmit frames, the result is a
perpetual slowdown which can never be recovered from, since the only
chunk that can be accepted to allow progress in the tsn stream necessitates
that a new gap be created to make room for it. This leads to a constant
need for retransmits, and subsequent receiver stalls. The fix I've come up
with is to deliver the frame without reneging if we have a full receive
buffer and the receiving sockets sk_receive_queue is empty(indicating that
the receive buffer is being blocked by a missing tsn).

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:59:03 -07:00
Tsutomu Fujii
d7c2c9e397 [SCTP]: Send only 1 window update SACK per message.
Right now, every time we increase our rwnd by more then MTU bytes, we
trigger a SACK.  When processing large messages, this will generate a
SACK for almost every other SCTP fragment. However since we are freeing
the entire message at the same time, we might as well collapse the SACK
generation to 1.

Signed-off-by: Tsutomu Fujii <t-fujii@nb.jp.nec.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:58:28 -07:00
Sridhar Samudrala
503b55fd77 [SCTP]: Don't do CRC32C checksum over loopback.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:57:28 -07:00
Vlad Yasevich
4c9f5d5305 [SCTP] Reset rtt_in_progress for the chunk when processing its sack.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:56:08 -07:00
Vlad Yasevich
5636bef732 [SCTP]: Reject sctp packets with broadcast addresses.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:55:35 -07:00
Vlad Yasevich
402d68c433 [SCTP]: Limit association max_retrans setting in setsockopt.
When using ASSOCINFO socket option, we need to limit the number of
maximum association retransmissions to be no greater than the sum
of all the path retransmissions. This is specified in Section 7.1.2
of the SCTP socket API draft.
However, we only do this if the association has multiple paths. If
there is only one path, the protocol stack will use the
assoc_max_retrans setting when trying to retransmit packets.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:54:51 -07:00
Vladislav Yasevich
b89498a1c2 [SCTP]: Allow linger to abort 1-N style sockets.
Enable SO_LINGER functionality for 1-N style sockets. The socket API
draft will be clarfied to allow for this functionality. The linger
settings will apply to all associations on a given socket.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 14:32:06 -07:00
Vladislav Yasevich
a601266e4f [SCTP]: Validate the parameter length in HB-ACK chunk.
If SCTP receives a badly formatted HB-ACK chunk, it is possible
that we may access invalid memory and potentially have a buffer
overflow.  We should really make sure that the chunk format is
what we expect, before attempting to touch the data.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 14:25:53 -07:00
Vladislav Yasevich
61c9fed416 [SCTP]: A better solution to fix the race between sctp_peeloff() and
sctp_rcv().

The goal is to hold the ref on the association/endpoint throughout the
state-machine process.  We accomplish like this:

  /* ref on the assoc/ep is taken during lookup */

  if owned_by_user(sk)
 	sctp_add_backlog(skb, sk);
  else
 	inqueue_push(skb, sk);

  /* drop the ref on the assoc/ep */

However, in sctp_add_backlog() we take the ref on assoc/ep and hold it
while the skb is on the backlog queue.  This allows us to get rid of the
sock_hold/sock_put in the lookup routines.

Now sctp_backlog_rcv() needs to account for potential association move.
In the unlikely event that association moved, we need to retest if the
new socket is locked by user.  If we don't this, we may have two packets
racing up the stack toward the same socket and we can't deal with it.
If the new socket is still locked, we'll just add the skb to its backlog
continuing to hold the ref on the association.  This get's rid of the
need to move packets from one backlog to another and it also safe in
case new packets arrive on the same backlog queue.

The last step, is to lock the new socket when we are moving the
association to it.  This is needed in case any new packets arrive on
the association when it moved.  We want these to go to the backlog since
we would like to avoid the race between this new packet and a packet
that may be sitting on the backlog queue of the old socket toward the
same association.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 11:01:18 -07:00
Sridhar Samudrala
8de8c87380 [SCTP]: Set sk_err so that poll wakes up after a non-blocking connect failure.
Also fix some other cases where sk_err is not set for 1-1 style sockets.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 10:58:12 -07:00
Sridhar Samudrala
35d63edb1c [SCTP]: Fix state table entries for chunks received in CLOSED state.
Discard an unexpected chunk in CLOSED state rather can calling BUG().

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:05:23 -07:00
Sridhar Samudrala
62b08083ec [SCTP]: Fix panic's when receiving fragmented SCTP control chunks.
Use pskb_pull() to handle incoming COOKIE_ECHO and HEARTBEAT chunks that
are received as skb's with fragment list.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:04:43 -07:00
Vladislav Yasevich
672e7cca17 [SCTP]: Prevent possible infinite recursion with multiple bundled DATA.
There is a rare situation that causes lksctp to go into infinite recursion
and crash the system.  The trigger is a packet that contains at least the
first two DATA fragments of a message bundled together. The recursion is
triggered when the user data buffer is smaller that the full data message.
The problem is that we clone the skb for every fragment in the message.
When reassembling the full message, we try to link skbs from the "first
fragment" clone using the frag_list. However, since the frag_list is shared
between two clones in this rare situation, we end up setting the frag_list
pointer of the second fragment to point to itself.  This causes
sctp_skb_pull() to potentially recurse indefinitely.

Proposed solution is to make a copy of the skb when attempting to link
things using frag_list.

Signed-off-by: Vladislav Yasevich <vladsilav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:03:49 -07:00
Neil Horman
7c3ceb4fb9 [SCTP]: Allow spillover of receive buffer to avoid deadlock.
This patch fixes a deadlock situation in the receive path by allowing
temporary spillover of the receive buffer.

- If the chunk we receive has a tsn that immediately follows the ctsn,
  accept it even if we run out of receive buffer space and renege data with
  higher TSNs.
- Once we accept one chunk in a packet, accept all the remaining chunks
  even if we run out of receive buffer space.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Mark Butler <butlerm@middle.net>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:02:09 -07:00
KAMEZAWA Hiroyuki
6f91204225 [PATCH] for_each_possible_cpu: network codes
for_each_cpu() actually iterates across all possible CPUs.  We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs.  This is inefficient and
possibly buggy.

We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.

This patch replaces for_each_cpu with for_each_possible_cpu under /net

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:31 -07:00
Linus Torvalds
b55813a2e5 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NETFILTER] x_table.c: sem2mutex
  [IPV4]: Aggregate route entries with different TOS values
  [TCP]: Mark tcp_*mem[] __read_mostly.
  [TCP]: Set default max buffers from memory pool size
  [SCTP]: Fix up sctp_rcv return value
  [NET]: Take RTNL when unregistering notifier
  [WIRELESS]: Fix config dependencies.
  [NET]: Fill in a 32-bit hole in struct sock on 64-bit platforms.
  [NET]: Ensure device name passed to SO_BINDTODEVICE is NULL terminated.
  [MODULES]: Don't allow statically declared exports
  [BRIDGE]: Unaligned accesses in the ethernet bridge
2006-03-25 08:39:20 -08:00
Davide Libenzi
f348d70a32 [PATCH] POLLRDHUP/EPOLLRDHUP handling for half-closed devices notifications
Implement the half-closed devices notifiation, by adding a new POLLRDHUP
(and its alias EPOLLRDHUP) bit to the existing poll/select sets.  Since the
existing POLLHUP handling, that does not report correctly half-closed
devices, was feared to be changed, this implementation leaves the current
POLLHUP reporting unchanged and simply add a new bit that is set in the few
places where it makes sense.  The same thing was discussed and conceptually
agreed quite some time ago:

http://lkml.org/lkml/2003/7/12/116

Since this new event bit is added to the existing Linux poll infrastruture,
even the existing poll/select system calls will be able to use it.  As far
as the existing POLLHUP handling, the patch leaves it as is.  The
pollrdhup-2.6.16.rc5-0.10.diff defines the POLLRDHUP for all the existing
archs and sets the bit in the six relevant files.  The other attached diff
is the simple change required to sys/epoll.h to add the EPOLLRDHUP
definition.

There is "a stupid program" to test POLLRDHUP delivery here:

 http://www.xmailserver.org/pollrdhup-test.c

It tests poll(2), but since the delivery is same epoll(2) will work equally.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:56 -08:00
Herbert Xu
2babf9daae [SCTP]: Fix up sctp_rcv return value
I was working on the ipip/xfrm problem and as usual I get side-tracked by
other problems.

As part of an attempt to change the IPv4 protocol handler calling
convention I found that SCTP violated the existing convention.

It's returning non-zero values after freeing the skb.  This is doubly bad
as 1) the skb gets resubmitted; 2) the return value is interpreted as a
protocol number.

This patch changes those return values to zero.

IPv6 doesn't suffer from this problem because it uses a positive return
value as an indication for resubmission.  So the only effect of this patch
there is to increment the IPSTATS_MIB_INDELIVERS counter which IMHO is
the right thing to do.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-25 01:25:29 -08:00
Arnaldo Carvalho de Melo
543d9cfeec [NET]: Identation & other cleanups related to compat_[gs]etsockopt cset
No code changes, just tidying up, in some cases moving EXPORT_SYMBOLs
to just after the function exported, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:48:35 -08:00
Dmitry Mishin
3fdadf7d27 [NET]: {get|set}sockopt compatibility layer
This patch extends {get|set}sockopt compatibility layer in order to
move protocol specific parts to their place and avoid huge universal
net/compat.c file in the future.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:45:21 -08:00
Vlad Yasevich
27852c26ba [SCTP]: Fix 'fast retransmit' to send a TSN only once.
SCTP used to "fast retransmit" a TSN every time we hit the number
of missing reports for the TSN.  However the Implementers Guide
specifies that we should only "fast retransmit" a given TSN once.
Subsequent retransmits should be timeouts only. Also change the
number of missing reports to 3 as per the latest IG(similar to TCP).

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-02 16:57:31 -08:00
Vlad Yasevich
e2c2fc2c8f [SCTP]: heartbeats exceed maximum retransmssion limit
The number of HEARTBEAT chunks that an association may transmit is
limited by Association.Max.Retrans count; however, the code allows
us to send one extra heartbeat.

This patch limits the number of heartbeats to the maximum count.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-30 16:00:40 -08:00
Vlad Yasevich
81845c21dc [SCTP]: correct the number of INIT retransmissions
We currently count the initial INIT/COOKIE_ECHO chunk toward the
retransmit count and thus sends a total of sctp_max_retrans_init chunks.
The correct behavior is to retransmit the chunk sctp_max_retrans_init in
addition to sending the original.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-30 15:59:54 -08:00
Tsutomu Fujii
a7d1f1b66c [SCTP]: Fix sctp_rcv_ootb() to handle the last chunk of a packet correctly.
Signed-off-by: Tsutomu Fujii <t-fujii@nb.jp.nec.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:57:09 -08:00
Sridhar Samudrala
c4d2444e99 [SCTP]: Fix couple of races between sctp_peeloff() and sctp_rcv().
Validate and update the sk in sctp_rcv() to avoid the race where an
assoc/ep could move to a different socket after we get the sk, but before
the skb is added to the backlog.

Also migrate the skb's in backlog queue to new sk when doing a peeloff.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:56:26 -08:00
Vlad Yasevich
313e7b4d25 [SCTP]: Fix machine check/connection hang on IA64.
sctp_unpack_cookie used an on-stack array called digest as a result/out
parameter in the call to crypto_hmac. However, hmac code
(crypto_hmac_final)
assumes that the 'out' argument is in virtual memory (identity mapped
region)
and can use virt_to_page call on it.  This does not work with the on-stack
declared digest.  The problems observed so far have been:
 a) incorrect hmac digest
 b) machine check and hardware reset.

Solution is to define the digest in an identity mapped region by
kmalloc'ing
it.  We can do this once as part of the endpoint structure and re-use it
when
verifying the SCTP cookie.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:55:57 -08:00
Vlad Yasevich
8116ffad41 [SCTP]: Fix bad sysctl formatting of SCTP timeout values on 64-bit m/cs.
Change all the structure members that hold jiffies to be of type
unsigned long.  This also corrects bad sysctl formating on 64 bit
architectures.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:55:17 -08:00
Vlad Yasevich
38b0e42aba [SCTP]: Fix sctp_assoc_seq_show() panics on big-endian systems.
This patch corrects the panic by casting the argument to the
pointer of correct size.  On big-endian systems we ended up loading
only 32 bits of data because we are treating the pointer as an int*.
By treating this pointer as loff_t*, we'll load the full 64 bits
and then let regular integer demotion take place which will give us
the correct value.

Signed-off-by: Vlad Yaseivch <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:54:06 -08:00
Vlad Yasevich
49392e5ecf [SCTP]: sctp doesn't show all associations/endpoints in /proc
When creating a very large number of associations (and endpoints),
/proc/assocs and /proc/eps will not show all of them.  As a result
netstat will not show all of the either.  This is particularly evident
when creating 1000+ associations (or endpoints).  As an example with
1500 tcp style associations over loopback, netstat showed 1420 on my
system instead of 3000.

The reason for this is that the seq_operations start method is invoked
multiple times bacause of the amount of data that is provided.  The
start method always increments the position parameter and since we use
the position as the hash bucket id, we end up skipping hash buckets.

This patch corrects this situation and get's rid of the silly hash-1
decrement.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:53:06 -08:00
Vlad Yasevich
9834a2bb49 [SCTP]: Fix sctp_cookie alignment in the packet.
On 64 bit architectures, sctp_cookie sent as part of INIT-ACK is not
aligned on a 64 bit boundry and thus causes unaligned access exceptions.

The layout of the cookie prameter is this:
|<----- Parameter Header --------------------|<--- Cookie DATA --------
-----------------------------------------------------------------------
| param type (16 bits) | param len (16 bits) | sig [32 bytes] | cookie..
-----------------------------------------------------------------------

The cookie data portion contains 64 bit values on 64 bit architechtures
(timeval) that fall on a 32 bit alignment boundry when used as part of
the on-wire format, but align correctly when used in internal
structures.  This patch explicitely pads the on-wire format so that
it is properly aligned.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:52:12 -08:00
Sridhar Samudrala
7a48f923b8 [SCTP]: Fix potential race condition between sctp_close() and sctp_rcv().
Do not release the reference to association/endpoint if an incoming skb is
added to backlog. Instead release it after the chunk is processed in
sctp_backlog_rcv().

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2006-01-17 11:51:28 -08:00
Joe Perches
46b86a2da0 [NET]: Use NIP6_FMT in kernel.h
There are errors and inconsistency in the display of NIP6 strings.
	ie: net/ipv6/ip6_flowlabel.c

There are errors and inconsistency in the display of NIPQUAD strings too.
	ie: net/netfilter/nf_conntrack_ftp.c

This patch:
	adds NIP6_FMT to kernel.h
	changes all code to use NIP6_FMT
	fixes net/ipv6/ip6_flowlabel.c
	adds NIPQUAD_FMT to kernel.h
	fixes net/netfilter/nf_conntrack_ftp.c
	changes a few uses of "%u.%u.%u.%u" to NIPQUAD_FMT for symmetry to NIP6_FMT

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 14:29:07 -08:00
Randy Dunlap
4fc268d24c [PATCH] capable/capability.h (net/)
net: Use <linux/capability.h> where capable() is used.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11 18:42:14 -08:00
Kris Katterjohn
8b3a70058b [NET]: Remove more unneeded typecasts on *malloc()
This removes more unneeded casts on the return value for kmalloc(),
sock_kmalloc(), and vmalloc().

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:14 -08:00
Kris Katterjohn
09a626600b [NET]: Change some "if (x) BUG();" to "BUG_ON(x);"
This changes some simple "if (x) BUG();" statements to "BUG_ON(x);"

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:18 -08:00
Patrick McHardy
b59c270104 [NETFILTER]: Keep conntrack reference until IPsec policy checks are done
Keep the conntrack reference until policy checks have been performed for
IPsec NAT support. The reference needs to be dropped before a packet is
queued to avoid having the conntrack module unloadable.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:36 -08:00
Patrick McHardy
951dbc8ac7 [IPV6]: Move nextheader offset to the IP6CB
Move nextheader offset to the IP6CB to make it possible to pass a
packet to ip6_input_finish multiple times and have it skip already
parsed headers. As a nice side effect this gets rid of the manual
hopopts skipping in ip6_input_finish.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:29 -08:00
Arnaldo Carvalho de Melo
14c850212e [INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h
To help in reducing the number of include dependencies, several files were
touched as they were getting needed headers indirectly for stuff they use.

Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had
linux/dccp.h include twice.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:21 -08:00
Eric Dumazet
90ddc4f047 [NET]: move struct proto_ops to const
I noticed that some of 'struct proto_ops' used in the kernel may share
a cache line used by locks or other heavily modified data. (default
linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at
least)

This patch makes sure a 'struct proto_ops' can be declared as const,
so that all cpus can share all parts of it without false sharing.

This is not mandatory : a driver can still use a read/write structure
if it needs to (and eventually a __read_mostly)

I made a global stubstitute to change all existing occurences to make
them const.

This should reduce the possibility of false sharing on SMP, and
speedup some socket system calls.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:15 -08:00
Frank Filz
7708610b1b [SCTP]: Add support for SCTP_DELAYED_ACK_TIME socket option.
Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:13 -08:00
Frank Filz
52ccb8e90c [SCTP]: Update SCTP_PEER_ADDR_PARAMS socket option to the latest api draft.
This patch adds support to set/get heartbeat interval, maximum number of
retransmissions, pathmtu, sackdelay time for a particular transport/
association/socket as per the latest SCTP sockets api draft11.

Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:11 -08:00
Neil Horman
9bffc4ace1 [SCTP]: Fix sctp to not return erroneous POLLOUT events.
Make sctp_writeable() use sk_wmem_alloc rather than sk_wmem_queued to
determine the sndbuf space available. It also removes all the modifications
to sk_wmem_queued as it is not currently used in SCTP.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-19 14:24:40 -08:00