When creating a very large number of associations (and endpoints),
/proc/assocs and /proc/eps will not show all of them. As a result
netstat will not show all of the either. This is particularly evident
when creating 1000+ associations (or endpoints). As an example with
1500 tcp style associations over loopback, netstat showed 1420 on my
system instead of 3000.
The reason for this is that the seq_operations start method is invoked
multiple times bacause of the amount of data that is provided. The
start method always increments the position parameter and since we use
the position as the hash bucket id, we end up skipping hash buckets.
This patch corrects this situation and get's rid of the silly hash-1
decrement.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
On 64 bit architectures, sctp_cookie sent as part of INIT-ACK is not
aligned on a 64 bit boundry and thus causes unaligned access exceptions.
The layout of the cookie prameter is this:
|<----- Parameter Header --------------------|<--- Cookie DATA --------
-----------------------------------------------------------------------
| param type (16 bits) | param len (16 bits) | sig [32 bytes] | cookie..
-----------------------------------------------------------------------
The cookie data portion contains 64 bit values on 64 bit architechtures
(timeval) that fall on a 32 bit alignment boundry when used as part of
the on-wire format, but align correctly when used in internal
structures. This patch explicitely pads the on-wire format so that
it is properly aligned.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Do not release the reference to association/endpoint if an incoming skb is
added to backlog. Instead release it after the chunk is processed in
sctp_backlog_rcv().
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Some versions of gcc generate incorrect code for the inet_check_attr()
function, apparently due to a totally bogus index -> pointer comparison
transformation.
At least "gcc version 4.0.1 20050727 (Red Hat 4.0.1-5)" from FC4 is
affected, possibly others too.
This changes the function subtly so that the buggy gcc transformation
doesn't trigger.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove the "inline" keyword from a bunch of big functions in the kernel with
the goal of shrinking it by 30kb to 40kb
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The default of using jiffies is very bad and results in
underutilization except with very low bandwidth.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the source address of a tunnel is given as 0.0.0.0 do a routing lookup
to get the real source address for the destination and fill that into the
acquire message. This allows to specify policies like this:
spdadd 172.16.128.13/32 172.16.0.0/20 any -P out ipsec
esp/tunnel/0.0.0.0-x.x.x.x/require;
spdadd 172.16.0.0/20 172.16.128.13/32 any -P in ipsec
esp/tunnel/x.x.x.x-0.0.0.0/require;
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes redundant comments, and moves one comment to a better
location.
Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are errors and inconsistency in the display of NIP6 strings.
ie: net/ipv6/ip6_flowlabel.c
There are errors and inconsistency in the display of NIPQUAD strings too.
ie: net/netfilter/nf_conntrack_ftp.c
This patch:
adds NIP6_FMT to kernel.h
changes all code to use NIP6_FMT
fixes net/ipv6/ip6_flowlabel.c
adds NIPQUAD_FMT to kernel.h
fixes net/netfilter/nf_conntrack_ftp.c
changes a few uses of "%u.%u.%u.%u" to NIPQUAD_FMT for symmetry to NIP6_FMT
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Increasing the module ref count at registration will block the module from
ever being unloaded. In fact, genetlink should not care about the owner at
all. This patch removes the owner field from the struct registered with
genetlink.
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
This monster-patch tries to do the best job for unifying the data
structures and backend interfaces for the three evil clones ip_tables,
ip6_tables and arp_tables. In an ideal world we would never have
allowed this kind of copy+paste programming... but well, our world
isn't (yet?) ideal.
o introduce a new x_tables module
o {ip,arp,ip6}_tables depend on this x_tables module
o registration functions for tables, matches and targets are only
wrappers around x_tables provided functions
o all matches/targets that are used from ip_tables and ip6_tables
are now implemented as xt_FOOBAR.c files and provide module aliases
to ipt_FOOBAR and ip6t_FOOBAR
o header files for xt_matches are in include/linux/netfilter/,
include/linux/netfilter_{ipv4,ipv6} contains compatibility wrappers
around the xt_FOOBAR.h headers
Based on this patchset we're going to further unify the code,
gradually getting rid of all the layer 3 specific assumptions.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Updated copyright notice to include the year the file was
actually created. Information about file creation dates
was extracted from the files in the old CVS repository
at tipc.sourceforge.net.
Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
The license header in each file now more clearly state that this
code is licensed under a dual BSD/GPL. Before this was only
evident if you looked at the MODULE_LICENSE line in core.c.
Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
Restored the old tipc_config.h to get a cleaner division between the
interfaces used by normal TIPC users and TIPC administration utilities.
Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
TIPC (Transparent Inter Process Communication) is a protocol designed for
intra cluster communication. For more information see
http://tipc.sourceforge.net
Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
net: Use <linux/capability.h> where capable() is used.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This removes more unneeded casts on the return value for kmalloc(),
sock_kmalloc(), and vmalloc().
Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ip6_xmit() function now assumes that its sk argument is non-NULL,
which isn't currently true when TCPv6 code is sending RST or ACK
packets. This fixes that code to use a socket of its own for sending
such packets, as TCPv4 does. (Thanks Andi for the pointer).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also, drop __exit marker from ipv6_netfilter_fini() as this
can be invoked from inet6_init() error handling paths.
Based upon a report from Stephen Hemminger.
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of needless casting of kmalloc() return value in net/
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Decrease the number of pointer derefs in net/rxrpc/connection.c
Benefits of the patch:
- Fewer pointer dereferences should make the code slightly faster.
- Size of generated code is smaller
- improved readability
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Martin Murray <murrayma@citi.umich.edu>
Sanity check nlmsg_len during netlink_rcv_skb. An nlmsg_len == 0 can
cause infinite loop in kernel, effectively DoSing machine. Noted by
Matin Murray.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The connection tracking timeout variables are unsigned long, but
proc_dointvec_jiffies is used with sizeof(unsigned int) in the sysctl
tables. Since there is no proc_doulongvec_jiffies function, change the
timeout variables to unsigned int.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
->print and ->print_range are not used (and apparently never were).
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip_nat_mangle_tcp_packet doesn't return NF_* values but 0/1 for
failure/success.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PPTP NAT helper calculates the offset at which the packet needs
to be mangled as difference between two pointers to the header. With
non-linear skbs however the pointers may point to two seperate buffers
on the stack and the calculation results in a wrong offset beeing
used.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an inbound PPTP_IN_CALL_REQUEST packet is received the
PPTP NAT helper uses a NULL pointer in pointer arithmentic to
calculate the offset in the packet which needs to be mangled
and corrupts random memory or crashes.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't wrap entire file in #ifdef CONFIG_NETFILTER, remove a few
unneccessary includes.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This changes some memcmp(one,two,ETH_ALEN) to compare_ether_addr(one,two).
Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
We want to wait for the cl_users to go down to zero, not for it to stay
positive. Quoth Trond (who wasn't even the author, but acked the wrong
version): "Argh! I need to increase my daily caffeine dosages."
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The API and code have been through various bits of initial review by
serial driver people but they definitely need to live somewhere for a
while so the unconverted drivers can get knocked into shape, existing
drivers that have been updated can be better tuned and bugs whacked out.
This replaces the tty flip buffers with kmalloc objects in rings. In the
normal situation for an IRQ driven serial port at typical speeds the
behaviour is pretty much the same, two buffers end up allocated and the
kernel cycles between them as before.
When there are delays or at high speed we now behave far better as the
buffer pool can grow a bit rather than lose characters. This also means
that we can operate at higher speeds reliably.
For drivers that receive characters in blocks (DMA based, USB and
especially virtualisation) the layer allows a lot of driver specific
code that works around the tty layer with private secondary queues to be
removed. The IBM folks need this sort of layer, the smart serial port
people do, the virtualisers do (because a virtualised tty typically
operates at infinite speed rather than emulating 9600 baud).
Finally many drivers had invalid and unsafe attempts to avoid buffer
overflows by directly invoking tty methods extracted out of the innards
of work queue structs. These are no longer needed and all go away. That
fixes various random hangs with serial ports on overflow.
The other change in here is to optimise the receive_room path that is
used by some callers. It turns out that only one ldisc uses receive room
except asa constant and it updates it far far less than the value is
read. We thus make it a variable not a function call.
I expect the code to contain bugs due to the size alone but I'll be
watching and squashing them and feeding out new patches as it goes.
Because the buffers now dynamically expand you should only run out of
buffering when the kernel runs out of memory for real. That means a lot of
the horrible hacks high performance drivers used to do just aren't needed any
more.
Description:
tty_insert_flip_char is an old API and continues to work as before, as does
tty_flip_buffer_push() [this is why many drivers dont need modification]. It
does now also return the number of chars inserted
There are also
tty_buffer_request_room(tty, len)
which asks for a buffer block of the length requested and returns the space
found. This improves efficiency with hardware that knows how much to
transfer.
and tty_insert_flip_string_flags(tty, str, flags, len)
to insert a string of characters and flags
For a smart interface the usual code is
len = tty_request_buffer_room(tty, amount_hardware_says);
tty_insert_flip_string(tty, buffer_from_card, len);
More description!
At the moment tty buffers are attached directly to the tty. This is causing a
lot of the problems related to tty layer locking, also problems at high speed
and also with bursty data (such as occurs in virtualised environments)
I'm working on ripping out the flip buffers and replacing them with a pool of
dynamically allocated buffers. This allows both for old style "byte I/O"
devices and also helps virtualisation and smart devices where large blocks of
data suddenely materialise and need storing.
So far so good. Lots of drivers reference tty->flip.*. Several of them also
call directly and unsafely into function pointers it provides. This will all
break. Most drivers can use tty_insert_flip_char which can be kept as an API
but others need more.
At the moment I've added the following interfaces, if people think more will
be needed now is a good time to say
int tty_buffer_request_room(tty, size)
Try and ensure at least size bytes are available, returns actual room (may be
zero). At the moment it just uses the flipbuf space but that will change.
Repeated calls without characters being added are not cumulative. (ie if you
call it with 1, 1, 1, and then 4 you'll have four characters of space. The
other functions will also try and grow buffers in future but this will be a
more efficient way when you know block sizes.
int tty_insert_flip_char(tty, ch, flag)
As before insert a character if there is room. Now returns 1 for success, 0
for failure.
int tty_insert_flip_string(tty, str, len)
Insert a block of non error characters. Returns the number inserted.
int tty_prepare_flip_string(tty, strptr, len)
Adjust the buffer to allow len characters to be added. Returns a buffer
pointer in strptr and the length available. This allows for hardware that
needs to use functions like insl or mencpy_fromio.
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Convert sleep_on() to wait_event_timeout(). Probably safe with the BKL but
could be racy once BKL use in NFS-client is gone.
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
dev->get_wireless_stats is deprecated but removing it also removes wireless
subdirectory in sysfs. This patch puts it back.
akpm: I don't know what's happening here. This might be appropriate as a
2.6.15.x compatibility backport. Waiting to hear from Jeff.
Signed-off-by: Andrey Borzenkov <arvidjaar@mail.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch converts the inode semaphore to a mutex. I have tested it on
XFS and compiled as much as one can consider on an ia64. Anyway your
luck with it might be different.
Modified-by: Ingo Molnar <mingo@elte.hu>
(finished the conversion)
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>