In clip_mkip(), skb->dev is dereferenced after clip_push(),
which frees up skb.
Advisory: AD_LAB-06009 (<adlab@venustech.com.cn>).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix lockdep warning with GRE, iptables and Speedtouch ADSL, PPP over ATM.
On Sat, Sep 02, 2006 at 08:39:28PM +0000, Krzysztof Halasa wrote:
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> -------------------------------------------------------
> swapper/0 is trying to acquire lock:
> (&dev->queue_lock){-+..}, at: [<c02c8c46>] dev_queue_xmit+0x56/0x290
>
> but task is already holding lock:
> (&dev->_xmit_lock){-+..}, at: [<c02c8e14>] dev_queue_xmit+0x224/0x290
>
> which lock already depends on the new lock.
This turns out to be a genuine bug. The queue lock and xmit lock are
intentionally taken out of order. Two things are supposed to prevent
dead-locks from occuring:
1) When we hold the queue_lock we're supposed to only do try_lock on the
tx_lock.
2) We always drop the queue_lock after taking the tx_lock and before doing
anything else.
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&dev->_xmit_lock){-+..}:
> [<c012e7b6>] lock_acquire+0x76/0xa0
> [<c0336241>] _spin_lock_bh+0x31/0x40
> [<c02d25a9>] dev_activate+0x69/0x120
This path obviously breaks assumption 1) and therefore can lead to ABBA
dead-locks.
I've looked at the history and there seems to be no reason for the lock
to be held at all in dev_watchdog_up. The lock appeared in day one and
even there it was unnecessary. In fact, people added __dev_watchdog_up
precisely in order to get around the tx lock there.
The function dev_watchdog_up is already serialised by rtnl_lock since
its only caller dev_activate is always called under it.
So here is a simple patch to remove the tx lock from dev_watchdog_up.
In 2.6.19 we can eliminate the unnecessary __dev_watchdog_up and
replace it with dev_watchdog_up.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Non-linear skbs are truncated to their linear part with mmaped IO.
Fix by using skb_copy_bits instead of memcpy.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code for frame diverter is unmaintained and has bitrotted.
The number of users is very small and the code has lots of problems.
If anyone is using it, they maybe exposing themselves to bad packet attacks.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sorry that the patch submited yesterday still contain a small bug.
This version have already been test for hours with BT connections. The
oops is now difficult to reproduce.
Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We seem to send 3 extra bytes in a TCN, which will be whatever happens
to be on the stack. Thanks to Aji_Srinivas@emc.com for seeing.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch should add support for -1 as "default" IPv6 traffic class,
as specified in IETF RFC3542 §6.5. Within the kernel, it seems tclass
< 0 is already handled, but setsockopt, getsockopt and recvmsg calls
won't accept it from userland.
Signed-off-by: Remi Denis-Courmont <rdenis@simphalempin.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
np->cork.tclass is used only in cork'ed context.
Otherwise, np->tclass should be used.
Bug#7096 reported by Remi Denis-Courmont <rdenis@simphalempin.com>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes the debuging behaviour of this code more consistent
with the rest of IPVS.
Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
I'm not entirely sure what happens in the case of a valid port,
at best it'll be silently ignored. This patch ignores them a little
more verbosely.
Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fill in a help message for the ports option to ip_vs_ftp
Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Turn Appropriate Byte Count off by default because it unfairly
penalizes applications that do small writes. Add better documentation
to describe what it is so users will understand why they might want to
turn it on.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
neigh_table_clear() doesn't free tbl->stats.
Found by Alexey Kuznetsov. Though Alexey considers this
leak minor for mainstream, I still believe that cleanup
code should not forget to free some of the resources :)
At least, this is critical for OpenVZ with virtualized
neighbour tables.
Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When I tested Linux kernel 2.6.17.7 about statistics
"ipFragFails",found that this counter couldn't increase correctly. The
criteria is RFC2011:
RFC2011
ipFragFails OBJECT-TYPE
SYNTAX Counter32
MAX-ACCESS read-only
STATUS current
DESCRIPTION
"The number of IP datagrams that have been discarded because
they needed to be fragmented at this entity but could not
be, e.g., because their Don't Fragment flag was set."
::= { ip 18 }
When I send big IP packet to a router with DF bit set to 1 which need to
be fragmented, and router just sends an ICMP error message
ICMP_FRAG_NEEDED but no increments for this counter(in the function
ip_fragment).
Signed-off-by: Wei Dong <weid@nanjing-fnst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch limits the warning messages when socket allocation failures
happen. It happens under memory pressure.
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug noticed by Remi Denis-Courmont <rdenis@simphalempin.com>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6_add_addr allocates a struct inet6_ifaddr and a dstentry, but it
doesn't install the dstentry in ifa->rt until after it releases the
addrconf_hash_lock. This means other CPUs will be able to see the new
address while it hasn't been initialized completely yet.
One possible fix would be to grab the ifp->lock spinlock when
creating the address struct; a simpler fix is to just move the
assignment.
Acked-by: jbeulich@novell.com
Acked-by: okir@suse.de
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes crash happen if initialization of nl_table fails
in initcalls. It is better than getting use after free crash later.
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) fix slow start after retransmit timeout
2) fix case of L=2*SMSS acked bytes comparison
Signed-off-by: Daikichi Osuga <osugad@s1.nttdocomo.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
When I tested Linux kernel 2.6.17.7 about statistics
"ipv6IfStatsInAddrErrors", found that this counter couldn't increase
correctly. The criteria is RFC2465:
ipv6IfStatsInAddrErrors OBJECT-TYPE
SYNTAX Counter32
MAX-ACCESS read-only
STATUS current
DESCRIPTION
"The number of input datagrams discarded because
the IPv6 address in their IPv6 header's destination
field was not a valid address to be received at
this entity. This count includes invalid
addresses (e.g., ::0) and unsupported addresses
(e.g., addresses with unallocated prefixes). For
entities which are not IPv6 routers and therefore
do not forward datagrams, this counter includes
datagrams discarded because the destination address
was not a local address."
::= { ipv6IfStatsEntry 5 }
When I send packet to host with destination that is ether invalid
address(::0) or unsupported addresses(1::1), the Linux kernel just
discard the packet, and the counter doesn't increase(in the function
ip6_pkt_discard).
Signed-off-by: Lv Liangying <lvly@nanjing-fnst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the recent fix, the callers of sctp_primitive_ABORT()
need to create an ABORT chunk and pass it as an argument rather
than msghdr that was passed earlier.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes CCID3 to give much closer performance to RFC4342.
CCID3 is meant to alter sending rate based on RTT and loss.
The performance was verified against:
http://wand.net.nz/~perry/max_download.php
For example I tested with netem and had the following parameters:
Delayed Acks 1, MSS 256 bytes, RTT 105 ms, packet loss 5%.
This gives a theoretical speed of 71.9 Kbits/s. I measured across three
runs with this patch set and got 70.1 Kbits/s. Without this patchset the
average was 232 Kbits/s which means Linux can't be used for CCID3 research
properly.
I also tested with netem turned off so box just acting as router with 1.2
msec RTT. The performance with this is the same with or without the patch
at around 30 Mbit/s.
Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bridge-netfilter code will overwrite memory if there is not
headroom in the skb to save the header. This first showed up when
using Xen with sky2 driver that doesn't allocate the extra space.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a new function dccp_rx_hist_find_entry.
Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a new function to see if two sequence numbers follow each
other.
Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a small typo in net/dccp/libs/packet_history.c
Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP over IPV6 would incorrectly inherit the GSO settings.
This would cause kernel to send Tcp Segmentation Offload packets for
IPV6 data to devices that can't handle it. It caused the sky2 driver
to lock http://bugzilla.kernel.org/show_bug.cgi?id=7050
and the e1000 would generate bogus packets. I can't blame the
hardware for gagging if the upper layers feed it garbage.
This was a new bug in 2.6.18 introduced with GSO support.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check the bounds of length specifiers more thoroughly in the XDR decoding of
NFS4 readdir reply data.
Currently, if the server returns a bitmap or attr length that causes the
current decode point pointer to wrap, this could go undetected (consider a
small "negative" length on a 32-bit machine).
Also add a check into the main XDR decode handler to make sure that the amount
of data is a multiple of four bytes (as specified by RFC-1014). This makes
sure that we can do u32* pointer subtraction in the NFS client without risking
an undefined result (the result is undefined if the pointers are not correctly
aligned with respect to one another).
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
(cherry picked from 5861fddd64a7eaf7e8b1a9997455a24e7f688092 commit)
rpc_unlink() and rpc_rmdir() will dput the dentry reference for you.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
(cherry picked from a05a57effa71a1f67ccbfc52335c10c8b85f3f6a commit)
A prior call to rpc_depopulate() by rpc_rmdir() on the parent directory may
have already called simple_unlink() on this entry.
Add the same check to rpc_rmdir(). Also remove a redundant call to
rpc_close_pipes() in rpc_rmdir.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
(cherry picked from 0bbfb9d20f6437c4031aa3bf9b4d311a053e58e3 commit)
Make it take a dentry argument instead of a path
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
(cherry picked from 648d4116eb2509f010f7f34704a650150309b3e7 commit)
This small change allows for easy per-route workarounds for broken hosts or
middleboxes that are not compliant with TCP standards for window scaling.
Rather than having to turn off window scaling globally. This patch allows
reducing or disabling window scaling if window clamp is present.
Example: Mark Lord reported a problem with 2.6.17 kernel being unable to
access http://www.everymac.com
# ip route add 216.145.246.23/32 via 10.8.0.1 window 65535
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
table->private might change because of ruleset changes, don't use it
without holding the lock.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
sctp_make_abort_user() now takes the msg_len along with the msg
so that we don't have to recalculate the bytes in iovec.
It also uses memcpy_fromiovec() so that we don't go beyond the
length allocated.
It is good to have this fix even if verify_iovec() is fixed to
return error on overflow.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When the bridge recomputes features, it does not maintain the
constraint that SG/GSO must be off if TX checksum is off.
This patch adds that constraint.
On a completely unrelated note, I've also added TSO6 and TSO_ECN
feature bits if GSO is enabled on the underlying device through
the new NETIF_F_GSO_SOFTWARE macro.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
table->private might change because of ruleset changes, don't use it without
holding the lock.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip_conntrack_put must not be called while holding ip_conntrack_lock
since destroy_conntrack takes it again.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Found in 2.4 by Yixin Pan <yxpan@hotmail.com>.
> When I read fib_semantics.c of Linux-2.4.32, write_lock(&fib_info_lock) =
> is used in fib_release_info() instead of write_lock_bh(&fib_info_lock). =
> Is the following case possible: a BH interrupts fib_release_info() while =
> holding the write lock, and calls ip_check_fib_default() which calls =
> read_lock(&fib_info_lock), and spin forever.
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes source filter leakage when a device is removed and a
process leaves the group thereafter.
This also includes corresponding fixes for IPv6 multicast source
filters on device removal.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
atm_proc_exit() is declared as __exit, and thus in .exit.text. On
some architectures (ARM) .exit.text is discarded at compile time, and
since atm_proc_exit() is called by some other __init functions, it
results in a link error.
Signed-off-by: Kevin Hilman <khilman@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a leak of a socket's multicast source filter list structure
on closing a socket with a multicast source filter set on an interface
that does not exist any more.
Signed-off-by: Michal Ruzicka <michal.ruzicka@comstar.cz>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Split off __icmpv6_socket's sk->sk_dst_lock class, because it gets
used from softirqs, which is safe for __icmpv6_sockets (because they
never get directly used via userspace syscalls), but unsafe for normal
sockets.
Has no effect on non-lockdep kernels.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
On High end systems (1024 or so cpus) this can potentially cause stack
overflow. Fix the stack usage.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since __vlan_hwaccel_rx() is essentially bypassing the
netif_receive_skb() call that would have occurred if we did the VLAN
decapsulation in software, we are missing the skb_bond() call and the
assosciated checks it does.
Export those checks via an inline function, skb_bond_should_drop(),
and use this in __vlan_hwaccel_rx().
Signed-off-by: David S. Miller <davem@davemloft.net>