kernel-fxtec-pro1x

Author	SHA1	Message	Date
Daniel Lezcano	1675c7b254	[IPV6]: remove duplicate call to proc_net_remove The file /proc/net/if_inet6 is removed twice. First time in: inet6_exit ->addrconf_cleanup And followed a few lines after by: inet6_exit -> if6_proc_exit Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 21:16:24 -07:00
Daniel Lezcano	310928d963	[NETNS]: fix net released by rcu callback When a network namespace reference is held by a network subsystem, and when this reference is decremented in a rcu update callback, we must ensure that there is no more outstanding rcu update before trying to free the network namespace. In the normal case, the rcu_barrier is called when the network namespace is exiting in the cleanup_net function. But when a network namespace creation fails, and the subsystems are undone (like the cleanup), the rcu_barrier is missing. This patch adds the missing rcu_barrier. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 21:16:21 -07:00
Daniel Lezcano	93ee31f14f	[NET]: Fix free_netdev on register_netdev failure. Point 1: The unregistering of a network device schedule a netdev_run_todo. This function calls dev->destructor when it is set and the destructor calls free_netdev. Point 2: In the case of an initialization of a network device the usual code is: * alloc_netdev * register_netdev -> if this one fails, call free_netdev and exit with error. Point 3: In the register_netdevice function at the later state, when the device is at the registered state, a call to the netdevice_notifiers is made. If one of the notification falls into an error, a rollback to the registered state is done using unregister_netdevice. Conclusion: When a network device fails to register during initialization because one network subsystem returned an error during a notification call chain, the network device is freed twice because of fact 1 and fact 2. The second free_netdev will be done with an invalid pointer. Proposed solution: The following patch move all the code of unregister_netdevice except the call to net_set_todo, to a new function "rollback_registered". The following functions are changed in this way: * register_netdevice: calls rollback_registered when a notification fails * unregister_netdevice: calls rollback_register + net_set_todo, the call order to net_set_todo is changed because it is the latest now. Since it justs add an element to a list that should not break anything. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 21:16:18 -07:00
Dirk Hohndel	e403149c92	Kbuild/doc: fix links to Documentation files Fix links to files in Documentation/* in various Kconfig files Signed-off-by: Dirk Hohndel <hohndel@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-30 14:26:30 -07:00
J. Bruce Fields	521c2a43b2	[SUNRPC]: fix rpc debugging Commit `baa3a2a0d2`, by removing initialization of the ctl_name field, broke this conditional, preventing the display of rpc_tasks that you previously got when turning on rpc debugging. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 01:07:15 -07:00
Jean Delvare	0ccfe61803	[TCP]: Saner thash_entries default with much memory. On systems with a very large amount of memory, the heuristics in alloc_large_system_hash() result in a very large TCP established hash table: 16 millions of entries for a 128 GB ia64 system. This makes reading from /proc/net/tcp pretty slow (well over a second) and as a result netstat is slow on these machines. I know that /proc/net/tcp is deprecated in favor of tcp_diag, however at the moment netstat only knows of the former. I am skeptical that such a large TCP established hash is often needed. Just because a system has a lot of memory doesn't imply that it will have several millions of concurrent TCP connections. Thus I believe that we should put an arbitrary high limit to the size of the TCP established hash by default. Users who really need a bigger hash can always use the thash_entries boot parameter to get more. I propose 2 millions of entries as the arbitrary high limit. This makes /proc/net/tcp reasonably fast on the system in question (0.2 s) while being still large enough for me to be confident that network performance won't suffer. This is just one way to limit the hash size, there are others; I am not familiar enough with the TCP code to decide which is best. Thus, I would welcome the proposals of alternatives. [ 2 million is still too large, thus I've modified the limit in the change to be '512 * 1024'. -DaveM ] Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 00:59:25 -07:00
Stephen Rothwell	e08a132b0e	[SUNRPC] rpc_rdma: we need to cast u64 to unsigned long long for printing as some architectures have unsigned long for u64. net/sunrpc/xprtrdma/rpc_rdma.c: In function 'rpcrdma_create_chunks': net/sunrpc/xprtrdma/rpc_rdma.c:222: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/sunrpc/xprtrdma/rpc_rdma.c:234: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/sunrpc/xprtrdma/rpc_rdma.c: In function 'rpcrdma_count_chunks': net/sunrpc/xprtrdma/rpc_rdma.c:577: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64 Noticed on PowerPC pseries_defconfig build. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 00:44:32 -07:00
Mitsuru Chinen	064f3605be	[IPv4] SNMP: Refer correct memory location to display ICMP out-going statistics While displaying ICMP out-going statistics as Out<name> counters in /proc/net/snmp, the memory location for ICMP in-coming statistics was referred by mistake. Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-29 22:37:36 -07:00
David S. Miller	bf3c23d171	[NET]: Fix error reporting in sys_socketpair(). If either of the two sock_alloc_fd() calls fail, we forget to update 'err' and thus we'll erroneously return zero in these cases. Based upon a report and patch from Rich Paul, and commentary from Chuck Ebbert. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-29 22:37:34 -07:00
Andrew Morton	29b67497f2	[NETFILTER]: nf_ct_alloc_hashtable(): use __GFP_NOWARN This allocation is expected to fail and we handle it by fallback to vmalloc(). So don't scare people with nasty messages like http://bugzilla.kernel.org/show_bug.cgi?id=9190 Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-29 22:37:31 -07:00
David S. Miller	0a7606c121	[NET]: Fix race between poll_napi() and net_rx_action() netpoll_poll_lock() synchronizes the ->poll() invocation code paths, but once we have the lock we have to make sure that NAPI_STATE_SCHED is still set. Otherwise we get: cpu 0 cpu 1 net_rx_action() poll_napi() netpoll_poll_lock() ... spin on ->poll_lock ->poll() netif_rx_complete netpoll_poll_unlock() acquire ->poll_lock() ->poll() netif_rx_complete() CRASH Based upon a bug report from Tina Yang. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-29 22:37:28 -07:00
Matthias M. Dellweg	b0a713e9e6	[TCP] MD5: Remove some more unnecessary casting. while reviewing the tcp_md5-related code further i came across with another two of these casts which you probably have missed. I don't actually think that they impose a problem by now, but as you said we should remove them. Signed-off-by: Matthias M. Dellweg <2500@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-29 22:37:27 -07:00
Xiaoliang (David) Wei	c940587bf6	[TCP] vegas: Fix a bug in disabling slow start by gamma parameter. TCP Vegas implementation has a bug in the process of disabling slow-start with gamma parameter. The bug may lead to extreme unfairness in the presence of early packet loss. See details in: http://www.cs.caltech.edu/~weixl/technical/ns2linux/known_linux/index.html#vegas Switch the order of "if (tp->snd_cwnd <= tp->snd_ssthresh)" statement and "if (diff > gamma)" statement to eliminate the problem. Signed-off-by: Xiaoliang (David) Wei <davidwei79@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-29 22:37:25 -07:00
Andy Gospodarek	5c81833c2f	[IPVS]: use proper timeout instead of fixed value Instead of using the default timeout of 3 minutes, this uses the timeout specific to the protocol used for the connection. The 3 minute timeout seems somewhat arbitrary (though I know it is used other places in the ipvs code) and when failing over it would be much nicer to use one of the configured timeout values. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-29 22:37:23 -07:00
YOSHIFUJI Hideaki	ad02ac145d	[IPV6] NDISC: Fix setting base_reachable_time_ms variable. This bug was introduced by the commit `d12af679bc` (sysctl: fix neighbour table sysctls). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-29 22:37:22 -07:00
Al Viro	d06f608265	SCTP endianness annotations regression Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-29 07:41:32 -07:00
Al Viro	2d8a972661	SUNRPC endianness annotations rpcrdma stuff lacks endianness annotations for on-the-wire data. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-29 07:41:32 -07:00
Herbert Xu	68e3f5dd4d	[CRYPTO] users: Fix up scatterlist conversion errors This patch fixes the errors made in the users of the crypto layer during the sg_init_table conversion. It also adds a few conversions that were missing altogether. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-27 00:52:07 -07:00
Eric W. Biederman	ceaa79c434	[NETNS]: Fix get_net_ns_by_pid The pid namespace patches changed the semantics of find_task_by_pid without breaking the compile resulting in get_net_ns_by_pid doing the wrong thing. So switch to using the intended find_task_by_vpid. Combined with Denis' earlier patch to make netlink traffic fully synchronous the inadvertent race I introduced with accessing current is actually removed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 22:56:12 -07:00
Eric W. Biederman	2b008b0a8e	[NET]: Marking struct pernet_operations __net_initdata was inappropriate It is not safe to to place struct pernet_operations in a special section. We need struct pernet_operations to last until we call unregister_pernet_subsys. Which doesn't happen until module unload. So marking struct pernet_operations is a disaster for modules in two ways. - We discard it before we call the exit method it points to. - Because I keep struct pernet_operations on a linked list discarding it for compiled in code removes elements in the middle of a linked list and does horrible things for linked insert. So this looks safe assuming __exit_refok is not discarded for modules. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 22:54:53 -07:00
Adrian Bunk	72998d8c84	[INET] ESP: Must #include <linux/scatterlist.h> This patch fixes the following compile errors in some configurations: <-- snip --> ... CC net/ipv4/esp4.o /home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv4/esp4.c: In function 'esp_output': /home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv4/esp4.c:113: error: implicit declaration of function 'sg_init_table' make[3]: * [net/ipv4/esp4.o] Error 1 ... /home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv6/esp6.c: In function 'esp6_output': /home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv6/esp6.c:112: error: implicit declaration of function 'sg_init_table' make[3]: * [net/ipv6/esp6.o] Error 1 <-- snip --> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 22:53:58 -07:00
Jeff Garzik	18134bed02	[TCP] IPV6: fix softnet build breakage net/ipv6/tcp_ipv6.c: In function 'tcp_v6_rcv': net/ipv6/tcp_ipv6.c:1736: error: implicit declaration of function 'get_softnet_dma' net/ipv6/tcp_ipv6.c:1736: warning: assignment makes pointer from integer without a cast Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 22:53:14 -07:00
Paul Moore	4be2700fb7	[NetLabel]: correct usage of RCU locking This fixes some awkward, and perhaps even problematic, RCU lock usage in the NetLabel code as well as some other related trivial cleanups found when looking through the RCU locking. Most of the changes involve removing the redundant RCU read locks wrapping spinlocks in the case of a RCU writer. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 04:29:08 -07:00
Ryousei Takano	94d3b1e586	[TCP]: fix D-SACK cwnd handling In the current net-2.6 kernel, handling FLAG_DSACKING_ACK is broken. The flag is cleared to 1 just after FLAG_DSACKING_ACK is set. if (found_dup_sack) flag \|= FLAG_DSACKING_ACK; : flag = 1; To fix it, this patch introduces a part of the tcp_sacktag_state patch: http://marc.info/?l=linux-netdev&m=119210560431519&w=2 Signed-off-by: Ryousei Takano <takano-ryousei@aist.go.jp> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 04:27:59 -07:00
Adrian Bunk	8ad7c62b75	[SCTP] net/sctp/auth.c: make 3 functions static This patch makes three needlessly global functions static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 04:21:23 -07:00
David S. Miller	b4caea8aa8	[TCP]: Add missing I/O AT code to ipv6 side. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 04:20:13 -07:00
Adrian Bunk	d84d64dcb3	[SCTP]: #if 0 sctp_update_copy_cksum() sctp_update_copy_cksum() is no longer used. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 04:07:20 -07:00
Adrian Bunk	39296ed669	[INET]: Unexport icmpmsg_statistics This patch removes the unused EXPORT_SYMBOL(icmpmsg_statistics). Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 04:06:08 -07:00
Adrian Bunk	bbbb1a812d	[NET]: Unexport sock_enable_timestamp(). sock_enable_timestamp() no longer has any modular users. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 03:59:45 -07:00
Adrian Bunk	0f79efdc23	[TCP]: Make tcp_match_skb_to_sack() static. tcp_match_skb_to_sack() can become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 03:57:36 -07:00
Adrian Bunk	d76081f875	[IRDA]: Make ircomm_tty static. ircomm_tty can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 03:56:43 -07:00
Stephen Hemminger	c8d90dca32	[NET] dev_change_name: ignore changes to same name Prevent error/backtrace from dev_rename() when changing name of network device to the same name. This is a common situation with udev and other scripts that bind addr to device. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 03:53:42 -07:00
David S. Miller	8c56a347c1	Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6	2007-10-26 03:50:02 -07:00
Jamal Hadi Salim	a057ae3c10	[NET_CLS_ACT]: Use skb_act_clone clean skb_clone of any signs of CONFIG_NET_CLS_ACT and have mirred us skb_act_clone() Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 02:47:54 -07:00
David S. Miller	c7da57a183	[TCP]: Fix scatterlist handling in MD5 signature support. Use sg_init_table() and sg_mark_end() as needed. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 00:41:21 -07:00
David S. Miller	0e0940d4bb	[IPSEC]: Fix scatterlist handling in skb_icv_walk(). Use sg_init_one() and sg_init_table() as needed. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 00:39:27 -07:00
David S. Miller	ed0e7e0ca3	[IPSEC]: Add missing sg_init_table() calls to ESP. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 00:38:39 -07:00
Ryousei Takano	564262c1f0	[TCP]: Fix inconsistency of terms. Fix inconsistency of terms: 1) D-SACK 2) F-RTO Signed-off-by: Ryousei Takano <takano-ryousei@aist.go.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-25 23:03:52 -07:00
David S. Miller	4bc3e17cce	Merge branch 'fixes-davem' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6	2007-10-25 22:49:43 -07:00
Johannes Berg	ddd68587d0	[PATCH] mac80211: fix printk warning on 64-bit My AID message patch introduced a warning on 64-bit machines because ~ extends to unsigned long: \| net/mac80211/ieee80211_sta.c: In function ‘ieee80211_rx_mgmt_assoc_resp’: \| net/mac80211/ieee80211_sta.c:1187: warning: format ‘%d’ expects type ‘int’, but argument 7 has type ‘long unsigned int’ This fixes it by explicitly casting the result to u16 (which 'aid' is). Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-26 00:14:29 -04:00
Michael Wu	48225709be	[PATCH] mac80211: Fix SSID matching in AP selection The length of the SSID desired should also be compared in addition to the memcmp of the SSIDs. Thanks to Andrea Merello <andreamrl@tiscali.it> for finding this issue. Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-25 22:32:05 -04:00
Vlad Yasevich	fee9dee730	[UDP]: Make use of inet_iif() when doing socket lookups. UDP currently uses skb->dev->ifindex which may provide the wrong information when the socket bound to a specific interface. This patch makes inet_iif() accessible to UDP and makes UDP use it. The scenario we are trying to fix is when a client is running on the same system and the server and both client and server bind to a non-loopback device. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-25 18:54:46 -07:00
David S. Miller	8a6911b12f	[IPV4]: Remove no longer used snmp4_icmp_list. This was obsoleted by a previous change, but the removal was forgotten. Reported by David Howells and David Stevens. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-25 18:40:05 -07:00
Linus Torvalds	06dbbfef82	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: [IPV4]: Explicitly call fib_get_table() in fib_frontend.c [NET]: Use BUILD_BUG_ON in net/core/flowi.c [NET]: Remove in-code externs for some functions from net/core/dev.c [NET]: Don't declare extern variables in net/core/sysctl_net_core.c [TCP]: Remove unneeded implicit type cast when calling tcp_minshall_update() [NET]: Treat the sign of the result of skb_headroom() consistently [9P]: Fix missing unlock before return in p9_mux_poll_start [PKT_SCHED]: Fix sch_prio.c build with CONFIG_NETDEVICES_MULTIQUEUE [IPV4] ip_gre: sendto/recvfrom NBMA address [SCTP]: Consolidate sctp_ulpq_renege_xxx functions [NETLINK]: Fix ACK processing after netlink_dump_start [VLAN]: MAINTAINERS update [DCCP]: Implement SIOCINQ/FIONREAD [NET]: Validate device addr prior to interface-up	2007-10-25 15:50:32 -07:00
Gerrit Renker	24c667db59	[CCID2/3]: Initialisation assignments of 0 are redundant Assigning initial values of `0' is redundant when loading a new CCID structure, since in net/dccp/ccid.c the entire CCID structure is zeroed out prior to initialisation in ccid_new(): struct ccid { struct ccid_operations ccid_ops; char ccid_priv[0]; }; // ... if (rx) { memset(ccid + 1, 0, ccid_ops->ccid_hc_rx_obj_size); if (ccid->ccid_ops->ccid_hc_rx_init != NULL && ccid->ccid_ops->ccid_hc_rx_init(ccid, sk) != 0) goto out_free_ccid; } else { memset(ccid + 1, 0, ccid_ops->ccid_hc_tx_obj_size); / analogous to the rx case */ } This patch therefore removes the redundant assignments. Thanks to Arnaldo for the inspiration. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-10-24 10:53:01 -02:00
Gerrit Renker	76fd1e87d9	[DCCP]: Unaligned pointer access This fixes `unaligned (read) access' errors of the type Kernel unaligned access at TPC[100f970c] dccp_parse_options+0x4f4/0x7e0 [dccp] Kernel unaligned access at TPC[1011f2e4] ccid3_hc_tx_parse_options+0x1ac/0x380 [dccp_ccid3] Kernel unaligned access at TPC[100f9898] dccp_parse_options+0x680/0x880 [dccp] by using the get_unaligned macro for parsing options. Commiter note: Preserved the sparse __be{16,32} annotations. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-10-24 10:46:58 -02:00
Gerrit Renker	d8ef2c29a0	[DCCP]: Convert Reset code into socket error number This adds support for converting the 11 currently defined Reset codes into system error numbers, which are stored in sk_err for further interpretation. This makes the externally visible API behaviour similar to TCP, since a client connecting to a non-existing port will experience ECONNREFUSED. * Code 0, Unspecified, is interpreted as non-error (0); * Code 1, Closed (normal termination), also maps into 0; * Code 2, Aborted, maps into "Connection reset by peer" (ECONNRESET); * Code 3, No Connection and Code 7, Connection Refused, map into "Connection refused" (ECONNREFUSED); * Code 4, Packet Error, maps into "No message of desired type" (ENOMSG); * Code 5, Option Error, maps into "Illegal byte sequence" (EILSEQ); * Code 6, Mandatory Error, maps into "Operation not supported on transport endpoint" (EOPNOTSUPP); * Code 8, Bad Service Code, maps into "Invalid request code" (EBADRQC); * Code 9, Too Busy, maps into "Too many users" (EUSERS); * Code 10, Bad Init Cookie, maps into "Invalid request descriptor" (EBADR); * Code 11, Aggression Penalty, maps into "Quota exceeded" (EDQUOT) which makes sense in terms of using more than the `fair share' of bandwidth. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-10-24 10:27:48 -02:00
Gerrit Renker	1238d0873b	[DCCP]: One more exemption from full sequence number checks This fixes the following problem: client connects to peer which has no DCCP enabled or loaded; ICMP error messages ("Protocol Unavailable") can be seen on the wire, but the application hangs. Reason: ICMP packets don't get through to dccp_v4_err. When reporting errors, a sequence number check is made for the DCCP packet that had caused an ICMP error to arrive. Such checks can not be made if the socket is in state LISTEN, RESPOND (which in the implementation is the same as LISTEN), or REQUEST, since update_gsr() has not been called in these states, hence the sequence window is 0..0. This patch fixes the problem by adding the REQUEST state as another exemption to the window check. The error reporting now works as expected on connecting. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-24 10:18:06 -02:00
Gerrit Renker	fde20105f3	[DCCP]: Retrieve packet sequence number for error reporting This fixes a problem when analysing erroneous packets in dccp_v{4,6}_err: * dccp_hdr_seq currently takes an skb * however, the transport headers in the skb are shifted, due to the preceding IPv4/v6 header. Fixed for v4 and v6 by changing dccp_hdr_seq to take a struct dccp_hdr as argument. Verified that the correct sequence number is now reported in the error handler. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-10-24 10:12:09 -02:00
Jens Axboe	642f149031	SG: Change sg_set_page() to take length and offset argument Most drivers need to set length and offset as well, so may as well fold those three lines into one. Add sg_assign_page() for those two locations that only needed to set the page, where the offset/length is set outside of the function context. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-24 11:20:47 +02:00
Geert Uytterhoeven	5a1cb47ff4	m68k: sg fallout Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>	2007-10-24 08:55:40 +02:00
Pavel Emelyanov	03cf786c4e	[IPV4]: Explicitly call fib_get_table() in fib_frontend.c In case the "multiple tables" config option is y, the ip_fib_local_table is not a variable, but a macro, that calls fib_get_table(RT_TABLE_LOCAL). Some code uses this "variable" 3 times in one place, thus implicitly making 3 calls. Fix it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:58 -07:00
Pavel Emelyanov	f0fe91ded3	[NET]: Use BUILD_BUG_ON in net/core/flowi.c Instead of ugly extern not-existing function. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:57 -07:00
Pavel Emelyanov	342709efc7	[NET]: Remove in-code externs for some functions from net/core/dev.c Inconsistent prototype and real type for functions may have worse consequences, than those for variables, so move them into a header. Since they are used privately in net/core, make this file reside in the same place. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:56 -07:00
Pavel Emelyanov	a37ae4086e	[NET]: Don't declare extern variables in net/core/sysctl_net_core.c Some are already declared in include/linux/netdevice.h, while some others (xfrm ones) need to be declared. The driver/net/rrunner.c just uses same extern as well, so cleanup it also. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:56 -07:00
Chuck Lever	c2636b4d9e	[NET]: Treat the sign of the result of skb_headroom() consistently In some places, the result of skb_headroom() is compared to an unsigned integer, and in others, the result is compared to a signed integer. Make the comparisons consistent and correct. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:55 -07:00
Roel Kluin	0ffdd58149	[9P]: Fix missing unlock before return in p9_mux_poll_start Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:54 -07:00
Pavel Emelyanov	0034622693	[PKT_SCHED]: Fix sch_prio.c build with CONFIG_NETDEVICES_MULTIQUEUE Fix one more user of netiff_subqueue_stopped. To check for the queue id one must use the __netiff_subqueue_stoped call. This run out of my sight when I made the: `668f895a85` [NET]: Hide the queue_mapping field inside netif_subqueue_stopped commit :( Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:53 -07:00
Timo Teras	6a5f44d7a0	[IPV4] ip_gre: sendto/recvfrom NBMA address When GRE tunnel is in NBMA mode, this patch allows an application to use a PF_PACKET socket to: - send a packet to specific NBMA address with sendto() - use recvfrom() to receive packet and check which NBMA address it came from This is required to implement properly NHRP over GRE tunnel. Signed-off-by: Timo Teras <timo.teras@iki.fi> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:53 -07:00
Pavel Emelyanov	16d14ef9f2	[SCTP]: Consolidate sctp_ulpq_renege_xxx functions Both are equal, except for the list to be traversed. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:52 -07:00
Denis V. Lunev	5c58298c25	[NETLINK]: Fix ACK processing after netlink_dump_start Revert to original netlink behavior. Do not reply with ACK if the netlink dump has bees successfully started. libnl has been broken by the `cd40b7d398` The following command reproduce the problem: /nl-route-get 192.168.1.1 Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:51 -07:00
Arnaldo Carvalho de Melo	6273172e17	[DCCP]: Implement SIOCINQ/FIONREAD Just like UDP. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Leandro Melo de Sales <leandroal@gmail.com> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:50 -07:00
Jeff Garzik	bada339ba2	[NET]: Validate device addr prior to interface-up Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:50 -07:00
Linus Torvalds	01e7ae8c13	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: v9fs_vfs_rename incorrect clunk order 9p: fix memleak in fs/9p/v9fs.c 9p: add virtio transport	2007-10-23 12:04:01 -07:00
Ralf Baechle	117636092a	[PATCH] Fix breakage after SG cleanups Commits `58b053e4ce` ("Update arch/ to use sg helpers") `45711f1af6` ("[SG] Update drivers to use sg helpers") `fa05f1286b` ("Update net/ to use sg helpers") converted many files to use the scatter gather helpers without ensuring that the necessary headerfile <linux/scatterlist> is included. This happened to work for ia64, powerpc, sparc64 and x86 because they happened to drag in that file via their <asm/dma-mapping.h>. On most of the others this probably broke. Instead of increasing the header file spider web I choose to include <linux/scatterlist.h> directly into the affectes files. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-23 12:02:39 -07:00
Eric Van Hensbergen	b530cc7940	9p: add virtio transport This adds a transport to 9p for communicating between guests and a host using a virtio based transport. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-10-23 13:47:31 -05:00
Christian Borntraeger	ebc3bbcfcf	Fix sctp compile sctp fails to compile with net/sctp/sm_make_chunk.c: In function 'sctp_pack_cookie': net/sctp/sm_make_chunk.c:1516: error: implicit declaration of function 'sg_init_table' net/sctp/sm_make_chunk.c:1517: error: implicit declaration of function 'sg_set_page' use the proper include file. SCTP maintainers Vlad Yasevich and Sridhar Samudrala are CCed. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-23 12:46:32 +02:00
Heiko Carstens	b3b724f48c	net: fix xfrm build - missing scatterlist.h include net/xfrm/xfrm_algo.c: In function 'skb_icv_walk': net/xfrm/xfrm_algo.c:555: error: implicit declaration of function 'sg_set_page' make[2]: *** [net/xfrm/xfrm_algo.o] Error 1 Cc: David Miller <davem@davemloft.net> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-23 09:49:27 +02:00
Linus Torvalds	f09cc910fe	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits) [IPSEC] IPV6: Fix to add tunnel mode SA correctly. [NET]: Cut off the queue_mapping field from sk_buff [NET]: Hide the queue_mapping field inside netif_subqueue_stopped [NET]: Make and use skb_get_queue_mapping [NET]: Use the skb_set_queue_mapping where appropriate [INET]: Use MODULE_ALIAS_NET_PF_PROTO_TYPE where possible. [INET]: Let inet_diag and friends autoload [NIU]: Cleanup PAGE_SIZE checks a bit [NET]: Fix SKB_WITH_OVERHEAD calculation [ATM]: Fix clip module reload crash. [TG3]: Update version to 3.85 [TG3]: PCI command adjustment [TG3]: Add management FW version to ethtool report [TG3]: Add 5723 support [Bluetooth] Convert RFCOMM to use kthread API [Bluetooth] Add constant for Bluetooth socket options level [Bluetooth] Add support for handling simple eSCO links [Bluetooth] Add address and channel attribute to RFCOMM TTY device [Bluetooth] Fix wrong argument in debug code of HIDP [Bluetooth] Add generic driver for Bluetooth USB devices ...	2007-10-22 19:22:33 -07:00
Jens Axboe	fa05f1286b	Update net/ to use sg helpers Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-22 21:19:56 +02:00
Masahide NAKAMURA	ea2c47b42f	[IPSEC] IPV6: Fix to add tunnel mode SA correctly. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:58 -07:00
Pavel Emelyanov	668f895a85	[NET]: Hide the queue_mapping field inside netif_subqueue_stopped Many places get the queue_mapping field from skb to pass it to the netif_subqueue_stopped() which will be 0 in any case. Make the helper that works with sk_buff Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:56 -07:00
Pavel Emelyanov	4e3ab47a54	[NET]: Make and use skb_get_queue_mapping Make the helper for getting the field, symmetrical to the "set" one. Return 0 if CONFIG_NETDEVICES_MULTIQUEUE=n Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:56 -07:00
Pavel Emelyanov	dfa4091129	[NET]: Use the skb_set_queue_mapping where appropriate There's already such a helper to initialize this field. Use it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:55 -07:00
Jean Delvare	7131c6c736	[INET]: Use MODULE_ALIAS_NET_PF_PROTO_TYPE where possible. Now that we have this new macro, use it where possible. Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:54 -07:00
Jean Delvare	305e1e9691	[INET]: Let inet_diag and friends autoload By adding module aliases to inet_diag, tcp_diag and dccp_diag, we let them load automatically as needed. This makes tools like "ss" run faster. Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:54 -07:00
Randy Dunlap	bfb85c9f75	[ATM]: Fix clip module reload crash. net/atm/clip.c crashes the kernel if it (module) is loaded, removed, and then loaded again. Its exit call to neigh_table_clear() should destroy the cache after freeing it. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:52 -07:00
Marcel Holtmann	a524eccc73	[Bluetooth] Convert RFCOMM to use kthread API This patch does the full kthread conversion for the RFCOMM protocol. It makes the code slightly simpler and more maintainable. Based on a patch from Christoph Hellwig <hch@lst.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-10-22 02:59:49 -07:00
Marcel Holtmann	b6a0dc8224	[Bluetooth] Add support for handling simple eSCO links With the Bluetooth 1.2 specification the Extended SCO feature for better audio connections was introduced. So far the Bluetooth core wasn't able to handle any eSCO connections correctly. This patch adds simple eSCO support while keeping backward compatibility with older devices. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-10-22 02:59:47 -07:00
Marcel Holtmann	dae6a0f663	[Bluetooth] Add address and channel attribute to RFCOMM TTY device Export the remote device address and channel of RFCOMM TTY device via sysfs attributes. This allows udev to create better naming rules for configured RFCOMM devices. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-10-22 02:59:47 -07:00
Dave Young	6792b5ec8d	[Bluetooth] Fix wrong argument in debug code of HIDP In the debug code of the hidp_queue_report function, the device variable does not exist, replace it with session->hid. Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-10-22 02:59:46 -07:00
Marcel Holtmann	6464f35f37	[Bluetooth] Fall back to L2CAP in basic mode In case the remote entity tries to negogiate retransmission or flow control mode, reject it and fall back to basic mode. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-10-22 02:59:43 -07:00
Marcel Holtmann	f0709e03ac	[Bluetooth] Advertise L2CAP features mask support Indicate the support for the L2CAP features mask value when the remote entity tries to negotiate Bluetooth 1.2 specific features. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-10-22 02:59:42 -07:00
Marcel Holtmann	4e8402a3f8	[Bluetooth] Retrieve L2CAP features mask on connection setup The Bluetooth 1.2 specification introduced a specific features mask value to interoperate with newer versions of the specification. So far this piece of information was never needed, but future extensions will rely on it. This patch adds a generic way to retrieve this information only once per connection setup. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-10-22 02:59:41 -07:00
Marcel Holtmann	861d6882b3	[Bluetooth] Remove global conf_mtu variable from L2CAP After the change to the L2CAP configuration parameter handling the global conf_mtu variable is no longer needed and so remove it. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-10-22 02:59:41 -07:00
Marcel Holtmann	876d9484ed	[Bluetooth] Finish L2CAP configuration only with acceptable settings The parameters of the L2CAP output configuration might not be accepted after the first configuration round. So only indicate a finished output configuration when acceptable settings are provided. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-10-22 02:59:40 -07:00
Marcel Holtmann	a9de924806	[Bluetooth] Switch from OGF+OCF to using only opcodes The Bluetooth HCI commands are divided into logical OGF groups for easier identification of their purposes. While this still makes sense for the written specification, its makes the code only more complex and harder to read. So instead of using separate OGF and OCF values to identify the commands, use a common 16-bit opcode that combines both values. As a side effect this also reduces the complexity of OGF and OCF calculations during command header parsing. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-10-22 02:59:40 -07:00
Matt LaPlante	01dd2fbf0d	typo fixes Most of these fixes were already submitted for old kernel versions, and were approved, but for some reason they never made it into the releases. Because this is a consolidation of a couple old missed patches, it touches both Kconfigs and documentation texts. Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-20 01:34:40 +02:00
Jean Delvare	c03983ac9b	Spelling fix: explicitly From: Jean Delvare <khali@linux-fr.org> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-19 23:22:55 +02:00
Marcin Garski	db955170d4	more UTF-8 conversions Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-19 23:22:11 +02:00
Jan Engelhardt	96de0e252c	Convert files to UTF-8 and some cleanups * Convert files to UTF-8. * Also correct some people's names (one example is Eißfeldt, which was found in a source file. Given that the author used an ß at all in a source file indicates that the real name has in fact a 'ß' and not an 'ss', which is commonly used as a substitute for 'ß' when limited to 7bit.) * Correct town names (Goettingen -> Göttingen) * Update Eberhard Mönkeberg's address (http://lkml.org/lkml/2007/1/8/313) Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-19 23:21:04 +02:00
Robert P. J. Day	3a4fa0a25d	Fix misspellings of "system", "controller", "interrupt" and "necessary". Fix the various misspellings of "system", controller", "interrupt" and "[un]necessary". Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-19 23:10:43 +02:00
Linus Torvalds	804b908adf	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: Fix possible dev_deactivate race condition [INET]: Justification for local port range robustness. [PACKET]: Kill unused pg_vec_endpage() function [NET]: QoS/Sched as menuconfig [NET]: Fix bug in sk_filter race cures. [PATCH] mac80211: make ieee802_11_parse_elems return void	2007-10-19 11:54:39 -07:00
Pavel Emelyanov	6651fd561b	Use task_pid_nr() in ip_vs_sync.c The sync_master_pid and sync_backup_pid are set in set_sync_pid() and are used later for set/not-set checks and in printk. So it is safe to use the global pid value in this case. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:43 -07:00
Pavel Emelyanov	ba25f9dcc4	Use helpers to obtain task pid in printks The task_struct->pid member is going to be deprecated, so start using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in the kernel. The first thing to start with is the pid, printed to dmesg - in this case we may safely use task_pid_nr(). Besides, printks produce more (much more) than a half of all the explicit pid usage. [akpm@linux-foundation.org: git-drm went and changed lots of stuff] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:43 -07:00
Jiri Slaby	93043ece03	define global BIT macro define global BIT macro move all local BIT defines to the new globally define macro. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@gate.crashing.org> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: Jeff Garzik <jeff@garzik.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Russell King <rmk@arm.linux.org.uk> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: "John W. Linville" <linville@tuxdriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:42 -07:00
Jiri Slaby	7b19ada2ed	get rid of input BIT* duplicate defines get rid of input BIT* duplicate defines use newly global defined macros for input layer. Also remove includes of input.h from non-input sources only for BIT macro definiton. Define the macro temporarily in local manner, all those local definitons will be removed further in this patchset (to not break bisecting). BIT macro will be globally defined (1<<x) Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: <dtor@mail.ru> Acked-by: Jiri Kosina <jkosina@suse.cz> Cc: <lenb@kernel.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Cc: <perex@suse.cz> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: <vernux@us.ibm.com> Cc: <malattia@linux.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:42 -07:00
Jiri Slaby	1977f03272	remove asm/bitops.h includes remove asm/bitops.h includes including asm/bitops directly may cause compile errors. don't include it and include linux/bitops instead. next patch will deny including asm header directly. Cc: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:41 -07:00
Pavel Emelyanov	b488893a39	pid namespaces: changes to show virtual ids to user This is the largest patch in the set. Make all (I hope) the places where the pid is shown to or get from user operate on the virtual pids. The idea is: - all in-kernel data structures must store either struct pid itself or the pid's global nr, obtained with pid_nr() call; - when seeking the task from kernel code with the stored id one should use find_task_by_pid() call that works with global pids; - when showing pid's numerical value to the user the virtual one should be used, but however when one shows task's pid outside this task's namespace the global one is to be used; - when getting the pid from userspace one need to consider this as the virtual one and use appropriate task/pid-searching functions. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: nuther build fix] [akpm@linux-foundation.org: yet nuther build fix] [akpm@linux-foundation.org: remove unneeded casts] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:40 -07:00
Pavel Emelyanov	cf7b708c8d	Make access to task's nsproxy lighter When someone wants to deal with some other taks's namespaces it has to lock the task and then to get the desired namespace if the one exists. This is slow on read-only paths and may be impossible in some cases. E.g. Oleg recently noticed a race between unshare() and the (sent for review in cgroups) pid namespaces - when the task notifies the parent it has to know the parent's namespace, but taking the task_lock() is impossible there - the code is under write locked tasklist lock. On the other hand switching the namespace on task (daemonize) and releasing the namespace (after the last task exit) is rather rare operation and we can sacrifice its speed to solve the issues above. The access to other task namespaces is proposed to be performed like this: rcu_read_lock(); nsproxy = task_nsproxy(tsk); if (nsproxy != NULL) { / * * work with the namespaces here * e.g. get the reference on one of them * / } / * * NULL task_nsproxy() means that this task is * almost dead (zombie) * / rcu_read_unlock(); This patch has passed the review by Eric and Oleg :) and, of course, tested. [clg@fr.ibm.com: fix unshare()] [ebiederm@xmission.com: Update get_net_ns_by_pid] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Herbert Xu	ce0e32e65f	[NET]: Fix possible dev_deactivate race condition The function dev_deactivate is supposed to only return when all outstanding transmissions have completed. Unfortunately it is possible for store operations in the driver's transmit function to only become visible after dev_deactivate returns. This patch fixes this by taking the queue lock after we see the end of the queue run. This ensures that all effects of any previous transmit calls are visible. If however we detect that there is another queue run occuring, then we'll warn about it because this should never happen as we have pointed dev->qdisc to noop_qdisc within the same queue lock earlier in the functino. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 22:37:58 -07:00
Anton Arapov	a25de534f8	[INET]: Justification for local port range robustness. There is a justifying patch for Stephen's patches. Stephen's patches disallows using a port range of one single port and brakes the meaning of the 'remaining' variable, in some places it has different meaning. My patch gives back the sense of 'remaining' variable. It should mean how many ports are remaining and nothing else. Also my patch allows using a single port. I sure we must be able to use mentioned port range, this does not restricted by documentation and does not brake current behavior. usefull links: Patches posted by Stephen Hemminger http://marc.info/?l=linux-netdev&m=119206106218187&w=2 http://marc.info/?l=linux-netdev&m=119206109918235&w=2 Andrew Morton's comment http://marc.info/?l=linux-kernel&m=119248225007737&w=2 1. Allows using a port range of one single port. 2. Gives back sense of 'remaining' variable. Signed-off-by: Anton Arapov <aarapov@redhat.com> Acked-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 22:00:17 -07:00
Patrick McHardy	be702d5e38	[PACKET]: Kill unused pg_vec_endpage() function The conversion to vm_insert_page() left this unused function behind, remove it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 21:58:19 -07:00
David S. Miller	70180659a4	Merge branch 'fixes-davem' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6	2007-10-18 21:57:27 -07:00
Randy Dunlap	85ef3e5cad	[NET]: QoS/Sched as menuconfig Convert "QoS and/or fair queueing" to menuconfig. This makes it easy for someone to disable all sub-options with one config symbol. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 21:56:38 -07:00
Olof Johansson	9b013e05e0	[NET]: Fix bug in sk_filter race cures. Looks like this might be causing problems, at least for me on ppc. This happened during a normal boot, right around first interface config/dhcp run.. cpu 0x0: Vector: 300 (Data Access) at [c00000000147b820] pc: c000000000435e5c: .sk_filter_delayed_uncharge+0x1c/0x60 lr: c0000000004360d0: .sk_attach_filter+0x170/0x180 sp: c00000000147baa0 msr: 9000000000009032 dar: 4 dsisr: 40000000 current = 0xc000000004780fa0 paca = 0xc000000000650480 pid = 1295, comm = dhclient3 0:mon> t [c00000000147bb20] c0000000004360d0 .sk_attach_filter+0x170/0x180 [c00000000147bbd0] c000000000418988 .sock_setsockopt+0x788/0x7f0 [c00000000147bcb0] c000000000438a74 .compat_sys_setsockopt+0x4e4/0x5a0 [c00000000147bd90] c00000000043955c .compat_sys_socketcall+0x25c/0x2b0 [c00000000147be30] c000000000007508 syscall_exit+0x0/0x40 --- Exception: c01 (System Call) at 000000000ff618d8 SP (fffdf040) is in userspace 0:mon> I.e. null pointer deref at sk_filter_delayed_uncharge+0x1c: 0:mon> di $.sk_filter_delayed_uncharge c000000000435e40 7c0802a6 mflr r0 c000000000435e44 fbc1fff0 std r30,-16(r1) c000000000435e48 7c8b2378 mr r11,r4 c000000000435e4c ebc2cdd0 ld r30,-12848(r2) c000000000435e50 f8010010 std r0,16(r1) c000000000435e54 f821ff81 stdu r1,-128(r1) c000000000435e58 380300a4 addi r0,r3,164 c000000000435e5c 81240004 lwz r9,4(r4) That's the deref of fp: static void sk_filter_delayed_uncharge(struct sock sk, struct sk_filter fp) { unsigned int size = sk_filter_len(fp); ... That is called from sk_attach_filter(): ... rcu_read_lock_bh(); old_fp = rcu_dereference(sk->sk_filter); rcu_assign_pointer(sk->sk_filter, fp); rcu_read_unlock_bh(); sk_filter_delayed_uncharge(sk, old_fp); return 0; ... So, looks like rcu_dereference() returned NULL. I don't know the filter code at all, but it seems like it might be a valid case? sk_detach_filter() seems to handle a NULL sk_filter, at least. So, this needs review by someone who knows the filter, but it fixes the problem for me: Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 21:48:39 -07:00
Linus Torvalds	a57793651f	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (51 commits) [IPV6]: Fix again the fl6_sock_lookup() fixed locking [NETFILTER]: nf_conntrack_tcp: fix connection reopening fix [IPV6]: Fix race in ipv6_flowlabel_opt() when inserting two labels [IPV6]: Lost locking in fl6_sock_lookup [IPV6]: Lost locking when inserting a flowlabel in ipv6_fl_list [NETFILTER]: xt_sctp: fix mistake to pass a pointer where array is required [NET]: Fix OOPS due to missing check in dev_parse_header(). [TCP]: Remove lost_retrans zero seqno special cases [NET]: fix carrier-on bug? [NET]: Fix uninitialised variable in ip_frag_reasm() [IPSEC]: Rename mode to outer_mode and add inner_mode [IPSEC]: Disallow combinations of RO and AH/ESP/IPCOMP [IPSEC]: Use the top IPv4 route's peer instead of the bottom [IPSEC]: Store afinfo pointer in xfrm_mode [IPSEC]: Add missing BEET checks [IPSEC]: Move type and mode map into xfrm_state.c [IPSEC]: Fix length check in xfrm_parse_spi [IPSEC]: Move ip_summed zapping out of xfrm6_rcv_spi [IPSEC]: Get nexthdr from caller in xfrm6_rcv_spi [IPSEC]: Move tunnel parsing for IPv4 out of xfrm4_input ...	2007-10-18 14:40:30 -07:00
Eric W. Biederman	f429cd37a2	sysctl: properly register the irda binary sysctl numbers Grumble. These numbers should have been in sysctl.h from the beginning if we ever expected anyone to use them. Oh well put them there now so we can find them and make maintenance easier. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:23 -07:00
Eric W. Biederman	064b5bba0c	sysctl: remove broken netfilter binary sysctls No one has bothered to set strategy routine for the the netfilter sysctls that return jiffies to be sysctl_jiffies. So it appears the sys_sysctl path is unused and untested, so this patch removes the binary sysctl numbers. Which fixes the netfilter oops in 2.6.23-rc2-mm2 for me. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Patrick McHardy <kaber@trash.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:23 -07:00
Eric W. Biederman	49641b58a7	sysctl: ipv4 remove binary sysctl paths where they are broken Currently tcp_available_congestion_control does not even attempt being read from sys_sysctl, and ipfrag_max_dist while it works allows setting of invalid values using sys_sysctl. So just kill the binary sys_sysctl support for these sysctls. If the support is not important enough to test and get right it probably isn't important enough to keep. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:23 -07:00
Eric W. Biederman	baa3a2a0d2	sysctl: remove broken sunrpc debug binary sysctls This is debug code so no need to support binary sysctl, and the binary sysctls as they were written were not consistent with what showed up in /proc so remove the binary sysctl support. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: "David S. Miller" <davem@davemloft.net> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:22 -07:00
Eric W. Biederman	428b367bff	sysctl: ipv6 route flushing (kill binary path) We don't preoperly support the sysctl binary path for flushing the ipv6 routes. So remove support for a binary path. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:22 -07:00
Eric W. Biederman	d12af679bc	sysctl: fix neighbour table sysctls. - In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary sysctl names for a function that works with proc. - In neighbour.c reorder the table to put the possibly unused entries at the end so we can remove them by terminating the table early. - In neighbour.c kill the entries with questionable binary sysctl handling behavior. - In neighbour.c if we don't have a strategy routine remove the binary path. So we don't the default sysctl strategy routine on data that is not ready for it. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:22 -07:00
John W. Linville	67a4cce4a8	[PATCH] mac80211: make ieee802_11_parse_elems return void Some APs send management frames with junk padding after the last IE. We already account for a similar problem with some Apple Airport devices, but at least one device is known to send more than a single extra byte. The device in question is the Draytek Vigor2900: http://www.draytek.com.au/products/Vigor2900.php The junk in question looks like an IE that runs off the end of the frame. This cause us to return ParseFailed. Since the frame in question is an association response, this causes us to fail to associate with this AP. The return code from ieee802_11_parse_elems is superfluous. All callers still check for the presence of the specific IEs that interest them anyway. So, remove the return code so the parse never "fails". Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-18 14:36:18 -04:00
Pavel Emelyanov	52f095ee88	[IPV6]: Fix again the fl6_sock_lookup() fixed locking YOSHIFUJI fairly pointed out, that the users increment should be done under the ip6_sk_fl_lock not to give IPV6_FL_A_PUT a chance to put this count to zero and release the flowlabel. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:38:48 -07:00
Jozsef Kadlecsik	bc34b84155	[NETFILTER]: nf_conntrack_tcp: fix connection reopening fix If one side aborts an established connection, the entry still lingers for 10s in conntrack for the late packets. Allow to open up the connection again for the party which sent the RST packet. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:20:12 -07:00
Pavel Emelyanov	78c2e50253	[IPV6]: Fix race in ipv6_flowlabel_opt() when inserting two labels In the IPV6_FL_A_GET case the hash is checked for flowlabels with the given label. If it is not found, the lock, protecting the hash, is dropped to be re-get for writing. After this a newly allocated entry is inserted, but no checks are performed to catch a classical SMP race, when the conflicting label may be inserted on another cpu. Use the (currently unused) return value from fl_intern() to return the conflicting entry (if found) and re-check, whether we can reuse it (IPV6_FL_F_EXCL) or return -EEXISTS. Also add the comment, about why not re-lookup the current sock for conflicting flowlabel entry. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:18:56 -07:00
Pavel Emelyanov	bd0bf57700	[IPV6]: Lost locking in fl6_sock_lookup This routine scans the ipv6_fl_list whose update is protected with the socket lock and the ip6_sk_fl_lock. Since the socket lock is not taken in the lookup, use the other one. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:15:57 -07:00
Pavel Emelyanov	04028045a1	[IPV6]: Lost locking when inserting a flowlabel in ipv6_fl_list The new flowlabels should be inserted into the sock list under the ip6_sk_fl_lock. This was lost in one place. This list is naturally protected with the socket lock, but the fl6_sock_lookup() is called without it, so another protection is required. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:14:58 -07:00
Li Zefan	009e8c965f	[NETFILTER]: xt_sctp: fix mistake to pass a pointer where array is required Macros like SCTP_CHUNKMAP_XXX(chukmap) require chukmap to be an array, but match_packet() passes a pointer to these macros. Also remove the ELEMCOUNT macro and fix a bug in SCTP_CHUNKMAP_COPY. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:12:21 -07:00
Ilpo Järvinen	df2e014bfb	[TCP]: Remove lost_retrans zero seqno special cases Both high-sack detection and new lowest seq variables have unnecessary zero special case which are now removed by setting safe initial seqnos. This also fixes problem which caused zero received_upto being passed to tcp_mark_lost_retrans which confused after relations within the marker loop causing incorrect TCPCB_SACKED_RETRANS clearing. The problem was noticed because of a performance report from TAKANO Ryousei <takano@axe-inc.co.jp>. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Acked-by: Ryousei Takano <takano-ryousei@aist.go.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:07:57 -07:00
Jeff Garzik	bfaae0f04c	[NET]: fix carrier-on bug? While looking at a net driver with the following construct, if (!netif_carrier_ok(dev)) netif_carrier_on(dev); it stuck me that the netif_carrier_ok() check was redundant, since netif_carrier_on() checks bit __LINK_STATE_NOCARRIER anyway. This is the same reason why netif_queue_stopped() need not be called prior to netif_wake_queue(). This is true, but there is however an unwanted side effect from assuming that netif_carrier_on() can be called multiple times: it touches the watchdog, regardless of pre-existing carrier state. The fix: move watchdog-up inside the bit-cleared code path. Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 23:26:43 -07:00
David Howells	45542479fb	[NET]: Fix uninitialised variable in ip_frag_reasm() Fix uninitialised variable in ip_frag_reasm(). err should be set to -ENOMEM if the initial call of skb_clone() fails. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:37:22 -07:00
Herbert Xu	13996378e6	[IPSEC]: Rename mode to outer_mode and add inner_mode This patch adds a new field to xfrm states called inner_mode. The existing mode object is renamed to outer_mode. This is the first part of an attempt to fix inter-family transforms. As it is we always use the outer family when determining which mode to use. As a result we may end up shoving IPv4 packets into netfilter6 and vice versa. What we really want is to use the inner family for the first part of outbound processing and the outer family for the second part. For inbound processing we'd use the opposite pairing. I've also added a check to prevent silly combinations such as transport mode with inter-family transforms. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:35:51 -07:00
Herbert Xu	ca68145f16	[IPSEC]: Disallow combinations of RO and AH/ESP/IPCOMP Combining RO and AH/ESP/IPCOMP does not make sense. So this patch adds a check in the state initialisation function to prevent this. This allows us to safely remove the mode input function of RO since it can never be called anymore. Indeed, if somehow it does get called we'll know about it through an OOPS instead of it slipping past silently. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:35:15 -07:00
Herbert Xu	ed3e37ddb0	[IPSEC]: Use the top IPv4 route's peer instead of the bottom For IPv4 we were using the bottom route's peer instead of the top one. This is wrong because the peer is only used by TCP to keep track of information about the TCP destination address which certainly does not live in the bottom route. This patch fixes that which allows us to get rid of the family check since the bottom route could be IPv6 while the top one must always be IPv4. I've also changed the other fields which are IPv4-specific to get the info from the top route instead of potentially bogus data from the bottom route. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:34:46 -07:00
Herbert Xu	17c2a42a24	[IPSEC]: Store afinfo pointer in xfrm_mode It is convenient to have a pointer from xfrm_state to address-specific functions such as the output function for a family. Currently the address-specific policy code calls out to the xfrm state code to get those pointers when we could get it in an easier way via the state itself. This patch adds an xfrm_state_afinfo to xfrm_mode (since they're address-specific) and changes the policy code to use it. I've also added an owner field to do reference counting on the module providing the afinfo even though it isn't strictly necessary today since IPv6 can't be unloaded yet. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:33:12 -07:00
Herbert Xu	1bfcb10f67	[IPSEC]: Add missing BEET checks Currently BEET mode does not reinject the packet back into the stack like tunnel mode does. Since BEET should behave just like tunnel mode this is incorrect. This patch fixes this by introducing a flags field to xfrm_mode that tells the IPsec code whether it should terminate and reinject the packet back into the stack. It then sets the flag for BEET and tunnel mode. I've also added a number of missing BEET checks elsewhere where we check whether a given mode is a tunnel or not. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:31:50 -07:00
Herbert Xu	aa5d62cc87	[IPSEC]: Move type and mode map into xfrm_state.c The type and mode maps are only used by SAs, not policies. So it makes sense to move them from xfrm_policy.c into xfrm_state.c. This also allows us to mark xfrm_get_type/xfrm_put_type/xfrm_get_mode/xfrm_put_mode as static. The only other change I've made in the move is to get rid of the casts on the request_module call for types. They're unnecessary because C will promote them to ints anyway. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:31:12 -07:00
Herbert Xu	440725000c	[IPSEC]: Fix length check in xfrm_parse_spi Currently xfrm_parse_spi requires there to be 16 bytes for AH and ESP. In contrived cases there may not actually be 16 bytes there since the respective header sizes are less than that (8 and 12 currently). This patch changes the test to use the actual header length instead of 16. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:30:34 -07:00
Herbert Xu	7aa68cb906	[IPSEC]: Move ip_summed zapping out of xfrm6_rcv_spi Not every transform needs to zap ip_summed. For example, a pure tunnel mode encapsulation does not affect the hardware checksum at all. In fact, every algorithm (that needs this) other than AH6 already does its own ip_summed zapping. This patch moves the zapping into AH6 which is in line with what IPv4 does. Possible future optimisation: Checksum the data as we copy them in IPComp. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:30:07 -07:00
Herbert Xu	33b5ecb8f6	[IPSEC]: Get nexthdr from caller in xfrm6_rcv_spi Currently xfrm6_rcv_spi gets the nexthdr value itself from the packet. This means that we need to fix up the value in case we have a 4-on-6 tunnel. Moving this logic into the caller simplifies things and allows us to merge the code with IPv4. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:29:25 -07:00
Herbert Xu	c4541b41c0	[IPSEC]: Move tunnel parsing for IPv4 out of xfrm4_input This patch moves the tunnel parsing for IPv4 out of xfrm4_input and into xfrm4_tunnel. This change is in line with what IPv6 does and will allow us to merge the two input functions. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:28:53 -07:00
Herbert Xu	04663d0b8b	[IPSEC]: Fix pure tunnel modes involving IPv6 I noticed that my recent patch broke 6-on-4 pure IPsec tunnels (the ones that are only used for incompressible IPsec packets). Subsequent reviews show that I broke 6-on-6 pure tunnels more than three years ago and nobody ever noticed. I suppose every must be testing 6-on-6 IPComp with large pings which are very compressible :) This patch fixes both cases. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:28:06 -07:00
Pavel Emelyanov	aaf70ec7fd	[IPV6]: Cleanup snmp6_alloc_dev() This functions is never called with NULL or not setup argument, so the checks inside are redundant. Also, the return value is always -ENOMEM, so no need in additional variable for this. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:25:32 -07:00
Pavel Emelyanov	16910b9829	[IPV6]: Fix return type for snmp6_free_dev() This call is essentially void. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:23:43 -07:00
Pavel Emelyanov	47e958eac2	[NET]: Fix the race between sk_filter_(de\|at)tach and sk_clone() The proposed fix is to delay the reference counter decrement until the quiescent state pass. This will give sk_clone() a chance to get the reference on the cloned filter. Regular sk_filter_uncharge can happen from the sk_free() only and there's no need in delaying the put - the socket is dead anyway and is to be release itself. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:22:42 -07:00
Pavel Emelyanov	d3904b7399	[NET]: Cleanup the error path in sk_attach_filter The sk_filter_uncharge is called for error handling and for releasing the former filter, but this will have to be done in a bit different manner, so cleanup the error path a bit. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:22:17 -07:00
Pavel Emelyanov	309dd5fc87	[NET]: Move the filter releasing into a separate call This is done merely as a preparation for the fix. The sk_filter_uncharge() unaccounts the filter memory and calls the sk_filter_release(), which in turn decrements the refcount anf frees the filter. The latter function will be required separately. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:21:51 -07:00
Pavel Emelyanov	55b333253d	[NET]: Introduce the sk_detach_filter() call Filter is attached in a separate function, so do the same for filter detaching. This also removes one variable sock_setsockopt(). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:21:26 -07:00
John W. Linville	d114f399b4	[MAC80211]: only honor IW_SCAN_THIS_ESSID in STA, IBSS, and AP modes The previous IW_SCAN_THIS_ESSID patch left a hole allowing scan requests on interfaces in inappropriate modes. Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:16:16 -07:00
David S. Miller	fe537c0ee8	Merge branch 'fixes-davem' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6	2007-10-17 21:14:35 -07:00
Pavel Emelyanov	c95477090a	[INET]: Consolidate frag queues freeing Since we now allocate the queues in inet_fragment.c, we can safely free it in the same place. The ->destructor callback thus becomes optional for inet_frags. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:48:26 -07:00
Pavel Emelyanov	48d6005638	[INET]: Remove no longer needed ->equal callback Since this callback is used to check for conflicts in hashtable when inserting a newly created frag queue, we can do the same by checking for matching the queue with the argument, used to create one. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:47:56 -07:00
Pavel Emelyanov	abd6523d15	[INET]: Consolidate xxx_find() in fragment management Here we need another callback ->match to check whether the entry found in hash matches the key passed. The key used is the same as the creation argument for inet_frag_create. Yet again, this ->match is the same for netfilter and ipv6. Running a frew steps forward - this callback will later replace the ->equal one. Since the inet_frag_find() uses the already consolidated inet_frag_create() remove the xxx_frag_create from protocol codes. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:47:21 -07:00
Pavel Emelyanov	c6fda28229	[INET]: Consolidate xxx_frag_create() This one uses the xxx_frag_intern() and xxx_frag_alloc() routines, which are already consolidated, so remove them from protocol code (as promised). The ->constructor callback is used to init the rest of the frag queue and it is the same for netfilter and ipv6. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:46:47 -07:00
Pavel Emelyanov	e521db9d79	[INET]: Consolidate xxx_frag_alloc() Just perform the kzalloc() allocation and setup common fields in the inet_frag_queue(). Then return the result to the caller to initialize the rest. The inet_frag_alloc() may return NULL, so check the return value before doing the container_of(). This looks ugly, but the xxx_frag_alloc() will be removed soon. The xxx_expire() timer callbacks are patches, because the argument is now the inet_frag_queue, not the protocol specific queue. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:45:23 -07:00
Pavel Emelyanov	2588fe1d78	[INET]: Consolidate xxx_frag_intern This routine checks for the existence of a given entry in the hash table and inserts the new one if needed. The ->equal callback is used to compare two frag_queue-s together, but this one is temporary and will be removed later. The netfilter code and the ipv6 one use the same routine to compare frags. The inet_frag_intern() always returns non-NULL pointer, so convert the inet_frag_queue into protocol specific one (with the container_of) without any checks. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:44:34 -07:00
Pavel Emelyanov	fd9e63544c	[INET]: Omit double hash calculations in xxx_frag_intern Since the hash value is already calculated in xxx_find, we can simply use it later. This is already done in netfilter code, so make the same in ipv4 and ipv6. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:43:37 -07:00
Stephen Hemminger	be07664599	[BR2684]: get rid of broken header code. Recent header_ops change would break the following dead code in br2684. Maintaining conditonal code in mainline is wrong. "Do, or do not. There is no 'try.'" Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:39:22 -07:00
Ryan Reading	c310f099be	[IRDA]: IrCOMM discovery indication simplification From: Ryan Reading <ryanr23@gmail.com> Every IrCOMM socket is registered with the discovery subsystem, so we don't need to loop over all of them for every discovery event. We just need to do it for the registered IrCOMM socket. Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:34:11 -07:00
Ingo Molnar	bd5435e76a	[DCCP]: fix link error with !CONFIG_SYSCTL Do not define the sysctl_dccp_sync_ratelimit sysctl variable in the CONFIG_SYSCTL dependent sysctl.c module - move it to input.c instead. This fixes the following build bug: net/built-in.o: In function `dccp_check_seqno': input.c:(.text+0xbd859): undefined reference to `sysctl_dccp_sync_ratelimit' distcc[29953] ERROR: compile (null) on localhost failed make: *** [vmlinux] Error 1 Found via 'make randconfig' build testing. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:33:06 -07:00
Pavel Emelyanov	dc8a82ad28	[IPV6]: Fix memory leak in cleanup_ipv6_mibs() The icmpv6msg mib statistics is not freed. This is almost not critical for current kernel, since ipv6 module is unloadable, but this can happen on load error and will happen every time we stop the network namespace (when we have one, of course). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:30:40 -07:00
Eric Van Hensbergen	982c37cfb6	9p: remove sysctl A sysctl method was added to enable and disable debugging levels. After further review, it was decided that there are better approaches to doing this and the sysctl methodology isn't really desirable. This patch removes the sysctl code from 9p. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-10-17 14:35:15 -05:00
Eric Van Hensbergen	fb0466c3ae	9p: fix bad kconfig cross-dependency This patch moves transport dynamic registration and matching to the net module to prevent a bad Kconfig dependency between the net and fs 9p modules. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-10-17 14:31:07 -05:00
Latchesar Ionkov	ba17674fe0	9p: attach-per-user The 9P2000 protocol requires the authentication and permission checks to be done in the file server. For that reason every user that accesses the file server tree has to authenticate and attach to the server separately. Multiple users can share the same connection to the server. Currently v9fs does a single attach and executes all I/O operations as a single user. This makes using v9fs in multiuser environment unsafe as it depends on the client doing the permission checking. This patch improves the 9P2000 support by allowing every user to attach separately. The patch defines three modes of access (new mount option 'access'): - attach-per-user (access=user) (default mode for 9P2000.u) If a user tries to access a file served by v9fs for the first time, v9fs sends an attach command to the server (Tattach) specifying the user. If the attach succeeds, the user can access the v9fs tree. As there is no uname->uid (string->integer) mapping yet, this mode works only with the 9P2000.u dialect. - allow only one user to access the tree (access=<uid>) Only the user with uid can access the v9fs tree. Other users that attempt to access it will get EPERM error. - do all operations as a single user (access=any) (default for 9P2000) V9fs does a single attach and all operations are done as a single user. If this mode is selected, the v9fs behavior is identical with the current one. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-10-17 14:31:07 -05:00
Eric Van Hensbergen	a80d923e13	9p: Make transports dynamic This patch abstracts out the interfaces to underlying transports so that new transports can be added as modules. This should also allow kernel configuration of transports without ifdef-hell. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-10-17 14:31:07 -05:00
Dave Hansen	ce8d2cdf3d	r/o bind mounts: filesystem helpers for custom 'struct file's Why do we need r/o bind mounts? This feature allows a read-only view into a read-write filesystem. In the process of doing that, it also provides infrastructure for keeping track of the number of writers to any given mount. This has a number of uses. It allows chroots to have parts of filesystems writable. It will be useful for containers in the future because users may have root inside a container, but should not be allowed to write to somefilesystems. This also replaces patches that vserver has had out of the tree for several years. It allows security enhancement by making sure that parts of your filesystem read-only (such as when you don't trust your FTP server), when you don't want to have entire new filesystems mounted, or when you want atime selectively updated. I've been using the following script to test that the feature is working as desired. It takes a directory and makes a regular bind and a r/o bind mount of it. It then performs some normal filesystem operations on the three directories, including ones that are expected to fail, like creating a file on the r/o mount. This patch: Some filesystems forego the vfs and may_open() and create their own 'struct file's. This patch creates a couple of helper functions which can be used by these filesystems, and will provide a unified place which the r/o bind mount code may patch. Also, rename an existing, static-scope init_file() to a less generic name. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
David Howells	76181c134f	KEYS: Make request_key() and co fundamentally asynchronous Make request_key() and co fundamentally asynchronous to make it easier for NFS to make use of them. There are now accessor functions that do asynchronous constructions, a wait function to wait for construction to complete, and a completion function for the key type to indicate completion of construction. Note that the construction queue is now gone. Instead, keys under construction are linked in to the appropriate keyring in advance, and that anyone encountering one must wait for it to be complete before they can use it. This is done automatically for userspace. The following auxiliary changes are also made: (1) Key type implementation stuff is split from linux/key.h into linux/key-type.h. (2) AF_RXRPC provides a way to allocate null rxrpc-type keys so that AFS does not need to call key_instantiate_and_link() directly. (3) Adjust the debugging macros so that they're -Wformat checked even if they are disabled, and make it so they can be enabled simply by defining __KDEBUG to be consistent with other code of mine. (3) Documentation. [alan@lxorguk.ukuu.org.uk: keys: missing word in documentation] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:57 -07:00
Rusty Russell	af49d9248f	Remove "unsafe" from module struct Adrian Bunk points out that "unsafe" was used to mark modules touched by the deprecated MOD_INC_USE_COUNT interface, which has long gone. It's time to remove the member from the module structure, as well. If you want a module which can't unload, don't register an exit function. (Vlad Yasevich says SCTP is now safe to unload, so just remove the __unsafe there). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Shannon Nelson <shannon.nelson@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Cc: Sridhar Samudrala <sri@us.ibm.com> Cc: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:49 -07:00
Christoph Lameter	4ba9b9d0ba	Slab API: remove useless ctor parameter and reorder parameters Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void object, struct kmem_cache s, unsigned long flags) to ctor(struct kmem_cache s, void object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Bill Moss	107acb23ba	[PATCH] mac80211: honor IW_SCAN_THIS_ESSID in siwscan ioctl This patch fixes the problem of associating with wpa_secured hidden AP. Please try out. The original author of this patch is Bill Moss <bmoss@clemson.edu> Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-16 21:32:37 -04:00
John W. Linville	cffdd30d20	[PATCH] mac80211: store SSID in sta_bss_list Some AP equipment "in the wild" services multiple SSIDs using the same BSSID. This patch changes the key of sta_bss_list to include the SSID as well as the BSSID and the channel so as to prevent one SSID from eclipsing another SSID with the same BSSID. Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-16 21:19:18 -04:00
John W. Linville	65c107ab3b	[PATCH] mac80211: store channel info in sta_bss_list Some AP equipment "in the wild" uses the same BSSID on multiple channels (particularly "a" vs. "b/g"). This patch changes the key of sta_bss_list to include both the BSSID and the channel so as to prevent a BSSID on one channel from eclipsing the same BSSID on another channel. Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-16 21:17:26 -04:00
Johannes Berg	1dd84aa213	[PATCH] mac80211: reorder association debug output There's no reason to warn about an invalid AID field when the association was denied. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-16 21:11:56 -04:00
Johannes Berg	58a9ac17ed	[PATCH] mac80211: fix set_channel regression Adam Baker reported that the prism2 ioctl removal changed behaviour in that now the selection order was the other way around as before. New API is planned but not done yet, so for now just use the first matching channel in any mode as was previous behaviour with an unset next_mode. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-16 20:58:12 -04:00
Johannes Berg	e797aa1b7d	[PATCH] ieee80211: fix TKIP QoS bug The commit `65b6a277` titled "ieee80211: Fix header->qos_ctl endian issue" introduced an endianness bug. Partially revert it. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-16 20:58:12 -04:00
Andrew Morton	a56daeb7d5	net/sunrpc/xprtrdma/verbs.c printk warning fix sparc64: net/sunrpc/xprtrdma/verbs.c:1264: warning: long long unsigned int format, u64 arg (arg 3) net/sunrpc/xprtrdma/verbs.c:1264: warning: long long unsigned int format, u64 arg (arg 4) Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "David S. Miller" <davem@davemloft.net> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:43:23 -07:00
Pavel Emelyanov	4acad72ded	[IPV6]: Consolidate the ip6_pol_route_(input\|output) pair The difference in both functions is in the "id" passed to the rt6_select, so just pass it as an extra argument from two outer helpers. This is minus 60 lines of code and 360 bytes of .text Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 13:02:51 -07:00
Pavel Emelyanov	4ae289444b	[NEIGH]: Ensure that pneigh_lookup is protected with RTNL The pnigh_lookup is used to lookup proxy entries and to create them in case lookup failed. However, the "creation" code does not perform the re-lookup after GFP_KERNEL allocation. This is done because the code is expected to be protected with the RTNL lock, so add the assertion (mainly to address future questions from new network developers like me :) ). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:54:15 -07:00
Denis V. Lunev	f1673ca52c	[INET]: kmalloc+memset -> kzalloc in frag_alloc_queue kmalloc + memset -> kzalloc in frag_alloc_queue Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:53:13 -07:00
Herbert Xu	e5bbef20e0	[IPV6]: Replace sk_buff ** with sk_buff * in input handlers With all the users of the double pointers removed from the IPv6 input path, this patch converts all occurances of sk_buff ** to sk_buff * in IPv6 input handlers. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:50:28 -07:00
Pavel Emelyanov	762cc40801	[INET]: Consolidate the xxx_put These ones use the generic data types too, so move them in one place. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:43 -07:00
Pavel Emelyanov	4b6cb5d8e3	[INET]: Small cleanup for xxx_put after evictor consolidation After the evictor code is consolidated there is no need in passing the extra pointer to the xxx_put() functions. The only place when it made sense was the evictor code itself. Maybe this change must got with the previous (or with the next) patch, but I try to make them shorter as much as possible to simplify the review (but they are still large anyway), so this change goes in a separate patch. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:43 -07:00
Pavel Emelyanov	8e7999c44e	[INET]: Consolidate the xxx_evictor The evictors collect some statistics for ipv4 and ipv6, so make it return the number of evicted queues and account them all at once in the caller. The XXX_ADD_STATS_BH() macros are just for this case, but maybe there are places in code, that can make use of them as well. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:42 -07:00
Pavel Emelyanov	1e4b82873a	[INET]: Consolidate the xxx_frag_destroy To make in possible we need to know the exact frag queue size for inet_frags->mem management and two callbacks: * to destoy the skb (optional, used in conntracks only) * to free the queue itself (mandatory, but later I plan to move the allocation and the destruction of frag_queues into the common place, so this callback will most likely be optional too). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:42 -07:00
Pavel Emelyanov	321a3a99e4	[INET]: Consolidate xxx_the secret_rebuild This code works with the generic data types as well, so move this into inet_fragment.c This move makes it possible to hide the secret_timer management and the secret_rebuild routine completely in the inet_fragment.c Introduce the ->hashfn() callback in inet_frags() to get the hashfun for a given inet_frag_queue() object. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:41 -07:00
Pavel Emelyanov	277e650ddf	[INET]: Consolidate the xxx_frag_kill Since now all the xxx_frag_kill functions now work with the generic inet_frag_queue data type, this can be moved into a common place. The xxx_unlink() code is moved as well. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:41 -07:00
Pavel Emelyanov	04128f233f	[INET]: Collect common frag sysctl variables together Some sysctl variables are used to tune the frag queues management and it will be useful to work with them in a common way in the future, so move them into one structure, moreover they are the same for all the frag management codes. I don't place them in the existing inet_frags object, introduced in the previous patch for two reasons: 1. to keep them in the __read_mostly section; 2. not to export the whole inet_frags objects outside. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:40 -07:00
Pavel Emelyanov	7eb95156d9	[INET]: Collect frag queues management objects together There are some objects that are common in all the places which are used to keep track of frag queues, they are: * hash table * LRU list * rw lock * rnd number for hash function * the number of queues * the amount of memory occupied by queues * secret timer Move all this stuff into one structure (struct inet_frags) to make it possible use them uniformly in the future. Like with the previous patch this mostly consists of hunks like - write_lock(&ipfrag_lock); + write_lock(&ip4_frags.lock); To address the issue with exporting the number of queues and the amount of memory occupied by queues outside the .c file they are declared in, I introduce a couple of helpers. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:39 -07:00
Pavel Emelyanov	5ab11c98d3	[INET]: Move common fields from frag_queues in one place. Introduce the struct inet_frag_queue in include/net/inet_frag.h file and place there all the common fields from three structs: * struct ipq in ipv4/ip_fragment.c * struct nf_ct_frag6_queue in nf_conntrack_reasm.c * struct frag_queue in ipv6/reassembly.c After this, replace these fields on appropriate structures with this structure instance and fix the users to use correct names i.e. hunks like - atomic_dec(&fq->refcnt); + atomic_dec(&fq->q.refcnt); (these occupy most of the patch) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:38 -07:00
Ilpo Järvinen	f885c5b08e	[TCP]: high_seq parameter removed (all callers use tp->high_seq) Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:37 -07:00
Patrick McHardy	ad643a793b	[IPV6]: Uninline netfilter okfns Uninline netfilter okfns for those cases where gcc can generate tail-calls. Before: text data bss dec hex filename 8994153 1016524 524652 10535329 a0c1a1 vmlinux After: text data bss dec hex filename 8992761 1016524 524652 10533937 a0bc31 vmlinux ------------------------------------------------------- -1392 All cases have been verified to generate tail-calls with and without netfilter. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:36 -07:00
Patrick McHardy	9c2842bd94	[BRIDGE]: Remove SKB share checks in br_nf_pre_routing(). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:35 -07:00
Patrick McHardy	861d048607	[IPV4]: Uninline netfilter okfns Now that we don't pass double skb pointers to nf_hook_slow anymore, gcc can generate tail calls for some of the netfilter hook okfn invocations, so there is no need to inline the functions anymore. This caused huge code bloat since we ended up with one inlined version and one out-of-line version since we pass the address to nf_hook_slow. Before: text data bss dec hex filename 8997385 1016524 524652 10538561 a0ce41 vmlinux After: text data bss dec hex filename 8994009 1016524 524652 10535185 a0c111 vmlinux ------------------------------------------------------- -3376 All cases have been verified to generate tail-calls with and without netfilter. The okfns in ipmr and xfrm4_input still remain inline because gcc can't generate tail-calls for them. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:35 -07:00
Herbert Xu	a030847e9f	[NET]: Avoid copying TCP packets unnecessarily TCP packets all have writable heads, that is, even though it's cloned, it is writable up to the end of the TCP header. This patch makes skb_checksum_help aware of this fact by using skb_clone_writable and avoiding a copy for TCP. I've also modified the BUG_ON tests to be unsigned. The only case where this makes a difference is if csum_start points to a location before skb->data. Since skb->data should always include the header where the checksum field is (and all currently callers adhere to that), this change is safe and may uncover bugs later. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:34 -07:00
Herbert Xu	172a863f2b	[NET]: Fix csum_start update in pskb_expand_head I got confused by the dual nature of the off variable in the function pskb_expand_head. The csum_start offset should use nhead instead of off which can change depending on whether we are using offsets or pointers. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:33 -07:00
Jesper Juhl	f937f1f46b	[NETLINK]: Don't leak 'listeners' in netlink_kernel_create() The Coverity checker spotted that we'll leak the storage allocated to 'listeners' in netlink_kernel_create() when the if (!nl_table[unit].registered) check is false. This patch avoids the leak. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:32 -07:00
Adrian Bunk	1dff92e09e	[IPV6] __inet6_csk_dst_store(): fix check-after-use The Coverity checker spotted that we have already oops'ed if "dst" was NULL. Since "dst" being NULL doesn't seem to be possible at this point this patch removes the NULL check. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:32 -07:00
Herbert Xu	65c8846660	[IPV6]: Avoid skb_copy/pskb_copy/skb_realloc_headroom on input This patch replaces unnecessary uses of skb_copy by pskb_expand_head on the IPv6 input path. This allows us to remove the double pointers later. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:31 -07:00
Herbert Xu	f61944efdf	[IPV6]: Make ipv6_frag_rcv return the same packet This patch implements the same change taht was done to ip_defrag. It makes ipv6_frag_rcv return the last packet received of a train of fragments rather than the head of that sequence. This allows us to get rid of the sk_buff ** argument later. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:30 -07:00
Herbert Xu	3db05fea51	[NETFILTER]: Replace sk_buff ** with sk_buff * With all the users of the double pointers removed, this patch mops up by finally replacing all occurances of sk_buff ** in the netfilter API by sk_buff *. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:29 -07:00
Herbert Xu	2ca7b0ac02	[NETFILTER]: Avoid skb_copy/pskb_copy/skb_realloc_headroom This patch replaces unnecessary uses of skb_copy, pskb_copy and skb_realloc_headroom by functions such as skb_make_writable and pskb_expand_head. This allows us to remove the double pointers later. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:28 -07:00
Herbert Xu	af1e1cf073	[IPVS]: Replace local version of skb_make_writable This patch removes the IPVS-specific version of skb_make_writable and replaces it with the netfilter one. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:28 -07:00
Herbert Xu	37d4187922	[NETFILTER]: Do not copy skb in skb_make_writable Now that all callers of netfilter can guarantee that the skb is not shared, we no longer have to copy the skb in skb_make_writable. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:27 -07:00
Herbert Xu	7b995651e3	[BRIDGE]: Unshare skb upon entry Due to the special location of the bridging hook, it should never see a shared packet anyway (certainly not with any in-kernel code). So it makes sense to unshare the skb there if necessary as that will greatly simplify the code below it (in particular, netfilter). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:27 -07:00
Herbert Xu	f697c3e8b3	[NET]: Avoid unnecessary cloning for ingress filtering As it is we always invoke pt_prev before ing_filter, even if there are no ingress filters attached. This can cause unnecessary cloning in pt_prev. This patch changes it so that we only invoke pt_prev if there are ingress filters attached. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:26 -07:00
Herbert Xu	776c729e8d	[IPV4]: Change ip_defrag to return an integer Now that ip_frag always returns the packet given to it on input, we can change it to return an integer indicating error instead. This patch does that and updates all its callers accordingly. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:25 -07:00
Herbert Xu	1706d58763	[IPV4]: Make ip_defrag return the same packet This patch is a bit of a hack. However it is worth it if you consider that this is the only reason why we have to carry around the struct sk_buff ** pointers in netfilter. It makes ip_defrag always return the packet that was given to it on input. It does this by cloning the packet and replacing its original contents with the head fragment if necessary. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:25 -07:00
Herbert Xu	e0053ec07e	[SKBUFF]: Add skb_morph This patch creates a new function skb_morph that's just like skb_clone except that it lets user provide the spare skb that will be overwritten by the one that's to be cloned. This will be used by IP fragment reassembly so that we get back the same skb that went in last (rather than the head skb that we get now which requires us to carry around double pointers all over the place). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:24 -07:00
Herbert Xu	dec18810c5	[SKBUFF]: Merge common code between copy_skb_header and skb_clone This patch creates a new function __copy_skb_header to merge the common code between copy_skb_header and skb_clone. Having two functions which are largely the same is a source of wasted labour as well as confusion. In fact the tc_verd stuff is almost certainly a bug since it's treated differently in skb_clone compared to the callers of copy_skb_header (skb_copy/pskb_copy/skb_copy_expand). I've kept that difference in tact with a comment added asking for clarification. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:24 -07:00
Linus Torvalds	f4921aff5b	Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 * git://git.linux-nfs.org/pub/linux/nfs-2.6: (131 commits) NFSv4: Fix a typo in nfs_inode_reclaim_delegation NFS: Add a boot parameter to disable 64 bit inode numbers NFS: nfs_refresh_inode should clear cache_validity flags on success NFS: Fix a connectathon regression in NFSv3 and NFSv4 NFS: Use nfs_refresh_inode() in ops that aren't expected to change the inode SUNRPC: Don't call xprt_release in call refresh SUNRPC: Don't call xprt_release() if call_allocate fails SUNRPC: Fix buggy UDP transmission [23/37] Clean up duplicate includes in [2.6 patch] net/sunrpc/rpcb_clnt.c: make struct rpcb_program static SUNRPC: Use correct type in buffer length calculations SUNRPC: Fix default hostname created in rpc_create() nfs: add server port to rpc_pipe info file NFS: Get rid of some obsolete macros NFS: Simplify filehandle revalidation NFS: Ensure that nfs_link() returns a hashed dentry NFS: Be strict about dentry revalidation when doing exclusive create NFS: Don't zap the readdir caches upon error NFS: Remove the redundant nfs_reval_fsid() NFSv3: Always use directory post-op attributes in nfs3_proc_lookup ... Fix up trivial conflict due to sock_owned_by_user() cleanup manually in net/sunrpc/xprtsock.c	2007-10-15 10:47:35 -07:00
Linus Torvalds	b5869ce7f6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: (140 commits) sched: sync wakeups preempt too sched: affine sync wakeups sched: guest CPU accounting: maintain guest state in KVM sched: guest CPU accounting: maintain stats in account_system_time() sched: guest CPU accounting: add guest-CPU /proc/<pid>/stat fields sched: guest CPU accounting: add guest-CPU /proc/stat field sched: domain sysctl fixes: add terminator comment sched: domain sysctl fixes: do not crash on allocation failure sched: domain sysctl fixes: unregister the sysctl table before domains sched: domain sysctl fixes: use for_each_online_cpu() sched: domain sysctl fixes: use kcalloc() Make scheduler debug file operations const sched: enable wake-idle on CONFIG_SCHED_MC=y sched: reintroduce topology.h tunings sched: allow the immediate migration of cache-cold tasks sched: debug, improve migration statistics sched: debug: increase width of debug line sched: activate task_hot() only on fair-scheduled tasks sched: reintroduce cache-hot affinity sched: speed up context-switches a bit ...	2007-10-15 08:22:16 -07:00
Linus Torvalds	37ca506adc	Merge branch 'nfs-server-stable' of git://linux-nfs.org/~bfields/linux * 'nfs-server-stable' of git://linux-nfs.org/~bfields/linux: knfsd: query filesystem for NFSv4 getattr of FATTR4_MAXNAME knfsd: nfsv4 delegation recall should take reference on client knfsd: don't shutdown callbacks until nfsv4 client is freed knfsd: let nfsd manage timing out its own leases knfsd: Add source address to sunrpc svc errors knfsd: 64 bit ino support for NFS server svcgss: move init code into separate function knfsd: remove code duplication in nfsd4_setclientid() nfsd warning fix knfsd: fix callback rpc cred knfsd: move nfsv4 slab creation/destruction to module init/exit knfsd: spawn kernel thread to probe callback channel knfsd: nfs4 name->id mapping not correctly parsing negative downcall knfsd: demote some printk()s to dprintk()s knfsd: cleanup of nfsd4 cmp_* functions knfsd: delete code made redundant by map_new_errors nfsd: fix horrible indentation in nfsd_setattr nfsd: remove unused cache_for_each macro nfsd: tone down inaccurate dprintk	2007-10-15 08:16:53 -07:00
Ingo Molnar	71e20f1873	sched: affine sync wakeups make sync wakeups affine for cache-cold tasks: if a cache-cold task is woken up by a sync wakeup then use the opportunity to migrate it straight away. (the two tasks are 'related' because they communicate) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-15 17:00:19 +02:00
Al Viro	f53f4137ba	fix endianness bug in inet_lro all uses of and almost all assignments to lro_desc->tcp_ack assume that it's net-endian; one converts net-endian to host-endian and sticks it in lro_desc->tcp_ack. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-14 12:41:52 -07:00
Al Viro	9df7c98a0f	inet_lro: trivial endianness annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-14 12:41:52 -07:00
Al Viro	411223c01a	fix breakage in sctp getsockopt copy_to_user() into on-stack array Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-14 12:41:51 -07:00
Randy Dunlap	c4ea43c552	net core: fix kernel-doc for new function parameters Fix networking code kernel-doc for newly added parameters. Warning(linux-2.6.23-git2//net/core/sock.c:879): No description found for parameter 'net' Warning(linux-2.6.23-git2//net/core/dev.c:570): No description found for parameter 'net' Warning(linux-2.6.23-git2//net/core/dev.c:594): No description found for parameter 'net' Warning(linux-2.6.23-git2//net/core/dev.c:617): No description found for parameter 'net' Warning(linux-2.6.23-git2//net/core/dev.c:641): No description found for parameter 'net' Warning(linux-2.6.23-git2//net/core/dev.c:667): No description found for parameter 'net' Warning(linux-2.6.23-git2//net/core/dev.c:722): No description found for parameter 'net' Warning(linux-2.6.23-git2//net/core/dev.c:959): No description found for parameter 'net' Warning(linux-2.6.23-git2//net/core/dev.c:1195): No description found for parameter 'dev' Warning(linux-2.6.23-git2//net/core/dev.c:2105): No description found for parameter 'n' Warning(linux-2.6.23-git2//net/core/dev.c:3272): No description found for parameter 'net' Warning(linux-2.6.23-git2//net/core/dev.c:3445): No description found for parameter 'net' Warning(linux-2.6.23-git2//include/linux/netdevice.h:1301): No description found for parameter 'cpu' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-13 09:52:26 -07:00
Greg Kroah-Hartman	19c38de88a	kobjects: fix up improper use of the kobject name field A number of different drivers incorrect access the kobject name field directly. This is not correct as the name might not be in the array. Use the proper accessor function instead.	2007-10-12 14:51:02 -07:00
Kay Sievers	7eff2e7a8b	Driver core: change add_uevent_var to use a struct This changes the uevent buffer functions to use a struct instead of a long list of parameters. It does no longer require the caller to do the proper buffer termination and size accounting, which is currently wrong in some places. It fixes a known bug where parts of the uevent environment are overwritten because of wrong index calculations. Many thanks to Mathieu Desnoyers for finding bugs and improving the error handling. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-10-12 14:51:01 -07:00
Ilpo Järvinen	b08d6cb22c	[TCP]: Limit processing lost_retrans loop to work-to-do cases This addition of lost_retrans_low to tcp_sock might be unnecessary, it's not clear how often lost_retrans worker is executed when there wasn't work to do. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-11 17:36:13 -07:00
Ilpo Järvinen	f785a8e28b	[TCP]: Fix lost_retrans loop vs fastpath problems Detection implemented with lost_retrans must work also when fastpath is taken, yet most of the queue is skipped including (very likely) those retransmitted skb's we're interested in. This problem appeared when the hints got added, which removed a need to always walk over the whole write queue head. Therefore decicion for the lost_retrans worker loop entry must be separated from the sacktag processing more than it was necessary before. It turns out to be problematic to optimize the worker loop very heavily because ack_seqs of skb may have a number of discontinuity points. Maybe similar approach as currently is implemented could be attempted but that's becoming more and more complex because the trend is towards less skb walking in sacktag marker. Trying a simple work until all rexmitted skbs heve been processed approach. Maybe after(highest_sack_end_seq, tp->high_seq) checking is not sufficiently accurate and causes entry too often in no-work-to-do cases. Since that's not known, I've separated solution to that from this patch. Noticed because of report against a related problem from TAKANO Ryousei <takano@axe-inc.co.jp>. He also provided a patch to that part of the problem. This patch includes solution to it (though this patch has to use somewhat different placement). TAKANO's description and patch is available here: http://marc.info/?l=linux-netdev&m=119149311913288&w=2 ...In short, TAKANO's problem is that end_seq the loop is using not necessarily the largest SACK block's end_seq because the current ACK may still have higher SACK blocks which are later by the loop. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-11 17:35:41 -07:00
Ilpo Järvinen	4cd829995b	[TCP]: No need to re-count fackets_out/sacked_out at RTO Both sacked_out and fackets_out are directly known from how parameter. Since fackets_out is accurate, there's no need for recounting (sacked_out was previously unnecessarily counted in the loop anyway). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-11 17:34:57 -07:00
Ilpo Järvinen	d193594299	[TCP]: Extract tcp_match_queue_to_sack from sacktag code This is necessary for upcoming DSACK bugfix. Reduces sacktag length which is not very sad thing at all... :-) Notice that there's a need to handle out-of-mem at caller's place. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-11 17:34:25 -07:00
Ilpo Järvinen	f6fb128d27	[TCP]: Kill almost unused variable pcount from sacktag It's on the way for future cutting of that function. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-11 17:33:55 -07:00
Ilpo Järvinen	3eec0047d9	[TCP]: Fix mark_head_lost to ignore R-bit when trying to mark L This condition (plain R) can arise at least in recovery that is triggered after tcp_undo_loss. There isn't any reason why they should not be marked as lost, not marking makes in_flight estimator to return too large values. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-11 17:33:11 -07:00
Ilpo Järvinen	16e906812f	[TCP]: Add bytes_acked (ABC) clearing to FRTO too I was reading tcp_enter_loss while looking for Cedric's bug and noticed bytes_acked adjustment is missing from FRTO side. Since bytes_acked will only be used in tcp_cong_avoid, I think it's safe to assume RTO would be spurious. During FRTO cwnd will be not controlled by tcp_cong_avoid and if FRTO calls for conventional recovery, cwnd is adjusted and the result of wrong assumption is cleared from bytes_acked. If RTO was in fact spurious, we did normal ABC already and can continue without any additional adjustments. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-11 17:32:31 -07:00
Brian Haley	4953f0fcc0	[IPv6]: Update setsockopt(IPV6_MULTICAST_IF) to support RFC 3493, try2 From RFC 3493, Section 5.2: IPV6_MULTICAST_IF Set the interface to use for outgoing multicast packets. The argument is the index of the interface to use. If the interface index is specified as zero, the system selects the interface (for example, by looking up the address in a routing table and using the resulting interface). This patch adds support for (index == 0) to reset the value to it's original state, allowing the system to choose the best interface. IPv4 already behaves this way. Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-11 14:39:29 -07:00
Jan Engelhardt	73aaf9355b	[NETFILTER]: x_tables: add missing ip6t_modulename aliases The patch will add MODULE_ALIAS("ip6t_<modulename>") where missing, otherwise you will get ip6tables: No chain/target/match by that name when xt_<modulename> is not already loaded. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-11 14:36:40 -07:00
Jozsef Kadlecsik	17311393f9	[NETFILTER]: nf_conntrack_tcp: fix connection reopening With your description I could reproduce the bug and actually you were completely right: the code above is incorrect. Somehow I was able to misread RFC1122 and mixed the roles :-(: When a connection is >>closed actively<<, it MUST linger in TIME-WAIT state for a time 2xMSL (Maximum Segment Lifetime). However, it MAY >>accept<< a new SYN from the remote TCP to reopen the connection directly from TIME-WAIT state, if it: [...] The fix is as follows: if the receiver initiated an active close, then the sender may reopen the connection - otherwise try to figure out if we hold a dead connection. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-11 14:35:52 -07:00
David S. Miller	28f7b0360f	[NETLINK]: fib_frontend build fixes 1) fibnl needs to be declared outside of config ifdefs, and also should not be explicitly initialized to NULL 2) nl_fib_input() args are wrong for netlink_kernel_create() input method Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 21:32:39 -07:00
Pierre Ynard	31910575a9	[IPv6]: Export userland ND options through netlink (RDNSS support) As discussed before, this patch provides userland with a way to access relevant options in Router Advertisements, after they are processed and validated by the kernel. Extra options are processed in a generic way; this patch only exports RDNSS options described in RFC5006, but support to control which options are exported could be easily added. A new rtnetlink message type is defined, to transport Neighbor Discovery options, along with optional context information. At the moment only the address of the router sending an RDNSS option is included, but additional attributes may be later defined, if needed by new use cases. Signed-off-by: Pierre Ynard <linkfanel@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 21:22:05 -07:00
Denis V. Lunev	cd40b7d398	[NET]: make netlink user -> kernel interface synchronious This patch make processing netlink user -> kernel messages synchronious. This change was inspired by the talk with Alexey Kuznetsov about current netlink messages processing. He says that he was badly wrong when introduced asynchronious user -> kernel communication. The call netlink_unicast is the only path to send message to the kernel netlink socket. But, unfortunately, it is also used to send data to the user. Before this change the user message has been attached to the socket queue and sk->sk_data_ready was called. The process has been blocked until all pending messages were processed. The bad thing is that this processing may occur in the arbitrary process context. This patch changes nlk->data_ready callback to get 1 skb and force packet processing right in the netlink_unicast. Kernel -> user path in netlink_unicast remains untouched. EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock drop, but the process remains in the cycle until the message will be fully processed. So, there is no need to use this kludges now. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 21:15:29 -07:00
Denis V. Lunev	aed815601f	[NET]: unify netlink kernel socket recognition There are currently two ways to determine whether the netlink socket is a kernel one or a user one. This patch creates a single inline call for this purpose and unifies all the calls in the af_netlink.c No similar calls are found outside af_netlink.c. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 21:14:32 -07:00
Denis V. Lunev	7ee015e0fa	[NET]: cleanup 3rd argument in netlink_sendskb netlink_sendskb does not use third argument. Clean it and save a couple of bytes. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 21:14:03 -07:00
Denis V. Lunev	3b71535f35	[NET]: Make netlink processing routines semi-synchronious (inspired by rtnl) v2 The code in netfilter/nfnetlink.c and in ./net/netlink/genetlink.c looks like outdated copy/paste from rtnetlink.c. Push them into sync with the original. Changes from v1: - deleted comment in nfnetlink_rcv_msg by request of Patrick McHardy Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 21:13:32 -07:00
Denis V. Lunev	1536cc0d55	[NET]: rtnl_unlock cleanups There is no need to process outstanding netlink user->kernel packets during rtnl_unlock now. There is no rtnl_trylock in the rtnetlink_rcv anymore. Normal code path is the following: netlink_sendmsg netlink_unicast netlink_sendskb skb_queue_tail netlink_data_ready rtnetlink_rcv mutex_lock(&rtnl_mutex); netlink_run_queue(sk, qlen, &rtnetlink_rcv_msg); mutex_unlock(&rtnl_mutex); So, it is possible, that packets can be present in the rtnl->sk_receive_queue during rtnl_unlock, but there is no need to process them at that moment as rtnetlink_rcv for that packet is pending. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 21:12:58 -07:00
Tony Battersby	fa8705b00a	[NET]: sanitize kernel_accept() error path If kernel_accept() returns an error, it may pass back a pointer to freed memory (which the caller should ignore). Make it pass back NULL instead for better safety. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 21:09:04 -07:00
Stephen Hemminger	227b60f510	[INET]: local port range robustness Expansion of original idea from Denis V. Lunev <den@openvz.org> Add robustness and locking to the local_port_range sysctl. 1. Enforce that low < high when setting. 2. Use seqlock to ensure atomic update. The locking might seem like overkill, but there are cases where sysadmin might want to change value in the middle of a DoS attack. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 17:30:46 -07:00
Stephen Hemminger	0639300900	[SCTP]: port randomization Add port randomization rather than a simple fixed rover for use with SCTP. This makes it act similar to TCP, UDP, DCCP when allocating ports. No longer need port_alloc_lock as well (suggestion by Brian Haley). Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 17:30:18 -07:00
Patrick McHardy	3c0cfc1358	[NET_SCHED]: Show timer resolution instead of clock resolution in /proc/net/psched The fourth parameter of /proc/net/psched is supposed to show the timer resultion and is used by HTB userspace to calculate the necessary burst rate. Currently we show the clock resolution, which results in a too low burst rate when the two differ. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:59 -07:00
Herbert Xu	631a6698d0	[IPSEC]: Move IP protocol setting from transforms into xfrm4_input.c This patch makes the IPv4 x->type->input functions return the next protocol instead of setting it directly. This is identical to how we do things in IPv6 and will help us merge common code on the input path. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:56 -07:00
Herbert Xu	ceb1eec829	[IPSEC]: Move IP length/checksum setting out of transforms This patch moves the setting of the IP length and checksum fields out of the transforms and into the xfrmX_output functions. This would help future efforts in merging the transforms themselves. It also adds an optimisation to ipcomp due to the fact that the transport offset is guaranteed to be zero. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:56 -07:00
Herbert Xu	87bdc48d30	[IPSEC]: Get rid of ipv6_{auth,esp,comp}_hdr This patch removes the duplicate ipv6_{auth,esp,comp}_hdr structures since they're identical to the IPv4 versions. Duplicating them would only create problems for ourselves later when we need to add things like extended sequence numbers. I've also added transport header type conversion headers for these types which are now used by the transforms. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:55 -07:00
Herbert Xu	37fedd3aab	[IPSEC]: Use IPv6 calling convention as the convention for x->mode->output The IPv6 calling convention for x->mode->output is more general and could help an eventual protocol-generic x->type->output implementation. This patch adopts it for IPv4 as well and modifies the IPv4 type output functions accordingly. It also rewrites the IPv6 mac/transport header calculation to be based off the network header where practical. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:54 -07:00
Herbert Xu	7b277b1a5f	[IPSEC]: Set skb->data to payload in x->mode->output This patch changes the calling convention so that on entry from x->mode->output and before entry into x->type->output skb->data will point to the payload instead of the IP header. This is essentially a redistribution of skb_push/skb_pull calls with the aim of minimising them on the common path of tunnel + ESP. It'll also let us use the same calling convention between IPv4 and IPv6 with the next patch. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:54 -07:00
Herbert Xu	bee0b40c06	[IPSEC] beet: Fix extension header support on output The beet output function completely kills any extension headers by replacing them with the IPv6 header. This is because it essentially ignores the result of ip6_find_1stfragopt by simply acting as if there aren't any extension headers. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:53 -07:00
Herbert Xu	8bd1707504	[IPSEC] esp: Remove NAT-T checksum invalidation for BEET I pointed this out back when this patch was first proposed but it looks like it got lost along the way. The checksum only needs to be ignored for NAT-T in transport mode where we lose the original inner addresses due to NAT. With BEET the inner addresses will be intact so the checksum remains valid. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:53 -07:00
Mitsuru Chinen	f24e3d658c	[IPV6]: Defer IPv6 device initialization until a valid qdisc is specified To judge the timing for DAD, netif_carrier_ok() is used. However, there is a possibility that dev->qdisc stays noop_qdisc even if netif_carrier_ok() returns true. In that case, DAD NS is not sent out. We need to defer the IPv6 device initialization until a valid qdisc is specified. Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:52 -07:00
Pavel Emelyanov	9b77265235	[NET]: Remove double dev->flags checking when calling dev_close() The unregister_netdevice() and dev_change_net_namespace() both check for dev->flags to be IFF_UP before calling the dev_close(), but the dev_close() checks for IFF_UP itself, so remove those unneeded checks. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:52 -07:00
Ilpo Järvinen	1c1e87edb9	[TCP]: Separate lost_retrans loop into own function Follows own function for each task principle, this is really somewhat separate task being done in sacktag. Also reduces indentation. In addition, added ack_seq local var to break some long lines & fixed coding style things. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:51 -07:00
Pavel Emelyanov	ec93103519	[SUNRPC]: Make the sunrpc use the seq_open_private() Just switch to the consolidated code. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:36 -07:00
Pavel Emelyanov	a662d4cb50	[IRDA]: Make the IRDA use the seq_open_private() Just switch to the consolidated code Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:35 -07:00
Pavel Emelyanov	31164088d7	[DECNET]: Make decnet code use the seq_open_private() Just switch to the consolidated code. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:34 -07:00
Pavel Emelyanov	e2da591338	[NETFILTER]: Make netfilter code use the seq_open_private Just switch to the consolidated calls. ipt_recent() has to initialize the private, so use the __seq_open_private() helper. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:34 -07:00
Pavel Emelyanov	cf7732e4cc	[NET]: Make core networking code use seq_open_private This concerns the ipv4 and ipv6 code mostly, but also the netlink and unix sockets. The netlink code is an example of how to use the __seq_open_private() call - it saves the net namespace on this private. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:33 -07:00
Mattias Nissler	e2f036da2f	[PATCH] mac80211: Defer setting of RX_FLAG_DECRYPTED. The decryption handlers will skip the frame if the RX_FLAG_DECRYPTED flag is set, so the early flag setting introduced by Johannes breaks decryption. To work around this, call the handlers first and then set the flag. Signed-off-by: Mattias Nissler <mattias.nissler@gmx.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:55:23 -07:00
John W. Linville	0654ff055c	[PATCH] ieee80211_if_set_type: make check for master dev more explicit Problem description by Daniel Drake <dsd@gentoo.org>: "This sequence of events causes loss of connectivity: <plug in> <associate as normal in managed mode> ifconfig eth7 down iwconfig eth7 mode monitor ifconfig eth7 up ifconfig eth7 down iwconfig eth7 mode managed <associate as normal> At this point you are associated but TX does not work. This is because the eth7 hard_start_xmit is still ieee80211_monitor_start_xmit." The problem is caused by ieee80211_if_set_type checking for a non-zero hard_start_xmit pointer value in order to avoid changing that value for master devices. The fix is to make that check more explicitly linked to master devices rather than simply checking if the value has been previously set. CC: Daniel Drake <dsd@gentoo.org> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:55:23 -07:00
Herbert Xu	b7c6538cd8	[IPSEC]: Move state lock into x->type->output This patch releases the lock on the state before calling x->type->output. It also adds the lock to the spots where they're currently needed. Most of those places (all except mip6) are expected to disappear with async crypto. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:03 -07:00

... 3 4 5 6 7 ...

6394 commits