Commit graph

22585 commits

Author SHA1 Message Date
Andy Gospodarek
eaddcd7690 bonding: remove entries for master_ip and vlan_ip and query devices instead
The following patch aimed to resolve an issue where secondary, tertiary,
etc. addresses added to bond interfaces could overwrite the
bond->master_ip and vlan_ip values.

        commit 917fbdb32f
        Author: Henrik Saavedra Persson <henrik.e.persson@ericsson.com>
        Date:   Wed Nov 23 23:37:15 2011 +0000

            bonding: only use primary address for ARP

That patch was good because it prevented bonds using ARP monitoring from
sending frames with an invalid source IP address.  Unfortunately, it
didn't always work as expected.

When using an ioctl (like ifconfig does) to set the IP address and
netmask, 2 separate ioctls are actually called to set the IP and netmask
if the mask chosen doesn't match the standard mask for that class of
address.  The first ioctl did not have a mask that matched the one in
the primary address and would still cause the device address to be
overwritten.  The second ioctl that was called to set the mask would
then detect as secondary and ignored, but the damage was already done.

This was not an issue when using an application that used netlink
sockets as the setting of IP and netmask came down at once.  The
inconsistent behavior between those two interfaces was something that
needed to be resolved.

While I was thinking about how I wanted to resolve this, Ralf Zeidler
came with a patch that resolved this on a RHEL kernel by keeping a full
shadow of the entries in dev->ifa_list for the bonding device and vlan
devices in the bonding driver.  I didn't like the duplication of the
list as I want to see the 'bonding' struct and code shrink rather than
grow, but liked the general idea.

As the Subject indicates this patch drops the master_ip and vlan_ip
elements from the 'bonding' and 'vlan_entry' structs, respectively.
This can be done because a device's address-list is now traversed to
determine the optimal source IP address for ARP requests and for checks
to see if the bonding device has a particular IP address.  This code
could have all be contained inside the bonding driver, but it made more
sense to me to EXPORT and call inet_confirm_addr since it did exactly
what was needed.

I tested this and a backported patch and everything works as expected.
Ralf also helped with verification of the backported patch.

Thanks to Ralf for all his help on this.

v2: Whitespace and organizational changes based on suggestions from Jay
Vosburgh and Dave Miller.

v3: Fixup incorrect usage of rcu_read_unlock based on Dave Miller's
suggestion.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
CC: Ralf Zeidler <ralf.zeidler@nsn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-22 22:36:17 -04:00
Rusty Russell
523f610e1b netfilter: remove forward module param confusion.
It used to be an int, and it got changed to a bool parameter at least
7 years ago.  It happens that NF_ACCEPT and NF_DROP are 0 and 1, so
this works, but it's unclear, and the check that it's in range is not
required.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-22 22:36:17 -04:00
Linus Torvalds
db14179679 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 patches from Martin Schwidefsky:
 "The biggest patch is the rework of the smp code, something I wanted to
  do for some time.  There are some patches for our various dump methods
  and one new thing: z/VM LGR detection.  LGR stands for linux-guest-
  relocation and is the guest migration feature of z/VM.  For debugging
  purposes we keep a log of the systems where a specific guest has lived."

Fix up trivial conflict in arch/s390/kernel/smp.c due to the scheduler
cleanup having removed some code next to removed s390 code.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  [S390] kernel: Pass correct stack for smp_call_ipl_cpu()
  [S390] Ensure that vmcore_info pointer is never accessed directly
  [S390] dasd: prevent validate server for offline devices
  [S390] Remove monolithic build option for zcrypt driver.
  [S390] stack dump: fix indentation in output
  [S390] kernel: Add OS info memory interface
  [S390] Use block_sigmask()
  [S390] kernel: Add z/VM LGR detection
  [S390] irq: external interrupt code passing
  [S390] irq: set __ARCH_IRQ_EXIT_IRQS_DISABLED
  [S390] zfcpdump: Implement async sdias event processing
  [S390] Use copy_to_absolute_zero() instead of "stura/sturg"
  [S390] rework idle code
  [S390] rework smp code
  [S390] rename lowcore field
  [S390] Fix gcc 4.6.0 compile warning
2012-03-22 18:15:32 -07:00
Pablo Neira Ayuso
60b5f8f745 netfilter: nf_conntrack: permanently attach timeout policy to conntrack
We need to permanently attach the timeout policy to the conntrack,
otherwise we may apply the custom timeout policy inconsistently.

Without this patch, the following example:

 nfct timeout add test inet icmp timeout 100
 iptables -I PREROUTING -t raw -p icmp -s 1.1.1.1 -j CT --timeout test

Will only apply the custom timeout policy to outgoing packets from
1.1.1.1, but not to reply packets from 2.2.2.2 going to 1.1.1.1.

To fix this issue, this patch modifies the current logic to attach the
timeout policy when the first packet is seen (which is when the
conntrack entry is created). Then, we keep using the attached timeout
policy until the conntrack entry is destroyed.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-23 00:52:08 +01:00
Pablo Neira Ayuso
eeb4cb9523 netfilter: xt_CT: fix assignation of the generic protocol tracker
`iptables -p all' uses 0 to match all protocols, while the conntrack
subsystem uses 255. We still need `-p all' to attach the custom
timeout policies for the generic protocol tracker.

Moreover, we may use `iptables -p sctp' while the SCTP tracker is
not loaded. In that case, we have to default on the generic protocol
tracker.

Another possibility is `iptables -p ip' that should be supported
as well. This patch makes sure we validate all possible scenarios.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-23 00:52:08 +01:00
Pablo Neira Ayuso
1ac0bf9926 netfilter: xt_CT: missing rcu_read_lock section in timeout assignment
Fix a dereference to pointer without rcu_read_lock held.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-23 00:52:07 +01:00
Pablo Neira Ayuso
c1ebd7dff7 netfilter: cttimeout: fix dependency with l4protocol conntrack module
This patch introduces nf_conntrack_l4proto_find_get() and
nf_conntrack_l4proto_put() to fix module dependencies between
timeout objects and l4-protocol conntrack modules.

Thus, we make sure that the module cannot be removed if it is
used by any of the cttimeout objects.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-23 00:52:01 +01:00
Steffen Klassert
1265fd6167 xfrm: Access the replay notify functions via the registered callbacks
We call the wrong replay notify function when we use ESN replay
handling. This leads to the fact that we don't send notifications
if we use ESN. Fix this by calling the registered callbacks instead
of xfrm_replay_notify().

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-22 19:29:58 -04:00
Steffen Klassert
26b2072e75 xfrm: Remove unused xfrm_state from xfrm_state_check_space
The xfrm_state argument is unused in this function, so remove it.
Also the name xfrm_state_check_space does not really match what this
function does. It actually checks if we have enough head and tailroom
on the skb. So we rename the function to xfrm_skb_check_space.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-22 19:29:58 -04:00
Dan Carpenter
f0229eaaf3 RDS: use gfp flags from caller in conn_alloc()
We should be using the gfp flags the caller specified here, instead of
GFP_KERNEL.  I think this might be a bugfix, depending on the value of
"sock->sk->sk_allocation" when we call rds_conn_create_outgoing() in
rds_sendmsg().  Otherwise, it's just a cleanup.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-22 19:29:58 -04:00
Dan Carpenter
64b5fad526 netlabel: use GFP flags from caller instead of GFP_ATOMIC
This function takes a GFP flags as a parameter, but they are never used.
We don't take a lock in this function so there is no reason to prefer
GFP_ATOMIC over the caller's GFP flags.

There is only one caller, cipso_v4_map_cat_rng_ntoh(), and it passes
GFP_ATOMIC as the GFP flags so this doesn't change how the code works.
It's just a cleanup.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-22 19:29:57 -04:00
Alex Elder
8d63e318c4 libceph: isolate kmap() call in write_partial_msg_pages()
In write_partial_msg_pages(), every case now does an identical call
to kmap(page).  Instead, just call it once inside the CRC-computing
block where it's needed.  Move the definition of kaddr inside that
block, and make it a (char *) to ensure portable pointer arithmetic.

We still don't kunmap() it until after the sendpage() call, in case
that also ends up needing to use the mapping.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:52 -05:00
Alex Elder
9bd1966344 libceph: rename "page_shift" variable to something sensible
In write_partial_msg_pages() there is a local variable used to
track the starting offset within a bio segment to use.  Its name,
"page_shift" defies the Linux convention of using that name for
log-base-2(page size).

Since it's only used in the bio case rename it "bio_offset".  Use it
along with the page_pos field to compute the memory offset when
computing CRC's in that function.  This makes the bio case match the
others more closely.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:52 -05:00
Alex Elder
0cdf9e6018 libceph: get rid of zero_page_address
There's not a lot of benefit to zero_page_address, which basically
holds a mapping of the zero page through the life of the messenger
module.  Even with our own mapping, the sendpage interface where
it's used may need to kmap() it again.  It's almost certain to
be in low memory anyway.

So stop treating the zero page specially in write_partial_msg_pages()
and just get rid of zero_page_address entirely.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:52 -05:00
Alex Elder
e36b13cceb libceph: only call kernel_sendpage() via helper
Make ceph_tcp_sendpage() be the only place kernel_sendpage() is
used, by using this helper in write_partial_msg_pages().

Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:52 -05:00
Alex Elder
31739139f3 libceph: use kernel_sendpage() for sending zeroes
If a message queued for send gets revoked, zeroes are sent over the
wire instead of any unsent data.  This is done by constructing a
message and passing it to kernel_sendmsg() via ceph_tcp_sendmsg().

Since we are already working with a page in this case we can use
the sendpage interface instead.  Create a new ceph_tcp_sendpage()
helper that sets up flags to match the way ceph_tcp_sendmsg()
does now.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:51 -05:00
Alex Elder
37675b0f42 libceph: fix inverted crc option logic
CRC's are computed for all messages between ceph entities.  The CRC
computation for the data portion of message can optionally be
disabled using the "nocrc" (common) ceph option.  The default is
for CRC computation for the data portion to be enabled.

Unfortunately, the code that implements this feature interprets the
feature flag wrong, meaning that by default the CRC's have *not*
been computed (or checked) for the data portion of messages unless
the "nocrc" option was supplied.

Fix this, in write_partial_msg_pages() and read_partial_message().
Also change the flag variable in write_partial_msg_pages() to be
"no_datacrc" to match the usage elsewhere in the file.

This fixes http://tracker.newdream.net/issues/2064

Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:51 -05:00
Alex Elder
84495f4961 libceph: some simple changes
Nothing too big here.
    - define the size of the buffer used for consuming ignored
      incoming data using a symbolic constant
    - simplify the condition determining whether to unmap the page
      in write_partial_msg_pages(): do it for crc but not if the
      page is the zero page

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:51 -05:00
Alex Elder
f42299e6c3 libceph: small refactor in write_partial_kvec()
Make a small change in the code that counts down kvecs consumed by
a ceph_tcp_sendmsg() call.  Same functionality, just blocked out
a little differently.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:51 -05:00
Alex Elder
fe3ad593e2 libceph: do crc calculations outside loop
Move blocks of code out of loops in read_partial_message_section()
and read_partial_message().  They were only was getting called at
the end of the last iteration of the loop anyway.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:51 -05:00
Alex Elder
a9a0c51af4 libceph: separate CRC calculation from byte swapping
Calculate CRC in a separate step from rearranging the byte order
of the result, to improve clarity and readability.

Use offsetof() to determine the number of bytes to include in the
CRC calculation.

In read_partial_message(), switch which value gets byte-swapped,
since the just-computed CRC is already likely to be in a register.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:51 -05:00
Alex Elder
bca064d236 libceph: use "do" in CRC-related Boolean variables
Change the name (and type) of a few CRC-related Boolean local
variables so they contain the word "do", to distingish their purpose
from variables used for holding an actual CRC value.

Note that in the process of doing this I identified a fairly serious
logic error in write_partial_msg_pages():  the value of "do_crc"
assigned appears to be the opposite of what it should be.  No
attempt to fix this is made here; this change preserves the
erroneous behavior.  The problem I found is documented here:
    http://tracker.newdream.net/issues/2064

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:51 -05:00
Alex Elder
cffaba15cd ceph: ensure Boolean options support both senses
Many ceph-related Boolean options offer the ability to both enable
and disable a feature.  For all those that don't offer this, add
a new option so that they do.

Note that ceph_show_options()--which reports mount options currently
in effect--only reports the option if it is different from the
default value.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:51 -05:00
Alex Elder
d3002b974c libceph: a few small changes
This gathers a number of very minor changes:
    - use %hu when formatting the a socket address's address family
    - null out the ceph_msgr_wq pointer after the queue has been
      destroyed
    - drop a needless cast in ceph_write_space()
    - add a WARN() call in ceph_state_change() in the event an
      unrecognized socket state is encountered
    - rearrange the logic in ceph_con_get() and ceph_con_put() so
      that:
        - the reference counts are only atomically read once
	- the values displayed via dout() calls are known to
	  be meaningful at the time they are formatted

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:51 -05:00
Alex Elder
41617d0c9c libceph: make ceph_tcp_connect() return int
There is no real need for ceph_tcp_connect() to return the socket
pointer it creates, since it already assigns it to con->sock, which
is visible to the caller.  Instead, have it return an error code,
which tidies things up a bit.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:51 -05:00
Alex Elder
6173d1f02f libceph: encapsulate some messenger cleanup code
Define a helper function to perform various cleanup operations.  Use
it both in the exit routine and in the init routine in the event of
an error.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:50 -05:00
Alex Elder
e0f43c9419 libceph: make ceph_msgr_wq private
The messenger workqueue has no need to be public.  So give it static
scope.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:50 -05:00
Alex Elder
859eb79948 libceph: encapsulate connection kvec operations
Encapsulate the operation of adding a new chunk of data to the next
open slot in a ceph_connection's out_kvec array.  Also add a "reset"
operation to make subsequent add operations start at the beginning
of the array again.

Use these routines throughout, avoiding duplicate code and ensuring
all calls are handled consistently.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:50 -05:00
Alex Elder
963be4d770 libceph: move prepare_write_banner()
One of the arguments to prepare_write_connect() indicates whether it
is being called immediately after a call to prepare_write_banner().
Move the prepare_write_banner() call inside prepare_write_connect(),
and reinterpret (and rename) the "after_banner" argument so it
indicates that prepare_write_connect() should *make* the call
rather than should know it has already been made.

This was split out from the next patch to highlight this change in
logic.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:50 -05:00
Alex Elder
ee57741c52 rbd: make ceph_parse_options() return a pointer
ceph_parse_options() takes the address of a pointer as an argument
and uses it to return the address of an allocated structure if
successful.  With this interface is not evident at call sites that
the pointer is always initialized.  Change the interface to return
the address instead (or a pointer-coded error code) to make the
validity of the returned pointer obvious.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:47 -05:00
Alex Elder
99f0f3b2c4 ceph: eliminate some abusive casts
This fixes some spots where a type cast to (void *) was used as
as a universal type hiding mechanism.  Instead, properly cast the
type to the intended target type.

Signed-off-by: Alex Elder <elder@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:45 -05:00
Alex Elder
bd40614512 ceph: eliminate some needless casts
This eliminates type casts in some places where they are not
required.

Signed-off-by: Alex Elder <elder@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:45 -05:00
Alex Elder
f64a93172b ceph: kill addr_str_lock spinlock; use atomic instead
A spinlock is used to protect a value used for selecting an array
index for a string used for formatting a socket address for human
consumption.  The index is reset to 0 if it ever reaches the maximum
index value.

Instead, use an ever-increasing atomic variable as a sequence
number, and compute the array index by masking off all but the
sequence number's lowest bits.  Make the number of entries in the
array a power of two to allow the use of such a mask (to avoid jumps
in the index value when the sequence number wraps).

The length of these strings is somewhat arbitrarily set at 60 bytes.
The worst-case length of a string produced is 54 bytes, for an IPv6
address that can't be shortened, e.g.:
    [1234:5678:9abc:def0:1111:2222:123.234.210.100]:32767
Change it so we arbitrarily use 64 bytes instead; if nothing else
it will make the array of these line up better in hex dumps.

Rename a few things to reinforce the distinction between the number
of strings in the array and the length of individual strings.

Signed-off-by: Alex Elder <elder@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:45 -05:00
Alex Elder
a5bc3129a2 ceph: make use of "else" where appropriate
Rearrange ceph_tcp_connect() a bit, making use of "else" rather than
re-testing a value with consecutive "if" statements.  Don't record a
connection's socket pointer unless the connect operation is
successful.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:45 -05:00
Alex Elder
5766651971 ceph: use a shared zero page rather than one per messenger
Each messenger allocates a page to be used when writing zeroes
out in the event of error or other abnormal condition.  Instead,
use the kernel ZERO_PAGE() for that purpose.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:45 -05:00
Xi Wang
6448669777 libceph: fix overflow check in crush_decode()
The existing overflow check (n > ULONG_MAX / b) didn't work, because
n = ULONG_MAX / b would both bypass the check and still overflow the
allocation size a + n * b.

The correct check should be (n > (ULONG_MAX - a) / b).

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:45 -05:00
Jim Schutt
182fac2689 net/ceph: Only clear SOCK_NOSPACE when there is sufficient space in the socket buffer
The Ceph messenger would sometimes queue multiple work items to write
data to a socket when the socket buffer was full.

Fix this problem by making ceph_write_space() use SOCK_NOSPACE in the
same way that net/core/stream.c:sk_stream_write_space() does, i.e.,
clearing it only when sufficient space is available in the socket buffer.

Signed-off-by: Jim Schutt <jaschut@sandia.gov>
Reviewed-by: Alex Elder <elder@dreamhost.com>
2012-03-22 10:47:45 -05:00
Pablo Neira Ayuso
a0f65a267d netfilter: xt_LOG: use CONFIG_IP6_NF_IPTABLES instead of CONFIG_IPV6
This fixes the following linking error:

xt_LOG.c:(.text+0x789b1): undefined reference to `ip6t_ext_hdr'

ifdefs have to use CONFIG_IP6_NF_IPTABLES instead of CONFIG_IPV6.

Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-22 11:50:56 +01:00
Benjamin LaHaise
9395a09d05 l2tp: enable automatic module loading for l2tp_ppp
When L2TP is configured as a module, requests for L2TP sockets do not result
in the l2tp_ppp module being loaded.  Fix this by adding the appropriate
MODULE_ALIAS to be recognized by pppox's request_module() call.

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-21 22:14:56 -04:00
Eric Dumazet
2a2a459eee net: fix napi_reuse_skb() skb reserve
napi->skb is allocated in napi_get_frags() using
netdev_alloc_skb_ip_align(), with a reserve of NET_SKB_PAD +
NET_IP_ALIGN bytes.

However, when such skb is recycled in napi_reuse_skb(), it ends with a
reserve of NET_IP_ALIGN which is suboptimal.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-21 16:52:09 -04:00
Linus Torvalds
e2a0883e40 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile 1 from Al Viro:
 "This is _not_ all; in particular, Miklos' and Jan's stuff is not there
  yet."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits)
  ext4: initialization of ext4_li_mtx needs to be done earlier
  debugfs-related mode_t whack-a-mole
  hfsplus: add an ioctl to bless files
  hfsplus: change finder_info to u32
  hfsplus: initialise userflags
  qnx4: new helper - try_extent()
  qnx4: get rid of qnx4_bread/qnx4_getblk
  take removal of PF_FORKNOEXEC to flush_old_exec()
  trim includes in inode.c
  um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it
  um: embed ->stub_pages[] into mmu_context
  gadgetfs: list_for_each_safe() misuse
  ocfs2: fix leaks on failure exits in module_init
  ecryptfs: make register_filesystem() the last potential failure exit
  ntfs: forgets to unregister sysctls on register_filesystem() failure
  logfs: missing cleanup on register_filesystem() failure
  jfs: mising cleanup on register_filesystem() failure
  make configfs_pin_fs() return root dentry on success
  configfs: configfs_create_dir() has parent dentry in dentry->d_parent
  configfs: sanitize configfs_create()
  ...
2012-03-21 13:36:41 -07:00
Linus Torvalds
3556485f15 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates for 3.4 from James Morris:
 "The main addition here is the new Yama security module from Kees Cook,
  which was discussed at the Linux Security Summit last year.  Its
  purpose is to collect miscellaneous DAC security enhancements in one
  place.  This also marks a departure in policy for LSM modules, which
  were previously limited to being standalone access control systems.
  Chromium OS is using Yama, and I believe there are plans for Ubuntu,
  at least.

  This patchset also includes maintenance updates for AppArmor, TOMOYO
  and others."

Fix trivial conflict in <net/sock.h> due to the jumo_label->static_key
rename.

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (38 commits)
  AppArmor: Fix location of const qualifier on generated string tables
  TOMOYO: Return error if fails to delete a domain
  AppArmor: add const qualifiers to string arrays
  AppArmor: Add ability to load extended policy
  TOMOYO: Return appropriate value to poll().
  AppArmor: Move path failure information into aa_get_name and rename
  AppArmor: Update dfa matching routines.
  AppArmor: Minor cleanup of d_namespace_path to consolidate error handling
  AppArmor: Retrieve the dentry_path for error reporting when path lookup fails
  AppArmor: Add const qualifiers to generated string tables
  AppArmor: Fix oops in policy unpack auditing
  AppArmor: Fix error returned when a path lookup is disconnected
  KEYS: testing wrong bit for KEY_FLAG_REVOKED
  TOMOYO: Fix mount flags checking order.
  security: fix ima kconfig warning
  AppArmor: Fix the error case for chroot relative path name lookup
  AppArmor: fix mapping of META_READ to audit and quiet flags
  AppArmor: Fix underflow in xindex calculation
  AppArmor: Fix dropping of allowed operations that are force audited
  AppArmor: Add mising end of structure test to caps unpacking
  ...
2012-03-21 13:25:04 -07:00
Linus Torvalds
9f3938346a Merge branch 'kmap_atomic' of git://github.com/congwang/linux
Pull kmap_atomic cleanup from Cong Wang.

It's been in -next for a long time, and it gets rid of the (no longer
used) second argument to k[un]map_atomic().

Fix up a few trivial conflicts in various drivers, and do an "evil
merge" to catch some new uses that have come in since Cong's tree.

* 'kmap_atomic' of git://github.com/congwang/linux: (59 commits)
  feature-removal-schedule.txt: schedule the deprecated form of kmap_atomic() for removal
  highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename]
  drbd: remove the second argument of k[un]map_atomic()
  zcache: remove the second argument of k[un]map_atomic()
  gma500: remove the second argument of k[un]map_atomic()
  dm: remove the second argument of k[un]map_atomic()
  tomoyo: remove the second argument of k[un]map_atomic()
  sunrpc: remove the second argument of k[un]map_atomic()
  rds: remove the second argument of k[un]map_atomic()
  net: remove the second argument of k[un]map_atomic()
  mm: remove the second argument of k[un]map_atomic()
  lib: remove the second argument of k[un]map_atomic()
  power: remove the second argument of k[un]map_atomic()
  kdb: remove the second argument of k[un]map_atomic()
  udf: remove the second argument of k[un]map_atomic()
  ubifs: remove the second argument of k[un]map_atomic()
  squashfs: remove the second argument of k[un]map_atomic()
  reiserfs: remove the second argument of k[un]map_atomic()
  ocfs2: remove the second argument of k[un]map_atomic()
  ntfs: remove the second argument of k[un]map_atomic()
  ...
2012-03-21 09:40:26 -07:00
Tom Tucker
9b78145c0f xprtrdma: Remove assumption that each segment is <= PAGE_SIZE
The xprtrdma FRMR mapping logic assumes that a segment is <= PAGE_SIZE.
This is not true for NFS4.

Signed-off-by: Tom Tucker <tom@ogc.us>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-21 09:31:47 -04:00
Tom Tucker
4a6862b364 xprtrdma: The transport should not bug-check when a dup reply is received
The client side RDMA transport will bug check if it receives a duplicate
reply, instead we should simply drop the duplicate reply.

Signed-off-by: Tom Tucker <tom@ogc.us>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-21 09:31:47 -04:00
Trond Myklebust
ffa94db604 SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined
Stephen Rothwell reports:
net/sunrpc/rpcb_clnt.c: In function 'rpcb_enc_mapping':
net/sunrpc/rpcb_clnt.c:820:19: warning: unused variable 'task' [-Wunused-variable]
net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_getport':
net/sunrpc/rpcb_clnt.c:837:19: warning: unused variable 'task' [-Wunused-variable]
net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_set':
net/sunrpc/rpcb_clnt.c:860:19: warning: unused variable 'task' [-Wunused-variable]
net/sunrpc/rpcb_clnt.c: In function 'rpcb_enc_getaddr':
net/sunrpc/rpcb_clnt.c:892:19: warning: unused variable 'task' [-Wunused-variable]
net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_getaddr':
net/sunrpc/rpcb_clnt.c:914:19: warning: unused variable 'task' [-Wunused-variable]
fs/lockd/svclock.c:49:20: warning: 'nlmdbg_cookie2a' declared 'static' but never defined [-Wunused-function]

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-21 09:31:44 -04:00
Linus Torvalds
69a7aebcf0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree from Jiri Kosina:
 "It's indeed trivial -- mostly documentation updates and a bunch of
  typo fixes from Masanari.

  There are also several linux/version.h include removals from Jesper."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (101 commits)
  kcore: fix spelling in read_kcore() comment
  constify struct pci_dev * in obvious cases
  Revert "char: Fix typo in viotape.c"
  init: fix wording error in mm_init comment
  usb: gadget: Kconfig: fix typo for 'different'
  Revert "power, max8998: Include linux/module.h just once in drivers/power/max8998_charger.c"
  writeback: fix fn name in writeback_inodes_sb_nr_if_idle() comment header
  writeback: fix typo in the writeback_control comment
  Documentation: Fix multiple typo in Documentation
  tpm_tis: fix tis_lock with respect to RCU
  Revert "media: Fix typo in mixer_drv.c and hdmi_drv.c"
  Doc: Update numastat.txt
  qla4xxx: Add missing spaces to error messages
  compiler.h: Fix typo
  security: struct security_operations kerneldoc fix
  Documentation: broken URL in libata.tmpl
  Documentation: broken URL in filesystems.tmpl
  mtd: simplify return logic in do_map_probe()
  mm: fix comment typo of truncate_inode_pages_range
  power: bq27x00: Fix typos in comment
  ...
2012-03-20 21:12:50 -07:00
Linus Torvalds
3b59bf0816 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking merge from David Miller:
 "1) Move ixgbe driver over to purely page based buffering on receive.
     From Alexander Duyck.

  2) Add receive packet steering support to e1000e, from Bruce Allan.

  3) Convert TCP MD5 support over to RCU, from Eric Dumazet.

  4) Reduce cpu usage in handling out-of-order TCP packets on modern
     systems, also from Eric Dumazet.

  5) Support the IP{,V6}_UNICAST_IF socket options, making the wine
     folks happy, from Erich Hoover.

  6) Support VLAN trunking from guests in hyperv driver, from Haiyang
     Zhang.

  7) Support byte-queue-limtis in r8169, from Igor Maravic.

  8) Outline code intended for IP_RECVTOS in IP_PKTOPTIONS existed but
     was never properly implemented, Jiri Benc fixed that.

  9) 64-bit statistics support in r8169 and 8139too, from Junchang Wang.

  10) Support kernel side dump filtering by ctmark in netfilter
      ctnetlink, from Pablo Neira Ayuso.

  11) Support byte-queue-limits in gianfar driver, from Paul Gortmaker.

  12) Add new peek socket options to assist with socket migration, from
      Pavel Emelyanov.

  13) Add sch_plug packet scheduler whose queue is controlled by
      userland daemons using explicit freeze and release commands.  From
      Shriram Rajagopalan.

  14) Fix FCOE checksum offload handling on transmit, from Yi Zou."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1846 commits)
  Fix pppol2tp getsockname()
  Remove printk from rds_sendmsg
  ipv6: fix incorrent ipv6 ipsec packet fragment
  cpsw: Hook up default ndo_change_mtu.
  net: qmi_wwan: fix build error due to cdc-wdm dependecy
  netdev: driver: ethernet: Add TI CPSW driver
  netdev: driver: ethernet: add cpsw address lookup engine support
  phy: add am79c874 PHY support
  mlx4_core: fix race on comm channel
  bonding: send igmp report for its master
  fs_enet: Add MPC5125 FEC support and PHY interface selection
  net: bpf_jit: fix BPF_S_LDX_B_MSH compilation
  net: update the usage of CHECKSUM_UNNECESSARY
  fcoe: use CHECKSUM_UNNECESSARY instead of CHECKSUM_PARTIAL on tx
  net: do not do gso for CHECKSUM_UNNECESSARY in netif_needs_gso
  ixgbe: Fix issues with SR-IOV loopback when flow control is disabled
  net/hyperv: Fix the code handling tx busy
  ixgbe: fix namespace issues when FCoE/DCB is not enabled
  rtlwifi: Remove unused ETH_ADDR_LEN defines
  igbvf: Use ETH_ALEN
  ...

Fix up fairly trivial conflicts in drivers/isdn/gigaset/interface.c and
drivers/net/usb/{Kconfig,qmi_wwan.c} as per David.
2012-03-20 21:04:47 -07:00
Al Viro
68ac1234fb switch touch_atime to struct path
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:41 -04:00
Al Viro
40ffe67d2e switch unix_sock to struct path
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:41 -04:00
Al Viro
48fde701af switch open-coded instances of d_make_root() to new helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:35 -04:00
Linus Torvalds
0d9cabdcce Merge branch 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup changes from Tejun Heo:
 "Out of the 8 commits, one fixes a long-standing locking issue around
  tasklist walking and others are cleanups."

* 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Walk task list under tasklist_lock in cgroup_enable_task_cg_list
  cgroup: Remove wrong comment on cgroup_enable_task_cg_list()
  cgroup: remove cgroup_subsys argument from callbacks
  cgroup: remove extra calls to find_existing_css_set
  cgroup: replace tasklist_lock with rcu_read_lock
  cgroup: simplify double-check locking in cgroup_attach_proc
  cgroup: move struct cgroup_pidlist out from the header file
  cgroup: remove cgroup_attach_task_current_cg()
2012-03-20 18:11:21 -07:00
Benjamin LaHaise
bbdb32cb5b Fix pppol2tp getsockname()
While testing L2TP functionality, I came across a bug in getsockname().  The
IP address returned within the pppol2tp_addr's addr memember was not being
set to the IP  address in use.  This bug is caused by using inet_sk() on the
wrong socket (the L2TP socket rather than the underlying UDP socket), and was
likely introduced during the addition of L2TPv3 support.

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-20 16:12:11 -04:00
Dave Jones
a6506e1486 Remove printk from rds_sendmsg
no socket layer outputs a message for this error and neither should rds.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-20 16:12:11 -04:00
Linus Torvalds
843ec558f9 tty and serial merge for 3.4-rc1
Here's the big serial and tty merge for the 3.4-rc1 tree.
 
 There's loads of fixes and reworks in here from Jiri for the tty layer,
 and a number of patches from Alan to help try to wrestle the vt layer
 into a sane model.
 
 Other than that, lots of driver updates and fixes, and other minor
 stuff, all detailed in the shortlog.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iEYEABECAAYFAk9nihQACgkQMUfUDdst+ylXTQCdFuwVuZgjCts+xDVa1jX2ac84
 UogAn3Wr+P7NYFN6gvaGm52KbGbZs405
 =2b/l
 -----END PGP SIGNATURE-----

Merge tag 'tty-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull TTY/serial patches from Greg KH:
 "tty and serial merge for 3.4-rc1

  Here's the big serial and tty merge for the 3.4-rc1 tree.

  There's loads of fixes and reworks in here from Jiri for the tty
  layer, and a number of patches from Alan to help try to wrestle the vt
  layer into a sane model.

  Other than that, lots of driver updates and fixes, and other minor
  stuff, all detailed in the shortlog."

* tag 'tty-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (132 commits)
  serial: pxa: add clk_prepare/clk_unprepare calls
  TTY: Wrong unicode value copied in con_set_unimap()
  serial: PL011: clear pending interrupts
  serial: bfin-uart: Don't access tty circular buffer in TX DMA interrupt after it is reset.
  vt: NULL dereference in vt_do_kdsk_ioctl()
  tty: serial: vt8500: fix annotations for probe/remove
  serial: remove back and forth conversions in serial_out_sync
  serial: use serial_port_in/out vs serial_in/out in 8250
  serial: introduce generic port in/out helpers
  serial: reduce number of indirections in 8250 code
  serial: delete useless void casts in 8250.c
  serial: make 8250's serial_in shareable to other drivers.
  serial: delete last unused traces of pausing I/O in 8250
  pch_uart: Add module parameter descriptions
  pch_uart: Use existing default_baud in setup_console
  pch_uart: Add user_uartclk parameter
  pch_uart: Add Fish River Island II uart clock quirks
  pch_uart: Use uartclk instead of base_baud
  mpc5200b/uart: select more tolerant uart prescaler on low baudrates
  tty: moxa: fix bit test in moxa_start()
  ...
2012-03-20 11:24:39 -07:00
Linus Torvalds
9c2b957db1 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events changes for v3.4 from Ingo Molnar:

 - New "hardware based branch profiling" feature both on the kernel and
   the tooling side, on CPUs that support it.  (modern x86 Intel CPUs
   with the 'LBR' hardware feature currently.)

   This new feature is basically a sophisticated 'magnifying glass' for
   branch execution - something that is pretty difficult to extract from
   regular, function histogram centric profiles.

   The simplest mode is activated via 'perf record -b', and the result
   looks like this in perf report:

	$ perf record -b any_call,u -e cycles:u branchy

	$ perf report -b --sort=symbol
	    52.34%  [.] main                   [.] f1
	    24.04%  [.] f1                     [.] f3
	    23.60%  [.] f1                     [.] f2
	     0.01%  [k] _IO_new_file_xsputn    [k] _IO_file_overflow
	     0.01%  [k] _IO_vfprintf_internal  [k] _IO_new_file_xsputn
	     0.01%  [k] _IO_vfprintf_internal  [k] strchrnul
	     0.01%  [k] __printf               [k] _IO_vfprintf_internal
	     0.01%  [k] main                   [k] __printf

   This output shows from/to branch columns and shows the highest
   percentage (from,to) jump combinations - i.e.  the most likely taken
   branches in the system.  "branches" can also include function calls
   and any other synchronous and asynchronous transitions of the
   instruction pointer that are not 'next instruction' - such as system
   calls, traps, interrupts, etc.

   This feature comes with (hopefully intuitive) flat ascii and TUI
   support in perf report.

 - Various 'perf annotate' visual improvements for us assembly junkies.
   It will now recognize function calls in the TUI and by hitting enter
   you can follow the call (recursively) and back, amongst other
   improvements.

 - Multiple threads/processes recording support in perf record, perf
   stat, perf top - which is activated via a comma-list of PIDs:

	perf top -p 21483,21485
	perf stat -p 21483,21485 -ddd
	perf record -p 21483,21485

 - Support for per UID views, via the --uid paramter to perf top, perf
   report, etc.  For example 'perf top --uid mingo' will only show the
   tasks that I am running, excluding other users, root, etc.

 - Jump label restructurings and improvements - this includes the
   factoring out of the (hopefully much clearer) include/linux/static_key.h
   generic facility:

	struct static_key key = STATIC_KEY_INIT_FALSE;

	...

	if (static_key_false(&key))
	        do unlikely code
	else
	        do likely code

	...
	static_key_slow_inc();
	...
	static_key_slow_inc();
	...

   The static_key_false() branch will be generated into the code with as
   little impact to the likely code path as possible.  the
   static_key_slow_*() APIs flip the branch via live kernel code patching.

   This facility can now be used more widely within the kernel to
   micro-optimize hot branches whose likelihood matches the static-key
   usage and fast/slow cost patterns.

 - SW function tracer improvements: perf support and filtering support.

 - Various hardenings of the perf.data ABI, to make older perf.data's
   smoother on newer tool versions, to make new features integrate more
   smoothly, to support cross-endian recording/analyzing workflows
   better, etc.

 - Restructuring of the kprobes code, the splitting out of 'optprobes',
   and a corner case bugfix.

 - Allow the tracing of kernel console output (printk).

 - Improvements/fixes to user-space RDPMC support, allowing user-space
   self-profiling code to extract PMU counts without performing any
   system calls, while playing nice with the kernel side.

 - 'perf bench' improvements

 - ... and lots of internal restructurings, cleanups and fixes that made
   these features possible.  And, as usual this list is incomplete as
   there were also lots of other improvements

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits)
  perf report: Fix annotate double quit issue in branch view mode
  perf report: Remove duplicate annotate choice in branch view mode
  perf/x86: Prettify pmu config literals
  perf report: Enable TUI in branch view mode
  perf report: Auto-detect branch stack sampling mode
  perf record: Add HEADER_BRANCH_STACK tag
  perf record: Provide default branch stack sampling mode option
  perf tools: Make perf able to read files from older ABIs
  perf tools: Fix ABI compatibility bug in print_event_desc()
  perf tools: Enable reading of perf.data files from different ABI rev
  perf: Add ABI reference sizes
  perf report: Add support for taken branch sampling
  perf record: Add support for sampling taken branch
  perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
  x86/kprobes: Split out optprobe related code to kprobes-opt.c
  x86/kprobes: Fix a bug which can modify kernel code permanently
  x86/kprobes: Fix instruction recovery on optimized path
  perf: Add callback to flush branch_stack on context switch
  perf: Disable PERF_SAMPLE_BRANCH_* when not supported
  perf/x86: Add LBR software filter support for Intel CPUs
  ...
2012-03-20 10:29:15 -07:00
Linus Torvalds
5928a2b60c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes for v3.4 from Ingo Molnar.  The major features of this
series are:

 - making RCU more aggressive about entering dyntick-idle mode in order
   to improve energy efficiency

 - converting a few more call_rcu()s to kfree_rcu()s

 - applying a number of rcutree fixes and cleanups to rcutiny

 - removing CONFIG_SMP #ifdefs from treercu

 - allowing RCU CPU stall times to be set via sysfs

 - adding CPU-stall capability to rcutorture

 - adding more RCU-abuse diagnostics

 - updating documentation

 - fixing yet more issues located by the still-ongoing top-to-bottom
   inspection of RCU, this time with a special focus on the CPU-hotplug
   code path.

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
  rcu: Stop spurious warnings from synchronize_sched_expedited
  rcu: Hold off RCU_FAST_NO_HZ after timer posted
  rcu: Eliminate softirq-mediated RCU_FAST_NO_HZ idle-entry loop
  rcu: Add RCU_NONIDLE() for idle-loop RCU read-side critical sections
  rcu: Allow nesting of rcu_idle_enter() and rcu_idle_exit()
  rcu: Remove redundant check for rcu_head misalignment
  PTR_ERR should be called before its argument is cleared.
  rcu: Convert WARN_ON_ONCE() in rcu_lock_acquire() to lockdep
  rcu: Trace only after NULL-pointer check
  rcu: Call out dangers of expedited RCU primitives
  rcu: Rework detection of use of RCU by offline CPUs
  lockdep: Add CPU-idle/offline warning to lockdep-RCU splat
  rcu: No interrupt disabling for rcu_prepare_for_idle()
  rcu: Move synchronize_sched_expedited() to rcutree.c
  rcu: Check for illegal use of RCU from offlined CPUs
  rcu: Update stall-warning documentation
  rcu: Add CPU-stall capability to rcutorture
  rcu: Make documentation give more realistic rcutorture duration
  rcutorture: Permit holding off CPU-hotplug operations during boot
  rcu: Print scheduling-clock information on RCU CPU stall-warning messages
  ...
2012-03-20 10:10:18 -07:00
Trond Myklebust
e27d359e9b SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG
This allows us to turn on/off the dprintk() debugging interfaces for
those distributions that don't ship the 'rpcdebug' utility.
It also allows us to add Kbuild dependencies. Specifically, we already
know that dprintk() in general relies on CONFIG_SYSCTL. Now it turns out
that the NFS dprintks depend on CONFIG_CRC32 after we added support
for the filehandle hash.

Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-20 13:08:26 -04:00
Cong Wang
b854178601 sunrpc: remove the second argument of k[un]map_atomic()
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20 21:48:28 +08:00
Cong Wang
6114eab535 rds: remove the second argument of k[un]map_atomic()
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20 21:48:28 +08:00
Cong Wang
0352bc550c net: remove the second argument of k[un]map_atomic()
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20 21:48:27 +08:00
Gao feng
1f85851e17 ipv6: fix incorrent ipv6 ipsec packet fragment
Since commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
In func ip6_append_data,after call skb_put(skb, fraglen + dst_exthdrlen)
the skb->len contains dst_exthdrlen,and we don't reduce dst_exthdrlen at last
This will make fraggap>0 in next "while cycle",and cause the size of skb incorrent

Fix this by reserve headroom for dst_exthdrlen.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-20 05:39:34 -04:00
Eric Dumazet
c8628155ec tcp: reduce out_of_order memory use
With increasing receive window sizes, but speed of light not improved
that much, out of order queue can contain a huge number of skbs, waiting
to be moved to receive_queue when missing packets can fill the holes.

Some devices happen to use fat skbs (truesize of 4096 + sizeof(struct
sk_buff)) to store regular (MTU <= 1500) frames. This makes highly
probable sk_rmem_alloc hits sk_rcvbuf limit, which can be 4Mbytes in
many cases.

When limit is hit, tcp stack calls tcp_collapse_ofo_queue(), a true
latency killer and cpu cache blower.

Doing the coalescing attempt each time we add a frame in ofo queue
permits to keep memory use tight and in many cases avoid the
tcp_collapse() thing later.

Tested on various wireless setups (b43, ath9k, ...) known to use big skb
truesize, this patch removed the "packets collapsed in receive queue due
to low socket buffer" I had before.

This also reduced average memory used by tcp sockets.

With help from Neal Cardwell.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-19 16:53:08 -04:00
Eric Dumazet
e86b291962 tcp: introduce tcp_data_queue_ofo
Split tcp_data_queue() in two parts for better readability.

tcp_data_queue_ofo() is responsible for queueing incoming skb into out
of order queue.

Change code layout so that the skb_set_owner_r() is performed only if
skb is not dropped.

This is a preliminary patch before "reduce out_of_order memory use"
following patch.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-19 16:53:07 -04:00
Trond Myklebust
540a0f7584 SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()
The problem is that for the case of priority queues, we
have to assume that __rpc_remove_wait_queue_priority will move new
elements from the tk_wait.links lists into the queue->tasks[] list.
We therefore cannot use list_for_each_entry_safe() on queue->tasks[],
since that will skip these new tasks that __rpc_remove_wait_queue_priority
is adding.

Without this fix, rpc_wake_up and rpc_wake_up_status will both fail
to wake up all functions on priority wait queues, which can result
in some nasty hangs.

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-03-19 14:15:02 -04:00
David S. Miller
4da0bd7365 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-03-18 23:29:41 -04:00
Pablo Neira Ayuso
a16a1647fa netfilter: ctnetlink: fix race between delete and timeout expiration
Kerin Millar reported hardlockups while running `conntrackd -c'
in a busy firewall. That system (with several processors) was
acting as backup in a primary-backup setup.

After several tries, I found a race condition between the deletion
operation of ctnetlink and timeout expiration. This patch fixes
this problem.

Tested-by: Kerin Millar <kerframil@gmail.com>
Reported-by: Kerin Millar <kerframil@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-17 01:47:08 -07:00
Neil Horman
124d37e9f0 arp: allow arp processing to honor per interface arp_accept sysctl
I found recently that the arp_process function which handles all of our received
arp frames, is using IPV4_DEVCONF_ALL macro to check the state of the arp_process
flag.  This seems wrong, as it implies that either none or all of the network
interfaces accept gratuitous arps.  This patch corrects that, allowing
per-interface arp_accept configuration to deviate from the all setting.  Note
this also brings us into line with the way the arp_filter setting is handled
during arp_process execution.

Tested this myself on my home network, and confirmed it works as expected.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16 23:00:20 -07:00
RongQing.Li
c577923756 ipv6: Don't dev_hold(dev) in ip6_mc_find_dev_rcu.
ip6_mc_find_dev_rcu() is called with rcu_read_lock(), so don't
need to dev_hold().
With dev_hold(), not corresponding dev_put(), will lead to leak.

[ bug introduced in 96b52e61be (ipv6: mcast: RCU conversions) ]

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16 21:56:42 -07:00
John W. Linville
01a2829809 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
	drivers/net/wireless/ath/ath9k/hw.c
2012-03-16 13:45:25 -04:00
Eric Dumazet
cc34eb672e sch_sfq: revert dont put new flow at the end of flows
This reverts commit d47a0ac7b6 (sch_sfq: dont put new flow at the end of
flows)

As Jesper found out, patch sounded great but has bad side effects.

In stress situation, pushing new flows in front of the queue can prevent
old flows doing any progress. Packets can stay in SFQ queue for
unlimited amount of time.

It's possible to add heuristics to limit this problem, but this would
add complexity outside of SFQ scope.

A more sensible answer to Dave Taht concerns (who reported the issued I
tried to solve in original commit) is probably to use a qdisc hierarchy
so that high prio packets dont enter a potentially crowded SFQ qdisc.

Reported-by: Jesper Dangaard Brouer <jdb@comx.dk>
Cc: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16 01:55:25 -07:00
Eric Dumazet
122bdf67f1 ipv6: fix icmp6_dst_alloc()
commit 87a115783 ( ipv6: Move xfrm_lookup() call down into
icmp6_dst_alloc().) forgot to convert one error path, leading
to crashes in mld_sendpack()

Many thanks to Dave Jones for providing a very complete bug report.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16 01:53:42 -07:00
Eliad Peller
dc41e4d474 mac80211: make uapsd_* keys per-vif
uapsd_queues and uapsd_max_sp_len are relevant only for managed
interfaces, and can be configured differently for each vif.

Move them from the local struct to sdata->u.mgd, and update
the debugfs functions accordingly.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-15 13:43:12 -04:00
Eliad Peller
ada577c12f mac80211: add NULL terminator to debugfs_netdev write buf
Some debugfs write functions call kstrto* functions, which
assume the string is null-terminated. Make it valid by changing
ieee80211_if_write() to use static buffer instead of allocating
one, and set the last char to NULL.

(The write functions try to parse some integer/mac address,
so 64 bytes buffer should be enough)

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-15 13:40:34 -04:00
Helmut Schaa
ba6fa29c6d mac80211: Don't sample max throughput rate in minstrel_ht
The current max throughput rate is known to be good as otherwise it
wouldn't be the max throughput rate. Since rate sampling can introduce
some overhead (by adding RTS for example or due to not aggregating the
frame) don't sample the max throughput rate.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-15 13:40:33 -04:00
Paul Stewart
3117bbdb78 mac80211: Don't let regulatory make us deaf
When regulatory information changes our HT behavior (e.g,
when we get a country code from the AP we have just associated
with), we should use this information to change the power with
which we transmit, and what channels we transmit.  Sometimes
the channel parameters we derive from regulatory information
contradicts the parameters we used in association.  For example,
we could have associated specifying HT40, but the regulatory
rules we apply may forbid HT40 operation.

In the situation above, we should reconfigure ourselves to
transmit in HT20 only, however it makes no sense for us to
disable receive in HT40, since if we associated with these
parameters, the AP has every reason to expect we can and
will receive packets this way.  The code in mac80211 does
not have the capability of sending the appropriate action
frames to signal a change in HT behaviour so the AP has
no clue we can no longer receive frames encoded this way.
In some broken AP implementations, this can leave us
effectively deaf if the AP never retries in lower HT rates.

This change breaks up the channel_type parameter in the
ieee80211_enable_ht function into a separate receive and
transmit part.  It honors the channel flags set by regulatory
in order to configure the rate control algorithm, but uses
the capability flags to configure the channel on the radio,
since these were used in association to set the AP's transmit
rate.

Signed-off-by: Paul Stewart <pstew@chromium.org>
Cc: Sam Leffler <sleffler@chromium.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Luis R Rodriguez <mcgrof@frijolero.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-13 14:55:53 -04:00
Johannes Berg
e9ac0745c7 mac80211: rename bss_conf timestamp to last_tsf
This value is not really very useful by itself,
yet some drivers (including iwlwifi until I can
figure out what it should do) use it. At least
rename it to "last_tsf" to indicate the meaning
and add a note that it may be really old.

I suspect the value may become useful combined
with the rx_status->mactime, but we don't (yet)
store that value and pass it to the driver.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-13 14:54:20 -04:00
Johannes Berg
7b8bcff2e0 cfg80211: clarify timestamp in cfg80211_inform_bss
This is intended to be the timestamp sent by the
peer in the beacon/probe response, not any form
of host timestamp. Clarify the documentation and
variable names.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-13 14:54:20 -04:00
Johannes Berg
a828691188 mac80211: linearize SKBs as needed for crypto
Not linearizing every SKB will help actually pass
non-linear SKBs all the way up when on an encrypted
connection. For now, linearize TKIP completely as
it is lower performance and I don't quite grok all
the details.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-13 14:54:17 -04:00
Johannes Berg
617bbde878 mac80211: move RX WEP weak IV counting
This is better done inside the WEP decrypt
function where it doesn't have to check all
the conditions any more since they've been
tested already.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-13 14:54:16 -04:00
Joe Perches
afd465030a net: ipv4: Standardize prefixes for message logging
Add #define pr_fmt(fmt) as appropriate.

Add "IPv4: ", "TCP: ", and "IPsec: " to appropriate files.
Standardize on "UDPLite: " for appropriate uses.
Some prefixes were previously "UDPLITE: " and "UDP-Lite: ".

Add KBUILD_MODNAME ": " to icmp and gre.
Remove embedded prefixes as appropriate.

Add missing "\n" to pr_info in gre.c.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-12 17:05:21 -07:00
Ingo Molnar
35239e23c6 Merge branch 'perf/urgent' into perf/core
Merge reason: We are going to queue up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-12 20:44:11 +01:00
Johannes Berg
5d6a1b069b mac80211: set basic rates earlier
The authentication and association handshake
already happens in the context of the new BSS,
and the basic rates are needed at least for
the ACK response frame to the authentication
or association response frames. Therefore the
basic rates should already be configured into
the driver when those frames are sent.

Change the logic to set up the basic rates in
the connection preparation that happens for
authentication and association (if needed).

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-12 14:22:16 -04:00
Johannes Berg
a1cf775dea mac80211: refactor common auth/assoc setup code
As associating is possible without first authenticating
(for FT over DS) association also has to be able to
switch to the right channel, insert the station entry
etc. Factor out this common code into a new function
called ieee80211_prep_connection().

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-12 14:22:15 -04:00
Johannes Berg
0775f9f90c mac80211: remove spurious BSSID change flag
The BSSID has been set a lot earlier already and
didn't change again in ieee80211_set_associated().

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-12 14:22:14 -04:00
Johannes Berg
76f0303d61 mac80211: simplify wmm check during association
Instead of setting assoc_data->wmm_used solely
based on the BSS also take into account our own
capabilities and later check those.

Also rename "wmm_used" and "uapsd_used" to just
"wmm" and "uapsd".

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-12 14:22:13 -04:00
Johannes Berg
4e74bfdb30 mac80211: simplify HT checks
Always set/use IEEE80211_STA_DISABLE_11N instead
of duplicating the queue, WMM and HT checks in
all places.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-12 14:22:12 -04:00
Johannes Berg
de5036aae6 mac80211: move misplaced comment
Looks like some changes in this area moved
the code but not the comment that belongs
to the code, move it to the right place.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-12 14:22:11 -04:00
Helmut Schaa
e9219779f9 mac80211: Disable MCS > 7 in minstrel_ht when STA uses static SMPS
Disable multi stream rates (MCS > 7) when a STA is in static SMPS mode
since it has only one active rx chain. Hence, it doesn't even make
sense to sample multi stream rates.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-12 14:19:39 -04:00
Johannes Berg
3cc5240b5e mac80211: set channel back after disassociating
As we've discussed, we want to avoid channel changes
while associated. While the part when we actually
associate needs a bit more work, the bit that happens
on disassociating can be changed quite easily. Move
the channel type change later in the disassociate
process to set the channel only after the driver was
told that it's now disassociated.

As the driver could expect powersave to be enabled
only when associated, this thus results in splitting
the config call, but overall what happens makes more
sense this way.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-12 14:19:38 -04:00
Johannes Berg
177958e967 mac80211: remove tx_sync
When the station state callback was added, this
was no longer needed in theory. With the iwlwifi
changes to remove use of it landing, we can kill
the entire tx-sync framework again, RIP.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-12 14:19:38 -04:00
Helmut Schaa
aa45458060 mac80211: Limit TID buffering during BA session setup/teardown
While setting up or tearing down a BA session mac80211 is buffering
pending frames for the according TID. However, there's currently no
limit on how many frames are buffered possibly leading to an out-of-
memory situation. This can happen on systems with little memory when
the CPU is fully loaded since the BA session work is executed in
process context while frames can still come via softirq.

Apply a limitation to the TIDs pending queue to avoid consuming
too much memory in this situation.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-12 14:19:35 -04:00
Bala Shanmugam
4486ea987e cfg80211: Add background scan period attribute.
Receive background scan period as part of connect
command and pass the same to the driver.

Signed-off-by: Bala Shanmugam <bkamatch@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-12 14:19:34 -04:00
Trond Myklebust
0097143c12 SUNRPC: Don't use variable length automatic arrays in kernel code
Replace the variable length array in the RPCSEC_GSS crypto code with
a fixed length one. The size should be bounded by the variable
GSS_KRB5_MAX_BLOCKSIZE, so use that.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-12 13:37:16 -04:00
Joe Perches
058bd4d2a4 net: Convert printks to pr_<level>
Use a more current kernel messaging style.

Convert a printk block to print_hex_dump.
Coalesce formats, align arguments.
Use %s, __func__ instead of embedding function names.

Some messages that were prefixed with <foo>_close are
now prefixed with <foo>_fini.  Some ah4 and esp messages
are now not prefixed with "ip ".

The intent of this patch is to later add something like
  #define pr_fmt(fmt) "IPv4: " fmt.
to standardize the output messages.

Text size is trivially reduced. (x86-32 allyesconfig)

$ size net/ipv4/built-in.o*
   text	   data	    bss	    dec	    hex	filename
 887888	  31558	 249696	1169142	 11d6f6	net/ipv4/built-in.o.new
 887934	  31558	 249800	1169292	 11d78c	net/ipv4/built-in.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-11 23:42:51 -07:00
Maciej Żenczykowski
43db362d3a net: get rid of some pointless casts to sockaddr
The following 4 functions:
  move_addr_to_kernel
  move_addr_to_user
  verify_iovec
  verify_compat_iovec
are always effectively called with a sockaddr_storage.

Make this explicit by changing their signature.

This removes a large number of casts from sockaddr_storage to sockaddr.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-11 19:11:22 -07:00
Trond Myklebust
09acfea5d8 SUNRPC: Fix a few sparse warnings
net/sunrpc/svcsock.c:412:22: warning: incorrect type in assignment
(different address spaces)
 - svc_partial_recvfrom now takes a struct kvec, so the variable
   save_iovbase needs to be an ordinary (void *)

Make a bunch of variables in net/sunrpc/xprtsock.c static

Fix a couple of "warning: symbol 'foo' was not declared. Should it be
static?" reports.

Fix a couple of conflicting function declarations.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-11 19:30:02 -04:00
Li Wei
8b2aaedee4 ipv6: Fix Smatch warning.
With commit d6ddef9e641d(IPv6: Fix not join all-router mcast group
when forwarding set.) I check 'dev' after it's dereference that
leads to a Smatch complaint:

net/ipv6/addrconf.c:438 ipv6_add_dev()
	 warn: variable dereferenced before check 'dev' (see line 432)

net/ipv6/addrconf.c
   431		/* protected by rtnl_lock */
   432		rcu_assign_pointer(dev->ip6_ptr, ndev);
                                   ^^^^^^^^^^^^
Old dereference.

   433
   434		/* Join all-node multicast group */
   435		ipv6_dev_mc_inc(dev, &in6addr_linklocal_allnodes);
   436
   437		/* Join all-router multicast group if forwarding is set
*/
   438		if (ndev->cnf.forwarding && dev && (dev->flags &
IFF_MULTICAST))
                                            ^^^

Remove the check to avoid the complaint as 'dev' can't be NULL.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Li Wei <lw@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-11 15:57:11 -07:00
Eric Dumazet
dfd25ffffc tcp: fix syncookie regression
commit ea4fc0d619 (ipv4: Don't use rt->rt_{src,dst} in ip_queue_xmit())
added a serious regression on synflood handling.

Simon Kirby discovered a successful connection was delayed by 20 seconds
before being responsive.

In my tests, I discovered that xmit frames were lost, and needed ~4
retransmits and a socket dst rebuild before being really sent.

In case of syncookie initiated connection, we use a different path to
initialize the socket dst, and inet->cork.fl.u.ip4 is left cleared.

As ip_queue_xmit() now depends on inet flow being setup, fix this by
copying the temp flowi4 we use in cookie_v4_check().

Reported-by: Simon Kirby <sim@netnation.com>
Bisected-by: Simon Kirby <sim@netnation.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-11 15:52:12 -07:00
sjur.brandeland@stericsson.com
e3abcc2a85 caif: make zero a legal caif connetion id.
Connection ID configured through RTNL must allow zero as
connection id. If connection-id is not given when creating the
interface, configure a loopback interface using ifindex as
connection-id.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-11 15:38:16 -07:00
Dmitry Tarnyagin
374458b3fe caif: Fix for a race in socket transmit with flow control.
Kill faulty checks on flow-off leading to connection drop at race conditions.
caif_socket checks for flow-on before transmitting and goes to sleep or
return -EAGAIN upon flow stop. Remove faulty subsequent checks on flow-off
leading to connection drop. Also fix memory leaks on some of the errors paths.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-11 15:38:16 -07:00
David S. Miller
e8abbe0d02 Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge 2012-03-11 15:36:34 -07:00
Paul Gortmaker
51990e8254 device.h: cleanup users outside of linux/include (C files)
For files that are actively using linux/device.h, make sure
that they call it out.  This will allow us to clean up some
of the implicit uses of linux/device.h within include/*
without introducing build regressions.

Yes, this was created by "cheating" -- i.e. the headers were
cleaned up, and then the fallout was found and fixed, and then
the two commits were reordered.  This ensures we don't introduce
build regressions into the git history.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-03-11 14:27:37 -04:00
Heiko Carstens
fde15c3a3a [S390] irq: external interrupt code passing
The external interrupt handlers have a parameter called ext_int_code.
Besides the name this paramter does not only contain the ext_int_code
but in addition also the "cpu address" (POP) which caused the external
interrupt.
To make the code a bit more obvious pass a struct instead so the called
function can easily distinguish between external interrupt code and
cpu address. The cpu address field however is named "subcode" since
some external interrupt sources do not pass a cpu address but a
different parameter (or none at all).

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-03-11 11:59:29 -04:00
Sven Eckelmann
40e0c4f51d batman-adv: Remove spaces after a cast
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2012-03-11 06:29:44 +08:00
Sven Eckelmann
96741ade15 batman-adv: Use {} braces consistent on the arms of a statement
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2012-03-11 06:29:44 +08:00
Sven Eckelmann
21a1236bc3 batman-adv: Don't begin block comments with only a /* line
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2012-03-11 06:29:44 +08:00
Sven Eckelmann
86ceb36056 batman-adv: Ignore 80-chars per line limits for strings
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2012-03-11 06:29:44 +08:00
David S. Miller
6a91395f20 ipv4: Make ip_rcv_options() return bool.
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-09 14:34:50 -08:00
David S. Miller
ba57b4db26 ipv4: Make ip_call_ra_chain() return bool.
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-09 14:34:50 -08:00
David S. Miller
b2d3298e09 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-03-09 14:34:20 -08:00
John W. Linville
74dd1521d0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2012-03-09 14:57:30 -05:00
Benjamin Poirier
0343c5543b sctp: Export sctp_do_peeloff
lookup sctp_association within sctp_do_peeloff() to enable its use outside of
the sctp code with minimal knowledge of the former.

Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-08 13:52:08 -08:00
Jiri Slaby
410235fd4d TTY: remove unneeded tty->index checks
Checking if tty->index is in bounds is not needed. The tty has the
index set in the initial open. This is done in get_tty_driver. And it
can be only in interval <0,driver->num).

So remove the tests which check exactly this interval. Some are
left untouched as they check against the current backing device count.
(Leaving apart that the check is racy in most of the cases.)

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-08 11:42:21 -08:00
Jiri Slaby
2f16669d32 TTY: remove re-assignments to tty_driver members
All num, magic and owner are set by alloc_tty_driver. No need to
re-set them on each allocation site.

pti driver sets something different to what it passes to
alloc_tty_driver. It is not a bug, since we don't use the lines
parameter in any way. Anyway this is fixed, and now we do the right
thing.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-08 11:37:58 -08:00
Steffen Klassert
ac3f48de09 route: Remove redirect_genid
As we invalidate the inetpeer tree along with the routing cache now,
we don't need a genid to reset the redirect handling when the routing
cache is flushed.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-08 00:30:32 -08:00
Steffen Klassert
5faa5df1fa inetpeer: Invalidate the inetpeer tree along with the routing cache
We initialize the routing metrics with the values cached on the
inetpeer in rt_init_metrics(). So if we have the metrics cached on the
inetpeer, we ignore the user configured fib_metrics.

To fix this issue, we replace the old tree with a fresh initialized
inet_peer_base. The old tree is removed later with a delayed work queue.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-08 00:30:24 -08:00
Paulius Zaleckas
5200959b83 bridge: fix state reporting when port is disabled
Now we have:
eth0: link *down*
br0: port 1(eth0) entered *forwarding* state

br_log_state(p) should be called *after* p->state is set
to BR_STATE_DISABLED.

Reported-by: Zilvinas Valinskas <zilvinas@wilibox.com>
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-08 00:25:25 -08:00
Paulius Zaleckas
d9e179ecec bridge: br_log_state() s/entering/entered/
When br_log_state() is reporting state it should say "entered"
istead of "entering" since state at this point is already
changed.

Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-08 00:25:25 -08:00
David S. Miller
0111ad823e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-next 2012-03-07 22:53:48 -08:00
Ursula Braun
82492a355f af_iucv: add shutdown for HS transport
AF_IUCV sockets offer a shutdown function. This patch makes sure
shutdown works for HS transport as well.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-07 22:52:24 -08:00
Ursula Braun
9fbd87d413 af_iucv: handle netdev events
In case of transport through HiperSockets the underlying network
interface may switch to DOWN state or the underlying network device
may recover. In both cases the socket must change to IUCV_DISCONN
state. If the interface goes down, af_iucv has a chance to notify
its connection peer in addition.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-07 22:52:24 -08:00
David S. Miller
9259c483a3 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch 2012-03-07 22:49:01 -08:00
David S. Miller
c75a312d8b Merge branch 'master' of git://1984.lsi.us.es/net-next 2012-03-07 22:27:56 -08:00
Ido Yariv
fdde0a26a2 Bluetooth: Set security level on incoming pairing request
If a master would like to raise the security level, it will send a
pairing request. While the pending security level is set on an incoming
security request (from a slave), it is not set on a pairing request. As
a result, the security level would not be raised on the slave in such
case.

Fix this by setting the pending security when receiving pairing
requests according to the requested authorization.

Signed-off-by: Ido Yariv <ido@wizery.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2012-03-08 02:26:04 -03:00
Ido Yariv
b3ff53ff00 Bluetooth: Fix access to the STK generation methods matrix
The major index of the table is actually the remote I/O capabilities, not
the local ones. As a result, devices with different I/O capabilities
could have used wrong or even unsupported generation methods.

Signed-off-by: Ido Yariv <ido@wizery.com>
CC: Brian Gix <bgix@codeaurora.org>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2012-03-08 02:25:00 -03:00
Luiz Augusto von Dentz
e57d758ae8 Bluetooth: Fix using uninitialized variable
+ src/net/bluetooth/rfcomm/tty.c: warning: 'p' is used uninitialized in this
       function:  => 218
+ src/net/bluetooth/rfcomm/tty.c: warning: 'p' may be used uninitialized in
       this function:  => 218

Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2012-03-08 02:20:22 -03:00
Gustavo F. Padovan
04124681f1 Bluetooth: fix conding style issues all over the tree
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2012-03-08 02:02:26 -03:00
Jesse Gross
81e5d41d7e openvswitch: Fix checksum update for actions on UDP packets.
When modifying IP addresses or ports on a UDP packet we don't
correctly follow the rules for unchecksummed packets.  This meant
that packets without a checksum can be given a incorrect new checksum
and packets with a checksum can become marked as being unchecksummed.
This fixes it to handle those requirements.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-03-07 14:36:57 -08:00
Eric Dumazet
b4fb05ea40 tcp: md5: correct a RCU lockdep splat
commit a8afca0329 (tcp: md5: protects md5sig_info with RCU) added a
lockdep splat in tcp_md5_do_lookup() in case a timer fires a tcp
retransmit.

At this point, socket lock is owned by the sofirq handler, not the user,
so we should adjust a bit the lockdep condition, as we dont hold
rcu_read_lock().

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-07 15:13:42 -05:00
Thomas Pedersen
f06c7885c3 mac80211: fix smatch lock errors in mesh
smatch was complaining:

CHECK   net/mac80211/mesh_pathtbl.c
net/mac80211/mesh_pathtbl.c:562 mesh_path_add() error: double lock
'bottom_half:'
net/mac80211/mesh_pathtbl.c:580 mesh_path_add() error: double unlock
'bottom_half:'
net/mac80211/mesh_pathtbl.c:589 mesh_path_add() error: double unlock
'bottom_half:'
net/mac80211/mesh_pathtbl.c:691 mpp_path_add() error: double lock
'bottom_half:'
net/mac80211/mesh_pathtbl.c:707 mpp_path_add() error: double unlock
'bottom_half:'
net/mac80211/mesh_pathtbl.c:716 mpp_path_add() error: double unlock
'bottom_half:'
net/mac80211/mesh_pathtbl.c:814 mesh_path_flush_by_nexthop() error:
double lock 'bottom_half:'
net/mac80211/mesh_pathtbl.c:819 mesh_path_flush_by_nexthop() error:
double unlock 'bottom_half:'
net/mac80211/mesh_pathtbl.c:887 mesh_path_del() error: double lock
'bottom_half:'
net/mac80211/mesh_pathtbl.c:901 mesh_path_del() error: double unlock
'bottom_half:'

So don't lock / unlock with _bh() while bottom halves are already
disabled.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-07 13:56:36 -05:00
Ashok Nagarajan
3d4f969972 mac80211: Fix potential null pointer dereferencing
The patch "{nl,cfg,mac}80211: Implement RSSI threshold for mesh peering"
has a potential null pointer dereferencing problem. Thanks to Dan Carpenter
for pointing out. This patch will fix the issue.

Signed-off-by: Ashok Nagarajan <ashok@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-07 13:51:47 -05:00
Paul Stewart
fcff4f108d mac80211: Filter duplicate IE ids
mac80211 is lenient with respect to reception of corrupted beacons.
Even if the frame is corrupted as a whole, the available IE elements
are still passed back and accepted, sometimes replacing legitimate
data.  It is unknown to what extent this "feature" is made use of,
but it is clear that in some cases, this is detrimental.  One such
case is reported in http://crosbug.com/26832 where an AP corrupts
its beacons but not its probe responses.

One approach would be to completely reject frames with invaid data
(for example, if the last tag extends beyond the end of the enclosing
PDU).  The enclosed approach is much more conservative: we simply
prevent later IEs from overwriting the state from previous ones.
This approach hopes that there might be some salient data in the
IE stream before the corruption, and seeks to at least prevent that
data from being overwritten.  This approach will fix the case above.

Further, we flag element structures that contain data we think might
be corrupted, so that as we fill the mac80211 BSS structure, we try
not to replace data from an un-corrupted probe response with that
of a corrupted beacon, for example.

Short of any statistics gathering in the various forms of AP breakage,
it's not possible to ascertain the side effects of more stringent
discarding of data.

Signed-off-by: Paul Stewart <pstew@chromium.org>
Cc: Sam Leffler <sleffler@chromium.org>
Cc: Eliad Peller <eliad@wizery.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-07 13:51:37 -05:00
Pablo Neira Ayuso
24de58f465 netfilter: xt_CT: allow to attach timeout policy + glue code
This patch allows you to attach the timeout policy via the
CT target, it adds a new revision of the target to ensure
backward compatibility. Moreover, it also contains the glue
code to stick the timeout object defined via nfnetlink_cttimeout
to the given flow.

Example usage (it requires installing the nfct tool and
libnetfilter_cttimeout):

1) create the timeout policy:

 nfct timeout add tcp-policy0 inet tcp \
	established 1000 close 10 time_wait 10 last_ack 10

2) attach the timeout policy to the packet:

 iptables -I PREROUTING -t raw -p tcp -j CT --timeout tcp-policy0

You have to install the following user-space software:

a) libnetfilter_cttimeout:
   git://git.netfilter.org/libnetfilter_cttimeout

b) nfct:
   git://git.netfilter.org/nfct

You also have to get iptables with -j CT --timeout support.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:41:28 +01:00
Pablo Neira Ayuso
dd70507241 netfilter: nf_ct_ext: add timeout extension
This patch adds the timeout extension, which allows you to attach
specific timeout policies to flows.

This extension is only used by the template conntrack.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:41:25 +01:00
Pablo Neira Ayuso
5097846230 netfilter: add cttimeout infrastructure for fine timeout tuning
This patch adds the infrastructure to add fine timeout tuning
over nfnetlink. Now you can use the NFNL_SUBSYS_CTNETLINK_TIMEOUT
subsystem to create/delete/dump timeout objects that contain some
specific timeout policy for one flow.

The follow up patches will allow you attach timeout policy object
to conntrack via the CT target and the conntrack extension
infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:41:22 +01:00
Pablo Neira Ayuso
2c8503f55f netfilter: nf_conntrack: pass timeout array to l4->new and l4->packet
This patch defines a new interface for l4 protocol trackers:

unsigned int *(*get_timeouts)(struct net *net);

that is used to return the array of unsigned int that contains
the timeouts that will be applied for this flow. This is passed
to the l4proto->new(...) and l4proto->packet(...) functions to
specify the timeout policy.

This interface allows per-net global timeout configuration
(although only DCCP supports this by now) and it will allow
custom custom timeout configuration by means of follow-up
patches.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:41:19 +01:00
Pablo Neira Ayuso
b888341c7f netfilter: nf_ct_gre: add unsigned int array to define timeouts
This patch adds an array to define the default GRE timeouts.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:41:16 +01:00
Pablo Neira Ayuso
33ee44643f netfilter: nf_ct_tcp: move retransmission and unacknowledged timeout to array
This patch moves the retransmission and unacknowledged timeouts
to the tcp_timeouts array. This change is required by follow-up
patches.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:41:15 +01:00
Pablo Neira Ayuso
5a41db94c6 netfilter: nf_ct_udp[lite]: convert UDP[lite] timeouts to array
Use one array to store the UDP timeouts instead of two variables.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:41:13 +01:00
Hans Schillstrom
3b988ece9b netfilter: ctnetlink: fix lockep splats
net/netfilter/nf_conntrack_proto.c:70 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
3 locks held by conntrack/3235:
nfnl_lock+0x17/0x20
netlink_dump+0x32/0x240
ctnetlink_dump_table+0x3e/0x170 [nf_conntrack_netlink]

stack backtrace:
Pid: 3235, comm: conntrack Tainted: G W  3.2.0+ #511
Call Trace:
[<ffffffff8108ce45>] lockdep_rcu_suspicious+0xe5/0x100
[<ffffffffa00ec6e1>] __nf_ct_l4proto_find+0x81/0xb0 [nf_conntrack]
[<ffffffffa0115675>] ctnetlink_fill_info+0x215/0x5f0 [nf_conntrack_netlink]
[<ffffffffa0115dc1>] ctnetlink_dump_table+0xd1/0x170 [nf_conntrack_netlink]
[<ffffffff815fbdbf>] netlink_dump+0x7f/0x240
[<ffffffff81090f9d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff815fd34f>] netlink_dump_start+0xdf/0x190
[<ffffffffa0111490>] ? ctnetlink_change_nat_seq_adj+0x160/0x160 [nf_conntrack_netlink]
[<ffffffffa0115cf0>] ? ctnetlink_get_conntrack+0x2a0/0x2a0 [nf_conntrack_netlink]
[<ffffffffa0115ad9>] ctnetlink_get_conntrack+0x89/0x2a0 [nf_conntrack_netlink]
[<ffffffff81603a47>] nfnetlink_rcv_msg+0x467/0x5f0
[<ffffffff81603a7c>] ? nfnetlink_rcv_msg+0x49c/0x5f0
[<ffffffff81603922>] ? nfnetlink_rcv_msg+0x342/0x5f0
[<ffffffff81071b21>] ? get_parent_ip+0x11/0x50
[<ffffffff816035e0>] ? nfnetlink_subsys_register+0x60/0x60
[<ffffffff815fed49>] netlink_rcv_skb+0xa9/0xd0
[<ffffffff81603475>] nfnetlink_rcv+0x15/0x20
[<ffffffff815fe70e>] netlink_unicast+0x1ae/0x1f0
[<ffffffff815fea16>] netlink_sendmsg+0x2c6/0x320
[<ffffffff815b2a87>] sock_sendmsg+0x117/0x130
[<ffffffff81125093>] ? might_fault+0x53/0xb0
[<ffffffff811250dc>] ? might_fault+0x9c/0xb0
[<ffffffff81125093>] ? might_fault+0x53/0xb0
[<ffffffff815b5991>] ? move_addr_to_kernel+0x71/0x80
[<ffffffff815b644e>] sys_sendto+0xfe/0x130
[<ffffffff815b5c94>] ? sys_bind+0xb4/0xd0
[<ffffffff817a8a0e>] ? retint_swapgs+0xe/0x13
[<ffffffff817afcd2>] system_call_fastpath+0x16/0x1b

Reported-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
2012-03-07 17:41:10 +01:00
Richard Weinberger
417e02bf42 netfilter: xt_LOG: fix bogus extra layer-4 logging information
In 16059b5 netfilter: merge ipt_LOG and ip6_LOG into xt_LOG, we have
merged ipt_LOG and ip6t_LOG.

However:

IN=wlan0 OUT= MAC=xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx
SRC=213.150.61.61 DST=192.168.1.133 LEN=40 TOS=0x00 PREC=0x00 TTL=117
ID=10539 DF PROTO=TCP SPT=80 DPT=49013 WINDOW=0 RES=0x00 ACK RST
URGP=0 PROTO=UDPLITE SPT=80 DPT=49013 LEN=45843 PROTO=ICMP TYPE=0
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Several missing break in the code led to including bogus layer-4
information. This patch fixes this problem.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:40:59 +01:00
Tony Zelenoff
58020f7761 netfilter: nf_ct_ecache: refactor nf_ct_deliver_cached_events
* identation lowered
* some CPU cycles saved at delayed item variable initialization

Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:40:53 +01:00
Tony Zelenoff
93326ae312 netfilter: nf_ct_ecache: trailing whitespace removed
Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:40:51 +01:00
Richard Weinberger
6939c33a75 netfilter: merge ipt_LOG and ip6_LOG into xt_LOG
ipt_LOG and ip6_LOG have a lot of common code, merge them
to reduce duplicate code.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:40:49 +01:00
Pablo Neira Ayuso
544d5c7d9f netfilter: ctnetlink: allow to set expectfn for expectations
This patch allows you to set expectfn which is specifically used
by the NAT side of most of the existing conntrack helpers.

I have added a symbol map that uses a string as key to look up for
the function that is attached to the expectation object. This is
the best solution I came out with to solve this issue.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:40:46 +01:00
Pablo Neira Ayuso
076a0ca026 netfilter: ctnetlink: add NAT support for expectations
This patch adds the missing bits to create expectations that
are created in NAT setups.
2012-03-07 17:40:44 +01:00
Pablo Neira Ayuso
b8c5e52c13 netfilter: ctnetlink: allow to set expectation class
This patch allows you to set the expectation class.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:40:42 +01:00
Pablo Neira Ayuso
660fdb2a0f netfilter: ctnetlink: allow to set helper for new expectations
This patch allow you to set the helper for newly created
expectations based of the CTA_EXPECT_HELP_NAME attribute.
Before this, the helper set was NULL.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:40:40 +01:00
Jozsef Kadlecsik
2a7cef2a4b netfilter: ipset: Exceptions support added to hash:*net* types
The "nomatch" keyword and option is added to the hash:*net* types,
by which one can add exception entries to sets. Example:

        ipset create test hash:net
        ipset add test 192.168.0/24
        ipset add test 192.168.0/30 nomatch

In this case the IP addresses from 192.168.0/24 except 192.168.0/30
match the elements of the set.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07 17:40:35 +01:00