kernel-fxtec-pro1x/include
Josh Hunt 32b293a53d IPv6: Avoid taking write lock for /proc/net/ipv6_route
During some debugging I needed to look into how /proc/net/ipv6_route
operated and in my digging I found its calling fib6_clean_all() which uses
"write_lock_bh(&table->tb6_lock)" before doing the walk of the table. I
found this on 2.6.32, but reading the code I believe the same basic idea
exists currently. Looking at the rtnetlink code they are only calling
"read_lock_bh(&table->tb6_lock);" via fib6_dump_table(). While I realize
reading from proc isn't the recommended way of fetching the ipv6 route
table; taking a write lock seems unnecessary and would probably cause
network performance issues.

To verify this I loaded up the ipv6 route table and then ran iperf in 3
cases:
  * doing nothing
  * reading ipv6 route table via proc
    (while :; do cat /proc/net/ipv6_route > /dev/null; done)
  * reading ipv6 route table via rtnetlink
    (while :; do ip -6 route show table all > /dev/null; done)

* Load the ipv6 route table up with:
  * for ((i = 0;i < 4000;i++)); do ip route add unreachable 2000::$i; done

* iperf commands:
  * client: iperf -i 1 -V -c <ipv6 addr>
  * server: iperf -V -s

* iperf results - 3 runs each (in Mbits/sec)
  * nothing: client: 927,927,927 server: 927,927,927
  * proc: client: 179,97,96,113 server: 142,112,133
  * iproute: client: 928,927,928 server: 927,927,927

lock_stat shows taking the write lock is causing the slowdown. Using this
info I decided to write a version of fib6_clean_all() which replaces
write_lock_bh(&table->tb6_lock) with read_lock_bh(&table->tb6_lock). With
this new function I see the same results as with my rtnetlink iperf test.

Signed-off-by: Josh Hunt <joshhunt00@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-30 17:07:33 -05:00
..
acpi Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux 2011-11-07 10:13:52 -08:00
asm-generic Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2011-12-30 13:04:14 -05:00
crypto crypto: Add a report function pointer to crypto_type 2011-10-21 14:24:03 +02:00
drm drm/radeon/kms: add some new pci ids 2011-12-14 12:29:03 +00:00
keys
linux unix_diag: Fixup RQLEN extension report 2011-12-30 16:46:02 -05:00
math-emu
media Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2011-12-20 10:49:39 -08:00
misc [media] altera-stapl: it is time to move out from staging 2011-09-23 15:00:57 -03:00
mtd
net IPv6: Avoid taking write lock for /proc/net/ipv6_route 2011-12-30 17:07:33 -05:00
pcmcia
rdma Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband 2011-11-01 10:51:38 -07:00
rxrpc
scsi [SCSI] fcoe: fix fcoe in a DCB environment by adding DCB notifiers to set skb priority 2011-12-15 11:02:07 +04:00
sound Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
target target: remove the unused se_dev_list 2011-12-06 06:00:57 +00:00
trace writeback: show writeback reason with __print_symbolic 2011-12-18 14:20:17 +08:00
video ARM: OMAP: HWMOD: Unify DSS resets for OMAPs 2011-11-08 03:16:13 -07:00
xen Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel" 2011-12-19 09:30:35 -05:00
Kbuild