Commit graph

65379 commits

Author SHA1 Message Date
Roland Dreier
2be8e3ee8e IB/umad: Add P_Key index support
Add support for setting the P_Key index of sent MADs and getting the
P_Key index of received MADs.  This requires a change to the layout of
the ABI structure struct ib_user_mad_hdr, so to avoid breaking
compatibility, we default to the old (unchanged) ABI and add a new
ioctl IB_USER_MAD_ENABLE_PKEY that allows applications that are aware
of the new ABI to opt into using it.

We plan on switching to the new ABI by default in a year or so, and
this patch adds a warning that is printed when an application uses the
old ABI, to push people towards converting to the new ABI.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Hal Rosenstock <hal@xsigo.com>
2007-10-09 19:59:15 -07:00
Joachim Fenkes
c01759cee9 IB/ehca: Return srq_attr->max_sge in ehca_query_srq()
Totally forgot this.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:15 -07:00
Hoang-Nam Nguyen
a660722375 IB/ehca: Adjust 64-bit alignment of create QP response for userspace
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:14 -07:00
Hoang-Nam Nguyen
03f72a51cb IB/ehca: Fix mem leak of firmware ctrlblock in ehca_create_srq()
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:14 -07:00
Jack Morgenstein
cd9281d873 IB/mlx4: Display misc device information under /sys/class/infiniband/
display the following device information under /sys/class/infiniband/mlx4_X:
board_id, fw_ver, hw_rev, hca_type.

This patch makes this information available to userspace utilities
such as ibstat and ibv_devinfo.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:14 -07:00
Ralph Campbell
57cb61d587 IB/core: Fix handling of multicast response failures
I was looking at the code for multicast.c and noticed that
ib_sa_join_multicast() calls queue_join() which puts the
request at the front of the group->pending_list.  If this
is a second request, it seems like it would interfere with
process_join_error() since group->last_join won't point
to the member at the head of the pending_list. The sequence
would thus be:

1. ib_sa_join_multicast()
   puts member1 on head of pending_list and starts work thread
2. mcast_work_handler()
   calls send_join() which sets group->last_join to member1
3. ib_sa_join_multicast()
   puts member2 on head of pending_list
4. join operation for member1 receives failures response from SA.
5. join_handler() is called with error status
6. process_join_error() fails to process member1 since
   it doesn't match the first entry in the group->pending_list.

The impact is that the failed join request is tossed.  The second
request is processed, and after it completes, the original request ends
up being retried.

This change also results in join requests being processed in FIFO
order.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:14 -07:00
Satyam Sharma
9faa559c01 IB/ehca: Misc cpuinit section annotations and #ifdef cleanups
* Replace {un}register_cpu_notifier with {un}register_hotcpu_notifier
  thereby losing a couple of #ifdef HOTPLUG_CPU pairs.
* Move comp_pool_callback_nb declaration to below that of callback
  function so that initialization of .notifier_call and .priority can
  occur at build time itself and not runtime.
* Mark the notifier_block (and callback function, and another static
  function used by it) as __cpuinit{data} for the sake of consistency
  and remove enclosing #ifdef. (This may increase size for modular
  build of this module, however, because these are no longer dropped
  unconditionally now.)

Signed-off-by: Satyam Sharma <satyam@infradead.org>
Acked-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:14 -07:00
Roland Dreier
ea98054fef mlx4_core: Change capability decoding: SRC->XRC
The SRC ("scalable RC") transport has been renamed to XRC ("extended 
RC"), to avoid having an abbreviation that is so easily confused with an 
abbreviation for "source."  Update the HCA capability decoding output to 
use the new name.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:13 -07:00
Roland Dreier
ec2a1344ad IB/iser: Remove unnecessary includes
<asm/scatterlist.h> is not needed because everyplace it appears,
<linux/scatterlist.h> also appears.  <asm/io.h> is not needed because
nothing seems to be using device IO anyway.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:13 -07:00
Steve Wise
935ef2d7a2 RDMA/cma: Use neigh_event_send() to start neighbour discovery
Calling arp_send() to initiate neighbour discovery (ND) doesn't do the
full ND protocol.  Namely, it doesn't handle retransmitting the arp
request if it is dropped. The function neigh_event_send() does all
this.  Without doing full ND, RDMA address resolution fails in the
presence of dropped ARP broadcast packets.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:13 -07:00
Joachim Fenkes
3a31c41901 IB/ehca: Only use MR large pages for hugetlb regions
...because, on virtualized hardware like System p, we can't be sure
that the physical pages behind them are contiguous otherwise.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:13 -07:00
Joachim Fenkes
c8d8beea03 IB/umem: Add hugetlb flag to struct ib_umem
During ib_umem_get(), determine whether all pages from the memory
region are hugetlb pages and report this in the "hugetlb" member.
Low-level drivers can use this information if they need it.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:13 -07:00
Sean Hefty
247e020ee5 IB/srp: Add QoS support through service ID
Provide the target service ID when performing a path record query to
support optional QoS capability.  QoS requires support from the SA.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:12 -07:00
Sean Hefty
7ce86409ad RDMA/ucma: Allow user space to set service type
Export the ability to set the type of service to user space.  Model
the interface after setsockopt.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:12 -07:00
Sean Hefty
a81c994d5e RDMA/cma: Add ability to specify type of service
Provide support to specify a type of service for a communication
identifier.  A new function call is used when dealing with IPv4
addresses.  For IPv6 addresses, the ToS is specified through the
traffic class field in the sockaddr_in6 structure.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>

[ The comments Eitan Zahavi and myself have made over the v1 post at 
  <http://lists.openfabrics.org/pipermail/general/2007-August/039247.html>
  were fully addressed. ]
 
Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com> 
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:12 -07:00
Sean Hefty
733d65fe33 IB/sa: Add new QoS fields to path record
The QoS annex defines new fields for path records.  Add them to the
ib_sa for consumers that want to use them.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:12 -07:00
Sean Hefty
81668838c4 IPoIB: Specify Traffic Class with path record queries for QoS support
To support QoS within and between subnets, modify IPoIB to request
specific Traffic Class values with path record queries, using
the value associated with the IPoIB broadcast group.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>

[ See some comments I made on this at v1 and v2 of the posts
  <http://lists.openfabrics.org/pipermail/general/2007-August/039275.html>
  <http://lists.openfabrics.org/pipermail/general/2007-September/040312.html> ]

Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:11 -07:00
Hoang-Nam Nguyen
08c283ac26 IB/ehca: Fix large page HW cap defines
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:11 -07:00
Joachim Fenkes
39089e7774 IB/ehca: Bump version number and change its format
Nobody needed the SVNEHCA_ prefix anyway.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:11 -07:00
Joachim Fenkes
5110e4de49 IB/ehca: Replace get_paca()->paca_index by the more portable raw_smp_processor_id()
We can use raw_smp_processor_id() here because the processor ID is
only used for debug output and therefore our use is preemption-unsafe.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:11 -07:00
Joachim Fenkes
0b5de96858 IB/ehca: Serialize MR alloc and MR free hvCalls
Some firmware levels exhibit a race condition between H_ALLOC_RESOURCE(MR)
and H_FREE_RESOURCE(MR).  Work around this problem by locking these hvCalls
against each other.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:11 -07:00
Joachim Fenkes
e90d0b3dae IB/ehca: Path migration support
Fix some modify_qp() issues related to path migration.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:10 -07:00
Joachim Fenkes
b708fba3c2 IB/ehca: Add check for max #SGE to create_qp()
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:10 -07:00
Joachim Fenkes
86dce445e0 IB/ehca: ehca_gen_warn() should always print
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:10 -07:00
Joachim Fenkes
e37221928b IB/ehca: Print return codes as signed decimal integers
...because -12 is easier to read than FFFFFFF4.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:10 -07:00
Joachim Fenkes
2863ad4bdd IB/ehca: Refactor hvcall tracing
Change hvcall trace output towards better readability: reg numbers
instead of argument numbers, return code as signed decimal instead of
unsigned hex.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:10 -07:00
Hoang-Nam Nguyen
e390d3b52f IB/ehca: Use remap_4k_pfn() to map firmware contexts to user space
Use Paul's new remap_4k_pfn() function to map our 4K firmware contexts
into user space on 64K-page machines without exposing neighboring
firmware contexts. Return the context's offset within a 64K page to
user space so it can determine the proper virtual address.

For details about remap_4k_pfn(), see commit 721151d0 or
http://patchwork.ozlabs.org/linuxppc/patch?id=10281

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:09 -07:00
Stefan Roscher
5281a4b8a0 IB/ehca: Support more than 4k QPs for userspace and kernelspace
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:08 -07:00
Stefan Roscher
441633b968 IB/ehca: Small QP userspace support
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:07 -07:00
Peter Oruba
a855b1a742 IB/mthca: Use PCI-X/PCI-Express read control interfaces
These driver changes incorporate the proposed PCI-X / PCI-Express read
byte count interface.  Reading and setting those values doesn't take
place "manually", instead wrapping functions are called to allow
quirks for some PCI bridges.

Signed-off by: Peter Oruba <peter.oruba@amd.com>
Based on work by Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:07 -07:00
Ali Ayoub
3c10c7c929 IB/sa: Error handling thinko fix
ib_create_send_mad() returns an error code pointer on error, not NULL.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:07 -07:00
Anton Blanchard
339e2640a9 IB/ehca: Export module parameters in sysfs
At the moment the ehca module parameters are not exported in sysfs.
Export them with 0444 permissions.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:06 -07:00
Anton Blanchard
1f79448302 IB/ehca: Make output clearer by removing some debug messages
ehca spits out a lot of debugging information. I had to look closely to
see the "Port 1 is not active" message within all the debug:

eHCA Infiniband Device Driver (Rel.: SVNEHCA_0022)
eHCA scaling code enabled
ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_define_sqp Port 1 is not active.
ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_create_qp ehca_define_sqp() failed rc=ffffffffffffffff
ib_mad: Couldn't create ib_mad QP1
ib_mad: Couldn't open ehca0 port 1
ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_alloc_fmr unsupported fmr_attr->page_shift=9
ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_alloc_fmr rc=ffffffffffffffea pd=c000000b4b5b2420 mr_access_flags=7 fmr_attr=c0000005afd37394
fmr_create failed for FMR 0

Remove a few debug statements so that things are clearer:

eHCA Infiniband Device Driver (Rel.: SVNEHCA_0022)
eHCA scaling code enabled
ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_define_sqp Port 1 is not active.
ib_mad: Couldn't create ib_mad QP1
ib_mad: Couldn't open ehca0 port 1
ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_alloc_fmr unsupported fmr_attr->page_shift=9
fmr_create failed for FMR 0

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:06 -07:00
Roland Dreier
d7dc3ccbe4 IB/mlx4: Fix up SRQ limit_watermark endianness
mlx4_srq_query() returns a big-endian 16-bit value through an int *,
which screws up sparse checking.  Fix this so that a CPU-endian value
is returned.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:06 -07:00
Eli Cohen
ca6de177ac IPoIB: Fix error path memory leak
Clean up properly if ib_query_pkey() or ib_query_gid() fail.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:06 -07:00
Eli Cohen
b3ac60fc24 IPoIB: Fix typo to end statement with ';' instead of ','
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:06 -07:00
Michael S. Tsirkin
017aadc4b5 IB/mthca: Enable MSI-X by default
Recover from MSI-X errors by automatically falling back on regular
interrupt, instead of asking the user to do this manually.  This makes
it possible to enable MSI-X by default, and will make it possible to
get rid of the msi_x module option in the future.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:06 -07:00
Michael S. Tsirkin
08fb105540 mlx4_core: Enable MSI-X by default
Recover from MSI-X errors by automatically falling back on regular
interrupt, instead of asking the user to do this manually.  This makes
it possible to enable MSI-X by default, and will make it possible to
get rid of the msi_x module option in the future.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:05 -07:00
Anton Blanchard
8a68bbe31d IB/fmr_pool: Clean up some error messages in fmr_pool.c
A number of printks in fmr_pool.c dont have newlines, eg:

    fmr_create failed for FMR 0<5>FS-Cache: Loaded

Fix them up.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:05 -07:00
Roland Dreier
1fea391039 IB/ehca: Include <linux/mutex.h> from ehca_classes.h
ehca_classes.h uses struct mutex, so while <linux/mutex.h> seems to be
pulled in indirectly by one of the headers it includes, the right
thing is to include <linux/mutex.h> directly.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Acked-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:05 -07:00
Roland Dreier
2242fa4f04 IB/mlx4: Use __set_data_seg() in mlx4_ib_post_recv()
Use a __set_data_seg() helper in mlx4_ib_post_recv() too; in addition
to making the code easier to read, this also allows gcc to generate
better code -- on x86_64:

add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-8 (-8)
function                                     old     new   delta
mlx4_ib_post_recv                            359     351      -8

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:05 -07:00
Roland Dreier
eaf559bf56 mlx4_core: Don't free special QPs in QP number bitmap
Special QPs are not allocated using the regular QP number bitmap, so
when they are destroyed, their QP number should not be freed in the
bitmap.

Found by Dotan Barak of Mellanox.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:05 -07:00
Dotan Barak
36ce10d3e8 mlx4_core: Use enum value GO_BIT_TIMEOUT_MSECS
Rename GO_BIT_TIMEOUT to GO_BIT_TIMEOUT_MSECS for clarity, and
actually use it as the go bit timeout (instead of having the define
but then ignoring it and using a hard-coded 10 * HZ for the actual
timeout).

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:04 -07:00
Roland Dreier
65d470b3ea IB: find_first_zero_bit() takes unsigned pointer
Fix sparse warning

    drivers/infiniband/core/device.c:142:6: warning: incorrect type in argument 1 (different signedness)
    drivers/infiniband/core/device.c:142:6:    expected unsigned long const *addr
    drivers/infiniband/core/device.c:142:6:    got long *[assigned] inuse

by making the local variable inuse unsigned.  Does not affect generated
code at all.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:04 -07:00
Roland Dreier
ce423ef50e IPoIB: Make sure no receives are handled when stopping device
The current IPoIB code might process receive completions from
ipoib_drain_cq() when bringing down the interface.  This could cause
packets to be passed up the stack without the device's poll method
being called.  Avoid this by setting the status of any successful
completions to IB_WC_WR_FLUSH_ERR.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:04 -07:00
Steve Wise
e54664c095 RDMA/cxgb3: Make the iw_cxgb3 module parameters writable
Allow changing parameter values without having to reload the module.
This is safe because these parameters are only looked at when a new
connection is established.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09 19:59:04 -07:00
Linus Torvalds
bbf25010f1 Linux 2.6.23 2007-10-09 13:31:38 -07:00
Linus Torvalds
5df3e0d953 Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  [MIPS] Au1000: set the PCI controller IO base
  [MIPS] Alchemy: Fix USB initialization.
  [MIPS] IP32: Fix fatal typo in address computation.
2007-10-09 12:38:44 -07:00
Trond Myklebust
a6d8543042 NLM: Fix a memory leak in nlmsvc_testlock
The recent fix for a circular lock dependency unfortunately introduced a
potential memory leak in the event where the call to nlmsvc_lookup_host
fails for some reason.

Thanks to Roel Kluin for spotting this.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-09 12:38:26 -07:00
Jeff Garzik
baf14aa14e sata_mv: correct S/G table limits
The recent mv_fill_sg() rewrite, to fix a data corruption problem
related to IOMMU virtual merging, forgot to account for the
potentially-increased size of the scatter/gather table after its run.

Additionally, the DMA boundary is reduced from 0xffffffff to 0xffff
to more closely match the needs of mv_fill_sg().

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-09 12:38:26 -07:00