Commit graph

46306 commits

Author SHA1 Message Date
Steven Whitehouse
6bd9c8c2fb [GFS2] Remove unused go_callback operation
This is never used, so we might as well remove it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:17 -05:00
Steven Whitehouse
e5dab552c8 [GFS2] Remove the "greedy" function from glock.[ch]
The "greedy" code was an attempt to retain glocks for a minimum length
of time when they relate to mmap()ed files. The current implementation
of this feature is not, however, ideal in that it required allocating
memory in order to do this and its overly complicated.

It also misses the mark by ignoring the other I/O operations which are
just as likely to suffer from the same problem. So the plan is to remove
this now and then add the functionality back as part of the glock state
machine at a later date (and thus take into account all the possible
users of this feature)

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:14 -05:00
Steven Whitehouse
fee852e374 [GFS2] Shrink gfs2_inode memory by half
Here is something I spotted (while looking for something entirely
different) the other day.

Rather than using a completion in each and every struct gfs2_holder,
this removes it in favour of hashed wait queues, thus saving a
considerable amount of memory both on the stack (where a number of
gfs2_holder structures are allocated) and in particular in the
gfs2_inode which has 8 gfs2_holder structures embedded within it.

As a result on x86_64 the gfs2_inode shrinks from 2488 bytes to
1912 bytes, a saving of 576 bytes per inode (no thats not a typo!).
In actual practice we get a much better result than that since
now that a gfs2_inode is under the 2048 byte barrier, we get two
per 4k slab page effectively halving the amount of memory required
to store gfs2_inodes.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:11 -05:00
Steven Whitehouse
330005c2b2 [GFS2] Remove max_atomic_write tunable
This removes an unused sysfs tunable parameter.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:08 -05:00
Steven Whitehouse
3699e3a44b [GFS2] Clean up/speed up readdir
This removes the extra filldir callback which gfs2 was using to
enclose an attempt at readahead for inodes during readdir. The
code was too complicated and also hurts performance badly in the
case that the getdents64/readdir call isn't being followed by
stat() and it wasn't even getting it right all the time when it
was.

As a result, on my test box an "ls" of a directory containing 250000
files fell from about 7mins (freshly mounted, so nothing cached) to
between about 15 to 25 seconds. When the directory content was cached,
the time taken fell from about 3mins to about 4 or 5 seconds.

Interestingly in the cached case, running "ls -l" once reduced the time
taken for subsequent runs of "ls" to about 6 secs even without this
patch. Now it turns out that there was a special case of glocks being
used for prefetching the metadata, but because of the timeouts for these
locks (set to 10 secs) the metadata was being timed out before it was
being used and this the prefetch code was constantly trying to prefetch
the same data over and over.

Calling "ls -l" meant that the inodes were brought into memory and once
the inodes are cached, the glocks are not disposed of until the inodes
are pushed out of the cache, thus extending the lifetime of the glocks,
and thus bringing down the time for subsequent runs of "ls"
considerably.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:04 -05:00
Steven Whitehouse
a8d638e30e [GFS2] Add writepages for "data=writeback" mounts
It occurred to me that although a gfs2 specific writepages for ordered
writes and journaled data would be tricky, by hooking writepages only
for "data=writeback" mounts we could take advantage of not needing
buffer heads (we don't use them on the read side, nor have we for some
time) and create much larger I/Os for the block layer.

Using blktrace both before and after, its possible to see that for large
I/Os, most of the requests generated through writepages are now 1024
sectors after this patch is applied as opposed to 8 sectors before.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:01 -05:00
David Teigland
222d396092 [DLM] fix master recovery
If master recovery happens on an rsb in one recovery sequence, then that
sequence is aborted before lock recovery happens, then in the next
sequence, we rely on the previous master recovery (which may now be
invalid due to another node ignoring a lookup result) and go on do to the
lock recovery where we get stuck due to an invalid master value.

 recovery cycle begins: master of rsb X has left
 nodes A and B send node C an rcom lookup for X to find the new master
 C gets lookup from B first, sets B as new master, and sends reply back to B
 C gets lookup from A next, and sends reply back to A saying B is master
 A gets lookup reply from C and sets B as the new master in the rsb
 recovery cycle on A, B and C is aborted to start a new recovery
 B gets lookup reply from C and ignores it since there's a new recovery
 recovery cycle begins: some other node has joined
 B doesn't think it's the master of X so it doesn't rebuild it in the directory
 C looks up the master of X, no one is master, so it becomes new master
 B looks up the master of X, finds it's C
 A believes that B is the master of X, so it sends its lock to B
 B sends an error back to A
 A resends
 this repeats forever, the incorrect master value on A is never corrected

The fix is to do master recovery on an rsb that still has the NEW_MASTER
flag set from an earlier recovery sequence, and therefore didn't complete
lock recovery.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:58 -05:00
David Teigland
a1bc86e6bd [DLM] fix user unlocking
When a user process exits, we clear all the locks it holds.  There is a
problem, though, with locks that the process had begun unlocking before it
exited.  We couldn't find the lkb's that were in the process of being
unlocked remotely, to flag that they are DEAD.  To solve this, we move
lkb's being unlocked onto a new list in the per-process structure that
tracks what locks the process is holding.  We can then go through this
list to flag the necessary lkb's when clearing locks for a process when it
exits.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:55 -05:00
Patrick Caulfield
1d6e8131cf [DLM] Use workqueues for dlm lowcomms
This patch converts the DLM TCP lowcomms to use workqueues rather than using its
own daemon functions. Simultaneously removing a lot of code and making it more
scalable on multi-processor machines.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:52 -05:00
Adrian Bunk
03dc6a538e [GFS2] make gfs2_change_nlink_i() static
On Thu, Jan 11, 2007 at 10:26:27PM -0800, Andrew Morton wrote:
>...
> Changes since 2.6.20-rc3-mm1:
>...
>  git-gfs2-nmw.patch
>...
>  git trees
>...

This patch makes the needlessly globlal gfs2_change_nlink_i() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:49 -05:00
Robert Peterson
7083146564 [GFS2] gfs2 knows of directories which it chooses not to display
This is for Red Hat bugzilla bug bz #222302:

Moving a virtual IP from node to node between two NFS-over-GFS2
servers was causing one of the GFS2 servers to become confused and
reference a deleted inode.  The problem was due to vfs dentries that did
not reference the gfs2_dops and therefore didn't call the gfs2 revalidate
code to revalidate a dentry after a directory had been deleted & recreated.
This patch is a crosswrite from a RHEL4 bug found in GFS1 as
bz #190756 and it is against the latest -nmw git tree.

Signed-off-by: Robert Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:46 -05:00
David Teigland
d200778e12 [DLM] expose dlm_config_info fields in configfs
Make the dlm_config_info values readable and writeable via configfs
entries.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:43 -05:00
David Teigland
99fc64874a [DLM] add config entry to enable log_debug
Add a new dlm_config_info field to enable log_debug output and change
log_debug() to use it.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:40 -05:00
David Teigland
68c817a1c4 [DLM] rename dlm_config_info fields
Add a "ci_" prefix to the fields in the dlm_config_info struct so that we
can use macros to add configfs functions to access them (in a later
patch).  No functional changes in this patch, just naming changes.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:37 -05:00
David Teigland
8ec6886748 [DLM] change some log_error to log_debug
Some common, non-error messages should use log_debug instead of log_error
so they can be turned off.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:34 -05:00
S. Wendy Cheng
87d21e07f3 [GFS2] Fix gfs2_rename deadlock
Second round of gfs2_rename lock re-ordering to allow Anaconda adding
root partition on top of gfs2. Previous to this patch the recursive
lock detector in glock.c can be triggered due to attempting to lock
the rgrp twice. This fixes it by checking to see whether the rgrp
is already locked.

This fixes Red Hat bugzilla #221237

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:31 -05:00
Russell Cattelan
6c93fd1e57 [GFS2] BZ 217008 fsfuzzer fix.
Update the quilt header comments to match the
code changes.

Change gfs2_lookup_simple to return an error in the case
of a NULL inode.
The callers of gfs2_lookup_simple do not check for NULL
in the no entry case and such would end up dereferencing a NULL ptr.

This fixes:
http://projects.info-pull.com/mokb/MOKB-15-11-2006.html

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:28 -05:00
Steven Whitehouse
49686f7106 [GFS2] Fix ordering of page disposal vs. glock_dq
In case of unlinked files with dirty pages GFS2 wasn't clearing
the pages in quite the right order. This patch clears the pages
earlier (before the qlock_dq) to avoid the situation that the
release of the glock results in attempting to write back data that
has already been deallocated.

This fixes Red Hat bugzilla: #220117

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:24 -05:00
Patrick Caulfield
4edde74eed [DLM] Fix spin lock already unlocked bug
I just noticed this message when testing some other changes I'd made to
lowcomms (to use workqueues) but the problem seems to be in the current
git trees too. I'm amazed no-one has seen it.

    BUG: spinlock already unlocked on CPU#1, dlm_recoverd/16868

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:21 -05:00
Patrick Caulfield
3fb4a251fe [DLM] Fix schedule() calls
I was a little over-enthusiastic turning schedule() calls int cond_sched() when fixing the DLM for Andrew Morton.

These four should really be calls to schedule() or the dlm can busy-wait.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:18 -05:00
S. Wendy Cheng
5509826f1e [GFS2] Fix change nlink deadlock
Bugzilla 215088

Fix deadlock in gfs2_change_nlink() while installing RHEL5 into GFS2
partition. The gfs2_rename() apparently needs block allocation for the
new name (into the directory) where it requires rg locks. At the same
time, while updating the nlink count for the replaced file,
gfs2_change_nlink() tries to return the inode meta-data back to resource
group where it needs rg locks too. Our logic doesn't allow process to
acquire these locks recursively by the same process  (RHEL installer)
that results a BUG call. This only happens within rename code path and
only if the destination file exists before the rename operation.

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:15 -05:00
Steven Whitehouse
e1d5b18ae9 [GFS2] Fail over to readpage for stuffed files
This is partially derrived from a patch written by Russell Cattelan.
It fixes a bug where there is a race between readpages and truncate
by ignoring readpages for stuffed files. This is ok because a stuffed
file will never be more than one block (minus sizeof(struct gfs2_dinode))
in size and block size is always less than page size, so we do not lose
anything efficiency-wise by not doing readahead for stuffed files. They
will have already been "read ahead" by the action of reading the inode
in, in the first place.

This is the remaining part of the fix for Red Hat bugzilla #218966
which had not yet made it upstream.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Russell Cattelan <cattelan@redhat.com>
2007-02-05 13:36:12 -05:00
Steven Whitehouse
c7b3383437 [GFS2] Fix DIO deadlock
This patch fixes Red Hat bugzilla #212627 in which a deadlock occurs
due to trying to take the i_mutex while holding a glock. The correct
locking order is defined as i_mutex -> glock in all cases.

I've left dealing with allocating writes. I know that we need to do
that, but for now this should do the trick. We don't need to take the
i_mutex on write, because the VFS has already taken it for us. On read
we don't need it since the glock is enough protection. The reason that
I've made some of the checks into a separate function is that we'll need
to do the checks again in the allocating write case eventually, so this
is partly in preparation for this. Likewise the return value test of !=
1 might look a bit odd and thats because we'll need a third return value
in case of requiring an allocation.

I've made the change to deferred mode on the glock to ensure flushing
read caches on other nodes. I notice that (using blktrace to look at
whats going on) we appear to do a better job of large I/Os than ext3
after this patch (in terms of not splitting up the I/Os).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Wendy Cheng <wcheng@redhat.com>
2007-02-05 13:36:09 -05:00
Adrian Bunk
927255f038 [DLM] fs/dlm/lowcomms-tcp.c: remove 2 functions
Remove the following unused functions:

- lowcomms_send_message()
- lowcomms_max_buffer_size()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:05 -05:00
David Teigland
075529b5e1 [DLM] fix lost flags in stub replies
When the dlm fakes an unlock/cancel reply from a failed node using a stub
message struct, it wasn't setting the flags in the stub message.  So, in
the process of receiving the fake message the lkb flags would be updated
and cleared from the zero flags in the message.  The problem observed in
tests was the loss of the USER flag which caused the dlm to think a user
lock was a kernel lock and subsequently fail an assertion checking the
validity of the ast/callback field.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:02 -05:00
David Teigland
8d07fd509e [DLM] fix receive_request() lvb copying
LVB's are not sent as part of new requests, but the code receiving the
request was copying data into the lvb anyway.  The space in the message
where it mistakenly thought the lvb lived actually contained the resource
name, so it wound up incorrectly copying this name data into the lvb.  Fix
is to just create the lvb, not copy junk into it.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:35:59 -05:00
David Teigland
da49f36f4f [DLM] fix send_args() lvb copying
The send_args() function is used to copy parameters into a message for a
number different message types.  Only some of those types are set up
beforehand (in create_message) to include space for sending lvb data.
send_args was wrongly copying the lvb for all message types as long as the
lock had an lvb.  This means that the lvb data was being written past the
end of the message into unknown space.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:35:56 -05:00
David Teigland
9e971b715d [DLM] add version check
Check if we receive a message from another lockspace member running a
version of the dlm with an incompatible inter-node message protocol.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:35:53 -05:00
David Teigland
38aa8b0c59 [DLM] fix old rcom messages
A reply to a recovery message will often be received after the relevant
recovery sequence has aborted and the next recovery sequence has begun.
We need to ignore replies to these old messages from the previous
recovery.  There's already a way to do this for synchronous recovery
requests using the rc_id number, but not for async.

Each recovery sequence already has a locally unique sequence number
associated with it.  This patch adds a field to the rcom (recovery
message) structure where this recovery sequence number can be placed,
rc_seq.  When a node sends a reply to a recovery request, it copies the
rc_seq number it received into rc_seq_reply.  When the first node receives
the reply to its recovery message, it will check whether rc_seq_reply
matches the current recovery sequence number, ls_recover_seq, and if not
then it ignores the old reply.

An old, inadequate approach to filtering out old replies (checking if the
current stage of recovery has moved back to the start) has been removed
from two spots.

The protocol version number is changed to reflect the different rcom
structures.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:35:50 -05:00
David Teigland
dc200a8848 [DLM] fix resend rcom lock
There's a chance the new master of resource hasn't learned it's the new
master before another node sends it a lock during recovery.  The node
sending the lock needs to resend if this happens.

- A sends a master lookup for resource R to C
- B sends a master lookup for resource R to C
- C receives A's lookup, assigns A to be master of R and
  sends a reply back to A
- C receives B's lookup and sends a reply back to B saying
  that A is the master
- B receives lookup reply from C and sends its lock for R to A
- A receives lock from B, doesn't think it's the master of R
  and sends an error back to B
- A receives lookup reply from C and becomes master of R
- B gets error back from A and resends its lock back to A
  (this resending is what this patch does)
- A receives lock from B, it now sees it's the master of R
  and takes the lock

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:35:47 -05:00
David Teigland
c378051177 [GFS2] don't try to lockfs after shutdown
If an fs has already been shut down, a lockfs callback should do nothing.
An fs that's been shut down can't acquire locks or do anything with
respect to the cluster.

Also, remove FIXME comment in withdraw function.  The missing bits of the
withdraw procedure are now all done by user space.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:35:44 -05:00
Soeren Sonnenburg
a417a21e10 USB HID: handle multi-interface devices for Apple macbook pro properly
Some HID devices by Apple have both keyboard and mouse interfaces; the
keyboard interface is handled by usbhid, but the mouse (really
touchpad) interface must be handled by the separate 'appletouch'
driver.  Using HID_QUIRK_IGNORE will make hiddev ignore both
interfaces, therefore a new quirk flag to ignore only the mouse
interface is required.

Signed-off-by: Soeren Sonnenburg <kernel@nn7.de>
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2007-02-05 10:06:01 +01:00
Jiri Kosina
dd64c151b9 HID: move away from DEBUG defines in favor of CONFIG_HID_DEBUG
CONFIG_INPUT_DEBUG is non-existent option, so remove anything depending
on it.

Also, as we have new CONFIG_HID_DEBUG, this should be used on places
where ifdef DEBUG was used before.

Suggested by Adrian Bunk.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2007-02-05 10:00:45 +01:00
Jiri Kosina
43c7bf0472 USB HID: fix bogus comment in hid_get_class_descriptor()
The comment in hid_get_class_descriptor() says a very obvious thing
and is also violating codingstyle. Just remove it.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2007-02-05 10:00:43 +01:00
Jiri Kosina
8235ca3c05 USB HID: remove hid_find_field_by_usage()
The unused hid_find_field_by_usage() function has been commented out for
a pretty long time. Remove it completely.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2007-02-05 10:00:42 +01:00
Jiri Kosina
7c37914600 HID: API - fix leftovers of hidinput API in USB HID
hidinput_{open,close}() functions do not belong to usbhid, but
to the generic HID layer. Move them, and fix hooks in struct
hid_device, so that now the callbacks are done to transport-specific
_open() functions, but not input_open() functions.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2007-02-05 10:00:40 +01:00
Jiri Kosina
c080d89ad9 HID: hid debug from hid-debug.h to hid layer
hid-debug.h contains a lot of code, and should not therefore
be a header.

This patch moves the code to generic hid layer as .c source, and
introduces CONFIG_HID_DEBUG to conditionally compile it, instead
of playing with #define DEBUG and including hid-debug.h.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2007-02-05 10:00:38 +01:00
Anssi Hannula
20eb127906 hid: force feedback driver for PantherLord USB/PS2 2in1 Adapter
Add a force feedback driver for PantherLord USB/PS2 2in1 Adapter,
0810:0001. The device identifies itself as "Twin USB Joystick".

Signed-off-by: Anssi Hannula <anssi.hannula@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2007-02-05 10:00:05 +01:00
Anssi Hannula
5556feae1c hid: quirk for multi-input devices with unneeded output reports
Add new quirk HID_QUIRK_SKIP_OUTPUT_REPORTS to skip output reports
when enumerating reports on a hid-input device. Add this quirk and
HID_QUIRK_MULTI_INPUT to 0810:0001.

PantherLord Twin USB Joystick, 0810:0001 has separate input reports
for 2 distinct game controllers in the same interface, so it needs
HID_QUIRK_MULTI_INPUT. However, the device also contains one output
report per controller which is used to control the force feedback
function, and we do not want those to appear as separate input
devices as well. The simplest approach seems to be to add a quirk to
skip output reports on 0810:0001, and allow the force feedback
driver to handle those.

Signed-off-by: Anssi Hannula <anssi.hannula@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2007-02-05 10:00:04 +01:00
Anssi Hannula
c4146067fd hid: allow force feedback for multi-input devices
Allow hid devices with HID_QUIRK_MULTI_INPUT to have force feedback.
This was previously disabled because there were not any force
feedback drivers for such devices. This will change with my upcoming
patch.

Signed-off-by: Anssi Hannula <anssi.hannula@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2007-02-05 10:00:02 +01:00
Hoang-Nam Nguyen
b45bfcc1ae IB/ehca: Remove obsolete prototypes
Remove prototypes for functions that don't exist.

Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-04 14:11:58 -08:00
Hoang-Nam Nguyen
4c34bdf58c IB/ehca: Remove use of do_mmap()
This patch removes do_mmap() from ehca:
 - Call remap_pfn_range() for hardware register block
 - Use vm_insert_page() to register memory allocated for completion
   queues and queue pairs
 - The actual mmap() call/trigger is now controlled by user space,
   ie. libehca

Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-04 14:11:57 -08:00
Steve Wise
1f12667021 RDMA/addr: Handle ethernet neighbour updates during route resolution
The iWARP connection manager uses the ib_addr services to do route
resolution (neighbour discovery in the IP world).  The ib_addr
netevent callback routine, however, currently only acts on InfiniBand
neighbour updates.  It needs to act on ethernet neighbour updates as
well.

This patch just removes filtering on device type altogether and will
trigger on any neighour updates where the nud_type is valid.  This
simplifies the code some.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-04 14:11:57 -08:00
Jason Gunthorpe
fa7252ed4d IB: Make sure struct ib_user_mad.data is aligned
Make the untyped data region in ib_user_mad have type u64 so that it
gets aligned properly.  This avoids alignment faults in ib_umad when
casting the data field to an rmpp_mad and accessing the 64-bit tid
field on architectures like ia64.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-04 14:11:56 -08:00
Ishai Rabinovitz
1033ff670d IB/srp: Don't wait for response when QP is in error state.
When there is a call to send_tsk_mgmt SRP posts a send and waits for 5
seconds to get a response.

When the QP is in the error state it is obvious that there will be no
response so it is quite useless to wait.  In fact, the timeout causes
SRP to wait a long time to reconnect when a QP error occurs. (Each
abort and each reset_device calls send_tsk_mgmt, which waits for the
timeout).  The following patch solves this problem by identifying the
failure and returning an immediate error code.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-04 14:11:56 -08:00
Michael S. Tsirkin
062dbb69f3 IB: Return qp pointer as part of ib_wc
struct ib_wc currently only includes the local QP number: this matches
the IB spec, but seems mostly useless. The following patch replaces
this with the pointer to qp itself, and updates all low level drivers
and all users.

This has the following advantages:
- Ability to get a per-qp context through wc->qp->qp_context
- Existing drivers already have the qp pointer ready in poll cq, so
  this change actually saves a tiny bit (extra memory read) on data path
  (for ehca it would actually be expensive to find the QP pointer when
  polling a CQ, but ehca does not support SRQ so we can leave wc->qp as
  NULL for ehca)
- Users that need the QP number can still get it through wc->qp->qp_num

Use case:

In IPoIB connected mode code, I have a common CQ shared by multiple
QPs.  To track connection usage, I need a way to get at some per-QP
context upon the completion, and I would like to avoid allocating
context object per work request just to stick a QP pointer into it.
With this code, I can just use wc->qp->qp_context.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-04 14:11:55 -08:00
Michael S. Tsirkin
459d6e2a54 IB: Include <linux/kref.h> explicitly in <rdma/ib_verbs.h>
<rdma/ib_verbs.h> uses struct kref, so it should include <linux/kref.h>
explicitly to avoid hidden include dependencies.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-04 14:11:55 -08:00
Pierre Ossman
f9d429a2e5 mmc: tifm: replace kmap with page_address
Since we actively avoid highmem, calling kmap_atomic() instead
of page_address() is effectively only obfuscation.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2007-02-04 20:54:12 +01:00
Pierre Ossman
c70840e819 mmc: sdhci: fix voltage ocr
Some bad if-clauses caused the driver to just report the highest
supported voltage, not all.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2007-02-04 20:54:12 +01:00
Pierre Ossman
2a22b14edf mmc: sdhci: replace kmap with page_address
Since we actively avoid highmem, calling kmap_atomic() instead
of page_address() is effectively only obfuscation.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2007-02-04 20:54:12 +01:00