Commit graph

178 commits

Author SHA1 Message Date
James Bottomley
02bd3499a3 [SCSI] scsi_lib: only call scsi_unprep_request() under queue lock
It's called under that lock everywhere else and it does alter the
request state, so it should be.

This one occurance in scsi_requeue_command() could open a window where
req->special is set to NULL while the requests is going through either
timeout or completion processing leading to NULL pointer derefs of the
sort complained of in bugzillas 12020 and 12195.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-13 14:31:03 -06:00
Mike Christie
2a3a59e5c9 [SCSI] Fix hang in starved list processing
Close possible infinite loop with interrupts off when devices are
added back to the starved list.

Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=11898

Reported-by: <alex.shi@intel.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-16 08:13:58 -06:00
Kiyoshi Ueda
6c5121b78b [SCSI] export busy state via q->lld_busy_fn()
This patch implements q->lld_busy_fn() for scsi mid layer to export
its busy state for request stacking drivers.

For efficiency, no lock is taken to check the busy state of
shost/starget/sdev, since the returned value is not guaranteed and
may be changed after request stacking drivers call the function,
regardless of taking lock or not.

When scsi can't dispatch I/Os anymore and needs to kill I/Os
(e.g. !sdev), scsi needs to return 'not busy'.
Otherwise, request stacking drivers may hold requests forever.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-23 11:42:16 -05:00
Kiyoshi Ueda
9d11251709 [SCSI] refactor sdev/starget/shost busy checking
This patch refactors the busy checking codes of scsi_device,
Scsi_Host and scsi_target.  There should be no functional change.

This is a preparation for another patch which exports scsi's busy
state to the block layer for request stacking drivers.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-23 11:42:16 -05:00
James Bottomley
32c356d76d [SCSI] fix removable device inability to detect disk changes
On Tue, 12 Aug 2008 15:08:14 +0200
Giuliano Pochini <pochini@shiny.it> wrote:

> Fujitsu magneto-optical drive, Adaptec 29160 and
> Linux Jay 2.6.26 #7 SMP Sun Aug 10 18:34:22 CEST 2008 ppc 7455, altivec supported PowerMac3,6 GNU/Linux
>
> When I insert a disk and I mount it, scsi_test_unit_ready() is called and
> the do-while loop gets sshdr->sense_key == UNIT_ATTENTION in the first
> cycle and 0 in the second one. So the if below misses the UNIT_ATTENTION
> and sdev->changed = 1 is not executed. At this point bad things can
> happen... I'm not sure how to fix this. Any clue ?

The problem is essentially caused by us eating UNIT_ATTENTION
conditions in scsi_test_unit_ready().  Fix by updating the ->changed
flag when this happens if the media is removable.

[pochini@shiny.it: updates to tidy up patch]
Signed-off-by: Giuliano Pochini <pochini@shiny.it>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-23 11:41:16 -05:00
Mike Christie
4a27446f3e [SCSI] modify scsi to handle new fail fast flags.
This checks the errors the scsi-ml determined were retryable
and returns if we should fast fail it based on the request
fail fast flags.

Without the patch, drivers like lpfc, qla2xxx and fcoe would return
DID_ERROR for what it determines is a temporary communication problem.
There is no loss of connectivity at that time and the driver thinks
that it would be fast to retry at the driver level. SCSI-ml will however
sees fast fail on the request and DID_ERROR and will fast fail the io.
This will then cause dm-multipath to fail the path and possibley switch
target controllers when we should be retrying at the scsi layer.

We also were fast failing device errors to dm multiapth when
unless the scsi_dh modules think otherwis we want to retry at
the scsi layer because multipath can only retry the IO like scsi
should have done. multipath is a little dumber though because it
does not what the error was for and assumes that it should fail
the paths.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-13 09:28:52 -04:00
Mike Christie
f0c0a376d0 [SCSI] Add helper code so transport classes/driver can control queueing (v3)
SCSI-ml manages the queueing limits for the device and host, but
does not do so at the target level. However something something similar
can come in userful when a driver is transitioning a transport object to
the the blocked state, becuase at that time we do not want to queue
io and we do not want the queuecommand to be called again.

The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers.
You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit
a transport level queueing issue like the hw cannot allocate some
resource at the iscsi session/connection level, or the target has temporarily
closed or shrunk the queueing window, or if we are transitioning
to the blocked state.

bnx2i, when they rework their firmware according to netdev
developers requests, will also need to be able to limit queueing at this
level. bnx2i will hook into libiscsi, but will allocate a scsi host per
netdevice/hba, so unlike pure software iscsi/iser which is allocating
a host per session, it cannot set the scsi_host->can_queue and return
SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport.

The iscsi class/driver can also set a scsi_target->can_queue value which
reflects the max commands the driver/class can support. For iscsi this
reflects the number of commands we can support for each session due to
session/connection hw limits, driver limits, and to also reflect the
session/targets's queueing window.

Changes:
v1 - initial patch.
v2 - Fix scsi_run_queue handling of multiple blocked targets.
Previously we would break from the main loop if a device was added back on
the starved list. We now run over the list and check if any target is
blocked.
v3 - Rediff for scsi-misc.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-13 09:28:46 -04:00
Linus Torvalds
ef5bef357c Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (37 commits)
  [SCSI] zfcp: fix double dbf id usage
  [SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev
  [SCSI] zfcp: fix erp list usage without using locks
  [SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport
  [SCSI] zfcp: fix deadlock caused by shared work queue tasks
  [SCSI] zfcp: put threshold data in hba trace
  [SCSI] zfcp: Simplify zfcp data structures
  [SCSI] zfcp: Simplify get_adapter_by_busid
  [SCSI] zfcp: remove all typedefs and replace them with standards
  [SCSI] zfcp: attach and release SAN nameserver port on demand
  [SCSI] zfcp: remove unused references, declarations and flags
  [SCSI] zfcp: Update message with input from review
  [SCSI] zfcp: add queue_full sysfs attribute
  [SCSI] scsi_dh: suppress comparison warning
  [SCSI] scsi_dh: add Dell product information into rdac device handler
  [SCSI] qla2xxx: remove the unused SCSI_QLOGIC_FC_FIRMWARE option
  [SCSI] qla2xxx: fix printk format warnings
  [SCSI] qla2xxx: Update version number to 8.02.01-k8.
  [SCSI] qla2xxx: Ignore payload reserved-bits during RSCN processing.
  [SCSI] qla2xxx: Additional residual-count corrections during UNDERRUN handling.
  ...
2008-10-10 10:53:26 -07:00
Jens Axboe
242f9dcb8b block: unify request timeout handling
Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.

Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.

Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:13 +02:00
James Bottomley
6f4267e3bd [SCSI] Update the SCSI state model to allow blocking in the created state
Brian King <brking@linux.vnet.ibm.com> reported that fibre channel
devices can oops during scanning if their ports block (because the
device goes from CREATED -> BLOCK -> RUNNING rather than CREATED ->
BLOCK -> CREATED).

Fix this by adding a new state: CREATED_BLOCK which can only transition
back to CREATED and disallow the CREATED -> BLOCK transition.  Now both
the created and blocked states that the mid-layer recognises can include
CREATED_BLOCK.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-03 11:46:13 -05:00
James Bottomley
44ea91c597 [SCSI] Fix hang with split requests
Sometimes, particularly for USB devices with the last sector bug,
requests get completed in chunks.  There's a bug in this in that if
one of the chunks gets an error, we complete that chunk with an error
but never move on to the remaining ones, leading to the request
hanging (because it's not fully completed).

Fix this by completing all remaining chunks if an error is encountered.

Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-09-23 12:29:01 -07:00
Harvey Harrison
cadbd4a5e3 [SCSI] replace __FUNCTION__ with __func__
[jejb: fixed up a ton of missed conversions.

 All of you are on notice this has happened, driver trees will now
 need to be rebased]

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: SCSI List <linux-scsi@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-27 10:31:49 -04:00
Mike Christie
6bd522f6a2 [SCSI] scsi_lib: use blk_rq_tagged in scsi_request_fn
I goofed and did not see the macro for checking if a request is tagged.
This patch has us use blk_rq_tagged instead of digging into the req->tag.

Patch was made over scsi-misc.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:59 -04:00
Martin K. Petersen
511e44f42e [SCSI] Do not retry a request whose data integrity check failed
If initiator or target reject the I/O due to DIF errors there is no
point in retrying.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:55 -04:00
Martin K. Petersen
7027ad72a6 [SCSI] Support devices with protection information
Implement support for DMA of protection information for devices that
are data integrity capable.

 - Add support for mapping an extra scatter-gather list containing
   the protection information.

 - Allocate protection scsi_data_buffer if host is DIX (integrity DMA)
   capable.

 - Accessor function for checking whether a device has protection
   enabled.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:55 -04:00
Mike Christie
ecefe8a975 [SCSI] fix shared tag map tag allocation
When drivers use a shared tag map we can end up with more requests
than tags, because the tag map is shost->can_queue tags and there
can be sdevs * sdev->queue_depth requests. In scsi_request_fn
if tag allocation fails we just drop down to just dequeueing the
tag without a tag. The problem is that drivers using the shared tag
map rely on a valid tag always being set, because it will use the
tag number to lookup commands later.

This patch has us check if we got a valid tag when the host lock
is held right before we check if the host queue is ready. We do the
check here because to allocate the tag we need the q lock, but
if the tag is bad we want to add the device/q onto the starved list
which requires the host lock.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:50 -04:00
Linus Torvalds
89a93f2f48 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (102 commits)
  [SCSI] scsi_dh: fix kconfig related build errors
  [SCSI] sym53c8xx: Fix bogus sym_que_entry re-implementation of container_of
  [SCSI] scsi_cmnd.h: remove double inclusion of linux/blkdev.h
  [SCSI] make struct scsi_{host,target}_type static
  [SCSI] fix locking in host use of blk_plug_device()
  [SCSI] zfcp: Cleanup external header file
  [SCSI] zfcp: Cleanup code in zfcp_erp.c
  [SCSI] zfcp: zfcp_fsf cleanup.
  [SCSI] zfcp: consolidate sysfs things into one file.
  [SCSI] zfcp: Cleanup of code in zfcp_aux.c
  [SCSI] zfcp: Cleanup of code in zfcp_scsi.c
  [SCSI] zfcp: Move status accessors from zfcp to SCSI include file.
  [SCSI] zfcp: Small QDIO cleanups
  [SCSI] zfcp: Adapter reopen for large number of unsolicited status
  [SCSI] zfcp: Fix error checking for ELS ADISC requests
  [SCSI] zfcp: wait until adapter is finished with ERP during auto-port
  [SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver
  [SCSI] sg: Add target reset support
  [SCSI] lib: Add support for the T10 (SCSI) Data Integrity Field CRC
  [SCSI] sd: Move scsi_disk() accessor function to sd.h
  ...
2008-07-15 18:58:04 -07:00
James Bottomley
2476b4d042 [SCSI] fix locking in host use of blk_plug_device()
scsi_lib.c:scsi_host_queue_ready() plugs the device with incorrect
locking.  It should actually have the queue lock held, but it's
holding the host lock.  Fix this by eliminating the call.  The host
ready has no need to plug the queue because if it returns 0 in
scsi_request_function control transfers to not_ready which acquires
the queue lock and plugs the device if its at zero depth.

Reported-by: Elias Oltmanns <eo@nebensachen.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-12 08:22:36 -05:00
Martin K. Petersen
6362abd3e0 [SCSI] Rename scsi_bidi_sdb_cache
The data integrity changes need to dynamically allocate
scsi_data_buffers too.  Rename scsi_bidi_sdb_cache for clarity.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-12 08:22:24 -05:00
Alan Stern
bdb2b8cab4 [SCSI] erase invalid data returned by device
This patch (as1108) fixes a problem that can occur with certain USB
mass-storage devices: They return invalid data together with a residue
indicating that the data should be ignored.  Rather than leave the
invalid data in a transfer buffer, where it can get misinterpreted,
the patch clears the invalid portion of the buffer.

This solves a problem (wrong write-protect setting detected) reported
by Maciej Rutecki and Peter Teoh.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Peter Teoh <htmldeveloper@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-06 11:33:08 -05:00
Chandra Seetharaman
a6a8d9f87e [SCSI] scsi_dh: add infrastructure for SCSI Device Handlers
Some of the storage devices (that can be accessed through multiple paths),
do need some special handling for
	1. Activating the passive path of the storage access.
	2. Decode and handle the special sense codes returned by the devices.
	3. Handle the I/Os being sent to the passive path, especially
           during the device probe time.
when accessed through multiple paths.

As of today this special device handling is done at the dm-multipath
layer using dm-handlers. That works well for (1); for (2) to be handled
at dm layer, scsi sense information need to be exported from SCSI to dm-layer,
which is not very attractive; (3) cannot be done at all at the dm layer.

Device handler has been moved to SCSI mainly to handle (2) and (3) properly.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-06-05 09:23:40 -05:00
Linus Torvalds
d626e3bf72 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6:
  [SCSI] aic94xx: fix section mismatch
  [SCSI] u14-34f: Fix 32bit only problem
  [SCSI] dpt_i2o: sysfs code
  [SCSI] dpt_i2o: 64 bit support
  [SCSI] dpt_i2o: move from virt_to_bus/bus_to_virt to dma_alloc_coherent
  [SCSI] dpt_i2o: use standard __init / __exit code
  [SCSI] megaraid_sas: fix suspend/resume sections
  [SCSI] aacraid: Add Power Management support
  [SCSI] aacraid: Fix jbod operations scan issues
  [SCSI] aacraid: Fix warning about macro side-effects
  [SCSI] add support for variable length extended commands
  [SCSI] Let scsi_cmnd->cmnd use request->cmd buffer
  [SCSI] bsg: add large command support
  [SCSI] aacraid: Fix down_interruptible() to check the return value correctly
  [SCSI] megaraid_sas; Update the Version and Changelog
  [SCSI] ibmvscsi: Handle non SCSI error status
  [SCSI] bug fix for free list handling
  [SCSI] ipr: Rename ipr's state scsi host attribute to prevent collisions
  [SCSI] megaraid_mbox: fix Dell CERC firmware problem
2008-05-02 13:52:35 -07:00
Boaz Harrosh
db4742dd8f [SCSI] add support for variable length extended commands
Add support for variable-length, extended, and vendor specific
CDBs to scsi-ml. It is now possible for initiators and ULD's
to issue these types of commands. LLDs need not change much.
All they need is to raise the .max_cmd_len to the longest command
they support (see iscsi patch).

- clean-up some code paths that did not expect commands to be
  larger than 16, and change cmd_len members' type to short as
  char is not enough.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-05-02 11:33:25 -05:00
Boaz Harrosh
64a87b244b [SCSI] Let scsi_cmnd->cmnd use request->cmd buffer
- struct scsi_cmnd had a 16 bytes command buffer of its own.
   This is an unnecessary duplication and copy of request's
   cmd. It is probably left overs from the time that scsi_cmnd
   could function without a request attached. So clean that up.

 - Once above is done, few places, apart from scsi-ml, needed
   adjustments due to changing the data type of scsi_cmnd->cmnd.

 - Lots of drivers still use MAX_COMMAND_SIZE. So I have left
   that #define but equate it to BLK_MAX_CDB. The way I see it
   and is reflected in the patch below is.
   MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB
                      as per the SCSI standard and is not related
                      to the implementation.
   BLK_MAX_CDB.     - The allocated space at the request level

 - I have audit all ISA drivers and made sure none use ->cmnd in a DMA
   Operation. Same audit was done by Andi Kleen.

(*)fixed-length here means commands that their size can be determined
   by their opcode and the CDB does not carry a length specifier, (unlike
   the VARIABLE_LENGTH_CMD(0x7f) command). This is actually not exactly
   true and the SCSI standard also defines extended commands and
   vendor specific commands that can be bigger than 16 bytes. The kernel
   will support these using the same infrastructure used for VARLEN CDB's.
   So in effect MAX_COMMAND_SIZE means the maximum size command
   scsi-ml supports without specifying a cmd_len by ULD's

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-05-02 10:18:22 -05:00
Nick Piggin
75ad23bc0f block: make queue flags non-atomic
We can save some atomic ops in the IO path, if we clearly define
the rules of how to modify the queue flags.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-29 14:48:33 +02:00
James Bottomley
fa8e36c39b [SCSI] fix barrier failure issue
Currently, if the barrier command fails, the error return isn't seen
by the block layer and it proceeds on regardless.  The problem is that
SCSI always returns no error for REQ_TYPE_BLOCK_PC ... it expects the
submitter to pick the errors out of req->errors, which the block
barrier functions don't do.

Since it appears that the way SG_IO and scsi_execute_request() work
they discard the block error return and always use req->errors, the
best fix for this is to have the SCSI layer return an error to block
if one actually occurred (this also allows us to filter out spurious
errors, like deferred sense).

This patch is a bug fix that will need backporting to stable, but it's
also quite a big change and in need of testing, so we'll incubate in
the main kernel tree and backport at the -rc2 or so stage if no
problems turn up.

Acked-by: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:19:10 -05:00
Adrian Bunk
8c5e03d3cf [SCSI] make scsi_end_bidi_request() static
This patch makes the needlessly global scsi_end_bidi_request() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:19:08 -05:00
Kay Sievers
4d1566ed21 [SCSI] fix media change events for polled devices
Commit:
  a341cd0f (SCSI: add asynchronous event notification API)
breaks:
  285e9670 (sr,sd: send media state change modification events)
by introducing an event filter, which is removed here, to make
events, we are depending on, happen again.

Fix this by removing the event filter.  It's pretty much broken at the
moment, since a user can't set it (the attribute being read only).  A
proper fix will be to make the event discriminator distinguish between
AN and Polled media change events.

Cc: David Zeuthen <david@fubar.dk>
Cc: kristen accardi <kaccardi@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-03-19 11:51:28 -05:00
Tejun Heo
6b00769fe1 block: add request->raw_data_len
With padding and draining moved into it, block layer now may extend
requests as directed by queue parameters, so now a request has two
sizes - the original request size and the extended size which matches
the size of area pointed to by bios and later by sgs.  The latter size
is what lower layers are primarily interested in when allocating,
filling up DMA tables and setting up the controller.

Both padding and draining extend the data area to accomodate
controller characteristics.  As any controller which speaks SCSI can
handle underflows, feeding larger data area is safe.

So, this patch makes the primary data length field, request->data_len,
indicate the size of full data area and add a separate length field,
request->raw_data_len, for the unmodified request size.  The latter is
used to report to higher layer (userland) and where the original
request size should be fed to the controller or device.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-02-19 11:36:35 +01:00
Tony Battersby
4d2de3a50c [SCSI] fix BUG when sum(scatterlist) > bufflen
When sending a SCSI command to a tape drive via the SCSI Generic (sg)
driver, if the command has a data transfer length more than
scatter_elem_sz (32 KB default) and not a multiple of 512, then I either
hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else
the command never completes (depending on the LLDD).

When constructing scatterlists, the sg driver rounds up the scatterlist
element sizes to be a multiple of 512.  This can result in
sum(scatterlist lengths) > bufflen.  In this case, scsi_req_map_sg()
incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to
bufflen.  When the command completes, req_bio_endio() detects that
bio->bi_size != 0, and so it doesn't call bio_endio().  This causes the
command to be resubmitted, resulting in BUG_ON or the command never
completing.

This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather
than to sum(scatterlist lengths), which fixes the problem.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-02-07 18:02:44 -06:00
FUJITA Tomonori
99c84dbdc7 iommu sg merging: call dma_set_seg_boundary in __scsi_alloc_queue()
This is a one-line patch to add the following to __scsi_alloc_queue():

dma_set_seg_boundary(dev, shost->dma_boundary);

This is the simplest approach but the result looks odd,
__scsi_alloc_queue() does:

blk_queue_segment_boundary(q, shost->dma_boundary);
dma_set_seg_boundary(dev, shost->dma_boundary);
blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));

I think that it would be better to set up segment boundary in the same
way as we did for the maximum segment size. That is, removing
shost->dma_boundary and LLDs call pci_set_dma_seg_boundary (or its
friends).

Then __scsi_alloc_queue() can set up both limits in the same way:

blk_queue_segment_boundary(q, dma_get_seg_boundary(dev));
blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));

killing dma_boundary in scsi_host_template needs a large patch for
libata (dma_boundary is used by only libata and sym53c8xx). I'll send
a patch to do that if it is acceptable. James and Jeff?

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:12 -08:00
FUJITA Tomonori
860ac568e8 iommu sg merging: call blk_queue_segment_boundary in __scsi_alloc_queue
request_queue and device struct must have the same value of a segment
size limit. This patch adds blk_queue_segment_boundary in
__scsi_alloc_queue so LLDs don't need to call both
blk_queue_segment_boundary and set_dma_max_seg_size. A LLD can change
the default value (64KB) can call device_dma_parameters accessors like
pci_set_dma_max_seg_size when allocating scsi_host.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jeff Garzik <jeff@garzik.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:11 -08:00
FUJITA Tomonori
3d9dd6eef8 [SCSI] handle scsi_init_queue failure properly
scsi_init_queue is expected to clean up allocated things when it
fails.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:14:25 -06:00
FUJITA Tomonori
b172b6e99e [SCSI] destroy scsi_bidi_sdb_cache in scsi_exit_queue
Needs to call kmem_cache_destroy for scsi_bidi_sdb_cache in
scsi_exit_queue.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:14:25 -06:00
James Bottomley
d3f46f39b7 [SCSI] remove use_sg_chaining
With the sg table code, every SCSI driver is now either chain capable
or broken (or has sg_tablesize set so chaining is never activated), so
there's no need to have a check in the host template.

Also tidy up the code by moving the scatterlist size defines into the
SCSI includes and permit the last entry of the scatterlist pools not
to be a power of two.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:14:02 -06:00
Kiyoshi Ueda
b8de163184 [SCSI] bidirectional: fix up for the new blk_end_request code
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:41 -06:00
Boaz Harrosh
6f9a35e2da [SCSI] bidirectional command support
At the block level bidi request uses req->next_rq pointer for a second
bidi_read request.
At Scsi-midlayer a second scsi_data_buffer structure is used for the
bidi_read part. This bidi scsi_data_buffer is put on
request->next_rq->special. Struct scsi_cmnd is not changed.

- Define scsi_bidi_cmnd() to return true if it is a bidi request and a
  second sgtable was allocated.

- Define scsi_in()/scsi_out() to return the in or out scsi_data_buffer
  from this command This API is to isolate users from the mechanics of
  bidi.

- Define scsi_end_bidi_request() to do what scsi_end_request() does but
  for a bidi request. This is necessary because bidi commands are a bit
  tricky here. (See comments in body)

- scsi_release_buffers() will also release the bidi_read scsi_data_buffer

- scsi_io_completion() on bidi commands will now call
  scsi_end_bidi_request() and return.

- The previous work done in scsi_init_io() is now done in a new
  scsi_init_sgtable() (which is 99% identical to old scsi_init_io())
  The new scsi_init_io() will call the above twice if needed also for
  the bidi_read command. Only at this point is a command bidi.

- In scsi_error.c at scsi_eh_prep/restore_cmnd() make sure bidi-lld is not
  confused by a get-sense command that looks like bidi. This is done
  by puting NULL at request->next_rq, and restoring.

[jejb: update to sg_table and resolve conflicts
also update to blk-end-request and resolve conflicts]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:41 -06:00
Boaz Harrosh
30b0c37b27 [SCSI] implement scsi_data_buffer
In preparation for bidi we abstract all IO members of scsi_cmnd,
that will need to duplicate, into a substructure.

- Group all IO members of scsi_cmnd into a scsi_data_buffer
  structure.
- Adjust accessors to new members.
- scsi_{alloc,free}_sgtable receive a scsi_data_buffer instead of
  scsi_cmnd. And work on it.
- Adjust scsi_init_io() and  scsi_release_buffers() for above
  change.
- Fix other parts of scsi_lib/scsi.c to members migration. Use
  accessors where appropriate.

- fix Documentation about scsi_cmnd in scsi_host.h

- scsi_error.c
  * Changed needed members of struct scsi_eh_save.
  * Careful considerations in scsi_eh_prep/restore_cmnd.

- sd.c and sr.c
  * sd and sr would adjust IO size to align on device's block
    size so code needs to change once we move to scsi_data_buff
    implementation.
  * Convert code to use scsi_for_each_sg
  * Use data accessors where appropriate.

- tgt: convert libsrp to use scsi_data_buffer

- isd200: This driver still bangs on scsi_cmnd IO members,
  so need changing

[jejb: rebased on top of sg_table patches fixed up conflicts
and used the synergy to eliminate use_sg and sg_count]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:40 -06:00
Boaz Harrosh
bb52d82f45 [SCSI] tgt: use scsi_init_io instead of scsi_alloc_sgtable
If we export scsi_init_io()/scsi_release_buffers() instead of
scsi_{alloc,free}_sgtable() from scsi_lib than tgt code is much more
insulated from scsi_lib changes. As a bonus it will also gain bidi
capability when it comes.

[jejb: rebase on to sg_table and fix up rejections]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:40 -06:00
Linus Torvalds
f0f0052069 Merge branch 'blk-end-request' of git://git.kernel.dk/linux-2.6-block
* 'blk-end-request' of git://git.kernel.dk/linux-2.6-block: (30 commits)
  blk_end_request: changing xsysace (take 4)
  blk_end_request: changing ub (take 4)
  blk_end_request: cleanup of request completion (take 4)
  blk_end_request: cleanup 'uptodate' related code (take 4)
  blk_end_request: remove/unexport end_that_request_* (take 4)
  blk_end_request: changing scsi (take 4)
  blk_end_request: add bidi completion interface (take 4)
  blk_end_request: changing ide-cd (take 4)
  blk_end_request: add callback feature (take 4)
  blk_end_request: changing ide normal caller (take 4)
  blk_end_request: changing cpqarray (take 4)
  blk_end_request: changing cciss (take 4)
  blk_end_request: changing ide-scsi (take 4)
  blk_end_request: changing s390 (take 4)
  blk_end_request: changing mmc (take 4)
  blk_end_request: changing i2o_block (take 4)
  blk_end_request: changing viocd (take 4)
  blk_end_request: changing xen-blkfront (take 4)
  blk_end_request: changing viodasd (take 4)
  blk_end_request: changing sx8 (take 4)
  ...
2008-01-29 08:51:32 +11:00
James Bottomley
7cedb1f17f SG: work with the SCSI fixed maximum allocations.
SCSI sg table allocation has a maximum size (of SCSI_MAX_SG_SEGMENTS,
currently 128) and this will cause a BUG_ON() in SCSI if something
tries an allocation over it.  This patch adds a size limit to the
chaining allocator to allow the specification of the maximum
allocation size for chaining, so we always chain in units of the
maximum SCSI allocation size.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:54:49 +01:00
Kiyoshi Ueda
610d8b0c97 blk_end_request: changing scsi (take 4)
This patch converts scsi mid-layer to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.

As a result, the interface of internal function, scsi_end_request(),
is changed.

Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:37:09 +01:00
Jens Axboe
5ed7959ede SG: Convert SCSI to use scatterlist helpers for sg chaining
Also change scsi_alloc_sgtable() to just return 0/failure, since it
maps to the command passed in. ->request_buffer is now no longer needed,
once drivers are adapted to use scsi_sglist() it can be killed.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:05:27 +01:00
FUJITA Tomonori
b80ca4f7ee [SCSI] replace sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE
This replaces sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE in
several LLDs. It's a preparation for the future changes to remove
sense_buffer array in scsi_cmnd structure.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-23 11:29:27 -06:00
James Bottomley
465ff3185e [SCSI] relax scsi dma alignment
This patch relaxes the default SCSI DMA alignment from 512 bytes to 4
bytes.  I remember from previous discussions that usb and firewire have
sector size alignment requirements, so I upped their alignments in the
respective slave allocs.

The reason for doing this is so that we don't get such a huge amount of
copy overhead in bio_copy_user() for udev.  (basically all inquiries it
issues can now be directly mapped).

Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11 18:29:22 -06:00
James Bottomley
001aac257c [SCSI] sd,sr: add early detection of medium not present
The current scsi_test_unit_ready() is updated to return sense code
information (in struct scsi_sense_hdr).  The sd and sr drivers are
changed to interpret the sense code return asc 0x3a as no media and
adjust the device status accordingly.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11 18:22:50 -06:00
Rusty Russell
4a03d90e35 [SCSI] BUG_ON() impossible condition in sg list counting
If blk_rq_map_sg wrote more than was allocated in the scatterlist,
BUG_ON() is probably the right thing to do.

[jejb: rejections fixed up]

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11 18:22:50 -06:00
Tony Battersby
25d7c363f2 [SCSI] move single_lun flag from scsi_device to scsi_target
Some SCSI tape medium changers that need the BLIST_SINGLELUN flag have
the medium changer at one LUN and the tape drive at a different LUN.
The inquiry string of the tape drive may be different from that of the
medium changer.  In order for single_lun to be effective, every
scsi_device under a given scsi_target must have it set.  This means that
there needs to be a blacklist entry for BOTH the medium changer AND the
tape drive, which is impractical because some medium changers may be
paired with a variety of different tape drive models.  It makes more
sense to put the single_lun flag in scsi_target instead of scsi_device,
which causes every device at a given target ID to inherit the single_lun
flag from one LUN.  This makes it possible to blacklist just the medium
changer and not the tape drive.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11 18:22:44 -06:00
Rob Landley
eb44820c28 [SCSI] Add Documentation and integrate into docbook build
Add Documentation/DocBook/scsi_midlayer.tmpl, add to Makefile, and update
lots of kerneldoc comments in drivers/scsi/*.

Updated with comments from Stefan Richter, Stephen M. Cameron,
 James Bottomley and Randy Dunlap.

Signed-off-by: Rob Landley <rob@landley.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-11 18:22:40 -06:00
Linus Torvalds
7b3d9545f9 Revert "scsi: revert "[SCSI] Get rid of scsi_cmnd->done""
This reverts commit ac40532ef0, which gets
us back the original cleanup of 6f5391c283.

It turns out that the bug that was triggered by that commit was
apparently not actually triggered by that commit at all, and just the
testing conditions had changed enough to make it appear to be due to it.

The real problem seems to have been found by Peter Osterlund:

  "pktcdvd sets it [block device size] when opening the /dev/pktcdvd
   device, but when the drive is later opened as /dev/scd0, there is
   nothing that sets it back.  (Btw, 40944 is possible if the disk is a
   CDRW that was formatted with "cdrwtool -m 10236".)

   The problem is that pktcdvd opens the cd device in non-blocking mode
   when pktsetup is run, and doesn't close it again until pktsetup -d is
   run.  The effect is that if you meanwhile open the cd device,
   blkdev.c:do_open() doesn't call bd_set_size() because
   bdev->bd_openers is non-zero."

In particular, to repeat the bug (regardless of whether commit
6f5391c283 is applied or not):

  " 1. Start with an empty drive.
    2. pktsetup 0 /dev/scd0
    3. Insert a CD containing an isofs filesystem.
    4. mount /dev/pktcdvd/0 /mnt/tmp
    5. umount /mnt/tmp
    6. Press the eject button.
    7. Insert a DVD containing a non-writable filesystem.
    8. mount /dev/scd0 /mnt/tmp
    9. find /mnt/tmp -type f -print0 | xargs -0 sha1sum >/dev/null
    10. If the DVD contains data beyond the physical size of a CD, you
        get I/O errors in the terminal, and dmesg reports lots of
        "attempt to access beyond end of device" errors."

which in turn is because the nested open after the media change won't
cause the size to be set properly (because the original open still holds
the block device, and we only do the bd_set_size() when we don't have
other people holding the device open).

The proper fix for that is probably to just do something like

	bdev->bd_inode->i_size = (loff_t)get_capacity(disk)<<9;

in fs/block_dev.c:do_open() even for the cases where we're not the
original opener (but *not* call bd_set_size(), since that will also
change the block size of the device).

Cc: Peter Osterlund <petero2@telia.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-06 10:17:12 -08:00