Commit graph

159 commits

Author SHA1 Message Date
Tejun Heo
672b2d65ba libata: kill ATA_EHI_RESUME_LINK
ATA_EHI_RESUME_LINK has two functions - promote reset to hardreset if
ATA_LFLAG_HRST_TO_RESUME is set and preventing EH from shortcutting
reset action when probing is requested.  The former is gone now and
the latter can easily be achieved by making EH to perform at least one
reset if reset is requested, which also makes more sense than
depending on RESUME_LINK flag.

As ATA_EHI_RESUME_LINK was the only EHI reset modifier, this also
kills reset modifier handling.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:16 -04:00
Tejun Heo
cf48062658 libata: prefer hardreset
When both soft and hard resets are available, libata preferred
softreset till now.  The logic behind it was to be softer to devices;
however, this doesn't really help much.  Rationales for the change:

* BIOS may freeze lock certain things during boot and softreset can't
  unlock those.  This by itself is okay but during operation PHY event
  or other error conditions can trigger hardreset and the device may
  end up with different configuration.

  For example, after a hardreset, previously unlockable HPA can be
  unlocked resulting in different device size and thus revalidation
  failure.  Similar condition can occur during or after resume.

* Certain ATAPI devices require hardreset to recover after certain
  error conditions.  On PATA, this is done by issuing the DEVICE RESET
  command.  On SATA, COMRESET has equivalent effect.  The problem is
  that DEVICE RESET needs its own execution protocol.

  For SFF controllers with bare TF access, it can be easily
  implemented but more advanced controllers (e.g. ahci and sata_sil24)
  require specialized implementations.  Simply using hardreset solves
  the problem nicely.

* COMRESET initialization sequence is the norm in SATA land and many
  SATA devices don't work properly if only SRST is used.  For example,
  some PMPs behave this way and libata works around by always issuing
  hardreset if the host supports PMP.

  Like the above example, libata has developed a number of mechanisms
  aiming to promote softreset to hardreset if softreset is not going
  to work.  This approach is time consuming and error prone.

  Also, note that, dependingon how you read the specs, it could be
  argued that PMP fan-out ports require COMRESET to start operation.
  In fact, all the PMPs on the market except one don't work properly
  if COMRESET is not issued to fan-out ports after PMP reset.

* COMRESET is an integral part of SATA connection and any working
  device should be able to handle COMRESET properly.  After all, it's
  the way to signal hardreset during reboot.  This is the most used
  and recommended (at least by the ahci spec) method of resetting
  devices.

So, this patch makes libata prefer hardreset over softreset by making
the following changes.

* Rename ATA_EH_RESET_MASK to ATA_EH_RESET and use it whereever
  ATA_EH_{SOFT|HARD}RESET used to be used.  ATA_EH_{SOFT|HARD}RESET is
  now only used to tell prereset whether soft or hard reset will be
  issued.

* Strip out now unneeded promote-to-hardreset logics from
  ata_eh_reset(), ata_std_prereset(), sata_pmp_std_prereset() and
  other places.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:15 -04:00
Tejun Heo
3ec25ebd69 libata: ATA_EHI_LPM should be ATA_EH_LPM
EH actions are ATA_EH_* not ATA_EHI_*.  Rename ATA_EHI_LPM to
ATA_EH_LPM.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-29 12:21:31 -04:00
Tejun Heo
eec59f76e9 libata: allow LLDs w/o any reset method
Some old SFF controllers don't have any way to reset the channel.
Currently, this isn't supported and libata EH causes an oops.  Allow
LLDs w/o any reset method and just assume ATA class in such cases.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-10 20:50:34 -04:00
Tejun Heo
3326732570 libata: implement libata.force module parameter
This patch implements libata.force module parameter which can
selectively override ATA port, link and device configurations
including cable type, SATA PHY SPD limit, transfer mode and NCQ.

For example, you can say "use 1.5Gbps for all fan-out ports attached
to the second port but allow 3.0Gbps for the PMP device itself, oh,
the device attached to the third fan-out port chokes on NCQ and
shouldn't go over UDMA4" by the following.

 libata.force=2:1.5g,2.15:3.0g,2.03:noncq,udma4

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-02-20 12:12:28 -05:00
Tejun Heo
75f9cafc2d libata: fix off-by-one in error categorization
ATA_ECAT_DUBIOUS_BASE was too high by one and thus all DUBIOUS error
categorizations were wrong.  This passed test because only ATA_BUS and
UNK_DEV were used during testing and the ones after them - ATA_BUS and
an overflowed entry - behaved similarly.

This patch fixes the problem by adding DUBIOUS_NONE category and use
it as base.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:17 -05:00
Tejun Heo
0dc36888d4 libata: rename ATA_PROT_ATAPI_* to ATAPI_PROT_*
ATA_PROT_ATAPI_* are ugly and naming schemes between ATA_PROT_* and
ATA_PROT_ATAPI_* are inconsistent causing confusion.  Rename them to
ATAPI_PROT_* and make them consistent with ATA counterpart.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-01-23 05:24:14 -05:00
Andrew Morton
e6a73ab1c8 drivers/ata/libata-eh.c: fix printk warning
drivers/ata/libata-eh.c: In function `ata_port_pbar_desc':
drivers/ata/libata-eh.c:215: warning: long long unsigned int format, long unsigned int arg (arg 4)

Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:13 -05:00
Jeff Garzik
e39eec13ff [libata] Build fix WRT ata_is_xxx() new API introduction
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-01-23 05:24:11 -05:00
Tejun Heo
76326ac1ac libata: implement fast speed down for unverified data transfer mode
It's very likely that the configured data transfer mode is the wrong
one if device fails data transfers right after initial data transfer
mode configuration (including NCQ on/off and xfermode).  libata EH
needs to speed down fast before upper layers give up on probing.

This patch implement fast speed down rules to handle such cases
better.  Error occured while data transfer hasn't been verified
trigger fast back-to-back speed down actions until data transfer
works.

This change will make cable mis-detection and other initial
configuration problems corrected before partition scanning code gives
up.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:11 -05:00
Tejun Heo
00115e0f5b libata: implement ATA_DFLAG_DUBIOUS_XFER
ATA_DFLAG_DUBIOUS_XFER is set whenever data transfer speed or method
changes and gets cleared when data transfer command succeeds in the
newly configured transfer mode.

This will be used to improve speed down logic.

Signed-off-by: Tejun Heo <htejun@gmail.com<
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:11 -05:00
Tejun Heo
663f99b86a libata: adjust speed down rules
Speed down rules were too conservative.  Adjust them a bit.

* More than 10 timeouts can't happen in 5 minutes as command timeout
  is 30secs.  Lower the limit for rule #1 to 6.

* 10 timeouts is too high for rule #3 too.  Lower it to 6.

* SATAPI can benefit from falling back to PIO too.  Allow SATAPI
  devices to fall back to PIO.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:11 -05:00
Tejun Heo
3884f7b0a8 libata: clean up EH speed down implementation
Clean up EH speed down implementation.

* is_io boolean variable is replaced eflags.  is_io is ATA_EFLAG_IS_IO.

* Error categories now have names.

* Better comments.

* Reorder 5min and 10min rules in ata_eh_speed_down_verdict()

* Use local variable @link to cache @dev->link in ata_eh_speed_down()

These changes are to improve readability and ease further changes.
This patch doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:10 -05:00
Tejun Heo
6f1d1e3a03 libata: move ata_set_mode() to libata-eh.c
Move ata_set_mode() to libata-eh.c.  ata_set_mode() is surely an EH
action and will be more tightly coupled with the rest of error
handling.  Move it to libata-eh.c.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:10 -05:00
Tejun Heo
02c05a27e8 libata: factor out ata_eh_schedule_probe()
Factor out ata_eh_schedule_probe() from ata_eh_handle_dev_fail() and
ata_eh_recover().  This is to improve maintainability and make future
changes easier.

In the previous revision, ata_dev_enabled() test was accidentally
dropped while factoring out.  This problem was spotted by Bartlomiej.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:10 -05:00
Shaohua Li
bd3adca52b libata-acpi: add ACPI _PSx method
ACPI spec (ver 3.0a, p289) requires IDE power on/off executes ACPI _PSx
methods. As recently most PATA drivers use libata, this patch adds _PSx
method support in libata. ACPI spec doesn't mention if SATA requires the
same _PSx method.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-01-23 05:24:09 -05:00
Tejun Heo
4ccd3329a2 libata: don't normalize UNKNOWN to NONE after reset
After non-classifying reset, ehc->classes[] could contain
ATA_DEV_UNKNOWN which used to be normalized to ATA_DEV_NONE for
consistency.  However, this causes unfortunate side effect for drivers
which have non-classifying hardresets (e.g. sata_nv) by making
hardreset report ATA_DEV_NONE for non-classifying resets and thus
makes EH believe that the port is unoccupied and recovery can be
skipped.  The end result is that after a device is swapped with
another one, the new device isn't attached after the old one is
detached.

This patch makes ata_eh_reset() not normalize UNKNOWN to NONE after
non-classifying resets.  This fixes the above problem.  As UNKNOWN and
NONE are handled differently by only EH hotplug logic, this doesn't
cause other behavior changes.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-10 16:53:22 -05:00
Tejun Heo
2695e36616 libata-pmp: propagate timeout to host link
Timeout on downstream command may indicate transmission problem on
host link.  Propagate timeouts to host link.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-10 16:53:16 -05:00
Tejun Heo
f2dfc1a12b libata: update atapi_eh_request_sense() such that lbam/lbah contains buffer size
While updating lbam/h for ATAPI commands, atapi_eh_request_sense() was
left out.  Update it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17 20:33:15 -05:00
Tejun Heo
abb6a88974 libata: report protocol and full CDB on error
Protocol and CDB allocation size field are important in determining
what went wrong with ATAPI commands.  Report them on failure.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-01 17:35:58 -05:00
Adrian Bunk
21bef6dd2b libata: remove unused functions
This patch removes the following obsolete functions:
- libata-core.c: __sata_phy_reset()
- libata-core.c: sata_phy_reset()
- libata-eh.c: ata_qc_timeout()
- libata-eh.c: ata_eng_timeout()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Tejun Heo <htejun@gmail.com>
2007-11-19 12:28:09 +09:00
Tejun Heo
dfcc173d71 libata: consider errors not associated with commands for speed down
libata EH used to ignore errors not associated with commands when
determining whether speed down is necessary or not.  This leads to the
following problems.

* Errors not associated with commands can occur indefinitely without
  libata EH taking corrective actions.

* Upstream link errors don't trigger speed down when PMP is attached
  to it and commands issued to downstream device trigger errors on the
  upstream link.

This patch makes ata_eh_link_autopsy() consider errors not associated
with command for speed down.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-11-03 08:47:26 -04:00
Tejun Heo
08cf69d005 libata: more robust reset failure handling
Reset failure is a critical error.  It results in disabling the link
requiring user intervention to re-enable it.  Make reset failure
handling more robust such that libata EH doesn't give up too early.

* Temporary glitches during hardreset may lead to classification
  failure when there's no softreset available.  Retry instead of
  giving up.

* Initial softreset or follow up softreset may fail classification.
  Move classification error handling block out of followup softreset
  block such that both cases are handled and retry instead of giving
  up.  Also, on the last try, give ATA class a blind shot.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-11-03 08:46:54 -04:00
Tejun Heo
416dc9ed20 libata: cosmetic clean up / reorganization of ata_eh_reset()
Clean up and reorganize ata_eh_reset() to ease further changes.

* Cache ARRAY_SIZE(ata_eh_reset_timeouts) in @max_tries.
* Cache link->flags in @lflags.
* Move failure handling block to the end of the function and unnest
  both success and failure handling blocks.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-11-03 08:46:54 -04:00
Tejun Heo
cd955463bb libata: fix timing computation in ata_eh_reset()
As jiffies changes asynchronously, it needs to be cached if unchanging
timestamp is needed.  The code in ata_eh_reset() intended to do that
with @now but never actually did it.  Fix it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-11-03 08:46:54 -04:00
Tejun Heo
e027bd36c1 libata: implement and use ATA_QCFLAG_QUIET
Implement ATA_QCFLAG_QUIET which indicates that there's no need to
report if the command fails with AC_ERR_DEV and set it for passthrough
commands.

Combined with previous changes, this now makes device errors for all
direct commands reported directly to the issuer without going through
EH actions and reporting.

Note that EH is still invoked after non-IO device errors to determine
the nature of the error and resume command execution (some controller
requires special care after error to continue).  It just performs
default maintenance after error, examines what's going on, realizes
that it's none of its business and reports the command failure without
logging any error messages.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-30 09:59:43 -04:00
Tejun Heo
f90f0828e5 libata: stop being overjealous about non-IO commands
libata EH always revalidated device and retried failed command after
error except for ATAPI CCs.  This is unnecessary and hinders with
users issuing direct commands.  This patch makes the following
changes.

* Make sata_sil24 not request ATA_EH_REVALIDATE on device errors.
  sil24 is the only driver which does this.  All others let libata EH
  core code decide.

* Don't request revalidation after device error of non-IO command.
  Revalidation doesn't really help anybody.  As ATA_EH_REVALIDATE
  isn't set by default, there's no reason to clear it after sense data
  is read.  Kill ATA_EH_REVALIDATE clearing code while at it.

* Don't retry non-IO command after device error.  Device has rejected
  the command.  There's no point in retrying.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-30 09:59:42 -04:00
Kristen Carlson Accardi
ca77329fb7 [libata] Link power management infrastructure
Device Initiated Power Management, which is defined
in SATA 2.5 can be enabled for disks which support it.
This patch enables DIPM when the user sets the link
power management policy to "min_power".

Additionally, libata drivers can define a function
(enable_pm) that will perform hardware specific actions to
enable whatever power management policy the user set up
for Host Initiated Power management (HIPM).
This power management policy will be activated after all
disks have been enumerated and intialized.  Drivers should
also define disable_pm, which will turn off link power
management, but not change link power management policy.

Documentation/scsi/link_power_management_policy.txt has additional
information.

Signed-off-by:  Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-29 11:00:35 -04:00
Tejun Heo
4fb4615bc9 libata: no need to speed down if already at PIO0
After reset, transfer mode is always PIO0 regardless of
dev->xfer_mask.  Check dev->pio_mode before trying to slow down after
configuration failure.  This prevents bogus speed down before device
is actually configured.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-29 06:21:33 -04:00
Tejun Heo
cdeab11407 libata: relocate forcing PIO0 on reset
Forcing PIO0 on reset was done inside ata_bus_softreset(), which is a
bit out of place as it should be applied to all resets - hard, soft
and implementation which don't use ata_bus_softreset().  Relocate it
such that...

* For new EH, it's done in ata_eh_reset() before calling prereset.

* For old EH, it's done before calling ap->ops->phy_reset() in
  ata_bus_probe().

This makes PIO0 forced after all resets.  Another difference is that
reset itself is done after PIO0 is forced.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-29 06:21:33 -04:00
Tejun Heo
054a5fbace libata: track SLEEP state and issue SRST to wake it up
ATA devices in SLEEP mode don't respond to any commands.  SRST is
necessary to wake it up.  Till now, when a command is issued to a
device in SLEEP mode, the command times out, which makes EH reset the
device and retry the command after that, causing a long delay.

This patch makes libata track SLEEP state and issue SRST automatically
if a command is about to be issued to a device in SLEEP.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Bruce Allen <ballen@gravity.phys.uwm.edu>
Cc: Andrew Paprocki <andrew@ishiboo.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-29 06:15:25 -04:00
Tejun Heo
0e06d9ce7a libata: cosmetic clean up in ata_eh_reset()
Local variable @action usage in ata_eh_reset() is a bit confusing.
It's used only to cache ehc->i.action to test reset masks after
clearing it; however, due to the generic name "action", it's easy to
misinterpret the local variable as containing the selected reset
method later.  Also, the reason for caching the original value is easy
to miss.

This patch renames @action to @tmp_action and make it buffer newly
selected value instead to improve readability.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-25 02:02:02 -04:00
Jeff Garzik
2dcb407e61 [libata] checkpatch-inspired cleanups
Tackle the relatively sane complaints of checkpatch --file.

The vast majority is indentation and whitespace changes, the rest are

* #include fixes
* printk KERN_xxx prefix addition
* BSS/initializer cleanups

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-23 20:59:42 -04:00
Jeff Garzik
2855568b1e [libata] struct pci_dev related cleanups
* remove pointless pci_dev_to_dev() wrapper.  Just directly reference
  the embedded struct device like everyone else does.

* pata_cs5520: delete cs5520_remove_one(), it was a duplicate of
  ata_pci_remove_one()

* linux/libata.h: don't bother including linux/pci.h, we don't need it.
  Simply declare 'struct pci_dev' and assume interested parties will
  include the header, as they should be doing anyway.

* linux/libata.h: consolidate all CONFIG_PCI declarations into a
  single location in the header.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-12 14:55:47 -04:00
Tejun Heo
b06ce3e51e libata: use ata_exec_internal() for PMP register access
PMP registers used to be accessed with dedicated accessors ->pmp_read
and ->pmp_write.  During reset, those callbacks are called with the
port frozen so they should be able to run without depending on
interrupt delivery.  To achieve this, they were implemented polling.

However, as resetting the host port makes the PMP to isolate fan-out
ports until SError.X is cleared, resetting fan-out ports while port is
frozen doesn't buy much additional safety.

This patch updates libata PMP support such that PMP registers are
accessed using regular ata_exec_internal() mechanism and kills
->pmp_read/write() callbacks.  The following changes are made.

* PMP access helpers - sata_pmp_read_init_tf(), sata_pmp_read_val(),
  sata_pmp_write_init_tf() are folded into sata_pmp_read/write() which
  are now standalone PMP register access functions.

* sata_pmp_read/write() returns err_mask instead of rc.  This is
  consistent with other functions which issue internal commands and
  allows more detailed error reporting.

* ahci interrupt handler is modified to ignore BAD_PMP and
  spurious/illegal completion IRQs while reset is in progress.  These
  conditions are expected during reset.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:47 -04:00
Tejun Heo
afaa5c373d libata: implement ATA_PFLAG_RESETTING
Implement ATA_PFLAG_RESETTING.  This flag is set while reset is in
progress.  It's set before prereset is called and cleared after reset
fails or postreset is finished.

This flag itself doesn't have any function.  It will be used by LLDs
to tell whether reset is in progress if it needs to behave differently
during reset.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:47 -04:00
Tejun Heo
2b789108fc libata: add @timeout to ata_exec_internal[_sg]()
Add @timeout argument to ata_exec_internal[_sg]().  If 0, default
timeout ata_probe_timeout is used.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:47 -04:00
Tejun Heo
9073868376 libata: wrap schedule_timeout_uninterruptible() in loop
Tasks in uninterruptible sleep might be woken up by unrelated events
and should check whether the condition it was waiting for has actually
triggered.  Wrap schedule_timeout_uninterruptible() in loop to achieve
it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:46 -04:00
Tejun Heo
94ff3d5408 libata: skip suppress reporting if ATA_EHI_QUIET
ATA_EHI_NO_AUTOPSY and ATA_EHI_QUIET are used during initial probing
to skip exception analysis and reporting.  Usually, there's nothing to
report but on some allowed but rare corner cases (e.g. phy status
changed interrupt when IRQ is enabled on frozen port - this happens if
IRQ pending status isn't cleared in the IRQ router or controller)
exception messages get printed.

Skip reporting if ATA_EHI_QUIET is set.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:46 -04:00
Robert Hancock
1333e19434 libata: add human-readable error value decoding
This adds human-readable decoding of the ATA status and error registers
(similar to what drivers/ide does) as well as the SATA Serror register
to libata error handling output.  This prevents the need to pore
through standards documents to figure out the meaning of the bits
in these registers when looking at error reports.  Some bits that
drivers/ide decoded are not decoded here, since the bits are either
command-dependent or obsolete, and properly parsing them would add
too much complexity.

Signed-off-by: Robert Hancock <hancockr@shaw.ca>

[edited slightly to make output a bit more symmetric]
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:45 -04:00
Tejun Heo
633273a3ed libata-pmp: hook PMP support and enable it
Hook PMP support into libata and enable it.  Connect SCR and probing
functions, and update ata_dev_classify() to detect PMP.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:44 -04:00
Tejun Heo
3495de7336 libata-pmp: update ata_eh_reset() for PMP
PMP always requires SRST to be enabled.  Also, hardreset reports
classification code from the first device when PMP is attached, not
from the PMP.  Update ata_eh_reset() such that followup softreset is
performed if the controller is PMP capable and the host link is being
reset.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:43 -04:00
Tejun Heo
7d77b24708 libata-pmp-prep: implement sata_async_notification()
AN serves multiple purposes.  For ATAPI, it's used for media change
notification.  For PMP, for downstream PHY status change notification.
Implement sata_async_notification() which demultiplexes AN.

To avoid unnecessary port events, ATAPI AN is not enabled if PMP is
attached but SNTF is not available.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Kriten Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:42 -04:00
Tejun Heo
668108d73b libata-pmp-prep: implement EH fast-fail path
If PMP itself becomes inaccessible while trying to link a downstream
link, spending time to recover the downstream link doesn't make any
sense.  Make EH skip retry and fail fast if -ERESTART is received.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:41 -04:00
Tejun Heo
f9df58cb27 libata-pmp-prep: implement ATA_LFLAG_DISABLED
Implement ATA_LFLAG_DISABLED.  The flag indicates the link is disabled
due to EH recovery failure.  While a link is disabled, no EH action is
taken on the link and suspend/resume become noop too.

This will be used by PMP links to manage failed links.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:41 -04:00
Tejun Heo
fd995f7039 libata-pmp-prep: implement ATA_LFLAG_NO_RETRY
Some PMP links are connected to internal pseudo devices which may come
and go depending on situation.  There's no reason to try hard to
recover them.  ATA_LFLAG_NO_RETRY tells EH to not retry if the device
attached to the link fails.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:41 -04:00
Tejun Heo
ae791c0569 libata-pmp-prep: implement ATA_LFLAG_NO_SRST, ASSUME_ATA and ASSUME_SEMB
Some links on some PMPs locks up on SRST and/or report incorrect
device signature.  Implement ATA_LFLAG_NO_SRST, ASSUME_ATA and
ASSUME_SEMB to handle these quirky links.  NO_SRST makes EH avoid
SRST.  ASSUME_ATA and SEMB forces class code to ATA and SEMB_UNSUP
respectively.  Note that SEMB isn't currently supported yet so the
_UNSUP variant is used.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:41 -04:00
Tejun Heo
da917d69d0 libata-pmp-prep: implement qc_defer helpers
Implement ap->nr_active_links (the number of links with active qcs),
ap->excl_link (pointer to link which can be used by ->qc_defer and is
cleared when a qc with ATA_QCFLAG_CLEAR_EXCL completes), and
ata_link_active().

These can be used by ->qc_defer() to implement proper command
exclusion.  This set of helpers seem enough for both sil24 (ATAPI
exclusion needed) and cmd-switching PMP.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:41 -04:00
Tejun Heo
fb7fd61454 libata-pmp-prep: make a number of functions global to libata
Make a number of functions from libata-core.c and libata-eh.c global
to libata (drivers/ata/libata.h).  These will be used by PMP.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:41 -04:00
Tejun Heo
422c9daa8b libata-pmp-prep: add @new_class to ata_dev_revalidate()
Consider newly found class code while revalidating.  PMP resetting
always results in valid class code and issuing PMP commands to
ATA/ATAPI device isn't very attractive.  Add @new_class to
ata_dev_revalidate() and check class code for revalidation.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:40 -04:00
Tejun Heo
a1e10f7e68 libata: move EH repeat reporting into ata_eh_report()
EH is sometimes repeated without any error or action.  For example,
this happens when probing IDENTIFY fails because of a phantom device.
In these cases, all the repeated EH does is making sure there is no
unhandled error or pending action and return.  This repeation is
necessary to avoid losing any event which occurred while EH was in
progress.

Unfortunately, this dry run causes annonying "EH pending after
completion" message.  This patch moves the repeat reporting into
ata_eh_report() such that it's more compact and skipped on dry runs.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Mikael Pettersson <mikep@it.uu.se>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:37 -04:00
Tejun Heo
cbcdd87593 libata: implement and use ata_port_desc() to report port configuration
Currently, port configuration reporting has the following problems.

* iomapped address is reported instead of raw address
* report contains irrelevant fields or lacks necessary fields for
  non-SFF controllers.
* host->irq/irq2 are there just for reporting and hacky.

This patch implements and uses ata_port_desc() and
ata_port_pbar_desc().  ata_port_desc() is almost identical to
ata_ehi_push_desc() except that it takes @ap instead of @ehi, has no
locking requirement, can only be used during host initialization and "
" is used as separator instead of ", ".  ata_port_pbar_desc() is a
helper to ease reporting of a PCI BAR or an offsetted address into it.

LLD pushes whatever description it wants using the above two
functions.  The accumulated description is printed on host
registration after "[S/P]ATA max MAX_XFERMODE ".

SFF init helpers and ata_host_activate() automatically add
descriptions for addresses and irq respectively, so only LLDs which
isn't standard SFF need to add custom descriptions.  In many cases,
such controllers need to report different things anyway.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:37 -04:00
Tejun Heo
9b1e2658fa libata-link: update EH to deal with PMP links
Update ata_eh_autopsy(), ata_eh_report(),
ata_eh_revalidate_and_attach() and ata_eh_recover() to deal with PMP
links.  ata_eh_autopsy() and ata_eh_report() updates are
straightforward.  They just repeat the same operation over all
configured links.  The only change to ata_eh_revalidate_and_attach()
is avoiding calling ->cable_select() on non-host ports.

ata_eh_recover() update is more complex as it first processes all
resets and then performs the rest.  This is necessary as thawing with
some links in unknown state can be dangerous.  ehi->action is cleared
on successful recovery of a link to avoid repeating recovery due to
failures in other links.

ata_eh_recover() iterates over only PMP links if PMP is attached, and,
on failure, the failing link is returned in @failed_link instead of
disabling devices directly.  These are to integrate ata_eh_recover()
into PMP EH later.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:32 -04:00
Tejun Heo
cf1b86c8ab libata-link: update ata_scsi_error() to handle PMP links
Update ata_scsi_error() to handle PMP links.  As error conditions can
occur on both host and PMP links, __ata_port_for_each_link() is used.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:32 -04:00
Tejun Heo
dbd826168d libata-link: implement ata_link_abort()
Implement ata_link_abort().

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:31 -04:00
Tejun Heo
0260731f01 libata-link: linkify config/EH related functions
Make the following functions deal with ata_link instead of ata_port.

* ata_set_mode()
* ata_eh_autopsy() and related functions
* ata_eh_report() and related functions
* suspend/resume related functions
* ata_eh_recover() and related functions

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:31 -04:00
Tejun Heo
cc0680a580 libata-link: linkify reset
Make reset methods and related functions deal with ata_link instead of
ata_port.

* ata_do_reset()
* ata_eh_reset()
* all prereset/reset/postreset methods and related functions

This patch introduces no behavior change.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:31 -04:00
Tejun Heo
955e57dfde libata-link: linkify EH action helpers
Make ata_eh_about_to_do() and ata_eh_done() deal with ata_link instead
of ata_port.

This patch introduces no behavior change.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:30 -04:00
Tejun Heo
936fd73286 libata-link: linkify PHY-related functions
Make the following PHY-related functions to deal with ata_link instead
of ata_port.

* sata_print_link_status()
* sata_down_spd_limit()
* ata_set_sata_spd_limit() and friends
* sata_link_debounce/resume()
* sata_scr_valid/read/write/write_flush()
* ata_link_on/offline()

This patch introduces no behavior change.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:30 -04:00
Tejun Heo
f58229f806 libata-link: implement and use link/device iterators
Multiple links and different number of devices per link should be
considered to iterate over links and devices.  This patch implements
and uses link and device iterators - ata_port_for_each_link() and
ata_link_for_each_dev() - and ata_link_max_devices().

This change makes a lot of functions iterate over only possible
devices instead of from dev 0 to dev ATA_MAX_DEVICES.  All such
changes have been examined and nothing should be broken.

While at it, add a separating comment before device helpers to
distinguish them better from link helpers and others.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:30 -04:00
Tejun Heo
9af5c9c97d libata-link: introduce ata_link
Introduce ata_link.  It abstracts PHY and sits between ata_port and
ata_device.  This new level of abstraction is necessary to support
SATA Port Multiplier, which basically adds a bunch of links (PHYs) to
a ATA host port.  Fields related to command execution, spd_limit and
EH are per-link and thus moved to ata_link.

This patch only defines the host link.  Multiple link handling will be
added later.  Also, a lot of ap->link derefences are added but many of
them will be removed as each part is converted to deal directly with
ata_link instead of ata_port.

This patch introduces no behavior change.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12 14:55:30 -04:00
Tejun Heo
5ddf24c5ea libata: implement EH fast drain
In most cases, when EH is scheduled, all in-flight commands are
aborted causing EH to kick in immediately.  However, in some cases
(especially with PMP), it's unclear which commands are affected by the
error condition and although aborting all in-flight commands work, it
isn't optimal and may cause unnecessary disruption.  On the other
hand, waiting for in-flight commands to drain themselves can take up
to 30seconds.

This patch implements EH fast drain to handle such situations.  It
gives in-flight commands some time to finish up but doesn't wait for
too long.  After EH is scheduled, fast drain timer is started and if
no other completion occurs in ATA_EH_FASTDRAIN_INTERVAL all in-flight
commands are aborted.  If any completion occurred in the interval, the
port is given another interval to finish up itself.

Currently ATA_EH_FASTDRAIN_INTERVAL is 3 secs which should be enough
for finishing up most commands.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20 08:26:26 -04:00
Tejun Heo
4e57c517b3 libata: schedule probing after SError access failure during autopsy
If SError isn't accessible, EH can't tell whether hotplug has happened
or not.  Report SError read failure with AC_ERR_OTHER and schedule
probing with hardreset.  This will be mainly useful for PMPs.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20 08:26:26 -04:00
Tejun Heo
fccb6ea5c2 libata: clear HOTPLUG flag after a reset
ATA_EHI_HOTPLUGGED is a hint for reset functions indicating the the
port might have gone through hotplug/unplug just before entering EH.
Reset functions modify their behaviors a bit to handle the situation
better - e.g. using longer debouncing delay.

Currently, once HOTPLUG is set, it isn't cleared till the end of EH.
This is unnecessary and makes EH take longer.  Clear the HOTPLUGGED
flag after a reset try (successful or not).

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20 08:26:25 -04:00
Tejun Heo
f1545154a5 libata: quickly trigger SATA SPD down after debouncing failed
Debouncing failure is a good indicator of basic link problem.  Use
-EPIPE to indicate debouncing failure and make ata_eh_reset() invoke
sata_down_spd_limit() if the error occurs during reset.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20 08:19:06 -04:00
Tejun Heo
008a78961e libata: improve SATA PHY speed down logic
sata_down_spd_limit() first reads the current SPD from SStatus and
limit the speed to the lower one of one below the current limit or one
below the current SPD in SStatus.  SPD may not be accessible or valid
when SPD down is requested making sata_down_spd_limit() fail when it's
most needed.

This patch makes the current SPD cached after each successful reset
and forces GEN I speed (1.5Gbps) if neither of SStatus or the cached
value is valid, so sata_down_spd_limit() is now guaranteed to lower
the speed limit if lower speed is available.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20 08:19:05 -04:00
Tejun Heo
5335b72906 libata: implement AC_ERR_NCQ
When an NCQ command fails, all commands in flight are aborted and the
offending one is reported using log page 10h.  Depending on controller
characteristics and LLD implementation, all commands may appear as
having a device error due to shared TF status making it hard to
determine what's actually going on.

This patch adds AC_ERR_NCQ, marks the command reported by log page 10h
with it and print extra "<F>" after the error report for the command
to help distinguishing the offending command.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20 08:02:11 -04:00
Tejun Heo
b64bbc39f2 libata: improve EH report formatting
Requiring LLDs to format multiple error description messages properly
doesn't work too well.  Help LLDs a bit by making ata_ehi_push_desc()
insert ", " on each invocation.  __ata_ehi_push_desc() is the raw
version without the automatic separator.

While at it, make ehi_desc interface proper functions instead of
macros.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20 08:02:11 -04:00
Tejun Heo
fee7ca72d3 libata-link: separate out ata_eh_handle_dev_fail()
Separate out ata_eh_handle_dev_fail() from ata_eh_recover().  This is
in preparation of ata_link and PMP support.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-10 21:46:03 -04:00
Tejun Heo
64578a3de7 libata-acpi: implement _GTM/_STM support
Implement _GTM/_STM support.  acpi_gtm is added to ata_port which
stores _GTM parameters over suspend/resume cycle.  A new hook
ata_acpi_on_suspend() is responsible for storing _GTM parameters
during suspend.  _STM is executed in ata_acpi_on_resume().  With this
change, invoking _GTF is safe on IDE hierarchy and acpi_sata check
before _GTF is removed.

ata_acpi_gtm() and ata_acpi_stm() implementation is taken from Alan
Cox's pata_acpi implementation.  ata_acpi_gtm() is fixed such that the
result parameter is not shifted by sizeof(union acpi_object).

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-09 12:17:32 -04:00
Tejun Heo
6746544c3b libata: reimplement ACPI invocation
This patch reimplements ACPI invocation such that, instead of
exporting ACPI details to the rest of libata, ACPI event handlers -
ata_acpi_on_resume() and ata_acpi_on_devcfg() - are used.  These two
functions are responsible for determining whether specific ACPI method
is used and when.

On resume, _GTF is scheduled by setting ATA_DFLAG_ACPI_PENDING device
flag.  This is done this way to avoid performing the action on wrong
device device (device swapping while suspended).

On every ata_dev_configure(), ata_acpi_on_devcfg() is called, which
performs _SDD and _GTF.  _GTF is performed only after resuming and, if
SATA, hardreset as the ACPI spec specifies.  As _GTF may contain
arbitrary commands, IDENTIFY page is re-read after _GTF taskfiles are
executed.

If one of ACPI methods fails, ata_acpi_on_devcfg() retries on the
first failure.  If it fails again on the second try, ACPI is disabled
on the device.  Note that successful configuration clears ACPI failed
status.

With all feature checks moved to the above two functions,
do_drive_set_taskfiles() is trivial and thus collapsed into
ata_acpi_exec_tfs(), which is now static and converted to return the
number of executed taskfiles to be used by ata_acpi_on_resume().  As
failures are handled properly, ata_acpi_push_id() now returns -errno
on errors instead of unconditional zero.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-09 12:17:31 -04:00
Tejun Heo
914616a3c2 libata: fix infinite EH waiting bug
When EH gives up after repeated exceptions, it doesn't't clear the
PENDING bit on exit which leaves PENDING bit set without EH actually
scheduled.  This makes ata_port_wait_eh() to wait forever makes rmmod
hang on such port.  Fix it by clearing the flag.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-06-27 02:44:21 -04:00
Tejun Heo
8b5bb2fa3d libata: remove unused variable from ata_eh_reset()
Removed unused variable did_followup_srst from ata_eh_reset().

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-06-27 02:44:20 -04:00
Tejun Heo
8af500bc7f libata: kill non-sense warning message
prereset() is now allowed to set flag for unsupported reset method.
EH layer is responsible for selecting the fallback.  Remove non-sense
warning message.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-06-27 02:44:19 -04:00
Jeff Garzik
a617c09f6d libata: Trim trailing whitespace
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-21 20:14:23 -04:00
Tejun Heo
8575b81409 libata: give devices one last chance even if recovery failed with -EINVAL
After certain errors, some devices report complete garbage on
IDENTIFY.  This can cause ata_dev_read_id() to fail with -EINVAL
resulting in immediate disabling of the device.  Give the device one
last chance after -EINVAL to allow recovery from such situations.  As
-EINVAL is triggered very rarely, this shouldn't cause any noticeable
affect on more common error paths.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Harald Dunkel <harald.dunkel@t-online.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-11 18:09:18 -04:00
Tejun Heo
f4d6d00466 libata: ignore EH scheduling during initialization
libata enables SCSI host during ATA host activation which happens
after IRQ handler is registered and IRQ is enabled.  All ATA ports are
in frozen state when IRQ is enabled but frozen ports may raise limited
number of IRQs after being frozen - IOW, ->freeze() is not responsible
for clearing pending IRQs.  During normal operation, the IRQ handler
is responsible for clearing spurious IRQs on frozen ports and it
usually doesn't require any extra code.

Unfortunately, during host initialization, the IRQ handler can end up
scheduling EH for a port whose SCSI host isn't initialized yet.  This
results in OOPS in the SCSI midlayer.  This is relatively short window
and scheduling EH for probing is the first thing libata does after
initialization, so ignoring EH scheduling until initialization is
complete solves the problem nicely.

This problem was spotted by Berck E. Nash in the following thread.

  http://thread.gmane.org/gmane.linux.kernel/519412

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Berck E. Nash <flyboy@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-11 18:09:18 -04:00
Tejun Heo
9666f4009c libata: reimplement suspend/resume support using sdev->manage_start_stop
Reimplement suspend/resume support using sdev->manage_start_stop.

* Device suspend/resume is now SCSI layer's responsibility and the
  code is simplified a lot.

* DPM is dropped.  This also simplifies code a lot.  Suspend/resume
  status is port-wide now.

* ata_scsi_device_suspend/resume() and ata_dev_ready() removed.

* Resume now has to wait for disk to spin up before proceeding.  I
  couldn't find easy way out as libata is in EH waiting for the
  disk to be ready and sd is waiting for EH to complete to issue
  START_STOP.

* sdev->manage_start_stop is set to 1 in ata_scsi_slave_config().
  This fixes spindown on shutdown and suspend-to-disk.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-11 18:01:03 -04:00
Tejun Heo
31daabda16 libata: reimplement reset sequencing
libata previously depended upon waits in prereset to get resets after
hotplug right for both spin up and device ready wait.  This was
necessary both for reliablity and speed as reset was likely to fail if
initiated too early and each try usually took more than 30secs to
fail.  Previous patches fixed the reliability part by fixing status
and SCR handling in resets.  This patch remedies the speed part by
improving reset sequencing.

Prereset waiting timeout is adjusted to 10s because spinup wait is
replaced by reset sequencing and !BSY wait is not as important as
before.  During boot or module loading where the drive is already
fully spun up, !BSY wait succeeds immediately, so 10s should be enough
in most cases.  It matters after hotplugging or other error
conditions, but in those cases, !BSY wait in prereset simply can't be
relied upon due to the varied and weird behaviors ATA controllers and
devices show.

Reset is now driven by ata_eh_reset_timeouts[] table which contains
timeouts for each reset try.  The first reset can be softreset but the
following ones are always hardreset if available.  Each timeout
defines deadline for the reset try.  If a reset try fails, reset is
retried with the next timeout till the end of the timeout table is
reached.  If a reset try fails before the timeout with error, libata
waits till the deadline of the failed try before retrying.

IOW, the timeout table defines timetable of reset tries such that the
n'th try always begins at least after the sum of all previous timeouts
has passed.  The current timetable defines 4 tries and takes around 1
minute.

@0	: First try.  This should succeed most of the time during boot.
@10	: 10s is enough to spin up most consumer harddrives.  Give it
	  another shot.
@20	: 20s should spin up > 99% of working drives.  This has 30s
	  timeout for retarded devices needing long idleness post reset.
@55	: Final try with 5s timeout just in case.

The above timetable is trade off between not annoying the device too
much with frequent resets and taking reasonable amount of time in most
cases.  Some controllers may do better with shorter timeouts while
others may fare better with longer but we just can't rely upon LLD
writers to test each controller with wide variety of devices using
various scenarios.  We need default behavior which reasonably fits
most cases.

I've tested the above timetable on a dozen SATA controllers and a few
PATA controllers with about a dozen different drives from all major
vendors and 4 different ODDs from three different vendors for both
boot and hotplug (if available) cases.

Boot probing is not affected unless the device is broken in which
cases new code gives up on the port after a minute rather than five or
nine minutes.  When hotplugging, most devices get detected on the
first or second try.  Multi-platter drives with long spin up time
which sometimes took > 40 secs with the original code, now usually
comes up during the second try and at least right after the third try
@20.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-01 07:49:54 -04:00
Tejun Heo
d4b2bab4f2 libata: add deadline support to prereset and reset methods
Add @deadline to prereset and reset methods and make them honor it.
ata_wait_ready() which directly takes @deadline is implemented to be
used as the wait function.  This patch is in preparation for EH timing
improvements.

* ata_wait_ready() never does busy sleep.  It's only used from EH and
  no wait in EH is that urgent.  This function also prints 'be
  patient' message automatically after 5 secs of waiting if more than
  3 secs is remaining till deadline.

* ata_bus_post_reset() now fails with error code if any of its wait
  fails.  This is important because earlier reset tries will have
  shorter timeout than the spec requires.  If a device fails to
  respond before the short timeout, reset should be retried with
  longer timeout rather than silently ignoring the device.

  There are three behavior differences.

  1. Timeout is applied to both devices at once, not separately.  This
     is more consistent with what the spec says.

  2. When a device passes devchk but fails to become ready before
     deadline.  Previouly, post_reset would just succeed and let
     device classification remove the device.  New code fails the
     reset thus causing reset retry.  After a few times, EH will give
     up disabling the port.

  3. When slave device passes devchk but fails to become accessible
     (TF-wise) after reset.  Original code disables dev1 after 30s
     timeout and continues as if the device doesn't exist, while the
     patched code fails reset.  When this happens, new code fails
     reset on whole port rather than proceeding with only the primary
     device.

  If the failing device is suffering transient problems, new code
  retries reset which is a better behavior.  If the failing device is
  actually broken, the net effect is identical to it, but not to the
  other device sharing the channel.  In the previous code, reset would
  have succeeded after 30s thus detecting the working one.  In the new
  code, reset fails and whole port gets disabled.  IMO, it's a
  pathological case anyway (broken device sharing bus with working
  one) and doesn't really matter.

* ata_bus_softreset() is changed to return error code from
  ata_bus_post_reset().  It used to return 0 unconditionally.

* Spin up waiting is to be removed and not converted to honor
  deadline.

* To be on the safe side, deadline is set to 40s for the time being.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-01 07:49:53 -04:00
Tejun Heo
0d64a233fe libata: separate ATA_EHI_DID_RESET into DID_SOFTRESET and DID_HARDRESET
Separate ATA_EHI_DID_RESET into ATA_EHI_DID_SOFTRESET and
ATA_EHI_DID_HARDRESET.  ATA_EHI_DID_RESET is redefined as OR of the
two flags.  This patch doesn't introduce any behavior change.  This
will be used later to determine whether _SDD is necessary or not.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-28 14:51:33 -04:00
Tejun Heo
c1c4e8d557 libata: add missing call to ->cable_detect() in new EH path
->cable_detect() used to be called on by the old ata_bus_probe() path.
Add invocation to ata_eh_revalidate_and_attach() right after IDENTIFYs
are done.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-28 14:46:18 -04:00
Tejun Heo
a51d644af6 libata: improve AC_ERR_DEV handling for ->post_internal_cmd
->post_internal_cmd is simplified EH for internal commands.  Its
primary mission is to stop the controller such that no rogue memory
access or other activities occur after the internal command is
released.  It may provide error diagnostics by setting qc->err_mask
but this hasn't been a requirement.

To ignore SETXFER failure for CFA devices, libata needs to know
whether a command was failed by the device or for any other reason.
ie. internal command needs to get AC_ERR_DEV right.

This patch makes the following changes to AC_ERR_DEV handling and
->post_internal_cmd semantics to accomodate this need and simplify
callback implementation.

1. As long as the correct bits in the result TF registers are set,
   there is no need to set AC_ERR_DEV explicitly.  libata EH core
   takes care of that for both normal and internal commands.

2. The only requirement for ->post_internal_cmd() is to put the
   controller into quiescent state.  It needs not to set any err_mask.

3. ata_exec_internal_sg() performs minimal error analysis such that
   AC_ERR_DEV is automatically set as long as result_tf is filled
   correctly.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-28 14:16:02 -04:00
Tejun Heo
771b8dad96 libata: hardreset on SERR_INTERNAL
There was a rare report where SB600 reported SERR_INTERNAL and SRST
couldn't get it out of the failure mode.  Hardreset on SERR_INTERNAL.
As the problem is intermittent, whether this fixes the problem or not
hasn't been verified yet, but hardresetting the channel on internal
error is a good idea anyway.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-28 14:15:59 -04:00
Albert Lee
56287768e3 libata: Clear tf before doing request sense (take 3)
patch 2/4:
  Clear tf before doing request sense.

This fixes the AOpen 56X/AKH timeout problem.
(http://bugzilla.kernel.org/show_bug.cgi?id=8244)

Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-04 02:12:27 -04:00
Tejun Heo
8c3c52a8f0 libata: IDENTIFY backwards for drive side cable detection
For drive side cable detection to work correctly, drives need to be
identified backwards such that the slave device releases PDIAG- before
the mater drive tries to detect cable type.  ata_bus_probe() was fixed
by commit f31f0cc2f0 but the new EH path
wasn't fixed.  This patch makes new EH path do IDENTIFY backwards.

ata_dev_configure() for new devices are still performed master first.
This is to keep the detection messages in forward order.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-03-28 02:04:27 -04:00
Tejun Heo
4aa9ab67fb libata: don't whine if ->prereset() returns -ENOENT
->prereset() returns -ENOENT to tell libata that the port is empty and
reset sequencing should be stopped.  This is not an error condition.
Update ata_eh_reset() such that it sets device classes to ATA_DEV_NONE
and return success in on -ENOENT.  This makes spurious error message
go away.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-03-19 11:55:43 -04:00
Tejun Heo
6ffa01d88c libata: add CONFIG_PM to libata core layer
Conditionalize all PM related stuff in libata core layer using
CONFIG_PM.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-03-02 18:30:35 -05:00
Tejun Heo
44877b4e22 libata: s/ap->id/ap->print_id/g
ata_port has two different id fields - id and port_no.  id is
system-wide 1-based unique id for the port while port_no is 0-based
host-wide port number.  The former is primarily used to identify the
ATA port to the user in printk messages while the latter is used in
various places in libata core and LLDs to index the port inside the
host.

The two fields feel quite similar and sometimes ap->id is used in
place of ap->port_no, which is very difficult to spot.  This patch
renames ap->id to ap->print_id to reduce the possibility of such bugs.

Some printk messages are adjusted such that id string (ata%u[.%u])
isn't printed twice and/or to use ata_*_printk() instead of hardcoded
id format.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-21 04:58:20 -05:00
Tejun Heo
7d47e8d4d4 libata: put some intelligence into EH speed down sequence
The current EH speed down code is more of a proof that the EH
framework is capable of adjusting transfer speed in response to error.
This patch puts some intelligence into EH speed down sequence.  The
rules are..

* If there have been more than three timeout, HSM violation or
  unclassified DEV errors for known supported commands during last 10
  mins, NCQ is turned off.

* If there have been more than three timeout or HSM violation for known
  supported command, transfer mode is slowed down.  If DMA is active,
  it is first slowered by one grade (e.g. UDMA133->100).  If that
  doesn't help, it's slowered to 40c limit (UDMA33).  If PIO is
  active, it's slowered by one grade first.  If that doesn't help,
  PIO0 is forced.  Note that this rule does not change transfer mode.
  DMA is never degraded into PIO by this rule.

* If there have been more than ten ATA bus, timeout, HSM violation or
  unclassified device errors for known supported commands && speeding
  down DMA mode didn't help, the device is forced into PIO mode.  Note
  that this rule is considered only for PATA devices and is pretty
  difficult to trigger.

One error can only trigger one rule at a time.  After a rule is
triggered, error history is cleared such that the next speed down
happens only after some number of errors are accumulated.  This makes
sense because now speed down is done in bigger stride.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-21 04:58:16 -05:00
Tejun Heo
4ae72a1e46 libata: improve probe failure handling
* Move forcing device to PIO0 on device disable into
  ata_dev_disable().  This makes both old and new EHs act the same
  way.

* Speed down only PIO mode on probe failure.  All commands used during
  probing are PIO commands.  There's no point in speeding down DMA.

* Retry at least once after -ENODEV.  Some devices report garbled
  IDENTIFY data after certain events.  This shouldn't cause device
  detach and re-attach.

* Rearrange EH failure path for simplicity.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-21 04:58:16 -05:00
Tejun Heo
458337dbb1 libata: improve ata_down_xfermask_limit()
Make ata_down_xfermask_limit() accept @sel instead of @force_pio0.
@sel selects how the xfermask limit will be adjusted.  The following
selectors are defined.

* ATA_DNXFER_PIO	: only speed down PIO
* ATA_DNXFER_DMA	: only speed down DMA, don't cause transfer mode change
* ATA_DNXFER_40C	: apply 40c cable limit
* ATA_DNXFER_FORCE_PIO	: force PIO
* ATA_DNXFER_FORCE_PIO0	: force PIO0 (same as original with @force_pio0 == 1)
* ATA_DNXFER_ANY	: same as original with @force_pio0 == 0

Currently, only ANY and FORCE_PIO0 are used to maintain the original
behavior.  Other selectors will be used later to improve EH speed down
sequence.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-21 04:58:16 -05:00
Tejun Heo
726f0785b6 libata: kill qc->nsect and cursect
libata used two separate sets of variables to record request size and
current offset for ATA and ATAPI.  This is confusing and fragile.
This patch replaces qc->nsect/cursect with qc->nbytes/curbytes and
kills them.  Also, ata_pio_sector() is updated to use bytes for
qc->cursg_ofs instead of sectors.  The field used to be used in bytes
for ATAPI and in sectors for ATA.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:31 -05:00
Tejun Heo
03ee5b1cdd libata: fix ata_eh_suspend() return value
ata_eh_suspend() was returning 0 regardless of failure.  This bug has
potential to lose data on suspend.  Fix it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-01-27 03:21:26 -05:00
Tejun Heo
79a55b72a1 libata: fix handling of port actions in per-dev action mask
libata EH ignores port-wide actions in per-dev action mask.  However,
device resume requests EH_SOFTRESET using per-dev action mask.  Under
certain circumstances, this results in not resetting frozen port after
resuming which causes failure of all commands.

This patch allows port-wide actions to be requested in per-dev action
mask.  Before EH recovery starts, port-wide actions will be collected.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-01-19 19:22:45 -05:00
David Howells
9db7372445 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	drivers/ata/libata-scsi.c
	include/linux/libata.h

Futher merge of Linus's head and compilation fixups.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-05 17:01:28 +00:00
Tejun Heo
800b399669 [PATCH] libata: always use polling IDENTIFY
libata switched to IRQ-driven IDENTIFY when IRQ-driven PIO was
introduced.  This has caused a lot of problems including device
misdetection and phantom device.

ATA_FLAG_DETECT_POLLING was added recently to selectively use polling
IDENTIFY on problemetic drivers but many controllers and devices are
affected by this problem and trying to adding ATA_FLAG_DETECT_POLLING
for each such case is diffcult and not very rewarding.

This patch makes libata always use polling IDENTIFY.  This is
consistent with libata's original behavior and drivers/ide's behavior.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-03 07:58:10 -05:00
Tejun Heo
a569a30d30 [PATCH] libata: don't request sense if the port is frozen
If EH command is issued to a frozen port, it fails with AC_ERR_SYSTEM.
libata used to request sense even when the port is frozen needlessly
adding AC_ERR_SYSTEM to err_mask.  Don't do it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-12-03 17:56:23 +09:00
Tejun Heo
664e8503fe [PATCH] libata: print cdb[0] in failed qc report
Print cdb[0] in failed qc report.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-01 22:47:03 -05:00
Tejun Heo
8a93758170 [PATCH] libata: improve failed qc reporting
Improve failed qc reporting.  The original message didn't include the
actual command nor full error status and it was necessary to
temporarily patch the code to find out exactly which command is
causing problem.  This patch makes EH report full command and result
TFs along with data direction and length.  This change will make bug
reports more useful.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-01 22:45:55 -05:00