This patch (as1098) changes the way ehci-hcd schedules its periodic
Iso transfers. That the current scheduling code is wrong is clear on
the face of it: Sometimes it returns -EL2NSYNC (meaning that an URB
couldn't be scheduled because it was submitted too late), but it does
this even when the URB_ISO_ASAP flag is set (meaning the URB should be
scheduled as soon as possible).
The new code properly implements as-soon-as-possible scheduling,
assigning the next unexpired slot as the URB's starting point. It
also is more careful about checking for Iso URB completion: It doesn't
bother to check for activity during frames that are already over,
and it allows for the possibility that some of the URB's packets may
have raced the hardware when they were submitted and so never got used
(the packet status is set to -EXDEV).
This fixes problems several people have experienced with USB video
applications.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1097) fixes a bug in the remote-wakeup handling in
ehci-hcd. The driver currently does not keep track of whether the
change-suspend feature is enabled for each port; the feature is
automatically reset the first time it is read. But recent changes to
the hub driver require that the feature be read at least twice in
order to work properly.
A bit-vector is added for storing the change-suspend feature values.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1096) fixes an annoying problem: When a full-speed or
low-speed device is plugged into an EHCI controller, it fails to
enumerate at high speed and then is handed over to the companion
controller. But usbcore logs a misleading and unwanted error message
when the high-speed enumeration fails.
The patch adds a new HCD method, port_handed_over, which asks whether
a port has been handed over to a companion controller. If it has, the
error message is suppressed.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: David Brownell <david-b@pacbell.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1095) cleans up the HCD glue and several of the EHCI
bus-glue files. The ehci->is_tdi_rh_tt flag is redundant, since it
means the same thing as the hcd->has_tt flag, so it is removed and the
other flag used in its place.
Some of the bus-glue files didn't get the relinquish_port method added
to their hc_driver structures. Although that routine currently
doesn't do anything for controllers with an integrated TT, in the
future it might. So the patch adds it where it is missing.
Lastly, some of the bus-glue files have erroneous entries for their
hc_driver's suspend and resume methods. These method pointers are
specific to PCI and shouldn't be used otherwise.
(The patch also includes an invisible whitespace fix.)
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
This patch (as1094) changes the output of the "descriptors" binary
attribute. Now it will contain the device descriptor followed by all
the configuration descriptors, not just the descriptor for the current
config.
Userspace libraries want to have access to the kernel's cached
descriptor information, so they can learn about device characteristics
without having to wake up suspended devices. So far the only user of
this attribute is the new libusb-1.0 library; thus changing its
contents shouldn't cause any problems.
This should be considered for 2.6.26, if for no other reason than to
minimize the range of releases in which the attribute contains only the
current config descriptor.
Also, it doesn't hurt that the patch removes the device locking --
which was formerly needed in order to know for certain which config was
indeed current.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is a potential deadlock when the usb_generic driver is unbound
from a device. The problem is that generic_disconnect() is called
with the device lock held, and it removes a bunch of device attributes
from sysfs. If a user task happens to be running an attribute method
at the time, the removal will block until the method returns. But at
least one of the attribute methods (the store routine for power/level)
needs to acquire the device lock!
This patch (as1093) eliminates the deadlock by moving the calls to
create and remove the sysfs attributes from the usb_generic driver
into usb_new_device() and usb_disconnect(), where they can be invoked
without holding the device lock.
Besides, the other sysfs attributes are created when the device is
registered and removed when the device is unregistered. So it seems
only fitting for the extra attributes to be created and removed at the
same time.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Uninitialised Apple iSight drivers present with a distinctive USB ID.
Once firmware has been uploaded, they disconnect and reconnect with a
new ID. At this point they can be driven by the uvcvideo driver. As this
is unique to the Apple cameras and not functionality shared by any other
UVC devices, it makes sense to provide the firmware loading
functionality in a separate driver. This driver will read an isight.fw
file extracted from the Apple driver using the tools at
http://bersace03.free.fr/ift/ and upload it to the camera. It will also
handle the case where the device loses its firmware during hibernation
and must have it reloaded.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds support for the range of PIDs
that have been allocated for FTDI based devices
at Matrix Orbital.
A small number of units have been shipped early 2008
with a faulty USB Descriptor. Products that may have
this issue have been marked with the existing quirk to
work around the problem.
Signed-off-by: R. Molenkamp <rmolenkamp@matrixorbital.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
pciehp: add message about pciehp_slot_with_bus option
pci hotplug core: add check of duplicate slot name
pciehp: move msleep after power off
pciehp: poll cmd completion if hotplug interrupt is disabled
pciehp: fix slow probing
pciehp: fix NULL dereference in interrupt handler
shpchp: add message about shpchp_slot_with_bus option
PCI: don't enable ASPM on devices with mixed PCIe/PCI functions
Some (broken?) platform assign the same slot name to multiple hotplug
slots. On such system, slot initialization would fail because of name
collision. The pciehp driver already have a "slot_with_bus" module
option which adds the bus number into the slot name. This patch adds
the message about this module option that will be displayed when slot
name collision is detected.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix the following errors reported by Jan C. Nordholz in
http://bugzilla.kernel.org/show_bug.cgi?id=10751.
kobject_add_internal failed for 2 with -EEXIST, don't try to register things with the same name in the same directory.
Pid: 1, comm: swapper Tainted: G W 2.6.26-rc3 #1
[<c0266980>] kobject_add_internal+0x140/0x190
[<c0266afd>] kobject_init_and_add+0x2d/0x40
[<c027bc91>] pci_hp_register+0x81/0x2f0
[<c027fd07>] pciehp_probe+0x1a7/0x470
[<c01b3b84>] sysfs_add_one+0x44/0xa0
[<c01b3c1f>] sysfs_addrm_start+0x3f/0xb0
[<c01b497a>] sysfs_create_link+0x8a/0xf0
[<c0279570>] pcie_port_probe_service+0x50/0x80
[<c02e0545>] driver_sysfs_add+0x55/0x70
[<c02e0662>] driver_probe_device+0x82/0x180
[<c02e07cc>] __driver_attach+0x6c/0x70
[<c02dfe0a>] bus_for_each_dev+0x3a/0x60
[<c05db2d0>] pcied_init+0x0/0x80
[<c02e04e6>] driver_attach+0x16/0x20
[<c02e0760>] __driver_attach+0x0/0x70
[<c02e0341>] bus_add_driver+0x1a1/0x220
[<c05db2d0>] pcied_init+0x0/0x80
[<c02e09cd>] driver_register+0x4d/0x120
[<c05db050>] ibm_acpiphp_init+0x0/0x190
[<c0125aab>] printk+0x1b/0x20
[<c05db2d0>] pcied_init+0x0/0x80
[<c05db2de>] pcied_init+0xe/0x80
[<c05c751a>] kernel_init+0x10a/0x300
[<c0120138>] schedule_tail+0x18/0x50
[<c0103b9a>] ret_from_fork+0x6/0x1c
[<c05c7410>] kernel_init+0x0/0x300
[<c05c7410>] kernel_init+0x0/0x300
[<c010485b>] kernel_thread_helper+0x7/0x1c
=======================
pci_hotplug: Unable to register kobject '2'<3>pciehp: pci_hp_register failed with error -22
Slot with the same name can be registered multiple times if shpchp or
pciehp driver is loaded after acpiphp is loaded because ACPI based
hotplug driver and Native OS hotplug driver trying to handle the same
physical slot. In this case, current pci_hotplug core will call
kobject_init_and_add() muliple time with the same name. This is the
cause of this problem. To fix this problem, this patch adds the check
into pci_hp_register() to see if the slot with the same name.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
According to the PCI Express specification, we must wait for at least
1 second after turning power off before taking any action that relies
on power having been removed from the slot/adapter. For this, current
pciehp wait for 1 second after issuing the power off command in
hpc_power_off_slot() function. But waiting for 1 second in
hpc_power_off_slot() can make pciehp probing slow-down because pciehp
probe code calls hpc_power_off_slot() if the slot is not occupied just
in case. We don't need to wait for 1 second at the pciehp probe time
because there is no action on that empty slot. So move 1 second wait
from hpc_power_off_slot() to the caller of hpc_power_off_slot().
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix improper long wait for command completion in pciehp probing.
As described in PCI Express specification, software notification is
not generated if the command that occurs as a result of a write to the
Slot Control register that disables software notification of command
completed events. Since pciehp driver doesn't take it into account,
such command is issued in pciehp probing, and it causes improper long
wait for command completion.
This patch changes the pciehp driver to take such command into
account.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix the "pciehp probing slow" problem reported from Jan C. Nordholz in
http://bugzilla.kernel.org/show_bug.cgi?id=10751.
The command completed bit in Slot Status register applies only to
commands issued to control the attention indicator, power indicator,
power controller, or electromechanical interlock. However, writes to
other parts of the Slot Control register would end up writing to the
control fields. Hence, any write to Slot Control register is
considered as a command. However, if the controller doesn't support
any of attention indicator, power indicator, power controller and
electromechanical interlock, command completed bit would not set in
writing to Slot Control register. In this case, we should not wait for
command completed bit set, otherwise all commands would be considered
not completed in timeout seconds (1 sec.).
The cause of the problem is pciehp driver didn't take this situation
into account. This patch changes pciehp to take it into account. This
patch also add the check for "No Command Completed Support" bit in
Slot Capability register. If it is set, we should not wait for command
completed bit set as well.
This problem seems to be revealed by the commit
c27fb883df that fixed the bug that
pciehp did not wait for command completed properly (pciehp just
ignored the command completion event).
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Some (broken?) platform assign the same slot name to multiple hotplug
slots. On such system, slot initialization would fail because of name
collision. The shpchp driver already have a "slot_with_bus" module
option which adds the bus number into the slot name. This patch adds
the message about this module option that will be displayed when slot
name collision is detected.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
There's a reason why using C99 initialisers even in the supposedly
trivial structs is a good idea.
Signed-off-by: Carl-Daniel Hailfinger <c-d.hailfinger.devel.2006@gmx.net>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (52 commits)
vlan: Use bitmask of feature flags instead of seperate feature bits
fmvj18x_cs: add NextCom NC5310 rev B support
xirc2ps_cs: re-initialize the multicast address in do_reset
3C509: rx_bytes should not be increased when alloc_skb failed
NETFRONT: Use __skb_queue_purge()
VIRTIO: Use __skb_queue_purge()
phylib: do EXPORT_SYMBOL on get_phy_id
netlink: Fix nla_parse_nested_compat() to call nla_parse() directly
WAN: protect HDLC proto list while insmod/rmmod
drivers/net/fs_enet: remove null pointer dereference
S2io: Version update for napi and MSI-X patches
S2io: Added napi support when MSIX is enabled.
S2io: Move all the transmit completions to a single msi-x (alarm) vector
drivers/net/ehea - remove unnecessary memset after kzalloc
au1000_eth: remove useless check
Blackfin EMAC Driver: Removed duplicated include <linux/ethtool.h>
cpmac bugfixes and enhancements
e1000e: use resource_size_t, not unsigned long, for phys addrs
net/usb: add support for Apple USB Ethernet Adapter
uli526x: add support for netpoll
...
The tuner driver used to change i2c_client.name for its own needs, but
it really shouldn't, as this field is used by i2c-core to do the
device/driver matching. So, create and use a separate field for the
tuner driver needs.
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Add the Intel ICH9DO controller ID's for the iTCO_wdt kernel driver and bump
the driver version.
Tested on an P5E-VM DO ASUS motherboard.
Signed-off-by: Gabriel Craciunescu <nix.or.die@googlemail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
On Book-E SMP systems each core has its own private watchdog. If only one
watchdog is enabled, when the core that doesn't enable the watchdog is hung,
system can't reset because no watchdog is running on it. That's bad. It
means we must enable watchdogs on both cores.
We can use smp_call_function() to send appropriate messages to all the other
cores to enable and update the watchdog.
Signed-off-by: Chen Gong <g.chen@freescale.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add a watchdog timer based on the MFGPT timers in the CS5535/CS5536
companion chips to the AMD Geode GX and LX processors. Only caveat
is that the BIOS must provide at least a one free timer, and most
do not.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
I need to just return in case it's not my NMI so someone else can take a look
at it (and reset die_nmi_called to 0 in case I actually do get one that's mine
to handle).
Signed-off-by: Thomas Mingarelli <thomas.mingarelli@hp.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
- split platform device/driver registering from actual watchdog device/driver
registering so that we can cleanly load/unload
- fixup __initdata with __initconst and __devinitdata with __devinitconst
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Pádraig Brady requested the possibility of not disabling the watchdog
at module load time or kernel boot time if it had been previously enabled
in the bios. It may help rebooting the machine if it freezes before the
userland daemon kicks in.
Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Cc: Pádraig Brady <P@draigBrady.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Some non-exported functions always returned 0. Mark them void instead.
Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Somehow the spidev code forgot to include a critical mechanism: when the
underlying device is removed (e.g. spi_master rmmod), open file
descriptors must be prevented from issuing new I/O requests to that
device. On penalty of the oopsing reported by Sebastian Siewior
<bigeasy@tglx.de> ...
This is a partial fix, adding handshaking between the lower level (SPI
messaging) and the file operations using the spi_dev. (It also fixes an
issue where reads and writes didn't return the number of bytes sent or
received.)
There's still a refcounting issue to be addressed (separately).
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Reported-by: Sebastian Siewior <bigeasy@tglx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
including of <asm/mpc85xx.h> causes build problems since it doesn't exist.
Also removed warning:
drivers/edac/mpc85xx_edac.c:45: warning: 'mpc85xx_ctl_name' defined but not used
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Acked-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a correct MODULE_ALIAS() entry for this driver to enable udev module
loading.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the old changelog entries which are now out of date and should be
extractable from git anyway. Also tidy up the copyright for the driver.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the following warning by checking the result of device_create_file and
printing an error but not removing the device (loss of debug registers is
not fatal).
drivers/video/s3c2410fb.c:905: warning: ignoring return value of 'device_create_file', declared with attribute warn_unused_result
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When a blank level of FB_BLANK_POWERDOWN is used, we should shut down the
controller so that it no longer tries to produce any panel signals or
data, and shuts down the DMA which is not needed.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To keep backwards compatibility, reverse the meanings of these flags so
that when they are not set, the driver uses the original behvaiour.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
enable_irq_wake() and disable_irq_wake() need to be balanced. However,
serial_core.c calls these for different conditions during the suspend and
resume functions...
This is causing a regular WARN_ON() as found at
http://www.kerneloops.org/search.php?search=set_irq_wake
This patch makes the conditions for triggering the _wake enable/disable
sequence identical.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In 2.6.25, ramdisk devices show up in /proc/partitions, which is a
behaviour change from the old rd.c. Add GENHD_FL_SUPPRESS_PARTITION_INFO,
which was present in rd.c.
All kernels prior to 2.6.25 weren't displaying ramdisks in
/proc/partitions. Since there are many userspace tools using information
from /proc/partitions some of them may now behave incorrectly (I didn't
tested any though). For example before 2.6.25 /proc/partitions was empty
if no block devices like hard disks and such were detected by kernel. Now
all 16 ramdisks are always visible there. Some software may rely on such
information (I mean, on empty /proc/partitions).
There was quite similar situation back in 2004, and ramdisks were excluded
back from displaying. Thats why I called this a regression (maybe a bit
unfortunate). See this patch for info:
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.3-rc2/2.6.3-rc2-mm1/broken-out/nbd-proc-partitions-fix.patch
I also think that someone somewhere (long time ago) excluded ramdisks from
/proc/partitions for good reasons. It is possible that now such new
"feature" is harmless, but I think there are more chances that someone
will say "hey, /proc/partitions has changed, now my software doesn't work"
then "hey where did my new 2.6.25 feature go". nbd devices are also
excluded, maybe for very same (unknown to me) reasons.
Signed-off-by: Marcin Krol <hawk@pld-linux.org>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This doesn't need to be two modules, and making it one cleans up the
problem
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The last gpio belonging to a chip is chip->base + chip->ngpios - 1. Some
places in the code, but not all, forgot the critical minus one.
Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The return value of mcp23s08_read_regs() can only be evaluated when signed
Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Teach drivers/gpio/pca953x.c about PCA9554, another compatible chip.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When we get any IO error during a recovery (rebuilding a spare), we abort
the recovery and restart it.
For RAID6 (and multi-drive RAID1) it may not be best to restart at the
beginning: when multiple failures can be tolerated, the recovery may be
able to continue and re-doing all that has already been done doesn't make
sense.
We already have the infrastructure to record where a recovery is up to
and restart from there, but it is not being used properly.
This is because:
- We sometimes abort with MD_RECOVERY_ERR rather than just MD_RECOVERY_INTR,
which causes the recovery not be be checkpointed.
- We remove spares and then re-added them which loses important state
information.
The distinction between MD_RECOVERY_ERR and MD_RECOVERY_INTR really isn't
needed. If there is an error, the relevant drive will be marked as
Faulty, and that is enough to ensure correct handling of the error. So we
first remove MD_RECOVERY_ERR, changing some of the uses of it to
MD_RECOVERY_INTR.
Then we cause the attempt to remove a non-faulty device from an array to
fail (unless recovery is impossible as the array is too degraded). Then
when remove_and_add_spares attempts to remove the devices on which
recovery can continue, it will fail, they will remain in place, and
recovery will continue on them as desired.
Issue: If we are halfway through rebuilding a spare and another drive
fails, and a new spare is immediately available, do we want to:
1/ complete the current rebuild, then go back and rebuild the new spare or
2/ restart the rebuild from the start and rebuild both devices in
parallel.
Both options can be argued for. The code currently takes option 2 as
a/ this requires least code change
b/ this results in a minimally-degraded array in minimal time.
Cc: "Eivind Sarto" <ivan@kasenna.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In some configurations, a raid6 resync can be limited by CPU speed
(Calculating P and Q and moving data) rather than by device speed. In
these cases there is nothing to be gained byt serialising resync of arrays
that share a device, and doing the resync in parallel can provide benefit.
So add a sysfs tunable to flag an array as being allowed to resync in
parallel with other arrays that use (a different part of) the same device.
Signed-off-by: Bernd Schubert <bs@q-leap.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This additional notification to 'array_state' is needed to allow the
monitor application to learn about stop events via sysfs. The
sysfs_notify("sync_action") call that comes at the end of do_md_stop()
(via md_new_event) is insufficient since the 'sync_action' attribute has
been removed by this point.
(Seems like a sysfs-notify-on-removal patch is a better fix. Currently
removal updates the event count but does not wake up waiters)
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When an array enters write pending, 'array_state' changes, so we must be
sure to sysfs_notify.
Also, when waiting for user-space to acknowledge 'write-pending' by
marking the metadata as dirty, we don't want to wait for MD_CHANGE_DEVS to
be cleared as that might not happen. So explicity test for the bits that
we are really interested in.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When performing a "recovery" or "check" pass on a RAID1 array, we read
from each device and possible, if there is a difference or a read error,
write back to some devices.
We use the same 'bio' for both read and write, resetting various fields
between the two operations.
We forgot to reset bv_offset and bv_len however. These are often left
unchanged, but in the case where there is an IO error one or two sectors
into a page, they are changed.
This results in correctable errors not being corrected properly. It does
not result in any data corruption.
Cc: "Fairbanks, David" <David.Fairbanks@stratus.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Last night we had scsi problems and a hardware raid unit was offlined
during heavy i/o. While this happened we got for about 3 minutes a huge
number messages like these
Apr 12 03:36:07 pfs1n14 kernel: [197510.696595] raid5:md7: read error not correctable (sector 2993096568 on sdj2).
I guess the high error rate is responsible for not scheduling other events
- during this time the system was not pingable and in the end also other
devices run into scsi command timeouts causing problems on these unrelated
devices as well.
Signed-off-by: Bernd Schubert <bernd-schubert@gmx.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kill the trivial and rather pointless file_path wrapper around d_path.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It is possible to add a write-intent bitmap to an active array, or remove
the bitmap that is there.
When we do with the 'quiesce' the array, which causes make_request to
block in "wait_barrier()".
However we are sampling the value of "mddev->bitmap" before the
wait_barrier call, and using it afterwards. This can result in using a
bitmap structure that has been freed.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>