md: improve the is_mddev_idle test

During a 'resync' or similar activity, md checks if the devices in the
array are otherwise active and winds back resync activity when they are.
This test in done in is_mddev_idle, and it is somewhat fragile - it
sometimes thinks there is non-sync io when there isn't.

The test compares the total sectors of io (disk_stat_read) with the sectors
of resync io (disk->sync_io).  This has problems because total sectors gets
updated when a request completes, while resync io gets updated when the
request is submitted.  The time difference can cause large differenced
between the two which do not actually imply non-resync activity.  The test
currently allows for some fuzz (+/- 4096) but there are some cases when it
is not enough.

The test currently looks for any (non-fuzz) difference, either positive or
negative.  This clearly is not needed.  Any non-sync activity will cause
the total sectors to grow faster than the sync_io count (never slower) so
we only need to look for a positive differences.

If we do this then the amount of in-flight sync io will never cause the
appearance of non-sync IO.  Once enough non-sync IO to worry about starts
happening, resync will be slowed down and the measurements will thus be
more precise (as there is less in-flight) and control of resync will still
be suitably responsive.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
NeilBrown 2007-05-10 22:23:31 -07:00 committed by Linus Torvalds
parent 11fe250d89
commit 435b71be20

View file

@ -5103,7 +5103,7 @@ static int is_mddev_idle(mddev_t *mddev)
* *
* Note: the following is an unsigned comparison. * Note: the following is an unsigned comparison.
*/ */
if ((curr_events - rdev->last_events + 4096) > 8192) { if ((long)curr_events - (long)rdev->last_events > 4096) {
rdev->last_events = curr_events; rdev->last_events = curr_events;
idle = 0; idle = 0;
} }