[PATCH] drivers/md/dm-raid1.c: Fix inconsistent mirroring after interrupted recovery

dm-mirror has potential data corruption problem: while on-disk log shows
that all disk contents are in-sync, actual contents of the disks are not
synchronized.  This problem occurs if initial recovery (synching) is
interrupted and resumed.

Attached patch fixes this problem.

Background:

rh_dec() changes the region state from RH_NOSYNC (out-of-sync) to RH_CLEAN
(in-sync), which results in the corresponding bit of clean_bits being set.

This is harmful if on-disk log is used and the map is removed/suspended
before the initial sync is completed.  The clean_bits is written down to
the on-disk log at the map removal, and, upon resume, it's read and copied
to sync_bits.  Since the recovery process refers to the sync_bits to find a
region to be recovered, the region whose state was changed from RH_NOSYNC
to RH_CLEAN is no longer recovered.

If you haven't applied dm-raid1-read-balancing.patch proposed in dm-devel
sometimes ago, the contents of the mirrored disk just corrupt silently.  If
you have, balanced read may get bogus data from out-of-sync disks.

The patch keeps RH_NOSYNC state unchanged.  It will be changed to
RH_RECOVERING when recovery starts and get reclaimed when the recovery
completes.  So it doesn't leak the region hash entry.

Description:

Keep RH_NOSYNC state unchanged when I/O on the region completes.

rh_dec() changes the region state from RH_NOSYNC (out-of-sync) to RH_CLEAN
(in-sync), which results in the corresponding bit of clean_bits being set.

This is harmful if on-disk log is used and the map is removed/suspended
before the initial sync is completed.  The clean_bits is written down to
the on-disk log at the map removal, and, upon resume, it's read and copied
to sync_bits.  Since the recovery process refers to the sync_bits to find a
region to be recovered, the region whose state was changed from RH_NOSYNC
to RH_CLEAN is no longer recovered.

If you haven't applied dm-raid1-read-balancing.patch proposed in dm-devel
sometimes ago, the contents of the mirrored disk just corrupt silently.  If
you have, balanced read may get bogus data from out-of-sync disks.

The RH_NOSYNC region will be changed to RH_RECOVERING when recovery starts
on the region and get reclaimed when the recovery completes.  So it doesn't
leak the region hash entry.

Alasdair said:

  I've analysed the relevant part of the state machine and I believe that
  the patch is correct.

  (Further work on this code is still needed - this patch has the
  side-effect of holding onto memory unnecessarily for long periods of time
  under certain workloads - but better that than corrupting data.)

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

This commit is contained in:

Jun'ichi Nomura

2006-03-27 01:17:47 -08:00

• committed by

Linus Torvalds

parent 76df1c651b

commit 930d332a23

1 changed files with 13 additions and 1 deletions

									
										14

drivers/md/dm-raid1.c
									
										View file
										
				@ -402,9 +402,21 @@ static void rh_dec(struct region_hash *rh, region_t region)

					spin_lock_irqsave(&rh->region_lock, flags);

					if (atomic_dec_and_test(&reg->pending)) {

						/*

						 * There is no pending I/O for this region.

						 * We can move the region to corresponding list for next action.

						 * At this point, the region is not yet connected to any list.

						 *

						 * If the state is RH_NOSYNC, the region should be kept off

						 * from clean list.

						 * The hash entry for RH_NOSYNC will remain in memory

						 * until the region is recovered or the map is reloaded.

						 */

						/* do nothing for RH_NOSYNC */

						if (reg->state == RH_RECOVERING) {

							list_add_tail(&reg->list, &rh->quiesced_regions);

						} else {

						} else if (reg->state == RH_DIRTY) {

							reg->state = RH_CLEAN;

							list_add(&reg->list, &rh->clean_regions);

						}

Rows
Columns

[PATCH] drivers/md/dm-raid1.c: Fix inconsistent mirroring after interrupted recovery

14 drivers/md/dm-raid1.c Unescape Escape View file

14

drivers/md/dm-raid1.c

View file