dm raid1: handle resync failures

Device-mapper mirroring currently takes a best effort approach to
recovery - failures during mirror synchronization are completely ignored.
This means that regions are marked 'in-sync' and 'clean' and removed
from the hash list.  Future reads and writes that query the region
will incorrectly interpret the region as in-sync.

This patch handles failures during the recovery process.  If a failure
occurs, the region is marked as 'not-in-sync' (aka RH_NOSYNC) and added
to a new list 'failed_recovered_regions'.

Regions on the 'failed_recovered_regions' list are not marked as 'clean'
upon removal from the list.  Furthermore, if the DM_RAID1_HANDLE_ERRORS
flag is set, the region is marked as 'not-in-sync'.  This action prevents
any future read-balancing from choosing an invalid device because of the
'not-in-sync' status.

If "handle_errors" is not specified when creating a mirror (leaving the
DM_RAID1_HANDLE_ERRORS flag unset), failures will be ignored exactly as they
would be without this patch.  This is to preserve backwards compatibility with
user-space tools, such as 'pvmove'.  However, since future read-balancing
policies will rely on the correct sync status of a region, a user must choose
"handle_errors" when using read-balancing.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 file changed