btrfs: reada: avoid undone reada extents in btrfs_reada_wait Reada background works is not designed to finish all jobs completely, it will break in following case: 1: When a device reaches workload limit (MAX_IN_FLIGHT) 2: Total reads reach max limit (10000) 3: All devices don't have queued more jobs, often happened in DUP case And if all background works exit with remaining jobs, btrfs_reada_wait() will wait indefinetelly. Above problem is rarely happened in old code, because: 1: Every work queues 2x new works So many works reduced chances of undone jobs. 2: One work will continue 10000 times loop in case of no-jobs It reduced no-thread window time. But after we fixed above case, the "undone reada extents" frequently happened. Fix: Check to ensure we have at least one thread if there are undone jobs in btrfs_reada_wait(). Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>

commit: 4fe7a0e13864238fe5b4cc2640e963581f96429e [log] [tgz]
author: Zhao Lei <zhaolei@cn.fujitsu.com> Tue Jan 26 18:42:40 2016 +0800
committer: David Sterba <dsterba@suse.com> Thu Feb 18 10:27:23 2016 +0100
tree: 6bbeac8bc14b6fa2683c454cafea277f1be88022
parent: 2fefd5583f8b86171c898f90cadac7c09ccf9d73 [diff]
diff --git a/fs/btrfs/reada.c b/fs/btrfs/reada.c
index e97bc8e..5bcd567 100644
--- a/fs/btrfs/reada.c
+++ b/fs/btrfs/reada.c

@@ -953,8 +953,11 @@
 int btrfs_reada_wait(void *handle)
 {
 	struct reada_control *rc = handle;
+	struct btrfs_fs_info *fs_info = rc->root->fs_info;
 
 	while (atomic_read(&rc->elems)) {
+		if (!atomic_read(&fs_info->reada_works_cnt))
+			reada_start_machine(fs_info);
 		wait_event_timeout(rc->wait, atomic_read(&rc->elems) == 0,
 				   5 * HZ);
 		dump_devs(rc->root->fs_info,
@@ -971,9 +974,13 @@
 int btrfs_reada_wait(void *handle)
 {
 	struct reada_control *rc = handle;
+	struct btrfs_fs_info *fs_info = rc->root->fs_info;
 
 	while (atomic_read(&rc->elems)) {
-		wait_event(rc->wait, atomic_read(&rc->elems) == 0);
+		if (!atomic_read(&fs_info->reada_works_cnt))
+			reada_start_machine(fs_info);
+		wait_event_timeout(rc->wait, atomic_read(&rc->elems) == 0,
+				   (HZ + 9) / 10);
 	}
 
 	kref_put(&rc->refcnt, reada_control_release);
commit	4fe7a0e13864238fe5b4cc2640e963581f96429e	[log] [tgz]
author	Zhao Lei <zhaolei@cn.fujitsu.com>	Tue Jan 26 18:42:40 2016 +0800
committer	David Sterba <dsterba@suse.com>	Thu Feb 18 10:27:23 2016 +0100
tree	6bbeac8bc14b6fa2683c454cafea277f1be88022
parent	2fefd5583f8b86171c898f90cadac7c09ccf9d73 [diff]