Btrfs: fix race waiting for qgroup rescan worker

We were initializing the completion (fs_info->qgroup_rescan_completion)
object after releasing the qgroup rescan lock, which gives a small time
window for a rescan waiter to not actually wait for the rescan worker
to finish. Example:

         CPU 1                                                     CPU 2

 fs_info->qgroup_rescan_completion->done is 0

 btrfs_qgroup_rescan_worker()
   complete_all(&fs_info->qgroup_rescan_completion)
     sets fs_info->qgroup_rescan_completion->done
     to UINT_MAX / 2

 ... do some other stuff ....

 qgroup_rescan_init()
   mutex_lock(&fs_info->qgroup_rescan_lock)
   set flag BTRFS_QGROUP_STATUS_FLAG_RESCAN
     in fs_info->qgroup_flags
   mutex_unlock(&fs_info->qgroup_rescan_lock)

                                                       btrfs_qgroup_wait_for_completion()
                                                         mutex_lock(&fs_info->qgroup_rescan_lock)
                                                         sees flag BTRFS_QGROUP_STATUS_FLAG_RESCAN
                                                           in fs_info->qgroup_flags
                                                         mutex_unlock(&fs_info->qgroup_rescan_lock)

                                                         wait_for_completion_interruptible(
                                                           &fs_info->qgroup_rescan_completion)

                                                           fs_info->qgroup_rescan_completion->done
                                                           is > 0 so it returns immediately

  init_completion(&fs_info->qgroup_rescan_completion)
    sets fs_info->qgroup_rescan_completion->done to 0

So fix this by initializing the completion object while holding the mutex
fs_info->qgroup_rescan_lock.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
1 file changed