blk-mq: fix waitqueue_active without memory barrier in block/blk-mq-tag.c

blk_mq_tag_update_depth() seems to be missing a memory barrier which
might cause the waker to not notice the waiter and fail to send a
wake_up as in the following figure.

	blk_mq_tag_update_depth			bt_get
------------------------------------------------------------------------
if (waitqueue_active(&bs->wait))
/* The CPU might reorder the test for
   the waitqueue up here, before
   prior writes complete */
					prepare_to_wait(&bs->wait, &wait,
					  TASK_UNINTERRUPTIBLE);
					tag = __bt_get(hctx, bt, last_tag,
					  tags);
					/* Value set in bt_update_count not
					   visible yet */
bt_update_count(&tags->bitmap_tags, tdepth);
/* blk_mq_tag_wakeup_all(tags, false); */
 bt = &tags->bitmap_tags;
 wake_index = atomic_read(&bt->wake_index);
					...
					io_schedule();
------------------------------------------------------------------------

This patch adds the missing memory barrier.

I found this issue when I was looking through the linux source code
for places calling waitqueue_active() before wake_up*(), but without
preceding memory barriers, after sending a patch to fix a similar
issue in drivers/tty/n_tty.c  (Details about the original issue can be
found here: https://lkml.org/lkml/2015/9/28/849).

Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index ed96474..7a6b6e2 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -75,6 +75,10 @@
 	struct blk_mq_bitmap_tags *bt;
 	int i, wake_index;
 
+	/*
+	 * Make sure all changes prior to this are visible from other CPUs.
+	 */
+	smp_mb();
 	bt = &tags->bitmap_tags;
 	wake_index = atomic_read(&bt->wake_index);
 	for (i = 0; i < BT_WAIT_QUEUES; i++) {