blk-throttle: make sure expire time isn't too big [ Upstream commit 06cceedcca67a93ac7f7aa93bbd9980c7496d14e ] cgroup could be throttled to a limit but when all cgroups cross high limit, queue enters a higher state and so the group should be throttled to a higher limit. It's possible the cgroup is sleeping because of throttle and other cgroups don't dispatch IO any more. In this case, nobody can trigger current downgrade/upgrade logic. To fix this issue, we could either set up a timer to wakeup the cgroup if other cgroups are idle or make sure this cgroup doesn't sleep too long. Setting up a timer means we must change the timer very frequently. This patch chooses the latter. Making cgroup sleep time not too big wouldn't change cgroup bps/iops, but could make it wakeup more frequently, which isn't a big issue because throtl_slice * 8 is already quite big. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit: 94b3df54ad6d451a0158681758a0996947b5134d [log] [tgz]
author: Shaohua Li <shli@fb.com> Mon Mar 27 10:51:36 2017 -0700
committer: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Thu Mar 22 09:17:44 2018 +0100
tree: 182aacc82b2b8f264ad12c32e913688d906bbf0a
parent: 0030b37be54786e156df0ce8ad99b4c1ead682aa [diff] [blame]
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index a3ea826..3a4c9a3 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c

@@ -499,6 +499,17 @@
 static void throtl_schedule_pending_timer(struct throtl_service_queue *sq,
 					  unsigned long expires)
 {
+	unsigned long max_expire = jiffies + 8 * throtl_slice;
+
+	/*
+	 * Since we are adjusting the throttle limit dynamically, the sleep
+	 * time calculated according to previous limit might be invalid. It's
+	 * possible the cgroup sleep time is very long and no other cgroups
+	 * have IO running so notify the limit changes. Make sure the cgroup
+	 * doesn't sleep too long to avoid the missed notification.
+	 */
+	if (time_after(expires, max_expire))
+		expires = max_expire;
 	mod_timer(&sq->pending_timer, expires);
 	throtl_log(sq, "schedule timer. delay=%lu jiffies=%lu",
 		   expires - jiffies, jiffies);
commit	94b3df54ad6d451a0158681758a0996947b5134d	[log] [tgz]
author	Shaohua Li <shli@fb.com>	Mon Mar 27 10:51:36 2017 -0700
committer	Greg Kroah-Hartman <gregkh@linuxfoundation.org>	Thu Mar 22 09:17:44 2018 +0100
tree	182aacc82b2b8f264ad12c32e913688d906bbf0a
parent	0030b37be54786e156df0ce8ad99b4c1ead682aa [diff] [blame]