sched: Fix SCHED_HRTICK bug leading to late preemption of tasks
SCHED_HRTICK feature is useful to preempt SCHED_FAIR tasks on-the-dot
(just when they would have exceeded their ideal_runtime). It makes use
of a a per-cpu hrtimer resource and hence alarming that hrtimer should
be based on total SCHED_FAIR tasks a cpu has across its various cfs_rqs,
rather than being based on number of tasks in a particular cfs_rq (as
implemented currently). As a result, with current code, its possible for
a running task (which is the sole task in its cfs_rq) to be preempted
much after its ideal_runtime has elapsed, resulting in increased latency
for tasks in other cfs_rq on same cpu.
Fix this by alarming sched hrtimer based on total number of SCHED_FAIR
tasks a CPU has across its various cfs_rqs.
Change-Id: I1f23680a64872f8ce0f451ac4bcae28e8967918f
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b175073..bce2817 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2126,11 +2126,11 @@
static void hrtick_start_fair(struct rq *rq, struct task_struct *p)
{
struct sched_entity *se = &p->se;
- struct cfs_rq *cfs_rq = cfs_rq_of(se);
+ struct cfs_rq *cfs_rq = &rq->cfs;
WARN_ON(task_rq(p) != rq);
- if (cfs_rq->nr_running > 1) {
+ if (cfs_rq->h_nr_running > 1) {
u64 slice = sched_slice(cfs_rq, se);
u64 ran = se->sum_exec_runtime - se->prev_sum_exec_runtime;
s64 delta = slice - ran;
@@ -2154,8 +2154,7 @@
/*
* called from enqueue/dequeue and updates the hrtick when the
- * current task is from our class and nr_running is low enough
- * to matter.
+ * current task is from our class.
*/
static void hrtick_update(struct rq *rq)
{
@@ -2164,8 +2163,7 @@
if (!hrtick_enabled(rq) || curr->sched_class != &fair_sched_class)
return;
- if (cfs_rq_of(&curr->se)->nr_running < sched_nr_latency)
- hrtick_start_fair(rq, curr);
+ hrtick_start_fair(rq, curr);
}
#else /* !CONFIG_SCHED_HRTICK */
static inline void