81a167c8ebeb5d14ad79a9c0b030453b86912a8a - kernel/msm-4.9

commit	81a167c8ebeb5d14ad79a9c0b030453b86912a8a	[log] [tgz]
author	Vikram Mulukutla <markivx@codeaurora.org>	Tue Jan 10 10:43:48 2017 -0800
committer	Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	Tue Feb 07 14:51:42 2017 -0800
tree	9ae0d16f59d4f07fb7d9a03dc12799610166a863
parent	7763f34a8878d46a8b62813dade87d89d49ef6b2 [diff]

sched: hmp: Ensure that best_cluster() never returns NULL

There are certain conditions under which group_will_fit() may return 0 for
all clusters in the system, especially under changing thermal conditions.
This may result in crashes such as this one:

        CPU 0                    |               CPU 1
====================================================================
select_best_cpu()                |
 -> env.rtg = rtgA               |
    rtgA.pref_cluster=C_big      |
                                 |   set_pref_cluster() for rtgA
                                 |     -> best_cluster()
                                 |        C_little doesn't fit
                                 |
                                 |   IRQ: thermal mitigation
                                 |   C_big capacity now less
                                 |   than C_little capacity
                                 |
                                 |     -> best_cluster() continues
                                 |        C_big doesn't fit
                                 |   set_pref_cluster() sets
                                 |   rtgA.pref_cluster = NULL
                                 |
select_least_power_cluster()     |
  -> cluster_first_cpu()         |
     -> BUG()                    |

To add lock protection around accesses to the group's preferred cluster
would be expensive and defeat the point of the usage of RCU to protect
access to the related_thread_group structure. Therefore, ensure that
best_cluster() can never return NULL. In the worst case, we'll select the
wrong cluster for a related_thread_group's demand, but this should be
fixed in the next tick or wakeup etc. Locking would have still led to the
momentary wrong decision with the additional expense!

Also, don't set preferred cluster to NULL when colocation is disabled.

Change-Id: Id3f514b149add9b3ed33d104fa6a9bd57bec27e2
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>

kernel/sched/hmp.c[diff]

1 file changed

tree: 9ae0d16f59d4f07fb7d9a03dc12799610166a863