Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | CPU frequency and voltage scaling code in the Linux(TM) kernel |
| 2 | |
| 3 | |
| 4 | L i n u x C P U F r e q |
| 5 | |
| 6 | C P U D r i v e r s |
| 7 | |
| 8 | - information for developers - |
| 9 | |
| 10 | |
| 11 | Dominik Brodowski <linux@brodo.de> |
| 12 | |
| 13 | |
| 14 | |
| 15 | Clock scaling allows you to change the clock speed of the CPUs on the |
| 16 | fly. This is a nice method to save battery power, because the lower |
| 17 | the clock speed, the less power the CPU consumes. |
| 18 | |
| 19 | |
| 20 | Contents: |
| 21 | --------- |
| 22 | 1. What To Do? |
| 23 | 1.1 Initialization |
| 24 | 1.2 Per-CPU Initialization |
| 25 | 1.3 verify |
Viresh Kumar | 9c0ebcf | 2013-10-25 19:45:48 +0530 | [diff] [blame] | 26 | 1.4 target/target_index or setpolicy? |
| 27 | 1.5 target/target_index |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 28 | 1.6 setpolicy |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 29 | 1.7 get_intermediate and target_intermediate |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 30 | 2. Frequency Table Helpers |
| 31 | |
| 32 | |
| 33 | |
| 34 | 1. What To Do? |
| 35 | ============== |
| 36 | |
| 37 | So, you just got a brand-new CPU / chipset with datasheets and want to |
| 38 | add cpufreq support for this CPU / chipset? Great. Here are some hints |
| 39 | on what is necessary: |
| 40 | |
| 41 | |
| 42 | 1.1 Initialization |
| 43 | ------------------ |
| 44 | |
| 45 | First of all, in an __initcall level 7 (module_init()) or later |
| 46 | function check whether this kernel runs on the right CPU and the right |
| 47 | chipset. If so, register a struct cpufreq_driver with the CPUfreq core |
| 48 | using cpufreq_register_driver() |
| 49 | |
| 50 | What shall this struct cpufreq_driver contain? |
| 51 | |
| 52 | cpufreq_driver.name - The name of this driver. |
| 53 | |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 54 | cpufreq_driver.init - A pointer to the per-CPU initialization |
| 55 | function. |
| 56 | |
| 57 | cpufreq_driver.verify - A pointer to a "verification" function. |
| 58 | |
| 59 | cpufreq_driver.setpolicy _or_ |
Viresh Kumar | 9c0ebcf | 2013-10-25 19:45:48 +0530 | [diff] [blame] | 60 | cpufreq_driver.target/ |
| 61 | target_index - See below on the differences. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 62 | |
| 63 | And optionally |
| 64 | |
Dirk Brandewie | 367dc4a | 2014-03-19 08:45:53 -0700 | [diff] [blame] | 65 | cpufreq_driver.exit - A pointer to a per-CPU cleanup |
| 66 | function called during CPU_POST_DEAD |
| 67 | phase of cpu hotplug process. |
| 68 | |
| 69 | cpufreq_driver.stop_cpu - A pointer to a per-CPU stop function |
| 70 | called during CPU_DOWN_PREPARE phase of |
| 71 | cpu hotplug process. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 72 | |
| 73 | cpufreq_driver.resume - A pointer to a per-CPU resume function |
| 74 | which is called with interrupts disabled |
| 75 | and _before_ the pre-suspend frequency |
| 76 | and/or policy is restored by a call to |
Viresh Kumar | 9c0ebcf | 2013-10-25 19:45:48 +0530 | [diff] [blame] | 77 | ->target/target_index or ->setpolicy. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 78 | |
| 79 | cpufreq_driver.attr - A pointer to a NULL-terminated list of |
| 80 | "struct freq_attr" which allow to |
| 81 | export values to sysfs. |
| 82 | |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 83 | cpufreq_driver.get_intermediate |
| 84 | and target_intermediate Used to switch to stable frequency while |
| 85 | changing CPU frequency. |
| 86 | |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 87 | |
| 88 | 1.2 Per-CPU Initialization |
| 89 | -------------------------- |
| 90 | |
| 91 | Whenever a new CPU is registered with the device model, or after the |
| 92 | cpufreq driver registers itself, the per-CPU initialization function |
| 93 | cpufreq_driver.init is called. It takes a struct cpufreq_policy |
| 94 | *policy as argument. What to do now? |
| 95 | |
| 96 | If necessary, activate the CPUfreq support on your CPU. |
| 97 | |
| 98 | Then, the driver must fill in the following values: |
| 99 | |
| 100 | policy->cpuinfo.min_freq _and_ |
| 101 | policy->cpuinfo.max_freq - the minimum and maximum frequency |
| 102 | (in kHz) which is supported by |
| 103 | this CPU |
| 104 | policy->cpuinfo.transition_latency the time it takes on this CPU to |
Mark Brown | bbe237a | 2009-11-12 16:06:45 +0000 | [diff] [blame] | 105 | switch between two frequencies in |
| 106 | nanoseconds (if appropriate, else |
| 107 | specify CPUFREQ_ETERNAL) |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 108 | |
| 109 | policy->cur The current operating frequency of |
| 110 | this CPU (if appropriate) |
| 111 | policy->min, |
| 112 | policy->max, |
| 113 | policy->policy and, if necessary, |
| 114 | policy->governor must contain the "default policy" for |
| 115 | this CPU. A few moments later, |
| 116 | cpufreq_driver.verify and either |
| 117 | cpufreq_driver.setpolicy or |
Viresh Kumar | 9c0ebcf | 2013-10-25 19:45:48 +0530 | [diff] [blame] | 118 | cpufreq_driver.target/target_index is called |
| 119 | with these values. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 120 | |
Viresh Kumar | eb2f50f | 2013-04-01 12:57:48 +0000 | [diff] [blame] | 121 | For setting some of these values (cpuinfo.min[max]_freq, policy->min[max]), the |
| 122 | frequency table helpers might be helpful. See the section 2 for more information |
| 123 | on them. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 124 | |
Viresh Kumar | 951fc5f | 2013-01-31 02:03:53 +0000 | [diff] [blame] | 125 | SMP systems normally have same clock source for a group of cpus. For these the |
| 126 | .init() would be called only once for the first online cpu. Here the .init() |
| 127 | routine must initialize policy->cpus with mask of all possible cpus (Online + |
| 128 | Offline) that share the clock. Then the core would copy this mask onto |
| 129 | policy->related_cpus and will reset policy->cpus to carry only online cpus. |
| 130 | |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 131 | |
| 132 | 1.3 verify |
| 133 | ------------ |
| 134 | |
| 135 | When the user decides a new policy (consisting of |
| 136 | "policy,governor,min,max") shall be set, this policy must be validated |
| 137 | so that incompatible values can be corrected. For verifying these |
| 138 | values, a frequency table helper and/or the |
| 139 | cpufreq_verify_within_limits(struct cpufreq_policy *policy, unsigned |
| 140 | int min_freq, unsigned int max_freq) function might be helpful. See |
| 141 | section 2 for details on frequency table helpers. |
| 142 | |
| 143 | You need to make sure that at least one valid frequency (or operating |
| 144 | range) is within policy->min and policy->max. If necessary, increase |
| 145 | policy->max first, and only if this is no solution, decrease policy->min. |
| 146 | |
| 147 | |
Viresh Kumar | 9c0ebcf | 2013-10-25 19:45:48 +0530 | [diff] [blame] | 148 | 1.4 target/target_index or setpolicy? |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 149 | ---------------------------- |
| 150 | |
| 151 | Most cpufreq drivers or even most cpu frequency scaling algorithms |
| 152 | only allow the CPU to be set to one frequency. For these, you use the |
Viresh Kumar | 9c0ebcf | 2013-10-25 19:45:48 +0530 | [diff] [blame] | 153 | ->target/target_index call. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 154 | |
| 155 | Some cpufreq-capable processors switch the frequency between certain |
| 156 | limits on their own. These shall use the ->setpolicy call |
| 157 | |
| 158 | |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 159 | 1.5. target/target_index |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 160 | ------------- |
| 161 | |
Viresh Kumar | 9c0ebcf | 2013-10-25 19:45:48 +0530 | [diff] [blame] | 162 | The target_index call has two arguments: struct cpufreq_policy *policy, |
| 163 | and unsigned int index (into the exposed frequency table). |
| 164 | |
| 165 | The CPUfreq driver must set the new frequency when called here. The |
| 166 | actual frequency must be determined by freq_table[index].frequency. |
| 167 | |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 168 | It should always restore to earlier frequency (i.e. policy->restore_freq) in |
| 169 | case of errors, even if we switched to intermediate frequency earlier. |
| 170 | |
Viresh Kumar | 9c0ebcf | 2013-10-25 19:45:48 +0530 | [diff] [blame] | 171 | Deprecated: |
| 172 | ---------- |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 173 | The target call has three arguments: struct cpufreq_policy *policy, |
| 174 | unsigned int target_frequency, unsigned int relation. |
| 175 | |
| 176 | The CPUfreq driver must set the new frequency when called here. The |
| 177 | actual frequency must be determined using the following rules: |
| 178 | |
| 179 | - keep close to "target_freq" |
| 180 | - policy->min <= new_freq <= policy->max (THIS MUST BE VALID!!!) |
| 181 | - if relation==CPUFREQ_REL_L, try to select a new_freq higher than or equal |
| 182 | target_freq. ("L for lowest, but no lower than") |
| 183 | - if relation==CPUFREQ_REL_H, try to select a new_freq lower than or equal |
| 184 | target_freq. ("H for highest, but no higher than") |
| 185 | |
Chumbalkar Nagananda | 51555c0 | 2009-05-21 23:29:48 +0000 | [diff] [blame] | 186 | Here again the frequency table helper might assist you - see section 2 |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 187 | for details. |
| 188 | |
| 189 | |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 190 | 1.6 setpolicy |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 191 | --------------- |
| 192 | |
| 193 | The setpolicy call only takes a struct cpufreq_policy *policy as |
| 194 | argument. You need to set the lower limit of the in-processor or |
| 195 | in-chipset dynamic frequency switching to policy->min, the upper limit |
| 196 | to policy->max, and -if supported- select a performance-oriented |
| 197 | setting when policy->policy is CPUFREQ_POLICY_PERFORMANCE, and a |
| 198 | powersaving-oriented setting when CPUFREQ_POLICY_POWERSAVE. Also check |
Wanlong Gao | 25eb650 | 2011-06-13 17:53:53 +0800 | [diff] [blame] | 199 | the reference implementation in drivers/cpufreq/longrun.c |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 200 | |
Viresh Kumar | 1c03a2d | 2014-06-02 22:49:28 +0530 | [diff] [blame] | 201 | 1.7 get_intermediate and target_intermediate |
| 202 | -------------------------------------------- |
| 203 | |
| 204 | Only for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION unset. |
| 205 | |
| 206 | get_intermediate should return a stable intermediate frequency platform wants to |
| 207 | switch to, and target_intermediate() should set CPU to to that frequency, before |
| 208 | jumping to the frequency corresponding to 'index'. Core will take care of |
| 209 | sending notifications and driver doesn't have to handle them in |
| 210 | target_intermediate() or target_index(). |
| 211 | |
| 212 | Drivers can return '0' from get_intermediate() in case they don't wish to switch |
| 213 | to intermediate frequency for some target frequency. In that case core will |
| 214 | directly call ->target_index(). |
| 215 | |
| 216 | NOTE: ->target_index() should restore to policy->restore_freq in case of |
| 217 | failures as core would send notifications for that. |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 218 | |
| 219 | |
| 220 | 2. Frequency Table Helpers |
| 221 | ========================== |
| 222 | |
| 223 | As most cpufreq processors only allow for being set to a few specific |
| 224 | frequencies, a "frequency table" with some functions might assist in |
| 225 | some work of the processor driver. Such a "frequency table" consists |
Viresh Kumar | 3a7818e | 2013-04-01 12:57:42 +0000 | [diff] [blame] | 226 | of an array of struct cpufreq_frequency_table entries, with any value in |
Viresh Kumar | 5070158 | 2013-03-30 16:25:15 +0530 | [diff] [blame] | 227 | "driver_data" you want to use, and the corresponding frequency in |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 228 | "frequency". At the end of the table, you need to add a |
Viresh Kumar | 3a7818e | 2013-04-01 12:57:42 +0000 | [diff] [blame] | 229 | cpufreq_frequency_table entry with frequency set to CPUFREQ_TABLE_END. And |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 230 | if you want to skip one entry in the table, set the frequency to |
| 231 | CPUFREQ_ENTRY_INVALID. The entries don't need to be in ascending |
| 232 | order. |
| 233 | |
Viresh Kumar | 64bf55a | 2016-05-31 16:50:23 +0530 | [diff] [blame] | 234 | By calling cpufreq_table_validate_and_show(struct cpufreq_policy *policy, |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 235 | struct cpufreq_frequency_table *table); |
| 236 | the cpuinfo.min_freq and cpuinfo.max_freq values are detected, and |
| 237 | policy->min and policy->max are set to the same values. This is |
| 238 | helpful for the per-CPU initialization stage. |
| 239 | |
| 240 | int cpufreq_frequency_table_verify(struct cpufreq_policy *policy, |
| 241 | struct cpufreq_frequency_table *table); |
| 242 | assures that at least one valid frequency is within policy->min and |
| 243 | policy->max, and all other criteria are met. This is helpful for the |
| 244 | ->verify call. |
| 245 | |
| 246 | int cpufreq_frequency_table_target(struct cpufreq_policy *policy, |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 247 | unsigned int target_freq, |
Viresh Kumar | d218ed7 | 2016-06-03 10:58:51 +0530 | [diff] [blame] | 248 | unsigned int relation); |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 249 | |
| 250 | is the corresponding frequency table helper for the ->target |
Viresh Kumar | d218ed7 | 2016-06-03 10:58:51 +0530 | [diff] [blame] | 251 | stage. Just pass the values to this function, and this function |
| 252 | returns the number of the frequency table entry which contains |
Viresh Kumar | 5070158 | 2013-03-30 16:25:15 +0530 | [diff] [blame] | 253 | the frequency the CPU shall be set to. |
Stratos Karafotis | 27e289d | 2014-04-25 23:15:23 +0300 | [diff] [blame] | 254 | |
| 255 | The following macros can be used as iterators over cpufreq_frequency_table: |
| 256 | |
| 257 | cpufreq_for_each_entry(pos, table) - iterates over all entries of frequency |
| 258 | table. |
| 259 | |
| 260 | cpufreq-for_each_valid_entry(pos, table) - iterates over all entries, |
| 261 | excluding CPUFREQ_ENTRY_INVALID frequencies. |
| 262 | Use arguments "pos" - a cpufreq_frequency_table * as a loop cursor and |
| 263 | "table" - the cpufreq_frequency_table * you want to iterate over. |
| 264 | |
| 265 | For example: |
| 266 | |
| 267 | struct cpufreq_frequency_table *pos, *driver_freq_table; |
| 268 | |
| 269 | cpufreq_for_each_entry(pos, driver_freq_table) { |
| 270 | /* Do something with pos */ |
| 271 | pos->frequency = ... |
| 272 | } |