Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | CPU frequency and voltage scaling code in the Linux(TM) kernel |
| 2 | |
| 3 | |
| 4 | L i n u x C P U F r e q |
| 5 | |
| 6 | C P U F r e q G o v e r n o r s |
| 7 | |
| 8 | - information for users and developers - |
| 9 | |
| 10 | |
| 11 | Dominik Brodowski <linux@brodo.de> |
Nico Golde | 594dd2c | 2005-06-25 14:58:33 -0700 | [diff] [blame] | 12 | some additions and corrections by Nico Golde <nico@ngolde.de> |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 13 | |
| 14 | |
| 15 | |
| 16 | Clock scaling allows you to change the clock speed of the CPUs on the |
| 17 | fly. This is a nice method to save battery power, because the lower |
| 18 | the clock speed, the less power the CPU consumes. |
| 19 | |
| 20 | |
| 21 | Contents: |
| 22 | --------- |
| 23 | 1. What is a CPUFreq Governor? |
| 24 | |
| 25 | 2. Governors In the Linux Kernel |
| 26 | 2.1 Performance |
| 27 | 2.2 Powersave |
| 28 | 2.3 Userspace |
Nico Golde | 594dd2c | 2005-06-25 14:58:33 -0700 | [diff] [blame] | 29 | 2.4 Ondemand |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 30 | |
| 31 | 3. The Governor Interface in the CPUfreq Core |
| 32 | |
| 33 | |
| 34 | |
| 35 | 1. What Is A CPUFreq Governor? |
| 36 | ============================== |
| 37 | |
| 38 | Most cpufreq drivers (in fact, all except one, longrun) or even most |
| 39 | cpu frequency scaling algorithms only offer the CPU to be set to one |
| 40 | frequency. In order to offer dynamic frequency scaling, the cpufreq |
| 41 | core must be able to tell these drivers of a "target frequency". So |
| 42 | these specific drivers will be transformed to offer a "->target" |
| 43 | call instead of the existing "->setpolicy" call. For "longrun", all |
| 44 | stays the same, though. |
| 45 | |
| 46 | How to decide what frequency within the CPUfreq policy should be used? |
| 47 | That's done using "cpufreq governors". Two are already in this patch |
| 48 | -- they're the already existing "powersave" and "performance" which |
| 49 | set the frequency statically to the lowest or highest frequency, |
| 50 | respectively. At least two more such governors will be ready for |
| 51 | addition in the near future, but likely many more as there are various |
| 52 | different theories and models about dynamic frequency scaling |
| 53 | around. Using such a generic interface as cpufreq offers to scaling |
| 54 | governors, these can be tested extensively, and the best one can be |
| 55 | selected for each specific use. |
| 56 | |
| 57 | Basically, it's the following flow graph: |
| 58 | |
| 59 | CPU can be set to switch independetly | CPU can only be set |
| 60 | within specific "limits" | to specific frequencies |
| 61 | |
| 62 | "CPUfreq policy" |
| 63 | consists of frequency limits (policy->{min,max}) |
| 64 | and CPUfreq governor to be used |
| 65 | / \ |
| 66 | / \ |
| 67 | / the cpufreq governor decides |
| 68 | / (dynamically or statically) |
| 69 | / what target_freq to set within |
| 70 | / the limits of policy->{min,max} |
| 71 | / \ |
| 72 | / \ |
| 73 | Using the ->setpolicy call, Using the ->target call, |
| 74 | the limits and the the frequency closest |
| 75 | "policy" is set. to target_freq is set. |
| 76 | It is assured that it |
| 77 | is within policy->{min,max} |
| 78 | |
| 79 | |
| 80 | 2. Governors In the Linux Kernel |
| 81 | ================================ |
| 82 | |
| 83 | 2.1 Performance |
| 84 | --------------- |
| 85 | |
| 86 | The CPUfreq governor "performance" sets the CPU statically to the |
| 87 | highest frequency within the borders of scaling_min_freq and |
| 88 | scaling_max_freq. |
| 89 | |
| 90 | |
Nico Golde | 594dd2c | 2005-06-25 14:58:33 -0700 | [diff] [blame] | 91 | 2.2 Powersave |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 92 | ------------- |
| 93 | |
| 94 | The CPUfreq governor "powersave" sets the CPU statically to the |
| 95 | lowest frequency within the borders of scaling_min_freq and |
| 96 | scaling_max_freq. |
| 97 | |
| 98 | |
Nico Golde | 594dd2c | 2005-06-25 14:58:33 -0700 | [diff] [blame] | 99 | 2.3 Userspace |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 100 | ------------- |
| 101 | |
| 102 | The CPUfreq governor "userspace" allows the user, or any userspace |
| 103 | program running with UID "root", to set the CPU to a specific frequency |
| 104 | by making a sysfs file "scaling_setspeed" available in the CPU-device |
| 105 | directory. |
| 106 | |
| 107 | |
Nico Golde | 594dd2c | 2005-06-25 14:58:33 -0700 | [diff] [blame] | 108 | 2.4 Ondemand |
| 109 | ------------ |
| 110 | |
| 111 | The CPUfreq govenor "ondemand" sets the CPU depending on the |
| 112 | current usage. To do this the CPU must have the capability to |
| 113 | switch the frequency very fast. |
| 114 | |
| 115 | |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 116 | |
| 117 | 3. The Governor Interface in the CPUfreq Core |
| 118 | ============================================= |
| 119 | |
| 120 | A new governor must register itself with the CPUfreq core using |
| 121 | "cpufreq_register_governor". The struct cpufreq_governor, which has to |
| 122 | be passed to that function, must contain the following values: |
| 123 | |
| 124 | governor->name - A unique name for this governor |
| 125 | governor->governor - The governor callback function |
| 126 | governor->owner - .THIS_MODULE for the governor module (if |
| 127 | appropriate) |
| 128 | |
| 129 | The governor->governor callback is called with the current (or to-be-set) |
| 130 | cpufreq_policy struct for that CPU, and an unsigned int event. The |
| 131 | following events are currently defined: |
| 132 | |
| 133 | CPUFREQ_GOV_START: This governor shall start its duty for the CPU |
| 134 | policy->cpu |
| 135 | CPUFREQ_GOV_STOP: This governor shall end its duty for the CPU |
| 136 | policy->cpu |
| 137 | CPUFREQ_GOV_LIMITS: The limits for CPU policy->cpu have changed to |
| 138 | policy->min and policy->max. |
| 139 | |
| 140 | If you need other "events" externally of your driver, _only_ use the |
| 141 | cpufreq_governor_l(unsigned int cpu, unsigned int event) call to the |
| 142 | CPUfreq core to ensure proper locking. |
| 143 | |
| 144 | |
| 145 | The CPUfreq governor may call the CPU processor driver using one of |
| 146 | these two functions: |
| 147 | |
| 148 | int cpufreq_driver_target(struct cpufreq_policy *policy, |
| 149 | unsigned int target_freq, |
| 150 | unsigned int relation); |
| 151 | |
| 152 | int __cpufreq_driver_target(struct cpufreq_policy *policy, |
| 153 | unsigned int target_freq, |
| 154 | unsigned int relation); |
| 155 | |
| 156 | target_freq must be within policy->min and policy->max, of course. |
| 157 | What's the difference between these two functions? When your governor |
| 158 | still is in a direct code path of a call to governor->governor, the |
| 159 | per-CPU cpufreq lock is still held in the cpufreq core, and there's |
| 160 | no need to lock it again (in fact, this would cause a deadlock). So |
| 161 | use __cpufreq_driver_target only in these cases. In all other cases |
| 162 | (for example, when there's a "daemonized" function that wakes up |
| 163 | every second), use cpufreq_driver_target to lock the cpufreq per-CPU |
| 164 | lock before the command is passed to the cpufreq processor driver. |
| 165 | |