| CPU frequency and voltage scaling code in the Linux(TM) kernel |
| |
| |
| L i n u x C P U F r e q |
| |
| C P U F r e q G o v e r n o r s |
| |
| - information for users and developers - |
| |
| |
| Dominik Brodowski <linux@brodo.de> |
| some additions and corrections by Nico Golde <nico@ngolde.de> |
| |
| |
| |
| Clock scaling allows you to change the clock speed of the CPUs on the |
| fly. This is a nice method to save battery power, because the lower |
| the clock speed, the less power the CPU consumes. |
| |
| |
| Contents: |
| --------- |
| 1. What is a CPUFreq Governor? |
| |
| 2. Governors In the Linux Kernel |
| 2.1 Performance |
| 2.2 Powersave |
| 2.3 Userspace |
| 2.4 Ondemand |
| |
| 3. The Governor Interface in the CPUfreq Core |
| |
| |
| |
| 1. What Is A CPUFreq Governor? |
| ============================== |
| |
| Most cpufreq drivers (in fact, all except one, longrun) or even most |
| cpu frequency scaling algorithms only offer the CPU to be set to one |
| frequency. In order to offer dynamic frequency scaling, the cpufreq |
| core must be able to tell these drivers of a "target frequency". So |
| these specific drivers will be transformed to offer a "->target" |
| call instead of the existing "->setpolicy" call. For "longrun", all |
| stays the same, though. |
| |
| How to decide what frequency within the CPUfreq policy should be used? |
| That's done using "cpufreq governors". Two are already in this patch |
| -- they're the already existing "powersave" and "performance" which |
| set the frequency statically to the lowest or highest frequency, |
| respectively. At least two more such governors will be ready for |
| addition in the near future, but likely many more as there are various |
| different theories and models about dynamic frequency scaling |
| around. Using such a generic interface as cpufreq offers to scaling |
| governors, these can be tested extensively, and the best one can be |
| selected for each specific use. |
| |
| Basically, it's the following flow graph: |
| |
| CPU can be set to switch independetly | CPU can only be set |
| within specific "limits" | to specific frequencies |
| |
| "CPUfreq policy" |
| consists of frequency limits (policy->{min,max}) |
| and CPUfreq governor to be used |
| / \ |
| / \ |
| / the cpufreq governor decides |
| / (dynamically or statically) |
| / what target_freq to set within |
| / the limits of policy->{min,max} |
| / \ |
| / \ |
| Using the ->setpolicy call, Using the ->target call, |
| the limits and the the frequency closest |
| "policy" is set. to target_freq is set. |
| It is assured that it |
| is within policy->{min,max} |
| |
| |
| 2. Governors In the Linux Kernel |
| ================================ |
| |
| 2.1 Performance |
| --------------- |
| |
| The CPUfreq governor "performance" sets the CPU statically to the |
| highest frequency within the borders of scaling_min_freq and |
| scaling_max_freq. |
| |
| |
| 2.2 Powersave |
| ------------- |
| |
| The CPUfreq governor "powersave" sets the CPU statically to the |
| lowest frequency within the borders of scaling_min_freq and |
| scaling_max_freq. |
| |
| |
| 2.3 Userspace |
| ------------- |
| |
| The CPUfreq governor "userspace" allows the user, or any userspace |
| program running with UID "root", to set the CPU to a specific frequency |
| by making a sysfs file "scaling_setspeed" available in the CPU-device |
| directory. |
| |
| |
| 2.4 Ondemand |
| ------------ |
| |
| The CPUfreq govenor "ondemand" sets the CPU depending on the |
| current usage. To do this the CPU must have the capability to |
| switch the frequency very fast. |
| |
| |
| |
| 3. The Governor Interface in the CPUfreq Core |
| ============================================= |
| |
| A new governor must register itself with the CPUfreq core using |
| "cpufreq_register_governor". The struct cpufreq_governor, which has to |
| be passed to that function, must contain the following values: |
| |
| governor->name - A unique name for this governor |
| governor->governor - The governor callback function |
| governor->owner - .THIS_MODULE for the governor module (if |
| appropriate) |
| |
| The governor->governor callback is called with the current (or to-be-set) |
| cpufreq_policy struct for that CPU, and an unsigned int event. The |
| following events are currently defined: |
| |
| CPUFREQ_GOV_START: This governor shall start its duty for the CPU |
| policy->cpu |
| CPUFREQ_GOV_STOP: This governor shall end its duty for the CPU |
| policy->cpu |
| CPUFREQ_GOV_LIMITS: The limits for CPU policy->cpu have changed to |
| policy->min and policy->max. |
| |
| If you need other "events" externally of your driver, _only_ use the |
| cpufreq_governor_l(unsigned int cpu, unsigned int event) call to the |
| CPUfreq core to ensure proper locking. |
| |
| |
| The CPUfreq governor may call the CPU processor driver using one of |
| these two functions: |
| |
| int cpufreq_driver_target(struct cpufreq_policy *policy, |
| unsigned int target_freq, |
| unsigned int relation); |
| |
| int __cpufreq_driver_target(struct cpufreq_policy *policy, |
| unsigned int target_freq, |
| unsigned int relation); |
| |
| target_freq must be within policy->min and policy->max, of course. |
| What's the difference between these two functions? When your governor |
| still is in a direct code path of a call to governor->governor, the |
| per-CPU cpufreq lock is still held in the cpufreq core, and there's |
| no need to lock it again (in fact, this would cause a deadlock). So |
| use __cpufreq_driver_target only in these cases. In all other cases |
| (for example, when there's a "daemonized" function that wakes up |
| every second), use cpufreq_driver_target to lock the cpufreq per-CPU |
| lock before the command is passed to the cpufreq processor driver. |
| |