| Operating Performance Points (OPP) Library |
| ========================================== |
| |
| (C) 2009-2010 Nishanth Menon <nm@ti.com>, Texas Instruments Incorporated |
| |
| Contents |
| -------- |
| 1. Introduction |
| 2. Initial OPP List Registration |
| 3. OPP Search Functions |
| 4. OPP Availability Control Functions |
| 5. OPP Data Retrieval Functions |
| 6. Cpufreq Table Generation |
| 7. Data Structures |
| |
| 1. Introduction |
| =============== |
| 1.1 What is an Operating Performance Point (OPP)? |
| |
| Complex SoCs of today consists of a multiple sub-modules working in conjunction. |
| In an operational system executing varied use cases, not all modules in the SoC |
| need to function at their highest performing frequency all the time. To |
| facilitate this, sub-modules in a SoC are grouped into domains, allowing some |
| domains to run at lower voltage and frequency while other domains run at |
| voltage/frequency pairs that are higher. |
| |
| The set of discrete tuples consisting of frequency and voltage pairs that |
| the device will support per domain are called Operating Performance Points or |
| OPPs. |
| |
| As an example: |
| Let us consider an MPU device which supports the following: |
| {300MHz at minimum voltage of 1V}, {800MHz at minimum voltage of 1.2V}, |
| {1GHz at minimum voltage of 1.3V} |
| |
| We can represent these as three OPPs as the following {Hz, uV} tuples: |
| {300000000, 1000000} |
| {800000000, 1200000} |
| {1000000000, 1300000} |
| |
| 1.2 Operating Performance Points Library |
| |
| OPP library provides a set of helper functions to organize and query the OPP |
| information. The library is located in drivers/base/power/opp.c and the header |
| is located in include/linux/pm_opp.h. OPP library can be enabled by enabling |
| CONFIG_PM_OPP from power management menuconfig menu. OPP library depends on |
| CONFIG_PM as certain SoCs such as Texas Instrument's OMAP framework allows to |
| optionally boot at a certain OPP without needing cpufreq. |
| |
| Typical usage of the OPP library is as follows: |
| (users) -> registers a set of default OPPs -> (library) |
| SoC framework -> modifies on required cases certain OPPs -> OPP layer |
| -> queries to search/retrieve information -> |
| |
| Architectures that provide a SoC framework for OPP should select ARCH_HAS_OPP |
| to make the OPP layer available. |
| |
| OPP layer expects each domain to be represented by a unique device pointer. SoC |
| framework registers a set of initial OPPs per device with the OPP layer. This |
| list is expected to be an optimally small number typically around 5 per device. |
| This initial list contains a set of OPPs that the framework expects to be safely |
| enabled by default in the system. |
| |
| Note on OPP Availability: |
| ------------------------ |
| As the system proceeds to operate, SoC framework may choose to make certain |
| OPPs available or not available on each device based on various external |
| factors. Example usage: Thermal management or other exceptional situations where |
| SoC framework might choose to disable a higher frequency OPP to safely continue |
| operations until that OPP could be re-enabled if possible. |
| |
| OPP library facilitates this concept in it's implementation. The following |
| operational functions operate only on available opps: |
| opp_find_freq_{ceil, floor}, dev_pm_opp_get_voltage, dev_pm_opp_get_freq, dev_pm_opp_get_opp_count |
| and dev_pm_opp_init_cpufreq_table |
| |
| dev_pm_opp_find_freq_exact is meant to be used to find the opp pointer which can then |
| be used for dev_pm_opp_enable/disable functions to make an opp available as required. |
| |
| WARNING: Users of OPP library should refresh their availability count using |
| get_opp_count if dev_pm_opp_enable/disable functions are invoked for a device, the |
| exact mechanism to trigger these or the notification mechanism to other |
| dependent subsystems such as cpufreq are left to the discretion of the SoC |
| specific framework which uses the OPP library. Similar care needs to be taken |
| care to refresh the cpufreq table in cases of these operations. |
| |
| WARNING on OPP List locking mechanism: |
| ------------------------------------------------- |
| OPP library uses RCU for exclusivity. RCU allows the query functions to operate |
| in multiple contexts and this synchronization mechanism is optimal for a read |
| intensive operations on data structure as the OPP library caters to. |
| |
| To ensure that the data retrieved are sane, the users such as SoC framework |
| should ensure that the section of code operating on OPP queries are locked |
| using RCU read locks. The opp_find_freq_{exact,ceil,floor}, |
| opp_get_{voltage, freq, opp_count} fall into this category. |
| |
| opp_{add,enable,disable} are updaters which use mutex and implement it's own |
| RCU locking mechanisms. dev_pm_opp_init_cpufreq_table acts as an updater and uses |
| mutex to implment RCU updater strategy. These functions should *NOT* be called |
| under RCU locks and other contexts that prevent blocking functions in RCU or |
| mutex operations from working. |
| |
| 2. Initial OPP List Registration |
| ================================ |
| The SoC implementation calls dev_pm_opp_add function iteratively to add OPPs per |
| device. It is expected that the SoC framework will register the OPP entries |
| optimally- typical numbers range to be less than 5. The list generated by |
| registering the OPPs is maintained by OPP library throughout the device |
| operation. The SoC framework can subsequently control the availability of the |
| OPPs dynamically using the dev_pm_opp_enable / disable functions. |
| |
| dev_pm_opp_add - Add a new OPP for a specific domain represented by the device pointer. |
| The OPP is defined using the frequency and voltage. Once added, the OPP |
| is assumed to be available and control of it's availability can be done |
| with the dev_pm_opp_enable/disable functions. OPP library internally stores |
| and manages this information in the opp struct. This function may be |
| used by SoC framework to define a optimal list as per the demands of |
| SoC usage environment. |
| |
| WARNING: Do not use this function in interrupt context. |
| |
| Example: |
| soc_pm_init() |
| { |
| /* Do things */ |
| r = dev_pm_opp_add(mpu_dev, 1000000, 900000); |
| if (!r) { |
| pr_err("%s: unable to register mpu opp(%d)\n", r); |
| goto no_cpufreq; |
| } |
| /* Do cpufreq things */ |
| no_cpufreq: |
| /* Do remaining things */ |
| } |
| |
| 3. OPP Search Functions |
| ======================= |
| High level framework such as cpufreq operates on frequencies. To map the |
| frequency back to the corresponding OPP, OPP library provides handy functions |
| to search the OPP list that OPP library internally manages. These search |
| functions return the matching pointer representing the opp if a match is |
| found, else returns error. These errors are expected to be handled by standard |
| error checks such as IS_ERR() and appropriate actions taken by the caller. |
| |
| dev_pm_opp_find_freq_exact - Search for an OPP based on an *exact* frequency and |
| availability. This function is especially useful to enable an OPP which |
| is not available by default. |
| Example: In a case when SoC framework detects a situation where a |
| higher frequency could be made available, it can use this function to |
| find the OPP prior to call the dev_pm_opp_enable to actually make it available. |
| rcu_read_lock(); |
| opp = dev_pm_opp_find_freq_exact(dev, 1000000000, false); |
| rcu_read_unlock(); |
| /* dont operate on the pointer.. just do a sanity check.. */ |
| if (IS_ERR(opp)) { |
| pr_err("frequency not disabled!\n"); |
| /* trigger appropriate actions.. */ |
| } else { |
| dev_pm_opp_enable(dev,1000000000); |
| } |
| |
| NOTE: This is the only search function that operates on OPPs which are |
| not available. |
| |
| dev_pm_opp_find_freq_floor - Search for an available OPP which is *at most* the |
| provided frequency. This function is useful while searching for a lesser |
| match OR operating on OPP information in the order of decreasing |
| frequency. |
| Example: To find the highest opp for a device: |
| freq = ULONG_MAX; |
| rcu_read_lock(); |
| dev_pm_opp_find_freq_floor(dev, &freq); |
| rcu_read_unlock(); |
| |
| dev_pm_opp_find_freq_ceil - Search for an available OPP which is *at least* the |
| provided frequency. This function is useful while searching for a |
| higher match OR operating on OPP information in the order of increasing |
| frequency. |
| Example 1: To find the lowest opp for a device: |
| freq = 0; |
| rcu_read_lock(); |
| dev_pm_opp_find_freq_ceil(dev, &freq); |
| rcu_read_unlock(); |
| Example 2: A simplified implementation of a SoC cpufreq_driver->target: |
| soc_cpufreq_target(..) |
| { |
| /* Do stuff like policy checks etc. */ |
| /* Find the best frequency match for the req */ |
| rcu_read_lock(); |
| opp = dev_pm_opp_find_freq_ceil(dev, &freq); |
| rcu_read_unlock(); |
| if (!IS_ERR(opp)) |
| soc_switch_to_freq_voltage(freq); |
| else |
| /* do something when we can't satisfy the req */ |
| /* do other stuff */ |
| } |
| |
| 4. OPP Availability Control Functions |
| ===================================== |
| A default OPP list registered with the OPP library may not cater to all possible |
| situation. The OPP library provides a set of functions to modify the |
| availability of a OPP within the OPP list. This allows SoC frameworks to have |
| fine grained dynamic control of which sets of OPPs are operationally available. |
| These functions are intended to *temporarily* remove an OPP in conditions such |
| as thermal considerations (e.g. don't use OPPx until the temperature drops). |
| |
| WARNING: Do not use these functions in interrupt context. |
| |
| dev_pm_opp_enable - Make a OPP available for operation. |
| Example: Lets say that 1GHz OPP is to be made available only if the |
| SoC temperature is lower than a certain threshold. The SoC framework |
| implementation might choose to do something as follows: |
| if (cur_temp < temp_low_thresh) { |
| /* Enable 1GHz if it was disabled */ |
| rcu_read_lock(); |
| opp = dev_pm_opp_find_freq_exact(dev, 1000000000, false); |
| rcu_read_unlock(); |
| /* just error check */ |
| if (!IS_ERR(opp)) |
| ret = dev_pm_opp_enable(dev, 1000000000); |
| else |
| goto try_something_else; |
| } |
| |
| dev_pm_opp_disable - Make an OPP to be not available for operation |
| Example: Lets say that 1GHz OPP is to be disabled if the temperature |
| exceeds a threshold value. The SoC framework implementation might |
| choose to do something as follows: |
| if (cur_temp > temp_high_thresh) { |
| /* Disable 1GHz if it was enabled */ |
| rcu_read_lock(); |
| opp = dev_pm_opp_find_freq_exact(dev, 1000000000, true); |
| rcu_read_unlock(); |
| /* just error check */ |
| if (!IS_ERR(opp)) |
| ret = dev_pm_opp_disable(dev, 1000000000); |
| else |
| goto try_something_else; |
| } |
| |
| 5. OPP Data Retrieval Functions |
| =============================== |
| Since OPP library abstracts away the OPP information, a set of functions to pull |
| information from the OPP structure is necessary. Once an OPP pointer is |
| retrieved using the search functions, the following functions can be used by SoC |
| framework to retrieve the information represented inside the OPP layer. |
| |
| dev_pm_opp_get_voltage - Retrieve the voltage represented by the opp pointer. |
| Example: At a cpufreq transition to a different frequency, SoC |
| framework requires to set the voltage represented by the OPP using |
| the regulator framework to the Power Management chip providing the |
| voltage. |
| soc_switch_to_freq_voltage(freq) |
| { |
| /* do things */ |
| rcu_read_lock(); |
| opp = dev_pm_opp_find_freq_ceil(dev, &freq); |
| v = dev_pm_opp_get_voltage(opp); |
| rcu_read_unlock(); |
| if (v) |
| regulator_set_voltage(.., v); |
| /* do other things */ |
| } |
| |
| dev_pm_opp_get_freq - Retrieve the freq represented by the opp pointer. |
| Example: Lets say the SoC framework uses a couple of helper functions |
| we could pass opp pointers instead of doing additional parameters to |
| handle quiet a bit of data parameters. |
| soc_cpufreq_target(..) |
| { |
| /* do things.. */ |
| max_freq = ULONG_MAX; |
| rcu_read_lock(); |
| max_opp = dev_pm_opp_find_freq_floor(dev,&max_freq); |
| requested_opp = dev_pm_opp_find_freq_ceil(dev,&freq); |
| if (!IS_ERR(max_opp) && !IS_ERR(requested_opp)) |
| r = soc_test_validity(max_opp, requested_opp); |
| rcu_read_unlock(); |
| /* do other things */ |
| } |
| soc_test_validity(..) |
| { |
| if(dev_pm_opp_get_voltage(max_opp) < dev_pm_opp_get_voltage(requested_opp)) |
| return -EINVAL; |
| if(dev_pm_opp_get_freq(max_opp) < dev_pm_opp_get_freq(requested_opp)) |
| return -EINVAL; |
| /* do things.. */ |
| } |
| |
| dev_pm_opp_get_opp_count - Retrieve the number of available opps for a device |
| Example: Lets say a co-processor in the SoC needs to know the available |
| frequencies in a table, the main processor can notify as following: |
| soc_notify_coproc_available_frequencies() |
| { |
| /* Do things */ |
| rcu_read_lock(); |
| num_available = dev_pm_opp_get_opp_count(dev); |
| speeds = kzalloc(sizeof(u32) * num_available, GFP_KERNEL); |
| /* populate the table in increasing order */ |
| freq = 0; |
| while (!IS_ERR(opp = dev_pm_opp_find_freq_ceil(dev, &freq))) { |
| speeds[i] = freq; |
| freq++; |
| i++; |
| } |
| rcu_read_unlock(); |
| |
| soc_notify_coproc(AVAILABLE_FREQs, speeds, num_available); |
| /* Do other things */ |
| } |
| |
| 6. Cpufreq Table Generation |
| =========================== |
| dev_pm_opp_init_cpufreq_table - cpufreq framework typically is initialized with |
| cpufreq_frequency_table_cpuinfo which is provided with the list of |
| frequencies that are available for operation. This function provides |
| a ready to use conversion routine to translate the OPP layer's internal |
| information about the available frequencies into a format readily |
| providable to cpufreq. |
| |
| WARNING: Do not use this function in interrupt context. |
| |
| Example: |
| soc_pm_init() |
| { |
| /* Do things */ |
| r = dev_pm_opp_init_cpufreq_table(dev, &freq_table); |
| if (!r) |
| cpufreq_frequency_table_cpuinfo(policy, freq_table); |
| /* Do other things */ |
| } |
| |
| NOTE: This function is available only if CONFIG_CPU_FREQ is enabled in |
| addition to CONFIG_PM as power management feature is required to |
| dynamically scale voltage and frequency in a system. |
| |
| dev_pm_opp_free_cpufreq_table - Free up the table allocated by dev_pm_opp_init_cpufreq_table |
| |
| 7. Data Structures |
| ================== |
| Typically an SoC contains multiple voltage domains which are variable. Each |
| domain is represented by a device pointer. The relationship to OPP can be |
| represented as follows: |
| SoC |
| |- device 1 |
| | |- opp 1 (availability, freq, voltage) |
| | |- opp 2 .. |
| ... ... |
| | `- opp n .. |
| |- device 2 |
| ... |
| `- device m |
| |
| OPP library maintains a internal list that the SoC framework populates and |
| accessed by various functions as described above. However, the structures |
| representing the actual OPPs and domains are internal to the OPP library itself |
| to allow for suitable abstraction reusable across systems. |
| |
| struct dev_pm_opp - The internal data structure of OPP library which is used to |
| represent an OPP. In addition to the freq, voltage, availability |
| information, it also contains internal book keeping information required |
| for the OPP library to operate on. Pointer to this structure is |
| provided back to the users such as SoC framework to be used as a |
| identifier for OPP in the interactions with OPP layer. |
| |
| WARNING: The struct dev_pm_opp pointer should not be parsed or modified by the |
| users. The defaults of for an instance is populated by dev_pm_opp_add, but the |
| availability of the OPP can be modified by dev_pm_opp_enable/disable functions. |
| |
| struct device - This is used to identify a domain to the OPP layer. The |
| nature of the device and it's implementation is left to the user of |
| OPP library such as the SoC framework. |
| |
| Overall, in a simplistic view, the data structure operations is represented as |
| following: |
| |
| Initialization / modification: |
| +-----+ /- dev_pm_opp_enable |
| dev_pm_opp_add --> | opp | <------- |
| | +-----+ \- dev_pm_opp_disable |
| \-------> domain_info(device) |
| |
| Search functions: |
| /-- dev_pm_opp_find_freq_ceil ---\ +-----+ |
| domain_info<---- dev_pm_opp_find_freq_exact -----> | opp | |
| \-- dev_pm_opp_find_freq_floor ---/ +-----+ |
| |
| Retrieval functions: |
| +-----+ /- dev_pm_opp_get_voltage |
| | opp | <--- |
| +-----+ \- dev_pm_opp_get_freq |
| |
| domain_info <- dev_pm_opp_get_opp_count |