Mark Hecomovich | 543e027 | 2016-07-07 16:42:55 -0700 | [diff] [blame^] | 1 | page.title=RIL Refactoring |
| 2 | @jd:body |
| 3 | |
| 4 | <!-- |
| 5 | Copyright 2016 The Android Open Source Project |
| 6 | |
| 7 | Licensed under the Apache License, Version 2.0 (the "License"); |
| 8 | you may not use this file except in compliance with the License. |
| 9 | You may obtain a copy of the License at |
| 10 | |
| 11 | http://www.apache.org/licenses/LICENSE-2.0 |
| 12 | |
| 13 | Unless required by applicable law or agreed to in writing, software |
| 14 | distributed under the License is distributed on an "AS IS" BASIS, |
| 15 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| 16 | See the License for the specific language governing permissions and |
| 17 | limitations under the License. |
| 18 | --> |
| 19 | <div id="qv-wrapper"> |
| 20 | <div id="qv"> |
| 21 | <h2>In this document</h2> |
| 22 | <ol id="auto-toc"> |
| 23 | </ol> |
| 24 | </div> |
| 25 | </div> |
| 26 | |
| 27 | <h2 id="introduction">Introduction</h2> |
| 28 | |
| 29 | <p>The Radio Interface Layer (RIL) refactoring feature |
| 30 | of the Android 7.0 release is a set of subfeatures |
| 31 | that improves RIL functionality. Implementing the features is optional but |
| 32 | encouraged. Partner code changes are required to implement these features. The |
| 33 | refactoring changes are backward compatible, so prior implementations of |
| 34 | the refactored features will still work.</p> |
| 35 | |
| 36 | <p>The following subfeatures are included in the RIL refactoring feature. You |
| 37 | can implement any or all of the subfeatures:</p> |
| 38 | |
| 39 | <ul> |
| 40 | <li>Enhanced RIL error codes: Code can return more specific error codes |
| 41 | than the existing <code>GENERIC_FAILURE</code> code. This enhances error |
| 42 | troubleshooting by providing more specific information about the cause |
| 43 | of errors.</li> |
| 44 | |
| 45 | <li>Enhanced RIL versioning: The RIL versioning mechanism is enhanced to |
| 46 | provide more accurate and easier to configure version information.</li> |
| 47 | |
| 48 | <li>Redesigned RIL communication using wakelocks: RIL communication using |
| 49 | wakelocks is enhanced to improve device battery performance.</li> |
| 50 | </ul> |
| 51 | |
| 52 | <h2 id="examples">Examples and source</h2> |
| 53 | |
| 54 | <p>Documentation for RIL versioning is also in code comments in <a |
| 55 | href="https://android.googlesource.com/platform/hardware/ril/+/master/include/telephony/ril.h"><code>https://android.googlesource.com/platform/hardware/ril/+/master/include/telephony/ril.h</code></a>.</p> |
| 56 | |
| 57 | <h2 id="implementation">Implementation</h2> |
| 58 | |
| 59 | <p>The following sections describe how to implement the subfeatures of the |
| 60 | RIL refactoring feature.</p> |
| 61 | |
| 62 | <h3 id="errorcodes">Implementing enhanced RIL error codes</h3> |
| 63 | |
| 64 | <h4 id="errorcodes-problem">Problem</h4> |
| 65 | |
| 66 | <p>Almost all RIL request calls can return the <code>GENERIC_FAILURE</code> |
| 67 | error code in response to an error. This is an issue with all solicited |
| 68 | responses returned by the OEMs. It is difficult to debug an issue from |
| 69 | the bug report if the same <code>GENERIC_FAILURE</code> error code is |
| 70 | returned by RIL calls for different reasons. It can take considerable time |
| 71 | for vendors to even identify what part of the code could have returned a |
| 72 | <code>GENERIC_FAILURE</code> code.</p> |
| 73 | |
| 74 | <h4 id="errorcodes-solution">Solution</h4> |
| 75 | |
| 76 | <p>OEMs should return a distinct error code value associated |
| 77 | with each of the different errors that are currently categorized as |
| 78 | <code>GENERIC_FAILURE</code>.</p> |
| 79 | |
| 80 | <p>If OEMs do not want to publicly reveal their custom error codes, they may |
| 81 | return errors as a distinct set of integers (for example, from 1 to x) that |
| 82 | are mapped as <code>OEM_ERROR_1</code> to <code>OEM_ERROR_X</code>. The |
| 83 | vendor should make sure each such masked error code returned maps to a unique |
| 84 | error reason in their code. The purpose of doing this is |
| 85 | to speed up debugging RIL issues whenever generic errors are returned |
| 86 | by the OEM. It can take too much time to identify what exactly caused |
| 87 | <code>GENERIC_FAILURE</code>, and sometimes it's impossible to figure out.<p> |
| 88 | |
| 89 | <p>In <code>ril.h</code>, more error codes are |
| 90 | added for enums <code>RIL_LastCallFailCause</code> and |
| 91 | <code>RIL_DataCallFailCause</code> so that vendor code avoids returning |
| 92 | generic errors like <code>CALL_FAIL_ERROR_UNSPECIFIED</code> and |
| 93 | <code>PDP_FAIL_ERROR_UNSPECIFIED</code>.</p> |
| 94 | |
| 95 | <h3 id="version">Implementing enhanced RIL versioning</h3> |
| 96 | |
| 97 | <h4 id="version-problem">Problem</h4> |
| 98 | |
| 99 | <p>RIL versioning is not accurate enough. The mechanism for vendors to |
| 100 | report their RIL version is not clear, causing vendors to report an incorrect |
| 101 | version. A workaround method of estimating the version is used, but it can |
| 102 | be inaccurate.</p> |
| 103 | |
| 104 | <h4 id="version-solution">Solution</h4> |
| 105 | |
| 106 | <p>There is a documented section in <code>ril.h</code> describing what a |
| 107 | particular RIL version value corresponds to. Each |
| 108 | RIL version is documented, including what changes correspond |
| 109 | to that version. Vendors must update their version in code when making |
| 110 | changes corresponding to that version, and return that version while doing |
| 111 | <code>RIL_REGISTER</code>.</p> |
| 112 | |
| 113 | <h3 id="wakelocks">Implementing redesigned RIL communication using |
| 114 | wakelocks</h3> |
| 115 | |
| 116 | <h4 id="wakelocks-prob-sum">Problem summary</h4> |
| 117 | |
| 118 | <p>Timed wakelocks are used in RIL communication in an imprecise way, |
| 119 | which negatively affects battery performance. RIL requests can be either |
| 120 | solicited or unsolicited. Solicited requests should be classified as one of |
| 121 | the following:</p> |
| 122 | |
| 123 | <ul> |
| 124 | <li>synchronous: Those that do not take considerable time to respond back. For |
| 125 | example, <code>RIL_REQUEST_GET_SIM_STATUS</code>.</li> |
| 126 | |
| 127 | <li>asynchronous: Those that take considerable time to respond back. For |
| 128 | example, <code>RIL_REQUEST_QUERY_AVAILABLE_NETWORKS</code>.</li> |
| 129 | </ul> |
| 130 | |
| 131 | <p>Follow these steps to implement redesigned wakelocks:</p> |
| 132 | |
| 133 | <ol> |
| 134 | <li> |
| 135 | Classify solicited RIL commands as either synchronous or asynchronous |
| 136 | depending on how much time they take to respond. |
| 137 | <p>Here are some things to consider while making |
| 138 | that decision:</p> |
| 139 | |
| 140 | <ul> |
| 141 | <li>As explained in the solution of asynchronous solicited RIL requests, |
| 142 | because the requests take considerable time, RIL Java releases the wakelock |
| 143 | after receiving ack from vendor code. This might cause the application |
| 144 | processor to go from idle to suspend state. When the response is available |
| 145 | from vendor code, RIL Java (the application processor) will re-acquire the |
| 146 | wakelock and process the response, and later go to idle state again. This |
| 147 | process of moving from idle to suspend state and back to idle can consume |
| 148 | a lot of power.</li> |
| 149 | |
| 150 | <li>If the response time isn't long enough then holding the wakelock and |
| 151 | staying in idle state for the entire time it takes to respond can be more |
| 152 | power efficient than going in suspend state by releasing the wakelock and |
| 153 | then waking up when the response arrives. So vendors should use |
| 154 | platform-specific power measurement to find out the threshold value of time 't' when |
| 155 | power consumed by staying in idle state for the entire time 't' consumes |
| 156 | more power than moving from idle to suspend and back to idle in same time |
| 157 | 't'. When that time 't' is discovered, RIL commands that take more than time |
| 158 | 't' can be classified as asynchronous, and the rest of the RIL commands can |
| 159 | be classified as synchronous.</li> |
| 160 | </ul> |
| 161 | </li> |
| 162 | |
| 163 | <li>Understand the RIL communications scenarios described in the <a |
| 164 | href="#ril-comm-scenarios">RIL communication scenarios</a> section.</li> |
| 165 | |
| 166 | <li>Follow the solutions in the scenarios by modifying your code to handle |
| 167 | RIL solicited and unsolicited requests.</li> |
| 168 | </ol> |
| 169 | |
| 170 | <h4 id="ril-comm-scenarios">RIL communication scenarios</h4> |
| 171 | |
| 172 | <p>For implementation details of the functions used in the |
| 173 | following diagrams, see the source code of <code>ril.cpp</code>: |
| 174 | <code>acquireWakeLock()</code>, <code>decrementWakeLock()</code>, |
| 175 | <code>clearWakeLock(</code>)</p> |
| 176 | |
| 177 | <h5>Scenario 1: RIL request from Java APIs and solicited asynchronous response |
| 178 | to that request</h5> |
| 179 | |
| 180 | <p><img src="images/ril-refactor-scenario-1.png"></p> |
| 181 | |
| 182 | <h6>Problem</h6> |
| 183 | |
| 184 | <p>If the RIL solicited response is expected to take considerable time (for |
| 185 | example, <code>RIL_REQUEST_GET_AVAILABLE_NETWORKS</code>), then wakelock |
| 186 | is held for a long time on the Application processor side, which is a |
| 187 | problem. Also, modem problems result in a long wait.</p> |
| 188 | |
| 189 | <h6>Solution part 1</h6> |
| 190 | |
| 191 | <p>In this scenario, wakelock equivalent is held by Modem code (RIL request |
| 192 | and asynchronous response back).</p> |
| 193 | |
| 194 | <p><img src="images/ril-refactor-scenario-1-solution-1.png"></p> |
| 195 | |
| 196 | <p>As shown in the above sequence diagram:</p> |
| 197 | |
| 198 | <ol> |
| 199 | <li>RIL request is sent, and the modem needs to acquire wakelock to process |
| 200 | the request.</li> |
| 201 | |
| 202 | <li>The modem code sends acknowledgement that causes the Java side to decrement |
| 203 | the wakelock counter and release it if the wakelock counter value is 0.</li> |
| 204 | |
| 205 | <li>After the modem processes the request, it sends an interrupt to the |
| 206 | vendor code that acquires wakelock and sends a response to ril.cpp. ril.cpp |
| 207 | then acquires wakelock and sends a response to the Java side.</li> |
| 208 | |
| 209 | <li>When the response reaches the Java side, wakelock is acquired and response |
| 210 | is sent back to caller.</li> |
| 211 | |
| 212 | <li>After that response is processed by all modules, acknowledgement is |
| 213 | sent back to <code>ril.cpp</code> over a socket. <code>ril.cpp</code> then |
| 214 | releases the wakelock that was acquired in step 3.</li> |
| 215 | </ol> |
| 216 | |
| 217 | <p>Note that the wakelock timeout duration for the request-ack sequence |
| 218 | would be smaller than the currently used timeout duration because the ack |
| 219 | should be received back fairly quickly.</p> |
| 220 | |
| 221 | <h6>Solution part 2</h6> |
| 222 | |
| 223 | <p>In this scenario, wakelock is not held by modem and response is quick |
| 224 | (synchronous RIL request and response).</p> |
| 225 | |
| 226 | <p><img src="images/ril-refactor-scenario-1-solution-2.png"></p> |
| 227 | |
| 228 | <p>As shown in the above sequence diagram:</p> |
| 229 | |
| 230 | <ol> |
| 231 | <li>RIL request is sent by calling <code>acquireWakeLock()</code> on the |
| 232 | Java side.</li> |
| 233 | |
| 234 | <li>Vendor code doesn't need to acquire wakelock and can process the request |
| 235 | and respond quickly.</li> |
| 236 | |
| 237 | <li>When the response is received by the Java side, |
| 238 | <code>decrementWakeLock()</code> is called, which decreases wakelock counter |
| 239 | and releases wakelock if the counter value is 0.</li> |
| 240 | </ol> |
| 241 | |
| 242 | <p>Note that this synchronous vs. asynchronous behavior is hardcoded for a |
| 243 | particular RIL command and decided on a call-by-call basis.</p> |
| 244 | |
| 245 | <h5>Scenario 2: RIL unsolicited response</h5> |
| 246 | |
| 247 | <p><img src="images/ril-refactor-scenario-2.png"></p> |
| 248 | |
| 249 | <p>As shown in the above diagram, RIL unsolicited responses have a wakelock |
| 250 | type flag in the response that indicates whether a wakelock needs to be |
| 251 | acquired or not for the particular response received from the vendor. If |
| 252 | the flag is set, then a timed wakelock is set and response is sent over a |
| 253 | socket to the Java side. When the timer expires, the wakelock is released.</p> |
| 254 | |
| 255 | <h6>Problem</h6> |
| 256 | |
| 257 | <p>The timed wakelock illustrated in Scenario 2 could be too long or too |
| 258 | short for different RIL unsolicited responses.</p> |
| 259 | |
| 260 | <h6>Solution</h6> |
| 261 | |
| 262 | <p><img src="images/ril-refactor-scenario-2-solution.png"></p> |
| 263 | |
| 264 | <p>As shown, the problem can be solved by sending an acknowledgement from |
| 265 | the Java code to the native side (<code>ril.cpp</code>), instead of holding |
| 266 | a timed wakelock on the native side while sending an unsolicited response.</p> |
| 267 | |
| 268 | <h2 id="validation">Validation</h2> |
| 269 | |
| 270 | <p>The following sections describe how to validate the implementation of |
| 271 | the RIL refactoring feature's subfeatures.</p> |
| 272 | |
| 273 | <h3 id="validate-error">Validating enhanced RIL error codes</h3> |
| 274 | |
| 275 | <p>After adding new error codes to replace the <code>GENERIC_FAILURE</code> |
| 276 | code, verify that the new error codes are returned by the RIL call instead |
| 277 | of <code>GENERIC_FAILURE</code>.</p> |
| 278 | |
| 279 | <h3 id="validate-version">Validating enhanced RIL versioning</h3> |
| 280 | |
| 281 | <p>Verify that the RIL version corresponding to your RIL code is returned |
| 282 | during <code>RIL_REGISTER</code> rather than the <code>RIL_VERSION</code> |
| 283 | defined in <code>ril.h</code>.</p> |
| 284 | |
| 285 | <h3 id="validate-wakelocks">Validating redesigned wakelocks</h3> |
| 286 | |
| 287 | <p>Verify that RIL calls are identified as synchronous or asynchronous.</p> |
| 288 | |
| 289 | <p>Because battery power consumption can be hardware/platform dependent, |
| 290 | vendors should do some internal testing to find out if using the new wakelock |
| 291 | semantics for asynchronous calls leads to battery power savings.</p> |