Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 1 | Inlining |
| 2 | ======== |
| 3 | |
Jordan Rose | de5277f | 2012-08-31 17:06:49 +0000 | [diff] [blame] | 4 | There are several options that control which calls the analyzer will consider for |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 5 | inlining. The major one is -analyzer-config ipa: |
Jordan Rose | de5277f | 2012-08-31 17:06:49 +0000 | [diff] [blame] | 6 | |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 7 | -analyzer-config ipa=none - All inlining is disabled. This is the only mode |
| 8 | available in LLVM 3.1 and earlier and in Xcode 4.3 and earlier. |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 9 | |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 10 | -analyzer-config ipa=basic-inlining - Turns on inlining for C functions, C++ |
| 11 | static member functions, and blocks -- essentially, the calls that behave |
| 12 | like simple C function calls. This is essentially the mode used in |
| 13 | Xcode 4.4. |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 14 | |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 15 | -analyzer-config ipa=inlining - Turns on inlining when we can confidently find |
| 16 | the function/method body corresponding to the call. (C functions, static |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 17 | functions, devirtualized C++ methods, Objective-C class methods, Objective-C |
| 18 | instance methods when ExprEngine is confident about the dynamic type of the |
| 19 | instance). |
| 20 | |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 21 | -analyzer-config ipa=dynamic - Inline instance methods for which the type is |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 22 | determined at runtime and we are not 100% sure that our type info is |
| 23 | correct. For virtual calls, inline the most plausible definition. |
| 24 | |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 25 | -analyzer-config ipa=dynamic-bifurcate - Same as -analyzer-config ipa=dynamic, |
| 26 | but the path is split. We inline on one branch and do not inline on the |
| 27 | other. This mode does not drop the coverage in cases when the parent class |
| 28 | has code that is only exercised when some of its methods are overridden. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 29 | |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 30 | Currently, -analyzer-config ipa=dynamic-bifurcate is the default mode. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 31 | |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 32 | While -analyzer-config ipa determines in general how aggressively the analyzer |
| 33 | will try to inline functions, several additional options control which types of |
| 34 | functions can inlined, in an all-or-nothing way. These options use the |
| 35 | analyzer's configuration table, so they are all specified as follows: |
Jordan Rose | de5277f | 2012-08-31 17:06:49 +0000 | [diff] [blame] | 36 | |
Jordan Rose | 978869a | 2012-09-10 21:54:24 +0000 | [diff] [blame] | 37 | -analyzer-config OPTION=VALUE |
Jordan Rose | de5277f | 2012-08-31 17:06:49 +0000 | [diff] [blame] | 38 | |
Jordan Rose | 978869a | 2012-09-10 21:54:24 +0000 | [diff] [blame] | 39 | ### c++-inlining ### |
| 40 | |
| 41 | This option controls which C++ member functions may be inlined. |
| 42 | |
| 43 | -analyzer-config c++-inlining=[none | methods | constructors | destructors] |
Jordan Rose | de5277f | 2012-08-31 17:06:49 +0000 | [diff] [blame] | 44 | |
| 45 | Each of these modes implies that all the previous member function kinds will be |
| 46 | inlined as well; it doesn't make sense to inline destructors without inlining |
| 47 | constructors, for example. |
| 48 | |
Jordan Rose | b11a908 | 2013-04-04 23:10:29 +0000 | [diff] [blame] | 49 | The default c++-inlining mode is 'destructors', meaning that all member |
| 50 | functions with visible definitions will be considered for inlining. In some |
| 51 | cases the analyzer may still choose not to inline the function. |
| 52 | |
| 53 | Note that under 'constructors', constructors for types with non-trivial |
| 54 | destructors will not be inlined. Additionally, no C++ member functions will be |
| 55 | inlined under -analyzer-config ipa=none or -analyzer-config ipa=basic-inlining, |
| 56 | regardless of the setting of the c++-inlining mode. |
Jordan Rose | de5277f | 2012-08-31 17:06:49 +0000 | [diff] [blame] | 57 | |
Jordan Rose | 978869a | 2012-09-10 21:54:24 +0000 | [diff] [blame] | 58 | ### c++-template-inlining ### |
| 59 | |
| 60 | This option controls whether C++ templated functions may be inlined. |
| 61 | |
| 62 | -analyzer-config c++-template-inlining=[true | false] |
| 63 | |
| 64 | Currently, template functions are considered for inlining by default. |
| 65 | |
| 66 | The motivation behind this option is that very generic code can be a source |
| 67 | of false positives, either by considering paths that the caller considers |
| 68 | impossible (by some unstated precondition), or by inlining some but not all |
| 69 | of a deep implementation of a function. |
| 70 | |
| 71 | ### c++-stdlib-inlining ### |
| 72 | |
| 73 | This option controls whether functions from the C++ standard library, including |
| 74 | methods of the container classes in the Standard Template Library, should be |
| 75 | considered for inlining. |
| 76 | |
| 77 | -analyzer-config c++-template-inlining=[true | false] |
| 78 | |
Jordan Rose | b11a908 | 2013-04-04 23:10:29 +0000 | [diff] [blame] | 79 | Currently, C++ standard library functions are considered for inlining by |
| 80 | default. |
Jordan Rose | 978869a | 2012-09-10 21:54:24 +0000 | [diff] [blame] | 81 | |
| 82 | The standard library functions and the STL in particular are used ubiquitously |
| 83 | enough that our tolerance for false positives is even lower here. A false |
| 84 | positive due to poor modeling of the STL leads to a poor user experience, since |
| 85 | most users would not be comfortable adding assertions to system headers in order |
| 86 | to silence analyzer warnings. |
| 87 | |
Jordan Rose | b11a908 | 2013-04-04 23:10:29 +0000 | [diff] [blame] | 88 | ### c++-container-inlining ### |
| 89 | |
| 90 | This option controls whether constructors and destructors of "container" types |
| 91 | should be considered for inlining. |
| 92 | |
| 93 | -analyzer-config c++-container-inlining=[true | false] |
| 94 | |
| 95 | Currently, these constructors and destructors are NOT considered for inlining |
| 96 | by default. |
| 97 | |
| 98 | The current implementation of this setting checks whether a type has a member |
| 99 | named 'iterator' or a member named 'begin'; these names are idiomatic in C++, |
| 100 | with the latter specified in the C++11 standard. The analyzer currently does a |
| 101 | fairly poor job of modeling certain data structure invariants of container-like |
| 102 | objects. For example, these three expressions should be equivalent: |
| 103 | |
| 104 | std::distance(c.begin(), c.end()) == 0 |
| 105 | c.begin() == c.end() |
| 106 | c.empty()) |
| 107 | |
| 108 | Many of these issues are avoided if containers always have unknown, symbolic |
| 109 | state, which is what happens when their constructors are treated as opaque. |
| 110 | In the future, we may decide specific containers are "safe" to model through |
| 111 | inlining, or choose to model them directly using checkers instead. |
| 112 | |
Jordan Rose | de5277f | 2012-08-31 17:06:49 +0000 | [diff] [blame] | 113 | |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 114 | Basics of Implementation |
| 115 | ----------------------- |
| 116 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 117 | The low-level mechanism of inlining a function is handled in |
| 118 | ExprEngine::inlineCall and ExprEngine::processCallExit. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 119 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 120 | If the conditions are right for inlining, a CallEnter node is created and added |
| 121 | to the analysis work list. The CallEnter node marks the change to a new |
| 122 | LocationContext representing the called function, and its state includes the |
| 123 | contents of the new stack frame. When the CallEnter node is actually processed, |
| 124 | its single successor will be a edge to the first CFG block in the function. |
| 125 | |
| 126 | Exiting an inlined function is a bit more work, fortunately broken up into |
| 127 | reasonable steps: |
| 128 | |
| 129 | 1. The CoreEngine realizes we're at the end of an inlined call and generates a |
| 130 | CallExitBegin node. |
| 131 | |
| 132 | 2. ExprEngine takes over (in processCallExit) and finds the return value of the |
| 133 | function, if it has one. This is bound to the expression that triggered the |
| 134 | call. (In the case of calls without origin expressions, such as destructors, |
| 135 | this step is skipped.) |
| 136 | |
| 137 | 3. Dead symbols and bindings are cleaned out from the state, including any local |
| 138 | bindings. |
| 139 | |
| 140 | 4. A CallExitEnd node is generated, which marks the transition back to the |
| 141 | caller's LocationContext. |
| 142 | |
| 143 | 5. Custom post-call checks are processed and the final nodes are pushed back |
| 144 | onto the work list, so that evaluation of the caller can continue. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 145 | |
| 146 | Retry Without Inlining |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 147 | ---------------------- |
| 148 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 149 | In some cases, we would like to retry analysis without inlining a particular |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 150 | call. |
| 151 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 152 | Currently, we use this technique to recover coverage in case we stop |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 153 | analyzing a path due to exceeding the maximum block count inside an inlined |
| 154 | function. |
| 155 | |
| 156 | When this situation is detected, we walk up the path to find the first node |
| 157 | before inlining was started and enqueue it on the WorkList with a special |
| 158 | ReplayWithoutInlining bit added to it (ExprEngine::replayWithoutInlining). The |
| 159 | path is then re-analyzed from that point without inlining that particular call. |
| 160 | |
| 161 | Deciding When to Inline |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 162 | ----------------------- |
| 163 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 164 | In general, the analyzer attempts to inline as much as possible, since it |
| 165 | provides a better summary of what actually happens in the program. There are |
| 166 | some cases, however, where the analyzer chooses not to inline: |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 167 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 168 | - If there is no definition available for the called function or method. In |
| 169 | this case, there is no opportunity to inline. |
| 170 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 171 | - If the CFG cannot be constructed for a called function, or the liveness |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 172 | cannot be computed. These are prerequisites for analyzing a function body, |
| 173 | with or without inlining. |
| 174 | |
| 175 | - If the LocationContext chain for a given ExplodedNode reaches a maximum cutoff |
| 176 | depth. This prevents unbounded analysis due to infinite recursion, but also |
| 177 | serves as a useful cutoff for performance reasons. |
| 178 | |
| 179 | - If the function is variadic. This is not a hard limitation, but an engineering |
| 180 | limitation. |
| 181 | |
| 182 | Tracked by: <rdar://problem/12147064> Support inlining of variadic functions |
| 183 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 184 | - In C++, constructors are not inlined unless the destructor call will be |
| 185 | processed by the ExprEngine. Thus, if the CFG was built without nodes for |
| 186 | implicit destructors, or if the destructors for the given object are not |
Jordan Rose | 7103d2d | 2012-08-27 18:39:16 +0000 | [diff] [blame] | 187 | represented in the CFG, the constructor will not be inlined. (As an exception, |
| 188 | constructors for objects with trivial constructors can still be inlined.) |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 189 | See "C++ Caveats" below. |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 190 | |
Jordan Rose | 7103d2d | 2012-08-27 18:39:16 +0000 | [diff] [blame] | 191 | - In C++, ExprEngine does not inline custom implementations of operator 'new' |
Jordan Rose | 6fe4dfb | 2012-08-27 18:39:22 +0000 | [diff] [blame] | 192 | or operator 'delete', nor does it inline the constructors and destructors |
| 193 | associated with these. See "C++ Caveats" below. |
Jordan Rose | 7103d2d | 2012-08-27 18:39:16 +0000 | [diff] [blame] | 194 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 195 | - Calls resulting in "dynamic dispatch" are specially handled. See more below. |
| 196 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 197 | - The FunctionSummaries map stores additional information about declarations, |
| 198 | some of which is collected at runtime based on previous analyses. |
| 199 | We do not inline functions which were not profitable to inline in a different |
| 200 | context (for example, if the maximum block count was exceeded; see |
| 201 | "Retry Without Inlining"). |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 202 | |
| 203 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 204 | Dynamic Calls and Devirtualization |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 205 | ---------------------------------- |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 206 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 207 | "Dynamic" calls are those that are resolved at runtime, such as C++ virtual |
| 208 | method calls and Objective-C message sends. Due to the path-sensitive nature of |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 209 | the analysis, the analyzer may be able to reason about the dynamic type of the |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 210 | object whose method is being called and thus "devirtualize" the call. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 211 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 212 | This path-sensitive devirtualization occurs when the analyzer can determine what |
| 213 | method would actually be called at runtime. This is possible when the type |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 214 | information is constrained enough for a simulated C++/Objective-C object that |
| 215 | the analyzer can make such a decision. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 216 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 217 | == DynamicTypeInfo == |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 218 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 219 | As the analyzer analyzes a path, it may accrue information to refine the |
| 220 | knowledge about the type of an object. This can then be used to make better |
| 221 | decisions about the target method of a call. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 222 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 223 | Such type information is tracked as DynamicTypeInfo. This is path-sensitive |
| 224 | data that is stored in ProgramState, which defines a mapping from MemRegions to |
| 225 | an (optional) DynamicTypeInfo. |
| 226 | |
| 227 | If no DynamicTypeInfo has been explicitly set for a MemRegion, it will be lazily |
| 228 | inferred from the region's type or associated symbol. Information from symbolic |
| 229 | regions is weaker than from true typed regions. |
| 230 | |
| 231 | EXAMPLE: A C++ object declared "A obj" is known to have the class 'A', but a |
| 232 | reference "A &ref" may dynamically be a subclass of 'A'. |
| 233 | |
| 234 | The DynamicTypePropagation checker gathers and propagates DynamicTypeInfo, |
| 235 | updating it as information is observed along a path that can refine that type |
| 236 | information for a region. |
| 237 | |
| 238 | WARNING: Not all of the existing analyzer code has been retrofitted to use |
| 239 | DynamicTypeInfo, nor is it universally appropriate. In particular, |
| 240 | DynamicTypeInfo always applies to a region with all casts stripped |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 241 | off, but sometimes the information provided by casts can be useful. |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 242 | |
| 243 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 244 | == RuntimeDefinition == |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 245 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 246 | The basis of devirtualization is CallEvent's getRuntimeDefinition() method, |
| 247 | which returns a RuntimeDefinition object. When asked to provide a definition, |
| 248 | the CallEvents for dynamic calls will use the DynamicTypeInfo in their |
| 249 | ProgramState to attempt to devirtualize the call. In the case of no dynamic |
| 250 | dispatch, or perfectly constrained devirtualization, the resulting |
| 251 | RuntimeDefinition contains a Decl corresponding to the definition of the called |
| 252 | function, and RuntimeDefinition::mayHaveOtherDefinitions will return FALSE. |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 253 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 254 | In the case of dynamic dispatch where our information is not perfect, CallEvent |
| 255 | can make a guess, but RuntimeDefinition::mayHaveOtherDefinitions will return |
| 256 | TRUE. The RuntimeDefinition object will then also include a MemRegion |
| 257 | corresponding to the object being called (i.e., the "receiver" in Objective-C |
| 258 | parlance), which ExprEngine uses to decide whether or not the call should be |
| 259 | inlined. |
| 260 | |
| 261 | == Inlining Dynamic Calls == |
| 262 | |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 263 | The -analyzer-config ipa option has five different modes: none, basic-inlining, |
| 264 | inlining, dynamic, and dynamic-bifurcate. Under -analyzer-config ipa=dynamic, |
| 265 | all dynamic calls are inlined, whether we are certain or not that this will |
| 266 | actually be the definition used at runtime. Under -analyzer-config ipa=inlining, |
| 267 | only "near-perfect" devirtualized calls are inlined*, and other dynamic calls |
| 268 | are evaluated conservatively (as if no definition were available). |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 269 | |
| 270 | * Currently, no Objective-C messages are not inlined under |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 271 | -analyzer-config ipa=inlining, even if we are reasonably confident of the type |
| 272 | of the receiver. We plan to enable this once we have tested our heuristics |
| 273 | more thoroughly. |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 274 | |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 275 | The last option, -analyzer-config ipa=dynamic-bifurcate, behaves similarly to |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 276 | "dynamic", but performs a conservative invalidation in the general virtual case |
| 277 | in *addition* to inlining. The details of this are discussed below. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 278 | |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 279 | As stated above, -analyzer-config ipa=basic-inlining does not inline any C++ |
| 280 | member functions or Objective-C method calls, even if they are non-virtual or |
| 281 | can be safely devirtualized. |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 282 | |
| 283 | |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 284 | Bifurcation |
| 285 | ----------- |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 286 | |
Anna Zaks | ce32890 | 2013-01-30 19:12:26 +0000 | [diff] [blame] | 287 | ExprEngine::BifurcateCall implements the -analyzer-config ipa=dynamic-bifurcate |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 288 | mode. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 289 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 290 | When a call is made on an object with imprecise dynamic type information |
Anna Zaks | 2eed8cc | 2012-08-22 05:38:38 +0000 | [diff] [blame] | 291 | (RuntimeDefinition::mayHaveOtherDefinitions() evaluates to TRUE), ExprEngine |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 292 | bifurcates the path and marks the object's region (retrieved from the |
| 293 | RuntimeDefinition object) with a path-sensitive "mode" in the ProgramState. |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 294 | |
| 295 | Currently, there are 2 modes: |
| 296 | |
| 297 | DynamicDispatchModeInlined - Models the case where the dynamic type information |
Anna Zaks | 2eed8cc | 2012-08-22 05:38:38 +0000 | [diff] [blame] | 298 | of the receiver (MemoryRegion) is assumed to be perfectly constrained so |
| 299 | that a given definition of a method is expected to be the code actually |
| 300 | called. When this mode is set, ExprEngine uses the Decl from |
| 301 | RuntimeDefinition to inline any dynamically dispatched call sent to this |
| 302 | receiver because the function definition is considered to be fully resolved. |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 303 | |
| 304 | DynamicDispatchModeConservative - Models the case where the dynamic type |
Anna Zaks | 2eed8cc | 2012-08-22 05:38:38 +0000 | [diff] [blame] | 305 | information is assumed to be incorrect, for example, implies that the method |
| 306 | definition is overriden in a subclass. In such cases, ExprEngine does not |
| 307 | inline the methods sent to the receiver (MemoryRegion), even if a candidate |
| 308 | definition is available. This mode is conservative about simulating the |
| 309 | effects of a call. |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 310 | |
Anna Zaks | 2eed8cc | 2012-08-22 05:38:38 +0000 | [diff] [blame] | 311 | Going forward along the symbolic execution path, ExprEngine consults the mode |
| 312 | of the receiver's MemRegion to make decisions on whether the calls should be |
| 313 | inlined or not, which ensures that there is at most one split per region. |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 314 | |
| 315 | At a high level, "bifurcation mode" allows for increased semantic coverage in |
| 316 | cases where the parent method contains code which is only executed when the |
| 317 | class is subclassed. The disadvantages of this mode are a (considerable?) |
| 318 | performance hit and the possibility of false positives on the path where the |
| 319 | conservative mode is used. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 320 | |
| 321 | Objective-C Message Heuristics |
| 322 | ------------------------------ |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 323 | |
Anna Zaks | 2eed8cc | 2012-08-22 05:38:38 +0000 | [diff] [blame] | 324 | ExprEngine relies on a set of heuristics to partition the set of Objective-C |
| 325 | method calls into those that require bifurcation and those that do not. Below |
| 326 | are the cases when the DynamicTypeInfo of the object is considered precise |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 327 | (cannot be a subclass): |
| 328 | |
| 329 | - If the object was created with +alloc or +new and initialized with an -init |
| 330 | method. |
| 331 | |
| 332 | - If the calls are property accesses using dot syntax. This is based on the |
| 333 | assumption that children rarely override properties, or do so in an |
| 334 | essentially compatible way. |
| 335 | |
| 336 | - If the class interface is declared inside the main source file. In this case |
| 337 | it is unlikely that it will be subclassed. |
| 338 | |
| 339 | - If the method is not declared outside of main source file, either by the |
| 340 | receiver's class or by any superclasses. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 341 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 342 | C++ Caveats |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 343 | -------------------- |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 344 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 345 | C++11 [class.cdtor]p4 describes how the vtable of an object is modified as it is |
| 346 | being constructed or destructed; that is, the type of the object depends on |
| 347 | which base constructors have been completed. This is tracked using |
| 348 | DynamicTypeInfo in the DynamicTypePropagation checker. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 349 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 350 | There are several limitations in the current implementation: |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 351 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 352 | - Temporaries are poorly modeled right now because we're not confident in the |
| 353 | placement of their destructors in the CFG. We currently won't inline their |
Jordan Rose | 7103d2d | 2012-08-27 18:39:16 +0000 | [diff] [blame] | 354 | constructors unless the destructor is trivial, and don't process their |
| 355 | destructors at all, not even to invalidate the region. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 356 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 357 | - 'new' is poorly modeled due to some nasty CFG/design issues. This is tracked |
| 358 | in PR12014. 'delete' is not modeled at all. |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 359 | |
| 360 | - Arrays of objects are modeled very poorly right now. ExprEngine currently |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 361 | only simulates the first constructor and first destructor. Because of this, |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 362 | ExprEngine does not inline any constructors or destructors for arrays. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 363 | |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 364 | |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 365 | CallEvent |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 366 | ========= |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 367 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 368 | A CallEvent represents a specific call to a function, method, or other body of |
| 369 | code. It is path-sensitive, containing both the current state (ProgramStateRef) |
| 370 | and stack space (LocationContext), and provides uniform access to the argument |
| 371 | values and return type of a call, no matter how the call is written in the |
| 372 | source or what sort of code body is being invoked. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 373 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 374 | NOTE: For those familiar with Cocoa, CallEvent is roughly equivalent to |
| 375 | NSInvocation. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 376 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 377 | CallEvent should be used whenever there is logic dealing with function calls |
| 378 | that does not care how the call occurred. |
Jordan Rose | 25c0400 | 2012-08-17 02:11:35 +0000 | [diff] [blame] | 379 | |
Ted Kremenek | 77df8d9 | 2012-08-22 01:20:05 +0000 | [diff] [blame] | 380 | Examples include checking that arguments satisfy preconditions (such as |
| 381 | __attribute__((nonnull))), and attempting to inline a call. |
| 382 | |
| 383 | CallEvents are reference-counted objects managed by a CallEventManager. While |
| 384 | there is no inherent issue with persisting them (say, in a ProgramState's GDM), |
| 385 | they are intended for short-lived use, and can be recreated from CFGElements or |
Jordan Rose | f01ef105 | 2012-08-22 17:13:27 +0000 | [diff] [blame] | 386 | non-top-level StackFrameContexts fairly easily. |