blob: ac75ee1bf2574073359e2fc7c95a8ffb0d637764 [file] [log] [blame]
Jordan Rose25c04002012-08-17 02:11:35 +00001Inlining
2========
3
Jordan Rosede5277f2012-08-31 17:06:49 +00004There are several options that control which calls the analyzer will consider for
Anna Zaksce328902013-01-30 19:12:26 +00005inlining. The major one is -analyzer-config ipa:
Jordan Rosede5277f2012-08-31 17:06:49 +00006
Anna Zaksce328902013-01-30 19:12:26 +00007 -analyzer-config ipa=none - All inlining is disabled. This is the only mode
8 available in LLVM 3.1 and earlier and in Xcode 4.3 and earlier.
Ted Kremenek77df8d92012-08-22 01:20:05 +00009
Anna Zaksce328902013-01-30 19:12:26 +000010 -analyzer-config ipa=basic-inlining - Turns on inlining for C functions, C++
11 static member functions, and blocks -- essentially, the calls that behave
12 like simple C function calls. This is essentially the mode used in
13 Xcode 4.4.
Ted Kremenek77df8d92012-08-22 01:20:05 +000014
Anna Zaksce328902013-01-30 19:12:26 +000015 -analyzer-config ipa=inlining - Turns on inlining when we can confidently find
16 the function/method body corresponding to the call. (C functions, static
Ted Kremenek77df8d92012-08-22 01:20:05 +000017 functions, devirtualized C++ methods, Objective-C class methods, Objective-C
18 instance methods when ExprEngine is confident about the dynamic type of the
19 instance).
20
Anna Zaksce328902013-01-30 19:12:26 +000021 -analyzer-config ipa=dynamic - Inline instance methods for which the type is
Ted Kremenek77df8d92012-08-22 01:20:05 +000022 determined at runtime and we are not 100% sure that our type info is
23 correct. For virtual calls, inline the most plausible definition.
24
Anna Zaksce328902013-01-30 19:12:26 +000025 -analyzer-config ipa=dynamic-bifurcate - Same as -analyzer-config ipa=dynamic,
26 but the path is split. We inline on one branch and do not inline on the
27 other. This mode does not drop the coverage in cases when the parent class
28 has code that is only exercised when some of its methods are overridden.
Jordan Rose25c04002012-08-17 02:11:35 +000029
Anna Zaksce328902013-01-30 19:12:26 +000030Currently, -analyzer-config ipa=dynamic-bifurcate is the default mode.
Jordan Rose25c04002012-08-17 02:11:35 +000031
Anna Zaksce328902013-01-30 19:12:26 +000032While -analyzer-config ipa determines in general how aggressively the analyzer
33will try to inline functions, several additional options control which types of
34functions can inlined, in an all-or-nothing way. These options use the
35analyzer's configuration table, so they are all specified as follows:
Jordan Rosede5277f2012-08-31 17:06:49 +000036
Jordan Rose978869a2012-09-10 21:54:24 +000037 -analyzer-config OPTION=VALUE
Jordan Rosede5277f2012-08-31 17:06:49 +000038
Jordan Rose978869a2012-09-10 21:54:24 +000039### c++-inlining ###
40
41This option controls which C++ member functions may be inlined.
42
43 -analyzer-config c++-inlining=[none | methods | constructors | destructors]
Jordan Rosede5277f2012-08-31 17:06:49 +000044
45Each of these modes implies that all the previous member function kinds will be
46inlined as well; it doesn't make sense to inline destructors without inlining
47constructors, for example.
48
49The default c++-inlining mode is 'methods', meaning only regular member
50functions and overloaded operators will be inlined. Note that no C++ member
Anna Zaksce328902013-01-30 19:12:26 +000051functions will be inlined under -analyzer-config ipa=none or
52-analyzer-config ipa=basic-inlining.
Jordan Rosede5277f2012-08-31 17:06:49 +000053
Jordan Rose978869a2012-09-10 21:54:24 +000054### c++-template-inlining ###
55
56This option controls whether C++ templated functions may be inlined.
57
58 -analyzer-config c++-template-inlining=[true | false]
59
60Currently, template functions are considered for inlining by default.
61
62The motivation behind this option is that very generic code can be a source
63of false positives, either by considering paths that the caller considers
64impossible (by some unstated precondition), or by inlining some but not all
65of a deep implementation of a function.
66
67### c++-stdlib-inlining ###
68
69This option controls whether functions from the C++ standard library, including
70methods of the container classes in the Standard Template Library, should be
71considered for inlining.
72
73 -analyzer-config c++-template-inlining=[true | false]
74
75Currently, C++ standard library functions are NOT considered for inlining by default.
76
77The standard library functions and the STL in particular are used ubiquitously
78enough that our tolerance for false positives is even lower here. A false
79positive due to poor modeling of the STL leads to a poor user experience, since
80most users would not be comfortable adding assertions to system headers in order
81to silence analyzer warnings.
82
Jordan Rosede5277f2012-08-31 17:06:49 +000083
Jordan Rose25c04002012-08-17 02:11:35 +000084Basics of Implementation
85-----------------------
86
Ted Kremenek77df8d92012-08-22 01:20:05 +000087The low-level mechanism of inlining a function is handled in
88ExprEngine::inlineCall and ExprEngine::processCallExit.
Jordan Rose25c04002012-08-17 02:11:35 +000089
Ted Kremenek77df8d92012-08-22 01:20:05 +000090If the conditions are right for inlining, a CallEnter node is created and added
91to the analysis work list. The CallEnter node marks the change to a new
92LocationContext representing the called function, and its state includes the
93contents of the new stack frame. When the CallEnter node is actually processed,
94its single successor will be a edge to the first CFG block in the function.
95
96Exiting an inlined function is a bit more work, fortunately broken up into
97reasonable steps:
98
991. The CoreEngine realizes we're at the end of an inlined call and generates a
100 CallExitBegin node.
101
1022. ExprEngine takes over (in processCallExit) and finds the return value of the
103 function, if it has one. This is bound to the expression that triggered the
104 call. (In the case of calls without origin expressions, such as destructors,
105 this step is skipped.)
106
1073. Dead symbols and bindings are cleaned out from the state, including any local
108 bindings.
109
1104. A CallExitEnd node is generated, which marks the transition back to the
111 caller's LocationContext.
112
1135. Custom post-call checks are processed and the final nodes are pushed back
114 onto the work list, so that evaluation of the caller can continue.
Jordan Rose25c04002012-08-17 02:11:35 +0000115
116Retry Without Inlining
Ted Kremenek77df8d92012-08-22 01:20:05 +0000117----------------------
118
Jordan Rosef01ef1052012-08-22 17:13:27 +0000119In some cases, we would like to retry analysis without inlining a particular
Ted Kremenek77df8d92012-08-22 01:20:05 +0000120call.
121
Jordan Rosef01ef1052012-08-22 17:13:27 +0000122Currently, we use this technique to recover coverage in case we stop
Ted Kremenek77df8d92012-08-22 01:20:05 +0000123analyzing a path due to exceeding the maximum block count inside an inlined
124function.
125
126When this situation is detected, we walk up the path to find the first node
127before inlining was started and enqueue it on the WorkList with a special
128ReplayWithoutInlining bit added to it (ExprEngine::replayWithoutInlining). The
129path is then re-analyzed from that point without inlining that particular call.
130
131Deciding When to Inline
Jordan Rose25c04002012-08-17 02:11:35 +0000132-----------------------
133
Ted Kremenek77df8d92012-08-22 01:20:05 +0000134In general, the analyzer attempts to inline as much as possible, since it
135provides a better summary of what actually happens in the program. There are
136some cases, however, where the analyzer chooses not to inline:
Jordan Rose25c04002012-08-17 02:11:35 +0000137
Ted Kremenek77df8d92012-08-22 01:20:05 +0000138- If there is no definition available for the called function or method. In
139 this case, there is no opportunity to inline.
140
Jordan Rosef01ef1052012-08-22 17:13:27 +0000141- If the CFG cannot be constructed for a called function, or the liveness
Ted Kremenek77df8d92012-08-22 01:20:05 +0000142 cannot be computed. These are prerequisites for analyzing a function body,
143 with or without inlining.
144
145- If the LocationContext chain for a given ExplodedNode reaches a maximum cutoff
146 depth. This prevents unbounded analysis due to infinite recursion, but also
147 serves as a useful cutoff for performance reasons.
148
149- If the function is variadic. This is not a hard limitation, but an engineering
150 limitation.
151
152 Tracked by: <rdar://problem/12147064> Support inlining of variadic functions
153
Jordan Rosef01ef1052012-08-22 17:13:27 +0000154- In C++, constructors are not inlined unless the destructor call will be
155 processed by the ExprEngine. Thus, if the CFG was built without nodes for
156 implicit destructors, or if the destructors for the given object are not
Jordan Rose7103d2d2012-08-27 18:39:16 +0000157 represented in the CFG, the constructor will not be inlined. (As an exception,
158 constructors for objects with trivial constructors can still be inlined.)
Jordan Rosef01ef1052012-08-22 17:13:27 +0000159 See "C++ Caveats" below.
Ted Kremenek77df8d92012-08-22 01:20:05 +0000160
Jordan Rose7103d2d2012-08-27 18:39:16 +0000161- In C++, ExprEngine does not inline custom implementations of operator 'new'
Jordan Rose6fe4dfb2012-08-27 18:39:22 +0000162 or operator 'delete', nor does it inline the constructors and destructors
163 associated with these. See "C++ Caveats" below.
Jordan Rose7103d2d2012-08-27 18:39:16 +0000164
Ted Kremenek77df8d92012-08-22 01:20:05 +0000165- Calls resulting in "dynamic dispatch" are specially handled. See more below.
166
Jordan Rosef01ef1052012-08-22 17:13:27 +0000167- The FunctionSummaries map stores additional information about declarations,
168 some of which is collected at runtime based on previous analyses.
169 We do not inline functions which were not profitable to inline in a different
170 context (for example, if the maximum block count was exceeded; see
171 "Retry Without Inlining").
Jordan Rose25c04002012-08-17 02:11:35 +0000172
173
Ted Kremenek77df8d92012-08-22 01:20:05 +0000174Dynamic Calls and Devirtualization
Jordan Rose25c04002012-08-17 02:11:35 +0000175----------------------------------
Jordan Rose25c04002012-08-17 02:11:35 +0000176
Ted Kremenek77df8d92012-08-22 01:20:05 +0000177"Dynamic" calls are those that are resolved at runtime, such as C++ virtual
178method calls and Objective-C message sends. Due to the path-sensitive nature of
Jordan Rosef01ef1052012-08-22 17:13:27 +0000179the analysis, the analyzer may be able to reason about the dynamic type of the
Ted Kremenek77df8d92012-08-22 01:20:05 +0000180object whose method is being called and thus "devirtualize" the call.
Jordan Rose25c04002012-08-17 02:11:35 +0000181
Ted Kremenek77df8d92012-08-22 01:20:05 +0000182This path-sensitive devirtualization occurs when the analyzer can determine what
183method would actually be called at runtime. This is possible when the type
Jordan Rosef01ef1052012-08-22 17:13:27 +0000184information is constrained enough for a simulated C++/Objective-C object that
185the analyzer can make such a decision.
Jordan Rose25c04002012-08-17 02:11:35 +0000186
Ted Kremenek77df8d92012-08-22 01:20:05 +0000187 == DynamicTypeInfo ==
Jordan Rose25c04002012-08-17 02:11:35 +0000188
Jordan Rosef01ef1052012-08-22 17:13:27 +0000189As the analyzer analyzes a path, it may accrue information to refine the
190knowledge about the type of an object. This can then be used to make better
191decisions about the target method of a call.
Jordan Rose25c04002012-08-17 02:11:35 +0000192
Ted Kremenek77df8d92012-08-22 01:20:05 +0000193Such type information is tracked as DynamicTypeInfo. This is path-sensitive
194data that is stored in ProgramState, which defines a mapping from MemRegions to
195an (optional) DynamicTypeInfo.
196
197If no DynamicTypeInfo has been explicitly set for a MemRegion, it will be lazily
198inferred from the region's type or associated symbol. Information from symbolic
199regions is weaker than from true typed regions.
200
201 EXAMPLE: A C++ object declared "A obj" is known to have the class 'A', but a
202 reference "A &ref" may dynamically be a subclass of 'A'.
203
204The DynamicTypePropagation checker gathers and propagates DynamicTypeInfo,
205updating it as information is observed along a path that can refine that type
206information for a region.
207
208 WARNING: Not all of the existing analyzer code has been retrofitted to use
209 DynamicTypeInfo, nor is it universally appropriate. In particular,
210 DynamicTypeInfo always applies to a region with all casts stripped
Jordan Rosef01ef1052012-08-22 17:13:27 +0000211 off, but sometimes the information provided by casts can be useful.
Ted Kremenek77df8d92012-08-22 01:20:05 +0000212
213
Jordan Rosef01ef1052012-08-22 17:13:27 +0000214 == RuntimeDefinition ==
Ted Kremenek77df8d92012-08-22 01:20:05 +0000215
Jordan Rosef01ef1052012-08-22 17:13:27 +0000216The basis of devirtualization is CallEvent's getRuntimeDefinition() method,
217which returns a RuntimeDefinition object. When asked to provide a definition,
218the CallEvents for dynamic calls will use the DynamicTypeInfo in their
219ProgramState to attempt to devirtualize the call. In the case of no dynamic
220dispatch, or perfectly constrained devirtualization, the resulting
221RuntimeDefinition contains a Decl corresponding to the definition of the called
222function, and RuntimeDefinition::mayHaveOtherDefinitions will return FALSE.
Ted Kremenek77df8d92012-08-22 01:20:05 +0000223
Jordan Rosef01ef1052012-08-22 17:13:27 +0000224In the case of dynamic dispatch where our information is not perfect, CallEvent
225can make a guess, but RuntimeDefinition::mayHaveOtherDefinitions will return
226TRUE. The RuntimeDefinition object will then also include a MemRegion
227corresponding to the object being called (i.e., the "receiver" in Objective-C
228parlance), which ExprEngine uses to decide whether or not the call should be
229inlined.
230
231 == Inlining Dynamic Calls ==
232
Anna Zaksce328902013-01-30 19:12:26 +0000233The -analyzer-config ipa option has five different modes: none, basic-inlining,
234inlining, dynamic, and dynamic-bifurcate. Under -analyzer-config ipa=dynamic,
235all dynamic calls are inlined, whether we are certain or not that this will
236actually be the definition used at runtime. Under -analyzer-config ipa=inlining,
237only "near-perfect" devirtualized calls are inlined*, and other dynamic calls
238are evaluated conservatively (as if no definition were available).
Ted Kremenek77df8d92012-08-22 01:20:05 +0000239
240* Currently, no Objective-C messages are not inlined under
Anna Zaksce328902013-01-30 19:12:26 +0000241 -analyzer-config ipa=inlining, even if we are reasonably confident of the type
242 of the receiver. We plan to enable this once we have tested our heuristics
243 more thoroughly.
Ted Kremenek77df8d92012-08-22 01:20:05 +0000244
Anna Zaksce328902013-01-30 19:12:26 +0000245The last option, -analyzer-config ipa=dynamic-bifurcate, behaves similarly to
Ted Kremenek77df8d92012-08-22 01:20:05 +0000246"dynamic", but performs a conservative invalidation in the general virtual case
247in *addition* to inlining. The details of this are discussed below.
Jordan Rose25c04002012-08-17 02:11:35 +0000248
Anna Zaksce328902013-01-30 19:12:26 +0000249As stated above, -analyzer-config ipa=basic-inlining does not inline any C++
250member functions or Objective-C method calls, even if they are non-virtual or
251can be safely devirtualized.
Jordan Rosef01ef1052012-08-22 17:13:27 +0000252
253
Jordan Rose25c04002012-08-17 02:11:35 +0000254Bifurcation
255-----------
Jordan Rose25c04002012-08-17 02:11:35 +0000256
Anna Zaksce328902013-01-30 19:12:26 +0000257ExprEngine::BifurcateCall implements the -analyzer-config ipa=dynamic-bifurcate
Ted Kremenek77df8d92012-08-22 01:20:05 +0000258mode.
Jordan Rose25c04002012-08-17 02:11:35 +0000259
Jordan Rosef01ef1052012-08-22 17:13:27 +0000260When a call is made on an object with imprecise dynamic type information
Anna Zaks2eed8cc2012-08-22 05:38:38 +0000261(RuntimeDefinition::mayHaveOtherDefinitions() evaluates to TRUE), ExprEngine
Jordan Rosef01ef1052012-08-22 17:13:27 +0000262bifurcates the path and marks the object's region (retrieved from the
263RuntimeDefinition object) with a path-sensitive "mode" in the ProgramState.
Ted Kremenek77df8d92012-08-22 01:20:05 +0000264
265Currently, there are 2 modes:
266
267 DynamicDispatchModeInlined - Models the case where the dynamic type information
Anna Zaks2eed8cc2012-08-22 05:38:38 +0000268 of the receiver (MemoryRegion) is assumed to be perfectly constrained so
269 that a given definition of a method is expected to be the code actually
270 called. When this mode is set, ExprEngine uses the Decl from
271 RuntimeDefinition to inline any dynamically dispatched call sent to this
272 receiver because the function definition is considered to be fully resolved.
Ted Kremenek77df8d92012-08-22 01:20:05 +0000273
274 DynamicDispatchModeConservative - Models the case where the dynamic type
Anna Zaks2eed8cc2012-08-22 05:38:38 +0000275 information is assumed to be incorrect, for example, implies that the method
276 definition is overriden in a subclass. In such cases, ExprEngine does not
277 inline the methods sent to the receiver (MemoryRegion), even if a candidate
278 definition is available. This mode is conservative about simulating the
279 effects of a call.
Ted Kremenek77df8d92012-08-22 01:20:05 +0000280
Anna Zaks2eed8cc2012-08-22 05:38:38 +0000281Going forward along the symbolic execution path, ExprEngine consults the mode
282of the receiver's MemRegion to make decisions on whether the calls should be
283inlined or not, which ensures that there is at most one split per region.
Ted Kremenek77df8d92012-08-22 01:20:05 +0000284
285At a high level, "bifurcation mode" allows for increased semantic coverage in
286cases where the parent method contains code which is only executed when the
287class is subclassed. The disadvantages of this mode are a (considerable?)
288performance hit and the possibility of false positives on the path where the
289conservative mode is used.
Jordan Rose25c04002012-08-17 02:11:35 +0000290
291Objective-C Message Heuristics
292------------------------------
Jordan Rose25c04002012-08-17 02:11:35 +0000293
Anna Zaks2eed8cc2012-08-22 05:38:38 +0000294ExprEngine relies on a set of heuristics to partition the set of Objective-C
295method calls into those that require bifurcation and those that do not. Below
296are the cases when the DynamicTypeInfo of the object is considered precise
Ted Kremenek77df8d92012-08-22 01:20:05 +0000297(cannot be a subclass):
298
299 - If the object was created with +alloc or +new and initialized with an -init
300 method.
301
302 - If the calls are property accesses using dot syntax. This is based on the
303 assumption that children rarely override properties, or do so in an
304 essentially compatible way.
305
306 - If the class interface is declared inside the main source file. In this case
307 it is unlikely that it will be subclassed.
308
309 - If the method is not declared outside of main source file, either by the
310 receiver's class or by any superclasses.
Jordan Rose25c04002012-08-17 02:11:35 +0000311
Jordan Rosef01ef1052012-08-22 17:13:27 +0000312C++ Caveats
Jordan Rose25c04002012-08-17 02:11:35 +0000313--------------------
Jordan Rose25c04002012-08-17 02:11:35 +0000314
Ted Kremenek77df8d92012-08-22 01:20:05 +0000315C++11 [class.cdtor]p4 describes how the vtable of an object is modified as it is
316being constructed or destructed; that is, the type of the object depends on
317which base constructors have been completed. This is tracked using
318DynamicTypeInfo in the DynamicTypePropagation checker.
Jordan Rose25c04002012-08-17 02:11:35 +0000319
Ted Kremenek77df8d92012-08-22 01:20:05 +0000320There are several limitations in the current implementation:
Jordan Rose25c04002012-08-17 02:11:35 +0000321
Jordan Rosef01ef1052012-08-22 17:13:27 +0000322- Temporaries are poorly modeled right now because we're not confident in the
323 placement of their destructors in the CFG. We currently won't inline their
Jordan Rose7103d2d2012-08-27 18:39:16 +0000324 constructors unless the destructor is trivial, and don't process their
325 destructors at all, not even to invalidate the region.
Jordan Rose25c04002012-08-17 02:11:35 +0000326
Jordan Rosef01ef1052012-08-22 17:13:27 +0000327- 'new' is poorly modeled due to some nasty CFG/design issues. This is tracked
328 in PR12014. 'delete' is not modeled at all.
Ted Kremenek77df8d92012-08-22 01:20:05 +0000329
330- Arrays of objects are modeled very poorly right now. ExprEngine currently
Jordan Rosef01ef1052012-08-22 17:13:27 +0000331 only simulates the first constructor and first destructor. Because of this,
Ted Kremenek77df8d92012-08-22 01:20:05 +0000332 ExprEngine does not inline any constructors or destructors for arrays.
Jordan Rose25c04002012-08-17 02:11:35 +0000333
Jordan Rosef01ef1052012-08-22 17:13:27 +0000334
Jordan Rose25c04002012-08-17 02:11:35 +0000335CallEvent
Jordan Rosef01ef1052012-08-22 17:13:27 +0000336=========
Jordan Rose25c04002012-08-17 02:11:35 +0000337
Ted Kremenek77df8d92012-08-22 01:20:05 +0000338A CallEvent represents a specific call to a function, method, or other body of
339code. It is path-sensitive, containing both the current state (ProgramStateRef)
340and stack space (LocationContext), and provides uniform access to the argument
341values and return type of a call, no matter how the call is written in the
342source or what sort of code body is being invoked.
Jordan Rose25c04002012-08-17 02:11:35 +0000343
Ted Kremenek77df8d92012-08-22 01:20:05 +0000344 NOTE: For those familiar with Cocoa, CallEvent is roughly equivalent to
345 NSInvocation.
Jordan Rose25c04002012-08-17 02:11:35 +0000346
Ted Kremenek77df8d92012-08-22 01:20:05 +0000347CallEvent should be used whenever there is logic dealing with function calls
348that does not care how the call occurred.
Jordan Rose25c04002012-08-17 02:11:35 +0000349
Ted Kremenek77df8d92012-08-22 01:20:05 +0000350Examples include checking that arguments satisfy preconditions (such as
351__attribute__((nonnull))), and attempting to inline a call.
352
353CallEvents are reference-counted objects managed by a CallEventManager. While
354there is no inherent issue with persisting them (say, in a ProgramState's GDM),
355they are intended for short-lived use, and can be recreated from CFGElements or
Jordan Rosef01ef1052012-08-22 17:13:27 +0000356non-top-level StackFrameContexts fairly easily.