blob: c079f33d6df41832832f8fc3d61ed1414570de09 [file] [log] [blame]
Anna Zaksd67fc492011-11-02 17:49:20 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3<html>
4<head>
5 <title>Checker Developer Manual</title>
6 <link type="text/css" rel="stylesheet" href="menu.css" />
7 <link type="text/css" rel="stylesheet" href="content.css" />
8 <script type="text/javascript" src="scripts/menu.js"></script>
9</head>
10<body>
11
12<div id="page">
13<!--#include virtual="menu.html.incl"-->
14
15<div id="content">
16
17<h1><font color=red>This Page Is Under Construction</font></h1>
18
19<h1>Checker Developer Manual</h1>
20
21<p>The static analyzer engine performs symbolic execution of the program and
22relies on a set of checkers to implement the logic for detecting and
23constructing bug reports. This page provides hints and guidelines for anyone
24who is interested in implementing their own checker. The static analyzer is a
25part of the Clang project, so consult <a href="http://clang.llvm.org/hacking.html">Hacking on Clang</a>
26and <a href="http://llvm.org/docs/ProgrammersManual.html">LLVM Programmer's Manual</a>
27for general developer guidelines and information. </p>
28
29 <ul>
30 <li><a href="#start">Getting Started</a></li>
Anna Zaks464ef2e2011-11-07 05:36:29 +000031 <li><a href="#analyzer">Analyzer Overview</a></li>
Anna Zaksd67fc492011-11-02 17:49:20 +000032 <li><a href="#idea">Idea for a Checker</a></li>
Anna Zaks464ef2e2011-11-07 05:36:29 +000033 <li><a href="#registration">Checker Registration</a></li>
Anna Zaksd67fc492011-11-02 17:49:20 +000034 <li><a href="#skeleton">Checker Skeleton</a></li>
35 <li><a href="#node">Exploded Node</a></li>
36 <li><a href="#bugs">Bug Reports</a></li>
37 <li><a href="#ast">AST Visitors</a></li>
38 <li><a href="#testing">Testing</a></li>
39 <li><a href="#commands">Useful Commands</a></li>
40 </ul>
41
42<h2 id=start>Getting Started</h2>
43 <ul>
Anna Zaks464ef2e2011-11-07 05:36:29 +000044 <li>To check out the source code and build the project, follow steps 1-4 of
45 the <a href="http://clang.llvm.org/get_started.html">Clang Getting Started</a>
Anna Zaksd67fc492011-11-02 17:49:20 +000046 page.</li>
47
48 <li>The analyzer source code is located under the Clang source tree:
49 <br><tt>
50 $ <b>cd llvm/tools/clang</b>
51 </tt>
Anna Zaks464ef2e2011-11-07 05:36:29 +000052 <br>See: <tt>include/clang/StaticAnalyzer</tt>, <tt>lib/StaticAnalyzer</tt>,
53 <tt>test/Analysis</tt>.</li>
Anna Zaksd67fc492011-11-02 17:49:20 +000054
Anna Zaks464ef2e2011-11-07 05:36:29 +000055 <li>The analyzer regression tests can be executed from the Clang's build
56 directory:
Anna Zaksd67fc492011-11-02 17:49:20 +000057 <br><tt>
58 $ <b>cd ../../../; cd build/tools/clang; TESTDIRS=Analysis make test</b>
59 </tt></li>
60
61 <li>Analyze a file with the specified checker:
62 <br><tt>
63 $ <b>clang -cc1 -analyze -analyzer-checker=core.DivideZero test.c</b>
64 </tt></li>
65
66 <li>List the available checkers:
67 <br><tt>
68 $ <b>clang -cc1 -analyzer-checker-help</b>
69 </tt></li>
70
Anna Zaks464ef2e2011-11-07 05:36:29 +000071 <li>See the analyzer help for different output formats, fine tuning, and
72 debug options:
Anna Zaksd67fc492011-11-02 17:49:20 +000073 <br><tt>
74 $ <b>clang -cc1 -help | grep "analyzer"</b>
75 </tt></li>
76
77 </ul>
78
79<h2 id=analyzer>Static Analyzer Overview</h2>
Anna Zaks464ef2e2011-11-07 05:36:29 +000080 The analyzer core performs symbolic execution of the given program. All the
81 input values are represented with symbolic values; further, the engine deduces
82 the values of all the expressions in the program based on the input symbols
83 and the path. The execution is path sensitive and every possible path through
84 the program is explored. The explored execution traces are represented with
85 <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ExplodedGraph.html">ExplidedGraph</a> object.
86 Each node of the graph is
87 <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ExplodedNode.html">ExplodedNode</a>,
88 which consists of a <tt>ProgramPoint</tt> and a <tt>ProgramState</tt>.
89 <p>
90 <a href="http://clang.llvm.org/doxygen/classclang_1_1ProgramPoint.html">ProgramPoint</a>
91 represents the corresponding location in the program (or the CFG graph).
92 <tt>ProgramPoint</tt> is also used to record additional information on
93 when/how the state was added. For example, <tt>PostPurgeDeadSymbolsKind</tt>
94 kind means that the state is the result of purging dead symbols - the
95 analyzer's equivalent of garbage collection.
96 <p>
97 <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ProgramState.html">ProgramState</a>
98 represents abstract state of the program. It consists of:
Anna Zaksd67fc492011-11-02 17:49:20 +000099 <ul>
Anna Zaks464ef2e2011-11-07 05:36:29 +0000100 <li><tt>Environment</tt> - a mapping from source code expressions to symbolic
101 values
102 <li><tt>Store</tt> - a mapping from memory locations to symbolic values
103 <li><tt>GenericDataMap</tt> - constraints on symbolic values
104 </ul>
105
106 <p>
107 Checkers are not merely passive receivers of the analyzer core changes - they
108 actively participate in the <tt>ProgramState</tt> construction through the
109 <tt>GenericDataMap</tt> which can be used to store the checker-defined part
110 of the state. Each time the analyzer engine explores a new statement, it
111 notifies each checker registered to listen for that statement, giving it an
112 opportunity to either report a bug or modify the state. (As a rule of thumb,
113 the checker itself should be stateless.) The checkers are called one after another
114 in the predefined order; thus, calling all the checkers adds a chain to the
115 <tt>ExplodedGraph</tt>.
116 <!--
117 TODO: Add a picture.
118 <br>
119 Symbols<br>
120 FunctionalObjects are used throughout.
121 -->
122<h2 id=idea>Idea for a Checker</h2>
123 Here are several questions which you should consider when evaluating your
124 checker idea:
125 <ul>
126 <li>Can the check be effectively implemented without path-sensitive
127 analysis? See <a href="#ast">AST Visitors</a>.</li>
Anna Zaksd67fc492011-11-02 17:49:20 +0000128
129 <li>How high the false positive rate is going to be? Looking at the occurrences
Anna Zaks464ef2e2011-11-07 05:36:29 +0000130 of the issue you want to write a checker for in the existing code bases might
131 give you some ideas. </li>
Anna Zaksd67fc492011-11-02 17:49:20 +0000132
133 <li>How the current limitations of the analysis will effect the false alarm
134 rate? Currently, the analyzer only reasons about one procedure at a time (no
Anna Zaks464ef2e2011-11-07 05:36:29 +0000135 inter-procedural analysis). Also, it uses a simple range tracking based
136 solver to model symbolic execution.</li>
Anna Zaksd67fc492011-11-02 17:49:20 +0000137
138 <li>Consult the <a href="http://llvm.org/bugs/buglist.cgi?query_format=advanced&bug_status=NEW&bug_status=REOPENED&version=trunk&component=Static%20Analyzer&product=clang">Bugzilla database</a>
139 to get some ideas for new checkers and consider starting with improving/fixing
140 bugs in the existing checkers.</li>
141 </ul>
142
Anna Zaks464ef2e2011-11-07 05:36:29 +0000143<h2 id=registration>Checker Registration</h2>
144 All checker implementation files are located in <tt>clang/lib/StaticAnalyzer/Checkers</tt>
145 folder. Follow the steps below to register a new checker with the analyzer.
146<ol>
147 <li>Create a new checker implementation file, for example <tt>./lib/StaticAnalyzer/Checkers/NewChecker.cpp</tt>
148<pre class="code_example">
149using namespace clang;
150using namespace ento;
151
152namespace {
153class NewChecker: public Checker< check::PreStmt<CallExpr> > {
154public:
155 void checkPreStmt(const CallExpr *CE, CheckerContext &Ctx) const {}
156}
157}
158void ento::registerNewChecker(CheckerManager &mgr) {
159 mgr.registerChecker<NewChecker>();
160}
161</pre>
162
163<li>Pick the package name for your checker and add the registration code to
164<tt>./lib/StaticAnalyzer/Checkers/Checkers.td</tt>. Note, all checkers should
165first be developed as experimental. Suppose our new checker performs security
166related checks, then we should add the following lines under
167<tt>SecurityExperimental</tt> package:
168<pre class="code_example">
169let ParentPackage = SecurityExperimental in {
170...
171def NewChecker : Checker<"NewChecker">,
172 HelpText<"This text should give a short description of the checks performed.">,
173 DescFile<"NewChecker.cpp">;
174...
175} // end "security.experimental"
176</pre>
177
178<li>Make the source code file visible to CMake by adding it to
179<tt>./lib/StaticAnalyzer/Checkers/CMakeLists.txt</tt>.
180
181<li>Compile and see your checker in the list of available checkers by running:<br>
182<tt><b>$clang -cc1 -analyzer-checker-help</b></tt>
183</ol>
184
185
Anna Zaksd67fc492011-11-02 17:49:20 +0000186<h2 id=skeleton>Checker Skeleton</h2>
Anna Zaksd67fc492011-11-02 17:49:20 +0000187 There are two main decisions you need to make:
188 <ul>
189 <li> Which events the checker should be tracking.</li>
Anna Zaks464ef2e2011-11-07 05:36:29 +0000190 <li> What data you want to store as part of the checker-specific program
191 state. Try to minimize the checker state as much as possible. </li>
Anna Zaksd67fc492011-11-02 17:49:20 +0000192 </ul>
Anna Zaksd67fc492011-11-02 17:49:20 +0000193
194<h2 id=bugs>Bug Reports</h2>
195
196<h2 id=ast>AST Visitors</h2>
197 Some checks might not require path-sensitivity to be effective. Simple AST walk
Anna Zaks464ef2e2011-11-07 05:36:29 +0000198 might be sufficient. If that is the case, consider implementing a Clang
199 compiler warning. On the other hand, a check might not be acceptable as a compiler
Anna Zaksd67fc492011-11-02 17:49:20 +0000200 warning; for example, because of a relatively high false positive rate. In this
201 situation, AST callbacks <tt><b>checkASTDecl</b></tt> and
202 <tt><b>checkASTCodeBody</b></tt> are your best friends.
203
204<h2 id=testing>Testing</h2>
205 Every patch should be well tested with Clang regression tests. The checker tests
206 live in <tt>clang/test/Analysis</tt> folder. To run all of the analyzer tests,
207 execute the following from the <tt>clang</tt> build directory:
208 <pre class="code">
209 $ <b>TESTDIRS=Analysis make test</b>
210 </pre>
211
212<h2 id=commands>Useful Commands/Debugging Hints</h2>
213<ul>
214<li>
Anna Zaks464ef2e2011-11-07 05:36:29 +0000215While investigating a checker-related issue, instruct the analyzer to only
216execute a single checker:
Anna Zaksd67fc492011-11-02 17:49:20 +0000217<br><tt>
218$ <b>clang -cc1 -analyze -analyzer-checker=osx.KeychainAPI test.c</b>
219</tt>
220</li>
221<li>
222To dump AST:
223<br><tt>
224$ <b>clang -cc1 -ast-dump test.c</b>
225</tt>
226</li>
227<li>
228To view/dump CFG use <tt>debug.ViewCFG</tt> or <tt>debug.DumpCFG</tt> checkers:
229<br><tt>
230$ <b>clang -cc1 -analyze -analyzer-checker=debug.ViewCFG test.c</b>
231</tt>
232</li>
233<li>
234To see all available debug checkers:
235<br><tt>
236$ <b>clang -cc1 -analyzer-checker-help | grep "debug"</b>
237</tt>
238</li>
239<li>
Anna Zaks464ef2e2011-11-07 05:36:29 +0000240To see which function is failing while processing a large file use
241<tt>-analyzer-display-progress</tt> option.
Anna Zaksd67fc492011-11-02 17:49:20 +0000242</li>
243<li>
Anna Zaks464ef2e2011-11-07 05:36:29 +0000244While debugging execute <tt>clang -cc1 -analyze -analyzer-checker=core</tt>
245instead of <tt>clang --analyze</tt>, as the later would call the compiler
246in a separate process.
Anna Zaksd67fc492011-11-02 17:49:20 +0000247</li>
248<li>
Anna Zaks464ef2e2011-11-07 05:36:29 +0000249To view <tt>ExplodedGraph</tt> (the state graph explored by the analyzer) while
250debugging, goto a frame that has <tt>clang::ento::ExprEngine</tt> object and
251execute:
Anna Zaksd67fc492011-11-02 17:49:20 +0000252<br><tt>
253(gdb) <b>p ViewGraph(0)</b>
254</tt>
255</li>
256<li>
Anna Zaks464ef2e2011-11-07 05:36:29 +0000257To see <tt>clang::Expr</tt> while debugging use the following command. If you
258pass in a SourceManager object, it will also dump the corresponding line in the
259source code.
Anna Zaksd67fc492011-11-02 17:49:20 +0000260<br><tt>
261(gdb) <b>p E->dump()</b>
262</tt>
263</li>
264<li>
265To dump AST of a method that the current <tt>ExplodedNode</tt> belongs to:
266<br><tt>
267(gdb) <b>p ENode->getCodeDecl().getBody()->dump(getContext().getSourceManager())</b>
268</tt>
269</li>
270</ul>
271
272</div>
273</div>
274</body>
275</html>