www/analyzer/checker_dev_manual.html - platform/external/clang - Gitiles

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
           "http://www.w3.org/TR/html4/strict.dtd">
 <html>
 <head>
   <title>Checker Developer Manual</title>
   <link type="text/css" rel="stylesheet" href="menu.css">
   <link type="text/css" rel="stylesheet" href="content.css">
   <script type="text/javascript" src="scripts/menu.js"></script>
 </head>
 <body>

 <div id="page">
 <!--#include virtual="menu.html.incl"-->

 <div id="content">

 <h1 style="color:red">This Page Is Under Construction</h1>

 <h1>Checker Developer Manual</h1>

 <p>The static analyzer engine performs symbolic execution of the program and
 relies on a set of checkers to implement the logic for detecting and
 constructing bug reports. This page provides hints and guidelines for anyone
 who is interested in implementing their own checker. The static analyzer is a
 part of the Clang project, so consult <a href="http://clang.llvm.org/hacking.html">Hacking on Clang</a>
 and <a href="http://llvm.org/docs/ProgrammersManual.html">LLVM Programmer's Manual</a>
 for general developer guidelines and information. </p>

     <ul>
       <li><a href="#start">Getting Started</a></li>
       <li><a href="#analyzer">Analyzer Overview</a></li>
       <li><a href="#idea">Idea for a Checker</a></li>
       <li><a href="#registration">Checker Registration</a></li>
       <li><a href="#skeleton">Checker Skeleton</a></li>
       <li><a href="#node">Exploded Node</a></li>
       <li><a href="#bugs">Bug Reports</a></li>
       <li><a href="#ast">AST Visitors</a></li>
       <li><a href="#testing">Testing</a></li>
       <li><a href="#commands">Useful Commands</a></li>
     </ul>

 <h2 id=start>Getting Started</h2>
   <ul>
     <li>To check out the source code and build the project, follow steps 1-4 of
     the <a href="http://clang.llvm.org/get_started.html">Clang Getting Started</a>
   page.</li>

     <li>The analyzer source code is located under the Clang source tree:
     <br><tt>
     $ <b>cd llvm/tools/clang</b>
     </tt>
     <br>See: <tt>include/clang/StaticAnalyzer</tt>, <tt>lib/StaticAnalyzer</tt>,
      <tt>test/Analysis</tt>.</li>

     <li>The analyzer regression tests can be executed from the Clang's build
     directory:
     <br><tt>
     $ <b>cd ../../../; cd build/tools/clang; TESTDIRS=Analysis make test</b>
     </tt></li>

     <li>Analyze a file with the specified checker:
     <br><tt>
     $ <b>clang -cc1 -analyze -analyzer-checker=core.DivideZero test.c</b>
     </tt></li>

     <li>List the available checkers:
     <br><tt>
     $ <b>clang -cc1 -analyzer-checker-help</b>
     </tt></li>

     <li>See the analyzer help for different output formats, fine tuning, and
     debug options:
     <br><tt>
     $ <b>clang -cc1 -help | grep "analyzer"</b>
     </tt></li>

   </ul>

 <h2 id=analyzer>Static Analyzer Overview</h2>
   The analyzer core performs symbolic execution of the given program. All the
   input values are represented with symbolic values; further, the engine deduces
   the values of all the expressions in the program based on the input symbols
   and the path. The execution is path sensitive and every possible path through
   the program is explored. The explored execution traces are represented with
   <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ExplodedGraph.html">ExplidedGraph</a> object.
   Each node of the graph is
   <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ExplodedNode.html">ExplodedNode</a>,
   which consists of a <tt>ProgramPoint</tt> and a <tt>ProgramState</tt>.
   <p>
   <a href="http://clang.llvm.org/doxygen/classclang_1_1ProgramPoint.html">ProgramPoint</a>
   represents the corresponding location in the program (or the CFG graph).
   <tt>ProgramPoint</tt> is also used to record additional information on
   when/how the state was added. For example, <tt>PostPurgeDeadSymbolsKind</tt>
   kind means that the state is the result of purging dead symbols - the
   analyzer's equivalent of garbage collection.
   <p>
   <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ProgramState.html">ProgramState</a>
   represents abstract state of the program. It consists of:
   <ul>
     <li><tt>Environment</tt> - a mapping from source code expressions to symbolic
     values
     <li><tt>Store</tt> - a mapping from memory locations to symbolic values
     <li><tt>GenericDataMap</tt> - constraints on symbolic values
   </ul>

   <h3>Interaction with Checkers</h3>
   Checkers are not merely passive receivers of the analyzer core changes - they
   actively participate in the <tt>ProgramState</tt> construction through the
   <tt>GenericDataMap</tt> which can be used to store the checker-defined part
   of the state. Each time the analyzer engine explores a new statement, it
   notifies each checker registered to listen for that statement, giving it an
   opportunity to either report a bug or modify the state. (As a rule of thumb,
   the checker itself should be stateless.) The checkers are called one after another
   in the predefined order; thus, calling all the checkers adds a chain to the
   <tt>ExplodedGraph</tt>.

   <h3>Representing Values</h3>
   During symbolic execution, <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SVal.html">SVal</a>
   objects are used to represent the semantic evaluation of expressions. They can
   represent things like concrete integers, symbolic values, or memory locations
   (which are memory regions). They are a discriminated union of "values",
   symbolic and otherwise.
   <p>
   <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SymExpr.html">SymExpr</a> (symbol)
   is meant to represent abstract, but named, symbolic value.
   Symbolic values can have constraints associated with them. Symbols represent
   an actual (immutable) value. We might not know what its specific value is, but
   we can associate constraints with that value as we analyze a path.
   <p>

   <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1MemRegion.html">MemRegion</a> is similar to a symbol.
   It is used to provide a lexicon of how to describe abstract memory. Regions can
   layer on top of other regions, providing a layered approach to representing memory.
   For example, a struct object on the stack might be represented by a <tt>VarRegion</tt>,
   but a <tt>FieldRegion</tt> which is a subregion of the <tt>VarRegion</tt> could
   be used to represent the memory associated with a specific field of that object.
   So how do we represent symbolic memory regions? That's what <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SymbolicRegion.html">SymbolicRegion</a>
   is for.  It is a <tt>MemRegion</tt> that has an associated symbol. Since the
   symbol is unique and has a unique name; that symbol names the region.
   <p>
   Let's see how the analyzer processes the expressions in the following example:
   <p>
   <pre class="code_example">
   int foo(int x) {
      int y = x * 2;
      int z = x;
      ...
   }
   </pre>
   <p>
 Let's look at how <tt>x*2</tt> gets evaluated. When <tt>x</tt> is evaluated,
 we first construct an <tt>SVal</tt> that represents the lvalue of <tt>x</tt>, in
 this case it is an <tt>SVal</tt> that references the <tt>MemRegion</tt> for <tt>x</tt>.
 Afterwards, when we do the lvalue-to-rvalue conversion, we get a new <tt>SVal</tt>,
 which references the value <b>currently bound</b> to <tt>x</tt>. That value is
 symbolic; it's whatever <tt>x</tt> was bound to at the start of the function.
 Let's call that symbol <tt>$0</tt>. Similarly, we evaluate the expression for <tt>2</tt>,
 and get an <tt>SVal</tt> that references the concrete number <tt>2</tt>. When
 we evaluate <tt>x*2</tt>, we take the two <tt>SVals</tt> of the subexpressions,
 and create a new <tt>SVal</tt> that represents their multiplication (which in
 this case is a new symbolic expression, which we might call <tt>$1</tt>). When we
 evaluate the assignment to <tt>y</tt>, we again compute its lvalue (a <tt>MemRegion</tt>),
 and then bind the <tt>SVal</tt> for the RHS (which references the symbolic value <tt>$1</tt>)
 to the <tt>MemRegion</tt> in the symbolic store.
 <br>
 The second line is similar. When we evaluate <tt>x</tt> again, we do the same
 dance, and create an <tt>SVal</tt> that references the symbol <tt>$0</tt>. Note, two <tt>SVals</tt>
 might reference the same underlying values.

 <p>
 To summarize, MemRegions are unique names for blocks of memory. Symbols are
 unique names for abstract symbolic values. Some MemRegions represents abstract
 symbolic chunks of memory, and thus are also based on symbols. SVals are just
 references to values, and can reference either MemRegions, Symbols, or concrete
 values (e.g., the number 1).

   <!--
   TODO: Add a picture.
   <br>
   Symbols<br>
   FunctionalObjects are used throughout.
   -->
 <h2 id=idea>Idea for a Checker</h2>
   Here are several questions which you should consider when evaluating your
   checker idea:
   <ul>
     <li>Can the check be effectively implemented without path-sensitive
     analysis? See <a href="#ast">AST Visitors</a>.</li>

     <li>How high the false positive rate is going to be? Looking at the occurrences
     of the issue you want to write a checker for in the existing code bases might
     give you some ideas. </li>

     <li>How the current limitations of the analysis will effect the false alarm
     rate? Currently, the analyzer only reasons about one procedure at a time (no
     inter-procedural analysis). Also, it uses a simple range tracking based
     solver to model symbolic execution.</li>

     <li>Consult the <a
     href="http://llvm.org/bugs/buglist.cgi?query_format=advanced&amp;bug_status=NEW&amp;bug_status=REOPENED&amp;version=trunk&amp;component=Static%20Analyzer&amp;product=clang">Bugzilla database</a>
     to get some ideas for new checkers and consider starting with improving/fixing
     bugs in the existing checkers.</li>
   </ul>

 <h2 id=registration>Checker Registration</h2>
   All checker implementation files are located in <tt>clang/lib/StaticAnalyzer/Checkers</tt>
   folder. Follow the steps below to register a new checker with the analyzer.
 <ol>
   <li>Create a new checker implementation file, for example <tt>./lib/StaticAnalyzer/Checkers/NewChecker.cpp</tt>
 <pre class="code_example">
 using namespace clang;
 using namespace ento;

 namespace {
 class NewChecker: public Checker< check::PreStmt&lt;CallExpr> > {
 public:
   void checkPreStmt(const CallExpr *CE, CheckerContext &amp;Ctx) const {}
 }
 }
 void ento::registerNewChecker(CheckerManager &amp;mgr) {
   mgr.registerChecker&lt;NewChecker>();
 }
 </pre>

 <li>Pick the package name for your checker and add the registration code to
 <tt>./lib/StaticAnalyzer/Checkers/Checkers.td</tt>. Note, all checkers should
 first be developed as experimental. Suppose our new checker performs security
 related checks, then we should add the following lines under
 <tt>SecurityExperimental</tt> package:
 <pre class="code_example">
 let ParentPackage = SecurityExperimental in {
 ...
 def NewChecker : Checker<"NewChecker">,
   HelpText<"This text should give a short description of the checks performed.">,
   DescFile<"NewChecker.cpp">;
 ...
 } // end "security.experimental"
 </pre>

 <li>Make the source code file visible to CMake by adding it to
 <tt>./lib/StaticAnalyzer/Checkers/CMakeLists.txt</tt>.

 <li>Compile and see your checker in the list of available checkers by running:<br>
 <tt><b>$clang -cc1 -analyzer-checker-help</b></tt>
 </ol>


 <h2 id=skeleton>Checker Skeleton</h2>
   There are two main decisions you need to make:
   <ul>
     <li> Which events the checker should be tracking.
     See <a href="http://clang.llvm.org/doxygen/classento_1_1CheckerDocumentation.html">CheckerDocumentation</a>
     for the list of available checker callbacks.</li>
     <li> What data you want to store as part of the checker-specific program
     state. Try to minimize the checker state as much as possible. </li>
   </ul>

 <h2 id=bugs>Bug Reports</h2>

 <h2 id=ast>AST Visitors</h2>
   Some checks might not require path-sensitivity to be effective. Simple AST walk
   might be sufficient. If that is the case, consider implementing a Clang
   compiler warning. On the other hand, a check might not be acceptable as a compiler
   warning; for example, because of a relatively high false positive rate. In this
   situation, AST callbacks <tt><b>checkASTDecl</b></tt> and
   <tt><b>checkASTCodeBody</b></tt> are your best friends.

 <h2 id=testing>Testing</h2>
   Every patch should be well tested with Clang regression tests. The checker tests
   live in <tt>clang/test/Analysis</tt> folder. To run all of the analyzer tests,
   execute the following from the <tt>clang</tt> build directory:
     <pre class="code">
     $ <b>TESTDIRS=Analysis make test</b>
     </pre>

 <h2 id=commands>Useful Commands/Debugging Hints</h2>
 <ul>
 <li>
 While investigating a checker-related issue, instruct the analyzer to only
 execute a single checker:
 <br><tt>
 $ <b>clang -cc1 -analyze -analyzer-checker=osx.KeychainAPI test.c</b>
 </tt>
 </li>
 <li>
 To dump AST:
 <br><tt>
 $ <b>clang -cc1 -ast-dump test.c</b>
 </tt>
 </li>
 <li>
 To view/dump CFG use <tt>debug.ViewCFG</tt> or <tt>debug.DumpCFG</tt> checkers:
 <br><tt>
 $ <b>clang -cc1 -analyze -analyzer-checker=debug.ViewCFG test.c</b>
 </tt>
 </li>
 <li>
 To see all available debug checkers:
 <br><tt>
 $ <b>clang -cc1 -analyzer-checker-help | grep "debug"</b>
 </tt>
 </li>
 <li>
 To see which function is failing while processing a large file use
 <tt>-analyzer-display-progress</tt> option.
 </li>
 <li>
 While debugging execute <tt>clang -cc1 -analyze -analyzer-checker=core</tt>
 instead of <tt>clang --analyze</tt>, as the later would call the compiler
 in a separate process.
 </li>
 <li>
 To view <tt>ExplodedGraph</tt> (the state graph explored by the analyzer) while
 debugging, goto a frame that has <tt>clang::ento::ExprEngine</tt> object and
 execute:
 <br><tt>
 (gdb) <b>p ViewGraph(0)</b>
 </tt>
 </li>
 <li>
 To see the <tt>ProgramState</tt> while debugging use the following command.
 <br><tt>
 (gdb) <b>p State->dump()</b>
 </tt>
 </li>
 <li>
 To see <tt>clang::Expr</tt> while debugging use the following command. If you
 pass in a SourceManager object, it will also dump the corresponding line in the
 source code.
 <br><tt>
 (gdb) <b>p E->dump()</b>
 </tt>
 </li>
 <li>
 To dump AST of a method that the current <tt>ExplodedNode</tt> belongs to:
 <br><tt>
 (gdb) <b>p ENode->getCodeDecl().getBody()->dump(getContext().getSourceManager())</b>
 </tt>
 </li>
 </ul>

 </div>
 </div>
 </body>
 </html>
	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
	"http://www.w3.org/TR/html4/strict.dtd">
	<html>
	<head>
	<title>Checker Developer Manual</title>
	<link type="text/css" rel="stylesheet" href="menu.css">
	<link type="text/css" rel="stylesheet" href="content.css">
	<script type="text/javascript" src="scripts/menu.js"></script>
	</head>
	<body>

	<div id="page">
	<!--#include virtual="menu.html.incl"-->

	<div id="content">

	<h1 style="color:red">This Page Is Under Construction</h1>

	<h1>Checker Developer Manual</h1>

	<p>The static analyzer engine performs symbolic execution of the program and
	relies on a set of checkers to implement the logic for detecting and
	constructing bug reports. This page provides hints and guidelines for anyone
	who is interested in implementing their own checker. The static analyzer is a
	part of the Clang project, so consult <a href="http://clang.llvm.org/hacking.html">Hacking on Clang</a>
	and <a href="http://llvm.org/docs/ProgrammersManual.html">LLVM Programmer's Manual</a>
	for general developer guidelines and information. </p>

	<ul>
	<li><a href="#start">Getting Started</a></li>
	<li><a href="#analyzer">Analyzer Overview</a></li>
	<li><a href="#idea">Idea for a Checker</a></li>
	<li><a href="#registration">Checker Registration</a></li>
	<li><a href="#skeleton">Checker Skeleton</a></li>
	<li><a href="#node">Exploded Node</a></li>
	<li><a href="#bugs">Bug Reports</a></li>
	<li><a href="#ast">AST Visitors</a></li>
	<li><a href="#testing">Testing</a></li>
	<li><a href="#commands">Useful Commands</a></li>
	</ul>

	<h2 id=start>Getting Started</h2>
	<ul>
	<li>To check out the source code and build the project, follow steps 1-4 of
	the <a href="http://clang.llvm.org/get_started.html">Clang Getting Started</a>
	page.</li>

	<li>The analyzer source code is located under the Clang source tree:
	<br><tt>
	$ <b>cd llvm/tools/clang</b>
	</tt>
	<br>See: <tt>include/clang/StaticAnalyzer</tt>, <tt>lib/StaticAnalyzer</tt>,
	<tt>test/Analysis</tt>.</li>

	<li>The analyzer regression tests can be executed from the Clang's build
	directory:
	<br><tt>
	$ <b>cd ../../../; cd build/tools/clang; TESTDIRS=Analysis make test</b>
	</tt></li>

	<li>Analyze a file with the specified checker:
	<br><tt>
	$ <b>clang -cc1 -analyze -analyzer-checker=core.DivideZero test.c</b>
	</tt></li>

	<li>List the available checkers:
	<br><tt>
	$ <b>clang -cc1 -analyzer-checker-help</b>
	</tt></li>

	<li>See the analyzer help for different output formats, fine tuning, and
	debug options:
	<br><tt>
	$ <b>clang -cc1 -help \| grep "analyzer"</b>
	</tt></li>

	</ul>

	<h2 id=analyzer>Static Analyzer Overview</h2>
	The analyzer core performs symbolic execution of the given program. All the
	input values are represented with symbolic values; further, the engine deduces
	the values of all the expressions in the program based on the input symbols
	and the path. The execution is path sensitive and every possible path through
	the program is explored. The explored execution traces are represented with
	<a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ExplodedGraph.html">ExplidedGraph</a> object.
	Each node of the graph is
	<a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ExplodedNode.html">ExplodedNode</a>,
	which consists of a <tt>ProgramPoint</tt> and a <tt>ProgramState</tt>.
	<p>
	<a href="http://clang.llvm.org/doxygen/classclang_1_1ProgramPoint.html">ProgramPoint</a>
	represents the corresponding location in the program (or the CFG graph).
	<tt>ProgramPoint</tt> is also used to record additional information on
	when/how the state was added. For example, <tt>PostPurgeDeadSymbolsKind</tt>
	kind means that the state is the result of purging dead symbols - the
	analyzer's equivalent of garbage collection.
	<p>
	<a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ProgramState.html">ProgramState</a>
	represents abstract state of the program. It consists of:
	<ul>
	<li><tt>Environment</tt> - a mapping from source code expressions to symbolic
	values
	<li><tt>Store</tt> - a mapping from memory locations to symbolic values
	<li><tt>GenericDataMap</tt> - constraints on symbolic values
	</ul>

	<h3>Interaction with Checkers</h3>
	Checkers are not merely passive receivers of the analyzer core changes - they
	actively participate in the <tt>ProgramState</tt> construction through the
	<tt>GenericDataMap</tt> which can be used to store the checker-defined part
	of the state. Each time the analyzer engine explores a new statement, it
	notifies each checker registered to listen for that statement, giving it an
	opportunity to either report a bug or modify the state. (As a rule of thumb,
	the checker itself should be stateless.) The checkers are called one after another
	in the predefined order; thus, calling all the checkers adds a chain to the
	<tt>ExplodedGraph</tt>.

	<h3>Representing Values</h3>
	During symbolic execution, <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SVal.html">SVal</a>
	objects are used to represent the semantic evaluation of expressions. They can
	represent things like concrete integers, symbolic values, or memory locations
	(which are memory regions). They are a discriminated union of "values",
	symbolic and otherwise.
	<p>
	<a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SymExpr.html">SymExpr</a> (symbol)
	is meant to represent abstract, but named, symbolic value.
	Symbolic values can have constraints associated with them. Symbols represent
	an actual (immutable) value. We might not know what its specific value is, but
	we can associate constraints with that value as we analyze a path.
	<p>

	<a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1MemRegion.html">MemRegion</a> is similar to a symbol.
	It is used to provide a lexicon of how to describe abstract memory. Regions can
	layer on top of other regions, providing a layered approach to representing memory.
	For example, a struct object on the stack might be represented by a <tt>VarRegion</tt>,
	but a <tt>FieldRegion</tt> which is a subregion of the <tt>VarRegion</tt> could
	be used to represent the memory associated with a specific field of that object.
	So how do we represent symbolic memory regions? That's what <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SymbolicRegion.html">SymbolicRegion</a>
	is for. It is a <tt>MemRegion</tt> that has an associated symbol. Since the
	symbol is unique and has a unique name; that symbol names the region.
	<p>
	Let's see how the analyzer processes the expressions in the following example:
	<p>
	<pre class="code_example">
	int foo(int x) {
	int y = x * 2;
	int z = x;
	...
	}
	</pre>
	<p>
	Let's look at how <tt>x*2</tt> gets evaluated. When <tt>x</tt> is evaluated,
	we first construct an <tt>SVal</tt> that represents the lvalue of <tt>x</tt>, in
	this case it is an <tt>SVal</tt> that references the <tt>MemRegion</tt> for <tt>x</tt>.
	Afterwards, when we do the lvalue-to-rvalue conversion, we get a new <tt>SVal</tt>,
	which references the value <b>currently bound</b> to <tt>x</tt>. That value is
	symbolic; it's whatever <tt>x</tt> was bound to at the start of the function.
	Let's call that symbol <tt>$0</tt>. Similarly, we evaluate the expression for <tt>2</tt>,
	and get an <tt>SVal</tt> that references the concrete number <tt>2</tt>. When
	we evaluate <tt>x*2</tt>, we take the two <tt>SVals</tt> of the subexpressions,
	and create a new <tt>SVal</tt> that represents their multiplication (which in
	this case is a new symbolic expression, which we might call <tt>$1</tt>). When we
	evaluate the assignment to <tt>y</tt>, we again compute its lvalue (a <tt>MemRegion</tt>),
	and then bind the <tt>SVal</tt> for the RHS (which references the symbolic value <tt>$1</tt>)
	to the <tt>MemRegion</tt> in the symbolic store.
	<br>
	The second line is similar. When we evaluate <tt>x</tt> again, we do the same
	dance, and create an <tt>SVal</tt> that references the symbol <tt>$0</tt>. Note, two <tt>SVals</tt>
	might reference the same underlying values.

	<p>
	To summarize, MemRegions are unique names for blocks of memory. Symbols are
	unique names for abstract symbolic values. Some MemRegions represents abstract
	symbolic chunks of memory, and thus are also based on symbols. SVals are just
	references to values, and can reference either MemRegions, Symbols, or concrete
	values (e.g., the number 1).

	<!--
	TODO: Add a picture.
	<br>
	Symbols<br>
	FunctionalObjects are used throughout.
	-->
	<h2 id=idea>Idea for a Checker</h2>
	Here are several questions which you should consider when evaluating your
	checker idea:
	<ul>
	<li>Can the check be effectively implemented without path-sensitive
	analysis? See <a href="#ast">AST Visitors</a>.</li>

	<li>How high the false positive rate is going to be? Looking at the occurrences
	of the issue you want to write a checker for in the existing code bases might
	give you some ideas. </li>

	<li>How the current limitations of the analysis will effect the false alarm
	rate? Currently, the analyzer only reasons about one procedure at a time (no
	inter-procedural analysis). Also, it uses a simple range tracking based
	solver to model symbolic execution.</li>

	<li>Consult the <a
	href="http://llvm.org/bugs/buglist.cgi?query_format=advanced&bug_status=NEW&bug_status=REOPENED&version=trunk&component=Static%20Analyzer&product=clang">Bugzilla database</a>
	to get some ideas for new checkers and consider starting with improving/fixing
	bugs in the existing checkers.</li>
	</ul>

	<h2 id=registration>Checker Registration</h2>
	All checker implementation files are located in <tt>clang/lib/StaticAnalyzer/Checkers</tt>
	folder. Follow the steps below to register a new checker with the analyzer.
	<ol>
	<li>Create a new checker implementation file, for example <tt>./lib/StaticAnalyzer/Checkers/NewChecker.cpp</tt>
	<pre class="code_example">
	using namespace clang;
	using namespace ento;

	namespace {
	class NewChecker: public Checker< check::PreStmt<CallExpr> > {
	public:
	void checkPreStmt(const CallExpr *CE, CheckerContext &Ctx) const {}
	}
	}
	void ento::registerNewChecker(CheckerManager &mgr) {
	mgr.registerChecker<NewChecker>();
	}
	</pre>

	<li>Pick the package name for your checker and add the registration code to
	<tt>./lib/StaticAnalyzer/Checkers/Checkers.td</tt>. Note, all checkers should
	first be developed as experimental. Suppose our new checker performs security
	related checks, then we should add the following lines under
	<tt>SecurityExperimental</tt> package:
	<pre class="code_example">
	let ParentPackage = SecurityExperimental in {
	...
	def NewChecker : Checker<"NewChecker">,
	HelpText<"This text should give a short description of the checks performed.">,
	DescFile<"NewChecker.cpp">;
	...
	} // end "security.experimental"
	</pre>

	<li>Make the source code file visible to CMake by adding it to
	<tt>./lib/StaticAnalyzer/Checkers/CMakeLists.txt</tt>.

	<li>Compile and see your checker in the list of available checkers by running:<br>
	<tt><b>$clang -cc1 -analyzer-checker-help</b></tt>
	</ol>


	<h2 id=skeleton>Checker Skeleton</h2>
	There are two main decisions you need to make:
	<ul>
	<li> Which events the checker should be tracking.
	See <a href="http://clang.llvm.org/doxygen/classento_1_1CheckerDocumentation.html">CheckerDocumentation</a>
	for the list of available checker callbacks.</li>
	<li> What data you want to store as part of the checker-specific program
	state. Try to minimize the checker state as much as possible. </li>
	</ul>

	<h2 id=bugs>Bug Reports</h2>

	<h2 id=ast>AST Visitors</h2>
	Some checks might not require path-sensitivity to be effective. Simple AST walk
	might be sufficient. If that is the case, consider implementing a Clang
	compiler warning. On the other hand, a check might not be acceptable as a compiler
	warning; for example, because of a relatively high false positive rate. In this
	situation, AST callbacks <tt><b>checkASTDecl</b></tt> and
	<tt><b>checkASTCodeBody</b></tt> are your best friends.

	<h2 id=testing>Testing</h2>
	Every patch should be well tested with Clang regression tests. The checker tests
	live in <tt>clang/test/Analysis</tt> folder. To run all of the analyzer tests,
	execute the following from the <tt>clang</tt> build directory:
	<pre class="code">
	$ <b>TESTDIRS=Analysis make test</b>
	</pre>

	<h2 id=commands>Useful Commands/Debugging Hints</h2>
	<ul>
	<li>
	While investigating a checker-related issue, instruct the analyzer to only
	execute a single checker:
	<br><tt>
	$ <b>clang -cc1 -analyze -analyzer-checker=osx.KeychainAPI test.c</b>
	</tt>
	</li>
	<li>
	To dump AST:
	<br><tt>
	$ <b>clang -cc1 -ast-dump test.c</b>
	</tt>
	</li>
	<li>
	To view/dump CFG use <tt>debug.ViewCFG</tt> or <tt>debug.DumpCFG</tt> checkers:
	<br><tt>
	$ <b>clang -cc1 -analyze -analyzer-checker=debug.ViewCFG test.c</b>
	</tt>
	</li>
	<li>
	To see all available debug checkers:
	<br><tt>
	$ <b>clang -cc1 -analyzer-checker-help \| grep "debug"</b>
	</tt>
	</li>
	<li>
	To see which function is failing while processing a large file use
	<tt>-analyzer-display-progress</tt> option.
	</li>
	<li>
	While debugging execute <tt>clang -cc1 -analyze -analyzer-checker=core</tt>
	instead of <tt>clang --analyze</tt>, as the later would call the compiler
	in a separate process.
	</li>
	<li>
	To view <tt>ExplodedGraph</tt> (the state graph explored by the analyzer) while
	debugging, goto a frame that has <tt>clang::ento::ExprEngine</tt> object and
	execute:
	<br><tt>
	(gdb) <b>p ViewGraph(0)</b>
	</tt>
	</li>
	<li>
	To see the <tt>ProgramState</tt> while debugging use the following command.
	<br><tt>
	(gdb) <b>p State->dump()</b>
	</tt>
	</li>
	<li>
	To see <tt>clang::Expr</tt> while debugging use the following command. If you
	pass in a SourceManager object, it will also dump the corresponding line in the
	source code.
	<br><tt>
	(gdb) <b>p E->dump()</b>
	</tt>
	</li>
	<li>
	To dump AST of a method that the current <tt>ExplodedNode</tt> belongs to:
	<br><tt>
	(gdb) <b>p ENode->getCodeDecl().getBody()->dump(getContext().getSourceManager())</b>
	</tt>
	</li>
	</ul>

	</div>
	</div>
	</body>
	</html>