blob: 19d8e4f47cd6bbbe0270b84daca7354c7d303d6f [file] [log] [blame]
Sergey Matveev33e32242015-04-23 21:29:37 +00001=================
Sergey Matveev07e2d282015-04-23 20:40:04 +00002SanitizerCoverage
Sergey Matveev33e32242015-04-23 21:29:37 +00003=================
Sergey Matveev07e2d282015-04-23 20:40:04 +00004
5.. contents::
6 :local:
7
8Introduction
9============
10
11Sanitizer tools have a very simple code coverage tool built in. It allows to
12get function-level, basic-block-level, and edge-level coverage at a very low
13cost.
14
15How to build and run
16====================
17
18SanitizerCoverage can be used with :doc:`AddressSanitizer`,
Evgeniy Stepanov5b49eb42016-06-14 21:33:40 +000019:doc:`LeakSanitizer`, :doc:`MemorySanitizer`,
20UndefinedBehaviorSanitizer, or without any sanitizer. Pass one of the
21following compile-time flags:
Sergey Matveev07e2d282015-04-23 20:40:04 +000022
Alexey Samsonov8fffba12015-05-07 23:04:19 +000023* ``-fsanitize-coverage=func`` for function-level coverage (very fast).
24* ``-fsanitize-coverage=bb`` for basic-block-level coverage (may add up to 30%
Sergey Matveev07e2d282015-04-23 20:40:04 +000025 **extra** slowdown).
Alexey Samsonov8fffba12015-05-07 23:04:19 +000026* ``-fsanitize-coverage=edge`` for edge-level coverage (up to 40% slowdown).
Sergey Matveev07e2d282015-04-23 20:40:04 +000027
Alexey Samsonov8fffba12015-05-07 23:04:19 +000028You may also specify ``-fsanitize-coverage=indirect-calls`` for
29additional `caller-callee coverage`_.
Sergey Matveev07e2d282015-04-23 20:40:04 +000030
Evgeniy Stepanov5b49eb42016-06-14 21:33:40 +000031At run time, pass ``coverage=1`` in ``ASAN_OPTIONS``,
32``LSAN_OPTIONS``, ``MSAN_OPTIONS`` or ``UBSAN_OPTIONS``, as
33appropriate. For the standalone coverage mode, use ``UBSAN_OPTIONS``.
Alexey Samsonov8fffba12015-05-07 23:04:19 +000034
35To get `Coverage counters`_, add ``-fsanitize-coverage=8bit-counters``
Sergey Matveev07e2d282015-04-23 20:40:04 +000036to one of the above compile-time flags. At runtime, use
37``*SAN_OPTIONS=coverage=1:coverage_counters=1``.
38
39Example:
40
41.. code-block:: console
42
43 % cat -n cov.cc
44 1 #include <stdio.h>
45 2 __attribute__((noinline))
46 3 void foo() { printf("foo\n"); }
47 4
48 5 int main(int argc, char **argv) {
49 6 if (argc == 2)
50 7 foo();
51 8 printf("main\n");
52 9 }
Alexey Samsonov8fffba12015-05-07 23:04:19 +000053 % clang++ -g cov.cc -fsanitize=address -fsanitize-coverage=func
Sergey Matveev07e2d282015-04-23 20:40:04 +000054 % ASAN_OPTIONS=coverage=1 ./a.out; ls -l *sancov
55 main
56 -rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov
57 % ASAN_OPTIONS=coverage=1 ./a.out foo ; ls -l *sancov
58 foo
59 main
60 -rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov
61 -rw-r----- 1 kcc eng 8 Nov 27 12:21 a.out.22679.sancov
62
63Every time you run an executable instrumented with SanitizerCoverage
64one ``*.sancov`` file is created during the process shutdown.
65If the executable is dynamically linked against instrumented DSOs,
66one ``*.sancov`` file will be also created for every DSO.
67
68Postprocessing
69==============
70
71The format of ``*.sancov`` files is very simple: the first 8 bytes is the magic,
72one of ``0xC0BFFFFFFFFFFF64`` and ``0xC0BFFFFFFFFFFF32``. The last byte of the
73magic defines the size of the following offsets. The rest of the data is the
74offsets in the corresponding binary/DSO that were executed during the run.
75
76A simple script
77``$LLVM/projects/compiler-rt/lib/sanitizer_common/scripts/sancov.py`` is
78provided to dump these offsets.
79
80.. code-block:: console
81
82 % sancov.py print a.out.22679.sancov a.out.22673.sancov
83 sancov.py: read 2 PCs from a.out.22679.sancov
84 sancov.py: read 1 PCs from a.out.22673.sancov
85 sancov.py: 2 files merged; 2 PCs total
86 0x465250
87 0x4652a0
88
89You can then filter the output of ``sancov.py`` through ``addr2line --exe
90ObjectFile`` or ``llvm-symbolizer --obj ObjectFile`` to get file names and line
91numbers:
92
93.. code-block:: console
94
95 % sancov.py print a.out.22679.sancov a.out.22673.sancov 2> /dev/null | llvm-symbolizer --obj a.out
96 cov.cc:3
97 cov.cc:5
98
Mike Aizatsky3828cbb2016-01-27 23:56:12 +000099Sancov Tool
100===========
101
102A new experimental ``sancov`` tool is developed to process coverage files.
103The tool is part of LLVM project and is currently supported only on Linux.
Mike Aizatskya731ee32016-02-12 00:29:45 +0000104It can handle symbolization tasks autonomously without any extra support
105from the environment. You need to pass .sancov files (named
106``<module_name>.<pid>.sancov`` and paths to all corresponding binary elf files.
107Sancov matches these files using module names and binaries file names.
Mike Aizatsky3828cbb2016-01-27 23:56:12 +0000108
109.. code-block:: console
110
Mike Aizatskya731ee32016-02-12 00:29:45 +0000111 USAGE: sancov [options] <action> (<binary file>|<.sancov file>)...
Mike Aizatsky3828cbb2016-01-27 23:56:12 +0000112
113 Action (required)
114 -print - Print coverage addresses
Sylvestre Ledrube8f3962016-02-14 20:20:58 +0000115 -covered-functions - Print all covered functions.
116 -not-covered-functions - Print all not covered functions.
Mike Aizatskya675e0e2016-09-30 21:02:56 +0000117 -symbolize - Symbolizes the report.
Mike Aizatsky3828cbb2016-01-27 23:56:12 +0000118
119 Options
120 -blacklist=<string> - Blacklist file (sanitizer blacklist format).
121 -demangle - Print demangled function name.
Mike Aizatsky3828cbb2016-01-27 23:56:12 +0000122 -strip_path_prefix=<string> - Strip this prefix from file paths in reports
123
124
Mike Aizatskya675e0e2016-09-30 21:02:56 +0000125Coverage Reports (Experimental)
Mike Aizatsky3828cbb2016-01-27 23:56:12 +0000126================================
127
Mike Aizatskya675e0e2016-09-30 21:02:56 +0000128``.sancov`` files do not contain enough information to generate a source-level
129coverage report. The missing information is contained
130in debug info of the binary. Thus the ``.sancov`` has to be symbolized
131to produce a ``.symcov`` file first:
132
133.. code-block:: console
134 sancov -symbolize my_program.123.sancov my_program > my_program.123.symcov
135
136The ``.symcov`` file can be browsed overlayed over the source code by
137running ``tools/sancov/sancov-report-server.py`` script that will start
138an HTTP server.
Mike Aizatsky3828cbb2016-01-27 23:56:12 +0000139
140
Sergey Matveev07e2d282015-04-23 20:40:04 +0000141How good is the coverage?
142=========================
143
Sergey Matveevea558e02015-05-06 21:09:00 +0000144It is possible to find out which PCs are not covered, by subtracting the covered
145set from the set of all instrumented PCs. The latter can be obtained by listing
146all callsites of ``__sanitizer_cov()`` in the binary. On Linux, ``sancov.py``
147can do this for you. Just supply the path to binary and a list of covered PCs:
Sergey Matveev07e2d282015-04-23 20:40:04 +0000148
149.. code-block:: console
150
Sergey Matveevea558e02015-05-06 21:09:00 +0000151 % sancov.py print a.out.12345.sancov > covered.txt
152 sancov.py: read 2 64-bit PCs from a.out.12345.sancov
153 sancov.py: 1 file merged; 2 PCs total
154 % sancov.py missing a.out < covered.txt
155 sancov.py: found 3 instrumented PCs in a.out
156 sancov.py: read 2 PCs from stdin
157 sancov.py: 1 PCs missing from coverage
158 0x4cc61c
Sergey Matveev07e2d282015-04-23 20:40:04 +0000159
160Edge coverage
161=============
162
163Consider this code:
164
165.. code-block:: c++
166
167 void foo(int *a) {
168 if (a)
169 *a = 0;
170 }
171
172It contains 3 basic blocks, let's name them A, B, C:
173
174.. code-block:: none
175
176 A
177 |\
178 | \
179 | B
180 | /
181 |/
182 C
183
184If blocks A, B, and C are all covered we know for certain that the edges A=>B
185and B=>C were executed, but we still don't know if the edge A=>C was executed.
186Such edges of control flow graph are called
187`critical <http://en.wikipedia.org/wiki/Control_flow_graph#Special_edges>`_. The
Alexey Samsonov8fffba12015-05-07 23:04:19 +0000188edge-level coverage (``-fsanitize-coverage=edge``) simply splits all critical
189edges by introducing new dummy blocks and then instruments those blocks:
Sergey Matveev07e2d282015-04-23 20:40:04 +0000190
191.. code-block:: none
192
193 A
194 |\
195 | \
196 D B
197 | /
198 |/
199 C
200
201Bitset
202======
203
204When ``coverage_bitset=1`` run-time flag is given, the coverage will also be
205dumped as a bitset (text file with 1 for blocks that have been executed and 0
206for blocks that were not).
207
208.. code-block:: console
209
Alexey Samsonov8fffba12015-05-07 23:04:19 +0000210 % clang++ -fsanitize=address -fsanitize-coverage=edge cov.cc
Sergey Matveev07e2d282015-04-23 20:40:04 +0000211 % ASAN_OPTIONS="coverage=1:coverage_bitset=1" ./a.out
212 main
213 % ASAN_OPTIONS="coverage=1:coverage_bitset=1" ./a.out 1
214 foo
215 main
216 % head *bitset*
217 ==> a.out.38214.bitset-sancov <==
218 01101
219 ==> a.out.6128.bitset-sancov <==
220 11011%
221
222For a given executable the length of the bitset is always the same (well,
223unless dlopen/dlclose come into play), so the bitset coverage can be
224easily used for bitset-based corpus distillation.
225
226Caller-callee coverage
227======================
228
229(Experimental!)
230Every indirect function call is instrumented with a run-time function call that
231captures caller and callee. At the shutdown time the process dumps a separate
232file called ``caller-callee.PID.sancov`` which contains caller/callee pairs as
233pairs of lines (odd lines are callers, even lines are callees)
234
235.. code-block:: console
236
237 a.out 0x4a2e0c
238 a.out 0x4a6510
239 a.out 0x4a2e0c
240 a.out 0x4a87f0
241
242Current limitations:
243
244* Only the first 14 callees for every caller are recorded, the rest are silently
245 ignored.
246* The output format is not very compact since caller and callee may reside in
247 different modules and we need to spell out the module names.
248* The routine that dumps the output is not optimized for speed
249* Only Linux x86_64 is tested so far.
250* Sandboxes are not supported.
251
252Coverage counters
253=================
254
255This experimental feature is inspired by
Aaron Ballman0f6f82a32016-02-22 13:09:36 +0000256`AFL <http://lcamtuf.coredump.cx/afl/technical_details.txt>`__'s coverage
Sergey Matveev07e2d282015-04-23 20:40:04 +0000257instrumentation. With additional compile-time and run-time flags you can get
258more sensitive coverage information. In addition to boolean values assigned to
259every basic block (edge) the instrumentation will collect imprecise counters.
260On exit, every counter will be mapped to a 8-bit bitset representing counter
261ranges: ``1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+`` and those 8-bit bitsets will
262be dumped to disk.
263
264.. code-block:: console
265
Alexey Samsonov8fffba12015-05-07 23:04:19 +0000266 % clang++ -g cov.cc -fsanitize=address -fsanitize-coverage=edge,8bit-counters
Sergey Matveev07e2d282015-04-23 20:40:04 +0000267 % ASAN_OPTIONS="coverage=1:coverage_counters=1" ./a.out
268 % ls -l *counters-sancov
269 ... a.out.17110.counters-sancov
270 % xxd *counters-sancov
271 0000000: 0001 0100 01
272
273These counters may also be used for in-process coverage-guided fuzzers. See
274``include/sanitizer/coverage_interface.h``:
275
276.. code-block:: c++
277
278 // The coverage instrumentation may optionally provide imprecise counters.
279 // Rather than exposing the counter values to the user we instead map
280 // the counters to a bitset.
281 // Every counter is associated with 8 bits in the bitset.
282 // We define 8 value ranges: 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+
283 // The i-th bit is set to 1 if the counter value is in the i-th range.
284 // This counter-based coverage implementation is *not* thread-safe.
285
286 // Returns the number of registered coverage counters.
287 uintptr_t __sanitizer_get_number_of_counters();
288 // Updates the counter 'bitset', clears the counters and returns the number of
289 // new bits in 'bitset'.
290 // If 'bitset' is nullptr, only clears the counters.
291 // Otherwise 'bitset' should be at least
292 // __sanitizer_get_number_of_counters bytes long and 8-aligned.
293 uintptr_t
294 __sanitizer_update_counter_bitset_and_clear_counters(uint8_t *bitset);
295
Kostya Serebryany5ce81792015-12-02 02:08:26 +0000296Tracing basic blocks
297====================
Kostya Serebryany64537862016-04-18 21:28:37 +0000298Experimental support for basic block (or edge) tracing.
Kostya Serebryany5ce81792015-12-02 02:08:26 +0000299With ``-fsanitize-coverage=trace-bb`` the compiler will insert
300``__sanitizer_cov_trace_basic_block(s32 *id)`` before every function, basic block, or edge
301(depending on the value of ``-fsanitize-coverage=[func,bb,edge]``).
Kostya Serebryany64537862016-04-18 21:28:37 +0000302Example:
303
304.. code-block:: console
305
306 % clang -g -fsanitize=address -fsanitize-coverage=edge,trace-bb foo.cc
307 % ASAN_OPTIONS=coverage=1 ./a.out
308
309This will produce two files after the process exit:
310`trace-points.PID.sancov` and `trace-events.PID.sancov`.
311The first file will contain a textual description of all the instrumented points in the program
312in the form that you can feed into llvm-symbolizer (e.g. `a.out 0x4dca89`), one per line.
313The second file will contain the actual execution trace as a sequence of 4-byte integers
314-- these integers are the indices into the array of instrumented points (the first file).
315
316Basic block tracing is currently supported only for single-threaded applications.
317
Kostya Serebryany5ce81792015-12-02 02:08:26 +0000318
Kostya Serebryanyd4590c72016-02-17 21:34:43 +0000319Tracing PCs
320===========
321*Experimental* feature similar to tracing basic blocks, but with a different API.
Kostya Serebryany52e86492016-02-18 00:49:23 +0000322With ``-fsanitize-coverage=trace-pc`` the compiler will insert
323``__sanitizer_cov_trace_pc()`` on every edge.
324With an additional ``...=trace-pc,indirect-calls`` flag
Kostya Serebryanyd4590c72016-02-17 21:34:43 +0000325``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call.
326These callbacks are not implemented in the Sanitizer run-time and should be defined
Kostya Serebryany52e86492016-02-18 00:49:23 +0000327by the user. So, these flags do not require the other sanitizer to be used.
328This mechanism is used for fuzzing the Linux kernel (https://github.com/google/syzkaller)
Aaron Ballman0f6f82a32016-02-22 13:09:36 +0000329and can be used with `AFL <http://lcamtuf.coredump.cx/afl>`__.
Kostya Serebryanyd4590c72016-02-17 21:34:43 +0000330
Kostya Serebryany60cdd612016-09-14 01:39:49 +0000331Tracing PCs with guards
332=======================
Kostya Serebryany66a9c172016-09-15 22:11:08 +0000333Another *experimental* feature that tries to combine the functionality of `trace-pc`,
334`8bit-counters` and boolean coverage.
Kostya Serebryany60cdd612016-09-14 01:39:49 +0000335
336With ``-fsanitize-coverage=trace-pc-guard`` the compiler will insert the following code
337on every edge:
338
339.. code-block:: none
340
Kostya Serebryany8e781a82016-09-18 04:52:23 +0000341 if (guard_variable)
Kostya Serebryany60cdd612016-09-14 01:39:49 +0000342 __sanitizer_cov_trace_pc_guard(&guard_variable)
343
Kostya Serebryanya9b0dd02016-09-29 17:43:24 +0000344Every edge will have its own `guard_variable` (uint32_t).
Kostya Serebryany66a9c172016-09-15 22:11:08 +0000345
Kostya Serebryany60cdd612016-09-14 01:39:49 +0000346The compler will also insert a module constructor that will call
347
348.. code-block:: c++
349
Kostya Serebryany8ad41552016-09-17 05:03:05 +0000350 // The guards are [start, stop).
351 // This function may be called multiple times with the same values of start/stop.
Kostya Serebryany6bb54982016-09-29 18:34:40 +0000352 __sanitizer_cov_trace_pc_guard_init(uint32_t *start, uint32_t *stop);
Kostya Serebryany60cdd612016-09-14 01:39:49 +0000353
Kostya Serebryany8ad41552016-09-17 05:03:05 +0000354Similarly to `trace-pc,indirect-calls`, with `trace-pc-guards,indirect-calls`
355``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call.
356
357The functions `__sanitizer_cov_trace_pc_*` should be defined by the user.
Kostya Serebryany60cdd612016-09-14 01:39:49 +0000358
Kostya Serebryanyd6ae22a2016-09-29 18:58:17 +0000359Example:
360
361.. code-block:: c++
362
363 // trace-pc-guard-cb.cc
364 #include <stdint.h>
365 #include <stdio.h>
366 #include <sanitizer/coverage_interface.h>
367
368 // This callback is inserted by the compiler as a module constructor
369 // into every compilation unit. 'start' and 'stop' correspond to the
370 // beginning and end of the section with the guards for the entire
371 // binary (executable or DSO) and so it will be called multiple times
372 // with the same parameters.
373 extern "C" void __sanitizer_cov_trace_pc_guard_init(uint32_t *start,
374 uint32_t *stop) {
375 static uint64_t N; // Counter for the guards.
376 if (start == stop || *start) return; // Initialize only once.
377 printf("INIT: %p %p\n", start, stop);
378 for (uint32_t *x = start; x < stop; x++)
379 *x = ++N; // Guards should start from 1.
380 }
381
382 // This callback is inserted by the compiler on every edge in the
383 // control flow (some optimizations apply).
384 // Typically, the compiler will emit the code like this:
385 // if(*guard)
386 // __sanitizer_cov_trace_pc_guard(guard);
387 // But for large functions it will emit a simple call:
388 // __sanitizer_cov_trace_pc_guard(guard);
389 extern "C" void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
390 if (!*guard) return; // Duplicate the guard check.
391 // If you set *guard to 0 this code will not be called again for this edge.
392 // Now you can get the PC and do whatever you want:
393 // store it somewhere or symbolize it and print right away.
394 // The values of `*guard` are as you set them in
Kostya Serebryany851cb982016-09-29 19:06:09 +0000395 // __sanitizer_cov_trace_pc_guard_init and so you can make them consecutive
Kostya Serebryanyd6ae22a2016-09-29 18:58:17 +0000396 // and use them to dereference an array or a bit vector.
397 void *PC = __builtin_return_address(0);
398 char PcDescr[1024];
399 // This function is a part of the sanitizer run-time.
400 // To use it, link with AddressSanitizer or other sanitizer.
401 __sanitizer_symbolize_pc(PC, "%p %F %L", PcDescr, sizeof(PcDescr));
402 printf("guard: %p %x PC %s\n", guard, *guard, PcDescr);
403 }
404
405.. code-block:: c++
406
407 // trace-pc-guard-example.cc
408 void foo() { }
409 int main(int argc, char **argv) {
410 if (argc > 1) foo();
411 }
412
413.. code-block:: console
414
415 clang++ -g -fsanitize-coverage=trace-pc-guard trace-pc-guard-example.cc -c
416 clang++ trace-pc-guard-cb.cc trace-pc-guard-example.o -fsanitize=address
417 ASAN_OPTIONS=strip_path_prefix=`pwd`/ ./a.out
418
419.. code-block:: console
420
421 INIT: 0x71bcd0 0x71bce0
422 guard: 0x71bcd4 2 PC 0x4ecd5b in main trace-pc-guard-example.cc:2
423 guard: 0x71bcd8 3 PC 0x4ecd9e in main trace-pc-guard-example.cc:3:7
424
Kostya Serebryany851cb982016-09-29 19:06:09 +0000425.. code-block:: console
426
427 ASAN_OPTIONS=strip_path_prefix=`pwd`/ ./a.out with-foo
428
429
430.. code-block:: console
431
432 INIT: 0x71bcd0 0x71bce0
433 guard: 0x71bcd4 2 PC 0x4ecd5b in main trace-pc-guard-example.cc:3
434 guard: 0x71bcdc 4 PC 0x4ecdc7 in main trace-pc-guard-example.cc:4:17
435 guard: 0x71bcd0 1 PC 0x4ecd20 in foo() trace-pc-guard-example.cc:2:14
436
Kostya Serebryanyd6ae22a2016-09-29 18:58:17 +0000437
Kostya Serebryanyb17e2982015-07-31 21:48:10 +0000438Tracing data flow
439=================
440
Kostya Serebryany3b419712016-08-30 01:27:03 +0000441Support for data-flow-guided fuzzing.
Kostya Serebryanyb17e2982015-07-31 21:48:10 +0000442With ``-fsanitize-coverage=trace-cmp`` the compiler will insert extra instrumentation
443around comparison instructions and switch statements.
Kostya Serebryany3b419712016-08-30 01:27:03 +0000444Similarly, with ``-fsanitize-coverage=trace-div`` the compiler will instrument
445integer division instructions (to capture the right argument of division)
446and with ``-fsanitize-coverage=trace-gep`` --
447the `LLVM GEP instructions <http://llvm.org/docs/GetElementPtr.html>`_
448(to capture array indices).
Kostya Serebryanyb17e2982015-07-31 21:48:10 +0000449
450.. code-block:: c++
451
452 // Called before a comparison instruction.
Kostya Serebryanyb17e2982015-07-31 21:48:10 +0000453 // Arg1 and Arg2 are arguments of the comparison.
Kostya Serebryany070bcb02016-08-18 01:26:36 +0000454 void __sanitizer_cov_trace_cmp1(uint8_t Arg1, uint8_t Arg2);
455 void __sanitizer_cov_trace_cmp2(uint16_t Arg1, uint16_t Arg2);
456 void __sanitizer_cov_trace_cmp4(uint32_t Arg1, uint32_t Arg2);
457 void __sanitizer_cov_trace_cmp8(uint64_t Arg1, uint64_t Arg2);
Kostya Serebryanyb17e2982015-07-31 21:48:10 +0000458
459 // Called before a switch statement.
460 // Val is the switch operand.
461 // Cases[0] is the number of case constants.
462 // Cases[1] is the size of Val in bits.
463 // Cases[2:] are the case constants.
464 void __sanitizer_cov_trace_switch(uint64_t Val, uint64_t *Cases);
465
Kostya Serebryany3b419712016-08-30 01:27:03 +0000466 // Called before a division statement.
467 // Val is the second argument of division.
468 void __sanitizer_cov_trace_div4(uint32_t Val);
469 void __sanitizer_cov_trace_div8(uint64_t Val);
470
471 // Called before a GetElemementPtr (GEP) instruction
472 // for every non-constant array index.
473 void __sanitizer_cov_trace_gep(uintptr_t Idx);
474
475
Kostya Serebryanyb17e2982015-07-31 21:48:10 +0000476This interface is a subject to change.
Kostya Serebryanya94e6e72015-11-30 22:17:19 +0000477The current implementation is not thread-safe and thus can be safely used only for single-threaded targets.
Kostya Serebryanyb17e2982015-07-31 21:48:10 +0000478
Sergey Matveev07e2d282015-04-23 20:40:04 +0000479Output directory
480================
481
482By default, .sancov files are created in the current working directory.
483This can be changed with ``ASAN_OPTIONS=coverage_dir=/path``:
484
485.. code-block:: console
486
487 % ASAN_OPTIONS="coverage=1:coverage_dir=/tmp/cov" ./a.out foo
488 % ls -l /tmp/cov/*sancov
489 -rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov
490 -rw-r----- 1 kcc eng 8 Nov 27 12:21 a.out.22679.sancov
491
492Sudden death
493============
494
495Normally, coverage data is collected in memory and saved to disk when the
496program exits (with an ``atexit()`` handler), when a SIGSEGV is caught, or when
497``__sanitizer_cov_dump()`` is called.
498
499If the program ends with a signal that ASan does not handle (or can not handle
500at all, like SIGKILL), coverage data will be lost. This is a big problem on
501Android, where SIGKILL is a normal way of evicting applications from memory.
502
503With ``ASAN_OPTIONS=coverage=1:coverage_direct=1`` coverage data is written to a
504memory-mapped file as soon as it collected.
505
506.. code-block:: console
507
508 % ASAN_OPTIONS="coverage=1:coverage_direct=1" ./a.out
509 main
510 % ls
511 7036.sancov.map 7036.sancov.raw a.out
512 % sancov.py rawunpack 7036.sancov.raw
513 sancov.py: reading map 7036.sancov.map
514 sancov.py: unpacking 7036.sancov.raw
515 writing 1 PCs to a.out.7036.sancov
516 % sancov.py print a.out.7036.sancov
517 sancov.py: read 1 PCs from a.out.7036.sancov
518 sancov.py: 1 files merged; 1 PCs total
519 0x4b2bae
520
521Note that on 64-bit platforms, this method writes 2x more data than the default,
522because it stores full PC values instead of 32-bit offsets.
523
524In-process fuzzing
525==================
526
527Coverage data could be useful for fuzzers and sometimes it is preferable to run
528a fuzzer in the same process as the code being fuzzed (in-process fuzzer).
529
530You can use ``__sanitizer_get_total_unique_coverage()`` from
531``<sanitizer/coverage_interface.h>`` which returns the number of currently
532covered entities in the program. This will tell the fuzzer if the coverage has
533increased after testing every new input.
534
535If a fuzzer finds a bug in the ASan run, you will need to save the reproducer
536before exiting the process. Use ``__asan_set_death_callback`` from
537``<sanitizer/asan_interface.h>`` to do that.
538
539An example of such fuzzer can be found in `the LLVM tree
540<http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Fuzzer/README.txt?view=markup>`_.
541
542Performance
543===========
544
545This coverage implementation is **fast**. With function-level coverage
Alexey Samsonov8fffba12015-05-07 23:04:19 +0000546(``-fsanitize-coverage=func``) the overhead is not measurable. With
547basic-block-level coverage (``-fsanitize-coverage=bb``) the overhead varies
Sergey Matveev07e2d282015-04-23 20:40:04 +0000548between 0 and 25%.
549
550============== ========= ========= ========= ========= ========= =========
551 benchmark cov0 cov1 diff 0-1 cov2 diff 0-2 diff 1-2
552============== ========= ========= ========= ========= ========= =========
553 400.perlbench 1296.00 1307.00 1.01 1465.00 1.13 1.12
554 401.bzip2 858.00 854.00 1.00 1010.00 1.18 1.18
555 403.gcc 613.00 617.00 1.01 683.00 1.11 1.11
556 429.mcf 605.00 582.00 0.96 610.00 1.01 1.05
557 445.gobmk 896.00 880.00 0.98 1050.00 1.17 1.19
558 456.hmmer 892.00 892.00 1.00 918.00 1.03 1.03
559 458.sjeng 995.00 1009.00 1.01 1217.00 1.22 1.21
560462.libquantum 497.00 492.00 0.99 534.00 1.07 1.09
561 464.h264ref 1461.00 1467.00 1.00 1543.00 1.06 1.05
562 471.omnetpp 575.00 590.00 1.03 660.00 1.15 1.12
563 473.astar 658.00 652.00 0.99 715.00 1.09 1.10
564 483.xalancbmk 471.00 491.00 1.04 582.00 1.24 1.19
565 433.milc 616.00 627.00 1.02 627.00 1.02 1.00
566 444.namd 602.00 601.00 1.00 654.00 1.09 1.09
567 447.dealII 630.00 634.00 1.01 653.00 1.04 1.03
568 450.soplex 365.00 368.00 1.01 395.00 1.08 1.07
569 453.povray 427.00 434.00 1.02 495.00 1.16 1.14
570 470.lbm 357.00 375.00 1.05 370.00 1.04 0.99
571 482.sphinx3 927.00 928.00 1.00 1000.00 1.08 1.08
572============== ========= ========= ========= ========= ========= =========
573
574Why another coverage?
575=====================
576
577Why did we implement yet another code coverage?
578 * We needed something that is lightning fast, plays well with
579 AddressSanitizer, and does not significantly increase the binary size.
580 * Traditional coverage implementations based in global counters
581 `suffer from contention on counters
582 <https://groups.google.com/forum/#!topic/llvm-dev/cDqYgnxNEhY>`_.