Blame - clang/docs/SanitizerCoverage.rst - toolchain/llvm-project

blob: 3e8102a12f6755221a199969de68eaaba241726e [file] [log] [blame]

Sergey Matveev	33e3224	2015-04-23 21:29:37 +0000	[diff] [blame]	1	=================
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	2	SanitizerCoverage
Sergey Matveev	33e3224	2015-04-23 21:29:37 +0000	[diff] [blame]	3	=================
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	4
				5	.. contents::
				6	:local:
				7
				8	Introduction
				9	============
				10
				11	Sanitizer tools have a very simple code coverage tool built in. It allows to
				12	get function-level, basic-block-level, and edge-level coverage at a very low
				13	cost.
				14
				15	How to build and run
				16	====================
				17
				18	SanitizerCoverage can be used with :doc:`AddressSanitizer`,
Evgeniy Stepanov	5b49eb4	2016-06-14 21:33:40 +0000	[diff] [blame]	19	:doc:`LeakSanitizer`, :doc:`MemorySanitizer`,
				20	UndefinedBehaviorSanitizer, or without any sanitizer. Pass one of the
				21	following compile-time flags:
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	22
Alexey Samsonov	8fffba1	2015-05-07 23:04:19 +0000	[diff] [blame]	23	* ``-fsanitize-coverage=func`` for function-level coverage (very fast).
				24	* ``-fsanitize-coverage=bb`` for basic-block-level coverage (may add up to 30%
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	25	extra slowdown).
Alexey Samsonov	8fffba1	2015-05-07 23:04:19 +0000	[diff] [blame]	26	* ``-fsanitize-coverage=edge`` for edge-level coverage (up to 40% slowdown).
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	27
Alexey Samsonov	8fffba1	2015-05-07 23:04:19 +0000	[diff] [blame]	28	You may also specify ``-fsanitize-coverage=indirect-calls`` for
				29	additional `caller-callee coverage`_.
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	30
Evgeniy Stepanov	5b49eb4	2016-06-14 21:33:40 +0000	[diff] [blame]	31	At run time, pass ``coverage=1`` in ``ASAN_OPTIONS``,
				32	``LSAN_OPTIONS``, ``MSAN_OPTIONS`` or ``UBSAN_OPTIONS``, as
				33	appropriate. For the standalone coverage mode, use ``UBSAN_OPTIONS``.
Alexey Samsonov	8fffba1	2015-05-07 23:04:19 +0000	[diff] [blame]	34
				35	To get `Coverage counters`_, add ``-fsanitize-coverage=8bit-counters``
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	36	to one of the above compile-time flags. At runtime, use
				37	``*SAN_OPTIONS=coverage=1:coverage_counters=1``.
				38
				39	Example:
				40
				41	.. code-block:: console
				42
				43	% cat -n cov.cc
				44	1 #include <stdio.h>
				45	2 __attribute__((noinline))
				46	3 void foo() { printf("foo\n"); }
				47	4
				48	5 int main(int argc, char **argv) {
				49	6 if (argc == 2)
				50	7 foo();
				51	8 printf("main\n");
				52	9 }
Alexey Samsonov	8fffba1	2015-05-07 23:04:19 +0000	[diff] [blame]	53	% clang++ -g cov.cc -fsanitize=address -fsanitize-coverage=func
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	54	% ASAN_OPTIONS=coverage=1 ./a.out; ls -l *sancov
				55	main
				56	-rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov
				57	% ASAN_OPTIONS=coverage=1 ./a.out foo ; ls -l *sancov
				58	foo
				59	main
				60	-rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov
				61	-rw-r----- 1 kcc eng 8 Nov 27 12:21 a.out.22679.sancov
				62
				63	Every time you run an executable instrumented with SanitizerCoverage
				64	one ``*.sancov`` file is created during the process shutdown.
				65	If the executable is dynamically linked against instrumented DSOs,
				66	one ``*.sancov`` file will be also created for every DSO.
				67
				68	Postprocessing
				69	==============
				70
				71	The format of ``*.sancov`` files is very simple: the first 8 bytes is the magic,
				72	one of ``0xC0BFFFFFFFFFFF64`` and ``0xC0BFFFFFFFFFFF32``. The last byte of the
				73	magic defines the size of the following offsets. The rest of the data is the
				74	offsets in the corresponding binary/DSO that were executed during the run.
				75
				76	A simple script
				77	``$LLVM/projects/compiler-rt/lib/sanitizer_common/scripts/sancov.py`` is
				78	provided to dump these offsets.
				79
				80	.. code-block:: console
				81
				82	% sancov.py print a.out.22679.sancov a.out.22673.sancov
				83	sancov.py: read 2 PCs from a.out.22679.sancov
				84	sancov.py: read 1 PCs from a.out.22673.sancov
				85	sancov.py: 2 files merged; 2 PCs total
				86	0x465250
				87	0x4652a0
				88
				89	You can then filter the output of ``sancov.py`` through ``addr2line --exe
				90	ObjectFile`` or ``llvm-symbolizer --obj ObjectFile`` to get file names and line
				91	numbers:
				92
				93	.. code-block:: console
				94
				95	% sancov.py print a.out.22679.sancov a.out.22673.sancov 2> /dev/null \| llvm-symbolizer --obj a.out
				96	cov.cc:3
				97	cov.cc:5
				98
Mike Aizatsky	3828cbb	2016-01-27 23:56:12 +0000	[diff] [blame]	99	Sancov Tool
				100	===========
				101
				102	A new experimental ``sancov`` tool is developed to process coverage files.
				103	The tool is part of LLVM project and is currently supported only on Linux.
Mike Aizatsky	a731ee3	2016-02-12 00:29:45 +0000	[diff] [blame]	104	It can handle symbolization tasks autonomously without any extra support
				105	from the environment. You need to pass .sancov files (named
				106	``<module_name>.<pid>.sancov`` and paths to all corresponding binary elf files.
				107	Sancov matches these files using module names and binaries file names.
Mike Aizatsky	3828cbb	2016-01-27 23:56:12 +0000	[diff] [blame]	108
				109	.. code-block:: console
				110
Mike Aizatsky	a731ee3	2016-02-12 00:29:45 +0000	[diff] [blame]	111	USAGE: sancov [options] <action> (<binary file>\|<.sancov file>)...
Mike Aizatsky	3828cbb	2016-01-27 23:56:12 +0000	[diff] [blame]	112
				113	Action (required)
				114	-print - Print coverage addresses
Sylvestre Ledru	be8f396	2016-02-14 20:20:58 +0000	[diff] [blame]	115	-covered-functions - Print all covered functions.
				116	-not-covered-functions - Print all not covered functions.
Mike Aizatsky	a675e0e	2016-09-30 21:02:56 +0000	[diff] [blame]	117	-symbolize - Symbolizes the report.
Mike Aizatsky	3828cbb	2016-01-27 23:56:12 +0000	[diff] [blame]	118
				119	Options
				120	-blacklist=<string> - Blacklist file (sanitizer blacklist format).
				121	-demangle - Print demangled function name.
Mike Aizatsky	3828cbb	2016-01-27 23:56:12 +0000	[diff] [blame]	122	-strip_path_prefix=<string> - Strip this prefix from file paths in reports
				123
				124
Mike Aizatsky	a675e0e	2016-09-30 21:02:56 +0000	[diff] [blame]	125	Coverage Reports (Experimental)
Mike Aizatsky	3828cbb	2016-01-27 23:56:12 +0000	[diff] [blame]	126	================================
				127
Mike Aizatsky	a675e0e	2016-09-30 21:02:56 +0000	[diff] [blame]	128	``.sancov`` files do not contain enough information to generate a source-level
				129	coverage report. The missing information is contained
				130	in debug info of the binary. Thus the ``.sancov`` has to be symbolized
				131	to produce a ``.symcov`` file first:
				132
				133	.. code-block:: console
Kostya Serebryany	f74169c	2016-09-30 21:57:10 +0000	[diff] [blame]	134
Mike Aizatsky	a675e0e	2016-09-30 21:02:56 +0000	[diff] [blame]	135	sancov -symbolize my_program.123.sancov my_program > my_program.123.symcov
				136
				137	The ``.symcov`` file can be browsed overlayed over the source code by
Mike Aizatsky	a271d1a	2016-10-04 19:19:16 +0000	[diff] [blame]	138	running ``tools/sancov/coverage-report-server.py`` script that will start
Mike Aizatsky	a675e0e	2016-09-30 21:02:56 +0000	[diff] [blame]	139	an HTTP server.
Mike Aizatsky	3828cbb	2016-01-27 23:56:12 +0000	[diff] [blame]	140
				141
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	142	How good is the coverage?
				143	=========================
				144
Sergey Matveev	ea558e0	2015-05-06 21:09:00 +0000	[diff] [blame]	145	It is possible to find out which PCs are not covered, by subtracting the covered
				146	set from the set of all instrumented PCs. The latter can be obtained by listing
				147	all callsites of ``__sanitizer_cov()`` in the binary. On Linux, ``sancov.py``
				148	can do this for you. Just supply the path to binary and a list of covered PCs:
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	149
				150	.. code-block:: console
				151
Sergey Matveev	ea558e0	2015-05-06 21:09:00 +0000	[diff] [blame]	152	% sancov.py print a.out.12345.sancov > covered.txt
				153	sancov.py: read 2 64-bit PCs from a.out.12345.sancov
				154	sancov.py: 1 file merged; 2 PCs total
				155	% sancov.py missing a.out < covered.txt
				156	sancov.py: found 3 instrumented PCs in a.out
				157	sancov.py: read 2 PCs from stdin
				158	sancov.py: 1 PCs missing from coverage
				159	0x4cc61c
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	160
				161	Edge coverage
				162	=============
				163
				164	Consider this code:
				165
				166	.. code-block:: c++
				167
				168	void foo(int *a) {
				169	if (a)
				170	*a = 0;
				171	}
				172
				173	It contains 3 basic blocks, let's name them A, B, C:
				174
				175	.. code-block:: none
				176
				177	A
				178	\|\
				179	\| \
				180	\| B
				181	\| /
				182	\|/
				183	C
				184
				185	If blocks A, B, and C are all covered we know for certain that the edges A=>B
				186	and B=>C were executed, but we still don't know if the edge A=>C was executed.
				187	Such edges of control flow graph are called
				188	`critical <http://en.wikipedia.org/wiki/Control_flow_graph#Special_edges>`_. The
Alexey Samsonov	8fffba1	2015-05-07 23:04:19 +0000	[diff] [blame]	189	edge-level coverage (``-fsanitize-coverage=edge``) simply splits all critical
				190	edges by introducing new dummy blocks and then instruments those blocks:
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	191
				192	.. code-block:: none
				193
				194	A
				195	\|\
				196	\| \
				197	D B
				198	\| /
				199	\|/
				200	C
				201
				202	Bitset
				203	======
				204
				205	When ``coverage_bitset=1`` run-time flag is given, the coverage will also be
				206	dumped as a bitset (text file with 1 for blocks that have been executed and 0
				207	for blocks that were not).
				208
				209	.. code-block:: console
				210
Alexey Samsonov	8fffba1	2015-05-07 23:04:19 +0000	[diff] [blame]	211	% clang++ -fsanitize=address -fsanitize-coverage=edge cov.cc
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	212	% ASAN_OPTIONS="coverage=1:coverage_bitset=1" ./a.out
				213	main
				214	% ASAN_OPTIONS="coverage=1:coverage_bitset=1" ./a.out 1
				215	foo
				216	main
				217	% head bitset
				218	==> a.out.38214.bitset-sancov <==
				219	01101
				220	==> a.out.6128.bitset-sancov <==
				221	11011%
				222
				223	For a given executable the length of the bitset is always the same (well,
				224	unless dlopen/dlclose come into play), so the bitset coverage can be
				225	easily used for bitset-based corpus distillation.
				226
				227	Caller-callee coverage
				228	======================
				229
				230	(Experimental!)
				231	Every indirect function call is instrumented with a run-time function call that
				232	captures caller and callee. At the shutdown time the process dumps a separate
				233	file called ``caller-callee.PID.sancov`` which contains caller/callee pairs as
				234	pairs of lines (odd lines are callers, even lines are callees)
				235
				236	.. code-block:: console
				237
				238	a.out 0x4a2e0c
				239	a.out 0x4a6510
				240	a.out 0x4a2e0c
				241	a.out 0x4a87f0
				242
				243	Current limitations:
				244
				245	* Only the first 14 callees for every caller are recorded, the rest are silently
				246	ignored.
				247	* The output format is not very compact since caller and callee may reside in
				248	different modules and we need to spell out the module names.
				249	* The routine that dumps the output is not optimized for speed
				250	* Only Linux x86_64 is tested so far.
				251	* Sandboxes are not supported.
				252
				253	Coverage counters
				254	=================
				255
				256	This experimental feature is inspired by
Aaron Ballman	0f6f82a3	2016-02-22 13:09:36 +0000	[diff] [blame]	257	`AFL <http://lcamtuf.coredump.cx/afl/technical_details.txt>`__'s coverage
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	258	instrumentation. With additional compile-time and run-time flags you can get
				259	more sensitive coverage information. In addition to boolean values assigned to
				260	every basic block (edge) the instrumentation will collect imprecise counters.
				261	On exit, every counter will be mapped to a 8-bit bitset representing counter
				262	ranges: ``1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+`` and those 8-bit bitsets will
				263	be dumped to disk.
				264
				265	.. code-block:: console
				266
Alexey Samsonov	8fffba1	2015-05-07 23:04:19 +0000	[diff] [blame]	267	% clang++ -g cov.cc -fsanitize=address -fsanitize-coverage=edge,8bit-counters
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	268	% ASAN_OPTIONS="coverage=1:coverage_counters=1" ./a.out
				269	% ls -l *counters-sancov
				270	... a.out.17110.counters-sancov
				271	% xxd *counters-sancov
				272	0000000: 0001 0100 01
				273
				274	These counters may also be used for in-process coverage-guided fuzzers. See
				275	``include/sanitizer/coverage_interface.h``:
				276
				277	.. code-block:: c++
				278
				279	// The coverage instrumentation may optionally provide imprecise counters.
				280	// Rather than exposing the counter values to the user we instead map
				281	// the counters to a bitset.
				282	// Every counter is associated with 8 bits in the bitset.
				283	// We define 8 value ranges: 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+
				284	// The i-th bit is set to 1 if the counter value is in the i-th range.
				285	// This counter-based coverage implementation is not thread-safe.
				286
				287	// Returns the number of registered coverage counters.
				288	uintptr_t __sanitizer_get_number_of_counters();
				289	// Updates the counter 'bitset', clears the counters and returns the number of
				290	// new bits in 'bitset'.
				291	// If 'bitset' is nullptr, only clears the counters.
				292	// Otherwise 'bitset' should be at least
				293	// __sanitizer_get_number_of_counters bytes long and 8-aligned.
				294	uintptr_t
				295	__sanitizer_update_counter_bitset_and_clear_counters(uint8_t *bitset);
				296
Kostya Serebryany	5ce8179	2015-12-02 02:08:26 +0000	[diff] [blame]	297	Tracing basic blocks
				298	====================
Kostya Serebryany	6453786	2016-04-18 21:28:37 +0000	[diff] [blame]	299	Experimental support for basic block (or edge) tracing.
Kostya Serebryany	5ce8179	2015-12-02 02:08:26 +0000	[diff] [blame]	300	With ``-fsanitize-coverage=trace-bb`` the compiler will insert
				301	``__sanitizer_cov_trace_basic_block(s32 *id)`` before every function, basic block, or edge
				302	(depending on the value of ``-fsanitize-coverage=[func,bb,edge]``).
Kostya Serebryany	6453786	2016-04-18 21:28:37 +0000	[diff] [blame]	303	Example:
				304
				305	.. code-block:: console
				306
				307	% clang -g -fsanitize=address -fsanitize-coverage=edge,trace-bb foo.cc
				308	% ASAN_OPTIONS=coverage=1 ./a.out
				309
				310	This will produce two files after the process exit:
				311	`trace-points.PID.sancov` and `trace-events.PID.sancov`.
				312	The first file will contain a textual description of all the instrumented points in the program
				313	in the form that you can feed into llvm-symbolizer (e.g. `a.out 0x4dca89`), one per line.
				314	The second file will contain the actual execution trace as a sequence of 4-byte integers
				315	-- these integers are the indices into the array of instrumented points (the first file).
				316
				317	Basic block tracing is currently supported only for single-threaded applications.
				318
Kostya Serebryany	5ce8179	2015-12-02 02:08:26 +0000	[diff] [blame]	319
Kostya Serebryany	d4590c7	2016-02-17 21:34:43 +0000	[diff] [blame]	320	Tracing PCs
				321	===========
				322	Experimental feature similar to tracing basic blocks, but with a different API.
Kostya Serebryany	52e8649	2016-02-18 00:49:23 +0000	[diff] [blame]	323	With ``-fsanitize-coverage=trace-pc`` the compiler will insert
				324	``__sanitizer_cov_trace_pc()`` on every edge.
				325	With an additional ``...=trace-pc,indirect-calls`` flag
Kostya Serebryany	d4590c7	2016-02-17 21:34:43 +0000	[diff] [blame]	326	``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call.
				327	These callbacks are not implemented in the Sanitizer run-time and should be defined
Kostya Serebryany	52e8649	2016-02-18 00:49:23 +0000	[diff] [blame]	328	by the user. So, these flags do not require the other sanitizer to be used.
				329	This mechanism is used for fuzzing the Linux kernel (https://github.com/google/syzkaller)
Aaron Ballman	0f6f82a3	2016-02-22 13:09:36 +0000	[diff] [blame]	330	and can be used with `AFL <http://lcamtuf.coredump.cx/afl>`__.
Kostya Serebryany	d4590c7	2016-02-17 21:34:43 +0000	[diff] [blame]	331
Kostya Serebryany	60cdd61	2016-09-14 01:39:49 +0000	[diff] [blame]	332	Tracing PCs with guards
				333	=======================
Kostya Serebryany	66a9c17	2016-09-15 22:11:08 +0000	[diff] [blame]	334	Another experimental feature that tries to combine the functionality of `trace-pc`,
				335	`8bit-counters` and boolean coverage.
Kostya Serebryany	60cdd61	2016-09-14 01:39:49 +0000	[diff] [blame]	336
				337	With ``-fsanitize-coverage=trace-pc-guard`` the compiler will insert the following code
				338	on every edge:
				339
				340	.. code-block:: none
				341
Kostya Serebryany	8e781a8	2016-09-18 04:52:23 +0000	[diff] [blame]	342	if (guard_variable)
Kostya Serebryany	60cdd61	2016-09-14 01:39:49 +0000	[diff] [blame]	343	__sanitizer_cov_trace_pc_guard(&guard_variable)
				344
Kostya Serebryany	a9b0dd0	2016-09-29 17:43:24 +0000	[diff] [blame]	345	Every edge will have its own `guard_variable` (uint32_t).
Kostya Serebryany	66a9c17	2016-09-15 22:11:08 +0000	[diff] [blame]	346
Kostya Serebryany	60cdd61	2016-09-14 01:39:49 +0000	[diff] [blame]	347	The compler will also insert a module constructor that will call
				348
				349	.. code-block:: c++
				350
Kostya Serebryany	8ad4155	2016-09-17 05:03:05 +0000	[diff] [blame]	351	// The guards are [start, stop).
				352	// This function may be called multiple times with the same values of start/stop.
Kostya Serebryany	6bb5498	2016-09-29 18:34:40 +0000	[diff] [blame]	353	__sanitizer_cov_trace_pc_guard_init(uint32_t start, uint32_t stop);
Kostya Serebryany	60cdd61	2016-09-14 01:39:49 +0000	[diff] [blame]	354
Kostya Serebryany	8ad4155	2016-09-17 05:03:05 +0000	[diff] [blame]	355	Similarly to `trace-pc,indirect-calls`, with `trace-pc-guards,indirect-calls`
				356	``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call.
				357
				358	The functions `__sanitizer_cov_trace_pc_*` should be defined by the user.
Kostya Serebryany	60cdd61	2016-09-14 01:39:49 +0000	[diff] [blame]	359
Kostya Serebryany	d6ae22a	2016-09-29 18:58:17 +0000	[diff] [blame]	360	Example:
				361
				362	.. code-block:: c++
				363
				364	// trace-pc-guard-cb.cc
				365	#include <stdint.h>
				366	#include <stdio.h>
				367	#include <sanitizer/coverage_interface.h>
				368
				369	// This callback is inserted by the compiler as a module constructor
				370	// into every compilation unit. 'start' and 'stop' correspond to the
				371	// beginning and end of the section with the guards for the entire
				372	// binary (executable or DSO) and so it will be called multiple times
				373	// with the same parameters.
				374	extern "C" void __sanitizer_cov_trace_pc_guard_init(uint32_t *start,
				375	uint32_t *stop) {
				376	static uint64_t N; // Counter for the guards.
				377	if (start == stop \|\| *start) return; // Initialize only once.
				378	printf("INIT: %p %p\n", start, stop);
				379	for (uint32_t *x = start; x < stop; x++)
				380	*x = ++N; // Guards should start from 1.
				381	}
				382
				383	// This callback is inserted by the compiler on every edge in the
				384	// control flow (some optimizations apply).
				385	// Typically, the compiler will emit the code like this:
				386	// if(*guard)
				387	// __sanitizer_cov_trace_pc_guard(guard);
				388	// But for large functions it will emit a simple call:
				389	// __sanitizer_cov_trace_pc_guard(guard);
				390	extern "C" void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
				391	if (!*guard) return; // Duplicate the guard check.
				392	// If you set *guard to 0 this code will not be called again for this edge.
				393	// Now you can get the PC and do whatever you want:
				394	// store it somewhere or symbolize it and print right away.
				395	// The values of `*guard` are as you set them in
Kostya Serebryany	851cb98	2016-09-29 19:06:09 +0000	[diff] [blame]	396	// __sanitizer_cov_trace_pc_guard_init and so you can make them consecutive
Kostya Serebryany	d6ae22a	2016-09-29 18:58:17 +0000	[diff] [blame]	397	// and use them to dereference an array or a bit vector.
				398	void *PC = __builtin_return_address(0);
				399	char PcDescr[1024];
				400	// This function is a part of the sanitizer run-time.
				401	// To use it, link with AddressSanitizer or other sanitizer.
				402	__sanitizer_symbolize_pc(PC, "%p %F %L", PcDescr, sizeof(PcDescr));
				403	printf("guard: %p %x PC %s\n", guard, *guard, PcDescr);
				404	}
				405
				406	.. code-block:: c++
				407
				408	// trace-pc-guard-example.cc
				409	void foo() { }
				410	int main(int argc, char **argv) {
				411	if (argc > 1) foo();
				412	}
				413
				414	.. code-block:: console
				415
				416	clang++ -g -fsanitize-coverage=trace-pc-guard trace-pc-guard-example.cc -c
				417	clang++ trace-pc-guard-cb.cc trace-pc-guard-example.o -fsanitize=address
				418	ASAN_OPTIONS=strip_path_prefix=`pwd`/ ./a.out
				419
				420	.. code-block:: console
				421
				422	INIT: 0x71bcd0 0x71bce0
				423	guard: 0x71bcd4 2 PC 0x4ecd5b in main trace-pc-guard-example.cc:2
				424	guard: 0x71bcd8 3 PC 0x4ecd9e in main trace-pc-guard-example.cc:3:7
				425
Kostya Serebryany	851cb98	2016-09-29 19:06:09 +0000	[diff] [blame]	426	.. code-block:: console
				427
				428	ASAN_OPTIONS=strip_path_prefix=`pwd`/ ./a.out with-foo
				429
				430
				431	.. code-block:: console
				432
				433	INIT: 0x71bcd0 0x71bce0
				434	guard: 0x71bcd4 2 PC 0x4ecd5b in main trace-pc-guard-example.cc:3
				435	guard: 0x71bcdc 4 PC 0x4ecdc7 in main trace-pc-guard-example.cc:4:17
				436	guard: 0x71bcd0 1 PC 0x4ecd20 in foo() trace-pc-guard-example.cc:2:14
				437
Kostya Serebryany	d6ae22a	2016-09-29 18:58:17 +0000	[diff] [blame]	438
Kostya Serebryany	b17e298	2015-07-31 21:48:10 +0000	[diff] [blame]	439	Tracing data flow
				440	=================
				441
Kostya Serebryany	3b41971	2016-08-30 01:27:03 +0000	[diff] [blame]	442	Support for data-flow-guided fuzzing.
Kostya Serebryany	b17e298	2015-07-31 21:48:10 +0000	[diff] [blame]	443	With ``-fsanitize-coverage=trace-cmp`` the compiler will insert extra instrumentation
				444	around comparison instructions and switch statements.
Kostya Serebryany	3b41971	2016-08-30 01:27:03 +0000	[diff] [blame]	445	Similarly, with ``-fsanitize-coverage=trace-div`` the compiler will instrument
				446	integer division instructions (to capture the right argument of division)
				447	and with ``-fsanitize-coverage=trace-gep`` --
				448	the `LLVM GEP instructions <http://llvm.org/docs/GetElementPtr.html>`_
				449	(to capture array indices).
Kostya Serebryany	b17e298	2015-07-31 21:48:10 +0000	[diff] [blame]	450
				451	.. code-block:: c++
				452
				453	// Called before a comparison instruction.
Kostya Serebryany	b17e298	2015-07-31 21:48:10 +0000	[diff] [blame]	454	// Arg1 and Arg2 are arguments of the comparison.
Kostya Serebryany	070bcb0	2016-08-18 01:26:36 +0000	[diff] [blame]	455	void __sanitizer_cov_trace_cmp1(uint8_t Arg1, uint8_t Arg2);
				456	void __sanitizer_cov_trace_cmp2(uint16_t Arg1, uint16_t Arg2);
				457	void __sanitizer_cov_trace_cmp4(uint32_t Arg1, uint32_t Arg2);
				458	void __sanitizer_cov_trace_cmp8(uint64_t Arg1, uint64_t Arg2);
Kostya Serebryany	b17e298	2015-07-31 21:48:10 +0000	[diff] [blame]	459
				460	// Called before a switch statement.
				461	// Val is the switch operand.
				462	// Cases[0] is the number of case constants.
				463	// Cases[1] is the size of Val in bits.
				464	// Cases[2:] are the case constants.
				465	void __sanitizer_cov_trace_switch(uint64_t Val, uint64_t *Cases);
				466
Kostya Serebryany	3b41971	2016-08-30 01:27:03 +0000	[diff] [blame]	467	// Called before a division statement.
				468	// Val is the second argument of division.
				469	void __sanitizer_cov_trace_div4(uint32_t Val);
				470	void __sanitizer_cov_trace_div8(uint64_t Val);
				471
				472	// Called before a GetElemementPtr (GEP) instruction
				473	// for every non-constant array index.
				474	void __sanitizer_cov_trace_gep(uintptr_t Idx);
				475
				476
Kostya Serebryany	b17e298	2015-07-31 21:48:10 +0000	[diff] [blame]	477	This interface is a subject to change.
Kostya Serebryany	a94e6e7	2015-11-30 22:17:19 +0000	[diff] [blame]	478	The current implementation is not thread-safe and thus can be safely used only for single-threaded targets.
Kostya Serebryany	b17e298	2015-07-31 21:48:10 +0000	[diff] [blame]	479
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	480	Output directory
				481	================
				482
				483	By default, .sancov files are created in the current working directory.
				484	This can be changed with ``ASAN_OPTIONS=coverage_dir=/path``:
				485
				486	.. code-block:: console
				487
				488	% ASAN_OPTIONS="coverage=1:coverage_dir=/tmp/cov" ./a.out foo
				489	% ls -l /tmp/cov/*sancov
				490	-rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov
				491	-rw-r----- 1 kcc eng 8 Nov 27 12:21 a.out.22679.sancov
				492
				493	Sudden death
				494	============
				495
				496	Normally, coverage data is collected in memory and saved to disk when the
				497	program exits (with an ``atexit()`` handler), when a SIGSEGV is caught, or when
				498	``__sanitizer_cov_dump()`` is called.
				499
				500	If the program ends with a signal that ASan does not handle (or can not handle
				501	at all, like SIGKILL), coverage data will be lost. This is a big problem on
				502	Android, where SIGKILL is a normal way of evicting applications from memory.
				503
				504	With ``ASAN_OPTIONS=coverage=1:coverage_direct=1`` coverage data is written to a
				505	memory-mapped file as soon as it collected.
				506
				507	.. code-block:: console
				508
				509	% ASAN_OPTIONS="coverage=1:coverage_direct=1" ./a.out
				510	main
				511	% ls
				512	7036.sancov.map 7036.sancov.raw a.out
				513	% sancov.py rawunpack 7036.sancov.raw
				514	sancov.py: reading map 7036.sancov.map
				515	sancov.py: unpacking 7036.sancov.raw
				516	writing 1 PCs to a.out.7036.sancov
				517	% sancov.py print a.out.7036.sancov
				518	sancov.py: read 1 PCs from a.out.7036.sancov
				519	sancov.py: 1 files merged; 1 PCs total
				520	0x4b2bae
				521
				522	Note that on 64-bit platforms, this method writes 2x more data than the default,
				523	because it stores full PC values instead of 32-bit offsets.
				524
				525	In-process fuzzing
				526	==================
				527
				528	Coverage data could be useful for fuzzers and sometimes it is preferable to run
				529	a fuzzer in the same process as the code being fuzzed (in-process fuzzer).
				530
				531	You can use ``__sanitizer_get_total_unique_coverage()`` from
				532	``<sanitizer/coverage_interface.h>`` which returns the number of currently
				533	covered entities in the program. This will tell the fuzzer if the coverage has
				534	increased after testing every new input.
				535
				536	If a fuzzer finds a bug in the ASan run, you will need to save the reproducer
				537	before exiting the process. Use ``__asan_set_death_callback`` from
				538	``<sanitizer/asan_interface.h>`` to do that.
				539
				540	An example of such fuzzer can be found in `the LLVM tree
				541	<http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Fuzzer/README.txt?view=markup>`_.
				542
				543	Performance
				544	===========
				545
				546	This coverage implementation is fast. With function-level coverage
Alexey Samsonov	8fffba1	2015-05-07 23:04:19 +0000	[diff] [blame]	547	(``-fsanitize-coverage=func``) the overhead is not measurable. With
				548	basic-block-level coverage (``-fsanitize-coverage=bb``) the overhead varies
Sergey Matveev	07e2d28	2015-04-23 20:40:04 +0000	[diff] [blame]	549	between 0 and 25%.
				550
				551	============== ========= ========= ========= ========= ========= =========
				552	benchmark cov0 cov1 diff 0-1 cov2 diff 0-2 diff 1-2
				553	============== ========= ========= ========= ========= ========= =========
				554	400.perlbench 1296.00 1307.00 1.01 1465.00 1.13 1.12
				555	401.bzip2 858.00 854.00 1.00 1010.00 1.18 1.18
				556	403.gcc 613.00 617.00 1.01 683.00 1.11 1.11
				557	429.mcf 605.00 582.00 0.96 610.00 1.01 1.05
				558	445.gobmk 896.00 880.00 0.98 1050.00 1.17 1.19
				559	456.hmmer 892.00 892.00 1.00 918.00 1.03 1.03
				560	458.sjeng 995.00 1009.00 1.01 1217.00 1.22 1.21
				561	462.libquantum 497.00 492.00 0.99 534.00 1.07 1.09
				562	464.h264ref 1461.00 1467.00 1.00 1543.00 1.06 1.05
				563	471.omnetpp 575.00 590.00 1.03 660.00 1.15 1.12
				564	473.astar 658.00 652.00 0.99 715.00 1.09 1.10
				565	483.xalancbmk 471.00 491.00 1.04 582.00 1.24 1.19
				566	433.milc 616.00 627.00 1.02 627.00 1.02 1.00
				567	444.namd 602.00 601.00 1.00 654.00 1.09 1.09
				568	447.dealII 630.00 634.00 1.01 653.00 1.04 1.03
				569	450.soplex 365.00 368.00 1.01 395.00 1.08 1.07
				570	453.povray 427.00 434.00 1.02 495.00 1.16 1.14
				571	470.lbm 357.00 375.00 1.05 370.00 1.04 0.99
				572	482.sphinx3 927.00 928.00 1.00 1000.00 1.08 1.08
				573	============== ========= ========= ========= ========= ========= =========
				574
				575	Why another coverage?
				576	=====================
				577
				578	Why did we implement yet another code coverage?
				579	* We needed something that is lightning fast, plays well with
				580	AddressSanitizer, and does not significantly increase the binary size.
				581	* Traditional coverage implementations based in global counters
				582	`suffer from contention on counters
				583	<https://groups.google.com/forum/#!topic/llvm-dev/cDqYgnxNEhY>`_.