tools/trace_example.txt - platform/external/bcc - Gitiles

 Demonstrations of trace.


 trace probes functions you specify and displays trace messages if a particular
 condition is met. You can control the message format to display function
 arguments and return values.

 For example, suppose you want to trace all commands being exec'd across the
 system:

 # trace 'sys_execve "%s", arg1'
 TIME     PID    COMM         FUNC             -
 05:11:51 4402   bash         sys_execve       /usr/bin/man
 05:11:51 4411   man          sys_execve       /usr/local/bin/less
 05:11:51 4411   man          sys_execve       /usr/bin/less
 05:11:51 4410   man          sys_execve       /usr/local/bin/nroff
 05:11:51 4410   man          sys_execve       /usr/bin/nroff
 05:11:51 4409   man          sys_execve       /usr/local/bin/tbl
 05:11:51 4409   man          sys_execve       /usr/bin/tbl
 05:11:51 4408   man          sys_execve       /usr/local/bin/preconv
 05:11:51 4408   man          sys_execve       /usr/bin/preconv
 05:11:51 4415   nroff        sys_execve       /usr/bin/locale
 05:11:51 4416   nroff        sys_execve       /usr/bin/groff
 05:11:51 4418   groff        sys_execve       /usr/bin/grotty
 05:11:51 4417   groff        sys_execve       /usr/bin/troff
 ^C

 The ::sys_execve syntax specifies that you want an entry probe (which is the
 default), in a kernel function (which is the default) called sys_execve. Next,
 the format string to print is simply "%s", which prints a string. Finally, the
 value to print is the first argument to the sys_execve function, which happens
 to be the command that is exec'd. The above trace was generated by executing
 "man ls" in a separate shell. As you see, man executes a number of additional
 programs to finally display the man page.

 Next, suppose you are looking for large reads across the system. Let's trace
 the read system call and inspect the third argument, which is the number of
 bytes to be read:

 # trace 'sys_read (arg3 > 20000) "read %d bytes", arg3'
 TIME     PID    COMM         FUNC             -
 05:18:23 4490   dd           sys_read         read 1048576 bytes
 05:18:23 4490   dd           sys_read         read 1048576 bytes
 05:18:23 4490   dd           sys_read         read 1048576 bytes
 05:18:23 4490   dd           sys_read         read 1048576 bytes
 ^C

 During the trace, I executed "dd if=/dev/zero of=/dev/null bs=1M count=4".
 The individual reads are visible, with the custom format message printed for
 each read. The parenthesized expression "(arg3 > 20000)" is a filter that is
 evaluated for each invocation of the probe before printing anything.

 You can also trace user functions. For example, let's simulate the bashreadline
 script, which attaches to the readline function in bash and prints its return
 value, effectively snooping all bash shell input across the system:

 # trace 'r:bash:readline "%s", retval'
 TIME     PID    COMM         FUNC             -
 05:24:50 2740   bash         readline         echo hi!
 05:24:53 2740   bash         readline         man ls
 ^C

 The special retval keywords stands for the function's return value, and can
 be used only in a retprobe, specified by the 'r' prefix. The next component
 of the probe is the library that contains the desired function. It's OK to
 specify executables too, as long as they can be found in the PATH. Or, you
 can specify the full path to the executable (e.g. "/usr/bin/bash").

 Multiple probes can be combined on the same command line. For example, let's
 trace failed read and write calls on the libc level:

 # trace 'r:c:read ((int)retval < 0) "read failed: %d", retval' \
         'r:c:write ((int)retval < 0) "write failed: %d", retval'
 TIME     PID    COMM         FUNC             -
 05:31:57 3388   bash         write            write failed: -1
 05:32:00 3388   bash         write            write failed: -1
 ^C

 Note that the retval variable must be cast to int before comparing to zero.
 The reason is that the default type for argN and retval is an unsigned 64-bit
 integer, which can never be smaller than 0.

 trace has also some basic support for kernel tracepoints. For example, let's
 trace the block:block_rq_complete tracepoint and print out the number of sectors
 transferred:

 # trace 't:block:block_rq_complete "sectors=%d", tp.nr_sector'
 TIME     PID    COMM         FUNC             -
 01:23:51 0      swapper/0    block_rq_complete sectors=8
 01:23:55 10017  kworker/u64: block_rq_complete sectors=1
 01:23:55 0      swapper/0    block_rq_complete sectors=8
 ^C

 To discover the tracepoint structure format (which you can refer to as the "tp"
 variable), use the tplist tool. For example:

 # tplist -v block:block_rq_complete
 block:block_rq_complete
     dev_t dev;
     sector_t sector;
     unsigned int nr_sector;
     int errors;
     char rwbs[8];

 This output tells you that you can use "tp.dev", "tp.sector", etc. in your
 predicate and trace arguments.

 As a final example, let's trace open syscalls for a specific process. By
 default, tracing is system-wide, but the -p switch overrides this:

 # trace -p 2740 'do_sys_open "%s", arg2'
 TIME     PID    COMM         FUNC             -
 05:36:16 15872  ls           do_sys_open      /etc/ld.so.cache
 05:36:16 15872  ls           do_sys_open      /lib64/libselinux.so.1
 05:36:16 15872  ls           do_sys_open      /lib64/libcap.so.2
 05:36:16 15872  ls           do_sys_open      /lib64/libacl.so.1
 05:36:16 15872  ls           do_sys_open      /lib64/libc.so.6
 05:36:16 15872  ls           do_sys_open      /lib64/libpcre.so.1
 05:36:16 15872  ls           do_sys_open      /lib64/libdl.so.2
 05:36:16 15872  ls           do_sys_open      /lib64/libattr.so.1
 05:36:16 15872  ls           do_sys_open      /lib64/libpthread.so.0
 05:36:16 15872  ls           do_sys_open      /usr/lib/locale/locale-archive
 05:36:16 15872  ls           do_sys_open      /home/vagrant
 ^C

 In this example, we traced the "ls ~" command as it was opening its shared
 libraries and then accessing the /home/vagrant directory listing.


 USAGE message:

 # trace -h
 usage: trace.py [-h] [-p PID] [-v] [-Z STRING_SIZE] [-S] [-M MAX_EVENTS] [-o]
                 probe [probe ...]

 Attach to functions and print trace messages.

 positional arguments:
   probe                 probe specifier (see examples)

 optional arguments:
   -h, --help            show this help message and exit
   -p PID, --pid PID     id of the process to trace (optional)
   -v, --verbose         print resulting BPF program code before executing
   -Z STRING_SIZE, --string-size STRING_SIZE
                         maximum size to read from strings
   -S, --include-self    do not filter trace's own pid from the trace
   -M MAX_EVENTS, --max-events MAX_EVENTS
                         number of events to print before quitting
   -o, --offset          use relative time from first traced message

 EXAMPLES:

 trace do_sys_open
         Trace the open syscall and print a default trace message when entered
 trace 'do_sys_open "%s", arg2'
         Trace the open syscall and print the filename being opened
 trace 'sys_read (arg3 > 20000) "read %d bytes", arg3'
         Trace the read syscall and print a message for reads >20000 bytes
 trace r::do_sys_return
         Trace the return from the open syscall
 trace 'c:open (arg2 == 42) "%s %d", arg1, arg2'
         Trace the open() call from libc only if the flags (arg2) argument is 42
 trace 'c:malloc "size = %d", arg1'
         Trace malloc calls and print the size being allocated
 trace 'p:c:write (arg1 == 1) "writing %d bytes to STDOUT", arg3'
         Trace the write() call from libc to monitor writes to STDOUT
 trace 'r::__kmalloc (retval == 0) "kmalloc failed!"
         Trace returns from __kmalloc which returned a null pointer
 trace 'r:c:malloc (retval) "allocated = %p", retval
         Trace returns from malloc and print non-NULL allocated buffers
 trace 't:block:block_rq_complete "sectors=%d", tp.nr_sector'
         Trace the block_rq_complete kernel tracepoint and print # of tx sectors
 trace 'u:pthread:pthread_create (arg4 != 0)'
         Trace the USDT probe pthread_create when its 4th argument is non-zero
	Demonstrations of trace.


	trace probes functions you specify and displays trace messages if a particular
	condition is met. You can control the message format to display function
	arguments and return values.

	For example, suppose you want to trace all commands being exec'd across the
	system:

	# trace 'sys_execve "%s", arg1'
	TIME PID COMM FUNC -
	05:11:51 4402 bash sys_execve /usr/bin/man
	05:11:51 4411 man sys_execve /usr/local/bin/less
	05:11:51 4411 man sys_execve /usr/bin/less
	05:11:51 4410 man sys_execve /usr/local/bin/nroff
	05:11:51 4410 man sys_execve /usr/bin/nroff
	05:11:51 4409 man sys_execve /usr/local/bin/tbl
	05:11:51 4409 man sys_execve /usr/bin/tbl
	05:11:51 4408 man sys_execve /usr/local/bin/preconv
	05:11:51 4408 man sys_execve /usr/bin/preconv
	05:11:51 4415 nroff sys_execve /usr/bin/locale
	05:11:51 4416 nroff sys_execve /usr/bin/groff
	05:11:51 4418 groff sys_execve /usr/bin/grotty
	05:11:51 4417 groff sys_execve /usr/bin/troff
	^C

	The ::sys_execve syntax specifies that you want an entry probe (which is the
	default), in a kernel function (which is the default) called sys_execve. Next,
	the format string to print is simply "%s", which prints a string. Finally, the
	value to print is the first argument to the sys_execve function, which happens
	to be the command that is exec'd. The above trace was generated by executing
	"man ls" in a separate shell. As you see, man executes a number of additional
	programs to finally display the man page.

	Next, suppose you are looking for large reads across the system. Let's trace
	the read system call and inspect the third argument, which is the number of
	bytes to be read:

	# trace 'sys_read (arg3 > 20000) "read %d bytes", arg3'
	TIME PID COMM FUNC -
	05:18:23 4490 dd sys_read read 1048576 bytes
	05:18:23 4490 dd sys_read read 1048576 bytes
	05:18:23 4490 dd sys_read read 1048576 bytes
	05:18:23 4490 dd sys_read read 1048576 bytes
	^C

	During the trace, I executed "dd if=/dev/zero of=/dev/null bs=1M count=4".
	The individual reads are visible, with the custom format message printed for
	each read. The parenthesized expression "(arg3 > 20000)" is a filter that is
	evaluated for each invocation of the probe before printing anything.

	You can also trace user functions. For example, let's simulate the bashreadline
	script, which attaches to the readline function in bash and prints its return
	value, effectively snooping all bash shell input across the system:

	# trace 'r:bash:readline "%s", retval'
	TIME PID COMM FUNC -
	05:24:50 2740 bash readline echo hi!
	05:24:53 2740 bash readline man ls
	^C

	The special retval keywords stands for the function's return value, and can
	be used only in a retprobe, specified by the 'r' prefix. The next component
	of the probe is the library that contains the desired function. It's OK to
	specify executables too, as long as they can be found in the PATH. Or, you
	can specify the full path to the executable (e.g. "/usr/bin/bash").

	Multiple probes can be combined on the same command line. For example, let's
	trace failed read and write calls on the libc level:

	# trace 'r:c:read ((int)retval < 0) "read failed: %d", retval' \
	'r:c:write ((int)retval < 0) "write failed: %d", retval'
	TIME PID COMM FUNC -
	05:31:57 3388 bash write write failed: -1
	05:32:00 3388 bash write write failed: -1
	^C

	Note that the retval variable must be cast to int before comparing to zero.
	The reason is that the default type for argN and retval is an unsigned 64-bit
	integer, which can never be smaller than 0.

	trace has also some basic support for kernel tracepoints. For example, let's
	trace the block:block_rq_complete tracepoint and print out the number of sectors
	transferred:

	# trace 't:block:block_rq_complete "sectors=%d", tp.nr_sector'
	TIME PID COMM FUNC -
	01:23:51 0 swapper/0 block_rq_complete sectors=8
	01:23:55 10017 kworker/u64: block_rq_complete sectors=1
	01:23:55 0 swapper/0 block_rq_complete sectors=8
	^C

	To discover the tracepoint structure format (which you can refer to as the "tp"
	variable), use the tplist tool. For example:

	# tplist -v block:block_rq_complete
	block:block_rq_complete
	dev_t dev;
	sector_t sector;
	unsigned int nr_sector;
	int errors;
	char rwbs[8];

	This output tells you that you can use "tp.dev", "tp.sector", etc. in your
	predicate and trace arguments.

	As a final example, let's trace open syscalls for a specific process. By
	default, tracing is system-wide, but the -p switch overrides this:

	# trace -p 2740 'do_sys_open "%s", arg2'
	TIME PID COMM FUNC -
	05:36:16 15872 ls do_sys_open /etc/ld.so.cache
	05:36:16 15872 ls do_sys_open /lib64/libselinux.so.1
	05:36:16 15872 ls do_sys_open /lib64/libcap.so.2
	05:36:16 15872 ls do_sys_open /lib64/libacl.so.1
	05:36:16 15872 ls do_sys_open /lib64/libc.so.6
	05:36:16 15872 ls do_sys_open /lib64/libpcre.so.1
	05:36:16 15872 ls do_sys_open /lib64/libdl.so.2
	05:36:16 15872 ls do_sys_open /lib64/libattr.so.1
	05:36:16 15872 ls do_sys_open /lib64/libpthread.so.0
	05:36:16 15872 ls do_sys_open /usr/lib/locale/locale-archive
	05:36:16 15872 ls do_sys_open /home/vagrant
	^C

	In this example, we traced the "ls ~" command as it was opening its shared
	libraries and then accessing the /home/vagrant directory listing.


	USAGE message:

	# trace -h
	usage: trace.py [-h] [-p PID] [-v] [-Z STRING_SIZE] [-S] [-M MAX_EVENTS] [-o]
	probe [probe ...]

	Attach to functions and print trace messages.

	positional arguments:
	probe probe specifier (see examples)

	optional arguments:
	-h, --help show this help message and exit
	-p PID, --pid PID id of the process to trace (optional)
	-v, --verbose print resulting BPF program code before executing
	-Z STRING_SIZE, --string-size STRING_SIZE
	maximum size to read from strings
	-S, --include-self do not filter trace's own pid from the trace
	-M MAX_EVENTS, --max-events MAX_EVENTS
	number of events to print before quitting
	-o, --offset use relative time from first traced message

	EXAMPLES:

	trace do_sys_open
	Trace the open syscall and print a default trace message when entered
	trace 'do_sys_open "%s", arg2'
	Trace the open syscall and print the filename being opened
	trace 'sys_read (arg3 > 20000) "read %d bytes", arg3'
	Trace the read syscall and print a message for reads >20000 bytes
	trace r::do_sys_return
	Trace the return from the open syscall
	trace 'c:open (arg2 == 42) "%s %d", arg1, arg2'
	Trace the open() call from libc only if the flags (arg2) argument is 42
	trace 'c:malloc "size = %d", arg1'
	Trace malloc calls and print the size being allocated
	trace 'p:c:write (arg1 == 1) "writing %d bytes to STDOUT", arg3'
	Trace the write() call from libc to monitor writes to STDOUT
	trace 'r::__kmalloc (retval == 0) "kmalloc failed!"
	Trace returns from __kmalloc which returned a null pointer
	trace 'r:c:malloc (retval) "allocated = %p", retval
	Trace returns from malloc and print non-NULL allocated buffers
	trace 't:block:block_rq_complete "sectors=%d", tp.nr_sector'
	Trace the block_rq_complete kernel tracepoint and print # of tx sectors
	trace 'u:pthread:pthread_create (arg4 != 0)'
	Trace the USDT probe pthread_create when its 4th argument is non-zero