@brief This HOWTO explains how to use perf
which has integrated openCSD.
This HOWTO explains how to use the perf cmd line tools and the openCSD library to collect and extract program flow traces generated by the CoreSight IP block on a Linux system. The examples have been generated using a aarch64 Juno-r1 platform. All information is considered accurate and tested using library branch opencsd-bkk16
(decode library only) and perf update branch perf-opencsd-4.5-rc6-bkk16
(decode library + perf tools) on the OpenCSD github repository.
The enhancement to the Perf tools that support the new cs_etm
pmu have not been upstreamed yet. To get the required functionality branch perf-opencsd-4.5-rc6-bbk16
needs to be downloaded to the target system where traces are to be collected. This branch is an upstream v4.5-rc6 kernel supplemented with modifications to the CoreSight framework and drivers to be usable by the Perf core. Some of those patches have been queued for merging in the 4.6 cycle. Others have been submitted and some have yet to be posted for review. The process is being done incrementally.
From there compiling the perf tools with make -C tools/perf
will yield a perf
executable that will support CoreSight trace collection. Note that if traces are to be decompressed off target, there is no need to download and compile the openCSD library (on the target).
Before launching a trace run a sink that will collect trace data needs to be identified. All CoreSight blocks identified by the framework are registed in sysFS:
linaro@linaro-nano:~$ ls /sys/bus/coresight/devices/ 20010000.etf 20040000.main_funnel 22040000.etm 22140000.etm 230c0000.A53_funnel 23240000.etm replicator@20020000 20030000.tpiu 20070000.etr 220c0000.A57_funnel 23040000.etm 23140000.etm 23340000.etm
CoreSight blocks are listed in the device tree for a specific system and discovered at boot time. Since tracers can be linked to more than one sink, the sink that will recieve trace data needs to be identified manually. In In this example the ETR block is selected:
root@linaro-nano:~# echo 1 > /sys/bus/coresight/devices/20070000.etr/enable_sink
Once a sink has been identify trace collection can start. An easy and yet interesting example is the uname
command:
linaro@linaro-nano:~/kernel$ ./tools/perf/perf record -e cs_etm// --per-thread uname
This will generate a perf.data
file where execution has been traced for both user and kernel space. To narrow the field to either user or kernel space the u
and k
options can be specified. For example the following will limit traces to user space:
linaro@linaro-nano:~/kernel$ ./tools/perf/perf record -vvv -e cs_etm//u --per-thread uname Problems setting modules path maps, continuing anyway... ----------------------------------------------------------- perf_event_attr: type 8 size 112 { sample_period, sample_freq } 1 sample_type IP|TID|IDENTIFIER read_format ID disabled 1 exclude_kernel 1 exclude_hv 1 enable_on_exec 1 sample_id_all 1 ------------------------------------------------------------ sys_perf_event_open: pid 11375 cpu -1 group_fd -1 flags 0x8 ------------------------------------------------------------ perf_event_attr: type 1 size 112 config 0x9 { sample_period, sample_freq } 1 sample_type IP|TID|IDENTIFIER read_format ID disabled 1 exclude_kernel 1 exclude_hv 1 mmap 1 comm 1 enable_on_exec 1 task 1 sample_id_all 1 mmap2 1 comm_exec 1 ------------------------------------------------------------ sys_perf_event_open: pid 11375 cpu -1 group_fd -1 flags 0x8 mmap size 266240B AUX area mmap length 131072 perf event ring buffer mmapped per thread Synthesizing auxtrace information Linux auxtrace idx 0 old 0 head 0x11ea0 diff 0x11ea0 [ perf record: Woken up 1 times to write data ] overlapping maps: 7f99daf000-7f99db0000 0 [vdso] 7f99d84000-7f99db3000 0 /lib/aarch64-linux-gnu/ld-2.21.so 7f99d84000-7f99daf000 0 /lib/aarch64-linux-gnu/ld-2.21.so 7f99db0000-7f99db3000 0 /lib/aarch64-linux-gnu/ld-2.21.so failed to write feature 8 failed to write feature 9 failed to write feature 14 [ perf record: Captured and wrote 0.072 MB perf.data ] linaro@linaro-nano:~/kernel$ ls -l ~/.debug/ perf.data _-rw------- 1 linaro linaro 77888 Mar 2 20:41 perf.data /home/linaro/.debug/: total 16 drwxr-xr-x 2 linaro linaro 4096 Mar 2 20:40 [kernel.kallsyms] drwxr-xr-x 2 linaro linaro 4096 Mar 2 20:40 [vdso] drwxr-xr-x 3 linaro linaro 4096 Mar 2 20:40 bin drwxr-xr-x 3 linaro linaro 4096 Mar 2 20:40 lib
The entire program flow will have been recorded in the perf.data
file. Information about libraries and executable is stored under $HOME/.debug
. All this information needs to be collected in order to successfully decode traces off target:
linaro@linaro-nano:~/kernel$ tar czf uname.trace.tgz perf.data ~/.debug
Note that file vmlinux
should also be added to the bundle if kernel traces have also been collected.
As of this writing the openCSD library is not part of the perf tools source. It is available on github and needs to be compiled before perf.
linaro@t430:~/linaro/coresight/bkk16/$ git clone -b opencsd-bkk16 https://github.com/Linaro/OpenCSD.git opencsd-bkk16 Cloning into 'OpenCSD'... remote: Counting objects: 2063, done. remote: Total 2063 (delta 0), reused 0 (delta 0), pack-reused 2063 Receiving objects: 100% (2063/2063), 2.51 MiB | 1.24 MiB/s, done. Resolving deltas: 100% (1399/1399), done. Checking connectivity... done. linaro@t430:~/linaro/coresight/bkk16/$ ls opencsd-bkk16 decoder LICENSE README.md
Once the source code has been acquired compilation of the openCSD library can take place. For Linux two options are available, LINUX and LINUX64, based on the host's (which has nothing to do with the target) architecture:
linaro@t430:~/linaro/coresight/bkk16/$ cd opencsd-bkk16/decoder/build/linux/ linaro@t430:~/linaro/coresight/bkk16/opencsd-bkk16/decoder/build/linux/$ ls makefile rctdl_c_api_lib ref_trace_decode_lib linaro@t430:~/linaro/coresight/bkk16/opencsd-bkk16/decoder/build/linux/$ make LINUX64=1 DEBUG=1 ... ... linaro@t430:~/linaro/coresight//bkk16/opencsd-bkk16/decoder/build/linux/$ ls ../../lib/linux64/dbg/ libcstraced.a libcstraced_c_api.a libcstraced_c_api.so libcstraced.so
As stated above not all the pieces of the solution have been upstreamed. To get all the components branch perf-opencsd-4.5-rc6-bkk16
needs to be obtained:
linaro@t430:~/linaro/coresight/bkk16/$ git clone -b perf-opencsd-4.5-rc6-bkk16 https://github.com/Linaro/OpenCSD.git perf-opencsd-4.5-rc6-bkk16 ... ... linaro@t430:~/linaro/coresight/bkk16/$ ls perf-opencsd-4.5-rc6-bkk16/ arch certs CREDITS Documentation firmware include ipc Kconfig lib Makefile net REPORTING-BUGS scripts sound usr block COPYING crypto drivers fs init Kbuild kernel MAINTAINERS mm README samples security tools virt
At this point openCSD object files needs to be copied in the cs_etm decoder directory. After that a new perf tool binary can be compiled:
linaro@t430:~/linaro/coresight/bkk16/$ mkdir perf-opencsd-4.5-rc6-bkk16/tools/perf/util/cs-etm-decoder/lib linaro@t430:~/linaro/coresight/bkk16/$ cp opencsd-bkk16/decoder/lib/linux64/dbg/* perf-opencsd-4.5-rc6-bkk16/tools/perf/util/cs-etm-decoder/lib/ linaro@t430:~/linaro/coresight/bkk16/$ cd perf-opencsd-4.5-rc6-bkk16 linaro@t430:~/linaro/coresight/bkk16/perf-opencsd-4.5-rc6-bkk16/$ export CSTRACE_PATH=~/linaro/coresight/bkk16/opencsd-bkk16/decoder linaro@t430:~/linaro/coresight/bkk16/perf-opencsd-4.5-rc6-bkk16/$ make -C tools/perf ARCH=arm DEBUG=1 NO_LIBPERL=1 ... ... linaro@t430:~/linaro/coresight/bkk16/perf-opencsd-4.5-rc6-bkk16/$ ls -l tools/perf/perf -rwxrwxr-x 1 linaro linaro 6276360 Mar 3 10:05 tools/perf/perf
Since the openCSD library is not part of the pert tools, an environment variable telling the build scripts where to find the library is needed. If the CSTRACE_PATH
variable is not defined the compilation will still be successful, but handling of CoreSight trace data won't be supported.
At the end of the compilation a new perf binary is available in tools/perf/
Before working with custom traces it is suggested to use a trace bundle that is known to be working properly. A sample bundle has been made available [here][2]. Trace bundles can be extracted anywhere and have no dependencies on where the perf tools and openCSD library have been compiled.
linaro@t430:~/linaro/coresight/bkk16/$ mkdir feb24 linaro@t430:~/linaro/coresight/bkk16/$ cd feb24 linaro@t430:~/linaro/coresight/bkk16/feb24/$ wget http://people.linaro.org/~mathieu.poirier/openCSD/uname.v4.user.feb24.tgz linaro@t430:~/linaro/coresight/bkk16/feb24/$ md5sum uname.v4.user.feb24.tgz f53f11d687ce72bdbe9de2e67e960ec6 uname.v4.user.feb24.tgz linaro@t430:~/linaro/coresight/bkk16/feb24/$ tar xf uname.v4.user.feb24.tgz linaro@t430:~/linaro/coresight/bkk16/feb24/$ ls -la total 1312 drwxrwxr-x 3 linaro linaro 4096 Mar 3 10:26 . drwxrwxr-x 5 linaro linaro 4096 Mar 3 10:13 .. drwxr-xr-x 7 linaro linaro 4096 Feb 24 12:21 .debug -rw------- 1 linaro linaro 78016 Feb 24 12:21 perf.data -rw-rw-r-- 1 linaro linaro 1245881 Feb 24 12:25 uname.v4.user.feb24.tgz
Perf is expecting files related to the trace capture (perf.data
) to be located under ~/.debug
[3]. This example will remove the current ~/.debug
directory to be sure everything is clean.
linaro@t430:~/linaro/coresight/bkk16/feb24/$ rm -rf ~/.debug linaro@t430:~/linaro/coresight/bkk16/feb24/$ cp -dpR .debug ~/ linaro@t430:~/linaro/coresight/bkk16/feb24/$ export LD_LIBRARY_PATH=~/linaro/coresight/bkk16/perf-opencsd-4.5-rc6-bkk16/tools/util/cs-etm-decoder/lib linaro@t430:~/linaro/coresight/bkk16/feb24/$ ../perf-opencsd-4.5-rc6-bkk16/tools/perf/perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 0 of event 'cs_etm//u' # Event count (approx.): 0 # # Children Self Command Shared Object Symbol # ........ ........ ....... ............. ...... # # Samples: 0 of event 'dummy:u' # Event count (approx.): 0 # # Children Self Command Shared Object Symbol # ........ ........ ....... ............. ...... # # Samples: 115K of event 'instructions:u' # Event count (approx.): 522009 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................ ...................... # 4.13% 4.13% uname libc-2.21.so [.] 0x0000000000078758 3.81% 3.81% uname libc-2.21.so [.] 0x0000000000078e50 2.06% 2.06% uname libc-2.21.so [.] 0x00000000000fcaf4 1.65% 1.65% uname libc-2.21.so [.] 0x00000000000fcae4 1.59% 1.59% uname ld-2.21.so [.] 0x000000000000a7f4 1.50% 1.50% uname libc-2.21.so [.] 0x0000000000078e40 1.43% 1.43% uname libc-2.21.so [.] 0x00000000000fcac4 1.31% 1.31% uname libc-2.21.so [.] 0x000000000002f0c0 1.26% 1.26% uname ld-2.21.so [.] 0x0000000000016888 1.24% 1.24% uname libc-2.21.so [.] 0x0000000000078e7c 1.24% 1.24% uname libc-2.21.so [.] 0x00000000000fcab8 ...
Working with perf scripts needs more command line options but yields interesting results.
linaro@t430:~/linaro/coresight/bkk16/feb24/$ export EXEC_PATH=/home/linaro/coresight/bkk16/perf-opencsd-4.5-rc6-bkk16/tools/perf/ linaro@t430:~/linaro/coresight/bkk16/feb24/$ export SCRIPT_PATH=$EXEC_PATH/scripts/python/ linaro@t430:~/linaro/coresight/bkk16/feb24/$ export XTOOL_PATH=/your/aarch64/toolchain/path/bin/ linaro@t430:~/linaro/coresight/bkk16/feb24/$ ../perf-opencsd-4.5-rc6-bkk16/tools/perf/perf -exec-path=${EXEC_PATH} script --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d ${XTOOL_PATH}/aarch64-linux-gnu-objdump 7f89f24d80: 910003e0 mov x0, sp 7f89f24d84: 94000d53 bl 7f89f282d0 <free@plt+0x3790> 7f89f282d0: d11203ff sub sp, sp, #0x480 7f89f282d4: a9ba7bfd stp x29, x30, [sp,#-96]! 7f89f282d8: 910003fd mov x29, sp 7f89f282dc: a90363f7 stp x23, x24, [sp,#48] 7f89f282e0: 9101e3b7 add x23, x29, #0x78 7f89f282e4: a90573fb stp x27, x28, [sp,#80] 7f89f282e8: a90153f3 stp x19, x20, [sp,#16] 7f89f282ec: aa0003fb mov x27, x0 7f89f282f0: 910a82e1 add x1, x23, #0x2a0 7f89f282f4: a9025bf5 stp x21, x22, [sp,#32] 7f89f282f8: a9046bf9 stp x25, x26, [sp,#64] 7f89f282fc: 910102e0 add x0, x23, #0x40 7f89f28300: f800841f str xzr, [x0],#8 7f89f28304: eb01001f cmp x0, x1 7f89f28308: 54ffffc1 b.ne 7f89f28300 <free@plt+0x37c0> 7f89f28300: f800841f str xzr, [x0],#8 7f89f28304: eb01001f cmp x0, x1 7f89f28308: 54ffffc1 b.ne 7f89f28300 <free@plt+0x37c0> 7f89f28300: f800841f str xzr, [x0],#8 7f89f28304: eb01001f cmp x0, x1 7f89f28308: 54ffffc1 b.ne 7f89f28300 <free@plt+0x37c0>
We welcome help on this project. If you would like to add features or help improve the way things work, we want to hear from you.
Best regards,
The Linaro CoreSight Team
[2] wget http://people.linaro.org/~mathieu.poirier/openCSD/uname.v4.user.feb24.tgz
[3) Get in touch with us if you know a way to change this.