perf report: Show branch type statistics for stdio mode
Show the branch type statistics at the end of perf report --stdio.
For example:
perf report --stdio
COND_FWD: 28.5%
COND_BWD: 9.4%
CROSS_4K: 0.7%
CROSS_2M: 14.1%
COND: 37.9%
UNCOND: 0.2%
IND: 6.7%
CALL: 26.5%
RET: 28.7%
SYSRET: 0.0%
The branch types are:
COND_FWD: conditional forward
COND_BWD: conditional backward
COND: conditional branch
UNCOND: unconditional branch
IND: indirect
CALL: function call
IND_CALL: indirect function call
RET: function return
SYSCALL: syscall
SYSRET: syscall return
COND_CALL: conditional function call
COND_RET: conditional function return
CROSS_4K and CROSS_2M:
They are the metrics checking for branches cross 4K or 2MB pages.
It's an approximate computing. We don't know if the area is 4K or
2MB, so always compute both.
To make the output simple, if a branch crosses 2M area, CROSS_4K
will not be incremented.
Change log
v7: Since the common branch type definitions are changed, some
tags/strings are updated accordingly.
v6: Remove branch_type_stat_display() since it's moved to branch.c.
v5: Remove the unnecessary sort__mode checking in
hist_iter__branch_callback().
v4: Comparing to previous version, the major changes are:
Add the computing of JCC forward/JCC backward and cross page checking
by using the from and to addresses.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1500379995-6449-7-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 8e752ba..cea25d0 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -38,6 +38,7 @@
#include "util/time-utils.h"
#include "util/auxtrace.h"
#include "util/units.h"
+#include "util/branch.h"
#include <dlfcn.h>
#include <errno.h>
@@ -73,6 +74,7 @@
u64 queue_size;
int socket_filter;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
+ struct branch_type_stat brtype_stat;
};
static int report__config(const char *var, const char *value, void *cb)
@@ -150,6 +152,22 @@
return err;
}
+static int hist_iter__branch_callback(struct hist_entry_iter *iter,
+ struct addr_location *al __maybe_unused,
+ bool single __maybe_unused,
+ void *arg)
+{
+ struct hist_entry *he = iter->he;
+ struct report *rep = arg;
+ struct branch_info *bi;
+
+ bi = he->branch_info;
+ branch_type_count(&rep->brtype_stat, &bi->flags,
+ bi->from.addr, bi->to.addr);
+
+ return 0;
+}
+
static int process_sample_event(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
@@ -188,6 +206,8 @@
*/
if (!sample->branch_stack)
goto out_put;
+
+ iter.add_entry_cb = hist_iter__branch_callback;
iter.ops = &hist_iter_branch;
} else if (rep->mem_mode) {
iter.ops = &hist_iter_mem;
@@ -410,6 +430,9 @@
perf_read_values_destroy(&rep->show_threads_values);
}
+ if (sort__mode == SORT_MODE__BRANCH)
+ branch_type_stat_display(stdout, &rep->brtype_stat);
+
return 0;
}
@@ -944,6 +967,8 @@
if (has_br_stack && branch_call_mode)
symbol_conf.show_branchflag_count = true;
+ memset(&report.brtype_stat, 0, sizeof(struct branch_type_stat));
+
/*
* Branch mode is a tristate:
* -1 means default, so decide based on the file having branch data.