Using traces allows computation of reproducible metrics in a wide range of situations; examples include benchmarks, lab tests and on large corpuses of traces. In these cases, these metrics allow for direct root-causing when a regression is detected.
The metrics platform (powered by the trace processor) allows metrics authors to write SQL queries to generate metrics in the form of protobuf messages or proto text.
We strongly encourage all metrics derived on Perfetto traces to be added to the Perfetto repo unless there is a clear usecase (e.g. confidentiality) why these metrics should not be publicly available.
In return for upstreaming metrics, authors will have first class support for running metrics locally and the confidence that their metrics will remain stable as trace processor is developed.
For example, generating the full (human readable) set of Android memory metrics on a trace is as simple as:
trace_processor_shell --run-metrics android_mem <trace>
As well as scaling upwards while developing from running on a single trace locally to running on a large set of traces, the reverse is also very useful. When an anomaly is observed in the metrics of a lab benchmark, you can simply download a representative trace and run the same metric locally in shell.
Since the same code is running locally and remotely, you can be confident in reproducing the issue and use the power of trace processor and/or the Perfetto UI to identify the problem!
To begin, all you need is some familiarity with SQL and you're ready to start!
Suppose that want a write a metric which computes the CPU time for every process in the trace and lists the names of the top 5 processes (by CPU time) and the number of threads which were associated with those processes over its lifetime.
Note:
As a setup step, you'll want to create a folder to act as a scratch workspace; this folder will be referred to using the env variable $WORKSPACE
in Step 4.
The other thing you'll need is trace processor shell. You can download this here or you can build from source using the instructions here. Whichever method is chosen, $TRACE_PROCESSOR env variable will be used to refer to the location of the binary in Step 4.
As all metrics in the metrics platform are defined using protos, the metric needs to be strctured as a proto. For this metric, there needs to be some notion of a process name along with its CPU time and number of threads.
Starting off, in a file named top_five_processes.proto
in our workspace, let's create a basic proto message called ProcessInfo with those three fields:
message ProcessInfo { optional string process_name = 1; optional uint64 cpu_time_ms = 2; optional uint32 num_threads = 3; }
Next up is a wrapping message which will hold the repeated field containing the top 5 processes.
message TopProcesses { repeated ProcessInfo process_info = 1; }
Finally, let's define an extension to the root proto for all metrics - the TraceMetrics proto).
extend TraceMetrics { optional TopProcesses top_processes = 450; }
Adding this extension field allows trace processor to link the newly defined metric to the TraceMetrics
proto.
Notes:
Putting everything together, along with some boilerplate header information gives:
syntax = "proto2"; package perfetto.protos; import "protos/perfetto/metrics/metrics.proto"; message ProcessInfo { optional string process_name = 1; optional int64 cpu_time_ms = 2; optional uint32 num_threads = 3; } message TopProcesses { repeated ProcessInfo process_info = 1; } extend TraceMetrics { optional TopProcesses top_processes = 450; }
Let's write the SQL to generate the table of the top 5 processes ordered by the sum of the CPU time they ran for and the number of threads which were associated with the process. The following SQL should be to a file called top_five_processes.sql
in your workspace:
CREATE VIEW top_five_processes_by_cpu SELECT process.name as process_name, CAST(SUM(sched.dur) / 1e6 as INT64) as cpu_time_ms, COUNT(DISTINCT utid) as num_threads FROM sched INNER JOIN thread USING(utid) INNER JOIN process USING(upid) GROUP BY process.name ORDER BY cpu_time_ms DESC LIMIT 5;
Let's break this query down:
sched
table. This contains all the scheduling data available in the trace. Each scheduling "slice" is associated with a thread which is uniquely identified in Perfetto traces using its utid
. The two pieces of information which needed from the sched table is the dur
- short for duration, this is the amount of time the slice lasted - and the utid
which will be use to join with the thread table.upid
. Similar to utid
, upid
is the unique identifier for a process in a Perfetto trace. In this case, upid
will refer to the process which hosts the thread given by utid
.Now that the result of the metric has been expressed as an SQL table, it needs to be converted a proto. The metrics platform has built-in support for emitting protos using SQL functions; something which is used extensively in this step.
Let's look at how it works for our table above.
CREATE VIEW top_processes_output AS SELECT TopProcesses( 'process_info', ( SELECT RepeatedField( ProcessInfo( 'process_name', process_name, 'cpu_time_ms', cpu_time_ms, 'num_threads', num_threads ) ) FROM top_five_processes_by_cpu ) );
Let's break this down again:
Starting from the inner-most SELECT statement, there is what looks like a function call to the ProcessInfo function; in face this is no conincidence. For each proto that the metrics platform knows about, it generates a SQL function with the same name as the proto. This function takes key value pairs with the key as the name of the proto field to fill and the value being the data to store in the field. The output is the proto created by writing the fields described in the function! (*)
In this case, this function is called once for each row in the top_five_processes_by_cpu
table. The output of will be the fully filled ProcessInfo proto.
The call to the RepeatedField
function is the most interesting part and also the most important. In technical terms, RepeatedField
is an aggregate function; practically, this means that it takes a full table of values and generates a single array which contains all the values passed to it.
Therefore, the output of this whole SELECT statement is an array of 5 ProcessInfo protos.
Next is creation of the TopProcesses
proto. By now, the syntax should already feel somewhat familiar; the proto builder function is called to fill in the process_info
field with the array of protos from the inner funciton.
The output of this SELECT is a single TopProcesses
proto containing the ProcessInfos as a repeated field.
Finally, the view is created. This view is specially named to allow the metrics platform to query it to obtain the root proto for each metric (in this case TopProcesses
). See the note below as to the pattern behind this view's name.
(*) - side note: this is not strictly true. To type-check the protos, we also return some metadata about the type of the proto but this is unimportant for metric authors
Note:
And that's all the SQL we need to write! Our final file should look like so:
CREATE VIEW top_five_processes_by_cpu AS SELECT process.name as process_name, CAST(SUM(sched.dur) / 1e6 as INT64) as cpu_time_ms, COUNT(DISTINCT utid) as num_threads FROM sched INNER JOIN thread USING(utid) INNER JOIN process USING(upid) GROUP BY process.name ORDER BY cpu_time_ms DESC LIMIT 5; CREATE top_processes_output AS SELECT TopProcesses( 'process_info', ( SELECT RepeatedField( ProcessInfo( 'process_name', process_name, 'cpu_time_ms', cpu_time_ms, 'num_threads', num_threads ) ) FROM top_five_processes_by_cpu ) );
Notes:
This is the last step and where we get to see the results of our work!
For this step, all we need is a one-liner, invoking trace processor shell (see Step 0 for downloading it):
$TRACE_PROCESSOR --run-metrics $WORKSPACE/top_five_processes.sql $TRACE 2> /dev/null
(If you want a example trace to test this on, see the Notes section below.)
By passing the SQL file for the metric we want to compute, trace processor uses the name of this file to both find the proto and also to figure out the name of the output table for the proto and the name of the extension field for TraceMetrics
; this is why it was important to choose the names of these other objects carefully.
Notes:
If everything went successfully, you should see something like the following (this is specifically the output for the Android example trace linked above):
[perfetto.protos.top_five_processes] { process_info { process_name: "com.google.android.GoogleCamera" cpu_time_ms: 15154 num_threads: 125 } process_info { process_name: "sugov:4" cpu_time_ms: 6846 num_threads: 1 } process_info { process_name: "system_server" cpu_time_ms: 6809 num_threads: 66 } process_info { process_name: "cds_ol_rx_threa" cpu_time_ms: 6684 num_threads: 1 } process_info { process_name: "com.android.chrome" cpu_time_ms: 5125 num_threads: 49 } }
That finishes the introductory guide to writing an metric using the Perfetto metrics platform! For more information about where to go next, the following links may be useful:
Coming soon!
Coming soon!
Coming soon!
Coming soon!