commit	e1b3c47620254a7ea7bb299e34564cc5f8724230	[log] [tgz]
author	Lalit Maganti <lalitm@google.com>	Tue Mar 12 23:07:42 2019 +0000
committer	Lalit Maganti <lalitm@google.com>	Tue Mar 12 23:07:42 2019 +0000
tree	359d119bbb1ddd8eae093eec9be0a1f1dffb3e58
parent	48092a436b70f70b0ee1f3da28e65e0c8cb5dd36 [diff]

trace_processor: improve query performance involving filter operations This change makes 3 changes which improve query performance on queries which are very frequently performed by the UI: 1) Changes from always writing into the bitvector when filtering in all rows mode to only writing when the query returns true, This is important because the cost of indexing into the bitvector cost is more than the cost of a branch. This is especially as when the first constraint to filter is likely going to decrease the dataset a lot (think a constraint on cpu or on utid). 2) Change predicate from using std::function to using a Functor. The virtual dispatch of std::function and the possibility of memory allocation (though unlikely) was causing large slowdowns. As this predicate is called in hot loops, we want this code to be inlined as much as possible - this is less than the cost of the switch. 3) Force the row predicate lambdas to be always inlined. This is really important as not inlining doubles the cost of these functions and generally lambdas are expected to be inlined. Because the filter switch is more than 6 branches (the magic number under which inlining seems to happen on Clang, this function was not getting inlined). The net result of these changes yields the following perf numbers on trace from b/124495829: Query 1: select ts, lead(ts) over (partition by ref_type order by ts) as ts_end, value from counters where name = 'SwapCached' and ref = 0 Old code: 107.088 ms New code: 57.127 ms (1.87x speedup) Query 2: create view mem_rss as select *, lead(ts) over (order by ts) - ts as dur from counters where name="mem.rss.file" and ref=10 and ref_type="upid" create virtual table span_49 USING span_join(mem_rss, window) select ts, dur, value from span_49 Old code: 114.136 ms New code: 67.882 ms (1.68x speedup) Change-Id: Ic81b3f711a2f9b28c9fb35c0cf0c624bae98da92

tree: 359d119bbb1ddd8eae093eec9be0a1f1dffb3e58

README.md

Perfetto - Performance instrumentation and tracing

Perfetto is an open-source project for performance instrumentation and tracing of Linux/Android/Chrome platforms and user-space apps.

See www.perfetto.dev for docs.

Bugs

For bugs affecting Android or the tracing internals use the internal bug tracker (go/perfetto-bugs).
For bugs affecting Chrome use http://crbug.com, Component:Speed>Tracing label:Perfetto.