trace_processor: improve query performance involving filter operations

This change makes 3 changes which improve query performance on queries
which are very frequently performed by the UI:

1) Changes from always writing into the bitvector when filtering in
all rows mode to only writing when the query returns true,

This is important because the cost of indexing into the bitvector cost
is more than the cost of a branch. This is especially as when the
first constraint to filter is likely going to decrease the dataset
a lot (think a constraint on cpu or on utid).

2) Change predicate from using std::function to using a Functor.

The virtual dispatch of std::function and the possibility of memory
allocation (though unlikely) was causing large slowdowns. As this predicate
is called in hot loops, we want this code to be inlined as much as possible
- this is less than the cost of the switch.

3) Force the row predicate lambdas to be always inlined.

This is really important as not inlining doubles the cost of these
functions and generally lambdas are expected to be inlined. Because the
filter switch is more than 6 branches (the magic number under which inlining
seems to happen on Clang, this function was not getting inlined).

The net result of these changes yields the following perf numbers
on trace from b/124495829:
Query 1:
select
  ts,
  lead(ts) over (partition by ref_type order by ts) as ts_end,
  value
from counters
where name = 'SwapCached' and ref = 0

Old code: 107.088 ms
New code: 57.127 ms (1.87x speedup)

Query 2:
create view mem_rss as
select *, lead(ts) over (order by ts) - ts as dur
from counters
where name="mem.rss.file" and ref=10 and ref_type="upid"
create virtual table span_49 USING span_join(mem_rss, window)
select ts, dur, value from span_49

Old code: 114.136 ms
New code: 67.882 ms (1.68x speedup)

Change-Id: Ic81b3f711a2f9b28c9fb35c0cf0c624bae98da92
5 files changed
tree: 359d119bbb1ddd8eae093eec9be0a1f1dffb3e58
  1. build_overrides/
  2. buildtools/
  3. debian/
  4. docs/
  5. gn/
  6. include/
  7. infra/
  8. protos/
  9. src/
  10. test/
  11. tools/
  12. ui/
  13. .clang-format
  14. .gitignore
  15. .gn
  16. .travis.yml
  17. Android.bp
  18. Android.bp.extras
  19. BUILD.gn
  20. codereview.settings
  21. heapprofd.rc
  22. MODULE_LICENSE_APACHE2
  23. NOTICE
  24. OWNERS
  25. perfetto.rc
  26. PRESUBMIT.py
  27. README.chromium
  28. README.md
README.md

Perfetto - Performance instrumentation and tracing

Perfetto is an open-source project for performance instrumentation and tracing of Linux/Android/Chrome platforms and user-space apps.

See www.perfetto.dev for docs.

Bugs

  • For bugs affecting Android or the tracing internals use the internal bug tracker (go/perfetto-bugs).
  • For bugs affecting Chrome use http://crbug.com, Component:Speed>Tracing label:Perfetto.