trace_processor: make filtering of non-null columns faster

This CL adds a fast-path when filtering non-null columns by skipping the
lookup of the bitvector inside SparseVector when we know that every entry
of the column is non-null.

As most performance sensitive columns will be non-null, it is important
that this is efficient.

Before:
----------------------------------------------------------------------------------------------
Benchmark                                                       Time           CPU Iterations
----------------------------------------------------------------------------------------------
BM_TableFilterRootNonNullEqMatchMany/1024                   13329 ns      13329 ns      52480
BM_TableFilterRootNonNullEqMatchMany/4096                   71338 ns      71338 ns       9873
BM_TableFilterRootNonNullEqMatchMany/32768                 456830 ns     456832 ns       1532
BM_TableFilterRootNonNullEqMatchMany/262144               3555187 ns    3555110 ns        194
BM_TableFilterRootNonNullEqMatchMany/2097152             27917977 ns   27918146 ns         25
BM_TableFilterChildNonNullEqMatchMany/1024                  18743 ns      18743 ns      37067
BM_TableFilterChildNonNullEqMatchMany/4096                  98913 ns      98914 ns       7084
BM_TableFilterChildNonNullEqMatchMany/32768                697157 ns     697161 ns       1031
BM_TableFilterChildNonNullEqMatchMany/262144              5231207 ns    5230762 ns        133
BM_TableFilterChildNonNullEqMatchMany/2097152            41994637 ns   41994799 ns         17
BM_TableFilterChildNonNullEqMatchManyInParent/1024          22546 ns      22546 ns      30982
BM_TableFilterChildNonNullEqMatchManyInParent/4096         111175 ns     111176 ns       6320
BM_TableFilterChildNonNullEqMatchManyInParent/32768        790285 ns     790285 ns        884
BM_TableFilterChildNonNullEqMatchManyInParent/262144      6088922 ns    6088676 ns        114
BM_TableFilterChildNonNullEqMatchManyInParent/2097152    55033399 ns   55032881 ns         10

After:
----------------------------------------------------------------------------------------------
Benchmark                                                       Time           CPU Iterations
----------------------------------------------------------------------------------------------
BM_TableFilterRootNonNullEqMatchMany/1024                    9661 ns       9661 ns      72434
BM_TableFilterRootNonNullEqMatchMany/4096                   55301 ns      55301 ns      12180
BM_TableFilterRootNonNullEqMatchMany/32768                 336006 ns     336009 ns       2098
BM_TableFilterRootNonNullEqMatchMany/262144               2502308 ns    2502299 ns        277
BM_TableFilterRootNonNullEqMatchMany/2097152             19694850 ns   19694393 ns         35
BM_TableFilterChildNonNullEqMatchMany/1024                  15019 ns      15019 ns      46872
BM_TableFilterChildNonNullEqMatchMany/4096                  83278 ns      83278 ns       8416
BM_TableFilterChildNonNullEqMatchMany/32768                549981 ns     549987 ns       1281
BM_TableFilterChildNonNullEqMatchMany/262144              4190223 ns    4190245 ns        166
BM_TableFilterChildNonNullEqMatchMany/2097152            33867350 ns   33867514 ns         20
BM_TableFilterChildNonNullEqMatchManyInParent/1024          19062 ns      19062 ns      36507
BM_TableFilterChildNonNullEqMatchManyInParent/4096          99900 ns      99898 ns       6967
BM_TableFilterChildNonNullEqMatchManyInParent/32768        684101 ns     684103 ns       1028
BM_TableFilterChildNonNullEqMatchManyInParent/262144      5280275 ns    5280263 ns        132
BM_TableFilterChildNonNullEqMatchManyInParent/2097152    42715263 ns   42715753 ns         16

Analysis:
Across all the non-null benchmarks, the new code is 20-30% faster with no regression on any
of the other codepaths. This is a good speed up given that we are reaching the limits of how
optimized filtering can be.

Context: go/perfetto-tp-refactor
Bug: 135177627
Change-Id: Ic49c582f406f60d6d24599db7b0eda5b0f1bc6eb
6 files changed
tree: 28c97222e304f35312801d2a8bb6e0ccb7c97573
  1. bazel/
  2. build_overrides/
  3. buildtools/
  4. debian/
  5. docs/
  6. gn/
  7. include/
  8. infra/
  9. protos/
  10. src/
  11. test/
  12. tools/
  13. ui/
  14. .clang-format
  15. .gitignore
  16. .gn
  17. .style.yapf
  18. Android.bp
  19. Android.bp.extras
  20. BUILD
  21. BUILD.extras
  22. BUILD.gn
  23. codereview.settings
  24. heapprofd.rc
  25. MODULE_LICENSE_APACHE2
  26. NOTICE
  27. OWNERS
  28. perfetto.rc
  29. PRESUBMIT.py
  30. README.chromium
  31. README.md
  32. TEST_MAPPING
  33. WORKSPACE
README.md

Perfetto - Performance instrumentation and tracing

Perfetto is an open-source project for performance instrumentation and tracing of Linux/Android/Chrome platforms and user-space apps.

See www.perfetto.dev for docs.

Bugs

  • For bugs affecting Android or the tracing internals use the internal bug tracker (go/perfetto-bugs).
  • For bugs affecting Chrome use http://crbug.com, Component:Speed>Tracing label:Perfetto.

Community

You can reach us on our Discord channel. If you prefer using IRC we have an experimental Discord <> IRC bridge synced with #perfetto-dev on Freenode.