commit | fbf4b7b9e5d1f02536b54734e81a482498d08de0 | [log] [tgz] |
---|---|---|
author | Lalit Maganti <lalitm@google.com> | Tue Nov 05 15:44:07 2019 +0000 |
committer | Lalit Maganti <lalitm@google.com> | Tue Nov 05 15:44:07 2019 +0000 |
tree | 25f2a212636366497e736e5772bbf85c09c1a7dc | |
parent | eedcc90bd5ec4cd3449ba202513576e46962956a [diff] |
trace_processor: optimize filtering a range into a BitVector This CL improves the performance of filtering a ranged RowMap into a BitVector. Instead of appending true/false on each filter, we can pre-reserve the entire memory we will need and just set the bits which should be retained. Before: ---------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------------------- BM_TableFilterRootNonNullEqMatchMany/1024 6180 ns 6179 ns 120812 BM_TableFilterRootNonNullEqMatchMany/4096 30564 ns 30565 ns 22876 BM_TableFilterRootNonNullEqMatchMany/32768 158012 ns 158013 ns 4331 BM_TableFilterRootNonNullEqMatchMany/262144 1155470 ns 1155446 ns 604 BM_TableFilterRootNonNullEqMatchMany/2097152 9337566 ns 9337575 ns 75 BM_TableFilterRootNullableEqMatchMany/1024 16055 ns 16055 ns 43675 BM_TableFilterRootNullableEqMatchMany/4096 78053 ns 78053 ns 9175 BM_TableFilterRootNullableEqMatchMany/32768 621340 ns 621348 ns 1121 BM_TableFilterRootNullableEqMatchMany/262144 4922984 ns 4923000 ns 141 BM_TableFilterRootNullableEqMatchMany/2097152 39401533 ns 39401762 ns 18 BM_TableFilterChildNonNullEqMatchMany/1024 11568 ns 11569 ns 61246 BM_TableFilterChildNonNullEqMatchMany/4096 57972 ns 57973 ns 11997 BM_TableFilterChildNonNullEqMatchMany/32768 381860 ns 381852 ns 1814 BM_TableFilterChildNonNullEqMatchMany/262144 2886247 ns 2886148 ns 241 BM_TableFilterChildNonNullEqMatchMany/2097152 23302410 ns 23301894 ns 30 BM_TableFilterChildNullableEqMatchMany/1024 22658 ns 22658 ns 31472 BM_TableFilterChildNullableEqMatchMany/4096 110301 ns 110288 ns 6197 BM_TableFilterChildNullableEqMatchMany/32768 849353 ns 849322 ns 814 BM_TableFilterChildNullableEqMatchMany/262144 6780529 ns 6778431 ns 104 BM_TableFilterChildNullableEqMatchMany/2097152 53963182 ns 53961542 ns 13 BM_TableFilterChildNonNullEqMatchManyInParent/1024 15736 ns 15735 ns 44072 BM_TableFilterChildNonNullEqMatchManyInParent/4096 76295 ns 76296 ns 9029 BM_TableFilterChildNonNullEqMatchManyInParent/32768 520852 ns 520809 ns 1335 BM_TableFilterChildNonNullEqMatchManyInParent/262144 4008311 ns 4008363 ns 175 BM_TableFilterChildNonNullEqMatchManyInParent/2097152 32729769 ns 32727231 ns 21 BM_TableFilterChildNullableEqMatchManyInParent/1024 37850 ns 37851 ns 18440 BM_TableFilterChildNullableEqMatchManyInParent/4096 133827 ns 133814 ns 5192 BM_TableFilterChildNullableEqMatchManyInParent/32768 1004359 ns 1004337 ns 701 BM_TableFilterChildNullableEqMatchManyInParent/262144 8020932 ns 8019976 ns 88 BM_TableFilterChildNullableEqMatchManyInParent/2097152 64175276 ns 64170894 ns 11 After: ---------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------------------- BM_TableFilterRootNonNullEqMatchMany/1024 3749 ns 3749 ns 188571 BM_TableFilterRootNonNullEqMatchMany/4096 19442 ns 19442 ns 35815 BM_TableFilterRootNonNullEqMatchMany/32768 85956 ns 85956 ns 8189 BM_TableFilterRootNonNullEqMatchMany/262144 581350 ns 581358 ns 1227 BM_TableFilterRootNonNullEqMatchMany/2097152 4620079 ns 4619961 ns 150 BM_TableFilterRootNullableEqMatchMany/1024 13398 ns 13398 ns 52325 BM_TableFilterRootNullableEqMatchMany/4096 69576 ns 69573 ns 10022 BM_TableFilterRootNullableEqMatchMany/32768 541768 ns 541775 ns 1302 BM_TableFilterRootNullableEqMatchMany/262144 4344103 ns 4344096 ns 158 BM_TableFilterRootNullableEqMatchMany/2097152 34370272 ns 34363056 ns 20 BM_TableFilterChildNonNullEqMatchMany/1024 9126 ns 9126 ns 77589 BM_TableFilterChildNonNullEqMatchMany/4096 46091 ns 46092 ns 15183 BM_TableFilterChildNonNullEqMatchMany/32768 304340 ns 304328 ns 2275 BM_TableFilterChildNonNullEqMatchMany/262144 2277972 ns 2278002 ns 308 BM_TableFilterChildNonNullEqMatchMany/2097152 18347501 ns 18347477 ns 38 BM_TableFilterChildNullableEqMatchMany/1024 19414 ns 19414 ns 36206 BM_TableFilterChildNullableEqMatchMany/4096 97416 ns 97417 ns 7242 BM_TableFilterChildNullableEqMatchMany/32768 754388 ns 754398 ns 939 BM_TableFilterChildNullableEqMatchMany/262144 5954874 ns 5954900 ns 118 BM_TableFilterChildNullableEqMatchMany/2097152 47767792 ns 47767683 ns 15 BM_TableFilterChildNonNullEqMatchManyInParent/1024 13175 ns 13175 ns 53010 BM_TableFilterChildNonNullEqMatchManyInParent/4096 59696 ns 59696 ns 11751 BM_TableFilterChildNonNullEqMatchManyInParent/32768 438009 ns 438012 ns 1586 BM_TableFilterChildNonNullEqMatchManyInParent/262144 3381966 ns 3381854 ns 206 BM_TableFilterChildNonNullEqMatchManyInParent/2097152 27470848 ns 27470916 ns 26 BM_TableFilterChildNullableEqMatchManyInParent/1024 29386 ns 29386 ns 23405 BM_TableFilterChildNullableEqMatchManyInParent/4096 119400 ns 119400 ns 5852 BM_TableFilterChildNullableEqMatchManyInParent/32768 897180 ns 897185 ns 779 BM_TableFilterChildNullableEqMatchManyInParent/262144 7110136 ns 7110223 ns 99 BM_TableFilterChildNullableEqMatchManyInParent/2097152 57180514 ns 57180682 ns 12 Analysis: We have a good speedup across the board of 1.1-1.7x; the less other things the codepath is doing, the greater the speedup of this optimization (e.g. the non-null columns are sped up more because they don't have to do SparseVector::IndexOf calls). Also, the first constraint of each filter will always cause this codepath to be hit so this is an important one to optimize. Context: go/perfetto-tp-refactor Bug: 135177627 Change-Id: I7df4811fa520a43b933b61c00fddb8832f0803e7
Perfetto is an open-source project for performance instrumentation and tracing of Linux/Android/Chrome platforms and user-space apps.
See www.perfetto.dev for docs.
You can reach us on our Discord channel. If you prefer using IRC we have an experimental Discord <> IRC bridge synced with #perfetto-dev
on Freenode.