commit | eedcc90bd5ec4cd3449ba202513576e46962956a | [log] [tgz] |
---|---|---|
author | Lalit Maganti <lalitm@google.com> | Tue Nov 05 15:44:00 2019 +0000 |
committer | Lalit Maganti <lalitm@google.com> | Tue Nov 05 15:44:00 2019 +0000 |
tree | 5ce0ce6728320d35f90b48881f57b720dab0edc5 | |
parent | d9c62a63dce55a5aaa1b60582fc66a3082109fe1 [diff] |
trace_processor: improve SelectRangeWithBv for start == 0 case This CL massively improves the performance of SelectRows in RowMap when the source RowMap is a range and the selector is a BitVector when start == 0. This situation happens on every filter for the self RowMap and thus is important to optimize. Before: ---------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------------------- BM_TableFilterRootNonNullEqMatchMany/1024 14017 ns 14017 ns 51401 BM_TableFilterRootNonNullEqMatchMany/4096 73479 ns 73480 ns 9309 BM_TableFilterRootNonNullEqMatchMany/32768 496632 ns 496623 ns 1435 BM_TableFilterRootNonNullEqMatchMany/262144 3549895 ns 3549940 ns 191 BM_TableFilterRootNonNullEqMatchMany/2097152 28100216 ns 28099776 ns 25 BM_TableFilterRootNullableEqMatchMany/1024 21251 ns 21251 ns 32768 BM_TableFilterRootNullableEqMatchMany/4096 105824 ns 105826 ns 6612 BM_TableFilterRootNullableEqMatchMany/32768 788526 ns 788535 ns 887 BM_TableFilterRootNullableEqMatchMany/262144 6240487 ns 6240360 ns 111 BM_TableFilterRootNullableEqMatchMany/2097152 50471154 ns 50469643 ns 14 BM_TableFilterChildNonNullEqMatchMany/1024 19044 ns 19044 ns 36624 BM_TableFilterChildNonNullEqMatchMany/4096 99624 ns 99625 ns 7061 BM_TableFilterChildNonNullEqMatchMany/32768 696208 ns 696179 ns 1040 BM_TableFilterChildNonNullEqMatchMany/262144 5212354 ns 5212334 ns 133 BM_TableFilterChildNonNullEqMatchMany/2097152 42609752 ns 42610240 ns 17 BM_TableFilterChildNullableEqMatchMany/1024 27721 ns 27722 ns 25215 BM_TableFilterChildNullableEqMatchMany/4096 135093 ns 135092 ns 5215 BM_TableFilterChildNullableEqMatchMany/32768 1016299 ns 1016305 ns 689 BM_TableFilterChildNullableEqMatchMany/262144 8002628 ns 8002516 ns 88 BM_TableFilterChildNullableEqMatchMany/2097152 64936808 ns 64936918 ns 11 BM_TableFilterChildNonNullEqMatchManyInParent/1024 22831 ns 22831 ns 30633 BM_TableFilterChildNonNullEqMatchManyInParent/4096 111175 ns 111173 ns 6117 BM_TableFilterChildNonNullEqMatchManyInParent/32768 798577 ns 798581 ns 887 BM_TableFilterChildNonNullEqMatchManyInParent/262144 6176321 ns 6176198 ns 114 BM_TableFilterChildNonNullEqMatchManyInParent/2097152 50297905 ns 50297723 ns 14 BM_TableFilterChildNullableEqMatchManyInParent/1024 42450 ns 42450 ns 16486 BM_TableFilterChildNullableEqMatchManyInParent/4096 153979 ns 153980 ns 4614 BM_TableFilterChildNullableEqMatchManyInParent/32768 1162992 ns 1162998 ns 606 BM_TableFilterChildNullableEqMatchManyInParent/262144 9142348 ns 9142392 ns 77 BM_TableFilterChildNullableEqMatchManyInParent/2097152 74198801 ns 74199115 ns 10 After: ---------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------------------- BM_TableFilterRootNonNullEqMatchMany/1024 5837 ns 5837 ns 118181 BM_TableFilterRootNonNullEqMatchMany/4096 30729 ns 30729 ns 22756 BM_TableFilterRootNonNullEqMatchMany/32768 158072 ns 158071 ns 4425 BM_TableFilterRootNonNullEqMatchMany/262144 1159558 ns 1159346 ns 601 BM_TableFilterRootNonNullEqMatchMany/2097152 9182664 ns 9182692 ns 76 BM_TableFilterRootNullableEqMatchMany/1024 16180 ns 16177 ns 43336 BM_TableFilterRootNullableEqMatchMany/4096 76866 ns 76866 ns 8990 BM_TableFilterRootNullableEqMatchMany/32768 628123 ns 628130 ns 1118 BM_TableFilterRootNullableEqMatchMany/262144 4969477 ns 4969536 ns 139 BM_TableFilterRootNullableEqMatchMany/2097152 39868032 ns 39860352 ns 17 BM_TableFilterChildNonNullEqMatchMany/1024 11336 ns 11335 ns 59754 BM_TableFilterChildNonNullEqMatchMany/4096 57529 ns 57529 ns 11924 BM_TableFilterChildNonNullEqMatchMany/32768 380772 ns 380777 ns 1848 BM_TableFilterChildNonNullEqMatchMany/262144 2889063 ns 2888570 ns 244 BM_TableFilterChildNonNullEqMatchMany/2097152 23695917 ns 23696042 ns 30 BM_TableFilterChildNullableEqMatchMany/1024 22467 ns 22463 ns 30861 BM_TableFilterChildNullableEqMatchMany/4096 108554 ns 108554 ns 6393 BM_TableFilterChildNullableEqMatchMany/32768 841629 ns 841496 ns 822 BM_TableFilterChildNullableEqMatchMany/262144 6618146 ns 6618177 ns 105 BM_TableFilterChildNullableEqMatchMany/2097152 53374738 ns 53366203 ns 13 BM_TableFilterChildNonNullEqMatchManyInParent/1024 15683 ns 15683 ns 44851 BM_TableFilterChildNonNullEqMatchManyInParent/4096 77420 ns 77408 ns 9037 BM_TableFilterChildNonNullEqMatchManyInParent/32768 527354 ns 527360 ns 1339 BM_TableFilterChildNonNullEqMatchManyInParent/262144 4054892 ns 4054258 ns 176 BM_TableFilterChildNonNullEqMatchManyInParent/2097152 32535702 ns 32536055 ns 22 BM_TableFilterChildNullableEqMatchManyInParent/1024 37799 ns 37800 ns 18364 BM_TableFilterChildNullableEqMatchManyInParent/4096 136581 ns 136582 ns 5123 BM_TableFilterChildNullableEqMatchManyInParent/32768 1019327 ns 1019163 ns 679 BM_TableFilterChildNullableEqMatchManyInParent/262144 8087503 ns 8087336 ns 87 BM_TableFilterChildNullableEqMatchManyInParent/2097152 65028977 ns 65029724 ns 11 Analysis: There is improvement across the board with 1.1x-2.4 speed-up depending on the usecase. The filters where there is no other overhead apart from the filtering itself obviously improves more than the cases where we need to also iterate over set bits/lookup nullability in SparseVector. Context: go/perfetto-tp-refactor Bug: 135177627 Change-Id: I43e80f54a4c737c7efcc3a7bd184cadc267a5463
Perfetto is an open-source project for performance instrumentation and tracing of Linux/Android/Chrome platforms and user-space apps.
See www.perfetto.dev for docs.
You can reach us on our Discord channel. If you prefer using IRC we have an experimental Discord <> IRC bridge synced with #perfetto-dev
on Freenode.