TraceProcessor: optimize std::sort<TimestampedTracePiece>

TraceSorter::Sort() is a very hot path in trace processor.
That does a std::sort on a CircularQueue<TTP>.
std::sort uses std::swap, which falls back on 2 std::moves
when not implemented.
TTP happens to be precisely 512-bits wide. This allows to
implement a very efficient swap which leverages XMM registers
on x86_64.
This saves 7-10% of trace ingestion time on a large trace.

Bug: 205302474
Change-Id: Ie57fc26e79599b2e27945dc5cef176c1d03d95cc
2 files changed
tree: 3a51b1c586ae6dbf057cad1e4417788d2a28e1ff
  1. .github/
  2. bazel/
  3. build_overrides/
  4. buildtools/
  5. debian/
  6. docs/
  7. examples/
  8. gn/
  9. include/
  10. infra/
  11. protos/
  12. src/
  13. test/
  14. tools/
  15. ui/
  16. .clang-format
  17. .clang-tidy
  18. .gitattributes
  19. .gitignore
  20. .gn
  21. .style.yapf
  22. Android.bp
  23. Android.bp.extras
  24. BUILD
  25. BUILD.extras
  26. BUILD.gn
  27. CHANGELOG
  28. codereview.settings
  29. DIR_METADATA
  30. heapprofd.rc
  31. LICENSE
  32. meson.build
  33. METADATA
  34. MODULE_LICENSE_APACHE2
  35. OWNERS
  36. perfetto.rc
  37. PerfettoIntegrationTests.xml
  38. PRESUBMIT.py
  39. README.chromium
  40. README.md
  41. TEST_MAPPING
  42. traced_perf.rc
  43. WORKSPACE
README.md

Perfetto - System profiling, app tracing and trace analysis

Perfetto is a production-grade open-source stack for performance instrumentation and trace analysis. It offers services and libraries and for recording system-level and app-level traces, native + java heap profiling, a library for analyzing traces using SQL and a web-based UI to visualize and explore multi-GB traces.

See https://perfetto.dev/docs or the /docs/ directory for documentation.