Introduce base::CircularQueue and improve trace processor sorter

This CL introduce a queue container backed by a circular array.
This vastly improves TraceSorter's performance, because it makes
delete-front O(1) (% invoking dtors), but still allows std::sort()
of the elements with the same cost of a vector.

This solution is also faster than using a std::deque for |events_|
instead of the std::vector, this is because CircularBuffer is way
simpler than a std::deque.

Benchmark of sorting-stage, obtained commenting out the parsing
stage and running on a 7GB trace [1] on a macbook pro 15 2018.

std::vector: 70 MB/s
std::deque: 260 MB/s
CircularBuffer: 285 MB/s

More CLs will further improve the sorter, but for now this CL alone
makes the trace processor cope better with large traces.

[1] SHA1: ad54e7de3a7,
    trace-walleye-QP1A.190211.001-2019-02-15-20-27-27.perfetto-trace
    from www/~primiano/perfetto/sample_traces

Test: perfetto_unittests --gtest_filter=CircularBufferTest.*
Change-Id: Ie804f54e08d7a9d374aefbe6141cbba65ddf25ec
diff --git a/Android.bp b/Android.bp
index 72ca25d..4d8e723 100644
--- a/Android.bp
+++ b/Android.bp
@@ -2585,6 +2585,7 @@
     ":perfetto_src_traced_probes_ftrace_test_messages_lite_gen",
     ":perfetto_src_traced_probes_ftrace_test_messages_zero_gen",
     "src/base/android_task_runner.cc",
+    "src/base/circular_queue_unittest.cc",
     "src/base/event.cc",
     "src/base/file_utils.cc",
     "src/base/metatrace.cc",