traced_perf: rework reading granularity, add unwind queue

The implementation is kept single-threaded, but the organization roughly
follows the intended unwinder-on-a-dedicated-thread approach.

There's a mix of smaller changes, summarizing in no particular order:
* Reading now consumes individual records instead of chunks of the ring buffer.
  (The records will be either a sample, or a PERF_RECORD_LOST if the kernel has
  lost samples due to ring buffer capacity). I didn't want to bubble up the
  record type decision all the way to the PerfProducer, proposing the
  callback+return format of ReadUntilSample to abstract most of it away (though
  noting that it ends up hiding a nested loop).
* Reading parses the event directly out of the ring buffer (copying the
  populated stack bytes onto the heap, etc).
* The wrapped ringbuffer case is handled by reconstructing (with a pair of
  memcpy's) the event in a dedicated buffer, and returning a pointer to that.
* Added boot clock timestamps to samples.
* Added proc-fd interface, and direct/signal-based implementations.
* Added per-datasource unwinder queues (for now simply std::deque while we're
  single-threaded, imagine a ring buffer in the future).

On unwinding queues specifically - the slots are filled in order (from the
perspective of a given per-cpu buffer reader). The parsing however can be out
of order, as samples are kept in the queue until their proc-fds are ready (or
assumed to not be obtainable for the remainder of the data source's lifetime).
The "unwind" tick keep walking the queue in order, only releasing the completed
entries once they reach the "oldest" end of the queue.

Bug: 144281346
Change-Id: Ic59c153c1c80e04b5e5bfb25656c10bc5e80dd11
diff --git a/Android.bp b/Android.bp
index 0b1651b..dc0bc6c 100644
--- a/Android.bp
+++ b/Android.bp
@@ -5696,6 +5696,14 @@
   ],
 }
 
+// GN: //src/profiling/perf:proc_descriptors
+filegroup {
+  name: "perfetto_src_profiling_perf_proc_descriptors",
+  srcs: [
+    "src/profiling/perf/proc_descriptors.cc",
+  ],
+}
+
 // GN: //src/profiling/perf:producer
 filegroup {
   name: "perfetto_src_profiling_perf_producer",
@@ -5713,6 +5721,14 @@
   ],
 }
 
+// GN: //src/profiling/perf:regs_parsing
+filegroup {
+  name: "perfetto_src_profiling_perf_regs_parsing",
+  srcs: [
+    "src/profiling/perf/regs_parsing.cc",
+  ],
+}
+
 // GN: //src/profiling/perf:traced_perf_main
 filegroup {
   name: "perfetto_src_profiling_perf_traced_perf_main",
@@ -5721,14 +5737,6 @@
   ],
 }
 
-// GN: //src/profiling/perf:unwind_support
-filegroup {
-  name: "perfetto_src_profiling_perf_unwind_support",
-  srcs: [
-    "src/profiling/perf/unwind_support.cc",
-  ],
-}
-
 // GN: //src/profiling/symbolizer:symbolize_database
 filegroup {
   name: "perfetto_src_profiling_symbolizer_symbolize_database",
@@ -6962,9 +6970,10 @@
     ":perfetto_src_profiling_memory_scoped_spinlock",
     ":perfetto_src_profiling_memory_unittests",
     ":perfetto_src_profiling_memory_wire_protocol",
+    ":perfetto_src_profiling_perf_proc_descriptors",
     ":perfetto_src_profiling_perf_producer",
     ":perfetto_src_profiling_perf_producer_unittests",
-    ":perfetto_src_profiling_perf_unwind_support",
+    ":perfetto_src_profiling_perf_regs_parsing",
     ":perfetto_src_profiling_unittests",
     ":perfetto_src_protozero_protozero",
     ":perfetto_src_protozero_testing_messages_cpp_gen",
@@ -7447,9 +7456,10 @@
     ":perfetto_src_base_unix_socket",
     ":perfetto_src_ipc_client",
     ":perfetto_src_ipc_common",
+    ":perfetto_src_profiling_perf_proc_descriptors",
     ":perfetto_src_profiling_perf_producer",
+    ":perfetto_src_profiling_perf_regs_parsing",
     ":perfetto_src_profiling_perf_traced_perf_main",
-    ":perfetto_src_profiling_perf_unwind_support",
     ":perfetto_src_protozero_protozero",
     ":perfetto_src_tracing_common",
     ":perfetto_src_tracing_core_core",