Use a separate hwcomposer hidl instance for vr flinger

Improve robustness of vr flinger <--> surface flinger switching by
having vr flinger use a separate hardware composer hidl instance instead
of sharing the instance with surface flinger. Sharing the hardware
composer instance has proven to be error prone, with situations where
both the vr flinger thread and surface flinger main thread would write
to the composer at the same time, causing hard to diagnose
crashes (b/62925812).

Instead of sharing the hardware composer instance, when switching to vr
flinger we now delete the existing instance, create a new instance
directed to the vr hardware composer shim, and vr flinger creates its
own composer instance connected to the real hardware composer. By
creating a separate composer instance for vr flinger, crashes like the
ones found in b/62925812 are no longer impossible.

Most of the changes in this commit are related to enabling surface
flinger to delete HWComposer instances cleanly. In particular:

- Previously the hardware composer callbacks (which come in on a
  hwbinder thread) would land in HWC2::Device and bubble up to the
  SurfaceFlinger object. But with the new behavior the HWC2::Device
  might be dead or in the process of being destroyed, so instead we have
  SurfaceFlinger receive the composer callbacks directly, and forward
  them to HWComposer and HWC2::Device. We include a composer id field in
  the callbacks so surface flinger can ignore stale callbacks from dead
  composer instances.

- Object ownership for HWC2::Display and HWC2::Layer was shared by
  passing around shared_ptrs to these objects. This was problematic
  because they referenced and used the HWC2::Device, which can now be
  destroyed when switching to vr flinger. Simplify the ownership model
  by having HWC2::Device own (via unique_ptr<>) instances of
  HWC2::Display, which owns (again via unique_ptr<>) instances of
  HWC2::Layer. In cases where we previously passed std::shared_ptr<> to
  HWC2::Display or HWC2::Layer, instead pass non-owning HWC2::Display*
  and HWC2::Layer* pointers. This ensures clean composer instance
  teardown with no stale references to the deleted HWC2::Device.

- When the hardware composer instance is destroyed and the HWC2::Layers
  are removed, notify the android::Layer via a callback, so it can
  remove the HWC2::Layer from its internal table of hardware composer
  layers. This removes the burden to explicitly clear out all hardware
  composer layers when switching to vr flinger, which has been a source
  of bugs.

- We were missing an mStateLock lock in
  SurfaceFlinger::setVsyncEnabled(), which was necessary to ensure we
  were setting vsync on the correct hardware composer instance. Once
  that lock was added, surface flinger would sometimes deadlock when
  transitioning to vr flinger, because the surface flinger main thread
  would acquire mStateLock and then EventControlThread::mMutex, whereas
  the event control thread would acquire the locks in the opposite
  order. The changes in EventControlThread.cpp are to ensure it doesn't
  hold a lock on EventControlThread::mMutex while calling
  setVsyncEnabled(), to avoid the deadlock.

I found that without a composer callback registered in vr flinger the
vsync_event file wasn't getting vsync timestamps written, so vr flinger
would get stuck in an infinite loop trying to parse a vsync
timestamp. Since we need to have a callback anyway I changed the code in
hardware_composer.cpp to get the vsync timestamp from the callback, as
surface flinger does. I confirmed the timestamps are the same with
either method, and this lets us remove some extra code for extracting
the vsync timestamp that (probably) wasn't compatible with all devices
we want to run on anyway. I also added a timeout to the vysnc wait so
we'll see an error message in the log if we fail to wait for vsync,
instead of looping forever.

Bug: 62925812

Test: - Confirmed surface flinger <--> vr flinger switching is robust by
        switching devices on and off hundreds of times and observing no
        hardware composer related issues, surface flinger crashes, or
        hardware composer service crashes.

- Confirmed 2d in vr works as before by going through the OOBE flow on a
  standalone. This also exercises virtual display creation and usage
  through surface flinger.

- Added logs to confirm perfect layer/display cleanup when destroying
  hardware composer instances.

- Tested normal 2d phone usage to confirm basic layer create/destroy
  functionality works as before.

- Monitored surface flinger file descriptor usage across dozens of
  surface flinger <--> vr flinger transitions and observed no file
  descriptor leaks.

- Confirmed the HWC1 code path still compiles.

- Ran the surface flinger tests and confirmed there are no new test
  failures.

- Ran the hardware composer hidl in passthrough mode on a Marlin and
  confirmed it works.

- Ran CTS tests for virtual displays and confirmed they all pass.

- Tested Android Auto and confirmed basic graphics functionality still
  works.

Change-Id: I17dc0e060bfb5cb447ffbaa573b279fc6d2d8bd1
Merged-In: I17dc0e060bfb5cb447ffbaa573b279fc6d2d8bd1
diff --git a/libs/vr/libvrflinger/hardware_composer.h b/libs/vr/libvrflinger/hardware_composer.h
index a0c50e1..fc0efee 100644
--- a/libs/vr/libvrflinger/hardware_composer.h
+++ b/libs/vr/libvrflinger/hardware_composer.h
@@ -54,11 +54,6 @@
  public:
   Layer() {}
 
-  // Sets up the global state used by all Layer instances. This must be called
-  // before using any Layer methods.
-  static void InitializeGlobals(Hwc2::Composer* hwc2_hidl,
-                                const HWCDisplayMetrics* metrics);
-
   // Releases any shared pointers and fence handles held by this instance.
   void Reset();
 
@@ -72,6 +67,7 @@
   // HWC_FRAMEBUFFER_TARGET (unless you know what you are doing).
   // |index| is the index of this surface in the DirectDisplaySurface array.
   void Setup(const std::shared_ptr<DirectDisplaySurface>& surface,
+             const HWCDisplayMetrics& display_metrics, Hwc2::Composer* hidl,
              HWC::BlendMode blending, HWC::Transform transform,
              HWC::Composition composition_type, size_t z_roder);
 
@@ -83,9 +79,10 @@
   // |transform| receives HWC_TRANSFORM_* values.
   // |composition_type| receives either HWC_FRAMEBUFFER for most layers or
   // HWC_FRAMEBUFFER_TARGET (unless you know what you are doing).
-  void Setup(const std::shared_ptr<IonBuffer>& buffer, HWC::BlendMode blending,
-             HWC::Transform transform, HWC::Composition composition_type,
-             size_t z_order);
+  void Setup(const std::shared_ptr<IonBuffer>& buffer,
+             const HWCDisplayMetrics& display_metrics, Hwc2::Composer* hidl,
+             HWC::BlendMode blending, HWC::Transform transform,
+             HWC::Composition composition_type, size_t z_order);
 
   // Layers that use a direct IonBuffer should call this each frame to update
   // which buffer will be used for the next PostLayers.
@@ -121,7 +118,7 @@
   bool IsLayerSetup() const { return !source_.empty(); }
 
   // Applies all of the settings to this layer using the hwc functions
-  void UpdateLayerSettings();
+  void UpdateLayerSettings(const HWCDisplayMetrics& display_metrics);
 
   int GetSurfaceId() const {
     int surface_id = -1;
@@ -142,10 +139,9 @@
   }
 
  private:
-  void CommonLayerSetup();
+  void CommonLayerSetup(const HWCDisplayMetrics& display_metrics);
 
-  static Hwc2::Composer* hwc2_hidl_;
-  static const HWCDisplayMetrics* display_metrics_;
+  Hwc2::Composer* hidl_ = nullptr;
 
   // The hardware composer layer and metrics to use during the prepare cycle.
   hwc2_layer_t hardware_composer_layer_ = 0;
@@ -263,11 +259,10 @@
   static constexpr size_t kMaxHardwareLayers = 4;
 
   HardwareComposer();
-  HardwareComposer(Hwc2::Composer* hidl,
-                   RequestDisplayCallback request_display_callback);
   ~HardwareComposer();
 
-  bool Initialize();
+  bool Initialize(Hwc2::Composer* hidl,
+                  RequestDisplayCallback request_display_callback);
 
   bool IsInitialized() const { return initialized_; }
 
@@ -281,11 +276,6 @@
   // Get the HMD display metrics for the current display.
   display::Metrics GetHmdDisplayMetrics() const;
 
-  HWC::Error GetDisplayAttribute(hwc2_display_t display, hwc2_config_t config,
-                                 hwc2_attribute_t attributes,
-                                 int32_t* out_value) const;
-  HWC::Error GetDisplayMetrics(hwc2_display_t display, hwc2_config_t config,
-                               HWCDisplayMetrics* out_metrics) const;
   std::string Dump();
 
   void SetVSyncCallback(VSyncCallback callback);
@@ -308,34 +298,31 @@
   int OnNewGlobalBuffer(DvrGlobalBufferKey key, IonBuffer& ion_buffer);
   void OnDeletedGlobalBuffer(DvrGlobalBufferKey key);
 
-  void OnHardwareComposerRefresh();
-
  private:
-  int32_t EnableVsync(bool enabled);
+  HWC::Error GetDisplayAttribute(Hwc2::Composer* hidl, hwc2_display_t display,
+                                 hwc2_config_t config,
+                                 hwc2_attribute_t attributes,
+                                 int32_t* out_value) const;
+  HWC::Error GetDisplayMetrics(Hwc2::Composer* hidl, hwc2_display_t display,
+                               hwc2_config_t config,
+                               HWCDisplayMetrics* out_metrics) const;
+
+  HWC::Error EnableVsync(bool enabled);
 
   class ComposerCallback : public Hwc2::IComposerCallback {
    public:
-    ComposerCallback() {}
-
-    hardware::Return<void> onHotplug(Hwc2::Display /*display*/,
-                                     Connection /*connected*/) override {
-      // TODO(skiazyk): depending on how the server is implemented, we might
-      // have to set it up to synchronize with receiving this event, as it can
-      // potentially be a critical event for setting up state within the
-      // hwc2 module. That is, we (technically) should not call any other hwc
-      // methods until this method has been called after registering the
-      // callbacks.
-      return hardware::Void();
-    }
-
-    hardware::Return<void> onRefresh(Hwc2::Display /*display*/) override {
-      return hardware::Void();
-    }
-
-    hardware::Return<void> onVsync(Hwc2::Display /*display*/,
-                                   int64_t /*timestamp*/) override {
-      return hardware::Void();
-    }
+    ComposerCallback();
+    hardware::Return<void> onHotplug(Hwc2::Display display,
+                                     Connection conn) override;
+    hardware::Return<void> onRefresh(Hwc2::Display display) override;
+    hardware::Return<void> onVsync(Hwc2::Display display,
+                                   int64_t timestamp) override;
+    const pdx::LocalHandle& GetVsyncEventFd() const;
+    int64_t GetVsyncTime();
+   private:
+    std::mutex vsync_mutex_;
+    pdx::LocalHandle vsync_event_fd_;
+    int64_t vsync_time_ = -1;
   };
 
   HWC::Error Validate(hwc2_display_t display);
@@ -364,17 +351,18 @@
   void UpdatePostThreadState(uint32_t state, bool suspend);
 
   // Blocks until either event_fd becomes readable, or we're interrupted by a
-  // control thread. Any errors are returned as negative errno values. If we're
-  // interrupted, kPostThreadInterrupted will be returned.
+  // control thread, or timeout_ms is reached before any events occur. Any
+  // errors are returned as negative errno values, with -ETIMEDOUT returned in
+  // the case of a timeout. If we're interrupted, kPostThreadInterrupted will be
+  // returned.
   int PostThreadPollInterruptible(const pdx::LocalHandle& event_fd,
-                                  int requested_events);
+                                  int requested_events,
+                                  int timeout_ms);
 
-  // BlockUntilVSync, WaitForVSync, and SleepUntil are all blocking calls made
-  // on the post thread that can be interrupted by a control thread. If
-  // interrupted, these calls return kPostThreadInterrupted.
+  // WaitForVSync and SleepUntil are blocking calls made on the post thread that
+  // can be interrupted by a control thread. If interrupted, these calls return
+  // kPostThreadInterrupted.
   int ReadWaitPPState();
-  int BlockUntilVSync();
-  int ReadVSyncTimestamp(int64_t* timestamp);
   int WaitForVSync(int64_t* timestamp);
   int SleepUntil(int64_t wakeup_timestamp);
 
@@ -398,11 +386,9 @@
 
   bool initialized_;
 
-  // Hardware composer HAL device from SurfaceFlinger. VrFlinger does not own
-  // this pointer.
-  Hwc2::Composer* hwc2_hidl_;
+  std::unique_ptr<Hwc2::Composer> hidl_;
+  sp<ComposerCallback> hidl_callback_;
   RequestDisplayCallback request_display_callback_;
-  sp<ComposerCallback> callbacks_;
 
   // Display metrics of the physical display.
   HWCDisplayMetrics native_display_metrics_;
@@ -433,7 +419,8 @@
   std::thread post_thread_;
 
   // Post thread state machine and synchronization primitives.
-  PostThreadStateType post_thread_state_{PostThreadState::Idle};
+  PostThreadStateType post_thread_state_{
+      PostThreadState::Idle | PostThreadState::Suspended};
   std::atomic<bool> post_thread_quiescent_{true};
   bool post_thread_resumed_{false};
   pdx::LocalHandle post_thread_event_fd_;
@@ -444,9 +431,6 @@
   // Backlight LED brightness sysfs node.
   pdx::LocalHandle backlight_brightness_fd_;
 
-  // Primary display vsync event sysfs node.
-  pdx::LocalHandle primary_display_vsync_event_fd_;
-
   // Primary display wait_pingpong state sysfs node.
   pdx::LocalHandle primary_display_wait_pp_fd_;
 
@@ -478,12 +462,6 @@
 
   static constexpr int kPostThreadInterrupted = 1;
 
-  static void HwcRefresh(hwc2_callback_data_t data, hwc2_display_t display);
-  static void HwcVSync(hwc2_callback_data_t data, hwc2_display_t display,
-                       int64_t timestamp);
-  static void HwcHotplug(hwc2_callback_data_t callbackData,
-                         hwc2_display_t display, hwc2_connection_t connected);
-
   HardwareComposer(const HardwareComposer&) = delete;
   void operator=(const HardwareComposer&) = delete;
 };