Fix memory order and race bugs in Refbase.h & RefBase.cpp

Convert to use std::atomic directly.

Consistently use relaxed ordering for increments, release ordering
for decrements, and an added acquire fence when the count goes to
zero.

Fix what looks like another race in attemptIncStrong:
It seems entirely possible that the final adjustment for
INITIAL_STRONG_VALUE would see e.g. INITIAL_STRONG_VALUE + 1,
since we could be running in the middle of another initial
increment.

Attempt to somewhat document what this actually does, and
what's expected from the client. Hide the documentation in
the .cpp file for now.

Remove a confusing redundant test in decWeak. OBJECT_LIFETIME_STRONG
and OBJECT_LIFETIME_WEAK are the only options, in spite of some
of the original comments.

It's conceivable that either of these issues has resulted in
actual crashes, though I would guess the probability is small.
It's hard enough to reason about this code without the bugs.

Bug: 28705989
Change-Id: I4107a56c3fc0fdb7ee17fc8a8f0dd7fb128af9d8
(cherry picked from commit e263e6c6337a24d56dc803792206e54981ad53a5)
diff --git a/libutils/RefBase.cpp b/libutils/RefBase.cpp
index 02907ad..f90e28b 100644
--- a/libutils/RefBase.cpp
+++ b/libutils/RefBase.cpp
@@ -27,7 +27,6 @@
 
 #include <utils/RefBase.h>
 
-#include <utils/Atomic.h>
 #include <utils/CallStack.h>
 #include <utils/Log.h>
 #include <utils/threads.h>
@@ -57,6 +56,68 @@
 
 namespace android {
 
+// Usage, invariants, etc:
+
+// It is normally OK just to keep weak pointers to an object.  The object will
+// be deallocated by decWeak when the last weak reference disappears.
+// Once a a strong reference has been created, the object will disappear once
+// the last strong reference does (decStrong).
+// AttemptIncStrong will succeed if the object has a strong reference, or if it
+// has a weak reference and has never had a strong reference.
+// AttemptIncWeak really does succeed only if there is already a WEAK
+// reference, and thus may fail when attemptIncStrong would succeed.
+// OBJECT_LIFETIME_WEAK changes this behavior to retain the object
+// unconditionally until the last reference of either kind disappears.  The
+// client ensures that the extendObjectLifetime call happens before the dec
+// call that would otherwise have deallocated the object, or before an
+// attemptIncStrong call that might rely on it.  We do not worry about
+// concurrent changes to the object lifetime.
+// mStrong is the strong reference count.  mWeak is the weak reference count.
+// Between calls, and ignoring memory ordering effects, mWeak includes strong
+// references, and is thus >= mStrong.
+//
+// A weakref_impl is allocated as the value of mRefs in a RefBase object on
+// construction.
+// In the OBJECT_LIFETIME_STRONG case, it is deallocated in the RefBase
+// destructor iff the strong reference count was never incremented. The
+// destructor can be invoked either from decStrong, or from decWeak if there
+// was never a strong reference. If the reference count had been incremented,
+// it is deallocated directly in decWeak, and hence still lives as long as
+// the last weak reference.
+// In the OBJECT_LIFETIME_WEAK case, it is always deallocated from the RefBase
+// destructor, which is always invoked by decWeak. DecStrong explicitly avoids
+// the deletion in this case.
+//
+// Memory ordering:
+// The client must ensure that every inc() call, together with all other
+// accesses to the object, happens before the corresponding dec() call.
+//
+// We try to keep memory ordering constraints on atomics as weak as possible,
+// since memory fences or ordered memory accesses are likely to be a major
+// performance cost for this code. All accesses to mStrong, mWeak, and mFlags
+// explicitly relax memory ordering in some way.
+//
+// The only operations that are not memory_order_relaxed are reference count
+// decrements. All reference count decrements are release operations.  In
+// addition, the final decrement leading the deallocation is followed by an
+// acquire fence, which we can view informally as also turning it into an
+// acquire operation.  (See 29.8p4 [atomics.fences] for details. We could
+// alternatively use acq_rel operations for all decrements. This is probably
+// slower on most current (2016) hardware, especially on ARMv7, but that may
+// not be true indefinitely.)
+//
+// This convention ensures that the second-to-last decrement synchronizes with
+// (in the language of 1.10 in the C++ standard) the final decrement of a
+// reference count. Since reference counts are only updated using atomic
+// read-modify-write operations, this also extends to any earlier decrements.
+// (See "release sequence" in 1.10.)
+//
+// Since all operations on an object happen before the corresponding reference
+// count decrement, and all reference count decrements happen before the final
+// one, we are guaranteed that all other object accesses happen before the
+// object is destroyed.
+
+
 #define INITIAL_STRONG_VALUE (1<<28)
 
 // ---------------------------------------------------------------------------
@@ -64,10 +125,10 @@
 class RefBase::weakref_impl : public RefBase::weakref_type
 {
 public:
-    volatile int32_t    mStrong;
-    volatile int32_t    mWeak;
-    RefBase* const      mBase;
-    volatile int32_t    mFlags;
+    std::atomic<int32_t>    mStrong;
+    std::atomic<int32_t>    mWeak;
+    RefBase* const          mBase;
+    std::atomic<int32_t>    mFlags;
 
 #if !DEBUG_REFS
 
@@ -141,7 +202,7 @@
     void addStrongRef(const void* id) {
         //ALOGD_IF(mTrackEnabled,
         //        "addStrongRef: RefBase=%p, id=%p", mBase, id);
-        addRef(&mStrongRefs, id, mStrong);
+        addRef(&mStrongRefs, id, mStrong.load(std::memory_order_relaxed));
     }
 
     void removeStrongRef(const void* id) {
@@ -150,7 +211,7 @@
         if (!mRetain) {
             removeRef(&mStrongRefs, id);
         } else {
-            addRef(&mStrongRefs, id, -mStrong);
+            addRef(&mStrongRefs, id, -mStrong.load(std::memory_order_relaxed));
         }
     }
 
@@ -162,14 +223,14 @@
     }
 
     void addWeakRef(const void* id) {
-        addRef(&mWeakRefs, id, mWeak);
+        addRef(&mWeakRefs, id, mWeak.load(std::memory_order_relaxed));
     }
 
     void removeWeakRef(const void* id) {
         if (!mRetain) {
             removeRef(&mWeakRefs, id);
         } else {
-            addRef(&mWeakRefs, id, -mWeak);
+            addRef(&mWeakRefs, id, -mWeak.load(std::memory_order_relaxed));
         }
     }
 
@@ -325,7 +386,7 @@
     refs->incWeak(id);
     
     refs->addStrongRef(id);
-    const int32_t c = android_atomic_inc(&refs->mStrong);
+    const int32_t c = refs->mStrong.fetch_add(1, std::memory_order_relaxed);
     ALOG_ASSERT(c > 0, "incStrong() called on %p after last strong ref", refs);
 #if PRINT_REFS
     ALOGD("incStrong of %p from %p: cnt=%d\n", this, id, c);
@@ -334,7 +395,10 @@
         return;
     }
 
-    android_atomic_add(-INITIAL_STRONG_VALUE, &refs->mStrong);
+    int32_t old = refs->mStrong.fetch_sub(INITIAL_STRONG_VALUE,
+            std::memory_order_relaxed);
+    // A decStrong() must still happen after us.
+    ALOG_ASSERT(old > INITIAL_STRONG_VALUE, "0x%x too small", old);
     refs->mBase->onFirstRef();
 }
 
@@ -342,27 +406,39 @@
 {
     weakref_impl* const refs = mRefs;
     refs->removeStrongRef(id);
-    const int32_t c = android_atomic_dec(&refs->mStrong);
+    const int32_t c = refs->mStrong.fetch_sub(1, std::memory_order_release);
 #if PRINT_REFS
     ALOGD("decStrong of %p from %p: cnt=%d\n", this, id, c);
 #endif
     ALOG_ASSERT(c >= 1, "decStrong() called on %p too many times", refs);
     if (c == 1) {
+        std::atomic_thread_fence(std::memory_order_acquire);
         refs->mBase->onLastStrongRef(id);
-        if ((refs->mFlags&OBJECT_LIFETIME_MASK) == OBJECT_LIFETIME_STRONG) {
+        int32_t flags = refs->mFlags.load(std::memory_order_relaxed);
+        if ((flags&OBJECT_LIFETIME_MASK) == OBJECT_LIFETIME_STRONG) {
             delete this;
+            // Since mStrong had been incremented, the destructor did not
+            // delete refs.
         }
     }
+    // Note that even with only strong reference operations, the thread
+    // deallocating this may not be the same as the thread deallocating refs.
+    // That's OK: all accesses to this happen before its deletion here,
+    // and all accesses to refs happen before its deletion in the final decWeak.
+    // The destructor can safely access mRefs because either it's deleting
+    // mRefs itself, or it's running entirely before the final mWeak decrement.
     refs->decWeak(id);
 }
 
 void RefBase::forceIncStrong(const void* id) const
 {
+    // Allows initial mStrong of 0 in addition to INITIAL_STRONG_VALUE.
+    // TODO: Better document assumptions.
     weakref_impl* const refs = mRefs;
     refs->incWeak(id);
     
     refs->addStrongRef(id);
-    const int32_t c = android_atomic_inc(&refs->mStrong);
+    const int32_t c = refs->mStrong.fetch_add(1, std::memory_order_relaxed);
     ALOG_ASSERT(c >= 0, "forceIncStrong called on %p after ref count underflow",
                refs);
 #if PRINT_REFS
@@ -371,7 +447,8 @@
 
     switch (c) {
     case INITIAL_STRONG_VALUE:
-        android_atomic_add(-INITIAL_STRONG_VALUE, &refs->mStrong);
+        refs->mStrong.fetch_sub(INITIAL_STRONG_VALUE,
+                std::memory_order_relaxed);
         // fall through...
     case 0:
         refs->mBase->onFirstRef();
@@ -380,7 +457,8 @@
 
 int32_t RefBase::getStrongCount() const
 {
-    return mRefs->mStrong;
+    // Debugging only; No memory ordering guarantees.
+    return mRefs->mStrong.load(std::memory_order_relaxed);
 }
 
 RefBase* RefBase::weakref_type::refBase() const
@@ -392,7 +470,8 @@
 {
     weakref_impl* const impl = static_cast<weakref_impl*>(this);
     impl->addWeakRef(id);
-    const int32_t c __unused = android_atomic_inc(&impl->mWeak);
+    const int32_t c __unused = impl->mWeak.fetch_add(1,
+            std::memory_order_relaxed);
     ALOG_ASSERT(c >= 0, "incWeak called on %p after last weak ref", this);
 }
 
@@ -401,16 +480,19 @@
 {
     weakref_impl* const impl = static_cast<weakref_impl*>(this);
     impl->removeWeakRef(id);
-    const int32_t c = android_atomic_dec(&impl->mWeak);
+    const int32_t c = impl->mWeak.fetch_sub(1, std::memory_order_release);
     ALOG_ASSERT(c >= 1, "decWeak called on %p too many times", this);
     if (c != 1) return;
+    atomic_thread_fence(std::memory_order_acquire);
 
-    if ((impl->mFlags&OBJECT_LIFETIME_WEAK) == OBJECT_LIFETIME_STRONG) {
+    int32_t flags = impl->mFlags.load(std::memory_order_relaxed);
+    if ((flags&OBJECT_LIFETIME_MASK) == OBJECT_LIFETIME_STRONG) {
         // This is the regular lifetime case. The object is destroyed
         // when the last strong reference goes away. Since weakref_impl
         // outlive the object, it is not destroyed in the dtor, and
         // we'll have to do it here.
-        if (impl->mStrong == INITIAL_STRONG_VALUE) {
+        if (impl->mStrong.load(std::memory_order_relaxed)
+                == INITIAL_STRONG_VALUE) {
             // Special case: we never had a strong reference, so we need to
             // destroy the object now.
             delete impl->mBase;
@@ -419,13 +501,10 @@
             delete impl;
         }
     } else {
-        // less common case: lifetime is OBJECT_LIFETIME_{WEAK|FOREVER}
+        // This is the OBJECT_LIFETIME_WEAK case. The last weak-reference
+        // is gone, we can destroy the object.
         impl->mBase->onLastWeakRef(id);
-        if ((impl->mFlags&OBJECT_LIFETIME_MASK) == OBJECT_LIFETIME_WEAK) {
-            // this is the OBJECT_LIFETIME_WEAK case. The last weak-reference
-            // is gone, we can destroy the object.
-            delete impl->mBase;
-        }
+        delete impl->mBase;
     }
 }
 
@@ -434,7 +513,7 @@
     incWeak(id);
     
     weakref_impl* const impl = static_cast<weakref_impl*>(this);
-    int32_t curCount = impl->mStrong;
+    int32_t curCount = impl->mStrong.load(std::memory_order_relaxed);
 
     ALOG_ASSERT(curCount >= 0,
             "attemptIncStrong called on %p after underflow", this);
@@ -442,19 +521,20 @@
     while (curCount > 0 && curCount != INITIAL_STRONG_VALUE) {
         // we're in the easy/common case of promoting a weak-reference
         // from an existing strong reference.
-        if (android_atomic_cmpxchg(curCount, curCount+1, &impl->mStrong) == 0) {
+        if (impl->mStrong.compare_exchange_weak(curCount, curCount+1,
+                std::memory_order_relaxed)) {
             break;
         }
         // the strong count has changed on us, we need to re-assert our
-        // situation.
-        curCount = impl->mStrong;
+        // situation. curCount was updated by compare_exchange_weak.
     }
     
     if (curCount <= 0 || curCount == INITIAL_STRONG_VALUE) {
         // we're now in the harder case of either:
         // - there never was a strong reference on us
         // - or, all strong references have been released
-        if ((impl->mFlags&OBJECT_LIFETIME_WEAK) == OBJECT_LIFETIME_STRONG) {
+        int32_t flags = impl->mFlags.load(std::memory_order_relaxed);
+        if ((flags&OBJECT_LIFETIME_MASK) == OBJECT_LIFETIME_STRONG) {
             // this object has a "normal" life-time, i.e.: it gets destroyed
             // when the last strong reference goes away
             if (curCount <= 0) {
@@ -468,13 +548,13 @@
             // there never was a strong-reference, so we can try to
             // promote this object; we need to do that atomically.
             while (curCount > 0) {
-                if (android_atomic_cmpxchg(curCount, curCount + 1,
-                        &impl->mStrong) == 0) {
+                if (impl->mStrong.compare_exchange_weak(curCount, curCount+1,
+                        std::memory_order_relaxed)) {
                     break;
                 }
                 // the strong count has changed on us, we need to re-assert our
                 // situation (e.g.: another thread has inc/decStrong'ed us)
-                curCount = impl->mStrong;
+                // curCount has been updated.
             }
 
             if (curCount <= 0) {
@@ -494,7 +574,7 @@
             }
             // grab a strong-reference, which is always safe due to the
             // extended life-time.
-            curCount = android_atomic_inc(&impl->mStrong);
+            curCount = impl->mStrong.fetch_add(1, std::memory_order_relaxed);
         }
 
         // If the strong reference count has already been incremented by
@@ -513,21 +593,16 @@
     ALOGD("attemptIncStrong of %p from %p: cnt=%d\n", this, id, curCount);
 #endif
 
-    // now we need to fix-up the count if it was INITIAL_STRONG_VALUE
-    // this must be done safely, i.e.: handle the case where several threads
+    // curCount is the value of mStrong before we increment ed it.
+    // Now we need to fix-up the count if it was INITIAL_STRONG_VALUE.
+    // This must be done safely, i.e.: handle the case where several threads
     // were here in attemptIncStrong().
-    curCount = impl->mStrong;
-    while (curCount >= INITIAL_STRONG_VALUE) {
-        ALOG_ASSERT(curCount > INITIAL_STRONG_VALUE,
-                "attemptIncStrong in %p underflowed to INITIAL_STRONG_VALUE",
-                this);
-        if (android_atomic_cmpxchg(curCount, curCount-INITIAL_STRONG_VALUE,
-                &impl->mStrong) == 0) {
-            break;
-        }
-        // the strong-count changed on us, we need to re-assert the situation,
-        // for e.g.: it's possible the fix-up happened in another thread.
-        curCount = impl->mStrong;
+    // curCount > INITIAL_STRONG_VALUE is OK, and can happen if we're doing
+    // this in the middle of another incStrong.  The subtraction is handled
+    // by the thread that started with INITIAL_STRONG_VALUE.
+    if (curCount == INITIAL_STRONG_VALUE) {
+        impl->mStrong.fetch_sub(INITIAL_STRONG_VALUE,
+                std::memory_order_relaxed);
     }
 
     return true;
@@ -537,14 +612,15 @@
 {
     weakref_impl* const impl = static_cast<weakref_impl*>(this);
 
-    int32_t curCount = impl->mWeak;
+    int32_t curCount = impl->mWeak.load(std::memory_order_relaxed);
     ALOG_ASSERT(curCount >= 0, "attemptIncWeak called on %p after underflow",
                this);
     while (curCount > 0) {
-        if (android_atomic_cmpxchg(curCount, curCount+1, &impl->mWeak) == 0) {
+        if (impl->mWeak.compare_exchange_weak(curCount, curCount+1,
+                std::memory_order_relaxed)) {
             break;
         }
-        curCount = impl->mWeak;
+        // curCount has been updated.
     }
 
     if (curCount > 0) {
@@ -556,7 +632,9 @@
 
 int32_t RefBase::weakref_type::getWeakCount() const
 {
-    return static_cast<const weakref_impl*>(this)->mWeak;
+    // Debug only!
+    return static_cast<const weakref_impl*>(this)->mWeak
+            .load(std::memory_order_relaxed);
 }
 
 void RefBase::weakref_type::printRefs() const
@@ -587,17 +665,19 @@
 
 RefBase::~RefBase()
 {
-    if (mRefs->mStrong == INITIAL_STRONG_VALUE) {
+    if (mRefs->mStrong.load(std::memory_order_relaxed)
+            == INITIAL_STRONG_VALUE) {
         // we never acquired a strong (and/or weak) reference on this object.
         delete mRefs;
     } else {
-        // life-time of this object is extended to WEAK or FOREVER, in
+        // life-time of this object is extended to WEAK, in
         // which case weakref_impl doesn't out-live the object and we
         // can free it now.
-        if ((mRefs->mFlags & OBJECT_LIFETIME_MASK) != OBJECT_LIFETIME_STRONG) {
+        int32_t flags = mRefs->mFlags.load(std::memory_order_relaxed);
+        if ((flags & OBJECT_LIFETIME_MASK) != OBJECT_LIFETIME_STRONG) {
             // It's possible that the weak count is not 0 if the object
             // re-acquired a weak reference in its destructor
-            if (mRefs->mWeak == 0) {
+            if (mRefs->mWeak.load(std::memory_order_relaxed) == 0) {
                 delete mRefs;
             }
         }
@@ -608,7 +688,9 @@
 
 void RefBase::extendObjectLifetime(int32_t mode)
 {
-    android_atomic_or(mode, &mRefs->mFlags);
+    // Must be happens-before ordered with respect to construction or any
+    // operation that could destroy the object.
+    mRefs->mFlags.fetch_or(mode, std::memory_order_relaxed);
 }
 
 void RefBase::onFirstRef()