Make SchedulerWorkerPoolImpl::DisallowWorkerPoolCleanupForTesting() synchronized with SchedulerWorkers' cleanup step.
Atomics didn't guarantee synchronization as a worker could observe it
was allowed to cleanup, then be preempted, and later actually cleanup
after cleanup had been disallowed. Synchronizing on the pool's lock
prevents this.
I believe this is the source of most TaskScheduler flakes we've seen on
Fuschia. The actual test being blamed were also random because a leaky
worker would result in a user-after-free in the following test, not
the test at fault.
While this CL synchronizes cleanups with DisallowWorkerPoolCleanupForTesting(),
a follow-up CL will be required to ensure detached workers do not touch pool
state after cleanup (since they removed themselves from |workers_| they will
not be joined and hence using pool state can result in a use-after-free).
R=fdoray@chromium.org, robliao@chromium.org
Bug: 813278
Change-Id: I455a2c8cff3aa9b1176567cc4a6409ff6ca6807d
Reviewed-on: https://chromium-review.googlesource.com/929061
Commit-Queue: Gabriel Charette <gab@chromium.org>
Reviewed-by: Robert Liao <robliao@chromium.org>
Cr-Commit-Position: refs/heads/master@{#538388}
CrOS-Libchrome-Original-Commit: fc40b1417b669d7df75b4d61491c521048027fda
2 files changed