Add clean up pathway for execution_subdir crosbug.com/31595.
For some, as of yet, unknown reason the scheduler ends up trying to recover
a job that has no execution_jobdir path that is valid.
This patch adds a pathway to clean up such entries by aborting them and logging
information about the database's current state as well as the id of the host
queue entry that was updated.
When the scheduler finds these issues it will _still_ crash. After some
investigation bailing out and waiting for the scheduler babysitter to restart
seemed to be the most sane solution until the nature of this bug can be better
understood. This avoids potentially creating a snowballing problem by mucking
with the database's state. Another reason why this is preferred is that it will
still inform us that the problem occurred rather via email than silently logging
the issues locally on the server.
BUG=chromium-os:31595
TEST=Loaded a known bad database state and started the scheduler.
Observed the scheduler eventually recovering properly.
Added fake data in to the host_queue_entries that coarse this sort of behavior
randomly during the scheduler's execution ensuring that it properly recovered.
Change-Id: I8d22c1964dc60dc119c9d1a90815543c12ad7a94
Reviewed-on: https://gerrit.chromium.org/gerrit/27450
Tested-by: Scott Zawalski <scottz@chromium.org>
Reviewed-by: Chris Sosa <sosa@chromium.org>
Commit-Ready: Scott Zawalski <scottz@chromium.org>
Reviewed-by: Scott Zawalski <scottz@chromium.org>
1 file changed