[autotest] Fix parent/child and multiple aborts.
The queue_entry_id is a primary key in afe_aborted_host_queue_entries, which
should precludes us from trying to abort the same job multiple times. It will
happen if we abort a parent job (which leads to the abortion of its children),
and then abort a child in that set before the scheduler sets the complete bit.
The complete bit is set when we abort the hqe, so depending on where the
scheduler is in the tick, this leaves a ~30 second window within which our
second child abort will lead to an IntegrityError. This window is actually much
longer if the HQE is already in PARSING, since we won't set the complete bit
till the epilog of the PostJobTask.
TEST=Aborted suites, then their children. Tried to abort complete jobs.
BUG=chromium:308010
DEPLOY=Apache
Change-Id: Ib112502e966f5575c523ac053e94d44bd92359d6
Reviewed-on: https://chromium-review.googlesource.com/174988
Tested-by: Prashanth B <beeps@chromium.org>
Reviewed-by: Alex Miller <milleral@chromium.org>
Commit-Queue: Prashanth B <beeps@chromium.org>
diff --git a/frontend/afe/rpc_interface.py b/frontend/afe/rpc_interface.py
index 3702ad5..198f120 100644
--- a/frontend/afe/rpc_interface.py
+++ b/frontend/afe/rpc_interface.py
@@ -586,7 +586,11 @@
Abort a set of host queue entries.
"""
query = models.HostQueueEntry.query_objects(filter_data)
- query = query.filter(complete=False)
+
+ # Dont allow aborts on:
+ # 1. Jobs that have already completed (whether or not they were aborted)
+ # 2. Jobs that we have already been aborted (but may not have completed)
+ query = query.filter(complete=False).filter(aborted=False)
models.AclGroup.check_abort_permissions(query)
host_queue_entries = list(query.select_related())
rpc_utils.check_abort_synchronous_jobs(host_queue_entries)