[autotest] Include the time created as a uid in task results. Since tasks will run directly on shards there is some chance that a shard will failover and its successor clobbers the special task logs of the dead shard in google storage. To avoid this scenario, include the creation time of the task as a uid in the path to the logs. Unless one of the shards is living in the past these timestamps should never repeat. Moreover, we take measures against clock drift (ntpd) and if a shard has the wrong system time configuration all its jobs will timeout anyway, since each job has a TTL based off time_created which is set on the master. TEST=Ran suites. BUG=chromium:423225,chromium:425347 DEPLOY=scheduler,apache Change-Id: Ia23ac8fd721f53fbb9b475c8eb9f8d25e4fd1c2f Reviewed-on: https://chromium-review.googlesource.com/228781 Tested-by: Prashanth B <beeps@chromium.org> Reviewed-by: Dan Shi <dshi@chromium.org> Commit-Queue: Prashanth B <beeps@chromium.org>

commit: de87dea6a5a82b4ce1ddc5ce881ddbc70b54fdda [log] [tgz]
author: Prashanth Balasubramanian <beeps@google.com> Sun Nov 09 17:47:10 2014 -0800
committer: chrome-internal-fetch <chrome-internal-fetch@google.com> Wed Nov 12 19:57:53 2014 +0000
tree: 342884f826e41b971997137e4405c3d8d52c4b93
parent: 5c8fe18fe8f2469783b527fd9ac9ed20ee13f99f [diff] [blame]
diff --git a/frontend/afe/models.py b/frontend/afe/models.py
index 305d331..ff8c28b 100644
--- a/frontend/afe/models.py
+++ b/frontend/afe/models.py

@@ -1792,9 +1792,46 @@
 
 
     def execution_path(self):
-        """@see HostQueueEntry.execution_path()"""
-        return 'hosts/%s/%s-%s' % (self.host.hostname, self.id,
-                                   self.task.lower())
+        """Get the execution path of the SpecialTask.
+
+        This method returns different paths depending on where a
+        the task ran:
+            * Master: hosts/hostname/task_id-task_type
+            * Shard: Master_path/time_created
+        This is to work around the fact that a shard can fail independent
+        of the master, and be replaced by another shard that has the same
+        hosts. Without the time_created stamp the logs of the tasks running
+        on the second shard will clobber the logs from the first in google
+        storage, because task ids are not globally unique.
+
+        @return: An execution path for the task.
+        """
+        results_path = 'hosts/%s/%s-%s' % (self.host.hostname, self.id,
+                                           self.task.lower())
+
+        # If we do this on the master it will break backward compatibility,
+        # as there are tasks that currently don't have timestamps. If a host
+        # or job has been sent to a shard, the rpc for that host/job will
+        # be redirected to the shard, so this global_config check will happen
+        # on the shard the logs are on.
+        is_shard = global_config.global_config.get_config_value(
+                'SHARD', 'shard_hostname', type=str, default='')
+        if not is_shard:
+            return results_path
+
+        # Generate a uid to disambiguate special task result directories
+        # in case this shard fails. The simplest uid is the job_id, however
+        # in rare cases tasks do not have jobs associated with them (eg:
+        # frontend verify), so just use the creation timestamp. The clocks
+        # between a shard and master should always be in sync. Any discrepancies
+        # will be brought to our attention in the form of job timeouts.
+        uid = self.time_requested.strftime('%Y%d%m%H%M%S')
+
+        # TODO: This is a hack, however it is the easiest way to achieve
+        # correctness. There is currently some debate over the future of
+        # tasks in our infrastructure and refactoring everything right
+        # now isn't worth the time.
+        return '%s/%s' % (results_path, uid)
 
 
     # property to emulate HostQueueEntry.status
commit	de87dea6a5a82b4ce1ddc5ce881ddbc70b54fdda	[log] [tgz]
author	Prashanth Balasubramanian <beeps@google.com>	Sun Nov 09 17:47:10 2014 -0800
committer	chrome-internal-fetch <chrome-internal-fetch@google.com>	Wed Nov 12 19:57:53 2014 +0000
tree	342884f826e41b971997137e4405c3d8d52c4b93
parent	5c8fe18fe8f2469783b527fd9ac9ed20ee13f99f [diff] [blame]