Fix a race condition between autotestd and autotestd_monitor. If for
some reason the monitor starts up faster than autotestd, it can grab
the exit_code file before it exists (or theoretically it could lock it
after autotestd creates it but before it locks it).
To resolve this, have autotestd touch a "started" file after it creates
and locks the exit_code file, and have autotestd_monitor perform a
thirty second wait for it to exist before going ahead and trying to
grab the log files. In practice a wait of a couple of seconds is
sufficient to avoid the race.
Risk: Low
Visibility: Fixes a race condition between autotestd and
autotestd_monitor during client startup.
Signed-off-by: John Admanski <jadmanski@google.com>
git-svn-id: http://test.kernel.org/svn/autotest/trunk@2816 592f7852-d20e-0410-864c-8624ca9c26a4
diff --git a/client/bin/autotestd_monitor b/client/bin/autotestd_monitor
index 0393d79..59602c0 100644
--- a/client/bin/autotestd_monitor
+++ b/client/bin/autotestd_monitor
@@ -21,6 +21,15 @@
stdout_pump = launch_tail('stdout', sys.stdout, stdout_start)
stderr_pump = launch_tail('stderr', sys.stderr, stderr_start)
+# wait for logdir/started to exist to be sure autotestd is started
+start_time = time.time()
+started_file_path = os.path.join(logdir, 'started')
+while not os.path.exists(started_file_path):
+ time.sleep(1)
+ if time.time() - start_time >= 30:
+ raise Exception("autotestd failed to start in %s" % logdir)
+os.remove(started_file_path)
+
# watch the exit code file for an exit
exit_code_file = open(os.path.join(logdir, 'exit_code'))
fcntl.flock(exit_code_file, fcntl.LOCK_EX)