[autotest] Force to collect /var/log if test is failed with device error.

When test running in DUT is aborted and does not get a chance to run post test
hooks, diff of /var/log can not be copied to resuts folder and autoserv will
not be able to collect any log from the DUT.

This CL saves the device error failure flag to job.failed_with_device_error.
Autoserv uses this flag to determine whether to collect crash info (through
server/control_segments/crashinfo), which will collect all files in /var/log.

BUG=chromium:271703
TEST=run autoserv in local setup. Manually reboot DUT during the middle of a
test, then confirm the results collected has content in crashinfo.[DUT name].

Change-Id: I1a3757b8933fe60deea75728e867033eeb86c7cd
Reviewed-on: https://gerrit.chromium.org/gerrit/66013
Commit-Queue: Dan Shi <dshi@chromium.org>
Reviewed-by: Dan Shi <dshi@chromium.org>
Tested-by: Dan Shi <dshi@chromium.org>
diff --git a/server/crashcollect.py b/server/crashcollect.py
index 38a3b26..a8ac8a5 100644
--- a/server/crashcollect.py
+++ b/server/crashcollect.py
@@ -1,7 +1,9 @@
-import os, time, pickle, logging, shutil
+import os, time, logging, shutil
 
 from autotest_lib.client.common_lib import global_config
+from autotest_lib.client.cros import constants
 from autotest_lib.server import utils
+from autotest_lib.site_utils.graphite import stats
 
 
 # import any site hooks for the crashdump and crashinfo collection
@@ -32,6 +34,11 @@
         collect_command(host, "dmesg", os.path.join(crashinfo_dir, "dmesg"))
         collect_uncollected_logs(host)
 
+        # Collect everything in /var/log.
+        log_path = os.path.join(crashinfo_dir, 'var')
+        os.makedirs(log_path)
+        collect_log_file(host, constants.LOG_DIR, log_path)
+
 
 # Load default for number of hours to wait before giving up on crash collection.
 HOURS_TO_WAIT = global_config.global_config.get_config_value(
@@ -54,6 +61,7 @@
     logging.info("Waiting %s hours for %s to come up (%s)",
                  hours_to_wait, host.hostname, current_time)
     if not host.wait_up(timeout=hours_to_wait * 3600):
+        stats.Counter('collect_crashinfo_timeout').increment()
         logging.warning("%s down, unable to collect crash info",
                         host.hostname)
         return False