1. dd77e01 by jamesren · 14 years ago
  2. 2566356 Fix process counting for SelfThrottledPostJobTask. Would previously by jamesren · 14 years ago
  3. 76fcf19 Add ability to associate drone sets with jobs. This restricts a job to by jamesren · 14 years ago
  4. b7c5d27 monitor_db.py: Fix SyntaxWarning by lmr · 14 years ago
  5. 3bc70a1 Reset host status to READY on aborting a WAITING entry by jamesren · 14 years ago
  6. 37b5045 Fixes to drone_manager behavior. by jamesren · 14 years ago
  7. 47bd737 Set hostless queue entries to STARTING upon scheduling the agent. This by jamesren · 14 years ago
  8. e0cbc91 Add support to autoserv for a --control-filename parameter, to allow users to by mbligh · 14 years ago
  9. dd85524 Abstract out common models used in the frontend's models.py so that django is not required to interact with non Django portions of the code. by jamesren · 14 years ago
  10. a3a2841 Adding "executable" property to scheduler unittests by jamesren · 14 years ago
  11. b55378a Part of http://patchwork.test.kernel.org/patch/1850/ didn't make it into the by jamesren · 14 years ago
  12. e21bf41 Minor fix to new metahost handlers code in scheduler to ensure handlers get a tick every cycle, even if there are no queued metahost jobs. by jamesren · 14 years ago
  13. 675bfe7 Change email of hosts without platforms to a warning in the logs. We don't do anything about this and it is just extra spam for us at this point in time. by jamesren · 14 years ago
  14. 138785a Add a site_monitor_db_babysitter module. If found, it will use its by jamesren · 14 years ago
  15. c44ae99 Refactor scheduler models into a separate module, scheduler_models. This module doesn't depend on monitor_db, only the other way around. The separation and isolation of dependencies should help us organize the scheduler code a bit better. by jamesren · 14 years ago
  16. 883492a First iteration of pluggable metahost handlers. This change adds the basic framework and moves the default, label-based metahost assignment code into a handler. It includes some refactorings to the basic scheduling code to make things a bit cleaner. by jamesren · 14 years ago
  17. 4b0eb53 When archiving results, we need to append a slash to the path to ensure it gets correctly handled as a directory. by showard · 14 years ago
  18. c6fb604 Ensure we reset pidfile age when the pidfile is read. I had dropped the call to register_pidfile() from get_pidfile_info() in my previous change, but now I realize the purpose of it was to reset the pidfile age. by showard · 15 years ago
  19. 5c114c7 Fix scheduler functional test for recent change to parse hostless jobs. by showard · 15 years ago
  20. 0164be3 Don't implicitly register pidfiles when get_pidfile_contents() is called. The scheduler is now registering and unregistering pidfiles correctly on its own, and this was causing files to get accidentally re-registered after being unregistered, causing pidfile leaks. by showard · 15 years ago
  21. cc92936 Basic support for "summary results" -- articifial test results that are explicitly recorded by a server-side control file or code that it calls. This CL just adds the record_summary() method to the server_job object. It lacks any special parser support or TKO DB changes, those will come later. by showard · 15 years ago
  22. fd8b89f don't set the current user to my_user in frontend_test_utils. let it default to the new autotest_system user. by showard · 15 years ago
  23. 7e67b43 New code for performing explicit joins with custom join conditions. by showard · 15 years ago
  24. 4076c63 In scheduler check for existence of results before trying to write the .archiver_failed file. by showard · 15 years ago
  25. c1a98d1 Support for job keyvals by showard · 15 years ago
  26. 1b7142d * fix a bug with restricted drone users config parsing by showard · 15 years ago
  27. be030fb In periodic reverificaiton, use schedule_special_task() instead of straight object creation. This is the right path to use for creating tasks -- it include duplication avoidance and automatic owner tagging. by showard · 15 years ago
  28. e1575b5 When the archiver fails for any reason, write a .archiver_failed file to the results dir. by showard · 15 years ago
  29. 948eb30 Construct an absolute path to the archiving control file when running the Archiving stage. Using a relative path was just silly and lazy and prone to breakage. by showard · 15 years ago
  30. 64a9595 When using Django models from a script, make the current user default to an actual database user named "autotest_system". This allows for simpler, more consistent code. by showard · 15 years ago
  31. 38b28bf Don't try to offload results if the results_host is localhost. This was causing duplicate results for normal (single-host) setups. by showard · 15 years ago
  32. 8dbd05a Implement periodic reverification of dead hosts, configurable in global_config. Implemented as part of the periodic cleanup, so the frequency of reverification is bounded by the periodic cleanup interval. I felt this would be acceptable and putting this in the existing cleanup class makes things more nicely organized. by showard · 15 years ago
  33. 12b4558 Massive permission fix by lmr · 15 years ago
  34. 4608b00 Add a new Archiving stage to the scheduler, which runs after Parsing. This stage is responsible for copying results to the results server in a drone setup, a task currently performed directly by the scheduler, and allows for site-specific archiving functionality, replacing the site_parse functionality. It does this by running autoserv with a special control file (scheduler/archive_results.control.srv), which loads and runs code from the new scheduler.archive_results module. The implementation was mostly straightfoward, as the archiving stage is fully analogous to the parser stage. I did make a couple of refactorings: by mbligh · 15 years ago
  35. 2b38f67 Add test case for aborting a synchronous job while it's throttled in the Starting state. Was trying to repro a bug. It doesn't repro, indicating that maybe the bug has already been fixed (or maybe this test case is missing something). Either way, it's good to have another test case around. by showard · 15 years ago
  36. 78f5b01 Update to Django 1.1.1. I want to use a new feature for my RESTful interface prototyping (direct inclusion of URL patterns in URLconfs). by showard · 15 years ago
  37. eab66ce Rename the tables in the databases, by prefixing the app name. This is by showard · 15 years ago
  38. 402934a Clear the Django connection query log after each tick. This was a major memory leak. by showard · 15 years ago
  39. f13a9e2 Add periodic CPython garbage collector statistics logging to aid in by showard · 15 years ago
  40. 493beaa fix a bug with pre-job keyvals, introduced in recent refactorings, and added new test to check it by showard · 15 years ago
  41. f65b740 Fix a rather brittle scheduler unit test by showard · 15 years ago
  42. a9545c0 backend support for hostless jobs by showard · 15 years ago
  43. d349624 Fix DroneManager._drop_old_pidfiles() -- use items() instead of iteritems() to avoid concurrent modification exception. Added unit test. by showard · 15 years ago
  44. 2ca64c9 * add a couple simple test cases to the scheduler functional test for metahosts by showard · 15 years ago
  45. d119565 Make drone_manager track running processes counts using only the information passed in from the scheduler. Currently it also uses process counts derived from "ps", but that is an unreliable source of information. This improves accuracy and consistency and gives us full control over the process. by showard · 15 years ago
  46. b21b8c8 Fix handling of database reconnects in the scheduler by enhancing the "django" database_connection backend and having the scheduler use it. This eliminates the duplicate connection that the scheduler was setting up -- now it uses only a single connection (the Django one). by showard · 15 years ago
  47. d07a5f3 The check for enough pending hosts after the delay to wait for others to by showard · 15 years ago
  48. 418785b Some improvements to process tracking in the scheduler. by showard · 15 years ago
  49. 9bb960b Support restricting access to drones by user. Administrators can put lines like by showard · 15 years ago
  50. e60e44e Special tasks show "Failed" as their status instead of "Completed" if by showard · 15 years ago
  51. 1b0ffc3 Address shutil.copy() failure when running a scheduler instance without by showard · 15 years ago
  52. 7ca9e01 Remove the synch_job_start_timeout_minutes scheduler "feature" as it is by showard · 15 years ago
  53. a21b949 Added functional test for recovering jobs with atomic hosts, with HQEs by showard · 15 years ago
  54. 65db393 * impose prioritization on SpecialTasks based on task type: Repair, then Cleanup, then Verify. remove prioritization of STs with queue entry over those without. this leads to more sane ordering of execution in certain unusual contexts -- the added functional test cases illustrate a few (in some cases, it's not just more sane, it eliminates bugs as well). by showard · 15 years ago
  55. 7b2d7cb We never considered the handling of DO_NOT_VERIFY hosts in certain situations. This adds handling of those cases to the scheduler and adds tests to the scheduler functional test. by showard · 15 years ago
  56. 4a60479 add a bunch of tests to the scheduler functional test to cover pre- and post-job cleanup, including failure cases by showard · 15 years ago
  57. 37757f3 Change "unrecovered active host queue entries" to be a more accurate by showard · 15 years ago
  58. ac5b000 * get rid of the code to create the drone temp dir in drones.py. This used to be necessary because we needed that directory just to run drone_utility (so we could put the pickle file there). But now we use stdin, so we don't need this anymore. (drone_utility still initializes the temp dir for its own use.) by showard · 15 years ago
  59. 202343e On the results drone, execute code from the results dir. by showard · 15 years ago
  60. 2aafd90 Need to get the drone temporary directory under the results dir as well. Added unit tests to check this and to check the behavior of attach_file_to_execution, which was being affected by this bug (but wasn't actually buggy itself). by showard · 15 years ago
  61. c75fded Fix the drone results dir computation. I forgot that the results don't just go under the drone_installation_directory, they go under "results" in there. by showard · 15 years ago
  62. 093a068 Added string stdin support to utils.BgJob and all its users that give it by jadmanski · 15 years ago
  63. 8375ce0 Fix unindexable object error raised on the error path within by showard · 15 years ago
  64. 42d4498 Use drone_installation_dir for all activities on drones, including results dirs and temp dirs. Previously it would use the drone_installation_dir for executing drone_utility, but would use the scheduler results dir for everything else. by showard · 15 years ago
  65. 786da9a Escalate to a SIGKILL in DroneUtility.kill_process() if the SIGTERM didn't work by showard · 15 years ago
  66. b890045 In scheduler recovery, allow Running HQEs with no process. The tick code already handles them fine (by re-executing Autoserv), but the recovery code was explicitly disallowing them. With this change, it turns out there's only one status that's not allowed to go unrecovered -- Verifying -- so I changed the code to reflect that and I made the failure conditions more accurate. by showard · 15 years ago
  67. 5682407 Added more logging, and fixed logging in HostQueueEntry.set_status() by showard · 15 years ago
  68. 0db3d43 Recheck queue entry status in Dispatcher._get_unassigned_entries() by showard · 15 years ago
  69. d201482 When a delayed call task finishes waiting for extra hosts to enter by showard · 15 years ago
  70. dae680a Ignore microsecond differences in datetimes when checking existing in by showard · 15 years ago
  71. e55955f Rewrite a conditional that was very confusing to me. by showard · 15 years ago
  72. f85a0b7 Explicitly release pidfiles after we're done with them. This does it in a kind of lazy way, but it should work just fine. Also extended the new scheduler functional test with a few more cases and added a test to check pidfile release under these various cases. In the process, I changed how some of the code works to allow the tests to more cleanly express their intentions. by showard · 15 years ago
  73. 34ab099 beginnings of a new scheduler functional test. this aims to test the entire monitor_db.py file holistically, made possible by the fact that monitor_db.py is already isolated from all direct system access through drone_manager (this was a necessary separation for distributed scheduling). by mocking out the entire drone_manager, as well as other major dependencies (email manager, global config), and filling a test database, we can allow the dispatcher to execute normally and allow it to interact with all the other code in monitor_db. at the end, we can check the state of the database and the drone_manager, and (probably most importantly, given the usual failure mode of the scheduler) we can ensure no exceptions get raised from monitor_db. by showard · 15 years ago
  74. 8d3dbca Make the maximum number of refreshes before forgetting a pidfile by showard · 15 years ago
  75. ec6a3b9 Make the pidfile timeout in the scheduler configurable. Raise the by showard · 15 years ago
  76. 0c5c18d Changed error message to be more useful by showard · 15 years ago
  77. d791dcb Give all scheduler launched child processes a mark and check for by showard · 15 years ago
  78. 828fc4c Make assertion in _choose_group_to_run non-fatal and log an error by showard · 15 years ago
  79. b6a186f Email notification currently relies on an MTA installed by showard · 15 years ago
  80. db50276 Write host keyvals for all verify/cleanup/repair tasks. by showard · 15 years ago
  81. 775300b Cleanups on hosts marked DO_NOT_VERIFY should continue to run as if they by showard · 15 years ago
  82. dabf6cf It is okay for hosts to have multiple atomic group labels so long as all by showard · 15 years ago
  83. b593fa8 Prevent email_manager from hiding exceptions when sending email fails. by showard · 15 years ago
  84. 8cc058f Make scheduler more stateless. Agents are now scheduled only by the by showard · 15 years ago
  85. 8de3713 Renamed process_is_alive to program_is_alive. by showard · 15 years ago
  86. cdaeae8 Fixed bug where scheduler would crash if the autoserv process is lost by showard · 15 years ago
  87. 4ac4754 Don't mark HQEs as Failed before the GatherLogsTask and the by showard · 15 years ago
  88. 6631273 Make a bunch of stuff executable by mbligh · 15 years ago
  89. 549afad Added pid file checks to monitor_db and monitor_db_babysitter, so that by showard · 15 years ago
  90. 70a294f Don't expect aborted "Pending" entries to be recovered. They'll be immediately picked up by _find_aborting() so they don't need to be recovered. by showard · 15 years ago
  91. 58721a8 One-off fix to address the issue where a scheduler shutdown immediately by showard · 15 years ago
  92. 3739978 Instrument the drone manager to allow debugging why it lost track of by showard · 15 years ago
  93. 6bba3d1 Don't assert if we were unable to load the pidfile in num_tests_failed. by showard · 15 years ago
  94. e8e3707 Treat unrecoverable host queue entries as a fatal error. Their existance by showard · 15 years ago
  95. 6d1c143 Fix scheduler's handling of jobs when the PID file can't be found. by showard · 15 years ago
  96. 708b352 Do not go through a DelayedCallTask on atomic group jobs when all Hosts by showard · 15 years ago
  97. 9b6ec50 Turn an assertion into a more useful error message. by showard · 15 years ago
  98. 1ef218d This is the result of a batch reindent.py across our tree. by mbligh · 15 years ago
  99. 5fa9e11 By default, only warn when orphaned autoservs are found by mbligh · 15 years ago
  100. 6fbdb80 Change print msg to logging.error(msg) so that we actually get the error in the scheduler log about the scheduler not being enalbed. by mbligh · 15 years ago