1. b2e2c32 -refactor Job.run in monitor_db, one of the most important and most confusing methods in the scheduler. it's now broken into separate synchronous and asynchronous paths with common methods extracted. by showard · 16 years ago
  2. 12bc8a8 The scheduler unit test needs to pass in a created_on time. by showard · 16 years ago
  3. b1e5187 Get the scheduler unittest to run against SQLite! by showard · 16 years ago
  4. 442e71e Move migration system into database/ directory. by showard · 16 years ago
  5. 0e73c85 Add a generic database wrapper, supporting different database backends, to be used by migrate, scheduler, parser (eventually), and maybe others. This will consolidate the multiple database wrappers we have throughout the code and allow us to swap in SQLite for MySQL for unit testing purposes. by showard · 16 years ago
  6. c993bee The scheduler has some overly vebose (debug) logging...kill it. by mbligh · 16 years ago
  7. c0e24fb A script for automatically restarting the scheduler when it dies or becomes unresponsive. This script will start a monitor_db.py instance and watch its logs. If monitor_db stalls for an amount of time defined in the top of the file (2 hours in this case), the babysitter will kick it. Also, if the process dies, the babysitter will restart it. by mbligh · 16 years ago
  8. f7fa2cc Update the scheduler and the parser to use the new aborted_* attributes that by jadmanski · 16 years ago
  9. 989f25d two new major features: by showard · 16 years ago
  10. 50c0e71 -add --force option to migrations to disable user confirmation because this can make migrations unscriptable by showard · 16 years ago
  11. 7d182aa Handled exceptions caused by email sending functions. Prints log messages to by showard · 16 years ago
  12. 542e840 Added email_list field to front end. On job completion emails on this by showard · 16 years ago
  13. d8e548a make scheduler write host keyval files at the beginning of the job. presently the only keyval that's written is a list of host labels. by showard · 16 years ago
  14. 4c5374f -modify scheduler throttling code to track number of running processes rather than just number of running agents. note this is only an estimate of running processes - it counts all agents as one process unless the agent is a synchronous autoserv execution, in which case it uses the number of hosts being run. by showard · 16 years ago
  15. 970a6db Rate limit the final parse of the scheduler. If more than 100 or so run at a time, it will bring mysql to its knees (for no good reason...all actions are on different jobs). by showard · 16 years ago
  16. 849a0f6 Invalid SQL is created if you have one-time hosts but no 'real' hosts by mbligh · 16 years ago
  17. ccb86d7 revert an earlier change to when exactly we set the 'Starting' status. this was breaking synchronous jobs and I'm not sure what the reason for it was. it doesn't seem to have been necessary. by showard · 16 years ago
  18. 1f27e99 Remove another old & obsolete file. by jadmanski · 16 years ago
  19. 63a3477 -Refactor new monitor_db scheduling algorithm into it's own class by showard · 16 years ago
  20. b95b1bd Rewrite the scheduling algorithm yet again. This time, we make separate DB queries to get all the queued host queue entries and all the ready hosts, and then match them up in Python. We could still do the non-metahosts the old way, but we might as well just do it all uniformly, so I've completely eliminated the old code. by showard · 16 years ago
  21. 56193bb -add basic abort functionality test to scheduler unit tests. this by showard · 16 years ago
  22. 3d9899a Provides a mechanism in the UI to choose to skip the verification stage. by showard · 17 years ago
  23. 7e26d62 Fixed the job timeouts. Jobs should no longer time out early. by mbligh · 17 years ago
  24. dd70371 I left some debugging code in monitor_db_unittest.py. This goes with a patch I sent out a few minutes ago. so it should be applied after it. It was a patch to monitor_db_unittest as well. by mbligh · 17 years ago
  25. 3e0f7e0 Need changes to fix the monitor_db unittest by mbligh · 17 years ago
  26. 542537f Normalize the --host-protection name, since autoserv is somewhat by jadmanski · 17 years ago
  27. c160352 Fixed the logic in the scheduler unit tests. Checks that the command by mbligh · 17 years ago
  28. fb2a7fa Adding new columns "locked_by_id" and "lock_time" to the hosts table, to by showard · 17 years ago
  29. 909c7a6 Initial release of test auto importer by showard · 17 years ago
  30. fb7cfb1 Add support to the scheduler to pass in the host.protection value as by jadmanski · 17 years ago
  31. df06256 Adding protection levels to hosts. Allows the user to specify how much by showard · 17 years ago
  32. 5df2b19 Updating the RPC interface and scheduler unit tests to match up with by showard · 17 years ago
  33. b8471e3 Added a new input that allows used to specify a one-time host when by showard · 17 years ago
  34. 3bb499f Adding a timeout field to the "Create Job" tab, modified the create_job by showard · 17 years ago
  35. f8c624d If a job is marked as Abort/Aborting/Aborted, do not change its status by mbligh · 17 years ago
  36. f40cf53 Fixed the monitor_db_unittest to be more robust. When checking that the command line is correct should by mbligh · 17 years ago
  37. b376bc5 Fix a bug introduced into recovery code in my refatoring for testability. PidfileRunMonitor.run() was being called on a path when it shouldn't have been. by showard · 17 years ago
  38. 70feeee Needed to fix problems caused by the use of the old import usage which has by mbligh · 17 years ago
  39. 4eaaf52 Add distinct to query to cut time spent in half by showard · 17 years ago
  40. 0afbb63 Convert all python code to use four-space indents instead of eight-space tabs. by jadmanski · 17 years ago
  41. 3182b33 minor refactorings to scheduler to make it more testable. the corresponding unit test changes seem to have gone in with some other change. by showard · 17 years ago
  42. 3d161b0 Move the mock libraries from client/unittest into client/common_lib/test_utils. by jadmanski · 17 years ago
  43. 20f4706 -check ACLs directly in the scheduler (bypassing ineligible_host_queues) by showard · 17 years ago
  44. b751751 Add __init__.py file. by showard · 17 years ago
  45. 04c82c5 Rewrite scheduling algorithm to use two queries + some data processing, rather than a separate query for each "idle" host. This should be considerably faster. It also gives us the opportunity to eliminate the whole ACL checking with ineligible_host_queues thing, which has been a nightmare. But one step at a time... by showard · 17 years ago
  46. ce38e0c The beginning of a unit test for the scheduler. Right now it only tests the job scheduling algorithm (i.e. Dispatcher._find_more_work() and the methods it uses). by showard · 17 years ago
  47. 30eed1f A bit of refactoring to monitor_db.py to clean up some code and make it more testable. by showard · 17 years ago
  48. 93ff7ea Rename monitor_db to monitor_db.py. This makes it import-able, which is necessary for unit testing. by showard · 17 years ago
  49. 5492748 Add distinct to query to cut time spent in half by showard · 17 years ago
  50. 57881ee It occurred to me that because of the change to batch up emails, if an exception occurs that kills the scheduler, it wouldn't send out the email. Fixed that. by showard · 17 years ago
  51. c2ac77f Risk: Medium by jadmanski · 17 years ago
  52. 7cf9a9b Batch up notification emails within a single tick, and send em out all together. by showard · 17 years ago
  53. ec11316 -make scheduler monitor number of running tasks and keep it limited to some maximum, set in global config by showard · 17 years ago
  54. a093972 Every time we modify ACLs we have to recompute ineligible host queues. We can't do that by deleting the old ones and then writing the new ones, since there would be a moment when the hosts are unprotected. So instead we write the new ones and then delete the old ones, which leaves a moment when there might be duplicate ineligible_host_queues. This is harmless, but the scheduler was asserting that there were never duplicates (just for safety I guess, since that used to be true), so I removed the assertion and made the code handle duplicates. by showard · 17 years ago
  55. e44a46d notify_email is a global config parameter which monitor_db reads out. by mbligh · 17 years ago
  56. 62ba2ed -include acl-inaccessible hosts in ineligible_host_queues blocks. by mbligh · 17 years ago
  57. 5244cbb Never delete hosts or labels. Instead, mark them as invalid. by mbligh · 17 years ago
  58. 6437ff5 Use the new parser library directly inside of autoserv, instead of by mbligh · 17 years ago
  59. cadb353 Fix bug in logging to host logs. by mbligh · 17 years ago
  60. d64e570 Tested by scheduling jobs against machines that had Repair Failed status and aborting jobs to cause RebootTask to be called and observing that tmp directories get created and deleted, and no directories are created in root autotest directory. by mbligh · 17 years ago
  61. 1b87bc5 Modify all the common.py to set up an autotest_lib.* namespace as well by mbligh · 17 years ago
  62. 90a549d -when no pidfile is found by PidfileRunMonitor, just wait, and after a timeout, send email and act as if process failed by mbligh · 17 years ago
  63. b03ba64 Patch to reduce the rate of reparse, and to make the parser locking by mbligh · 17 years ago
  64. 4eb2df2 Specify a boolean string converter for MySQLdb. Some older versions of MySQLdb do not include this, and it breaks monitor_db. by mbligh · 17 years ago
  65. bb42185 This patch enables the scheduler to pick up jobs that were left running after it crashes, and see them to completion. by mbligh · 17 years ago
  66. 104e9ce This fixes some issues with global_config relating to the fact by mbligh · 17 years ago
  67. 38c2d03 Changed sighandler in autoserv to call SIGKILL on its children instead of SIGTERM. by mbligh · 17 years ago
  68. dbdac6c Continuously reparse the status logs whenever new logs are written out by mbligh · 17 years ago
  69. 16c722d Remove ReverifyTask altogether, and trust the return code of autoserv repair (which does a reverify itself). The flowchart on the wiki is updated. by mbligh · 17 years ago
  70. d5c9580 Implemented abort functionality in scheduler. by mbligh · 17 years ago
  71. dffd637 scheduler release hosts by mbligh · 17 years ago
  72. 48c10a5 We don't want to pass -n to autoserv for host-specific tasks (verify, repair, re by mbligh · 17 years ago
  73. e258668 Verify repair fixes for scheduler by mbligh · 17 years ago
  74. 6f8bab4 Catch any errors due to mysql losing its connection. If it does lose by mbligh · 17 years ago
  75. 8ce2c4a On verify failure for a synch job, stop all other queue_entries, not just active ones. by mbligh · 17 years ago
  76. 4314a71 Testing mode support for the scheduler. In testing mode, the scheduler runs a dummy autoserv script and doesn't try to parse results. This is part of an ongoing project to create an automated scheduler test. by mbligh · 17 years ago
  77. b090f14 more on global config by mbligh · 17 years ago
  78. 36768f0 add missing monitor_db code by mbligh · 17 years ago
  79. c40fa92 Change ConmuxSSHHost to SSHHost by mbligh · 17 years ago
  80. 6203ace Split stdout and stderr for monitor_queue into separate files, by mbligh · 17 years ago
  81. b7ef301 Comment fix to monitor queue plus ignore non-dirs in queue dir by mbligh · 17 years ago
  82. af0b811 Make scheduler stuff executable by mbligh · 17 years ago
  83. 8dcb745 Various fixes and updates for monitor_queue by mbligh · 17 years ago
  84. a4649af kill existing monitor queues before restarting by mbligh · 17 years ago
  85. 88d3256 Queues now have a .machines file by mbligh · 17 years ago
  86. dcc0499 The new monitor_queue script works much like the old one but now by mbligh · 18 years ago