1. f3294cc Move clean up functions into seperate file/classes by mbligh · 15 years ago
  2. 27f3387 Ensure exception information from monitor_db goes to logs. by showard · 15 years ago
  3. 50e463b Add a check for AUTOTEST_SCHEDULER_LOG_DIR by showard · 15 years ago
  4. f2839f6 Change killing %d to %s by showard · 15 years ago
  5. c9895aa Move monitor_db_babysitter to using utils.run to start monitor_db with environment variable for monitor_db's logs. by mbligh · 15 years ago
  6. fb67603 Add write_pid to common code Call write_pid from scheduler and babysitter by mbligh · 15 years ago
  7. 7629f14 by showard · 15 years ago
  8. 205fd60 by showard · 16 years ago
  9. ccbd6c5 Ensure RepairTasks aren't associated with the queue entries that spawned them, so that if the QE is aborted during repair the repair task will continue running (and just leave the QE alone from then on). by showard · 16 years ago
  10. b18134f As discussed on the mailing list, we implemented logging with a single by showard · 16 years ago
  11. 89f84db by showard · 16 years ago
  12. cca334f by showard · 16 years ago
  13. a3c5857 a) Reduce the number of instances of DBObject classes created for the same row by showard · 16 years ago
  14. 35162b0 by showard · 16 years ago
  15. de700d3 by showard · 16 years ago
  16. 6ae5ea9 by showard · 16 years ago
  17. 25cbdbd by showard · 16 years ago
  18. a5cb406 by mbligh · 16 years ago
  19. a038235 by showard · 16 years ago
  20. 73ec044 by showard · 16 years ago
  21. d9ac445 by showard · 16 years ago
  22. 678df4f by showard · 16 years ago
  23. 8bcd23a Move all MySQLdb imports after the 'import common' so that a MySQLdb by mbligh · 16 years ago
  24. 6bb7c29 by showard · 16 years ago
  25. de634ee by showard · 16 years ago
  26. c9ae178 by showard · 16 years ago
  27. 6adf837 Fail quickly if we are accidentally started as root by mbligh · 16 years ago
  28. ade14e2 by showard · 16 years ago
  29. 324bf81 by showard · 16 years ago
  30. 67831ae by showard · 16 years ago
  31. 78d4d97 by showard · 16 years ago
  32. 0205a3e by showard · 16 years ago
  33. 2fa5169 by showard · 16 years ago
  34. 4fd61be by showard · 16 years ago
  35. c5afc46 by showard · 16 years ago
  36. c408c5e by showard · 16 years ago
  37. 55b4b54 by showard · 16 years ago
  38. 4f9e537 by showard · 16 years ago
  39. d1ee1dd * move some scheduler config options into a separate module, scheduler_config by showard · 16 years ago
  40. 170873e Attached is a very large patch that adds support for running a by showard · 16 years ago
  41. 37eceaa Add entries to the config file to control which server is used rather by mbligh · 16 years ago
  42. 6355f6b by showard · 16 years ago
  43. ac9ce22 Only schedule jobs that are "Queued". Now that state "Parsing" is an active=complete=0 state, we need to explicitly check for this. by showard · 16 years ago
  44. ff059d7 Don't abort running entries from synch start timeout (only queued/starting/verifying/pending ones). by showard · 16 years ago
  45. d876f45 gps pointed out that "== and != work in most cases but its better to use is by mbligh · 16 years ago
  46. c85c21b * allow scheduler email "from" address to be specified in global config by showard · 16 years ago
  47. e58e3f8 Set HQEs to "Verifying" instead of "Starting" when we're about to run verify on them. We need to set them to an active status, but if we use "Starting" then we can't tell which stage they're in, and we need that information to know when to "stop" synchronous jobs. by showard · 16 years ago
  48. cbd7461 When aborting a running job, write an INFO line to the status.log. by showard · 16 years ago
  49. 8fe93b5 Make CleanupTask copy results to job dir on failure. Did this by extracting code from VerifyTask into a common superclass. by showard · 16 years ago
  50. e788ea6 -make get_group_entries() return a list instead of a generator, since all callers want it that way anyway by showard · 16 years ago
  51. e77ac67 Set queue entries to "Starting" when the VerifyTask is created for them. This perennial source of problems cropped up again in the latest change to the job.run() code (as part of the synch_count changes). by showard · 16 years ago
  52. 2bab8f4 Implement sync_count. The primary change here is replacing the job.synch_type field with a synch_count field. There is no longer just a distinction between synchronous and asynchronous jobs. Instead, every job as a synch_count, with synch_count = 1 corresponding to the old concept of synchronous jobs. This required: by showard · 16 years ago
  53. 036e4be This file no longer serves a purpose. The rest of the old scheduler by mbligh · 16 years ago
  54. 9d9ffd5 don't reboot hosts when aborting inactive jobs. by showard · 16 years ago
  55. 6198f1d When a synch job fails and we stop other entries, set the host back to "Ready" if it was "Pending". Otherwise it'll sit in state "Pending" forever. by showard · 16 years ago
  56. 45ae819 Add a formal cleanup phase to the scheduler flow. by showard · 16 years ago
  57. 8ebca79 -fix running process accounting in scheduler. Dispatcher.num_running_processes() already excludes Agents that are done, so we don't need to subtract their processes off. by showard · 16 years ago
  58. fa8629c -ensure Django connection is autocommit enabled, when used from monitor_db by showard · 16 years ago
  59. 97aed50 Rewrite final reparse code in scheduler. the final reparse is now handled by a separate AgentTask, and there's a "Parsing" status for queue entries. This is a cleaner implementation that allows us to still implement parse throttling with ease and get proper recovery of reparses after a system crash fairly easily. by showard · 16 years ago
  60. a3ab0d5 -change AFE abort code to always set to "Abort" status and never skip straight to "Aborted". Doing so is prone to a race condition with the scheduler. The scheduler handles a non-active "Abort" entries perfectly already, setting them immediately to "Aborted" without trying to kill anything. by showard · 16 years ago
  61. 9886397 Add job start timeout for synchronous jobs. This timeout applies to synchronous jobs that are holding a public pool machine (i.e. in the Everyone ACL) as "Pending". This includes a new global config option, scheduler code to enforce the timeout and a unit test. by showard · 16 years ago
  62. b2ccdda Change location of set_status('Starting') line. This just got put in the wrong place when I refactored the job.run() code, and it wasn't getting run at all for asynchronous jobs. by showard · 16 years ago
  63. e05654d Ensure results directories always get created for asynchronous multimachine jobs (previously they wouldn't for jobs with run_verify=False). by showard · 16 years ago
  64. 3dd6b88 Two simple scheduler fixes: by showard · 16 years ago
  65. 0fc3830 Add user preferences for reboot options, including simple user preferences tab which could later be expanded to include more options. by showard · 16 years ago
  66. 21baa45 Add options to control reboots before and after a job. by showard · 16 years ago
  67. 1be9743 -fix bug with handling abort on unassigned host queue entries by showard · 16 years ago
  68. 364fe86 Refactor the basic environment setup code out of django_test_utils.py into setup_django_environment.py, and rename django_test_utils.py to setup_test_environment.py. Also changed the environment setup code to run at import time. This makes it easy for scripts, both test and non-test, to use Django models without running through manage.py. The idea is that scripts will import setup_django_environment before importing Django code (somewhat akin to common.py), and test code will subsequently import setup_test_environment. by showard · 16 years ago
  69. cfd66a3 Make scheduler set host status to "Pending" when there's a pending queue entry against the host. by showard · 16 years ago
  70. 9976ce9 -make monitor_db implement "skip verify" properly, and add unit tests for it by showard · 16 years ago
  71. b2e2c32 -refactor Job.run in monitor_db, one of the most important and most confusing methods in the scheduler. it's now broken into separate synchronous and asynchronous paths with common methods extracted. by showard · 16 years ago
  72. 12bc8a8 The scheduler unit test needs to pass in a created_on time. by showard · 16 years ago
  73. b1e5187 Get the scheduler unittest to run against SQLite! by showard · 16 years ago
  74. 442e71e Move migration system into database/ directory. by showard · 16 years ago
  75. 0e73c85 Add a generic database wrapper, supporting different database backends, to be used by migrate, scheduler, parser (eventually), and maybe others. This will consolidate the multiple database wrappers we have throughout the code and allow us to swap in SQLite for MySQL for unit testing purposes. by showard · 16 years ago
  76. c993bee The scheduler has some overly vebose (debug) logging...kill it. by mbligh · 16 years ago
  77. c0e24fb A script for automatically restarting the scheduler when it dies or becomes unresponsive. This script will start a monitor_db.py instance and watch its logs. If monitor_db stalls for an amount of time defined in the top of the file (2 hours in this case), the babysitter will kick it. Also, if the process dies, the babysitter will restart it. by mbligh · 16 years ago
  78. f7fa2cc Update the scheduler and the parser to use the new aborted_* attributes that by jadmanski · 16 years ago
  79. 989f25d two new major features: by showard · 16 years ago
  80. 50c0e71 -add --force option to migrations to disable user confirmation because this can make migrations unscriptable by showard · 16 years ago
  81. 7d182aa Handled exceptions caused by email sending functions. Prints log messages to by showard · 16 years ago
  82. 542e840 Added email_list field to front end. On job completion emails on this by showard · 16 years ago
  83. d8e548a make scheduler write host keyval files at the beginning of the job. presently the only keyval that's written is a list of host labels. by showard · 16 years ago
  84. 4c5374f -modify scheduler throttling code to track number of running processes rather than just number of running agents. note this is only an estimate of running processes - it counts all agents as one process unless the agent is a synchronous autoserv execution, in which case it uses the number of hosts being run. by showard · 16 years ago
  85. 970a6db Rate limit the final parse of the scheduler. If more than 100 or so run at a time, it will bring mysql to its knees (for no good reason...all actions are on different jobs). by showard · 16 years ago
  86. 849a0f6 Invalid SQL is created if you have one-time hosts but no 'real' hosts by mbligh · 16 years ago
  87. ccb86d7 revert an earlier change to when exactly we set the 'Starting' status. this was breaking synchronous jobs and I'm not sure what the reason for it was. it doesn't seem to have been necessary. by showard · 16 years ago
  88. 1f27e99 Remove another old & obsolete file. by jadmanski · 16 years ago
  89. 63a3477 -Refactor new monitor_db scheduling algorithm into it's own class by showard · 16 years ago
  90. b95b1bd Rewrite the scheduling algorithm yet again. This time, we make separate DB queries to get all the queued host queue entries and all the ready hosts, and then match them up in Python. We could still do the non-metahosts the old way, but we might as well just do it all uniformly, so I've completely eliminated the old code. by showard · 16 years ago
  91. 56193bb -add basic abort functionality test to scheduler unit tests. this by showard · 16 years ago
  92. 3d9899a Provides a mechanism in the UI to choose to skip the verification stage. by showard · 16 years ago
  93. 7e26d62 Fixed the job timeouts. Jobs should no longer time out early. by mbligh · 16 years ago
  94. dd70371 I left some debugging code in monitor_db_unittest.py. This goes with a patch I sent out a few minutes ago. so it should be applied after it. It was a patch to monitor_db_unittest as well. by mbligh · 16 years ago
  95. 3e0f7e0 Need changes to fix the monitor_db unittest by mbligh · 16 years ago
  96. 542537f Normalize the --host-protection name, since autoserv is somewhat by jadmanski · 16 years ago
  97. c160352 Fixed the logic in the scheduler unit tests. Checks that the command by mbligh · 16 years ago
  98. fb2a7fa Adding new columns "locked_by_id" and "lock_time" to the hosts table, to by showard · 16 years ago
  99. 909c7a6 Initial release of test auto importer by showard · 16 years ago
  100. fb7cfb1 Add support to the scheduler to pass in the host.protection value as by jadmanski · 16 years ago