1. 4608b00 Add a new Archiving stage to the scheduler, which runs after Parsing. This stage is responsible for copying results to the results server in a drone setup, a task currently performed directly by the scheduler, and allows for site-specific archiving functionality, replacing the site_parse functionality. It does this by running autoserv with a special control file (scheduler/archive_results.control.srv), which loads and runs code from the new scheduler.archive_results module. The implementation was mostly straightfoward, as the archiving stage is fully analogous to the parser stage. I did make a couple of refactorings: by mbligh · 15 years ago
  2. 2b38f67 Add test case for aborting a synchronous job while it's throttled in the Starting state. Was trying to repro a bug. It doesn't repro, indicating that maybe the bug has already been fixed (or maybe this test case is missing something). Either way, it's good to have another test case around. by showard · 15 years ago
  3. 78f5b01 Update to Django 1.1.1. I want to use a new feature for my RESTful interface prototyping (direct inclusion of URL patterns in URLconfs). by showard · 15 years ago
  4. 493beaa fix a bug with pre-job keyvals, introduced in recent refactorings, and added new test to check it by showard · 15 years ago
  5. a9545c0 backend support for hostless jobs by showard · 15 years ago
  6. 2ca64c9 * add a couple simple test cases to the scheduler functional test for metahosts by showard · 15 years ago
  7. d119565 Make drone_manager track running processes counts using only the information passed in from the scheduler. Currently it also uses process counts derived from "ps", but that is an unreliable source of information. This improves accuracy and consistency and gives us full control over the process. by showard · 15 years ago
  8. 418785b Some improvements to process tracking in the scheduler. by showard · 15 years ago
  9. 9bb960b Support restricting access to drones by user. Administrators can put lines like by showard · 15 years ago
  10. a21b949 Added functional test for recovering jobs with atomic hosts, with HQEs by showard · 15 years ago
  11. 65db393 * impose prioritization on SpecialTasks based on task type: Repair, then Cleanup, then Verify. remove prioritization of STs with queue entry over those without. this leads to more sane ordering of execution in certain unusual contexts -- the added functional test cases illustrate a few (in some cases, it's not just more sane, it eliminates bugs as well). by showard · 15 years ago
  12. 7b2d7cb We never considered the handling of DO_NOT_VERIFY hosts in certain situations. This adds handling of those cases to the scheduler and adds tests to the scheduler functional test. by showard · 15 years ago
  13. 4a60479 add a bunch of tests to the scheduler functional test to cover pre- and post-job cleanup, including failure cases by showard · 15 years ago
  14. b890045 In scheduler recovery, allow Running HQEs with no process. The tick code already handles them fine (by re-executing Autoserv), but the recovery code was explicitly disallowing them. With this change, it turns out there's only one status that's not allowed to go unrecovered -- Verifying -- so I changed the code to reflect that and I made the failure conditions more accurate. by showard · 15 years ago
  15. f85a0b7 Explicitly release pidfiles after we're done with them. This does it in a kind of lazy way, but it should work just fine. Also extended the new scheduler functional test with a few more cases and added a test to check pidfile release under these various cases. In the process, I changed how some of the code works to allow the tests to more cleanly express their intentions. by showard · 15 years ago
  16. 34ab099 beginnings of a new scheduler functional test. this aims to test the entire monitor_db.py file holistically, made possible by the fact that monitor_db.py is already isolated from all direct system access through drone_manager (this was a necessary separation for distributed scheduling). by mocking out the entire drone_manager, as well as other major dependencies (email manager, global config), and filling a test database, we can allow the dispatcher to execute normally and allow it to interact with all the other code in monitor_db. at the end, we can check the state of the database and the drone_manager, and (probably most importantly, given the usual failure mode of the scheduler) we can ensure no exceptions get raised from monitor_db. by showard · 15 years ago