| This document describes the operation of the test scheduling framework in |
| the pounder30 package. This document reflects pounder30 as of 2011-8-09. |
| |
| Authors: |
| Darrick Wong <djwong@us.ibm.com> |
| Lucy Liang <lgliang@us.ibm.com> |
| |
| Copyright (C) 2011 IBM. |
| |
| Contents |
| ======== |
| 1. Overview |
| 2. Test Files |
| 3. Build Scripts |
| 4. Test Scripts |
| 5. Scheduling Tests |
| 6. Running Tests Repeatedly |
| 7. The Provided Test Schedulers |
| 8. Creating Your Own Test Scheduler |
| 9. Including and Excluding Tests |
| |
| Overview |
| ======== |
| The scheduler in the original pounder release was too simplistic--it would kick |
| off every test at once, simultaneously. There was no attempt to ramp up the |
| machine's stress levels test by test, or to run only certain combinations, or |
| even run the tests one by one before beginning the real load testing. |
| |
| In addition, the test scripts had a very simple pass/fail mechanism--failure |
| was defined by a kernel panic/oops/bug, and passing was defined by the lack of |
| that condition. There was no attempt to find soft failures--situations where |
| a test program would fail, but without bringing the machine down. The test |
| suite would not alert the user that these failures had occurred. |
| |
| Consequently, Darrick Wong rewrote the test scheduling framework to achieve |
| several goals--first, to separate the test automation code from the tests |
| themselves, to allow for more intelligent scheduling of tests, to give better |
| summary reports of what passed (and what didn't), and finally to improve the |
| load testing that this suite could do. |
| |
| Test Files |
| ========== |
| Each test should only need to provide three files: |
| |
| 1) build_scripts/<testname> |
| - The build_scripts/ directory contains scripts that take care of checking for |
| system requirements, downloading the relevant packages and binaries, and building |
| any code necessary to run the subtests. See the "Build Scripts" section below for |
| more information. |
| |
| 2) test_scripts/<testname> |
| - The test_script/ directory contains scripts that take care of running the actual tests. |
| See the "Test Scripts" section below for more information. |
| |
| 3) tests/.../[T|D]XX<testname> |
| - The tests/ directory represents our unpackaged "test scheduler" (if your tests/ |
| directory is empty, that means you haven't unpacked any test schedulers yet and will |
| need run "make install" to unpack a scheduler - see "The Provided Test Schedulers" |
| section for more information. The test_repo/ directory also provides an example of what |
| an unpacked test scheduler should look like). The files in the tests/ directory are |
| usually symlinks that point to files in test_scripts/. The order in which the subtests are |
| run upon starting pounder depends on how the files in tests/ are named and organized. |
| See the "Scheduling Tests" section below for more information. |
| |
| Note: <testname> should be the same in the build_scripts/, test_scripts/, and tests/ folders. |
| (Example: build_scripts/subtest1, test_scripts/subtest1, and tests/D99subtest1 would be valid. |
| build_scripts/subtest1, test_scripts/subtest1_different, and tests/D99subtest1 would not.) |
| See "Scheduling Tests" below for a detailed description of naming rules for files in the tests/ |
| directory. |
| |
| Build Scripts |
| ============= |
| As the name implies, a script in build_scripts/ is in charge of downloading |
| and building whatever bits of code are necessary to make the test run. |
| |
| Temporary files needed to run a test should go in $POUNDER_TMPDIR. Third party source, |
| packages, binaries should go in $POUNDER_OPTDIR. Third party packages can be fetched |
| from the web or from a user-created cache, a web-accessible directory containing |
| cached tarballs and files used for whatever it is you'll need to build. |
| (see "$POUNDER_CACHE" in doc/CONFIGURATION for more information) |
| |
| Should there be a failure in the build script that is essential to the ability |
| to run a test, the build script should exit with error to halt the main build |
| process immediately. |
| |
| Also, be aware that distributing pre-built binary tarballs is not always a good |
| idea. Though one could cache pre-built binary tarballs rather than source, it may |
| not be a good idea because distros are not always good at ABI/library path compatibility, |
| despite the efforts of LSB, FHS, etc. It is always safest to build your |
| subtests from source on your target system. |
| |
| The build_scripts/ directory provides some examples. |
| |
| Test Scripts |
| ============ |
| A script in test_scripts/ is in charge of running the actual test. |
| |
| The requirements on test scripts are pretty light. First, the building of the |
| test ought to go in the build script unless it's absolutely necessary to build |
| a test component at run time. Any checking for system requirements should also |
| go in the build script. |
| |
| Second, the script must catch SIGTERM and clean up after itself. SIGTERM is |
| used by the test scheduler to stop tests. |
| |
| The third requirement is much more stringent: Return codes. The script should |
| return 0 to indicate success, 1-254 to indicate failure (the common use is to |
| signify the number of failures), and -1 or 255 to indicate that the there was |
| a failure that cannot be fixed. |
| |
| Note: If a test is being run in a timed or infinite loop (see the |
| "Running Tests Repeatedly" section below for details), returning -1 or 255 |
| has the effect of cancelling all subsequent loops. |
| |
| Quick map of return codes to what gets reported: |
| 0 = "PASS" |
| -1 = "ABORT" |
| 255 = "ABORT" |
| anything else = "FAIL" |
| |
| Also note: If a test is killed by an unhandled signal, the test is reported as |
| failing. |
| |
| Put any temporary files created during test run in $POUNDER_TMPDIR. |
| |
| The test_scripts/ directory provides some examples. |
| |
| Scheduling Tests |
| ================ |
| Everything under the tests/ directory is used for scheduling purposes. The current |
| test scheduler borrows a System V rc script-like structure for specifying how and |
| when tests should be run. Files under tests/ should have names that follow the this |
| standard: |
| |
| [type][sequence number][name] |
| |
| "type" is the type of test. Currently, there are two types, 'D' and 'T'. 'T' |
| signifies a test, which means that the scheduler starts the test, waits for the |
| test to complete, and reports on its exit status. 'D' signifies a daemon |
| "test", which is to say that the scheduler will start the test, let it run in |
| the background, and kill it when it's done running all the tests in that |
| directory. |
| |
| The "sequence number" dictates the order in which the test are run. 00 goes |
| first, 99 goes last. Tests with the same number are started simultaneously, |
| regardless of the type. |
| |
| "name" is just a convenient mnemonic to distinguish between tests. However, |
| it should be the same as the corresponding name using in build_scripts and |
| test_scripts. (A test with build script "build_scripts/subtest" and |
| test script "test_scripts/subtest" should be defined as something like |
| "tests/T00subtest" as opposed to "tests/T00whatever_i_feel_like") |
| |
| Test names must be unique! |
| |
| File system objects under the tests/ directory can be nearly anything-- |
| directories, symbolic links, or files. The test scheduler will not run |
| anything that doesn't have the execute bit set. If a FS object is a |
| directory, then the contents of the directory are executed sequentially. |
| |
| Example: |
| |
| Let's examine the following test scheduler hierarchy: |
| |
| tests/ |
| D00stats |
| T01foo |
| T01bar |
| T02dir/ |
| T00gav -> ../../test_scripts/gav |
| T01hic -> ../../test_scripts/hic |
| T03lat |
| |
| Let's see how the tests are run. The test scheduler will start off by scanning |
| the tests/ directory. First it spawns D00stats and lets it run in the |
| background. Next, T01foo and T01bar are launched at the same time; the |
| scheduler will wait for both of them to complete before proceeding. Since T01foo |
| is a file and not just a symbolic link, there is a fair chance that T01foo runs |
| some test in a loop for a certain amount of time. In any case, the scheduler |
| next sees T02dir and proceeds into it. |
| |
| In the T02dir/, we find two test scripts. First T00gav runs, followed by |
| T01hic. Now there are no more tests to run in T02dir/, so the scheduler heads |
| back up to the parent directory. T03lat is forked and allowed to run to |
| completion, after which D00stats is killed, and the test suite exits. |
| |
| Running Tests Repeatedly |
| ======================== |
| Two helper programs are provided to run tests repeatedly, timed_loop and infinite_loop. |
| (This version of pounder currently also includes a fancy_timed_loop.c file, but it's only |
| meant to be used for the random_syscall and will most likely be merged with timed_loop.c |
| in the future, so we will ignore it here for now.) |
| |
| 1. timed_loop |
| |
| timed_loop [-m max_failures] duration_in_seconds command [arguments] |
| |
| This program will run "command" with the given arguments repeated |
| until the number of seconds given as "duration" has passed or the |
| command has failed a total of "max_failures" times, whichever comes first. |
| If the $MAX_FAILURES variable is set (defined in config, see CONFIGURATION |
| for details), then the program will run until command has failed a total of |
| $MAX_FAILURES time (as long as it's not overridden by the -m option). |
| |
| 2. infinite_loop |
| |
| infinite_loop [-m max_failures] command [arguments] |
| |
| This program runs "command" repeatedly until sent SIGTERM or the |
| command has failed a total of "max_failures" times. If the $MAX_FAILURES |
| variable is set (defined in config, see CONFIGURATION for details), then |
| the program will run until command has failed a total of $MAX_FAILURES time |
| (as long as it's not overridden by the -m option). |
| |
| Examples: |
| |
| 1. test_repo/T90ramp/D02build_kernel contains the following line: |
| |
| "$POUNDER_HOME/infinite_loop $POUNDER_HOME/test_scripts/build_kernel" |
| |
| which will run the build_kernel test script repeatedly until sent SIGTERM |
| or until it has failed a total of $MAX_FAILURES times. |
| |
| "$POUNDER_HOME/infinite_loop -m 10 $POUNDER_HOME/test_scripts/build_kernel" |
| |
| would run the build_kernel test script repeatedly until sent SIGTERM or |
| until it has failed 10 times, regardless of what $MAX_FAILURES is. |
| |
| 2. test_scripts/time_drift contains the following line: |
| |
| "$POUNDER_HOME/timed_loop 900 "$POUNDER_SRCDIR/time_tests/drift-test.py" $NTP_SERVER $FREQ" |
| |
| which will run the drift-test.py script ($NTP_SERVER and $FREQ are some args passed to drift-test.py) |
| for 15 minutes or until it has failed a total of $MAX_FAILURES times. |
| |
| "$POUNDER_HOME/timed_loop -m 10 900 "$POUNDER_SRCDIR/time_tests/drift-test.py" $NTP_SERVER $FREQ" |
| |
| would run the drift-test.py script for 15 minutes or until it has failed 10 times, regardless of |
| what $MAX_FAILURES is. |
| |
| The Provided Test Schedulers |
| ============================ |
| This version of pounder provides 3 test schedulers: the "default," "fast," and "test" test schedulers. |
| The tarred versions can be found in the schedulers/ directory as default-tests.tar.gz, fast-tests.tar.gz, |
| and test-tests.tar.gz respectively. |
| |
| To unpack a test scheduler, run "make install" in the pounder/ directory and enter the name of the |
| scheduler you would like to unpack at the first prompt. |
| |
| Example of unpacking the "fast" test scheduler: |
| |
| # make install |
| ./Install |
| Looking for tools...make g++ lex gcc python wget sudo diff patch egrep rm echo test which cp mkdir . |
| All tools were found. |
| WHICH TEST SCHEDULER SETUP DO YOU WANT TO UNPACK? |
| [Choose from: |
| default-tests.tar.gz |
| fast-tests.tar.gz |
| test-tests.tar.gz] |
| [Or simply press ENTER for the default scheduler] |
| Scheduler selection: fast |
| |
| Descriptions of the provided test schedulers: |
| |
| 1. default - provides a general purpose stress test, runs for 48 hours unless the -d option |
| is used when starting pounder. |
| 2. fast - basically the same as default, except it runs for 12 hours by default. |
| 3. test - provides a set of useless tests. Each test simply passes, fails, aborts, or sleeps for |
| some period of time. They don't do anything useful but can be used to see how |
| the test scheduling setup works. |
| |
| Creating Your Own Test Schedulers |
| ================================= |
| From the pounder directory, place the desired tests in the tests/ directory according to |
| the rules described in the "Scheduling Tests" section above. Then run the following command: |
| |
| ./pounder -c name_of_scheduler |
| |
| to create a new test scheduler, which will be tarred as name_of_scheduler-tests.tar.gz and |
| placed in the schedulers/ directory. |
| |
| Example Usage: |
| |
| # ls ./schedulers |
| default-tests.tar.gz fast-tests.tar.gz test-tests.tar.gz |
| |
| # ls ./tests |
| T00hwinfo |
| |
| # ./pounder -c new_sched |
| |
| # ls ./schedulers |
| default-tests.tar.gz fast-tests.tar.gz new_sched-tests.tar.gz test-tests.tar.gz |
| |
| After unpacking the "new_sched" test scheduler during install, the tests/ directory should |
| contain the T00hwinfo subtest along with a tests/excluded/ directory (see the "Including and |
| Excluding Tests" section below for details regarding the tests/excluded directory). |
| |
| Including and Excluding Tests |
| ============================= |
| After unpacking the test scheduler and building each individual test, running |
| "./pounder" will automatically run every test included in the tests folder. If you |
| would like to run only ONE test, run "./pounder ./tests/<some subtest>". If you would |
| like to run a portion of tests, you can use the "./pounder -e" option to exclude |
| certain subtests from subsequent pounder runs: |
| |
| Example: |
| |
| Suppose you have already ran "make install" and unpacked the default test scheduler. |
| The tests/ directory should now contain the subtests to be run |
| |
| 1) ./pounder -l |
| - lists all of the subtests that came with the currently active test scheduler. |
| The output should look something like: |
| |
| ------------------ |
| #./pounder -l |
| Included subtests: |
| ... |
| .../ltp-full-xxxxxxxx/tools/pounder/tests/T10single/T00xterm_stress |
| .../ltp-full-xxxxxxxx/tools/pounder/tests/T00hwinfo |
| ... |
| |
| Excluded subtests: |
| [NONE] |
| ------------------ |
| |
| 2) ./pounder -e "tests/T10single/T00xterm_stress tests/T00hwinfo" |
| - will exclude T00xterm_stress and T00hwinfo from any subsequent pounder runs. |
| This command essentially moves the two tests from the "tests" folder to the |
| "tests/excluded" folder for temporary storage, where they will remain until |
| re-included back into the test scheduler (this is also why all test names |
| should be unique). A file "tests/excluded/testlist" keeps track of which tests |
| have been excluded from the test scheduler and what their original paths were. |
| |
| 3) ./pounder -l |
| - should now output something like: |
| |
| ------------------ |
| #./pounder -l |
| Included subtests: |
| ... |
| |
| Excluded subtests: |
| T00xterm_stress |
| T00hwinfo |
| ------------------ |
| |
| 4) ./pounder -i "T00xterm_stress T00hwinfo" - will re-include these subtests back into |
| the test scheduler. They will be moved from the tests/excluded folder back into |
| the tests folder under their original paths. |