Lib/test/README - platform/external/python/cpython2 - Gitiles

                       Writing Python Regression Tests
                       -------------------------------
                                Skip Montanaro
                               (skip@mojam.com)


 Introduction

 If you add a new module to Python or modify the functionality of an existing
 module, you should write one or more test cases to exercise that new
 functionality.  There are different ways to do this within the regression
 testing facility provided with Python; any particular test should use only
 one of these options.  Each option requires writing a test module using the
 conventions of the the selected option:

     - PyUnit based tests
     - doctest based tests
     - "traditional" Python test modules

 Regardless of the mechanics of the testing approach you choose,
 you will be writing unit tests (isolated tests of functions and objects
 defined by the module) using white box techniques.  Unlike black box
 testing, where you only have the external interfaces to guide your test case
 writing, in white box testing you can see the code being tested and tailor
 your test cases to exercise it more completely.  In particular, you will be
 able to refer to the C and Python code in the CVS repository when writing
 your regression test cases.


 PyUnit based tests

 The PyUnit framework is based on the ideas of unit testing as espoused
 by Kent Beck and the Extreme Programming (XP) movement.  The specific
 interface provided by the framework is tightly based on the JUnit
 Java implementation of Beck's original SmallTalk test framework.  Please
 see the documentation of the unittest module for detailed information on
 the interface and general guidelines on writing PyUnit based tests.

 The test_support helper module provides a single function for use by
 PyUnit based tests in the Python regression testing framework:
 run_unittest() takes a unittest.TestCase derived class as a parameter
 and runs the tests defined in that class.  All test methods in the
 Python regression framework have names that start with "test_" and use
 lower-case names with words separated with underscores.


 doctest based tests

 Tests written to use doctest are actually part of the docstrings for
 the module being tested.  Each test is written as a display of an
 interactive session, including the Python prompts, statements that would
 be typed by the user, and the output of those statements (including
 tracebacks, although only the exception msg needs to be retained then).
 The module in the test package is simply a wrapper that causes doctest
 to run over the tests in the module.  The test for the difflib module
 provides a convenient example:

     import difflib, test_support
     test_support.run_doctest(difflib)

 If the test is successful, nothing is written to stdout (so you should not
 create a corresponding output/test_difflib file), but running regrtest
 with -v will give a detailed report, the same as if passing -v to doctest.

 A second argument can be passed to run_doctest to tell doctest to search
 sys.argv for -v instead of using test_support's idea of verbosity.  This
 is useful for writing doctest-based tests that aren't simply running a
 doctest'ed Lib module, but contain the doctests themselves.  Then at
 times you may want to run such a test directly as a doctest, independent
 of the regrtest framework.  The tail end of test_descrtut.py is a good
 example:

     def test_main(verbose=None):
         import test_support, test.test_descrtut
         test_support.run_doctest(test.test_descrtut, verbose)

     if __name__ == "__main__":
         test_main(1)

 If run via regrtest, test_main() is called (by regrtest) without specifying
 verbose, and then test_support's idea of verbosity is used.  But when
 run directly, test_main(1) is called, and then doctest's idea of verbosity
 is used.

 See the documentation for the doctest module for information on
 writing tests using the doctest framework.


 "traditional" Python test modules

 The mechanics of how the "traditional" test system operates are fairly
 straightforward.  When a test case is run, the output is compared with the
 expected output that is stored in .../Lib/test/output.  If the test runs to
 completion and the actual and expected outputs match, the test succeeds, if
 not, it fails.  If an ImportError or test_support.TestSkipped error is
 raised, the test is not run.


 Executing Test Cases

 If you are writing test cases for module spam, you need to create a file
 in .../Lib/test named test_spam.py.  In addition, if the tests are expected
 to write to stdout during a successful run, you also need to create an
 expected output file in .../Lib/test/output named test_spam ("..."
 represents the top-level directory in the Python source tree, the directory
 containing the configure script).  If needed, generate the initial version
 of the test output file by executing:

     ./python Lib/test/regrtest.py -g test_spam.py

 from the top-level directory.

 Any time you modify test_spam.py you need to generate a new expected
 output file.  Don't forget to desk check the generated output to make sure
 it's really what you expected to find!  All in all it's usually better
 not to have an expected-out file (note that doctest- and unittest-based
 tests do not).

 To run a single test after modifying a module, simply run regrtest.py
 without the -g flag:

     ./python Lib/test/regrtest.py test_spam.py

 While debugging a regression test, you can of course execute it
 independently of the regression testing framework and see what it prints:

     ./python Lib/test/test_spam.py

 To run the entire test suite:

 [UNIX, + other platforms where "make" works] Make the "test" target at the
 top level:

     make test

 {WINDOWS] Run rt.bat from your PCBuild directory.  Read the comments at
 the top of rt.bat for the use of special -d, -O and -q options processed
 by rt.bat.

 [OTHER] You can simply execute the two runs of regrtest (optimized and
 non-optimized) directly:

     ./python Lib/test/regrtest.py
     ./python -O Lib/test/regrtest.py

 But note that this way picks up whatever .pyc and .pyo files happen to be
 around.  The makefile and rt.bat ways run the tests twice, the first time
 removing all .pyc and .pyo files from the subtree rooted at Lib/.

 Test cases generate output based upon values computed by the test code.
 When executed, regrtest.py compares the actual output generated by executing
 the test case with the expected output and reports success or failure.  It
 stands to reason that if the actual and expected outputs are to match, they
 must not contain any machine dependencies.  This means your test cases
 should not print out absolute machine addresses (e.g. the return value of
 the id() builtin function) or floating point numbers with large numbers of
 significant digits (unless you understand what you are doing!).


 Test Case Writing Tips

 Writing good test cases is a skilled task and is too complex to discuss in
 detail in this short document.  Many books have been written on the subject.
 I'll show my age by suggesting that Glenford Myers' "The Art of Software
 Testing", published in 1979, is still the best introduction to the subject
 available.  It is short (177 pages), easy to read, and discusses the major
 elements of software testing, though its publication predates the
 object-oriented software revolution, so doesn't cover that subject at all.
 Unfortunately, it is very expensive (about $100 new).  If you can borrow it
 or find it used (around $20), I strongly urge you to pick up a copy.

 The most important goal when writing test cases is to break things.  A test
 case that doesn't uncover a bug is much less valuable than one that does.
 In designing test cases you should pay attention to the following:

     * Your test cases should exercise all the functions and objects defined
       in the module, not just the ones meant to be called by users of your
       module.  This may require you to write test code that uses the module
       in ways you don't expect (explicitly calling internal functions, for
       example - see test_atexit.py).

     * You should consider any boundary values that may tickle exceptional
       conditions (e.g. if you were writing regression tests for division,
       you might well want to generate tests with numerators and denominators
       at the limits of floating point and integer numbers on the machine
       performing the tests as well as a denominator of zero).

     * You should exercise as many paths through the code as possible.  This
       may not always be possible, but is a goal to strive for.  In
       particular, when considering if statements (or their equivalent), you
       want to create test cases that exercise both the true and false
       branches.  For loops, you should create test cases that exercise the
       loop zero, one and multiple times.

     * You should test with obviously invalid input.  If you know that a
       function requires an integer input, try calling it with other types of
       objects to see how it responds.

     * You should test with obviously out-of-range input.  If the domain of a
       function is only defined for positive integers, try calling it with a
       negative integer.

     * If you are going to fix a bug that wasn't uncovered by an existing
       test, try to write a test case that exposes the bug (preferably before
       fixing it).

     * If you need to create a temporary file, you can use the filename in
       test_support.TESTFN to do so.  It is important to remove the file
       when done; other tests should be able to use the name without cleaning
       up after your test.


 Regression Test Writing Rules

 Each test case is different.  There is no "standard" form for a Python
 regression test case, though there are some general rules (note that
 these mostly apply only to the "classic" tests; unittest- and doctest-
 based tests should follow the conventions natural to those frameworks):

     * If your test case detects a failure, raise TestFailed (found in
       test_support).

     * Import everything you'll need as early as possible.

     * If you'll be importing objects from a module that is at least
       partially platform-dependent, only import those objects you need for
       the current test case to avoid spurious ImportError exceptions that
       prevent the test from running to completion.

     * Print all your test case results using the print statement.  For
       non-fatal errors, print an error message (or omit a successful
       completion print) to indicate the failure, but proceed instead of
       raising TestFailed.

     * Use "assert" sparingly, if at all.  It's usually better to just print
       what you got, and rely on regrtest's got-vs-expected comparison to
       catch deviations from what you expect.  assert statements aren't
       executed at all when regrtest is run in -O mode; and, because they
       cause the test to stop immediately, can lead to a long & tedious
       test-fix, test-fix, test-fix, ... cycle when things are badly broken
       (and note that "badly broken" often includes running the test suite
       for the first time on new platforms or under new implementations of
       the language).


 Miscellaneous

 There is a test_support module you can import from your test case.  It
 provides the following useful objects:

     * TestFailed - raise this exception when your regression test detects a
       failure.

     * TestSkipped - raise this if the test could not be run because the
       platform doesn't offer all the required facilities (like large
       file support), even if all the required modules are available.

     * verbose - you can use this variable to control print output.  Many
       modules use it.  Search for "verbose" in the test_*.py files to see
       lots of examples.

     * verify(condition, reason='test failed').  Use this instead of

           assert condition[, reason]

       verify() has two advantages over assert:  it works even in -O mode,
       and it raises TestFailed on failure instead of AssertionError.

     * TESTFN - a string that should always be used as the filename when you
       need to create a temp file.  Also use try/finally to ensure that your
       temp files are deleted before your test completes.  Note that you
       cannot unlink an open file on all operating systems, so also be sure
       to close temp files before trying to unlink them.

     * sortdict(dict) - acts like repr(dict.items()), but sorts the items
       first.  This is important when printing a dict value, because the
       order of items produced by dict.items() is not defined by the
       language.

     * findfile(file) - you can call this function to locate a file somewhere
       along sys.path or in the Lib/test tree - see test_linuxaudiodev.py for
       an example of its use.

     * use_large_resources - true iff tests requiring large time or space
       should be run.

     * fcmp(x,y) - you can call this function to compare two floating point
       numbers when you expect them to only be approximately equal withing a
       fuzz factor (test_support.FUZZ, which defaults to 1e-6).

 NOTE:  Always import something from test_support like so:

     from test_support import verbose

 or like so:

     import test_support
     ... use test_support.verbose in the code ...

 Never import anything from test_support like this:

     from test.test_support import verbose

 "test" is a package already, so can refer to modules it contains without
 "test." qualification.  If you do an explicit "test.xxx" qualification, that
 can fool Python into believing test.xxx is a module distinct from the xxx
 in the current package, and you can end up importing two distinct copies of
 xxx.  This is especially bad if xxx=test_support, as regrtest.py can (and
 routinely does) overwrite its "verbose" and "use_large_resources"
 attributes:  if you get a second copy of test_support loaded, it may not
 have the same values for those as regrtest intended.


 Python and C statement coverage results are currently available at

     http://www.musi-cal.com/~skip/python/Python/dist/src/

 As of this writing (July, 2000) these results are being generated nightly.
 You can refer to the summaries and the test coverage output files to see
 where coverage is adequate or lacking and write test cases to beef up the
 coverage.


 Some Non-Obvious regrtest Features

     * Automagic test detection:  When you create a new test file
       test_spam.py, you do not need to modify regrtest (or anything else)
       to advertise its existence.  regrtest searches for and runs all
       modules in the test directory with names of the form test_xxx.py.

     * Miranda output:  If, when running test_spam.py, regrtest does not
       find an expected-output file test/output/test_spam, regrtest
       pretends that it did find one, containing the single line

       test_spam

       This allows new tests that don't expect to print anything to stdout
       to not bother creating expected-output files.

     * Two-stage testing:  To run test_spam.py, regrtest imports test_spam
       as a module.  Most tests run to completion as a side-effect of
       getting imported.  After importing test_spam, regrtest also executes
       test_spam.test_main(), if test_spam has a "test_main" attribute.
       This is rarely needed, and you shouldn't create a module global
       with name test_main unless you're specifically exploiting this
       gimmick.  In such cases, please put a comment saying so near your
       def test_main, because this feature is so rarely used it's not
       obvious when reading the test code.
	Writing Python Regression Tests
	-------------------------------
	Skip Montanaro
	(skip@mojam.com)


	Introduction

	If you add a new module to Python or modify the functionality of an existing
	module, you should write one or more test cases to exercise that new
	functionality. There are different ways to do this within the regression
	testing facility provided with Python; any particular test should use only
	one of these options. Each option requires writing a test module using the
	conventions of the the selected option:

	- PyUnit based tests
	- doctest based tests
	- "traditional" Python test modules

	Regardless of the mechanics of the testing approach you choose,
	you will be writing unit tests (isolated tests of functions and objects
	defined by the module) using white box techniques. Unlike black box
	testing, where you only have the external interfaces to guide your test case
	writing, in white box testing you can see the code being tested and tailor
	your test cases to exercise it more completely. In particular, you will be
	able to refer to the C and Python code in the CVS repository when writing
	your regression test cases.


	PyUnit based tests

	The PyUnit framework is based on the ideas of unit testing as espoused
	by Kent Beck and the Extreme Programming (XP) movement. The specific
	interface provided by the framework is tightly based on the JUnit
	Java implementation of Beck's original SmallTalk test framework. Please
	see the documentation of the unittest module for detailed information on
	the interface and general guidelines on writing PyUnit based tests.

	The test_support helper module provides a single function for use by
	PyUnit based tests in the Python regression testing framework:
	run_unittest() takes a unittest.TestCase derived class as a parameter
	and runs the tests defined in that class. All test methods in the
	Python regression framework have names that start with "test_" and use
	lower-case names with words separated with underscores.


	doctest based tests

	Tests written to use doctest are actually part of the docstrings for
	the module being tested. Each test is written as a display of an
	interactive session, including the Python prompts, statements that would
	be typed by the user, and the output of those statements (including
	tracebacks, although only the exception msg needs to be retained then).
	The module in the test package is simply a wrapper that causes doctest
	to run over the tests in the module. The test for the difflib module
	provides a convenient example:

	import difflib, test_support
	test_support.run_doctest(difflib)

	If the test is successful, nothing is written to stdout (so you should not
	create a corresponding output/test_difflib file), but running regrtest
	with -v will give a detailed report, the same as if passing -v to doctest.

	A second argument can be passed to run_doctest to tell doctest to search
	sys.argv for -v instead of using test_support's idea of verbosity. This
	is useful for writing doctest-based tests that aren't simply running a
	doctest'ed Lib module, but contain the doctests themselves. Then at
	times you may want to run such a test directly as a doctest, independent
	of the regrtest framework. The tail end of test_descrtut.py is a good
	example:

	def test_main(verbose=None):
	import test_support, test.test_descrtut
	test_support.run_doctest(test.test_descrtut, verbose)

	if __name__ == "__main__":
	test_main(1)

	If run via regrtest, test_main() is called (by regrtest) without specifying
	verbose, and then test_support's idea of verbosity is used. But when
	run directly, test_main(1) is called, and then doctest's idea of verbosity
	is used.

	See the documentation for the doctest module for information on
	writing tests using the doctest framework.


	"traditional" Python test modules

	The mechanics of how the "traditional" test system operates are fairly
	straightforward. When a test case is run, the output is compared with the
	expected output that is stored in .../Lib/test/output. If the test runs to
	completion and the actual and expected outputs match, the test succeeds, if
	not, it fails. If an ImportError or test_support.TestSkipped error is
	raised, the test is not run.


	Executing Test Cases

	If you are writing test cases for module spam, you need to create a file
	in .../Lib/test named test_spam.py. In addition, if the tests are expected
	to write to stdout during a successful run, you also need to create an
	expected output file in .../Lib/test/output named test_spam ("..."
	represents the top-level directory in the Python source tree, the directory
	containing the configure script). If needed, generate the initial version
	of the test output file by executing:

	./python Lib/test/regrtest.py -g test_spam.py

	from the top-level directory.

	Any time you modify test_spam.py you need to generate a new expected
	output file. Don't forget to desk check the generated output to make sure
	it's really what you expected to find! All in all it's usually better
	not to have an expected-out file (note that doctest- and unittest-based
	tests do not).

	To run a single test after modifying a module, simply run regrtest.py
	without the -g flag:

	./python Lib/test/regrtest.py test_spam.py

	While debugging a regression test, you can of course execute it
	independently of the regression testing framework and see what it prints:

	./python Lib/test/test_spam.py

	To run the entire test suite:

	[UNIX, + other platforms where "make" works] Make the "test" target at the
	top level:

	make test

	{WINDOWS] Run rt.bat from your PCBuild directory. Read the comments at
	the top of rt.bat for the use of special -d, -O and -q options processed
	by rt.bat.

	[OTHER] You can simply execute the two runs of regrtest (optimized and
	non-optimized) directly:

	./python Lib/test/regrtest.py
	./python -O Lib/test/regrtest.py

	But note that this way picks up whatever .pyc and .pyo files happen to be
	around. The makefile and rt.bat ways run the tests twice, the first time
	removing all .pyc and .pyo files from the subtree rooted at Lib/.

	Test cases generate output based upon values computed by the test code.
	When executed, regrtest.py compares the actual output generated by executing
	the test case with the expected output and reports success or failure. It
	stands to reason that if the actual and expected outputs are to match, they
	must not contain any machine dependencies. This means your test cases
	should not print out absolute machine addresses (e.g. the return value of
	the id() builtin function) or floating point numbers with large numbers of
	significant digits (unless you understand what you are doing!).


	Test Case Writing Tips

	Writing good test cases is a skilled task and is too complex to discuss in
	detail in this short document. Many books have been written on the subject.
	I'll show my age by suggesting that Glenford Myers' "The Art of Software
	Testing", published in 1979, is still the best introduction to the subject
	available. It is short (177 pages), easy to read, and discusses the major
	elements of software testing, though its publication predates the
	object-oriented software revolution, so doesn't cover that subject at all.
	Unfortunately, it is very expensive (about $100 new). If you can borrow it
	or find it used (around $20), I strongly urge you to pick up a copy.

	The most important goal when writing test cases is to break things. A test
	case that doesn't uncover a bug is much less valuable than one that does.
	In designing test cases you should pay attention to the following:

	* Your test cases should exercise all the functions and objects defined
	in the module, not just the ones meant to be called by users of your
	module. This may require you to write test code that uses the module
	in ways you don't expect (explicitly calling internal functions, for
	example - see test_atexit.py).

	* You should consider any boundary values that may tickle exceptional
	conditions (e.g. if you were writing regression tests for division,
	you might well want to generate tests with numerators and denominators
	at the limits of floating point and integer numbers on the machine
	performing the tests as well as a denominator of zero).

	* You should exercise as many paths through the code as possible. This
	may not always be possible, but is a goal to strive for. In
	particular, when considering if statements (or their equivalent), you
	want to create test cases that exercise both the true and false
	branches. For loops, you should create test cases that exercise the
	loop zero, one and multiple times.

	* You should test with obviously invalid input. If you know that a
	function requires an integer input, try calling it with other types of
	objects to see how it responds.

	* You should test with obviously out-of-range input. If the domain of a
	function is only defined for positive integers, try calling it with a
	negative integer.

	* If you are going to fix a bug that wasn't uncovered by an existing
	test, try to write a test case that exposes the bug (preferably before
	fixing it).

	* If you need to create a temporary file, you can use the filename in
	test_support.TESTFN to do so. It is important to remove the file
	when done; other tests should be able to use the name without cleaning
	up after your test.


	Regression Test Writing Rules

	Each test case is different. There is no "standard" form for a Python
	regression test case, though there are some general rules (note that
	these mostly apply only to the "classic" tests; unittest- and doctest-
	based tests should follow the conventions natural to those frameworks):

	* If your test case detects a failure, raise TestFailed (found in
	test_support).

	* Import everything you'll need as early as possible.

	* If you'll be importing objects from a module that is at least
	partially platform-dependent, only import those objects you need for
	the current test case to avoid spurious ImportError exceptions that
	prevent the test from running to completion.

	* Print all your test case results using the print statement. For
	non-fatal errors, print an error message (or omit a successful
	completion print) to indicate the failure, but proceed instead of
	raising TestFailed.

	* Use "assert" sparingly, if at all. It's usually better to just print
	what you got, and rely on regrtest's got-vs-expected comparison to
	catch deviations from what you expect. assert statements aren't
	executed at all when regrtest is run in -O mode; and, because they
	cause the test to stop immediately, can lead to a long & tedious
	test-fix, test-fix, test-fix, ... cycle when things are badly broken
	(and note that "badly broken" often includes running the test suite
	for the first time on new platforms or under new implementations of
	the language).


	Miscellaneous

	There is a test_support module you can import from your test case. It
	provides the following useful objects:

	* TestFailed - raise this exception when your regression test detects a
	failure.

	* TestSkipped - raise this if the test could not be run because the
	platform doesn't offer all the required facilities (like large
	file support), even if all the required modules are available.

	* verbose - you can use this variable to control print output. Many
	modules use it. Search for "verbose" in the test_*.py files to see
	lots of examples.

	* verify(condition, reason='test failed'). Use this instead of

	assert condition[, reason]

	verify() has two advantages over assert: it works even in -O mode,
	and it raises TestFailed on failure instead of AssertionError.

	* TESTFN - a string that should always be used as the filename when you
	need to create a temp file. Also use try/finally to ensure that your
	temp files are deleted before your test completes. Note that you
	cannot unlink an open file on all operating systems, so also be sure
	to close temp files before trying to unlink them.

	* sortdict(dict) - acts like repr(dict.items()), but sorts the items
	first. This is important when printing a dict value, because the
	order of items produced by dict.items() is not defined by the
	language.

	* findfile(file) - you can call this function to locate a file somewhere
	along sys.path or in the Lib/test tree - see test_linuxaudiodev.py for
	an example of its use.

	* use_large_resources - true iff tests requiring large time or space
	should be run.

	* fcmp(x,y) - you can call this function to compare two floating point
	numbers when you expect them to only be approximately equal withing a
	fuzz factor (test_support.FUZZ, which defaults to 1e-6).

	NOTE: Always import something from test_support like so:

	from test_support import verbose

	or like so:

	import test_support
	... use test_support.verbose in the code ...

	Never import anything from test_support like this:

	from test.test_support import verbose

	"test" is a package already, so can refer to modules it contains without
	"test." qualification. If you do an explicit "test.xxx" qualification, that
	can fool Python into believing test.xxx is a module distinct from the xxx
	in the current package, and you can end up importing two distinct copies of
	xxx. This is especially bad if xxx=test_support, as regrtest.py can (and
	routinely does) overwrite its "verbose" and "use_large_resources"
	attributes: if you get a second copy of test_support loaded, it may not
	have the same values for those as regrtest intended.


	Python and C statement coverage results are currently available at

	http://www.musi-cal.com/~skip/python/Python/dist/src/

	As of this writing (July, 2000) these results are being generated nightly.
	You can refer to the summaries and the test coverage output files to see
	where coverage is adequate or lacking and write test cases to beef up the
	coverage.


	Some Non-Obvious regrtest Features

	* Automagic test detection: When you create a new test file
	test_spam.py, you do not need to modify regrtest (or anything else)
	to advertise its existence. regrtest searches for and runs all
	modules in the test directory with names of the form test_xxx.py.

	* Miranda output: If, when running test_spam.py, regrtest does not
	find an expected-output file test/output/test_spam, regrtest
	pretends that it did find one, containing the single line

	test_spam

	This allows new tests that don't expect to print anything to stdout
	to not bother creating expected-output files.

	* Two-stage testing: To run test_spam.py, regrtest imports test_spam
	as a module. Most tests run to completion as a side-effect of
	getting imported. After importing test_spam, regrtest also executes
	test_spam.test_main(), if test_spam has a "test_main" attribute.
	This is rarely needed, and you shouldn't create a module global
	with name test_main unless you're specifically exploiting this
	gimmick. In such cases, please put a comment saying so near your
	def test_main, because this feature is so rarely used it's not
	obvious when reading the test code.