Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 1 | \section{\module{doctest} --- |
| 2 | Test docstrings represent reality} |
| 3 | |
| 4 | \declaremodule{standard}{doctest} |
| 5 | \moduleauthor{Tim Peters}{tim_one@users.sourceforge.net} |
| 6 | \sectionauthor{Tim Peters}{tim_one@users.sourceforge.net} |
| 7 | \sectionauthor{Moshe Zadka}{moshez@debian.org} |
| 8 | |
| 9 | \modulesynopsis{A framework for verifying examples in docstrings.} |
| 10 | |
| 11 | The \module{doctest} module searches a module's docstrings for text that looks |
| 12 | like an interactive Python session, then executes all such sessions to verify |
| 13 | they still work exactly as shown. Here's a complete but small example: |
| 14 | |
| 15 | \begin{verbatim} |
| 16 | """ |
| 17 | This is module example. |
| 18 | |
| 19 | Example supplies one function, factorial. For example, |
| 20 | |
| 21 | >>> factorial(5) |
| 22 | 120 |
| 23 | """ |
| 24 | |
| 25 | def factorial(n): |
| 26 | """Return the factorial of n, an exact integer >= 0. |
| 27 | |
| 28 | If the result is small enough to fit in an int, return an int. |
| 29 | Else return a long. |
| 30 | |
| 31 | >>> [factorial(n) for n in range(6)] |
| 32 | [1, 1, 2, 6, 24, 120] |
| 33 | >>> [factorial(long(n)) for n in range(6)] |
| 34 | [1, 1, 2, 6, 24, 120] |
| 35 | >>> factorial(30) |
| 36 | 265252859812191058636308480000000L |
| 37 | >>> factorial(30L) |
| 38 | 265252859812191058636308480000000L |
| 39 | >>> factorial(-1) |
| 40 | Traceback (most recent call last): |
| 41 | ... |
| 42 | ValueError: n must be >= 0 |
| 43 | |
| 44 | Factorials of floats are OK, but the float must be an exact integer: |
| 45 | >>> factorial(30.1) |
| 46 | Traceback (most recent call last): |
| 47 | ... |
| 48 | ValueError: n must be exact integer |
| 49 | >>> factorial(30.0) |
| 50 | 265252859812191058636308480000000L |
| 51 | |
| 52 | It must also not be ridiculously large: |
| 53 | >>> factorial(1e100) |
| 54 | Traceback (most recent call last): |
| 55 | ... |
| 56 | OverflowError: n too large |
| 57 | """ |
| 58 | |
| 59 | \end{verbatim} |
| 60 | % allow LaTeX to break here. |
| 61 | \begin{verbatim} |
| 62 | |
| 63 | import math |
| 64 | if not n >= 0: |
| 65 | raise ValueError("n must be >= 0") |
| 66 | if math.floor(n) != n: |
| 67 | raise ValueError("n must be exact integer") |
| 68 | if n+1 == n: # e.g., 1e300 |
| 69 | raise OverflowError("n too large") |
| 70 | result = 1 |
| 71 | factor = 2 |
| 72 | while factor <= n: |
| 73 | try: |
| 74 | result *= factor |
| 75 | except OverflowError: |
| 76 | result *= long(factor) |
| 77 | factor += 1 |
| 78 | return result |
| 79 | |
| 80 | def _test(): |
| 81 | import doctest, example |
| 82 | return doctest.testmod(example) |
| 83 | |
| 84 | if __name__ == "__main__": |
| 85 | _test() |
| 86 | \end{verbatim} |
| 87 | |
| 88 | If you run \file{example.py} directly from the command line, doctest works |
| 89 | its magic: |
| 90 | |
| 91 | \begin{verbatim} |
| 92 | $ python example.py |
| 93 | $ |
| 94 | \end{verbatim} |
| 95 | |
| 96 | There's no output! That's normal, and it means all the examples worked. |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 97 | Pass \programopt{-v} to the script, and doctest prints a detailed log |
| 98 | of what it's trying, and prints a summary at the end: |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 99 | |
| 100 | \begin{verbatim} |
| 101 | $ python example.py -v |
| 102 | Running example.__doc__ |
| 103 | Trying: factorial(5) |
| 104 | Expecting: 120 |
| 105 | ok |
| 106 | 0 of 1 examples failed in example.__doc__ |
| 107 | Running example.factorial.__doc__ |
| 108 | Trying: [factorial(n) for n in range(6)] |
| 109 | Expecting: [1, 1, 2, 6, 24, 120] |
| 110 | ok |
| 111 | Trying: [factorial(long(n)) for n in range(6)] |
| 112 | Expecting: [1, 1, 2, 6, 24, 120] |
| 113 | ok |
| 114 | Trying: factorial(30) |
| 115 | Expecting: 265252859812191058636308480000000L |
| 116 | ok |
| 117 | \end{verbatim} |
| 118 | |
| 119 | And so on, eventually ending with: |
| 120 | |
| 121 | \begin{verbatim} |
| 122 | Trying: factorial(1e100) |
| 123 | Expecting: |
| 124 | Traceback (most recent call last): |
| 125 | ... |
| 126 | OverflowError: n too large |
| 127 | ok |
| 128 | 0 of 8 examples failed in example.factorial.__doc__ |
| 129 | 2 items passed all tests: |
| 130 | 1 tests in example |
| 131 | 8 tests in example.factorial |
| 132 | 9 tests in 2 items. |
| 133 | 9 passed and 0 failed. |
| 134 | Test passed. |
| 135 | $ |
| 136 | \end{verbatim} |
| 137 | |
| 138 | That's all you need to know to start making productive use of doctest! Jump |
| 139 | in. The docstrings in doctest.py contain detailed information about all |
| 140 | aspects of doctest, and we'll just cover the more important points here. |
| 141 | |
| 142 | \subsection{Normal Usage} |
| 143 | |
| 144 | In normal use, end each module \module{M} with: |
| 145 | |
| 146 | \begin{verbatim} |
| 147 | def _test(): |
| 148 | import doctest, M # replace M with your module's name |
| 149 | return doctest.testmod(M) # ditto |
| 150 | |
| 151 | if __name__ == "__main__": |
| 152 | _test() |
| 153 | \end{verbatim} |
| 154 | |
| 155 | Then running the module as a script causes the examples in the docstrings |
| 156 | to get executed and verified: |
| 157 | |
| 158 | \begin{verbatim} |
| 159 | python M.py |
| 160 | \end{verbatim} |
| 161 | |
| 162 | This won't display anything unless an example fails, in which case the |
| 163 | failing example(s) and the cause(s) of the failure(s) are printed to stdout, |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 164 | and the final line of output is \code{'Test failed.'}. |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 165 | |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 166 | Run it with the \programopt{-v} switch instead: |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 167 | |
| 168 | \begin{verbatim} |
| 169 | python M.py -v |
| 170 | \end{verbatim} |
| 171 | |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 172 | and a detailed report of all examples tried is printed to \code{stdout}, |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 173 | along with assorted summaries at the end. |
| 174 | |
| 175 | You can force verbose mode by passing \code{verbose=1} to testmod, or |
| 176 | prohibit it by passing \code{verbose=0}. In either of those cases, |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 177 | \code{sys.argv} is not examined by testmod. |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 178 | |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 179 | In any case, testmod returns a 2-tuple of ints \code{(\var{f}, |
| 180 | \var{t})}, where \var{f} is the number of docstring examples that |
| 181 | failed and \var{t} is the total number of docstring examples |
| 182 | attempted. |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 183 | |
| 184 | \subsection{Which Docstrings Are Examined?} |
| 185 | |
| 186 | See \file{docstring.py} for all the details. They're unsurprising: the |
| 187 | module docstring, and all function, class and method docstrings are |
| 188 | searched, with the exception of docstrings attached to objects with private |
Tim Peters | 0481d24 | 2001-10-02 21:01:22 +0000 | [diff] [blame] | 189 | names. Objects imported into the module are not searched. |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 190 | |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 191 | In addition, if \code{M.__test__} exists and "is true", it must be a |
| 192 | dict, and each entry maps a (string) name to a function object, class |
| 193 | object, or string. Function and class object docstrings found from |
| 194 | \code{M.__test__} are searched even if the name is private, and |
| 195 | strings are searched directly as if they were docstrings. In output, |
| 196 | a key \code{K} in \code{M.__test__} appears with name |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 197 | |
| 198 | \begin{verbatim} |
| 199 | <name of M>.__test__.K |
| 200 | \end{verbatim} |
| 201 | |
| 202 | Any classes found are recursively searched similarly, to test docstrings in |
| 203 | their contained methods and nested classes. While private names reached |
| 204 | from \module{M}'s globals are skipped, all names reached from |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 205 | \code{M.__test__} are searched. |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 206 | |
| 207 | \subsection{What's the Execution Context?} |
| 208 | |
| 209 | By default, each time testmod finds a docstring to test, it uses a |
Fred Drake | 4cf1227 | 2001-04-05 18:31:27 +0000 | [diff] [blame] | 210 | \emph{copy} of \module{M}'s globals, so that running tests on a module |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 211 | doesn't change the module's real globals, and so that one test in |
| 212 | \module{M} can't leave behind crumbs that accidentally allow another test |
| 213 | to work. This means examples can freely use any names defined at top-level |
Tim Peters | 0481d24 | 2001-10-02 21:01:22 +0000 | [diff] [blame] | 214 | in \module{M}, and names defined earlier in the docstring being run. |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 215 | |
| 216 | You can force use of your own dict as the execution context by passing |
| 217 | \code{globs=your_dict} to \function{testmod()} instead. Presumably this |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 218 | would be a copy of \code{M.__dict__} merged with the globals from other |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 219 | imported modules. |
| 220 | |
| 221 | \subsection{What About Exceptions?} |
| 222 | |
| 223 | No problem, as long as the only output generated by the example is the |
| 224 | traceback itself. For example: |
| 225 | |
| 226 | \begin{verbatim} |
Fred Drake | 19f3c52 | 2001-02-22 23:15:05 +0000 | [diff] [blame] | 227 | >>> [1, 2, 3].remove(42) |
| 228 | Traceback (most recent call last): |
| 229 | File "<stdin>", line 1, in ? |
| 230 | ValueError: list.remove(x): x not in list |
| 231 | >>> |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 232 | \end{verbatim} |
| 233 | |
| 234 | Note that only the exception type and value are compared (specifically, |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 235 | only the last line in the traceback). The various ``File'' lines in |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 236 | between can be left out (unless they add significantly to the documentation |
| 237 | value of the example). |
| 238 | |
| 239 | \subsection{Advanced Usage} |
| 240 | |
| 241 | \function{testmod()} actually creates a local instance of class |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 242 | \class{Tester}, runs appropriate methods of that class, and merges |
| 243 | the results into global \class{Tester} instance \code{master}. |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 244 | |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 245 | You can create your own instances of \class{Tester}, and so build your |
| 246 | own policies, or even run methods of \code{master} directly. See |
| 247 | \code{Tester.__doc__} for details. |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 248 | |
| 249 | |
| 250 | \subsection{How are Docstring Examples Recognized?} |
| 251 | |
| 252 | In most cases a copy-and-paste of an interactive console session works fine |
| 253 | --- just make sure the leading whitespace is rigidly consistent (you can mix |
| 254 | tabs and spaces if you're too lazy to do it right, but doctest is not in |
| 255 | the business of guessing what you think a tab means). |
| 256 | |
| 257 | \begin{verbatim} |
Fred Drake | 19f3c52 | 2001-02-22 23:15:05 +0000 | [diff] [blame] | 258 | >>> # comments are ignored |
| 259 | >>> x = 12 |
| 260 | >>> x |
| 261 | 12 |
| 262 | >>> if x == 13: |
| 263 | ... print "yes" |
| 264 | ... else: |
| 265 | ... print "no" |
| 266 | ... print "NO" |
| 267 | ... print "NO!!!" |
| 268 | ... |
| 269 | no |
| 270 | NO |
| 271 | NO!!! |
| 272 | >>> |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 273 | \end{verbatim} |
| 274 | |
Fred Drake | 19f3c52 | 2001-02-22 23:15:05 +0000 | [diff] [blame] | 275 | Any expected output must immediately follow the final |
| 276 | \code{'>\code{>}>~'} or \code{'...~'} line containing the code, and |
| 277 | the expected output (if any) extends to the next \code{'>\code{>}>~'} |
| 278 | or all-whitespace line. |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 279 | |
| 280 | The fine print: |
| 281 | |
| 282 | \begin{itemize} |
| 283 | |
| 284 | \item Expected output cannot contain an all-whitespace line, since such a |
| 285 | line is taken to signal the end of expected output. |
| 286 | |
| 287 | \item Output to stdout is captured, but not output to stderr (exception |
| 288 | tracebacks are captured via a different means). |
| 289 | |
| 290 | \item If you continue a line via backslashing in an interactive session, or |
| 291 | for any other reason use a backslash, you need to double the backslash in |
| 292 | the docstring version. This is simply because you're in a string, and so |
| 293 | the backslash must be escaped for it to survive intact. Like: |
| 294 | |
| 295 | \begin{verbatim} |
| 296 | >>> if "yes" == \\ |
| 297 | ... "y" + \\ |
| 298 | ... "es": |
| 299 | ... print 'yes' |
| 300 | yes |
| 301 | \end{verbatim} |
| 302 | |
Tim Peters | f0768c8 | 2001-02-20 10:57:30 +0000 | [diff] [blame] | 303 | \item The starting column doesn't matter: |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 304 | |
| 305 | \begin{verbatim} |
Tim Peters | c4089d8 | 2001-02-17 18:03:25 +0000 | [diff] [blame] | 306 | >>> assert "Easy!" |
| 307 | >>> import math |
| 308 | >>> math.floor(1.9) |
| 309 | 1.0 |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 310 | \end{verbatim} |
| 311 | |
Fred Drake | 19f3c52 | 2001-02-22 23:15:05 +0000 | [diff] [blame] | 312 | and as many leading whitespace characters are stripped from the |
| 313 | expected output as appeared in the initial \code{'>\code{>}>~'} line |
| 314 | that triggered it. |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 315 | \end{itemize} |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 316 | |
| 317 | \subsection{Warnings} |
| 318 | |
| 319 | \begin{enumerate} |
| 320 | |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 321 | \item \module{doctest} is serious about requiring exact matches in expected |
| 322 | output. If even a single character doesn't match, the test fails. This |
| 323 | will probably surprise you a few times, as you learn exactly what Python |
| 324 | does and doesn't guarantee about output. For example, when printing a |
| 325 | dict, Python doesn't guarantee that the key-value pairs will be printed |
| 326 | in any particular order, so a test like |
| 327 | |
| 328 | % Hey! What happened to Monty Python examples? |
Tim Peters | f0768c8 | 2001-02-20 10:57:30 +0000 | [diff] [blame] | 329 | % Tim: ask Guido -- it's his example! |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 330 | \begin{verbatim} |
Fred Drake | 19f3c52 | 2001-02-22 23:15:05 +0000 | [diff] [blame] | 331 | >>> foo() |
| 332 | {"Hermione": "hippogryph", "Harry": "broomstick"} |
| 333 | >>> |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 334 | \end{verbatim} |
| 335 | |
| 336 | is vulnerable! One workaround is to do |
| 337 | |
| 338 | \begin{verbatim} |
Fred Drake | 19f3c52 | 2001-02-22 23:15:05 +0000 | [diff] [blame] | 339 | >>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"} |
| 340 | 1 |
| 341 | >>> |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 342 | \end{verbatim} |
| 343 | |
| 344 | instead. Another is to do |
| 345 | |
| 346 | \begin{verbatim} |
Fred Drake | 19f3c52 | 2001-02-22 23:15:05 +0000 | [diff] [blame] | 347 | >>> d = foo().items() |
| 348 | >>> d.sort() |
| 349 | >>> d |
| 350 | [('Harry', 'broomstick'), ('Hermione', 'hippogryph')] |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 351 | \end{verbatim} |
| 352 | |
| 353 | There are others, but you get the idea. |
| 354 | |
| 355 | Another bad idea is to print things that embed an object address, like |
| 356 | |
| 357 | \begin{verbatim} |
Fred Drake | 19f3c52 | 2001-02-22 23:15:05 +0000 | [diff] [blame] | 358 | >>> id(1.0) # certain to fail some of the time |
| 359 | 7948648 |
| 360 | >>> |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 361 | \end{verbatim} |
| 362 | |
| 363 | Floating-point numbers are also subject to small output variations across |
| 364 | platforms, because Python defers to the platform C library for float |
| 365 | formatting, and C libraries vary widely in quality here. |
| 366 | |
| 367 | \begin{verbatim} |
Fred Drake | 19f3c52 | 2001-02-22 23:15:05 +0000 | [diff] [blame] | 368 | >>> 1./7 # risky |
| 369 | 0.14285714285714285 |
| 370 | >>> print 1./7 # safer |
| 371 | 0.142857142857 |
| 372 | >>> print round(1./7, 6) # much safer |
| 373 | 0.142857 |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 374 | \end{verbatim} |
| 375 | |
| 376 | Numbers of the form \code{I/2.**J} are safe across all platforms, and I |
| 377 | often contrive doctest examples to produce numbers of that form: |
| 378 | |
| 379 | \begin{verbatim} |
Fred Drake | 19f3c52 | 2001-02-22 23:15:05 +0000 | [diff] [blame] | 380 | >>> 3./4 # utterly safe |
| 381 | 0.75 |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 382 | \end{verbatim} |
| 383 | |
| 384 | Simple fractions are also easier for people to understand, and that makes |
| 385 | for better documentation. |
| 386 | |
Skip Montanaro | 1dc98c4 | 2001-06-08 14:40:28 +0000 | [diff] [blame] | 387 | \item Be careful if you have code that must only execute once. |
| 388 | |
| 389 | If you have module-level code that must only execute once, a more foolproof |
Fred Drake | c115835 | 2001-06-11 14:55:01 +0000 | [diff] [blame] | 390 | definition of \function{_test()} is |
Skip Montanaro | 1dc98c4 | 2001-06-08 14:40:28 +0000 | [diff] [blame] | 391 | |
| 392 | \begin{verbatim} |
| 393 | def _test(): |
| 394 | import doctest, sys |
| 395 | doctest.testmod(sys.modules["__main__"]) |
| 396 | \end{verbatim} |
Fred Drake | c115835 | 2001-06-11 14:55:01 +0000 | [diff] [blame] | 397 | \end{enumerate} |
| 398 | |
Tim Peters | 7688229 | 2001-02-17 05:58:44 +0000 | [diff] [blame] | 399 | |
| 400 | \subsection{Soapbox} |
| 401 | |
| 402 | The first word in doctest is "doc", and that's why the author wrote |
| 403 | doctest: to keep documentation up to date. It so happens that doctest |
| 404 | makes a pleasant unit testing environment, but that's not its primary |
| 405 | purpose. |
| 406 | |
| 407 | Choose docstring examples with care. There's an art to this that needs to |
| 408 | be learned --- it may not be natural at first. Examples should add genuine |
| 409 | value to the documentation. A good example can often be worth many words. |
| 410 | If possible, show just a few normal cases, show endcases, show interesting |
| 411 | subtle cases, and show an example of each kind of exception that can be |
| 412 | raised. You're probably testing for endcases and subtle cases anyway in an |
| 413 | interactive shell: doctest wants to make it as easy as possible to capture |
| 414 | those sessions, and will verify they continue to work as designed forever |
| 415 | after. |
| 416 | |
| 417 | If done with care, the examples will be invaluable for your users, and will |
| 418 | pay back the time it takes to collect them many times over as the years go |
| 419 | by and "things change". I'm still amazed at how often one of my doctest |
| 420 | examples stops working after a "harmless" change. |
| 421 | |
| 422 | For exhaustive testing, or testing boring cases that add no value to the |
Fred Drake | 7eb1463 | 2001-02-17 17:32:41 +0000 | [diff] [blame] | 423 | docs, define a \code{__test__} dict instead. That's what it's for. |