blob: a8b20993e73820bfdcad6d1cc17949b0eaef9204 [file] [log] [blame]
Guido van Rossumdf804f81995-03-02 12:38:39 +00001\chapter{The Python Profiler}
Fred Drake31ecd501998-02-18 15:40:11 +00002\label{profile}
Guido van Rossumdf804f81995-03-02 12:38:39 +00003\stmodindex{profile}
4\stmodindex{pstats}
5
Fred Drake4b3f0311996-12-13 22:04:31 +00006Copyright \copyright{} 1994, by InfoSeek Corporation, all rights reserved.
Guido van Rossumdf804f81995-03-02 12:38:39 +00007
Fred Drakeeb72a271998-02-22 20:42:11 +00008Written by James Roskind.%
Guido van Rossumdf804f81995-03-02 12:38:39 +00009\footnote{
Guido van Rossum6c4f0031995-03-07 10:14:09 +000010Updated and converted to \LaTeX\ by Guido van Rossum. The references to
Guido van Rossumdf804f81995-03-02 12:38:39 +000011the old profiler are left in the text, although it no longer exists.
12}
13
14Permission to use, copy, modify, and distribute this Python software
15and its associated documentation for any purpose (subject to the
16restriction in the following sentence) without fee is hereby granted,
17provided that the above copyright notice appears in all copies, and
18that both that copyright notice and this permission notice appear in
19supporting documentation, and that the name of InfoSeek not be used in
20advertising or publicity pertaining to distribution of the software
21without specific, written prior permission. This permission is
22explicitly restricted to the copying and modification of the software
23to remain in Python, compiled Python, or other languages (such as C)
24wherein the modified or derived code is exclusively imported into a
25Python module.
26
27INFOSEEK CORPORATION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS
28SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
29FITNESS. IN NO EVENT SHALL INFOSEEK CORPORATION BE LIABLE FOR ANY
30SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
31RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
32CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
33CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
34
35
36The profiler was written after only programming in Python for 3 weeks.
37As a result, it is probably clumsy code, but I don't know for sure yet
38'cause I'm a beginner :-). I did work hard to make the code run fast,
39so that profiling would be a reasonable thing to do. I tried not to
40repeat code fragments, but I'm sure I did some stuff in really awkward
41ways at times. Please send suggestions for improvements to:
Fred Drake8fa5eb81998-02-27 05:23:37 +000042\email{jar@netscape.com}. I won't promise \emph{any} support. ...but
Guido van Rossumdf804f81995-03-02 12:38:39 +000043I'd appreciate the feedback.
44
45
Guido van Rossum470be141995-03-17 16:07:09 +000046\section{Introduction to the profiler}
Guido van Rossum86cb0921995-03-20 12:59:56 +000047\nodename{Profiler Introduction}
Guido van Rossumdf804f81995-03-02 12:38:39 +000048
49A \dfn{profiler} is a program that describes the run time performance
50of a program, providing a variety of statistics. This documentation
51describes the profiler functionality provided in the modules
Fred Drake8fa5eb81998-02-27 05:23:37 +000052\module{profile} and \module{pstats}. This profiler provides
Guido van Rossumdf804f81995-03-02 12:38:39 +000053\dfn{deterministic profiling} of any Python programs. It also
54provides a series of report generation tools to allow users to rapidly
55examine the results of a profile operation.
Fred Drake8fa5eb81998-02-27 05:23:37 +000056\index{deterministic profiling}
57\index{profiling, deterministic}
Guido van Rossumdf804f81995-03-02 12:38:39 +000058
59
60\section{How Is This Profiler Different From The Old Profiler?}
Guido van Rossum86cb0921995-03-20 12:59:56 +000061\nodename{Profiler Changes}
Guido van Rossumdf804f81995-03-02 12:38:39 +000062
Guido van Rossum364e6431997-11-18 15:28:46 +000063(This section is of historical importance only; the old profiler
64discussed here was last seen in Python 1.1.)
65
Guido van Rossumdf804f81995-03-02 12:38:39 +000066The big changes from old profiling module are that you get more
67information, and you pay less CPU time. It's not a trade-off, it's a
68trade-up.
69
70To be specific:
71
72\begin{description}
73
74\item[Bugs removed:]
75Local stack frame is no longer molested, execution time is now charged
76to correct functions.
77
78\item[Accuracy increased:]
79Profiler execution time is no longer charged to user's code,
80calibration for platform is supported, file reads are not done \emph{by}
81profiler \emph{during} profiling (and charged to user's code!).
82
83\item[Speed increased:]
84Overhead CPU cost was reduced by more than a factor of two (perhaps a
85factor of five), lightweight profiler module is all that must be
Fred Drake8fa5eb81998-02-27 05:23:37 +000086loaded, and the report generating module (\module{pstats}) is not needed
Guido van Rossumdf804f81995-03-02 12:38:39 +000087during profiling.
88
89\item[Recursive functions support:]
90Cumulative times in recursive functions are correctly calculated;
91recursive entries are counted.
92
93\item[Large growth in report generating UI:]
94Distinct profiles runs can be added together forming a comprehensive
95report; functions that import statistics take arbitrary lists of
96files; sorting criteria is now based on keywords (instead of 4 integer
97options); reports shows what functions were profiled as well as what
98profile file was referenced; output format has been improved.
99
100\end{description}
101
102
103\section{Instant Users Manual}
104
105This section is provided for users that ``don't want to read the
106manual.'' It provides a very brief overview, and allows a user to
107rapidly perform profiling on an existing application.
108
109To profile an application with a main entry point of \samp{foo()}, you
110would add the following to your module:
111
Fred Drake19479911998-02-13 06:58:54 +0000112\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000113import profile
114profile.run("foo()")
Fred Drake19479911998-02-13 06:58:54 +0000115\end{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000116%
Guido van Rossumdf804f81995-03-02 12:38:39 +0000117The above action would cause \samp{foo()} to be run, and a series of
118informative lines (the profile) to be printed. The above approach is
119most useful when working with the interpreter. If you would like to
120save the results of a profile into a file for later examination, you
Fred Drake8fa5eb81998-02-27 05:23:37 +0000121can supply a file name as the second argument to the \function{run()}
Guido van Rossumdf804f81995-03-02 12:38:39 +0000122function:
123
Fred Drake19479911998-02-13 06:58:54 +0000124\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000125import profile
126profile.run("foo()", 'fooprof')
Fred Drake19479911998-02-13 06:58:54 +0000127\end{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000128%
Fred Drake8fa5eb81998-02-27 05:23:37 +0000129The file \file{profile.py} can also be invoked as
Guido van Rossumbac80021997-06-02 17:29:12 +0000130a script to profile another script. For example:
Fred Drake8fa5eb81998-02-27 05:23:37 +0000131
132\begin{verbatim}
133python /usr/local/lib/python1.4/profile.py myscript.py
134\end{verbatim}
Guido van Rossumbac80021997-06-02 17:29:12 +0000135
Guido van Rossumdf804f81995-03-02 12:38:39 +0000136When you wish to review the profile, you should use the methods in the
Fred Drake8fa5eb81998-02-27 05:23:37 +0000137\module{pstats} module. Typically you would load the statistics data as
Guido van Rossumdf804f81995-03-02 12:38:39 +0000138follows:
139
Fred Drake19479911998-02-13 06:58:54 +0000140\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000141import pstats
142p = pstats.Stats('fooprof')
Fred Drake19479911998-02-13 06:58:54 +0000143\end{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000144%
Fred Drake8fa5eb81998-02-27 05:23:37 +0000145The class \class{Stats} (the above code just created an instance of
Guido van Rossumdf804f81995-03-02 12:38:39 +0000146this class) has a variety of methods for manipulating and printing the
147data that was just read into \samp{p}. When you ran
Fred Drake8fa5eb81998-02-27 05:23:37 +0000148\function{profile.run()} above, what was printed was the result of three
Guido van Rossumdf804f81995-03-02 12:38:39 +0000149method calls:
150
Fred Drake19479911998-02-13 06:58:54 +0000151\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000152p.strip_dirs().sort_stats(-1).print_stats()
Fred Drake19479911998-02-13 06:58:54 +0000153\end{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000154%
Guido van Rossumdf804f81995-03-02 12:38:39 +0000155The first method removed the extraneous path from all the module
156names. The second method sorted all the entries according to the
157standard module/line/name string that is printed (this is to comply
158with the semantics of the old profiler). The third method printed out
159all the statistics. You might try the following sort calls:
160
Fred Drake19479911998-02-13 06:58:54 +0000161\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000162p.sort_stats('name')
163p.print_stats()
Fred Drake19479911998-02-13 06:58:54 +0000164\end{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000165%
Guido van Rossumdf804f81995-03-02 12:38:39 +0000166The first call will actually sort the list by function name, and the
167second call will print out the statistics. The following are some
168interesting calls to experiment with:
169
Fred Drake19479911998-02-13 06:58:54 +0000170\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000171p.sort_stats('cumulative').print_stats(10)
Fred Drake19479911998-02-13 06:58:54 +0000172\end{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000173%
Guido van Rossumdf804f81995-03-02 12:38:39 +0000174This sorts the profile by cumulative time in a function, and then only
175prints the ten most significant lines. If you want to understand what
176algorithms are taking time, the above line is what you would use.
177
178If you were looking to see what functions were looping a lot, and
179taking a lot of time, you would do:
180
Fred Drake19479911998-02-13 06:58:54 +0000181\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000182p.sort_stats('time').print_stats(10)
Fred Drake19479911998-02-13 06:58:54 +0000183\end{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000184%
Guido van Rossumdf804f81995-03-02 12:38:39 +0000185to sort according to time spent within each function, and then print
186the statistics for the top ten functions.
187
188You might also try:
189
Fred Drake19479911998-02-13 06:58:54 +0000190\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000191p.sort_stats('file').print_stats('__init__')
Fred Drake19479911998-02-13 06:58:54 +0000192\end{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000193%
Guido van Rossumdf804f81995-03-02 12:38:39 +0000194This will sort all the statistics by file name, and then print out
195statistics for only the class init methods ('cause they are spelled
Fred Drake8fa5eb81998-02-27 05:23:37 +0000196with \samp{__init__} in them). As one final example, you could try:
Guido van Rossumdf804f81995-03-02 12:38:39 +0000197
Fred Drake19479911998-02-13 06:58:54 +0000198\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000199p.sort_stats('time', 'cum').print_stats(.5, 'init')
Fred Drake19479911998-02-13 06:58:54 +0000200\end{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000201%
Guido van Rossumdf804f81995-03-02 12:38:39 +0000202This line sorts statistics with a primary key of time, and a secondary
203key of cumulative time, and then prints out some of the statistics.
204To be specific, the list is first culled down to 50\% (re: \samp{.5})
205of its original size, then only lines containing \code{init} are
206maintained, and that sub-sub-list is printed.
207
208If you wondered what functions called the above functions, you could
209now (\samp{p} is still sorted according to the last criteria) do:
210
Fred Drake19479911998-02-13 06:58:54 +0000211\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000212p.print_callers(.5, 'init')
Fred Drake19479911998-02-13 06:58:54 +0000213\end{verbatim}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000214
Guido van Rossumdf804f81995-03-02 12:38:39 +0000215and you would get a list of callers for each of the listed functions.
216
217If you want more functionality, you're going to have to read the
218manual, or guess what the following functions do:
219
Fred Drake19479911998-02-13 06:58:54 +0000220\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000221p.print_callees()
222p.add('fooprof')
Fred Drake19479911998-02-13 06:58:54 +0000223\end{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000224%
Guido van Rossumdf804f81995-03-02 12:38:39 +0000225\section{What Is Deterministic Profiling?}
Guido van Rossum86cb0921995-03-20 12:59:56 +0000226\nodename{Deterministic Profiling}
Guido van Rossumdf804f81995-03-02 12:38:39 +0000227
228\dfn{Deterministic profiling} is meant to reflect the fact that all
229\dfn{function call}, \dfn{function return}, and \dfn{exception} events
230are monitored, and precise timings are made for the intervals between
231these events (during which time the user's code is executing). In
232contrast, \dfn{statistical profiling} (which is not done by this
233module) randomly samples the effective instruction pointer, and
234deduces where time is being spent. The latter technique traditionally
235involves less overhead (as the code does not need to be instrumented),
236but provides only relative indications of where time is being spent.
237
238In Python, since there is an interpreter active during execution, the
239presence of instrumented code is not required to do deterministic
240profiling. Python automatically provides a \dfn{hook} (optional
241callback) for each event. In addition, the interpreted nature of
242Python tends to add so much overhead to execution, that deterministic
243profiling tends to only add small processing overhead in typical
244applications. The result is that deterministic profiling is not that
245expensive, yet provides extensive run time statistics about the
246execution of a Python program.
247
248Call count statistics can be used to identify bugs in code (surprising
249counts), and to identify possible inline-expansion points (high call
250counts). Internal time statistics can be used to identify ``hot
251loops'' that should be carefully optimized. Cumulative time
252statistics should be used to identify high level errors in the
253selection of algorithms. Note that the unusual handling of cumulative
254times in this profiler allows statistics for recursive implementations
255of algorithms to be directly compared to iterative implementations.
256
257
258\section{Reference Manual}
259
Fred Drake8fa5eb81998-02-27 05:23:37 +0000260\setindexsubitem{(in module profile)}
Guido van Rossumdf804f81995-03-02 12:38:39 +0000261
262The primary entry point for the profiler is the global function
Fred Drake8fa5eb81998-02-27 05:23:37 +0000263\function{profile.run()}. It is typically used to create any profile
Guido van Rossumdf804f81995-03-02 12:38:39 +0000264information. The reports are formatted and printed using methods of
Fred Drake8fa5eb81998-02-27 05:23:37 +0000265the class \class{pstats.Stats}. The following is a description of all
Guido van Rossumdf804f81995-03-02 12:38:39 +0000266of these standard entry points and functions. For a more in-depth
267view of some of the code, consider reading the later section on
268Profiler Extensions, which includes discussion of how to derive
269``better'' profilers from the classes presented, or reading the source
270code for these modules.
271
Guido van Rossum470be141995-03-17 16:07:09 +0000272\begin{funcdesc}{profile.run}{string\optional{\, filename\optional{\, ...}}}
Guido van Rossumdf804f81995-03-02 12:38:39 +0000273
274This function takes a single argument that has can be passed to the
Fred Drake8fa5eb81998-02-27 05:23:37 +0000275\keyword{exec} statement, and an optional file name. In all cases this
276routine attempts to \keyword{exec} its first argument, and gather profiling
Guido van Rossumdf804f81995-03-02 12:38:39 +0000277statistics from the execution. If no file name is present, then this
278function automatically prints a simple profiling report, sorted by the
279standard name string (file/line/function-name) that is presented in
280each line. The following is a typical output from such a call:
281
Fred Drake19479911998-02-13 06:58:54 +0000282\begin{verbatim}
Guido van Rossum96628a91995-04-10 11:34:00 +0000283 main()
284 2706 function calls (2004 primitive calls) in 4.504 CPU seconds
Guido van Rossumdf804f81995-03-02 12:38:39 +0000285
Guido van Rossum96628a91995-04-10 11:34:00 +0000286Ordered by: standard name
Guido van Rossumdf804f81995-03-02 12:38:39 +0000287
Guido van Rossum96628a91995-04-10 11:34:00 +0000288ncalls tottime percall cumtime percall filename:lineno(function)
289 2 0.006 0.003 0.953 0.477 pobject.py:75(save_objects)
290 43/3 0.533 0.012 0.749 0.250 pobject.py:99(evaluate)
291 ...
Fred Drake19479911998-02-13 06:58:54 +0000292\end{verbatim}
Guido van Rossumdf804f81995-03-02 12:38:39 +0000293
294The first line indicates that this profile was generated by the call:\\
295\code{profile.run('main()')}, and hence the exec'ed string is
296\code{'main()'}. The second line indicates that 2706 calls were
297monitored. Of those calls, 2004 were \dfn{primitive}. We define
298\dfn{primitive} to mean that the call was not induced via recursion.
299The next line: \code{Ordered by:\ standard name}, indicates that
300the text string in the far right column was used to sort the output.
301The column headings include:
302
303\begin{description}
304
305\item[ncalls ]
306for the number of calls,
307
308\item[tottime ]
309for the total time spent in the given function (and excluding time
310made in calls to sub-functions),
311
312\item[percall ]
313is the quotient of \code{tottime} divided by \code{ncalls}
314
315\item[cumtime ]
316is the total time spent in this and all subfunctions (i.e., from
317invocation till exit). This figure is accurate \emph{even} for recursive
318functions.
319
320\item[percall ]
321is the quotient of \code{cumtime} divided by primitive calls
322
323\item[filename:lineno(function) ]
324provides the respective data of each function
325
326\end{description}
327
328When there are two numbers in the first column (e.g.: \samp{43/3}),
329then the latter is the number of primitive calls, and the former is
330the actual number of calls. Note that when the function does not
331recurse, these two values are the same, and only the single figure is
332printed.
Guido van Rossum96628a91995-04-10 11:34:00 +0000333
Guido van Rossumdf804f81995-03-02 12:38:39 +0000334\end{funcdesc}
335
Fred Drake8fa5eb81998-02-27 05:23:37 +0000336Analysis of the profiler data is done using this class from the
337\module{pstats} module:
338
339\setindexsubitem{(in module pstats)}
340
341\begin{classdesc}{Stats}{filename\optional{\, ...}}
Guido van Rossumdf804f81995-03-02 12:38:39 +0000342This class constructor creates an instance of a ``statistics object''
Fred Drake8fa5eb81998-02-27 05:23:37 +0000343from a \var{filename} (or set of filenames). \class{Stats} objects are
Guido van Rossumdf804f81995-03-02 12:38:39 +0000344manipulated by methods, in order to print useful reports.
345
346The file selected by the above constructor must have been created by
Fred Drake8fa5eb81998-02-27 05:23:37 +0000347the corresponding version of \module{profile}. To be specific, there is
348\emph{no} file compatibility guaranteed with future versions of this
Guido van Rossumdf804f81995-03-02 12:38:39 +0000349profiler, and there is no compatibility with files produced by other
350profilers (e.g., the old system profiler).
351
352If several files are provided, all the statistics for identical
353functions will be coalesced, so that an overall view of several
354processes can be considered in a single report. If additional files
Fred Drake8fa5eb81998-02-27 05:23:37 +0000355need to be combined with data in an existing \class{Stats} object, the
356\method{add()} method can be used.
357\end{classdesc}
Guido van Rossumdf804f81995-03-02 12:38:39 +0000358
359
Guido van Rossum470be141995-03-17 16:07:09 +0000360\subsection{The \sectcode{Stats} Class}
Guido van Rossumdf804f81995-03-02 12:38:39 +0000361
Fred Drake19479911998-02-13 06:58:54 +0000362\setindexsubitem{(Stats method)}
Guido van Rossumdf804f81995-03-02 12:38:39 +0000363
364\begin{funcdesc}{strip_dirs}{}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000365This method for the \class{Stats} class removes all leading path
366information from file names. It is very useful in reducing the size
367of the printout to fit within (close to) 80 columns. This method
368modifies the object, and the stripped information is lost. After
369performing a strip operation, the object is considered to have its
370entries in a ``random'' order, as it was just after object
371initialization and loading. If \method{strip_dirs()} causes two
372function names to be indistinguishable (i.e., they are on the same
373line of the same filename, and have the same function name), then the
374statistics for these two entries are accumulated into a single entry.
Guido van Rossumdf804f81995-03-02 12:38:39 +0000375\end{funcdesc}
376
377
378\begin{funcdesc}{add}{filename\optional{\, ...}}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000379This method of the \class{Stats} class accumulates additional
380profiling information into the current profiling object. Its
381arguments should refer to filenames created by the corresponding
382version of \function{profile.run()}. Statistics for identically named
383(re: file, line, name) functions are automatically accumulated into
384single function statistics.
Guido van Rossumdf804f81995-03-02 12:38:39 +0000385\end{funcdesc}
386
387\begin{funcdesc}{sort_stats}{key\optional{\, ...}}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000388This method modifies the \class{Stats} object by sorting it according
389to the supplied criteria. The argument is typically a string
390identifying the basis of a sort (example: \code{"time"} or
391\code{"name"}).
Guido van Rossumdf804f81995-03-02 12:38:39 +0000392
393When more than one key is provided, then additional keys are used as
394secondary criteria when the there is equality in all keys selected
Fred Drake8fa5eb81998-02-27 05:23:37 +0000395before them. For example, \samp{sort_stats('name', 'file')} will sort
396all the entries according to their function name, and resolve all ties
Guido van Rossumdf804f81995-03-02 12:38:39 +0000397(identical function names) by sorting by file name.
398
399Abbreviations can be used for any key names, as long as the
400abbreviation is unambiguous. The following are the keys currently
401defined:
402
403\begin{tableii}{|l|l|}{code}{Valid Arg}{Meaning}
404\lineii{"calls"}{call count}
405\lineii{"cumulative"}{cumulative time}
406\lineii{"file"}{file name}
407\lineii{"module"}{file name}
408\lineii{"pcalls"}{primitive call count}
409\lineii{"line"}{line number}
410\lineii{"name"}{function name}
411\lineii{"nfl"}{name/file/line}
412\lineii{"stdname"}{standard name}
413\lineii{"time"}{internal time}
414\end{tableii}
415
416Note that all sorts on statistics are in descending order (placing
417most time consuming items first), where as name, file, and line number
418searches are in ascending order (i.e., alphabetical). The subtle
419distinction between \code{"nfl"} and \code{"stdname"} is that the
420standard name is a sort of the name as printed, which means that the
421embedded line numbers get compared in an odd way. For example, lines
4223, 20, and 40 would (if the file names were the same) appear in the
423string order 20, 3 and 40. In contrast, \code{"nfl"} does a numeric
424compare of the line numbers. In fact, \code{sort_stats("nfl")} is the
425same as \code{sort_stats("name", "file", "line")}.
426
427For compatibility with the old profiler, the numeric arguments
428\samp{-1}, \samp{0}, \samp{1}, and \samp{2} are permitted. They are
429interpreted as \code{"stdname"}, \code{"calls"}, \code{"time"}, and
430\code{"cumulative"} respectively. If this old style format (numeric)
431is used, only one sort key (the numeric key) will be used, and
432additional arguments will be silently ignored.
433\end{funcdesc}
434
435
436\begin{funcdesc}{reverse_order}{}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000437This method for the \class{Stats} class reverses the ordering of the basic
Guido van Rossumdf804f81995-03-02 12:38:39 +0000438list within the object. This method is provided primarily for
439compatibility with the old profiler. Its utility is questionable
440now that ascending vs descending order is properly selected based on
441the sort key of choice.
442\end{funcdesc}
443
444\begin{funcdesc}{print_stats}{restriction\optional{\, ...}}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000445This method for the \class{Stats} class prints out a report as described
446in the \function{profile.run()} definition.
Guido van Rossumdf804f81995-03-02 12:38:39 +0000447
Fred Drake8fa5eb81998-02-27 05:23:37 +0000448The order of the printing is based on the last \method{sort_stats()}
449operation done on the object (subject to caveats in \method{add()} and
450\method{strip_dirs()}.
Guido van Rossumdf804f81995-03-02 12:38:39 +0000451
452The arguments provided (if any) can be used to limit the list down to
453the significant entries. Initially, the list is taken to be the
454complete set of profiled functions. Each restriction is either an
455integer (to select a count of lines), or a decimal fraction between
4560.0 and 1.0 inclusive (to select a percentage of lines), or a regular
Guido van Rossum364e6431997-11-18 15:28:46 +0000457expression (to pattern match the standard name that is printed; as of
458Python 1.5b1, this uses the Perl-style regular expression syntax
Fred Drake8fa5eb81998-02-27 05:23:37 +0000459defined by the \module{re} module). If several restrictions are
Guido van Rossum364e6431997-11-18 15:28:46 +0000460provided, then they are applied sequentially. For example:
Guido van Rossumdf804f81995-03-02 12:38:39 +0000461
Fred Drake19479911998-02-13 06:58:54 +0000462\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000463print_stats(.1, "foo:")
Fred Drake19479911998-02-13 06:58:54 +0000464\end{verbatim}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000465
Guido van Rossumdf804f81995-03-02 12:38:39 +0000466would first limit the printing to first 10\% of list, and then only
467print functions that were part of filename \samp{.*foo:}. In
468contrast, the command:
469
Fred Drake19479911998-02-13 06:58:54 +0000470\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000471print_stats("foo:", .1)
Fred Drake19479911998-02-13 06:58:54 +0000472\end{verbatim}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000473
Guido van Rossumdf804f81995-03-02 12:38:39 +0000474would limit the list to all functions having file names \samp{.*foo:},
475and then proceed to only print the first 10\% of them.
476\end{funcdesc}
477
478
479\begin{funcdesc}{print_callers}{restrictions\optional{\, ...}}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000480This method for the \class{Stats} class prints a list of all functions
Guido van Rossumdf804f81995-03-02 12:38:39 +0000481that called each function in the profiled database. The ordering is
Fred Drake8fa5eb81998-02-27 05:23:37 +0000482identical to that provided by \method{print_stats()}, and the definition
Guido van Rossumdf804f81995-03-02 12:38:39 +0000483of the restricting argument is also identical. For convenience, a
484number is shown in parentheses after each caller to show how many
485times this specific call was made. A second non-parenthesized number
486is the cumulative time spent in the function at the right.
487\end{funcdesc}
488
489\begin{funcdesc}{print_callees}{restrictions\optional{\, ...}}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000490This method for the \class{Stats} class prints a list of all function
Guido van Rossumdf804f81995-03-02 12:38:39 +0000491that were called by the indicated function. Aside from this reversal
492of direction of calls (re: called vs was called by), the arguments and
Fred Drake8fa5eb81998-02-27 05:23:37 +0000493ordering are identical to the \method{print_callers()} method.
Guido van Rossumdf804f81995-03-02 12:38:39 +0000494\end{funcdesc}
495
496\begin{funcdesc}{ignore}{}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000497This method of the \class{Stats} class is used to dispose of the value
Guido van Rossumdf804f81995-03-02 12:38:39 +0000498returned by earlier methods. All standard methods in this class
499return the instance that is being processed, so that the commands can
500be strung together. For example:
501
Fred Drake19479911998-02-13 06:58:54 +0000502\begin{verbatim}
Guido van Rossum96628a91995-04-10 11:34:00 +0000503pstats.Stats('foofile').strip_dirs().sort_stats('cum') \
504 .print_stats().ignore()
Fred Drake19479911998-02-13 06:58:54 +0000505\end{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000506%
Guido van Rossumdf804f81995-03-02 12:38:39 +0000507would perform all the indicated functions, but it would not return
Fred Drake8fa5eb81998-02-27 05:23:37 +0000508the final reference to the \class{Stats} instance.%
Guido van Rossumdf804f81995-03-02 12:38:39 +0000509\footnote{
510This was once necessary, when Python would print any unused expression
511result that was not \code{None}. The method is still defined for
512backward compatibility.
513}
514\end{funcdesc}
515
516
517\section{Limitations}
518
519There are two fundamental limitations on this profiler. The first is
520that it relies on the Python interpreter to dispatch \dfn{call},
Fred Drake8fa5eb81998-02-27 05:23:37 +0000521\dfn{return}, and \dfn{exception} events. Compiled \C{} code does not
Guido van Rossumdf804f81995-03-02 12:38:39 +0000522get interpreted, and hence is ``invisible'' to the profiler. All time
Fred Drake8fa5eb81998-02-27 05:23:37 +0000523spent in \C{} code (including builtin functions) will be charged to the
524Python function that invoked the \C{} code. If the \C{} code calls out
Guido van Rossumdf804f81995-03-02 12:38:39 +0000525to some native Python code, then those calls will be profiled
526properly.
527
528The second limitation has to do with accuracy of timing information.
529There is a fundamental problem with deterministic profilers involving
530accuracy. The most obvious restriction is that the underlying ``clock''
531is only ticking at a rate (typically) of about .001 seconds. Hence no
532measurements will be more accurate that that underlying clock. If
533enough measurements are taken, then the ``error'' will tend to average
534out. Unfortunately, removing this first error induces a second source
535of error...
536
537The second problem is that it ``takes a while'' from when an event is
538dispatched until the profiler's call to get the time actually
539\emph{gets} the state of the clock. Similarly, there is a certain lag
540when exiting the profiler event handler from the time that the clock's
541value was obtained (and then squirreled away), until the user's code
542is once again executing. As a result, functions that are called many
543times, or call many functions, will typically accumulate this error.
544The error that accumulates in this fashion is typically less than the
545accuracy of the clock (i.e., less than one clock tick), but it
546\emph{can} accumulate and become very significant. This profiler
547provides a means of calibrating itself for a given platform so that
548this error can be probabilistically (i.e., on the average) removed.
549After the profiler is calibrated, it will be more accurate (in a least
550square sense), but it will sometimes produce negative numbers (when
551call counts are exceptionally low, and the gods of probability work
552against you :-). ) Do \emph{NOT} be alarmed by negative numbers in
553the profile. They should \emph{only} appear if you have calibrated
554your profiler, and the results are actually better than without
555calibration.
556
557
558\section{Calibration}
559
560The profiler class has a hard coded constant that is added to each
561event handling time to compensate for the overhead of calling the time
562function, and socking away the results. The following procedure can
563be used to obtain this constant for a given platform (see discussion
564in section Limitations above).
565
Fred Drake19479911998-02-13 06:58:54 +0000566\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000567import profile
568pr = profile.Profile()
569pr.calibrate(100)
570pr.calibrate(100)
571pr.calibrate(100)
Fred Drake19479911998-02-13 06:58:54 +0000572\end{verbatim}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000573
574The argument to \method{calibrate()} is the number of times to try to
575do the sample calls to get the CPU times. If your computer is
576\emph{very} fast, you might have to do:
Guido van Rossumdf804f81995-03-02 12:38:39 +0000577
Fred Drake19479911998-02-13 06:58:54 +0000578\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000579pr.calibrate(1000)
Fred Drake19479911998-02-13 06:58:54 +0000580\end{verbatim}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000581
Guido van Rossumdf804f81995-03-02 12:38:39 +0000582or even:
583
Fred Drake19479911998-02-13 06:58:54 +0000584\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000585pr.calibrate(10000)
Fred Drake19479911998-02-13 06:58:54 +0000586\end{verbatim}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000587
Guido van Rossumdf804f81995-03-02 12:38:39 +0000588The object of this exercise is to get a fairly consistent result.
589When you have a consistent answer, you are ready to use that number in
590the source code. For a Sun Sparcstation 1000 running Solaris 2.3, the
591magical number is about .00053. If you have a choice, you are better
592off with a smaller constant, and your results will ``less often'' show
593up as negative in profile statistics.
594
595The following shows how the trace_dispatch() method in the Profile
596class should be modified to install the calibration constant on a Sun
597Sparcstation 1000:
598
Fred Drake19479911998-02-13 06:58:54 +0000599\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000600def trace_dispatch(self, frame, event, arg):
601 t = self.timer()
602 t = t[0] + t[1] - self.t - .00053 # Calibration constant
603
604 if self.dispatch[event](frame,t):
Guido van Rossumdf804f81995-03-02 12:38:39 +0000605 t = self.timer()
Guido van Rossume47da0a1997-07-17 16:34:52 +0000606 self.t = t[0] + t[1]
607 else:
608 r = self.timer()
609 self.t = r[0] + r[1] - t # put back unrecorded delta
610 return
Fred Drake19479911998-02-13 06:58:54 +0000611\end{verbatim}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000612
Guido van Rossumdf804f81995-03-02 12:38:39 +0000613Note that if there is no calibration constant, then the line
614containing the callibration constant should simply say:
615
Fred Drake19479911998-02-13 06:58:54 +0000616\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000617t = t[0] + t[1] - self.t # no calibration constant
Fred Drake19479911998-02-13 06:58:54 +0000618\end{verbatim}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000619
Guido van Rossumdf804f81995-03-02 12:38:39 +0000620You can also achieve the same results using a derived class (and the
621profiler will actually run equally fast!!), but the above method is
622the simplest to use. I could have made the profiler ``self
623calibrating'', but it would have made the initialization of the
624profiler class slower, and would have required some \emph{very} fancy
625coding, or else the use of a variable where the constant \samp{.00053}
626was placed in the code shown. This is a \strong{VERY} critical
627performance section, and there is no reason to use a variable lookup
628at this point, when a constant can be used.
629
630
Guido van Rossum86cb0921995-03-20 12:59:56 +0000631\section{Extensions --- Deriving Better Profilers}
632\nodename{Profiler Extensions}
Guido van Rossumdf804f81995-03-02 12:38:39 +0000633
Fred Drake8fa5eb81998-02-27 05:23:37 +0000634The \class{Profile} class of module \module{profile} was written so that
Guido van Rossumdf804f81995-03-02 12:38:39 +0000635derived classes could be developed to extend the profiler. Rather
636than describing all the details of such an effort, I'll just present
637the following two examples of derived classes that can be used to do
638profiling. If the reader is an avid Python programmer, then it should
639be possible to use these as a model and create similar (and perchance
640better) profile classes.
641
642If all you want to do is change how the timer is called, or which
643timer function is used, then the basic class has an option for that in
644the constructor for the class. Consider passing the name of a
645function to call into the constructor:
646
Fred Drake19479911998-02-13 06:58:54 +0000647\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000648pr = profile.Profile(your_time_func)
Fred Drake19479911998-02-13 06:58:54 +0000649\end{verbatim}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000650
Guido van Rossumdf804f81995-03-02 12:38:39 +0000651The resulting profiler will call \code{your_time_func()} instead of
Fred Drake8fa5eb81998-02-27 05:23:37 +0000652\function{os.times()}. The function should return either a single number
653or a list of numbers (like what \function{os.times()} returns). If the
Guido van Rossumdf804f81995-03-02 12:38:39 +0000654function returns a single time number, or the list of returned numbers
655has length 2, then you will get an especially fast version of the
656dispatch routine.
657
658Be warned that you \emph{should} calibrate the profiler class for the
659timer function that you choose. For most machines, a timer that
660returns a lone integer value will provide the best results in terms of
Fred Drake8fa5eb81998-02-27 05:23:37 +0000661low overhead during profiling. (\function{os.times()} is
662\emph{pretty} bad, 'cause it returns a tuple of floating point values,
663so all arithmetic is floating point in the profiler!). If you want to
664substitute a better timer in the cleanest fashion, you should derive a
665class, and simply put in the replacement dispatch method that better
666handles your timer call, along with the appropriate calibration
667constant :-).
Guido van Rossumdf804f81995-03-02 12:38:39 +0000668
669
670\subsection{OldProfile Class}
671
672The following derived profiler simulates the old style profiler,
673providing errant results on recursive functions. The reason for the
674usefulness of this profiler is that it runs faster (i.e., less
675overhead) than the old profiler. It still creates all the caller
676stats, and is quite useful when there is \emph{no} recursion in the
677user's code. It is also a lot more accurate than the old profiler, as
678it does not charge all its overhead time to the user's code.
679
Fred Drake19479911998-02-13 06:58:54 +0000680\begin{verbatim}
Guido van Rossumdf804f81995-03-02 12:38:39 +0000681class OldProfile(Profile):
682
683 def trace_dispatch_exception(self, frame, t):
684 rt, rtt, rct, rfn, rframe, rcur = self.cur
685 if rcur and not rframe is frame:
686 return self.trace_dispatch_return(rframe, t)
687 return 0
688
689 def trace_dispatch_call(self, frame, t):
690 fn = `frame.f_code`
691
692 self.cur = (t, 0, 0, fn, frame, self.cur)
693 if self.timings.has_key(fn):
694 tt, ct, callers = self.timings[fn]
695 self.timings[fn] = tt, ct, callers
696 else:
697 self.timings[fn] = 0, 0, {}
698 return 1
699
700 def trace_dispatch_return(self, frame, t):
701 rt, rtt, rct, rfn, frame, rcur = self.cur
702 rtt = rtt + t
703 sft = rtt + rct
704
705 pt, ptt, pct, pfn, pframe, pcur = rcur
706 self.cur = pt, ptt+rt, pct+sft, pfn, pframe, pcur
707
708 tt, ct, callers = self.timings[rfn]
709 if callers.has_key(pfn):
710 callers[pfn] = callers[pfn] + 1
711 else:
712 callers[pfn] = 1
713 self.timings[rfn] = tt+rtt, ct + sft, callers
714
715 return 1
716
717
718 def snapshot_stats(self):
719 self.stats = {}
720 for func in self.timings.keys():
721 tt, ct, callers = self.timings[func]
722 nor_func = self.func_normalize(func)
723 nor_callers = {}
724 nc = 0
725 for func_caller in callers.keys():
726 nor_callers[self.func_normalize(func_caller)]=\
727 callers[func_caller]
728 nc = nc + callers[func_caller]
729 self.stats[nor_func] = nc, nc, tt, ct, nor_callers
Fred Drake19479911998-02-13 06:58:54 +0000730\end{verbatim}
Fred Drake8fa5eb81998-02-27 05:23:37 +0000731
Guido van Rossumdf804f81995-03-02 12:38:39 +0000732\subsection{HotProfile Class}
733
734This profiler is the fastest derived profile example. It does not
735calculate caller-callee relationships, and does not calculate
736cumulative time under a function. It only calculates time spent in a
737function, so it runs very quickly (re: very low overhead). In truth,
738the basic profiler is so fast, that is probably not worth the savings
739to give up the data, but this class still provides a nice example.
740
Fred Drake19479911998-02-13 06:58:54 +0000741\begin{verbatim}
Guido van Rossumdf804f81995-03-02 12:38:39 +0000742class HotProfile(Profile):
743
744 def trace_dispatch_exception(self, frame, t):
745 rt, rtt, rfn, rframe, rcur = self.cur
746 if rcur and not rframe is frame:
747 return self.trace_dispatch_return(rframe, t)
748 return 0
749
750 def trace_dispatch_call(self, frame, t):
751 self.cur = (t, 0, frame, self.cur)
752 return 1
753
754 def trace_dispatch_return(self, frame, t):
755 rt, rtt, frame, rcur = self.cur
756
757 rfn = `frame.f_code`
758
759 pt, ptt, pframe, pcur = rcur
760 self.cur = pt, ptt+rt, pframe, pcur
761
762 if self.timings.has_key(rfn):
763 nc, tt = self.timings[rfn]
764 self.timings[rfn] = nc + 1, rt + rtt + tt
765 else:
766 self.timings[rfn] = 1, rt + rtt
767
768 return 1
769
770
771 def snapshot_stats(self):
772 self.stats = {}
773 for func in self.timings.keys():
774 nc, tt = self.timings[func]
775 nor_func = self.func_normalize(func)
776 self.stats[nor_func] = nc, nc, tt, 0, {}
Fred Drake19479911998-02-13 06:58:54 +0000777\end{verbatim}