Blame - Doc/lib/libprofile.tex - platform/external/python/cpython3

blob: 335c02f88017f86b54dc0a105dc84aed7ed8dede [file] [log] [blame]

Fred Drake	ea003fc	1999-04-05 21:59:15 +0000	[diff] [blame]	1	\chapter{The Python Profiler \label{profile}}
				2
				3	\sectionauthor{James Roskind}{}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	4
Fred Drake	4b3f031	1996-12-13 22:04:31 +0000	[diff] [blame]	5	Copyright \copyright{} 1994, by InfoSeek Corporation, all rights reserved.
Fred Drake	5dabeed	1998-04-03 07:02:35 +0000	[diff] [blame]	6	\index{InfoSeek Corporation}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	7
Fred Drake	ea003fc	1999-04-05 21:59:15 +0000	[diff] [blame]	8	Written by James Roskind.\footnote{
				9	Updated and converted to \LaTeX\ by Guido van Rossum. The references to
				10	the old profiler are left in the text, although it no longer exists.}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	11
				12	Permission to use, copy, modify, and distribute this Python software
				13	and its associated documentation for any purpose (subject to the
				14	restriction in the following sentence) without fee is hereby granted,
				15	provided that the above copyright notice appears in all copies, and
				16	that both that copyright notice and this permission notice appear in
				17	supporting documentation, and that the name of InfoSeek not be used in
				18	advertising or publicity pertaining to distribution of the software
				19	without specific, written prior permission. This permission is
				20	explicitly restricted to the copying and modification of the software
				21	to remain in Python, compiled Python, or other languages (such as C)
				22	wherein the modified or derived code is exclusively imported into a
				23	Python module.
				24
				25	INFOSEEK CORPORATION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS
				26	SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
				27	FITNESS. IN NO EVENT SHALL INFOSEEK CORPORATION BE LIABLE FOR ANY
				28	SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
				29	RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
				30	CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
				31	CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
				32
				33
				34	The profiler was written after only programming in Python for 3 weeks.
				35	As a result, it is probably clumsy code, but I don't know for sure yet
				36	'cause I'm a beginner :-). I did work hard to make the code run fast,
				37	so that profiling would be a reasonable thing to do. I tried not to
				38	repeat code fragments, but I'm sure I did some stuff in really awkward
				39	ways at times. Please send suggestions for improvements to:
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	40	\email{jar@netscape.com}. I won't promise \emph{any} support. ...but
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	41	I'd appreciate the feedback.
				42
				43
Guido van Rossum	470be14	1995-03-17 16:07:09 +0000	[diff] [blame]	44	\section{Introduction to the profiler}
Guido van Rossum	86cb092	1995-03-20 12:59:56 +0000	[diff] [blame]	45	\nodename{Profiler Introduction}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	46
				47	A \dfn{profiler} is a program that describes the run time performance
				48	of a program, providing a variety of statistics. This documentation
				49	describes the profiler functionality provided in the modules
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	50	\module{profile} and \module{pstats}. This profiler provides
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	51	\dfn{deterministic profiling} of any Python programs. It also
				52	provides a series of report generation tools to allow users to rapidly
				53	examine the results of a profile operation.
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	54	\index{deterministic profiling}
				55	\index{profiling, deterministic}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	56
				57
				58	\section{How Is This Profiler Different From The Old Profiler?}
Guido van Rossum	86cb092	1995-03-20 12:59:56 +0000	[diff] [blame]	59	\nodename{Profiler Changes}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	60
Guido van Rossum	364e643	1997-11-18 15:28:46 +0000	[diff] [blame]	61	(This section is of historical importance only; the old profiler
				62	discussed here was last seen in Python 1.1.)
				63
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	64	The big changes from old profiling module are that you get more
				65	information, and you pay less CPU time. It's not a trade-off, it's a
				66	trade-up.
				67
				68	To be specific:
				69
				70	\begin{description}
				71
				72	\item[Bugs removed:]
				73	Local stack frame is no longer molested, execution time is now charged
				74	to correct functions.
				75
				76	\item[Accuracy increased:]
				77	Profiler execution time is no longer charged to user's code,
				78	calibration for platform is supported, file reads are not done \emph{by}
				79	profiler \emph{during} profiling (and charged to user's code!).
				80
				81	\item[Speed increased:]
				82	Overhead CPU cost was reduced by more than a factor of two (perhaps a
				83	factor of five), lightweight profiler module is all that must be
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	84	loaded, and the report generating module (\module{pstats}) is not needed
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	85	during profiling.
				86
				87	\item[Recursive functions support:]
				88	Cumulative times in recursive functions are correctly calculated;
				89	recursive entries are counted.
				90
				91	\item[Large growth in report generating UI:]
				92	Distinct profiles runs can be added together forming a comprehensive
				93	report; functions that import statistics take arbitrary lists of
				94	files; sorting criteria is now based on keywords (instead of 4 integer
				95	options); reports shows what functions were profiled as well as what
				96	profile file was referenced; output format has been improved.
				97
				98	\end{description}
				99
				100
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	101	\section{Instant Users Manual \label{profile-instant}}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	102
				103	This section is provided for users that ``don't want to read the
				104	manual.'' It provides a very brief overview, and allows a user to
				105	rapidly perform profiling on an existing application.
				106
				107	To profile an application with a main entry point of \samp{foo()}, you
				108	would add the following to your module:
				109
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	110	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	111	import profile
Fred Drake	2cb824c	1998-04-09 18:10:35 +0000	[diff] [blame]	112	profile.run('foo()')
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	113	\end{verbatim}
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	114
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	115	The above action would cause \samp{foo()} to be run, and a series of
				116	informative lines (the profile) to be printed. The above approach is
				117	most useful when working with the interpreter. If you would like to
				118	save the results of a profile into a file for later examination, you
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	119	can supply a file name as the second argument to the \function{run()}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	120	function:
				121
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	122	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	123	import profile
Fred Drake	2cb824c	1998-04-09 18:10:35 +0000	[diff] [blame]	124	profile.run('foo()', 'fooprof')
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	125	\end{verbatim}
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	126
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	127	The file \file{profile.py} can also be invoked as
Guido van Rossum	bac8002	1997-06-02 17:29:12 +0000	[diff] [blame]	128	a script to profile another script. For example:
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	129
				130	\begin{verbatim}
Fred Drake	5dabeed	1998-04-03 07:02:35 +0000	[diff] [blame]	131	python /usr/local/lib/python1.5/profile.py myscript.py
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	132	\end{verbatim}
Guido van Rossum	bac8002	1997-06-02 17:29:12 +0000	[diff] [blame]	133
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	134	When you wish to review the profile, you should use the methods in the
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	135	\module{pstats} module. Typically you would load the statistics data as
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	136	follows:
				137
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	138	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	139	import pstats
				140	p = pstats.Stats('fooprof')
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	141	\end{verbatim}
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	142
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	143	The class \class{Stats} (the above code just created an instance of
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	144	this class) has a variety of methods for manipulating and printing the
				145	data that was just read into \samp{p}. When you ran
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	146	\function{profile.run()} above, what was printed was the result of three
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	147	method calls:
				148
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	149	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	150	p.strip_dirs().sort_stats(-1).print_stats()
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	151	\end{verbatim}
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	152
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	153	The first method removed the extraneous path from all the module
				154	names. The second method sorted all the entries according to the
				155	standard module/line/name string that is printed (this is to comply
				156	with the semantics of the old profiler). The third method printed out
				157	all the statistics. You might try the following sort calls:
				158
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	159	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	160	p.sort_stats('name')
				161	p.print_stats()
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	162	\end{verbatim}
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	163
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	164	The first call will actually sort the list by function name, and the
				165	second call will print out the statistics. The following are some
				166	interesting calls to experiment with:
				167
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	168	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	169	p.sort_stats('cumulative').print_stats(10)
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	170	\end{verbatim}
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	171
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	172	This sorts the profile by cumulative time in a function, and then only
				173	prints the ten most significant lines. If you want to understand what
				174	algorithms are taking time, the above line is what you would use.
				175
				176	If you were looking to see what functions were looping a lot, and
				177	taking a lot of time, you would do:
				178
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	179	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	180	p.sort_stats('time').print_stats(10)
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	181	\end{verbatim}
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	182
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	183	to sort according to time spent within each function, and then print
				184	the statistics for the top ten functions.
				185
				186	You might also try:
				187
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	188	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	189	p.sort_stats('file').print_stats('__init__')
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	190	\end{verbatim}
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	191
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	192	This will sort all the statistics by file name, and then print out
				193	statistics for only the class init methods ('cause they are spelled
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	194	with \samp{__init__} in them). As one final example, you could try:
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	195
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	196	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	197	p.sort_stats('time', 'cum').print_stats(.5, 'init')
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	198	\end{verbatim}
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	199
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	200	This line sorts statistics with a primary key of time, and a secondary
				201	key of cumulative time, and then prints out some of the statistics.
				202	To be specific, the list is first culled down to 50\% (re: \samp{.5})
				203	of its original size, then only lines containing \code{init} are
				204	maintained, and that sub-sub-list is printed.
				205
				206	If you wondered what functions called the above functions, you could
				207	now (\samp{p} is still sorted according to the last criteria) do:
				208
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	209	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	210	p.print_callers(.5, 'init')
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	211	\end{verbatim}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	212
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	213	and you would get a list of callers for each of the listed functions.
				214
				215	If you want more functionality, you're going to have to read the
				216	manual, or guess what the following functions do:
				217
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	218	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	219	p.print_callees()
				220	p.add('fooprof')
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	221	\end{verbatim}
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	222
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	223	\section{What Is Deterministic Profiling?}
Guido van Rossum	86cb092	1995-03-20 12:59:56 +0000	[diff] [blame]	224	\nodename{Deterministic Profiling}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	225
				226	\dfn{Deterministic profiling} is meant to reflect the fact that all
				227	\dfn{function call}, \dfn{function return}, and \dfn{exception} events
				228	are monitored, and precise timings are made for the intervals between
				229	these events (during which time the user's code is executing). In
				230	contrast, \dfn{statistical profiling} (which is not done by this
				231	module) randomly samples the effective instruction pointer, and
				232	deduces where time is being spent. The latter technique traditionally
				233	involves less overhead (as the code does not need to be instrumented),
				234	but provides only relative indications of where time is being spent.
				235
				236	In Python, since there is an interpreter active during execution, the
				237	presence of instrumented code is not required to do deterministic
				238	profiling. Python automatically provides a \dfn{hook} (optional
				239	callback) for each event. In addition, the interpreted nature of
				240	Python tends to add so much overhead to execution, that deterministic
				241	profiling tends to only add small processing overhead in typical
				242	applications. The result is that deterministic profiling is not that
				243	expensive, yet provides extensive run time statistics about the
				244	execution of a Python program.
				245
				246	Call count statistics can be used to identify bugs in code (surprising
				247	counts), and to identify possible inline-expansion points (high call
				248	counts). Internal time statistics can be used to identify ``hot
				249	loops'' that should be carefully optimized. Cumulative time
				250	statistics should be used to identify high level errors in the
				251	selection of algorithms. Note that the unusual handling of cumulative
				252	times in this profiler allows statistics for recursive implementations
				253	of algorithms to be directly compared to iterative implementations.
				254
				255
				256	\section{Reference Manual}
Fred Drake	b91e934	1998-07-23 17:59:49 +0000	[diff] [blame]	257
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	258	\declaremodule{standard}{profile}
				259	\modulesynopsis{Python profiler}
Fred Drake	b91e934	1998-07-23 17:59:49 +0000	[diff] [blame]	260
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	261
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	262
				263	The primary entry point for the profiler is the global function
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	264	\function{profile.run()}. It is typically used to create any profile
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	265	information. The reports are formatted and printed using methods of
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	266	the class \class{pstats.Stats}. The following is a description of all
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	267	of these standard entry points and functions. For a more in-depth
				268	view of some of the code, consider reading the later section on
				269	Profiler Extensions, which includes discussion of how to derive
				270	``better'' profilers from the classes presented, or reading the source
				271	code for these modules.
				272
Fred Drake	8fe533e	1998-03-27 05:27:08 +0000	[diff] [blame]	273	\begin{funcdesc}{run}{string\optional{, filename\optional{, ...}}}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	274
				275	This function takes a single argument that has can be passed to the
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	276	\keyword{exec} statement, and an optional file name. In all cases this
				277	routine attempts to \keyword{exec} its first argument, and gather profiling
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	278	statistics from the execution. If no file name is present, then this
				279	function automatically prints a simple profiling report, sorted by the
				280	standard name string (file/line/function-name) that is presented in
				281	each line. The following is a typical output from such a call:
				282
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	283	\begin{verbatim}
Guido van Rossum	96628a9	1995-04-10 11:34:00 +0000	[diff] [blame]	284	main()
				285	2706 function calls (2004 primitive calls) in 4.504 CPU seconds
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	286
Guido van Rossum	96628a9	1995-04-10 11:34:00 +0000	[diff] [blame]	287	Ordered by: standard name
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	288
Guido van Rossum	96628a9	1995-04-10 11:34:00 +0000	[diff] [blame]	289	ncalls tottime percall cumtime percall filename:lineno(function)
				290	2 0.006 0.003 0.953 0.477 pobject.py:75(save_objects)
				291	43/3 0.533 0.012 0.749 0.250 pobject.py:99(evaluate)
				292	...
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	293	\end{verbatim}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	294
				295	The first line indicates that this profile was generated by the call:\\
				296	\code{profile.run('main()')}, and hence the exec'ed string is
				297	\code{'main()'}. The second line indicates that 2706 calls were
				298	monitored. Of those calls, 2004 were \dfn{primitive}. We define
				299	\dfn{primitive} to mean that the call was not induced via recursion.
				300	The next line: \code{Ordered by:\ standard name}, indicates that
				301	the text string in the far right column was used to sort the output.
				302	The column headings include:
				303
				304	\begin{description}
				305
				306	\item[ncalls ]
				307	for the number of calls,
				308
				309	\item[tottime ]
				310	for the total time spent in the given function (and excluding time
				311	made in calls to sub-functions),
				312
				313	\item[percall ]
				314	is the quotient of \code{tottime} divided by \code{ncalls}
				315
				316	\item[cumtime ]
				317	is the total time spent in this and all subfunctions (i.e., from
				318	invocation till exit). This figure is accurate \emph{even} for recursive
				319	functions.
				320
				321	\item[percall ]
				322	is the quotient of \code{cumtime} divided by primitive calls
				323
				324	\item[filename:lineno(function) ]
				325	provides the respective data of each function
				326
				327	\end{description}
				328
				329	When there are two numbers in the first column (e.g.: \samp{43/3}),
				330	then the latter is the number of primitive calls, and the former is
				331	the actual number of calls. Note that when the function does not
				332	recurse, these two values are the same, and only the single figure is
				333	printed.
Guido van Rossum	96628a9	1995-04-10 11:34:00 +0000	[diff] [blame]	334
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	335	\end{funcdesc}
				336
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	337	Analysis of the profiler data is done using this class from the
				338	\module{pstats} module:
				339
Fred Drake	8fe533e	1998-03-27 05:27:08 +0000	[diff] [blame]	340	% now switch modules....
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	341	% (This \stmodindex use may be hard to change ;-( )
Fred Drake	8fe533e	1998-03-27 05:27:08 +0000	[diff] [blame]	342	\stmodindex{pstats}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	343
Fred Drake	cce1090	1998-03-17 06:33:25 +0000	[diff] [blame]	344	\begin{classdesc}{Stats}{filename\optional{, ...}}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	345	This class constructor creates an instance of a ``statistics object''
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	346	from a \var{filename} (or set of filenames). \class{Stats} objects are
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	347	manipulated by methods, in order to print useful reports.
				348
				349	The file selected by the above constructor must have been created by
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	350	the corresponding version of \module{profile}. To be specific, there is
				351	\emph{no} file compatibility guaranteed with future versions of this
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	352	profiler, and there is no compatibility with files produced by other
				353	profilers (e.g., the old system profiler).
				354
				355	If several files are provided, all the statistics for identical
				356	functions will be coalesced, so that an overall view of several
				357	processes can be considered in a single report. If additional files
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	358	need to be combined with data in an existing \class{Stats} object, the
				359	\method{add()} method can be used.
				360	\end{classdesc}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	361
				362
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	363	\subsection{The \class{Stats} Class \label{profile-stats}}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	364
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	365	\class{Stats} objects have the following methods:
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	366
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	367	\begin{methoddesc}[Stats]{strip_dirs}{}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	368	This method for the \class{Stats} class removes all leading path
				369	information from file names. It is very useful in reducing the size
				370	of the printout to fit within (close to) 80 columns. This method
				371	modifies the object, and the stripped information is lost. After
				372	performing a strip operation, the object is considered to have its
				373	entries in a ``random'' order, as it was just after object
				374	initialization and loading. If \method{strip_dirs()} causes two
				375	function names to be indistinguishable (i.e., they are on the same
				376	line of the same filename, and have the same function name), then the
				377	statistics for these two entries are accumulated into a single entry.
Fred Drake	8fe533e	1998-03-27 05:27:08 +0000	[diff] [blame]	378	\end{methoddesc}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	379
				380
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	381	\begin{methoddesc}[Stats]{add}{filename\optional{, ...}}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	382	This method of the \class{Stats} class accumulates additional
				383	profiling information into the current profiling object. Its
				384	arguments should refer to filenames created by the corresponding
				385	version of \function{profile.run()}. Statistics for identically named
				386	(re: file, line, name) functions are automatically accumulated into
				387	single function statistics.
Fred Drake	8fe533e	1998-03-27 05:27:08 +0000	[diff] [blame]	388	\end{methoddesc}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	389
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	390	\begin{methoddesc}[Stats]{sort_stats}{key\optional{, ...}}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	391	This method modifies the \class{Stats} object by sorting it according
				392	to the supplied criteria. The argument is typically a string
Fred Drake	2cb824c	1998-04-09 18:10:35 +0000	[diff] [blame]	393	identifying the basis of a sort (example: \code{'time'} or
				394	\code{'name'}).
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	395
				396	When more than one key is provided, then additional keys are used as
				397	secondary criteria when the there is equality in all keys selected
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	398	before them. For example, \samp{sort_stats('name', 'file')} will sort
				399	all the entries according to their function name, and resolve all ties
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	400	(identical function names) by sorting by file name.
				401
				402	Abbreviations can be used for any key names, as long as the
				403	abbreviation is unambiguous. The following are the keys currently
				404	defined:
				405
Fred Drake	ee60191	1998-04-11 20:53:03 +0000	[diff] [blame]	406	\begin{tableii}{l\|l}{code}{Valid Arg}{Meaning}
Fred Drake	5dabeed	1998-04-03 07:02:35 +0000	[diff] [blame]	407	\lineii{'calls'}{call count}
				408	\lineii{'cumulative'}{cumulative time}
				409	\lineii{'file'}{file name}
				410	\lineii{'module'}{file name}
				411	\lineii{'pcalls'}{primitive call count}
				412	\lineii{'line'}{line number}
				413	\lineii{'name'}{function name}
				414	\lineii{'nfl'}{name/file/line}
				415	\lineii{'stdname'}{standard name}
				416	\lineii{'time'}{internal time}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	417	\end{tableii}
				418
				419	Note that all sorts on statistics are in descending order (placing
				420	most time consuming items first), where as name, file, and line number
				421	searches are in ascending order (i.e., alphabetical). The subtle
Fred Drake	2cb824c	1998-04-09 18:10:35 +0000	[diff] [blame]	422	distinction between \code{'nfl'} and \code{'stdname'} is that the
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	423	standard name is a sort of the name as printed, which means that the
				424	embedded line numbers get compared in an odd way. For example, lines
				425	3, 20, and 40 would (if the file names were the same) appear in the
Fred Drake	2cb824c	1998-04-09 18:10:35 +0000	[diff] [blame]	426	string order 20, 3 and 40. In contrast, \code{'nfl'} does a numeric
				427	compare of the line numbers. In fact, \code{sort_stats('nfl')} is the
				428	same as \code{sort_stats('name', 'file', 'line')}.
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	429
				430	For compatibility with the old profiler, the numeric arguments
Fred Drake	2cb824c	1998-04-09 18:10:35 +0000	[diff] [blame]	431	\code{-1}, \code{0}, \code{1}, and \code{2} are permitted. They are
				432	interpreted as \code{'stdname'}, \code{'calls'}, \code{'time'}, and
				433	\code{'cumulative'} respectively. If this old style format (numeric)
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	434	is used, only one sort key (the numeric key) will be used, and
				435	additional arguments will be silently ignored.
Fred Drake	8fe533e	1998-03-27 05:27:08 +0000	[diff] [blame]	436	\end{methoddesc}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	437
				438
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	439	\begin{methoddesc}[Stats]{reverse_order}{}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	440	This method for the \class{Stats} class reverses the ordering of the basic
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	441	list within the object. This method is provided primarily for
				442	compatibility with the old profiler. Its utility is questionable
				443	now that ascending vs descending order is properly selected based on
				444	the sort key of choice.
Fred Drake	8fe533e	1998-03-27 05:27:08 +0000	[diff] [blame]	445	\end{methoddesc}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	446
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	447	\begin{methoddesc}[Stats]{print_stats}{restriction\optional{, ...}}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	448	This method for the \class{Stats} class prints out a report as described
				449	in the \function{profile.run()} definition.
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	450
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	451	The order of the printing is based on the last \method{sort_stats()}
				452	operation done on the object (subject to caveats in \method{add()} and
				453	\method{strip_dirs()}.
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	454
				455	The arguments provided (if any) can be used to limit the list down to
				456	the significant entries. Initially, the list is taken to be the
				457	complete set of profiled functions. Each restriction is either an
				458	integer (to select a count of lines), or a decimal fraction between
				459	0.0 and 1.0 inclusive (to select a percentage of lines), or a regular
Guido van Rossum	364e643	1997-11-18 15:28:46 +0000	[diff] [blame]	460	expression (to pattern match the standard name that is printed; as of
				461	Python 1.5b1, this uses the Perl-style regular expression syntax
Fred Drake	ffbe687	1999-04-22 21:23:22 +0000	[diff] [blame]	462	defined by the \refmodule{re} module). If several restrictions are
Guido van Rossum	364e643	1997-11-18 15:28:46 +0000	[diff] [blame]	463	provided, then they are applied sequentially. For example:
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	464
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	465	\begin{verbatim}
Fred Drake	2cb824c	1998-04-09 18:10:35 +0000	[diff] [blame]	466	print_stats(.1, 'foo:')
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	467	\end{verbatim}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	468
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	469	would first limit the printing to first 10\% of list, and then only
				470	print functions that were part of filename \samp{.*foo:}. In
				471	contrast, the command:
				472
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	473	\begin{verbatim}
Fred Drake	2cb824c	1998-04-09 18:10:35 +0000	[diff] [blame]	474	print_stats('foo:', .1)
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	475	\end{verbatim}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	476
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	477	would limit the list to all functions having file names \samp{.*foo:},
				478	and then proceed to only print the first 10\% of them.
Fred Drake	8fe533e	1998-03-27 05:27:08 +0000	[diff] [blame]	479	\end{methoddesc}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	480
				481
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	482	\begin{methoddesc}[Stats]{print_callers}{restrictions\optional{, ...}}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	483	This method for the \class{Stats} class prints a list of all functions
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	484	that called each function in the profiled database. The ordering is
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	485	identical to that provided by \method{print_stats()}, and the definition
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	486	of the restricting argument is also identical. For convenience, a
				487	number is shown in parentheses after each caller to show how many
				488	times this specific call was made. A second non-parenthesized number
				489	is the cumulative time spent in the function at the right.
Fred Drake	8fe533e	1998-03-27 05:27:08 +0000	[diff] [blame]	490	\end{methoddesc}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	491
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	492	\begin{methoddesc}[Stats]{print_callees}{restrictions\optional{, ...}}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	493	This method for the \class{Stats} class prints a list of all function
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	494	that were called by the indicated function. Aside from this reversal
				495	of direction of calls (re: called vs was called by), the arguments and
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	496	ordering are identical to the \method{print_callers()} method.
Fred Drake	8fe533e	1998-03-27 05:27:08 +0000	[diff] [blame]	497	\end{methoddesc}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	498
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	499	\begin{methoddesc}[Stats]{ignore}{}
Fred Drake	ea003fc	1999-04-05 21:59:15 +0000	[diff] [blame]	500	\deprecated{1.5.1}{This is not needed in modern versions of
				501	Python.\footnote{
				502	This was once necessary, when Python would print any unused expression
				503	result that was not \code{None}. The method is still defined for
				504	backward compatibility.}}
Fred Drake	8fe533e	1998-03-27 05:27:08 +0000	[diff] [blame]	505	\end{methoddesc}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	506
				507
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	508	\section{Limitations \label{profile-limits}}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	509
				510	There are two fundamental limitations on this profiler. The first is
				511	that it relies on the Python interpreter to dispatch \dfn{call},
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	512	\dfn{return}, and \dfn{exception} events. Compiled \C{} code does not
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	513	get interpreted, and hence is ``invisible'' to the profiler. All time
Fred Drake	3a18f3b	1998-04-02 19:36:25 +0000	[diff] [blame]	514	spent in \C{} code (including built-in functions) will be charged to the
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	515	Python function that invoked the \C{} code. If the \C{} code calls out
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	516	to some native Python code, then those calls will be profiled
				517	properly.
				518
				519	The second limitation has to do with accuracy of timing information.
				520	There is a fundamental problem with deterministic profilers involving
				521	accuracy. The most obvious restriction is that the underlying ``clock''
				522	is only ticking at a rate (typically) of about .001 seconds. Hence no
				523	measurements will be more accurate that that underlying clock. If
				524	enough measurements are taken, then the ``error'' will tend to average
				525	out. Unfortunately, removing this first error induces a second source
				526	of error...
				527
				528	The second problem is that it ``takes a while'' from when an event is
				529	dispatched until the profiler's call to get the time actually
				530	\emph{gets} the state of the clock. Similarly, there is a certain lag
				531	when exiting the profiler event handler from the time that the clock's
				532	value was obtained (and then squirreled away), until the user's code
				533	is once again executing. As a result, functions that are called many
				534	times, or call many functions, will typically accumulate this error.
				535	The error that accumulates in this fashion is typically less than the
				536	accuracy of the clock (i.e., less than one clock tick), but it
				537	\emph{can} accumulate and become very significant. This profiler
				538	provides a means of calibrating itself for a given platform so that
				539	this error can be probabilistically (i.e., on the average) removed.
				540	After the profiler is calibrated, it will be more accurate (in a least
				541	square sense), but it will sometimes produce negative numbers (when
				542	call counts are exceptionally low, and the gods of probability work
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	543	against you :-). ) Do \emph{not} be alarmed by negative numbers in
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	544	the profile. They should \emph{only} appear if you have calibrated
				545	your profiler, and the results are actually better than without
				546	calibration.
				547
				548
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	549	\section{Calibration \label{profile-calibration}}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	550
				551	The profiler class has a hard coded constant that is added to each
				552	event handling time to compensate for the overhead of calling the time
				553	function, and socking away the results. The following procedure can
				554	be used to obtain this constant for a given platform (see discussion
				555	in section Limitations above).
				556
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	557	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	558	import profile
				559	pr = profile.Profile()
Guido van Rossum	685ef4e	1998-03-17 14:37:48 +0000	[diff] [blame]	560	print pr.calibrate(100)
				561	print pr.calibrate(100)
				562	print pr.calibrate(100)
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	563	\end{verbatim}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	564
				565	The argument to \method{calibrate()} is the number of times to try to
				566	do the sample calls to get the CPU times. If your computer is
				567	\emph{very} fast, you might have to do:
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	568
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	569	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	570	pr.calibrate(1000)
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	571	\end{verbatim}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	572
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	573	or even:
				574
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	575	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	576	pr.calibrate(10000)
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	577	\end{verbatim}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	578
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	579	The object of this exercise is to get a fairly consistent result.
				580	When you have a consistent answer, you are ready to use that number in
				581	the source code. For a Sun Sparcstation 1000 running Solaris 2.3, the
				582	magical number is about .00053. If you have a choice, you are better
				583	off with a smaller constant, and your results will ``less often'' show
				584	up as negative in profile statistics.
				585
				586	The following shows how the trace_dispatch() method in the Profile
				587	class should be modified to install the calibration constant on a Sun
				588	Sparcstation 1000:
				589
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	590	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	591	def trace_dispatch(self, frame, event, arg):
				592	t = self.timer()
				593	t = t[0] + t[1] - self.t - .00053 # Calibration constant
				594
				595	if self.dispatch[event](frame,t):
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	596	t = self.timer()
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	597	self.t = t[0] + t[1]
				598	else:
				599	r = self.timer()
				600	self.t = r[0] + r[1] - t # put back unrecorded delta
				601	return
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	602	\end{verbatim}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	603
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	604	Note that if there is no calibration constant, then the line
				605	containing the callibration constant should simply say:
				606
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	607	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	608	t = t[0] + t[1] - self.t # no calibration constant
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	609	\end{verbatim}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	610
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	611	You can also achieve the same results using a derived class (and the
				612	profiler will actually run equally fast!!), but the above method is
				613	the simplest to use. I could have made the profiler ``self
				614	calibrating'', but it would have made the initialization of the
				615	profiler class slower, and would have required some \emph{very} fancy
				616	coding, or else the use of a variable where the constant \samp{.00053}
				617	was placed in the code shown. This is a \strong{VERY} critical
				618	performance section, and there is no reason to use a variable lookup
				619	at this point, when a constant can be used.
				620
				621
Guido van Rossum	86cb092	1995-03-20 12:59:56 +0000	[diff] [blame]	622	\section{Extensions --- Deriving Better Profilers}
				623	\nodename{Profiler Extensions}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	624
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	625	The \class{Profile} class of module \module{profile} was written so that
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	626	derived classes could be developed to extend the profiler. Rather
				627	than describing all the details of such an effort, I'll just present
				628	the following two examples of derived classes that can be used to do
				629	profiling. If the reader is an avid Python programmer, then it should
				630	be possible to use these as a model and create similar (and perchance
				631	better) profile classes.
				632
				633	If all you want to do is change how the timer is called, or which
				634	timer function is used, then the basic class has an option for that in
				635	the constructor for the class. Consider passing the name of a
				636	function to call into the constructor:
				637
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	638	\begin{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	639	pr = profile.Profile(your_time_func)
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	640	\end{verbatim}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	641
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	642	The resulting profiler will call \code{your_time_func()} instead of
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	643	\function{os.times()}. The function should return either a single number
				644	or a list of numbers (like what \function{os.times()} returns). If the
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	645	function returns a single time number, or the list of returned numbers
				646	has length 2, then you will get an especially fast version of the
				647	dispatch routine.
				648
				649	Be warned that you \emph{should} calibrate the profiler class for the
				650	timer function that you choose. For most machines, a timer that
				651	returns a lone integer value will provide the best results in terms of
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	652	low overhead during profiling. (\function{os.times()} is
				653	\emph{pretty} bad, 'cause it returns a tuple of floating point values,
				654	so all arithmetic is floating point in the profiler!). If you want to
				655	substitute a better timer in the cleanest fashion, you should derive a
				656	class, and simply put in the replacement dispatch method that better
				657	handles your timer call, along with the appropriate calibration
				658	constant :-).
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	659
				660
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	661	\subsection{OldProfile Class \label{profile-old}}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	662
				663	The following derived profiler simulates the old style profiler,
				664	providing errant results on recursive functions. The reason for the
				665	usefulness of this profiler is that it runs faster (i.e., less
				666	overhead) than the old profiler. It still creates all the caller
				667	stats, and is quite useful when there is \emph{no} recursion in the
				668	user's code. It is also a lot more accurate than the old profiler, as
				669	it does not charge all its overhead time to the user's code.
				670
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	671	\begin{verbatim}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	672	class OldProfile(Profile):
				673
				674	def trace_dispatch_exception(self, frame, t):
				675	rt, rtt, rct, rfn, rframe, rcur = self.cur
				676	if rcur and not rframe is frame:
				677	return self.trace_dispatch_return(rframe, t)
				678	return 0
				679
				680	def trace_dispatch_call(self, frame, t):
				681	fn = `frame.f_code`
				682
				683	self.cur = (t, 0, 0, fn, frame, self.cur)
				684	if self.timings.has_key(fn):
				685	tt, ct, callers = self.timings[fn]
				686	self.timings[fn] = tt, ct, callers
				687	else:
				688	self.timings[fn] = 0, 0, {}
				689	return 1
				690
				691	def trace_dispatch_return(self, frame, t):
				692	rt, rtt, rct, rfn, frame, rcur = self.cur
				693	rtt = rtt + t
				694	sft = rtt + rct
				695
				696	pt, ptt, pct, pfn, pframe, pcur = rcur
				697	self.cur = pt, ptt+rt, pct+sft, pfn, pframe, pcur
				698
				699	tt, ct, callers = self.timings[rfn]
				700	if callers.has_key(pfn):
				701	callers[pfn] = callers[pfn] + 1
				702	else:
				703	callers[pfn] = 1
				704	self.timings[rfn] = tt+rtt, ct + sft, callers
				705
				706	return 1
				707
				708
				709	def snapshot_stats(self):
				710	self.stats = {}
				711	for func in self.timings.keys():
				712	tt, ct, callers = self.timings[func]
				713	nor_func = self.func_normalize(func)
				714	nor_callers = {}
				715	nc = 0
				716	for func_caller in callers.keys():
Fred Drake	5dabeed	1998-04-03 07:02:35 +0000	[diff] [blame]	717	nor_callers[self.func_normalize(func_caller)] = \
				718	callers[func_caller]
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	719	nc = nc + callers[func_caller]
				720	self.stats[nor_func] = nc, nc, tt, ct, nor_callers
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	721	\end{verbatim}
Fred Drake	8fa5eb8	1998-02-27 05:23:37 +0000	[diff] [blame]	722
Fred Drake	b9f1f6d	1999-04-21 21:43:17 +0000	[diff] [blame]	723	\subsection{HotProfile Class \label{profile-HotProfile}}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	724
				725	This profiler is the fastest derived profile example. It does not
				726	calculate caller-callee relationships, and does not calculate
				727	cumulative time under a function. It only calculates time spent in a
				728	function, so it runs very quickly (re: very low overhead). In truth,
				729	the basic profiler is so fast, that is probably not worth the savings
				730	to give up the data, but this class still provides a nice example.
				731
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	732	\begin{verbatim}
Guido van Rossum	df804f8	1995-03-02 12:38:39 +0000	[diff] [blame]	733	class HotProfile(Profile):
				734
				735	def trace_dispatch_exception(self, frame, t):
				736	rt, rtt, rfn, rframe, rcur = self.cur
				737	if rcur and not rframe is frame:
				738	return self.trace_dispatch_return(rframe, t)
				739	return 0
				740
				741	def trace_dispatch_call(self, frame, t):
				742	self.cur = (t, 0, frame, self.cur)
				743	return 1
				744
				745	def trace_dispatch_return(self, frame, t):
				746	rt, rtt, frame, rcur = self.cur
				747
				748	rfn = `frame.f_code`
				749
				750	pt, ptt, pframe, pcur = rcur
				751	self.cur = pt, ptt+rt, pframe, pcur
				752
				753	if self.timings.has_key(rfn):
				754	nc, tt = self.timings[rfn]
				755	self.timings[rfn] = nc + 1, rt + rtt + tt
				756	else:
				757	self.timings[rfn] = 1, rt + rtt
				758
				759	return 1
				760
				761
				762	def snapshot_stats(self):
				763	self.stats = {}
				764	for func in self.timings.keys():
				765	nc, tt = self.timings[func]
				766	nor_func = self.func_normalize(func)
				767	self.stats[nor_func] = nc, nc, tt, 0, {}
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	768	\end{verbatim}