Blame - Tools/pybench/README - platform/external/python/cpython2

blob: 022c8dea9caf0fde65a890afc7b7b48ed439c492 [file] [log] [blame]

Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	1	________________________________________________________________________
				2
				3	PYBENCH - A Python Benchmark Suite
				4	________________________________________________________________________
				5
				6	Extendable suite of of low-level benchmarks for measuring
				7	the performance of the Python implementation
				8	(interpreter, compiler or VM).
				9
				10	pybench is a collection of tests that provides a standardized way to
				11	measure the performance of Python implementations. It takes a very
				12	close look at different aspects of Python programs and let's you
				13	decide which factors are more important to you than others, rather
				14	than wrapping everything up in one number, like the other performance
				15	tests do (e.g. pystone which is included in the Python Standard
				16	Library).
				17
				18	pybench has been used in the past by several Python developers to
				19	track down performance bottlenecks or to demonstrate the impact of
				20	optimizations and new features in Python.
				21
				22	The command line interface for pybench is the file pybench.py. Run
				23	this script with option '--help' to get a listing of the possible
				24	options. Without options, pybench will simply execute the benchmark
				25	and then print out a report to stdout.
				26
				27
				28	Micro-Manual
				29	------------
				30
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	31	Run 'pybench.py -h' to see the help screen. Run 'pybench.py' to run
				32	the benchmark suite using default settings and 'pybench.py -f <file>'
				33	to have it store the results in a file too.
				34
				35	It is usually a good idea to run pybench.py multiple times to see
				36	whether the environment, timers and benchmark run-times are suitable
				37	for doing benchmark tests.
				38
				39	You can use the comparison feature of pybench.py ('pybench.py -c
				40	<file>') to check how well the system behaves in comparison to a
				41	reference run.
				42
				43	If the differences are well below 10% for each test, then you have a
				44	system that is good for doing benchmark testings. Of you get random
				45	differences of more than 10% or significant differences between the
				46	values for minimum and average time, then you likely have some
				47	background processes running which cause the readings to become
				48	inconsistent. Examples include: web-browsers, email clients, RSS
				49	readers, music players, backup programs, etc.
				50
				51	If you are only interested in a few tests of the whole suite, you can
				52	use the filtering option, e.g. 'pybench.py -t string' will only
				53	run/show the tests that have 'string' in their name.
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	54
				55	This is the current output of pybench.py --help:
				56
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	57	"""
				58	------------------------------------------------------------------------
				59	PYBENCH - a benchmark test suite for Python interpreters/compilers.
				60	------------------------------------------------------------------------
				61
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	62	Synopsis:
				63	pybench.py [option] files...
				64
				65	Options and default settings:
				66	-n arg number of rounds (10)
				67	-f arg save benchmark to file arg ()
				68	-c arg compare benchmark with the one in file arg ()
				69	-s arg show benchmark in file arg, then exit ()
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	70	-w arg set warp factor to arg (10)
				71	-t arg run only tests with names matching arg ()
				72	-C arg set the number of calibration runs to arg (20)
				73	-d hide noise in comparisons (0)
				74	-v verbose output (not recommended) (0)
				75	--with-gc enable garbage collection (0)
				76	--with-syscheck use default sys check interval (0)
				77	--timer arg use given timer (time.time)
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	78	-h show this help text
				79	--help show this help text
				80	--debug enable debugging
				81	--copyright show copyright
				82	--examples show examples of usage
				83
				84	Version:
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	85	2.0
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	86
				87	The normal operation is to run the suite and display the
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	88	results. Use -f to save them for later reuse or comparisons.
				89
				90	Available timers:
				91
				92	time.time
				93	time.clock
				94	systimes.processtime
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	95
				96	Examples:
				97
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	98	python2.1 pybench.py -f p21.pybench
				99	python2.5 pybench.py -f p25.pybench
				100	python pybench.py -s p25.pybench -c p21.pybench
				101	"""
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	102
				103	License
				104	-------
				105
				106	See LICENSE file.
				107
				108
				109	Sample output
				110	-------------
				111
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	112	"""
				113	-------------------------------------------------------------------------------
				114	PYBENCH 2.0
				115	-------------------------------------------------------------------------------
				116	* using Python 2.4.2
				117	* disabled garbage collection
				118	* system check interval set to maximum: 2147483647
				119	* using timer: time.time
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	120
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	121	Calibrating tests. Please wait...
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	122
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	123	Running 10 round(s) of the suite at warp factor 10:
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	124
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	125	* Round 1 done in 6.388 seconds.
				126	* Round 2 done in 6.485 seconds.
				127	* Round 3 done in 6.786 seconds.
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	128	...
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	129	* Round 10 done in 6.546 seconds.
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	130
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	131	-------------------------------------------------------------------------------
				132	Benchmark: 2006-06-12 12:09:25
				133	-------------------------------------------------------------------------------
				134
				135	Rounds: 10
				136	Warp: 10
				137	Timer: time.time
				138
				139	Machine Details:
				140	Platform ID: Linux-2.6.8-24.19-default-x86_64-with-SuSE-9.2-x86-64
				141	Processor: x86_64
				142
				143	Python:
				144	Executable: /usr/local/bin/python
				145	Version: 2.4.2
				146	Compiler: GCC 3.3.4 (pre 3.3.5 20040809)
				147	Bits: 64bit
				148	Build: Oct 1 2005 15:24:35 (#1)
				149	Unicode: UCS2
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	150
				151
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	152	Test minimum average operation overhead
				153	-------------------------------------------------------------------------------
				154	BuiltinFunctionCalls: 126ms 145ms 0.28us 0.274ms
				155	BuiltinMethodLookup: 124ms 130ms 0.12us 0.316ms
				156	CompareFloats: 109ms 110ms 0.09us 0.361ms
				157	CompareFloatsIntegers: 100ms 104ms 0.12us 0.271ms
				158	CompareIntegers: 137ms 138ms 0.08us 0.542ms
				159	CompareInternedStrings: 124ms 127ms 0.08us 1.367ms
				160	CompareLongs: 100ms 104ms 0.10us 0.316ms
				161	CompareStrings: 111ms 115ms 0.12us 0.929ms
				162	CompareUnicode: 108ms 128ms 0.17us 0.693ms
				163	ConcatStrings: 142ms 155ms 0.31us 0.562ms
				164	ConcatUnicode: 119ms 127ms 0.42us 0.384ms
				165	CreateInstances: 123ms 128ms 1.14us 0.367ms
				166	CreateNewInstances: 121ms 126ms 1.49us 0.335ms
				167	CreateStringsWithConcat: 130ms 135ms 0.14us 0.916ms
				168	CreateUnicodeWithConcat: 130ms 135ms 0.34us 0.361ms
				169	DictCreation: 108ms 109ms 0.27us 0.361ms
				170	DictWithFloatKeys: 149ms 153ms 0.17us 0.678ms
				171	DictWithIntegerKeys: 124ms 126ms 0.11us 0.915ms
				172	DictWithStringKeys: 114ms 117ms 0.10us 0.905ms
				173	ForLoops: 110ms 111ms 4.46us 0.063ms
				174	IfThenElse: 118ms 119ms 0.09us 0.685ms
				175	ListSlicing: 116ms 120ms 8.59us 0.103ms
				176	NestedForLoops: 125ms 137ms 0.09us 0.019ms
				177	NormalClassAttribute: 124ms 136ms 0.11us 0.457ms
				178	NormalInstanceAttribute: 110ms 117ms 0.10us 0.454ms
				179	PythonFunctionCalls: 107ms 113ms 0.34us 0.271ms
				180	PythonMethodCalls: 140ms 149ms 0.66us 0.141ms
				181	Recursion: 156ms 166ms 3.32us 0.452ms
				182	SecondImport: 112ms 118ms 1.18us 0.180ms
				183	SecondPackageImport: 118ms 127ms 1.27us 0.180ms
				184	SecondSubmoduleImport: 140ms 151ms 1.51us 0.180ms
				185	SimpleComplexArithmetic: 128ms 139ms 0.16us 0.361ms
				186	SimpleDictManipulation: 134ms 136ms 0.11us 0.452ms
				187	SimpleFloatArithmetic: 110ms 113ms 0.09us 0.571ms
				188	SimpleIntFloatArithmetic: 106ms 111ms 0.08us 0.548ms
				189	SimpleIntegerArithmetic: 106ms 109ms 0.08us 0.544ms
				190	SimpleListManipulation: 103ms 113ms 0.10us 0.587ms
				191	SimpleLongArithmetic: 112ms 118ms 0.18us 0.271ms
				192	SmallLists: 105ms 116ms 0.17us 0.366ms
				193	SmallTuples: 108ms 128ms 0.24us 0.406ms
				194	SpecialClassAttribute: 119ms 136ms 0.11us 0.453ms
				195	SpecialInstanceAttribute: 143ms 155ms 0.13us 0.454ms
				196	StringMappings: 115ms 121ms 0.48us 0.405ms
				197	StringPredicates: 120ms 129ms 0.18us 2.064ms
				198	StringSlicing: 111ms 127ms 0.23us 0.781ms
				199	TryExcept: 125ms 126ms 0.06us 0.681ms
				200	TryRaiseExcept: 133ms 137ms 2.14us 0.361ms
				201	TupleSlicing: 117ms 120ms 0.46us 0.066ms
				202	UnicodeMappings: 156ms 160ms 4.44us 0.429ms
				203	UnicodePredicates: 117ms 121ms 0.22us 2.487ms
				204	UnicodeProperties: 115ms 153ms 0.38us 2.070ms
				205	UnicodeSlicing: 126ms 129ms 0.26us 0.689ms
				206	-------------------------------------------------------------------------------
				207	Totals: 6283ms 6673ms
				208	"""
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	209	________________________________________________________________________
				210
				211	Writing New Tests
				212	________________________________________________________________________
				213
				214	pybench tests are simple modules defining one or more pybench.Test
				215	subclasses.
				216
				217	Writing a test essentially boils down to providing two methods:
				218	.test() which runs .rounds number of .operations test operations each
				219	and .calibrate() which does the same except that it doesn't actually
				220	execute the operations.
				221
				222
				223	Here's an example:
				224	------------------
				225
				226	from pybench import Test
				227
				228	class IntegerCounting(Test):
				229
				230	# Version number of the test as float (x.yy); this is important
				231	# for comparisons of benchmark runs - tests with unequal version
				232	# number will not get compared.
				233	version = 1.0
				234
				235	# The number of abstract operations done in each round of the
				236	# test. An operation is the basic unit of what you want to
				237	# measure. The benchmark will output the amount of run-time per
				238	# operation. Note that in order to raise the measured timings
				239	# significantly above noise level, it is often required to repeat
				240	# sets of operations more than once per test round. The measured
				241	# overhead per test round should be less than 1 second.
				242	operations = 20
				243
				244	# Number of rounds to execute per test run. This should be
				245	# adjusted to a figure that results in a test run-time of between
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	246	# 1-2 seconds (at warp 1).
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	247	rounds = 100000
				248
				249	def test(self):
				250
				251	""" Run the test.
				252
				253	The test needs to run self.rounds executing
				254	self.operations number of operations each.
				255
				256	"""
				257	# Init the test
				258	a = 1
				259
				260	# Run test rounds
				261	#
				262	# NOTE: Use xrange() for all test loops unless you want to face
				263	# a 20MB process !
				264	#
				265	for i in xrange(self.rounds):
				266
				267	# Repeat the operations per round to raise the run-time
				268	# per operation significantly above the noise level of the
				269	# for-loop overhead.
				270
				271	# Execute 20 operations (a += 1):
				272	a += 1
				273	a += 1
				274	a += 1
				275	a += 1
				276	a += 1
				277	a += 1
				278	a += 1
				279	a += 1
				280	a += 1
				281	a += 1
				282	a += 1
				283	a += 1
				284	a += 1
				285	a += 1
				286	a += 1
				287	a += 1
				288	a += 1
				289	a += 1
				290	a += 1
				291	a += 1
				292
				293	def calibrate(self):
				294
				295	""" Calibrate the test.
				296
				297	This method should execute everything that is needed to
				298	setup and run the test - except for the actual operations
				299	that you intend to measure. pybench uses this method to
				300	measure the test implementation overhead.
				301
				302	"""
				303	# Init the test
				304	a = 1
				305
				306	# Run test rounds (without actually doing any operation)
				307	for i in xrange(self.rounds):
				308
				309	# Skip the actual execution of the operations, since we
				310	# only want to measure the test's administration overhead.
				311	pass
				312
				313	Registering a new test module
				314	-----------------------------
				315
				316	To register a test module with pybench, the classes need to be
				317	imported into the pybench.Setup module. pybench will then scan all the
				318	symbols defined in that module for subclasses of pybench.Test and
				319	automatically add them to the benchmark suite.
				320
				321
Steve Holden	a4ebed8	2006-05-26 22:33:20 +0000	[diff] [blame]	322	Breaking Comparability
				323	----------------------
				324
				325	If a change is made to any individual test that means it is no
Steve Holden	57ad060	2006-05-26 22:39:27 +0000	[diff] [blame]	326	longer strictly comparable with previous runs, the '.version' class
Steve Holden	a4ebed8	2006-05-26 22:33:20 +0000	[diff] [blame]	327	variable should be updated. Therefafter, comparisons with previous
				328	versions of the test will list as "n/a" to reflect the change.
				329
Marc-André Lemburg	7d9743d	2006-06-13 18:56:56 +0000	[diff] [blame]	330
				331	Version History
				332	---------------
				333
				334	2.0: rewrote parts of pybench which resulted in more repeatable
				335	timings:
				336	- made timer a parameter
				337	- changed the platform default timer to use high-resolution
				338	timers rather than process timers (which have a much lower
				339	resolution)
				340	- added option to select timer
				341	- added process time timer (using systimes.py)
				342	- changed to use min() as timing estimator (average
				343	is still taken as well to provide an idea of the difference)
				344	- garbage collection is turned off per default
				345	- sys check interval is set to the highest possible value
				346	- calibration is now a separate step and done using
				347	a different strategy that allows measuring the test
				348	overhead more accurately
				349	- modified the tests to each give a run-time of between
				350	100-200ms using warp 10
				351	- changed default warp factor to 10 (from 20)
				352	- compared results with timeit.py and confirmed measurements
				353	- bumped all test versions to 2.0
				354	- updated platform.py to the latest version
				355	- changed the output format a bit to make it look
				356	nicer
				357	- refactored the APIs somewhat
				358	1.3+: Steve Holden added the NewInstances test and the filtering
				359	option during the NeedForSpeed sprint; this also triggered a long
				360	discussion on how to improve benchmark timing and finally
				361	resulted in the release of 2.0
				362	1.3: initial checkin into the Python SVN repository
				363
				364
Marc-André Lemburg	c311f64	2006-04-19 15:27:33 +0000	[diff] [blame]	365	Have fun,
				366	--
				367	Marc-Andre Lemburg
				368	mal@lemburg.com