Blame - Tools/pybench/README - platform/external/python/cpython3

blob: e33d0647006002b695cf05306d8528ae2b21a263 [file] [log] [blame]

Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	1	________________________________________________________________________
				2
				3	PYBENCH - A Python Benchmark Suite
				4	________________________________________________________________________
				5
				6	Extendable suite of of low-level benchmarks for measuring
				7	the performance of the Python implementation
				8	(interpreter, compiler or VM).
				9
				10	pybench is a collection of tests that provides a standardized way to
				11	measure the performance of Python implementations. It takes a very
				12	close look at different aspects of Python programs and let's you
				13	decide which factors are more important to you than others, rather
				14	than wrapping everything up in one number, like the other performance
				15	tests do (e.g. pystone which is included in the Python Standard
				16	Library).
				17
				18	pybench has been used in the past by several Python developers to
				19	track down performance bottlenecks or to demonstrate the impact of
				20	optimizations and new features in Python.
				21
				22	The command line interface for pybench is the file pybench.py. Run
				23	this script with option '--help' to get a listing of the possible
				24	options. Without options, pybench will simply execute the benchmark
				25	and then print out a report to stdout.
				26
				27
				28	Micro-Manual
				29	------------
				30
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	31	Run 'pybench.py -h' to see the help screen. Run 'pybench.py' to run
				32	the benchmark suite using default settings and 'pybench.py -f <file>'
				33	to have it store the results in a file too.
				34
				35	It is usually a good idea to run pybench.py multiple times to see
				36	whether the environment, timers and benchmark run-times are suitable
				37	for doing benchmark tests.
				38
				39	You can use the comparison feature of pybench.py ('pybench.py -c
				40	<file>') to check how well the system behaves in comparison to a
				41	reference run.
				42
				43	If the differences are well below 10% for each test, then you have a
				44	system that is good for doing benchmark testings. Of you get random
				45	differences of more than 10% or significant differences between the
				46	values for minimum and average time, then you likely have some
				47	background processes running which cause the readings to become
				48	inconsistent. Examples include: web-browsers, email clients, RSS
				49	readers, music players, backup programs, etc.
				50
				51	If you are only interested in a few tests of the whole suite, you can
				52	use the filtering option, e.g. 'pybench.py -t string' will only
				53	run/show the tests that have 'string' in their name.
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	54
				55	This is the current output of pybench.py --help:
				56
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	57	"""
				58	------------------------------------------------------------------------
				59	PYBENCH - a benchmark test suite for Python interpreters/compilers.
				60	------------------------------------------------------------------------
				61
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	62	Synopsis:
				63	pybench.py [option] files...
				64
				65	Options and default settings:
				66	-n arg number of rounds (10)
				67	-f arg save benchmark to file arg ()
				68	-c arg compare benchmark with the one in file arg ()
				69	-s arg show benchmark in file arg, then exit ()
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	70	-w arg set warp factor to arg (10)
				71	-t arg run only tests with names matching arg ()
				72	-C arg set the number of calibration runs to arg (20)
				73	-d hide noise in comparisons (0)
				74	-v verbose output (not recommended) (0)
				75	--with-gc enable garbage collection (0)
				76	--with-syscheck use default sys check interval (0)
				77	--timer arg use given timer (time.time)
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	78	-h show this help text
				79	--help show this help text
				80	--debug enable debugging
				81	--copyright show copyright
				82	--examples show examples of usage
				83
				84	Version:
Antoine Pitrou	8a68122	2009-02-07 17:13:31 +0000	[diff] [blame]	85	2.1
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	86
				87	The normal operation is to run the suite and display the
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	88	results. Use -f to save them for later reuse or comparisons.
				89
				90	Available timers:
				91
				92	time.time
				93	time.clock
				94	systimes.processtime
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	95
				96	Examples:
				97
Antoine Pitrou	8a68122	2009-02-07 17:13:31 +0000	[diff] [blame]	98	python3.0 pybench.py -f p30.pybench
				99	python3.1 pybench.py -f p31.pybench
				100	python pybench.py -s p31.pybench -c p30.pybench
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	101	"""
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	102
				103	License
				104	-------
				105
				106	See LICENSE file.
				107
				108
				109	Sample output
				110	-------------
				111
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	112	"""
				113	-------------------------------------------------------------------------------
Antoine Pitrou	8a68122	2009-02-07 17:13:31 +0000	[diff] [blame]	114	PYBENCH 2.1
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	115	-------------------------------------------------------------------------------
Antoine Pitrou	8a68122	2009-02-07 17:13:31 +0000	[diff] [blame]	116	* using CPython 3.0
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	117	* disabled garbage collection
				118	* system check interval set to maximum: 2147483647
				119	* using timer: time.time
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	120
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	121	Calibrating tests. Please wait...
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	122
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	123	Running 10 round(s) of the suite at warp factor 10:
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	124
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	125	* Round 1 done in 6.388 seconds.
				126	* Round 2 done in 6.485 seconds.
				127	* Round 3 done in 6.786 seconds.
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	128	...
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	129	* Round 10 done in 6.546 seconds.
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	130
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	131	-------------------------------------------------------------------------------
				132	Benchmark: 2006-06-12 12:09:25
				133	-------------------------------------------------------------------------------
				134
				135	Rounds: 10
				136	Warp: 10
				137	Timer: time.time
				138
				139	Machine Details:
				140	Platform ID: Linux-2.6.8-24.19-default-x86_64-with-SuSE-9.2-x86-64
				141	Processor: x86_64
				142
				143	Python:
Antoine Pitrou	8a68122	2009-02-07 17:13:31 +0000	[diff] [blame]	144	Implementation: CPython
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	145	Executable: /usr/local/bin/python
Antoine Pitrou	8a68122	2009-02-07 17:13:31 +0000	[diff] [blame]	146	Version: 3.0
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	147	Compiler: GCC 3.3.4 (pre 3.3.5 20040809)
				148	Bits: 64bit
				149	Build: Oct 1 2005 15:24:35 (#1)
				150	Unicode: UCS2
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	151
				152
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	153	Test minimum average operation overhead
				154	-------------------------------------------------------------------------------
				155	BuiltinFunctionCalls: 126ms 145ms 0.28us 0.274ms
				156	BuiltinMethodLookup: 124ms 130ms 0.12us 0.316ms
				157	CompareFloats: 109ms 110ms 0.09us 0.361ms
				158	CompareFloatsIntegers: 100ms 104ms 0.12us 0.271ms
				159	CompareIntegers: 137ms 138ms 0.08us 0.542ms
				160	CompareInternedStrings: 124ms 127ms 0.08us 1.367ms
				161	CompareLongs: 100ms 104ms 0.10us 0.316ms
				162	CompareStrings: 111ms 115ms 0.12us 0.929ms
				163	CompareUnicode: 108ms 128ms 0.17us 0.693ms
				164	ConcatStrings: 142ms 155ms 0.31us 0.562ms
				165	ConcatUnicode: 119ms 127ms 0.42us 0.384ms
				166	CreateInstances: 123ms 128ms 1.14us 0.367ms
				167	CreateNewInstances: 121ms 126ms 1.49us 0.335ms
				168	CreateStringsWithConcat: 130ms 135ms 0.14us 0.916ms
				169	CreateUnicodeWithConcat: 130ms 135ms 0.34us 0.361ms
				170	DictCreation: 108ms 109ms 0.27us 0.361ms
				171	DictWithFloatKeys: 149ms 153ms 0.17us 0.678ms
				172	DictWithIntegerKeys: 124ms 126ms 0.11us 0.915ms
				173	DictWithStringKeys: 114ms 117ms 0.10us 0.905ms
				174	ForLoops: 110ms 111ms 4.46us 0.063ms
				175	IfThenElse: 118ms 119ms 0.09us 0.685ms
				176	ListSlicing: 116ms 120ms 8.59us 0.103ms
				177	NestedForLoops: 125ms 137ms 0.09us 0.019ms
				178	NormalClassAttribute: 124ms 136ms 0.11us 0.457ms
				179	NormalInstanceAttribute: 110ms 117ms 0.10us 0.454ms
				180	PythonFunctionCalls: 107ms 113ms 0.34us 0.271ms
				181	PythonMethodCalls: 140ms 149ms 0.66us 0.141ms
				182	Recursion: 156ms 166ms 3.32us 0.452ms
				183	SecondImport: 112ms 118ms 1.18us 0.180ms
				184	SecondPackageImport: 118ms 127ms 1.27us 0.180ms
				185	SecondSubmoduleImport: 140ms 151ms 1.51us 0.180ms
				186	SimpleComplexArithmetic: 128ms 139ms 0.16us 0.361ms
				187	SimpleDictManipulation: 134ms 136ms 0.11us 0.452ms
				188	SimpleFloatArithmetic: 110ms 113ms 0.09us 0.571ms
				189	SimpleIntFloatArithmetic: 106ms 111ms 0.08us 0.548ms
				190	SimpleIntegerArithmetic: 106ms 109ms 0.08us 0.544ms
				191	SimpleListManipulation: 103ms 113ms 0.10us 0.587ms
				192	SimpleLongArithmetic: 112ms 118ms 0.18us 0.271ms
				193	SmallLists: 105ms 116ms 0.17us 0.366ms
				194	SmallTuples: 108ms 128ms 0.24us 0.406ms
				195	SpecialClassAttribute: 119ms 136ms 0.11us 0.453ms
				196	SpecialInstanceAttribute: 143ms 155ms 0.13us 0.454ms
				197	StringMappings: 115ms 121ms 0.48us 0.405ms
				198	StringPredicates: 120ms 129ms 0.18us 2.064ms
				199	StringSlicing: 111ms 127ms 0.23us 0.781ms
				200	TryExcept: 125ms 126ms 0.06us 0.681ms
				201	TryRaiseExcept: 133ms 137ms 2.14us 0.361ms
				202	TupleSlicing: 117ms 120ms 0.46us 0.066ms
				203	UnicodeMappings: 156ms 160ms 4.44us 0.429ms
				204	UnicodePredicates: 117ms 121ms 0.22us 2.487ms
				205	UnicodeProperties: 115ms 153ms 0.38us 2.070ms
				206	UnicodeSlicing: 126ms 129ms 0.26us 0.689ms
				207	-------------------------------------------------------------------------------
				208	Totals: 6283ms 6673ms
				209	"""
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	210	________________________________________________________________________
				211
				212	Writing New Tests
				213	________________________________________________________________________
				214
				215	pybench tests are simple modules defining one or more pybench.Test
				216	subclasses.
				217
				218	Writing a test essentially boils down to providing two methods:
				219	.test() which runs .rounds number of .operations test operations each
				220	and .calibrate() which does the same except that it doesn't actually
				221	execute the operations.
				222
				223
				224	Here's an example:
				225	------------------
				226
				227	from pybench import Test
				228
				229	class IntegerCounting(Test):
				230
				231	# Version number of the test as float (x.yy); this is important
				232	# for comparisons of benchmark runs - tests with unequal version
				233	# number will not get compared.
				234	version = 1.0
				235
				236	# The number of abstract operations done in each round of the
				237	# test. An operation is the basic unit of what you want to
				238	# measure. The benchmark will output the amount of run-time per
				239	# operation. Note that in order to raise the measured timings
				240	# significantly above noise level, it is often required to repeat
				241	# sets of operations more than once per test round. The measured
				242	# overhead per test round should be less than 1 second.
				243	operations = 20
				244
				245	# Number of rounds to execute per test run. This should be
				246	# adjusted to a figure that results in a test run-time of between
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	247	# 1-2 seconds (at warp 1).
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	248	rounds = 100000
				249
				250	def test(self):
				251
				252	""" Run the test.
				253
				254	The test needs to run self.rounds executing
				255	self.operations number of operations each.
				256
				257	"""
				258	# Init the test
				259	a = 1
				260
				261	# Run test rounds
				262	#
Georg Brandl	c9a5a0e	2009-09-01 07:34:27 +0000	[diff] [blame]	263	for i in range(self.rounds):
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	264
				265	# Repeat the operations per round to raise the run-time
				266	# per operation significantly above the noise level of the
				267	# for-loop overhead.
				268
				269	# Execute 20 operations (a += 1):
				270	a += 1
				271	a += 1
				272	a += 1
				273	a += 1
				274	a += 1
				275	a += 1
				276	a += 1
				277	a += 1
				278	a += 1
				279	a += 1
				280	a += 1
				281	a += 1
				282	a += 1
				283	a += 1
				284	a += 1
				285	a += 1
				286	a += 1
				287	a += 1
				288	a += 1
				289	a += 1
				290
				291	def calibrate(self):
				292
				293	""" Calibrate the test.
				294
				295	This method should execute everything that is needed to
				296	setup and run the test - except for the actual operations
				297	that you intend to measure. pybench uses this method to
				298	measure the test implementation overhead.
				299
				300	"""
				301	# Init the test
				302	a = 1
				303
				304	# Run test rounds (without actually doing any operation)
Georg Brandl	c9a5a0e	2009-09-01 07:34:27 +0000	[diff] [blame]	305	for i in range(self.rounds):
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	306
				307	# Skip the actual execution of the operations, since we
				308	# only want to measure the test's administration overhead.
				309	pass
				310
				311	Registering a new test module
				312	-----------------------------
				313
				314	To register a test module with pybench, the classes need to be
				315	imported into the pybench.Setup module. pybench will then scan all the
				316	symbols defined in that module for subclasses of pybench.Test and
				317	automatically add them to the benchmark suite.
				318
				319
Thomas Wouters	477c8d5	2006-05-27 19:21:47 +0000	[diff] [blame]	320	Breaking Comparability
				321	----------------------
				322
				323	If a change is made to any individual test that means it is no
				324	longer strictly comparable with previous runs, the '.version' class
				325	variable should be updated. Therefafter, comparisons with previous
				326	versions of the test will list as "n/a" to reflect the change.
				327
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	328
				329	Version History
				330	---------------
				331
Antoine Pitrou	8a68122	2009-02-07 17:13:31 +0000	[diff] [blame]	332	2.1: made some minor changes for compatibility with Python 3.0:
				333	- replaced cmp with divmod and range with max in Calls.py
				334	(cmp no longer exists in 3.0, and range is a list in
				335	Python 2.x and an iterator in Python 3.x)
				336
Thomas Wouters	0e3f591	2006-08-11 14:57:12 +0000	[diff] [blame]	337	2.0: rewrote parts of pybench which resulted in more repeatable
				338	timings:
				339	- made timer a parameter
				340	- changed the platform default timer to use high-resolution
				341	timers rather than process timers (which have a much lower
				342	resolution)
				343	- added option to select timer
				344	- added process time timer (using systimes.py)
				345	- changed to use min() as timing estimator (average
				346	is still taken as well to provide an idea of the difference)
				347	- garbage collection is turned off per default
				348	- sys check interval is set to the highest possible value
				349	- calibration is now a separate step and done using
				350	a different strategy that allows measuring the test
				351	overhead more accurately
				352	- modified the tests to each give a run-time of between
				353	100-200ms using warp 10
				354	- changed default warp factor to 10 (from 20)
				355	- compared results with timeit.py and confirmed measurements
				356	- bumped all test versions to 2.0
				357	- updated platform.py to the latest version
				358	- changed the output format a bit to make it look
				359	nicer
				360	- refactored the APIs somewhat
				361	1.3+: Steve Holden added the NewInstances test and the filtering
				362	option during the NeedForSpeed sprint; this also triggered a long
				363	discussion on how to improve benchmark timing and finally
				364	resulted in the release of 2.0
				365	1.3: initial checkin into the Python SVN repository
				366
				367
Thomas Wouters	49fd7fa	2006-04-21 10:40:58 +0000	[diff] [blame]	368	Have fun,
				369	--
				370	Marc-Andre Lemburg
				371	mal@lemburg.com