Blame - Tools/stringbench/README - platform/external/python/cpython3

blob: a271f12632afff3afe4ee5d5ec64dcba6c5489dd [file] [log] [blame]

Antoine Pitrou	1584ae3	2012-04-09 17:03:32 +0200	[diff] [blame]	1	stringbench is a set of performance tests comparing byte string
				2	operations with unicode operations. The two string implementations
				3	are loosely based on each other and sometimes the algorithm for one is
				4	faster than the other.
				5
				6	These test set was started at the Need For Speed sprint in Reykjavik
				7	to identify which string methods could be sped up quickly and to
				8	identify obvious places for improvement.
				9
				10	Here is an example of a benchmark
				11
				12
				13	@bench('"Andrew".startswith("A")', 'startswith single character', 1000)
				14	def startswith_single(STR):
				15	s1 = STR("Andrew")
				16	s2 = STR("A")
				17	s1_startswith = s1.startswith
				18	for x in _RANGE_1000:
				19	s1_startswith(s2)
				20
				21	The bench decorator takes three parameters. The first is a short
				22	description of how the code works. In most cases this is Python code
				23	snippet. It is not the code which is actually run because the real
				24	code is hand-optimized to focus on the method being tested.
				25
				26	The second parameter is a group title. All benchmarks with the same
				27	group title are listed together. This lets you compare different
				28	implementations of the same algorithm, such as "t in s"
				29	vs. "s.find(t)".
				30
				31	The last is a count. Each benchmark loops over the algorithm either
				32	100 or 1000 times, depending on the algorithm performance. The output
				33	time is the time per benchmark call so the reader needs a way to know
				34	how to scale the performance.
				35
				36	These parameters become function attributes.
				37
				38
				39	Here is an example of the output
				40
				41
				42	========== count newlines
				43	38.54 41.60 92.7 ...text.with.2000.newlines.count("\n") (*100)
				44	========== early match, single character
				45	1.14 1.18 96.8 ("A"1000).find("A") (1000)
				46	0.44 0.41 105.6 "A" in "A"1000 (1000)
				47	1.15 1.17 98.1 ("A"1000).index("A") (1000)
				48
				49	The first column is the run time in milliseconds for byte strings.
				50	The second is the run time for unicode strings. The third is a
				51	percentage; byte time / unicode time. It's the percentage by which
				52	unicode is faster than byte strings.
				53
				54	The last column contains the code snippet and the repeat count for the
				55	internal benchmark loop.
				56
				57	The times are computed with 'timeit.py' which repeats the test more
				58	and more times until the total time takes over 0.2 seconds, returning
				59	the best time for a single iteration.
				60
				61	The final line of the output is the cumulative time for byte and
				62	unicode strings, and the overall performance of unicode relative to
				63	bytes. For example
				64
				65	4079.83 5432.25 75.1 TOTAL
				66
				67	However, this has no meaning as it evenly weights every test.
				68