Blame - www/performance.html - platform/external/clang

blob: 28ac6a6901156d266b2a5ef7b60f25abaf3e6c35 [file] [log] [blame]

Daniel Dunbar	2336d1f	2008-11-01 01:14:36 +0000	[diff] [blame]	1	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
				2	"http://www.w3.org/TR/html4/strict.dtd">
				3	<html>
				4	<head>
				5	<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
				6	<title>Clang - Performance</title>
				7	<link type="text/css" rel="stylesheet" href="menu.css" />
				8	<link type="text/css" rel="stylesheet" href="content.css" />
				9	<style type="text/css">
				10	</style>
				11	</head>
				12	<body>
				13
				14	<!--#include virtual="menu.html.incl"-->
				15
				16	<div id="content">
				17
				18	<!--*************************************************************************-->
				19	<h1>Clang - Performance</h1>
				20	<!--*************************************************************************-->
				21
				22	<p>This page tracks the compile time performance of Clang on two
				23	interesting benchmarks:
				24	<ul>
				25	<li><i>Sketch</i>: The Objective-C example application shipped on
				26	Mac OS X as part of Xcode. <i>Sketch</i> is indicative of a
				27	"typical" Objective-C app. The source itself has a relatively
				28	small amount of code (~7,500 lines of source code), but it relies
				29	on the extensive Cocoa APIs to build its functionality. Like many
				30	Objective-C applications, it includes
				31	<tt>Cocoa/Cocoa.h</tt> in all of its source files, which represents a
				32	significant stress test of the front-end's performance on lexing,
				33	preprocessing, parsing, and syntax analysis.</li>
				34	<li><i>176.gcc</i>: This is the gcc-2.7.2.2 code base as present in
				35	SPECINT 2000. In contrast to Sketch, <i>176.gcc</i> consists of a
				36	large amount of C source code (~220,000 lines) with few system
				37	dependencies. This stresses the back-end's performance on generating
				38	assembly code and debug information.</li>
				39	</ul>
				40	</p>
				41
				42	<!--*************************************************************************-->
				43	<h2><a name="enduser">Experiments</a></h2>
				44	<!--*************************************************************************-->
				45
				46	<p>Measurements are done by serially processing each file in the
				47	respective benchmark, using Clang, gcc, and llvm-gcc as compilers. In
Chris Lattner	09d84ba	2008-11-01 01:46:51 +0000	[diff] [blame^]	48	order to track the performance of various subsystems, the timings have
Daniel Dunbar	2336d1f	2008-11-01 01:14:36 +0000	[diff] [blame]	49	been broken down into separate stages where possible:
				50
				51	<ul>
				52	<li><tt>-Eonly</tt>: This option runs the preprocessor but does not
				53	perform any output. For gcc and llvm-gcc, the -MM option is used
				54	as a rough equivalent to this step.</li>
				55	<li><tt>-parse-noop</tt>: This option runs the parser on the input,
				56	but without semantic analysis or any output. gcc and llvm-gcc have
				57	no equivalent for this option.</li>
Chris Lattner	09d84ba	2008-11-01 01:46:51 +0000	[diff] [blame^]	58	<li><tt>-fsyntax-only</tt>: This option runs the parser with semantic
Daniel Dunbar	2336d1f	2008-11-01 01:14:36 +0000	[diff] [blame]	59	analysis.</li>
				60	<li><tt>-emit-llvm -O0</tt>: For Clang and llvm-gcc, this option
				61	converts to the LLVM intermediate representation but doesn't
				62	generate native code.</li>
				63	<li><tt>-S -O0</tt>: Perform actual code generation to produce a
				64	native assembler file.</li>
				65	<li><tt>-S -O0 -g</tt>: This adds emission of debug information to
				66	the assembly output.</li>
				67	</ul>
				68	</p>
				69
				70	<p>This set of stages is chosen to be approximately additive, that is
				71	each subsequent stage simply adds some additional processing. The
				72	timings measure the delta of the given stage from the previous
				73	one. For example, the timings for <tt>-fsyntax-only</tt> below show
Daniel Dunbar	8fa9845	2008-11-01 01:24:31 +0000	[diff] [blame]	74	the difference of running with <tt>-fsyntax-only</tt> versus running
Daniel Dunbar	2336d1f	2008-11-01 01:14:36 +0000	[diff] [blame]	75	with <tt>-parse-noop</tt> (for clang) or <tt>-MM</tt> with gcc and
				76	llvm-gcc. This amounts to a fairly accurate measure of only the time
				77	to perform semantic analysis (and parsing, in the case of gcc and llvm-gcc).</p>
				78
Chris Lattner	09d84ba	2008-11-01 01:46:51 +0000	[diff] [blame^]	79	<p>Note that we already know that the LLVM optimizers are substantially (30-40%)
				80	faster than the GCC optimizers at a given -O level, so we only focus on -O0
				81	compile time here.</p>
				82
Daniel Dunbar	2336d1f	2008-11-01 01:14:36 +0000	[diff] [blame]	83	<!--*************************************************************************-->
				84	<h2><a name="enduser">Timing Results</a></h2>
				85	<!--*************************************************************************-->
				86
				87	<!--=======================================================================-->
				88	<h3><a name="2008-10-31">2008-10-31</a></h3>
				89	<!--=======================================================================-->
				90
				91	<center><h4>Sketch</h4></center>
				92	<img class="img_slide"
				93	src="timing-data/2008-10-31/sketch.png" alt="Sketch Timings"/>
				94
				95	<p>This shows Clang's substantial performance improvements in
				96	preprocessing and semantic analysis; over 90% faster on
				97	-fsyntax-only. As expected, time spent in code generation for this
				98	benchmark is relatively small. One caveat, Clang's debug information
				99	generation for Objective-C is very incomplete; this means the <tt>-S
				100	-O0 -g</tt> numbers are unfair since Clang is generating substantially
				101	less output.</p>
				102
				103	<p>This chart also shows the effect of using precompiled headers (PCH)
				104	on compiler time. gcc and llvm-gcc see a large performance improvement
				105	with PCH; about 4x in wall time. Unfortunately, Clang does not yet
				106	have an implementation of PCH-style optimizations, but we are actively
				107	working to address this.</p>
				108
				109	<center><h4>176.gcc</h4></center>
				110	<img class="img_slide"
				111	src="timing-data/2008-10-31/176.gcc.png" alt="176.gcc Timings"/>
				112
				113	<p>Unlike the <i>Sketch</i> timings, compilation of <i>176.gcc</i>
				114	involves a large amount of code generation. The time spent in Clang's
				115	LLVM IR generation and code generation is on par with gcc's code
				116	generation time but the improved parsing & semantic analysis
Daniel Dunbar	8fa9845	2008-11-01 01:24:31 +0000	[diff] [blame]	117	performance means Clang still comes in at ~29% faster versus gcc
				118	on <tt>-S -O0 -g</tt> and ~20% faster versus llvm-gcc.</p>
Daniel Dunbar	2336d1f	2008-11-01 01:14:36 +0000	[diff] [blame]	119
				120	<p>These numbers indicate that Clang still has room for improvement in
				121	several areas, notably our LLVM IR generation is significantly slower
Daniel Dunbar	8fa9845	2008-11-01 01:24:31 +0000	[diff] [blame]	122	than that of llvm-gcc, and both Clang and llvm-gcc incur a
Daniel Dunbar	2336d1f	2008-11-01 01:14:36 +0000	[diff] [blame]	123	significantly higher cost for adding debugging information compared to
				124	gcc.</p>
				125
				126	</div>
				127	</body>
				128	</html>