blob: 28ac6a6901156d266b2a5ef7b60f25abaf3e6c35 [file] [log] [blame]
Daniel Dunbar2336d1f2008-11-01 01:14:36 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3<html>
4<head>
5 <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
6 <title>Clang - Performance</title>
7 <link type="text/css" rel="stylesheet" href="menu.css" />
8 <link type="text/css" rel="stylesheet" href="content.css" />
9 <style type="text/css">
10</style>
11</head>
12<body>
13
14<!--#include virtual="menu.html.incl"-->
15
16<div id="content">
17
18<!--*************************************************************************-->
19<h1>Clang - Performance</h1>
20<!--*************************************************************************-->
21
22<p>This page tracks the compile time performance of Clang on two
23interesting benchmarks:
24<ul>
25 <li><i>Sketch</i>: The Objective-C example application shipped on
26 Mac OS X as part of Xcode. <i>Sketch</i> is indicative of a
27 "typical" Objective-C app. The source itself has a relatively
28 small amount of code (~7,500 lines of source code), but it relies
29 on the extensive Cocoa APIs to build its functionality. Like many
30 Objective-C applications, it includes
31 <tt>Cocoa/Cocoa.h</tt> in all of its source files, which represents a
32 significant stress test of the front-end's performance on lexing,
33 preprocessing, parsing, and syntax analysis.</li>
34 <li><i>176.gcc</i>: This is the gcc-2.7.2.2 code base as present in
35 SPECINT 2000. In contrast to Sketch, <i>176.gcc</i> consists of a
36 large amount of C source code (~220,000 lines) with few system
37 dependencies. This stresses the back-end's performance on generating
38 assembly code and debug information.</li>
39</ul>
40</p>
41
42<!--*************************************************************************-->
43<h2><a name="enduser">Experiments</a></h2>
44<!--*************************************************************************-->
45
46<p>Measurements are done by serially processing each file in the
47respective benchmark, using Clang, gcc, and llvm-gcc as compilers. In
Chris Lattner09d84ba2008-11-01 01:46:51 +000048order to track the performance of various subsystems, the timings have
Daniel Dunbar2336d1f2008-11-01 01:14:36 +000049been broken down into separate stages where possible:
50
51<ul>
52 <li><tt>-Eonly</tt>: This option runs the preprocessor but does not
53 perform any output. For gcc and llvm-gcc, the -MM option is used
54 as a rough equivalent to this step.</li>
55 <li><tt>-parse-noop</tt>: This option runs the parser on the input,
56 but without semantic analysis or any output. gcc and llvm-gcc have
57 no equivalent for this option.</li>
Chris Lattner09d84ba2008-11-01 01:46:51 +000058 <li><tt>-fsyntax-only</tt>: This option runs the parser with semantic
Daniel Dunbar2336d1f2008-11-01 01:14:36 +000059 analysis.</li>
60 <li><tt>-emit-llvm -O0</tt>: For Clang and llvm-gcc, this option
61 converts to the LLVM intermediate representation but doesn't
62 generate native code.</li>
63 <li><tt>-S -O0</tt>: Perform actual code generation to produce a
64 native assembler file.</li>
65 <li><tt>-S -O0 -g</tt>: This adds emission of debug information to
66 the assembly output.</li>
67</ul>
68</p>
69
70<p>This set of stages is chosen to be approximately additive, that is
71each subsequent stage simply adds some additional processing. The
72timings measure the delta of the given stage from the previous
73one. For example, the timings for <tt>-fsyntax-only</tt> below show
Daniel Dunbar8fa98452008-11-01 01:24:31 +000074the difference of running with <tt>-fsyntax-only</tt> versus running
Daniel Dunbar2336d1f2008-11-01 01:14:36 +000075with <tt>-parse-noop</tt> (for clang) or <tt>-MM</tt> with gcc and
76llvm-gcc. This amounts to a fairly accurate measure of only the time
77to perform semantic analysis (and parsing, in the case of gcc and llvm-gcc).</p>
78
Chris Lattner09d84ba2008-11-01 01:46:51 +000079<p>Note that we already know that the LLVM optimizers are substantially (30-40%)
80faster than the GCC optimizers at a given -O level, so we only focus on -O0
81compile time here.</p>
82
Daniel Dunbar2336d1f2008-11-01 01:14:36 +000083<!--*************************************************************************-->
84<h2><a name="enduser">Timing Results</a></h2>
85<!--*************************************************************************-->
86
87<!--=======================================================================-->
88<h3><a name="2008-10-31">2008-10-31</a></h3>
89<!--=======================================================================-->
90
91<center><h4>Sketch</h4></center>
92<img class="img_slide"
93 src="timing-data/2008-10-31/sketch.png" alt="Sketch Timings"/>
94
95<p>This shows Clang's substantial performance improvements in
96preprocessing and semantic analysis; over 90% faster on
97-fsyntax-only. As expected, time spent in code generation for this
98benchmark is relatively small. One caveat, Clang's debug information
99generation for Objective-C is very incomplete; this means the <tt>-S
100-O0 -g</tt> numbers are unfair since Clang is generating substantially
101less output.</p>
102
103<p>This chart also shows the effect of using precompiled headers (PCH)
104on compiler time. gcc and llvm-gcc see a large performance improvement
105with PCH; about 4x in wall time. Unfortunately, Clang does not yet
106have an implementation of PCH-style optimizations, but we are actively
107working to address this.</p>
108
109<center><h4>176.gcc</h4></center>
110<img class="img_slide"
111 src="timing-data/2008-10-31/176.gcc.png" alt="176.gcc Timings"/>
112
113<p>Unlike the <i>Sketch</i> timings, compilation of <i>176.gcc</i>
114involves a large amount of code generation. The time spent in Clang's
115LLVM IR generation and code generation is on par with gcc's code
116generation time but the improved parsing & semantic analysis
Daniel Dunbar8fa98452008-11-01 01:24:31 +0000117performance means Clang still comes in at ~29% faster versus gcc
118on <tt>-S -O0 -g</tt> and ~20% faster versus llvm-gcc.</p>
Daniel Dunbar2336d1f2008-11-01 01:14:36 +0000119
120<p>These numbers indicate that Clang still has room for improvement in
121several areas, notably our LLVM IR generation is significantly slower
Daniel Dunbar8fa98452008-11-01 01:24:31 +0000122than that of llvm-gcc, and both Clang and llvm-gcc incur a
Daniel Dunbar2336d1f2008-11-01 01:14:36 +0000123significantly higher cost for adding debugging information compared to
124gcc.</p>
125
126</div>
127</body>
128</html>