blob: f69308e1b59ee226fae57274d632ff4c54635138 [file] [log] [blame]
Dmitri Gribenkoe17d8582012-12-09 23:14:26 +00001.. raw:: html
2
3 <style> .red {color:red} </style>
4
5.. role:: red
6
7======================
8LLVM 3.2 Release Notes
9======================
10
11.. contents::
12 :local:
13
14Written by the `LLVM Team <http://llvm.org/>`_
15
16:red:`These are in-progress notes for the upcoming LLVM 3.2 release. You may
17prefer the` `LLVM 3.1 Release Notes <http://llvm.org/releases/3.1/docs
18/ReleaseNotes.html>`_.
19
20Introduction
21============
22
23This document contains the release notes for the LLVM Compiler Infrastructure,
24release 3.2. Here we describe the status of LLVM, including major improvements
25from the previous release, improvements in various subprojects of LLVM, and
26some of the current users of the code. All LLVM releases may be downloaded
27from the `LLVM releases web site <http://llvm.org/releases/>`_.
28
29For more information about LLVM, including information about the latest
30release, please check out the `main LLVM web site <http://llvm.org/>`_. If you
31have questions or comments, the `LLVM Developer's Mailing List
32<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>`_ is a good place to send
33them.
34
35Note that if you are reading this file from a Subversion checkout or the main
36LLVM web page, this document applies to the *next* release, not the current
37one. To see the release notes for a specific release, please see the `releases
38page <http://llvm.org/releases/>`_.
39
40Sub-project Status Update
41=========================
42
43The LLVM 3.2 distribution currently consists of code from the core LLVM
44repository, which roughly includes the LLVM optimizers, code generators and
45supporting tools, and the Clang repository. In addition to this code, the LLVM
46Project includes other sub-projects that are in development. Here we include
47updates on these subprojects.
48
49Clang: C/C++/Objective-C Frontend Toolkit
50-----------------------------------------
51
52`Clang <http://clang.llvm.org/>`_ is an LLVM front end for the C, C++, and
53Objective-C languages. Clang aims to provide a better user experience through
54expressive diagnostics, a high level of conformance to language standards, fast
55compilation, and low memory use. Like LLVM, Clang provides a modular,
56library-based architecture that makes it suitable for creating or integrating
57with other development tools. Clang is considered a production-quality
58compiler for C, Objective-C, C++ and Objective-C++ on x86 (32- and 64-bit), and
59for Darwin/ARM targets.
60
61In the LLVM 3.2 time-frame, the Clang team has made many improvements.
62Highlights include:
63
Chandler Carruthd7407cd2012-12-14 13:22:57 +000064#. More powerful warnings, especially `-Wuninitialized`
65#. Template type diffing in diagnostic messages
66#. Higher quality and more efficient debug info generation
Dmitri Gribenkoe17d8582012-12-09 23:14:26 +000067
68For more details about the changes to Clang since the 3.1 release, see the
69`Clang release notes. <http://clang.llvm.org/docs/ReleaseNotes.html>`_
70
71If Clang rejects your code but another compiler accepts it, please take a look
72at the `language compatibility <http://clang.llvm.org/compatibility.html>`_
73guide to make sure this is not intentional or a known issue.
74
75DragonEgg: GCC front-ends, LLVM back-end
76----------------------------------------
77
78`DragonEgg <http://dragonegg.llvm.org/>`_ is a `gcc plugin
79<http://gcc.gnu.org/wiki/plugins>`_ that replaces GCC's optimizers and code
80generators with LLVM's. It works with gcc-4.5 and gcc-4.6 (and partially with
81gcc-4.7), can target the x86-32/x86-64 and ARM processor families, and has been
82successfully used on the Darwin, FreeBSD, KFreeBSD, Linux and OpenBSD
83platforms. It fully supports Ada, C, C++ and Fortran. It has partial support
84for Go, Java, Obj-C and Obj-C++.
85
86The 3.2 release has the following notable changes:
87
Duncan Sands58d15b52012-12-14 21:10:59 +000088#. Able to load LLVM plugins such as Polly.
89#. Supports thread-local storage models.
90#. Passes knowledge of variable lifetimes to the LLVM optimizers.
91#. No longer requires GCC to be built with LTO support.
Dmitri Gribenkoe17d8582012-12-09 23:14:26 +000092
93compiler-rt: Compiler Runtime Library
94-------------------------------------
95
96The new LLVM `compiler-rt project <http://compiler-rt.llvm.org/>`_ is a simple
97library that provides an implementation of the low-level target-specific hooks
98required by code generation and other runtime components. For example, when
99compiling for a 32-bit target, converting a double to a 64-bit unsigned integer
100is compiled into a runtime call to the ``__fixunsdfdi`` function. The
101``compiler-rt`` library provides highly optimized implementations of this and
102other low-level routines (some are 3x faster than the equivalent libgcc
103routines).
104
105The 3.2 release has the following notable changes:
106
107#. ...
108
109LLDB: Low Level Debugger
110------------------------
111
112`LLDB <http://lldb.llvm.org>`_ is a ground-up implementation of a command line
113debugger, as well as a debugger API that can be used from other applications.
114LLDB makes use of the Clang parser to provide high-fidelity expression parsing
115(particularly for C++) and uses the LLVM JIT for target support.
116
117The 3.2 release has the following notable changes:
118
119#. ...
120
121libc++: C++ Standard Library
122----------------------------
123
124Like compiler_rt, libc++ is now :ref:`dual licensed
125<copyright-license-patents>` under the MIT and UIUC license, allowing it to be
126used more permissively.
127
128Within the LLVM 3.2 time-frame there were the following highlights:
129
130#. ...
131
132VMKit
133-----
134
135The `VMKit project <http://vmkit.llvm.org/>`_ is an implementation of a Java
136Virtual Machine (Java VM or JVM) that uses LLVM for static and just-in-time
137compilation.
138
139The 3.2 release has the following notable changes:
140
141#. ...
142
143Polly: Polyhedral Optimizer
144---------------------------
145
146`Polly <http://polly.llvm.org/>`_ is an *experimental* optimizer for data
147locality and parallelism. It provides high-level loop optimizations and
148automatic parallelisation.
149
150Within the LLVM 3.2 time-frame there were the following highlights:
151
152#. isl, the integer set library used by Polly, was relicensed to the MIT license
153#. isl based code generation
154#. MIT licensed replacement for CLooG (LGPLv2)
155#. Fine grained option handling (separation of core and border computations,
156 control overhead vs. code size)
157#. Support for FORTRAN and dragonegg
158#. OpenMP code generation fixes
159
160External Open Source Projects Using LLVM 3.2
161============================================
162
163An exciting aspect of LLVM is that it is used as an enabling technology for a
164lot of other language and tools projects. This section lists some of the
165projects that have already been updated to work with LLVM 3.2.
166
167Crack
168-----
169
170`Crack <http://code.google.com/p/crack-language/>`_ aims to provide the ease of
171development of a scripting language with the performance of a compiled
172language. The language derives concepts from C++, Java and Python,
173incorporating object-oriented programming, operator overloading and strong
174typing.
175
176FAUST
177-----
178
179`FAUST <http://faust.grame.fr/>`_ is a compiled language for real-time audio
180signal processing. The name FAUST stands for Functional AUdio STream. Its
181programming model combines two approaches: functional programming and block
182diagram composition. In addition with the C, C++, Java, JavaScript output
183formats, the Faust compiler can generate LLVM bitcode, and works with LLVM
1842.7-3.1.
185
186Glasgow Haskell Compiler (GHC)
187------------------------------
188
189`GHC <http://www.haskell.org/ghc/>`_ is an open source compiler and programming
190suite for Haskell, a lazy functional programming language. It includes an
191optimizing static compiler generating good code for a variety of platforms,
192together with an interactive system for convenient, quick development.
193
194GHC 7.0 and onwards include an LLVM code generator, supporting LLVM 2.8 and
195later.
196
197Julia
198-----
199
200`Julia <https://github.com/JuliaLang/julia>`_ is a high-level, high-performance
201dynamic language for technical computing. It provides a sophisticated
202compiler, distributed parallel execution, numerical accuracy, and an extensive
203mathematical function library. The compiler uses type inference to generate
204fast code without any type declarations, and uses LLVM's optimization passes
205and JIT compiler. The `Julia Language <http://julialang.org/>`_ is designed
206around multiple dispatch, giving programs a large degree of flexibility. It is
207ready for use on many kinds of problems.
208
209LLVM D Compiler
210---------------
211
212`LLVM D Compiler <https://github.com/ldc-developers/ldc>`_ (LDC) is a compiler
213for the D programming Language. It is based on the DMD frontend and uses LLVM
214as backend.
215
216Open Shading Language
217---------------------
218
219`Open Shading Language (OSL)
220<https://github.com/imageworks/OpenShadingLanguage/>`_ is a small but rich
221language for programmable shading in advanced global illumination renderers and
222other applications, ideal for describing materials, lights, displacement, and
223pattern generation. It uses LLVM to JIT complex shader networks to x86 code at
224runtime.
225
226OSL was developed by Sony Pictures Imageworks for use in its in-house renderer
227used for feature film animation and visual effects, and is distributed as open
228source software with the "New BSD" license.
229
230Portable OpenCL (pocl)
231----------------------
232
233In addition to producing an easily portable open source OpenCL implementation,
234another major goal of `pocl <http://pocl.sourceforge.net/>`_ is improving
235performance portability of OpenCL programs with compiler optimizations,
236reducing the need for target-dependent manual optimizations. An important part
237of pocl is a set of LLVM passes used to statically parallelize multiple
238work-items with the kernel compiler, even in the presence of work-group
239barriers. This enables static parallelization of the fine-grained static
240concurrency in the work groups in multiple ways (SIMD, VLIW, superscalar, ...).
241
242Pure
243----
244
245`Pure <http://pure-lang.googlecode.com/>`_ is an algebraic/functional
246programming language based on term rewriting. Programs are collections of
247equations which are used to evaluate expressions in a symbolic fashion. The
248interpreter uses LLVM as a backend to JIT-compile Pure programs to fast native
249code. Pure offers dynamic typing, eager and lazy evaluation, lexical closures,
250a hygienic macro system (also based on term rewriting), built-in list and
251matrix support (including list and matrix comprehensions) and an easy-to-use
252interface to C and other programming languages (including the ability to load
253LLVM bitcode modules, and inline C, C++, Fortran and Faust code in Pure
254programs if the corresponding LLVM-enabled compilers are installed).
255
256Pure version 0.54 has been tested and is known to work with LLVM 3.1 (and
257continues to work with older LLVM releases >= 2.5).
258
259TTA-based Co-design Environment (TCE)
260-------------------------------------
261
262`TCE <http://tce.cs.tut.fi/>`_ is a toolset for designing application-specific
263processors (ASP) based on the Transport triggered architecture (TTA). The
264toolset provides a complete co-design flow from C/C++ programs down to
265synthesizable VHDL/Verilog and parallel program binaries. Processor
266customization points include the register files, function units, supported
267operations, and the interconnection network.
268
269TCE uses Clang and LLVM for C/C++ language support, target independent
270optimizations and also for parts of code generation. It generates new
271LLVM-based code generators "on the fly" for the designed TTA processors and
272loads them in to the compiler backend as runtime libraries to avoid per-target
273recompilation of larger parts of the compiler chain.
274
275Installation Instructions
276=========================
277
278See :doc:`GettingStarted`.
279
280What's New in LLVM 3.2?
281=======================
282
283This release includes a huge number of bug fixes, performance tweaks and minor
284improvements. Some of the major improvements and new features are listed in
285this section.
286
287Major New Features
288------------------
289
290..
291
292 Features that need text if they're finished for 3.2:
293 ARM EHABI
294 combiner-aa?
295 strong phi elim
296 loop dependence analysis
297 CorrelatedValuePropagation
Dmitri Gribenkoe17d8582012-12-09 23:14:26 +0000298 Integrated assembler on by default for arm/thumb?
299
300 Near dead:
301 Analysis/RegionInfo.h + Dom Frontiers
302 SparseBitVector: used in LiveVar.
303 llvm/lib/Archive - replace with lib object?
304
305
306LLVM 3.2 includes several major changes and big features:
307
308#. New NVPTX back-end (replacing existing PTX back-end) based on NVIDIA sources
309#. ...
310
311LLVM IR and Core Improvements
312-----------------------------
313
314LLVM IR has several new features for better support of new targets and that
315expose new optimization opportunities:
316
317#. Thread local variables may have a specified TLS model. See the :ref:`Language
318 Reference Manual <globalvars>`.
319#. ...
320
321Optimizer Improvements
322----------------------
323
324In addition to many minor performance tweaks and bug fixes, this release
325includes a few major enhancements and additions to the optimizers:
326
327Loop Vectorizer - We've added a loop vectorizer and we are now able to
328vectorize small loops. The loop vectorizer is disabled by default and can be
329enabled using the ``-mllvm -vectorize-loops`` flag. The SIMD vector width can
330be specified using the flag ``-mllvm -force-vector-width=4``. The default
331value is ``0`` which means auto-select.
332
333We can now vectorize this function:
334
335.. code-block:: c++
336
337 unsigned sum_arrays(int *A, int *B, int start, int end) {
338 unsigned sum = 0;
339 for (int i = start; i < end; ++i)
340 sum += A[i] + B[i] + i;
341 return sum;
342 }
343
344We vectorize under the following loops:
345
346#. The inner most loops must have a single basic block.
347#. The number of iterations are known before the loop starts to execute.
348#. The loop counter needs to be incremented by one.
349#. The loop trip count **can** be a variable.
350#. Loops do **not** need to start at zero.
351#. The induction variable can be used inside the loop.
352#. Loop reductions are supported.
353#. Arrays with affine access pattern do **not** need to be marked as
354 '``noalias``' and are checked at runtime.
355#. ...
356
Chandler Carruthff038d72012-12-14 13:37:17 +0000357SROA - We've re-written SROA to be significantly more powerful and generate
358code which is much more friendly to the rest of the optimization pipeline.
359Previously this pass had scaling problems that required it to only operate on
360relatively small aggregates, and at times it would mistakenly replace a large
361aggregate with a single very large integer in order to make it a scalar SSA
362value. The result was a large number of i1024 and i2048 values representing any
363small stack buffer. These in turn slowed down many subsequent optimization
364paths.
365
366The new SROA pass uses a different algorithm that allows it to only promote to
367scalars the pieces of the aggregate actively in use. Because of this it doesn't
368require any thresholds. It also always deduces the scalar values from the uses
369of the aggregate rather than the specific LLVM type of the aggregate. These
370features combine to both optimize more code with the pass but to improve the
371compile time of many functions dramatically.
Dmitri Gribenkoe17d8582012-12-09 23:14:26 +0000372
373#. Branch weight metadata is preseved through more of the optimizer.
374#. ...
375
376MC Level Improvements
377---------------------
378
379The LLVM Machine Code (aka MC) subsystem was created to solve a number of
380problems in the realm of assembly, disassembly, object file format handling,
381and a number of other related areas that CPU instruction-set level tools work
382in. For more information, please see the `Intro to the LLVM MC Project Blog
383Post <http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html>`_.
384
385#. ...
386
387.. _codegen:
388
389Target Independent Code Generator Improvements
390----------------------------------------------
391
Dmitri Gribenkoe17d8582012-12-09 23:14:26 +0000392We have put a significant amount of work into the code generator
393infrastructure, which allows us to implement more aggressive algorithms and
394make it run faster:
395
396#. ...
397
Chandler Carruth76292852012-12-14 13:37:18 +0000398Stack Coloring - We have implemented a new optimization pass to merge stack
399objects which are used in disjoin areas of the code. This optimization reduces
400the required stack space significantly, in cases where it is clear to the
401optimizer that the stack slot is not shared. We use the lifetime markers to
402tell the codegen that a certain alloca is used within a region.
403
404We now merge consecutive loads and stores.
405
Dmitri Gribenkoe17d8582012-12-09 23:14:26 +0000406X86-32 and X86-64 Target Improvements
407-------------------------------------
408
409New features and major changes in the X86 target include:
410
411#. ...
412
413.. _ARM:
414
415ARM Target Improvements
416-----------------------
417
418New features of the ARM target include:
419
420#. ...
421
422.. _armintegratedassembler:
423
Dmitri Gribenkoe17d8582012-12-09 23:14:26 +0000424MIPS Target Improvements
425------------------------
426
427New features and major changes in the MIPS target include:
428
429#. ...
430
431PowerPC Target Improvements
432---------------------------
433
434Many fixes and changes across LLVM (and Clang) for better compliance with the
43564-bit PowerPC ELF Application Binary Interface, interoperability with GCC, and
436overall 64-bit PowerPC support. Some highlights include:
437
438#. MCJIT support added.
439#. PPC64 relocation support and (small code model) TOC handling added.
440#. Parameter passing and return value fixes (alignment issues, padding, varargs
441 support, proper register usage, odd-sized structure support, float support,
442 extension of return values for i32 return values).
443#. Fixes in spill and reload code for vector registers.
444#. C++ exception handling enabled.
445#. Changes to remediate double-rounding compatibility issues with respect to
446 GCC behavior.
447#. Refactoring to disentangle ``ppc64-elf-linux`` ABI from Darwin ppc64 ABI
448 support.
449#. Assorted new test cases and test case fixes (endian and word size issues).
450#. Fixes for big-endian codegen bugs, instruction encodings, and instruction
451 constraints.
452#. Implemented ``-integrated-as`` support.
453#. Additional support for Altivec compare operations.
454#. IBM long double support.
455
456There have also been code generation improvements for both 32- and 64-bit code.
457Instruction scheduling support for the Freescale e500mc and e5500 cores has
458been added.
459
460PTX/NVPTX Target Improvements
461-----------------------------
462
463The PTX back-end has been replaced by the NVPTX back-end, which is based on the
464LLVM back-end used by NVIDIA in their CUDA (nvcc) and OpenCL compiler. Some
465highlights include:
466
467#. Compatibility with PTX 3.1 and SM 3.5.
468#. Support for NVVM intrinsics as defined in the NVIDIA Compiler SDK.
469#. Full compatibility with old PTX back-end, with much greater coverage of LLVM
470 SIR.
471
472Please submit any back-end bugs to the LLVM Bugzilla site.
473
474Other Target Specific Improvements
475----------------------------------
476
477#. ...
478
479Major Changes and Removed Features
480----------------------------------
481
482If you're already an LLVM user or developer with out-of-tree changes based on
483LLVM 3.2, this section lists some "gotchas" that you may run into upgrading
484from the previous release.
485
486#. The CellSPU port has been removed. It can still be found in older versions.
487#. ...
488
489Internal API Changes
490--------------------
491
492In addition, many APIs have changed in this release. Some of the major LLVM
493API changes are:
494
495We've added a new interface for allowing IR-level passes to access
496target-specific information. A new IR-level pass, called
497``TargetTransformInfo`` provides a number of low-level interfaces. LSR and
498LowerInvoke already use the new interface.
499
500The ``TargetData`` structure has been renamed to ``DataLayout`` and moved to
501``VMCore`` to remove a dependency on ``Target``.
502
Daniel Dunbar634bd852013-01-17 19:52:25 +0000503#. The IR-level extended linker APIs (for example, to link bitcode files out of
504 archives) have been removed. Any existing clients of these features should
505 move to using a linker with integrated LTO support.
Dmitri Gribenkoe17d8582012-12-09 23:14:26 +0000506
507Tools Changes
508-------------
509
510In addition, some tools have changed in this release. Some of the changes are:
511
512#. ...
513
514Python Bindings
515---------------
516
517Officially supported Python bindings have been added! Feature support is far
518from complete. The current bindings support interfaces to:
519
520#. ...
521
522Known Problems
523==============
524
525LLVM is generally a production quality compiler, and is used by a broad range
526of applications and shipping in many products. That said, not every subsystem
527is as mature as the aggregate, particularly the more obscure1 targets. If you
528run into a problem, please check the `LLVM bug database
529<http://llvm.org/bugs/>`_ and submit a bug if there isn't already one or ask on
530the `LLVMdev list <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>`_.
531
532Known problem areas include:
533
Chandler Carruth06259292012-12-15 08:56:20 +0000534#. The MSP430 and XCore backends are experimental.
Dmitri Gribenkoe17d8582012-12-09 23:14:26 +0000535
536#. The integrated assembler, disassembler, and JIT is not supported by several
537 targets. If an integrated assembler is not supported, then a system
538 assembler is required. For more details, see the
539 :ref:`target-feature-matrix`.
540
541Additional Information
542======================
543
544A wide variety of additional information is available on the `LLVM web page
545<http://llvm.org/>`_, in particular in the `documentation
546<http://llvm.org/docs/>`_ section. The web page also contains versions of the
547API documentation which is up-to-date with the Subversion version of the source
548code. You can access versions of these documents specific to this release by
549going into the ``llvm/docs/`` directory in the LLVM tree.
550
551If you have any questions or comments about LLVM, please feel free to contact
552us via the `mailing lists <http://llvm.org/docs/#maillist>`_.
553