| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" |
| "http://www.w3.org/TR/html4/strict.dtd"> |
| <html> |
| <head> |
| <title>Exception Handling in LLVM</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
| <meta name="description" |
| content="Exception Handling in LLVM."> |
| <link rel="stylesheet" href="llvm.css" type="text/css"> |
| </head> |
| |
| <body> |
| |
| <h1>Exception Handling in LLVM</h1> |
| |
| <table class="layout" style="width:100%"> |
| <tr class="layout"> |
| <td class="left"> |
| <ul> |
| <li><a href="#introduction">Introduction</a> |
| <ol> |
| <li><a href="#itanium">Itanium ABI Zero-cost Exception Handling</a></li> |
| <li><a href="#sjlj">Setjmp/Longjmp Exception Handling</a></li> |
| <li><a href="#overview">Overview</a></li> |
| </ol></li> |
| <li><a href="#codegen">LLVM Code Generation</a> |
| <ol> |
| <li><a href="#throw">Throw</a></li> |
| <li><a href="#try_catch">Try/Catch</a></li> |
| <li><a href="#cleanups">Cleanups</a></li> |
| <li><a href="#throw_filters">Throw Filters</a></li> |
| <li><a href="#restrictions">Restrictions</a></li> |
| </ol></li> |
| <li><a href="#format_common_intrinsics">Exception Handling Intrinsics</a> |
| <ol> |
| <li><a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a></li> |
| <li><a href="#llvm_eh_sjlj_setjmp"><tt>llvm.eh.sjlj.setjmp</tt></a></li> |
| <li><a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a></li> |
| <li><a href="#llvm_eh_sjlj_lsda"><tt>llvm.eh.sjlj.lsda</tt></a></li> |
| <li><a href="#llvm_eh_sjlj_callsite"><tt>llvm.eh.sjlj.callsite</tt></a></li> |
| </ol></li> |
| <li><a href="#asm">Asm Table Formats</a> |
| <ol> |
| <li><a href="#unwind_tables">Exception Handling Frame</a></li> |
| <li><a href="#exception_tables">Exception Tables</a></li> |
| </ol></li> |
| </ul> |
| </td> |
| </tr></table> |
| |
| <div class="doc_author"> |
| <p>Written by the <a href="http://llvm.org/">LLVM Team</a></p> |
| </div> |
| |
| |
| <!-- *********************************************************************** --> |
| <h2><a name="introduction">Introduction</a></h2> |
| <!-- *********************************************************************** --> |
| |
| <div> |
| |
| <p>This document is the central repository for all information pertaining to |
| exception handling in LLVM. It describes the format that LLVM exception |
| handling information takes, which is useful for those interested in creating |
| front-ends or dealing directly with the information. Further, this document |
| provides specific examples of what exception handling information is used for |
| in C and C++.</p> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="itanium">Itanium ABI Zero-cost Exception Handling</a> |
| </h3> |
| |
| <div> |
| |
| <p>Exception handling for most programming languages is designed to recover from |
| conditions that rarely occur during general use of an application. To that |
| end, exception handling should not interfere with the main flow of an |
| application's algorithm by performing checkpointing tasks, such as saving the |
| current pc or register state.</p> |
| |
| <p>The Itanium ABI Exception Handling Specification defines a methodology for |
| providing outlying data in the form of exception tables without inlining |
| speculative exception handling code in the flow of an application's main |
| algorithm. Thus, the specification is said to add "zero-cost" to the normal |
| execution of an application.</p> |
| |
| <p>A more complete description of the Itanium ABI exception handling runtime |
| support of can be found at |
| <a href="http://www.codesourcery.com/cxx-abi/abi-eh.html">Itanium C++ ABI: |
| Exception Handling</a>. A description of the exception frame format can be |
| found at |
| <a href="http://refspecs.freestandards.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html">Exception |
| Frames</a>, with details of the DWARF 4 specification at |
| <a href="http://dwarfstd.org/Dwarf4Std.php">DWARF 4 Standard</a>. |
| A description for the C++ exception table formats can be found at |
| <a href="http://www.codesourcery.com/cxx-abi/exceptions.pdf">Exception Handling |
| Tables</a>.</p> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="sjlj">Setjmp/Longjmp Exception Handling</a> |
| </h3> |
| |
| <div> |
| |
| <p>Setjmp/Longjmp (SJLJ) based exception handling uses LLVM intrinsics |
| <a href="#llvm_eh_sjlj_setjmp"><tt>llvm.eh.sjlj.setjmp</tt></a> and |
| <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a> to |
| handle control flow for exception handling.</p> |
| |
| <p>For each function which does exception processing — be |
| it <tt>try</tt>/<tt>catch</tt> blocks or cleanups — that function |
| registers itself on a global frame list. When exceptions are unwinding, the |
| runtime uses this list to identify which functions need processing.<p> |
| |
| <p>Landing pad selection is encoded in the call site entry of the function |
| context. The runtime returns to the function via |
| <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a>, where |
| a switch table transfers control to the appropriate landing pad based on |
| the index stored in the function context.</p> |
| |
| <p>In contrast to DWARF exception handling, which encodes exception regions |
| and frame information in out-of-line tables, SJLJ exception handling |
| builds and removes the unwind frame context at runtime. This results in |
| faster exception handling at the expense of slower execution when no |
| exceptions are thrown. As exceptions are, by their nature, intended for |
| uncommon code paths, DWARF exception handling is generally preferred to |
| SJLJ.</p> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="overview">Overview</a> |
| </h3> |
| |
| <div> |
| |
| <p>When an exception is thrown in LLVM code, the runtime does its best to find a |
| handler suited to processing the circumstance.</p> |
| |
| <p>The runtime first attempts to find an <i>exception frame</i> corresponding to |
| the function where the exception was thrown. If the programming language |
| supports exception handling (e.g. C++), the exception frame contains a |
| reference to an exception table describing how to process the exception. If |
| the language does not support exception handling (e.g. C), or if the |
| exception needs to be forwarded to a prior activation, the exception frame |
| contains information about how to unwind the current activation and restore |
| the state of the prior activation. This process is repeated until the |
| exception is handled. If the exception is not handled and no activations |
| remain, then the application is terminated with an appropriate error |
| message.</p> |
| |
| <p>Because different programming languages have different behaviors when |
| handling exceptions, the exception handling ABI provides a mechanism for |
| supplying <i>personalities</i>. An exception handling personality is defined |
| by way of a <i>personality function</i> (e.g. <tt>__gxx_personality_v0</tt> |
| in C++), which receives the context of the exception, an <i>exception |
| structure</i> containing the exception object type and value, and a reference |
| to the exception table for the current function. The personality function |
| for the current compile unit is specified in a <i>common exception |
| frame</i>.</p> |
| |
| <p>The organization of an exception table is language dependent. For C++, an |
| exception table is organized as a series of code ranges defining what to do |
| if an exception occurs in that range. Typically, the information associated |
| with a range defines which types of exception objects (using C++ <i>type |
| info</i>) that are handled in that range, and an associated action that |
| should take place. Actions typically pass control to a <i>landing |
| pad</i>.</p> |
| |
| <p>A landing pad corresponds roughly to the code found in the <tt>catch</tt> |
| portion of a <tt>try</tt>/<tt>catch</tt> sequence. When execution resumes at |
| a landing pad, it receives an <i>exception structure</i> and a |
| <i>selector value</i> corresponding to the <i>type</i> of exception |
| thrown. The selector is then used to determine which <i>catch</i> should |
| actually process the exception.</p> |
| |
| </div> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h2> |
| <a name="codegen">LLVM Code Generation</a> |
| </h2> |
| |
| <div> |
| |
| <p>From a C++ developer's perspective, exceptions are defined in terms of the |
| <tt>throw</tt> and <tt>try</tt>/<tt>catch</tt> statements. In this section |
| we will describe the implementation of LLVM exception handling in terms of |
| C++ examples.</p> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="throw">Throw</a> |
| </h3> |
| |
| <div> |
| |
| <p>Languages that support exception handling typically provide a <tt>throw</tt> |
| operation to initiate the exception process. Internally, a <tt>throw</tt> |
| operation breaks down into two steps.</p> |
| |
| <ol> |
| <li>A request is made to allocate exception space for an exception structure. |
| This structure needs to survive beyond the current activation. This |
| structure will contain the type and value of the object being thrown.</li> |
| |
| <li>A call is made to the runtime to raise the exception, passing the |
| exception structure as an argument.</li> |
| </ol> |
| |
| <p>In C++, the allocation of the exception structure is done by the |
| <tt>__cxa_allocate_exception</tt> runtime function. The exception raising is |
| handled by <tt>__cxa_throw</tt>. The type of the exception is represented |
| using a C++ RTTI structure.</p> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="try_catch">Try/Catch</a> |
| </h3> |
| |
| <div> |
| |
| <p>A call within the scope of a <i>try</i> statement can potentially raise an |
| exception. In those circumstances, the LLVM C++ front-end replaces the call |
| with an <tt>invoke</tt> instruction. Unlike a call, the <tt>invoke</tt> has |
| two potential continuation points:</p> |
| |
| <ol> |
| <li>where to continue when the call succeeds as per normal, and</li> |
| |
| <li>where to continue if the call raises an exception, either by a throw or |
| the unwinding of a throw</li> |
| </ol> |
| |
| <p>The term used to define a the place where an <tt>invoke</tt> continues after |
| an exception is called a <i>landing pad</i>. LLVM landing pads are |
| conceptually alternative function entry points where an exception structure |
| reference and a type info index are passed in as arguments. The landing pad |
| saves the exception structure reference and then proceeds to select the catch |
| block that corresponds to the type info of the exception object.</p> |
| |
| <p>The LLVM <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> |
| instruction</a> is used to convey information about the landing pad to the |
| back end. For C++, the <tt>landingpad</tt> instruction returns a pointer and |
| integer pair corresponding to the pointer to the <i>exception structure</i> |
| and the <i>selector value</i> respectively.</p> |
| |
| <p>The <tt>landingpad</tt> instruction takes a reference to the personality |
| function to be used for this <tt>try</tt>/<tt>catch</tt> sequence. The |
| remainder of the instruction is a list of <i>cleanup</i>, <i>catch</i>, |
| and <i>filter</i> clauses. The exception is tested against the clauses |
| sequentially from first to last. The selector value is a positive number if |
| the exception matched a type info, a negative number if it matched a filter, |
| and zero if it matched a cleanup. If nothing is matched, the behavior of |
| the program is <a href="#restrictions">undefined</a>. If a type info matched, |
| then the selector value is the index of the type info in the exception table, |
| which can be obtained using the |
| <a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a> intrinsic.</p> |
| |
| <p>Once the landing pad has the type info selector, the code branches to the |
| code for the first catch. The catch then checks the value of the type info |
| selector against the index of type info for that catch. Since the type info |
| index is not known until all the type infos have been gathered in the |
| backend, the catch code must call the |
| <a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a> intrinsic to |
| determine the index for a given type info. If the catch fails to match the |
| selector then control is passed on to the next catch.</p> |
| |
| <p>Finally, the entry and exit of catch code is bracketed with calls to |
| <tt>__cxa_begin_catch</tt> and <tt>__cxa_end_catch</tt>.</p> |
| |
| <ul> |
| <li><tt>__cxa_begin_catch</tt> takes an exception structure reference as an |
| argument and returns the value of the exception object.</li> |
| |
| <li><tt>__cxa_end_catch</tt> takes no arguments. This function:<br><br> |
| <ol> |
| <li>Locates the most recently caught exception and decrements its handler |
| count,</li> |
| <li>Removes the exception from the <i>caught</i> stack if the handler |
| count goes to zero, and</li> |
| <li>Destroys the exception if the handler count goes to zero and the |
| exception was not re-thrown by throw.</li> |
| </ol> |
| <p><b>Note:</b> a rethrow from within the catch may replace this call with |
| a <tt>__cxa_rethrow</tt>.</p></li> |
| </ul> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="cleanups">Cleanups</a> |
| </h3> |
| |
| <div> |
| |
| <p>A cleanup is extra code which needs to be run as part of unwinding a scope. |
| C++ destructors are a typical example, but other languages and language |
| extensions provide a variety of different kinds of cleanups. In general, a |
| landing pad may need to run arbitrary amounts of cleanup code before actually |
| entering a catch block. To indicate the presence of cleanups, a |
| <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instruction</a> |
| should have a <i>cleanup</i> clause. Otherwise, the unwinder will not stop at |
| the landing pad if there are no catches or filters that require it to.</p> |
| |
| <p><b>Note:</b> Do not allow a new exception to propagate out of the execution |
| of a cleanup. This can corrupt the internal state of the unwinder. |
| Different languages describe different high-level semantics for these |
| situations: for example, C++ requires that the process be terminated, whereas |
| Ada cancels both exceptions and throws a third.</p> |
| |
| <p>When all cleanups are finished, if the exception is not handled by the |
| current function, resume unwinding by calling the |
| <a href="LangRef.html#i_resume"><tt>resume</tt> instruction</a>, passing in |
| the result of the <tt>landingpad</tt> instruction for the original landing |
| pad.</p> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="throw_filters">Throw Filters</a> |
| </h3> |
| |
| <div> |
| |
| <p>C++ allows the specification of which exception types may be thrown from a |
| function. To represent this, a top level landing pad may exist to filter out |
| invalid types. To express this in LLVM code the |
| <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instruction</a> will |
| have a filter clause. The clause consists of an array of type infos. |
| <tt>landingpad</tt> will return a negative value if the exception does not |
| match any of the type infos. If no match is found then a call |
| to <tt>__cxa_call_unexpected</tt> should be made, otherwise |
| <tt>_Unwind_Resume</tt>. Each of these functions requires a reference to the |
| exception structure. Note that the most general form of a |
| <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instruction</a> can |
| have any number of catch, cleanup, and filter clauses (though having more |
| than one cleanup is pointless). The LLVM C++ front-end can generate such |
| <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instructions</a> due |
| to inlining creating nested exception handling scopes.</p> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="restrictions">Restrictions</a> |
| </h3> |
| |
| <div> |
| |
| <p>The unwinder delegates the decision of whether to stop in a call frame to |
| that call frame's language-specific personality function. Not all unwinders |
| guarantee that they will stop to perform cleanups. For example, the GNU C++ |
| unwinder doesn't do so unless the exception is actually caught somewhere |
| further up the stack.</p> |
| |
| <p>In order for inlining to behave correctly, landing pads must be prepared to |
| handle selector results that they did not originally advertise. Suppose that |
| a function catches exceptions of type <tt>A</tt>, and it's inlined into a |
| function that catches exceptions of type <tt>B</tt>. The inliner will update |
| the <tt>landingpad</tt> instruction for the inlined landing pad to include |
| the fact that <tt>B</tt> is also caught. If that landing pad assumes that it |
| will only be entered to catch an <tt>A</tt>, it's in for a rude awakening. |
| Consequently, landing pads must test for the selector results they understand |
| and then resume exception propagation with the |
| <a href="LangRef.html#i_resume"><tt>resume</tt> instruction</a> if none of |
| the conditions match.</p> |
| |
| </div> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h2> |
| <a name="format_common_intrinsics">Exception Handling Intrinsics</a> |
| </h2> |
| |
| <div> |
| |
| <p>In addition to the |
| <a href="LangRef.html#i_landingpad"><tt>landingpad</tt></a> and |
| <a href="LangRef.html#i_resume"><tt>resume</tt></a> instructions, LLVM uses |
| several intrinsic functions (name prefixed with <i><tt>llvm.eh</tt></i>) to |
| provide exception handling information at various points in generated |
| code.</p> |
| |
| <!-- ======================================================================= --> |
| <h4> |
| <a name="llvm_eh_typeid_for">llvm.eh.typeid.for</a> |
| </h4> |
| |
| <div> |
| |
| <pre> |
| i32 @llvm.eh.typeid.for(i8* %type_info) |
| </pre> |
| |
| <p>This intrinsic returns the type info index in the exception table of the |
| current function. This value can be used to compare against the result |
| of <a href="LangRef.html#i_landingpad"><tt>landingpad</tt> instruction</a>. |
| The single argument is a reference to a type info.</p> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h4> |
| <a name="llvm_eh_sjlj_setjmp">llvm.eh.sjlj.setjmp</a> |
| </h4> |
| |
| <div> |
| |
| <pre> |
| i32 @llvm.eh.sjlj.setjmp(i8* %setjmp_buf) |
| </pre> |
| |
| <p>For SJLJ based exception handling, this intrinsic forces register saving for |
| the current function and stores the address of the following instruction for |
| use as a destination address |
| by <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a>. The |
| buffer format and the overall functioning of this intrinsic is compatible |
| with the GCC <tt>__builtin_setjmp</tt> implementation allowing code built |
| with the clang and GCC to interoperate.</p> |
| |
| <p>The single parameter is a pointer to a five word buffer in which the calling |
| context is saved. The front end places the frame pointer in the first word, |
| and the target implementation of this intrinsic should place the destination |
| address for a |
| <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a> in the |
| second word. The following three words are available for use in a |
| target-specific manner.</p> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h4> |
| <a name="llvm_eh_sjlj_longjmp">llvm.eh.sjlj.longjmp</a> |
| </h4> |
| |
| <div> |
| |
| <pre> |
| void @llvm.eh.sjlj.longjmp(i8* %setjmp_buf) |
| </pre> |
| |
| <p>For SJLJ based exception handling, the <tt>llvm.eh.sjlj.longjmp</tt> |
| intrinsic is used to implement <tt>__builtin_longjmp()</tt>. The single |
| parameter is a pointer to a buffer populated |
| by <a href="#llvm_eh_sjlj_setjmp"><tt>llvm.eh.sjlj.setjmp</tt></a>. The frame |
| pointer and stack pointer are restored from the buffer, then control is |
| transferred to the destination address.</p> |
| |
| </div> |
| <!-- ======================================================================= --> |
| <h4> |
| <a name="llvm_eh_sjlj_lsda">llvm.eh.sjlj.lsda</a> |
| </h4> |
| |
| <div> |
| |
| <pre> |
| i8* @llvm.eh.sjlj.lsda() |
| </pre> |
| |
| <p>For SJLJ based exception handling, the <tt>llvm.eh.sjlj.lsda</tt> intrinsic |
| returns the address of the Language Specific Data Area (LSDA) for the current |
| function. The SJLJ front-end code stores this address in the exception |
| handling function context for use by the runtime.</p> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h4> |
| <a name="llvm_eh_sjlj_callsite">llvm.eh.sjlj.callsite</a> |
| </h4> |
| |
| <div> |
| |
| <pre> |
| void @llvm.eh.sjlj.callsite(i32 %call_site_num) |
| </pre> |
| |
| <p>For SJLJ based exception handling, the <tt>llvm.eh.sjlj.callsite</tt> |
| intrinsic identifies the callsite value associated with the |
| following <tt>invoke</tt> instruction. This is used to ensure that landing |
| pad entries in the LSDA are generated in matching order.</p> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h2> |
| <a name="asm">Asm Table Formats</a> |
| </h2> |
| |
| <div> |
| |
| <p>There are two tables that are used by the exception handling runtime to |
| determine which actions should be taken when an exception is thrown.</p> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="unwind_tables">Exception Handling Frame</a> |
| </h3> |
| |
| <div> |
| |
| <p>An exception handling frame <tt>eh_frame</tt> is very similar to the unwind |
| frame used by DWARF debug info. The frame contains all the information |
| necessary to tear down the current frame and restore the state of the prior |
| frame. There is an exception handling frame for each function in a compile |
| unit, plus a common exception handling frame that defines information common |
| to all functions in the unit.</p> |
| |
| <!-- Todo - Table details here. --> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="exception_tables">Exception Tables</a> |
| </h3> |
| |
| <div> |
| |
| <p>An exception table contains information about what actions to take when an |
| exception is thrown in a particular part of a function's code. There is one |
| exception table per function, except leaf functions and functions that have |
| calls only to non-throwing functions. They do not need an exception |
| table.</p> |
| |
| <!-- Todo - Table details here. --> |
| |
| </div> |
| |
| </div> |
| |
| <!-- *********************************************************************** --> |
| |
| <hr> |
| <address> |
| <a href="http://jigsaw.w3.org/css-validator/check/referer"><img |
| src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> |
| <a href="http://validator.w3.org/check/referer"><img |
| src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> |
| |
| <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> |
| Last modified: $Date$ |
| </address> |
| |
| </body> |
| </html> |