docs/HistoricalNotes/2001-05-18-ExceptionHandling.txt - fp2-dev/platform/external/llvm - Gitiles

 Meeting notes: Implementation idea: Exception Handling in C++/Java

 The 5/18/01 meeting discussed ideas for implementing exceptions in LLVM.
 We decided that the best solution requires a set of library calls provided by
 the VM, as well as an extension to the LLVM function invocation syntax.

 The LLVM function invocation instruction previously looks like this (ignoring
 types):

   call func(arg1, arg2, arg3)

 The extension discussed today adds an optional "with" clause that
 associates a label with the call site.  The new syntax looks like this:

   call func(arg1, arg2, arg3) with funcCleanup

 This funcHandler always stays tightly associated with the call site (being
 encoded directly into the call opcode itself), and should be used whenever
 there is cleanup work that needs to be done for the current function if
 an exception is thrown by func (or if we are in a try block).

 To support this, the VM/Runtime provide the following simple library
 functions (all syntax in this document is very abstract):

 typedef struct { something } %frame;
   The VM must export a "frame type", that is an opaque structure used to
   implement different types of stack walking that may be used by various
   language runtime libraries. We imagine that it would be typical to
   represent a frame with a PC and frame pointer pair, although that is not
   required.

 %frame getStackCurrentFrame();
   Get a frame object for the current function.  Note that if the current
   function was inlined into its caller, the "current" frame will belong to
   the "caller".

 bool isFirstFrame(%frame f);
   Returns true if the specified frame is the top level (first activated) frame
   for this thread.  For the main thread, this corresponds to the main()
   function, for a spawned thread, it corresponds to the thread function.

 %frame getNextFrame(%frame f);
   Return the previous frame on the stack.  This function is undefined if f
   satisfies the predicate isFirstFrame(f).

 Label *getFrameLabel(%frame f);
   If a label was associated with f (as discussed below), this function returns
   it.  Otherwise, it returns a null pointer.

 doNonLocalBranch(Label *L);
   At this point, it is not clear whether this should be a function or
   intrinsic.  It should probably be an intrinsic in LLVM, but we'll deal with
   this issue later.


 Here is a motivating example that illustrates how these facilities could be
 used to implement the C++ exception model:

 void TestFunction(...) {
   A a; B b;
   foo();        // Any function call may throw
   bar();
   C c;

   try {
     D d;
     baz();
   } catch (int) {
     ...int Stuff...
     // execution continues after the try block: the exception is consumed
   } catch (double) {
     ...double stuff...
    throw;            // Exception is propogated
   }
 }

 This function would compile to approximately the following code (heavy
 pseudo code follows):

 Func:
   %a = alloca A
   A::A(%a)        // These ctors & dtors could throw, but we ignore this
   %b = alloca B   // minor detail for this example
   B::B(%b)

   call foo() with fooCleanup // An exception in foo is propogated to fooCleanup
   call bar() with barCleanup // An exception in bar is propogated to barCleanup

   %c = alloca C
   C::C(c)
   %d = alloca D
   D::D(d)
   call baz() with bazCleanup // An exception in baz is propogated to bazCleanup
   d->~D();
 EndTry:                   // This label corresponds to the end of the try block
   c->~C()       // These could also throw, these are also ignored
   b->~B()
   a->~A()
   return

 Note that this is a very straight forward and literal translation: exactly
 what we want for zero cost (when unused) exception handling.  Especially on
 platforms with many registers (ie, the IA64) setjmp/longjmp style exception
 handling is *very* impractical.  Also, the "with" clauses describe the
 control flow paths explicitly so that analysis is not adversly effected.

 The foo/barCleanup labels are implemented as:

 TryCleanup:          // Executed if an exception escapes the try block
   c->~C()
 barCleanup:          // Executed if an exception escapes from bar()
   // fall through
 fooCleanup:          // Executed if an exception escapes from foo()
   b->~B()
   a->~A()
   Exception *E = getThreadLocalException()
   call throw(E)      // Implemented by the C++ runtime, described below

 Which does the work one would expect.  getThreadLocalException is a function
 implemented by the C++ support library.  It returns the current exception
 object for the current thread.  Note that we do not attempt to recycle the
 shutdown code from before, because performance of the mainline code is
 critically important.  Also, obviously fooCleanup and barCleanup may be
 merged and one of them eliminated.  This just shows how the code generator
 would most likely emit code.

 The bazCleanup label is more interesting.  Because the exception may be caught
 by the try block, we must dispatch to its handler... but it does not exist
 on the call stack (it does not have a VM Call->Label mapping installed), so
 we must dispatch statically with a goto.  The bazHandler thus appears as:

 bazHandler:
   d->~D();    // destruct D as it goes out of scope when entering catch clauses
   goto TryHandler

 In general, TryHandler is not the same as bazHandler, because multiple
 function calls could be made from the try block.  In this case, trivial
 optimization could merge the two basic blocks.  TryHandler is the code
 that actually determines the type of exception, based on the Exception object
 itself.  For this discussion, assume that the exception object contains *at
 least*:

 1. A pointer to the RTTI info for the contained object
 2. A pointer to the dtor for the contained object
 3. The contained object itself

 Note that it is neccesary to maintain #1 & #2 in the exception object itself
 because objects without virtual function tables may be thrown (as in this
 example).  Assuming this, TryHandler would look something like this:

 TryHandler:
   Exception *E = getThreadLocalException();
   switch (E->RTTIType) {
   case IntRTTIInfo:
     ...int Stuff...       // The action to perform from the catch block
     break;
   case DoubleRTTIInfo:
     ...double Stuff...    // The action to perform from the catch block
     goto TryCleanup       // This catch block rethrows the exception
     break;                // Redundant, eliminated by the optimizer
   default:
     goto TryCleanup       // Exception not caught, rethrow
   }

   // Exception was consumed
   if (E->dtor)
     E->dtor(E->object)    // Invoke the dtor on the object if it exists
   goto EndTry             // Continue mainline code...

 And that is all there is to it.

 The throw(E) function would then be implemented like this (which may be
 inlined into the caller through standard optimization):

 function throw(Exception *E) {
   // Get the start of the stack trace...
   %frame %f = call getStackCurrentFrame()

   // Get the label information that corresponds to it
   label * %L = call getFrameLabel(%f)
   while (%L == 0 && !isFirstFrame(%f)) {
     // Loop until a cleanup handler is found
     %f = call getNextFrame(%f)
     %L = call getFrameLabel(%f)
   }

   if (%L != 0) {
     call setThreadLocalException(E)   // Allow handlers access to this...
     call doNonLocalBranch(%L)
   }
   // No handler found!
   call BlowUp()         // Ends up calling the terminate() method in use
 }

 That's a brief rundown of how C++ exception handling could be implemented in
 llvm.  Java would be very similar, except it only uses destructors to unlock
 synchronized blocks, not to destroy data.  Also, it uses two stack walks: a
 nondestructive walk that builds a stack trace, then a destructive walk that
 unwinds the stack as shown here.

 It would be trivial to get exception interoperability between C++ and Java.
	Meeting notes: Implementation idea: Exception Handling in C++/Java

	The 5/18/01 meeting discussed ideas for implementing exceptions in LLVM.
	We decided that the best solution requires a set of library calls provided by
	the VM, as well as an extension to the LLVM function invocation syntax.

	The LLVM function invocation instruction previously looks like this (ignoring
	types):

	call func(arg1, arg2, arg3)

	The extension discussed today adds an optional "with" clause that
	associates a label with the call site. The new syntax looks like this:

	call func(arg1, arg2, arg3) with funcCleanup

	This funcHandler always stays tightly associated with the call site (being
	encoded directly into the call opcode itself), and should be used whenever
	there is cleanup work that needs to be done for the current function if
	an exception is thrown by func (or if we are in a try block).

	To support this, the VM/Runtime provide the following simple library
	functions (all syntax in this document is very abstract):

	typedef struct { something } %frame;
	The VM must export a "frame type", that is an opaque structure used to
	implement different types of stack walking that may be used by various
	language runtime libraries. We imagine that it would be typical to
	represent a frame with a PC and frame pointer pair, although that is not
	required.

	%frame getStackCurrentFrame();
	Get a frame object for the current function. Note that if the current
	function was inlined into its caller, the "current" frame will belong to
	the "caller".

	bool isFirstFrame(%frame f);
	Returns true if the specified frame is the top level (first activated) frame
	for this thread. For the main thread, this corresponds to the main()
	function, for a spawned thread, it corresponds to the thread function.

	%frame getNextFrame(%frame f);
	Return the previous frame on the stack. This function is undefined if f
	satisfies the predicate isFirstFrame(f).

	Label *getFrameLabel(%frame f);
	If a label was associated with f (as discussed below), this function returns
	it. Otherwise, it returns a null pointer.

	doNonLocalBranch(Label *L);
	At this point, it is not clear whether this should be a function or
	intrinsic. It should probably be an intrinsic in LLVM, but we'll deal with
	this issue later.


	Here is a motivating example that illustrates how these facilities could be
	used to implement the C++ exception model:

	void TestFunction(...) {
	A a; B b;
	foo(); // Any function call may throw
	bar();
	C c;

	try {
	D d;
	baz();
	} catch (int) {
	...int Stuff...
	// execution continues after the try block: the exception is consumed
	} catch (double) {
	...double stuff...
	throw; // Exception is propogated
	}
	}

	This function would compile to approximately the following code (heavy
	pseudo code follows):

	Func:
	%a = alloca A
	A::A(%a) // These ctors & dtors could throw, but we ignore this
	%b = alloca B // minor detail for this example
	B::B(%b)

	call foo() with fooCleanup // An exception in foo is propogated to fooCleanup
	call bar() with barCleanup // An exception in bar is propogated to barCleanup

	%c = alloca C
	C::C(c)
	%d = alloca D
	D::D(d)
	call baz() with bazCleanup // An exception in baz is propogated to bazCleanup
	d->~D();
	EndTry: // This label corresponds to the end of the try block
	c->~C() // These could also throw, these are also ignored
	b->~B()
	a->~A()
	return

	Note that this is a very straight forward and literal translation: exactly
	what we want for zero cost (when unused) exception handling. Especially on
	platforms with many registers (ie, the IA64) setjmp/longjmp style exception
	handling is very impractical. Also, the "with" clauses describe the
	control flow paths explicitly so that analysis is not adversly effected.

	The foo/barCleanup labels are implemented as:

	TryCleanup: // Executed if an exception escapes the try block
	c->~C()
	barCleanup: // Executed if an exception escapes from bar()
	// fall through
	fooCleanup: // Executed if an exception escapes from foo()
	b->~B()
	a->~A()
	Exception *E = getThreadLocalException()
	call throw(E) // Implemented by the C++ runtime, described below

	Which does the work one would expect. getThreadLocalException is a function
	implemented by the C++ support library. It returns the current exception
	object for the current thread. Note that we do not attempt to recycle the
	shutdown code from before, because performance of the mainline code is
	critically important. Also, obviously fooCleanup and barCleanup may be
	merged and one of them eliminated. This just shows how the code generator
	would most likely emit code.

	The bazCleanup label is more interesting. Because the exception may be caught
	by the try block, we must dispatch to its handler... but it does not exist
	on the call stack (it does not have a VM Call->Label mapping installed), so
	we must dispatch statically with a goto. The bazHandler thus appears as:

	bazHandler:
	d->~D(); // destruct D as it goes out of scope when entering catch clauses
	goto TryHandler

	In general, TryHandler is not the same as bazHandler, because multiple
	function calls could be made from the try block. In this case, trivial
	optimization could merge the two basic blocks. TryHandler is the code
	that actually determines the type of exception, based on the Exception object
	itself. For this discussion, assume that the exception object contains *at
	least*:

	1. A pointer to the RTTI info for the contained object
	2. A pointer to the dtor for the contained object
	3. The contained object itself

	Note that it is neccesary to maintain #1 & #2 in the exception object itself
	because objects without virtual function tables may be thrown (as in this
	example). Assuming this, TryHandler would look something like this:

	TryHandler:
	Exception *E = getThreadLocalException();
	switch (E->RTTIType) {
	case IntRTTIInfo:
	...int Stuff... // The action to perform from the catch block
	break;
	case DoubleRTTIInfo:
	...double Stuff... // The action to perform from the catch block
	goto TryCleanup // This catch block rethrows the exception
	break; // Redundant, eliminated by the optimizer
	default:
	goto TryCleanup // Exception not caught, rethrow
	}

	// Exception was consumed
	if (E->dtor)
	E->dtor(E->object) // Invoke the dtor on the object if it exists
	goto EndTry // Continue mainline code...

	And that is all there is to it.

	The throw(E) function would then be implemented like this (which may be
	inlined into the caller through standard optimization):

	function throw(Exception *E) {
	// Get the start of the stack trace...
	%frame %f = call getStackCurrentFrame()

	// Get the label information that corresponds to it
	label * %L = call getFrameLabel(%f)
	while (%L == 0 && !isFirstFrame(%f)) {
	// Loop until a cleanup handler is found
	%f = call getNextFrame(%f)
	%L = call getFrameLabel(%f)
	}

	if (%L != 0) {
	call setThreadLocalException(E) // Allow handlers access to this...
	call doNonLocalBranch(%L)
	}
	// No handler found!
	call BlowUp() // Ends up calling the terminate() method in use
	}

	That's a brief rundown of how C++ exception handling could be implemented in
	llvm. Java would be very similar, except it only uses destructors to unlock
	synchronized blocks, not to destroy data. Also, it uses two stack walks: a
	nondestructive walk that builds a stack trace, then a destructive walk that
	unwinds the stack as shown here.

	It would be trivial to get exception interoperability between C++ and Java.