|  | ====================================================== | 
|  | How to set up LLVM-style RTTI for your class hierarchy | 
|  | ====================================================== | 
|  |  | 
|  | .. contents:: | 
|  |  | 
|  | Background | 
|  | ========== | 
|  |  | 
|  | LLVM avoids using C++'s built in RTTI. Instead, it  pervasively uses its | 
|  | own hand-rolled form of RTTI which is much more efficient and flexible, | 
|  | although it requires a bit more work from you as a class author. | 
|  |  | 
|  | A description of how to use LLVM-style RTTI from a client's perspective is | 
|  | given in the `Programmer's Manual <ProgrammersManual.html#isa>`_. This | 
|  | document, in contrast, discusses the steps you need to take as a class | 
|  | hierarchy author to make LLVM-style RTTI available to your clients. | 
|  |  | 
|  | Before diving in, make sure that you are familiar with the Object Oriented | 
|  | Programming concept of "`is-a`_". | 
|  |  | 
|  | .. _is-a: http://en.wikipedia.org/wiki/Is-a | 
|  |  | 
|  | Basic Setup | 
|  | =========== | 
|  |  | 
|  | This section describes how to set up the most basic form of LLVM-style RTTI | 
|  | (which is sufficient for 99.9% of the cases). We will set up LLVM-style | 
|  | RTTI for this class hierarchy: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | class Shape { | 
|  | public: | 
|  | Shape() {} | 
|  | virtual double computeArea() = 0; | 
|  | }; | 
|  |  | 
|  | class Square : public Shape { | 
|  | double SideLength; | 
|  | public: | 
|  | Square(double S) : SideLength(S) {} | 
|  | double computeArea() override; | 
|  | }; | 
|  |  | 
|  | class Circle : public Shape { | 
|  | double Radius; | 
|  | public: | 
|  | Circle(double R) : Radius(R) {} | 
|  | double computeArea() override; | 
|  | }; | 
|  |  | 
|  | The most basic working setup for LLVM-style RTTI requires the following | 
|  | steps: | 
|  |  | 
|  | #. In the header where you declare ``Shape``, you will want to ``#include | 
|  | "llvm/Support/Casting.h"``, which declares LLVM's RTTI templates. That | 
|  | way your clients don't even have to think about it. | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | #include "llvm/Support/Casting.h" | 
|  |  | 
|  | #. In the base class, introduce an enum which discriminates all of the | 
|  | different concrete classes in the hierarchy, and stash the enum value | 
|  | somewhere in the base class. | 
|  |  | 
|  | Here is the code after introducing this change: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | class Shape { | 
|  | public: | 
|  | +  /// Discriminator for LLVM-style RTTI (dyn_cast<> et al.) | 
|  | +  enum ShapeKind { | 
|  | +    SK_Square, | 
|  | +    SK_Circle | 
|  | +  }; | 
|  | +private: | 
|  | +  const ShapeKind Kind; | 
|  | +public: | 
|  | +  ShapeKind getKind() const { return Kind; } | 
|  | + | 
|  | Shape() {} | 
|  | virtual double computeArea() = 0; | 
|  | }; | 
|  |  | 
|  | You will usually want to keep the ``Kind`` member encapsulated and | 
|  | private, but let the enum ``ShapeKind`` be public along with providing a | 
|  | ``getKind()`` method. This is convenient for clients so that they can do | 
|  | a ``switch`` over the enum. | 
|  |  | 
|  | A common naming convention is that these enums are "kind"s, to avoid | 
|  | ambiguity with the words "type" or "class" which have overloaded meanings | 
|  | in many contexts within LLVM. Sometimes there will be a natural name for | 
|  | it, like "opcode". Don't bikeshed over this; when in doubt use ``Kind``. | 
|  |  | 
|  | You might wonder why the ``Kind`` enum doesn't have an entry for | 
|  | ``Shape``. The reason for this is that since ``Shape`` is abstract | 
|  | (``computeArea() = 0;``), you will never actually have non-derived | 
|  | instances of exactly that class (only subclasses). See `Concrete Bases | 
|  | and Deeper Hierarchies`_ for information on how to deal with | 
|  | non-abstract bases. It's worth mentioning here that unlike | 
|  | ``dynamic_cast<>``, LLVM-style RTTI can be used (and is often used) for | 
|  | classes that don't have v-tables. | 
|  |  | 
|  | #. Next, you need to make sure that the ``Kind`` gets initialized to the | 
|  | value corresponding to the dynamic type of the class. Typically, you will | 
|  | want to have it be an argument to the constructor of the base class, and | 
|  | then pass in the respective ``XXXKind`` from subclass constructors. | 
|  |  | 
|  | Here is the code after that change: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | class Shape { | 
|  | public: | 
|  | /// Discriminator for LLVM-style RTTI (dyn_cast<> et al.) | 
|  | enum ShapeKind { | 
|  | SK_Square, | 
|  | SK_Circle | 
|  | }; | 
|  | private: | 
|  | const ShapeKind Kind; | 
|  | public: | 
|  | ShapeKind getKind() const { return Kind; } | 
|  |  | 
|  | -  Shape() {} | 
|  | +  Shape(ShapeKind K) : Kind(K) {} | 
|  | virtual double computeArea() = 0; | 
|  | }; | 
|  |  | 
|  | class Square : public Shape { | 
|  | double SideLength; | 
|  | public: | 
|  | -  Square(double S) : SideLength(S) {} | 
|  | +  Square(double S) : Shape(SK_Square), SideLength(S) {} | 
|  | double computeArea() override; | 
|  | }; | 
|  |  | 
|  | class Circle : public Shape { | 
|  | double Radius; | 
|  | public: | 
|  | -  Circle(double R) : Radius(R) {} | 
|  | +  Circle(double R) : Shape(SK_Circle), Radius(R) {} | 
|  | double computeArea() override; | 
|  | }; | 
|  |  | 
|  | #. Finally, you need to inform LLVM's RTTI templates how to dynamically | 
|  | determine the type of a class (i.e. whether the ``isa<>``/``dyn_cast<>`` | 
|  | should succeed). The default "99.9% of use cases" way to accomplish this | 
|  | is through a small static member function ``classof``. In order to have | 
|  | proper context for an explanation, we will display this code first, and | 
|  | then below describe each part: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | class Shape { | 
|  | public: | 
|  | /// Discriminator for LLVM-style RTTI (dyn_cast<> et al.) | 
|  | enum ShapeKind { | 
|  | SK_Square, | 
|  | SK_Circle | 
|  | }; | 
|  | private: | 
|  | const ShapeKind Kind; | 
|  | public: | 
|  | ShapeKind getKind() const { return Kind; } | 
|  |  | 
|  | Shape(ShapeKind K) : Kind(K) {} | 
|  | virtual double computeArea() = 0; | 
|  | }; | 
|  |  | 
|  | class Square : public Shape { | 
|  | double SideLength; | 
|  | public: | 
|  | Square(double S) : Shape(SK_Square), SideLength(S) {} | 
|  | double computeArea() override; | 
|  | + | 
|  | +  static bool classof(const Shape *S) { | 
|  | +    return S->getKind() == SK_Square; | 
|  | +  } | 
|  | }; | 
|  |  | 
|  | class Circle : public Shape { | 
|  | double Radius; | 
|  | public: | 
|  | Circle(double R) : Shape(SK_Circle), Radius(R) {} | 
|  | double computeArea() override; | 
|  | + | 
|  | +  static bool classof(const Shape *S) { | 
|  | +    return S->getKind() == SK_Circle; | 
|  | +  } | 
|  | }; | 
|  |  | 
|  | The job of ``classof`` is to dynamically determine whether an object of | 
|  | a base class is in fact of a particular derived class.  In order to | 
|  | downcast a type ``Base`` to a type ``Derived``, there needs to be a | 
|  | ``classof`` in ``Derived`` which will accept an object of type ``Base``. | 
|  |  | 
|  | To be concrete, consider the following code: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | Shape *S = ...; | 
|  | if (isa<Circle>(S)) { | 
|  | /* do something ... */ | 
|  | } | 
|  |  | 
|  | The code of the ``isa<>`` test in this code will eventually boil | 
|  | down---after template instantiation and some other machinery---to a | 
|  | check roughly like ``Circle::classof(S)``. For more information, see | 
|  | :ref:`classof-contract`. | 
|  |  | 
|  | The argument to ``classof`` should always be an *ancestor* class because | 
|  | the implementation has logic to allow and optimize away | 
|  | upcasts/up-``isa<>``'s automatically. It is as though every class | 
|  | ``Foo`` automatically has a ``classof`` like: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | class Foo { | 
|  | [...] | 
|  | template <class T> | 
|  | static bool classof(const T *, | 
|  | ::std::enable_if< | 
|  | ::std::is_base_of<Foo, T>::value | 
|  | >::type* = 0) { return true; } | 
|  | [...] | 
|  | }; | 
|  |  | 
|  | Note that this is the reason that we did not need to introduce a | 
|  | ``classof`` into ``Shape``: all relevant classes derive from ``Shape``, | 
|  | and ``Shape`` itself is abstract (has no entry in the ``Kind`` enum), | 
|  | so this notional inferred ``classof`` is all we need. See `Concrete | 
|  | Bases and Deeper Hierarchies`_ for more information about how to extend | 
|  | this example to more general hierarchies. | 
|  |  | 
|  | Although for this small example setting up LLVM-style RTTI seems like a lot | 
|  | of "boilerplate", if your classes are doing anything interesting then this | 
|  | will end up being a tiny fraction of the code. | 
|  |  | 
|  | Concrete Bases and Deeper Hierarchies | 
|  | ===================================== | 
|  |  | 
|  | For concrete bases (i.e. non-abstract interior nodes of the inheritance | 
|  | tree), the ``Kind`` check inside ``classof`` needs to be a bit more | 
|  | complicated. The situation differs from the example above in that | 
|  |  | 
|  | * Since the class is concrete, it must itself have an entry in the ``Kind`` | 
|  | enum because it is possible to have objects with this class as a dynamic | 
|  | type. | 
|  |  | 
|  | * Since the class has children, the check inside ``classof`` must take them | 
|  | into account. | 
|  |  | 
|  | Say that ``SpecialSquare`` and ``OtherSpecialSquare`` derive | 
|  | from ``Square``, and so ``ShapeKind`` becomes: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | enum ShapeKind { | 
|  | SK_Square, | 
|  | +  SK_SpecialSquare, | 
|  | +  SK_OtherSpecialSquare, | 
|  | SK_Circle | 
|  | } | 
|  |  | 
|  | Then in ``Square``, we would need to modify the ``classof`` like so: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | -  static bool classof(const Shape *S) { | 
|  | -    return S->getKind() == SK_Square; | 
|  | -  } | 
|  | +  static bool classof(const Shape *S) { | 
|  | +    return S->getKind() >= SK_Square && | 
|  | +           S->getKind() <= SK_OtherSpecialSquare; | 
|  | +  } | 
|  |  | 
|  | The reason that we need to test a range like this instead of just equality | 
|  | is that both ``SpecialSquare`` and ``OtherSpecialSquare`` "is-a" | 
|  | ``Square``, and so ``classof`` needs to return ``true`` for them. | 
|  |  | 
|  | This approach can be made to scale to arbitrarily deep hierarchies. The | 
|  | trick is that you arrange the enum values so that they correspond to a | 
|  | preorder traversal of the class hierarchy tree. With that arrangement, all | 
|  | subclass tests can be done with two comparisons as shown above. If you just | 
|  | list the class hierarchy like a list of bullet points, you'll get the | 
|  | ordering right:: | 
|  |  | 
|  | | Shape | 
|  | | Square | 
|  | | SpecialSquare | 
|  | | OtherSpecialSquare | 
|  | | Circle | 
|  |  | 
|  | A Bug to be Aware Of | 
|  | -------------------- | 
|  |  | 
|  | The example just given opens the door to bugs where the ``classof``\s are | 
|  | not updated to match the ``Kind`` enum when adding (or removing) classes to | 
|  | (from) the hierarchy. | 
|  |  | 
|  | Continuing the example above, suppose we add a ``SomewhatSpecialSquare`` as | 
|  | a subclass of ``Square``, and update the ``ShapeKind`` enum like so: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | enum ShapeKind { | 
|  | SK_Square, | 
|  | SK_SpecialSquare, | 
|  | SK_OtherSpecialSquare, | 
|  | +  SK_SomewhatSpecialSquare, | 
|  | SK_Circle | 
|  | } | 
|  |  | 
|  | Now, suppose that we forget to update ``Square::classof()``, so it still | 
|  | looks like: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | static bool classof(const Shape *S) { | 
|  | // BUG: Returns false when S->getKind() == SK_SomewhatSpecialSquare, | 
|  | // even though SomewhatSpecialSquare "is a" Square. | 
|  | return S->getKind() >= SK_Square && | 
|  | S->getKind() <= SK_OtherSpecialSquare; | 
|  | } | 
|  |  | 
|  | As the comment indicates, this code contains a bug. A straightforward and | 
|  | non-clever way to avoid this is to introduce an explicit ``SK_LastSquare`` | 
|  | entry in the enum when adding the first subclass(es). For example, we could | 
|  | rewrite the example at the beginning of `Concrete Bases and Deeper | 
|  | Hierarchies`_ as: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | enum ShapeKind { | 
|  | SK_Square, | 
|  | +  SK_SpecialSquare, | 
|  | +  SK_OtherSpecialSquare, | 
|  | +  SK_LastSquare, | 
|  | SK_Circle | 
|  | } | 
|  | ... | 
|  | // Square::classof() | 
|  | -  static bool classof(const Shape *S) { | 
|  | -    return S->getKind() == SK_Square; | 
|  | -  } | 
|  | +  static bool classof(const Shape *S) { | 
|  | +    return S->getKind() >= SK_Square && | 
|  | +           S->getKind() <= SK_LastSquare; | 
|  | +  } | 
|  |  | 
|  | Then, adding new subclasses is easy: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | enum ShapeKind { | 
|  | SK_Square, | 
|  | SK_SpecialSquare, | 
|  | SK_OtherSpecialSquare, | 
|  | +  SK_SomewhatSpecialSquare, | 
|  | SK_LastSquare, | 
|  | SK_Circle | 
|  | } | 
|  |  | 
|  | Notice that ``Square::classof`` does not need to be changed. | 
|  |  | 
|  | .. _classof-contract: | 
|  |  | 
|  | The Contract of ``classof`` | 
|  | --------------------------- | 
|  |  | 
|  | To be more precise, let ``classof`` be inside a class ``C``.  Then the | 
|  | contract for ``classof`` is "return ``true`` if the dynamic type of the | 
|  | argument is-a ``C``".  As long as your implementation fulfills this | 
|  | contract, you can tweak and optimize it as much as you want. | 
|  |  | 
|  | For example, LLVM-style RTTI can work fine in the presence of | 
|  | multiple-inheritance by defining an appropriate ``classof``. | 
|  | An example of this in practice is | 
|  | `Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_ vs. | 
|  | `DeclContext <http://clang.llvm.org/doxygen/classclang_1_1DeclContext.html>`_ | 
|  | inside Clang. | 
|  | The ``Decl`` hierarchy is done very similarly to the example setup | 
|  | demonstrated in this tutorial. | 
|  | The key part is how to then incorporate ``DeclContext``: all that is needed | 
|  | is in ``bool DeclContext::classof(const Decl *)``, which asks the question | 
|  | "Given a ``Decl``, how can I determine if it is-a ``DeclContext``?". | 
|  | It answers this with a simple switch over the set of ``Decl`` "kinds", and | 
|  | returning true for ones that are known to be ``DeclContext``'s. | 
|  |  | 
|  | .. TODO:: | 
|  |  | 
|  | Touch on some of the more advanced features, like ``isa_impl`` and | 
|  | ``simplify_type``. However, those two need reference documentation in | 
|  | the form of doxygen comments as well. We need the doxygen so that we can | 
|  | say "for full details, see http://llvm.org/doxygen/..." | 
|  |  | 
|  | Rules of Thumb | 
|  | ============== | 
|  |  | 
|  | #. The ``Kind`` enum should have one entry per concrete class, ordered | 
|  | according to a preorder traversal of the inheritance tree. | 
|  | #. The argument to ``classof`` should be a ``const Base *``, where ``Base`` | 
|  | is some ancestor in the inheritance hierarchy. The argument should | 
|  | *never* be a derived class or the class itself: the template machinery | 
|  | for ``isa<>`` already handles this case and optimizes it. | 
|  | #. For each class in the hierarchy that has no children, implement a | 
|  | ``classof`` that checks only against its ``Kind``. | 
|  | #. For each class in the hierarchy that has children, implement a | 
|  | ``classof`` that checks a range of the first child's ``Kind`` and the | 
|  | last child's ``Kind``. |