Lang Hames | 42c9b59 | 2016-05-26 21:17:06 +0000 | [diff] [blame] | 1 | ============================================= |
| 2 | Building a JIT: Per-function Lazy Compilation |
| 3 | ============================================= |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | |
| 8 | **This tutorial is under active development. It is incomplete and details may |
| 9 | change frequently.** Nonetheless we invite you to try it out as it stands, and |
| 10 | we welcome any feedback. |
| 11 | |
| 12 | Chapter 3 Introduction |
| 13 | ====================== |
| 14 | |
Lang Hames | f3fb9836 | 2018-02-06 21:25:20 +0000 | [diff] [blame] | 15 | **Warning: This text is currently out of date due to ORC API updates.** |
| 16 | |
| 17 | **The example code has been updated and can be used. The text will be updated |
| 18 | once the API churn dies down.** |
| 19 | |
Lang Hames | 42c9b59 | 2016-05-26 21:17:06 +0000 | [diff] [blame] | 20 | Welcome to Chapter 3 of the "Building an ORC-based JIT in LLVM" tutorial. This |
| 21 | chapter discusses lazy JITing and shows you how to enable it by adding an ORC |
| 22 | CompileOnDemand layer the JIT from `Chapter 2 <BuildingAJIT2.html>`_. |
| 23 | |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 24 | Lazy Compilation |
| 25 | ================ |
| 26 | |
Lang Hames | 96a2d57 | 2016-08-08 18:09:56 +0000 | [diff] [blame] | 27 | When we add a module to the KaleidoscopeJIT class from Chapter 2 it is |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 28 | immediately optimized, compiled and linked for us by the IRTransformLayer, |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 29 | IRCompileLayer and RTDyldObjectLinkingLayer respectively. This scheme, where all the |
Lang Hames | 96a2d57 | 2016-08-08 18:09:56 +0000 | [diff] [blame] | 30 | work to make a Module executable is done up front, is simple to understand and |
| 31 | its performance characteristics are easy to reason about. However, it will lead |
| 32 | to very high startup times if the amount of code to be compiled is large, and |
| 33 | may also do a lot of unnecessary compilation if only a few compiled functions |
| 34 | are ever called at runtime. A truly "just-in-time" compiler should allow us to |
Lang Hames | 0de9b91 | 2016-07-19 00:25:52 +0000 | [diff] [blame] | 35 | defer the compilation of any given function until the moment that function is |
| 36 | first called, improving launch times and eliminating redundant work. In fact, |
| 37 | the ORC APIs provide us with a layer to lazily compile LLVM IR: |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 38 | *CompileOnDemandLayer*. |
| 39 | |
Lang Hames | 0de9b91 | 2016-07-19 00:25:52 +0000 | [diff] [blame] | 40 | The CompileOnDemandLayer class conforms to the layer interface described in |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 41 | Chapter 2, but its addModule method behaves quite differently from the layers |
Lang Hames | 0de9b91 | 2016-07-19 00:25:52 +0000 | [diff] [blame] | 42 | we have seen so far: rather than doing any work up front, it just scans the |
| 43 | Modules being added and arranges for each function in them to be compiled the |
| 44 | first time it is called. To do this, the CompileOnDemandLayer creates two small |
| 45 | utilities for each function that it scans: a *stub* and a *compile |
| 46 | callback*. The stub is a pair of a function pointer (which will be pointed at |
| 47 | the function's implementation once the function has been compiled) and an |
| 48 | indirect jump through the pointer. By fixing the address of the indirect jump |
| 49 | for the lifetime of the program we can give the function a permanent "effective |
| 50 | address", one that can be safely used for indirection and function pointer |
| 51 | comparison even if the function's implementation is never compiled, or if it is |
| 52 | compiled more than once (due to, for example, recompiling the function at a |
| 53 | higher optimization level) and changes address. The second utility, the compile |
| 54 | callback, represents a re-entry point from the program into the compiler that |
| 55 | will trigger compilation and then execution of a function. By initializing the |
| 56 | function's stub to point at the function's compile callback, we enable lazy |
| 57 | compilation: The first attempted call to the function will follow the function |
| 58 | pointer and trigger the compile callback instead. The compile callback will |
| 59 | compile the function, update the function pointer for the stub, then execute |
| 60 | the function. On all subsequent calls to the function, the function pointer |
| 61 | will point at the already-compiled function, so there is no further overhead |
| 62 | from the compiler. We will look at this process in more detail in the next |
| 63 | chapter of this tutorial, but for now we'll trust the CompileOnDemandLayer to |
| 64 | set all the stubs and callbacks up for us. All we need to do is to add the |
| 65 | CompileOnDemandLayer to the top of our stack and we'll get the benefits of |
| 66 | lazy compilation. We just need a few changes to the source: |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 67 | |
| 68 | .. code-block:: c++ |
| 69 | |
| 70 | ... |
| 71 | #include "llvm/ExecutionEngine/SectionMemoryManager.h" |
| 72 | #include "llvm/ExecutionEngine/Orc/CompileOnDemandLayer.h" |
| 73 | #include "llvm/ExecutionEngine/Orc/CompileUtils.h" |
| 74 | ... |
| 75 | |
| 76 | ... |
| 77 | class KaleidoscopeJIT { |
| 78 | private: |
| 79 | std::unique_ptr<TargetMachine> TM; |
| 80 | const DataLayout DL; |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 81 | RTDyldObjectLinkingLayer ObjectLayer; |
| 82 | IRCompileLayer<decltype(ObjectLayer), SimpleCompiler> CompileLayer; |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 83 | |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 84 | using OptimizeFunction = |
| 85 | std::function<std::shared_ptr<Module>(std::shared_ptr<Module>)>; |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 86 | |
| 87 | IRTransformLayer<decltype(CompileLayer), OptimizeFunction> OptimizeLayer; |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 88 | |
| 89 | std::unique_ptr<JITCompileCallbackManager> CompileCallbackManager; |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 90 | CompileOnDemandLayer<decltype(OptimizeLayer)> CODLayer; |
| 91 | |
| 92 | public: |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 93 | using ModuleHandle = decltype(CODLayer)::ModuleHandleT; |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 94 | |
| 95 | First we need to include the CompileOnDemandLayer.h header, then add two new |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 96 | members: a std::unique_ptr<JITCompileCallbackManager> and a CompileOnDemandLayer, |
Lang Hames | 0de9b91 | 2016-07-19 00:25:52 +0000 | [diff] [blame] | 97 | to our class. The CompileCallbackManager member is used by the CompileOnDemandLayer |
| 98 | to create the compile callback needed for each function. |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 99 | |
Alexander Kornienko | d80f626 | 2016-07-18 14:13:18 +0000 | [diff] [blame] | 100 | .. code-block:: c++ |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 101 | |
| 102 | KaleidoscopeJIT() |
| 103 | : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()), |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 104 | ObjectLayer([]() { return std::make_shared<SectionMemoryManager>(); }), |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 105 | CompileLayer(ObjectLayer, SimpleCompiler(*TM)), |
| 106 | OptimizeLayer(CompileLayer, |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 107 | [this](std::shared_ptr<Module> M) { |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 108 | return optimizeModule(std::move(M)); |
| 109 | }), |
| 110 | CompileCallbackManager( |
| 111 | orc::createLocalCompileCallbackManager(TM->getTargetTriple(), 0)), |
| 112 | CODLayer(OptimizeLayer, |
| 113 | [this](Function &F) { return std::set<Function*>({&F}); }, |
| 114 | *CompileCallbackManager, |
| 115 | orc::createLocalIndirectStubsManagerBuilder( |
| 116 | TM->getTargetTriple())) { |
| 117 | llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr); |
| 118 | } |
| 119 | |
| 120 | Next we have to update our constructor to initialize the new members. To create |
| 121 | an appropriate compile callback manager we use the |
| 122 | createLocalCompileCallbackManager function, which takes a TargetMachine and a |
Lang Hames | ad4a911 | 2016-08-01 20:49:11 +0000 | [diff] [blame] | 123 | JITTargetAddress to call if it receives a request to compile an unknown |
| 124 | function. In our simple JIT this situation is unlikely to come up, so we'll |
| 125 | cheat and just pass '0' here. In a production quality JIT you could give the |
| 126 | address of a function that throws an exception in order to unwind the JIT'd |
| 127 | code's stack. |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 128 | |
| 129 | Now we can construct our CompileOnDemandLayer. Following the pattern from |
| 130 | previous layers we start by passing a reference to the next layer down in our |
| 131 | stack -- the OptimizeLayer. Next we need to supply a 'partitioning function': |
| 132 | when a not-yet-compiled function is called, the CompileOnDemandLayer will call |
| 133 | this function to ask us what we would like to compile. At a minimum we need to |
| 134 | compile the function being called (given by the argument to the partitioning |
| 135 | function), but we could also request that the CompileOnDemandLayer compile other |
| 136 | functions that are unconditionally called (or highly likely to be called) from |
| 137 | the function being called. For KaleidoscopeJIT we'll keep it simple and just |
| 138 | request compilation of the function that was called. Next we pass a reference to |
| 139 | our CompileCallbackManager. Finally, we need to supply an "indirect stubs |
Lang Hames | 0de9b91 | 2016-07-19 00:25:52 +0000 | [diff] [blame] | 140 | manager builder": a utility function that constructs IndirectStubManagers, which |
| 141 | are in turn used to build the stubs for the functions in each module. The |
| 142 | CompileOnDemandLayer will call the indirect stub manager builder once for each |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 143 | call to addModule, and use the resulting indirect stubs manager to create |
Lang Hames | 0de9b91 | 2016-07-19 00:25:52 +0000 | [diff] [blame] | 144 | stubs for all functions in all modules in the set. If/when the module set is |
| 145 | removed from the JIT the indirect stubs manager will be deleted, freeing any |
| 146 | memory allocated to the stubs. We supply this function by using the |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 147 | createLocalIndirectStubsManagerBuilder utility. |
| 148 | |
Alexander Kornienko | d80f626 | 2016-07-18 14:13:18 +0000 | [diff] [blame] | 149 | .. code-block:: c++ |
| 150 | |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 151 | // ... |
| 152 | if (auto Sym = CODLayer.findSymbol(Name, false)) |
| 153 | // ... |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 154 | return cantFail(CODLayer.addModule(std::move(Ms), |
| 155 | std::move(Resolver))); |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 156 | // ... |
| 157 | |
| 158 | // ... |
| 159 | return CODLayer.findSymbol(MangledNameStream.str(), true); |
| 160 | // ... |
| 161 | |
| 162 | // ... |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 163 | CODLayer.removeModule(H); |
Lang Hames | 7cd3ac7 | 2016-07-15 01:39:49 +0000 | [diff] [blame] | 164 | // ... |
| 165 | |
| 166 | Finally, we need to replace the references to OptimizeLayer in our addModule, |
| 167 | findSymbol, and removeModule methods. With that, we're up and running. |
| 168 | |
Lang Hames | 42c9b59 | 2016-05-26 21:17:06 +0000 | [diff] [blame] | 169 | **To be done:** |
| 170 | |
Lang Hames | 0de9b91 | 2016-07-19 00:25:52 +0000 | [diff] [blame] | 171 | ** Chapter conclusion.** |
Lang Hames | 42c9b59 | 2016-05-26 21:17:06 +0000 | [diff] [blame] | 172 | |
| 173 | Full Code Listing |
| 174 | ================= |
| 175 | |
| 176 | Here is the complete code listing for our running example with a CompileOnDemand |
| 177 | layer added to enable lazy function-at-a-time compilation. To build this example, use: |
| 178 | |
| 179 | .. code-block:: bash |
| 180 | |
| 181 | # Compile |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 182 | clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy |
Lang Hames | 42c9b59 | 2016-05-26 21:17:06 +0000 | [diff] [blame] | 183 | # Run |
| 184 | ./toy |
| 185 | |
| 186 | Here is the code: |
| 187 | |
| 188 | .. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter3/KaleidoscopeJIT.h |
| 189 | :language: c++ |
| 190 | |
| 191 | `Next: Extreme Laziness -- Using Compile Callbacks to JIT directly from ASTs <BuildingAJIT4.html>`_ |