Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 1 | ======================================================= |
Lang Hames | 9d4ea6d | 2016-05-25 23:34:19 +0000 | [diff] [blame] | 2 | Building a JIT: Starting out with KaleidoscopeJIT |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 3 | ======================================================= |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 8 | Chapter 1 Introduction |
| 9 | ====================== |
| 10 | |
| 11 | Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This |
| 12 | tutorial runs through the implementation of a JIT compiler using LLVM's |
| 13 | On-Request-Compilation (ORC) APIs. It begins with a simplified version of the |
| 14 | KaleidoscopeJIT class used in the |
Kirill Bobyrev | e436483 | 2017-07-10 09:07:23 +0000 | [diff] [blame] | 15 | `Implementing a language with LLVM <LangImpl01.html>`_ tutorials and then |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 16 | introduces new features like optimization, lazy compilation and remote |
| 17 | execution. |
| 18 | |
| 19 | The goal of this tutorial is to introduce you to LLVM's ORC JIT APIs, show how |
| 20 | these APIs interact with other parts of LLVM, and to teach you how to recombine |
| 21 | them to build a custom JIT that is suited to your use-case. |
| 22 | |
| 23 | The structure of the tutorial is: |
| 24 | |
| 25 | - Chapter #1: Investigate the simple KaleidoscopeJIT class. This will |
| 26 | introduce some of the basic concepts of the ORC JIT APIs, including the |
| 27 | idea of an ORC *Layer*. |
| 28 | |
| 29 | - `Chapter #2 <BuildingAJIT2.html>`_: Extend the basic KaleidoscopeJIT by adding |
| 30 | a new layer that will optimize IR and generated code. |
| 31 | |
| 32 | - `Chapter #3 <BuildingAJIT3.html>`_: Further extend the JIT by adding a |
| 33 | Compile-On-Demand layer to lazily compile IR. |
| 34 | |
| 35 | - `Chapter #4 <BuildingAJIT4.html>`_: Improve the laziness of our JIT by |
| 36 | replacing the Compile-On-Demand layer with a custom layer that uses the ORC |
| 37 | Compile Callbacks API directly to defer IR-generation until functions are |
| 38 | called. |
| 39 | |
| 40 | - `Chapter #5 <BuildingAJIT5.html>`_: Add process isolation by JITing code into |
| 41 | a remote process with reduced privileges using the JIT Remote APIs. |
| 42 | |
| 43 | To provide input for our JIT we will use the Kaleidoscope REPL from |
Kirill Bobyrev | e436483 | 2017-07-10 09:07:23 +0000 | [diff] [blame] | 44 | `Chapter 7 <LangImpl07.html>`_ of the "Implementing a language in LLVM tutorial", |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 45 | with one minor modification: We will remove the FunctionPassManager from the |
| 46 | code for that chapter and replace it with optimization support in our JIT class |
| 47 | in Chapter #2. |
| 48 | |
| 49 | Finally, a word on API generations: ORC is the 3rd generation of LLVM JIT API. |
Sylvestre Ledru | 7d54050 | 2016-07-02 19:28:40 +0000 | [diff] [blame] | 50 | It was preceded by MCJIT, and before that by the (now deleted) legacy JIT. |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 51 | These tutorials don't assume any experience with these earlier APIs, but |
| 52 | readers acquainted with them will see many familiar elements. Where appropriate |
| 53 | we will make this connection with the earlier APIs explicit to help people who |
| 54 | are transitioning from them to ORC. |
| 55 | |
| 56 | JIT API Basics |
| 57 | ============== |
| 58 | |
| 59 | The purpose of a JIT compiler is to compile code "on-the-fly" as it is needed, |
| 60 | rather than compiling whole programs to disk ahead of time as a traditional |
| 61 | compiler does. To support that aim our initial, bare-bones JIT API will be: |
| 62 | |
| 63 | 1. Handle addModule(Module &M) -- Make the given IR module available for |
| 64 | execution. |
| 65 | 2. JITSymbol findSymbol(const std::string &Name) -- Search for pointers to |
| 66 | symbols (functions or variables) that have been added to the JIT. |
| 67 | 3. void removeModule(Handle H) -- Remove a module from the JIT, releasing any |
| 68 | memory that had been used for the compiled code. |
| 69 | |
| 70 | A basic use-case for this API, executing the 'main' function from a module, |
| 71 | will look like: |
| 72 | |
| 73 | .. code-block:: c++ |
| 74 | |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 75 | std::unique_ptr<Module> M = buildModule(); |
| 76 | JIT J; |
| 77 | Handle H = J.addModule(*M); |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 78 | int (*Main)(int, char*[]) = (int(*)(int, char*[]))J.getSymbolAddress("main"); |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 79 | int Result = Main(); |
| 80 | J.removeModule(H); |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 81 | |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 82 | The APIs that we build in these tutorials will all be variations on this simple |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 83 | theme. Behind the API we will refine the implementation of the JIT to add |
| 84 | support for optimization and lazy compilation. Eventually we will extend the |
| 85 | API itself to allow higher-level program representations (e.g. ASTs) to be |
| 86 | added to the JIT. |
| 87 | |
| 88 | KaleidoscopeJIT |
| 89 | =============== |
| 90 | |
| 91 | In the previous section we described our API, now we examine a simple |
| 92 | implementation of it: The KaleidoscopeJIT class [1]_ that was used in the |
Kirill Bobyrev | e436483 | 2017-07-10 09:07:23 +0000 | [diff] [blame] | 93 | `Implementing a language with LLVM <LangImpl01.html>`_ tutorials. We will use |
| 94 | the REPL code from `Chapter 7 <LangImpl07.html>`_ of that tutorial to supply the |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 95 | input for our JIT: Each time the user enters an expression the REPL will add a |
| 96 | new IR module containing the code for that expression to the JIT. If the |
| 97 | expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also |
| 98 | use the findSymbol method of our JIT class find and execute the code for the |
| 99 | expression, and then use the removeModule method to remove the code again |
| 100 | (since there's no way to re-invoke an anonymous expression). In later chapters |
| 101 | of this tutorial we'll modify the REPL to enable new interactions with our JIT |
| 102 | class, but for now we will take this setup for granted and focus our attention on |
| 103 | the implementation of our JIT itself. |
| 104 | |
| 105 | Our KaleidoscopeJIT class is defined in the KaleidoscopeJIT.h header. After the |
| 106 | usual include guards and #includes [2]_, we get to the definition of our class: |
| 107 | |
| 108 | .. code-block:: c++ |
| 109 | |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 110 | #ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H |
| 111 | #define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 112 | |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 113 | #include "llvm/ADT/STLExtras.h" |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 114 | #include "llvm/ExecutionEngine/ExecutionEngine.h" |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 115 | #include "llvm/ExecutionEngine/JITSymbol.h" |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 116 | #include "llvm/ExecutionEngine/RTDyldMemoryManager.h" |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 117 | #include "llvm/ExecutionEngine/SectionMemoryManager.h" |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 118 | #include "llvm/ExecutionEngine/Orc/CompileUtils.h" |
| 119 | #include "llvm/ExecutionEngine/Orc/IRCompileLayer.h" |
| 120 | #include "llvm/ExecutionEngine/Orc/LambdaResolver.h" |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 121 | #include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h" |
| 122 | #include "llvm/IR/DataLayout.h" |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 123 | #include "llvm/IR/Mangler.h" |
| 124 | #include "llvm/Support/DynamicLibrary.h" |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 125 | #include "llvm/Support/raw_ostream.h" |
| 126 | #include "llvm/Target/TargetMachine.h" |
| 127 | #include <algorithm> |
| 128 | #include <memory> |
| 129 | #include <string> |
| 130 | #include <vector> |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 131 | |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 132 | namespace llvm { |
| 133 | namespace orc { |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 134 | |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 135 | class KaleidoscopeJIT { |
| 136 | private: |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 137 | std::unique_ptr<TargetMachine> TM; |
| 138 | const DataLayout DL; |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 139 | RTDyldObjectLinkingLayer ObjectLayer; |
| 140 | IRCompileLayer<decltype(ObjectLayer), SimpleCompiler> CompileLayer; |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 141 | |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 142 | public: |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 143 | using ModuleHandle = decltype(CompileLayer)::ModuleHandleT; |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 144 | |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 145 | Our class begins with four members: A TargetMachine, TM, which will be used to |
| 146 | build our LLVM compiler instance; A DataLayout, DL, which will be used for |
Lang Hames | db0551e | 2016-05-30 19:03:26 +0000 | [diff] [blame] | 147 | symbol mangling (more on that later), and two ORC *layers*: an |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 148 | RTDyldObjectLinkingLayer and a CompileLayer. We'll be talking more about layers |
| 149 | in the next chapter, but for now you can think of them as analogous to LLVM |
Sylvestre Ledru | 7d54050 | 2016-07-02 19:28:40 +0000 | [diff] [blame] | 150 | Passes: they wrap up useful JIT utilities behind an easy to compose interface. |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 151 | The first layer, ObjectLayer, is the foundation of our JIT: it takes in-memory |
| 152 | object files produced by a compiler and links them on the fly to make them |
| 153 | executable. This JIT-on-top-of-a-linker design was introduced in MCJIT, however |
| 154 | the linker was hidden inside the MCJIT class. In ORC we expose the linker so |
| 155 | that clients can access and configure it directly if they need to. In this |
| 156 | tutorial our ObjectLayer will just be used to support the next layer in our |
| 157 | stack: the CompileLayer, which will be responsible for taking LLVM IR, compiling |
| 158 | it, and passing the resulting in-memory object files down to the object linking |
| 159 | layer below. |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 160 | |
Lang Hames | db0551e | 2016-05-30 19:03:26 +0000 | [diff] [blame] | 161 | That's it for member variables, after that we have a single typedef: |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 162 | ModuleHandle. This is the handle type that will be returned from our JIT's |
Lang Hames | db0551e | 2016-05-30 19:03:26 +0000 | [diff] [blame] | 163 | addModule method, and can be passed to the removeModule method to remove a |
| 164 | module. The IRCompileLayer class already provides a convenient handle type |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 165 | (IRCompileLayer::ModuleHandleT), so we just alias our ModuleHandle to this. |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 166 | |
| 167 | .. code-block:: c++ |
| 168 | |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 169 | KaleidoscopeJIT() |
| 170 | : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()), |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 171 | ObjectLayer([]() { return std::make_shared<SectionMemoryManager>(); }), |
Mehdi Amini | bb6805d | 2017-02-11 21:26:52 +0000 | [diff] [blame] | 172 | CompileLayer(ObjectLayer, SimpleCompiler(*TM)) { |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 173 | llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr); |
| 174 | } |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 175 | |
Lang Hames | 59a5ad8 | 2016-05-25 22:33:25 +0000 | [diff] [blame] | 176 | TargetMachine &getTargetMachine() { return *TM; } |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 177 | |
| 178 | Next up we have our class constructor. We begin by initializing TM using the |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 179 | EngineBuilder::selectTarget helper method which constructs a TargetMachine for |
| 180 | the current process. Then we use our newly created TargetMachine to initialize |
| 181 | DL, our DataLayout. After that we need to initialize our ObjectLayer. The |
| 182 | ObjectLayer requires a function object that will build a JIT memory manager for |
| 183 | each module that is added (a JIT memory manager manages memory allocations, |
| 184 | memory permissions, and registration of exception handlers for JIT'd code). For |
| 185 | this we use a lambda that returns a SectionMemoryManager, an off-the-shelf |
| 186 | utility that provides all the basic memory management functionality required for |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 187 | this chapter. Next we initialize our CompileLayer. The CompileLayer needs two |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 188 | things: (1) A reference to our object layer, and (2) a compiler instance to use |
| 189 | to perform the actual compilation from IR to object files. We use the |
| 190 | off-the-shelf SimpleCompiler instance for now. Finally, in the body of the |
| 191 | constructor, we call the DynamicLibrary::LoadLibraryPermanently method with a |
| 192 | nullptr argument. Normally the LoadLibraryPermanently method is called with the |
| 193 | path of a dynamic library to load, but when passed a null pointer it will 'load' |
| 194 | the host process itself, making its exported symbols available for execution. |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 195 | |
| 196 | .. code-block:: c++ |
| 197 | |
Lang Hames | e0fc5ae | 2016-05-25 22:27:25 +0000 | [diff] [blame] | 198 | ModuleHandle addModule(std::unique_ptr<Module> M) { |
| 199 | // Build our symbol resolver: |
| 200 | // Lambda 1: Look back into the JIT itself to find symbols that are part of |
| 201 | // the same "logical dylib". |
| 202 | // Lambda 2: Search for external symbols in the host process. |
| 203 | auto Resolver = createLambdaResolver( |
| 204 | [&](const std::string &Name) { |
| 205 | if (auto Sym = CompileLayer.findSymbol(Name, false)) |
Lang Hames | ad4a911 | 2016-08-01 20:49:11 +0000 | [diff] [blame] | 206 | return Sym; |
| 207 | return JITSymbol(nullptr); |
Lang Hames | e0fc5ae | 2016-05-25 22:27:25 +0000 | [diff] [blame] | 208 | }, |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 209 | [](const std::string &Name) { |
Lang Hames | e0fc5ae | 2016-05-25 22:27:25 +0000 | [diff] [blame] | 210 | if (auto SymAddr = |
| 211 | RTDyldMemoryManager::getSymbolAddressInProcess(Name)) |
Lang Hames | ad4a911 | 2016-08-01 20:49:11 +0000 | [diff] [blame] | 212 | return JITSymbol(SymAddr, JITSymbolFlags::Exported); |
| 213 | return JITSymbol(nullptr); |
Lang Hames | e0fc5ae | 2016-05-25 22:27:25 +0000 | [diff] [blame] | 214 | }); |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 215 | |
Lang Hames | e0fc5ae | 2016-05-25 22:27:25 +0000 | [diff] [blame] | 216 | // Add the set to the JIT with the resolver we created above and a newly |
| 217 | // created SectionMemoryManager. |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 218 | return cantFail(CompileLayer.addModule(std::move(M), |
| 219 | std::move(Resolver))); |
Lang Hames | e0fc5ae | 2016-05-25 22:27:25 +0000 | [diff] [blame] | 220 | } |
| 221 | |
Lang Hames | db0551e | 2016-05-30 19:03:26 +0000 | [diff] [blame] | 222 | Now we come to the first of our JIT API methods: addModule. This method is |
| 223 | responsible for adding IR to the JIT and making it available for execution. In |
| 224 | this initial implementation of our JIT we will make our modules "available for |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 225 | execution" by adding them straight to the CompileLayer, which will immediately |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 226 | compile them. In later chapters we will teach our JIT to defer compilation |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 227 | of individual functions until they're actually called. |
Lang Hames | e0fc5ae | 2016-05-25 22:27:25 +0000 | [diff] [blame] | 228 | |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 229 | To add our module to the CompileLayer we need to supply both the module and a |
| 230 | symbol resolver. The symbol resolver is responsible for supplying the JIT with |
| 231 | an address for each *external symbol* in the module we are adding. External |
Lang Hames | db0551e | 2016-05-30 19:03:26 +0000 | [diff] [blame] | 232 | symbols are any symbol not defined within the module itself, including calls to |
| 233 | functions outside the JIT and calls to functions defined in other modules that |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 234 | have already been added to the JIT. (It may seem as though modules added to the |
| 235 | JIT should know about one another by default, but since we would still have to |
Lang Hames | db0551e | 2016-05-30 19:03:26 +0000 | [diff] [blame] | 236 | supply a symbol resolver for references to code outside the JIT it turns out to |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 237 | be easier to re-use this one mechanism for all symbol resolution.) This has the |
| 238 | added benefit that the user has full control over the symbol resolution |
Lang Hames | db0551e | 2016-05-30 19:03:26 +0000 | [diff] [blame] | 239 | process. Should we search for definitions within the JIT first, then fall back |
| 240 | on external definitions? Or should we prefer external definitions where |
| 241 | available and only JIT code if we don't already have an available |
| 242 | implementation? By using a single symbol resolution scheme we are free to choose |
| 243 | whatever makes the most sense for any given use case. |
Lang Hames | e0fc5ae | 2016-05-25 22:27:25 +0000 | [diff] [blame] | 244 | |
Lang Hames | db0551e | 2016-05-30 19:03:26 +0000 | [diff] [blame] | 245 | Building a symbol resolver is made especially easy by the *createLambdaResolver* |
Lang Hames | ad4a911 | 2016-08-01 20:49:11 +0000 | [diff] [blame] | 246 | function. This function takes two lambdas [3]_ and returns a JITSymbolResolver |
| 247 | instance. The first lambda is used as the implementation of the resolver's |
| 248 | findSymbolInLogicalDylib method, which searches for symbol definitions that |
| 249 | should be thought of as being part of the same "logical" dynamic library as this |
| 250 | Module. If you are familiar with static linking: this means that |
| 251 | findSymbolInLogicalDylib should expose symbols with common linkage and hidden |
| 252 | visibility. If all this sounds foreign you can ignore the details and just |
| 253 | remember that this is the first method that the linker will use to try to find a |
| 254 | symbol definition. If the findSymbolInLogicalDylib method returns a null result |
| 255 | then the linker will call the second symbol resolver method, called findSymbol, |
| 256 | which searches for symbols that should be thought of as external to (but |
| 257 | visibile from) the module and its logical dylib. In this tutorial we will adopt |
| 258 | the following simple scheme: All modules added to the JIT will behave as if they |
| 259 | were linked into a single, ever-growing logical dylib. To implement this our |
| 260 | first lambda (the one defining findSymbolInLogicalDylib) will just search for |
| 261 | JIT'd code by calling the CompileLayer's findSymbol method. If we don't find a |
| 262 | symbol in the JIT itself we'll fall back to our second lambda, which implements |
Mehdi Amini | bb6805d | 2017-02-11 21:26:52 +0000 | [diff] [blame] | 263 | findSymbol. This will use the RTDyldMemoryManager::getSymbolAddressInProcess |
Lang Hames | ad4a911 | 2016-08-01 20:49:11 +0000 | [diff] [blame] | 264 | method to search for the symbol within the program itself. If we can't find a |
Mehdi Amini | bb6805d | 2017-02-11 21:26:52 +0000 | [diff] [blame] | 265 | symbol definition via either of these paths, the JIT will refuse to accept our |
Lang Hames | ad4a911 | 2016-08-01 20:49:11 +0000 | [diff] [blame] | 266 | module, returning a "symbol not found" error. |
Lang Hames | e0fc5ae | 2016-05-25 22:27:25 +0000 | [diff] [blame] | 267 | |
Mehdi Amini | bb6805d | 2017-02-11 21:26:52 +0000 | [diff] [blame] | 268 | Now that we've built our symbol resolver, we're ready to add our module to the |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 269 | JIT. We do this by calling the CompileLayer's addModule method. The addModule |
| 270 | method returns an ``Expected<CompileLayer::ModuleHandle>``, since in more |
| 271 | advanced JIT configurations it could fail. In our basic configuration we know |
| 272 | that it will always succeed so we use the cantFail utility to assert that no |
| 273 | error occurred, and extract the handle value. Since we have already typedef'd |
| 274 | our ModuleHandle type to be the same as the CompileLayer's handle type, we can |
| 275 | return the unwrapped handle directly. |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 276 | |
| 277 | .. code-block:: c++ |
| 278 | |
| 279 | JITSymbol findSymbol(const std::string Name) { |
| 280 | std::string MangledName; |
| 281 | raw_string_ostream MangledNameStream(MangledName); |
| 282 | Mangler::getNameWithPrefix(MangledNameStream, Name, DL); |
| 283 | return CompileLayer.findSymbol(MangledNameStream.str(), true); |
| 284 | } |
| 285 | |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 286 | JITTargetAddress getSymbolAddress(const std::string Name) { |
| 287 | return cantFail(findSymbol(Name).getAddress()); |
| 288 | } |
| 289 | |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 290 | void removeModule(ModuleHandle H) { |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 291 | cantFail(CompileLayer.removeModule(H)); |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 292 | } |
| 293 | |
Lang Hames | db0551e | 2016-05-30 19:03:26 +0000 | [diff] [blame] | 294 | Now that we can add code to our JIT, we need a way to find the symbols we've |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 295 | added to it. To do that we call the findSymbol method on our CompileLayer, but |
| 296 | with a twist: We have to *mangle* the name of the symbol we're searching for |
| 297 | first. The ORC JIT components use mangled symbols internally the same way a |
| 298 | static compiler and linker would, rather than using plain IR symbol names. This |
| 299 | allows JIT'd code to interoperate easily with precompiled code in the |
| 300 | application or shared libraries. The kind of mangling will depend on the |
| 301 | DataLayout, which in turn depends on the target platform. To allow us to remain |
| 302 | portable and search based on the un-mangled name, we just re-produce this |
| 303 | mangling ourselves. |
| 304 | |
| 305 | Next we have a convenience function, getSymbolAddress, which returns the address |
| 306 | of a given symbol. Like CompileLayer's addModule function, JITSymbol's getAddress |
| 307 | function is allowed to fail [4]_, however we know that it will not in our simple |
| 308 | example, so we wrap it in a call to cantFail. |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 309 | |
Lang Hames | db0551e | 2016-05-30 19:03:26 +0000 | [diff] [blame] | 310 | We now come to the last method in our JIT API: removeModule. This method is |
| 311 | responsible for destructing the MemoryManager and SymbolResolver that were |
| 312 | added with a given module, freeing any resources they were using in the |
| 313 | process. In our Kaleidoscope demo we rely on this method to remove the module |
| 314 | representing the most recent top-level expression, preventing it from being |
| 315 | treated as a duplicate definition when the next top-level expression is |
| 316 | entered. It is generally good to free any module that you know you won't need |
| 317 | to call further, just to free up the resources dedicated to it. However, you |
| 318 | don't strictly need to do this: All resources will be cleaned up when your |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 319 | JIT class is destructed, if they haven't been freed before then. Like |
| 320 | ``CompileLayer::addModule`` and ``JITSymbol::getAddress``, removeModule may |
| 321 | fail in general but will never fail in our example, so we wrap it in a call to |
| 322 | cantFail. |
Lang Hames | db0551e | 2016-05-30 19:03:26 +0000 | [diff] [blame] | 323 | |
| 324 | This brings us to the end of Chapter 1 of Building a JIT. You now have a basic |
| 325 | but fully functioning JIT stack that you can use to take LLVM IR and make it |
| 326 | executable within the context of your JIT process. In the next chapter we'll |
| 327 | look at how to extend this JIT to produce better quality code, and in the |
| 328 | process take a deeper look at the ORC layer concept. |
| 329 | |
| 330 | `Next: Extending the KaleidoscopeJIT <BuildingAJIT2.html>`_ |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 331 | |
| 332 | Full Code Listing |
| 333 | ================= |
| 334 | |
Lang Hames | 9ed5f00 | 2016-05-25 23:42:48 +0000 | [diff] [blame] | 335 | Here is the complete code listing for our running example. To build this |
| 336 | example, use: |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 337 | |
| 338 | .. code-block:: bash |
| 339 | |
| 340 | # Compile |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 341 | clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 342 | # Run |
| 343 | ./toy |
| 344 | |
| 345 | Here is the code: |
| 346 | |
| 347 | .. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter1/KaleidoscopeJIT.h |
| 348 | :language: c++ |
| 349 | |
Lang Hames | 7331cc3 | 2016-05-23 20:34:19 +0000 | [diff] [blame] | 350 | .. [1] Actually we use a cut-down version of KaleidoscopeJIT that makes a |
| 351 | simplifying assumption: symbols cannot be re-defined. This will make it |
| 352 | impossible to re-define symbols in the REPL, but will make our symbol |
| 353 | lookup logic simpler. Re-introducing support for symbol redefinition is |
| 354 | left as an exercise for the reader. (The KaleidoscopeJIT.h used in the |
| 355 | original tutorials will be a helpful reference). |
| 356 | |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 357 | .. [2] +-----------------------------+-----------------------------------------------+ |
| 358 | | File | Reason for inclusion | |
| 359 | +=============================+===============================================+ |
| 360 | | STLExtras.h | LLVM utilities that are useful when working | |
| 361 | | | with the STL. | |
| 362 | +-----------------------------+-----------------------------------------------+ |
| 363 | | ExecutionEngine.h | Access to the EngineBuilder::selectTarget | |
| 364 | | | method. | |
| 365 | +-----------------------------+-----------------------------------------------+ |
| 366 | | | Access to the | |
| 367 | | RTDyldMemoryManager.h | RTDyldMemoryManager::getSymbolAddressInProcess| |
| 368 | | | method. | |
| 369 | +-----------------------------+-----------------------------------------------+ |
| 370 | | CompileUtils.h | Provides the SimpleCompiler class. | |
| 371 | +-----------------------------+-----------------------------------------------+ |
| 372 | | IRCompileLayer.h | Provides the IRCompileLayer class. | |
| 373 | +-----------------------------+-----------------------------------------------+ |
| 374 | | | Access the createLambdaResolver function, | |
| 375 | | LambdaResolver.h | which provides easy construction of symbol | |
| 376 | | | resolvers. | |
| 377 | +-----------------------------+-----------------------------------------------+ |
| 378 | | RTDyldObjectLinkingLayer.h | Provides the RTDyldObjectLinkingLayer class. | |
| 379 | +-----------------------------+-----------------------------------------------+ |
| 380 | | Mangler.h | Provides the Mangler class for platform | |
| 381 | | | specific name-mangling. | |
| 382 | +-----------------------------+-----------------------------------------------+ |
| 383 | | DynamicLibrary.h | Provides the DynamicLibrary class, which | |
| 384 | | | makes symbols in the host process searchable. | |
| 385 | +-----------------------------+-----------------------------------------------+ |
| 386 | | | A fast output stream class. We use the | |
| 387 | | raw_ostream.h | raw_string_ostream subclass for symbol | |
| 388 | | | mangling | |
| 389 | +-----------------------------+-----------------------------------------------+ |
| 390 | | TargetMachine.h | LLVM target machine description class. | |
| 391 | +-----------------------------+-----------------------------------------------+ |
Lang Hames | e0fc5ae | 2016-05-25 22:27:25 +0000 | [diff] [blame] | 392 | |
Lang Hames | db0551e | 2016-05-30 19:03:26 +0000 | [diff] [blame] | 393 | .. [3] Actually they don't have to be lambdas, any object with a call operator |
| 394 | will do, including plain old functions or std::functions. |
| 395 | |
Lang Hames | e815bf3 | 2017-08-15 19:20:10 +0000 | [diff] [blame] | 396 | .. [4] ``JITSymbol::getAddress`` will force the JIT to compile the definition of |
| 397 | the symbol if it hasn't already been compiled, and since the compilation |
Don Hinton | 4b93d23 | 2017-09-17 00:24:43 +0000 | [diff] [blame] | 398 | process could fail getAddress must be able to return this failure. |