blob: fcb755bd286f18bc8e5d3d30c2f683a21ad3f8ea [file] [log] [blame]
Lang Hames7331cc32016-05-23 20:34:19 +00001=======================================================
Lang Hames9d4ea6d2016-05-25 23:34:19 +00002Building a JIT: Starting out with KaleidoscopeJIT
Lang Hames7331cc32016-05-23 20:34:19 +00003=======================================================
4
5.. contents::
6 :local:
7
Lang Hames7331cc32016-05-23 20:34:19 +00008Chapter 1 Introduction
9======================
10
Lang Hamesb327b0e2018-10-17 03:34:09 +000011**Warning: This tutorial is currently being updated to account for ORC API
Lang Hames33a2f5e2018-10-18 00:51:38 +000012changes. Only Chapters 1 and 2 are up-to-date.**
Lang Hamesf3fb98362018-02-06 21:25:20 +000013
Lang Hames33a2f5e2018-10-18 00:51:38 +000014**Example code from Chapters 3 to 5 will compile and run, but has not been
Lang Hamesb327b0e2018-10-17 03:34:09 +000015updated**
Lang Hamesf3fb98362018-02-06 21:25:20 +000016
Lang Hames7331cc32016-05-23 20:34:19 +000017Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This
18tutorial runs through the implementation of a JIT compiler using LLVM's
19On-Request-Compilation (ORC) APIs. It begins with a simplified version of the
20KaleidoscopeJIT class used in the
Kirill Bobyreve4364832017-07-10 09:07:23 +000021`Implementing a language with LLVM <LangImpl01.html>`_ tutorials and then
Lang Hamesb327b0e2018-10-17 03:34:09 +000022introduces new features like concurrent compilation, optimization, lazy
23compilation and remote execution.
Lang Hames7331cc32016-05-23 20:34:19 +000024
25The goal of this tutorial is to introduce you to LLVM's ORC JIT APIs, show how
26these APIs interact with other parts of LLVM, and to teach you how to recombine
27them to build a custom JIT that is suited to your use-case.
28
29The structure of the tutorial is:
30
31- Chapter #1: Investigate the simple KaleidoscopeJIT class. This will
32 introduce some of the basic concepts of the ORC JIT APIs, including the
33 idea of an ORC *Layer*.
34
35- `Chapter #2 <BuildingAJIT2.html>`_: Extend the basic KaleidoscopeJIT by adding
36 a new layer that will optimize IR and generated code.
37
38- `Chapter #3 <BuildingAJIT3.html>`_: Further extend the JIT by adding a
39 Compile-On-Demand layer to lazily compile IR.
40
41- `Chapter #4 <BuildingAJIT4.html>`_: Improve the laziness of our JIT by
42 replacing the Compile-On-Demand layer with a custom layer that uses the ORC
43 Compile Callbacks API directly to defer IR-generation until functions are
44 called.
45
46- `Chapter #5 <BuildingAJIT5.html>`_: Add process isolation by JITing code into
47 a remote process with reduced privileges using the JIT Remote APIs.
48
Lang Hamesb327b0e2018-10-17 03:34:09 +000049To provide input for our JIT we will use a lightly modified version of the
50Kaleidoscope REPL from `Chapter 7 <LangImpl07.html>`_ of the "Implementing a
51language in LLVM tutorial".
Lang Hames7331cc32016-05-23 20:34:19 +000052
53Finally, a word on API generations: ORC is the 3rd generation of LLVM JIT API.
Sylvestre Ledru7d540502016-07-02 19:28:40 +000054It was preceded by MCJIT, and before that by the (now deleted) legacy JIT.
Lang Hames7331cc32016-05-23 20:34:19 +000055These tutorials don't assume any experience with these earlier APIs, but
56readers acquainted with them will see many familiar elements. Where appropriate
57we will make this connection with the earlier APIs explicit to help people who
58are transitioning from them to ORC.
59
60JIT API Basics
61==============
62
63The purpose of a JIT compiler is to compile code "on-the-fly" as it is needed,
64rather than compiling whole programs to disk ahead of time as a traditional
Lang Hamesb327b0e2018-10-17 03:34:09 +000065compiler does. To support that aim our initial, bare-bones JIT API will have
66just two functions:
Lang Hames7331cc32016-05-23 20:34:19 +000067
Lang Hames33a2f5e2018-10-18 00:51:38 +0000681. ``Error addModule(std::unique_ptr<Module> M)``: Make the given IR module
Lang Hamesd2d73ba2018-10-17 19:35:38 +000069 available for execution.
Lang Hames33a2f5e2018-10-18 00:51:38 +0000702. ``Expected<JITEvaluatedSymbol> lookup()``: Search for pointers to
Lang Hames7331cc32016-05-23 20:34:19 +000071 symbols (functions or variables) that have been added to the JIT.
Lang Hames7331cc32016-05-23 20:34:19 +000072
73A basic use-case for this API, executing the 'main' function from a module,
74will look like:
75
76.. code-block:: c++
77
Lang Hames59a5ad82016-05-25 22:33:25 +000078 JIT J;
Lang Hames5d4a12d2018-10-17 22:27:09 +000079 J.addModule(buildModule());
80 auto *Main = (int(*)(int, char*[]))J.lookup("main").getAddress();
Lang Hames59a5ad82016-05-25 22:33:25 +000081 int Result = Main();
Lang Hames7331cc32016-05-23 20:34:19 +000082
Don Hinton4b93d232017-09-17 00:24:43 +000083The APIs that we build in these tutorials will all be variations on this simple
Lang Hamesb327b0e2018-10-17 03:34:09 +000084theme. Behind this API we will refine the implementation of the JIT to add
85support for concurrent compilation, optimization and lazy compilation.
86Eventually we will extend the API itself to allow higher-level program
87representations (e.g. ASTs) to be added to the JIT.
Lang Hames7331cc32016-05-23 20:34:19 +000088
89KaleidoscopeJIT
90===============
91
92In the previous section we described our API, now we examine a simple
93implementation of it: The KaleidoscopeJIT class [1]_ that was used in the
Kirill Bobyreve4364832017-07-10 09:07:23 +000094`Implementing a language with LLVM <LangImpl01.html>`_ tutorials. We will use
95the REPL code from `Chapter 7 <LangImpl07.html>`_ of that tutorial to supply the
Lang Hames7331cc32016-05-23 20:34:19 +000096input for our JIT: Each time the user enters an expression the REPL will add a
97new IR module containing the code for that expression to the JIT. If the
98expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also
Lang Hamesb327b0e2018-10-17 03:34:09 +000099use the lookup method of our JIT class find and execute the code for the
100expression. In later chapters of this tutorial we will modify the REPL to enable
101new interactions with our JIT class, but for now we will take this setup for
102granted and focus our attention on the implementation of our JIT itself.
Lang Hames7331cc32016-05-23 20:34:19 +0000103
104Our KaleidoscopeJIT class is defined in the KaleidoscopeJIT.h header. After the
105usual include guards and #includes [2]_, we get to the definition of our class:
106
107.. code-block:: c++
108
Lang Hames59a5ad82016-05-25 22:33:25 +0000109 #ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
110 #define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
Lang Hames7331cc32016-05-23 20:34:19 +0000111
Lang Hamesb327b0e2018-10-17 03:34:09 +0000112 #include "llvm/ADT/StringRef.h"
Don Hinton4b93d232017-09-17 00:24:43 +0000113 #include "llvm/ExecutionEngine/JITSymbol.h"
Lang Hames59a5ad82016-05-25 22:33:25 +0000114 #include "llvm/ExecutionEngine/Orc/CompileUtils.h"
Lang Hamesb327b0e2018-10-17 03:34:09 +0000115 #include "llvm/ExecutionEngine/Orc/Core.h"
116 #include "llvm/ExecutionEngine/Orc/ExecutionUtils.h"
Lang Hames59a5ad82016-05-25 22:33:25 +0000117 #include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
Lang Hamesb327b0e2018-10-17 03:34:09 +0000118 #include "llvm/ExecutionEngine/Orc/JITTargetMachineBuilder.h"
Don Hinton4b93d232017-09-17 00:24:43 +0000119 #include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h"
Lang Hamesb327b0e2018-10-17 03:34:09 +0000120 #include "llvm/ExecutionEngine/SectionMemoryManager.h"
Don Hinton4b93d232017-09-17 00:24:43 +0000121 #include "llvm/IR/DataLayout.h"
Lang Hamesb327b0e2018-10-17 03:34:09 +0000122 #include "llvm/IR/LLVMContext.h"
Lang Hamese815bf32017-08-15 19:20:10 +0000123 #include <memory>
Lang Hames7331cc32016-05-23 20:34:19 +0000124
Lang Hames59a5ad82016-05-25 22:33:25 +0000125 namespace llvm {
126 namespace orc {
Lang Hames7331cc32016-05-23 20:34:19 +0000127
Lang Hames59a5ad82016-05-25 22:33:25 +0000128 class KaleidoscopeJIT {
129 private:
Lang Hamesb327b0e2018-10-17 03:34:09 +0000130 ExecutionSession ES;
Lang Hames33a2f5e2018-10-18 00:51:38 +0000131 RTDyldObjectLinkingLayer ObjectLayer;
132 IRCompileLayer CompileLayer;
Lang Hamesb327b0e2018-10-17 03:34:09 +0000133
Lang Hames33a2f5e2018-10-18 00:51:38 +0000134 DataLayout DL;
135 MangleAndInterner Mangle;
136 ThreadSafeContext Ctx;
Lang Hames7331cc32016-05-23 20:34:19 +0000137
Lang Hames59a5ad82016-05-25 22:33:25 +0000138 public:
Lang Hames33a2f5e2018-10-18 00:51:38 +0000139 KaleidoscopeJIT(JITTargetMachineBuilder JTMB, DataLayout DL)
140 : ObjectLayer(ES,
141 []() { return llvm::make_unique<SectionMemoryManager>(); }),
142 CompileLayer(ES, ObjectLayer, ConcurrentIRCompiler(std::move(JTMB))),
143 DL(std::move(DL)), Mangle(ES, this->DL),
144 Ctx(llvm::make_unique<LLVMContext>()) {
Lang Hamesb327b0e2018-10-17 03:34:09 +0000145 ES.getMainJITDylib().setGenerator(
Lang Hames33a2f5e2018-10-18 00:51:38 +0000146 cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(DL)));
Lang Hamesb327b0e2018-10-17 03:34:09 +0000147 }
Lang Hames7331cc32016-05-23 20:34:19 +0000148
Lang Hames33a2f5e2018-10-18 00:51:38 +0000149Our class begins with six member variables: An ExecutionSession member, ``ES``,
150which provides context for our running JIT'd code (including the string pool,
151global mutex, and error reporting facilities); An RTDyldObjectLinkingLayer,
152``ObjectLayer``, that can be used to add object files to our JIT (though we will
153not use it directly); An IRCompileLayer, ``CompileLayer``, that can be used to
154add LLVM Modules to our JIT (and which builds on the ObjectLayer), A DataLayout
155and MangleAndInterner, ``DL`` and ``Mangle``, that will be used for symbol mangling
156(more on that later); and finally an LLVMContext that clients will use when
157building IR files for the JIT.
Lang Hamesb327b0e2018-10-17 03:34:09 +0000158
Lang Hames33a2f5e2018-10-18 00:51:38 +0000159Next up we have our class constructor, which takes a `JITTargetMachineBuilder``
160that will be used by our IRCompiler, and a ``DataLayout`` that we will use to
161initialize our DL member. The constructor begins by initializing our
162ObjectLayer. The ObjectLayer requires a reference to the ExecutionSession, and
163a function object that will build a JIT memory manager for each module that is
164added (a JIT memory manager manages memory allocations, memory permissions, and
165registration of exception handlers for JIT'd code). For this we use a lambda
166that returns a SectionMemoryManager, an off-the-shelf utility that provides all
167the basic memory management functionality required for this chapter. Next we
168initialize our CompileLayer. The CompileLayer needs three things: (1) A
169reference to the ExecutionSession, (2) A reference to our object layer, and (3)
170a compiler instance to use to perform the actual compilation from IR to object
171files. We use the off-the-shelf ConcurrentIRCompiler utility as our compiler,
172which we construct using this constructor's JITTargetMachineBuilder argument.
173The ConcurrentIRCompiler utility will use the JITTargetMachineBuilder to build
174llvm TargetMachines (which are not thread safe) as needed for compiles. After
175this, we initialize our supporting members: ``DL``, ``Mangler`` and ``Ctx`` with
176the input DataLayout, the ExecutionSession and DL member, and a new default
177constucted LLVMContext respectively. Now that our members have been initialized,
178so the one thing that remains to do is to tweak the configuration of the
179*JITDylib* that we will store our code in. We want to modify this dylib to
180contain not only the symbols that we add to it, but also the symbols from our
181REPL process as well. We do this by attaching a
Lang Hamesb327b0e2018-10-17 03:34:09 +0000182``DynamicLibrarySearchGenerator`` instance using the
183``DynamicLibrarySearchGenerator::GetForCurrentProcess`` method.
184
Lang Hames33a2f5e2018-10-18 00:51:38 +0000185
186.. code-block:: c++
187
188 static Expected<std::unique_ptr<KaleidoscopeJIT>> Create() {
189 auto JTMB = JITTargetMachineBuilder::detectHost();
190
191 if (!JTMB)
192 return JTMB.takeError();
193
194 auto DL = JTMB->getDefaultDataLayoutForTarget();
195 if (!DL)
196 return DL.takeError();
197
198 return llvm::make_unique<KaleidoscopeJIT>(std::move(*JTMB), std::move(*DL));
199 }
200
201 const DataLayout &getDataLayout() const { return DL; }
202
203 LLVMContext &getContext() { return *Ctx.getContext(); }
204
205Next we have a named constructor, ``Create``, which will build a KaleidoscopeJIT
206instance that is configured to generate code for our host process. It does this
207by first generating a JITTargetMachineBuilder instance using that clases's
208detectHost method and then using that instance to generate a datalayout for
209the target process. Each of these operations can fail, so each returns its
210result wrapped in an Expected value [3]_ that we must check for error before
211continuing. If both operations succeed we can unwrap their results (using the
212dereference operator) and pass them into KaleidoscopeJIT's constructor on the
213last line of the function.
214
215Following the named constructor we have the ``getDataLayout()`` and
216``getContext()`` methods. These are used to make data structures created and
217managed by the JIT (especially the LLVMContext) available to the REPL code that
218will build our IR modules.
Lang Hames7331cc32016-05-23 20:34:19 +0000219
220.. code-block:: c++
221
Lang Hamesb327b0e2018-10-17 03:34:09 +0000222 void addModule(std::unique_ptr<Module> M) {
223 cantFail(CompileLayer.add(ES.getMainJITDylib(),
224 ThreadSafeModule(std::move(M), Ctx)));
Lang Hames59a5ad82016-05-25 22:33:25 +0000225 }
Lang Hames7331cc32016-05-23 20:34:19 +0000226
Lang Hamesb327b0e2018-10-17 03:34:09 +0000227 Expected<JITEvaluatedSymbol> lookup(StringRef Name) {
228 return ES.lookup({&ES.getMainJITDylib()}, Mangle(Name.str()));
Lang Hamese0fc5ae2016-05-25 22:27:25 +0000229 }
230
Lang Hamesdb0551e2016-05-30 19:03:26 +0000231Now we come to the first of our JIT API methods: addModule. This method is
232responsible for adding IR to the JIT and making it available for execution. In
233this initial implementation of our JIT we will make our modules "available for
Lang Hamesb327b0e2018-10-17 03:34:09 +0000234execution" by adding them to the CompileLayer, which will it turn store the
235Module in the main JITDylib. This process will create new symbol table entries
236in the JITDylib for each definition in the module, and will defer compilation of
237the module until any of its definitions is looked up. Note that this is not lazy
238compilation: just referencing a definition, even if it is never used, will be
239enough to trigger compilation. In later chapters we will teach our JIT to defer
240compilation of functions until they're actually called. To add our Module we
241must first wrap it in a ThreadSafeModule instance, which manages the lifetime of
242the Module's LLVMContext (our Ctx member) in a thread-friendly way. In our
243example, all modules will share the Ctx member, which will exist for the
244duration of the JIT. Once we switch to concurrent compilation in later chapters
245we will use a new context per module.
Lang Hamese0fc5ae2016-05-25 22:27:25 +0000246
Lang Hamesb327b0e2018-10-17 03:34:09 +0000247Our last method is ``lookup``, which allows us to look up addresses for
248function and variable definitions added to the JIT based on their symbol names.
249As noted above, lookup will implicitly trigger compilation for any symbol
250that has not already been compiled. Our lookup method calls through to
251`ExecutionSession::lookup`, passing in a list of dylibs to search (in our case
252just the main dylib), and the symbol name to search for, with a twist: We have
253to *mangle* the name of the symbol we're searching for first. The ORC JIT
254components use mangled symbols internally the same way a static compiler and
255linker would, rather than using plain IR symbol names. This allows JIT'd code
256to interoperate easily with precompiled code in the application or shared
257libraries. The kind of mangling will depend on the DataLayout, which in turn
258depends on the target platform. To allow us to remain portable and search based
259on the un-mangled name, we just re-produce this mangling ourselves using our
260``Mangle`` member function object.
Lang Hamesdb0551e2016-05-30 19:03:26 +0000261
262This brings us to the end of Chapter 1 of Building a JIT. You now have a basic
263but fully functioning JIT stack that you can use to take LLVM IR and make it
264executable within the context of your JIT process. In the next chapter we'll
265look at how to extend this JIT to produce better quality code, and in the
266process take a deeper look at the ORC layer concept.
267
268`Next: Extending the KaleidoscopeJIT <BuildingAJIT2.html>`_
Lang Hames7331cc32016-05-23 20:34:19 +0000269
270Full Code Listing
271=================
272
Lang Hames9ed5f002016-05-25 23:42:48 +0000273Here is the complete code listing for our running example. To build this
274example, use:
Lang Hames7331cc32016-05-23 20:34:19 +0000275
276.. code-block:: bash
277
278 # Compile
Don Hinton4b93d232017-09-17 00:24:43 +0000279 clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy
Lang Hames7331cc32016-05-23 20:34:19 +0000280 # Run
281 ./toy
282
283Here is the code:
284
285.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter1/KaleidoscopeJIT.h
286 :language: c++
287
Lang Hames7331cc32016-05-23 20:34:19 +0000288.. [1] Actually we use a cut-down version of KaleidoscopeJIT that makes a
289 simplifying assumption: symbols cannot be re-defined. This will make it
290 impossible to re-define symbols in the REPL, but will make our symbol
291 lookup logic simpler. Re-introducing support for symbol redefinition is
292 left as an exercise for the reader. (The KaleidoscopeJIT.h used in the
293 original tutorials will be a helpful reference).
294
Don Hinton4b93d232017-09-17 00:24:43 +0000295.. [2] +-----------------------------+-----------------------------------------------+
296 | File | Reason for inclusion |
297 +=============================+===============================================+
Lang Hamesb327b0e2018-10-17 03:34:09 +0000298 | JITSymbol.h | Defines the lookup result type |
299 | | JITEvaluatedSymbol |
Don Hinton4b93d232017-09-17 00:24:43 +0000300 +-----------------------------+-----------------------------------------------+
Lang Hamesb327b0e2018-10-17 03:34:09 +0000301 | CompileUtils.h | Provides the SimpleCompiler class. |
Don Hinton4b93d232017-09-17 00:24:43 +0000302 +-----------------------------+-----------------------------------------------+
Lang Hamesb327b0e2018-10-17 03:34:09 +0000303 | Core.h | Core utilities such as ExecutionSession and |
304 | | JITDylib. |
Don Hinton4b93d232017-09-17 00:24:43 +0000305 +-----------------------------+-----------------------------------------------+
Lang Hamesb327b0e2018-10-17 03:34:09 +0000306 | ExecutionUtils.h | Provides the DynamicLibrarySearchGenerator |
307 | | class. |
Don Hinton4b93d232017-09-17 00:24:43 +0000308 +-----------------------------+-----------------------------------------------+
Lang Hamesb327b0e2018-10-17 03:34:09 +0000309 | IRCompileLayer.h | Provides the IRCompileLayer class. |
Don Hinton4b93d232017-09-17 00:24:43 +0000310 +-----------------------------+-----------------------------------------------+
Lang Hamesb327b0e2018-10-17 03:34:09 +0000311 | JITTargetMachineBuilder.h | Provides the JITTargetMachineBuilder class. |
Don Hinton4b93d232017-09-17 00:24:43 +0000312 +-----------------------------+-----------------------------------------------+
Lang Hamesb327b0e2018-10-17 03:34:09 +0000313 | RTDyldObjectLinkingLayer.h | Provides the RTDyldObjectLinkingLayer class. |
Don Hinton4b93d232017-09-17 00:24:43 +0000314 +-----------------------------+-----------------------------------------------+
Lang Hamesb327b0e2018-10-17 03:34:09 +0000315 | SectionMemoryManager.h | Provides the SectionMemoryManager class. |
Don Hinton4b93d232017-09-17 00:24:43 +0000316 +-----------------------------+-----------------------------------------------+
Lang Hamesb327b0e2018-10-17 03:34:09 +0000317 | DataLayout.h | Provides the DataLayout class. |
Don Hinton4b93d232017-09-17 00:24:43 +0000318 +-----------------------------+-----------------------------------------------+
Lang Hamesb327b0e2018-10-17 03:34:09 +0000319 | LLVMContext.h | Provides the LLVMContext class. |
Don Hinton4b93d232017-09-17 00:24:43 +0000320 +-----------------------------+-----------------------------------------------+
Lang Hames33a2f5e2018-10-18 00:51:38 +0000321
322.. [3] See the ErrorHandling section in the LLVM Programmer's Manual
323 (http://llvm.org/docs/ProgrammersManual.html#error-handling)