blob: 7665647736a73ae54fd2d902462bce5982b6dab7 [file] [log] [blame]
Sean Silvaee47edf2012-12-05 00:26:32 +00001============================================================
2Kaleidoscope: Extending the Language: User-defined Operators
3============================================================
4
5.. contents::
6 :local:
7
8Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Erick
9Tryzelaar <mailto:idadesub@users.sourceforge.net>`_
10
11Chapter 6 Introduction
12======================
13
14Welcome to Chapter 6 of the "`Implementing a language with
15LLVM <index.html>`_" tutorial. At this point in our tutorial, we now
16have a fully functional language that is fairly minimal, but also
17useful. There is still one big problem with it, however. Our language
18doesn't have many useful operators (like division, logical negation, or
19even any comparisons besides less-than).
20
21This chapter of the tutorial takes a wild digression into adding
22user-defined operators to the simple and beautiful Kaleidoscope
23language. This digression now gives us a simple and ugly language in
24some ways, but also a powerful one at the same time. One of the great
25things about creating your own language is that you get to decide what
26is good or bad. In this tutorial we'll assume that it is okay to use
27this as a way to show some interesting parsing techniques.
28
29At the end of this tutorial, we'll run through an example Kaleidoscope
30application that `renders the Mandelbrot set <#example>`_. This gives an
31example of what you can build with Kaleidoscope and its feature set.
32
33User-defined Operators: the Idea
34================================
35
36The "operator overloading" that we will add to Kaleidoscope is more
37general than languages like C++. In C++, you are only allowed to
38redefine existing operators: you can't programatically change the
39grammar, introduce new operators, change precedence levels, etc. In this
40chapter, we will add this capability to Kaleidoscope, which will let the
41user round out the set of operators that are supported.
42
43The point of going into user-defined operators in a tutorial like this
44is to show the power and flexibility of using a hand-written parser.
45Thus far, the parser we have been implementing uses recursive descent
46for most parts of the grammar and operator precedence parsing for the
47expressions. See `Chapter 2 <OCamlLangImpl2.html>`_ for details. Without
48using operator precedence parsing, it would be very difficult to allow
49the programmer to introduce new operators into the grammar: the grammar
50is dynamically extensible as the JIT runs.
51
52The two specific features we'll add are programmable unary operators
53(right now, Kaleidoscope has no unary operators at all) as well as
54binary operators. An example of this is:
55
56::
57
58 # Logical unary not.
59 def unary!(v)
60 if v then
61 0
62 else
63 1;
64
65 # Define > with the same precedence as <.
66 def binary> 10 (LHS RHS)
67 RHS < LHS;
68
69 # Binary "logical or", (note that it does not "short circuit")
70 def binary| 5 (LHS RHS)
71 if LHS then
72 1
73 else if RHS then
74 1
75 else
76 0;
77
78 # Define = with slightly lower precedence than relationals.
79 def binary= 9 (LHS RHS)
80 !(LHS < RHS | LHS > RHS);
81
82Many languages aspire to being able to implement their standard runtime
83library in the language itself. In Kaleidoscope, we can implement
84significant parts of the language in the library!
85
86We will break down implementation of these features into two parts:
87implementing support for user-defined binary operators and adding unary
88operators.
89
90User-defined Binary Operators
91=============================
92
93Adding support for user-defined binary operators is pretty simple with
94our current framework. We'll first add support for the unary/binary
95keywords:
96
97.. code-block:: ocaml
98
99 type token =
100 ...
101 (* operators *)
102 | Binary | Unary
103
104 ...
105
106 and lex_ident buffer = parser
107 ...
108 | "for" -> [< 'Token.For; stream >]
109 | "in" -> [< 'Token.In; stream >]
110 | "binary" -> [< 'Token.Binary; stream >]
111 | "unary" -> [< 'Token.Unary; stream >]
112
113This just adds lexer support for the unary and binary keywords, like we
114did in `previous chapters <OCamlLangImpl5.html#iflexer>`_. One nice
115thing about our current AST, is that we represent binary operators with
116full generalisation by using their ASCII code as the opcode. For our
117extended operators, we'll use this same representation, so we don't need
118any new AST or parser support.
119
120On the other hand, we have to be able to represent the definitions of
121these new operators, in the "def binary\| 5" part of the function
122definition. In our grammar so far, the "name" for the function
123definition is parsed as the "prototype" production and into the
124``Ast.Prototype`` AST node. To represent our new user-defined operators
125as prototypes, we have to extend the ``Ast.Prototype`` AST node like
126this:
127
128.. code-block:: ocaml
129
130 (* proto - This type represents the "prototype" for a function, which captures
131 * its name, and its argument names (thus implicitly the number of arguments the
132 * function takes). *)
133 type proto =
134 | Prototype of string * string array
135 | BinOpPrototype of string * string array * int
136
137Basically, in addition to knowing a name for the prototype, we now keep
138track of whether it was an operator, and if it was, what precedence
139level the operator is at. The precedence is only used for binary
140operators (as you'll see below, it just doesn't apply for unary
141operators). Now that we have a way to represent the prototype for a
142user-defined operator, we need to parse it:
143
144.. code-block:: ocaml
145
146 (* prototype
147 * ::= id '(' id* ')'
148 * ::= binary LETTER number? (id, id)
149 * ::= unary LETTER number? (id) *)
150 let parse_prototype =
151 let rec parse_args accumulator = parser
152 | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
153 | [< >] -> accumulator
154 in
155 let parse_operator = parser
156 | [< 'Token.Unary >] -> "unary", 1
157 | [< 'Token.Binary >] -> "binary", 2
158 in
159 let parse_binary_precedence = parser
160 | [< 'Token.Number n >] -> int_of_float n
161 | [< >] -> 30
162 in
163 parser
164 | [< 'Token.Ident id;
165 'Token.Kwd '(' ?? "expected '(' in prototype";
166 args=parse_args [];
167 'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
168 (* success. *)
169 Ast.Prototype (id, Array.of_list (List.rev args))
170 | [< (prefix, kind)=parse_operator;
171 'Token.Kwd op ?? "expected an operator";
172 (* Read the precedence if present. *)
173 binary_precedence=parse_binary_precedence;
174 'Token.Kwd '(' ?? "expected '(' in prototype";
175 args=parse_args [];
176 'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
177 let name = prefix ^ (String.make 1 op) in
178 let args = Array.of_list (List.rev args) in
179
180 (* Verify right number of arguments for operator. *)
181 if Array.length args != kind
182 then raise (Stream.Error "invalid number of operands for operator")
183 else
184 if kind == 1 then
185 Ast.Prototype (name, args)
186 else
187 Ast.BinOpPrototype (name, args, binary_precedence)
188 | [< >] ->
189 raise (Stream.Error "expected function name in prototype")
190
191This is all fairly straightforward parsing code, and we have already
192seen a lot of similar code in the past. One interesting part about the
193code above is the couple lines that set up ``name`` for binary
194operators. This builds names like "binary@" for a newly defined "@"
195operator. This then takes advantage of the fact that symbol names in the
196LLVM symbol table are allowed to have any character in them, including
197embedded nul characters.
198
199The next interesting thing to add, is codegen support for these binary
200operators. Given our current structure, this is a simple addition of a
201default case for our existing binary operator node:
202
203.. code-block:: ocaml
204
205 let codegen_expr = function
206 ...
207 | Ast.Binary (op, lhs, rhs) ->
208 let lhs_val = codegen_expr lhs in
209 let rhs_val = codegen_expr rhs in
210 begin
211 match op with
212 | '+' -> build_add lhs_val rhs_val "addtmp" builder
213 | '-' -> build_sub lhs_val rhs_val "subtmp" builder
214 | '*' -> build_mul lhs_val rhs_val "multmp" builder
215 | '<' ->
216 (* Convert bool 0/1 to double 0.0 or 1.0 *)
217 let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
218 build_uitofp i double_type "booltmp" builder
219 | _ ->
220 (* If it wasn't a builtin binary operator, it must be a user defined
221 * one. Emit a call to it. *)
222 let callee = "binary" ^ (String.make 1 op) in
223 let callee =
224 match lookup_function callee the_module with
225 | Some callee -> callee
226 | None -> raise (Error "binary operator not found!")
227 in
228 build_call callee [|lhs_val; rhs_val|] "binop" builder
229 end
230
231As you can see above, the new code is actually really simple. It just
232does a lookup for the appropriate operator in the symbol table and
233generates a function call to it. Since user-defined operators are just
234built as normal functions (because the "prototype" boils down to a
235function with the right name) everything falls into place.
236
237The final piece of code we are missing, is a bit of top level magic:
238
239.. code-block:: ocaml
240
241 let codegen_func the_fpm = function
242 | Ast.Function (proto, body) ->
243 Hashtbl.clear named_values;
244 let the_function = codegen_proto proto in
245
246 (* If this is an operator, install it. *)
247 begin match proto with
248 | Ast.BinOpPrototype (name, args, prec) ->
249 let op = name.[String.length name - 1] in
250 Hashtbl.add Parser.binop_precedence op prec;
251 | _ -> ()
252 end;
253
254 (* Create a new basic block to start insertion into. *)
255 let bb = append_block context "entry" the_function in
256 position_at_end bb builder;
257 ...
258
259Basically, before codegening a function, if it is a user-defined
260operator, we register it in the precedence table. This allows the binary
261operator parsing logic we already have in place to handle it. Since we
262are working on a fully-general operator precedence parser, this is all
263we need to do to "extend the grammar".
264
265Now we have useful user-defined binary operators. This builds a lot on
266the previous framework we built for other operators. Adding unary
267operators is a bit more challenging, because we don't have any framework
268for it yet - lets see what it takes.
269
270User-defined Unary Operators
271============================
272
273Since we don't currently support unary operators in the Kaleidoscope
274language, we'll need to add everything to support them. Above, we added
275simple support for the 'unary' keyword to the lexer. In addition to
276that, we need an AST node:
277
278.. code-block:: ocaml
279
280 type expr =
281 ...
282 (* variant for a unary operator. *)
283 | Unary of char * expr
284 ...
285
286This AST node is very simple and obvious by now. It directly mirrors the
287binary operator AST node, except that it only has one child. With this,
288we need to add the parsing logic. Parsing a unary operator is pretty
289simple: we'll add a new function to do it:
290
291.. code-block:: ocaml
292
293 (* unary
294 * ::= primary
295 * ::= '!' unary *)
296 and parse_unary = parser
297 (* If this is a unary operator, read it. *)
298 | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] ->
299 Ast.Unary (op, operand)
300
301 (* If the current token is not an operator, it must be a primary expr. *)
302 | [< stream >] -> parse_primary stream
303
304The grammar we add is pretty straightforward here. If we see a unary
305operator when parsing a primary operator, we eat the operator as a
306prefix and parse the remaining piece as another unary operator. This
307allows us to handle multiple unary operators (e.g. "!!x"). Note that
308unary operators can't have ambiguous parses like binary operators can,
309so there is no need for precedence information.
310
311The problem with this function, is that we need to call ParseUnary from
312somewhere. To do this, we change previous callers of ParsePrimary to
313call ``parse_unary`` instead:
314
315.. code-block:: ocaml
316
317 (* binoprhs
318 * ::= ('+' primary)* *)
319 and parse_bin_rhs expr_prec lhs stream =
320 ...
321 (* Parse the unary expression after the binary operator. *)
322 let rhs = parse_unary stream in
323 ...
324
325 ...
326
327 (* expression
328 * ::= primary binoprhs *)
329 and parse_expr = parser
330 | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream
331
332With these two simple changes, we are now able to parse unary operators
333and build the AST for them. Next up, we need to add parser support for
334prototypes, to parse the unary operator prototype. We extend the binary
335operator code above with:
336
337.. code-block:: ocaml
338
339 (* prototype
340 * ::= id '(' id* ')'
341 * ::= binary LETTER number? (id, id)
342 * ::= unary LETTER number? (id) *)
343 let parse_prototype =
344 let rec parse_args accumulator = parser
345 | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
346 | [< >] -> accumulator
347 in
348 let parse_operator = parser
349 | [< 'Token.Unary >] -> "unary", 1
350 | [< 'Token.Binary >] -> "binary", 2
351 in
352 let parse_binary_precedence = parser
353 | [< 'Token.Number n >] -> int_of_float n
354 | [< >] -> 30
355 in
356 parser
357 | [< 'Token.Ident id;
358 'Token.Kwd '(' ?? "expected '(' in prototype";
359 args=parse_args [];
360 'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
361 (* success. *)
362 Ast.Prototype (id, Array.of_list (List.rev args))
363 | [< (prefix, kind)=parse_operator;
364 'Token.Kwd op ?? "expected an operator";
365 (* Read the precedence if present. *)
366 binary_precedence=parse_binary_precedence;
367 'Token.Kwd '(' ?? "expected '(' in prototype";
368 args=parse_args [];
369 'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
370 let name = prefix ^ (String.make 1 op) in
371 let args = Array.of_list (List.rev args) in
372
373 (* Verify right number of arguments for operator. *)
374 if Array.length args != kind
375 then raise (Stream.Error "invalid number of operands for operator")
376 else
377 if kind == 1 then
378 Ast.Prototype (name, args)
379 else
380 Ast.BinOpPrototype (name, args, binary_precedence)
381 | [< >] ->
382 raise (Stream.Error "expected function name in prototype")
383
384As with binary operators, we name unary operators with a name that
385includes the operator character. This assists us at code generation
386time. Speaking of, the final piece we need to add is codegen support for
387unary operators. It looks like this:
388
389.. code-block:: ocaml
390
391 let rec codegen_expr = function
392 ...
393 | Ast.Unary (op, operand) ->
394 let operand = codegen_expr operand in
395 let callee = "unary" ^ (String.make 1 op) in
396 let callee =
397 match lookup_function callee the_module with
398 | Some callee -> callee
399 | None -> raise (Error "unknown unary operator")
400 in
401 build_call callee [|operand|] "unop" builder
402
403This code is similar to, but simpler than, the code for binary
404operators. It is simpler primarily because it doesn't need to handle any
405predefined operators.
406
407Kicking the Tires
408=================
409
410It is somewhat hard to believe, but with a few simple extensions we've
411covered in the last chapters, we have grown a real-ish language. With
412this, we can do a lot of interesting things, including I/O, math, and a
413bunch of other things. For example, we can now add a nice sequencing
414operator (printd is defined to print out the specified value and a
415newline):
416
417::
418
419 ready> extern printd(x);
420 Read extern: declare double @printd(double)
421 ready> def binary : 1 (x y) 0; # Low-precedence operator that ignores operands.
422 ..
423 ready> printd(123) : printd(456) : printd(789);
424 123.000000
425 456.000000
426 789.000000
427 Evaluated to 0.000000
428
429We can also define a bunch of other "primitive" operations, such as:
430
431::
432
433 # Logical unary not.
434 def unary!(v)
435 if v then
436 0
437 else
438 1;
439
440 # Unary negate.
441 def unary-(v)
442 0-v;
443
444 # Define > with the same precedence as <.
445 def binary> 10 (LHS RHS)
446 RHS < LHS;
447
448 # Binary logical or, which does not short circuit.
449 def binary| 5 (LHS RHS)
450 if LHS then
451 1
452 else if RHS then
453 1
454 else
455 0;
456
457 # Binary logical and, which does not short circuit.
458 def binary& 6 (LHS RHS)
459 if !LHS then
460 0
461 else
462 !!RHS;
463
464 # Define = with slightly lower precedence than relationals.
465 def binary = 9 (LHS RHS)
466 !(LHS < RHS | LHS > RHS);
467
468Given the previous if/then/else support, we can also define interesting
469functions for I/O. For example, the following prints out a character
470whose "density" reflects the value passed in: the lower the value, the
471denser the character:
472
473::
474
475 ready>
476
477 extern putchard(char)
478 def printdensity(d)
479 if d > 8 then
480 putchard(32) # ' '
481 else if d > 4 then
482 putchard(46) # '.'
483 else if d > 2 then
484 putchard(43) # '+'
485 else
486 putchard(42); # '*'
487 ...
488 ready> printdensity(1): printdensity(2): printdensity(3) :
489 printdensity(4): printdensity(5): printdensity(9): putchard(10);
490 *++..
491 Evaluated to 0.000000
492
493Based on these simple primitive operations, we can start to define more
494interesting things. For example, here's a little function that solves
495for the number of iterations it takes a function in the complex plane to
496converge:
497
498::
499
500 # determine whether the specific location diverges.
501 # Solve for z = z^2 + c in the complex plane.
502 def mandleconverger(real imag iters creal cimag)
503 if iters > 255 | (real*real + imag*imag > 4) then
504 iters
505 else
506 mandleconverger(real*real - imag*imag + creal,
507 2*real*imag + cimag,
508 iters+1, creal, cimag);
509
510 # return the number of iterations required for the iteration to escape
511 def mandleconverge(real imag)
512 mandleconverger(real, imag, 0, real, imag);
513
514This "z = z\ :sup:`2`\ + c" function is a beautiful little creature
515that is the basis for computation of the `Mandelbrot
516Set <http://en.wikipedia.org/wiki/Mandelbrot_set>`_. Our
517``mandelconverge`` function returns the number of iterations that it
518takes for a complex orbit to escape, saturating to 255. This is not a
519very useful function by itself, but if you plot its value over a
520two-dimensional plane, you can see the Mandelbrot set. Given that we are
521limited to using putchard here, our amazing graphical output is limited,
522but we can whip together something using the density plotter above:
523
524::
525
526 # compute and plot the mandlebrot set with the specified 2 dimensional range
527 # info.
528 def mandelhelp(xmin xmax xstep ymin ymax ystep)
529 for y = ymin, y < ymax, ystep in (
530 (for x = xmin, x < xmax, xstep in
531 printdensity(mandleconverge(x,y)))
532 : putchard(10)
533 )
534
535 # mandel - This is a convenient helper function for plotting the mandelbrot set
536 # from the specified position with the specified Magnification.
537 def mandel(realstart imagstart realmag imagmag)
538 mandelhelp(realstart, realstart+realmag*78, realmag,
539 imagstart, imagstart+imagmag*40, imagmag);
540
541Given this, we can try plotting out the mandlebrot set! Lets try it out:
542
543::
544
545 ready> mandel(-2.3, -1.3, 0.05, 0.07);
546 *******************************+++++++++++*************************************
547 *************************+++++++++++++++++++++++*******************************
548 **********************+++++++++++++++++++++++++++++****************************
549 *******************+++++++++++++++++++++.. ...++++++++*************************
550 *****************++++++++++++++++++++++.... ...+++++++++***********************
551 ***************+++++++++++++++++++++++..... ...+++++++++*********************
552 **************+++++++++++++++++++++++.... ....+++++++++********************
553 *************++++++++++++++++++++++...... .....++++++++*******************
554 ************+++++++++++++++++++++....... .......+++++++******************
555 ***********+++++++++++++++++++.... ... .+++++++*****************
556 **********+++++++++++++++++....... .+++++++****************
557 *********++++++++++++++........... ...+++++++***************
558 ********++++++++++++............ ...++++++++**************
559 ********++++++++++... .......... .++++++++**************
560 *******+++++++++..... .+++++++++*************
561 *******++++++++...... ..+++++++++*************
562 *******++++++....... ..+++++++++*************
563 *******+++++...... ..+++++++++*************
564 *******.... .... ...+++++++++*************
565 *******.... . ...+++++++++*************
566 *******+++++...... ...+++++++++*************
567 *******++++++....... ..+++++++++*************
568 *******++++++++...... .+++++++++*************
569 *******+++++++++..... ..+++++++++*************
570 ********++++++++++... .......... .++++++++**************
571 ********++++++++++++............ ...++++++++**************
572 *********++++++++++++++.......... ...+++++++***************
573 **********++++++++++++++++........ .+++++++****************
574 **********++++++++++++++++++++.... ... ..+++++++****************
575 ***********++++++++++++++++++++++....... .......++++++++*****************
576 ************+++++++++++++++++++++++...... ......++++++++******************
577 **************+++++++++++++++++++++++.... ....++++++++********************
578 ***************+++++++++++++++++++++++..... ...+++++++++*********************
579 *****************++++++++++++++++++++++.... ...++++++++***********************
580 *******************+++++++++++++++++++++......++++++++*************************
581 *********************++++++++++++++++++++++.++++++++***************************
582 *************************+++++++++++++++++++++++*******************************
583 ******************************+++++++++++++************************************
584 *******************************************************************************
585 *******************************************************************************
586 *******************************************************************************
587 Evaluated to 0.000000
588 ready> mandel(-2, -1, 0.02, 0.04);
589 **************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
590 ***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
591 *********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
592 *******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
593 *****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
594 ***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
595 **************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
596 ************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
597 ***********++++++++++++++++++++++++++++++++++++++++++++++++++........ .
598 **********++++++++++++++++++++++++++++++++++++++++++++++.............
599 ********+++++++++++++++++++++++++++++++++++++++++++..................
600 *******+++++++++++++++++++++++++++++++++++++++.......................
601 ******+++++++++++++++++++++++++++++++++++...........................
602 *****++++++++++++++++++++++++++++++++............................
603 *****++++++++++++++++++++++++++++...............................
604 ****++++++++++++++++++++++++++...... .........................
605 ***++++++++++++++++++++++++......... ...... ...........
606 ***++++++++++++++++++++++............
607 **+++++++++++++++++++++..............
608 **+++++++++++++++++++................
609 *++++++++++++++++++.................
610 *++++++++++++++++............ ...
611 *++++++++++++++..............
612 *+++....++++................
613 *.......... ...........
614 *
615 *.......... ...........
616 *+++....++++................
617 *++++++++++++++..............
618 *++++++++++++++++............ ...
619 *++++++++++++++++++.................
620 **+++++++++++++++++++................
621 **+++++++++++++++++++++..............
622 ***++++++++++++++++++++++............
623 ***++++++++++++++++++++++++......... ...... ...........
624 ****++++++++++++++++++++++++++...... .........................
625 *****++++++++++++++++++++++++++++...............................
626 *****++++++++++++++++++++++++++++++++............................
627 ******+++++++++++++++++++++++++++++++++++...........................
628 *******+++++++++++++++++++++++++++++++++++++++.......................
629 ********+++++++++++++++++++++++++++++++++++++++++++..................
630 Evaluated to 0.000000
631 ready> mandel(-0.9, -1.4, 0.02, 0.03);
632 *******************************************************************************
633 *******************************************************************************
634 *******************************************************************************
635 **********+++++++++++++++++++++************************************************
636 *+++++++++++++++++++++++++++++++++++++++***************************************
637 +++++++++++++++++++++++++++++++++++++++++++++**********************************
638 ++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
639 ++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
640 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
641 +++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
642 +++++++++++++++++++++++++++++++.... ......+++++++++++++++++++****************
643 +++++++++++++++++++++++++++++....... ........+++++++++++++++++++**************
644 ++++++++++++++++++++++++++++........ ........++++++++++++++++++++************
645 +++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++**********
646 ++++++++++++++++++++++++++........... ....++++++++++++++++++++++********
647 ++++++++++++++++++++++++............. .......++++++++++++++++++++++******
648 +++++++++++++++++++++++............. ........+++++++++++++++++++++++****
649 ++++++++++++++++++++++........... ..........++++++++++++++++++++++***
650 ++++++++++++++++++++........... .........++++++++++++++++++++++*
651 ++++++++++++++++++............ ...........++++++++++++++++++++
652 ++++++++++++++++............... .............++++++++++++++++++
653 ++++++++++++++................. ...............++++++++++++++++
654 ++++++++++++.................. .................++++++++++++++
655 +++++++++.................. .................+++++++++++++
656 ++++++........ . ......... ..++++++++++++
657 ++............ ...... ....++++++++++
658 .............. ...++++++++++
659 .............. ....+++++++++
660 .............. .....++++++++
661 ............. ......++++++++
662 ........... .......++++++++
663 ......... ........+++++++
664 ......... ........+++++++
665 ......... ....+++++++
666 ........ ...+++++++
667 ....... ...+++++++
668 ....+++++++
669 .....+++++++
670 ....+++++++
671 ....+++++++
672 ....+++++++
673 Evaluated to 0.000000
674 ready> ^D
675
676At this point, you may be starting to realize that Kaleidoscope is a
677real and powerful language. It may not be self-similar :), but it can be
678used to plot things that are!
679
680With this, we conclude the "adding user-defined operators" chapter of
681the tutorial. We have successfully augmented our language, adding the
682ability to extend the language in the library, and we have shown how
683this can be used to build a simple but interesting end-user application
684in Kaleidoscope. At this point, Kaleidoscope can build a variety of
685applications that are functional and can call functions with
686side-effects, but it can't actually define and mutate a variable itself.
687
688Strikingly, variable mutation is an important feature of some languages,
689and it is not at all obvious how to `add support for mutable
690variables <OCamlLangImpl7.html>`_ without having to add an "SSA
691construction" phase to your front-end. In the next chapter, we will
692describe how you can add variable mutation without building SSA in your
693front-end.
694
695Full Code Listing
696=================
697
698Here is the complete code listing for our running example, enhanced with
699the if/then/else and for expressions.. To build this example, use:
700
701.. code-block:: bash
702
703 # Compile
704 ocamlbuild toy.byte
705 # Run
706 ./toy.byte
707
708Here is the code:
709
710\_tags:
711 ::
712
713 <{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
714 <*.{byte,native}>: g++, use_llvm, use_llvm_analysis
715 <*.{byte,native}>: use_llvm_executionengine, use_llvm_target
716 <*.{byte,native}>: use_llvm_scalar_opts, use_bindings
717
718myocamlbuild.ml:
719 .. code-block:: ocaml
720
721 open Ocamlbuild_plugin;;
722
723 ocaml_lib ~extern:true "llvm";;
724 ocaml_lib ~extern:true "llvm_analysis";;
725 ocaml_lib ~extern:true "llvm_executionengine";;
726 ocaml_lib ~extern:true "llvm_target";;
727 ocaml_lib ~extern:true "llvm_scalar_opts";;
728
729 flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"; A"-cclib"; A"-rdynamic"]);;
730 dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
731
732token.ml:
733 .. code-block:: ocaml
734
735 (*===----------------------------------------------------------------------===
736 * Lexer Tokens
737 *===----------------------------------------------------------------------===*)
738
739 (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
740 * these others for known things. *)
741 type token =
742 (* commands *)
743 | Def | Extern
744
745 (* primary *)
746 | Ident of string | Number of float
747
748 (* unknown *)
749 | Kwd of char
750
751 (* control *)
752 | If | Then | Else
753 | For | In
754
755 (* operators *)
756 | Binary | Unary
757
758lexer.ml:
759 .. code-block:: ocaml
760
761 (*===----------------------------------------------------------------------===
762 * Lexer
763 *===----------------------------------------------------------------------===*)
764
765 let rec lex = parser
766 (* Skip any whitespace. *)
767 | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
768
769 (* identifier: [a-zA-Z][a-zA-Z0-9] *)
770 | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
771 let buffer = Buffer.create 1 in
772 Buffer.add_char buffer c;
773 lex_ident buffer stream
774
775 (* number: [0-9.]+ *)
776 | [< ' ('0' .. '9' as c); stream >] ->
777 let buffer = Buffer.create 1 in
778 Buffer.add_char buffer c;
779 lex_number buffer stream
780
781 (* Comment until end of line. *)
782 | [< ' ('#'); stream >] ->
783 lex_comment stream
784
785 (* Otherwise, just return the character as its ascii value. *)
786 | [< 'c; stream >] ->
787 [< 'Token.Kwd c; lex stream >]
788
789 (* end of stream. *)
790 | [< >] -> [< >]
791
792 and lex_number buffer = parser
793 | [< ' ('0' .. '9' | '.' as c); stream >] ->
794 Buffer.add_char buffer c;
795 lex_number buffer stream
796 | [< stream=lex >] ->
797 [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
798
799 and lex_ident buffer = parser
800 | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
801 Buffer.add_char buffer c;
802 lex_ident buffer stream
803 | [< stream=lex >] ->
804 match Buffer.contents buffer with
805 | "def" -> [< 'Token.Def; stream >]
806 | "extern" -> [< 'Token.Extern; stream >]
807 | "if" -> [< 'Token.If; stream >]
808 | "then" -> [< 'Token.Then; stream >]
809 | "else" -> [< 'Token.Else; stream >]
810 | "for" -> [< 'Token.For; stream >]
811 | "in" -> [< 'Token.In; stream >]
812 | "binary" -> [< 'Token.Binary; stream >]
813 | "unary" -> [< 'Token.Unary; stream >]
814 | id -> [< 'Token.Ident id; stream >]
815
816 and lex_comment = parser
817 | [< ' ('\n'); stream=lex >] -> stream
818 | [< 'c; e=lex_comment >] -> e
819 | [< >] -> [< >]
820
821ast.ml:
822 .. code-block:: ocaml
823
824 (*===----------------------------------------------------------------------===
825 * Abstract Syntax Tree (aka Parse Tree)
826 *===----------------------------------------------------------------------===*)
827
828 (* expr - Base type for all expression nodes. *)
829 type expr =
830 (* variant for numeric literals like "1.0". *)
831 | Number of float
832
833 (* variant for referencing a variable, like "a". *)
834 | Variable of string
835
836 (* variant for a unary operator. *)
837 | Unary of char * expr
838
839 (* variant for a binary operator. *)
840 | Binary of char * expr * expr
841
842 (* variant for function calls. *)
843 | Call of string * expr array
844
845 (* variant for if/then/else. *)
846 | If of expr * expr * expr
847
848 (* variant for for/in. *)
849 | For of string * expr * expr * expr option * expr
850
851 (* proto - This type represents the "prototype" for a function, which captures
852 * its name, and its argument names (thus implicitly the number of arguments the
853 * function takes). *)
854 type proto =
855 | Prototype of string * string array
856 | BinOpPrototype of string * string array * int
857
858 (* func - This type represents a function definition itself. *)
859 type func = Function of proto * expr
860
861parser.ml:
862 .. code-block:: ocaml
863
864 (*===---------------------------------------------------------------------===
865 * Parser
866 *===---------------------------------------------------------------------===*)
867
868 (* binop_precedence - This holds the precedence for each binary operator that is
869 * defined *)
870 let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
871
872 (* precedence - Get the precedence of the pending binary operator token. *)
873 let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
874
875 (* primary
876 * ::= identifier
877 * ::= numberexpr
878 * ::= parenexpr
879 * ::= ifexpr
880 * ::= forexpr *)
881 let rec parse_primary = parser
882 (* numberexpr ::= number *)
883 | [< 'Token.Number n >] -> Ast.Number n
884
885 (* parenexpr ::= '(' expression ')' *)
886 | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
887
888 (* identifierexpr
889 * ::= identifier
890 * ::= identifier '(' argumentexpr ')' *)
891 | [< 'Token.Ident id; stream >] ->
892 let rec parse_args accumulator = parser
893 | [< e=parse_expr; stream >] ->
894 begin parser
895 | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
896 | [< >] -> e :: accumulator
897 end stream
898 | [< >] -> accumulator
899 in
900 let rec parse_ident id = parser
901 (* Call. *)
902 | [< 'Token.Kwd '(';
903 args=parse_args [];
904 'Token.Kwd ')' ?? "expected ')'">] ->
905 Ast.Call (id, Array.of_list (List.rev args))
906
907 (* Simple variable ref. *)
908 | [< >] -> Ast.Variable id
909 in
910 parse_ident id stream
911
912 (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
913 | [< 'Token.If; c=parse_expr;
914 'Token.Then ?? "expected 'then'"; t=parse_expr;
915 'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
916 Ast.If (c, t, e)
917
918 (* forexpr
919 ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
920 | [< 'Token.For;
921 'Token.Ident id ?? "expected identifier after for";
922 'Token.Kwd '=' ?? "expected '=' after for";
923 stream >] ->
924 begin parser
925 | [<
926 start=parse_expr;
927 'Token.Kwd ',' ?? "expected ',' after for";
928 end_=parse_expr;
929 stream >] ->
930 let step =
931 begin parser
932 | [< 'Token.Kwd ','; step=parse_expr >] -> Some step
933 | [< >] -> None
934 end stream
935 in
936 begin parser
937 | [< 'Token.In; body=parse_expr >] ->
938 Ast.For (id, start, end_, step, body)
939 | [< >] ->
940 raise (Stream.Error "expected 'in' after for")
941 end stream
942 | [< >] ->
943 raise (Stream.Error "expected '=' after for")
944 end stream
945
946 | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
947
948 (* unary
949 * ::= primary
950 * ::= '!' unary *)
951 and parse_unary = parser
952 (* If this is a unary operator, read it. *)
953 | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] ->
954 Ast.Unary (op, operand)
955
956 (* If the current token is not an operator, it must be a primary expr. *)
957 | [< stream >] -> parse_primary stream
958
959 (* binoprhs
960 * ::= ('+' primary)* *)
961 and parse_bin_rhs expr_prec lhs stream =
962 match Stream.peek stream with
963 (* If this is a binop, find its precedence. *)
964 | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
965 let token_prec = precedence c in
966
967 (* If this is a binop that binds at least as tightly as the current binop,
968 * consume it, otherwise we are done. *)
969 if token_prec < expr_prec then lhs else begin
970 (* Eat the binop. *)
971 Stream.junk stream;
972
973 (* Parse the unary expression after the binary operator. *)
974 let rhs = parse_unary stream in
975
976 (* Okay, we know this is a binop. *)
977 let rhs =
978 match Stream.peek stream with
979 | Some (Token.Kwd c2) ->
980 (* If BinOp binds less tightly with rhs than the operator after
981 * rhs, let the pending operator take rhs as its lhs. *)
982 let next_prec = precedence c2 in
983 if token_prec < next_prec
984 then parse_bin_rhs (token_prec + 1) rhs stream
985 else rhs
986 | _ -> rhs
987 in
988
989 (* Merge lhs/rhs. *)
990 let lhs = Ast.Binary (c, lhs, rhs) in
991 parse_bin_rhs expr_prec lhs stream
992 end
993 | _ -> lhs
994
995 (* expression
996 * ::= primary binoprhs *)
997 and parse_expr = parser
998 | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream
999
1000 (* prototype
1001 * ::= id '(' id* ')'
1002 * ::= binary LETTER number? (id, id)
1003 * ::= unary LETTER number? (id) *)
1004 let parse_prototype =
1005 let rec parse_args accumulator = parser
1006 | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
1007 | [< >] -> accumulator
1008 in
1009 let parse_operator = parser
1010 | [< 'Token.Unary >] -> "unary", 1
1011 | [< 'Token.Binary >] -> "binary", 2
1012 in
1013 let parse_binary_precedence = parser
1014 | [< 'Token.Number n >] -> int_of_float n
1015 | [< >] -> 30
1016 in
1017 parser
1018 | [< 'Token.Ident id;
1019 'Token.Kwd '(' ?? "expected '(' in prototype";
1020 args=parse_args [];
1021 'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
1022 (* success. *)
1023 Ast.Prototype (id, Array.of_list (List.rev args))
1024 | [< (prefix, kind)=parse_operator;
1025 'Token.Kwd op ?? "expected an operator";
1026 (* Read the precedence if present. *)
1027 binary_precedence=parse_binary_precedence;
1028 'Token.Kwd '(' ?? "expected '(' in prototype";
1029 args=parse_args [];
1030 'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
1031 let name = prefix ^ (String.make 1 op) in
1032 let args = Array.of_list (List.rev args) in
1033
1034 (* Verify right number of arguments for operator. *)
1035 if Array.length args != kind
1036 then raise (Stream.Error "invalid number of operands for operator")
1037 else
1038 if kind == 1 then
1039 Ast.Prototype (name, args)
1040 else
1041 Ast.BinOpPrototype (name, args, binary_precedence)
1042 | [< >] ->
1043 raise (Stream.Error "expected function name in prototype")
1044
1045 (* definition ::= 'def' prototype expression *)
1046 let parse_definition = parser
1047 | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
1048 Ast.Function (p, e)
1049
1050 (* toplevelexpr ::= expression *)
1051 let parse_toplevel = parser
1052 | [< e=parse_expr >] ->
1053 (* Make an anonymous proto. *)
1054 Ast.Function (Ast.Prototype ("", [||]), e)
1055
1056 (* external ::= 'extern' prototype *)
1057 let parse_extern = parser
1058 | [< 'Token.Extern; e=parse_prototype >] -> e
1059
1060codegen.ml:
1061 .. code-block:: ocaml
1062
1063 (*===----------------------------------------------------------------------===
1064 * Code Generation
1065 *===----------------------------------------------------------------------===*)
1066
1067 open Llvm
1068
1069 exception Error of string
1070
1071 let context = global_context ()
1072 let the_module = create_module context "my cool jit"
1073 let builder = builder context
1074 let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
1075 let double_type = double_type context
1076
1077 let rec codegen_expr = function
1078 | Ast.Number n -> const_float double_type n
1079 | Ast.Variable name ->
1080 (try Hashtbl.find named_values name with
1081 | Not_found -> raise (Error "unknown variable name"))
1082 | Ast.Unary (op, operand) ->
1083 let operand = codegen_expr operand in
1084 let callee = "unary" ^ (String.make 1 op) in
1085 let callee =
1086 match lookup_function callee the_module with
1087 | Some callee -> callee
1088 | None -> raise (Error "unknown unary operator")
1089 in
1090 build_call callee [|operand|] "unop" builder
1091 | Ast.Binary (op, lhs, rhs) ->
1092 let lhs_val = codegen_expr lhs in
1093 let rhs_val = codegen_expr rhs in
1094 begin
1095 match op with
1096 | '+' -> build_add lhs_val rhs_val "addtmp" builder
1097 | '-' -> build_sub lhs_val rhs_val "subtmp" builder
1098 | '*' -> build_mul lhs_val rhs_val "multmp" builder
1099 | '<' ->
1100 (* Convert bool 0/1 to double 0.0 or 1.0 *)
1101 let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
1102 build_uitofp i double_type "booltmp" builder
1103 | _ ->
1104 (* If it wasn't a builtin binary operator, it must be a user defined
1105 * one. Emit a call to it. *)
1106 let callee = "binary" ^ (String.make 1 op) in
1107 let callee =
1108 match lookup_function callee the_module with
1109 | Some callee -> callee
1110 | None -> raise (Error "binary operator not found!")
1111 in
1112 build_call callee [|lhs_val; rhs_val|] "binop" builder
1113 end
1114 | Ast.Call (callee, args) ->
1115 (* Look up the name in the module table. *)
1116 let callee =
1117 match lookup_function callee the_module with
1118 | Some callee -> callee
1119 | None -> raise (Error "unknown function referenced")
1120 in
1121 let params = params callee in
1122
1123 (* If argument mismatch error. *)
1124 if Array.length params == Array.length args then () else
1125 raise (Error "incorrect # arguments passed");
1126 let args = Array.map codegen_expr args in
1127 build_call callee args "calltmp" builder
1128 | Ast.If (cond, then_, else_) ->
1129 let cond = codegen_expr cond in
1130
1131 (* Convert condition to a bool by comparing equal to 0.0 *)
1132 let zero = const_float double_type 0.0 in
1133 let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
1134
1135 (* Grab the first block so that we might later add the conditional branch
1136 * to it at the end of the function. *)
1137 let start_bb = insertion_block builder in
1138 let the_function = block_parent start_bb in
1139
1140 let then_bb = append_block context "then" the_function in
1141
1142 (* Emit 'then' value. *)
1143 position_at_end then_bb builder;
1144 let then_val = codegen_expr then_ in
1145
1146 (* Codegen of 'then' can change the current block, update then_bb for the
1147 * phi. We create a new name because one is used for the phi node, and the
1148 * other is used for the conditional branch. *)
1149 let new_then_bb = insertion_block builder in
1150
1151 (* Emit 'else' value. *)
1152 let else_bb = append_block context "else" the_function in
1153 position_at_end else_bb builder;
1154 let else_val = codegen_expr else_ in
1155
1156 (* Codegen of 'else' can change the current block, update else_bb for the
1157 * phi. *)
1158 let new_else_bb = insertion_block builder in
1159
1160 (* Emit merge block. *)
1161 let merge_bb = append_block context "ifcont" the_function in
1162 position_at_end merge_bb builder;
1163 let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
1164 let phi = build_phi incoming "iftmp" builder in
1165
1166 (* Return to the start block to add the conditional branch. *)
1167 position_at_end start_bb builder;
1168 ignore (build_cond_br cond_val then_bb else_bb builder);
1169
1170 (* Set a unconditional branch at the end of the 'then' block and the
1171 * 'else' block to the 'merge' block. *)
1172 position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
1173 position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
1174
1175 (* Finally, set the builder to the end of the merge block. *)
1176 position_at_end merge_bb builder;
1177
1178 phi
1179 | Ast.For (var_name, start, end_, step, body) ->
1180 (* Emit the start code first, without 'variable' in scope. *)
1181 let start_val = codegen_expr start in
1182
1183 (* Make the new basic block for the loop header, inserting after current
1184 * block. *)
1185 let preheader_bb = insertion_block builder in
1186 let the_function = block_parent preheader_bb in
1187 let loop_bb = append_block context "loop" the_function in
1188
1189 (* Insert an explicit fall through from the current block to the
1190 * loop_bb. *)
1191 ignore (build_br loop_bb builder);
1192
1193 (* Start insertion in loop_bb. *)
1194 position_at_end loop_bb builder;
1195
1196 (* Start the PHI node with an entry for start. *)
1197 let variable = build_phi [(start_val, preheader_bb)] var_name builder in
1198
1199 (* Within the loop, the variable is defined equal to the PHI node. If it
1200 * shadows an existing variable, we have to restore it, so save it
1201 * now. *)
1202 let old_val =
1203 try Some (Hashtbl.find named_values var_name) with Not_found -> None
1204 in
1205 Hashtbl.add named_values var_name variable;
1206
1207 (* Emit the body of the loop. This, like any other expr, can change the
1208 * current BB. Note that we ignore the value computed by the body, but
1209 * don't allow an error *)
1210 ignore (codegen_expr body);
1211
1212 (* Emit the step value. *)
1213 let step_val =
1214 match step with
1215 | Some step -> codegen_expr step
1216 (* If not specified, use 1.0. *)
1217 | None -> const_float double_type 1.0
1218 in
1219
1220 let next_var = build_add variable step_val "nextvar" builder in
1221
1222 (* Compute the end condition. *)
1223 let end_cond = codegen_expr end_ in
1224
1225 (* Convert condition to a bool by comparing equal to 0.0. *)
1226 let zero = const_float double_type 0.0 in
1227 let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
1228
1229 (* Create the "after loop" block and insert it. *)
1230 let loop_end_bb = insertion_block builder in
1231 let after_bb = append_block context "afterloop" the_function in
1232
1233 (* Insert the conditional branch into the end of loop_end_bb. *)
1234 ignore (build_cond_br end_cond loop_bb after_bb builder);
1235
1236 (* Any new code will be inserted in after_bb. *)
1237 position_at_end after_bb builder;
1238
1239 (* Add a new entry to the PHI node for the backedge. *)
1240 add_incoming (next_var, loop_end_bb) variable;
1241
1242 (* Restore the unshadowed variable. *)
1243 begin match old_val with
1244 | Some old_val -> Hashtbl.add named_values var_name old_val
1245 | None -> ()
1246 end;
1247
1248 (* for expr always returns 0.0. *)
1249 const_null double_type
1250
1251 let codegen_proto = function
1252 | Ast.Prototype (name, args) | Ast.BinOpPrototype (name, args, _) ->
1253 (* Make the function type: double(double,double) etc. *)
1254 let doubles = Array.make (Array.length args) double_type in
1255 let ft = function_type double_type doubles in
1256 let f =
1257 match lookup_function name the_module with
1258 | None -> declare_function name ft the_module
1259
1260 (* If 'f' conflicted, there was already something named 'name'. If it
1261 * has a body, don't allow redefinition or reextern. *)
1262 | Some f ->
1263 (* If 'f' already has a body, reject this. *)
1264 if block_begin f <> At_end f then
1265 raise (Error "redefinition of function");
1266
1267 (* If 'f' took a different number of arguments, reject. *)
1268 if element_type (type_of f) <> ft then
1269 raise (Error "redefinition of function with different # args");
1270 f
1271 in
1272
1273 (* Set names for all arguments. *)
1274 Array.iteri (fun i a ->
1275 let n = args.(i) in
1276 set_value_name n a;
1277 Hashtbl.add named_values n a;
1278 ) (params f);
1279 f
1280
1281 let codegen_func the_fpm = function
1282 | Ast.Function (proto, body) ->
1283 Hashtbl.clear named_values;
1284 let the_function = codegen_proto proto in
1285
1286 (* If this is an operator, install it. *)
1287 begin match proto with
1288 | Ast.BinOpPrototype (name, args, prec) ->
1289 let op = name.[String.length name - 1] in
1290 Hashtbl.add Parser.binop_precedence op prec;
1291 | _ -> ()
1292 end;
1293
1294 (* Create a new basic block to start insertion into. *)
1295 let bb = append_block context "entry" the_function in
1296 position_at_end bb builder;
1297
1298 try
1299 let ret_val = codegen_expr body in
1300
1301 (* Finish off the function. *)
1302 let _ = build_ret ret_val builder in
1303
1304 (* Validate the generated code, checking for consistency. *)
1305 Llvm_analysis.assert_valid_function the_function;
1306
1307 (* Optimize the function. *)
1308 let _ = PassManager.run_function the_function the_fpm in
1309
1310 the_function
1311 with e ->
1312 delete_function the_function;
1313 raise e
1314
1315toplevel.ml:
1316 .. code-block:: ocaml
1317
1318 (*===----------------------------------------------------------------------===
1319 * Top-Level parsing and JIT Driver
1320 *===----------------------------------------------------------------------===*)
1321
1322 open Llvm
1323 open Llvm_executionengine
1324
1325 (* top ::= definition | external | expression | ';' *)
1326 let rec main_loop the_fpm the_execution_engine stream =
1327 match Stream.peek stream with
1328 | None -> ()
1329
1330 (* ignore top-level semicolons. *)
1331 | Some (Token.Kwd ';') ->
1332 Stream.junk stream;
1333 main_loop the_fpm the_execution_engine stream
1334
1335 | Some token ->
1336 begin
1337 try match token with
1338 | Token.Def ->
1339 let e = Parser.parse_definition stream in
1340 print_endline "parsed a function definition.";
1341 dump_value (Codegen.codegen_func the_fpm e);
1342 | Token.Extern ->
1343 let e = Parser.parse_extern stream in
1344 print_endline "parsed an extern.";
1345 dump_value (Codegen.codegen_proto e);
1346 | _ ->
1347 (* Evaluate a top-level expression into an anonymous function. *)
1348 let e = Parser.parse_toplevel stream in
1349 print_endline "parsed a top-level expr";
1350 let the_function = Codegen.codegen_func the_fpm e in
1351 dump_value the_function;
1352
1353 (* JIT the function, returning a function pointer. *)
1354 let result = ExecutionEngine.run_function the_function [||]
1355 the_execution_engine in
1356
1357 print_string "Evaluated to ";
1358 print_float (GenericValue.as_float Codegen.double_type result);
1359 print_newline ();
1360 with Stream.Error s | Codegen.Error s ->
1361 (* Skip token for error recovery. *)
1362 Stream.junk stream;
1363 print_endline s;
1364 end;
1365 print_string "ready> "; flush stdout;
1366 main_loop the_fpm the_execution_engine stream
1367
1368toy.ml:
1369 .. code-block:: ocaml
1370
1371 (*===----------------------------------------------------------------------===
1372 * Main driver code.
1373 *===----------------------------------------------------------------------===*)
1374
1375 open Llvm
1376 open Llvm_executionengine
1377 open Llvm_target
1378 open Llvm_scalar_opts
1379
1380 let main () =
1381 ignore (initialize_native_target ());
1382
1383 (* Install standard binary operators.
1384 * 1 is the lowest precedence. *)
1385 Hashtbl.add Parser.binop_precedence '<' 10;
1386 Hashtbl.add Parser.binop_precedence '+' 20;
1387 Hashtbl.add Parser.binop_precedence '-' 20;
1388 Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *)
1389
1390 (* Prime the first token. *)
1391 print_string "ready> "; flush stdout;
1392 let stream = Lexer.lex (Stream.of_channel stdin) in
1393
1394 (* Create the JIT. *)
1395 let the_execution_engine = ExecutionEngine.create Codegen.the_module in
1396 let the_fpm = PassManager.create_function Codegen.the_module in
1397
1398 (* Set up the optimizer pipeline. Start with registering info about how the
1399 * target lays out data structures. *)
1400 DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
1401
1402 (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
1403 add_instruction_combination the_fpm;
1404
1405 (* reassociate expressions. *)
1406 add_reassociation the_fpm;
1407
1408 (* Eliminate Common SubExpressions. *)
1409 add_gvn the_fpm;
1410
1411 (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
1412 add_cfg_simplification the_fpm;
1413
1414 ignore (PassManager.initialize the_fpm);
1415
1416 (* Run the main "interpreter loop" now. *)
1417 Toplevel.main_loop the_fpm the_execution_engine stream;
1418
1419 (* Print out all the generated code. *)
1420 dump_module Codegen.the_module
1421 ;;
1422
1423 main ()
1424
1425bindings.c
1426 .. code-block:: c
1427
1428 #include <stdio.h>
1429
1430 /* putchard - putchar that takes a double and returns 0. */
1431 extern double putchard(double X) {
1432 putchar((char)X);
1433 return 0;
1434 }
1435
1436 /* printd - printf that takes a double prints it as "%f\n", returning 0. */
1437 extern double printd(double X) {
1438 printf("%f\n", X);
1439 return 0;
1440 }
1441
1442`Next: Extending the language: mutable variables / SSA
1443construction <OCamlLangImpl7.html>`_
1444