blob: c8b97cdb4ab41e705a3ec6b805253ec3548c43e7 [file] [log] [blame]
David Majnemera6539272016-07-23 04:05:08 +00001=====================================
2Coroutines in LLVM
3=====================================
4
5.. contents::
6 :local:
7 :depth: 3
8
9.. warning::
10 This is a work in progress. Compatibility across LLVM releases is not
11 guaranteed.
12
13Introduction
14============
15
16.. _coroutine handle:
17
18LLVM coroutines are functions that have one or more `suspend points`_.
19When a suspend point is reached, the execution of a coroutine is suspended and
20control is returned back to its caller. A suspended coroutine can be resumed
21to continue execution from the last suspend point or it can be destroyed.
22
23In the following example, we call function `f` (which may or may not be a
24coroutine itself) that returns a handle to a suspended coroutine
25(**coroutine handle**) that is used by `main` to resume the coroutine twice and
26then destroy it:
27
28.. code-block:: llvm
29
30 define i32 @main() {
31 entry:
32 %hdl = call i8* @f(i32 4)
33 call void @llvm.coro.resume(i8* %hdl)
34 call void @llvm.coro.resume(i8* %hdl)
35 call void @llvm.coro.destroy(i8* %hdl)
36 ret i32 0
37 }
38
39.. _coroutine frame:
40
41In addition to the function stack frame which exists when a coroutine is
42executing, there is an additional region of storage that contains objects that
43keep the coroutine state when a coroutine is suspended. This region of storage
John McCall94010b22019-08-14 03:53:17 +000044is called the **coroutine frame**. It is created when a coroutine is called
45and destroyed when a coroutine either runs to completion or is destroyed
46while suspended.
David Majnemera6539272016-07-23 04:05:08 +000047
John McCall94010b22019-08-14 03:53:17 +000048LLVM currently supports two styles of coroutine lowering. These styles
49support substantially different sets of features, have substantially
50different ABIs, and expect substantially different patterns of frontend
51code generation. However, the styles also have a great deal in common.
David Majnemera6539272016-07-23 04:05:08 +000052
John McCall94010b22019-08-14 03:53:17 +000053In all cases, an LLVM coroutine is initially represented as an ordinary LLVM
54function that has calls to `coroutine intrinsics`_ defining the structure of
55the coroutine. The coroutine function is then, in the most general case,
56rewritten by the coroutine lowering passes to become the "ramp function",
57the initial entrypoint of the coroutine, which executes until a suspend point
58is first reached. The remainder of the original coroutine function is split
59out into some number of "resume functions". Any state which must persist
60across suspensions is stored in the coroutine frame. The resume functions
61must somehow be able to handle either a "normal" resumption, which continues
62the normal execution of the coroutine, or an "abnormal" resumption, which
63must unwind the coroutine without attempting to suspend it.
David Majnemera6539272016-07-23 04:05:08 +000064
John McCall94010b22019-08-14 03:53:17 +000065Switched-Resume Lowering
66------------------------
David Majnemera6539272016-07-23 04:05:08 +000067
John McCall94010b22019-08-14 03:53:17 +000068In LLVM's standard switched-resume lowering, signaled by the use of
69`llvm.coro.id`, the coroutine frame is stored as part of a "coroutine
70object" which represents a handle to a particular invocation of the
71coroutine. All coroutine objects support a common ABI allowing certain
72features to be used without knowing anything about the coroutine's
73implementation:
David Majnemera6539272016-07-23 04:05:08 +000074
John McCall94010b22019-08-14 03:53:17 +000075- A coroutine object can be queried to see if it has reached completion
76 with `llvm.coro.done`.
77
78- A coroutine object can be resumed normally if it has not already reached
79 completion with `llvm.coro.resume`.
80
81- A coroutine object can be destroyed, invalidating the coroutine object,
82 with `llvm.coro.destroy`. This must be done separately even if the
83 coroutine has reached completion normally.
84
85- "Promise" storage, which is known to have a certain size and alignment,
86 can be projected out of the coroutine object with `llvm.coro.promise`.
87 The coroutine implementation must have been compiled to define a promise
88 of the same size and alignment.
89
90In general, interacting with a coroutine object in any of these ways while
91it is running has undefined behavior.
92
93The coroutine function is split into three functions, representing three
94different ways that control can enter the coroutine:
95
961. the ramp function that is initially invoked, which takes arbitrary
97 arguments and returns a pointer to the coroutine object;
98
992. a coroutine resume function that is invoked when the coroutine is resumed,
100 which takes a pointer to the coroutine object and returns `void`;
101
1023. a coroutine destroy function that is invoked when the coroutine is
103 destroyed, which takes a pointer to the coroutine object and returns
104 `void`.
105
106Because the resume and destroy functions are shared across all suspend
107points, suspend points must store the index of the active suspend in
108the coroutine object, and the resume/destroy functions must switch over
109that index to get back to the correct point. Hence the name of this
110lowering.
111
112Pointers to the resume and destroy functions are stored in the coroutine
113object at known offsets which are fixed for all coroutines. A completed
114coroutine is represented with a null resume function.
115
116There is a somewhat complex protocol of intrinsics for allocating and
117deallocating the coroutine object. It is complex in order to allow the
118allocation to be elided due to inlining. This protocol is discussed
119in further detail below.
120
121The frontend may generate code to call the coroutine function directly;
122this will become a call to the ramp function and will return a pointer
123to the coroutine object. The frontend should always resume or destroy
124the coroutine using the corresping intrinsics.
125
126Returned-Continuation Lowering
127------------------------------
128
129In returned-continuation lowering, signaled by the use of
130`llvm.coro.id.retcon` or `llvm.coro.id.retcon.once`, some aspects of
131the ABI must be handled more explicitly by the frontend.
132
133In this lowering, every suspend point takes a list of "yielded values"
134which are returned back to the caller along with a function pointer,
135called the continuation function. The coroutine is resumed by simply
136calling this continuation function pointer. The original coroutine
137is divided into the ramp function and then an arbitrary number of
138these continuation functions, one for each suspend point.
139
140LLVM actually supports two closely-related returned-continuation
141lowerings:
142
143- In normal returned-continuation lowering, the coroutine may suspend
144 itself multiple times. This means that a continuation function
145 itself returns another continuation pointer, as well as a list of
146 yielded values.
147
148 The coroutine indicates that it has run to completion by returning
149 a null continuation pointer. Any yielded values will be `undef`
150 should be ignored.
151
152- In yield-once returned-continuation lowering, the coroutine must
153 suspend itself exactly once (or throw an exception). The ramp
154 function returns a continuation function pointer and yielded
155 values, but the continuation function simply returns `void`
156 when the coroutine has run to completion.
157
158The coroutine frame is maintained in a fixed-size buffer that is
159passed to the `coro.id` intrinsic, which guarantees a certain size
160and alignment statically. The same buffer must be passed to the
161continuation function(s). The coroutine will allocate memory if the
162buffer is insufficient, in which case it will need to store at
163least that pointer in the buffer; therefore the buffer must always
164be at least pointer-sized. How the coroutine uses the buffer may
165vary between suspend points.
166
167In addition to the buffer pointer, continuation functions take an
168argument indicating whether the coroutine is being resumed normally
169(zero) or abnormally (non-zero).
170
171LLVM is currently ineffective at statically eliminating allocations
172after fully inlining returned-continuation coroutines into a caller.
173This may be acceptable if LLVM's coroutine support is primarily being
174used for low-level lowering and inlining is expected to be applied
175earlier in the pipeline.
David Majnemera6539272016-07-23 04:05:08 +0000176
177Coroutines by Example
178=====================
179
John McCall94010b22019-08-14 03:53:17 +0000180The examples below are all of switched-resume coroutines.
181
David Majnemera6539272016-07-23 04:05:08 +0000182Coroutine Representation
183------------------------
184
185Let's look at an example of an LLVM coroutine with the behavior sketched
186by the following pseudo-code.
187
Sanjoy Das77a9c792016-07-26 21:03:41 +0000188.. code-block:: c++
David Majnemera6539272016-07-23 04:05:08 +0000189
190 void *f(int n) {
191 for(;;) {
192 print(n++);
193 <suspend> // returns a coroutine handle on first suspend
194 }
195 }
196
197This coroutine calls some function `print` with value `n` as an argument and
198suspends execution. Every time this coroutine resumes, it calls `print` again with an argument one bigger than the last time. This coroutine never completes by itself and must be destroyed explicitly. If we use this coroutine with
199a `main` shown in the previous section. It will call `print` with values 4, 5
200and 6 after which the coroutine will be destroyed.
201
202The LLVM IR for this coroutine looks like this:
203
Gor Nishanov06fdf482017-04-05 05:26:26 +0000204.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000205
206 define i8* @f(i32 %n) {
207 entry:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000208 %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
David Majnemera6539272016-07-23 04:05:08 +0000209 %size = call i32 @llvm.coro.size.i32()
210 %alloc = call i8* @malloc(i32 %size)
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000211 %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %alloc)
David Majnemera6539272016-07-23 04:05:08 +0000212 br label %loop
213 loop:
214 %n.val = phi i32 [ %n, %entry ], [ %inc, %loop ]
215 %inc = add nsw i32 %n.val, 1
216 call void @print(i32 %n.val)
217 %0 = call i8 @llvm.coro.suspend(token none, i1 false)
218 switch i8 %0, label %suspend [i8 0, label %loop
219 i8 1, label %cleanup]
220 cleanup:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000221 %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
David Majnemera6539272016-07-23 04:05:08 +0000222 call void @free(i8* %mem)
223 br label %suspend
224 suspend:
Gor Nishanovc52006a2017-03-07 21:00:54 +0000225 %unused = call i1 @llvm.coro.end(i8* %hdl, i1 false)
David Majnemera6539272016-07-23 04:05:08 +0000226 ret i8* %hdl
227 }
228
229The `entry` block establishes the coroutine frame. The `coro.size`_ intrinsic is
230lowered to a constant representing the size required for the coroutine frame.
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000231The `coro.begin`_ intrinsic initializes the coroutine frame and returns the
232coroutine handle. The second parameter of `coro.begin` is given a block of memory
233to be used if the coroutine frame needs to be allocated dynamically.
234The `coro.id`_ intrinsic serves as coroutine identity useful in cases when the
235`coro.begin`_ intrinsic get duplicated by optimization passes such as
236jump-threading.
David Majnemera6539272016-07-23 04:05:08 +0000237
238The `cleanup` block destroys the coroutine frame. The `coro.free`_ intrinsic,
239given the coroutine handle, returns a pointer of the memory block to be freed or
240`null` if the coroutine frame was not allocated dynamically. The `cleanup`
241block is entered when coroutine runs to completion by itself or destroyed via
242call to the `coro.destroy`_ intrinsic.
243
244The `suspend` block contains code to be executed when coroutine runs to
245completion or suspended. The `coro.end`_ intrinsic marks the point where
246a coroutine needs to return control back to the caller if it is not an initial
247invocation of the coroutine.
248
249The `loop` blocks represents the body of the coroutine. The `coro.suspend`_
250intrinsic in combination with the following switch indicates what happens to
251control flow when a coroutine is suspended (default case), resumed (case 0) or
252destroyed (case 1).
253
254Coroutine Transformation
255------------------------
256
257One of the steps of coroutine lowering is building the coroutine frame. The
258def-use chains are analyzed to determine which objects need be kept alive across
259suspend points. In the coroutine shown in the previous section, use of virtual register
260`%n.val` is separated from the definition by a suspend point, therefore, it
261cannot reside on the stack frame since the latter goes away once the coroutine
262is suspended and control is returned back to the caller. An i32 slot is
263allocated in the coroutine frame and `%n.val` is spilled and reloaded from that
264slot as needed.
265
266We also store addresses of the resume and destroy functions so that the
267`coro.resume` and `coro.destroy` intrinsics can resume and destroy the coroutine
268when its identity cannot be determined statically at compile time. For our
269example, the coroutine frame will be:
270
Gor Nishanov06fdf482017-04-05 05:26:26 +0000271.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000272
273 %f.frame = type { void (%f.frame*)*, void (%f.frame*)*, i32 }
274
275After resume and destroy parts are outlined, function `f` will contain only the
276code responsible for creation and initialization of the coroutine frame and
277execution of the coroutine until a suspend point is reached:
278
Gor Nishanov06fdf482017-04-05 05:26:26 +0000279.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000280
281 define i8* @f(i32 %n) {
282 entry:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000283 %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
David Majnemera6539272016-07-23 04:05:08 +0000284 %alloc = call noalias i8* @malloc(i32 24)
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000285 %0 = call noalias i8* @llvm.coro.begin(token %id, i8* %alloc)
Mehdi Aminibe1cb222016-07-27 06:03:47 +0000286 %frame = bitcast i8* %0 to %f.frame*
David Majnemera6539272016-07-23 04:05:08 +0000287 %1 = getelementptr %f.frame, %f.frame* %frame, i32 0, i32 0
288 store void (%f.frame*)* @f.resume, void (%f.frame*)** %1
289 %2 = getelementptr %f.frame, %f.frame* %frame, i32 0, i32 1
290 store void (%f.frame*)* @f.destroy, void (%f.frame*)** %2
291
292 %inc = add nsw i32 %n, 1
293 %inc.spill.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 2
294 store i32 %inc, i32* %inc.spill.addr
295 call void @print(i32 %n)
296
297 ret i8* %frame
298 }
299
300Outlined resume part of the coroutine will reside in function `f.resume`:
301
302.. code-block:: llvm
303
304 define internal fastcc void @f.resume(%f.frame* %frame.ptr.resume) {
305 entry:
306 %inc.spill.addr = getelementptr %f.frame, %f.frame* %frame.ptr.resume, i64 0, i32 2
307 %inc.spill = load i32, i32* %inc.spill.addr, align 4
308 %inc = add i32 %n.val, 1
309 store i32 %inc, i32* %inc.spill.addr, align 4
310 tail call void @print(i32 %inc)
311 ret void
312 }
313
314Whereas function `f.destroy` will contain the cleanup code for the coroutine:
315
316.. code-block:: llvm
317
318 define internal fastcc void @f.destroy(%f.frame* %frame.ptr.destroy) {
319 entry:
320 %0 = bitcast %f.frame* %frame.ptr.destroy to i8*
321 tail call void @free(i8* %0)
322 ret void
323 }
324
325Avoiding Heap Allocations
326-------------------------
327
328A particular coroutine usage pattern, which is illustrated by the `main`
329function in the overview section, where a coroutine is created, manipulated and
330destroyed by the same calling function, is common for coroutines implementing
331RAII idiom and is suitable for allocation elision optimization which avoid
332dynamic allocation by storing the coroutine frame as a static `alloca` in its
333caller.
334
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000335In the entry block, we will call `coro.alloc`_ intrinsic that will return `true`
336when dynamic allocation is required, and `false` if dynamic allocation is
337elided.
David Majnemera6539272016-07-23 04:05:08 +0000338
Gor Nishanov06fdf482017-04-05 05:26:26 +0000339.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000340
341 entry:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000342 %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000343 %need.dyn.alloc = call i1 @llvm.coro.alloc(token %id)
344 br i1 %need.dyn.alloc, label %dyn.alloc, label %coro.begin
David Majnemera6539272016-07-23 04:05:08 +0000345 dyn.alloc:
346 %size = call i32 @llvm.coro.size.i32()
347 %alloc = call i8* @CustomAlloc(i32 %size)
348 br label %coro.begin
349 coro.begin:
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000350 %phi = phi i8* [ null, %entry ], [ %alloc, %dyn.alloc ]
351 %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %phi)
David Majnemera6539272016-07-23 04:05:08 +0000352
353In the cleanup block, we will make freeing the coroutine frame conditional on
354`coro.free`_ intrinsic. If allocation is elided, `coro.free`_ returns `null`
355thus skipping the deallocation code:
356
Gor Nishanov06fdf482017-04-05 05:26:26 +0000357.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000358
359 cleanup:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000360 %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
David Majnemera6539272016-07-23 04:05:08 +0000361 %need.dyn.free = icmp ne i8* %mem, null
362 br i1 %need.dyn.free, label %dyn.free, label %if.end
363 dyn.free:
364 call void @CustomFree(i8* %mem)
365 br label %if.end
366 if.end:
367 ...
368
369With allocations and deallocations represented as described as above, after
David Majnemer78557192016-07-27 05:12:35 +0000370coroutine heap allocation elision optimization, the resulting main will be:
David Majnemera6539272016-07-23 04:05:08 +0000371
372.. code-block:: llvm
373
374 define i32 @main() {
375 entry:
376 call void @print(i32 4)
377 call void @print(i32 5)
378 call void @print(i32 6)
379 ret i32 0
380 }
381
382Multiple Suspend Points
383-----------------------
384
385Let's consider the coroutine that has more than one suspend point:
386
Sanjoy Das77a9c792016-07-26 21:03:41 +0000387.. code-block:: c++
David Majnemera6539272016-07-23 04:05:08 +0000388
389 void *f(int n) {
390 for(;;) {
391 print(n++);
392 <suspend>
393 print(-n);
394 <suspend>
395 }
396 }
397
398Matching LLVM code would look like (with the rest of the code remaining the same
399as the code in the previous section):
400
Gor Nishanov06fdf482017-04-05 05:26:26 +0000401.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000402
403 loop:
404 %n.addr = phi i32 [ %n, %entry ], [ %inc, %loop.resume ]
405 call void @print(i32 %n.addr) #4
406 %2 = call i8 @llvm.coro.suspend(token none, i1 false)
407 switch i8 %2, label %suspend [i8 0, label %loop.resume
408 i8 1, label %cleanup]
409 loop.resume:
410 %inc = add nsw i32 %n.addr, 1
411 %sub = xor i32 %n.addr, -1
412 call void @print(i32 %sub)
413 %3 = call i8 @llvm.coro.suspend(token none, i1 false)
414 switch i8 %3, label %suspend [i8 0, label %loop
415 i8 1, label %cleanup]
416
417In this case, the coroutine frame would include a suspend index that will
418indicate at which suspend point the coroutine needs to resume. The resume
419function will use an index to jump to an appropriate basic block and will look
420as follows:
421
422.. code-block:: llvm
423
424 define internal fastcc void @f.Resume(%f.Frame* %FramePtr) {
425 entry.Resume:
426 %index.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i64 0, i32 2
427 %index = load i8, i8* %index.addr, align 1
428 %switch = icmp eq i8 %index, 0
429 %n.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i64 0, i32 3
430 %n = load i32, i32* %n.addr, align 4
431 br i1 %switch, label %loop.resume, label %loop
432
433 loop.resume:
434 %sub = xor i32 %n, -1
435 call void @print(i32 %sub)
436 br label %suspend
437 loop:
438 %inc = add nsw i32 %n, 1
439 store i32 %inc, i32* %n.addr, align 4
440 tail call void @print(i32 %inc)
441 br label %suspend
442
443 suspend:
444 %storemerge = phi i8 [ 0, %loop ], [ 1, %loop.resume ]
445 store i8 %storemerge, i8* %index.addr, align 1
446 ret void
447 }
448
449If different cleanup code needs to get executed for different suspend points,
450a similar switch will be in the `f.destroy` function.
451
452.. note ::
453
454 Using suspend index in a coroutine state and having a switch in `f.resume` and
455 `f.destroy` is one of the possible implementation strategies. We explored
456 another option where a distinct `f.resume1`, `f.resume2`, etc. are created for
457 every suspend point, and instead of storing an index, the resume and destroy
458 function pointers are updated at every suspend. Early testing showed that the
459 current approach is easier on the optimizer than the latter so it is a
460 lowering strategy implemented at the moment.
461
462Distinct Save and Suspend
463-------------------------
464
465In the previous example, setting a resume index (or some other state change that
466needs to happen to prepare a coroutine for resumption) happens at the same time as
467a suspension of a coroutine. However, in certain cases, it is necessary to control
468when coroutine is prepared for resumption and when it is suspended.
469
470In the following example, a coroutine represents some activity that is driven
471by completions of asynchronous operations `async_op1` and `async_op2` which get
472a coroutine handle as a parameter and resume the coroutine once async
473operation is finished.
474
Aaron Ballmanbc7c2d02016-07-23 20:11:21 +0000475.. code-block:: text
David Majnemera6539272016-07-23 04:05:08 +0000476
477 void g() {
478 for (;;)
479 if (cond()) {
480 async_op1(<coroutine-handle>); // will resume once async_op1 completes
481 <suspend>
482 do_one();
483 }
484 else {
485 async_op2(<coroutine-handle>); // will resume once async_op2 completes
486 <suspend>
487 do_two();
488 }
489 }
490 }
491
492In this case, coroutine should be ready for resumption prior to a call to
493`async_op1` and `async_op2`. The `coro.save`_ intrinsic is used to indicate a
494point when coroutine should be ready for resumption (namely, when a resume index
495should be stored in the coroutine frame, so that it can be resumed at the
496correct resume point):
497
Gor Nishanov06fdf482017-04-05 05:26:26 +0000498.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000499
500 if.true:
501 %save1 = call token @llvm.coro.save(i8* %hdl)
Gor Nishanov06fdf482017-04-05 05:26:26 +0000502 call void @async_op1(i8* %hdl)
David Majnemera6539272016-07-23 04:05:08 +0000503 %suspend1 = call i1 @llvm.coro.suspend(token %save1, i1 false)
504 switch i8 %suspend1, label %suspend [i8 0, label %resume1
505 i8 1, label %cleanup]
506 if.false:
507 %save2 = call token @llvm.coro.save(i8* %hdl)
Gor Nishanov06fdf482017-04-05 05:26:26 +0000508 call void @async_op2(i8* %hdl)
David Majnemera6539272016-07-23 04:05:08 +0000509 %suspend2 = call i1 @llvm.coro.suspend(token %save2, i1 false)
510 switch i8 %suspend1, label %suspend [i8 0, label %resume2
511 i8 1, label %cleanup]
512
513.. _coroutine promise:
514
515Coroutine Promise
516-----------------
517
518A coroutine author or a frontend may designate a distinguished `alloca` that can
519be used to communicate with the coroutine. This distinguished alloca is called
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000520**coroutine promise** and is provided as the second parameter to the
521`coro.id`_ intrinsic.
David Majnemera6539272016-07-23 04:05:08 +0000522
523The following coroutine designates a 32 bit integer `promise` and uses it to
524store the current value produced by a coroutine.
525
Gor Nishanov06fdf482017-04-05 05:26:26 +0000526.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000527
528 define i8* @f(i32 %n) {
529 entry:
530 %promise = alloca i32
531 %pv = bitcast i32* %promise to i8*
Gor Nishanovdce9b022016-08-29 14:34:12 +0000532 %id = call token @llvm.coro.id(i32 0, i8* %pv, i8* null, i8* null)
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000533 %need.dyn.alloc = call i1 @llvm.coro.alloc(token %id)
534 br i1 %need.dyn.alloc, label %dyn.alloc, label %coro.begin
David Majnemer78557192016-07-27 05:12:35 +0000535 dyn.alloc:
David Majnemera6539272016-07-23 04:05:08 +0000536 %size = call i32 @llvm.coro.size.i32()
537 %alloc = call i8* @malloc(i32 %size)
David Majnemer78557192016-07-27 05:12:35 +0000538 br label %coro.begin
539 coro.begin:
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000540 %phi = phi i8* [ null, %entry ], [ %alloc, %dyn.alloc ]
541 %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %phi)
David Majnemera6539272016-07-23 04:05:08 +0000542 br label %loop
543 loop:
David Majnemer78557192016-07-27 05:12:35 +0000544 %n.val = phi i32 [ %n, %coro.begin ], [ %inc, %loop ]
David Majnemera6539272016-07-23 04:05:08 +0000545 %inc = add nsw i32 %n.val, 1
546 store i32 %n.val, i32* %promise
547 %0 = call i8 @llvm.coro.suspend(token none, i1 false)
548 switch i8 %0, label %suspend [i8 0, label %loop
549 i8 1, label %cleanup]
550 cleanup:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000551 %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
David Majnemera6539272016-07-23 04:05:08 +0000552 call void @free(i8* %mem)
553 br label %suspend
554 suspend:
Gor Nishanovc52006a2017-03-07 21:00:54 +0000555 %unused = call i1 @llvm.coro.end(i8* %hdl, i1 false)
David Majnemera6539272016-07-23 04:05:08 +0000556 ret i8* %hdl
557 }
558
559A coroutine consumer can rely on the `coro.promise`_ intrinsic to access the
560coroutine promise.
561
562.. code-block:: llvm
563
564 define i32 @main() {
565 entry:
566 %hdl = call i8* @f(i32 4)
567 %promise.addr.raw = call i8* @llvm.coro.promise(i8* %hdl, i32 4, i1 false)
568 %promise.addr = bitcast i8* %promise.addr.raw to i32*
569 %val0 = load i32, i32* %promise.addr
570 call void @print(i32 %val0)
571 call void @llvm.coro.resume(i8* %hdl)
572 %val1 = load i32, i32* %promise.addr
573 call void @print(i32 %val1)
574 call void @llvm.coro.resume(i8* %hdl)
575 %val2 = load i32, i32* %promise.addr
576 call void @print(i32 %val2)
577 call void @llvm.coro.destroy(i8* %hdl)
578 ret i32 0
579 }
580
David Majnemer78557192016-07-27 05:12:35 +0000581After example in this section is compiled, result of the compilation will be:
David Majnemera6539272016-07-23 04:05:08 +0000582
583.. code-block:: llvm
584
585 define i32 @main() {
586 entry:
587 tail call void @print(i32 4)
588 tail call void @print(i32 5)
589 tail call void @print(i32 6)
590 ret i32 0
591 }
592
593.. _final:
594.. _final suspend:
595
596Final Suspend
597-------------
598
599A coroutine author or a frontend may designate a particular suspend to be final,
600by setting the second argument of the `coro.suspend`_ intrinsic to `true`.
601Such a suspend point has two properties:
602
603* it is possible to check whether a suspended coroutine is at the final suspend
604 point via `coro.done`_ intrinsic;
605
606* a resumption of a coroutine stopped at the final suspend point leads to
607 undefined behavior. The only possible action for a coroutine at a final
608 suspend point is destroying it via `coro.destroy`_ intrinsic.
609
610From the user perspective, the final suspend point represents an idea of a
611coroutine reaching the end. From the compiler perspective, it is an optimization
612opportunity for reducing number of resume points (and therefore switch cases) in
613the resume function.
614
615The following is an example of a function that keeps resuming the coroutine
616until the final suspend point is reached after which point the coroutine is
617destroyed:
618
619.. code-block:: llvm
620
621 define i32 @main() {
622 entry:
623 %hdl = call i8* @f(i32 4)
624 br label %while
625 while:
626 call void @llvm.coro.resume(i8* %hdl)
627 %done = call i1 @llvm.coro.done(i8* %hdl)
628 br i1 %done, label %end, label %while
629 end:
630 call void @llvm.coro.destroy(i8* %hdl)
631 ret i32 0
632 }
633
634Usually, final suspend point is a frontend injected suspend point that does not
635correspond to any explicitly authored suspend point of the high level language.
636For example, for a Python generator that has only one suspend point:
637
638.. code-block:: python
639
640 def coroutine(n):
641 for i in range(n):
642 yield i
643
644Python frontend would inject two more suspend points, so that the actual code
645looks like this:
646
Sanjoy Das77a9c792016-07-26 21:03:41 +0000647.. code-block:: c
David Majnemera6539272016-07-23 04:05:08 +0000648
649 void* coroutine(int n) {
650 int current_value;
651 <designate current_value to be coroutine promise>
652 <SUSPEND> // injected suspend point, so that the coroutine starts suspended
653 for (int i = 0; i < n; ++i) {
654 current_value = i; <SUSPEND>; // corresponds to "yield i"
655 }
656 <SUSPEND final=true> // injected final suspend point
657 }
658
659and python iterator `__next__` would look like:
660
Sanjoy Das77a9c792016-07-26 21:03:41 +0000661.. code-block:: c++
David Majnemera6539272016-07-23 04:05:08 +0000662
663 int __next__(void* hdl) {
664 coro.resume(hdl);
665 if (coro.done(hdl)) throw StopIteration();
666 return *(int*)coro.promise(hdl, 4, false);
667 }
668
John McCall94010b22019-08-14 03:53:17 +0000669
David Majnemera6539272016-07-23 04:05:08 +0000670Intrinsics
671==========
672
673Coroutine Manipulation Intrinsics
674---------------------------------
675
676Intrinsics described in this section are used to manipulate an existing
677coroutine. They can be used in any function which happen to have a pointer
678to a `coroutine frame`_ or a pointer to a `coroutine promise`_.
679
680.. _coro.destroy:
681
682'llvm.coro.destroy' Intrinsic
683^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
684
685Syntax:
686"""""""
687
688::
689
690 declare void @llvm.coro.destroy(i8* <handle>)
691
692Overview:
693"""""""""
694
695The '``llvm.coro.destroy``' intrinsic destroys a suspended
John McCall94010b22019-08-14 03:53:17 +0000696switched-resume coroutine.
David Majnemera6539272016-07-23 04:05:08 +0000697
698Arguments:
699""""""""""
700
701The argument is a coroutine handle to a suspended coroutine.
702
703Semantics:
704""""""""""
705
706When possible, the `coro.destroy` intrinsic is replaced with a direct call to
707the coroutine destroy function. Otherwise it is replaced with an indirect call
708based on the function pointer for the destroy function stored in the coroutine
709frame. Destroying a coroutine that is not suspended leads to undefined behavior.
710
711.. _coro.resume:
712
713'llvm.coro.resume' Intrinsic
714^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
715
716::
717
718 declare void @llvm.coro.resume(i8* <handle>)
719
720Overview:
721"""""""""
722
John McCall94010b22019-08-14 03:53:17 +0000723The '``llvm.coro.resume``' intrinsic resumes a suspended switched-resume coroutine.
David Majnemera6539272016-07-23 04:05:08 +0000724
725Arguments:
726""""""""""
727
728The argument is a handle to a suspended coroutine.
729
730Semantics:
731""""""""""
732
733When possible, the `coro.resume` intrinsic is replaced with a direct call to the
734coroutine resume function. Otherwise it is replaced with an indirect call based
735on the function pointer for the resume function stored in the coroutine frame.
736Resuming a coroutine that is not suspended leads to undefined behavior.
737
738.. _coro.done:
739
740'llvm.coro.done' Intrinsic
741^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
742
743::
744
745 declare i1 @llvm.coro.done(i8* <handle>)
746
747Overview:
748"""""""""
749
John McCall94010b22019-08-14 03:53:17 +0000750The '``llvm.coro.done``' intrinsic checks whether a suspended
751switched-resume coroutine is at the final suspend point or not.
David Majnemera6539272016-07-23 04:05:08 +0000752
753Arguments:
754""""""""""
755
756The argument is a handle to a suspended coroutine.
757
758Semantics:
759""""""""""
760
761Using this intrinsic on a coroutine that does not have a `final suspend`_ point
762or on a coroutine that is not suspended leads to undefined behavior.
763
764.. _coro.promise:
765
766'llvm.coro.promise' Intrinsic
767^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
768
769::
770
771 declare i8* @llvm.coro.promise(i8* <ptr>, i32 <alignment>, i1 <from>)
772
773Overview:
774"""""""""
775
776The '``llvm.coro.promise``' intrinsic obtains a pointer to a
John McCall94010b22019-08-14 03:53:17 +0000777`coroutine promise`_ given a switched-resume coroutine handle and vice versa.
David Majnemera6539272016-07-23 04:05:08 +0000778
779Arguments:
780""""""""""
781
782The first argument is a handle to a coroutine if `from` is false. Otherwise,
783it is a pointer to a coroutine promise.
784
785The second argument is an alignment requirements of the promise.
786If a frontend designated `%promise = alloca i32` as a promise, the alignment
787argument to `coro.promise` should be the alignment of `i32` on the target
788platform. If a frontend designated `%promise = alloca i32, align 16` as a
789promise, the alignment argument should be 16.
790This argument only accepts constants.
791
792The third argument is a boolean indicating a direction of the transformation.
793If `from` is true, the intrinsic returns a coroutine handle given a pointer
794to a promise. If `from` is false, the intrinsics return a pointer to a promise
795from a coroutine handle. This argument only accepts constants.
796
797Semantics:
798""""""""""
799
800Using this intrinsic on a coroutine that does not have a coroutine promise
801leads to undefined behavior. It is possible to read and modify coroutine
802promise of the coroutine which is currently executing. The coroutine author and
803a coroutine user are responsible to makes sure there is no data races.
804
805Example:
806""""""""
807
Gor Nishanov06fdf482017-04-05 05:26:26 +0000808.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000809
810 define i8* @f(i32 %n) {
811 entry:
812 %promise = alloca i32
813 %pv = bitcast i32* %promise to i8*
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000814 ; the second argument to coro.id points to the coroutine promise.
Gor Nishanovdce9b022016-08-29 14:34:12 +0000815 %id = call token @llvm.coro.id(i32 0, i8* %pv, i8* null, i8* null)
David Majnemera6539272016-07-23 04:05:08 +0000816 ...
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000817 %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %alloc)
David Majnemera6539272016-07-23 04:05:08 +0000818 ...
819 store i32 42, i32* %promise ; store something into the promise
820 ...
821 ret i8* %hdl
822 }
823
824 define i32 @main() {
825 entry:
826 %hdl = call i8* @f(i32 4) ; starts the coroutine and returns its handle
827 %promise.addr.raw = call i8* @llvm.coro.promise(i8* %hdl, i32 4, i1 false)
828 %promise.addr = bitcast i8* %promise.addr.raw to i32*
829 %val = load i32, i32* %promise.addr ; load a value from the promise
830 call void @print(i32 %val)
831 call void @llvm.coro.destroy(i8* %hdl)
832 ret i32 0
833 }
834
835.. _coroutine intrinsics:
836
837Coroutine Structure Intrinsics
838------------------------------
839Intrinsics described in this section are used within a coroutine to describe
840the coroutine structure. They should not be used outside of a coroutine.
841
842.. _coro.size:
843
844'llvm.coro.size' Intrinsic
845^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
846::
847
848 declare i32 @llvm.coro.size.i32()
849 declare i64 @llvm.coro.size.i64()
850
851Overview:
852"""""""""
853
854The '``llvm.coro.size``' intrinsic returns the number of bytes
John McCall94010b22019-08-14 03:53:17 +0000855required to store a `coroutine frame`_. This is only supported for
856switched-resume coroutines.
David Majnemera6539272016-07-23 04:05:08 +0000857
858Arguments:
859""""""""""
860
861None
862
863Semantics:
864""""""""""
865
866The `coro.size` intrinsic is lowered to a constant representing the size of
867the coroutine frame.
868
869.. _coro.begin:
870
871'llvm.coro.begin' Intrinsic
872^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
873::
874
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000875 declare i8* @llvm.coro.begin(token <id>, i8* <mem>)
David Majnemera6539272016-07-23 04:05:08 +0000876
877Overview:
878"""""""""
879
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000880The '``llvm.coro.begin``' intrinsic returns an address of the coroutine frame.
David Majnemera6539272016-07-23 04:05:08 +0000881
882Arguments:
883""""""""""
884
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000885The first argument is a token returned by a call to '``llvm.coro.id``'
886identifying the coroutine.
David Majnemera6539272016-07-23 04:05:08 +0000887
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000888The second argument is a pointer to a block of memory where coroutine frame
John McCall94010b22019-08-14 03:53:17 +0000889will be stored if it is allocated dynamically. This pointer is ignored
890for returned-continuation coroutines.
David Majnemera6539272016-07-23 04:05:08 +0000891
892Semantics:
893""""""""""
894
895Depending on the alignment requirements of the objects in the coroutine frame
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000896and/or on the codegen compactness reasons the pointer returned from `coro.begin`
897may be at offset to the `%mem` argument. (This could be beneficial if
898instructions that express relative access to data can be more compactly encoded
899with small positive and negative offsets).
David Majnemera6539272016-07-23 04:05:08 +0000900
David Majnemer78557192016-07-27 05:12:35 +0000901A frontend should emit exactly one `coro.begin` intrinsic per coroutine.
David Majnemera6539272016-07-23 04:05:08 +0000902
903.. _coro.free:
904
905'llvm.coro.free' Intrinsic
906^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
907::
908
Gor Nishanovdce9b022016-08-29 14:34:12 +0000909 declare i8* @llvm.coro.free(token %id, i8* <frame>)
David Majnemera6539272016-07-23 04:05:08 +0000910
911Overview:
912"""""""""
913
914The '``llvm.coro.free``' intrinsic returns a pointer to a block of memory where
915coroutine frame is stored or `null` if this instance of a coroutine did not use
John McCall94010b22019-08-14 03:53:17 +0000916dynamically allocated memory for its coroutine frame. This intrinsic is not
917supported for returned-continuation coroutines.
David Majnemera6539272016-07-23 04:05:08 +0000918
919Arguments:
920""""""""""
921
Gor Nishanovdce9b022016-08-29 14:34:12 +0000922The first argument is a token returned by a call to '``llvm.coro.id``'
923identifying the coroutine.
924
925The second argument is a pointer to the coroutine frame. This should be the same
926pointer that was returned by prior `coro.begin` call.
David Majnemera6539272016-07-23 04:05:08 +0000927
928Example (custom deallocation function):
929"""""""""""""""""""""""""""""""""""""""
930
Gor Nishanov06fdf482017-04-05 05:26:26 +0000931.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000932
933 cleanup:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000934 %mem = call i8* @llvm.coro.free(token %id, i8* %frame)
David Majnemera6539272016-07-23 04:05:08 +0000935 %mem_not_null = icmp ne i8* %mem, null
936 br i1 %mem_not_null, label %if.then, label %if.end
937 if.then:
938 call void @CustomFree(i8* %mem)
939 br label %if.end
940 if.end:
941 ret void
942
943Example (standard deallocation functions):
944""""""""""""""""""""""""""""""""""""""""""
945
Gor Nishanov06fdf482017-04-05 05:26:26 +0000946.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000947
948 cleanup:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000949 %mem = call i8* @llvm.coro.free(token %id, i8* %frame)
David Majnemera6539272016-07-23 04:05:08 +0000950 call void @free(i8* %mem)
951 ret void
952
953.. _coro.alloc:
954
955'llvm.coro.alloc' Intrinsic
956^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
957::
958
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000959 declare i1 @llvm.coro.alloc(token <id>)
David Majnemera6539272016-07-23 04:05:08 +0000960
961Overview:
962"""""""""
963
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000964The '``llvm.coro.alloc``' intrinsic returns `true` if dynamic allocation is
Hiroshi Inoueb93daec2017-07-02 12:44:27 +0000965required to obtain a memory for the coroutine frame and `false` otherwise.
John McCall94010b22019-08-14 03:53:17 +0000966This is not supported for returned-continuation coroutines.
David Majnemera6539272016-07-23 04:05:08 +0000967
968Arguments:
969""""""""""
970
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000971The first argument is a token returned by a call to '``llvm.coro.id``'
972identifying the coroutine.
David Majnemera6539272016-07-23 04:05:08 +0000973
974Semantics:
975""""""""""
976
David Majnemer78557192016-07-27 05:12:35 +0000977A frontend should emit at most one `coro.alloc` intrinsic per coroutine.
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000978The intrinsic is used to suppress dynamic allocation of the coroutine frame
979when possible.
Gor Nishanovb2a9c022016-08-10 16:40:39 +0000980
David Majnemera6539272016-07-23 04:05:08 +0000981Example:
982""""""""
983
Gor Nishanov06fdf482017-04-05 05:26:26 +0000984.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000985
986 entry:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000987 %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000988 %dyn.alloc.required = call i1 @llvm.coro.alloc(token %id)
989 br i1 %dyn.alloc.required, label %coro.alloc, label %coro.begin
David Majnemera6539272016-07-23 04:05:08 +0000990
991 coro.alloc:
992 %frame.size = call i32 @llvm.coro.size()
993 %alloc = call i8* @MyAlloc(i32 %frame.size)
994 br label %coro.begin
995
996 coro.begin:
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000997 %phi = phi i8* [ null, %entry ], [ %alloc, %coro.alloc ]
998 %frame = call i8* @llvm.coro.begin(token %id, i8* %phi)
David Majnemera6539272016-07-23 04:05:08 +0000999
Gor Nishanovb0316d92018-04-02 16:55:12 +00001000.. _coro.noop:
1001
1002'llvm.coro.noop' Intrinsic
1003^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1004::
1005
1006 declare i8* @llvm.coro.noop()
1007
1008Overview:
1009"""""""""
1010
1011The '``llvm.coro.noop``' intrinsic returns an address of the coroutine frame of
1012a coroutine that does nothing when resumed or destroyed.
1013
1014Arguments:
1015""""""""""
1016
1017None
1018
1019Semantics:
1020""""""""""
1021
1022This intrinsic is lowered to refer to a private constant coroutine frame. The
1023resume and destroy handlers for this frame are empty functions that do nothing.
1024Note that in different translation units llvm.coro.noop may return different pointers.
1025
David Majnemera6539272016-07-23 04:05:08 +00001026.. _coro.frame:
1027
1028'llvm.coro.frame' Intrinsic
1029^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1030::
1031
1032 declare i8* @llvm.coro.frame()
1033
1034Overview:
1035"""""""""
1036
1037The '``llvm.coro.frame``' intrinsic returns an address of the coroutine frame of
1038the enclosing coroutine.
1039
1040Arguments:
1041""""""""""
1042
Gor Nishanov0f303ac2016-08-12 05:45:49 +00001043None
David Majnemera6539272016-07-23 04:05:08 +00001044
1045Semantics:
1046""""""""""
1047
Gor Nishanov0f303ac2016-08-12 05:45:49 +00001048This intrinsic is lowered to refer to the `coro.begin`_ instruction. This is
1049a frontend convenience intrinsic that makes it easier to refer to the
1050coroutine frame.
1051
1052.. _coro.id:
1053
1054'llvm.coro.id' Intrinsic
1055^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1056::
1057
Gor Nishanovdce9b022016-08-29 14:34:12 +00001058 declare token @llvm.coro.id(i32 <align>, i8* <promise>, i8* <coroaddr>,
1059 i8* <fnaddrs>)
Gor Nishanov0f303ac2016-08-12 05:45:49 +00001060
1061Overview:
1062"""""""""
1063
John McCall94010b22019-08-14 03:53:17 +00001064The '``llvm.coro.id``' intrinsic returns a token identifying a
1065switched-resume coroutine.
Gor Nishanov0f303ac2016-08-12 05:45:49 +00001066
1067Arguments:
1068""""""""""
1069
1070The first argument provides information on the alignment of the memory returned
1071by the allocation function and given to `coro.begin` by the first argument. If
1072this argument is 0, the memory is assumed to be aligned to 2 * sizeof(i8*).
1073This argument only accepts constants.
1074
1075The second argument, if not `null`, designates a particular alloca instruction
1076to be a `coroutine promise`_.
1077
Gor Nishanovdce9b022016-08-29 14:34:12 +00001078The third argument is `null` coming out of the frontend. The CoroEarly pass sets
1079this argument to point to the function this coro.id belongs to.
1080
1081The fourth argument is `null` before coroutine is split, and later is replaced
Gor Nishanov0f303ac2016-08-12 05:45:49 +00001082to point to a private global constant array containing function pointers to
1083outlined resume and destroy parts of the coroutine.
1084
1085
1086Semantics:
1087""""""""""
1088
1089The purpose of this intrinsic is to tie together `coro.id`, `coro.alloc` and
1090`coro.begin` belonging to the same coroutine to prevent optimization passes from
1091duplicating any of these instructions unless entire body of the coroutine is
1092duplicated.
1093
1094A frontend should emit exactly one `coro.id` intrinsic per coroutine.
David Majnemera6539272016-07-23 04:05:08 +00001095
John McCall94010b22019-08-14 03:53:17 +00001096.. _coro.id.retcon:
1097
1098'llvm.coro.id.retcon' Intrinsic
1099^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1100::
1101
1102 declare token @llvm.coro.id.retcon(i32 <size>, i32 <align>, i8* <buffer>,
1103 i8* <continuation prototype>,
1104 i8* <alloc>, i8* <dealloc>)
1105
1106Overview:
1107"""""""""
1108
1109The '``llvm.coro.id.retcon``' intrinsic returns a token identifying a
1110multiple-suspend returned-continuation coroutine.
1111
1112The 'result-type sequence' of the coroutine is defined as follows:
1113
1114- if the return type of the coroutine function is ``void``, it is the
1115 empty sequence;
1116
1117- if the return type of the coroutine function is a ``struct``, it is the
1118 element types of that ``struct`` in order;
1119
1120- otherwise, it is just the return type of the coroutine function.
1121
1122The first element of the result-type sequence must be a pointer type;
1123continuation functions will be coerced to this type. The rest of
1124the sequence are the 'yield types', and any suspends in the coroutine
1125must take arguments of these types.
1126
1127Arguments:
1128""""""""""
1129
1130The first and second arguments are the expected size and alignment of
1131the buffer provided as the third argument. They must be constant.
1132
1133The fourth argument must be a reference to a global function, called
1134the 'continuation prototype function'. The type, calling convention,
1135and attributes of any continuation functions will be taken from this
1136declaration. The return type of the prototype function must match the
1137return type of the current function. The first parameter type must be
1138a pointer type. The second parameter type must be an integer type;
1139it will be used only as a boolean flag.
1140
1141The fifth argument must be a reference to a global function that will
1142be used to allocate memory. It may not fail, either by returning null
1143or throwing an exception. It must take an integer and return a pointer.
1144
1145The sixth argument must be a reference to a global function that will
1146be used to deallocate memory. It must take a pointer and return ``void``.
1147
1148'llvm.coro.id.retcon.once' Intrinsic
1149^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1150::
1151
1152 declare token @llvm.coro.id.retcon.once(i32 <size>, i32 <align>, i8* <buffer>,
1153 i8* <prototype>,
1154 i8* <alloc>, i8* <dealloc>)
1155
1156Overview:
1157"""""""""
1158
1159The '``llvm.coro.id.retcon.once``' intrinsic returns a token identifying a
1160unique-suspend returned-continuation coroutine.
1161
1162Arguments:
1163""""""""""
1164
1165As for ``llvm.core.id.retcon``, except that the return type of the
1166continuation prototype must be `void` instead of matching the
1167coroutine's return type.
1168
David Majnemera6539272016-07-23 04:05:08 +00001169.. _coro.end:
1170
1171'llvm.coro.end' Intrinsic
1172^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1173::
1174
Gor Nishanovc52006a2017-03-07 21:00:54 +00001175 declare i1 @llvm.coro.end(i8* <handle>, i1 <unwind>)
David Majnemera6539272016-07-23 04:05:08 +00001176
1177Overview:
1178"""""""""
1179
1180The '``llvm.coro.end``' marks the point where execution of the resume part of
Gor Nishanovc52006a2017-03-07 21:00:54 +00001181the coroutine should end and control should return to the caller.
David Majnemera6539272016-07-23 04:05:08 +00001182
1183
1184Arguments:
1185""""""""""
1186
Gor Nishanovc52006a2017-03-07 21:00:54 +00001187The first argument should refer to the coroutine handle of the enclosing
1188coroutine. A frontend is allowed to supply null as the first parameter, in this
1189case `coro-early` pass will replace the null with an appropriate coroutine
1190handle value.
David Majnemera6539272016-07-23 04:05:08 +00001191
1192The second argument should be `true` if this coro.end is in the block that is
Gor Nishanovc52006a2017-03-07 21:00:54 +00001193part of the unwind sequence leaving the coroutine body due to an exception and
1194`false` otherwise.
David Majnemera6539272016-07-23 04:05:08 +00001195
1196Semantics:
1197""""""""""
Gor Nishanovc52006a2017-03-07 21:00:54 +00001198The purpose of this intrinsic is to allow frontends to mark the cleanup and
1199other code that is only relevant during the initial invocation of the coroutine
1200and should not be present in resume and destroy parts.
David Majnemera6539272016-07-23 04:05:08 +00001201
John McCall94010b22019-08-14 03:53:17 +00001202In returned-continuation lowering, ``llvm.coro.end`` fully destroys the
1203coroutine frame. If the second argument is `false`, it also returns from
1204the coroutine with a null continuation pointer, and the next instruction
1205will be unreachable. If the second argument is `true`, it falls through
1206so that the following logic can resume unwinding. In a yield-once
1207coroutine, reaching a non-unwind ``llvm.coro.end`` without having first
1208reached a ``llvm.coro.suspend.retcon`` has undefined behavior.
1209
1210The remainder of this section describes the behavior under switched-resume
1211lowering.
1212
Gor Nishanovc52006a2017-03-07 21:00:54 +00001213This intrinsic is lowered when a coroutine is split into
1214the start, resume and destroy parts. In the start part, it is a no-op,
1215in resume and destroy parts, it is replaced with `ret void` instruction and
David Majnemera6539272016-07-23 04:05:08 +00001216the rest of the block containing `coro.end` instruction is discarded.
David Majnemera6539272016-07-23 04:05:08 +00001217In landing pads it is replaced with an appropriate instruction to unwind to
Gor Nishanovc52006a2017-03-07 21:00:54 +00001218caller. The handling of coro.end differs depending on whether the target is
1219using landingpad or WinEH exception model.
David Majnemera6539272016-07-23 04:05:08 +00001220
Gor Nishanovc52006a2017-03-07 21:00:54 +00001221For landingpad based exception model, it is expected that frontend uses the
1222`coro.end`_ intrinsic as follows:
1223
1224.. code-block:: llvm
1225
1226 ehcleanup:
1227 %InResumePart = call i1 @llvm.coro.end(i8* null, i1 true)
1228 br i1 %InResumePart, label %eh.resume, label %cleanup.cont
1229
1230 cleanup.cont:
1231 ; rest of the cleanup
1232
1233 eh.resume:
1234 %exn = load i8*, i8** %exn.slot, align 8
1235 %sel = load i32, i32* %ehselector.slot, align 4
1236 %lpad.val = insertvalue { i8*, i32 } undef, i8* %exn, 0
1237 %lpad.val29 = insertvalue { i8*, i32 } %lpad.val, i32 %sel, 1
1238 resume { i8*, i32 } %lpad.val29
1239
1240The `CoroSpit` pass replaces `coro.end` with ``True`` in the resume functions,
1241thus leading to immediate unwind to the caller, whereas in start function it
1242is replaced with ``False``, thus allowing to proceed to the rest of the cleanup
1243code that is only needed during initial invocation of the coroutine.
1244
1245For Windows Exception handling model, a frontend should attach a funclet bundle
1246referring to an enclosing cleanuppad as follows:
1247
Gor Nishanov06fdf482017-04-05 05:26:26 +00001248.. code-block:: llvm
Gor Nishanovc52006a2017-03-07 21:00:54 +00001249
1250 ehcleanup:
1251 %tok = cleanuppad within none []
1252 %unused = call i1 @llvm.coro.end(i8* null, i1 true) [ "funclet"(token %tok) ]
1253 cleanupret from %tok unwind label %RestOfTheCleanup
1254
1255The `CoroSplit` pass, if the funclet bundle is present, will insert
1256``cleanupret from %tok unwind to caller`` before
1257the `coro.end`_ intrinsic and will remove the rest of the block.
1258
1259The following table summarizes the handling of `coro.end`_ intrinsic.
1260
1261+--------------------------+-------------------+-------------------------------+
1262| | In Start Function | In Resume/Destroy Functions |
1263+--------------------------+-------------------+-------------------------------+
1264|unwind=false | nothing |``ret void`` |
1265+------------+-------------+-------------------+-------------------------------+
1266| | WinEH | nothing |``cleanupret unwind to caller``|
1267|unwind=true +-------------+-------------------+-------------------------------+
1268| | Landingpad | nothing | nothing |
1269+------------+-------------+-------------------+-------------------------------+
David Majnemera6539272016-07-23 04:05:08 +00001270
1271.. _coro.suspend:
1272.. _suspend points:
1273
1274'llvm.coro.suspend' Intrinsic
1275^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1276::
1277
1278 declare i8 @llvm.coro.suspend(token <save>, i1 <final>)
1279
1280Overview:
1281"""""""""
1282
John McCall94010b22019-08-14 03:53:17 +00001283The '``llvm.coro.suspend``' marks the point where execution of a
1284switched-resume coroutine is suspended and control is returned back
1285to the caller. Conditional branches consuming the result of this
1286intrinsic lead to basic blocks where coroutine should proceed when
1287suspended (-1), resumed (0) or destroyed (1).
David Majnemera6539272016-07-23 04:05:08 +00001288
1289Arguments:
1290""""""""""
1291
1292The first argument refers to a token of `coro.save` intrinsic that marks the
1293point when coroutine state is prepared for suspension. If `none` token is passed,
1294the intrinsic behaves as if there were a `coro.save` immediately preceding
1295the `coro.suspend` intrinsic.
1296
1297The second argument indicates whether this suspension point is `final`_.
1298The second argument only accepts constants. If more than one suspend point is
1299designated as final, the resume and destroy branches should lead to the same
1300basic blocks.
1301
1302Example (normal suspend point):
1303"""""""""""""""""""""""""""""""
1304
Gor Nishanov06fdf482017-04-05 05:26:26 +00001305.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +00001306
1307 %0 = call i8 @llvm.coro.suspend(token none, i1 false)
1308 switch i8 %0, label %suspend [i8 0, label %resume
1309 i8 1, label %cleanup]
1310
1311Example (final suspend point):
1312""""""""""""""""""""""""""""""
1313
Gor Nishanov06fdf482017-04-05 05:26:26 +00001314.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +00001315
1316 while.end:
1317 %s.final = call i8 @llvm.coro.suspend(token none, i1 true)
1318 switch i8 %s.final, label %suspend [i8 0, label %trap
1319 i8 1, label %cleanup]
1320 trap:
1321 call void @llvm.trap()
1322 unreachable
1323
1324Semantics:
1325""""""""""
1326
1327If a coroutine that was suspended at the suspend point marked by this intrinsic
1328is resumed via `coro.resume`_ the control will transfer to the basic block
1329of the 0-case. If it is resumed via `coro.destroy`_, it will proceed to the
1330basic block indicated by the 1-case. To suspend, coroutine proceed to the
1331default label.
1332
1333If suspend intrinsic is marked as final, it can consider the `true` branch
1334unreachable and can perform optimizations that can take advantage of that fact.
1335
1336.. _coro.save:
1337
1338'llvm.coro.save' Intrinsic
1339^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1340::
1341
1342 declare token @llvm.coro.save(i8* <handle>)
1343
1344Overview:
1345"""""""""
1346
1347The '``llvm.coro.save``' marks the point where a coroutine need to update its
1348state to prepare for resumption to be considered suspended (and thus eligible
1349for resumption).
1350
1351Arguments:
1352""""""""""
1353
1354The first argument points to a coroutine handle of the enclosing coroutine.
1355
1356Semantics:
1357""""""""""
1358
1359Whatever coroutine state changes are required to enable resumption of
1360the coroutine from the corresponding suspend point should be done at the point
1361of `coro.save` intrinsic.
1362
1363Example:
1364""""""""
1365
1366Separate save and suspend points are necessary when a coroutine is used to
1367represent an asynchronous control flow driven by callbacks representing
1368completions of asynchronous operations.
1369
1370In such a case, a coroutine should be ready for resumption prior to a call to
1371`async_op` function that may trigger resumption of a coroutine from the same or
1372a different thread possibly prior to `async_op` call returning control back
1373to the coroutine:
1374
Gor Nishanov06fdf482017-04-05 05:26:26 +00001375.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +00001376
1377 %save1 = call token @llvm.coro.save(i8* %hdl)
Gor Nishanov06fdf482017-04-05 05:26:26 +00001378 call void @async_op1(i8* %hdl)
David Majnemera6539272016-07-23 04:05:08 +00001379 %suspend1 = call i1 @llvm.coro.suspend(token %save1, i1 false)
1380 switch i8 %suspend1, label %suspend [i8 0, label %resume1
1381 i8 1, label %cleanup]
1382
John McCall94010b22019-08-14 03:53:17 +00001383.. _coro.suspend.retcon:
1384
1385'llvm.coro.suspend.retcon' Intrinsic
1386^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1387::
1388
1389 declare i1 @llvm.coro.suspend.retcon(...)
1390
1391Overview:
1392"""""""""
1393
1394The '``llvm.coro.suspend.retcon``' intrinsic marks the point where
1395execution of a returned-continuation coroutine is suspended and control
1396is returned back to the caller.
1397
1398`llvm.coro.suspend.retcon`` does not support separate save points;
1399they are not useful when the continuation function is not locally
1400accessible. That would be a more appropriate feature for a ``passcon``
1401lowering that is not yet implemented.
1402
1403Arguments:
1404""""""""""
1405
1406The types of the arguments must exactly match the yielded-types sequence
1407of the coroutine. They will be turned into return values from the ramp
1408and continuation functions, along with the next continuation function.
1409
1410Semantics:
1411""""""""""
1412
1413The result of the intrinsic indicates whether the coroutine should resume
1414abnormally (non-zero).
1415
1416In a normal coroutine, it is undefined behavior if the coroutine executes
1417a call to ``llvm.coro.suspend.retcon`` after resuming abnormally.
1418
1419In a yield-once coroutine, it is undefined behavior if the coroutine
1420executes a call to ``llvm.coro.suspend.retcon`` after resuming in any way.
1421
David Majnemera6539272016-07-23 04:05:08 +00001422.. _coro.param:
1423
1424'llvm.coro.param' Intrinsic
1425^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1426::
1427
1428 declare i1 @llvm.coro.param(i8* <original>, i8* <copy>)
1429
1430Overview:
1431"""""""""
1432
David Majnemer78557192016-07-27 05:12:35 +00001433The '``llvm.coro.param``' is used by a frontend to mark up the code used to
David Majnemera6539272016-07-23 04:05:08 +00001434construct and destruct copies of the parameters. If the optimizer discovers that
1435a particular parameter copy is not used after any suspends, it can remove the
1436construction and destruction of the copy by replacing corresponding coro.param
1437with `i1 false` and replacing any use of the `copy` with the `original`.
1438
1439Arguments:
1440""""""""""
1441
1442The first argument points to an `alloca` storing the value of a parameter to a
1443coroutine.
1444
1445The second argument points to an `alloca` storing the value of the copy of that
1446parameter.
1447
1448Semantics:
1449""""""""""
1450
1451The optimizer is free to always replace this intrinsic with `i1 true`.
1452
1453The optimizer is also allowed to replace it with `i1 false` provided that the
1454parameter copy is only used prior to control flow reaching any of the suspend
1455points. The code that would be DCE'd if the `coro.param` is replaced with
1456`i1 false` is not considered to be a use of the parameter copy.
1457
1458The frontend can emit this intrinsic if its language rules allow for this
1459optimization.
1460
1461Example:
1462""""""""
1463Consider the following example. A coroutine takes two parameters `a` and `b`
1464that has a destructor and a move constructor.
1465
Sanjoy Das77a9c792016-07-26 21:03:41 +00001466.. code-block:: c++
David Majnemera6539272016-07-23 04:05:08 +00001467
1468 struct A { ~A(); A(A&&); bool foo(); void bar(); };
1469
1470 task<int> f(A a, A b) {
1471 if (a.foo())
1472 return 42;
1473
1474 a.bar();
1475 co_await read_async(); // introduces suspend point
1476 b.bar();
1477 }
1478
1479Note that, uses of `b` is used after a suspend point and thus must be copied
1480into a coroutine frame, whereas `a` does not have to, since it never used
1481after suspend.
1482
1483A frontend can create parameter copies for `a` and `b` as follows:
1484
Aaron Ballmanbc7c2d02016-07-23 20:11:21 +00001485.. code-block:: text
David Majnemera6539272016-07-23 04:05:08 +00001486
1487 task<int> f(A a', A b') {
1488 a = alloca A;
1489 b = alloca A;
1490 // move parameters to its copies
1491 if (coro.param(a', a)) A::A(a, A&& a');
1492 if (coro.param(b', b)) A::A(b, A&& b');
1493 ...
1494 // destroy parameters copies
1495 if (coro.param(a', a)) A::~A(a);
1496 if (coro.param(b', b)) A::~A(b);
1497 }
1498
1499The optimizer can replace coro.param(a',a) with `i1 false` and replace all uses
1500of `a` with `a'`, since it is not used after suspend.
1501
1502The optimizer must replace coro.param(b', b) with `i1 true`, since `b` is used
1503after suspend and therefore, it has to reside in the coroutine frame.
1504
1505Coroutine Transformation Passes
1506===============================
1507CoroEarly
1508---------
1509The pass CoroEarly lowers coroutine intrinsics that hide the details of the
1510structure of the coroutine frame, but, otherwise not needed to be preserved to
1511help later coroutine passes. This pass lowers `coro.frame`_, `coro.done`_,
1512and `coro.promise`_ intrinsics.
1513
1514.. _CoroSplit:
1515
1516CoroSplit
1517---------
1518The pass CoroSplit buides coroutine frame and outlines resume and destroy parts
1519into separate functions.
1520
1521CoroElide
1522---------
1523The pass CoroElide examines if the inlined coroutine is eligible for heap
Gor Nishanov0f303ac2016-08-12 05:45:49 +00001524allocation elision optimization. If so, it replaces
1525`coro.begin` intrinsic with an address of a coroutine frame placed on its caller
1526and replaces `coro.alloc` and `coro.free` intrinsics with `false` and `null`
1527respectively to remove the deallocation code.
David Majnemera6539272016-07-23 04:05:08 +00001528This pass also replaces `coro.resume` and `coro.destroy` intrinsics with direct
1529calls to resume and destroy functions for a particular coroutine where possible.
1530
1531CoroCleanup
1532-----------
1533This pass runs late to lower all coroutine related intrinsics not replaced by
1534earlier passes.
1535
David Majnemera6539272016-07-23 04:05:08 +00001536Areas Requiring Attention
1537=========================
1538#. A coroutine frame is bigger than it could be. Adding stack packing and stack
1539 coloring like optimization on the coroutine frame will result in tighter
1540 coroutine frames.
1541
1542#. Take advantage of the lifetime intrinsics for the data that goes into the
1543 coroutine frame. Leave lifetime intrinsics as is for the data that stays in
1544 allocas.
1545
1546#. The CoroElide optimization pass relies on coroutine ramp function to be
1547 inlined. It would be beneficial to split the ramp function further to
1548 increase the chance that it will get inlined into its caller.
1549
1550#. Design a convention that would make it possible to apply coroutine heap
1551 elision optimization across ABI boundaries.
1552
1553#. Cannot handle coroutines with `inalloca` parameters (used in x86 on Windows).
1554
1555#. Alignment is ignored by coro.begin and coro.free intrinsics.
1556
1557#. Make required changes to make sure that coroutine optimizations work with
1558 LTO.
1559
1560#. More tests, more tests, more tests