blob: f3667585c6c26213f8e043161b81e20602ce3e80 [file] [log] [blame]
David Majnemera6539272016-07-23 04:05:08 +00001=====================================
2Coroutines in LLVM
3=====================================
4
5.. contents::
6 :local:
7 :depth: 3
8
9.. warning::
10 This is a work in progress. Compatibility across LLVM releases is not
11 guaranteed.
12
13Introduction
14============
15
16.. _coroutine handle:
17
18LLVM coroutines are functions that have one or more `suspend points`_.
19When a suspend point is reached, the execution of a coroutine is suspended and
20control is returned back to its caller. A suspended coroutine can be resumed
21to continue execution from the last suspend point or it can be destroyed.
22
23In the following example, we call function `f` (which may or may not be a
24coroutine itself) that returns a handle to a suspended coroutine
25(**coroutine handle**) that is used by `main` to resume the coroutine twice and
26then destroy it:
27
28.. code-block:: llvm
29
30 define i32 @main() {
31 entry:
32 %hdl = call i8* @f(i32 4)
33 call void @llvm.coro.resume(i8* %hdl)
34 call void @llvm.coro.resume(i8* %hdl)
35 call void @llvm.coro.destroy(i8* %hdl)
36 ret i32 0
37 }
38
39.. _coroutine frame:
40
41In addition to the function stack frame which exists when a coroutine is
42executing, there is an additional region of storage that contains objects that
43keep the coroutine state when a coroutine is suspended. This region of storage
44is called **coroutine frame**. It is created when a coroutine is called and
45destroyed when a coroutine runs to completion or destroyed by a call to
46the `coro.destroy`_ intrinsic.
47
48An LLVM coroutine is represented as an LLVM function that has calls to
49`coroutine intrinsics`_ defining the structure of the coroutine.
50After lowering, a coroutine is split into several
51functions that represent three different ways of how control can enter the
52coroutine:
53
541. a ramp function, which represents an initial invocation of the coroutine that
55 creates the coroutine frame and executes the coroutine code until it
56 encounters a suspend point or reaches the end of the function;
57
582. a coroutine resume function that is invoked when the coroutine is resumed;
59
603. a coroutine destroy function that is invoked when the coroutine is destroyed.
61
62.. note:: Splitting out resume and destroy functions are just one of the
63 possible ways of lowering the coroutine. We chose it for initial
64 implementation as it matches closely the mental model and results in
65 reasonably nice code.
66
67Coroutines by Example
68=====================
69
70Coroutine Representation
71------------------------
72
73Let's look at an example of an LLVM coroutine with the behavior sketched
74by the following pseudo-code.
75
Sanjoy Das77a9c792016-07-26 21:03:41 +000076.. code-block:: c++
David Majnemera6539272016-07-23 04:05:08 +000077
78 void *f(int n) {
79 for(;;) {
80 print(n++);
81 <suspend> // returns a coroutine handle on first suspend
82 }
83 }
84
85This coroutine calls some function `print` with value `n` as an argument and
86suspends execution. Every time this coroutine resumes, it calls `print` again with an argument one bigger than the last time. This coroutine never completes by itself and must be destroyed explicitly. If we use this coroutine with
87a `main` shown in the previous section. It will call `print` with values 4, 5
88and 6 after which the coroutine will be destroyed.
89
90The LLVM IR for this coroutine looks like this:
91
Gor Nishanov06fdf482017-04-05 05:26:26 +000092.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +000093
94 define i8* @f(i32 %n) {
95 entry:
Gor Nishanovdce9b022016-08-29 14:34:12 +000096 %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
David Majnemera6539272016-07-23 04:05:08 +000097 %size = call i32 @llvm.coro.size.i32()
98 %alloc = call i8* @malloc(i32 %size)
Gor Nishanov0f303ac2016-08-12 05:45:49 +000099 %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %alloc)
David Majnemera6539272016-07-23 04:05:08 +0000100 br label %loop
101 loop:
102 %n.val = phi i32 [ %n, %entry ], [ %inc, %loop ]
103 %inc = add nsw i32 %n.val, 1
104 call void @print(i32 %n.val)
105 %0 = call i8 @llvm.coro.suspend(token none, i1 false)
106 switch i8 %0, label %suspend [i8 0, label %loop
107 i8 1, label %cleanup]
108 cleanup:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000109 %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
David Majnemera6539272016-07-23 04:05:08 +0000110 call void @free(i8* %mem)
111 br label %suspend
112 suspend:
Gor Nishanovc52006a2017-03-07 21:00:54 +0000113 %unused = call i1 @llvm.coro.end(i8* %hdl, i1 false)
David Majnemera6539272016-07-23 04:05:08 +0000114 ret i8* %hdl
115 }
116
117The `entry` block establishes the coroutine frame. The `coro.size`_ intrinsic is
118lowered to a constant representing the size required for the coroutine frame.
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000119The `coro.begin`_ intrinsic initializes the coroutine frame and returns the
120coroutine handle. The second parameter of `coro.begin` is given a block of memory
121to be used if the coroutine frame needs to be allocated dynamically.
122The `coro.id`_ intrinsic serves as coroutine identity useful in cases when the
123`coro.begin`_ intrinsic get duplicated by optimization passes such as
124jump-threading.
David Majnemera6539272016-07-23 04:05:08 +0000125
126The `cleanup` block destroys the coroutine frame. The `coro.free`_ intrinsic,
127given the coroutine handle, returns a pointer of the memory block to be freed or
128`null` if the coroutine frame was not allocated dynamically. The `cleanup`
129block is entered when coroutine runs to completion by itself or destroyed via
130call to the `coro.destroy`_ intrinsic.
131
132The `suspend` block contains code to be executed when coroutine runs to
133completion or suspended. The `coro.end`_ intrinsic marks the point where
134a coroutine needs to return control back to the caller if it is not an initial
135invocation of the coroutine.
136
137The `loop` blocks represents the body of the coroutine. The `coro.suspend`_
138intrinsic in combination with the following switch indicates what happens to
139control flow when a coroutine is suspended (default case), resumed (case 0) or
140destroyed (case 1).
141
142Coroutine Transformation
143------------------------
144
145One of the steps of coroutine lowering is building the coroutine frame. The
146def-use chains are analyzed to determine which objects need be kept alive across
147suspend points. In the coroutine shown in the previous section, use of virtual register
148`%n.val` is separated from the definition by a suspend point, therefore, it
149cannot reside on the stack frame since the latter goes away once the coroutine
150is suspended and control is returned back to the caller. An i32 slot is
151allocated in the coroutine frame and `%n.val` is spilled and reloaded from that
152slot as needed.
153
154We also store addresses of the resume and destroy functions so that the
155`coro.resume` and `coro.destroy` intrinsics can resume and destroy the coroutine
156when its identity cannot be determined statically at compile time. For our
157example, the coroutine frame will be:
158
Gor Nishanov06fdf482017-04-05 05:26:26 +0000159.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000160
161 %f.frame = type { void (%f.frame*)*, void (%f.frame*)*, i32 }
162
163After resume and destroy parts are outlined, function `f` will contain only the
164code responsible for creation and initialization of the coroutine frame and
165execution of the coroutine until a suspend point is reached:
166
Gor Nishanov06fdf482017-04-05 05:26:26 +0000167.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000168
169 define i8* @f(i32 %n) {
170 entry:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000171 %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
David Majnemera6539272016-07-23 04:05:08 +0000172 %alloc = call noalias i8* @malloc(i32 24)
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000173 %0 = call noalias i8* @llvm.coro.begin(token %id, i8* %alloc)
Mehdi Aminibe1cb222016-07-27 06:03:47 +0000174 %frame = bitcast i8* %0 to %f.frame*
David Majnemera6539272016-07-23 04:05:08 +0000175 %1 = getelementptr %f.frame, %f.frame* %frame, i32 0, i32 0
176 store void (%f.frame*)* @f.resume, void (%f.frame*)** %1
177 %2 = getelementptr %f.frame, %f.frame* %frame, i32 0, i32 1
178 store void (%f.frame*)* @f.destroy, void (%f.frame*)** %2
179
180 %inc = add nsw i32 %n, 1
181 %inc.spill.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 2
182 store i32 %inc, i32* %inc.spill.addr
183 call void @print(i32 %n)
184
185 ret i8* %frame
186 }
187
188Outlined resume part of the coroutine will reside in function `f.resume`:
189
190.. code-block:: llvm
191
192 define internal fastcc void @f.resume(%f.frame* %frame.ptr.resume) {
193 entry:
194 %inc.spill.addr = getelementptr %f.frame, %f.frame* %frame.ptr.resume, i64 0, i32 2
195 %inc.spill = load i32, i32* %inc.spill.addr, align 4
196 %inc = add i32 %n.val, 1
197 store i32 %inc, i32* %inc.spill.addr, align 4
198 tail call void @print(i32 %inc)
199 ret void
200 }
201
202Whereas function `f.destroy` will contain the cleanup code for the coroutine:
203
204.. code-block:: llvm
205
206 define internal fastcc void @f.destroy(%f.frame* %frame.ptr.destroy) {
207 entry:
208 %0 = bitcast %f.frame* %frame.ptr.destroy to i8*
209 tail call void @free(i8* %0)
210 ret void
211 }
212
213Avoiding Heap Allocations
214-------------------------
215
216A particular coroutine usage pattern, which is illustrated by the `main`
217function in the overview section, where a coroutine is created, manipulated and
218destroyed by the same calling function, is common for coroutines implementing
219RAII idiom and is suitable for allocation elision optimization which avoid
220dynamic allocation by storing the coroutine frame as a static `alloca` in its
221caller.
222
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000223In the entry block, we will call `coro.alloc`_ intrinsic that will return `true`
224when dynamic allocation is required, and `false` if dynamic allocation is
225elided.
David Majnemera6539272016-07-23 04:05:08 +0000226
Gor Nishanov06fdf482017-04-05 05:26:26 +0000227.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000228
229 entry:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000230 %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000231 %need.dyn.alloc = call i1 @llvm.coro.alloc(token %id)
232 br i1 %need.dyn.alloc, label %dyn.alloc, label %coro.begin
David Majnemera6539272016-07-23 04:05:08 +0000233 dyn.alloc:
234 %size = call i32 @llvm.coro.size.i32()
235 %alloc = call i8* @CustomAlloc(i32 %size)
236 br label %coro.begin
237 coro.begin:
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000238 %phi = phi i8* [ null, %entry ], [ %alloc, %dyn.alloc ]
239 %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %phi)
David Majnemera6539272016-07-23 04:05:08 +0000240
241In the cleanup block, we will make freeing the coroutine frame conditional on
242`coro.free`_ intrinsic. If allocation is elided, `coro.free`_ returns `null`
243thus skipping the deallocation code:
244
Gor Nishanov06fdf482017-04-05 05:26:26 +0000245.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000246
247 cleanup:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000248 %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
David Majnemera6539272016-07-23 04:05:08 +0000249 %need.dyn.free = icmp ne i8* %mem, null
250 br i1 %need.dyn.free, label %dyn.free, label %if.end
251 dyn.free:
252 call void @CustomFree(i8* %mem)
253 br label %if.end
254 if.end:
255 ...
256
257With allocations and deallocations represented as described as above, after
David Majnemer78557192016-07-27 05:12:35 +0000258coroutine heap allocation elision optimization, the resulting main will be:
David Majnemera6539272016-07-23 04:05:08 +0000259
260.. code-block:: llvm
261
262 define i32 @main() {
263 entry:
264 call void @print(i32 4)
265 call void @print(i32 5)
266 call void @print(i32 6)
267 ret i32 0
268 }
269
270Multiple Suspend Points
271-----------------------
272
273Let's consider the coroutine that has more than one suspend point:
274
Sanjoy Das77a9c792016-07-26 21:03:41 +0000275.. code-block:: c++
David Majnemera6539272016-07-23 04:05:08 +0000276
277 void *f(int n) {
278 for(;;) {
279 print(n++);
280 <suspend>
281 print(-n);
282 <suspend>
283 }
284 }
285
286Matching LLVM code would look like (with the rest of the code remaining the same
287as the code in the previous section):
288
Gor Nishanov06fdf482017-04-05 05:26:26 +0000289.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000290
291 loop:
292 %n.addr = phi i32 [ %n, %entry ], [ %inc, %loop.resume ]
293 call void @print(i32 %n.addr) #4
294 %2 = call i8 @llvm.coro.suspend(token none, i1 false)
295 switch i8 %2, label %suspend [i8 0, label %loop.resume
296 i8 1, label %cleanup]
297 loop.resume:
298 %inc = add nsw i32 %n.addr, 1
299 %sub = xor i32 %n.addr, -1
300 call void @print(i32 %sub)
301 %3 = call i8 @llvm.coro.suspend(token none, i1 false)
302 switch i8 %3, label %suspend [i8 0, label %loop
303 i8 1, label %cleanup]
304
305In this case, the coroutine frame would include a suspend index that will
306indicate at which suspend point the coroutine needs to resume. The resume
307function will use an index to jump to an appropriate basic block and will look
308as follows:
309
310.. code-block:: llvm
311
312 define internal fastcc void @f.Resume(%f.Frame* %FramePtr) {
313 entry.Resume:
314 %index.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i64 0, i32 2
315 %index = load i8, i8* %index.addr, align 1
316 %switch = icmp eq i8 %index, 0
317 %n.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i64 0, i32 3
318 %n = load i32, i32* %n.addr, align 4
319 br i1 %switch, label %loop.resume, label %loop
320
321 loop.resume:
322 %sub = xor i32 %n, -1
323 call void @print(i32 %sub)
324 br label %suspend
325 loop:
326 %inc = add nsw i32 %n, 1
327 store i32 %inc, i32* %n.addr, align 4
328 tail call void @print(i32 %inc)
329 br label %suspend
330
331 suspend:
332 %storemerge = phi i8 [ 0, %loop ], [ 1, %loop.resume ]
333 store i8 %storemerge, i8* %index.addr, align 1
334 ret void
335 }
336
337If different cleanup code needs to get executed for different suspend points,
338a similar switch will be in the `f.destroy` function.
339
340.. note ::
341
342 Using suspend index in a coroutine state and having a switch in `f.resume` and
343 `f.destroy` is one of the possible implementation strategies. We explored
344 another option where a distinct `f.resume1`, `f.resume2`, etc. are created for
345 every suspend point, and instead of storing an index, the resume and destroy
346 function pointers are updated at every suspend. Early testing showed that the
347 current approach is easier on the optimizer than the latter so it is a
348 lowering strategy implemented at the moment.
349
350Distinct Save and Suspend
351-------------------------
352
353In the previous example, setting a resume index (or some other state change that
354needs to happen to prepare a coroutine for resumption) happens at the same time as
355a suspension of a coroutine. However, in certain cases, it is necessary to control
356when coroutine is prepared for resumption and when it is suspended.
357
358In the following example, a coroutine represents some activity that is driven
359by completions of asynchronous operations `async_op1` and `async_op2` which get
360a coroutine handle as a parameter and resume the coroutine once async
361operation is finished.
362
Aaron Ballmanbc7c2d02016-07-23 20:11:21 +0000363.. code-block:: text
David Majnemera6539272016-07-23 04:05:08 +0000364
365 void g() {
366 for (;;)
367 if (cond()) {
368 async_op1(<coroutine-handle>); // will resume once async_op1 completes
369 <suspend>
370 do_one();
371 }
372 else {
373 async_op2(<coroutine-handle>); // will resume once async_op2 completes
374 <suspend>
375 do_two();
376 }
377 }
378 }
379
380In this case, coroutine should be ready for resumption prior to a call to
381`async_op1` and `async_op2`. The `coro.save`_ intrinsic is used to indicate a
382point when coroutine should be ready for resumption (namely, when a resume index
383should be stored in the coroutine frame, so that it can be resumed at the
384correct resume point):
385
Gor Nishanov06fdf482017-04-05 05:26:26 +0000386.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000387
388 if.true:
389 %save1 = call token @llvm.coro.save(i8* %hdl)
Gor Nishanov06fdf482017-04-05 05:26:26 +0000390 call void @async_op1(i8* %hdl)
David Majnemera6539272016-07-23 04:05:08 +0000391 %suspend1 = call i1 @llvm.coro.suspend(token %save1, i1 false)
392 switch i8 %suspend1, label %suspend [i8 0, label %resume1
393 i8 1, label %cleanup]
394 if.false:
395 %save2 = call token @llvm.coro.save(i8* %hdl)
Gor Nishanov06fdf482017-04-05 05:26:26 +0000396 call void @async_op2(i8* %hdl)
David Majnemera6539272016-07-23 04:05:08 +0000397 %suspend2 = call i1 @llvm.coro.suspend(token %save2, i1 false)
398 switch i8 %suspend1, label %suspend [i8 0, label %resume2
399 i8 1, label %cleanup]
400
401.. _coroutine promise:
402
403Coroutine Promise
404-----------------
405
406A coroutine author or a frontend may designate a distinguished `alloca` that can
407be used to communicate with the coroutine. This distinguished alloca is called
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000408**coroutine promise** and is provided as the second parameter to the
409`coro.id`_ intrinsic.
David Majnemera6539272016-07-23 04:05:08 +0000410
411The following coroutine designates a 32 bit integer `promise` and uses it to
412store the current value produced by a coroutine.
413
Gor Nishanov06fdf482017-04-05 05:26:26 +0000414.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000415
416 define i8* @f(i32 %n) {
417 entry:
418 %promise = alloca i32
419 %pv = bitcast i32* %promise to i8*
Gor Nishanovdce9b022016-08-29 14:34:12 +0000420 %id = call token @llvm.coro.id(i32 0, i8* %pv, i8* null, i8* null)
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000421 %need.dyn.alloc = call i1 @llvm.coro.alloc(token %id)
422 br i1 %need.dyn.alloc, label %dyn.alloc, label %coro.begin
David Majnemer78557192016-07-27 05:12:35 +0000423 dyn.alloc:
David Majnemera6539272016-07-23 04:05:08 +0000424 %size = call i32 @llvm.coro.size.i32()
425 %alloc = call i8* @malloc(i32 %size)
David Majnemer78557192016-07-27 05:12:35 +0000426 br label %coro.begin
427 coro.begin:
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000428 %phi = phi i8* [ null, %entry ], [ %alloc, %dyn.alloc ]
429 %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %phi)
David Majnemera6539272016-07-23 04:05:08 +0000430 br label %loop
431 loop:
David Majnemer78557192016-07-27 05:12:35 +0000432 %n.val = phi i32 [ %n, %coro.begin ], [ %inc, %loop ]
David Majnemera6539272016-07-23 04:05:08 +0000433 %inc = add nsw i32 %n.val, 1
434 store i32 %n.val, i32* %promise
435 %0 = call i8 @llvm.coro.suspend(token none, i1 false)
436 switch i8 %0, label %suspend [i8 0, label %loop
437 i8 1, label %cleanup]
438 cleanup:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000439 %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
David Majnemera6539272016-07-23 04:05:08 +0000440 call void @free(i8* %mem)
441 br label %suspend
442 suspend:
Gor Nishanovc52006a2017-03-07 21:00:54 +0000443 %unused = call i1 @llvm.coro.end(i8* %hdl, i1 false)
David Majnemera6539272016-07-23 04:05:08 +0000444 ret i8* %hdl
445 }
446
447A coroutine consumer can rely on the `coro.promise`_ intrinsic to access the
448coroutine promise.
449
450.. code-block:: llvm
451
452 define i32 @main() {
453 entry:
454 %hdl = call i8* @f(i32 4)
455 %promise.addr.raw = call i8* @llvm.coro.promise(i8* %hdl, i32 4, i1 false)
456 %promise.addr = bitcast i8* %promise.addr.raw to i32*
457 %val0 = load i32, i32* %promise.addr
458 call void @print(i32 %val0)
459 call void @llvm.coro.resume(i8* %hdl)
460 %val1 = load i32, i32* %promise.addr
461 call void @print(i32 %val1)
462 call void @llvm.coro.resume(i8* %hdl)
463 %val2 = load i32, i32* %promise.addr
464 call void @print(i32 %val2)
465 call void @llvm.coro.destroy(i8* %hdl)
466 ret i32 0
467 }
468
David Majnemer78557192016-07-27 05:12:35 +0000469After example in this section is compiled, result of the compilation will be:
David Majnemera6539272016-07-23 04:05:08 +0000470
471.. code-block:: llvm
472
473 define i32 @main() {
474 entry:
475 tail call void @print(i32 4)
476 tail call void @print(i32 5)
477 tail call void @print(i32 6)
478 ret i32 0
479 }
480
481.. _final:
482.. _final suspend:
483
484Final Suspend
485-------------
486
487A coroutine author or a frontend may designate a particular suspend to be final,
488by setting the second argument of the `coro.suspend`_ intrinsic to `true`.
489Such a suspend point has two properties:
490
491* it is possible to check whether a suspended coroutine is at the final suspend
492 point via `coro.done`_ intrinsic;
493
494* a resumption of a coroutine stopped at the final suspend point leads to
495 undefined behavior. The only possible action for a coroutine at a final
496 suspend point is destroying it via `coro.destroy`_ intrinsic.
497
498From the user perspective, the final suspend point represents an idea of a
499coroutine reaching the end. From the compiler perspective, it is an optimization
500opportunity for reducing number of resume points (and therefore switch cases) in
501the resume function.
502
503The following is an example of a function that keeps resuming the coroutine
504until the final suspend point is reached after which point the coroutine is
505destroyed:
506
507.. code-block:: llvm
508
509 define i32 @main() {
510 entry:
511 %hdl = call i8* @f(i32 4)
512 br label %while
513 while:
514 call void @llvm.coro.resume(i8* %hdl)
515 %done = call i1 @llvm.coro.done(i8* %hdl)
516 br i1 %done, label %end, label %while
517 end:
518 call void @llvm.coro.destroy(i8* %hdl)
519 ret i32 0
520 }
521
522Usually, final suspend point is a frontend injected suspend point that does not
523correspond to any explicitly authored suspend point of the high level language.
524For example, for a Python generator that has only one suspend point:
525
526.. code-block:: python
527
528 def coroutine(n):
529 for i in range(n):
530 yield i
531
532Python frontend would inject two more suspend points, so that the actual code
533looks like this:
534
Sanjoy Das77a9c792016-07-26 21:03:41 +0000535.. code-block:: c
David Majnemera6539272016-07-23 04:05:08 +0000536
537 void* coroutine(int n) {
538 int current_value;
539 <designate current_value to be coroutine promise>
540 <SUSPEND> // injected suspend point, so that the coroutine starts suspended
541 for (int i = 0; i < n; ++i) {
542 current_value = i; <SUSPEND>; // corresponds to "yield i"
543 }
544 <SUSPEND final=true> // injected final suspend point
545 }
546
547and python iterator `__next__` would look like:
548
Sanjoy Das77a9c792016-07-26 21:03:41 +0000549.. code-block:: c++
David Majnemera6539272016-07-23 04:05:08 +0000550
551 int __next__(void* hdl) {
552 coro.resume(hdl);
553 if (coro.done(hdl)) throw StopIteration();
554 return *(int*)coro.promise(hdl, 4, false);
555 }
556
557Intrinsics
558==========
559
560Coroutine Manipulation Intrinsics
561---------------------------------
562
563Intrinsics described in this section are used to manipulate an existing
564coroutine. They can be used in any function which happen to have a pointer
565to a `coroutine frame`_ or a pointer to a `coroutine promise`_.
566
567.. _coro.destroy:
568
569'llvm.coro.destroy' Intrinsic
570^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
571
572Syntax:
573"""""""
574
575::
576
577 declare void @llvm.coro.destroy(i8* <handle>)
578
579Overview:
580"""""""""
581
582The '``llvm.coro.destroy``' intrinsic destroys a suspended
583coroutine.
584
585Arguments:
586""""""""""
587
588The argument is a coroutine handle to a suspended coroutine.
589
590Semantics:
591""""""""""
592
593When possible, the `coro.destroy` intrinsic is replaced with a direct call to
594the coroutine destroy function. Otherwise it is replaced with an indirect call
595based on the function pointer for the destroy function stored in the coroutine
596frame. Destroying a coroutine that is not suspended leads to undefined behavior.
597
598.. _coro.resume:
599
600'llvm.coro.resume' Intrinsic
601^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
602
603::
604
605 declare void @llvm.coro.resume(i8* <handle>)
606
607Overview:
608"""""""""
609
610The '``llvm.coro.resume``' intrinsic resumes a suspended coroutine.
611
612Arguments:
613""""""""""
614
615The argument is a handle to a suspended coroutine.
616
617Semantics:
618""""""""""
619
620When possible, the `coro.resume` intrinsic is replaced with a direct call to the
621coroutine resume function. Otherwise it is replaced with an indirect call based
622on the function pointer for the resume function stored in the coroutine frame.
623Resuming a coroutine that is not suspended leads to undefined behavior.
624
625.. _coro.done:
626
627'llvm.coro.done' Intrinsic
628^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
629
630::
631
632 declare i1 @llvm.coro.done(i8* <handle>)
633
634Overview:
635"""""""""
636
637The '``llvm.coro.done``' intrinsic checks whether a suspended coroutine is at
638the final suspend point or not.
639
640Arguments:
641""""""""""
642
643The argument is a handle to a suspended coroutine.
644
645Semantics:
646""""""""""
647
648Using this intrinsic on a coroutine that does not have a `final suspend`_ point
649or on a coroutine that is not suspended leads to undefined behavior.
650
651.. _coro.promise:
652
653'llvm.coro.promise' Intrinsic
654^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
655
656::
657
658 declare i8* @llvm.coro.promise(i8* <ptr>, i32 <alignment>, i1 <from>)
659
660Overview:
661"""""""""
662
663The '``llvm.coro.promise``' intrinsic obtains a pointer to a
664`coroutine promise`_ given a coroutine handle and vice versa.
665
666Arguments:
667""""""""""
668
669The first argument is a handle to a coroutine if `from` is false. Otherwise,
670it is a pointer to a coroutine promise.
671
672The second argument is an alignment requirements of the promise.
673If a frontend designated `%promise = alloca i32` as a promise, the alignment
674argument to `coro.promise` should be the alignment of `i32` on the target
675platform. If a frontend designated `%promise = alloca i32, align 16` as a
676promise, the alignment argument should be 16.
677This argument only accepts constants.
678
679The third argument is a boolean indicating a direction of the transformation.
680If `from` is true, the intrinsic returns a coroutine handle given a pointer
681to a promise. If `from` is false, the intrinsics return a pointer to a promise
682from a coroutine handle. This argument only accepts constants.
683
684Semantics:
685""""""""""
686
687Using this intrinsic on a coroutine that does not have a coroutine promise
688leads to undefined behavior. It is possible to read and modify coroutine
689promise of the coroutine which is currently executing. The coroutine author and
690a coroutine user are responsible to makes sure there is no data races.
691
692Example:
693""""""""
694
Gor Nishanov06fdf482017-04-05 05:26:26 +0000695.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000696
697 define i8* @f(i32 %n) {
698 entry:
699 %promise = alloca i32
700 %pv = bitcast i32* %promise to i8*
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000701 ; the second argument to coro.id points to the coroutine promise.
Gor Nishanovdce9b022016-08-29 14:34:12 +0000702 %id = call token @llvm.coro.id(i32 0, i8* %pv, i8* null, i8* null)
David Majnemera6539272016-07-23 04:05:08 +0000703 ...
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000704 %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %alloc)
David Majnemera6539272016-07-23 04:05:08 +0000705 ...
706 store i32 42, i32* %promise ; store something into the promise
707 ...
708 ret i8* %hdl
709 }
710
711 define i32 @main() {
712 entry:
713 %hdl = call i8* @f(i32 4) ; starts the coroutine and returns its handle
714 %promise.addr.raw = call i8* @llvm.coro.promise(i8* %hdl, i32 4, i1 false)
715 %promise.addr = bitcast i8* %promise.addr.raw to i32*
716 %val = load i32, i32* %promise.addr ; load a value from the promise
717 call void @print(i32 %val)
718 call void @llvm.coro.destroy(i8* %hdl)
719 ret i32 0
720 }
721
722.. _coroutine intrinsics:
723
724Coroutine Structure Intrinsics
725------------------------------
726Intrinsics described in this section are used within a coroutine to describe
727the coroutine structure. They should not be used outside of a coroutine.
728
729.. _coro.size:
730
731'llvm.coro.size' Intrinsic
732^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
733::
734
735 declare i32 @llvm.coro.size.i32()
736 declare i64 @llvm.coro.size.i64()
737
738Overview:
739"""""""""
740
741The '``llvm.coro.size``' intrinsic returns the number of bytes
742required to store a `coroutine frame`_.
743
744Arguments:
745""""""""""
746
747None
748
749Semantics:
750""""""""""
751
752The `coro.size` intrinsic is lowered to a constant representing the size of
753the coroutine frame.
754
755.. _coro.begin:
756
757'llvm.coro.begin' Intrinsic
758^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
759::
760
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000761 declare i8* @llvm.coro.begin(token <id>, i8* <mem>)
David Majnemera6539272016-07-23 04:05:08 +0000762
763Overview:
764"""""""""
765
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000766The '``llvm.coro.begin``' intrinsic returns an address of the coroutine frame.
David Majnemera6539272016-07-23 04:05:08 +0000767
768Arguments:
769""""""""""
770
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000771The first argument is a token returned by a call to '``llvm.coro.id``'
772identifying the coroutine.
David Majnemera6539272016-07-23 04:05:08 +0000773
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000774The second argument is a pointer to a block of memory where coroutine frame
775will be stored if it is allocated dynamically.
David Majnemera6539272016-07-23 04:05:08 +0000776
777Semantics:
778""""""""""
779
780Depending on the alignment requirements of the objects in the coroutine frame
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000781and/or on the codegen compactness reasons the pointer returned from `coro.begin`
782may be at offset to the `%mem` argument. (This could be beneficial if
783instructions that express relative access to data can be more compactly encoded
784with small positive and negative offsets).
David Majnemera6539272016-07-23 04:05:08 +0000785
David Majnemer78557192016-07-27 05:12:35 +0000786A frontend should emit exactly one `coro.begin` intrinsic per coroutine.
David Majnemera6539272016-07-23 04:05:08 +0000787
788.. _coro.free:
789
790'llvm.coro.free' Intrinsic
791^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
792::
793
Gor Nishanovdce9b022016-08-29 14:34:12 +0000794 declare i8* @llvm.coro.free(token %id, i8* <frame>)
David Majnemera6539272016-07-23 04:05:08 +0000795
796Overview:
797"""""""""
798
799The '``llvm.coro.free``' intrinsic returns a pointer to a block of memory where
800coroutine frame is stored or `null` if this instance of a coroutine did not use
801dynamically allocated memory for its coroutine frame.
802
803Arguments:
804""""""""""
805
Gor Nishanovdce9b022016-08-29 14:34:12 +0000806The first argument is a token returned by a call to '``llvm.coro.id``'
807identifying the coroutine.
808
809The second argument is a pointer to the coroutine frame. This should be the same
810pointer that was returned by prior `coro.begin` call.
David Majnemera6539272016-07-23 04:05:08 +0000811
812Example (custom deallocation function):
813"""""""""""""""""""""""""""""""""""""""
814
Gor Nishanov06fdf482017-04-05 05:26:26 +0000815.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000816
817 cleanup:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000818 %mem = call i8* @llvm.coro.free(token %id, i8* %frame)
David Majnemera6539272016-07-23 04:05:08 +0000819 %mem_not_null = icmp ne i8* %mem, null
820 br i1 %mem_not_null, label %if.then, label %if.end
821 if.then:
822 call void @CustomFree(i8* %mem)
823 br label %if.end
824 if.end:
825 ret void
826
827Example (standard deallocation functions):
828""""""""""""""""""""""""""""""""""""""""""
829
Gor Nishanov06fdf482017-04-05 05:26:26 +0000830.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000831
832 cleanup:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000833 %mem = call i8* @llvm.coro.free(token %id, i8* %frame)
David Majnemera6539272016-07-23 04:05:08 +0000834 call void @free(i8* %mem)
835 ret void
836
837.. _coro.alloc:
838
839'llvm.coro.alloc' Intrinsic
840^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
841::
842
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000843 declare i1 @llvm.coro.alloc(token <id>)
David Majnemera6539272016-07-23 04:05:08 +0000844
845Overview:
846"""""""""
847
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000848The '``llvm.coro.alloc``' intrinsic returns `true` if dynamic allocation is
Hiroshi Inoueb93daec2017-07-02 12:44:27 +0000849required to obtain a memory for the coroutine frame and `false` otherwise.
David Majnemera6539272016-07-23 04:05:08 +0000850
851Arguments:
852""""""""""
853
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000854The first argument is a token returned by a call to '``llvm.coro.id``'
855identifying the coroutine.
David Majnemera6539272016-07-23 04:05:08 +0000856
857Semantics:
858""""""""""
859
David Majnemer78557192016-07-27 05:12:35 +0000860A frontend should emit at most one `coro.alloc` intrinsic per coroutine.
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000861The intrinsic is used to suppress dynamic allocation of the coroutine frame
862when possible.
Gor Nishanovb2a9c022016-08-10 16:40:39 +0000863
David Majnemera6539272016-07-23 04:05:08 +0000864Example:
865""""""""
866
Gor Nishanov06fdf482017-04-05 05:26:26 +0000867.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +0000868
869 entry:
Gor Nishanovdce9b022016-08-29 14:34:12 +0000870 %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000871 %dyn.alloc.required = call i1 @llvm.coro.alloc(token %id)
872 br i1 %dyn.alloc.required, label %coro.alloc, label %coro.begin
David Majnemera6539272016-07-23 04:05:08 +0000873
874 coro.alloc:
875 %frame.size = call i32 @llvm.coro.size()
876 %alloc = call i8* @MyAlloc(i32 %frame.size)
877 br label %coro.begin
878
879 coro.begin:
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000880 %phi = phi i8* [ null, %entry ], [ %alloc, %coro.alloc ]
881 %frame = call i8* @llvm.coro.begin(token %id, i8* %phi)
David Majnemera6539272016-07-23 04:05:08 +0000882
Gor Nishanovb0316d92018-04-02 16:55:12 +0000883.. _coro.noop:
884
885'llvm.coro.noop' Intrinsic
886^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
887::
888
889 declare i8* @llvm.coro.noop()
890
891Overview:
892"""""""""
893
894The '``llvm.coro.noop``' intrinsic returns an address of the coroutine frame of
895a coroutine that does nothing when resumed or destroyed.
896
897Arguments:
898""""""""""
899
900None
901
902Semantics:
903""""""""""
904
905This intrinsic is lowered to refer to a private constant coroutine frame. The
906resume and destroy handlers for this frame are empty functions that do nothing.
907Note that in different translation units llvm.coro.noop may return different pointers.
908
David Majnemera6539272016-07-23 04:05:08 +0000909.. _coro.frame:
910
911'llvm.coro.frame' Intrinsic
912^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
913::
914
915 declare i8* @llvm.coro.frame()
916
917Overview:
918"""""""""
919
920The '``llvm.coro.frame``' intrinsic returns an address of the coroutine frame of
921the enclosing coroutine.
922
923Arguments:
924""""""""""
925
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000926None
David Majnemera6539272016-07-23 04:05:08 +0000927
928Semantics:
929""""""""""
930
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000931This intrinsic is lowered to refer to the `coro.begin`_ instruction. This is
932a frontend convenience intrinsic that makes it easier to refer to the
933coroutine frame.
934
935.. _coro.id:
936
937'llvm.coro.id' Intrinsic
938^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
939::
940
Gor Nishanovdce9b022016-08-29 14:34:12 +0000941 declare token @llvm.coro.id(i32 <align>, i8* <promise>, i8* <coroaddr>,
942 i8* <fnaddrs>)
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000943
944Overview:
945"""""""""
946
947The '``llvm.coro.id``' intrinsic returns a token identifying a coroutine.
948
949Arguments:
950""""""""""
951
952The first argument provides information on the alignment of the memory returned
953by the allocation function and given to `coro.begin` by the first argument. If
954this argument is 0, the memory is assumed to be aligned to 2 * sizeof(i8*).
955This argument only accepts constants.
956
957The second argument, if not `null`, designates a particular alloca instruction
958to be a `coroutine promise`_.
959
Gor Nishanovdce9b022016-08-29 14:34:12 +0000960The third argument is `null` coming out of the frontend. The CoroEarly pass sets
961this argument to point to the function this coro.id belongs to.
962
963The fourth argument is `null` before coroutine is split, and later is replaced
Gor Nishanov0f303ac2016-08-12 05:45:49 +0000964to point to a private global constant array containing function pointers to
965outlined resume and destroy parts of the coroutine.
966
967
968Semantics:
969""""""""""
970
971The purpose of this intrinsic is to tie together `coro.id`, `coro.alloc` and
972`coro.begin` belonging to the same coroutine to prevent optimization passes from
973duplicating any of these instructions unless entire body of the coroutine is
974duplicated.
975
976A frontend should emit exactly one `coro.id` intrinsic per coroutine.
David Majnemera6539272016-07-23 04:05:08 +0000977
978.. _coro.end:
979
980'llvm.coro.end' Intrinsic
981^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
982::
983
Gor Nishanovc52006a2017-03-07 21:00:54 +0000984 declare i1 @llvm.coro.end(i8* <handle>, i1 <unwind>)
David Majnemera6539272016-07-23 04:05:08 +0000985
986Overview:
987"""""""""
988
989The '``llvm.coro.end``' marks the point where execution of the resume part of
Gor Nishanovc52006a2017-03-07 21:00:54 +0000990the coroutine should end and control should return to the caller.
David Majnemera6539272016-07-23 04:05:08 +0000991
992
993Arguments:
994""""""""""
995
Gor Nishanovc52006a2017-03-07 21:00:54 +0000996The first argument should refer to the coroutine handle of the enclosing
997coroutine. A frontend is allowed to supply null as the first parameter, in this
998case `coro-early` pass will replace the null with an appropriate coroutine
999handle value.
David Majnemera6539272016-07-23 04:05:08 +00001000
1001The second argument should be `true` if this coro.end is in the block that is
Gor Nishanovc52006a2017-03-07 21:00:54 +00001002part of the unwind sequence leaving the coroutine body due to an exception and
1003`false` otherwise.
David Majnemera6539272016-07-23 04:05:08 +00001004
1005Semantics:
1006""""""""""
Gor Nishanovc52006a2017-03-07 21:00:54 +00001007The purpose of this intrinsic is to allow frontends to mark the cleanup and
1008other code that is only relevant during the initial invocation of the coroutine
1009and should not be present in resume and destroy parts.
David Majnemera6539272016-07-23 04:05:08 +00001010
Gor Nishanovc52006a2017-03-07 21:00:54 +00001011This intrinsic is lowered when a coroutine is split into
1012the start, resume and destroy parts. In the start part, it is a no-op,
1013in resume and destroy parts, it is replaced with `ret void` instruction and
David Majnemera6539272016-07-23 04:05:08 +00001014the rest of the block containing `coro.end` instruction is discarded.
David Majnemera6539272016-07-23 04:05:08 +00001015In landing pads it is replaced with an appropriate instruction to unwind to
Gor Nishanovc52006a2017-03-07 21:00:54 +00001016caller. The handling of coro.end differs depending on whether the target is
1017using landingpad or WinEH exception model.
David Majnemera6539272016-07-23 04:05:08 +00001018
Gor Nishanovc52006a2017-03-07 21:00:54 +00001019For landingpad based exception model, it is expected that frontend uses the
1020`coro.end`_ intrinsic as follows:
1021
1022.. code-block:: llvm
1023
1024 ehcleanup:
1025 %InResumePart = call i1 @llvm.coro.end(i8* null, i1 true)
1026 br i1 %InResumePart, label %eh.resume, label %cleanup.cont
1027
1028 cleanup.cont:
1029 ; rest of the cleanup
1030
1031 eh.resume:
1032 %exn = load i8*, i8** %exn.slot, align 8
1033 %sel = load i32, i32* %ehselector.slot, align 4
1034 %lpad.val = insertvalue { i8*, i32 } undef, i8* %exn, 0
1035 %lpad.val29 = insertvalue { i8*, i32 } %lpad.val, i32 %sel, 1
1036 resume { i8*, i32 } %lpad.val29
1037
1038The `CoroSpit` pass replaces `coro.end` with ``True`` in the resume functions,
1039thus leading to immediate unwind to the caller, whereas in start function it
1040is replaced with ``False``, thus allowing to proceed to the rest of the cleanup
1041code that is only needed during initial invocation of the coroutine.
1042
1043For Windows Exception handling model, a frontend should attach a funclet bundle
1044referring to an enclosing cleanuppad as follows:
1045
Gor Nishanov06fdf482017-04-05 05:26:26 +00001046.. code-block:: llvm
Gor Nishanovc52006a2017-03-07 21:00:54 +00001047
1048 ehcleanup:
1049 %tok = cleanuppad within none []
1050 %unused = call i1 @llvm.coro.end(i8* null, i1 true) [ "funclet"(token %tok) ]
1051 cleanupret from %tok unwind label %RestOfTheCleanup
1052
1053The `CoroSplit` pass, if the funclet bundle is present, will insert
1054``cleanupret from %tok unwind to caller`` before
1055the `coro.end`_ intrinsic and will remove the rest of the block.
1056
1057The following table summarizes the handling of `coro.end`_ intrinsic.
1058
1059+--------------------------+-------------------+-------------------------------+
1060| | In Start Function | In Resume/Destroy Functions |
1061+--------------------------+-------------------+-------------------------------+
1062|unwind=false | nothing |``ret void`` |
1063+------------+-------------+-------------------+-------------------------------+
1064| | WinEH | nothing |``cleanupret unwind to caller``|
1065|unwind=true +-------------+-------------------+-------------------------------+
1066| | Landingpad | nothing | nothing |
1067+------------+-------------+-------------------+-------------------------------+
David Majnemera6539272016-07-23 04:05:08 +00001068
1069.. _coro.suspend:
1070.. _suspend points:
1071
1072'llvm.coro.suspend' Intrinsic
1073^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1074::
1075
1076 declare i8 @llvm.coro.suspend(token <save>, i1 <final>)
1077
1078Overview:
1079"""""""""
1080
1081The '``llvm.coro.suspend``' marks the point where execution of the coroutine
1082need to get suspended and control returned back to the caller.
1083Conditional branches consuming the result of this intrinsic lead to basic blocks
1084where coroutine should proceed when suspended (-1), resumed (0) or destroyed
1085(1).
1086
1087Arguments:
1088""""""""""
1089
1090The first argument refers to a token of `coro.save` intrinsic that marks the
1091point when coroutine state is prepared for suspension. If `none` token is passed,
1092the intrinsic behaves as if there were a `coro.save` immediately preceding
1093the `coro.suspend` intrinsic.
1094
1095The second argument indicates whether this suspension point is `final`_.
1096The second argument only accepts constants. If more than one suspend point is
1097designated as final, the resume and destroy branches should lead to the same
1098basic blocks.
1099
1100Example (normal suspend point):
1101"""""""""""""""""""""""""""""""
1102
Gor Nishanov06fdf482017-04-05 05:26:26 +00001103.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +00001104
1105 %0 = call i8 @llvm.coro.suspend(token none, i1 false)
1106 switch i8 %0, label %suspend [i8 0, label %resume
1107 i8 1, label %cleanup]
1108
1109Example (final suspend point):
1110""""""""""""""""""""""""""""""
1111
Gor Nishanov06fdf482017-04-05 05:26:26 +00001112.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +00001113
1114 while.end:
1115 %s.final = call i8 @llvm.coro.suspend(token none, i1 true)
1116 switch i8 %s.final, label %suspend [i8 0, label %trap
1117 i8 1, label %cleanup]
1118 trap:
1119 call void @llvm.trap()
1120 unreachable
1121
1122Semantics:
1123""""""""""
1124
1125If a coroutine that was suspended at the suspend point marked by this intrinsic
1126is resumed via `coro.resume`_ the control will transfer to the basic block
1127of the 0-case. If it is resumed via `coro.destroy`_, it will proceed to the
1128basic block indicated by the 1-case. To suspend, coroutine proceed to the
1129default label.
1130
1131If suspend intrinsic is marked as final, it can consider the `true` branch
1132unreachable and can perform optimizations that can take advantage of that fact.
1133
1134.. _coro.save:
1135
1136'llvm.coro.save' Intrinsic
1137^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1138::
1139
1140 declare token @llvm.coro.save(i8* <handle>)
1141
1142Overview:
1143"""""""""
1144
1145The '``llvm.coro.save``' marks the point where a coroutine need to update its
1146state to prepare for resumption to be considered suspended (and thus eligible
1147for resumption).
1148
1149Arguments:
1150""""""""""
1151
1152The first argument points to a coroutine handle of the enclosing coroutine.
1153
1154Semantics:
1155""""""""""
1156
1157Whatever coroutine state changes are required to enable resumption of
1158the coroutine from the corresponding suspend point should be done at the point
1159of `coro.save` intrinsic.
1160
1161Example:
1162""""""""
1163
1164Separate save and suspend points are necessary when a coroutine is used to
1165represent an asynchronous control flow driven by callbacks representing
1166completions of asynchronous operations.
1167
1168In such a case, a coroutine should be ready for resumption prior to a call to
1169`async_op` function that may trigger resumption of a coroutine from the same or
1170a different thread possibly prior to `async_op` call returning control back
1171to the coroutine:
1172
Gor Nishanov06fdf482017-04-05 05:26:26 +00001173.. code-block:: llvm
David Majnemera6539272016-07-23 04:05:08 +00001174
1175 %save1 = call token @llvm.coro.save(i8* %hdl)
Gor Nishanov06fdf482017-04-05 05:26:26 +00001176 call void @async_op1(i8* %hdl)
David Majnemera6539272016-07-23 04:05:08 +00001177 %suspend1 = call i1 @llvm.coro.suspend(token %save1, i1 false)
1178 switch i8 %suspend1, label %suspend [i8 0, label %resume1
1179 i8 1, label %cleanup]
1180
1181.. _coro.param:
1182
1183'llvm.coro.param' Intrinsic
1184^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1185::
1186
1187 declare i1 @llvm.coro.param(i8* <original>, i8* <copy>)
1188
1189Overview:
1190"""""""""
1191
David Majnemer78557192016-07-27 05:12:35 +00001192The '``llvm.coro.param``' is used by a frontend to mark up the code used to
David Majnemera6539272016-07-23 04:05:08 +00001193construct and destruct copies of the parameters. If the optimizer discovers that
1194a particular parameter copy is not used after any suspends, it can remove the
1195construction and destruction of the copy by replacing corresponding coro.param
1196with `i1 false` and replacing any use of the `copy` with the `original`.
1197
1198Arguments:
1199""""""""""
1200
1201The first argument points to an `alloca` storing the value of a parameter to a
1202coroutine.
1203
1204The second argument points to an `alloca` storing the value of the copy of that
1205parameter.
1206
1207Semantics:
1208""""""""""
1209
1210The optimizer is free to always replace this intrinsic with `i1 true`.
1211
1212The optimizer is also allowed to replace it with `i1 false` provided that the
1213parameter copy is only used prior to control flow reaching any of the suspend
1214points. The code that would be DCE'd if the `coro.param` is replaced with
1215`i1 false` is not considered to be a use of the parameter copy.
1216
1217The frontend can emit this intrinsic if its language rules allow for this
1218optimization.
1219
1220Example:
1221""""""""
1222Consider the following example. A coroutine takes two parameters `a` and `b`
1223that has a destructor and a move constructor.
1224
Sanjoy Das77a9c792016-07-26 21:03:41 +00001225.. code-block:: c++
David Majnemera6539272016-07-23 04:05:08 +00001226
1227 struct A { ~A(); A(A&&); bool foo(); void bar(); };
1228
1229 task<int> f(A a, A b) {
1230 if (a.foo())
1231 return 42;
1232
1233 a.bar();
1234 co_await read_async(); // introduces suspend point
1235 b.bar();
1236 }
1237
1238Note that, uses of `b` is used after a suspend point and thus must be copied
1239into a coroutine frame, whereas `a` does not have to, since it never used
1240after suspend.
1241
1242A frontend can create parameter copies for `a` and `b` as follows:
1243
Aaron Ballmanbc7c2d02016-07-23 20:11:21 +00001244.. code-block:: text
David Majnemera6539272016-07-23 04:05:08 +00001245
1246 task<int> f(A a', A b') {
1247 a = alloca A;
1248 b = alloca A;
1249 // move parameters to its copies
1250 if (coro.param(a', a)) A::A(a, A&& a');
1251 if (coro.param(b', b)) A::A(b, A&& b');
1252 ...
1253 // destroy parameters copies
1254 if (coro.param(a', a)) A::~A(a);
1255 if (coro.param(b', b)) A::~A(b);
1256 }
1257
1258The optimizer can replace coro.param(a',a) with `i1 false` and replace all uses
1259of `a` with `a'`, since it is not used after suspend.
1260
1261The optimizer must replace coro.param(b', b) with `i1 true`, since `b` is used
1262after suspend and therefore, it has to reside in the coroutine frame.
1263
1264Coroutine Transformation Passes
1265===============================
1266CoroEarly
1267---------
1268The pass CoroEarly lowers coroutine intrinsics that hide the details of the
1269structure of the coroutine frame, but, otherwise not needed to be preserved to
1270help later coroutine passes. This pass lowers `coro.frame`_, `coro.done`_,
1271and `coro.promise`_ intrinsics.
1272
1273.. _CoroSplit:
1274
1275CoroSplit
1276---------
1277The pass CoroSplit buides coroutine frame and outlines resume and destroy parts
1278into separate functions.
1279
1280CoroElide
1281---------
1282The pass CoroElide examines if the inlined coroutine is eligible for heap
Gor Nishanov0f303ac2016-08-12 05:45:49 +00001283allocation elision optimization. If so, it replaces
1284`coro.begin` intrinsic with an address of a coroutine frame placed on its caller
1285and replaces `coro.alloc` and `coro.free` intrinsics with `false` and `null`
1286respectively to remove the deallocation code.
David Majnemera6539272016-07-23 04:05:08 +00001287This pass also replaces `coro.resume` and `coro.destroy` intrinsics with direct
1288calls to resume and destroy functions for a particular coroutine where possible.
1289
1290CoroCleanup
1291-----------
1292This pass runs late to lower all coroutine related intrinsics not replaced by
1293earlier passes.
1294
David Majnemera6539272016-07-23 04:05:08 +00001295Areas Requiring Attention
1296=========================
1297#. A coroutine frame is bigger than it could be. Adding stack packing and stack
1298 coloring like optimization on the coroutine frame will result in tighter
1299 coroutine frames.
1300
1301#. Take advantage of the lifetime intrinsics for the data that goes into the
1302 coroutine frame. Leave lifetime intrinsics as is for the data that stays in
1303 allocas.
1304
1305#. The CoroElide optimization pass relies on coroutine ramp function to be
1306 inlined. It would be beneficial to split the ramp function further to
1307 increase the chance that it will get inlined into its caller.
1308
1309#. Design a convention that would make it possible to apply coroutine heap
1310 elision optimization across ABI boundaries.
1311
1312#. Cannot handle coroutines with `inalloca` parameters (used in x86 on Windows).
1313
1314#. Alignment is ignored by coro.begin and coro.free intrinsics.
1315
1316#. Make required changes to make sure that coroutine optimizations work with
1317 LTO.
1318
1319#. More tests, more tests, more tests