Dan Gohman | 10e730a | 2015-06-29 23:51:55 +0000 | [diff] [blame] | 1 | //===-- README.txt - Notes for WebAssembly code gen -----------------------===// |
| 2 | |
| 3 | This WebAssembly backend is presently in a very early stage of development. |
| 4 | The code should build and not break anything else, but don't expect a lot more |
| 5 | at this point. |
| 6 | |
| 7 | For more information on WebAssembly itself, see the design documents: |
| 8 | * https://github.com/WebAssembly/design/blob/master/README.md |
| 9 | |
| 10 | The following documents contain some information on the planned semantics and |
| 11 | binary encoding of WebAssembly itself: |
Dan Gohman | 5d3391f | 2016-10-24 20:35:17 +0000 | [diff] [blame^] | 12 | * https://github.com/WebAssembly/design/blob/master/Semantics.md |
Dan Gohman | 10e730a | 2015-06-29 23:51:55 +0000 | [diff] [blame] | 13 | * https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md |
| 14 | |
JF Bastien | f05f6fd | 2015-12-05 19:36:33 +0000 | [diff] [blame] | 15 | The backend is built, tested and archived on the following waterfall: |
JF Bastien | 4383a34 | 2016-01-22 04:21:49 +0000 | [diff] [blame] | 16 | https://wasm-stat.us |
JF Bastien | f05f6fd | 2015-12-05 19:36:33 +0000 | [diff] [blame] | 17 | |
| 18 | The backend's bringup is done using the GCC torture test suite first since it |
| 19 | doesn't require C library support. Current known failures are in |
| 20 | known_gcc_test_failures.txt, all other tests should pass. The waterfall will |
| 21 | turn red if not. Once most of these pass, further testing will use LLVM's own |
JF Bastien | c2b3048 | 2015-12-09 13:29:32 +0000 | [diff] [blame] | 22 | test suite. The tests can be run locally using: |
JF Bastien | 4383a34 | 2016-01-22 04:21:49 +0000 | [diff] [blame] | 23 | https://github.com/WebAssembly/waterfall/blob/master/src/compile_torture_tests.py |
JF Bastien | f05f6fd | 2015-12-05 19:36:33 +0000 | [diff] [blame] | 24 | |
Dan Gohman | 10e730a | 2015-06-29 23:51:55 +0000 | [diff] [blame] | 25 | //===---------------------------------------------------------------------===// |
Dan Gohman | dfa81d8 | 2015-11-20 03:08:27 +0000 | [diff] [blame] | 26 | |
Dan Gohman | e040533 | 2016-10-03 22:43:53 +0000 | [diff] [blame] | 27 | Br, br_if, and br_table instructions can support having a value on the value |
| 28 | stack across the jump (sometimes). We should (a) model this, and (b) extend |
| 29 | the stackifier to utilize it. |
Dan Gohman | dfa81d8 | 2015-11-20 03:08:27 +0000 | [diff] [blame] | 30 | |
| 31 | //===---------------------------------------------------------------------===// |
Dan Gohman | 753abf8 | 2015-12-06 19:29:54 +0000 | [diff] [blame] | 32 | |
Dan Gohman | e040533 | 2016-10-03 22:43:53 +0000 | [diff] [blame] | 33 | The min/max instructions aren't exactly a<b?a:b because of NaN and negative zero |
Dan Gohman | 753abf8 | 2015-12-06 19:29:54 +0000 | [diff] [blame] | 34 | behavior. The ARM target has the same kind of min/max instructions and has |
| 35 | implemented optimizations for them; we should do similar optimizations for |
| 36 | WebAssembly. |
| 37 | |
| 38 | //===---------------------------------------------------------------------===// |
| 39 | |
| 40 | AArch64 runs SeparateConstOffsetFromGEPPass, followed by EarlyCSE and LICM. |
| 41 | Would these be useful to run for WebAssembly too? Also, it has an option to |
| 42 | run SimplifyCFG after running the AtomicExpand pass. Would this be useful for |
| 43 | us too? |
| 44 | |
| 45 | //===---------------------------------------------------------------------===// |
| 46 | |
Dan Gohman | e040533 | 2016-10-03 22:43:53 +0000 | [diff] [blame] | 47 | Register stackification uses the VALUE_STACK physical register to impose |
Dan Gohman | 753abf8 | 2015-12-06 19:29:54 +0000 | [diff] [blame] | 48 | ordering dependencies on instructions with stack operands. This is pessimistic; |
| 49 | we should consider alternate ways to model stack dependencies. |
| 50 | |
| 51 | //===---------------------------------------------------------------------===// |
| 52 | |
| 53 | Lots of things could be done in WebAssemblyTargetTransformInfo.cpp. Similarly, |
| 54 | there are numerous optimization-related hooks that can be overridden in |
| 55 | WebAssemblyTargetLowering. |
| 56 | |
| 57 | //===---------------------------------------------------------------------===// |
| 58 | |
| 59 | Instead of the OptimizeReturned pass, which should consider preserving the |
| 60 | "returned" attribute through to MachineInstrs and extending the StoreResults |
| 61 | pass to do this optimization on calls too. That would also let the |
| 62 | WebAssemblyPeephole pass clean up dead defs for such calls, as it does for |
| 63 | stores. |
| 64 | |
| 65 | //===---------------------------------------------------------------------===// |
| 66 | |
Dan Gohman | 7f86ca1 | 2016-01-16 00:20:03 +0000 | [diff] [blame] | 67 | Consider implementing optimizeSelect, optimizeCompareInstr, optimizeCondBranch, |
| 68 | optimizeLoadInstr, and/or getMachineCombinerPatterns. |
| 69 | |
| 70 | //===---------------------------------------------------------------------===// |
| 71 | |
| 72 | Find a clean way to fix the problem which leads to the Shrink Wrapping pass |
| 73 | being run after the WebAssembly PEI pass. |
| 74 | |
| 75 | //===---------------------------------------------------------------------===// |
Dan Gohman | adf2817 | 2016-01-28 01:22:44 +0000 | [diff] [blame] | 76 | |
| 77 | When setting multiple local variables to the same constant, we currently get |
| 78 | code like this: |
| 79 | |
| 80 | i32.const $4=, 0 |
| 81 | i32.const $3=, 0 |
| 82 | |
| 83 | It could be done with a smaller encoding like this: |
| 84 | |
| 85 | i32.const $push5=, 0 |
| 86 | tee_local $push6=, $4=, $pop5 |
| 87 | copy_local $3=, $pop6 |
| 88 | |
| 89 | //===---------------------------------------------------------------------===// |
Dan Gohman | 4918702 | 2016-02-08 03:42:36 +0000 | [diff] [blame] | 90 | |
| 91 | WebAssembly registers are implicitly initialized to zero. Explicit zeroing is |
| 92 | therefore often redundant and could be optimized away. |
| 93 | |
| 94 | //===---------------------------------------------------------------------===// |
Dan Gohman | 87e368b | 2016-02-19 19:22:44 +0000 | [diff] [blame] | 95 | |
| 96 | Small indices may use smaller encodings than large indices. |
Dan Gohman | 3a71ccb | 2016-02-20 23:11:14 +0000 | [diff] [blame] | 97 | WebAssemblyRegColoring and/or WebAssemblyRegRenumbering should sort registers |
| 98 | according to their usage frequency to maximize the usage of smaller encodings. |
Dan Gohman | 87e368b | 2016-02-19 19:22:44 +0000 | [diff] [blame] | 99 | |
| 100 | //===---------------------------------------------------------------------===// |
Dan Gohman | e6b8136 | 2016-03-04 20:09:57 +0000 | [diff] [blame] | 101 | |
Dan Gohman | ddfa1a6 | 2016-03-09 04:17:36 +0000 | [diff] [blame] | 102 | Many cases of irreducible control flow could be transformed more optimally |
| 103 | than via the transform in WebAssemblyFixIrreducibleControlFlow.cpp. |
| 104 | |
| 105 | It may also be worthwhile to do transforms before register coloring, |
| 106 | particularly when duplicating code, to allow register coloring to be aware of |
| 107 | the duplication. |
| 108 | |
| 109 | //===---------------------------------------------------------------------===// |
Dan Gohman | 6627e5f | 2016-05-16 18:51:03 +0000 | [diff] [blame] | 110 | |
| 111 | WebAssemblyRegStackify could use AliasAnalysis to reorder loads and stores more |
| 112 | aggressively. |
| 113 | |
| 114 | //===---------------------------------------------------------------------===// |
| 115 | |
| 116 | WebAssemblyRegStackify is currently a greedy algorithm. This means that, for |
| 117 | example, a binary operator will stackify with its user before its operands. |
| 118 | However, if moving the binary operator to its user moves it to a place where |
| 119 | its operands can't be moved to, it would be better to leave it in place, or |
| 120 | perhaps move it up, so that it can stackify its operands. A binary operator |
| 121 | has two operands and one result, so in such cases there could be a net win by |
| 122 | prefering the operands. |
| 123 | |
| 124 | //===---------------------------------------------------------------------===// |
Dan Gohman | e045f67 | 2016-05-18 20:19:02 +0000 | [diff] [blame] | 125 | |
| 126 | Instruction ordering has a significant influence on register stackification and |
| 127 | coloring. Consider experimenting with the MachineScheduler (enable via |
| 128 | enableMachineScheduler) and determine if it can be configured to schedule |
| 129 | instructions advantageously for this purpose. |
| 130 | |
| 131 | //===---------------------------------------------------------------------===// |
Dan Gohman | e040533 | 2016-10-03 22:43:53 +0000 | [diff] [blame] | 132 | |
| 133 | WebAssembly is now officially a stack machine, rather than an AST, and this |
| 134 | comes with additional opportunities for WebAssemblyRegStackify. Specifically, |
| 135 | the stack doesn't need to be empty after an instruction with no return values. |
| 136 | WebAssemblyRegStackify could be extended, or possibly rewritten, to take |
| 137 | advantage of the new opportunities. |
| 138 | |
| 139 | //===---------------------------------------------------------------------===// |