Tobias Grosser | 719f843 | 2016-02-04 07:45:32 +0000 | [diff] [blame] | 1 | ====================== |
| 2 | Using Polly with Clang |
| 3 | ====================== |
| 4 | |
| 5 | This documentation discusses how Polly can be used in Clang to automatically |
| 6 | optimize C/C++ code during compilation. |
| 7 | |
| 8 | |
| 9 | .. warning:: |
| 10 | |
| 11 | Warning: clang/LLVM/Polly need to be in sync (compiled from the same SVN |
| 12 | revision). |
| 13 | |
| 14 | Make Polly available from Clang |
| 15 | =============================== |
| 16 | |
Tobias Grosser | c4a8016 | 2016-04-29 12:35:46 +0000 | [diff] [blame] | 17 | Polly is available through clang, opt, and bugpoint, if Polly was checked out |
| 18 | into tools/polly before compilation. No further configuration is needed. |
Tobias Grosser | 719f843 | 2016-02-04 07:45:32 +0000 | [diff] [blame] | 19 | |
| 20 | Optimizing with Polly |
| 21 | ===================== |
| 22 | |
| 23 | Optimizing with Polly is as easy as adding -O3 -mllvm -polly to your compiler |
| 24 | flags (Polly is only available at -O3). |
| 25 | |
| 26 | .. code-block:: console |
| 27 | |
Tobias Grosser | c4a8016 | 2016-04-29 12:35:46 +0000 | [diff] [blame] | 28 | clang -O3 -mllvm -polly file.c |
Tobias Grosser | 719f843 | 2016-02-04 07:45:32 +0000 | [diff] [blame] | 29 | |
| 30 | Automatic OpenMP code generation |
| 31 | ================================ |
| 32 | |
| 33 | To automatically detect parallel loops and generate OpenMP code for them you |
| 34 | also need to add -mllvm -polly-parallel -lgomp to your CFLAGS. |
| 35 | |
| 36 | .. code-block:: console |
| 37 | |
Tobias Grosser | c4a8016 | 2016-04-29 12:35:46 +0000 | [diff] [blame] | 38 | clang -O3 -mllvm -polly -mllvm -polly-parallel -lgomp file.c |
Tobias Grosser | 719f843 | 2016-02-04 07:45:32 +0000 | [diff] [blame] | 39 | |
| 40 | Automatic Vector code generation |
| 41 | ================================ |
| 42 | |
| 43 | Automatic vector code generation can be enabled by adding -mllvm |
| 44 | -polly-vectorizer=stripmine to your CFLAGS. |
| 45 | |
| 46 | .. code-block:: console |
| 47 | |
Tobias Grosser | c4a8016 | 2016-04-29 12:35:46 +0000 | [diff] [blame] | 48 | clang -O3 -mllvm -polly -mllvm -polly-vectorizer=stripmine file.c |
Tobias Grosser | 719f843 | 2016-02-04 07:45:32 +0000 | [diff] [blame] | 49 | |
Michael Kruse | 9b67e56 | 2018-04-06 19:24:18 +0000 | [diff] [blame] | 50 | Isolate the Polly passes |
| 51 | ======================== |
Tobias Grosser | 719f843 | 2016-02-04 07:45:32 +0000 | [diff] [blame] | 52 | |
Michael Kruse | 9b67e56 | 2018-04-06 19:24:18 +0000 | [diff] [blame] | 53 | Polly's analysis and transformation passes are run with many other |
| 54 | passes of the pass manager's pipeline. Some of passes that run before |
| 55 | Polly are essential for its working, for instance the canonicalization |
| 56 | of loop. Therefore Polly is unable to optimize code straight out of |
| 57 | clang's -O0 output. |
| 58 | |
| 59 | To get the LLVM-IR that Polly sees in the optimization pipeline, use the |
| 60 | command: |
Tobias Grosser | 719f843 | 2016-02-04 07:45:32 +0000 | [diff] [blame] | 61 | |
| 62 | .. code-block:: console |
| 63 | |
Michael Kruse | 9b67e56 | 2018-04-06 19:24:18 +0000 | [diff] [blame] | 64 | clang file.c -c -O3 -mllvm -polly -mllvm -polly-dump-before-file=before-polly.ll |
| 65 | |
| 66 | This writes a file 'before-polly.ll' containing the LLVM-IR as passed to |
| 67 | polly, after SSA transformation, loop canonicalization, inlining and |
| 68 | other passes. |
| 69 | |
| 70 | Thereafter, any Polly pass can be run over 'before-polly.ll' using the |
| 71 | 'opt' tool. To found out which Polly passes are active in the standard |
| 72 | pipeline, see the output of |
| 73 | |
| 74 | .. code-block:: console |
| 75 | |
| 76 | clang file.c -c -O3 -mllvm -polly -mllvm -debug-pass=Arguments |
| 77 | |
| 78 | The Polly's passes are those between '-polly-detect' and |
| 79 | '-polly-codegen'. Analysis passes can be omitted. At the time of this |
| 80 | writing, the default Polly pass pipeline is: |
| 81 | |
| 82 | .. code-block:: console |
| 83 | |
| 84 | opt before-polly.ll -polly-simplify -polly-optree -polly-delicm -polly-simplify -polly-prune-unprofitable -polly-opt-isl -polly-codegen |
| 85 | |
| 86 | Note that this uses LLVM's old/legacy pass manager. |
| 87 | |
| 88 | For completeness, here are some other methods that generates IR |
| 89 | suitable for processing with Polly from C/C++/Objective C source code. |
| 90 | The previous method is the recommended one. |
| 91 | |
| 92 | The following generates unoptimized LLVM-IR ('-O0', which is the |
| 93 | default) and runs the canonicalizing passes on it |
| 94 | ('-polly-canonicalize'). This does /not/ include all the passes that run |
| 95 | before Polly in the default pass pipeline. The '-disable-O0-optnone' |
| 96 | option is required because otherwise clang adds an 'optnone' attribute |
| 97 | to all functions such that it is skipped by most optimization passes. |
| 98 | This is meant to stop LTO builds to optimize these functions in the |
| 99 | linking phase anyway. |
| 100 | |
| 101 | .. code-block:: console |
| 102 | |
| 103 | clang file.c -c -O0 -Xclang -disable-O0-optnone -emit-llvm -S -o - | opt -polly-canonicalize -S |
| 104 | |
| 105 | The option '-disable-llvm-passes' disables all LLVM passes, even those |
| 106 | that run at -O0. Passing -O1 (or any optimization level other than -O0) |
| 107 | avoids that the 'optnone' attribute is added. |
| 108 | |
| 109 | .. code-block:: console |
| 110 | |
| 111 | clang file.c -c -O1 -Xclang -disable-llvm-passes -emit-llvm -S -o - | opt -polly-canonicalize -S |
| 112 | |
| 113 | As another alternative, Polly can be pushed in front of the pass |
| 114 | pipeline, and then its output dumped. This implicitly runs the |
| 115 | '-polly-canonicalize' passes. |
| 116 | |
| 117 | .. code-block:: console |
| 118 | |
| 119 | clang file.c -c -O3 -mllvm -polly -mllvm -polly-position=early -mllvm -polly-dump-before-file=before-polly.ll |
Tobias Grosser | 719f843 | 2016-02-04 07:45:32 +0000 | [diff] [blame] | 120 | |
| 121 | Further options |
| 122 | =============== |
| 123 | Polly supports further options that are mainly useful for the development or the |
| 124 | analysis of Polly. The relevant options can be added to clang by appending |
| 125 | -mllvm -option-name to the CFLAGS or the clang command line. |
| 126 | |
| 127 | Limit Polly to a single function |
| 128 | -------------------------------- |
| 129 | |
| 130 | To limit the execution of Polly to a single function, use the option |
| 131 | -polly-only-func=functionname. |
| 132 | |
| 133 | Disable LLVM-IR generation |
| 134 | -------------------------- |
| 135 | |
| 136 | Polly normally regenerates LLVM-IR from the Polyhedral representation. To only |
| 137 | see the effects of the preparing transformation, but to disable Polly code |
| 138 | generation add the option polly-no-codegen. |
| 139 | |
| 140 | Graphical view of the SCoPs |
| 141 | --------------------------- |
| 142 | Polly can use graphviz to show the SCoPs it detects in a program. The relevant |
| 143 | options are -polly-show, -polly-show-only, -polly-dot and -polly-dot-only. The |
| 144 | 'show' options automatically run dotty or another graphviz viewer to show the |
| 145 | scops graphically. The 'dot' options store for each function a dot file that |
| 146 | highlights the detected SCoPs. If 'only' is appended at the end of the option, |
| 147 | the basic blocks are shown without the statements the contain. |
| 148 | |
| 149 | Change/Disable the Optimizer |
| 150 | ---------------------------- |
| 151 | |
| 152 | Polly uses by default the isl scheduling optimizer. The isl optimizer optimizes |
Tobias Grosser | 0a828aa | 2016-05-17 19:44:16 +0000 | [diff] [blame] | 153 | for data-locality and parallelism using the Pluto algorithm. |
| 154 | To disable the optimizer entirely use the option -polly-optimizer=none. |
Tobias Grosser | 719f843 | 2016-02-04 07:45:32 +0000 | [diff] [blame] | 155 | |
| 156 | Disable tiling in the optimizer |
| 157 | ------------------------------- |
| 158 | |
| 159 | By default both optimizers perform tiling, if possible. In case this is not |
| 160 | wanted the option -polly-tiling=false can be used to disable it. (This option |
| 161 | disables tiling for both optimizers). |
| 162 | |
Tobias Grosser | 719f843 | 2016-02-04 07:45:32 +0000 | [diff] [blame] | 163 | Import / Export |
| 164 | --------------- |
| 165 | |
| 166 | The flags -polly-import and -polly-export allow the export and reimport of the |
| 167 | polyhedral representation. By exporting, modifying and reimporting the |
| 168 | polyhedral representation externally calculated transformations can be |
| 169 | applied. This enables external optimizers or the manual optimization of |
Tobias Grosser | 5c88f00 | 2017-07-19 01:16:55 +0000 | [diff] [blame] | 170 | specific SCoPs. |
| 171 | |
| 172 | Viewing Polly Diagnostics with opt-viewer |
| 173 | ----------------------------------------- |
| 174 | |
| 175 | The flag -fsave-optimization-record will generate .opt.yaml files when compiling |
| 176 | your program. These yaml files contain information about each emitted remark. |
| 177 | Ensure that you have Python 2.7 with PyYaml and Pygments Python Packages. |
| 178 | To run opt-viewer: |
| 179 | |
| 180 | .. code-block:: console |
| 181 | |
| 182 | llvm/tools/opt-viewer/opt-viewer.py -source-dir /path/to/program/src/ \ |
| 183 | /path/to/program/src/foo.opt.yaml \ |
| 184 | /path/to/program/src/bar.opt.yaml \ |
| 185 | -o ./output |
| 186 | |
Eli Friedman | 84c73fd | 2017-07-19 18:18:37 +0000 | [diff] [blame] | 187 | Include all yaml files (use \*.opt.yaml when specifying which yaml files to view) |
Tobias Grosser | 5c88f00 | 2017-07-19 01:16:55 +0000 | [diff] [blame] | 188 | to view all diagnostics from your program in opt-viewer. Compile with `PGO |
Eli Friedman | 84c73fd | 2017-07-19 18:18:37 +0000 | [diff] [blame] | 189 | <https://clang.llvm.org/docs/UsersManual.html#profiling-with-instrumentation>`_ to view |
Tobias Grosser | 5c88f00 | 2017-07-19 01:16:55 +0000 | [diff] [blame] | 190 | Hotness information in opt-viewer. Resulting html files can be viewed in an internet browser. |