Dan Gohman | dfecbe9 | 2010-02-17 22:47:06 +0000 | [diff] [blame] | 1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" |
| 2 | "http://www.w3.org/TR/html4/strict.dtd"> |
| 3 | <html> |
| 4 | <head> |
| 5 | <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
| 6 | <title>The Revenge Of The Often Misunderstood GEP Instruction</title> |
| 7 | <link rel="stylesheet" href="llvm.css" type="text/css"> |
| 8 | <style type="text/css"> |
| 9 | TABLE { text-align: left; border: 1px solid black; border-collapse: collapse; margin: 0 0 0 0; } |
| 10 | </style> |
| 11 | </head> |
| 12 | <body> |
| 13 | |
| 14 | <div class="doc_title"> |
| 15 | The Revenge Of The Often Misunderstood GEP Instruction |
| 16 | </div> |
| 17 | |
| 18 | <!-- *********************************************************************** --> |
| 19 | <div class="doc_section"><a name="intro"><b>Introduction</b></a></div> |
| 20 | <!-- *********************************************************************** --> |
| 21 | <div class="doc_text"> |
| 22 | <p>GEP was mysterious and wily at first, but it turned out that the basic |
| 23 | workings were fairly comprehensible. However the dragon was merely subdued; |
| 24 | now it's back, and it has more fundamental complexity to confront. This |
| 25 | document seeks to uncover misunderstandings of the GEP operator that tend |
| 26 | to persist past initial confusion about the funky "extra 0" thing. Here we |
| 27 | show that the GEP instruction is really not quite as simple as it seems, |
| 28 | even after the initial confusion is overcome.</p> |
| 29 | </div> |
| 30 | |
| 31 | <!-- *********************************************************************** --> |
| 32 | <div class="doc_subsection"> |
| 33 | <a name="lead0"><b>How is GEP different from ptrtoint, arithmetic, |
| 34 | and inttoptr?</b></a> |
| 35 | </div> |
| 36 | <div class="doc_text"> |
| 37 | <p>It's very similar; there are only subtle differences.</p> |
| 38 | |
| 39 | <p>With ptrtoint, you have to pick an integer type. One approach is to pick i64; |
| 40 | this is safe on everything LLVM supports (LLVM internally assumes pointers |
| 41 | are never wider than 64 bits in many places), and the optimizer will actually |
| 42 | narrow the i64 arithmetic down to the actual pointer size on targets which |
| 43 | don't support 64-bit arithmetic in most cases. However, there are some cases |
| 44 | where it doesn't do this. With GEP you can avoid this problem. |
| 45 | |
| 46 | <p>Also, GEP carries additional pointer aliasing rules. It's invalid to take a |
| 47 | GEP from one object and address into a different separately allocated |
| 48 | object. IR producers (front-ends) must follow this rule, and consumers |
| 49 | (optimizers, specifically alias analysis) benefit from being able to rely |
| 50 | on it.</p> |
| 51 | |
| 52 | <p>And, GEP is more concise in common cases.</p> |
| 53 | |
| 54 | <p>However, for of the underlying integer computation implied, there |
| 55 | is no difference.</p> |
| 56 | |
| 57 | </div> |
| 58 | |
| 59 | <!-- *********************************************************************** --> |
| 60 | <div class="doc_subsection"> |
| 61 | <a name="lead0"><b>I'm writing a backend for a target which needs custom |
| 62 | lowering for GEP. How do I do this?</b></a> |
| 63 | </div> |
| 64 | <div class="doc_text"> |
| 65 | <p>You don't. The integer computation implied by a GEP is target-independent. |
| 66 | Typically what you'll need to do is make your backend pattern-match |
| 67 | expressions trees involving ADD, MUL, etc., which are what GEP is lowered |
| 68 | into. This has the advantage of letting your code work correctly in more |
| 69 | cases.</p> |
| 70 | |
| 71 | <p>GEP does use target-dependent parameters for the size and layout of data |
| 72 | types, which targets can customize.</p> |
| 73 | |
| 74 | <p>If you require support for addressing units which are not 8 bits, you'll |
| 75 | need to fix a lot of code in the backend, with GEP lowering being only a |
| 76 | small piece of the overall picture.</p> |
| 77 | |
| 78 | </div> |
| 79 | |
| 80 | <!-- *********************************************************************** --> |
| 81 | <div class="doc_subsection"> |
| 82 | <a name="lead0"><b>Why do struct member indices always use i32?</b></a> |
| 83 | </div> |
| 84 | <div class="doc_text"> |
| 85 | <p>The specific type i32 is probably just a historical artifact, however it's |
| 86 | wide enough for all practical purposes, so there's been no need to change it. |
| 87 | It doesn't necessarily imply i32 address arithmetic; it's just an identifier |
| 88 | which identifies a field in a struct. Requiring that all struct indices be |
| 89 | the same reduces the range of possibilities for cases where two GEPs are |
| 90 | effectively the same but have distinct operand types.</p> |
| 91 | |
| 92 | </div> |
| 93 | |
| 94 | <!-- *********************************************************************** --> |
| 95 | <div class="doc_subsection"> |
| 96 | <a name="lead0"><b>How does VLA addressing work with GEPs?</b></a> |
| 97 | </div> |
| 98 | <div class="doc_text"> |
| 99 | <p>GEPs don't natively support VLAs. LLVM's type system is entirely static, |
| 100 | and GEP address computations are guided by an LLVM type.</p> |
| 101 | |
| 102 | <p>VLA indices can be implemented as linearized indices. For example, an |
| 103 | expression like X[a][b][c], must be effectively lowered into a form |
| 104 | like X[a*m+b*n+c], so that it appears to the GEP as a single-dimensional |
| 105 | array reference.</p> |
| 106 | |
| 107 | <p>This means if you want to write an analysis which understands array |
| 108 | indices and you want to support VLAs, your code will have to be |
| 109 | prepared to reverse-engineer the linearization. One way to solve this |
| 110 | problem is to use the ScalarEvolution library, which always presents |
| 111 | VLA and non-VLA indexing in the same manner.</p> |
| 112 | |
| 113 | </div> |
| 114 | |
| 115 | <!-- *********************************************************************** --> |
| 116 | <div class="doc_subsection"> |
| 117 | <a name="lead0"><b>What happens if an array index is out of bounds?</b></a> |
| 118 | </div> |
| 119 | <div class="doc_text"> |
| 120 | <p>There are two senses in which an array index can be out of bounds.</p> |
| 121 | |
| 122 | <p>First, there's the array type which comes from the (static) type of |
| 123 | the first operand to the GEP. Indices greater than the number of elements |
| 124 | in the corresponding static array type are valid. There is no problem with |
| 125 | out of bounds indices in this sense. Indexing into an array only depends |
| 126 | on the size of the array element, not the number of elements.</p> |
| 127 | |
| 128 | <p>A common example of how this is used is arrays where the size is not known. |
| 129 | It's common to use array types with zero length to represent these. The |
| 130 | fact that the static type says there are zero elements is irrelevant; it's |
| 131 | perfectly valid to compute arbitrary element indices, as the computation |
| 132 | only depends on the size of the array element, not the number of |
| 133 | elements. Note that zero-sized arrays are not a special case here.</p> |
| 134 | |
| 135 | <p>This sense is unconnected with <tt>inbounds</tt> keyword. The |
| 136 | <tt>inbounds</tt> keyword is designed to describe low-level pointer |
| 137 | arithmetic overflow conditions, rather than high-level array |
| 138 | indexing rules. |
| 139 | |
| 140 | <p>Analysis passes which wish to understand array indexing should not |
| 141 | assume that the static array type bounds are respected.</p> |
| 142 | |
| 143 | <p>The second sense of being out of bounds is computing an address that's |
| 144 | beyond of the actual underlying allocated object.</p> |
| 145 | |
| 146 | <p>With the <tt>inbounds</tt> keyword, the result value of the GEP is |
| 147 | undefined if the address is outside the actual underlying allocated |
| 148 | object and not the address one-past-the-end.</p> |
| 149 | |
| 150 | <p>Without the <tt>inbounds</tt> keyword, there are no restrictions |
| 151 | on computing out-of-bounds addresses. Obviously, performing a load or |
| 152 | a store requires an address of allocated and sufficiently aligned |
| 153 | memory. But the GEP itself is only concerned with computing addresses.</p> |
| 154 | |
| 155 | </div> |
| 156 | |
| 157 | <!-- *********************************************************************** --> |
| 158 | <div class="doc_subsection"> |
| 159 | <a name="lead0"><b>Can array indices be negative?</b></a> |
| 160 | </div> |
| 161 | <div class="doc_text"> |
| 162 | <p>Yes. This is basically a special case of array indices being out |
| 163 | of bounds.</p> |
| 164 | |
| 165 | </div> |
| 166 | |
| 167 | <!-- *********************************************************************** --> |
| 168 | <div class="doc_subsection"> |
| 169 | <a name="lead0"><b>Can I compare two values computed with GEPs?</b></a> |
| 170 | </div> |
| 171 | <div class="doc_text"> |
| 172 | <p>Yes. If both addresses are within the same allocated object, or |
| 173 | one-past-the-end, you'll get the comparison result you expect. If either |
| 174 | is outside of it, integer arithmetic wrapping may occur, so the |
| 175 | comparison may not be meaningful.</p> |
| 176 | |
| 177 | </div> |
| 178 | |
| 179 | <!-- *********************************************************************** --> |
| 180 | <div class="doc_subsection"> |
| 181 | <a name="lead0"><b>Can I do GEP with a different pointer type than the type of |
| 182 | the underlying object?</b></a> |
| 183 | </div> |
| 184 | <div class="doc_text"> |
| 185 | <p>Yes. There are no restrictions on bitcasting a pointer value to an arbitrary |
| 186 | pointer type. The types in a GEP serve only to define the parameters for the |
| 187 | underlying integer computation. They need not correspond with the actual |
| 188 | type of the underlying object.</p> |
| 189 | |
| 190 | <p>Furthermore, loads and stores don't have to use the same types as the type |
| 191 | of the underlying object. Types in this context serve only to specify |
| 192 | memory size and alignment. Beyond that there are merely a hint to the |
| 193 | optimizer indicating how the value will likely be used.</p> |
| 194 | |
| 195 | </div> |
| 196 | |
| 197 | <!-- *********************************************************************** --> |
| 198 | <div class="doc_subsection"> |
| 199 | <a name="lead0"><b>Can I cast an object's address to integer and add it |
| 200 | to null?</b></a> |
| 201 | </div> |
| 202 | <div class="doc_text"> |
| 203 | <p>You can compute an address that way, but you can't use that pointer to |
| 204 | actually access the object if you do, unless the object is managed |
| 205 | outside of LLVM.</p> |
| 206 | |
| 207 | <p>The underlying integer computation is sufficiently defined; null has a |
| 208 | defined value -- zero -- and you can add whatever value you want to it.</p> |
| 209 | |
| 210 | <p>However, it's invalid to access (load from or store to) an LLVM-aware |
| 211 | object with such a pointer. This includes GlobalVariables, Allocas, and |
| 212 | objects pointed to by noalias pointers.</p> |
| 213 | |
| 214 | </div> |
| 215 | |
| 216 | <!-- *********************************************************************** --> |
| 217 | <div class="doc_subsection"> |
| 218 | <a name="lead0"><b>Can I compute the distance between two objects, and add |
| 219 | that value to one address to compute the other address?</b></a> |
| 220 | </div> |
| 221 | <div class="doc_text"> |
| 222 | <p>As with arithmetic on null, You can compute an address that way, but |
| 223 | you can't use that pointer to actually access the object if you do, |
| 224 | unless the object is managed outside of LLVM.</p> |
| 225 | |
| 226 | </div> |
| 227 | |
| 228 | <!-- *********************************************************************** --> |
| 229 | <div class="doc_subsection"> |
| 230 | <a name="lead0"><b>Can I do type-based alias analysis on LLVM IR?</b></a> |
| 231 | </div> |
| 232 | <div class="doc_text"> |
| 233 | <p>You can't do type-based alias analysis using LLVM's built-in type system, |
| 234 | because LLVM has no restrictions on mixing types in addressing, loads or |
| 235 | stores.</p> |
| 236 | |
| 237 | <p>It would be possible to add special annotations to the IR, probably using |
| 238 | metadata, to describe a different type system (such as the C type system), |
| 239 | and do type-based aliasing on top of that. This is a much bigger |
| 240 | undertaking though.</p> |
| 241 | |
| 242 | </div> |
| 243 | |
| 244 | <!-- *********************************************************************** --> |
| 245 | |
| 246 | <div class="doc_subsection"> |
| 247 | <a name="lead0"><b>What's an uglygep?</b></a> |
| 248 | </div> |
| 249 | <div class="doc_text"> |
| 250 | <p>Some LLVM optimizers operate on GEPs by internally lowering them into |
| 251 | more primitive integer expressions, which allows them to be combined |
| 252 | with other integer expressions and/or split into multiple separate |
| 253 | integer expressions. If they've made non-trivial changes, translating |
| 254 | back into LLVM IR can involve reverse-engineering the structure of |
| 255 | the addressing in order to fit it into the static type of the original |
| 256 | first operand. It isn't always possibly to fully reconstruct this |
| 257 | structure; sometimes the underlying addressing doesn't correspond with |
| 258 | the static type at all. In such cases the optimizer instead will emit |
| 259 | a GEP with the base pointer casted to a simple address-unit pointer, |
| 260 | using the name "uglygep". This isn't pretty, but it's just as |
| 261 | valid, and it's sufficient to preserve the pointer aliasing guarantees |
| 262 | that GEP provides.</p> |
| 263 | |
| 264 | </div> |
| 265 | |
| 266 | <!-- *********************************************************************** --> |
| 267 | |
| 268 | <div class="doc_subsection"> |
| 269 | <a name="lead0"><b>Can GEP index into vector elements?</b></a> |
| 270 | </div> |
| 271 | <div class="doc_text"> |
| 272 | <p>Sort of. This hasn't always been forcefully disallowed, though it's |
| 273 | not recommended. It leads to awkward special cases in the optimizers. |
| 274 | In the future, it may be outright disallowed.</p> |
| 275 | |
| 276 | <p>Instead, you should cast your pointer types and use arrays instead of |
| 277 | vectors for addressing. Arrays have the same in-memory representation |
| 278 | as vectors, so the addressing is interchangeable.</p> |
| 279 | |
| 280 | </div> |
| 281 | |
| 282 | <!-- *********************************************************************** --> |
| 283 | |
| 284 | <div class="doc_subsection"> |
| 285 | <a name="lead0"><b>Can GEP index into unions?</b></a> |
| 286 | </div> |
| 287 | <div class="doc_text"> |
| 288 | <p>Unknown.</p> |
| 289 | |
| 290 | </div> |
| 291 | |
| 292 | <!-- *********************************************************************** --> |
| 293 | |
| 294 | <div class="doc_subsection"> |
| 295 | <a name="lead0"><b>What happens if a GEP computation overflows?</b></a> |
| 296 | </div> |
| 297 | <div class="doc_text"> |
| 298 | <p>If the GEP has the <tt>inbounds</tt> keyword, the result value is |
| 299 | undefined.</p> |
| 300 | |
| 301 | <p>Otherwise, the result value is the result from evaluating the implied |
| 302 | two's complement integer computation. However, since there's no |
| 303 | guarantee of where an object will be allocated in the address space, |
| 304 | such values have limited meaning.</p> |
| 305 | |
| 306 | </div> |
| 307 | |
| 308 | <!-- *********************************************************************** --> |
| 309 | |
| 310 | <div class="doc_subsection"> |
| 311 | <a name="lead0"><b>What effect do address spaces have on GEPs?</b></a> |
| 312 | </div> |
| 313 | <div class="doc_text"> |
| 314 | <p>None, except that the address space qualifier on the first operand pointer |
| 315 | type always matches the address space qualifier on the result type.</p> |
| 316 | |
| 317 | </div> |
| 318 | |
| 319 | <!-- *********************************************************************** --> |
| 320 | |
| 321 | <div class="doc_subsection"> |
| 322 | <a name="lead0"><b>Why is GEP designed this way?</b></a> |
| 323 | </div> |
| 324 | <div class="doc_text"> |
| 325 | <p>The design of GEP has the following goals, in rough unofficial |
| 326 | order of priority:</p> |
Dan Gohman | 52658d1 | 2010-02-17 22:50:12 +0000 | [diff] [blame^] | 327 | <ol> |
| 328 | <li>Support C, C-like languages, and languages which can be |
| 329 | conceptually lowered into C (this covers a lot).</li> |
| 330 | <li>Support optimizations such as those that are common in |
| 331 | C compilers.</li> |
| 332 | <li>Provide a consistent method for computing addresses so that |
| 333 | address computations don't need to be a part of load and |
| 334 | store instructions in the IR.</li> |
| 335 | <li>Support non-C-like languages, to the extent that it doesn't |
| 336 | interfere with other goals.</li> |
| 337 | <li>Minimize target-specific information in the IR.</li> |
| 338 | </ol> |
Dan Gohman | dfecbe9 | 2010-02-17 22:47:06 +0000 | [diff] [blame] | 339 | </div> |
| 340 | |
| 341 | <!-- *********************************************************************** --> |
| 342 | |
| 343 | <hr> |
| 344 | <address> |
| 345 | <a href="http://jigsaw.w3.org/css-validator/check/referer"><img |
| 346 | src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> |
| 347 | <a href="http://validator.w3.org/check/referer"><img |
| 348 | src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> |
| 349 | <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br/> |
| 350 | Last modified: $Date$ |
| 351 | </address> |
| 352 | </body> |
| 353 | </html> |
| 354 | |