Chris Lattner | 086c014 | 2006-02-03 06:21:43 +0000 | [diff] [blame^] | 1 | Target Independent Opportunities: |
| 2 | |
| 3 | ===-------------------------------------------------------------------------=== |
| 4 | |
| 5 | FreeBench/mason contains code like this: |
| 6 | |
| 7 | static p_type m0u(p_type p) { |
| 8 | int m[]={0, 8, 1, 2, 16, 5, 13, 7, 14, 9, 3, 4, 11, 12, 15, 10, 17, 6}; |
| 9 | p_type pu; |
| 10 | pu.a = m[p.a]; |
| 11 | pu.b = m[p.b]; |
| 12 | pu.c = m[p.c]; |
| 13 | return pu; |
| 14 | } |
| 15 | |
| 16 | We currently compile this into a memcpy from a static array into 'm', then |
| 17 | a bunch of loads from m. It would be better to avoid the memcpy and just do |
| 18 | loads from the static array. |
| 19 | |
| 20 | ===-------------------------------------------------------------------------=== |
| 21 | |
| 22 | Get the C front-end to expand hypot(x,y) -> llvm.sqrt(x*x+y*y) when errno and |
| 23 | precision don't matter (ffastmath). Misc/mandel will like this. :) |
| 24 | |
| 25 | ===-------------------------------------------------------------------------=== |
| 26 | |
| 27 | For all targets, not just X86: |
| 28 | When llvm.memcpy, llvm.memset, or llvm.memmove are lowered, they should be |
| 29 | optimized to a few store instructions if the source is constant and the length |
| 30 | is smallish (< 8). This will greatly help some tests like Shootout/strcat.c |
| 31 | and fldry. |
| 32 | |
| 33 | //===---------------------------------------------------------------------===// |
| 34 | |
| 35 | Solve this DAG isel folding deficiency: |
| 36 | |
| 37 | int X, Y; |
| 38 | |
| 39 | void fn1(void) |
| 40 | { |
| 41 | X = X | (Y << 3); |
| 42 | } |
| 43 | |
| 44 | compiles to |
| 45 | |
| 46 | fn1: |
| 47 | movl Y, %eax |
| 48 | shll $3, %eax |
| 49 | orl X, %eax |
| 50 | movl %eax, X |
| 51 | ret |
| 52 | |
| 53 | The problem is the store's chain operand is not the load X but rather |
| 54 | a TokenFactor of the load X and load Y, which prevents the folding. |
| 55 | |
| 56 | There are two ways to fix this: |
| 57 | |
| 58 | 1. The dag combiner can start using alias analysis to realize that y/x |
| 59 | don't alias, making the store to X not dependent on the load from Y. |
| 60 | 2. The generated isel could be made smarter in the case it can't |
| 61 | disambiguate the pointers. |
| 62 | |
| 63 | Number 1 is the preferred solution. |
| 64 | |
| 65 | //===---------------------------------------------------------------------===// |
| 66 | |
| 67 | |