Chris Lattner | 0095054 | 2001-06-06 20:29:01 +0000 | [diff] [blame] | 1 | Ok, here are my comments and suggestions about the LLVM instruction set. |
| 2 | We should discuss some now, but can discuss many of them later, when we |
| 3 | revisit synchronization, type inference, and other issues. |
| 4 | (We have discussed some of the comments already.) |
| 5 | |
| 6 | |
| 7 | o We should consider eliminating the type annotation in cases where it is |
| 8 | essentially obvious from the instruction type, e.g., in br, it is obvious |
| 9 | that the first arg. should be a bool and the other args should be labels: |
| 10 | |
| 11 | br bool <cond>, label <iftrue>, label <iffalse> |
| 12 | |
| 13 | I think your point was that making all types explicit improves clarity |
| 14 | and readability. I agree to some extent, but it also comes at the cost |
| 15 | of verbosity. And when the types are obvious from people's experience |
| 16 | (e.g., in the br instruction), it doesn't seem to help as much. |
| 17 | |
| 18 | |
| 19 | o On reflection, I really like your idea of having the two different switch |
| 20 | types (even though they encode implementation techniques rather than |
| 21 | semantics). It should simplify building the CFG and my guess is it could |
| 22 | enable some significant optimizations, though we should think about which. |
| 23 | |
| 24 | |
| 25 | o In the lookup-indirect form of the switch, is there a reason not to make |
| 26 | the val-type uint? Most HLL switch statements (including Java and C++) |
| 27 | require that anyway. And it would also make the val-type uniform |
| 28 | in the two forms of the switch. |
| 29 | |
| 30 | I did see the switch-on-bool examples and, while cute, we can just use |
| 31 | the branch instructions in that particular case. |
| 32 | |
| 33 | |
| 34 | o I agree with your comment that we don't need 'neg'. |
| 35 | |
| 36 | |
| 37 | o There's a trade-off with the cast instruction: |
| 38 | + it avoids having to define all the upcasts and downcasts that are |
| 39 | valid for the operands of each instruction (you probably have thought |
| 40 | of other benefits also) |
| 41 | - it could make the bytecode significantly larger because there could |
| 42 | be a lot of cast operations |
| 43 | |
| 44 | |
| 45 | o Making the second arg. to 'shl' a ubyte seems good enough to me. |
| 46 | 255 positions seems adequate for several generations of machines |
| 47 | and is more compact than uint. |
| 48 | |
| 49 | |
| 50 | o I still have some major concerns about including malloc and free in the |
| 51 | language (either as builtin functions or instructions). LLVM must be |
| 52 | able to represent code from many different languages. Languages such as |
| 53 | C, C++ Java and Fortran 90 would not be able to use our malloc anyway |
| 54 | because each of them will want to provide a library implementation of it. |
| 55 | |
| 56 | This gets even worse when code from different languages is linked |
| 57 | into a single executable (which is fairly common in large apps). |
| 58 | Having a single malloc would just not suffice, and instead would simply |
| 59 | complicate the picture further because it adds an extra variant in |
| 60 | addition to the one each language provides. |
| 61 | |
| 62 | Instead, providing a default library version of malloc and free |
| 63 | (and perhaps a malloc_gc with garbage collection instead of free) |
| 64 | would make a good implementation available to anyone who wants it. |
| 65 | |
| 66 | I don't recall all your arguments in favor so let's discuss this again, |
| 67 | and soon. |
| 68 | |
| 69 | |
| 70 | o 'alloca' on the other hand sounds like a good idea, and the |
| 71 | implementation seems fairly language-independent so it doesn't have the |
| 72 | problems with malloc listed above. |
| 73 | |
| 74 | |
| 75 | o About indirect call: |
| 76 | Your option #2 sounded good to me. I'm not sure I understand your |
| 77 | concern about an explicit 'icall' instruction? |
| 78 | |
| 79 | |
| 80 | o A pair of important synchronization instr'ns to think about: |
| 81 | load-linked |
| 82 | store-conditional |
| 83 | |
| 84 | |
| 85 | o Other classes of instructions that are valuable for pipeline performance: |
| 86 | conditional-move |
| 87 | predicated instructions |
| 88 | |
| 89 | |
| 90 | o I believe tail calls are relatively easy to identify; do you know why |
| 91 | .NET has a tailcall instruction? |
| 92 | |
| 93 | |
| 94 | o I agree that we need a static data space. Otherwise, emulating global |
| 95 | data gets unnecessarily complex. |
| 96 | |
| 97 | |
| 98 | o About explicit parallelism: |
| 99 | |
| 100 | We once talked about adding a symbolic thread-id field to each |
| 101 | instruction. (It could be optional so single-threaded codes are |
| 102 | not penalized.) This could map well to multi-threaded architectures |
| 103 | while providing easy ILP for single-threaded onces. But it is probably |
| 104 | too radical an idea to include in a base version of LLVM. Instead, it |
| 105 | could a great topic for a separate study. |
| 106 | |
| 107 | What is the semantics of the IA64 stop bit? |
| 108 | |
| 109 | |
| 110 | |
| 111 | |
| 112 | o And finally, another thought about the syntax for arrays :-) |
| 113 | |
| 114 | Although this syntax: |
| 115 | array <dimension-list> of <type> |
| 116 | is verbose, it will be used only in the human-readable assembly code so |
| 117 | size should not matter. I think we should consider it because I find it |
| 118 | to be the clearest syntax. It could even make arrays of function |
| 119 | pointers somewhat readable. |
| 120 | |