Chris Lattner | 0095054 | 2001-06-06 20:29:01 +0000 | [diff] [blame] | 1 | Date: Sun, 19 Nov 2000 16:23:57 -0600 (CST) |
| 2 | From: Chris Lattner <sabre@nondot.org> |
| 3 | To: Vikram Adve <vadve@cs.uiuc.edu> |
| 4 | Subject: Re: a few thoughts |
| 5 | |
| 6 | Okay... here are a few of my thoughts on this (it's good to know that we |
| 7 | think so alike!): |
| 8 | |
| 9 | > 1. We need to be clear on our goals for the VM. Do we want to emphasize |
| 10 | > portability and safety like the Java VM? Or shall we focus on the |
| 11 | > architecture interface first (i.e., consider the code generation and |
| 12 | > processor issues), since the architecture interface question is also |
| 13 | > important for portable Java-type VMs? |
| 14 | |
| 15 | I forsee the architecture looking kinda like this: (which is completely |
| 16 | subject to change) |
| 17 | |
| 18 | 1. The VM code is NOT guaranteed safe in a java sense. Doing so makes it |
| 19 | basically impossible to support C like languages. Besides that, |
| 20 | certifying a register based language as safe at run time would be a |
| 21 | pretty expensive operation to have to do. Additionally, we would like |
| 22 | to be able to statically eliminate many bounds checks in Java |
| 23 | programs... for example. |
| 24 | |
| 25 | 2. Instead, we can do the following (eventually): |
| 26 | * Java bytecode is used as our "safe" representation (to avoid |
| 27 | reinventing something that we don't add much value to). When the |
| 28 | user chooses to execute Java bytecodes directly (ie, not |
| 29 | precompiled) the runtime compiler can do some very simple |
| 30 | transformations (JIT style) to convert it into valid input for our |
| 31 | VM. Performance is not wonderful, but it works right. |
| 32 | * The file is scheduled to be compiled (rigorously) at a later |
| 33 | time. This could be done by some background process or by a second |
| 34 | processor in the system during idle time or something... |
| 35 | * To keep things "safe" ie to enforce a sandbox on Java/foreign code, |
| 36 | we could sign the generated VM code with a host specific private |
| 37 | key. Then before the code is executed/loaded, we can check to see if |
| 38 | the trusted compiler generated the code. This would be much quicker |
| 39 | than having to validate consistency (especially if bounds checks have |
| 40 | been removed, for example) |
| 41 | |
| 42 | > This is important because the audiences for these two goals are very |
| 43 | > different. Architects and many compiler people care much more about |
| 44 | > the second question. The Java compiler and OS community care much more |
| 45 | > about the first one. |
| 46 | |
| 47 | 3. By focusing on a more low level virtual machine, we have much more room |
| 48 | for value add. The nice safe "sandbox" VM can be provided as a layer |
| 49 | on top of it. It also lets us focus on the more interesting compilers |
| 50 | related projects. |
| 51 | |
| 52 | > 2. Design issues to consider (an initial list that we should continue |
| 53 | > to modify). Note that I'm not trying to suggest actual solutions here, |
| 54 | > but just various directions we can pursue: |
| 55 | |
| 56 | Understood. :) |
| 57 | |
| 58 | > a. A single-assignment VM, which we've both already been thinking |
| 59 | > about. |
| 60 | |
| 61 | Yup, I think that this makes a lot of sense. I am still intrigued, |
| 62 | however, by the prospect of a minimally allocated VM representation... I |
| 63 | think that it could have definate advantages for certain applications |
| 64 | (think very small machines, like PDAs). I don't, however, think that our |
| 65 | initial implementations should focus on this. :) |
| 66 | |
| 67 | Here are some other auxilliary goals that I think we should consider: |
| 68 | |
| 69 | 1. Primary goal: Support a high performance dynamic compilation |
| 70 | system. This means that we have an "ideal" division of labor between |
| 71 | the runtime and static compilers. Of course, the other goals of the |
| 72 | system somewhat reduce the importance of this point (f.e. portability |
| 73 | reduces performance, but hopefully not much) |
| 74 | 2. Portability to different processors. Since we are most familiar with |
| 75 | x86 and solaris, I think that these two are excellent candidates when |
| 76 | we get that far... |
| 77 | 3. Support for all languages & styles of programming (general purpose |
| 78 | VM). This is the point that disallows java style bytecodes, where all |
| 79 | array refs are checked for bounds, etc... |
| 80 | 4. Support linking between different language families. For example, call |
| 81 | C functions directly from Java without using the nasty/slow/gross JNI |
| 82 | layer. This involves several subpoints: |
| 83 | A. Support for languages that require garbage collectors and integration |
| 84 | with languages that don't. As a base point, we could insist on |
| 85 | always using a conservative GC, but implement free as a noop, f.e. |
| 86 | |
| 87 | > b. A strongly-typed VM. One question is do we need the types to be |
| 88 | > explicitly declared or should they be inferred by the dynamic |
| 89 | > compiler? |
| 90 | |
| 91 | B. This is kind of similar to another idea that I have: make OOP |
| 92 | constructs (virtual function tables, class heirarchies, etc) explicit |
| 93 | in the VM representation. I believe that the number of additional |
| 94 | constructs would be fairly low, but would give us lots of important |
| 95 | information... something else that would/could be important is to |
| 96 | have exceptions as first class types so that they would be handled in |
| 97 | a uniform way for the entire VM... so that C functions can call Java |
| 98 | functions for example... |
| 99 | |
| 100 | > c. How do we get more high-level information into the VM while keeping |
| 101 | > to a low-level VM design? |
| 102 | > o Explicit array references as operands? An alternative is |
| 103 | > to have just an array type, and let the index computations be |
| 104 | > separate 3-operand instructions. |
| 105 | |
| 106 | C. In the model I was thinking of (subject to change of course), we |
| 107 | would just have an array type (distinct from the pointer |
| 108 | types). This would allow us to have arbitrarily complex index |
| 109 | expressions, while still distinguishing "load" from "Array load", |
| 110 | for example. Perhaps also, switch jump tables would be first class |
| 111 | types as well? This would allow better reasoning about the program. |
| 112 | |
| 113 | 5. Support dynamic loading of code from various sources. Already |
| 114 | mentioned above was the example of loading java bytecodes, but we want |
| 115 | to support dynamic loading of VM code as well. This makes the job of |
| 116 | the runtime compiler much more interesting: it can do interprocedural |
| 117 | optimizations that the static compiler can't do, because it doesn't |
| 118 | have all of the required information (for example, inlining from |
| 119 | shared libraries, etc...) |
| 120 | |
| 121 | 6. Define a set of generally useful annotations to add to the VM |
| 122 | representation. For example, a function can be analysed to see if it |
| 123 | has any sideeffects when run... also, the MOD/REF sets could be |
| 124 | calculated, etc... we would have to determine what is reasonable. This |
| 125 | would generally be used to make IP optimizations cheaper for the |
| 126 | runtime compiler... |
| 127 | |
| 128 | > o Explicit instructions to handle aliasing, e.g.s: |
| 129 | > -- an instruction to say "I speculate that these two values are not |
| 130 | > aliased, but check at runtime", like speculative execution in |
| 131 | > EPIC? |
| 132 | > -- or an instruction to check whether two values are aliased and |
| 133 | > execute different code depending on the answer, somewhat like |
| 134 | > predicated code in EPIC |
| 135 | |
| 136 | These are also very good points... if this can be determined at compile |
| 137 | time. I think that an epic style of representation (not the instruction |
| 138 | packing, just the information presented) could be a very interesting model |
| 139 | to use... more later... |
| 140 | |
| 141 | > o (This one is a difficult but powerful idea.) |
| 142 | > A "thread-id" field on every instruction that allows the static |
| 143 | > compiler to generate a set of parallel threads, and then have |
| 144 | > the runtime compiler and hardware do what they please with it. |
| 145 | > This has very powerful uses, but thread-id on every instruction |
| 146 | > is expensive in terms of instruction size and code size. |
| 147 | > We would need to compactly encode it somehow. |
| 148 | |
| 149 | Yes yes yes! :) I think it would be *VERY* useful to include this kind |
| 150 | of information (which EPIC architectures *implicitly* encode. The trend |
| 151 | that we are seeing supports this greatly: |
| 152 | |
| 153 | 1. Commodity processors are getting massive SIMD support: |
| 154 | * Intel/Amd MMX/MMX2 |
| 155 | * AMD's 3Dnow! |
| 156 | * Intel's SSE/SSE2 |
| 157 | * Sun's VIS |
| 158 | 2. SMP is becoming much more common, especially in the server space. |
| 159 | 3. Multiple processors on a die are right around the corner. |
| 160 | |
| 161 | If nothing else, not designing this in would severely limit our future |
| 162 | expansion of the project... |
| 163 | |
| 164 | > Also, this will require some reading on at least two other |
| 165 | > projects: |
| 166 | > -- Multiscalar architecture from Wisconsin |
| 167 | > -- Simultaneous multithreading architecture from Washington |
| 168 | > |
| 169 | > o Or forget all this and stick to a traditional instruction set? |
| 170 | |
| 171 | Heh... :) Well, from a pure research point of view, it is almost more |
| 172 | attactive to go with the most extreme/different ISA possible. On one axis |
| 173 | you get safety and conservatism, and on the other you get degree of |
| 174 | influence that the results have. Of course the problem with pure research |
| 175 | is that often times there is no concrete product of the research... :) |
| 176 | |
| 177 | > BTW, on an unrelated note, after the meeting yesterday, I did remember |
| 178 | > that you had suggested doing instruction scheduling on SSA form instead |
| 179 | > of a dependence DAG earlier in the semester. When we talked about |
| 180 | > it yesterday, I didn't remember where the idea had come from but I |
| 181 | > remembered later. Just giving credit where its due... |
| 182 | |
| 183 | :) Thanks. |
| 184 | |
| 185 | > Perhaps you can save the above as a file under RCS so you and I can |
| 186 | > continue to expand on this. |
| 187 | |
| 188 | I think it makes sense to do so when we get our ideas more formalized and |
| 189 | bounce it back and forth a couple of times... then I'll do a more formal |
| 190 | writeup of our goals and ideas. Obviously our first implementation will |
| 191 | not want to do all of the stuff that I pointed out above... be we will |
| 192 | want to design the project so that we do not artificially limit ourselves |
| 193 | at sometime in the future... |
| 194 | |
| 195 | Anyways, let me know what you think about these ideas... and if they sound |
| 196 | reasonable... |
| 197 | |
| 198 | -Chris |
| 199 | |