| Chris Lattner | 2f7c963 | 2001-06-06 20:29:01 +0000 | [diff] [blame] | 1 | Date: Sun, 19 Nov 2000 16:23:57 -0600 (CST) | 
|  | 2 | From: Chris Lattner <sabre@nondot.org> | 
|  | 3 | To: Vikram Adve <vadve@cs.uiuc.edu> | 
|  | 4 | Subject: Re: a few thoughts | 
|  | 5 |  | 
|  | 6 | Okay... here are a few of my thoughts on this (it's good to know that we | 
|  | 7 | think so alike!): | 
|  | 8 |  | 
|  | 9 | > 1. We need to be clear on our goals for the VM.  Do we want to emphasize | 
|  | 10 | >    portability and safety like the Java VM?  Or shall we focus on the | 
|  | 11 | >    architecture interface first (i.e., consider the code generation and | 
|  | 12 | >    processor issues), since the architecture interface question is also | 
|  | 13 | >    important for portable Java-type VMs? | 
|  | 14 |  | 
|  | 15 | I forsee the architecture looking kinda like this: (which is completely | 
|  | 16 | subject to change) | 
|  | 17 |  | 
|  | 18 | 1. The VM code is NOT guaranteed safe in a java sense.  Doing so makes it | 
|  | 19 | basically impossible to support C like languages.  Besides that, | 
|  | 20 | certifying a register based language as safe at run time would be a | 
|  | 21 | pretty expensive operation to have to do.  Additionally, we would like | 
|  | 22 | to be able to statically eliminate many bounds checks in Java | 
|  | 23 | programs... for example. | 
|  | 24 |  | 
|  | 25 | 2. Instead, we can do the following (eventually): | 
|  | 26 | * Java bytecode is used as our "safe" representation (to avoid | 
|  | 27 | reinventing something that we don't add much value to).  When the | 
|  | 28 | user chooses to execute Java bytecodes directly (ie, not | 
|  | 29 | precompiled) the runtime compiler can do some very simple | 
|  | 30 | transformations (JIT style) to convert it into valid input for our | 
|  | 31 | VM.  Performance is not wonderful, but it works right. | 
|  | 32 | * The file is scheduled to be compiled (rigorously) at a later | 
|  | 33 | time.  This could be done by some background process or by a second | 
|  | 34 | processor in the system during idle time or something... | 
|  | 35 | * To keep things "safe" ie to enforce a sandbox on Java/foreign code, | 
|  | 36 | we could sign the generated VM code with a host specific private | 
|  | 37 | key.  Then before the code is executed/loaded, we can check to see if | 
|  | 38 | the trusted compiler generated the code.  This would be much quicker | 
|  | 39 | than having to validate consistency (especially if bounds checks have | 
|  | 40 | been removed, for example) | 
|  | 41 |  | 
|  | 42 | >    This is important because the audiences for these two goals are very | 
|  | 43 | >    different.  Architects and many compiler people care much more about | 
|  | 44 | >    the second question.  The Java compiler and OS community care much more | 
|  | 45 | >    about the first one. | 
|  | 46 |  | 
|  | 47 | 3. By focusing on a more low level virtual machine, we have much more room | 
|  | 48 | for value add.  The nice safe "sandbox" VM can be provided as a layer | 
|  | 49 | on top of it.  It also lets us focus on the more interesting compilers | 
|  | 50 | related projects. | 
|  | 51 |  | 
|  | 52 | > 2. Design issues to consider (an initial list that we should continue | 
|  | 53 | >    to modify).  Note that I'm not trying to suggest actual solutions here, | 
|  | 54 | >    but just various directions we can pursue: | 
|  | 55 |  | 
|  | 56 | Understood.  :) | 
|  | 57 |  | 
|  | 58 | >    a. A single-assignment VM, which we've both already been thinking | 
|  | 59 | >       about. | 
|  | 60 |  | 
|  | 61 | Yup, I think that this makes a lot of sense.  I am still intrigued, | 
|  | 62 | however, by the prospect of a minimally allocated VM representation... I | 
| Chris Lattner | 0ab5e2c | 2011-04-15 05:18:47 +0000 | [diff] [blame] | 63 | think that it could have definite advantages for certain applications | 
| Chris Lattner | 2f7c963 | 2001-06-06 20:29:01 +0000 | [diff] [blame] | 64 | (think very small machines, like PDAs).  I don't, however, think that our | 
|  | 65 | initial implementations should focus on this.  :) | 
|  | 66 |  | 
| Chris Lattner | 0ab5e2c | 2011-04-15 05:18:47 +0000 | [diff] [blame] | 67 | Here are some other auxiliary goals that I think we should consider: | 
| Chris Lattner | 2f7c963 | 2001-06-06 20:29:01 +0000 | [diff] [blame] | 68 |  | 
|  | 69 | 1. Primary goal: Support a high performance dynamic compilation | 
|  | 70 | system.  This means that we have an "ideal" division of labor between | 
|  | 71 | the runtime and static compilers.  Of course, the other goals of the | 
|  | 72 | system somewhat reduce the importance of this point (f.e. portability | 
|  | 73 | reduces performance, but hopefully not much) | 
|  | 74 | 2. Portability to different processors.  Since we are most familiar with | 
|  | 75 | x86 and solaris, I think that these two are excellent candidates when | 
|  | 76 | we get that far... | 
|  | 77 | 3. Support for all languages & styles of programming (general purpose | 
|  | 78 | VM).  This is the point that disallows java style bytecodes, where all | 
|  | 79 | array refs are checked for bounds, etc... | 
|  | 80 | 4. Support linking between different language families.  For example, call | 
|  | 81 | C functions directly from Java without using the nasty/slow/gross JNI | 
|  | 82 | layer.  This involves several subpoints: | 
|  | 83 | A. Support for languages that require garbage collectors and integration | 
|  | 84 | with languages that don't.  As a base point, we could insist on | 
|  | 85 | always using a conservative GC, but implement free as a noop, f.e. | 
|  | 86 |  | 
|  | 87 | >    b. A strongly-typed VM.  One question is do we need the types to be | 
|  | 88 | >       explicitly declared or should they be inferred by the dynamic | 
|  | 89 | >       compiler? | 
|  | 90 |  | 
|  | 91 | B. This is kind of similar to another idea that I have: make OOP | 
|  | 92 | constructs (virtual function tables, class heirarchies, etc) explicit | 
|  | 93 | in the VM representation.  I believe that the number of additional | 
|  | 94 | constructs would be fairly low, but would give us lots of important | 
|  | 95 | information... something else that would/could be important is to | 
|  | 96 | have exceptions as first class types so that they would be handled in | 
|  | 97 | a uniform way for the entire VM... so that C functions can call Java | 
|  | 98 | functions for example... | 
|  | 99 |  | 
|  | 100 | >    c. How do we get more high-level information into the VM while keeping | 
|  | 101 | >       to a low-level VM design? | 
|  | 102 | >       o  Explicit array references as operands?  An alternative is | 
|  | 103 | >          to have just an array type, and let the index computations be | 
|  | 104 | >          separate 3-operand instructions. | 
|  | 105 |  | 
|  | 106 | C. In the model I was thinking of (subject to change of course), we | 
|  | 107 | would just have an array type (distinct from the pointer | 
|  | 108 | types).  This would allow us to have arbitrarily complex index | 
|  | 109 | expressions, while still distinguishing "load" from "Array load", | 
|  | 110 | for example.  Perhaps also, switch jump tables would be first class | 
|  | 111 | types as well?  This would allow better reasoning about the program. | 
|  | 112 |  | 
|  | 113 | 5. Support dynamic loading of code from various sources.  Already | 
|  | 114 | mentioned above was the example of loading java bytecodes, but we want | 
|  | 115 | to support dynamic loading of VM code as well.  This makes the job of | 
|  | 116 | the runtime compiler much more interesting:  it can do interprocedural | 
|  | 117 | optimizations that the static compiler can't do, because it doesn't | 
|  | 118 | have all of the required information (for example, inlining from | 
|  | 119 | shared libraries, etc...) | 
|  | 120 |  | 
|  | 121 | 6. Define a set of generally useful annotations to add to the VM | 
|  | 122 | representation.  For example, a function can be analysed to see if it | 
|  | 123 | has any sideeffects when run... also, the MOD/REF sets could be | 
|  | 124 | calculated, etc... we would have to determine what is reasonable.  This | 
|  | 125 | would generally be used to make IP optimizations cheaper for the | 
|  | 126 | runtime compiler... | 
|  | 127 |  | 
|  | 128 | >       o  Explicit instructions to handle aliasing, e.g.s: | 
|  | 129 | >            -- an instruction to say "I speculate that these two values are not | 
|  | 130 | >               aliased, but check at runtime", like speculative execution in | 
|  | 131 | >             EPIC? | 
|  | 132 | >          -- or an instruction to check whether two values are aliased and | 
|  | 133 | >             execute different code depending on the answer, somewhat like | 
|  | 134 | >             predicated code in EPIC | 
|  | 135 |  | 
|  | 136 | These are also very good points... if this can be determined at compile | 
|  | 137 | time.  I think that an epic style of representation (not the instruction | 
|  | 138 | packing, just the information presented) could be a very interesting model | 
|  | 139 | to use... more later... | 
|  | 140 |  | 
|  | 141 | >         o  (This one is a difficult but powerful idea.) | 
|  | 142 | >          A "thread-id" field on every instruction that allows the static | 
|  | 143 | >          compiler to generate a set of parallel threads, and then have | 
|  | 144 | >          the runtime compiler and hardware do what they please with it. | 
|  | 145 | >          This has very powerful uses, but thread-id on every instruction | 
|  | 146 | >          is expensive in terms of instruction size and code size. | 
|  | 147 | >          We would need to compactly encode it somehow. | 
|  | 148 |  | 
|  | 149 | Yes yes yes!  :)  I think it would be *VERY* useful to include this kind | 
|  | 150 | of information (which EPIC architectures *implicitly* encode.  The trend | 
|  | 151 | that we are seeing supports this greatly: | 
|  | 152 |  | 
|  | 153 | 1. Commodity processors are getting massive SIMD support: | 
|  | 154 | * Intel/Amd MMX/MMX2 | 
|  | 155 | * AMD's 3Dnow! | 
|  | 156 | * Intel's SSE/SSE2 | 
|  | 157 | * Sun's VIS | 
|  | 158 | 2. SMP is becoming much more common, especially in the server space. | 
|  | 159 | 3. Multiple processors on a die are right around the corner. | 
|  | 160 |  | 
|  | 161 | If nothing else, not designing this in would severely limit our future | 
|  | 162 | expansion of the project... | 
|  | 163 |  | 
|  | 164 | >          Also, this will require some reading on at least two other | 
|  | 165 | >          projects: | 
|  | 166 | >               -- Multiscalar architecture from Wisconsin | 
|  | 167 | >               -- Simultaneous multithreading architecture from Washington | 
|  | 168 | > | 
|  | 169 | >       o  Or forget all this and stick to a traditional instruction set? | 
|  | 170 |  | 
|  | 171 | Heh... :)  Well, from a pure research point of view, it is almost more | 
|  | 172 | attactive to go with the most extreme/different ISA possible.  On one axis | 
|  | 173 | you get safety and conservatism, and on the other you get degree of | 
|  | 174 | influence that the results have.  Of course the problem with pure research | 
|  | 175 | is that often times there is no concrete product of the research... :) | 
|  | 176 |  | 
|  | 177 | > BTW, on an unrelated note, after the meeting yesterday, I did remember | 
|  | 178 | > that you had suggested doing instruction scheduling on SSA form instead | 
|  | 179 | > of a dependence DAG earlier in the semester.  When we talked about | 
|  | 180 | > it yesterday, I didn't remember where the idea had come from but I | 
|  | 181 | > remembered later.  Just giving credit where its due... | 
|  | 182 |  | 
|  | 183 | :) Thanks. | 
|  | 184 |  | 
|  | 185 | > Perhaps you can save the above as a file under RCS so you and I can | 
|  | 186 | > continue to expand on this. | 
|  | 187 |  | 
|  | 188 | I think it makes sense to do so when we get our ideas more formalized and | 
|  | 189 | bounce it back and forth a couple of times... then I'll do a more formal | 
|  | 190 | writeup of our goals and ideas.  Obviously our first implementation will | 
|  | 191 | not want to do all of the stuff that I pointed out above... be we will | 
|  | 192 | want to design the project so that we do not artificially limit ourselves | 
|  | 193 | at sometime in the future... | 
|  | 194 |  | 
|  | 195 | Anyways, let me know what you think about these ideas... and if they sound | 
|  | 196 | reasonable... | 
|  | 197 |  | 
|  | 198 | -Chris | 
|  | 199 |  |