blob: 5e3f3384db892081d434f542faea8f96d48dfab6 [file] [log] [blame]
sewardja1a370f2004-08-17 13:31:55 +00001
2/*---------------------------------------------------------------*/
3/*--- ---*/
4/*--- This file (ir/iropt.c) is ---*/
5/*--- Copyright (c) 2004 OpenWorks LLP. All rights reserved. ---*/
6/*--- ---*/
7/*---------------------------------------------------------------*/
8
sewardjf8ed9d82004-11-12 17:40:23 +00009/*
10 This file is part of LibVEX, a library for dynamic binary
11 instrumentation and translation.
12
13 Copyright (C) 2004 OpenWorks, LLP.
14
15 This program is free software; you can redistribute it and/or modify
16 it under the terms of the GNU General Public License as published by
17 the Free Software Foundation; Version 2 dated June 1991 of the
18 license.
19
20 This program is distributed in the hope that it will be useful,
21 but WITHOUT ANY WARRANTY; without even the implied warranty of
22 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE, or liability
23 for damages. See the GNU General Public License for more details.
24
25 Neither the names of the U.S. Department of Energy nor the
26 University of California nor the names of its contributors may be
27 used to endorse or promote products derived from this software
28 without prior written permission.
29
30 You should have received a copy of the GNU General Public License
31 along with this program; if not, write to the Free Software
32 Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
33 USA.
34*/
35
sewardja1a370f2004-08-17 13:31:55 +000036#include "libvex_basictypes.h"
sewardjedf4d692004-08-17 13:52:58 +000037#include "libvex_ir.h"
sewardja1a370f2004-08-17 13:31:55 +000038#include "libvex.h"
39
40#include "main/vex_util.h"
sewardj08613742004-10-25 13:01:45 +000041#include "main/vex_globals.h"
sewardjedf4d692004-08-17 13:52:58 +000042#include "ir/iropt.h"
sewardja1a370f2004-08-17 13:31:55 +000043
sewardjd0863ff2004-10-23 00:22:32 +000044
sewardj088bcb42004-08-19 17:16:52 +000045/* Set to 1 for lots of debugging output. */
46#define DEBUG_IROPT 0
47
sewardja1a370f2004-08-17 13:31:55 +000048
sewardj08210532004-12-29 17:09:11 +000049/* What iropt does, 29 Dec 04.
50
51 It takes an IRBB and produces a new one with the same meaning,
52 defined thus:
53
54 After execution of the new BB, all guest state and guest memory is
55 the same as after execution of the original. This is true
56 regardless of how the block was exited (at the end vs side exit).
57
58 In addition, parts of the guest state will be identical to that
59 created by execution of the original at the following observation
60 points:
61
62 * In a dirty helper call, any parts of the guest state that the
63 helper states that it reads or modifies will be up to date.
64 Also, guest memory will be up to date. Parts of the guest state
65 not marked as being read or modified by the helper cannot be
66 assumed to be up-to-date at the point where the helper is called.
67
68 * Immediately prior to any load or store, those parts of the guest
69 state marked as requiring precise exceptions will be up to date.
70 Also, guest memory will be up to date. Parts of the guest state
71 not marked as requiring precise exceptions cannot be assumed to
72 be up-to-date at the point of the load/store.
73
74 The relative order of loads and stores (including loads/stores of
75 guest memory done by dirty helpers annotated as such) is not
76 changed. However, the relative order of loads with no intervening
77 stores/modifies may be changed.
78
79 Transformation order
80 ~~~~~~~~~~~~~~~~~~~~
81
82 There are three levels of optimisation, controlled by
83 vex_control.iropt_level. Define first:
84
85 "Cheap transformations" are the following sequence:
86 * Redundant-Get removal
87 * Redundant-Put removal
88 * Constant propagation/folding
89 * Dead code removal
90 * Specialisation of clean helper functions
91 * Dead code removal
92
93 "Expensive transformations" are the following sequence:
94 * CSE
95 * Folding of add/sub chains
96 * Redundant-GetI removal
97 * Redundant-PutI removal
98 * Dead code removal
sewardj08210532004-12-29 17:09:11 +000099
100 Then the transformations are as follows, as defined by
101 vex_control.iropt_level:
102
103 Level 0:
104 * Flatten into atomic form.
105
106 Level 1: the following sequence:
107 * Flatten into atomic form.
108 * Cheap transformations.
sewardj08210532004-12-29 17:09:11 +0000109
110 Level 2: the following sequence
111 * Flatten into atomic form.
112 * Cheap transformations.
113 * If block contains GetI or PutI, Expensive transformations.
114 * Try unrolling loops. Three possible outcomes:
115 - No effect: do nothing more.
116 - Unrolled a loop, and block does not contain GetI or PutI:
117 Do: * CSE
118 * Dead code removal
sewardj08210532004-12-29 17:09:11 +0000119 - Unrolled a loop, and block contains GetI or PutI:
120 Do: * Expensive transformations
121 * Cheap transformations
sewardj08210532004-12-29 17:09:11 +0000122*/
123
sewardj98430292004-12-29 17:34:50 +0000124/* Implementation notes, 29 Dec 04.
125
126 TODO (important): I think rPutI removal ignores precise exceptions
127 and is therefore in a sense, wrong. In the sense that PutIs are
128 assumed not to write parts of the guest state that we need to have
129 up-to-date at loads/stores. So far on x86 guest that has not
130 mattered since indeed only the x87 FP registers and tags are
131 accessed using GetI/PutI, and there is no need so far for them to
132 be up to date at mem exception points. The rPutI pass should be
133 fixed.
sewardjfb44d552004-10-25 09:48:47 +0000134
sewardj4c5f6d52004-10-26 13:25:33 +0000135 TODO: improve pessimistic handling of precise exceptions
136 in the tree builder.
137
sewardjfb44d552004-10-25 09:48:47 +0000138 TODO: check interaction of rGetI and dirty helpers.
sewardjc0b42282004-10-12 13:44:12 +0000139
140 F64i constants are treated differently from other constants.
141 They are not regarded as atoms, and instead lifted off and
142 bound to temps. This allows them to participate in CSE, which
143 is important for getting good performance for x86 guest code.
sewardj695cff92004-10-13 14:50:14 +0000144
sewardja5aa9cf2004-10-15 22:56:38 +0000145 CSE up F64 literals (already doing F64is)
sewardj4c5f6d52004-10-26 13:25:33 +0000146
147 CSE: consider carefully the requirement for precise exns
sewardj98430292004-12-29 17:34:50 +0000148 prior to making CSE any more aggressive. */
sewardjc0b42282004-10-12 13:44:12 +0000149
150
sewardja1a370f2004-08-17 13:31:55 +0000151/*---------------------------------------------------------------*/
152/*--- Finite mappery, of a sort ---*/
153/*---------------------------------------------------------------*/
154
sewardj08210532004-12-29 17:09:11 +0000155/* General map from HWord-sized thing HWord-sized thing. Could be by
156 hashing, but it's not clear whether or not this would really be any
157 faster. */
sewardja1a370f2004-08-17 13:31:55 +0000158
159typedef
160 struct {
161 Bool* inuse;
sewardjea602bc2004-10-14 21:40:12 +0000162 HWord* key;
163 HWord* val;
sewardja1a370f2004-08-17 13:31:55 +0000164 Int size;
165 Int used;
166 }
sewardjea602bc2004-10-14 21:40:12 +0000167 HashHW;
sewardja1a370f2004-08-17 13:31:55 +0000168
sewardjea602bc2004-10-14 21:40:12 +0000169static HashHW* newHHW ( void )
sewardja1a370f2004-08-17 13:31:55 +0000170{
sewardjea602bc2004-10-14 21:40:12 +0000171 HashHW* h = LibVEX_Alloc(sizeof(HashHW));
sewardj29632392004-08-22 02:38:11 +0000172 h->size = 8;
sewardja1a370f2004-08-17 13:31:55 +0000173 h->used = 0;
174 h->inuse = LibVEX_Alloc(h->size * sizeof(Bool));
sewardjea602bc2004-10-14 21:40:12 +0000175 h->key = LibVEX_Alloc(h->size * sizeof(HWord));
176 h->val = LibVEX_Alloc(h->size * sizeof(HWord));
sewardja1a370f2004-08-17 13:31:55 +0000177 return h;
178}
179
180
sewardj84be7372004-08-18 13:59:33 +0000181/* Look up key in the map. */
sewardja1a370f2004-08-17 13:31:55 +0000182
sewardjea602bc2004-10-14 21:40:12 +0000183static Bool lookupHHW ( HashHW* h, /*OUT*/HWord* val, HWord key )
sewardja1a370f2004-08-17 13:31:55 +0000184{
185 Int i;
sewardj08210532004-12-29 17:09:11 +0000186 /* vex_printf("lookupHHW(%llx)\n", key ); */
sewardja1a370f2004-08-17 13:31:55 +0000187 for (i = 0; i < h->used; i++) {
188 if (h->inuse[i] && h->key[i] == key) {
sewardj39e3f242004-08-18 16:54:52 +0000189 if (val)
190 *val = h->val[i];
sewardja1a370f2004-08-17 13:31:55 +0000191 return True;
192 }
193 }
194 return False;
195}
196
197
sewardja1a370f2004-08-17 13:31:55 +0000198/* Add key->val to the map. Replaces any existing binding for key. */
199
sewardjea602bc2004-10-14 21:40:12 +0000200static void addToHHW ( HashHW* h, HWord key, HWord val )
sewardja1a370f2004-08-17 13:31:55 +0000201{
202 Int i, j;
sewardj08210532004-12-29 17:09:11 +0000203 /* vex_printf("addToHHW(%llx, %llx)\n", key, val); */
sewardja1a370f2004-08-17 13:31:55 +0000204
205 /* Find and replace existing binding, if any. */
206 for (i = 0; i < h->used; i++) {
207 if (h->inuse[i] && h->key[i] == key) {
208 h->val[i] = val;
209 return;
210 }
211 }
212
213 /* Ensure a space is available. */
214 if (h->used == h->size) {
215 /* Copy into arrays twice the size. */
216 Bool* inuse2 = LibVEX_Alloc(2 * h->size * sizeof(Bool));
sewardjea602bc2004-10-14 21:40:12 +0000217 HWord* key2 = LibVEX_Alloc(2 * h->size * sizeof(HWord));
218 HWord* val2 = LibVEX_Alloc(2 * h->size * sizeof(HWord));
sewardja1a370f2004-08-17 13:31:55 +0000219 for (i = j = 0; i < h->size; i++) {
220 if (!h->inuse[i]) continue;
221 inuse2[j] = True;
222 key2[j] = h->key[i];
223 val2[j] = h->val[i];
224 j++;
225 }
226 h->used = j;
227 h->size *= 2;
228 h->inuse = inuse2;
229 h->key = key2;
230 h->val = val2;
231 }
232
233 /* Finally, add it. */
234 vassert(h->used < h->size);
235 h->inuse[h->used] = True;
236 h->key[h->used] = key;
sewardj84be7372004-08-18 13:59:33 +0000237 h->val[h->used] = val;
sewardja1a370f2004-08-17 13:31:55 +0000238 h->used++;
239}
240
sewardj84be7372004-08-18 13:59:33 +0000241
sewardjd7cb8532004-08-17 23:59:23 +0000242/*---------------------------------------------------------------*/
sewardj08210532004-12-29 17:09:11 +0000243/*--- Flattening out a BB into atomic SSA form ---*/
sewardjd7cb8532004-08-17 23:59:23 +0000244/*---------------------------------------------------------------*/
245
sewardje80679a2004-09-21 23:00:11 +0000246/* Non-critical helper, heuristic for reducing the number of tmp-tmp
247 copies made by flattening. If in doubt return False. */
248
249static Bool isFlat ( IRExpr* e )
250{
sewardj695cff92004-10-13 14:50:14 +0000251 if (e->tag == Iex_Get)
252 return True;
sewardje80679a2004-09-21 23:00:11 +0000253 if (e->tag == Iex_Binop)
sewardj9d2e7692005-02-07 01:11:31 +0000254 return toBool( isAtom(e->Iex.Binop.arg1)
255 && isAtom(e->Iex.Binop.arg2) );
sewardje80679a2004-09-21 23:00:11 +0000256 if (e->tag == Iex_LDle)
257 return isAtom(e->Iex.LDle.addr);
258 return False;
259}
260
sewardjd7cb8532004-08-17 23:59:23 +0000261/* Flatten out 'ex' so it is atomic, returning a new expression with
262 the same value, after having appended extra IRTemp assignments to
263 the end of 'bb'. */
264
265static IRExpr* flatten_Expr ( IRBB* bb, IRExpr* ex )
266{
267 Int i;
268 IRExpr** newargs;
269 IRType ty = typeOfIRExpr(bb->tyenv, ex);
270 IRTemp t1;
271
272 switch (ex->tag) {
273
sewardjd7217032004-08-19 10:49:10 +0000274 case Iex_GetI:
275 t1 = newIRTemp(bb->tyenv, ty);
276 addStmtToIRBB(bb, IRStmt_Tmp(t1,
sewardj2d3f77c2004-09-22 23:49:09 +0000277 IRExpr_GetI(ex->Iex.GetI.descr,
sewardjeeac8412004-11-02 00:26:55 +0000278 flatten_Expr(bb, ex->Iex.GetI.ix),
sewardj2d3f77c2004-09-22 23:49:09 +0000279 ex->Iex.GetI.bias)));
sewardjd7217032004-08-19 10:49:10 +0000280 return IRExpr_Tmp(t1);
281
sewardjd7cb8532004-08-17 23:59:23 +0000282 case Iex_Get:
283 t1 = newIRTemp(bb->tyenv, ty);
284 addStmtToIRBB(bb,
285 IRStmt_Tmp(t1, ex));
286 return IRExpr_Tmp(t1);
287
288 case Iex_Binop:
289 t1 = newIRTemp(bb->tyenv, ty);
290 addStmtToIRBB(bb, IRStmt_Tmp(t1,
291 IRExpr_Binop(ex->Iex.Binop.op,
292 flatten_Expr(bb, ex->Iex.Binop.arg1),
293 flatten_Expr(bb, ex->Iex.Binop.arg2))));
294 return IRExpr_Tmp(t1);
295
296 case Iex_Unop:
297 t1 = newIRTemp(bb->tyenv, ty);
298 addStmtToIRBB(bb, IRStmt_Tmp(t1,
299 IRExpr_Unop(ex->Iex.Unop.op,
300 flatten_Expr(bb, ex->Iex.Unop.arg))));
301 return IRExpr_Tmp(t1);
302
303 case Iex_LDle:
304 t1 = newIRTemp(bb->tyenv, ty);
305 addStmtToIRBB(bb, IRStmt_Tmp(t1,
306 IRExpr_LDle(ex->Iex.LDle.ty,
307 flatten_Expr(bb, ex->Iex.LDle.addr))));
308 return IRExpr_Tmp(t1);
309
310 case Iex_CCall:
sewardj695cff92004-10-13 14:50:14 +0000311 newargs = sopyIRExprVec(ex->Iex.CCall.args);
sewardjd7cb8532004-08-17 23:59:23 +0000312 for (i = 0; newargs[i]; i++)
313 newargs[i] = flatten_Expr(bb, newargs[i]);
314 t1 = newIRTemp(bb->tyenv, ty);
315 addStmtToIRBB(bb, IRStmt_Tmp(t1,
sewardj8ea867b2004-10-30 19:03:02 +0000316 IRExpr_CCall(ex->Iex.CCall.cee,
sewardjd7cb8532004-08-17 23:59:23 +0000317 ex->Iex.CCall.retty,
318 newargs)));
319 return IRExpr_Tmp(t1);
320
321 case Iex_Mux0X:
322 t1 = newIRTemp(bb->tyenv, ty);
323 addStmtToIRBB(bb, IRStmt_Tmp(t1,
324 IRExpr_Mux0X(flatten_Expr(bb, ex->Iex.Mux0X.cond),
325 flatten_Expr(bb, ex->Iex.Mux0X.expr0),
326 flatten_Expr(bb, ex->Iex.Mux0X.exprX))));
327 return IRExpr_Tmp(t1);
328
329 case Iex_Const:
sewardjc0b42282004-10-12 13:44:12 +0000330 /* Lift F64i constants out onto temps so they can be CSEd
331 later. */
332 if (ex->Iex.Const.con->tag == Ico_F64i) {
333 t1 = newIRTemp(bb->tyenv, ty);
334 addStmtToIRBB(bb, IRStmt_Tmp(t1,
335 IRExpr_Const(ex->Iex.Const.con)));
336 return IRExpr_Tmp(t1);
337 } else {
338 /* Leave all other constants alone. */
339 return ex;
340 }
341
sewardjd7cb8532004-08-17 23:59:23 +0000342 case Iex_Tmp:
343 return ex;
344
345 default:
346 vex_printf("\n");
347 ppIRExpr(ex);
348 vex_printf("\n");
349 vpanic("flatten_Expr");
350 }
351}
352
353
354/* Append a completely flattened form of 'st' to the end of 'bb'. */
355
356static void flatten_Stmt ( IRBB* bb, IRStmt* st )
357{
sewardj17442fe2004-09-20 14:54:28 +0000358 Int i;
359 IRExpr *e1, *e2;
360 IRDirty *d, *d2;
sewardjd7cb8532004-08-17 23:59:23 +0000361 switch (st->tag) {
362 case Ist_Put:
sewardj49651f42004-10-28 22:11:04 +0000363 if (isAtom(st->Ist.Put.data)) {
364 /* optimisation to reduce the amount of heap wasted
365 by the flattener */
366 addStmtToIRBB(bb, st);
367 } else {
368 /* general case, always correct */
369 e1 = flatten_Expr(bb, st->Ist.Put.data);
370 addStmtToIRBB(bb, IRStmt_Put(st->Ist.Put.offset, e1));
371 }
sewardjd7cb8532004-08-17 23:59:23 +0000372 break;
sewardjd7cb8532004-08-17 23:59:23 +0000373 case Ist_PutI:
sewardjeeac8412004-11-02 00:26:55 +0000374 e1 = flatten_Expr(bb, st->Ist.PutI.ix);
sewardj2d3f77c2004-09-22 23:49:09 +0000375 e2 = flatten_Expr(bb, st->Ist.PutI.data);
376 addStmtToIRBB(bb, IRStmt_PutI(st->Ist.PutI.descr,
377 e1,
378 st->Ist.PutI.bias,
379 e2));
sewardjd7217032004-08-19 10:49:10 +0000380 break;
sewardjd7cb8532004-08-17 23:59:23 +0000381 case Ist_Tmp:
sewardj6d076362004-09-23 11:06:17 +0000382 if (isFlat(st->Ist.Tmp.data)) {
sewardje80679a2004-09-21 23:00:11 +0000383 /* optimisation, to reduce the number of tmp-tmp
384 copies generated */
385 addStmtToIRBB(bb, st);
386 } else {
387 /* general case, always correct */
sewardj6d076362004-09-23 11:06:17 +0000388 e1 = flatten_Expr(bb, st->Ist.Tmp.data);
sewardje80679a2004-09-21 23:00:11 +0000389 addStmtToIRBB(bb, IRStmt_Tmp(st->Ist.Tmp.tmp, e1));
390 }
sewardjd7cb8532004-08-17 23:59:23 +0000391 break;
392 case Ist_STle:
393 e1 = flatten_Expr(bb, st->Ist.STle.addr);
394 e2 = flatten_Expr(bb, st->Ist.STle.data);
395 addStmtToIRBB(bb, IRStmt_STle(e1,e2));
396 break;
sewardj17442fe2004-09-20 14:54:28 +0000397 case Ist_Dirty:
398 d = st->Ist.Dirty.details;
399 d2 = emptyIRDirty();
400 *d2 = *d;
sewardj695cff92004-10-13 14:50:14 +0000401 d2->args = sopyIRExprVec(d2->args);
sewardj17442fe2004-09-20 14:54:28 +0000402 if (d2->mFx != Ifx_None) {
403 d2->mAddr = flatten_Expr(bb, d2->mAddr);
404 } else {
405 vassert(d2->mAddr == NULL);
406 }
sewardjb8385d82004-11-02 01:34:15 +0000407 d2->guard = flatten_Expr(bb, d2->guard);
sewardj17442fe2004-09-20 14:54:28 +0000408 for (i = 0; d2->args[i]; i++)
409 d2->args[i] = flatten_Expr(bb, d2->args[i]);
410 addStmtToIRBB(bb, IRStmt_Dirty(d2));
411 break;
sewardj3e838932005-01-07 12:09:15 +0000412 case Ist_MFence:
413 addStmtToIRBB(bb, st);
414 break;
sewardjd7cb8532004-08-17 23:59:23 +0000415 case Ist_Exit:
sewardj0276d4b2004-11-15 15:30:21 +0000416 e1 = flatten_Expr(bb, st->Ist.Exit.guard);
sewardj893aada2004-11-29 19:57:54 +0000417 addStmtToIRBB(bb, IRStmt_Exit(e1, st->Ist.Exit.jk,
418 st->Ist.Exit.dst));
sewardjd7cb8532004-08-17 23:59:23 +0000419 break;
420 default:
421 vex_printf("\n");
422 ppIRStmt(st);
423 vex_printf("\n");
424 vpanic("flatten_Stmt");
425 }
426}
427
sewardj08210532004-12-29 17:09:11 +0000428
sewardjd7cb8532004-08-17 23:59:23 +0000429static IRBB* flatten_BB ( IRBB* in )
430{
431 Int i;
432 IRBB* out;
433 out = emptyIRBB();
sewardj695cff92004-10-13 14:50:14 +0000434 out->tyenv = dopyIRTypeEnv( in->tyenv );
sewardjd7cb8532004-08-17 23:59:23 +0000435 for (i = 0; i < in->stmts_used; i++)
sewardj4345f7a2004-09-22 19:49:27 +0000436 if (in->stmts[i])
437 flatten_Stmt( out, in->stmts[i] );
sewardjd7cb8532004-08-17 23:59:23 +0000438 out->next = flatten_Expr( out, in->next );
439 out->jumpkind = in->jumpkind;
440 return out;
441}
442
sewardjedf4d692004-08-17 13:52:58 +0000443
sewardj08210532004-12-29 17:09:11 +0000444/*---------------------------------------------------------------*/
445/*--- In-place removal of redundant GETs ---*/
446/*---------------------------------------------------------------*/
447
448/* Scan forwards, building up an environment binding (min offset, max
449 offset) pairs to values, which will either be temps or constants.
450
451 On seeing 't = Get(minoff,maxoff)', look up (minoff,maxoff) in the
452 env and if it matches, replace the Get with the stored value. If
453 there is no match, add a (minoff,maxoff) :-> t binding.
454
455 On seeing 'Put (minoff,maxoff) = t or c', first remove in the env
456 any binding which fully or partially overlaps with (minoff,maxoff).
457 Then add a new (minoff,maxoff) :-> t or c binding. */
458
459/* Extract the min/max offsets from a guest state array descriptor. */
460
461inline
462static void getArrayBounds ( IRArray* descr, UInt* minoff, UInt* maxoff )
463{
464 *minoff = descr->base;
465 *maxoff = *minoff + descr->nElems*sizeofIRType(descr->elemTy) - 1;
466 vassert((*minoff & ~0xFFFF) == 0);
467 vassert((*maxoff & ~0xFFFF) == 0);
468 vassert(*minoff <= *maxoff);
469}
470
471/* Create keys, of the form ((minoffset << 16) | maxoffset). */
472
473static UInt mk_key_GetPut ( Int offset, IRType ty )
474{
475 /* offset should fit in 16 bits. */
476 UInt minoff = offset;
477 UInt maxoff = minoff + sizeofIRType(ty) - 1;
478 vassert((minoff & ~0xFFFF) == 0);
479 vassert((maxoff & ~0xFFFF) == 0);
480 return (minoff << 16) | maxoff;
481}
482
483static UInt mk_key_GetIPutI ( IRArray* descr )
484{
485 UInt minoff, maxoff;
486 getArrayBounds( descr, &minoff, &maxoff );
487 vassert((minoff & ~0xFFFF) == 0);
488 vassert((maxoff & ~0xFFFF) == 0);
489 return (minoff << 16) | maxoff;
490}
491
492/* Supposing h has keys of the form generated by mk_key_GetPut and
493 mk_key_GetIPutI, invalidate any key which overlaps (k_lo
494 .. k_hi).
495*/
496static void invalidateOverlaps ( HashHW* h, UInt k_lo, UInt k_hi )
497{
498 Int j;
499 UInt e_lo, e_hi;
500 vassert(k_lo <= k_hi);
501 /* invalidate any env entries which in any way overlap (k_lo
502 .. k_hi) */
503 /* vex_printf("invalidate %d .. %d\n", k_lo, k_hi ); */
504
505 for (j = 0; j < h->used; j++) {
506 if (!h->inuse[j])
507 continue;
508 e_lo = (((UInt)h->key[j]) >> 16) & 0xFFFF;
509 e_hi = ((UInt)h->key[j]) & 0xFFFF;
510 vassert(e_lo <= e_hi);
511 if (e_hi < k_lo || k_hi < e_lo)
512 continue; /* no overlap possible */
513 else
514 /* overlap; invalidate */
515 h->inuse[j] = False;
516 }
517}
518
519
520static void redundant_get_removal_BB ( IRBB* bb )
521{
522 HashHW* env = newHHW();
523 UInt key = 0; /* keep gcc -O happy */
524 Int i, j;
525 HWord val;
526
527 for (i = 0; i < bb->stmts_used; i++) {
528 IRStmt* st = bb->stmts[i];
529
530 if (!st)
531 continue;
532
533 /* Deal with Gets */
534 if (st->tag == Ist_Tmp
535 && st->Ist.Tmp.data->tag == Iex_Get) {
536 /* st is 't = Get(...)'. Look up in the environment and see
537 if the Get can be replaced. */
538 IRExpr* get = st->Ist.Tmp.data;
539 key = (HWord)mk_key_GetPut( get->Iex.Get.offset,
540 get->Iex.Get.ty );
541 if (lookupHHW(env, &val, (HWord)key)) {
542 /* found it */
543 /* Note, we could do better here. If the types are
544 different we don't do the substitution, since doing so
545 could lead to invalidly-typed IR. An improvement would
546 be to stick in a reinterpret-style cast, although that
547 would make maintaining flatness more difficult. */
548 IRExpr* valE = (IRExpr*)val;
sewardj9d2e7692005-02-07 01:11:31 +0000549 Bool typesOK = toBool( typeOfIRExpr(bb->tyenv,valE)
550 == st->Ist.Tmp.data->Iex.Get.ty );
sewardj08210532004-12-29 17:09:11 +0000551 if (typesOK && DEBUG_IROPT) {
552 vex_printf("rGET: "); ppIRExpr(get);
553 vex_printf(" -> "); ppIRExpr(valE);
554 vex_printf("\n");
555 }
556 if (typesOK)
557 bb->stmts[i] = IRStmt_Tmp(st->Ist.Tmp.tmp, valE);
558 } else {
559 /* Not found, but at least we know that t and the Get(...)
560 are now associated. So add a binding to reflect that
561 fact. */
562 addToHHW( env, (HWord)key,
sewardj59c07782005-01-21 21:23:07 +0000563 (HWord)(void*)(IRExpr_Tmp(st->Ist.Tmp.tmp)) );
sewardj08210532004-12-29 17:09:11 +0000564 }
565 }
566
567 /* Deal with Puts: invalidate any env entries overlapped by this
568 Put */
569 if (st->tag == Ist_Put || st->tag == Ist_PutI) {
570 UInt k_lo, k_hi;
571 if (st->tag == Ist_Put) {
572 key = mk_key_GetPut( st->Ist.Put.offset,
573 typeOfIRExpr(bb->tyenv,st->Ist.Put.data) );
574 } else {
575 vassert(st->tag == Ist_PutI);
576 key = mk_key_GetIPutI( st->Ist.PutI.descr );
577 }
578
579 k_lo = (key >> 16) & 0xFFFF;
580 k_hi = key & 0xFFFF;
581 invalidateOverlaps(env, k_lo, k_hi);
582 }
583 else
584 if (st->tag == Ist_Dirty) {
585 /* Deal with dirty helpers which write or modify guest state.
586 Invalidate the entire env. We could do a lot better
587 here. */
588 IRDirty* d = st->Ist.Dirty.details;
589 Bool writes = False;
590 for (j = 0; j < d->nFxState; j++) {
591 if (d->fxState[j].fx == Ifx_Modify
592 || d->fxState[j].fx == Ifx_Write)
593 writes = True;
594 }
595 if (writes) {
596 /* dump the entire env (not clever, but correct ...) */
597 for (j = 0; j < env->used; j++)
598 env->inuse[j] = False;
599 if (0) vex_printf("rGET: trash env due to dirty helper\n");
600 }
601 }
602
603 /* add this one to the env, if appropriate */
604 if (st->tag == Ist_Put) {
605 vassert(isAtom(st->Ist.Put.data));
606 addToHHW( env, (HWord)key, (HWord)(st->Ist.Put.data));
607 }
608
609 } /* for (i = 0; i < bb->stmts_used; i++) */
610
611}
612
613
614/*---------------------------------------------------------------*/
615/*--- In-place removal of redundant PUTs ---*/
616/*---------------------------------------------------------------*/
617
618/* Find any Get uses in st and invalidate any partially or fully
619 overlapping ranges listed in env. Due to the flattening phase, the
620 only stmt kind we expect to find a Get on is IRStmt_Tmp. */
621
622static void handle_gets_Stmt (
623 HashHW* env,
624 IRStmt* st,
625 Bool (*preciseMemExnsFn)(Int,Int)
626 )
627{
628 Int j;
629 UInt key = 0; /* keep gcc -O happy */
630 Bool isGet;
631 Bool memRW = False;
632 IRExpr* e;
633
634 switch (st->tag) {
635
636 /* This is the only interesting case. Deal with Gets in the RHS
637 expression. */
638 case Ist_Tmp:
639 e = st->Ist.Tmp.data;
640 switch (e->tag) {
641 case Iex_Get:
642 isGet = True;
643 key = mk_key_GetPut ( e->Iex.Get.offset, e->Iex.Get.ty );
644 break;
645 case Iex_GetI:
646 isGet = True;
647 key = mk_key_GetIPutI ( e->Iex.GetI.descr );
648 break;
649 case Iex_LDle:
650 isGet = False;
651 memRW = True;
652 break;
653 default:
654 isGet = False;
655 }
656 if (isGet) {
657 UInt k_lo, k_hi;
658 k_lo = (key >> 16) & 0xFFFF;
659 k_hi = key & 0xFFFF;
660 invalidateOverlaps(env, k_lo, k_hi);
661 }
662 break;
663
664 /* Be very conservative for dirty helper calls; dump the entire
665 environment. The helper might read guest state, in which
666 case it needs to be flushed first. Also, the helper might
667 access guest memory, in which case all parts of the guest
668 state requiring precise exceptions needs to be flushed. The
669 crude solution is just to flush everything; we could easily
670 enough do a lot better if needed. */
sewardj3e838932005-01-07 12:09:15 +0000671 /* Probably also overly-conservative, but also dump everything
672 if we hit a memory fence. */
673 case Ist_MFence:
sewardj08210532004-12-29 17:09:11 +0000674 case Ist_Dirty:
675 for (j = 0; j < env->used; j++)
676 env->inuse[j] = False;
677 break;
678
679 /* all other cases are boring. */
680 case Ist_STle:
681 vassert(isAtom(st->Ist.STle.addr));
682 vassert(isAtom(st->Ist.STle.data));
683 memRW = True;
684 break;
685
686 case Ist_Exit:
687 vassert(isAtom(st->Ist.Exit.guard));
688 break;
689
690 case Ist_PutI:
691 vassert(isAtom(st->Ist.PutI.ix));
692 vassert(isAtom(st->Ist.PutI.data));
693 break;
694
695 default:
696 vex_printf("\n");
697 ppIRStmt(st);
698 vex_printf("\n");
699 vpanic("handle_gets_Stmt");
700 }
701
702 if (memRW) {
703 /* This statement accesses memory. So we need to dump all parts
704 of the environment corresponding to guest state that may not
705 be reordered with respect to memory references. That means
706 at least the stack pointer. */
707 for (j = 0; j < env->used; j++) {
708 if (!env->inuse[j])
709 continue;
710 if (vex_control.iropt_precise_memory_exns) {
711 /* Precise exceptions required. Flush all guest state. */
712 env->inuse[j] = False;
713 } else {
714 /* Just flush the minimal amount required, as computed by
715 preciseMemExnsFn. */
716 HWord k_lo = (env->key[j] >> 16) & 0xFFFF;
717 HWord k_hi = env->key[j] & 0xFFFF;
718 if (preciseMemExnsFn( k_lo, k_hi ))
719 env->inuse[j] = False;
720 }
721 }
722 } /* if (memRW) */
723
724}
725
726
727/* Scan backwards, building up a set of (min offset, max
728 offset) pairs, indicating those parts of the guest state
729 for which the next event is a write.
730
731 On seeing a conditional exit, empty the set.
732
733 On seeing 'Put (minoff,maxoff) = t or c', if (minoff,maxoff) is
734 completely within the set, remove the Put. Otherwise, add
735 (minoff,maxoff) to the set.
736
737 On seeing 'Get (minoff,maxoff)', remove any part of the set
sewardj98430292004-12-29 17:34:50 +0000738 overlapping (minoff,maxoff). The same has to happen for any events
739 which implicitly read parts of the guest state: dirty helper calls
740 and loads/stores.
sewardj08210532004-12-29 17:09:11 +0000741*/
742
743static void redundant_put_removal_BB (
744 IRBB* bb,
745 Bool (*preciseMemExnsFn)(Int,Int)
746 )
747{
748 Int i, j;
749 Bool isPut;
750 IRStmt* st;
751 UInt key = 0; /* keep gcc -O happy */
752
753 HashHW* env = newHHW();
754 for (i = bb->stmts_used-1; i >= 0; i--) {
755 st = bb->stmts[i];
756 if (!st)
757 continue;
758
759 /* Deal with conditional exits. */
760 if (st->tag == Ist_Exit) {
761 /* Since control may not get beyond this point, we must empty
762 out the set, since we can no longer claim that the next
763 event for any part of the guest state is definitely a
764 write. */
765 vassert(isAtom(st->Ist.Exit.guard));
766 for (j = 0; j < env->used; j++)
767 env->inuse[j] = False;
768 continue;
769 }
770
771 /* Deal with Puts */
772 switch (st->tag) {
773 case Ist_Put:
774 isPut = True;
775 key = mk_key_GetPut( st->Ist.Put.offset,
776 typeOfIRExpr(bb->tyenv,st->Ist.Put.data) );
777 vassert(isAtom(st->Ist.Put.data));
778 break;
779 case Ist_PutI:
780 isPut = True;
781 key = mk_key_GetIPutI( st->Ist.PutI.descr );
782 vassert(isAtom(st->Ist.PutI.ix));
783 vassert(isAtom(st->Ist.PutI.data));
784 break;
785 default:
786 isPut = False;
787 }
788 if (isPut && st->tag != Ist_PutI) {
789 /* See if any single entry in env overlaps this Put. This is
790 simplistic in that the transformation is valid if, say, two
791 or more entries in the env overlap this Put, but the use of
792 lookupHHW will only find a single entry which exactly
793 overlaps this Put. This is suboptimal but safe. */
794 if (lookupHHW(env, NULL, (HWord)key)) {
795 /* This Put is redundant because a later one will overwrite
796 it. So NULL (nop) it out. */
797 if (DEBUG_IROPT) {
798 vex_printf("rPUT: "); ppIRStmt(st);
799 vex_printf("\n");
800 }
801 bb->stmts[i] = NULL;
802 } else {
803 /* We can't demonstrate that this Put is redundant, so add it
804 to the running collection. */
805 addToHHW(env, (HWord)key, 0);
806 }
807 continue;
808 }
809
810 /* Deal with Gets. These remove bits of the environment since
811 appearance of a Get means that the next event for that slice
sewardj98430292004-12-29 17:34:50 +0000812 of the guest state is no longer a write, but a read. Also
813 deals with implicit reads of guest state needed to maintain
814 precise exceptions. */
sewardj08210532004-12-29 17:09:11 +0000815 handle_gets_Stmt( env, st, preciseMemExnsFn );
816 }
817}
818
sewardj84be7372004-08-18 13:59:33 +0000819
820/*---------------------------------------------------------------*/
821/*--- Constant propagation and folding ---*/
822/*---------------------------------------------------------------*/
823
sewardj62617ef2004-10-13 23:29:22 +0000824/* The env in this section is a map from IRTemp to IRExpr*,
825 that is, an array indexed by IRTemp. */
sewardjf6501992004-08-27 11:58:24 +0000826
sewardjf6729012004-08-25 12:45:13 +0000827/* Are both expressions simply the same IRTemp ? */
828static Bool sameIRTemps ( IRExpr* e1, IRExpr* e2 )
829{
sewardj9d2e7692005-02-07 01:11:31 +0000830 return toBool( e1->tag == Iex_Tmp
831 && e2->tag == Iex_Tmp
832 && e1->Iex.Tmp.tmp == e2->Iex.Tmp.tmp );
sewardjf6729012004-08-25 12:45:13 +0000833}
834
sewardje1d45da2004-11-12 00:13:21 +0000835static Bool notBool ( Bool b )
836{
837 if (b == True) return False;
838 if (b == False) return True;
839 vpanic("notBool");
840}
841
sewardj84be7372004-08-18 13:59:33 +0000842static IRExpr* fold_Expr ( IRExpr* e )
843{
sewardj278c44c2004-08-20 00:28:13 +0000844 Int shift;
sewardj84be7372004-08-18 13:59:33 +0000845 IRExpr* e2 = e; /* e2 is the result of folding e, if possible */
846
847 /* UNARY ops */
848 if (e->tag == Iex_Unop
849 && e->Iex.Unop.arg->tag == Iex_Const) {
850 switch (e->Iex.Unop.op) {
sewardjae27ab62004-10-15 21:21:46 +0000851 case Iop_1Uto8:
sewardj9d2e7692005-02-07 01:11:31 +0000852 e2 = IRExpr_Const(IRConst_U8(toUChar(
sewardjba999312004-11-15 15:21:17 +0000853 e->Iex.Unop.arg->Iex.Const.con->Ico.U1
sewardj9d2e7692005-02-07 01:11:31 +0000854 ? 1 : 0)));
sewardjae27ab62004-10-15 21:21:46 +0000855 break;
sewardjf4a821d2004-10-09 00:58:19 +0000856 case Iop_1Uto32:
857 e2 = IRExpr_Const(IRConst_U32(
sewardjba999312004-11-15 15:21:17 +0000858 e->Iex.Unop.arg->Iex.Const.con->Ico.U1
sewardjf4a821d2004-10-09 00:58:19 +0000859 ? 1 : 0));
860 break;
sewardje6b39932004-11-06 17:01:15 +0000861
sewardjd9997882004-11-04 19:42:59 +0000862 case Iop_1Sto32:
863 e2 = IRExpr_Const(IRConst_U32(
sewardjba999312004-11-15 15:21:17 +0000864 e->Iex.Unop.arg->Iex.Const.con->Ico.U1
sewardjd9997882004-11-04 19:42:59 +0000865 ? 0xFFFFFFFF : 0));
866 break;
sewardje6b39932004-11-06 17:01:15 +0000867 case Iop_1Sto64:
868 e2 = IRExpr_Const(IRConst_U64(
sewardjba999312004-11-15 15:21:17 +0000869 e->Iex.Unop.arg->Iex.Const.con->Ico.U1
sewardje6b39932004-11-06 17:01:15 +0000870 ? 0xFFFFFFFFFFFFFFFFULL : 0));
871 break;
872
sewardj883b00b2004-09-11 09:30:24 +0000873 case Iop_8Sto32: {
874 /* signed */ Int s32 = e->Iex.Unop.arg->Iex.Const.con->Ico.U8;
875 s32 <<= 24;
876 s32 >>= 24;
877 e2 = IRExpr_Const(IRConst_U32((UInt)s32));
878 break;
879 }
sewardj84be7372004-08-18 13:59:33 +0000880 case Iop_8Uto32:
881 e2 = IRExpr_Const(IRConst_U32(
882 0xFF & e->Iex.Unop.arg->Iex.Const.con->Ico.U8));
883 break;
884 case Iop_16Uto32:
885 e2 = IRExpr_Const(IRConst_U32(
886 0xFFFF & e->Iex.Unop.arg->Iex.Const.con->Ico.U16));
887 break;
sewardj73017432004-10-14 19:33:25 +0000888 case Iop_32to16:
sewardj9d2e7692005-02-07 01:11:31 +0000889 e2 = IRExpr_Const(IRConst_U16(toUShort(
890 0xFFFF & e->Iex.Unop.arg->Iex.Const.con->Ico.U32)));
sewardj73017432004-10-14 19:33:25 +0000891 break;
sewardj4345f7a2004-09-22 19:49:27 +0000892 case Iop_32to8:
sewardj9d2e7692005-02-07 01:11:31 +0000893 e2 = IRExpr_Const(IRConst_U8(toUChar(
894 0xFF & e->Iex.Unop.arg->Iex.Const.con->Ico.U32)));
sewardj4345f7a2004-09-22 19:49:27 +0000895 break;
sewardj7447b5b2004-10-16 11:30:42 +0000896 case Iop_32to1:
sewardj9d2e7692005-02-07 01:11:31 +0000897 e2 = IRExpr_Const(IRConst_U1(toBool(
898 1 == (1 & e->Iex.Unop.arg->Iex.Const.con->Ico.U32)
899 )));
sewardj7447b5b2004-10-16 11:30:42 +0000900 break;
sewardje6b39932004-11-06 17:01:15 +0000901
sewardj883b00b2004-09-11 09:30:24 +0000902 case Iop_Not32:
903 e2 = IRExpr_Const(IRConst_U32(
904 ~ (e->Iex.Unop.arg->Iex.Const.con->Ico.U32)));
905 break;
sewardje6b39932004-11-06 17:01:15 +0000906 case Iop_Not16:
sewardj9d2e7692005-02-07 01:11:31 +0000907 e2 = IRExpr_Const(IRConst_U16(toUShort(
908 ~ (e->Iex.Unop.arg->Iex.Const.con->Ico.U16))));
sewardje6b39932004-11-06 17:01:15 +0000909 break;
sewardjd9997882004-11-04 19:42:59 +0000910 case Iop_Not8:
sewardj9d2e7692005-02-07 01:11:31 +0000911 e2 = IRExpr_Const(IRConst_U8(toUChar(
912 ~ (e->Iex.Unop.arg->Iex.Const.con->Ico.U8))));
sewardjd9997882004-11-04 19:42:59 +0000913 break;
sewardje6b39932004-11-06 17:01:15 +0000914
sewardje1d45da2004-11-12 00:13:21 +0000915 case Iop_Not1:
sewardjba999312004-11-15 15:21:17 +0000916 e2 = IRExpr_Const(IRConst_U1(
917 notBool(e->Iex.Unop.arg->Iex.Const.con->Ico.U1)));
sewardje1d45da2004-11-12 00:13:21 +0000918 break;
919
sewardj1d8ce202004-12-13 14:14:16 +0000920 case Iop_64to32: {
921 ULong w64 = e->Iex.Unop.arg->Iex.Const.con->Ico.U64;
922 w64 &= 0x00000000FFFFFFFFULL;
923 e2 = IRExpr_Const(IRConst_U32( (UInt)w64 ));
sewardj37010592004-12-13 10:47:15 +0000924 break;
sewardj1d8ce202004-12-13 14:14:16 +0000925 }
sewardj1d8ce202004-12-13 14:14:16 +0000926 case Iop_64HIto32: {
927 ULong w64 = e->Iex.Unop.arg->Iex.Const.con->Ico.U64;
928 w64 >>= 32;
929 e2 = IRExpr_Const(IRConst_U32( (UInt)w64 ));
930 break;
931 }
sewardjb5710b82005-01-27 16:05:09 +0000932 case Iop_32Uto64:
933 e2 = IRExpr_Const(IRConst_U64(
934 0xFFFFFFFFULL
935 & e->Iex.Unop.arg->Iex.Const.con->Ico.U32));
936 break;
sewardj37010592004-12-13 10:47:15 +0000937
sewardj84be7372004-08-18 13:59:33 +0000938 default:
939 goto unhandled;
940 }
941 }
942
943 /* BINARY ops */
944 if (e->tag == Iex_Binop) {
945 if (e->Iex.Binop.arg1->tag == Iex_Const
946 && e->Iex.Binop.arg2->tag == Iex_Const) {
947 /* cases where both args are consts */
948 switch (e->Iex.Binop.op) {
sewardje6b39932004-11-06 17:01:15 +0000949
950 /* --- Iop_Or --- */
sewardjd9997882004-11-04 19:42:59 +0000951 case Iop_Or8:
sewardj9d2e7692005-02-07 01:11:31 +0000952 e2 = IRExpr_Const(IRConst_U8(toUChar(
sewardjd9997882004-11-04 19:42:59 +0000953 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U8
sewardj9d2e7692005-02-07 01:11:31 +0000954 | e->Iex.Binop.arg2->Iex.Const.con->Ico.U8))));
sewardjd9997882004-11-04 19:42:59 +0000955 break;
sewardje6b39932004-11-06 17:01:15 +0000956 case Iop_Or16:
sewardj9d2e7692005-02-07 01:11:31 +0000957 e2 = IRExpr_Const(IRConst_U16(toUShort(
sewardje6b39932004-11-06 17:01:15 +0000958 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U16
sewardj9d2e7692005-02-07 01:11:31 +0000959 | e->Iex.Binop.arg2->Iex.Const.con->Ico.U16))));
sewardje6b39932004-11-06 17:01:15 +0000960 break;
961 case Iop_Or32:
962 e2 = IRExpr_Const(IRConst_U32(
963 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
964 | e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)));
965 break;
966
sewardj883b00b2004-09-11 09:30:24 +0000967 case Iop_Xor8:
sewardj9d2e7692005-02-07 01:11:31 +0000968 e2 = IRExpr_Const(IRConst_U8(toUChar(
sewardj883b00b2004-09-11 09:30:24 +0000969 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U8
sewardj9d2e7692005-02-07 01:11:31 +0000970 ^ e->Iex.Binop.arg2->Iex.Const.con->Ico.U8))));
sewardj883b00b2004-09-11 09:30:24 +0000971 break;
sewardj84be7372004-08-18 13:59:33 +0000972 case Iop_And8:
sewardj9d2e7692005-02-07 01:11:31 +0000973 e2 = IRExpr_Const(IRConst_U8(toUChar(
sewardj84be7372004-08-18 13:59:33 +0000974 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U8
sewardj9d2e7692005-02-07 01:11:31 +0000975 & e->Iex.Binop.arg2->Iex.Const.con->Ico.U8))));
sewardj84be7372004-08-18 13:59:33 +0000976 break;
sewardj4345f7a2004-09-22 19:49:27 +0000977 case Iop_Add8:
sewardj9d2e7692005-02-07 01:11:31 +0000978 e2 = IRExpr_Const(IRConst_U8(toUChar(
sewardj4345f7a2004-09-22 19:49:27 +0000979 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U8
sewardj9d2e7692005-02-07 01:11:31 +0000980 + e->Iex.Binop.arg2->Iex.Const.con->Ico.U8))));
sewardj4345f7a2004-09-22 19:49:27 +0000981 break;
sewardj84be7372004-08-18 13:59:33 +0000982 case Iop_Sub8:
sewardj9d2e7692005-02-07 01:11:31 +0000983 e2 = IRExpr_Const(IRConst_U8(toUChar(
sewardj84be7372004-08-18 13:59:33 +0000984 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U8
sewardj9d2e7692005-02-07 01:11:31 +0000985 - e->Iex.Binop.arg2->Iex.Const.con->Ico.U8))));
sewardj84be7372004-08-18 13:59:33 +0000986 break;
sewardjd7217032004-08-19 10:49:10 +0000987 case Iop_Sub32:
988 e2 = IRExpr_Const(IRConst_U32(
989 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
990 - e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)));
991 break;
sewardjc2bcb6f2005-02-07 00:17:12 +0000992
sewardjd7217032004-08-19 10:49:10 +0000993 case Iop_Add32:
994 e2 = IRExpr_Const(IRConst_U32(
995 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
996 + e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)));
997 break;
sewardjc2bcb6f2005-02-07 00:17:12 +0000998 case Iop_Add64:
999 e2 = IRExpr_Const(IRConst_U64(
1000 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U64
1001 + e->Iex.Binop.arg2->Iex.Const.con->Ico.U64)));
1002 break;
1003
sewardj883b00b2004-09-11 09:30:24 +00001004 case Iop_Xor32:
1005 e2 = IRExpr_Const(IRConst_U32(
1006 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
1007 ^ e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)));
1008 break;
sewardjd7217032004-08-19 10:49:10 +00001009 case Iop_And32:
1010 e2 = IRExpr_Const(IRConst_U32(
1011 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
1012 & e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)));
1013 break;
sewardje6b39932004-11-06 17:01:15 +00001014
sewardjb9c5cf62004-08-24 15:10:38 +00001015 case Iop_Mul32:
1016 e2 = IRExpr_Const(IRConst_U32(
1017 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
1018 * e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)));
1019 break;
sewardjd7217032004-08-19 10:49:10 +00001020 case Iop_Shl32:
sewardj61348472004-08-20 01:01:04 +00001021 vassert(e->Iex.Binop.arg2->Iex.Const.con->tag == Ico_U8);
1022 shift = (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U8);
sewardj29632392004-08-22 02:38:11 +00001023 if (shift >= 0 && shift <= 31)
sewardj278c44c2004-08-20 00:28:13 +00001024 e2 = IRExpr_Const(IRConst_U32(
1025 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
1026 << shift)));
sewardjd7217032004-08-19 10:49:10 +00001027 break;
sewardj278c44c2004-08-20 00:28:13 +00001028 case Iop_Sar32: {
1029 /* paranoid ... */
1030 /*signed*/ Int s32;
sewardj61348472004-08-20 01:01:04 +00001031 vassert(e->Iex.Binop.arg2->Iex.Const.con->tag == Ico_U8);
sewardj278c44c2004-08-20 00:28:13 +00001032 s32 = (Int)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32);
sewardj61348472004-08-20 01:01:04 +00001033 shift = (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U8);
sewardj278c44c2004-08-20 00:28:13 +00001034 if (shift >= 0 && shift <= 31) {
1035 s32 >>=/*signed*/ shift;
1036 e2 = IRExpr_Const(IRConst_U32((UInt)s32));
1037 }
1038 break;
1039 }
sewardj61348472004-08-20 01:01:04 +00001040 case Iop_Shr32: {
1041 /* paranoid ... */
1042 /*unsigned*/ UInt s32;
1043 vassert(e->Iex.Binop.arg2->Iex.Const.con->tag == Ico_U8);
1044 s32 = (Int)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32);
1045 shift = (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U8);
1046 if (shift >= 0 && shift <= 31) {
1047 s32 >>=/*unsigned*/ shift;
1048 e2 = IRExpr_Const(IRConst_U32((UInt)s32));
1049 }
1050 break;
1051 }
sewardjb8e75862004-08-19 17:58:45 +00001052 case Iop_CmpEQ32:
sewardj9d2e7692005-02-07 01:11:31 +00001053 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardjb8e75862004-08-19 17:58:45 +00001054 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
sewardj9d2e7692005-02-07 01:11:31 +00001055 == e->Iex.Binop.arg2->Iex.Const.con->Ico.U32))));
sewardjb8e75862004-08-19 17:58:45 +00001056 break;
sewardjae27ab62004-10-15 21:21:46 +00001057 case Iop_CmpNE32:
sewardj9d2e7692005-02-07 01:11:31 +00001058 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardjae27ab62004-10-15 21:21:46 +00001059 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
sewardj9d2e7692005-02-07 01:11:31 +00001060 != e->Iex.Binop.arg2->Iex.Const.con->Ico.U32))));
sewardjae27ab62004-10-15 21:21:46 +00001061 break;
sewardj7447b5b2004-10-16 11:30:42 +00001062
sewardje6b39932004-11-06 17:01:15 +00001063 case Iop_CmpNE64:
sewardj9d2e7692005-02-07 01:11:31 +00001064 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardje6b39932004-11-06 17:01:15 +00001065 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U64
sewardj9d2e7692005-02-07 01:11:31 +00001066 != e->Iex.Binop.arg2->Iex.Const.con->Ico.U64))));
sewardje6b39932004-11-06 17:01:15 +00001067 break;
1068
sewardjd9997882004-11-04 19:42:59 +00001069 case Iop_CmpNE8:
sewardj9d2e7692005-02-07 01:11:31 +00001070 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardjd9997882004-11-04 19:42:59 +00001071 ((0xFF & e->Iex.Binop.arg1->Iex.Const.con->Ico.U8)
sewardj9d2e7692005-02-07 01:11:31 +00001072 != (0xFF & e->Iex.Binop.arg2->Iex.Const.con->Ico.U8)))));
sewardjd9997882004-11-04 19:42:59 +00001073 break;
1074
sewardj7447b5b2004-10-16 11:30:42 +00001075 case Iop_CmpLE32U:
sewardj9d2e7692005-02-07 01:11:31 +00001076 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardj7447b5b2004-10-16 11:30:42 +00001077 ((UInt)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32)
sewardj9d2e7692005-02-07 01:11:31 +00001078 <= (UInt)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)))));
sewardj7447b5b2004-10-16 11:30:42 +00001079 break;
sewardj088e4f72004-10-19 01:25:02 +00001080 case Iop_CmpLE32S:
sewardj9d2e7692005-02-07 01:11:31 +00001081 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardj088e4f72004-10-19 01:25:02 +00001082 ((Int)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32)
sewardj9d2e7692005-02-07 01:11:31 +00001083 <= (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)))));
sewardj088e4f72004-10-19 01:25:02 +00001084 break;
sewardje1d45da2004-11-12 00:13:21 +00001085
sewardj9bdd2652004-10-19 12:56:33 +00001086 case Iop_CmpLT32S:
sewardj9d2e7692005-02-07 01:11:31 +00001087 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardj9bdd2652004-10-19 12:56:33 +00001088 ((Int)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32)
sewardj9d2e7692005-02-07 01:11:31 +00001089 < (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)))));
sewardj9bdd2652004-10-19 12:56:33 +00001090 break;
sewardje1d45da2004-11-12 00:13:21 +00001091 case Iop_CmpLT32U:
sewardj9d2e7692005-02-07 01:11:31 +00001092 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardje1d45da2004-11-12 00:13:21 +00001093 ((UInt)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32)
sewardj9d2e7692005-02-07 01:11:31 +00001094 < (UInt)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)))));
sewardje1d45da2004-11-12 00:13:21 +00001095 break;
sewardj7447b5b2004-10-16 11:30:42 +00001096
sewardj088bcb42004-08-19 17:16:52 +00001097 case Iop_32HLto64:
1098 e2 = IRExpr_Const(IRConst_U64(
1099 (((ULong)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32)) << 32)
1100 | ((ULong)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U32))
1101 ));
1102 break;
sewardj607dd4f2004-09-08 18:20:19 +00001103 default:
1104 goto unhandled;
sewardjd7217032004-08-19 10:49:10 +00001105 }
sewardjf6729012004-08-25 12:45:13 +00001106
sewardj84be7372004-08-18 13:59:33 +00001107 } else {
sewardjf6729012004-08-25 12:45:13 +00001108
sewardj84be7372004-08-18 13:59:33 +00001109 /* other cases (identities, etc) */
sewardj4afab822005-01-20 10:04:05 +00001110 /* Shl32/Shr32(x,0) ==> x */
1111 if ((e->Iex.Binop.op == Iop_Shl32 || e->Iex.Binop.op == Iop_Shr32)
sewardj52345402004-10-13 16:08:14 +00001112 && e->Iex.Binop.arg2->tag == Iex_Const
sewardj4980c6b2004-10-13 16:16:27 +00001113 && e->Iex.Binop.arg2->Iex.Const.con->Ico.U8 == 0) {
sewardj52345402004-10-13 16:08:14 +00001114 e2 = e->Iex.Binop.arg1;
1115 } else
1116
sewardjd9997882004-11-04 19:42:59 +00001117 /* Or32/Add32(x,0) ==> x */
1118 if ((e->Iex.Binop.op == Iop_Add32 || e->Iex.Binop.op == Iop_Or32)
sewardj84be7372004-08-18 13:59:33 +00001119 && e->Iex.Binop.arg2->tag == Iex_Const
1120 && e->Iex.Binop.arg2->Iex.Const.con->Ico.U32 == 0) {
1121 e2 = e->Iex.Binop.arg1;
sewardjf6729012004-08-25 12:45:13 +00001122 } else
1123
sewardjdcd6c882004-12-16 11:41:06 +00001124 /* Or64/Add64(x,0) ==> x */
1125 if ((e->Iex.Binop.op == Iop_Add64 || e->Iex.Binop.op == Iop_Or64)
1126 && e->Iex.Binop.arg2->tag == Iex_Const
1127 && e->Iex.Binop.arg2->Iex.Const.con->Ico.U64 == 0) {
1128 e2 = e->Iex.Binop.arg1;
1129 } else
1130
sewardj832853b2004-11-06 12:10:04 +00001131 /* And32(x,0xFFFFFFFF) ==> x */
1132 if (e->Iex.Binop.op == Iop_And32
1133 && e->Iex.Binop.arg2->tag == Iex_Const
1134 && e->Iex.Binop.arg2->Iex.Const.con->Ico.U32 == 0xFFFFFFFF) {
1135 e2 = e->Iex.Binop.arg1;
1136 } else
1137
sewardjd9997882004-11-04 19:42:59 +00001138 /* Or32(0,x) ==> x */
1139 if (e->Iex.Binop.op == Iop_Or32
1140 && e->Iex.Binop.arg1->tag == Iex_Const
1141 && e->Iex.Binop.arg1->Iex.Const.con->Ico.U32 == 0) {
1142 e2 = e->Iex.Binop.arg2;
1143 } else
1144
sewardjdcd6c882004-12-16 11:41:06 +00001145 /* Or8/16/32/64(t,t) ==> t, for some IRTemp t */
1146 /* And8/16/32/64(t,t) ==> t, for some IRTemp t */
1147 if ( (e->Iex.Binop.op == Iop_And64
1148 || e->Iex.Binop.op == Iop_And32
sewardjf6729012004-08-25 12:45:13 +00001149 || e->Iex.Binop.op == Iop_And16
sewardjdcd6c882004-12-16 11:41:06 +00001150 || e->Iex.Binop.op == Iop_And8
1151 || e->Iex.Binop.op == Iop_Or64
1152 || e->Iex.Binop.op == Iop_Or32
1153 || e->Iex.Binop.op == Iop_Or16
1154 || e->Iex.Binop.op == Iop_Or8)
sewardjf6729012004-08-25 12:45:13 +00001155 && sameIRTemps(e->Iex.Binop.arg1, e->Iex.Binop.arg2)) {
1156 e2 = e->Iex.Binop.arg1;
sewardjaba4fff2004-10-08 21:37:15 +00001157 }
sewardjf6729012004-08-25 12:45:13 +00001158
sewardj84be7372004-08-18 13:59:33 +00001159 }
1160 }
1161
1162 /* Mux0X */
1163 if (e->tag == Iex_Mux0X
1164 && e->Iex.Mux0X.cond->tag == Iex_Const) {
1165 Bool zero;
1166 /* assured us by the IR type rules */
1167 vassert(e->Iex.Mux0X.cond->Iex.Const.con->tag == Ico_U8);
sewardj9d2e7692005-02-07 01:11:31 +00001168 zero = toBool(0 == (0xFF & e->Iex.Mux0X.cond
1169 ->Iex.Const.con->Ico.U8));
sewardj84be7372004-08-18 13:59:33 +00001170 e2 = zero ? e->Iex.Mux0X.expr0 : e->Iex.Mux0X.exprX;
1171 }
1172
sewardj088bcb42004-08-19 17:16:52 +00001173 if (DEBUG_IROPT && e2 != e) {
1174 vex_printf("FOLD: ");
sewardj84be7372004-08-18 13:59:33 +00001175 ppIRExpr(e); vex_printf(" -> ");
1176 ppIRExpr(e2); vex_printf("\n");
1177 }
1178
1179 return e2;
1180
1181 unhandled:
sewardj883b00b2004-09-11 09:30:24 +00001182# if 0
sewardj84be7372004-08-18 13:59:33 +00001183 vex_printf("\n\n");
1184 ppIRExpr(e);
1185 vpanic("fold_Expr: no rule for the above");
sewardj883b00b2004-09-11 09:30:24 +00001186# else
1187 vex_printf("vex iropt: fold_Expr: no rule for: ");
1188 ppIRExpr(e);
1189 vex_printf("\n");
1190 return e2;
1191# endif
sewardj84be7372004-08-18 13:59:33 +00001192}
1193
1194
sewardj84be7372004-08-18 13:59:33 +00001195/* Apply the subst to a simple 1-level expression -- guaranteed to be
1196 1-level due to previous flattening pass. */
1197
sewardj62617ef2004-10-13 23:29:22 +00001198static IRExpr* subst_Expr ( IRExpr** env, IRExpr* ex )
sewardj84be7372004-08-18 13:59:33 +00001199{
sewardj62617ef2004-10-13 23:29:22 +00001200 switch (ex->tag) {
1201 case Iex_Tmp:
1202 if (env[(Int)ex->Iex.Tmp.tmp] != NULL) {
1203 return env[(Int)ex->Iex.Tmp.tmp];
1204 } else {
1205 /* not bound in env */
1206 return ex;
1207 }
1208
1209 case Iex_Const:
1210 case Iex_Get:
sewardj84be7372004-08-18 13:59:33 +00001211 return ex;
sewardj62617ef2004-10-13 23:29:22 +00001212
1213 case Iex_GetI:
sewardjeeac8412004-11-02 00:26:55 +00001214 vassert(isAtom(ex->Iex.GetI.ix));
sewardj62617ef2004-10-13 23:29:22 +00001215 return IRExpr_GetI(
1216 ex->Iex.GetI.descr,
sewardjeeac8412004-11-02 00:26:55 +00001217 subst_Expr(env, ex->Iex.GetI.ix),
sewardj62617ef2004-10-13 23:29:22 +00001218 ex->Iex.GetI.bias
1219 );
1220
1221 case Iex_Binop:
1222 vassert(isAtom(ex->Iex.Binop.arg1));
1223 vassert(isAtom(ex->Iex.Binop.arg2));
1224 return IRExpr_Binop(
1225 ex->Iex.Binop.op,
1226 subst_Expr(env, ex->Iex.Binop.arg1),
1227 subst_Expr(env, ex->Iex.Binop.arg2)
1228 );
1229
1230 case Iex_Unop:
1231 vassert(isAtom(ex->Iex.Unop.arg));
1232 return IRExpr_Unop(
1233 ex->Iex.Unop.op,
1234 subst_Expr(env, ex->Iex.Unop.arg)
1235 );
1236
1237 case Iex_LDle:
1238 vassert(isAtom(ex->Iex.LDle.addr));
1239 return IRExpr_LDle(
1240 ex->Iex.LDle.ty,
1241 subst_Expr(env, ex->Iex.LDle.addr)
1242 );
1243
1244 case Iex_CCall: {
1245 Int i;
1246 IRExpr** args2 = sopyIRExprVec(ex->Iex.CCall.args);
1247 for (i = 0; args2[i]; i++) {
1248 vassert(isAtom(args2[i]));
1249 args2[i] = subst_Expr(env, args2[i]);
1250 }
1251 return IRExpr_CCall(
sewardj8ea867b2004-10-30 19:03:02 +00001252 ex->Iex.CCall.cee,
sewardj62617ef2004-10-13 23:29:22 +00001253 ex->Iex.CCall.retty,
1254 args2
1255 );
sewardj84be7372004-08-18 13:59:33 +00001256 }
sewardj62617ef2004-10-13 23:29:22 +00001257
1258 case Iex_Mux0X:
1259 vassert(isAtom(ex->Iex.Mux0X.cond));
1260 vassert(isAtom(ex->Iex.Mux0X.expr0));
1261 vassert(isAtom(ex->Iex.Mux0X.exprX));
1262 return IRExpr_Mux0X(
1263 subst_Expr(env, ex->Iex.Mux0X.cond),
1264 subst_Expr(env, ex->Iex.Mux0X.expr0),
1265 subst_Expr(env, ex->Iex.Mux0X.exprX)
1266 );
1267
1268 default:
1269 vex_printf("\n\n"); ppIRExpr(ex);
1270 vpanic("subst_Expr");
1271
sewardj84be7372004-08-18 13:59:33 +00001272 }
sewardj84be7372004-08-18 13:59:33 +00001273}
1274
1275
1276/* Apply the subst to stmt, then fold the result as much as possible.
sewardjb8e75862004-08-19 17:58:45 +00001277 Much simplified due to stmt being previously flattened. Returning
1278 NULL means the statement has been turned into a no-op. */
sewardj84be7372004-08-18 13:59:33 +00001279
sewardj62617ef2004-10-13 23:29:22 +00001280static IRStmt* subst_and_fold_Stmt ( IRExpr** env, IRStmt* st )
sewardj84be7372004-08-18 13:59:33 +00001281{
1282# if 0
1283 vex_printf("\nsubst and fold stmt\n");
1284 ppIRStmt(st);
1285 vex_printf("\n");
1286# endif
1287
sewardj62617ef2004-10-13 23:29:22 +00001288 switch (st->tag) {
1289 case Ist_Put:
1290 vassert(isAtom(st->Ist.Put.data));
1291 return IRStmt_Put(
1292 st->Ist.Put.offset,
1293 fold_Expr(subst_Expr(env, st->Ist.Put.data))
1294 );
sewardj84be7372004-08-18 13:59:33 +00001295
sewardj62617ef2004-10-13 23:29:22 +00001296 case Ist_PutI:
sewardjeeac8412004-11-02 00:26:55 +00001297 vassert(isAtom(st->Ist.PutI.ix));
sewardj62617ef2004-10-13 23:29:22 +00001298 vassert(isAtom(st->Ist.PutI.data));
1299 return IRStmt_PutI(
1300 st->Ist.PutI.descr,
sewardjeeac8412004-11-02 00:26:55 +00001301 fold_Expr(subst_Expr(env, st->Ist.PutI.ix)),
sewardj62617ef2004-10-13 23:29:22 +00001302 st->Ist.PutI.bias,
1303 fold_Expr(subst_Expr(env, st->Ist.PutI.data))
1304 );
sewardjd7217032004-08-19 10:49:10 +00001305
sewardj62617ef2004-10-13 23:29:22 +00001306 case Ist_Tmp:
1307 /* This is the one place where an expr (st->Ist.Tmp.data) is
1308 allowed to be more than just a constant or a tmp. */
1309 return IRStmt_Tmp(
1310 st->Ist.Tmp.tmp,
1311 fold_Expr(subst_Expr(env, st->Ist.Tmp.data))
1312 );
sewardj84be7372004-08-18 13:59:33 +00001313
sewardj62617ef2004-10-13 23:29:22 +00001314 case Ist_STle:
1315 vassert(isAtom(st->Ist.STle.addr));
1316 vassert(isAtom(st->Ist.STle.data));
1317 return IRStmt_STle(
1318 fold_Expr(subst_Expr(env, st->Ist.STle.addr)),
1319 fold_Expr(subst_Expr(env, st->Ist.STle.data))
1320 );
sewardj84be7372004-08-18 13:59:33 +00001321
sewardj62617ef2004-10-13 23:29:22 +00001322 case Ist_Dirty: {
1323 Int i;
1324 IRDirty *d, *d2;
1325 d = st->Ist.Dirty.details;
1326 d2 = emptyIRDirty();
1327 *d2 = *d;
1328 d2->args = sopyIRExprVec(d2->args);
1329 if (d2->mFx != Ifx_None) {
1330 vassert(isAtom(d2->mAddr));
1331 d2->mAddr = fold_Expr(subst_Expr(env, d2->mAddr));
sewardjb8e75862004-08-19 17:58:45 +00001332 }
sewardjb8385d82004-11-02 01:34:15 +00001333 vassert(isAtom(d2->guard));
1334 d2->guard = fold_Expr(subst_Expr(env, d2->guard));
sewardj62617ef2004-10-13 23:29:22 +00001335 for (i = 0; d2->args[i]; i++) {
1336 vassert(isAtom(d2->args[i]));
1337 d2->args[i] = fold_Expr(subst_Expr(env, d2->args[i]));
1338 }
1339 return IRStmt_Dirty(d2);
sewardjb8e75862004-08-19 17:58:45 +00001340 }
sewardj84be7372004-08-18 13:59:33 +00001341
sewardj3e838932005-01-07 12:09:15 +00001342 case Ist_MFence:
1343 return IRStmt_MFence();
1344
sewardj62617ef2004-10-13 23:29:22 +00001345 case Ist_Exit: {
1346 IRExpr* fcond;
sewardj0276d4b2004-11-15 15:30:21 +00001347 vassert(isAtom(st->Ist.Exit.guard));
1348 fcond = fold_Expr(subst_Expr(env, st->Ist.Exit.guard));
sewardj62617ef2004-10-13 23:29:22 +00001349 if (fcond->tag == Iex_Const) {
1350 /* Interesting. The condition on this exit has folded down to
1351 a constant. */
sewardjba999312004-11-15 15:21:17 +00001352 vassert(fcond->Iex.Const.con->tag == Ico_U1);
sewardj98430292004-12-29 17:34:50 +00001353 vassert(fcond->Iex.Const.con->Ico.U1 == False
1354 || fcond->Iex.Const.con->Ico.U1 == True);
sewardjba999312004-11-15 15:21:17 +00001355 if (fcond->Iex.Const.con->Ico.U1 == False) {
sewardj62617ef2004-10-13 23:29:22 +00001356 /* exit is never going to happen, so dump the statement. */
1357 return NULL;
1358 } else {
sewardjba999312004-11-15 15:21:17 +00001359 vassert(fcond->Iex.Const.con->Ico.U1 == True);
sewardj62617ef2004-10-13 23:29:22 +00001360 /* Hmmm. The exit has become unconditional. Leave it as
1361 it is for now, since we'd have to truncate the BB at
1362 this point, which is tricky. */
1363 /* fall out into the reconstruct-the-exit code. */
sewardj08613742004-10-25 13:01:45 +00001364 if (vex_control.iropt_verbosity > 0)
1365 /* really a misuse of vex_control.iropt_verbosity */
sewardj8c2c10b2004-10-16 20:51:52 +00001366 vex_printf("vex iropt: IRStmt_Exit became unconditional\n");
sewardj62617ef2004-10-13 23:29:22 +00001367 }
1368 }
sewardj893aada2004-11-29 19:57:54 +00001369 return IRStmt_Exit(fcond, st->Ist.Exit.jk, st->Ist.Exit.dst);
sewardj62617ef2004-10-13 23:29:22 +00001370 }
1371
1372 default:
1373 vex_printf("\n"); ppIRStmt(st);
1374 vpanic("subst_and_fold_Stmt");
1375 }
sewardj84be7372004-08-18 13:59:33 +00001376}
1377
1378
sewardjd9997882004-11-04 19:42:59 +00001379IRBB* cprop_BB ( IRBB* in )
sewardj84be7372004-08-18 13:59:33 +00001380{
sewardj62617ef2004-10-13 23:29:22 +00001381 Int i;
1382 IRBB* out;
1383 IRStmt* st2;
1384 Int n_tmps = in->tyenv->types_used;
1385 IRExpr** env = LibVEX_Alloc(n_tmps * sizeof(IRExpr*));
sewardj84be7372004-08-18 13:59:33 +00001386
1387 out = emptyIRBB();
sewardj695cff92004-10-13 14:50:14 +00001388 out->tyenv = dopyIRTypeEnv( in->tyenv );
sewardj84be7372004-08-18 13:59:33 +00001389
1390 /* Set up the env with which travels forward. This holds a
1391 substitution, mapping IRTemps to atoms, that is, IRExprs which
1392 are either IRTemps or IRConsts. Thus, copy and constant
1393 propagation is done. The environment is to be applied as we
1394 move along. Keys are IRTemps. Values are IRExpr*s.
1395 */
sewardj62617ef2004-10-13 23:29:22 +00001396 for (i = 0; i < n_tmps; i++)
1397 env[i] = NULL;
sewardj84be7372004-08-18 13:59:33 +00001398
1399 /* For each original SSA-form stmt ... */
1400 for (i = 0; i < in->stmts_used; i++) {
1401
1402 /* First apply the substitution to the current stmt. This
1403 propagates in any constants and tmp-tmp assignments
1404 accumulated prior to this point. As part of the subst_Stmt
1405 call, also then fold any constant expressions resulting. */
1406
sewardjf0e43162004-08-20 00:11:12 +00001407 st2 = in->stmts[i];
1408
1409 /* perhaps st2 is already a no-op? */
1410 if (!st2) continue;
1411
1412 st2 = subst_and_fold_Stmt( env, st2 );
sewardj84be7372004-08-18 13:59:33 +00001413
sewardjb8e75862004-08-19 17:58:45 +00001414 /* If the statement has been folded into a no-op, forget it. */
sewardjf0e43162004-08-20 00:11:12 +00001415 if (!st2) continue;
sewardjb8e75862004-08-19 17:58:45 +00001416
sewardj84be7372004-08-18 13:59:33 +00001417 /* Now consider what the stmt looks like. If it's of the form
1418 't = const' or 't1 = t2', add it to the running environment
1419 and not to the output BB. Otherwise, add it to the output
sewardjc0b42282004-10-12 13:44:12 +00001420 BB. Note, we choose not to propagate const when const is an
1421 F64i, so that F64i literals can be CSE'd later. This helps
1422 x86 floating point code generation. */
sewardj84be7372004-08-18 13:59:33 +00001423
sewardjc0b42282004-10-12 13:44:12 +00001424 if (st2->tag == Ist_Tmp
1425 && st2->Ist.Tmp.data->tag == Iex_Const
1426 && st2->Ist.Tmp.data->Iex.Const.con->tag != Ico_F64i) {
sewardj84be7372004-08-18 13:59:33 +00001427 /* 't = const' -- add to env.
1428 The pair (IRTemp, IRExpr*) is added. */
sewardj62617ef2004-10-13 23:29:22 +00001429 env[(Int)(st2->Ist.Tmp.tmp)] = st2->Ist.Tmp.data;
sewardj84be7372004-08-18 13:59:33 +00001430 }
1431 else
sewardj6d076362004-09-23 11:06:17 +00001432 if (st2->tag == Ist_Tmp && st2->Ist.Tmp.data->tag == Iex_Tmp) {
sewardj84be7372004-08-18 13:59:33 +00001433 /* 't1 = t2' -- add to env.
1434 The pair (IRTemp, IRExpr*) is added. */
sewardj62617ef2004-10-13 23:29:22 +00001435 env[(Int)(st2->Ist.Tmp.tmp)] = st2->Ist.Tmp.data;
sewardj84be7372004-08-18 13:59:33 +00001436 }
1437 else {
1438 /* Not interesting, copy st2 into the output block. */
1439 addStmtToIRBB( out, st2 );
1440 }
1441 }
1442
sewardj84be7372004-08-18 13:59:33 +00001443 out->next = subst_Expr( env, in->next );
1444 out->jumpkind = in->jumpkind;
1445 return out;
1446}
1447
1448
1449
sewardjedf4d692004-08-17 13:52:58 +00001450/*---------------------------------------------------------------*/
sewardj39e3f242004-08-18 16:54:52 +00001451/*--- Dead code (t = E) removal ---*/
1452/*---------------------------------------------------------------*/
1453
sewardjea602bc2004-10-14 21:40:12 +00001454/* The type of the HashHW map is: a map from IRTemp to nothing
sewardj39e3f242004-08-18 16:54:52 +00001455 -- really just operating a set or IRTemps.
1456*/
1457
sewardjd503a322004-10-13 22:41:16 +00001458inline
1459static void addUses_Temp ( Bool* set, IRTemp tmp )
sewardj17442fe2004-09-20 14:54:28 +00001460{
sewardjd503a322004-10-13 22:41:16 +00001461 set[(Int)tmp] = True;
sewardj17442fe2004-09-20 14:54:28 +00001462}
1463
sewardjd503a322004-10-13 22:41:16 +00001464static void addUses_Expr ( Bool* set, IRExpr* e )
sewardj39e3f242004-08-18 16:54:52 +00001465{
1466 Int i;
1467 switch (e->tag) {
sewardjd7217032004-08-19 10:49:10 +00001468 case Iex_GetI:
sewardjeeac8412004-11-02 00:26:55 +00001469 addUses_Expr(set, e->Iex.GetI.ix);
sewardjd7217032004-08-19 10:49:10 +00001470 return;
sewardj39e3f242004-08-18 16:54:52 +00001471 case Iex_Mux0X:
1472 addUses_Expr(set, e->Iex.Mux0X.cond);
1473 addUses_Expr(set, e->Iex.Mux0X.expr0);
1474 addUses_Expr(set, e->Iex.Mux0X.exprX);
1475 return;
1476 case Iex_CCall:
1477 for (i = 0; e->Iex.CCall.args[i]; i++)
1478 addUses_Expr(set, e->Iex.CCall.args[i]);
1479 return;
1480 case Iex_LDle:
1481 addUses_Expr(set, e->Iex.LDle.addr);
1482 return;
1483 case Iex_Binop:
1484 addUses_Expr(set, e->Iex.Binop.arg1);
1485 addUses_Expr(set, e->Iex.Binop.arg2);
1486 return;
1487 case Iex_Unop:
1488 addUses_Expr(set, e->Iex.Unop.arg);
1489 return;
1490 case Iex_Tmp:
sewardj17442fe2004-09-20 14:54:28 +00001491 addUses_Temp(set, e->Iex.Tmp.tmp);
sewardj39e3f242004-08-18 16:54:52 +00001492 return;
1493 case Iex_Const:
1494 case Iex_Get:
1495 return;
1496 default:
1497 vex_printf("\n");
1498 ppIRExpr(e);
1499 vpanic("addUses_Expr");
1500 }
1501}
1502
sewardjd503a322004-10-13 22:41:16 +00001503static void addUses_Stmt ( Bool* set, IRStmt* st )
sewardj39e3f242004-08-18 16:54:52 +00001504{
sewardj17442fe2004-09-20 14:54:28 +00001505 Int i;
1506 IRDirty* d;
sewardj39e3f242004-08-18 16:54:52 +00001507 switch (st->tag) {
sewardjd7217032004-08-19 10:49:10 +00001508 case Ist_PutI:
sewardjeeac8412004-11-02 00:26:55 +00001509 addUses_Expr(set, st->Ist.PutI.ix);
sewardj2d3f77c2004-09-22 23:49:09 +00001510 addUses_Expr(set, st->Ist.PutI.data);
sewardjd7217032004-08-19 10:49:10 +00001511 return;
sewardj39e3f242004-08-18 16:54:52 +00001512 case Ist_Tmp:
sewardj6d076362004-09-23 11:06:17 +00001513 addUses_Expr(set, st->Ist.Tmp.data);
sewardj39e3f242004-08-18 16:54:52 +00001514 return;
1515 case Ist_Put:
sewardj6d076362004-09-23 11:06:17 +00001516 addUses_Expr(set, st->Ist.Put.data);
sewardj39e3f242004-08-18 16:54:52 +00001517 return;
1518 case Ist_STle:
1519 addUses_Expr(set, st->Ist.STle.addr);
1520 addUses_Expr(set, st->Ist.STle.data);
1521 return;
sewardj17442fe2004-09-20 14:54:28 +00001522 case Ist_Dirty:
1523 d = st->Ist.Dirty.details;
1524 if (d->mFx != Ifx_None)
1525 addUses_Expr(set, d->mAddr);
sewardjb8385d82004-11-02 01:34:15 +00001526 addUses_Expr(set, d->guard);
sewardj17442fe2004-09-20 14:54:28 +00001527 for (i = 0; d->args[i] != NULL; i++)
1528 addUses_Expr(set, d->args[i]);
1529 return;
sewardj3e838932005-01-07 12:09:15 +00001530 case Ist_MFence:
1531 return;
sewardj17442fe2004-09-20 14:54:28 +00001532 case Ist_Exit:
sewardj0276d4b2004-11-15 15:30:21 +00001533 addUses_Expr(set, st->Ist.Exit.guard);
sewardj17442fe2004-09-20 14:54:28 +00001534 return;
sewardj39e3f242004-08-18 16:54:52 +00001535 default:
1536 vex_printf("\n");
1537 ppIRStmt(st);
1538 vpanic("addUses_Stmt");
sewardjd503a322004-10-13 22:41:16 +00001539 }
sewardj39e3f242004-08-18 16:54:52 +00001540}
1541
1542
sewardjba999312004-11-15 15:21:17 +00001543/* Is this literally IRExpr_Const(IRConst_U1(False)) ? */
1544static Bool isZeroU1 ( IRExpr* e )
sewardjd9997882004-11-04 19:42:59 +00001545{
sewardj9d2e7692005-02-07 01:11:31 +00001546 return toBool( e->tag == Iex_Const
1547 && e->Iex.Const.con->tag == Ico_U1
1548 && e->Iex.Const.con->Ico.U1 == False );
sewardjd9997882004-11-04 19:42:59 +00001549}
1550
sewardj39e3f242004-08-18 16:54:52 +00001551
1552/* Note, this destructively modifies the given IRBB. */
1553
1554/* Scan backwards through statements, carrying a set of IRTemps which
1555 are known to be used after the current point. On encountering 't =
1556 E', delete the binding if it is not used. Otherwise, add any temp
1557 uses to the set and keep on moving backwards. */
1558
sewardj49651f42004-10-28 22:11:04 +00001559/* notstatic */ void do_deadcode_BB ( IRBB* bb )
sewardj39e3f242004-08-18 16:54:52 +00001560{
1561 Int i;
sewardjd503a322004-10-13 22:41:16 +00001562 Int n_tmps = bb->tyenv->types_used;
1563 Bool* set = LibVEX_Alloc(n_tmps * sizeof(Bool));
sewardj39e3f242004-08-18 16:54:52 +00001564 IRStmt* st;
1565
sewardjd503a322004-10-13 22:41:16 +00001566 for (i = 0; i < n_tmps; i++)
1567 set[i] = False;
1568
sewardj39e3f242004-08-18 16:54:52 +00001569 /* start off by recording IRTemp uses in the next field. */
1570 addUses_Expr(set, bb->next);
1571
1572 /* Work backwards through the stmts */
1573 for (i = bb->stmts_used-1; i >= 0; i--) {
1574 st = bb->stmts[i];
sewardj84ff0652004-08-23 16:16:08 +00001575 if (!st)
1576 continue;
sewardj39e3f242004-08-18 16:54:52 +00001577 if (st->tag == Ist_Tmp
sewardjd503a322004-10-13 22:41:16 +00001578 && set[(Int)(st->Ist.Tmp.tmp)] == False) {
sewardj39e3f242004-08-18 16:54:52 +00001579 /* it's an IRTemp which never got used. Delete it. */
sewardj088bcb42004-08-19 17:16:52 +00001580 if (DEBUG_IROPT) {
sewardj39e3f242004-08-18 16:54:52 +00001581 vex_printf("DEAD: ");
1582 ppIRStmt(st);
1583 vex_printf("\n");
1584 }
1585 bb->stmts[i] = NULL;
sewardjd9997882004-11-04 19:42:59 +00001586 }
1587 else
1588 if (st->tag == Ist_Dirty
1589 && st->Ist.Dirty.details->guard
sewardjba999312004-11-15 15:21:17 +00001590 && isZeroU1(st->Ist.Dirty.details->guard)) {
sewardjd9997882004-11-04 19:42:59 +00001591 /* This is a dirty helper which will never get called. Delete it. */
1592 bb->stmts[i] = NULL;
1593 }
1594 else {
sewardj39e3f242004-08-18 16:54:52 +00001595 /* Note any IRTemp uses made by the current statement. */
1596 addUses_Stmt(set, st);
1597 }
1598 }
1599}
1600
sewardj84ff0652004-08-23 16:16:08 +00001601/*---------------------------------------------------------------*/
1602/*--- Specialisation of helper function calls, in ---*/
1603/*--- collaboration with the front end ---*/
1604/*---------------------------------------------------------------*/
1605
1606static
sewardjb9230752004-12-29 19:25:06 +00001607IRBB* spec_helpers_BB ( IRBB* bb,
sewardj9d2e7692005-02-07 01:11:31 +00001608 IRExpr* (*specHelper) ( HChar*, IRExpr**) )
sewardj84ff0652004-08-23 16:16:08 +00001609{
sewardjb9230752004-12-29 19:25:06 +00001610 Int i;
sewardj84ff0652004-08-23 16:16:08 +00001611 IRStmt* st;
1612 IRExpr* ex;
sewardjb9230752004-12-29 19:25:06 +00001613 Bool any = False;
sewardj84ff0652004-08-23 16:16:08 +00001614
1615 for (i = bb->stmts_used-1; i >= 0; i--) {
1616 st = bb->stmts[i];
1617
1618 if (!st
1619 || st->tag != Ist_Tmp
sewardj6d076362004-09-23 11:06:17 +00001620 || st->Ist.Tmp.data->tag != Iex_CCall)
sewardjaba4fff2004-10-08 21:37:15 +00001621 continue;
sewardj84ff0652004-08-23 16:16:08 +00001622
sewardj8ea867b2004-10-30 19:03:02 +00001623 ex = (*specHelper)( st->Ist.Tmp.data->Iex.CCall.cee->name,
sewardj6d076362004-09-23 11:06:17 +00001624 st->Ist.Tmp.data->Iex.CCall.args );
sewardj84ff0652004-08-23 16:16:08 +00001625 if (!ex)
sewardjaba4fff2004-10-08 21:37:15 +00001626 /* the front end can't think of a suitable replacement */
1627 continue;
sewardj84ff0652004-08-23 16:16:08 +00001628
1629 /* We got something better. Install it in the bb. */
sewardjb9230752004-12-29 19:25:06 +00001630 any = True;
sewardj84ff0652004-08-23 16:16:08 +00001631 bb->stmts[i]
1632 = IRStmt_Tmp(st->Ist.Tmp.tmp, ex);
1633
1634 if (0) {
1635 vex_printf("SPEC: ");
sewardj6d076362004-09-23 11:06:17 +00001636 ppIRExpr(st->Ist.Tmp.data);
sewardj84ff0652004-08-23 16:16:08 +00001637 vex_printf(" --> ");
1638 ppIRExpr(ex);
1639 vex_printf("\n");
1640 }
1641 }
sewardjb9230752004-12-29 19:25:06 +00001642
1643 if (any)
1644 bb = flatten_BB(bb);
1645 return bb;
sewardj84ff0652004-08-23 16:16:08 +00001646}
1647
sewardj29632392004-08-22 02:38:11 +00001648
1649/*---------------------------------------------------------------*/
sewardj08210532004-12-29 17:09:11 +00001650/*--- Common Subexpression Elimination ---*/
1651/*---------------------------------------------------------------*/
1652
1653/* Expensive in time and space. */
1654
1655/* Uses two environments:
1656 a IRTemp -> IRTemp mapping
1657 a mapping from AvailExpr* to IRTemp
1658*/
1659
1660typedef
1661 struct {
1662 enum { Ut, Btt, Btc, Bct, Cf64i } tag;
1663 union {
1664 /* unop(tmp) */
1665 struct {
1666 IROp op;
1667 IRTemp arg;
1668 } Ut;
1669 /* binop(tmp,tmp) */
1670 struct {
1671 IROp op;
1672 IRTemp arg1;
1673 IRTemp arg2;
1674 } Btt;
1675 /* binop(tmp,const) */
1676 struct {
1677 IROp op;
1678 IRTemp arg1;
1679 IRConst con2;
1680 } Btc;
1681 /* binop(const,tmp) */
1682 struct {
1683 IROp op;
1684 IRConst con1;
1685 IRTemp arg2;
1686 } Bct;
1687 /* F64i-style const */
1688 struct {
1689 ULong f64i;
1690 } Cf64i;
1691 } u;
1692 }
1693 AvailExpr;
1694
1695static Bool eq_AvailExpr ( AvailExpr* a1, AvailExpr* a2 )
1696{
1697 if (a1->tag != a2->tag)
1698 return False;
1699 switch (a1->tag) {
1700 case Ut:
sewardj9d2e7692005-02-07 01:11:31 +00001701 return toBool(
1702 a1->u.Ut.op == a2->u.Ut.op
1703 && a1->u.Ut.arg == a2->u.Ut.arg);
sewardj08210532004-12-29 17:09:11 +00001704 case Btt:
sewardj9d2e7692005-02-07 01:11:31 +00001705 return toBool(
1706 a1->u.Btt.op == a2->u.Btt.op
sewardj08210532004-12-29 17:09:11 +00001707 && a1->u.Btt.arg1 == a2->u.Btt.arg1
sewardj9d2e7692005-02-07 01:11:31 +00001708 && a1->u.Btt.arg2 == a2->u.Btt.arg2);
sewardj08210532004-12-29 17:09:11 +00001709 case Btc:
sewardj9d2e7692005-02-07 01:11:31 +00001710 return toBool(
1711 a1->u.Btc.op == a2->u.Btc.op
sewardj08210532004-12-29 17:09:11 +00001712 && a1->u.Btc.arg1 == a2->u.Btc.arg1
sewardj9d2e7692005-02-07 01:11:31 +00001713 && eqIRConst(&a1->u.Btc.con2, &a2->u.Btc.con2));
sewardj08210532004-12-29 17:09:11 +00001714 case Bct:
sewardj9d2e7692005-02-07 01:11:31 +00001715 return toBool(
1716 a1->u.Bct.op == a2->u.Bct.op
sewardj08210532004-12-29 17:09:11 +00001717 && a1->u.Bct.arg2 == a2->u.Bct.arg2
sewardj9d2e7692005-02-07 01:11:31 +00001718 && eqIRConst(&a1->u.Bct.con1, &a2->u.Bct.con1));
sewardj08210532004-12-29 17:09:11 +00001719 case Cf64i:
sewardj9d2e7692005-02-07 01:11:31 +00001720 return toBool(a1->u.Cf64i.f64i == a2->u.Cf64i.f64i);
sewardj08210532004-12-29 17:09:11 +00001721 default: vpanic("eq_AvailExpr");
1722 }
1723}
1724
1725static IRExpr* availExpr_to_IRExpr ( AvailExpr* ae )
1726{
1727 IRConst* con;
1728 switch (ae->tag) {
1729 case Ut:
1730 return IRExpr_Unop( ae->u.Ut.op, IRExpr_Tmp(ae->u.Ut.arg) );
1731 case Btt:
1732 return IRExpr_Binop( ae->u.Btt.op,
1733 IRExpr_Tmp(ae->u.Btt.arg1),
1734 IRExpr_Tmp(ae->u.Btt.arg2) );
1735 case Btc:
1736 con = LibVEX_Alloc(sizeof(IRConst));
1737 *con = ae->u.Btc.con2;
1738 return IRExpr_Binop( ae->u.Btc.op,
1739 IRExpr_Tmp(ae->u.Btc.arg1), IRExpr_Const(con) );
1740 case Bct:
1741 con = LibVEX_Alloc(sizeof(IRConst));
1742 *con = ae->u.Bct.con1;
1743 return IRExpr_Binop( ae->u.Bct.op,
1744 IRExpr_Const(con), IRExpr_Tmp(ae->u.Bct.arg2) );
1745 case Cf64i:
1746 return IRExpr_Const(IRConst_F64i(ae->u.Cf64i.f64i));
1747 default:
1748 vpanic("availExpr_to_IRExpr");
1749 }
1750}
1751
1752inline
1753static IRTemp subst_AvailExpr_Temp ( HashHW* env, IRTemp tmp )
1754{
1755 HWord res;
1756 /* env :: IRTemp -> IRTemp */
1757 if (lookupHHW( env, &res, (HWord)tmp ))
1758 return (IRTemp)res;
1759 else
1760 return tmp;
1761}
1762
1763static void subst_AvailExpr ( HashHW* env, AvailExpr* ae )
1764{
1765 /* env :: IRTemp -> IRTemp */
1766 switch (ae->tag) {
1767 case Ut:
1768 ae->u.Ut.arg = subst_AvailExpr_Temp( env, ae->u.Ut.arg );
1769 break;
1770 case Btt:
1771 ae->u.Btt.arg1 = subst_AvailExpr_Temp( env, ae->u.Btt.arg1 );
1772 ae->u.Btt.arg2 = subst_AvailExpr_Temp( env, ae->u.Btt.arg2 );
1773 break;
1774 case Btc:
1775 ae->u.Btc.arg1 = subst_AvailExpr_Temp( env, ae->u.Btc.arg1 );
1776 break;
1777 case Bct:
1778 ae->u.Bct.arg2 = subst_AvailExpr_Temp( env, ae->u.Bct.arg2 );
1779 break;
1780 case Cf64i:
1781 break;
1782 default:
1783 vpanic("subst_AvailExpr");
1784 }
1785}
1786
1787static AvailExpr* irExpr_to_AvailExpr ( IRExpr* e )
1788{
1789 AvailExpr* ae;
1790
1791 if (e->tag == Iex_Unop
1792 && e->Iex.Unop.arg->tag == Iex_Tmp) {
1793 ae = LibVEX_Alloc(sizeof(AvailExpr));
1794 ae->tag = Ut;
1795 ae->u.Ut.op = e->Iex.Unop.op;
1796 ae->u.Ut.arg = e->Iex.Unop.arg->Iex.Tmp.tmp;
1797 return ae;
1798 }
1799
1800 if (e->tag == Iex_Binop
1801 && e->Iex.Binop.arg1->tag == Iex_Tmp
1802 && e->Iex.Binop.arg2->tag == Iex_Tmp) {
1803 ae = LibVEX_Alloc(sizeof(AvailExpr));
1804 ae->tag = Btt;
1805 ae->u.Btt.op = e->Iex.Binop.op;
1806 ae->u.Btt.arg1 = e->Iex.Binop.arg1->Iex.Tmp.tmp;
1807 ae->u.Btt.arg2 = e->Iex.Binop.arg2->Iex.Tmp.tmp;
1808 return ae;
1809 }
1810
1811 if (e->tag == Iex_Binop
1812 && e->Iex.Binop.arg1->tag == Iex_Tmp
1813 && e->Iex.Binop.arg2->tag == Iex_Const) {
1814 ae = LibVEX_Alloc(sizeof(AvailExpr));
1815 ae->tag = Btc;
1816 ae->u.Btc.op = e->Iex.Binop.op;
1817 ae->u.Btc.arg1 = e->Iex.Binop.arg1->Iex.Tmp.tmp;
1818 ae->u.Btc.con2 = *(e->Iex.Binop.arg2->Iex.Const.con);
1819 return ae;
1820 }
1821
1822 if (e->tag == Iex_Binop
1823 && e->Iex.Binop.arg1->tag == Iex_Const
1824 && e->Iex.Binop.arg2->tag == Iex_Tmp) {
1825 ae = LibVEX_Alloc(sizeof(AvailExpr));
1826 ae->tag = Bct;
1827 ae->u.Bct.op = e->Iex.Binop.op;
1828 ae->u.Bct.arg2 = e->Iex.Binop.arg2->Iex.Tmp.tmp;
1829 ae->u.Bct.con1 = *(e->Iex.Binop.arg1->Iex.Const.con);
1830 return ae;
1831 }
1832
1833 if (e->tag == Iex_Const
1834 && e->Iex.Const.con->tag == Ico_F64i) {
1835 ae = LibVEX_Alloc(sizeof(AvailExpr));
1836 ae->tag = Cf64i;
1837 ae->u.Cf64i.f64i = e->Iex.Const.con->Ico.F64i;
1838 return ae;
1839 }
1840
1841 return NULL;
1842}
1843
1844
1845/* The BB is modified in-place. */
1846
1847void do_cse_BB ( IRBB* bb )
1848{
1849 Int i, j;
1850 IRTemp t, q;
1851 IRStmt* st;
1852 AvailExpr* eprime;
1853
1854 HashHW* tenv = newHHW(); /* :: IRTemp -> IRTemp */
1855 HashHW* aenv = newHHW(); /* :: AvailExpr* -> IRTemp */
1856
1857 vassert(sizeof(IRTemp) <= sizeof(HWord));
1858
1859 //ppIRBB(bb);
1860 //vex_printf("\n\n");
1861
1862 /* Iterate forwards over the stmts.
1863 On seeing "t = E", where E is one of the 3 AvailExpr forms:
1864 let E' = apply tenv substitution to E
1865 search aenv for E'
1866 if a mapping E' -> q is found,
1867 replace this stmt by "t = q"
1868 and add binding t -> q to tenv
1869 else
1870 add binding E' -> t to aenv
1871 replace this stmt by "t = E'"
1872 Ignore any other kind of stmt.
1873 */
1874 for (i = 0; i < bb->stmts_used; i++) {
1875 st = bb->stmts[i];
1876
1877 /* ignore not-interestings */
1878 if ((!st) || st->tag != Ist_Tmp)
1879 continue;
1880
1881 t = st->Ist.Tmp.tmp;
1882 eprime = irExpr_to_AvailExpr(st->Ist.Tmp.data);
1883 /* ignore if not of AvailExpr form */
1884 if (!eprime)
1885 continue;
1886
1887 /* vex_printf("considering: " ); ppIRStmt(st); vex_printf("\n"); */
1888
1889 /* apply tenv */
1890 subst_AvailExpr( tenv, eprime );
1891
1892 /* search aenv for eprime, unfortunately the hard way */
1893 for (j = 0; j < aenv->used; j++)
1894 if (aenv->inuse[j] && eq_AvailExpr(eprime, (AvailExpr*)aenv->key[j]))
1895 break;
1896
1897 if (j < aenv->used) {
1898 /* A binding E' -> q was found. Replace stmt by "t = q" and
1899 note the t->q binding in tenv. */
1900 /* (this is the core of the CSE action) */
1901 q = (IRTemp)aenv->val[j];
1902 bb->stmts[i] = IRStmt_Tmp( t, IRExpr_Tmp(q) );
1903 addToHHW( tenv, (HWord)t, (HWord)q );
1904 } else {
1905 /* No binding was found, so instead we add E' -> t to our
1906 collection of available expressions, replace this stmt
1907 with "t = E'", and move on. */
1908 bb->stmts[i] = IRStmt_Tmp( t, availExpr_to_IRExpr(eprime) );
1909 addToHHW( aenv, (HWord)eprime, (HWord)t );
1910 }
1911 }
1912
1913 //ppIRBB(bb);
1914 //sanityCheckIRBB(bb, Ity_I32);
1915 //vex_printf("\n\n");
1916
1917}
1918
1919
1920/*---------------------------------------------------------------*/
1921/*--- Add32/Sub32 chain collapsing ---*/
1922/*---------------------------------------------------------------*/
1923
1924/* ----- Helper functions for Add32/Sub32 chain collapsing ----- */
1925
1926/* Is this expression "Add32(tmp,const)" or "Sub32(tmp,const)" ? If
1927 yes, set *tmp and *i32 appropriately. *i32 is set as if the
1928 root node is Add32, not Sub32. */
1929
1930static Bool isAdd32OrSub32 ( IRExpr* e, IRTemp* tmp, Int* i32 )
1931{
1932 if (e->tag != Iex_Binop)
1933 return False;
1934 if (e->Iex.Binop.op != Iop_Add32 && e->Iex.Binop.op != Iop_Sub32)
1935 return False;
1936 if (e->Iex.Binop.arg1->tag != Iex_Tmp)
1937 return False;
1938 if (e->Iex.Binop.arg2->tag != Iex_Const)
1939 return False;
1940 *tmp = e->Iex.Binop.arg1->Iex.Tmp.tmp;
1941 *i32 = (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U32);
1942 if (e->Iex.Binop.op == Iop_Sub32)
1943 *i32 = -*i32;
1944 return True;
1945}
1946
1947
1948/* Figure out if tmp can be expressed as tmp2 +32 const, for some
1949 other tmp2. Scan backwards from the specified start point -- an
1950 optimisation. */
1951
1952static Bool collapseChain ( IRBB* bb, Int startHere,
1953 IRTemp tmp,
1954 IRTemp* tmp2, Int* i32 )
1955{
1956 Int j, ii;
1957 IRTemp vv;
1958 IRStmt* st;
1959 IRExpr* e;
1960
1961 /* the (var, con) pair contain the current 'representation' for
1962 'tmp'. We start with 'tmp + 0'. */
1963 IRTemp var = tmp;
1964 Int con = 0;
1965
1966 /* Scan backwards to see if tmp can be replaced by some other tmp
1967 +/- a constant. */
1968 for (j = startHere; j >= 0; j--) {
1969 st = bb->stmts[j];
1970 if (!st || st->tag != Ist_Tmp)
1971 continue;
1972 if (st->Ist.Tmp.tmp != var)
1973 continue;
1974 e = st->Ist.Tmp.data;
1975 if (!isAdd32OrSub32(e, &vv, &ii))
1976 break;
1977 var = vv;
1978 con += ii;
1979 }
1980 if (j == -1)
1981 /* no earlier binding for var .. ill-formed IR */
1982 vpanic("collapseChain");
1983
1984 /* so, did we find anything interesting? */
1985 if (var == tmp)
1986 return False; /* no .. */
1987
1988 *tmp2 = var;
1989 *i32 = con;
1990 return True;
1991}
1992
1993
1994/* ------- Main function for Add32/Sub32 chain collapsing ------ */
1995
1996static void collapse_AddSub_chains_BB ( IRBB* bb )
1997{
1998 IRStmt *st;
1999 IRTemp var, var2;
2000 Int i, con, con2;
2001
2002 for (i = bb->stmts_used-1; i >= 0; i--) {
2003 st = bb->stmts[i];
2004 if (!st)
2005 continue;
2006
2007 /* Try to collapse 't1 = Add32/Sub32(t2, con)'. */
2008
2009 if (st->tag == Ist_Tmp
2010 && isAdd32OrSub32(st->Ist.Tmp.data, &var, &con)) {
2011
2012 /* So e1 is of the form Add32(var,con) or Sub32(var,-con).
2013 Find out if var can be expressed as var2 + con2. */
2014 if (collapseChain(bb, i-1, var, &var2, &con2)) {
2015 if (DEBUG_IROPT) {
2016 vex_printf("replacing1 ");
2017 ppIRStmt(st);
2018 vex_printf(" with ");
2019 }
2020 con2 += con;
2021 bb->stmts[i]
2022 = IRStmt_Tmp(
2023 st->Ist.Tmp.tmp,
2024 (con2 >= 0)
2025 ? IRExpr_Binop(Iop_Add32,
2026 IRExpr_Tmp(var2),
2027 IRExpr_Const(IRConst_U32(con2)))
2028 : IRExpr_Binop(Iop_Sub32,
2029 IRExpr_Tmp(var2),
2030 IRExpr_Const(IRConst_U32(-con2)))
2031 );
2032 if (DEBUG_IROPT) {
2033 ppIRStmt(bb->stmts[i]);
2034 vex_printf("\n");
2035 }
2036 }
2037
2038 continue;
2039 }
2040
2041 /* Try to collapse 't1 = GetI[t2, con]'. */
2042
2043 if (st->tag == Ist_Tmp
2044 && st->Ist.Tmp.data->tag == Iex_GetI
2045 && st->Ist.Tmp.data->Iex.GetI.ix->tag == Iex_Tmp
2046 && collapseChain(bb, i-1, st->Ist.Tmp.data->Iex.GetI.ix
2047 ->Iex.Tmp.tmp, &var2, &con2)) {
2048 if (DEBUG_IROPT) {
2049 vex_printf("replacing3 ");
2050 ppIRStmt(st);
2051 vex_printf(" with ");
2052 }
2053 con2 += st->Ist.Tmp.data->Iex.GetI.bias;
2054 bb->stmts[i]
2055 = IRStmt_Tmp(
2056 st->Ist.Tmp.tmp,
2057 IRExpr_GetI(st->Ist.Tmp.data->Iex.GetI.descr,
2058 IRExpr_Tmp(var2),
2059 con2));
2060 if (DEBUG_IROPT) {
2061 ppIRStmt(bb->stmts[i]);
2062 vex_printf("\n");
2063 }
2064 continue;
2065 }
2066
2067 /* Perhaps st is PutI[t, con] ? */
2068
2069 if (st->tag == Ist_PutI
2070 && st->Ist.PutI.ix->tag == Iex_Tmp
2071 && collapseChain(bb, i-1, st->Ist.PutI.ix->Iex.Tmp.tmp,
2072 &var2, &con2)) {
2073 if (DEBUG_IROPT) {
2074 vex_printf("replacing2 ");
2075 ppIRStmt(st);
2076 vex_printf(" with ");
2077 }
2078 con2 += st->Ist.PutI.bias;
2079 bb->stmts[i]
2080 = IRStmt_PutI(st->Ist.PutI.descr,
2081 IRExpr_Tmp(var2),
2082 con2,
2083 st->Ist.PutI.data);
2084 if (DEBUG_IROPT) {
2085 ppIRStmt(bb->stmts[i]);
2086 vex_printf("\n");
2087 }
2088 continue;
2089 }
2090
2091 } /* for */
2092}
2093
2094
2095/*---------------------------------------------------------------*/
2096/*--- PutI/GetI transformations ---*/
2097/*---------------------------------------------------------------*/
2098
2099/* ------- Helper functions for PutI/GetI transformations ------ */
2100
2101/* Do a1 and a2 denote identical values? Safe answer: False
2102*/
2103static Bool identicalAtoms ( IRExpr* a1, IRExpr* a2 )
2104{
2105 vassert(isAtom(a1));
2106 vassert(isAtom(a2));
2107 if (a1->tag == Iex_Tmp && a2->tag == Iex_Tmp)
sewardj9d2e7692005-02-07 01:11:31 +00002108 return toBool(a1->Iex.Tmp.tmp == a2->Iex.Tmp.tmp);
sewardj08210532004-12-29 17:09:11 +00002109 if (a1->tag == Iex_Const && a2->tag == Iex_Const)
2110 return eqIRConst(a1->Iex.Const.con, a2->Iex.Const.con);
2111 return False;
2112}
2113
2114
2115/* Determine, to the extent possible, the relationship between two
2116 guest state accesses. The possible outcomes are:
2117
2118 * Exact alias. These two accesses denote precisely the same
2119 piece of the guest state.
2120
2121 * Definitely no alias. These two accesses are guaranteed not to
2122 overlap any part of the guest state.
2123
2124 * Unknown -- if neither of the above can be established.
2125
2126 If in doubt, return Unknown. */
2127
2128typedef
2129 enum { ExactAlias, NoAlias, UnknownAlias }
2130 GSAliasing;
2131
2132
2133/* Produces the alias relation between an indexed guest
2134 state access and a non-indexed access. */
2135
2136static
2137GSAliasing getAliasingRelation_IC ( IRArray* descr1, IRExpr* ix1,
2138 Int offset2, IRType ty2 )
2139{
2140 UInt minoff1, maxoff1, minoff2, maxoff2;
2141
2142 getArrayBounds( descr1, &minoff1, &maxoff1 );
2143 minoff2 = offset2;
2144 maxoff2 = minoff2 + sizeofIRType(ty2) - 1;
2145
2146 if (maxoff1 < minoff2 || maxoff2 < minoff1)
2147 return NoAlias;
2148
2149 /* Could probably do better here if required. For the moment
2150 however just claim not to know anything more. */
2151 return UnknownAlias;
2152}
2153
2154
2155/* Produces the alias relation between two indexed guest state
2156 accesses. */
2157
2158static
2159GSAliasing getAliasingRelation_II (
2160 IRArray* descr1, IRExpr* ix1, Int bias1,
2161 IRArray* descr2, IRExpr* ix2, Int bias2
2162 )
2163{
2164 UInt minoff1, maxoff1, minoff2, maxoff2;
2165 Int iters;
2166
2167 /* First try hard to show they don't alias. */
2168 getArrayBounds( descr1, &minoff1, &maxoff1 );
2169 getArrayBounds( descr2, &minoff2, &maxoff2 );
2170 if (maxoff1 < minoff2 || maxoff2 < minoff1)
2171 return NoAlias;
2172
2173 /* So the two arrays at least partially overlap. To get any
2174 further we'll have to be sure that the descriptors are
2175 identical. */
2176 if (!eqIRArray(descr1, descr2))
2177 return UnknownAlias;
2178
2179 /* The descriptors are identical. Now the only difference can be
2180 in the index expressions. If they cannot be shown to be
2181 identical, we have to say we don't know what the aliasing
2182 relation will be. Now, since the IR is flattened, the index
2183 expressions should be atoms -- either consts or tmps. So that
2184 makes the comparison simple. */
2185 vassert(isAtom(ix1));
2186 vassert(isAtom(ix2));
2187 if (!identicalAtoms(ix1,ix2))
2188 return UnknownAlias;
2189
2190 /* Ok, the index expressions are identical. So now the only way
2191 they can be different is in the bias. Normalise this
2192 paranoidly, to reliably establish equality/non-equality. */
2193
2194 /* So now we know that the GetI and PutI index the same array
2195 with the same base. Are the offsets the same, modulo the
2196 array size? Do this paranoidly. */
2197 vassert(descr1->nElems == descr2->nElems);
2198 vassert(descr1->elemTy == descr2->elemTy);
2199 vassert(descr1->base == descr2->base);
2200 iters = 0;
2201 while (bias1 < 0 || bias2 < 0) {
2202 bias1 += descr1->nElems;
2203 bias2 += descr1->nElems;
2204 iters++;
2205 if (iters > 10)
2206 vpanic("getAliasingRelation: iters");
2207 }
2208 vassert(bias1 >= 0 && bias2 >= 0);
2209 bias1 %= descr1->nElems;
2210 bias2 %= descr1->nElems;
2211 vassert(bias1 >= 0 && bias1 < descr1->nElems);
2212 vassert(bias2 >= 0 && bias2 < descr1->nElems);
2213
2214 /* Finally, biasP and biasG are normalised into the range
2215 0 .. descrP/G->nElems - 1. And so we can establish
2216 equality/non-equality. */
2217
2218 return bias1==bias2 ? ExactAlias : NoAlias;
2219}
2220
2221
2222/* Given the parts (descr, tmp, bias) for a GetI, scan backwards from
2223 the given starting point to find, if any, a PutI which writes
2224 exactly the same piece of guest state, and so return the expression
2225 that the PutI writes. This is the core of PutI-GetI forwarding. */
2226
2227static
2228IRExpr* findPutI ( IRBB* bb, Int startHere,
2229 IRArray* descrG, IRExpr* ixG, Int biasG )
2230{
2231 Int j;
2232 IRStmt* st;
2233 GSAliasing relation;
2234
2235 if (0) {
2236 vex_printf("\nfindPutI ");
2237 ppIRArray(descrG);
2238 vex_printf(" ");
2239 ppIRExpr(ixG);
2240 vex_printf(" %d\n", biasG);
2241 }
2242
2243 /* Scan backwards in bb from startHere to find a suitable PutI
2244 binding for (descrG, ixG, biasG), if any. */
2245
2246 for (j = startHere; j >= 0; j--) {
2247 st = bb->stmts[j];
2248 if (!st) continue;
2249
2250 if (st->tag == Ist_Put) {
2251 /* Non-indexed Put. This can't give a binding, but we do
2252 need to check it doesn't invalidate the search by
2253 overlapping any part of the indexed guest state. */
2254
2255 relation
2256 = getAliasingRelation_IC(
2257 descrG, ixG,
2258 st->Ist.Put.offset,
2259 typeOfIRExpr(bb->tyenv,st->Ist.Put.data) );
2260
2261 if (relation == NoAlias) {
2262 /* we're OK; keep going */
2263 continue;
2264 } else {
2265 /* relation == UnknownAlias || relation == ExactAlias */
2266 /* If this assertion fails, we've found a Put which writes
2267 an area of guest state which is read by a GetI. Which
2268 is unlikely (although not per se wrong). */
2269 vassert(relation != ExactAlias);
2270 /* This Put potentially writes guest state that the GetI
2271 reads; we must fail. */
2272 return NULL;
2273 }
2274 }
2275
2276 if (st->tag == Ist_PutI) {
2277
2278 relation = getAliasingRelation_II(
2279 descrG, ixG, biasG,
2280 st->Ist.PutI.descr,
2281 st->Ist.PutI.ix,
2282 st->Ist.PutI.bias
2283 );
2284
2285 if (relation == NoAlias) {
2286 /* This PutI definitely doesn't overlap. Ignore it and
2287 keep going. */
2288 continue; /* the for j loop */
2289 }
2290
2291 if (relation == UnknownAlias) {
2292 /* We don't know if this PutI writes to the same guest
2293 state that the GetI, or not. So we have to give up. */
2294 return NULL;
2295 }
2296
2297 /* Otherwise, we've found what we're looking for. */
2298 vassert(relation == ExactAlias);
2299 return st->Ist.PutI.data;
2300
2301 } /* if (st->tag == Ist_PutI) */
2302
2303 if (st->tag == Ist_Dirty) {
2304 /* Be conservative. If the dirty call has any guest effects at
2305 all, give up. We could do better -- only give up if there
2306 are any guest writes/modifies. */
2307 if (st->Ist.Dirty.details->nFxState > 0)
2308 return NULL;
2309 }
2310
2311 } /* for */
2312
2313 /* No valid replacement was found. */
2314 return NULL;
2315}
2316
2317
2318
2319/* Assuming pi is a PutI stmt, is s2 identical to it (in the sense
2320 that it writes exactly the same piece of guest state) ? Safe
2321 answer: False. */
2322
2323static Bool identicalPutIs ( IRStmt* pi, IRStmt* s2 )
2324{
2325 vassert(pi->tag == Ist_PutI);
2326 if (s2->tag != Ist_PutI)
2327 return False;
2328
sewardj9d2e7692005-02-07 01:11:31 +00002329 return toBool(
2330 getAliasingRelation_II(
sewardj08210532004-12-29 17:09:11 +00002331 pi->Ist.PutI.descr, pi->Ist.PutI.ix, pi->Ist.PutI.bias,
2332 s2->Ist.PutI.descr, s2->Ist.PutI.ix, s2->Ist.PutI.bias
2333 )
sewardj9d2e7692005-02-07 01:11:31 +00002334 == ExactAlias
2335 );
sewardj08210532004-12-29 17:09:11 +00002336}
2337
2338
2339/* Assuming pi is a PutI stmt, is s2 a Get/GetI/Put/PutI which might
2340 overlap it? Safe answer: True. Note, we could do a lot better
2341 than this if needed. */
2342
2343static
2344Bool guestAccessWhichMightOverlapPutI (
2345 IRTypeEnv* tyenv, IRStmt* pi, IRStmt* s2
2346 )
2347{
2348 GSAliasing relation;
2349 UInt minoffP, maxoffP;
2350
2351 vassert(pi->tag == Ist_PutI);
2352 getArrayBounds(pi->Ist.PutI.descr, &minoffP, &maxoffP);
2353 switch (s2->tag) {
2354
sewardjbb3f52d2005-01-07 14:14:50 +00002355 case Ist_MFence:
2356 /* just be paranoid ... this should be rare. */
2357 return True;
2358
sewardj08210532004-12-29 17:09:11 +00002359 case Ist_Dirty:
2360 /* If the dirty call has any guest effects at all, give up.
2361 Probably could do better. */
2362 if (s2->Ist.Dirty.details->nFxState > 0)
2363 return True;
2364 return False;
2365
2366 case Ist_Put:
2367 vassert(isAtom(s2->Ist.Put.data));
2368 relation
2369 = getAliasingRelation_IC(
2370 pi->Ist.PutI.descr, pi->Ist.PutI.ix,
2371 s2->Ist.Put.offset,
2372 typeOfIRExpr(tyenv,s2->Ist.Put.data)
2373 );
2374 goto have_relation;
2375
2376 case Ist_PutI:
2377 vassert(isAtom(s2->Ist.PutI.ix));
2378 vassert(isAtom(s2->Ist.PutI.data));
2379 relation
2380 = getAliasingRelation_II(
2381 pi->Ist.PutI.descr, pi->Ist.PutI.ix, pi->Ist.PutI.bias,
2382 s2->Ist.PutI.descr, s2->Ist.PutI.ix, s2->Ist.PutI.bias
2383 );
2384 goto have_relation;
2385
2386 case Ist_Tmp:
2387 if (s2->Ist.Tmp.data->tag == Iex_GetI) {
2388 relation
2389 = getAliasingRelation_II(
2390 pi->Ist.PutI.descr, pi->Ist.PutI.ix,
2391 pi->Ist.PutI.bias,
2392 s2->Ist.Tmp.data->Iex.GetI.descr,
2393 s2->Ist.Tmp.data->Iex.GetI.ix,
2394 s2->Ist.Tmp.data->Iex.GetI.bias
2395 );
2396 goto have_relation;
2397 }
2398 if (s2->Ist.Tmp.data->tag == Iex_Get) {
2399 relation
2400 = getAliasingRelation_IC(
2401 pi->Ist.PutI.descr, pi->Ist.PutI.ix,
2402 s2->Ist.Tmp.data->Iex.Get.offset,
2403 s2->Ist.Tmp.data->Iex.Get.ty
2404 );
2405 goto have_relation;
2406 }
2407 return False;
2408
2409 case Ist_STle:
2410 vassert(isAtom(s2->Ist.STle.addr));
2411 vassert(isAtom(s2->Ist.STle.data));
2412 return False;
2413
2414 default:
2415 vex_printf("\n"); ppIRStmt(s2); vex_printf("\n");
2416 vpanic("guestAccessWhichMightOverlapPutI");
2417 }
2418
2419 have_relation:
2420 if (relation == NoAlias)
2421 return False;
2422 else
2423 return True; /* ExactAlias or UnknownAlias */
2424}
2425
2426
2427
2428/* ---------- PutI/GetI transformations main functions --------- */
2429
2430/* Remove redundant GetIs, to the extent that they can be detected.
2431 bb is modified in-place. */
2432
2433static
2434void do_redundant_GetI_elimination ( IRBB* bb )
2435{
2436 Int i;
2437 IRStmt* st;
2438
2439 for (i = bb->stmts_used-1; i >= 0; i--) {
2440 st = bb->stmts[i];
2441 if (!st)
2442 continue;
2443
2444 if (st->tag == Ist_Tmp
2445 && st->Ist.Tmp.data->tag == Iex_GetI
2446 && st->Ist.Tmp.data->Iex.GetI.ix->tag == Iex_Tmp) {
2447 IRArray* descr = st->Ist.Tmp.data->Iex.GetI.descr;
2448 IRExpr* ix = st->Ist.Tmp.data->Iex.GetI.ix;
2449 Int bias = st->Ist.Tmp.data->Iex.GetI.bias;
2450 IRExpr* replacement = findPutI(bb, i-1, descr, ix, bias);
2451 if (replacement
2452 && isAtom(replacement)
2453 /* Make sure we're doing a type-safe transformation! */
2454 && typeOfIRExpr(bb->tyenv, replacement) == descr->elemTy) {
2455 if (DEBUG_IROPT) {
2456 vex_printf("rGI: ");
2457 ppIRExpr(st->Ist.Tmp.data);
2458 vex_printf(" -> ");
2459 ppIRExpr(replacement);
2460 vex_printf("\n");
2461 }
2462 bb->stmts[i] = IRStmt_Tmp(st->Ist.Tmp.tmp, replacement);
2463 }
2464 }
2465 }
2466
2467}
2468
2469
2470/* Remove redundant PutIs, to the extent which they can be detected.
2471 bb is modified in-place. */
2472
2473static
2474void do_redundant_PutI_elimination ( IRBB* bb )
2475{
2476 Int i, j;
2477 Bool delete;
2478 IRStmt *st, *stj;
2479
2480 for (i = 0; i < bb->stmts_used; i++) {
2481 st = bb->stmts[i];
2482 if (!st || st->tag != Ist_PutI)
2483 continue;
2484 /* Ok, search forwards from here to see if we can find another
2485 PutI which makes this one redundant, and dodging various
2486 hazards. Search forwards:
2487 * If conditional exit, give up (because anything after that
2488 does not postdominate this put).
2489 * If a Get which might overlap, give up (because this PutI
2490 not necessarily dead).
2491 * If a Put which is identical, stop with success.
2492 * If a Put which might overlap, but is not identical, give up.
2493 * If a dirty helper call which might write guest state, give up.
2494 * If a Put which definitely doesn't overlap, or any other
2495 kind of stmt, continue.
2496 */
2497 delete = False;
2498 for (j = i+1; j < bb->stmts_used; j++) {
2499 stj = bb->stmts[j];
2500 if (!stj)
2501 continue;
2502 if (identicalPutIs(st, stj)) {
2503 /* success! */
2504 delete = True;
2505 break;
2506 }
2507 if (stj->tag == Ist_Exit)
2508 /* give up */
2509 break;
2510 if (st->tag == Ist_Dirty)
2511 /* give up; could do better here */
2512 break;
2513 if (guestAccessWhichMightOverlapPutI(bb->tyenv, st, stj))
2514 /* give up */
2515 break;
2516 }
2517
2518 if (delete) {
2519 if (DEBUG_IROPT) {
2520 vex_printf("rPI: ");
2521 ppIRStmt(st);
2522 vex_printf("\n");
2523 }
2524 bb->stmts[i] = NULL;
2525 }
2526
2527 }
2528}
2529
2530
2531/*---------------------------------------------------------------*/
2532/*--- Loop unrolling ---*/
2533/*---------------------------------------------------------------*/
2534
2535/* Adjust all tmp values (names) in e by delta. e is destructively
2536 modified. */
2537
2538static void deltaIRExpr ( IRExpr* e, Int delta )
2539{
2540 Int i;
2541 switch (e->tag) {
2542 case Iex_Tmp:
2543 e->Iex.Tmp.tmp += delta;
2544 break;
2545 case Iex_Get:
2546 case Iex_Const:
2547 break;
2548 case Iex_GetI:
2549 deltaIRExpr(e->Iex.GetI.ix, delta);
2550 break;
2551 case Iex_Binop:
2552 deltaIRExpr(e->Iex.Binop.arg1, delta);
2553 deltaIRExpr(e->Iex.Binop.arg2, delta);
2554 break;
2555 case Iex_Unop:
2556 deltaIRExpr(e->Iex.Unop.arg, delta);
2557 break;
2558 case Iex_LDle:
2559 deltaIRExpr(e->Iex.LDle.addr, delta);
2560 break;
2561 case Iex_CCall:
2562 for (i = 0; e->Iex.CCall.args[i]; i++)
2563 deltaIRExpr(e->Iex.CCall.args[i], delta);
2564 break;
2565 case Iex_Mux0X:
2566 deltaIRExpr(e->Iex.Mux0X.cond, delta);
2567 deltaIRExpr(e->Iex.Mux0X.expr0, delta);
2568 deltaIRExpr(e->Iex.Mux0X.exprX, delta);
2569 break;
2570 default:
2571 vex_printf("\n"); ppIRExpr(e); vex_printf("\n");
2572 vpanic("deltaIRExpr");
2573 }
2574}
2575
2576/* Adjust all tmp values (names) in st by delta. st is destructively
2577 modified. */
2578
2579static void deltaIRStmt ( IRStmt* st, Int delta )
2580{
2581 Int i;
2582 IRDirty* d;
2583 switch (st->tag) {
2584 case Ist_Put:
2585 deltaIRExpr(st->Ist.Put.data, delta);
2586 break;
2587 case Ist_PutI:
2588 deltaIRExpr(st->Ist.PutI.ix, delta);
2589 deltaIRExpr(st->Ist.PutI.data, delta);
2590 break;
2591 case Ist_Tmp:
2592 st->Ist.Tmp.tmp += delta;
2593 deltaIRExpr(st->Ist.Tmp.data, delta);
2594 break;
2595 case Ist_Exit:
2596 deltaIRExpr(st->Ist.Exit.guard, delta);
2597 break;
2598 case Ist_STle:
2599 deltaIRExpr(st->Ist.STle.addr, delta);
2600 deltaIRExpr(st->Ist.STle.data, delta);
2601 break;
2602 case Ist_Dirty:
2603 d = st->Ist.Dirty.details;
2604 deltaIRExpr(d->guard, delta);
2605 for (i = 0; d->args[i]; i++)
2606 deltaIRExpr(d->args[i], delta);
2607 if (d->tmp != IRTemp_INVALID)
2608 d->tmp += delta;
2609 if (d->mAddr)
2610 deltaIRExpr(d->mAddr, delta);
2611 break;
2612 default:
2613 vex_printf("\n"); ppIRStmt(st); vex_printf("\n");
2614 vpanic("deltaIRStmt");
2615 }
2616}
2617
2618
2619/* If possible, return a loop-unrolled version of bb0. The original
2620 is changed. If not possible, return NULL. */
2621
2622/* The two schemas considered are:
2623
2624 X: BODY; goto X
2625
2626 which unrolls to (eg) X: BODY;BODY; goto X
2627
2628 and
2629
2630 X: BODY; if (c) goto X; goto Y
2631 which trivially transforms to
2632 X: BODY; if (!c) goto Y; goto X;
2633 so it falls in the scope of the first case.
2634
2635 X and Y must be literal (guest) addresses.
2636*/
2637
2638static Int calc_unroll_factor( IRBB* bb )
2639{
2640 Int n_stmts, i;
2641
2642 n_stmts = 0;
2643 for (i = 0; i < bb->stmts_used; i++)
2644 if (bb->stmts[i])
2645 n_stmts++;
2646
2647 if (n_stmts <= vex_control.iropt_unroll_thresh/8) {
2648 if (vex_control.iropt_verbosity > 0)
2649 vex_printf("vex iropt: 8 x unrolling (%d sts -> %d sts)\n",
2650 n_stmts, 8* n_stmts);
2651 return 8;
2652 }
2653 if (n_stmts <= vex_control.iropt_unroll_thresh/4) {
2654 if (vex_control.iropt_verbosity > 0)
2655 vex_printf("vex iropt: 4 x unrolling (%d sts -> %d sts)\n",
2656 n_stmts, 4* n_stmts);
2657 return 4;
2658 }
2659
2660 if (n_stmts <= vex_control.iropt_unroll_thresh/2) {
2661 if (vex_control.iropt_verbosity > 0)
2662 vex_printf("vex iropt: 2 x unrolling (%d sts -> %d sts)\n",
2663 n_stmts, 2* n_stmts);
2664 return 2;
2665 }
2666
2667 if (vex_control.iropt_verbosity > 0)
2668 vex_printf("vex iropt: not unrolling (%d sts)\n", n_stmts);
2669
2670 return 1;
2671}
2672
2673
2674static IRBB* maybe_loop_unroll_BB ( IRBB* bb0, Addr64 my_addr )
2675{
2676 Int i, j, jmax, n_vars;
2677 Bool xxx_known;
2678 Addr64 xxx_value, yyy_value;
2679 IRExpr* udst;
2680 IRStmt* st;
2681 IRConst* con;
2682 IRBB *bb1, *bb2;
2683 Int unroll_factor;
2684
2685 if (vex_control.iropt_unroll_thresh <= 0)
2686 return NULL;
2687
2688 /* First off, figure out if we can unroll this loop. Do this
2689 without modifying bb0. */
2690
2691 if (bb0->jumpkind != Ijk_Boring)
2692 return NULL;
2693
2694 xxx_known = False;
2695 xxx_value = 0;
2696
2697 /* Extract the next-guest address. If it isn't a literal, we
2698 have to give up. */
2699
2700 udst = bb0->next;
2701 if (udst->tag == Iex_Const
2702 && (udst->Iex.Const.con->tag == Ico_U32
2703 || udst->Iex.Const.con->tag == Ico_U64)) {
2704 /* The BB ends in a jump to a literal location. */
2705 xxx_known = True;
2706 xxx_value = udst->Iex.Const.con->tag == Ico_U64
2707 ? udst->Iex.Const.con->Ico.U64
2708 : (Addr64)(udst->Iex.Const.con->Ico.U32);
2709 }
2710
2711 if (!xxx_known)
2712 return NULL;
2713
2714 /* Now we know the BB ends to a jump to a literal location. If
2715 it's a jump to itself (viz, idiom #1), move directly to the
2716 unrolling stage, first cloning the bb so the original isn't
2717 modified. */
2718 if (xxx_value == my_addr) {
2719 unroll_factor = calc_unroll_factor( bb0 );
2720 if (unroll_factor < 2)
2721 return NULL;
2722 bb1 = dopyIRBB( bb0 );
2723 bb0 = NULL;
2724 udst = NULL; /* is now invalid */
2725 goto do_unroll;
2726 }
2727
2728 /* Search for the second idiomatic form:
2729 X: BODY; if (c) goto X; goto Y
2730 We know Y, but need to establish that the last stmt
2731 is 'if (c) goto X'.
2732 */
2733 yyy_value = xxx_value;
2734 for (i = bb0->stmts_used-1; i >= 0; i--)
2735 if (bb0->stmts[i])
2736 break;
2737
2738 if (i < 0)
2739 return NULL; /* block with no stmts. Strange. */
2740
2741 st = bb0->stmts[i];
2742 if (st->tag != Ist_Exit)
2743 return NULL;
2744 if (st->Ist.Exit.jk != Ijk_Boring)
2745 return NULL;
2746
2747 con = st->Ist.Exit.dst;
2748 vassert(con->tag == Ico_U32 || con->tag == Ico_U64);
2749
2750 xxx_value = con->tag == Ico_U64
2751 ? st->Ist.Exit.dst->Ico.U64
2752 : (Addr64)(st->Ist.Exit.dst->Ico.U32);
2753
2754 /* If this assertion fails, we have some kind of type error. */
2755 vassert(con->tag == udst->Iex.Const.con->tag);
2756
2757 if (xxx_value != my_addr)
2758 /* We didn't find either idiom. Give up. */
2759 return NULL;
2760
2761 /* Ok, we found idiom #2. Copy the BB, switch around the xxx and
2762 yyy values (which makes it look like idiom #1), and go into
2763 unrolling proper. This means finding (again) the last stmt, in
2764 the copied BB. */
2765
2766 unroll_factor = calc_unroll_factor( bb0 );
2767 if (unroll_factor < 2)
2768 return NULL;
2769
2770 bb1 = dopyIRBB( bb0 );
2771 bb0 = NULL;
2772 udst = NULL; /* is now invalid */
2773 for (i = bb1->stmts_used-1; i >= 0; i--)
2774 if (bb1->stmts[i])
2775 break;
2776
2777 /* The next bunch of assertions should be true since we already
2778 found and checked the last stmt in the original bb. */
2779
2780 vassert(i >= 0);
2781
2782 st = bb1->stmts[i];
2783 vassert(st->tag == Ist_Exit);
2784
2785 con = st->Ist.Exit.dst;
2786 vassert(con->tag == Ico_U32 || con->tag == Ico_U64);
2787
2788 udst = bb1->next;
2789 vassert(udst->tag == Iex_Const);
2790 vassert(udst->Iex.Const.con->tag == Ico_U32
2791 || udst->Iex.Const.con->tag == Ico_U64);
2792 vassert(con->tag == udst->Iex.Const.con->tag);
2793
2794 /* switch the xxx and yyy fields around */
2795 if (con->tag == Ico_U64) {
2796 udst->Iex.Const.con->Ico.U64 = xxx_value;
2797 con->Ico.U64 = yyy_value;
2798 } else {
2799 udst->Iex.Const.con->Ico.U32 = (UInt)xxx_value;
2800 con->Ico.U64 = (UInt)yyy_value;
2801 }
2802
2803 /* negate the test condition */
2804 st->Ist.Exit.guard
2805 = IRExpr_Unop(Iop_Not1,dopyIRExpr(st->Ist.Exit.guard));
2806
2807 /* --- The unroller proper. Both idioms are by now --- */
2808 /* --- now converted to idiom 1. --- */
2809
2810 do_unroll:
2811
2812 vassert(unroll_factor == 2
2813 || unroll_factor == 4
2814 || unroll_factor == 8);
2815
2816 jmax = unroll_factor==8 ? 3 : (unroll_factor==4 ? 2 : 1);
2817 for (j = 1; j <= jmax; j++) {
2818
2819 n_vars = bb1->tyenv->types_used;
2820
2821 bb2 = dopyIRBB(bb1);
2822 for (i = 0; i < n_vars; i++)
2823 (void)newIRTemp(bb1->tyenv, bb2->tyenv->types[i]);
2824
2825 for (i = 0; i < bb2->stmts_used; i++) {
2826 if (bb2->stmts[i] == NULL)
2827 continue;
2828 /* deltaIRStmt destructively modifies the stmt, but
2829 that's OK since bb2 is a complete fresh copy of bb1. */
2830 deltaIRStmt(bb2->stmts[i], n_vars);
2831 addStmtToIRBB(bb1, bb2->stmts[i]);
2832 }
2833 }
2834
2835 if (DEBUG_IROPT) {
2836 vex_printf("\nUNROLLED (%llx)\n", my_addr);
2837 ppIRBB(bb1);
2838 vex_printf("\n");
2839 }
2840
2841 /* Flattening; sigh. The unroller succeeds in breaking flatness
2842 by negating the test condition. This should be fixed properly.
2843 For the moment use this shotgun approach. */
2844 return flatten_BB(bb1);
2845}
2846
2847
2848/*---------------------------------------------------------------*/
sewardj29632392004-08-22 02:38:11 +00002849/*--- The tree builder ---*/
2850/*---------------------------------------------------------------*/
2851
sewardj08210532004-12-29 17:09:11 +00002852/* This isn't part of IR optimisation. Really it's a pass done prior
2853 to instruction selection, which improves the code that the
2854 instruction selector can produce. */
2855
sewardj29632392004-08-22 02:38:11 +00002856typedef
2857 struct {
2858 Int occ; /* occurrence count for this tmp */
sewardj84ff0652004-08-23 16:16:08 +00002859 IRExpr* expr; /* expr it is bound to,
2860 or NULL if already 'used' */
sewardj29632392004-08-22 02:38:11 +00002861 Bool eDoesLoad; /* True <=> expr reads mem */
2862 Bool eDoesGet; /* True <=> expr reads guest state */
2863 Bool invalidateMe; /* used when dumping bindings */
2864 Int origPos; /* posn of the binder in the original bb */
2865 }
2866 TmpInfo;
2867
2868/* Given env :: IRTemp -> TmpInfo*
2869 Add the use-occurrences of temps in this expression
2870 to the environment.
2871*/
sewardjc9ad1152004-10-14 00:08:21 +00002872static void occCount_Temp ( TmpInfo** env, IRTemp tmp )
sewardj17442fe2004-09-20 14:54:28 +00002873{
sewardjc9ad1152004-10-14 00:08:21 +00002874 TmpInfo* ti = env[(Int)tmp];
2875 if (ti) {
sewardj17442fe2004-09-20 14:54:28 +00002876 ti->occ++;
2877 } else {
2878 ti = LibVEX_Alloc(sizeof(TmpInfo));
2879 ti->occ = 1;
2880 ti->expr = NULL;
2881 ti->eDoesLoad = False;
2882 ti->eDoesGet = False;
2883 ti->invalidateMe = False;
2884 ti->origPos = -1; /* filed in properly later */
sewardjc9ad1152004-10-14 00:08:21 +00002885 env[(Int)tmp] = ti;
sewardj17442fe2004-09-20 14:54:28 +00002886 }
2887}
2888
sewardjc9ad1152004-10-14 00:08:21 +00002889static void occCount_Expr ( TmpInfo** env, IRExpr* e )
sewardj29632392004-08-22 02:38:11 +00002890{
sewardj17442fe2004-09-20 14:54:28 +00002891 Int i;
sewardj29632392004-08-22 02:38:11 +00002892
2893 switch (e->tag) {
2894
2895 case Iex_Tmp: /* the only interesting case */
sewardj17442fe2004-09-20 14:54:28 +00002896 occCount_Temp(env, e->Iex.Tmp.tmp);
sewardj29632392004-08-22 02:38:11 +00002897 return;
2898
2899 case Iex_Mux0X:
2900 occCount_Expr(env, e->Iex.Mux0X.cond);
2901 occCount_Expr(env, e->Iex.Mux0X.expr0);
2902 occCount_Expr(env, e->Iex.Mux0X.exprX);
2903 return;
2904
2905 case Iex_Binop:
2906 occCount_Expr(env, e->Iex.Binop.arg1);
2907 occCount_Expr(env, e->Iex.Binop.arg2);
2908 return;
2909
2910 case Iex_Unop:
2911 occCount_Expr(env, e->Iex.Unop.arg);
2912 return;
2913
2914 case Iex_LDle:
2915 occCount_Expr(env, e->Iex.LDle.addr);
2916 return;
2917
2918 case Iex_CCall:
2919 for (i = 0; e->Iex.CCall.args[i]; i++)
2920 occCount_Expr(env, e->Iex.CCall.args[i]);
2921 return;
2922
sewardjf6501992004-08-27 11:58:24 +00002923 case Iex_GetI:
sewardjeeac8412004-11-02 00:26:55 +00002924 occCount_Expr(env, e->Iex.GetI.ix);
sewardjf6501992004-08-27 11:58:24 +00002925 return;
2926
sewardj29632392004-08-22 02:38:11 +00002927 case Iex_Const:
2928 case Iex_Get:
2929 return;
2930
2931 default:
2932 vex_printf("\n"); ppIRExpr(e); vex_printf("\n");
2933 vpanic("occCount_Expr");
2934 }
2935}
2936
2937
2938/* Given env :: IRTemp -> TmpInfo*
2939 Add the use-occurrences of temps in this expression
2940 to the environment.
2941*/
sewardjc9ad1152004-10-14 00:08:21 +00002942static void occCount_Stmt ( TmpInfo** env, IRStmt* st )
sewardj29632392004-08-22 02:38:11 +00002943{
sewardj17442fe2004-09-20 14:54:28 +00002944 Int i;
2945 IRDirty* d;
sewardj29632392004-08-22 02:38:11 +00002946 switch (st->tag) {
2947 case Ist_Tmp:
sewardj6d076362004-09-23 11:06:17 +00002948 occCount_Expr(env, st->Ist.Tmp.data);
sewardj29632392004-08-22 02:38:11 +00002949 return;
2950 case Ist_Put:
sewardj6d076362004-09-23 11:06:17 +00002951 occCount_Expr(env, st->Ist.Put.data);
sewardj29632392004-08-22 02:38:11 +00002952 return;
sewardjf6501992004-08-27 11:58:24 +00002953 case Ist_PutI:
sewardjeeac8412004-11-02 00:26:55 +00002954 occCount_Expr(env, st->Ist.PutI.ix);
sewardj2d3f77c2004-09-22 23:49:09 +00002955 occCount_Expr(env, st->Ist.PutI.data);
sewardjf6501992004-08-27 11:58:24 +00002956 return;
sewardj29632392004-08-22 02:38:11 +00002957 case Ist_STle:
2958 occCount_Expr(env, st->Ist.STle.addr);
2959 occCount_Expr(env, st->Ist.STle.data);
2960 return;
sewardj17442fe2004-09-20 14:54:28 +00002961 case Ist_Dirty:
2962 d = st->Ist.Dirty.details;
2963 if (d->mFx != Ifx_None)
2964 occCount_Expr(env, d->mAddr);
sewardjb8385d82004-11-02 01:34:15 +00002965 occCount_Expr(env, d->guard);
sewardj17442fe2004-09-20 14:54:28 +00002966 for (i = 0; d->args[i]; i++)
2967 occCount_Expr(env, d->args[i]);
2968 return;
sewardj3e838932005-01-07 12:09:15 +00002969 case Ist_MFence:
2970 return;
sewardj29632392004-08-22 02:38:11 +00002971 case Ist_Exit:
sewardj0276d4b2004-11-15 15:30:21 +00002972 occCount_Expr(env, st->Ist.Exit.guard);
sewardj29632392004-08-22 02:38:11 +00002973 return;
2974 default:
2975 vex_printf("\n"); ppIRStmt(st); vex_printf("\n");
2976 vpanic("occCount_Stmt");
2977 }
2978}
2979
sewardj17442fe2004-09-20 14:54:28 +00002980/* Look up a binding for tmp in the env. If found, return the bound
2981 expression, and set the env's binding to NULL so it is marked as
2982 used. If not found, return NULL. */
2983
sewardjc9ad1152004-10-14 00:08:21 +00002984static IRExpr* tbSubst_Temp ( TmpInfo** env, IRTemp tmp )
sewardj17442fe2004-09-20 14:54:28 +00002985{
2986 TmpInfo* ti;
sewardj17442fe2004-09-20 14:54:28 +00002987 IRExpr* e;
sewardjc9ad1152004-10-14 00:08:21 +00002988 ti = env[(Int)tmp];
2989 if (ti){
2990 e = ti->expr;
sewardj17442fe2004-09-20 14:54:28 +00002991 if (e) {
2992 ti->expr = NULL;
2993 return e;
2994 } else {
2995 return NULL;
2996 }
2997 } else {
2998 return NULL;
2999 }
3000}
sewardj29632392004-08-22 02:38:11 +00003001
3002/* Traverse e, looking for temps. For each observed temp, see if env
3003 contains a binding for the temp, and if so return the bound value.
3004 The env has the property that any binding it holds is
3005 'single-shot', so once a binding is used, it is marked as no longer
3006 available, by setting its .expr field to NULL. */
3007
sewardjc9ad1152004-10-14 00:08:21 +00003008static IRExpr* tbSubst_Expr ( TmpInfo** env, IRExpr* e )
sewardj29632392004-08-22 02:38:11 +00003009{
sewardjc41f1fb2004-08-22 09:48:08 +00003010 IRExpr* e2;
3011 IRExpr** args2;
3012 Int i;
sewardj29632392004-08-22 02:38:11 +00003013
3014 switch (e->tag) {
3015
sewardjc41f1fb2004-08-22 09:48:08 +00003016 case Iex_CCall:
sewardj695cff92004-10-13 14:50:14 +00003017 args2 = sopyIRExprVec(e->Iex.CCall.args);
sewardjc41f1fb2004-08-22 09:48:08 +00003018 for (i = 0; args2[i]; i++)
3019 args2[i] = tbSubst_Expr(env,args2[i]);
sewardj8ea867b2004-10-30 19:03:02 +00003020 return IRExpr_CCall(e->Iex.CCall.cee,
sewardjc41f1fb2004-08-22 09:48:08 +00003021 e->Iex.CCall.retty,
3022 args2
3023 );
sewardj29632392004-08-22 02:38:11 +00003024 case Iex_Tmp:
sewardj17442fe2004-09-20 14:54:28 +00003025 e2 = tbSubst_Temp(env, e->Iex.Tmp.tmp);
3026 return e2 ? e2 : e;
sewardjc41f1fb2004-08-22 09:48:08 +00003027 case Iex_Mux0X:
3028 return IRExpr_Mux0X(
3029 tbSubst_Expr(env, e->Iex.Mux0X.cond),
3030 tbSubst_Expr(env, e->Iex.Mux0X.expr0),
3031 tbSubst_Expr(env, e->Iex.Mux0X.exprX)
3032 );
3033 case Iex_Binop:
3034 return IRExpr_Binop(
3035 e->Iex.Binop.op,
3036 tbSubst_Expr(env, e->Iex.Binop.arg1),
3037 tbSubst_Expr(env, e->Iex.Binop.arg2)
3038 );
3039 case Iex_Unop:
3040 return IRExpr_Unop(
3041 e->Iex.Unop.op,
3042 tbSubst_Expr(env, e->Iex.Unop.arg)
3043 );
3044 case Iex_LDle:
3045 return IRExpr_LDle(
3046 e->Iex.LDle.ty,
sewardjf6501992004-08-27 11:58:24 +00003047 tbSubst_Expr(env, e->Iex.LDle.addr)
sewardjc41f1fb2004-08-22 09:48:08 +00003048 );
sewardjf6501992004-08-27 11:58:24 +00003049 case Iex_GetI:
3050 return IRExpr_GetI(
sewardj2d3f77c2004-09-22 23:49:09 +00003051 e->Iex.GetI.descr,
sewardjeeac8412004-11-02 00:26:55 +00003052 tbSubst_Expr(env, e->Iex.GetI.ix),
sewardj2d3f77c2004-09-22 23:49:09 +00003053 e->Iex.GetI.bias
3054 );
sewardjc41f1fb2004-08-22 09:48:08 +00003055 case Iex_Const:
sewardj29632392004-08-22 02:38:11 +00003056 case Iex_Get:
3057 return e;
3058 default:
3059 vex_printf("\n"); ppIRExpr(e); vex_printf("\n");
3060 vpanic("tbSubst_Expr");
3061 }
3062}
3063
3064/* Same deal as tbSubst_Expr, except for stmts. */
3065
sewardjc9ad1152004-10-14 00:08:21 +00003066static IRStmt* tbSubst_Stmt ( TmpInfo** env, IRStmt* st )
sewardj29632392004-08-22 02:38:11 +00003067{
sewardj17442fe2004-09-20 14:54:28 +00003068 Int i;
3069 IRDirty* d;
3070 IRDirty* d2;
sewardj29632392004-08-22 02:38:11 +00003071 switch (st->tag) {
sewardj17442fe2004-09-20 14:54:28 +00003072 case Ist_STle:
3073 return IRStmt_STle(
3074 tbSubst_Expr(env, st->Ist.STle.addr),
3075 tbSubst_Expr(env, st->Ist.STle.data)
3076 );
sewardj29632392004-08-22 02:38:11 +00003077 case Ist_Tmp:
3078 return IRStmt_Tmp(
3079 st->Ist.Tmp.tmp,
sewardj6d076362004-09-23 11:06:17 +00003080 tbSubst_Expr(env, st->Ist.Tmp.data)
sewardj29632392004-08-22 02:38:11 +00003081 );
3082 case Ist_Put:
3083 return IRStmt_Put(
3084 st->Ist.Put.offset,
sewardj6d076362004-09-23 11:06:17 +00003085 tbSubst_Expr(env, st->Ist.Put.data)
sewardj29632392004-08-22 02:38:11 +00003086 );
sewardjf6501992004-08-27 11:58:24 +00003087 case Ist_PutI:
3088 return IRStmt_PutI(
sewardj2d3f77c2004-09-22 23:49:09 +00003089 st->Ist.PutI.descr,
sewardjeeac8412004-11-02 00:26:55 +00003090 tbSubst_Expr(env, st->Ist.PutI.ix),
sewardj2d3f77c2004-09-22 23:49:09 +00003091 st->Ist.PutI.bias,
3092 tbSubst_Expr(env, st->Ist.PutI.data)
3093 );
sewardjf6501992004-08-27 11:58:24 +00003094
sewardj29632392004-08-22 02:38:11 +00003095 case Ist_Exit:
3096 return IRStmt_Exit(
sewardj0276d4b2004-11-15 15:30:21 +00003097 tbSubst_Expr(env, st->Ist.Exit.guard),
sewardj893aada2004-11-29 19:57:54 +00003098 st->Ist.Exit.jk,
sewardj29632392004-08-22 02:38:11 +00003099 st->Ist.Exit.dst
3100 );
sewardj3e838932005-01-07 12:09:15 +00003101 case Ist_MFence:
3102 return IRStmt_MFence();
sewardj17442fe2004-09-20 14:54:28 +00003103 case Ist_Dirty:
3104 d = st->Ist.Dirty.details;
3105 d2 = emptyIRDirty();
3106 *d2 = *d;
3107 if (d2->mFx != Ifx_None)
3108 d2->mAddr = tbSubst_Expr(env, d2->mAddr);
sewardjb8385d82004-11-02 01:34:15 +00003109 d2->guard = tbSubst_Expr(env, d2->guard);
sewardj17442fe2004-09-20 14:54:28 +00003110 for (i = 0; d2->args[i]; i++)
3111 d2->args[i] = tbSubst_Expr(env, d2->args[i]);
3112 return IRStmt_Dirty(d2);
sewardj29632392004-08-22 02:38:11 +00003113 default:
3114 vex_printf("\n"); ppIRStmt(st); vex_printf("\n");
3115 vpanic("tbSubst_Stmt");
3116 }
3117}
3118
3119
sewardj37d38012004-08-24 00:37:04 +00003120/* Traverse an expr, and detect if any part of it reads memory or does
3121 a Get. Be careful ... this really controls how much the
3122 tree-builder can reorder the code, so getting it right is critical.
3123*/
sewardj29632392004-08-22 02:38:11 +00003124static void setHints_Expr (Bool* doesLoad, Bool* doesGet, IRExpr* e )
3125{
sewardjc41f1fb2004-08-22 09:48:08 +00003126 Int i;
sewardj29632392004-08-22 02:38:11 +00003127 switch (e->tag) {
sewardjc41f1fb2004-08-22 09:48:08 +00003128 case Iex_CCall:
3129 for (i = 0; e->Iex.CCall.args[i]; i++)
3130 setHints_Expr(doesLoad, doesGet, e->Iex.CCall.args[i]);
3131 return;
3132 case Iex_Mux0X:
3133 setHints_Expr(doesLoad, doesGet, e->Iex.Mux0X.cond);
3134 setHints_Expr(doesLoad, doesGet, e->Iex.Mux0X.expr0);
3135 setHints_Expr(doesLoad, doesGet, e->Iex.Mux0X.exprX);
3136 return;
3137 case Iex_Binop:
3138 setHints_Expr(doesLoad, doesGet, e->Iex.Binop.arg1);
3139 setHints_Expr(doesLoad, doesGet, e->Iex.Binop.arg2);
3140 return;
3141 case Iex_Unop:
3142 setHints_Expr(doesLoad, doesGet, e->Iex.Unop.arg);
3143 return;
3144 case Iex_LDle:
sewardjc9ad1152004-10-14 00:08:21 +00003145 *doesLoad = True;
sewardjaba4fff2004-10-08 21:37:15 +00003146 setHints_Expr(doesLoad, doesGet, e->Iex.LDle.addr);
sewardjc41f1fb2004-08-22 09:48:08 +00003147 return;
sewardj29632392004-08-22 02:38:11 +00003148 case Iex_Get:
sewardjc9ad1152004-10-14 00:08:21 +00003149 *doesGet = True;
sewardj29632392004-08-22 02:38:11 +00003150 return;
sewardjf6501992004-08-27 11:58:24 +00003151 case Iex_GetI:
sewardjc9ad1152004-10-14 00:08:21 +00003152 *doesGet = True;
sewardjeeac8412004-11-02 00:26:55 +00003153 setHints_Expr(doesLoad, doesGet, e->Iex.GetI.ix);
sewardjf6501992004-08-27 11:58:24 +00003154 return;
sewardjc41f1fb2004-08-22 09:48:08 +00003155 case Iex_Tmp:
3156 case Iex_Const:
3157 return;
sewardj29632392004-08-22 02:38:11 +00003158 default:
3159 vex_printf("\n"); ppIRExpr(e); vex_printf("\n");
3160 vpanic("setHints_Expr");
3161 }
3162}
3163
3164
sewardjc9ad1152004-10-14 00:08:21 +00003165static void dumpInvalidated ( TmpInfo** env, IRBB* bb, /*INOUT*/Int* j )
sewardj29632392004-08-22 02:38:11 +00003166{
3167 Int k, oldest_op, oldest_k;
3168 TmpInfo* ti;
3169
3170 /* Dump all the bindings to marked as invalidated, in order. */
3171 while (True) {
3172
3173 /* find the oldest bind marked 'invalidateMe'. */
3174 oldest_op = 1<<30;
3175 oldest_k = 1<<30;
sewardjc9ad1152004-10-14 00:08:21 +00003176 for (k = 0; k < bb->tyenv->types_used; k++) {
3177 ti = env[k];
3178 if (!ti)
sewardj29632392004-08-22 02:38:11 +00003179 continue;
sewardj29632392004-08-22 02:38:11 +00003180 if (!ti->expr)
3181 continue;
3182 if (!ti->invalidateMe)
3183 continue;
sewardjc41f1fb2004-08-22 09:48:08 +00003184 /* vex_printf("FOUND INVAL %d %d\n", ti->origPos, oldest_op); */
sewardj29632392004-08-22 02:38:11 +00003185 if (ti->origPos < oldest_op) {
3186 oldest_op = ti->origPos;
3187 oldest_k = k;
3188 }
3189 }
3190
3191 /* No more binds to invalidate. */
3192 if (oldest_op == 1<<30)
sewardjc41f1fb2004-08-22 09:48:08 +00003193 return;
sewardj29632392004-08-22 02:38:11 +00003194
3195 /* the oldest bind to invalidate has been identified */
3196 vassert(oldest_op != 1<<31 && oldest_k != 1<<31);
sewardjc9ad1152004-10-14 00:08:21 +00003197 ti = env[oldest_k];
sewardj29632392004-08-22 02:38:11 +00003198 vassert(ti->expr && ti->invalidateMe);
3199
3200 /* and invalidate it ... */
sewardjc9ad1152004-10-14 00:08:21 +00003201 bb->stmts[*j] = IRStmt_Tmp( (IRTemp)oldest_k, ti->expr );
sewardj29632392004-08-22 02:38:11 +00003202 /* vex_printf("**1 "); ppIRStmt(bb->stmts[*j]); vex_printf("\n"); */
3203 (*j)++;
3204 ti->invalidateMe = False;
3205 ti->expr = NULL; /* no longer available for substitution */
3206
3207 } /* loop which dumps the binds marked for invalidation */
3208}
3209
3210
3211
sewardj49651f42004-10-28 22:11:04 +00003212/* notstatic */ void do_treebuild_BB ( IRBB* bb )
sewardj29632392004-08-22 02:38:11 +00003213{
3214 Int i, j, k;
sewardj29632392004-08-22 02:38:11 +00003215 Bool invPut, invStore;
3216 IRStmt* st;
3217 IRStmt* st2;
3218 TmpInfo* ti;
3219 IRExpr* next2;
3220
3221 /* Mapping from IRTemp to TmpInfo*. */
sewardjc9ad1152004-10-14 00:08:21 +00003222 Int n_tmps = bb->tyenv->types_used;
3223 TmpInfo** env = LibVEX_Alloc(n_tmps * sizeof(TmpInfo*));
3224
3225 for (i = 0; i < n_tmps; i++)
3226 env[i] = NULL;
sewardj29632392004-08-22 02:38:11 +00003227
3228 /* Phase 1. Scan forwards in bb, counting use occurrences of each
3229 temp. Also count occurrences in the bb->next field. */
3230
3231 for (i = 0; i < bb->stmts_used; i++) {
3232 st = bb->stmts[i];
3233 if (!st)
3234 continue;
3235 occCount_Stmt( env, st );
3236 }
3237 occCount_Expr(env, bb->next );
3238
sewardjc9ad1152004-10-14 00:08:21 +00003239# if 0
sewardj29632392004-08-22 02:38:11 +00003240 for (i = 0; i < env->used; i++) {
3241 if (!env->inuse[i])
3242 continue;
3243 ppIRTemp( (IRTemp)(env->key[i]) );
3244 vex_printf(" used %d\n", ((TmpInfo*)env->val[i])->occ );
3245 }
sewardjc9ad1152004-10-14 00:08:21 +00003246# endif
sewardj29632392004-08-22 02:38:11 +00003247
3248 /* Phase 2. Fill in the origPos fields. */
3249
3250 for (i = 0; i < bb->stmts_used; i++) {
3251 st = bb->stmts[i];
3252 if (!st)
3253 continue;
3254 if (st->tag != Ist_Tmp)
3255 continue;
3256
sewardjc9ad1152004-10-14 00:08:21 +00003257 ti = env[(Int)st->Ist.Tmp.tmp];
3258 if (!ti) {
sewardj84ff0652004-08-23 16:16:08 +00003259 vex_printf("\n");
3260 ppIRTemp(st->Ist.Tmp.tmp);
3261 vex_printf("\n");
sewardj29632392004-08-22 02:38:11 +00003262 vpanic("treebuild_BB (phase 2): unmapped IRTemp");
3263 }
sewardj29632392004-08-22 02:38:11 +00003264 ti->origPos = i;
3265 }
3266
3267 /* Phase 3. Scan forwards in bb.
3268
3269 On seeing 't = E', occ(t)==1,
3270 let E'=env(E), set t's binding to be E', and
3271 delete this stmt.
3272 Also examine E' and set the hints for E' appropriately
3273 (doesLoad? doesGet?)
3274
3275 On seeing any other stmt,
3276 let stmt' = env(stmt)
3277 remove from env any 't=E' binds invalidated by stmt
3278 emit the invalidated stmts
3279 emit stmt'
3280
3281 Apply env to bb->next.
3282 */
3283
3284 /* The stmts in bb are being reordered, and we are guaranteed to
3285 end up with no more than the number we started with. Use i to
3286 be the cursor of the current stmt examined and j <= i to be that
3287 for the current stmt being written.
3288 */
3289 j = 0;
3290 for (i = 0; i < bb->stmts_used; i++) {
3291 st = bb->stmts[i];
3292 if (!st)
3293 continue;
3294
3295 if (st->tag == Ist_Tmp) {
sewardje80679a2004-09-21 23:00:11 +00003296 /* vex_printf("acquire binding\n"); */
sewardjc9ad1152004-10-14 00:08:21 +00003297 ti = env[st->Ist.Tmp.tmp];
3298 if (!ti) {
sewardj29632392004-08-22 02:38:11 +00003299 vpanic("treebuild_BB (phase 2): unmapped IRTemp");
3300 }
sewardj29632392004-08-22 02:38:11 +00003301 if (ti->occ == 1) {
3302 /* ok, we have 't = E', occ(t)==1. Do the abovementioned actions. */
sewardj6d076362004-09-23 11:06:17 +00003303 IRExpr* e = st->Ist.Tmp.data;
sewardj29632392004-08-22 02:38:11 +00003304 IRExpr* e2 = tbSubst_Expr(env, e);
3305 ti->expr = e2;
3306 ti->eDoesLoad = ti->eDoesGet = False;
3307 setHints_Expr(&ti->eDoesLoad, &ti->eDoesGet, e2);
3308 /* don't advance j, as we are deleting this stmt and instead
3309 holding it temporarily in the env. */
3310 continue; /* the for (i = 0; i < bb->stmts_used; i++) loop */
3311 }
3312 }
3313
3314 /* we get here for any other kind of statement. */
3315 /* 'use up' any bindings required by the current statement. */
3316 st2 = tbSubst_Stmt(env, st);
3317
3318 /* Now, before this stmt, dump any bindings it invalidates.
3319 These need to be dumped in the order in which they originally
3320 appeared. (Stupid algorithm): first, mark all bindings which
3321 need to be dumped. Then, dump them in the order in which
3322 they were defined. */
sewardj3e838932005-01-07 12:09:15 +00003323
sewardj9d2e7692005-02-07 01:11:31 +00003324 invPut = toBool(st->tag == Ist_Put
3325 || st->tag == Ist_PutI
3326 || st->tag == Ist_Dirty);
sewardj3e838932005-01-07 12:09:15 +00003327
sewardj9d2e7692005-02-07 01:11:31 +00003328 invStore = toBool(st->tag == Ist_STle
3329 || st->tag == Ist_Dirty);
sewardj29632392004-08-22 02:38:11 +00003330
sewardjc9ad1152004-10-14 00:08:21 +00003331 for (k = 0; k < n_tmps; k++) {
3332 ti = env[k];
3333 if (!ti)
sewardj29632392004-08-22 02:38:11 +00003334 continue;
sewardj29632392004-08-22 02:38:11 +00003335 if (!ti->expr)
3336 continue;
3337
sewardj3e838932005-01-07 12:09:15 +00003338 /* Do we have to invalidate this binding? */
3339
sewardj29632392004-08-22 02:38:11 +00003340 ti->invalidateMe
sewardj9d2e7692005-02-07 01:11:31 +00003341 = toBool(
3342 /* a store invalidates loaded data */
sewardj4c5f6d52004-10-26 13:25:33 +00003343 (ti->eDoesLoad && invStore)
3344 /* a put invalidates get'd data */
3345 || (ti->eDoesGet && invPut)
3346 /* a put invalidates loaded data. Note, we could do
3347 much better here in the sense that we only need to
3348 invalidate trees containing loads if the Put in
3349 question is marked as requiring precise
3350 exceptions. */
sewardj3e838932005-01-07 12:09:15 +00003351 || (ti->eDoesLoad && invPut)
3352 /* probably overly conservative: a memory fence
3353 invalidates absolutely everything, so that all
3354 computation prior to it is forced to complete before
3355 proceeding with the fence. */
sewardj9d2e7692005-02-07 01:11:31 +00003356 || st->tag == Ist_MFence
3357 );
sewardjc41f1fb2004-08-22 09:48:08 +00003358 /*
3359 if (ti->invalidateMe)
sewardj3e838932005-01-07 12:09:15 +00003360 vex_printf("SET INVAL\n");
3361 */
sewardj29632392004-08-22 02:38:11 +00003362 }
3363
3364 dumpInvalidated ( env, bb, &j );
3365
3366 /* finally, emit the substituted statement */
3367 bb->stmts[j] = st2;
3368 /* vex_printf("**2 "); ppIRStmt(bb->stmts[j]); vex_printf("\n"); */
3369 j++;
3370
3371 vassert(j <= i+1);
3372 } /* for each stmt in the original bb ... */
3373
3374 /* Finally ... substitute the ->next field as much as possible, and
3375 dump any left-over bindings. Hmm. Perhaps there should be no
3376 left over bindings? Or any left-over bindings are
3377 by definition dead? */
3378 next2 = tbSubst_Expr(env, bb->next);
3379 bb->next = next2;
3380 bb->stmts_used = j;
3381}
3382
3383
sewardj695cff92004-10-13 14:50:14 +00003384/*---------------------------------------------------------------*/
sewardjedf4d692004-08-17 13:52:58 +00003385/*--- iropt main ---*/
3386/*---------------------------------------------------------------*/
3387
sewardj695cff92004-10-13 14:50:14 +00003388static Bool iropt_verbose = False; //True;
sewardj4345f7a2004-09-22 19:49:27 +00003389
3390
sewardj4345f7a2004-09-22 19:49:27 +00003391/* Do a simple cleanup pass on bb. This is: redundant Get removal,
3392 redundant Put removal, constant propagation, dead code removal,
3393 clean helper specialisation, and dead code removal (again).
sewardjb9230752004-12-29 19:25:06 +00003394*/
sewardj695cff92004-10-13 14:50:14 +00003395
sewardj4345f7a2004-09-22 19:49:27 +00003396
3397static
sewardj695cff92004-10-13 14:50:14 +00003398IRBB* cheap_transformations (
3399 IRBB* bb,
sewardj9d2e7692005-02-07 01:11:31 +00003400 IRExpr* (*specHelper) (HChar*, IRExpr**),
sewardj8d2291c2004-10-25 14:50:21 +00003401 Bool (*preciseMemExnsFn)(Int,Int)
3402 )
sewardj4345f7a2004-09-22 19:49:27 +00003403{
3404 redundant_get_removal_BB ( bb );
3405 if (iropt_verbose) {
3406 vex_printf("\n========= REDUNDANT GET\n\n" );
3407 ppIRBB(bb);
3408 }
sewardj044a2152004-10-21 10:22:10 +00003409
sewardj8d2291c2004-10-25 14:50:21 +00003410 redundant_put_removal_BB ( bb, preciseMemExnsFn );
sewardj4345f7a2004-09-22 19:49:27 +00003411 if (iropt_verbose) {
3412 vex_printf("\n========= REDUNDANT PUT\n\n" );
3413 ppIRBB(bb);
3414 }
sewardj044a2152004-10-21 10:22:10 +00003415
sewardj4345f7a2004-09-22 19:49:27 +00003416 bb = cprop_BB ( bb );
3417 if (iropt_verbose) {
3418 vex_printf("\n========= CPROPD\n\n" );
3419 ppIRBB(bb);
3420 }
3421
sewardj49651f42004-10-28 22:11:04 +00003422 do_deadcode_BB ( bb );
sewardj4345f7a2004-09-22 19:49:27 +00003423 if (iropt_verbose) {
3424 vex_printf("\n========= DEAD\n\n" );
3425 ppIRBB(bb);
3426 }
3427
sewardjb9230752004-12-29 19:25:06 +00003428 bb = spec_helpers_BB ( bb, specHelper );
sewardj49651f42004-10-28 22:11:04 +00003429 do_deadcode_BB ( bb );
sewardj4345f7a2004-09-22 19:49:27 +00003430 if (iropt_verbose) {
3431 vex_printf("\n========= SPECd \n\n" );
3432 ppIRBB(bb);
3433 }
3434
3435 return bb;
3436}
3437
sewardj695cff92004-10-13 14:50:14 +00003438
3439/* Do some more expensive transformations on bb, which are aimed at
3440 optimising as much as possible in the presence of GetI and PutI. */
3441
3442static
3443IRBB* expensive_transformations( IRBB* bb )
3444{
sewardjfe1ccfc2004-11-11 02:14:45 +00003445 do_cse_BB( bb );
sewardj695cff92004-10-13 14:50:14 +00003446 collapse_AddSub_chains_BB( bb );
sewardj08210532004-12-29 17:09:11 +00003447 do_redundant_GetI_elimination( bb );
sewardj695cff92004-10-13 14:50:14 +00003448 do_redundant_PutI_elimination( bb );
sewardj49651f42004-10-28 22:11:04 +00003449 do_deadcode_BB( bb );
sewardjb9230752004-12-29 19:25:06 +00003450 return bb;
sewardj695cff92004-10-13 14:50:14 +00003451}
3452
3453
sewardj4345f7a2004-09-22 19:49:27 +00003454/* Scan a flattened BB to see if it has any GetI or PutIs in it. Used
3455 as a heuristic hack to see if iropt needs to do expensive
sewardj695cff92004-10-13 14:50:14 +00003456 optimisations (CSE, PutI -> GetI forwarding, redundant PutI
3457 elimination) to improve code containing GetI or PutI. */
3458
sewardj4345f7a2004-09-22 19:49:27 +00003459static Bool hasGetIorPutI ( IRBB* bb )
3460{
3461 Int i, j;
3462 IRStmt* st;
3463 IRDirty* d;
3464
3465 for (i = 0; i < bb->stmts_used; i++) {
3466 st = bb->stmts[i];
3467 if (!st)
3468 continue;
3469
3470 switch (st->tag) {
3471 case Ist_PutI:
3472 return True;
3473 case Ist_Tmp:
sewardj6d076362004-09-23 11:06:17 +00003474 if (st->Ist.Tmp.data->tag == Iex_GetI)
sewardj4345f7a2004-09-22 19:49:27 +00003475 return True;
3476 break;
3477 case Ist_Put:
sewardj6d076362004-09-23 11:06:17 +00003478 vassert(isAtom(st->Ist.Put.data));
sewardj4345f7a2004-09-22 19:49:27 +00003479 break;
3480 case Ist_STle:
3481 vassert(isAtom(st->Ist.STle.addr));
3482 vassert(isAtom(st->Ist.STle.data));
3483 break;
sewardj4345f7a2004-09-22 19:49:27 +00003484 case Ist_Dirty:
3485 d = st->Ist.Dirty.details;
sewardjb8385d82004-11-02 01:34:15 +00003486 vassert(isAtom(d->guard));
sewardj4345f7a2004-09-22 19:49:27 +00003487 for (j = 0; d->args[j]; j++)
3488 vassert(isAtom(d->args[j]));
3489 if (d->mFx != Ifx_None)
3490 vassert(isAtom(d->mAddr));
3491 break;
sewardj3e838932005-01-07 12:09:15 +00003492 case Ist_MFence:
3493 break;
3494 case Ist_Exit:
3495 vassert(isAtom(st->Ist.Exit.guard));
3496 break;
sewardj4345f7a2004-09-22 19:49:27 +00003497 default:
3498 ppIRStmt(st);
3499 vpanic("hasGetIorPutI");
3500 }
3501
3502 }
3503 return False;
3504
3505}
3506
3507
sewardj695cff92004-10-13 14:50:14 +00003508/* ---------------- The main iropt entry point. ---------------- */
3509
sewardjedf4d692004-08-17 13:52:58 +00003510/* exported from this file */
sewardj695cff92004-10-13 14:50:14 +00003511/* Rules of the game:
3512
3513 - IRExpr/IRStmt trees should be treated as immutable, as they
3514 may get shared. So never change a field of such a tree node;
3515 instead construct and return a new one if needed.
3516*/
3517
sewardj4345f7a2004-09-22 19:49:27 +00003518
sewardj84ff0652004-08-23 16:16:08 +00003519IRBB* do_iropt_BB ( IRBB* bb0,
sewardj9d2e7692005-02-07 01:11:31 +00003520 IRExpr* (*specHelper) (HChar*, IRExpr**),
sewardj8d2291c2004-10-25 14:50:21 +00003521 Bool (*preciseMemExnsFn)(Int,Int),
sewardj695cff92004-10-13 14:50:14 +00003522 Addr64 guest_addr )
sewardjedf4d692004-08-17 13:52:58 +00003523{
sewardj9d2e7692005-02-07 01:11:31 +00003524 static Int n_total = 0;
3525 static Int n_expensive = 0;
sewardj29632392004-08-22 02:38:11 +00003526
sewardj695cff92004-10-13 14:50:14 +00003527 Bool do_expensive;
sewardj695cff92004-10-13 14:50:14 +00003528 IRBB *bb, *bb2;
sewardj8c2c10b2004-10-16 20:51:52 +00003529
sewardj4345f7a2004-09-22 19:49:27 +00003530 n_total++;
3531
3532 /* First flatten the block out, since all other
3533 phases assume flat code. */
3534
3535 bb = flatten_BB ( bb0 );
3536
3537 if (iropt_verbose) {
3538 vex_printf("\n========= FLAT\n\n" );
3539 ppIRBB(bb);
sewardj84be7372004-08-18 13:59:33 +00003540 }
sewardjd7217032004-08-19 10:49:10 +00003541
sewardj08210532004-12-29 17:09:11 +00003542 /* If at level 0, stop now. */
3543 if (vex_control.iropt_level <= 0) return bb;
3544
sewardj695cff92004-10-13 14:50:14 +00003545 /* Now do a preliminary cleanup pass, and figure out if we also
3546 need to do 'expensive' optimisations. Expensive optimisations
3547 are deemed necessary if the block contains any GetIs or PutIs.
3548 If needed, do expensive transformations and then another cheap
3549 cleanup pass. */
sewardj4345f7a2004-09-22 19:49:27 +00003550
sewardj8d2291c2004-10-25 14:50:21 +00003551 bb = cheap_transformations( bb, specHelper, preciseMemExnsFn );
sewardj695cff92004-10-13 14:50:14 +00003552
sewardj08613742004-10-25 13:01:45 +00003553 if (vex_control.iropt_level > 1) {
sewardj39555aa2004-10-24 22:29:19 +00003554 do_expensive = hasGetIorPutI(bb);
sewardj695cff92004-10-13 14:50:14 +00003555 if (do_expensive) {
sewardj39555aa2004-10-24 22:29:19 +00003556 n_expensive++;
sewardj39555aa2004-10-24 22:29:19 +00003557 if (DEBUG_IROPT)
3558 vex_printf("***** EXPENSIVE %d %d\n", n_total, n_expensive);
sewardj695cff92004-10-13 14:50:14 +00003559 bb = expensive_transformations( bb );
sewardj8d2291c2004-10-25 14:50:21 +00003560 bb = cheap_transformations( bb, specHelper, preciseMemExnsFn );
sewardj695cff92004-10-13 14:50:14 +00003561 }
sewardj39555aa2004-10-24 22:29:19 +00003562
3563 /* Now have a go at unrolling simple (single-BB) loops. If
3564 successful, clean up the results as much as possible. */
3565
3566 bb2 = maybe_loop_unroll_BB( bb, guest_addr );
3567 if (bb2) {
sewardj8d2291c2004-10-25 14:50:21 +00003568 bb = cheap_transformations( bb2, specHelper, preciseMemExnsFn );
sewardj39555aa2004-10-24 22:29:19 +00003569 if (do_expensive) {
3570 bb = expensive_transformations( bb );
sewardj8d2291c2004-10-25 14:50:21 +00003571 bb = cheap_transformations( bb, specHelper, preciseMemExnsFn );
sewardj39555aa2004-10-24 22:29:19 +00003572 } else {
3573 /* at least do CSE and dead code removal */
sewardjfe1ccfc2004-11-11 02:14:45 +00003574 do_cse_BB( bb );
sewardj49651f42004-10-28 22:11:04 +00003575 do_deadcode_BB( bb );
sewardj39555aa2004-10-24 22:29:19 +00003576 }
3577 if (0) vex_printf("vex iropt: unrolled a loop\n");
3578 }
3579
sewardjd7217032004-08-19 10:49:10 +00003580 }
3581
sewardj4345f7a2004-09-22 19:49:27 +00003582 return bb;
sewardjedf4d692004-08-17 13:52:58 +00003583}
3584
3585
sewardja1a370f2004-08-17 13:31:55 +00003586/*---------------------------------------------------------------*/
3587/*--- end ir/iropt.c ---*/
3588/*---------------------------------------------------------------*/