blob: 45df01c8375b8e596ad5149a839a0c06c3c74343 [file] [log] [blame]
sewardja1a370f2004-08-17 13:31:55 +00001
2/*---------------------------------------------------------------*/
3/*--- ---*/
4/*--- This file (ir/iropt.c) is ---*/
sewardj496a58d2005-03-20 18:44:44 +00005/*--- Copyright (c) OpenWorks LLP. All rights reserved. ---*/
sewardja1a370f2004-08-17 13:31:55 +00006/*--- ---*/
7/*---------------------------------------------------------------*/
8
sewardjf8ed9d82004-11-12 17:40:23 +00009/*
10 This file is part of LibVEX, a library for dynamic binary
11 instrumentation and translation.
12
sewardj496a58d2005-03-20 18:44:44 +000013 Copyright (C) 2004-2005 OpenWorks LLP.
sewardjf8ed9d82004-11-12 17:40:23 +000014
15 This program is free software; you can redistribute it and/or modify
16 it under the terms of the GNU General Public License as published by
17 the Free Software Foundation; Version 2 dated June 1991 of the
18 license.
19
20 This program is distributed in the hope that it will be useful,
21 but WITHOUT ANY WARRANTY; without even the implied warranty of
22 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE, or liability
23 for damages. See the GNU General Public License for more details.
24
25 Neither the names of the U.S. Department of Energy nor the
26 University of California nor the names of its contributors may be
27 used to endorse or promote products derived from this software
28 without prior written permission.
29
30 You should have received a copy of the GNU General Public License
31 along with this program; if not, write to the Free Software
32 Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
33 USA.
34*/
35
sewardja1a370f2004-08-17 13:31:55 +000036#include "libvex_basictypes.h"
sewardjedf4d692004-08-17 13:52:58 +000037#include "libvex_ir.h"
sewardja1a370f2004-08-17 13:31:55 +000038#include "libvex.h"
39
40#include "main/vex_util.h"
sewardj08613742004-10-25 13:01:45 +000041#include "main/vex_globals.h"
sewardjedf4d692004-08-17 13:52:58 +000042#include "ir/iropt.h"
sewardja1a370f2004-08-17 13:31:55 +000043
sewardjd0863ff2004-10-23 00:22:32 +000044
sewardj088bcb42004-08-19 17:16:52 +000045/* Set to 1 for lots of debugging output. */
46#define DEBUG_IROPT 0
47
sewardja1a370f2004-08-17 13:31:55 +000048
sewardj08210532004-12-29 17:09:11 +000049/* What iropt does, 29 Dec 04.
50
51 It takes an IRBB and produces a new one with the same meaning,
52 defined thus:
53
54 After execution of the new BB, all guest state and guest memory is
55 the same as after execution of the original. This is true
56 regardless of how the block was exited (at the end vs side exit).
57
58 In addition, parts of the guest state will be identical to that
59 created by execution of the original at the following observation
60 points:
61
62 * In a dirty helper call, any parts of the guest state that the
63 helper states that it reads or modifies will be up to date.
64 Also, guest memory will be up to date. Parts of the guest state
65 not marked as being read or modified by the helper cannot be
66 assumed to be up-to-date at the point where the helper is called.
67
68 * Immediately prior to any load or store, those parts of the guest
69 state marked as requiring precise exceptions will be up to date.
70 Also, guest memory will be up to date. Parts of the guest state
71 not marked as requiring precise exceptions cannot be assumed to
72 be up-to-date at the point of the load/store.
73
74 The relative order of loads and stores (including loads/stores of
75 guest memory done by dirty helpers annotated as such) is not
76 changed. However, the relative order of loads with no intervening
77 stores/modifies may be changed.
78
79 Transformation order
80 ~~~~~~~~~~~~~~~~~~~~
81
82 There are three levels of optimisation, controlled by
83 vex_control.iropt_level. Define first:
84
85 "Cheap transformations" are the following sequence:
86 * Redundant-Get removal
87 * Redundant-Put removal
88 * Constant propagation/folding
89 * Dead code removal
90 * Specialisation of clean helper functions
91 * Dead code removal
92
93 "Expensive transformations" are the following sequence:
94 * CSE
95 * Folding of add/sub chains
96 * Redundant-GetI removal
97 * Redundant-PutI removal
98 * Dead code removal
sewardj08210532004-12-29 17:09:11 +000099
100 Then the transformations are as follows, as defined by
101 vex_control.iropt_level:
102
103 Level 0:
104 * Flatten into atomic form.
105
106 Level 1: the following sequence:
107 * Flatten into atomic form.
108 * Cheap transformations.
sewardj08210532004-12-29 17:09:11 +0000109
110 Level 2: the following sequence
111 * Flatten into atomic form.
112 * Cheap transformations.
113 * If block contains GetI or PutI, Expensive transformations.
114 * Try unrolling loops. Three possible outcomes:
115 - No effect: do nothing more.
116 - Unrolled a loop, and block does not contain GetI or PutI:
117 Do: * CSE
118 * Dead code removal
sewardj08210532004-12-29 17:09:11 +0000119 - Unrolled a loop, and block contains GetI or PutI:
120 Do: * Expensive transformations
121 * Cheap transformations
sewardj08210532004-12-29 17:09:11 +0000122*/
123
sewardj98430292004-12-29 17:34:50 +0000124/* Implementation notes, 29 Dec 04.
125
126 TODO (important): I think rPutI removal ignores precise exceptions
127 and is therefore in a sense, wrong. In the sense that PutIs are
128 assumed not to write parts of the guest state that we need to have
129 up-to-date at loads/stores. So far on x86 guest that has not
130 mattered since indeed only the x87 FP registers and tags are
131 accessed using GetI/PutI, and there is no need so far for them to
132 be up to date at mem exception points. The rPutI pass should be
133 fixed.
sewardjfb44d552004-10-25 09:48:47 +0000134
sewardj4c5f6d52004-10-26 13:25:33 +0000135 TODO: improve pessimistic handling of precise exceptions
136 in the tree builder.
137
sewardjfb44d552004-10-25 09:48:47 +0000138 TODO: check interaction of rGetI and dirty helpers.
sewardjc0b42282004-10-12 13:44:12 +0000139
140 F64i constants are treated differently from other constants.
141 They are not regarded as atoms, and instead lifted off and
142 bound to temps. This allows them to participate in CSE, which
143 is important for getting good performance for x86 guest code.
sewardj695cff92004-10-13 14:50:14 +0000144
sewardja5aa9cf2004-10-15 22:56:38 +0000145 CSE up F64 literals (already doing F64is)
sewardj4c5f6d52004-10-26 13:25:33 +0000146
147 CSE: consider carefully the requirement for precise exns
sewardj98430292004-12-29 17:34:50 +0000148 prior to making CSE any more aggressive. */
sewardjc0b42282004-10-12 13:44:12 +0000149
150
sewardja1a370f2004-08-17 13:31:55 +0000151/*---------------------------------------------------------------*/
152/*--- Finite mappery, of a sort ---*/
153/*---------------------------------------------------------------*/
154
sewardj08210532004-12-29 17:09:11 +0000155/* General map from HWord-sized thing HWord-sized thing. Could be by
156 hashing, but it's not clear whether or not this would really be any
157 faster. */
sewardja1a370f2004-08-17 13:31:55 +0000158
159typedef
160 struct {
161 Bool* inuse;
sewardjea602bc2004-10-14 21:40:12 +0000162 HWord* key;
163 HWord* val;
sewardja1a370f2004-08-17 13:31:55 +0000164 Int size;
165 Int used;
166 }
sewardjea602bc2004-10-14 21:40:12 +0000167 HashHW;
sewardja1a370f2004-08-17 13:31:55 +0000168
sewardjea602bc2004-10-14 21:40:12 +0000169static HashHW* newHHW ( void )
sewardja1a370f2004-08-17 13:31:55 +0000170{
sewardjea602bc2004-10-14 21:40:12 +0000171 HashHW* h = LibVEX_Alloc(sizeof(HashHW));
sewardj29632392004-08-22 02:38:11 +0000172 h->size = 8;
sewardja1a370f2004-08-17 13:31:55 +0000173 h->used = 0;
174 h->inuse = LibVEX_Alloc(h->size * sizeof(Bool));
sewardjea602bc2004-10-14 21:40:12 +0000175 h->key = LibVEX_Alloc(h->size * sizeof(HWord));
176 h->val = LibVEX_Alloc(h->size * sizeof(HWord));
sewardja1a370f2004-08-17 13:31:55 +0000177 return h;
178}
179
180
sewardj84be7372004-08-18 13:59:33 +0000181/* Look up key in the map. */
sewardja1a370f2004-08-17 13:31:55 +0000182
sewardjea602bc2004-10-14 21:40:12 +0000183static Bool lookupHHW ( HashHW* h, /*OUT*/HWord* val, HWord key )
sewardja1a370f2004-08-17 13:31:55 +0000184{
185 Int i;
sewardj08210532004-12-29 17:09:11 +0000186 /* vex_printf("lookupHHW(%llx)\n", key ); */
sewardja1a370f2004-08-17 13:31:55 +0000187 for (i = 0; i < h->used; i++) {
188 if (h->inuse[i] && h->key[i] == key) {
sewardj39e3f242004-08-18 16:54:52 +0000189 if (val)
190 *val = h->val[i];
sewardja1a370f2004-08-17 13:31:55 +0000191 return True;
192 }
193 }
194 return False;
195}
196
197
sewardja1a370f2004-08-17 13:31:55 +0000198/* Add key->val to the map. Replaces any existing binding for key. */
199
sewardjea602bc2004-10-14 21:40:12 +0000200static void addToHHW ( HashHW* h, HWord key, HWord val )
sewardja1a370f2004-08-17 13:31:55 +0000201{
202 Int i, j;
sewardj08210532004-12-29 17:09:11 +0000203 /* vex_printf("addToHHW(%llx, %llx)\n", key, val); */
sewardja1a370f2004-08-17 13:31:55 +0000204
205 /* Find and replace existing binding, if any. */
206 for (i = 0; i < h->used; i++) {
207 if (h->inuse[i] && h->key[i] == key) {
208 h->val[i] = val;
209 return;
210 }
211 }
212
213 /* Ensure a space is available. */
214 if (h->used == h->size) {
215 /* Copy into arrays twice the size. */
216 Bool* inuse2 = LibVEX_Alloc(2 * h->size * sizeof(Bool));
sewardjea602bc2004-10-14 21:40:12 +0000217 HWord* key2 = LibVEX_Alloc(2 * h->size * sizeof(HWord));
218 HWord* val2 = LibVEX_Alloc(2 * h->size * sizeof(HWord));
sewardja1a370f2004-08-17 13:31:55 +0000219 for (i = j = 0; i < h->size; i++) {
220 if (!h->inuse[i]) continue;
221 inuse2[j] = True;
222 key2[j] = h->key[i];
223 val2[j] = h->val[i];
224 j++;
225 }
226 h->used = j;
227 h->size *= 2;
228 h->inuse = inuse2;
229 h->key = key2;
230 h->val = val2;
231 }
232
233 /* Finally, add it. */
234 vassert(h->used < h->size);
235 h->inuse[h->used] = True;
236 h->key[h->used] = key;
sewardj84be7372004-08-18 13:59:33 +0000237 h->val[h->used] = val;
sewardja1a370f2004-08-17 13:31:55 +0000238 h->used++;
239}
240
sewardj84be7372004-08-18 13:59:33 +0000241
sewardjd7cb8532004-08-17 23:59:23 +0000242/*---------------------------------------------------------------*/
sewardj08210532004-12-29 17:09:11 +0000243/*--- Flattening out a BB into atomic SSA form ---*/
sewardjd7cb8532004-08-17 23:59:23 +0000244/*---------------------------------------------------------------*/
245
sewardje80679a2004-09-21 23:00:11 +0000246/* Non-critical helper, heuristic for reducing the number of tmp-tmp
247 copies made by flattening. If in doubt return False. */
248
249static Bool isFlat ( IRExpr* e )
250{
sewardj695cff92004-10-13 14:50:14 +0000251 if (e->tag == Iex_Get)
252 return True;
sewardje80679a2004-09-21 23:00:11 +0000253 if (e->tag == Iex_Binop)
sewardj496a58d2005-03-20 18:44:44 +0000254 return toBool( isIRAtom(e->Iex.Binop.arg1)
255 && isIRAtom(e->Iex.Binop.arg2) );
sewardje80679a2004-09-21 23:00:11 +0000256 if (e->tag == Iex_LDle)
sewardj496a58d2005-03-20 18:44:44 +0000257 return isIRAtom(e->Iex.LDle.addr);
sewardje80679a2004-09-21 23:00:11 +0000258 return False;
259}
260
sewardjd7cb8532004-08-17 23:59:23 +0000261/* Flatten out 'ex' so it is atomic, returning a new expression with
262 the same value, after having appended extra IRTemp assignments to
263 the end of 'bb'. */
264
265static IRExpr* flatten_Expr ( IRBB* bb, IRExpr* ex )
266{
267 Int i;
268 IRExpr** newargs;
269 IRType ty = typeOfIRExpr(bb->tyenv, ex);
270 IRTemp t1;
271
272 switch (ex->tag) {
273
sewardjd7217032004-08-19 10:49:10 +0000274 case Iex_GetI:
275 t1 = newIRTemp(bb->tyenv, ty);
276 addStmtToIRBB(bb, IRStmt_Tmp(t1,
sewardj2d3f77c2004-09-22 23:49:09 +0000277 IRExpr_GetI(ex->Iex.GetI.descr,
sewardjeeac8412004-11-02 00:26:55 +0000278 flatten_Expr(bb, ex->Iex.GetI.ix),
sewardj2d3f77c2004-09-22 23:49:09 +0000279 ex->Iex.GetI.bias)));
sewardjd7217032004-08-19 10:49:10 +0000280 return IRExpr_Tmp(t1);
281
sewardjd7cb8532004-08-17 23:59:23 +0000282 case Iex_Get:
283 t1 = newIRTemp(bb->tyenv, ty);
284 addStmtToIRBB(bb,
285 IRStmt_Tmp(t1, ex));
286 return IRExpr_Tmp(t1);
287
288 case Iex_Binop:
289 t1 = newIRTemp(bb->tyenv, ty);
290 addStmtToIRBB(bb, IRStmt_Tmp(t1,
291 IRExpr_Binop(ex->Iex.Binop.op,
292 flatten_Expr(bb, ex->Iex.Binop.arg1),
293 flatten_Expr(bb, ex->Iex.Binop.arg2))));
294 return IRExpr_Tmp(t1);
295
296 case Iex_Unop:
297 t1 = newIRTemp(bb->tyenv, ty);
298 addStmtToIRBB(bb, IRStmt_Tmp(t1,
299 IRExpr_Unop(ex->Iex.Unop.op,
300 flatten_Expr(bb, ex->Iex.Unop.arg))));
301 return IRExpr_Tmp(t1);
302
303 case Iex_LDle:
304 t1 = newIRTemp(bb->tyenv, ty);
305 addStmtToIRBB(bb, IRStmt_Tmp(t1,
306 IRExpr_LDle(ex->Iex.LDle.ty,
307 flatten_Expr(bb, ex->Iex.LDle.addr))));
308 return IRExpr_Tmp(t1);
309
310 case Iex_CCall:
sewardj695cff92004-10-13 14:50:14 +0000311 newargs = sopyIRExprVec(ex->Iex.CCall.args);
sewardjd7cb8532004-08-17 23:59:23 +0000312 for (i = 0; newargs[i]; i++)
313 newargs[i] = flatten_Expr(bb, newargs[i]);
314 t1 = newIRTemp(bb->tyenv, ty);
315 addStmtToIRBB(bb, IRStmt_Tmp(t1,
sewardj8ea867b2004-10-30 19:03:02 +0000316 IRExpr_CCall(ex->Iex.CCall.cee,
sewardjd7cb8532004-08-17 23:59:23 +0000317 ex->Iex.CCall.retty,
318 newargs)));
319 return IRExpr_Tmp(t1);
320
321 case Iex_Mux0X:
322 t1 = newIRTemp(bb->tyenv, ty);
323 addStmtToIRBB(bb, IRStmt_Tmp(t1,
324 IRExpr_Mux0X(flatten_Expr(bb, ex->Iex.Mux0X.cond),
325 flatten_Expr(bb, ex->Iex.Mux0X.expr0),
326 flatten_Expr(bb, ex->Iex.Mux0X.exprX))));
327 return IRExpr_Tmp(t1);
328
329 case Iex_Const:
sewardjc0b42282004-10-12 13:44:12 +0000330 /* Lift F64i constants out onto temps so they can be CSEd
331 later. */
332 if (ex->Iex.Const.con->tag == Ico_F64i) {
333 t1 = newIRTemp(bb->tyenv, ty);
334 addStmtToIRBB(bb, IRStmt_Tmp(t1,
335 IRExpr_Const(ex->Iex.Const.con)));
336 return IRExpr_Tmp(t1);
337 } else {
338 /* Leave all other constants alone. */
339 return ex;
340 }
341
sewardjd7cb8532004-08-17 23:59:23 +0000342 case Iex_Tmp:
343 return ex;
344
345 default:
346 vex_printf("\n");
347 ppIRExpr(ex);
348 vex_printf("\n");
349 vpanic("flatten_Expr");
350 }
351}
352
353
354/* Append a completely flattened form of 'st' to the end of 'bb'. */
355
356static void flatten_Stmt ( IRBB* bb, IRStmt* st )
357{
sewardj17442fe2004-09-20 14:54:28 +0000358 Int i;
359 IRExpr *e1, *e2;
360 IRDirty *d, *d2;
sewardjd7cb8532004-08-17 23:59:23 +0000361 switch (st->tag) {
362 case Ist_Put:
sewardj496a58d2005-03-20 18:44:44 +0000363 if (isIRAtom(st->Ist.Put.data)) {
sewardj49651f42004-10-28 22:11:04 +0000364 /* optimisation to reduce the amount of heap wasted
365 by the flattener */
366 addStmtToIRBB(bb, st);
367 } else {
368 /* general case, always correct */
369 e1 = flatten_Expr(bb, st->Ist.Put.data);
370 addStmtToIRBB(bb, IRStmt_Put(st->Ist.Put.offset, e1));
371 }
sewardjd7cb8532004-08-17 23:59:23 +0000372 break;
sewardjd7cb8532004-08-17 23:59:23 +0000373 case Ist_PutI:
sewardjeeac8412004-11-02 00:26:55 +0000374 e1 = flatten_Expr(bb, st->Ist.PutI.ix);
sewardj2d3f77c2004-09-22 23:49:09 +0000375 e2 = flatten_Expr(bb, st->Ist.PutI.data);
376 addStmtToIRBB(bb, IRStmt_PutI(st->Ist.PutI.descr,
377 e1,
378 st->Ist.PutI.bias,
379 e2));
sewardjd7217032004-08-19 10:49:10 +0000380 break;
sewardjd7cb8532004-08-17 23:59:23 +0000381 case Ist_Tmp:
sewardj6d076362004-09-23 11:06:17 +0000382 if (isFlat(st->Ist.Tmp.data)) {
sewardje80679a2004-09-21 23:00:11 +0000383 /* optimisation, to reduce the number of tmp-tmp
384 copies generated */
385 addStmtToIRBB(bb, st);
386 } else {
387 /* general case, always correct */
sewardj6d076362004-09-23 11:06:17 +0000388 e1 = flatten_Expr(bb, st->Ist.Tmp.data);
sewardje80679a2004-09-21 23:00:11 +0000389 addStmtToIRBB(bb, IRStmt_Tmp(st->Ist.Tmp.tmp, e1));
390 }
sewardjd7cb8532004-08-17 23:59:23 +0000391 break;
392 case Ist_STle:
393 e1 = flatten_Expr(bb, st->Ist.STle.addr);
394 e2 = flatten_Expr(bb, st->Ist.STle.data);
395 addStmtToIRBB(bb, IRStmt_STle(e1,e2));
396 break;
sewardj17442fe2004-09-20 14:54:28 +0000397 case Ist_Dirty:
398 d = st->Ist.Dirty.details;
399 d2 = emptyIRDirty();
400 *d2 = *d;
sewardj695cff92004-10-13 14:50:14 +0000401 d2->args = sopyIRExprVec(d2->args);
sewardj17442fe2004-09-20 14:54:28 +0000402 if (d2->mFx != Ifx_None) {
403 d2->mAddr = flatten_Expr(bb, d2->mAddr);
404 } else {
405 vassert(d2->mAddr == NULL);
406 }
sewardjb8385d82004-11-02 01:34:15 +0000407 d2->guard = flatten_Expr(bb, d2->guard);
sewardj17442fe2004-09-20 14:54:28 +0000408 for (i = 0; d2->args[i]; i++)
409 d2->args[i] = flatten_Expr(bb, d2->args[i]);
410 addStmtToIRBB(bb, IRStmt_Dirty(d2));
411 break;
sewardjd2445f62005-03-21 00:15:53 +0000412 case Ist_NoOp:
sewardj3e838932005-01-07 12:09:15 +0000413 case Ist_MFence:
sewardjd2445f62005-03-21 00:15:53 +0000414 case Ist_IMark:
sewardj3e838932005-01-07 12:09:15 +0000415 addStmtToIRBB(bb, st);
416 break;
sewardjd7cb8532004-08-17 23:59:23 +0000417 case Ist_Exit:
sewardj0276d4b2004-11-15 15:30:21 +0000418 e1 = flatten_Expr(bb, st->Ist.Exit.guard);
sewardj893aada2004-11-29 19:57:54 +0000419 addStmtToIRBB(bb, IRStmt_Exit(e1, st->Ist.Exit.jk,
420 st->Ist.Exit.dst));
sewardjd7cb8532004-08-17 23:59:23 +0000421 break;
422 default:
423 vex_printf("\n");
424 ppIRStmt(st);
425 vex_printf("\n");
426 vpanic("flatten_Stmt");
427 }
428}
429
sewardj08210532004-12-29 17:09:11 +0000430
sewardjd7cb8532004-08-17 23:59:23 +0000431static IRBB* flatten_BB ( IRBB* in )
432{
433 Int i;
434 IRBB* out;
435 out = emptyIRBB();
sewardj695cff92004-10-13 14:50:14 +0000436 out->tyenv = dopyIRTypeEnv( in->tyenv );
sewardjd7cb8532004-08-17 23:59:23 +0000437 for (i = 0; i < in->stmts_used; i++)
sewardj4345f7a2004-09-22 19:49:27 +0000438 if (in->stmts[i])
439 flatten_Stmt( out, in->stmts[i] );
sewardjd7cb8532004-08-17 23:59:23 +0000440 out->next = flatten_Expr( out, in->next );
441 out->jumpkind = in->jumpkind;
442 return out;
443}
444
sewardjedf4d692004-08-17 13:52:58 +0000445
sewardj08210532004-12-29 17:09:11 +0000446/*---------------------------------------------------------------*/
447/*--- In-place removal of redundant GETs ---*/
448/*---------------------------------------------------------------*/
449
450/* Scan forwards, building up an environment binding (min offset, max
451 offset) pairs to values, which will either be temps or constants.
452
453 On seeing 't = Get(minoff,maxoff)', look up (minoff,maxoff) in the
454 env and if it matches, replace the Get with the stored value. If
455 there is no match, add a (minoff,maxoff) :-> t binding.
456
457 On seeing 'Put (minoff,maxoff) = t or c', first remove in the env
458 any binding which fully or partially overlaps with (minoff,maxoff).
459 Then add a new (minoff,maxoff) :-> t or c binding. */
460
461/* Extract the min/max offsets from a guest state array descriptor. */
462
463inline
464static void getArrayBounds ( IRArray* descr, UInt* minoff, UInt* maxoff )
465{
466 *minoff = descr->base;
467 *maxoff = *minoff + descr->nElems*sizeofIRType(descr->elemTy) - 1;
468 vassert((*minoff & ~0xFFFF) == 0);
469 vassert((*maxoff & ~0xFFFF) == 0);
470 vassert(*minoff <= *maxoff);
471}
472
473/* Create keys, of the form ((minoffset << 16) | maxoffset). */
474
475static UInt mk_key_GetPut ( Int offset, IRType ty )
476{
477 /* offset should fit in 16 bits. */
478 UInt minoff = offset;
479 UInt maxoff = minoff + sizeofIRType(ty) - 1;
480 vassert((minoff & ~0xFFFF) == 0);
481 vassert((maxoff & ~0xFFFF) == 0);
482 return (minoff << 16) | maxoff;
483}
484
485static UInt mk_key_GetIPutI ( IRArray* descr )
486{
487 UInt minoff, maxoff;
488 getArrayBounds( descr, &minoff, &maxoff );
489 vassert((minoff & ~0xFFFF) == 0);
490 vassert((maxoff & ~0xFFFF) == 0);
491 return (minoff << 16) | maxoff;
492}
493
494/* Supposing h has keys of the form generated by mk_key_GetPut and
495 mk_key_GetIPutI, invalidate any key which overlaps (k_lo
496 .. k_hi).
497*/
498static void invalidateOverlaps ( HashHW* h, UInt k_lo, UInt k_hi )
499{
500 Int j;
501 UInt e_lo, e_hi;
502 vassert(k_lo <= k_hi);
503 /* invalidate any env entries which in any way overlap (k_lo
504 .. k_hi) */
505 /* vex_printf("invalidate %d .. %d\n", k_lo, k_hi ); */
506
507 for (j = 0; j < h->used; j++) {
508 if (!h->inuse[j])
509 continue;
510 e_lo = (((UInt)h->key[j]) >> 16) & 0xFFFF;
511 e_hi = ((UInt)h->key[j]) & 0xFFFF;
512 vassert(e_lo <= e_hi);
513 if (e_hi < k_lo || k_hi < e_lo)
514 continue; /* no overlap possible */
515 else
516 /* overlap; invalidate */
517 h->inuse[j] = False;
518 }
519}
520
521
522static void redundant_get_removal_BB ( IRBB* bb )
523{
524 HashHW* env = newHHW();
525 UInt key = 0; /* keep gcc -O happy */
526 Int i, j;
527 HWord val;
528
529 for (i = 0; i < bb->stmts_used; i++) {
530 IRStmt* st = bb->stmts[i];
531
sewardj8bee6d12005-03-22 02:24:05 +0000532 if (st->tag == Ist_NoOp)
sewardj08210532004-12-29 17:09:11 +0000533 continue;
534
535 /* Deal with Gets */
536 if (st->tag == Ist_Tmp
537 && st->Ist.Tmp.data->tag == Iex_Get) {
538 /* st is 't = Get(...)'. Look up in the environment and see
539 if the Get can be replaced. */
540 IRExpr* get = st->Ist.Tmp.data;
541 key = (HWord)mk_key_GetPut( get->Iex.Get.offset,
542 get->Iex.Get.ty );
543 if (lookupHHW(env, &val, (HWord)key)) {
544 /* found it */
545 /* Note, we could do better here. If the types are
546 different we don't do the substitution, since doing so
547 could lead to invalidly-typed IR. An improvement would
548 be to stick in a reinterpret-style cast, although that
549 would make maintaining flatness more difficult. */
550 IRExpr* valE = (IRExpr*)val;
sewardj9d2e7692005-02-07 01:11:31 +0000551 Bool typesOK = toBool( typeOfIRExpr(bb->tyenv,valE)
552 == st->Ist.Tmp.data->Iex.Get.ty );
sewardj08210532004-12-29 17:09:11 +0000553 if (typesOK && DEBUG_IROPT) {
554 vex_printf("rGET: "); ppIRExpr(get);
555 vex_printf(" -> "); ppIRExpr(valE);
556 vex_printf("\n");
557 }
558 if (typesOK)
559 bb->stmts[i] = IRStmt_Tmp(st->Ist.Tmp.tmp, valE);
560 } else {
561 /* Not found, but at least we know that t and the Get(...)
562 are now associated. So add a binding to reflect that
563 fact. */
564 addToHHW( env, (HWord)key,
sewardj59c07782005-01-21 21:23:07 +0000565 (HWord)(void*)(IRExpr_Tmp(st->Ist.Tmp.tmp)) );
sewardj08210532004-12-29 17:09:11 +0000566 }
567 }
568
569 /* Deal with Puts: invalidate any env entries overlapped by this
570 Put */
571 if (st->tag == Ist_Put || st->tag == Ist_PutI) {
572 UInt k_lo, k_hi;
573 if (st->tag == Ist_Put) {
574 key = mk_key_GetPut( st->Ist.Put.offset,
575 typeOfIRExpr(bb->tyenv,st->Ist.Put.data) );
576 } else {
577 vassert(st->tag == Ist_PutI);
578 key = mk_key_GetIPutI( st->Ist.PutI.descr );
579 }
580
581 k_lo = (key >> 16) & 0xFFFF;
582 k_hi = key & 0xFFFF;
583 invalidateOverlaps(env, k_lo, k_hi);
584 }
585 else
586 if (st->tag == Ist_Dirty) {
587 /* Deal with dirty helpers which write or modify guest state.
588 Invalidate the entire env. We could do a lot better
589 here. */
590 IRDirty* d = st->Ist.Dirty.details;
591 Bool writes = False;
592 for (j = 0; j < d->nFxState; j++) {
593 if (d->fxState[j].fx == Ifx_Modify
594 || d->fxState[j].fx == Ifx_Write)
595 writes = True;
596 }
597 if (writes) {
598 /* dump the entire env (not clever, but correct ...) */
599 for (j = 0; j < env->used; j++)
600 env->inuse[j] = False;
601 if (0) vex_printf("rGET: trash env due to dirty helper\n");
602 }
603 }
604
605 /* add this one to the env, if appropriate */
606 if (st->tag == Ist_Put) {
sewardj496a58d2005-03-20 18:44:44 +0000607 vassert(isIRAtom(st->Ist.Put.data));
sewardj08210532004-12-29 17:09:11 +0000608 addToHHW( env, (HWord)key, (HWord)(st->Ist.Put.data));
609 }
610
611 } /* for (i = 0; i < bb->stmts_used; i++) */
612
613}
614
615
616/*---------------------------------------------------------------*/
617/*--- In-place removal of redundant PUTs ---*/
618/*---------------------------------------------------------------*/
619
620/* Find any Get uses in st and invalidate any partially or fully
621 overlapping ranges listed in env. Due to the flattening phase, the
622 only stmt kind we expect to find a Get on is IRStmt_Tmp. */
623
624static void handle_gets_Stmt (
625 HashHW* env,
626 IRStmt* st,
627 Bool (*preciseMemExnsFn)(Int,Int)
628 )
629{
630 Int j;
631 UInt key = 0; /* keep gcc -O happy */
632 Bool isGet;
633 Bool memRW = False;
634 IRExpr* e;
635
636 switch (st->tag) {
637
638 /* This is the only interesting case. Deal with Gets in the RHS
639 expression. */
640 case Ist_Tmp:
641 e = st->Ist.Tmp.data;
642 switch (e->tag) {
643 case Iex_Get:
644 isGet = True;
645 key = mk_key_GetPut ( e->Iex.Get.offset, e->Iex.Get.ty );
646 break;
647 case Iex_GetI:
648 isGet = True;
649 key = mk_key_GetIPutI ( e->Iex.GetI.descr );
650 break;
651 case Iex_LDle:
652 isGet = False;
653 memRW = True;
654 break;
655 default:
656 isGet = False;
657 }
658 if (isGet) {
659 UInt k_lo, k_hi;
660 k_lo = (key >> 16) & 0xFFFF;
661 k_hi = key & 0xFFFF;
662 invalidateOverlaps(env, k_lo, k_hi);
663 }
664 break;
665
666 /* Be very conservative for dirty helper calls; dump the entire
667 environment. The helper might read guest state, in which
668 case it needs to be flushed first. Also, the helper might
669 access guest memory, in which case all parts of the guest
670 state requiring precise exceptions needs to be flushed. The
671 crude solution is just to flush everything; we could easily
672 enough do a lot better if needed. */
sewardj3e838932005-01-07 12:09:15 +0000673 /* Probably also overly-conservative, but also dump everything
674 if we hit a memory fence. */
675 case Ist_MFence:
sewardj08210532004-12-29 17:09:11 +0000676 case Ist_Dirty:
677 for (j = 0; j < env->used; j++)
678 env->inuse[j] = False;
679 break;
680
681 /* all other cases are boring. */
682 case Ist_STle:
sewardj496a58d2005-03-20 18:44:44 +0000683 vassert(isIRAtom(st->Ist.STle.addr));
684 vassert(isIRAtom(st->Ist.STle.data));
sewardj08210532004-12-29 17:09:11 +0000685 memRW = True;
686 break;
687
688 case Ist_Exit:
sewardj496a58d2005-03-20 18:44:44 +0000689 vassert(isIRAtom(st->Ist.Exit.guard));
sewardj08210532004-12-29 17:09:11 +0000690 break;
691
692 case Ist_PutI:
sewardj496a58d2005-03-20 18:44:44 +0000693 vassert(isIRAtom(st->Ist.PutI.ix));
694 vassert(isIRAtom(st->Ist.PutI.data));
sewardj08210532004-12-29 17:09:11 +0000695 break;
696
sewardjd2445f62005-03-21 00:15:53 +0000697 case Ist_NoOp:
sewardjf1689312005-03-16 18:19:10 +0000698 case Ist_IMark:
699 break;
700
sewardj08210532004-12-29 17:09:11 +0000701 default:
702 vex_printf("\n");
703 ppIRStmt(st);
704 vex_printf("\n");
705 vpanic("handle_gets_Stmt");
706 }
707
708 if (memRW) {
709 /* This statement accesses memory. So we need to dump all parts
710 of the environment corresponding to guest state that may not
711 be reordered with respect to memory references. That means
712 at least the stack pointer. */
713 for (j = 0; j < env->used; j++) {
714 if (!env->inuse[j])
715 continue;
716 if (vex_control.iropt_precise_memory_exns) {
717 /* Precise exceptions required. Flush all guest state. */
718 env->inuse[j] = False;
719 } else {
720 /* Just flush the minimal amount required, as computed by
721 preciseMemExnsFn. */
722 HWord k_lo = (env->key[j] >> 16) & 0xFFFF;
723 HWord k_hi = env->key[j] & 0xFFFF;
724 if (preciseMemExnsFn( k_lo, k_hi ))
725 env->inuse[j] = False;
726 }
727 }
728 } /* if (memRW) */
729
730}
731
732
733/* Scan backwards, building up a set of (min offset, max
734 offset) pairs, indicating those parts of the guest state
735 for which the next event is a write.
736
737 On seeing a conditional exit, empty the set.
738
739 On seeing 'Put (minoff,maxoff) = t or c', if (minoff,maxoff) is
740 completely within the set, remove the Put. Otherwise, add
741 (minoff,maxoff) to the set.
742
743 On seeing 'Get (minoff,maxoff)', remove any part of the set
sewardj98430292004-12-29 17:34:50 +0000744 overlapping (minoff,maxoff). The same has to happen for any events
745 which implicitly read parts of the guest state: dirty helper calls
746 and loads/stores.
sewardj08210532004-12-29 17:09:11 +0000747*/
748
749static void redundant_put_removal_BB (
750 IRBB* bb,
751 Bool (*preciseMemExnsFn)(Int,Int)
752 )
753{
754 Int i, j;
755 Bool isPut;
756 IRStmt* st;
757 UInt key = 0; /* keep gcc -O happy */
758
759 HashHW* env = newHHW();
760 for (i = bb->stmts_used-1; i >= 0; i--) {
761 st = bb->stmts[i];
sewardj8bee6d12005-03-22 02:24:05 +0000762
763 if (st->tag == Ist_NoOp)
sewardj08210532004-12-29 17:09:11 +0000764 continue;
765
766 /* Deal with conditional exits. */
767 if (st->tag == Ist_Exit) {
768 /* Since control may not get beyond this point, we must empty
769 out the set, since we can no longer claim that the next
770 event for any part of the guest state is definitely a
771 write. */
sewardj496a58d2005-03-20 18:44:44 +0000772 vassert(isIRAtom(st->Ist.Exit.guard));
sewardj08210532004-12-29 17:09:11 +0000773 for (j = 0; j < env->used; j++)
774 env->inuse[j] = False;
775 continue;
776 }
777
778 /* Deal with Puts */
779 switch (st->tag) {
780 case Ist_Put:
781 isPut = True;
782 key = mk_key_GetPut( st->Ist.Put.offset,
783 typeOfIRExpr(bb->tyenv,st->Ist.Put.data) );
sewardj496a58d2005-03-20 18:44:44 +0000784 vassert(isIRAtom(st->Ist.Put.data));
sewardj08210532004-12-29 17:09:11 +0000785 break;
786 case Ist_PutI:
787 isPut = True;
788 key = mk_key_GetIPutI( st->Ist.PutI.descr );
sewardj496a58d2005-03-20 18:44:44 +0000789 vassert(isIRAtom(st->Ist.PutI.ix));
790 vassert(isIRAtom(st->Ist.PutI.data));
sewardj08210532004-12-29 17:09:11 +0000791 break;
792 default:
793 isPut = False;
794 }
795 if (isPut && st->tag != Ist_PutI) {
796 /* See if any single entry in env overlaps this Put. This is
797 simplistic in that the transformation is valid if, say, two
798 or more entries in the env overlap this Put, but the use of
799 lookupHHW will only find a single entry which exactly
800 overlaps this Put. This is suboptimal but safe. */
801 if (lookupHHW(env, NULL, (HWord)key)) {
802 /* This Put is redundant because a later one will overwrite
803 it. So NULL (nop) it out. */
804 if (DEBUG_IROPT) {
805 vex_printf("rPUT: "); ppIRStmt(st);
806 vex_printf("\n");
807 }
sewardjd2445f62005-03-21 00:15:53 +0000808 bb->stmts[i] = IRStmt_NoOp();
sewardj08210532004-12-29 17:09:11 +0000809 } else {
810 /* We can't demonstrate that this Put is redundant, so add it
811 to the running collection. */
812 addToHHW(env, (HWord)key, 0);
813 }
814 continue;
815 }
816
817 /* Deal with Gets. These remove bits of the environment since
818 appearance of a Get means that the next event for that slice
sewardj98430292004-12-29 17:34:50 +0000819 of the guest state is no longer a write, but a read. Also
820 deals with implicit reads of guest state needed to maintain
821 precise exceptions. */
sewardj08210532004-12-29 17:09:11 +0000822 handle_gets_Stmt( env, st, preciseMemExnsFn );
823 }
824}
825
sewardj84be7372004-08-18 13:59:33 +0000826
827/*---------------------------------------------------------------*/
828/*--- Constant propagation and folding ---*/
829/*---------------------------------------------------------------*/
830
sewardj62617ef2004-10-13 23:29:22 +0000831/* The env in this section is a map from IRTemp to IRExpr*,
832 that is, an array indexed by IRTemp. */
sewardjf6501992004-08-27 11:58:24 +0000833
sewardjf6729012004-08-25 12:45:13 +0000834/* Are both expressions simply the same IRTemp ? */
835static Bool sameIRTemps ( IRExpr* e1, IRExpr* e2 )
836{
sewardj9d2e7692005-02-07 01:11:31 +0000837 return toBool( e1->tag == Iex_Tmp
838 && e2->tag == Iex_Tmp
839 && e1->Iex.Tmp.tmp == e2->Iex.Tmp.tmp );
sewardjf6729012004-08-25 12:45:13 +0000840}
841
sewardje1d45da2004-11-12 00:13:21 +0000842static Bool notBool ( Bool b )
843{
844 if (b == True) return False;
845 if (b == False) return True;
846 vpanic("notBool");
847}
848
sewardj0033ddc2005-04-26 23:34:34 +0000849/* Make a zero which has the same type as the result of the given
850 primop. */
851static IRExpr* mkZeroForXor ( IROp op )
852{
853 switch (op) {
854 case Iop_Xor8: return IRExpr_Const(IRConst_U8(0));
855 case Iop_Xor16: return IRExpr_Const(IRConst_U16(0));
856 case Iop_Xor32: return IRExpr_Const(IRConst_U32(0));
857 case Iop_Xor64: return IRExpr_Const(IRConst_U64(0));
858 default: vpanic("mkZeroForXor: bad primop");
859 }
860}
861
862
sewardj84be7372004-08-18 13:59:33 +0000863static IRExpr* fold_Expr ( IRExpr* e )
864{
sewardj278c44c2004-08-20 00:28:13 +0000865 Int shift;
sewardj84be7372004-08-18 13:59:33 +0000866 IRExpr* e2 = e; /* e2 is the result of folding e, if possible */
867
868 /* UNARY ops */
869 if (e->tag == Iex_Unop
870 && e->Iex.Unop.arg->tag == Iex_Const) {
871 switch (e->Iex.Unop.op) {
sewardjae27ab62004-10-15 21:21:46 +0000872 case Iop_1Uto8:
sewardj9d2e7692005-02-07 01:11:31 +0000873 e2 = IRExpr_Const(IRConst_U8(toUChar(
sewardjba999312004-11-15 15:21:17 +0000874 e->Iex.Unop.arg->Iex.Const.con->Ico.U1
sewardj9d2e7692005-02-07 01:11:31 +0000875 ? 1 : 0)));
sewardjae27ab62004-10-15 21:21:46 +0000876 break;
sewardjf4a821d2004-10-09 00:58:19 +0000877 case Iop_1Uto32:
878 e2 = IRExpr_Const(IRConst_U32(
sewardjba999312004-11-15 15:21:17 +0000879 e->Iex.Unop.arg->Iex.Const.con->Ico.U1
sewardjf4a821d2004-10-09 00:58:19 +0000880 ? 1 : 0));
881 break;
sewardje6b39932004-11-06 17:01:15 +0000882
sewardjd9997882004-11-04 19:42:59 +0000883 case Iop_1Sto32:
884 e2 = IRExpr_Const(IRConst_U32(
sewardjba999312004-11-15 15:21:17 +0000885 e->Iex.Unop.arg->Iex.Const.con->Ico.U1
sewardjd9997882004-11-04 19:42:59 +0000886 ? 0xFFFFFFFF : 0));
887 break;
sewardje6b39932004-11-06 17:01:15 +0000888 case Iop_1Sto64:
889 e2 = IRExpr_Const(IRConst_U64(
sewardjba999312004-11-15 15:21:17 +0000890 e->Iex.Unop.arg->Iex.Const.con->Ico.U1
sewardje6b39932004-11-06 17:01:15 +0000891 ? 0xFFFFFFFFFFFFFFFFULL : 0));
892 break;
893
sewardj883b00b2004-09-11 09:30:24 +0000894 case Iop_8Sto32: {
895 /* signed */ Int s32 = e->Iex.Unop.arg->Iex.Const.con->Ico.U8;
896 s32 <<= 24;
897 s32 >>= 24;
898 e2 = IRExpr_Const(IRConst_U32((UInt)s32));
899 break;
900 }
sewardj291a7e82005-04-27 11:42:44 +0000901 case Iop_8Uto64:
902 e2 = IRExpr_Const(IRConst_U64(
903 0xFFULL & e->Iex.Unop.arg->Iex.Const.con->Ico.U8));
904 break;
905 case Iop_16Uto64:
906 e2 = IRExpr_Const(IRConst_U64(
907 0xFFFFULL & e->Iex.Unop.arg->Iex.Const.con->Ico.U16));
908 break;
sewardj84be7372004-08-18 13:59:33 +0000909 case Iop_8Uto32:
910 e2 = IRExpr_Const(IRConst_U32(
911 0xFF & e->Iex.Unop.arg->Iex.Const.con->Ico.U8));
912 break;
913 case Iop_16Uto32:
914 e2 = IRExpr_Const(IRConst_U32(
915 0xFFFF & e->Iex.Unop.arg->Iex.Const.con->Ico.U16));
916 break;
sewardj73017432004-10-14 19:33:25 +0000917 case Iop_32to16:
sewardj9d2e7692005-02-07 01:11:31 +0000918 e2 = IRExpr_Const(IRConst_U16(toUShort(
919 0xFFFF & e->Iex.Unop.arg->Iex.Const.con->Ico.U32)));
sewardj73017432004-10-14 19:33:25 +0000920 break;
sewardj4345f7a2004-09-22 19:49:27 +0000921 case Iop_32to8:
sewardj9d2e7692005-02-07 01:11:31 +0000922 e2 = IRExpr_Const(IRConst_U8(toUChar(
923 0xFF & e->Iex.Unop.arg->Iex.Const.con->Ico.U32)));
sewardj4345f7a2004-09-22 19:49:27 +0000924 break;
sewardj7447b5b2004-10-16 11:30:42 +0000925 case Iop_32to1:
sewardj9d2e7692005-02-07 01:11:31 +0000926 e2 = IRExpr_Const(IRConst_U1(toBool(
927 1 == (1 & e->Iex.Unop.arg->Iex.Const.con->Ico.U32)
928 )));
sewardj7447b5b2004-10-16 11:30:42 +0000929 break;
sewardj291a7e82005-04-27 11:42:44 +0000930 case Iop_64to1:
931 e2 = IRExpr_Const(IRConst_U1(toBool(
932 1 == (1 & e->Iex.Unop.arg->Iex.Const.con->Ico.U64)
933 )));
934 break;
sewardje6b39932004-11-06 17:01:15 +0000935
sewardjf057afb2005-02-27 13:35:41 +0000936 case Iop_Not64:
937 e2 = IRExpr_Const(IRConst_U64(
938 ~ (e->Iex.Unop.arg->Iex.Const.con->Ico.U64)));
939 break;
sewardj883b00b2004-09-11 09:30:24 +0000940 case Iop_Not32:
941 e2 = IRExpr_Const(IRConst_U32(
942 ~ (e->Iex.Unop.arg->Iex.Const.con->Ico.U32)));
943 break;
sewardje6b39932004-11-06 17:01:15 +0000944 case Iop_Not16:
sewardj9d2e7692005-02-07 01:11:31 +0000945 e2 = IRExpr_Const(IRConst_U16(toUShort(
946 ~ (e->Iex.Unop.arg->Iex.Const.con->Ico.U16))));
sewardje6b39932004-11-06 17:01:15 +0000947 break;
sewardjd9997882004-11-04 19:42:59 +0000948 case Iop_Not8:
sewardj9d2e7692005-02-07 01:11:31 +0000949 e2 = IRExpr_Const(IRConst_U8(toUChar(
950 ~ (e->Iex.Unop.arg->Iex.Const.con->Ico.U8))));
sewardjd9997882004-11-04 19:42:59 +0000951 break;
sewardje6b39932004-11-06 17:01:15 +0000952
sewardje1d45da2004-11-12 00:13:21 +0000953 case Iop_Not1:
sewardjba999312004-11-15 15:21:17 +0000954 e2 = IRExpr_Const(IRConst_U1(
955 notBool(e->Iex.Unop.arg->Iex.Const.con->Ico.U1)));
sewardje1d45da2004-11-12 00:13:21 +0000956 break;
957
sewardj291a7e82005-04-27 11:42:44 +0000958 case Iop_Neg64:
959 e2 = IRExpr_Const(IRConst_U64(
960 - (e->Iex.Unop.arg->Iex.Const.con->Ico.U64)));
961 break;
sewardj0033ddc2005-04-26 23:34:34 +0000962 case Iop_Neg32:
963 e2 = IRExpr_Const(IRConst_U32(
964 - (e->Iex.Unop.arg->Iex.Const.con->Ico.U32)));
965 break;
966 case Iop_Neg8:
sewardj65b17c62005-05-02 15:52:44 +0000967 e2 = IRExpr_Const(IRConst_U8(toUChar(
968 - (e->Iex.Unop.arg->Iex.Const.con->Ico.U8))));
sewardj0033ddc2005-04-26 23:34:34 +0000969 break;
970
sewardj291a7e82005-04-27 11:42:44 +0000971 case Iop_64to8: {
972 ULong w64 = e->Iex.Unop.arg->Iex.Const.con->Ico.U64;
973 w64 &= 0xFFULL;
974 e2 = IRExpr_Const(IRConst_U8( (UChar)w64 ));
975 break;
976 }
977 case Iop_64to16: {
978 ULong w64 = e->Iex.Unop.arg->Iex.Const.con->Ico.U64;
979 w64 &= 0xFFFFULL;
980 e2 = IRExpr_Const(IRConst_U16( (UChar)w64 ));
981 break;
982 }
sewardj1d8ce202004-12-13 14:14:16 +0000983 case Iop_64to32: {
984 ULong w64 = e->Iex.Unop.arg->Iex.Const.con->Ico.U64;
985 w64 &= 0x00000000FFFFFFFFULL;
986 e2 = IRExpr_Const(IRConst_U32( (UInt)w64 ));
sewardj37010592004-12-13 10:47:15 +0000987 break;
sewardj1d8ce202004-12-13 14:14:16 +0000988 }
sewardj1d8ce202004-12-13 14:14:16 +0000989 case Iop_64HIto32: {
990 ULong w64 = e->Iex.Unop.arg->Iex.Const.con->Ico.U64;
991 w64 >>= 32;
992 e2 = IRExpr_Const(IRConst_U32( (UInt)w64 ));
993 break;
994 }
sewardjb5710b82005-01-27 16:05:09 +0000995 case Iop_32Uto64:
996 e2 = IRExpr_Const(IRConst_U64(
997 0xFFFFFFFFULL
998 & e->Iex.Unop.arg->Iex.Const.con->Ico.U32));
999 break;
sewardj37010592004-12-13 10:47:15 +00001000
sewardj0033ddc2005-04-26 23:34:34 +00001001 case Iop_CmpNEZ8:
1002 e2 = IRExpr_Const(IRConst_U1(toBool(
1003 0 !=
1004 (0xFF & e->Iex.Unop.arg->Iex.Const.con->Ico.U8)
1005 )));
1006 break;
1007 case Iop_CmpNEZ32:
1008 e2 = IRExpr_Const(IRConst_U1(toBool(
1009 0 !=
1010 (0xFFFFFFFF & e->Iex.Unop.arg->Iex.Const.con->Ico.U32)
1011 )));
1012 break;
1013 case Iop_CmpNEZ64:
1014 e2 = IRExpr_Const(IRConst_U1(toBool(
1015 0ULL != e->Iex.Unop.arg->Iex.Const.con->Ico.U64
1016 )));
1017 break;
1018
sewardj84be7372004-08-18 13:59:33 +00001019 default:
1020 goto unhandled;
1021 }
1022 }
1023
1024 /* BINARY ops */
1025 if (e->tag == Iex_Binop) {
1026 if (e->Iex.Binop.arg1->tag == Iex_Const
1027 && e->Iex.Binop.arg2->tag == Iex_Const) {
1028 /* cases where both args are consts */
1029 switch (e->Iex.Binop.op) {
sewardje6b39932004-11-06 17:01:15 +00001030
sewardj855dc722005-02-17 09:26:05 +00001031 /* -- Or -- */
sewardjd9997882004-11-04 19:42:59 +00001032 case Iop_Or8:
sewardj9d2e7692005-02-07 01:11:31 +00001033 e2 = IRExpr_Const(IRConst_U8(toUChar(
sewardjd9997882004-11-04 19:42:59 +00001034 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U8
sewardj9d2e7692005-02-07 01:11:31 +00001035 | e->Iex.Binop.arg2->Iex.Const.con->Ico.U8))));
sewardjd9997882004-11-04 19:42:59 +00001036 break;
sewardje6b39932004-11-06 17:01:15 +00001037 case Iop_Or16:
sewardj9d2e7692005-02-07 01:11:31 +00001038 e2 = IRExpr_Const(IRConst_U16(toUShort(
sewardje6b39932004-11-06 17:01:15 +00001039 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U16
sewardj9d2e7692005-02-07 01:11:31 +00001040 | e->Iex.Binop.arg2->Iex.Const.con->Ico.U16))));
sewardje6b39932004-11-06 17:01:15 +00001041 break;
1042 case Iop_Or32:
1043 e2 = IRExpr_Const(IRConst_U32(
1044 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
1045 | e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)));
1046 break;
sewardjf057afb2005-02-27 13:35:41 +00001047 case Iop_Or64:
1048 e2 = IRExpr_Const(IRConst_U64(
1049 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U64
1050 | e->Iex.Binop.arg2->Iex.Const.con->Ico.U64)));
1051 break;
sewardje6b39932004-11-06 17:01:15 +00001052
sewardj855dc722005-02-17 09:26:05 +00001053 /* -- Xor -- */
sewardj883b00b2004-09-11 09:30:24 +00001054 case Iop_Xor8:
sewardj9d2e7692005-02-07 01:11:31 +00001055 e2 = IRExpr_Const(IRConst_U8(toUChar(
sewardj883b00b2004-09-11 09:30:24 +00001056 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U8
sewardj9d2e7692005-02-07 01:11:31 +00001057 ^ e->Iex.Binop.arg2->Iex.Const.con->Ico.U8))));
sewardj883b00b2004-09-11 09:30:24 +00001058 break;
sewardj82c9f2f2005-03-02 16:05:13 +00001059 case Iop_Xor16:
sewardjc7c098f2005-03-21 01:06:20 +00001060 e2 = IRExpr_Const(IRConst_U16(toUShort(
sewardj82c9f2f2005-03-02 16:05:13 +00001061 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U16
sewardjc7c098f2005-03-21 01:06:20 +00001062 ^ e->Iex.Binop.arg2->Iex.Const.con->Ico.U16))));
sewardj82c9f2f2005-03-02 16:05:13 +00001063 break;
sewardj855dc722005-02-17 09:26:05 +00001064 case Iop_Xor32:
1065 e2 = IRExpr_Const(IRConst_U32(
1066 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
1067 ^ e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)));
1068 break;
sewardjf057afb2005-02-27 13:35:41 +00001069 case Iop_Xor64:
1070 e2 = IRExpr_Const(IRConst_U64(
1071 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U64
1072 ^ e->Iex.Binop.arg2->Iex.Const.con->Ico.U64)));
1073 break;
sewardj855dc722005-02-17 09:26:05 +00001074
1075 /* -- And -- */
sewardj84be7372004-08-18 13:59:33 +00001076 case Iop_And8:
sewardj9d2e7692005-02-07 01:11:31 +00001077 e2 = IRExpr_Const(IRConst_U8(toUChar(
sewardj84be7372004-08-18 13:59:33 +00001078 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U8
sewardj9d2e7692005-02-07 01:11:31 +00001079 & e->Iex.Binop.arg2->Iex.Const.con->Ico.U8))));
sewardj84be7372004-08-18 13:59:33 +00001080 break;
sewardj855dc722005-02-17 09:26:05 +00001081 case Iop_And32:
1082 e2 = IRExpr_Const(IRConst_U32(
1083 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
1084 & e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)));
1085 break;
1086 case Iop_And64:
1087 e2 = IRExpr_Const(IRConst_U64(
1088 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U64
1089 & e->Iex.Binop.arg2->Iex.Const.con->Ico.U64)));
1090 break;
1091
1092 /* -- Add -- */
sewardj4345f7a2004-09-22 19:49:27 +00001093 case Iop_Add8:
sewardj9d2e7692005-02-07 01:11:31 +00001094 e2 = IRExpr_Const(IRConst_U8(toUChar(
sewardj4345f7a2004-09-22 19:49:27 +00001095 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U8
sewardj9d2e7692005-02-07 01:11:31 +00001096 + e->Iex.Binop.arg2->Iex.Const.con->Ico.U8))));
sewardj4345f7a2004-09-22 19:49:27 +00001097 break;
sewardj855dc722005-02-17 09:26:05 +00001098 case Iop_Add32:
1099 e2 = IRExpr_Const(IRConst_U32(
1100 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
1101 + e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)));
1102 break;
1103 case Iop_Add64:
1104 e2 = IRExpr_Const(IRConst_U64(
1105 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U64
1106 + e->Iex.Binop.arg2->Iex.Const.con->Ico.U64)));
1107 break;
1108
1109 /* -- Sub -- */
sewardj84be7372004-08-18 13:59:33 +00001110 case Iop_Sub8:
sewardj9d2e7692005-02-07 01:11:31 +00001111 e2 = IRExpr_Const(IRConst_U8(toUChar(
sewardj84be7372004-08-18 13:59:33 +00001112 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U8
sewardj9d2e7692005-02-07 01:11:31 +00001113 - e->Iex.Binop.arg2->Iex.Const.con->Ico.U8))));
sewardj84be7372004-08-18 13:59:33 +00001114 break;
sewardjd7217032004-08-19 10:49:10 +00001115 case Iop_Sub32:
1116 e2 = IRExpr_Const(IRConst_U32(
1117 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
1118 - e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)));
1119 break;
sewardj70a8ddf2005-02-13 02:24:26 +00001120 case Iop_Sub64:
1121 e2 = IRExpr_Const(IRConst_U64(
1122 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U64
1123 - e->Iex.Binop.arg2->Iex.Const.con->Ico.U64)));
1124 break;
sewardjc2bcb6f2005-02-07 00:17:12 +00001125
sewardj855dc722005-02-17 09:26:05 +00001126 /* -- Mul -- */
sewardjb9c5cf62004-08-24 15:10:38 +00001127 case Iop_Mul32:
1128 e2 = IRExpr_Const(IRConst_U32(
1129 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
1130 * e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)));
1131 break;
sewardja34c0712005-03-30 23:19:46 +00001132 case Iop_Mul64:
1133 e2 = IRExpr_Const(IRConst_U64(
1134 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U64
1135 * e->Iex.Binop.arg2->Iex.Const.con->Ico.U64)));
1136 break;
1137
sewardjea6bccb2005-03-01 10:19:23 +00001138 case Iop_MullS32: {
1139 /* very paranoid */
1140 UInt u32a = e->Iex.Binop.arg1->Iex.Const.con->Ico.U32;
1141 UInt u32b = e->Iex.Binop.arg2->Iex.Const.con->Ico.U32;
1142 Int s32a = (Int)u32a;
1143 Int s32b = (Int)u32b;
1144 Long s64a = (Long)s32a;
1145 Long s64b = (Long)s32b;
1146 Long sres = s64a * s64b;
1147 ULong ures = (ULong)sres;
1148 e2 = IRExpr_Const(IRConst_U64(ures));
1149 break;
1150 }
sewardjb095fba2005-02-13 14:13:04 +00001151
sewardj855dc722005-02-17 09:26:05 +00001152 /* -- Shl -- */
sewardjd7217032004-08-19 10:49:10 +00001153 case Iop_Shl32:
sewardj61348472004-08-20 01:01:04 +00001154 vassert(e->Iex.Binop.arg2->Iex.Const.con->tag == Ico_U8);
1155 shift = (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U8);
sewardj29632392004-08-22 02:38:11 +00001156 if (shift >= 0 && shift <= 31)
sewardj278c44c2004-08-20 00:28:13 +00001157 e2 = IRExpr_Const(IRConst_U32(
1158 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
1159 << shift)));
sewardjd7217032004-08-19 10:49:10 +00001160 break;
sewardjb095fba2005-02-13 14:13:04 +00001161 case Iop_Shl64:
1162 vassert(e->Iex.Binop.arg2->Iex.Const.con->tag == Ico_U8);
1163 shift = (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U8);
1164 if (shift >= 0 && shift <= 63)
1165 e2 = IRExpr_Const(IRConst_U64(
1166 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U64
1167 << shift)));
1168 break;
1169
sewardj855dc722005-02-17 09:26:05 +00001170 /* -- Sar -- */
sewardj278c44c2004-08-20 00:28:13 +00001171 case Iop_Sar32: {
1172 /* paranoid ... */
1173 /*signed*/ Int s32;
sewardj61348472004-08-20 01:01:04 +00001174 vassert(e->Iex.Binop.arg2->Iex.Const.con->tag == Ico_U8);
sewardj278c44c2004-08-20 00:28:13 +00001175 s32 = (Int)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32);
sewardj61348472004-08-20 01:01:04 +00001176 shift = (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U8);
sewardj278c44c2004-08-20 00:28:13 +00001177 if (shift >= 0 && shift <= 31) {
1178 s32 >>=/*signed*/ shift;
1179 e2 = IRExpr_Const(IRConst_U32((UInt)s32));
1180 }
1181 break;
1182 }
sewardj855dc722005-02-17 09:26:05 +00001183 case Iop_Sar64: {
1184 /* paranoid ... */
1185 /*signed*/ Long s64;
1186 vassert(e->Iex.Binop.arg2->Iex.Const.con->tag == Ico_U8);
1187 s64 = (Long)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U64);
1188 shift = (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U8);
1189 if (shift >= 0 && shift <= 63) {
1190 s64 >>=/*signed*/ shift;
1191 e2 = IRExpr_Const(IRConst_U64((ULong)s64));
1192 }
1193 break;
1194 }
1195
1196 /* -- Shr -- */
sewardj61348472004-08-20 01:01:04 +00001197 case Iop_Shr32: {
1198 /* paranoid ... */
sewardj4add7142005-02-21 08:20:22 +00001199 /*unsigned*/ UInt u32;
sewardj61348472004-08-20 01:01:04 +00001200 vassert(e->Iex.Binop.arg2->Iex.Const.con->tag == Ico_U8);
sewardj4add7142005-02-21 08:20:22 +00001201 u32 = (UInt)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32);
sewardj61348472004-08-20 01:01:04 +00001202 shift = (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U8);
1203 if (shift >= 0 && shift <= 31) {
sewardj4add7142005-02-21 08:20:22 +00001204 u32 >>=/*unsigned*/ shift;
1205 e2 = IRExpr_Const(IRConst_U32(u32));
1206 }
1207 break;
1208 }
1209 case Iop_Shr64: {
1210 /* paranoid ... */
1211 /*unsigned*/ ULong u64;
1212 vassert(e->Iex.Binop.arg2->Iex.Const.con->tag == Ico_U8);
1213 u64 = (ULong)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U64);
1214 shift = (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U8);
1215 if (shift >= 0 && shift <= 63) {
1216 u64 >>=/*unsigned*/ shift;
1217 e2 = IRExpr_Const(IRConst_U64(u64));
sewardj61348472004-08-20 01:01:04 +00001218 }
1219 break;
1220 }
sewardj855dc722005-02-17 09:26:05 +00001221
1222 /* -- CmpEQ -- */
sewardjb8e75862004-08-19 17:58:45 +00001223 case Iop_CmpEQ32:
sewardj9d2e7692005-02-07 01:11:31 +00001224 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardjb8e75862004-08-19 17:58:45 +00001225 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
sewardj9d2e7692005-02-07 01:11:31 +00001226 == e->Iex.Binop.arg2->Iex.Const.con->Ico.U32))));
sewardjb8e75862004-08-19 17:58:45 +00001227 break;
sewardj855dc722005-02-17 09:26:05 +00001228 case Iop_CmpEQ64:
1229 e2 = IRExpr_Const(IRConst_U1(toBool(
1230 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U64
1231 == e->Iex.Binop.arg2->Iex.Const.con->Ico.U64))));
1232 break;
1233
1234 /* -- CmpNE -- */
1235 case Iop_CmpNE8:
1236 e2 = IRExpr_Const(IRConst_U1(toBool(
1237 ((0xFF & e->Iex.Binop.arg1->Iex.Const.con->Ico.U8)
1238 != (0xFF & e->Iex.Binop.arg2->Iex.Const.con->Ico.U8)))));
1239 break;
sewardjae27ab62004-10-15 21:21:46 +00001240 case Iop_CmpNE32:
sewardj9d2e7692005-02-07 01:11:31 +00001241 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardjae27ab62004-10-15 21:21:46 +00001242 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U32
sewardj9d2e7692005-02-07 01:11:31 +00001243 != e->Iex.Binop.arg2->Iex.Const.con->Ico.U32))));
sewardjae27ab62004-10-15 21:21:46 +00001244 break;
sewardje6b39932004-11-06 17:01:15 +00001245 case Iop_CmpNE64:
sewardj9d2e7692005-02-07 01:11:31 +00001246 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardje6b39932004-11-06 17:01:15 +00001247 (e->Iex.Binop.arg1->Iex.Const.con->Ico.U64
sewardj9d2e7692005-02-07 01:11:31 +00001248 != e->Iex.Binop.arg2->Iex.Const.con->Ico.U64))));
sewardje6b39932004-11-06 17:01:15 +00001249 break;
1250
sewardj855dc722005-02-17 09:26:05 +00001251 /* -- CmpLEU -- */
sewardj7447b5b2004-10-16 11:30:42 +00001252 case Iop_CmpLE32U:
sewardj9d2e7692005-02-07 01:11:31 +00001253 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardj7447b5b2004-10-16 11:30:42 +00001254 ((UInt)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32)
sewardj9d2e7692005-02-07 01:11:31 +00001255 <= (UInt)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)))));
sewardj7447b5b2004-10-16 11:30:42 +00001256 break;
sewardj855dc722005-02-17 09:26:05 +00001257
1258 /* -- CmpLES -- */
sewardj088e4f72004-10-19 01:25:02 +00001259 case Iop_CmpLE32S:
sewardj9d2e7692005-02-07 01:11:31 +00001260 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardj088e4f72004-10-19 01:25:02 +00001261 ((Int)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32)
sewardj9d2e7692005-02-07 01:11:31 +00001262 <= (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)))));
sewardj088e4f72004-10-19 01:25:02 +00001263 break;
sewardje1d45da2004-11-12 00:13:21 +00001264
sewardj855dc722005-02-17 09:26:05 +00001265 /* -- CmpLTS -- */
sewardj9bdd2652004-10-19 12:56:33 +00001266 case Iop_CmpLT32S:
sewardj9d2e7692005-02-07 01:11:31 +00001267 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardj9bdd2652004-10-19 12:56:33 +00001268 ((Int)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32)
sewardj9d2e7692005-02-07 01:11:31 +00001269 < (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)))));
sewardj9bdd2652004-10-19 12:56:33 +00001270 break;
sewardj855dc722005-02-17 09:26:05 +00001271
1272 /* -- CmpLTU -- */
sewardje1d45da2004-11-12 00:13:21 +00001273 case Iop_CmpLT32U:
sewardj9d2e7692005-02-07 01:11:31 +00001274 e2 = IRExpr_Const(IRConst_U1(toBool(
sewardje1d45da2004-11-12 00:13:21 +00001275 ((UInt)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32)
sewardj9d2e7692005-02-07 01:11:31 +00001276 < (UInt)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U32)))));
sewardje1d45da2004-11-12 00:13:21 +00001277 break;
sewardj7447b5b2004-10-16 11:30:42 +00001278
sewardj855dc722005-02-17 09:26:05 +00001279 /* -- nHLto2n -- */
sewardj088bcb42004-08-19 17:16:52 +00001280 case Iop_32HLto64:
1281 e2 = IRExpr_Const(IRConst_U64(
1282 (((ULong)(e->Iex.Binop.arg1->Iex.Const.con->Ico.U32)) << 32)
1283 | ((ULong)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U32))
1284 ));
1285 break;
sewardj3bc8a592005-05-02 10:47:22 +00001286 case Iop_64HLto128:
1287 /* We can't fold this, because there is no way to
1288 express he result in IR, but at least pretend to
1289 handle it, so as to stop getting blasted with
1290 no-rule-for-this-primop messages. */
1291 break;
sewardj855dc722005-02-17 09:26:05 +00001292
sewardj607dd4f2004-09-08 18:20:19 +00001293 default:
1294 goto unhandled;
sewardjd7217032004-08-19 10:49:10 +00001295 }
sewardjf6729012004-08-25 12:45:13 +00001296
sewardj84be7372004-08-18 13:59:33 +00001297 } else {
sewardjf6729012004-08-25 12:45:13 +00001298
sewardj84be7372004-08-18 13:59:33 +00001299 /* other cases (identities, etc) */
sewardj291a7e82005-04-27 11:42:44 +00001300
1301 /* Shl64/Shr64(x,0) ==> x */
1302 if ((e->Iex.Binop.op == Iop_Shl64 || e->Iex.Binop.op == Iop_Shr64)
1303 && e->Iex.Binop.arg2->tag == Iex_Const
1304 && e->Iex.Binop.arg2->Iex.Const.con->Ico.U8 == 0) {
1305 e2 = e->Iex.Binop.arg1;
1306 } else
1307
sewardj4afab822005-01-20 10:04:05 +00001308 /* Shl32/Shr32(x,0) ==> x */
1309 if ((e->Iex.Binop.op == Iop_Shl32 || e->Iex.Binop.op == Iop_Shr32)
sewardj52345402004-10-13 16:08:14 +00001310 && e->Iex.Binop.arg2->tag == Iex_Const
sewardj4980c6b2004-10-13 16:16:27 +00001311 && e->Iex.Binop.arg2->Iex.Const.con->Ico.U8 == 0) {
sewardj52345402004-10-13 16:08:14 +00001312 e2 = e->Iex.Binop.arg1;
1313 } else
1314
sewardj8ac9bc92005-04-21 01:35:48 +00001315 /* Or8(x,0) ==> x */
1316 if ((e->Iex.Binop.op == Iop_Or8)
1317 && e->Iex.Binop.arg2->tag == Iex_Const
1318 && e->Iex.Binop.arg2->Iex.Const.con->Ico.U8 == 0) {
1319 e2 = e->Iex.Binop.arg1;
1320 } else
1321
sewardj0033ddc2005-04-26 23:34:34 +00001322 /* Or16(x,0) ==> x */
1323 if ((e->Iex.Binop.op == Iop_Or16)
1324 && e->Iex.Binop.arg2->tag == Iex_Const
1325 && e->Iex.Binop.arg2->Iex.Const.con->Ico.U16 == 0) {
1326 e2 = e->Iex.Binop.arg1;
1327 } else
1328
sewardjd9997882004-11-04 19:42:59 +00001329 /* Or32/Add32(x,0) ==> x */
1330 if ((e->Iex.Binop.op == Iop_Add32 || e->Iex.Binop.op == Iop_Or32)
sewardj84be7372004-08-18 13:59:33 +00001331 && e->Iex.Binop.arg2->tag == Iex_Const
1332 && e->Iex.Binop.arg2->Iex.Const.con->Ico.U32 == 0) {
1333 e2 = e->Iex.Binop.arg1;
sewardjf6729012004-08-25 12:45:13 +00001334 } else
1335
sewardjdcd6c882004-12-16 11:41:06 +00001336 /* Or64/Add64(x,0) ==> x */
1337 if ((e->Iex.Binop.op == Iop_Add64 || e->Iex.Binop.op == Iop_Or64)
1338 && e->Iex.Binop.arg2->tag == Iex_Const
1339 && e->Iex.Binop.arg2->Iex.Const.con->Ico.U64 == 0) {
1340 e2 = e->Iex.Binop.arg1;
1341 } else
1342
sewardj832853b2004-11-06 12:10:04 +00001343 /* And32(x,0xFFFFFFFF) ==> x */
1344 if (e->Iex.Binop.op == Iop_And32
1345 && e->Iex.Binop.arg2->tag == Iex_Const
1346 && e->Iex.Binop.arg2->Iex.Const.con->Ico.U32 == 0xFFFFFFFF) {
1347 e2 = e->Iex.Binop.arg1;
1348 } else
1349
sewardjd9997882004-11-04 19:42:59 +00001350 /* Or32(0,x) ==> x */
1351 if (e->Iex.Binop.op == Iop_Or32
1352 && e->Iex.Binop.arg1->tag == Iex_Const
1353 && e->Iex.Binop.arg1->Iex.Const.con->Ico.U32 == 0) {
1354 e2 = e->Iex.Binop.arg2;
1355 } else
1356
sewardjdcd6c882004-12-16 11:41:06 +00001357 /* Or8/16/32/64(t,t) ==> t, for some IRTemp t */
1358 /* And8/16/32/64(t,t) ==> t, for some IRTemp t */
1359 if ( (e->Iex.Binop.op == Iop_And64
1360 || e->Iex.Binop.op == Iop_And32
sewardjf6729012004-08-25 12:45:13 +00001361 || e->Iex.Binop.op == Iop_And16
sewardjdcd6c882004-12-16 11:41:06 +00001362 || e->Iex.Binop.op == Iop_And8
1363 || e->Iex.Binop.op == Iop_Or64
1364 || e->Iex.Binop.op == Iop_Or32
1365 || e->Iex.Binop.op == Iop_Or16
1366 || e->Iex.Binop.op == Iop_Or8)
sewardjf6729012004-08-25 12:45:13 +00001367 && sameIRTemps(e->Iex.Binop.arg1, e->Iex.Binop.arg2)) {
1368 e2 = e->Iex.Binop.arg1;
sewardjaba4fff2004-10-08 21:37:15 +00001369 }
sewardjf6729012004-08-25 12:45:13 +00001370
sewardj0033ddc2005-04-26 23:34:34 +00001371 /* Xor8/16/32/64(t,t) ==> 0, for some IRTemp t */
1372 if ( (e->Iex.Binop.op == Iop_Xor64
1373 || e->Iex.Binop.op == Iop_Xor32
1374 || e->Iex.Binop.op == Iop_Xor16
1375 || e->Iex.Binop.op == Iop_Xor8)
1376 && sameIRTemps(e->Iex.Binop.arg1, e->Iex.Binop.arg2)) {
sewardj0033ddc2005-04-26 23:34:34 +00001377 e2 = mkZeroForXor(e->Iex.Binop.op);
1378 }
1379
sewardj84be7372004-08-18 13:59:33 +00001380 }
1381 }
1382
1383 /* Mux0X */
1384 if (e->tag == Iex_Mux0X
1385 && e->Iex.Mux0X.cond->tag == Iex_Const) {
1386 Bool zero;
1387 /* assured us by the IR type rules */
1388 vassert(e->Iex.Mux0X.cond->Iex.Const.con->tag == Ico_U8);
sewardj9d2e7692005-02-07 01:11:31 +00001389 zero = toBool(0 == (0xFF & e->Iex.Mux0X.cond
1390 ->Iex.Const.con->Ico.U8));
sewardj84be7372004-08-18 13:59:33 +00001391 e2 = zero ? e->Iex.Mux0X.expr0 : e->Iex.Mux0X.exprX;
1392 }
1393
sewardj088bcb42004-08-19 17:16:52 +00001394 if (DEBUG_IROPT && e2 != e) {
1395 vex_printf("FOLD: ");
sewardj84be7372004-08-18 13:59:33 +00001396 ppIRExpr(e); vex_printf(" -> ");
1397 ppIRExpr(e2); vex_printf("\n");
1398 }
1399
1400 return e2;
1401
1402 unhandled:
sewardj883b00b2004-09-11 09:30:24 +00001403# if 0
sewardj84be7372004-08-18 13:59:33 +00001404 vex_printf("\n\n");
1405 ppIRExpr(e);
1406 vpanic("fold_Expr: no rule for the above");
sewardj883b00b2004-09-11 09:30:24 +00001407# else
1408 vex_printf("vex iropt: fold_Expr: no rule for: ");
1409 ppIRExpr(e);
1410 vex_printf("\n");
1411 return e2;
1412# endif
sewardj84be7372004-08-18 13:59:33 +00001413}
1414
1415
sewardj84be7372004-08-18 13:59:33 +00001416/* Apply the subst to a simple 1-level expression -- guaranteed to be
1417 1-level due to previous flattening pass. */
1418
sewardj62617ef2004-10-13 23:29:22 +00001419static IRExpr* subst_Expr ( IRExpr** env, IRExpr* ex )
sewardj84be7372004-08-18 13:59:33 +00001420{
sewardj62617ef2004-10-13 23:29:22 +00001421 switch (ex->tag) {
1422 case Iex_Tmp:
1423 if (env[(Int)ex->Iex.Tmp.tmp] != NULL) {
1424 return env[(Int)ex->Iex.Tmp.tmp];
1425 } else {
1426 /* not bound in env */
1427 return ex;
1428 }
1429
1430 case Iex_Const:
1431 case Iex_Get:
sewardj84be7372004-08-18 13:59:33 +00001432 return ex;
sewardj62617ef2004-10-13 23:29:22 +00001433
1434 case Iex_GetI:
sewardj496a58d2005-03-20 18:44:44 +00001435 vassert(isIRAtom(ex->Iex.GetI.ix));
sewardj62617ef2004-10-13 23:29:22 +00001436 return IRExpr_GetI(
1437 ex->Iex.GetI.descr,
sewardjeeac8412004-11-02 00:26:55 +00001438 subst_Expr(env, ex->Iex.GetI.ix),
sewardj62617ef2004-10-13 23:29:22 +00001439 ex->Iex.GetI.bias
1440 );
1441
1442 case Iex_Binop:
sewardj496a58d2005-03-20 18:44:44 +00001443 vassert(isIRAtom(ex->Iex.Binop.arg1));
1444 vassert(isIRAtom(ex->Iex.Binop.arg2));
sewardj62617ef2004-10-13 23:29:22 +00001445 return IRExpr_Binop(
1446 ex->Iex.Binop.op,
1447 subst_Expr(env, ex->Iex.Binop.arg1),
1448 subst_Expr(env, ex->Iex.Binop.arg2)
1449 );
1450
1451 case Iex_Unop:
sewardj496a58d2005-03-20 18:44:44 +00001452 vassert(isIRAtom(ex->Iex.Unop.arg));
sewardj62617ef2004-10-13 23:29:22 +00001453 return IRExpr_Unop(
1454 ex->Iex.Unop.op,
1455 subst_Expr(env, ex->Iex.Unop.arg)
1456 );
1457
1458 case Iex_LDle:
sewardj496a58d2005-03-20 18:44:44 +00001459 vassert(isIRAtom(ex->Iex.LDle.addr));
sewardj62617ef2004-10-13 23:29:22 +00001460 return IRExpr_LDle(
1461 ex->Iex.LDle.ty,
1462 subst_Expr(env, ex->Iex.LDle.addr)
1463 );
1464
1465 case Iex_CCall: {
1466 Int i;
1467 IRExpr** args2 = sopyIRExprVec(ex->Iex.CCall.args);
1468 for (i = 0; args2[i]; i++) {
sewardj496a58d2005-03-20 18:44:44 +00001469 vassert(isIRAtom(args2[i]));
sewardj62617ef2004-10-13 23:29:22 +00001470 args2[i] = subst_Expr(env, args2[i]);
1471 }
1472 return IRExpr_CCall(
sewardj8ea867b2004-10-30 19:03:02 +00001473 ex->Iex.CCall.cee,
sewardj62617ef2004-10-13 23:29:22 +00001474 ex->Iex.CCall.retty,
1475 args2
1476 );
sewardj84be7372004-08-18 13:59:33 +00001477 }
sewardj62617ef2004-10-13 23:29:22 +00001478
1479 case Iex_Mux0X:
sewardj496a58d2005-03-20 18:44:44 +00001480 vassert(isIRAtom(ex->Iex.Mux0X.cond));
1481 vassert(isIRAtom(ex->Iex.Mux0X.expr0));
1482 vassert(isIRAtom(ex->Iex.Mux0X.exprX));
sewardj62617ef2004-10-13 23:29:22 +00001483 return IRExpr_Mux0X(
1484 subst_Expr(env, ex->Iex.Mux0X.cond),
1485 subst_Expr(env, ex->Iex.Mux0X.expr0),
1486 subst_Expr(env, ex->Iex.Mux0X.exprX)
1487 );
1488
1489 default:
1490 vex_printf("\n\n"); ppIRExpr(ex);
1491 vpanic("subst_Expr");
1492
sewardj84be7372004-08-18 13:59:33 +00001493 }
sewardj84be7372004-08-18 13:59:33 +00001494}
1495
1496
1497/* Apply the subst to stmt, then fold the result as much as possible.
sewardjd2445f62005-03-21 00:15:53 +00001498 Much simplified due to stmt being previously flattened. As a
1499 result of this, the stmt may wind up being turned into a no-op.
1500*/
sewardj62617ef2004-10-13 23:29:22 +00001501static IRStmt* subst_and_fold_Stmt ( IRExpr** env, IRStmt* st )
sewardj84be7372004-08-18 13:59:33 +00001502{
1503# if 0
1504 vex_printf("\nsubst and fold stmt\n");
1505 ppIRStmt(st);
1506 vex_printf("\n");
1507# endif
1508
sewardj62617ef2004-10-13 23:29:22 +00001509 switch (st->tag) {
1510 case Ist_Put:
sewardj496a58d2005-03-20 18:44:44 +00001511 vassert(isIRAtom(st->Ist.Put.data));
sewardj62617ef2004-10-13 23:29:22 +00001512 return IRStmt_Put(
1513 st->Ist.Put.offset,
1514 fold_Expr(subst_Expr(env, st->Ist.Put.data))
1515 );
sewardj84be7372004-08-18 13:59:33 +00001516
sewardj62617ef2004-10-13 23:29:22 +00001517 case Ist_PutI:
sewardj496a58d2005-03-20 18:44:44 +00001518 vassert(isIRAtom(st->Ist.PutI.ix));
1519 vassert(isIRAtom(st->Ist.PutI.data));
sewardj62617ef2004-10-13 23:29:22 +00001520 return IRStmt_PutI(
1521 st->Ist.PutI.descr,
sewardjeeac8412004-11-02 00:26:55 +00001522 fold_Expr(subst_Expr(env, st->Ist.PutI.ix)),
sewardj62617ef2004-10-13 23:29:22 +00001523 st->Ist.PutI.bias,
1524 fold_Expr(subst_Expr(env, st->Ist.PutI.data))
1525 );
sewardjd7217032004-08-19 10:49:10 +00001526
sewardj62617ef2004-10-13 23:29:22 +00001527 case Ist_Tmp:
1528 /* This is the one place where an expr (st->Ist.Tmp.data) is
1529 allowed to be more than just a constant or a tmp. */
1530 return IRStmt_Tmp(
1531 st->Ist.Tmp.tmp,
1532 fold_Expr(subst_Expr(env, st->Ist.Tmp.data))
1533 );
sewardj84be7372004-08-18 13:59:33 +00001534
sewardj62617ef2004-10-13 23:29:22 +00001535 case Ist_STle:
sewardj496a58d2005-03-20 18:44:44 +00001536 vassert(isIRAtom(st->Ist.STle.addr));
1537 vassert(isIRAtom(st->Ist.STle.data));
sewardj62617ef2004-10-13 23:29:22 +00001538 return IRStmt_STle(
1539 fold_Expr(subst_Expr(env, st->Ist.STle.addr)),
1540 fold_Expr(subst_Expr(env, st->Ist.STle.data))
1541 );
sewardj84be7372004-08-18 13:59:33 +00001542
sewardj62617ef2004-10-13 23:29:22 +00001543 case Ist_Dirty: {
1544 Int i;
1545 IRDirty *d, *d2;
1546 d = st->Ist.Dirty.details;
1547 d2 = emptyIRDirty();
1548 *d2 = *d;
1549 d2->args = sopyIRExprVec(d2->args);
1550 if (d2->mFx != Ifx_None) {
sewardj496a58d2005-03-20 18:44:44 +00001551 vassert(isIRAtom(d2->mAddr));
sewardj62617ef2004-10-13 23:29:22 +00001552 d2->mAddr = fold_Expr(subst_Expr(env, d2->mAddr));
sewardjb8e75862004-08-19 17:58:45 +00001553 }
sewardj496a58d2005-03-20 18:44:44 +00001554 vassert(isIRAtom(d2->guard));
sewardjb8385d82004-11-02 01:34:15 +00001555 d2->guard = fold_Expr(subst_Expr(env, d2->guard));
sewardj62617ef2004-10-13 23:29:22 +00001556 for (i = 0; d2->args[i]; i++) {
sewardj496a58d2005-03-20 18:44:44 +00001557 vassert(isIRAtom(d2->args[i]));
sewardj62617ef2004-10-13 23:29:22 +00001558 d2->args[i] = fold_Expr(subst_Expr(env, d2->args[i]));
1559 }
1560 return IRStmt_Dirty(d2);
sewardjb8e75862004-08-19 17:58:45 +00001561 }
sewardj84be7372004-08-18 13:59:33 +00001562
sewardjf1689312005-03-16 18:19:10 +00001563 case Ist_IMark:
1564 return IRStmt_IMark(st->Ist.IMark.addr, st->Ist.IMark.len);
1565
sewardjd2445f62005-03-21 00:15:53 +00001566 case Ist_NoOp:
1567 return IRStmt_NoOp();
1568
sewardj3e838932005-01-07 12:09:15 +00001569 case Ist_MFence:
1570 return IRStmt_MFence();
1571
sewardj62617ef2004-10-13 23:29:22 +00001572 case Ist_Exit: {
1573 IRExpr* fcond;
sewardj496a58d2005-03-20 18:44:44 +00001574 vassert(isIRAtom(st->Ist.Exit.guard));
sewardj0276d4b2004-11-15 15:30:21 +00001575 fcond = fold_Expr(subst_Expr(env, st->Ist.Exit.guard));
sewardj62617ef2004-10-13 23:29:22 +00001576 if (fcond->tag == Iex_Const) {
1577 /* Interesting. The condition on this exit has folded down to
1578 a constant. */
sewardjba999312004-11-15 15:21:17 +00001579 vassert(fcond->Iex.Const.con->tag == Ico_U1);
sewardj98430292004-12-29 17:34:50 +00001580 vassert(fcond->Iex.Const.con->Ico.U1 == False
1581 || fcond->Iex.Const.con->Ico.U1 == True);
sewardjba999312004-11-15 15:21:17 +00001582 if (fcond->Iex.Const.con->Ico.U1 == False) {
sewardj62617ef2004-10-13 23:29:22 +00001583 /* exit is never going to happen, so dump the statement. */
sewardjd2445f62005-03-21 00:15:53 +00001584 return IRStmt_NoOp();
sewardj62617ef2004-10-13 23:29:22 +00001585 } else {
sewardjba999312004-11-15 15:21:17 +00001586 vassert(fcond->Iex.Const.con->Ico.U1 == True);
sewardj62617ef2004-10-13 23:29:22 +00001587 /* Hmmm. The exit has become unconditional. Leave it as
1588 it is for now, since we'd have to truncate the BB at
1589 this point, which is tricky. */
1590 /* fall out into the reconstruct-the-exit code. */
sewardj08613742004-10-25 13:01:45 +00001591 if (vex_control.iropt_verbosity > 0)
1592 /* really a misuse of vex_control.iropt_verbosity */
sewardj8c2c10b2004-10-16 20:51:52 +00001593 vex_printf("vex iropt: IRStmt_Exit became unconditional\n");
sewardj62617ef2004-10-13 23:29:22 +00001594 }
1595 }
sewardj893aada2004-11-29 19:57:54 +00001596 return IRStmt_Exit(fcond, st->Ist.Exit.jk, st->Ist.Exit.dst);
sewardj62617ef2004-10-13 23:29:22 +00001597 }
1598
1599 default:
1600 vex_printf("\n"); ppIRStmt(st);
1601 vpanic("subst_and_fold_Stmt");
1602 }
sewardj84be7372004-08-18 13:59:33 +00001603}
1604
1605
sewardjd9997882004-11-04 19:42:59 +00001606IRBB* cprop_BB ( IRBB* in )
sewardj84be7372004-08-18 13:59:33 +00001607{
sewardj62617ef2004-10-13 23:29:22 +00001608 Int i;
1609 IRBB* out;
1610 IRStmt* st2;
1611 Int n_tmps = in->tyenv->types_used;
1612 IRExpr** env = LibVEX_Alloc(n_tmps * sizeof(IRExpr*));
sewardj84be7372004-08-18 13:59:33 +00001613
1614 out = emptyIRBB();
sewardj695cff92004-10-13 14:50:14 +00001615 out->tyenv = dopyIRTypeEnv( in->tyenv );
sewardj84be7372004-08-18 13:59:33 +00001616
1617 /* Set up the env with which travels forward. This holds a
1618 substitution, mapping IRTemps to atoms, that is, IRExprs which
1619 are either IRTemps or IRConsts. Thus, copy and constant
1620 propagation is done. The environment is to be applied as we
1621 move along. Keys are IRTemps. Values are IRExpr*s.
1622 */
sewardj62617ef2004-10-13 23:29:22 +00001623 for (i = 0; i < n_tmps; i++)
1624 env[i] = NULL;
sewardj84be7372004-08-18 13:59:33 +00001625
1626 /* For each original SSA-form stmt ... */
1627 for (i = 0; i < in->stmts_used; i++) {
1628
1629 /* First apply the substitution to the current stmt. This
1630 propagates in any constants and tmp-tmp assignments
1631 accumulated prior to this point. As part of the subst_Stmt
1632 call, also then fold any constant expressions resulting. */
1633
sewardjf0e43162004-08-20 00:11:12 +00001634 st2 = in->stmts[i];
1635
1636 /* perhaps st2 is already a no-op? */
sewardj8bee6d12005-03-22 02:24:05 +00001637 if (st2->tag == Ist_NoOp) continue;
sewardjf0e43162004-08-20 00:11:12 +00001638
1639 st2 = subst_and_fold_Stmt( env, st2 );
sewardj84be7372004-08-18 13:59:33 +00001640
sewardjb8e75862004-08-19 17:58:45 +00001641 /* If the statement has been folded into a no-op, forget it. */
sewardj8bee6d12005-03-22 02:24:05 +00001642 if (st2->tag == Ist_NoOp) continue;
sewardjb8e75862004-08-19 17:58:45 +00001643
sewardj84be7372004-08-18 13:59:33 +00001644 /* Now consider what the stmt looks like. If it's of the form
1645 't = const' or 't1 = t2', add it to the running environment
1646 and not to the output BB. Otherwise, add it to the output
sewardjc0b42282004-10-12 13:44:12 +00001647 BB. Note, we choose not to propagate const when const is an
1648 F64i, so that F64i literals can be CSE'd later. This helps
1649 x86 floating point code generation. */
sewardj84be7372004-08-18 13:59:33 +00001650
sewardjc0b42282004-10-12 13:44:12 +00001651 if (st2->tag == Ist_Tmp
1652 && st2->Ist.Tmp.data->tag == Iex_Const
1653 && st2->Ist.Tmp.data->Iex.Const.con->tag != Ico_F64i) {
sewardj84be7372004-08-18 13:59:33 +00001654 /* 't = const' -- add to env.
1655 The pair (IRTemp, IRExpr*) is added. */
sewardj62617ef2004-10-13 23:29:22 +00001656 env[(Int)(st2->Ist.Tmp.tmp)] = st2->Ist.Tmp.data;
sewardj84be7372004-08-18 13:59:33 +00001657 }
1658 else
sewardj6d076362004-09-23 11:06:17 +00001659 if (st2->tag == Ist_Tmp && st2->Ist.Tmp.data->tag == Iex_Tmp) {
sewardj84be7372004-08-18 13:59:33 +00001660 /* 't1 = t2' -- add to env.
1661 The pair (IRTemp, IRExpr*) is added. */
sewardj62617ef2004-10-13 23:29:22 +00001662 env[(Int)(st2->Ist.Tmp.tmp)] = st2->Ist.Tmp.data;
sewardj84be7372004-08-18 13:59:33 +00001663 }
1664 else {
1665 /* Not interesting, copy st2 into the output block. */
1666 addStmtToIRBB( out, st2 );
1667 }
1668 }
1669
sewardj84be7372004-08-18 13:59:33 +00001670 out->next = subst_Expr( env, in->next );
1671 out->jumpkind = in->jumpkind;
1672 return out;
1673}
1674
1675
1676
sewardjedf4d692004-08-17 13:52:58 +00001677/*---------------------------------------------------------------*/
sewardj39e3f242004-08-18 16:54:52 +00001678/*--- Dead code (t = E) removal ---*/
1679/*---------------------------------------------------------------*/
1680
sewardjea602bc2004-10-14 21:40:12 +00001681/* The type of the HashHW map is: a map from IRTemp to nothing
sewardj39e3f242004-08-18 16:54:52 +00001682 -- really just operating a set or IRTemps.
1683*/
1684
sewardjd503a322004-10-13 22:41:16 +00001685inline
1686static void addUses_Temp ( Bool* set, IRTemp tmp )
sewardj17442fe2004-09-20 14:54:28 +00001687{
sewardjd503a322004-10-13 22:41:16 +00001688 set[(Int)tmp] = True;
sewardj17442fe2004-09-20 14:54:28 +00001689}
1690
sewardjd503a322004-10-13 22:41:16 +00001691static void addUses_Expr ( Bool* set, IRExpr* e )
sewardj39e3f242004-08-18 16:54:52 +00001692{
1693 Int i;
1694 switch (e->tag) {
sewardjd7217032004-08-19 10:49:10 +00001695 case Iex_GetI:
sewardjeeac8412004-11-02 00:26:55 +00001696 addUses_Expr(set, e->Iex.GetI.ix);
sewardjd7217032004-08-19 10:49:10 +00001697 return;
sewardj39e3f242004-08-18 16:54:52 +00001698 case Iex_Mux0X:
1699 addUses_Expr(set, e->Iex.Mux0X.cond);
1700 addUses_Expr(set, e->Iex.Mux0X.expr0);
1701 addUses_Expr(set, e->Iex.Mux0X.exprX);
1702 return;
1703 case Iex_CCall:
1704 for (i = 0; e->Iex.CCall.args[i]; i++)
1705 addUses_Expr(set, e->Iex.CCall.args[i]);
1706 return;
1707 case Iex_LDle:
1708 addUses_Expr(set, e->Iex.LDle.addr);
1709 return;
1710 case Iex_Binop:
1711 addUses_Expr(set, e->Iex.Binop.arg1);
1712 addUses_Expr(set, e->Iex.Binop.arg2);
1713 return;
1714 case Iex_Unop:
1715 addUses_Expr(set, e->Iex.Unop.arg);
1716 return;
1717 case Iex_Tmp:
sewardj17442fe2004-09-20 14:54:28 +00001718 addUses_Temp(set, e->Iex.Tmp.tmp);
sewardj39e3f242004-08-18 16:54:52 +00001719 return;
1720 case Iex_Const:
1721 case Iex_Get:
1722 return;
1723 default:
1724 vex_printf("\n");
1725 ppIRExpr(e);
1726 vpanic("addUses_Expr");
1727 }
1728}
1729
sewardjd503a322004-10-13 22:41:16 +00001730static void addUses_Stmt ( Bool* set, IRStmt* st )
sewardj39e3f242004-08-18 16:54:52 +00001731{
sewardj17442fe2004-09-20 14:54:28 +00001732 Int i;
1733 IRDirty* d;
sewardj39e3f242004-08-18 16:54:52 +00001734 switch (st->tag) {
sewardjd7217032004-08-19 10:49:10 +00001735 case Ist_PutI:
sewardjeeac8412004-11-02 00:26:55 +00001736 addUses_Expr(set, st->Ist.PutI.ix);
sewardj2d3f77c2004-09-22 23:49:09 +00001737 addUses_Expr(set, st->Ist.PutI.data);
sewardjd7217032004-08-19 10:49:10 +00001738 return;
sewardj39e3f242004-08-18 16:54:52 +00001739 case Ist_Tmp:
sewardj6d076362004-09-23 11:06:17 +00001740 addUses_Expr(set, st->Ist.Tmp.data);
sewardj39e3f242004-08-18 16:54:52 +00001741 return;
1742 case Ist_Put:
sewardj6d076362004-09-23 11:06:17 +00001743 addUses_Expr(set, st->Ist.Put.data);
sewardj39e3f242004-08-18 16:54:52 +00001744 return;
1745 case Ist_STle:
1746 addUses_Expr(set, st->Ist.STle.addr);
1747 addUses_Expr(set, st->Ist.STle.data);
1748 return;
sewardj17442fe2004-09-20 14:54:28 +00001749 case Ist_Dirty:
1750 d = st->Ist.Dirty.details;
1751 if (d->mFx != Ifx_None)
1752 addUses_Expr(set, d->mAddr);
sewardjb8385d82004-11-02 01:34:15 +00001753 addUses_Expr(set, d->guard);
sewardj17442fe2004-09-20 14:54:28 +00001754 for (i = 0; d->args[i] != NULL; i++)
1755 addUses_Expr(set, d->args[i]);
1756 return;
sewardjd2445f62005-03-21 00:15:53 +00001757 case Ist_NoOp:
sewardjf1689312005-03-16 18:19:10 +00001758 case Ist_IMark:
sewardj3e838932005-01-07 12:09:15 +00001759 case Ist_MFence:
1760 return;
sewardj17442fe2004-09-20 14:54:28 +00001761 case Ist_Exit:
sewardj0276d4b2004-11-15 15:30:21 +00001762 addUses_Expr(set, st->Ist.Exit.guard);
sewardj17442fe2004-09-20 14:54:28 +00001763 return;
sewardj39e3f242004-08-18 16:54:52 +00001764 default:
1765 vex_printf("\n");
1766 ppIRStmt(st);
1767 vpanic("addUses_Stmt");
sewardjd503a322004-10-13 22:41:16 +00001768 }
sewardj39e3f242004-08-18 16:54:52 +00001769}
1770
1771
sewardjba999312004-11-15 15:21:17 +00001772/* Is this literally IRExpr_Const(IRConst_U1(False)) ? */
1773static Bool isZeroU1 ( IRExpr* e )
sewardjd9997882004-11-04 19:42:59 +00001774{
sewardj9d2e7692005-02-07 01:11:31 +00001775 return toBool( e->tag == Iex_Const
1776 && e->Iex.Const.con->tag == Ico_U1
1777 && e->Iex.Const.con->Ico.U1 == False );
sewardjd9997882004-11-04 19:42:59 +00001778}
1779
sewardj39e3f242004-08-18 16:54:52 +00001780
1781/* Note, this destructively modifies the given IRBB. */
1782
1783/* Scan backwards through statements, carrying a set of IRTemps which
1784 are known to be used after the current point. On encountering 't =
1785 E', delete the binding if it is not used. Otherwise, add any temp
1786 uses to the set and keep on moving backwards. */
1787
sewardj49651f42004-10-28 22:11:04 +00001788/* notstatic */ void do_deadcode_BB ( IRBB* bb )
sewardj39e3f242004-08-18 16:54:52 +00001789{
1790 Int i;
sewardjd503a322004-10-13 22:41:16 +00001791 Int n_tmps = bb->tyenv->types_used;
1792 Bool* set = LibVEX_Alloc(n_tmps * sizeof(Bool));
sewardj39e3f242004-08-18 16:54:52 +00001793 IRStmt* st;
1794
sewardjd503a322004-10-13 22:41:16 +00001795 for (i = 0; i < n_tmps; i++)
1796 set[i] = False;
1797
sewardj39e3f242004-08-18 16:54:52 +00001798 /* start off by recording IRTemp uses in the next field. */
1799 addUses_Expr(set, bb->next);
1800
1801 /* Work backwards through the stmts */
1802 for (i = bb->stmts_used-1; i >= 0; i--) {
1803 st = bb->stmts[i];
sewardj8bee6d12005-03-22 02:24:05 +00001804 if (st->tag == Ist_NoOp)
sewardj84ff0652004-08-23 16:16:08 +00001805 continue;
sewardj39e3f242004-08-18 16:54:52 +00001806 if (st->tag == Ist_Tmp
sewardjd503a322004-10-13 22:41:16 +00001807 && set[(Int)(st->Ist.Tmp.tmp)] == False) {
sewardj39e3f242004-08-18 16:54:52 +00001808 /* it's an IRTemp which never got used. Delete it. */
sewardj088bcb42004-08-19 17:16:52 +00001809 if (DEBUG_IROPT) {
sewardj39e3f242004-08-18 16:54:52 +00001810 vex_printf("DEAD: ");
1811 ppIRStmt(st);
1812 vex_printf("\n");
1813 }
sewardjd2445f62005-03-21 00:15:53 +00001814 bb->stmts[i] = IRStmt_NoOp();
sewardjd9997882004-11-04 19:42:59 +00001815 }
1816 else
1817 if (st->tag == Ist_Dirty
1818 && st->Ist.Dirty.details->guard
sewardjba999312004-11-15 15:21:17 +00001819 && isZeroU1(st->Ist.Dirty.details->guard)) {
sewardjd2445f62005-03-21 00:15:53 +00001820 /* This is a dirty helper which will never get called.
1821 Delete it. */
1822 bb->stmts[i] = IRStmt_NoOp();
sewardjd9997882004-11-04 19:42:59 +00001823 }
1824 else {
sewardj39e3f242004-08-18 16:54:52 +00001825 /* Note any IRTemp uses made by the current statement. */
1826 addUses_Stmt(set, st);
1827 }
1828 }
1829}
1830
sewardj84ff0652004-08-23 16:16:08 +00001831/*---------------------------------------------------------------*/
1832/*--- Specialisation of helper function calls, in ---*/
1833/*--- collaboration with the front end ---*/
1834/*---------------------------------------------------------------*/
1835
1836static
sewardjb9230752004-12-29 19:25:06 +00001837IRBB* spec_helpers_BB ( IRBB* bb,
sewardj9d2e7692005-02-07 01:11:31 +00001838 IRExpr* (*specHelper) ( HChar*, IRExpr**) )
sewardj84ff0652004-08-23 16:16:08 +00001839{
sewardjb9230752004-12-29 19:25:06 +00001840 Int i;
sewardj84ff0652004-08-23 16:16:08 +00001841 IRStmt* st;
1842 IRExpr* ex;
sewardjb9230752004-12-29 19:25:06 +00001843 Bool any = False;
sewardj84ff0652004-08-23 16:16:08 +00001844
1845 for (i = bb->stmts_used-1; i >= 0; i--) {
1846 st = bb->stmts[i];
1847
sewardj8bee6d12005-03-22 02:24:05 +00001848 if (st->tag != Ist_Tmp
sewardj6d076362004-09-23 11:06:17 +00001849 || st->Ist.Tmp.data->tag != Iex_CCall)
sewardj8bee6d12005-03-22 02:24:05 +00001850 continue;
sewardj84ff0652004-08-23 16:16:08 +00001851
sewardj8ea867b2004-10-30 19:03:02 +00001852 ex = (*specHelper)( st->Ist.Tmp.data->Iex.CCall.cee->name,
sewardj6d076362004-09-23 11:06:17 +00001853 st->Ist.Tmp.data->Iex.CCall.args );
sewardj84ff0652004-08-23 16:16:08 +00001854 if (!ex)
sewardjaba4fff2004-10-08 21:37:15 +00001855 /* the front end can't think of a suitable replacement */
1856 continue;
sewardj84ff0652004-08-23 16:16:08 +00001857
1858 /* We got something better. Install it in the bb. */
sewardjb9230752004-12-29 19:25:06 +00001859 any = True;
sewardj84ff0652004-08-23 16:16:08 +00001860 bb->stmts[i]
1861 = IRStmt_Tmp(st->Ist.Tmp.tmp, ex);
1862
1863 if (0) {
1864 vex_printf("SPEC: ");
sewardj6d076362004-09-23 11:06:17 +00001865 ppIRExpr(st->Ist.Tmp.data);
sewardj84ff0652004-08-23 16:16:08 +00001866 vex_printf(" --> ");
1867 ppIRExpr(ex);
1868 vex_printf("\n");
1869 }
1870 }
sewardjb9230752004-12-29 19:25:06 +00001871
1872 if (any)
1873 bb = flatten_BB(bb);
1874 return bb;
sewardj84ff0652004-08-23 16:16:08 +00001875}
1876
sewardj29632392004-08-22 02:38:11 +00001877
1878/*---------------------------------------------------------------*/
sewardj08210532004-12-29 17:09:11 +00001879/*--- Common Subexpression Elimination ---*/
1880/*---------------------------------------------------------------*/
1881
1882/* Expensive in time and space. */
1883
1884/* Uses two environments:
1885 a IRTemp -> IRTemp mapping
1886 a mapping from AvailExpr* to IRTemp
1887*/
1888
1889typedef
1890 struct {
1891 enum { Ut, Btt, Btc, Bct, Cf64i } tag;
1892 union {
1893 /* unop(tmp) */
1894 struct {
1895 IROp op;
1896 IRTemp arg;
1897 } Ut;
1898 /* binop(tmp,tmp) */
1899 struct {
1900 IROp op;
1901 IRTemp arg1;
1902 IRTemp arg2;
1903 } Btt;
1904 /* binop(tmp,const) */
1905 struct {
1906 IROp op;
1907 IRTemp arg1;
1908 IRConst con2;
1909 } Btc;
1910 /* binop(const,tmp) */
1911 struct {
1912 IROp op;
1913 IRConst con1;
1914 IRTemp arg2;
1915 } Bct;
1916 /* F64i-style const */
1917 struct {
1918 ULong f64i;
1919 } Cf64i;
1920 } u;
1921 }
1922 AvailExpr;
1923
1924static Bool eq_AvailExpr ( AvailExpr* a1, AvailExpr* a2 )
1925{
1926 if (a1->tag != a2->tag)
1927 return False;
1928 switch (a1->tag) {
1929 case Ut:
sewardj9d2e7692005-02-07 01:11:31 +00001930 return toBool(
1931 a1->u.Ut.op == a2->u.Ut.op
1932 && a1->u.Ut.arg == a2->u.Ut.arg);
sewardj08210532004-12-29 17:09:11 +00001933 case Btt:
sewardj9d2e7692005-02-07 01:11:31 +00001934 return toBool(
1935 a1->u.Btt.op == a2->u.Btt.op
sewardj08210532004-12-29 17:09:11 +00001936 && a1->u.Btt.arg1 == a2->u.Btt.arg1
sewardj9d2e7692005-02-07 01:11:31 +00001937 && a1->u.Btt.arg2 == a2->u.Btt.arg2);
sewardj08210532004-12-29 17:09:11 +00001938 case Btc:
sewardj9d2e7692005-02-07 01:11:31 +00001939 return toBool(
1940 a1->u.Btc.op == a2->u.Btc.op
sewardj08210532004-12-29 17:09:11 +00001941 && a1->u.Btc.arg1 == a2->u.Btc.arg1
sewardj9d2e7692005-02-07 01:11:31 +00001942 && eqIRConst(&a1->u.Btc.con2, &a2->u.Btc.con2));
sewardj08210532004-12-29 17:09:11 +00001943 case Bct:
sewardj9d2e7692005-02-07 01:11:31 +00001944 return toBool(
1945 a1->u.Bct.op == a2->u.Bct.op
sewardj08210532004-12-29 17:09:11 +00001946 && a1->u.Bct.arg2 == a2->u.Bct.arg2
sewardj9d2e7692005-02-07 01:11:31 +00001947 && eqIRConst(&a1->u.Bct.con1, &a2->u.Bct.con1));
sewardj08210532004-12-29 17:09:11 +00001948 case Cf64i:
sewardj9d2e7692005-02-07 01:11:31 +00001949 return toBool(a1->u.Cf64i.f64i == a2->u.Cf64i.f64i);
sewardj08210532004-12-29 17:09:11 +00001950 default: vpanic("eq_AvailExpr");
1951 }
1952}
1953
1954static IRExpr* availExpr_to_IRExpr ( AvailExpr* ae )
1955{
1956 IRConst* con;
1957 switch (ae->tag) {
1958 case Ut:
1959 return IRExpr_Unop( ae->u.Ut.op, IRExpr_Tmp(ae->u.Ut.arg) );
1960 case Btt:
1961 return IRExpr_Binop( ae->u.Btt.op,
1962 IRExpr_Tmp(ae->u.Btt.arg1),
1963 IRExpr_Tmp(ae->u.Btt.arg2) );
1964 case Btc:
1965 con = LibVEX_Alloc(sizeof(IRConst));
1966 *con = ae->u.Btc.con2;
1967 return IRExpr_Binop( ae->u.Btc.op,
1968 IRExpr_Tmp(ae->u.Btc.arg1), IRExpr_Const(con) );
1969 case Bct:
1970 con = LibVEX_Alloc(sizeof(IRConst));
1971 *con = ae->u.Bct.con1;
1972 return IRExpr_Binop( ae->u.Bct.op,
1973 IRExpr_Const(con), IRExpr_Tmp(ae->u.Bct.arg2) );
1974 case Cf64i:
1975 return IRExpr_Const(IRConst_F64i(ae->u.Cf64i.f64i));
1976 default:
1977 vpanic("availExpr_to_IRExpr");
1978 }
1979}
1980
1981inline
1982static IRTemp subst_AvailExpr_Temp ( HashHW* env, IRTemp tmp )
1983{
1984 HWord res;
1985 /* env :: IRTemp -> IRTemp */
1986 if (lookupHHW( env, &res, (HWord)tmp ))
1987 return (IRTemp)res;
1988 else
1989 return tmp;
1990}
1991
1992static void subst_AvailExpr ( HashHW* env, AvailExpr* ae )
1993{
1994 /* env :: IRTemp -> IRTemp */
1995 switch (ae->tag) {
1996 case Ut:
1997 ae->u.Ut.arg = subst_AvailExpr_Temp( env, ae->u.Ut.arg );
1998 break;
1999 case Btt:
2000 ae->u.Btt.arg1 = subst_AvailExpr_Temp( env, ae->u.Btt.arg1 );
2001 ae->u.Btt.arg2 = subst_AvailExpr_Temp( env, ae->u.Btt.arg2 );
2002 break;
2003 case Btc:
2004 ae->u.Btc.arg1 = subst_AvailExpr_Temp( env, ae->u.Btc.arg1 );
2005 break;
2006 case Bct:
2007 ae->u.Bct.arg2 = subst_AvailExpr_Temp( env, ae->u.Bct.arg2 );
2008 break;
2009 case Cf64i:
2010 break;
2011 default:
2012 vpanic("subst_AvailExpr");
2013 }
2014}
2015
2016static AvailExpr* irExpr_to_AvailExpr ( IRExpr* e )
2017{
2018 AvailExpr* ae;
2019
2020 if (e->tag == Iex_Unop
2021 && e->Iex.Unop.arg->tag == Iex_Tmp) {
2022 ae = LibVEX_Alloc(sizeof(AvailExpr));
2023 ae->tag = Ut;
2024 ae->u.Ut.op = e->Iex.Unop.op;
2025 ae->u.Ut.arg = e->Iex.Unop.arg->Iex.Tmp.tmp;
2026 return ae;
2027 }
2028
2029 if (e->tag == Iex_Binop
2030 && e->Iex.Binop.arg1->tag == Iex_Tmp
2031 && e->Iex.Binop.arg2->tag == Iex_Tmp) {
2032 ae = LibVEX_Alloc(sizeof(AvailExpr));
2033 ae->tag = Btt;
2034 ae->u.Btt.op = e->Iex.Binop.op;
2035 ae->u.Btt.arg1 = e->Iex.Binop.arg1->Iex.Tmp.tmp;
2036 ae->u.Btt.arg2 = e->Iex.Binop.arg2->Iex.Tmp.tmp;
2037 return ae;
2038 }
2039
2040 if (e->tag == Iex_Binop
2041 && e->Iex.Binop.arg1->tag == Iex_Tmp
2042 && e->Iex.Binop.arg2->tag == Iex_Const) {
2043 ae = LibVEX_Alloc(sizeof(AvailExpr));
2044 ae->tag = Btc;
2045 ae->u.Btc.op = e->Iex.Binop.op;
2046 ae->u.Btc.arg1 = e->Iex.Binop.arg1->Iex.Tmp.tmp;
2047 ae->u.Btc.con2 = *(e->Iex.Binop.arg2->Iex.Const.con);
2048 return ae;
2049 }
2050
2051 if (e->tag == Iex_Binop
2052 && e->Iex.Binop.arg1->tag == Iex_Const
2053 && e->Iex.Binop.arg2->tag == Iex_Tmp) {
2054 ae = LibVEX_Alloc(sizeof(AvailExpr));
2055 ae->tag = Bct;
2056 ae->u.Bct.op = e->Iex.Binop.op;
2057 ae->u.Bct.arg2 = e->Iex.Binop.arg2->Iex.Tmp.tmp;
2058 ae->u.Bct.con1 = *(e->Iex.Binop.arg1->Iex.Const.con);
2059 return ae;
2060 }
2061
2062 if (e->tag == Iex_Const
2063 && e->Iex.Const.con->tag == Ico_F64i) {
2064 ae = LibVEX_Alloc(sizeof(AvailExpr));
2065 ae->tag = Cf64i;
2066 ae->u.Cf64i.f64i = e->Iex.Const.con->Ico.F64i;
2067 return ae;
2068 }
2069
2070 return NULL;
2071}
2072
2073
2074/* The BB is modified in-place. */
2075
2076void do_cse_BB ( IRBB* bb )
2077{
2078 Int i, j;
2079 IRTemp t, q;
2080 IRStmt* st;
2081 AvailExpr* eprime;
2082
2083 HashHW* tenv = newHHW(); /* :: IRTemp -> IRTemp */
2084 HashHW* aenv = newHHW(); /* :: AvailExpr* -> IRTemp */
2085
2086 vassert(sizeof(IRTemp) <= sizeof(HWord));
2087
2088 //ppIRBB(bb);
2089 //vex_printf("\n\n");
2090
2091 /* Iterate forwards over the stmts.
2092 On seeing "t = E", where E is one of the 3 AvailExpr forms:
2093 let E' = apply tenv substitution to E
2094 search aenv for E'
2095 if a mapping E' -> q is found,
2096 replace this stmt by "t = q"
2097 and add binding t -> q to tenv
2098 else
2099 add binding E' -> t to aenv
2100 replace this stmt by "t = E'"
2101 Ignore any other kind of stmt.
2102 */
2103 for (i = 0; i < bb->stmts_used; i++) {
2104 st = bb->stmts[i];
2105
2106 /* ignore not-interestings */
sewardj8bee6d12005-03-22 02:24:05 +00002107 if (st->tag != Ist_Tmp)
sewardj08210532004-12-29 17:09:11 +00002108 continue;
2109
2110 t = st->Ist.Tmp.tmp;
2111 eprime = irExpr_to_AvailExpr(st->Ist.Tmp.data);
2112 /* ignore if not of AvailExpr form */
2113 if (!eprime)
2114 continue;
2115
2116 /* vex_printf("considering: " ); ppIRStmt(st); vex_printf("\n"); */
2117
2118 /* apply tenv */
2119 subst_AvailExpr( tenv, eprime );
2120
2121 /* search aenv for eprime, unfortunately the hard way */
2122 for (j = 0; j < aenv->used; j++)
2123 if (aenv->inuse[j] && eq_AvailExpr(eprime, (AvailExpr*)aenv->key[j]))
2124 break;
2125
2126 if (j < aenv->used) {
2127 /* A binding E' -> q was found. Replace stmt by "t = q" and
2128 note the t->q binding in tenv. */
2129 /* (this is the core of the CSE action) */
2130 q = (IRTemp)aenv->val[j];
2131 bb->stmts[i] = IRStmt_Tmp( t, IRExpr_Tmp(q) );
2132 addToHHW( tenv, (HWord)t, (HWord)q );
2133 } else {
2134 /* No binding was found, so instead we add E' -> t to our
2135 collection of available expressions, replace this stmt
2136 with "t = E'", and move on. */
2137 bb->stmts[i] = IRStmt_Tmp( t, availExpr_to_IRExpr(eprime) );
2138 addToHHW( aenv, (HWord)eprime, (HWord)t );
2139 }
2140 }
2141
2142 //ppIRBB(bb);
2143 //sanityCheckIRBB(bb, Ity_I32);
2144 //vex_printf("\n\n");
2145
2146}
2147
2148
2149/*---------------------------------------------------------------*/
2150/*--- Add32/Sub32 chain collapsing ---*/
2151/*---------------------------------------------------------------*/
2152
2153/* ----- Helper functions for Add32/Sub32 chain collapsing ----- */
2154
2155/* Is this expression "Add32(tmp,const)" or "Sub32(tmp,const)" ? If
2156 yes, set *tmp and *i32 appropriately. *i32 is set as if the
2157 root node is Add32, not Sub32. */
2158
2159static Bool isAdd32OrSub32 ( IRExpr* e, IRTemp* tmp, Int* i32 )
2160{
2161 if (e->tag != Iex_Binop)
2162 return False;
2163 if (e->Iex.Binop.op != Iop_Add32 && e->Iex.Binop.op != Iop_Sub32)
2164 return False;
2165 if (e->Iex.Binop.arg1->tag != Iex_Tmp)
2166 return False;
2167 if (e->Iex.Binop.arg2->tag != Iex_Const)
2168 return False;
2169 *tmp = e->Iex.Binop.arg1->Iex.Tmp.tmp;
2170 *i32 = (Int)(e->Iex.Binop.arg2->Iex.Const.con->Ico.U32);
2171 if (e->Iex.Binop.op == Iop_Sub32)
2172 *i32 = -*i32;
2173 return True;
2174}
2175
2176
2177/* Figure out if tmp can be expressed as tmp2 +32 const, for some
2178 other tmp2. Scan backwards from the specified start point -- an
2179 optimisation. */
2180
2181static Bool collapseChain ( IRBB* bb, Int startHere,
2182 IRTemp tmp,
2183 IRTemp* tmp2, Int* i32 )
2184{
2185 Int j, ii;
2186 IRTemp vv;
2187 IRStmt* st;
2188 IRExpr* e;
2189
2190 /* the (var, con) pair contain the current 'representation' for
2191 'tmp'. We start with 'tmp + 0'. */
2192 IRTemp var = tmp;
2193 Int con = 0;
2194
2195 /* Scan backwards to see if tmp can be replaced by some other tmp
2196 +/- a constant. */
2197 for (j = startHere; j >= 0; j--) {
2198 st = bb->stmts[j];
sewardj8bee6d12005-03-22 02:24:05 +00002199 if (st->tag != Ist_Tmp)
sewardj08210532004-12-29 17:09:11 +00002200 continue;
2201 if (st->Ist.Tmp.tmp != var)
2202 continue;
2203 e = st->Ist.Tmp.data;
2204 if (!isAdd32OrSub32(e, &vv, &ii))
2205 break;
2206 var = vv;
2207 con += ii;
2208 }
2209 if (j == -1)
2210 /* no earlier binding for var .. ill-formed IR */
2211 vpanic("collapseChain");
2212
2213 /* so, did we find anything interesting? */
2214 if (var == tmp)
2215 return False; /* no .. */
2216
2217 *tmp2 = var;
2218 *i32 = con;
2219 return True;
2220}
2221
2222
2223/* ------- Main function for Add32/Sub32 chain collapsing ------ */
2224
2225static void collapse_AddSub_chains_BB ( IRBB* bb )
2226{
2227 IRStmt *st;
2228 IRTemp var, var2;
2229 Int i, con, con2;
2230
2231 for (i = bb->stmts_used-1; i >= 0; i--) {
2232 st = bb->stmts[i];
sewardj8bee6d12005-03-22 02:24:05 +00002233 if (st->tag == Ist_NoOp)
sewardj08210532004-12-29 17:09:11 +00002234 continue;
2235
2236 /* Try to collapse 't1 = Add32/Sub32(t2, con)'. */
2237
2238 if (st->tag == Ist_Tmp
2239 && isAdd32OrSub32(st->Ist.Tmp.data, &var, &con)) {
2240
2241 /* So e1 is of the form Add32(var,con) or Sub32(var,-con).
2242 Find out if var can be expressed as var2 + con2. */
2243 if (collapseChain(bb, i-1, var, &var2, &con2)) {
2244 if (DEBUG_IROPT) {
2245 vex_printf("replacing1 ");
2246 ppIRStmt(st);
2247 vex_printf(" with ");
2248 }
2249 con2 += con;
2250 bb->stmts[i]
2251 = IRStmt_Tmp(
2252 st->Ist.Tmp.tmp,
2253 (con2 >= 0)
2254 ? IRExpr_Binop(Iop_Add32,
2255 IRExpr_Tmp(var2),
2256 IRExpr_Const(IRConst_U32(con2)))
2257 : IRExpr_Binop(Iop_Sub32,
2258 IRExpr_Tmp(var2),
2259 IRExpr_Const(IRConst_U32(-con2)))
2260 );
2261 if (DEBUG_IROPT) {
2262 ppIRStmt(bb->stmts[i]);
2263 vex_printf("\n");
2264 }
2265 }
2266
2267 continue;
2268 }
2269
2270 /* Try to collapse 't1 = GetI[t2, con]'. */
2271
2272 if (st->tag == Ist_Tmp
2273 && st->Ist.Tmp.data->tag == Iex_GetI
2274 && st->Ist.Tmp.data->Iex.GetI.ix->tag == Iex_Tmp
2275 && collapseChain(bb, i-1, st->Ist.Tmp.data->Iex.GetI.ix
2276 ->Iex.Tmp.tmp, &var2, &con2)) {
2277 if (DEBUG_IROPT) {
2278 vex_printf("replacing3 ");
2279 ppIRStmt(st);
2280 vex_printf(" with ");
2281 }
2282 con2 += st->Ist.Tmp.data->Iex.GetI.bias;
2283 bb->stmts[i]
2284 = IRStmt_Tmp(
2285 st->Ist.Tmp.tmp,
2286 IRExpr_GetI(st->Ist.Tmp.data->Iex.GetI.descr,
2287 IRExpr_Tmp(var2),
2288 con2));
2289 if (DEBUG_IROPT) {
2290 ppIRStmt(bb->stmts[i]);
2291 vex_printf("\n");
2292 }
2293 continue;
2294 }
2295
2296 /* Perhaps st is PutI[t, con] ? */
2297
2298 if (st->tag == Ist_PutI
2299 && st->Ist.PutI.ix->tag == Iex_Tmp
2300 && collapseChain(bb, i-1, st->Ist.PutI.ix->Iex.Tmp.tmp,
2301 &var2, &con2)) {
2302 if (DEBUG_IROPT) {
2303 vex_printf("replacing2 ");
2304 ppIRStmt(st);
2305 vex_printf(" with ");
2306 }
2307 con2 += st->Ist.PutI.bias;
2308 bb->stmts[i]
2309 = IRStmt_PutI(st->Ist.PutI.descr,
2310 IRExpr_Tmp(var2),
2311 con2,
2312 st->Ist.PutI.data);
2313 if (DEBUG_IROPT) {
2314 ppIRStmt(bb->stmts[i]);
2315 vex_printf("\n");
2316 }
2317 continue;
2318 }
2319
2320 } /* for */
2321}
2322
2323
2324/*---------------------------------------------------------------*/
2325/*--- PutI/GetI transformations ---*/
2326/*---------------------------------------------------------------*/
2327
2328/* ------- Helper functions for PutI/GetI transformations ------ */
2329
sewardj08210532004-12-29 17:09:11 +00002330/* Determine, to the extent possible, the relationship between two
2331 guest state accesses. The possible outcomes are:
2332
2333 * Exact alias. These two accesses denote precisely the same
2334 piece of the guest state.
2335
2336 * Definitely no alias. These two accesses are guaranteed not to
2337 overlap any part of the guest state.
2338
2339 * Unknown -- if neither of the above can be established.
2340
2341 If in doubt, return Unknown. */
2342
2343typedef
2344 enum { ExactAlias, NoAlias, UnknownAlias }
2345 GSAliasing;
2346
2347
2348/* Produces the alias relation between an indexed guest
2349 state access and a non-indexed access. */
2350
2351static
2352GSAliasing getAliasingRelation_IC ( IRArray* descr1, IRExpr* ix1,
2353 Int offset2, IRType ty2 )
2354{
2355 UInt minoff1, maxoff1, minoff2, maxoff2;
2356
2357 getArrayBounds( descr1, &minoff1, &maxoff1 );
2358 minoff2 = offset2;
2359 maxoff2 = minoff2 + sizeofIRType(ty2) - 1;
2360
2361 if (maxoff1 < minoff2 || maxoff2 < minoff1)
2362 return NoAlias;
2363
2364 /* Could probably do better here if required. For the moment
2365 however just claim not to know anything more. */
2366 return UnknownAlias;
2367}
2368
2369
2370/* Produces the alias relation between two indexed guest state
2371 accesses. */
2372
2373static
2374GSAliasing getAliasingRelation_II (
2375 IRArray* descr1, IRExpr* ix1, Int bias1,
2376 IRArray* descr2, IRExpr* ix2, Int bias2
2377 )
2378{
2379 UInt minoff1, maxoff1, minoff2, maxoff2;
2380 Int iters;
2381
2382 /* First try hard to show they don't alias. */
2383 getArrayBounds( descr1, &minoff1, &maxoff1 );
2384 getArrayBounds( descr2, &minoff2, &maxoff2 );
2385 if (maxoff1 < minoff2 || maxoff2 < minoff1)
2386 return NoAlias;
2387
2388 /* So the two arrays at least partially overlap. To get any
2389 further we'll have to be sure that the descriptors are
2390 identical. */
2391 if (!eqIRArray(descr1, descr2))
2392 return UnknownAlias;
2393
2394 /* The descriptors are identical. Now the only difference can be
2395 in the index expressions. If they cannot be shown to be
2396 identical, we have to say we don't know what the aliasing
2397 relation will be. Now, since the IR is flattened, the index
2398 expressions should be atoms -- either consts or tmps. So that
2399 makes the comparison simple. */
sewardj496a58d2005-03-20 18:44:44 +00002400 vassert(isIRAtom(ix1));
2401 vassert(isIRAtom(ix2));
2402 if (!eqIRAtom(ix1,ix2))
sewardj08210532004-12-29 17:09:11 +00002403 return UnknownAlias;
2404
2405 /* Ok, the index expressions are identical. So now the only way
2406 they can be different is in the bias. Normalise this
2407 paranoidly, to reliably establish equality/non-equality. */
2408
2409 /* So now we know that the GetI and PutI index the same array
2410 with the same base. Are the offsets the same, modulo the
2411 array size? Do this paranoidly. */
2412 vassert(descr1->nElems == descr2->nElems);
2413 vassert(descr1->elemTy == descr2->elemTy);
2414 vassert(descr1->base == descr2->base);
2415 iters = 0;
2416 while (bias1 < 0 || bias2 < 0) {
2417 bias1 += descr1->nElems;
2418 bias2 += descr1->nElems;
2419 iters++;
2420 if (iters > 10)
2421 vpanic("getAliasingRelation: iters");
2422 }
2423 vassert(bias1 >= 0 && bias2 >= 0);
2424 bias1 %= descr1->nElems;
2425 bias2 %= descr1->nElems;
2426 vassert(bias1 >= 0 && bias1 < descr1->nElems);
2427 vassert(bias2 >= 0 && bias2 < descr1->nElems);
2428
2429 /* Finally, biasP and biasG are normalised into the range
2430 0 .. descrP/G->nElems - 1. And so we can establish
2431 equality/non-equality. */
2432
2433 return bias1==bias2 ? ExactAlias : NoAlias;
2434}
2435
2436
2437/* Given the parts (descr, tmp, bias) for a GetI, scan backwards from
2438 the given starting point to find, if any, a PutI which writes
2439 exactly the same piece of guest state, and so return the expression
2440 that the PutI writes. This is the core of PutI-GetI forwarding. */
2441
2442static
2443IRExpr* findPutI ( IRBB* bb, Int startHere,
2444 IRArray* descrG, IRExpr* ixG, Int biasG )
2445{
2446 Int j;
2447 IRStmt* st;
2448 GSAliasing relation;
2449
2450 if (0) {
2451 vex_printf("\nfindPutI ");
2452 ppIRArray(descrG);
2453 vex_printf(" ");
2454 ppIRExpr(ixG);
2455 vex_printf(" %d\n", biasG);
2456 }
2457
2458 /* Scan backwards in bb from startHere to find a suitable PutI
2459 binding for (descrG, ixG, biasG), if any. */
2460
2461 for (j = startHere; j >= 0; j--) {
2462 st = bb->stmts[j];
sewardj8bee6d12005-03-22 02:24:05 +00002463 if (st->tag == Ist_NoOp)
2464 continue;
sewardj08210532004-12-29 17:09:11 +00002465
2466 if (st->tag == Ist_Put) {
2467 /* Non-indexed Put. This can't give a binding, but we do
2468 need to check it doesn't invalidate the search by
2469 overlapping any part of the indexed guest state. */
2470
2471 relation
2472 = getAliasingRelation_IC(
2473 descrG, ixG,
2474 st->Ist.Put.offset,
2475 typeOfIRExpr(bb->tyenv,st->Ist.Put.data) );
2476
2477 if (relation == NoAlias) {
2478 /* we're OK; keep going */
2479 continue;
2480 } else {
2481 /* relation == UnknownAlias || relation == ExactAlias */
2482 /* If this assertion fails, we've found a Put which writes
2483 an area of guest state which is read by a GetI. Which
2484 is unlikely (although not per se wrong). */
2485 vassert(relation != ExactAlias);
2486 /* This Put potentially writes guest state that the GetI
2487 reads; we must fail. */
2488 return NULL;
2489 }
2490 }
2491
2492 if (st->tag == Ist_PutI) {
2493
2494 relation = getAliasingRelation_II(
2495 descrG, ixG, biasG,
2496 st->Ist.PutI.descr,
2497 st->Ist.PutI.ix,
2498 st->Ist.PutI.bias
2499 );
2500
2501 if (relation == NoAlias) {
2502 /* This PutI definitely doesn't overlap. Ignore it and
2503 keep going. */
2504 continue; /* the for j loop */
2505 }
2506
2507 if (relation == UnknownAlias) {
2508 /* We don't know if this PutI writes to the same guest
2509 state that the GetI, or not. So we have to give up. */
2510 return NULL;
2511 }
2512
2513 /* Otherwise, we've found what we're looking for. */
2514 vassert(relation == ExactAlias);
2515 return st->Ist.PutI.data;
2516
2517 } /* if (st->tag == Ist_PutI) */
2518
2519 if (st->tag == Ist_Dirty) {
2520 /* Be conservative. If the dirty call has any guest effects at
2521 all, give up. We could do better -- only give up if there
2522 are any guest writes/modifies. */
2523 if (st->Ist.Dirty.details->nFxState > 0)
2524 return NULL;
2525 }
2526
2527 } /* for */
2528
2529 /* No valid replacement was found. */
2530 return NULL;
2531}
2532
2533
2534
2535/* Assuming pi is a PutI stmt, is s2 identical to it (in the sense
2536 that it writes exactly the same piece of guest state) ? Safe
2537 answer: False. */
2538
2539static Bool identicalPutIs ( IRStmt* pi, IRStmt* s2 )
2540{
2541 vassert(pi->tag == Ist_PutI);
2542 if (s2->tag != Ist_PutI)
2543 return False;
2544
sewardj9d2e7692005-02-07 01:11:31 +00002545 return toBool(
2546 getAliasingRelation_II(
sewardj08210532004-12-29 17:09:11 +00002547 pi->Ist.PutI.descr, pi->Ist.PutI.ix, pi->Ist.PutI.bias,
2548 s2->Ist.PutI.descr, s2->Ist.PutI.ix, s2->Ist.PutI.bias
2549 )
sewardj9d2e7692005-02-07 01:11:31 +00002550 == ExactAlias
2551 );
sewardj08210532004-12-29 17:09:11 +00002552}
2553
2554
2555/* Assuming pi is a PutI stmt, is s2 a Get/GetI/Put/PutI which might
2556 overlap it? Safe answer: True. Note, we could do a lot better
2557 than this if needed. */
2558
2559static
2560Bool guestAccessWhichMightOverlapPutI (
2561 IRTypeEnv* tyenv, IRStmt* pi, IRStmt* s2
2562 )
2563{
2564 GSAliasing relation;
2565 UInt minoffP, maxoffP;
2566
2567 vassert(pi->tag == Ist_PutI);
2568 getArrayBounds(pi->Ist.PutI.descr, &minoffP, &maxoffP);
2569 switch (s2->tag) {
2570
sewardjd2445f62005-03-21 00:15:53 +00002571 case Ist_NoOp:
sewardjf1689312005-03-16 18:19:10 +00002572 case Ist_IMark:
2573 return False;
2574
sewardjbb3f52d2005-01-07 14:14:50 +00002575 case Ist_MFence:
2576 /* just be paranoid ... this should be rare. */
2577 return True;
2578
sewardj08210532004-12-29 17:09:11 +00002579 case Ist_Dirty:
2580 /* If the dirty call has any guest effects at all, give up.
2581 Probably could do better. */
2582 if (s2->Ist.Dirty.details->nFxState > 0)
2583 return True;
2584 return False;
2585
2586 case Ist_Put:
sewardj496a58d2005-03-20 18:44:44 +00002587 vassert(isIRAtom(s2->Ist.Put.data));
sewardj08210532004-12-29 17:09:11 +00002588 relation
2589 = getAliasingRelation_IC(
2590 pi->Ist.PutI.descr, pi->Ist.PutI.ix,
2591 s2->Ist.Put.offset,
2592 typeOfIRExpr(tyenv,s2->Ist.Put.data)
2593 );
2594 goto have_relation;
2595
2596 case Ist_PutI:
sewardj496a58d2005-03-20 18:44:44 +00002597 vassert(isIRAtom(s2->Ist.PutI.ix));
2598 vassert(isIRAtom(s2->Ist.PutI.data));
sewardj08210532004-12-29 17:09:11 +00002599 relation
2600 = getAliasingRelation_II(
2601 pi->Ist.PutI.descr, pi->Ist.PutI.ix, pi->Ist.PutI.bias,
2602 s2->Ist.PutI.descr, s2->Ist.PutI.ix, s2->Ist.PutI.bias
2603 );
2604 goto have_relation;
2605
2606 case Ist_Tmp:
2607 if (s2->Ist.Tmp.data->tag == Iex_GetI) {
2608 relation
2609 = getAliasingRelation_II(
2610 pi->Ist.PutI.descr, pi->Ist.PutI.ix,
2611 pi->Ist.PutI.bias,
2612 s2->Ist.Tmp.data->Iex.GetI.descr,
2613 s2->Ist.Tmp.data->Iex.GetI.ix,
2614 s2->Ist.Tmp.data->Iex.GetI.bias
2615 );
2616 goto have_relation;
2617 }
2618 if (s2->Ist.Tmp.data->tag == Iex_Get) {
2619 relation
2620 = getAliasingRelation_IC(
2621 pi->Ist.PutI.descr, pi->Ist.PutI.ix,
2622 s2->Ist.Tmp.data->Iex.Get.offset,
2623 s2->Ist.Tmp.data->Iex.Get.ty
2624 );
2625 goto have_relation;
2626 }
2627 return False;
2628
2629 case Ist_STle:
sewardj496a58d2005-03-20 18:44:44 +00002630 vassert(isIRAtom(s2->Ist.STle.addr));
2631 vassert(isIRAtom(s2->Ist.STle.data));
sewardj08210532004-12-29 17:09:11 +00002632 return False;
2633
2634 default:
2635 vex_printf("\n"); ppIRStmt(s2); vex_printf("\n");
2636 vpanic("guestAccessWhichMightOverlapPutI");
2637 }
2638
2639 have_relation:
2640 if (relation == NoAlias)
2641 return False;
2642 else
2643 return True; /* ExactAlias or UnknownAlias */
2644}
2645
2646
2647
2648/* ---------- PutI/GetI transformations main functions --------- */
2649
2650/* Remove redundant GetIs, to the extent that they can be detected.
2651 bb is modified in-place. */
2652
2653static
2654void do_redundant_GetI_elimination ( IRBB* bb )
2655{
2656 Int i;
2657 IRStmt* st;
2658
2659 for (i = bb->stmts_used-1; i >= 0; i--) {
2660 st = bb->stmts[i];
sewardj8bee6d12005-03-22 02:24:05 +00002661 if (st->tag == Ist_NoOp)
sewardj08210532004-12-29 17:09:11 +00002662 continue;
2663
2664 if (st->tag == Ist_Tmp
2665 && st->Ist.Tmp.data->tag == Iex_GetI
2666 && st->Ist.Tmp.data->Iex.GetI.ix->tag == Iex_Tmp) {
2667 IRArray* descr = st->Ist.Tmp.data->Iex.GetI.descr;
2668 IRExpr* ix = st->Ist.Tmp.data->Iex.GetI.ix;
2669 Int bias = st->Ist.Tmp.data->Iex.GetI.bias;
2670 IRExpr* replacement = findPutI(bb, i-1, descr, ix, bias);
2671 if (replacement
sewardj496a58d2005-03-20 18:44:44 +00002672 && isIRAtom(replacement)
sewardj08210532004-12-29 17:09:11 +00002673 /* Make sure we're doing a type-safe transformation! */
2674 && typeOfIRExpr(bb->tyenv, replacement) == descr->elemTy) {
2675 if (DEBUG_IROPT) {
2676 vex_printf("rGI: ");
2677 ppIRExpr(st->Ist.Tmp.data);
2678 vex_printf(" -> ");
2679 ppIRExpr(replacement);
2680 vex_printf("\n");
2681 }
2682 bb->stmts[i] = IRStmt_Tmp(st->Ist.Tmp.tmp, replacement);
2683 }
2684 }
2685 }
2686
2687}
2688
2689
2690/* Remove redundant PutIs, to the extent which they can be detected.
2691 bb is modified in-place. */
2692
2693static
2694void do_redundant_PutI_elimination ( IRBB* bb )
2695{
2696 Int i, j;
2697 Bool delete;
2698 IRStmt *st, *stj;
2699
2700 for (i = 0; i < bb->stmts_used; i++) {
2701 st = bb->stmts[i];
sewardj8bee6d12005-03-22 02:24:05 +00002702 if (st->tag != Ist_PutI)
sewardj08210532004-12-29 17:09:11 +00002703 continue;
2704 /* Ok, search forwards from here to see if we can find another
2705 PutI which makes this one redundant, and dodging various
2706 hazards. Search forwards:
2707 * If conditional exit, give up (because anything after that
2708 does not postdominate this put).
2709 * If a Get which might overlap, give up (because this PutI
2710 not necessarily dead).
2711 * If a Put which is identical, stop with success.
2712 * If a Put which might overlap, but is not identical, give up.
2713 * If a dirty helper call which might write guest state, give up.
2714 * If a Put which definitely doesn't overlap, or any other
2715 kind of stmt, continue.
2716 */
2717 delete = False;
2718 for (j = i+1; j < bb->stmts_used; j++) {
2719 stj = bb->stmts[j];
sewardj8bee6d12005-03-22 02:24:05 +00002720 if (stj->tag == Ist_NoOp)
sewardj08210532004-12-29 17:09:11 +00002721 continue;
2722 if (identicalPutIs(st, stj)) {
2723 /* success! */
2724 delete = True;
2725 break;
2726 }
2727 if (stj->tag == Ist_Exit)
2728 /* give up */
2729 break;
2730 if (st->tag == Ist_Dirty)
2731 /* give up; could do better here */
2732 break;
2733 if (guestAccessWhichMightOverlapPutI(bb->tyenv, st, stj))
2734 /* give up */
2735 break;
2736 }
2737
2738 if (delete) {
2739 if (DEBUG_IROPT) {
2740 vex_printf("rPI: ");
2741 ppIRStmt(st);
2742 vex_printf("\n");
2743 }
sewardjd2445f62005-03-21 00:15:53 +00002744 bb->stmts[i] = IRStmt_NoOp();
sewardj08210532004-12-29 17:09:11 +00002745 }
2746
2747 }
2748}
2749
2750
2751/*---------------------------------------------------------------*/
2752/*--- Loop unrolling ---*/
2753/*---------------------------------------------------------------*/
2754
2755/* Adjust all tmp values (names) in e by delta. e is destructively
2756 modified. */
2757
2758static void deltaIRExpr ( IRExpr* e, Int delta )
2759{
2760 Int i;
2761 switch (e->tag) {
2762 case Iex_Tmp:
2763 e->Iex.Tmp.tmp += delta;
2764 break;
2765 case Iex_Get:
2766 case Iex_Const:
2767 break;
2768 case Iex_GetI:
2769 deltaIRExpr(e->Iex.GetI.ix, delta);
2770 break;
2771 case Iex_Binop:
2772 deltaIRExpr(e->Iex.Binop.arg1, delta);
2773 deltaIRExpr(e->Iex.Binop.arg2, delta);
2774 break;
2775 case Iex_Unop:
2776 deltaIRExpr(e->Iex.Unop.arg, delta);
2777 break;
2778 case Iex_LDle:
2779 deltaIRExpr(e->Iex.LDle.addr, delta);
2780 break;
2781 case Iex_CCall:
2782 for (i = 0; e->Iex.CCall.args[i]; i++)
2783 deltaIRExpr(e->Iex.CCall.args[i], delta);
2784 break;
2785 case Iex_Mux0X:
2786 deltaIRExpr(e->Iex.Mux0X.cond, delta);
2787 deltaIRExpr(e->Iex.Mux0X.expr0, delta);
2788 deltaIRExpr(e->Iex.Mux0X.exprX, delta);
2789 break;
2790 default:
2791 vex_printf("\n"); ppIRExpr(e); vex_printf("\n");
2792 vpanic("deltaIRExpr");
2793 }
2794}
2795
2796/* Adjust all tmp values (names) in st by delta. st is destructively
2797 modified. */
2798
2799static void deltaIRStmt ( IRStmt* st, Int delta )
2800{
2801 Int i;
2802 IRDirty* d;
2803 switch (st->tag) {
sewardjd2445f62005-03-21 00:15:53 +00002804 case Ist_NoOp:
sewardjf1689312005-03-16 18:19:10 +00002805 case Ist_IMark:
2806 break;
sewardj08210532004-12-29 17:09:11 +00002807 case Ist_Put:
2808 deltaIRExpr(st->Ist.Put.data, delta);
2809 break;
2810 case Ist_PutI:
2811 deltaIRExpr(st->Ist.PutI.ix, delta);
2812 deltaIRExpr(st->Ist.PutI.data, delta);
2813 break;
2814 case Ist_Tmp:
2815 st->Ist.Tmp.tmp += delta;
2816 deltaIRExpr(st->Ist.Tmp.data, delta);
2817 break;
2818 case Ist_Exit:
2819 deltaIRExpr(st->Ist.Exit.guard, delta);
2820 break;
2821 case Ist_STle:
2822 deltaIRExpr(st->Ist.STle.addr, delta);
2823 deltaIRExpr(st->Ist.STle.data, delta);
2824 break;
2825 case Ist_Dirty:
2826 d = st->Ist.Dirty.details;
2827 deltaIRExpr(d->guard, delta);
2828 for (i = 0; d->args[i]; i++)
2829 deltaIRExpr(d->args[i], delta);
2830 if (d->tmp != IRTemp_INVALID)
2831 d->tmp += delta;
2832 if (d->mAddr)
2833 deltaIRExpr(d->mAddr, delta);
2834 break;
2835 default:
2836 vex_printf("\n"); ppIRStmt(st); vex_printf("\n");
2837 vpanic("deltaIRStmt");
2838 }
2839}
2840
2841
2842/* If possible, return a loop-unrolled version of bb0. The original
2843 is changed. If not possible, return NULL. */
2844
2845/* The two schemas considered are:
2846
2847 X: BODY; goto X
2848
2849 which unrolls to (eg) X: BODY;BODY; goto X
2850
2851 and
2852
2853 X: BODY; if (c) goto X; goto Y
2854 which trivially transforms to
2855 X: BODY; if (!c) goto Y; goto X;
2856 so it falls in the scope of the first case.
2857
2858 X and Y must be literal (guest) addresses.
2859*/
2860
2861static Int calc_unroll_factor( IRBB* bb )
2862{
2863 Int n_stmts, i;
2864
2865 n_stmts = 0;
sewardj0ddbf792005-04-04 23:12:54 +00002866 for (i = 0; i < bb->stmts_used; i++) {
2867 if (bb->stmts[i]->tag != Ist_NoOp)
2868 n_stmts++;
2869 }
sewardj08210532004-12-29 17:09:11 +00002870
2871 if (n_stmts <= vex_control.iropt_unroll_thresh/8) {
2872 if (vex_control.iropt_verbosity > 0)
2873 vex_printf("vex iropt: 8 x unrolling (%d sts -> %d sts)\n",
2874 n_stmts, 8* n_stmts);
2875 return 8;
2876 }
2877 if (n_stmts <= vex_control.iropt_unroll_thresh/4) {
2878 if (vex_control.iropt_verbosity > 0)
2879 vex_printf("vex iropt: 4 x unrolling (%d sts -> %d sts)\n",
2880 n_stmts, 4* n_stmts);
2881 return 4;
2882 }
2883
2884 if (n_stmts <= vex_control.iropt_unroll_thresh/2) {
2885 if (vex_control.iropt_verbosity > 0)
2886 vex_printf("vex iropt: 2 x unrolling (%d sts -> %d sts)\n",
2887 n_stmts, 2* n_stmts);
2888 return 2;
2889 }
2890
2891 if (vex_control.iropt_verbosity > 0)
2892 vex_printf("vex iropt: not unrolling (%d sts)\n", n_stmts);
2893
2894 return 1;
2895}
2896
2897
2898static IRBB* maybe_loop_unroll_BB ( IRBB* bb0, Addr64 my_addr )
2899{
2900 Int i, j, jmax, n_vars;
2901 Bool xxx_known;
2902 Addr64 xxx_value, yyy_value;
2903 IRExpr* udst;
2904 IRStmt* st;
2905 IRConst* con;
2906 IRBB *bb1, *bb2;
2907 Int unroll_factor;
2908
2909 if (vex_control.iropt_unroll_thresh <= 0)
2910 return NULL;
2911
2912 /* First off, figure out if we can unroll this loop. Do this
2913 without modifying bb0. */
2914
2915 if (bb0->jumpkind != Ijk_Boring)
2916 return NULL;
2917
2918 xxx_known = False;
2919 xxx_value = 0;
2920
2921 /* Extract the next-guest address. If it isn't a literal, we
2922 have to give up. */
2923
2924 udst = bb0->next;
2925 if (udst->tag == Iex_Const
2926 && (udst->Iex.Const.con->tag == Ico_U32
2927 || udst->Iex.Const.con->tag == Ico_U64)) {
2928 /* The BB ends in a jump to a literal location. */
2929 xxx_known = True;
2930 xxx_value = udst->Iex.Const.con->tag == Ico_U64
2931 ? udst->Iex.Const.con->Ico.U64
2932 : (Addr64)(udst->Iex.Const.con->Ico.U32);
2933 }
2934
2935 if (!xxx_known)
2936 return NULL;
2937
2938 /* Now we know the BB ends to a jump to a literal location. If
2939 it's a jump to itself (viz, idiom #1), move directly to the
2940 unrolling stage, first cloning the bb so the original isn't
2941 modified. */
2942 if (xxx_value == my_addr) {
2943 unroll_factor = calc_unroll_factor( bb0 );
2944 if (unroll_factor < 2)
2945 return NULL;
2946 bb1 = dopyIRBB( bb0 );
2947 bb0 = NULL;
2948 udst = NULL; /* is now invalid */
2949 goto do_unroll;
2950 }
2951
2952 /* Search for the second idiomatic form:
2953 X: BODY; if (c) goto X; goto Y
2954 We know Y, but need to establish that the last stmt
2955 is 'if (c) goto X'.
2956 */
2957 yyy_value = xxx_value;
2958 for (i = bb0->stmts_used-1; i >= 0; i--)
2959 if (bb0->stmts[i])
2960 break;
2961
2962 if (i < 0)
2963 return NULL; /* block with no stmts. Strange. */
2964
2965 st = bb0->stmts[i];
2966 if (st->tag != Ist_Exit)
2967 return NULL;
2968 if (st->Ist.Exit.jk != Ijk_Boring)
2969 return NULL;
2970
2971 con = st->Ist.Exit.dst;
2972 vassert(con->tag == Ico_U32 || con->tag == Ico_U64);
2973
2974 xxx_value = con->tag == Ico_U64
2975 ? st->Ist.Exit.dst->Ico.U64
2976 : (Addr64)(st->Ist.Exit.dst->Ico.U32);
2977
2978 /* If this assertion fails, we have some kind of type error. */
2979 vassert(con->tag == udst->Iex.Const.con->tag);
2980
2981 if (xxx_value != my_addr)
2982 /* We didn't find either idiom. Give up. */
2983 return NULL;
2984
2985 /* Ok, we found idiom #2. Copy the BB, switch around the xxx and
2986 yyy values (which makes it look like idiom #1), and go into
2987 unrolling proper. This means finding (again) the last stmt, in
2988 the copied BB. */
2989
2990 unroll_factor = calc_unroll_factor( bb0 );
2991 if (unroll_factor < 2)
2992 return NULL;
2993
2994 bb1 = dopyIRBB( bb0 );
2995 bb0 = NULL;
2996 udst = NULL; /* is now invalid */
2997 for (i = bb1->stmts_used-1; i >= 0; i--)
2998 if (bb1->stmts[i])
2999 break;
3000
3001 /* The next bunch of assertions should be true since we already
3002 found and checked the last stmt in the original bb. */
3003
3004 vassert(i >= 0);
3005
3006 st = bb1->stmts[i];
3007 vassert(st->tag == Ist_Exit);
3008
3009 con = st->Ist.Exit.dst;
3010 vassert(con->tag == Ico_U32 || con->tag == Ico_U64);
3011
3012 udst = bb1->next;
3013 vassert(udst->tag == Iex_Const);
3014 vassert(udst->Iex.Const.con->tag == Ico_U32
3015 || udst->Iex.Const.con->tag == Ico_U64);
3016 vassert(con->tag == udst->Iex.Const.con->tag);
3017
3018 /* switch the xxx and yyy fields around */
3019 if (con->tag == Ico_U64) {
3020 udst->Iex.Const.con->Ico.U64 = xxx_value;
3021 con->Ico.U64 = yyy_value;
3022 } else {
3023 udst->Iex.Const.con->Ico.U32 = (UInt)xxx_value;
cerion57b4c6d2005-02-22 11:07:35 +00003024 con->Ico.U32 = (UInt)yyy_value;
sewardj08210532004-12-29 17:09:11 +00003025 }
3026
3027 /* negate the test condition */
3028 st->Ist.Exit.guard
3029 = IRExpr_Unop(Iop_Not1,dopyIRExpr(st->Ist.Exit.guard));
3030
3031 /* --- The unroller proper. Both idioms are by now --- */
3032 /* --- now converted to idiom 1. --- */
3033
3034 do_unroll:
3035
3036 vassert(unroll_factor == 2
3037 || unroll_factor == 4
3038 || unroll_factor == 8);
3039
3040 jmax = unroll_factor==8 ? 3 : (unroll_factor==4 ? 2 : 1);
3041 for (j = 1; j <= jmax; j++) {
3042
3043 n_vars = bb1->tyenv->types_used;
3044
3045 bb2 = dopyIRBB(bb1);
3046 for (i = 0; i < n_vars; i++)
3047 (void)newIRTemp(bb1->tyenv, bb2->tyenv->types[i]);
3048
3049 for (i = 0; i < bb2->stmts_used; i++) {
sewardj08210532004-12-29 17:09:11 +00003050 /* deltaIRStmt destructively modifies the stmt, but
3051 that's OK since bb2 is a complete fresh copy of bb1. */
3052 deltaIRStmt(bb2->stmts[i], n_vars);
3053 addStmtToIRBB(bb1, bb2->stmts[i]);
3054 }
3055 }
3056
3057 if (DEBUG_IROPT) {
3058 vex_printf("\nUNROLLED (%llx)\n", my_addr);
3059 ppIRBB(bb1);
3060 vex_printf("\n");
3061 }
3062
3063 /* Flattening; sigh. The unroller succeeds in breaking flatness
3064 by negating the test condition. This should be fixed properly.
3065 For the moment use this shotgun approach. */
3066 return flatten_BB(bb1);
3067}
3068
3069
3070/*---------------------------------------------------------------*/
sewardj29632392004-08-22 02:38:11 +00003071/*--- The tree builder ---*/
3072/*---------------------------------------------------------------*/
3073
sewardj08210532004-12-29 17:09:11 +00003074/* This isn't part of IR optimisation. Really it's a pass done prior
3075 to instruction selection, which improves the code that the
3076 instruction selector can produce. */
3077
sewardj29632392004-08-22 02:38:11 +00003078typedef
3079 struct {
3080 Int occ; /* occurrence count for this tmp */
sewardj84ff0652004-08-23 16:16:08 +00003081 IRExpr* expr; /* expr it is bound to,
3082 or NULL if already 'used' */
sewardj29632392004-08-22 02:38:11 +00003083 Bool eDoesLoad; /* True <=> expr reads mem */
3084 Bool eDoesGet; /* True <=> expr reads guest state */
3085 Bool invalidateMe; /* used when dumping bindings */
3086 Int origPos; /* posn of the binder in the original bb */
3087 }
3088 TmpInfo;
3089
3090/* Given env :: IRTemp -> TmpInfo*
3091 Add the use-occurrences of temps in this expression
3092 to the environment.
3093*/
sewardjc9ad1152004-10-14 00:08:21 +00003094static void occCount_Temp ( TmpInfo** env, IRTemp tmp )
sewardj17442fe2004-09-20 14:54:28 +00003095{
sewardjc9ad1152004-10-14 00:08:21 +00003096 TmpInfo* ti = env[(Int)tmp];
3097 if (ti) {
sewardj17442fe2004-09-20 14:54:28 +00003098 ti->occ++;
3099 } else {
3100 ti = LibVEX_Alloc(sizeof(TmpInfo));
3101 ti->occ = 1;
3102 ti->expr = NULL;
3103 ti->eDoesLoad = False;
3104 ti->eDoesGet = False;
3105 ti->invalidateMe = False;
3106 ti->origPos = -1; /* filed in properly later */
sewardjc9ad1152004-10-14 00:08:21 +00003107 env[(Int)tmp] = ti;
sewardj17442fe2004-09-20 14:54:28 +00003108 }
3109}
3110
sewardjc9ad1152004-10-14 00:08:21 +00003111static void occCount_Expr ( TmpInfo** env, IRExpr* e )
sewardj29632392004-08-22 02:38:11 +00003112{
sewardj17442fe2004-09-20 14:54:28 +00003113 Int i;
sewardj29632392004-08-22 02:38:11 +00003114
3115 switch (e->tag) {
3116
3117 case Iex_Tmp: /* the only interesting case */
sewardj17442fe2004-09-20 14:54:28 +00003118 occCount_Temp(env, e->Iex.Tmp.tmp);
sewardj29632392004-08-22 02:38:11 +00003119 return;
3120
3121 case Iex_Mux0X:
3122 occCount_Expr(env, e->Iex.Mux0X.cond);
3123 occCount_Expr(env, e->Iex.Mux0X.expr0);
3124 occCount_Expr(env, e->Iex.Mux0X.exprX);
3125 return;
3126
3127 case Iex_Binop:
3128 occCount_Expr(env, e->Iex.Binop.arg1);
3129 occCount_Expr(env, e->Iex.Binop.arg2);
3130 return;
3131
3132 case Iex_Unop:
3133 occCount_Expr(env, e->Iex.Unop.arg);
3134 return;
3135
3136 case Iex_LDle:
3137 occCount_Expr(env, e->Iex.LDle.addr);
3138 return;
3139
3140 case Iex_CCall:
3141 for (i = 0; e->Iex.CCall.args[i]; i++)
3142 occCount_Expr(env, e->Iex.CCall.args[i]);
3143 return;
3144
sewardjf6501992004-08-27 11:58:24 +00003145 case Iex_GetI:
sewardjeeac8412004-11-02 00:26:55 +00003146 occCount_Expr(env, e->Iex.GetI.ix);
sewardjf6501992004-08-27 11:58:24 +00003147 return;
3148
sewardj29632392004-08-22 02:38:11 +00003149 case Iex_Const:
3150 case Iex_Get:
3151 return;
3152
3153 default:
3154 vex_printf("\n"); ppIRExpr(e); vex_printf("\n");
3155 vpanic("occCount_Expr");
3156 }
3157}
3158
3159
3160/* Given env :: IRTemp -> TmpInfo*
3161 Add the use-occurrences of temps in this expression
3162 to the environment.
3163*/
sewardjc9ad1152004-10-14 00:08:21 +00003164static void occCount_Stmt ( TmpInfo** env, IRStmt* st )
sewardj29632392004-08-22 02:38:11 +00003165{
sewardj17442fe2004-09-20 14:54:28 +00003166 Int i;
3167 IRDirty* d;
sewardj29632392004-08-22 02:38:11 +00003168 switch (st->tag) {
3169 case Ist_Tmp:
sewardj6d076362004-09-23 11:06:17 +00003170 occCount_Expr(env, st->Ist.Tmp.data);
sewardj29632392004-08-22 02:38:11 +00003171 return;
3172 case Ist_Put:
sewardj6d076362004-09-23 11:06:17 +00003173 occCount_Expr(env, st->Ist.Put.data);
sewardj29632392004-08-22 02:38:11 +00003174 return;
sewardjf6501992004-08-27 11:58:24 +00003175 case Ist_PutI:
sewardjeeac8412004-11-02 00:26:55 +00003176 occCount_Expr(env, st->Ist.PutI.ix);
sewardj2d3f77c2004-09-22 23:49:09 +00003177 occCount_Expr(env, st->Ist.PutI.data);
sewardjf6501992004-08-27 11:58:24 +00003178 return;
sewardj29632392004-08-22 02:38:11 +00003179 case Ist_STle:
3180 occCount_Expr(env, st->Ist.STle.addr);
3181 occCount_Expr(env, st->Ist.STle.data);
3182 return;
sewardj17442fe2004-09-20 14:54:28 +00003183 case Ist_Dirty:
3184 d = st->Ist.Dirty.details;
3185 if (d->mFx != Ifx_None)
3186 occCount_Expr(env, d->mAddr);
sewardjb8385d82004-11-02 01:34:15 +00003187 occCount_Expr(env, d->guard);
sewardj17442fe2004-09-20 14:54:28 +00003188 for (i = 0; d->args[i]; i++)
3189 occCount_Expr(env, d->args[i]);
3190 return;
sewardjd2445f62005-03-21 00:15:53 +00003191 case Ist_NoOp:
sewardjf1689312005-03-16 18:19:10 +00003192 case Ist_IMark:
sewardj3e838932005-01-07 12:09:15 +00003193 case Ist_MFence:
3194 return;
sewardj29632392004-08-22 02:38:11 +00003195 case Ist_Exit:
sewardj0276d4b2004-11-15 15:30:21 +00003196 occCount_Expr(env, st->Ist.Exit.guard);
sewardj29632392004-08-22 02:38:11 +00003197 return;
3198 default:
3199 vex_printf("\n"); ppIRStmt(st); vex_printf("\n");
3200 vpanic("occCount_Stmt");
3201 }
3202}
3203
sewardj17442fe2004-09-20 14:54:28 +00003204/* Look up a binding for tmp in the env. If found, return the bound
3205 expression, and set the env's binding to NULL so it is marked as
3206 used. If not found, return NULL. */
3207
sewardjc9ad1152004-10-14 00:08:21 +00003208static IRExpr* tbSubst_Temp ( TmpInfo** env, IRTemp tmp )
sewardj17442fe2004-09-20 14:54:28 +00003209{
3210 TmpInfo* ti;
sewardj17442fe2004-09-20 14:54:28 +00003211 IRExpr* e;
sewardjc9ad1152004-10-14 00:08:21 +00003212 ti = env[(Int)tmp];
3213 if (ti){
3214 e = ti->expr;
sewardj17442fe2004-09-20 14:54:28 +00003215 if (e) {
3216 ti->expr = NULL;
3217 return e;
3218 } else {
3219 return NULL;
3220 }
3221 } else {
3222 return NULL;
3223 }
3224}
sewardj29632392004-08-22 02:38:11 +00003225
3226/* Traverse e, looking for temps. For each observed temp, see if env
3227 contains a binding for the temp, and if so return the bound value.
3228 The env has the property that any binding it holds is
3229 'single-shot', so once a binding is used, it is marked as no longer
3230 available, by setting its .expr field to NULL. */
3231
sewardjc9ad1152004-10-14 00:08:21 +00003232static IRExpr* tbSubst_Expr ( TmpInfo** env, IRExpr* e )
sewardj29632392004-08-22 02:38:11 +00003233{
sewardjc41f1fb2004-08-22 09:48:08 +00003234 IRExpr* e2;
3235 IRExpr** args2;
3236 Int i;
sewardj29632392004-08-22 02:38:11 +00003237
3238 switch (e->tag) {
3239
sewardjc41f1fb2004-08-22 09:48:08 +00003240 case Iex_CCall:
sewardj695cff92004-10-13 14:50:14 +00003241 args2 = sopyIRExprVec(e->Iex.CCall.args);
sewardjc41f1fb2004-08-22 09:48:08 +00003242 for (i = 0; args2[i]; i++)
3243 args2[i] = tbSubst_Expr(env,args2[i]);
sewardj8ea867b2004-10-30 19:03:02 +00003244 return IRExpr_CCall(e->Iex.CCall.cee,
sewardjc41f1fb2004-08-22 09:48:08 +00003245 e->Iex.CCall.retty,
3246 args2
3247 );
sewardj29632392004-08-22 02:38:11 +00003248 case Iex_Tmp:
sewardj17442fe2004-09-20 14:54:28 +00003249 e2 = tbSubst_Temp(env, e->Iex.Tmp.tmp);
3250 return e2 ? e2 : e;
sewardjc41f1fb2004-08-22 09:48:08 +00003251 case Iex_Mux0X:
3252 return IRExpr_Mux0X(
3253 tbSubst_Expr(env, e->Iex.Mux0X.cond),
3254 tbSubst_Expr(env, e->Iex.Mux0X.expr0),
3255 tbSubst_Expr(env, e->Iex.Mux0X.exprX)
3256 );
3257 case Iex_Binop:
3258 return IRExpr_Binop(
3259 e->Iex.Binop.op,
3260 tbSubst_Expr(env, e->Iex.Binop.arg1),
3261 tbSubst_Expr(env, e->Iex.Binop.arg2)
3262 );
3263 case Iex_Unop:
3264 return IRExpr_Unop(
3265 e->Iex.Unop.op,
3266 tbSubst_Expr(env, e->Iex.Unop.arg)
3267 );
3268 case Iex_LDle:
3269 return IRExpr_LDle(
3270 e->Iex.LDle.ty,
sewardjf6501992004-08-27 11:58:24 +00003271 tbSubst_Expr(env, e->Iex.LDle.addr)
sewardjc41f1fb2004-08-22 09:48:08 +00003272 );
sewardjf6501992004-08-27 11:58:24 +00003273 case Iex_GetI:
3274 return IRExpr_GetI(
sewardj2d3f77c2004-09-22 23:49:09 +00003275 e->Iex.GetI.descr,
sewardjeeac8412004-11-02 00:26:55 +00003276 tbSubst_Expr(env, e->Iex.GetI.ix),
sewardj2d3f77c2004-09-22 23:49:09 +00003277 e->Iex.GetI.bias
3278 );
sewardjc41f1fb2004-08-22 09:48:08 +00003279 case Iex_Const:
sewardj29632392004-08-22 02:38:11 +00003280 case Iex_Get:
3281 return e;
3282 default:
3283 vex_printf("\n"); ppIRExpr(e); vex_printf("\n");
3284 vpanic("tbSubst_Expr");
3285 }
3286}
3287
3288/* Same deal as tbSubst_Expr, except for stmts. */
3289
sewardjc9ad1152004-10-14 00:08:21 +00003290static IRStmt* tbSubst_Stmt ( TmpInfo** env, IRStmt* st )
sewardj29632392004-08-22 02:38:11 +00003291{
sewardj17442fe2004-09-20 14:54:28 +00003292 Int i;
3293 IRDirty* d;
3294 IRDirty* d2;
sewardj29632392004-08-22 02:38:11 +00003295 switch (st->tag) {
sewardj17442fe2004-09-20 14:54:28 +00003296 case Ist_STle:
3297 return IRStmt_STle(
3298 tbSubst_Expr(env, st->Ist.STle.addr),
3299 tbSubst_Expr(env, st->Ist.STle.data)
3300 );
sewardj29632392004-08-22 02:38:11 +00003301 case Ist_Tmp:
3302 return IRStmt_Tmp(
3303 st->Ist.Tmp.tmp,
sewardj6d076362004-09-23 11:06:17 +00003304 tbSubst_Expr(env, st->Ist.Tmp.data)
sewardj29632392004-08-22 02:38:11 +00003305 );
3306 case Ist_Put:
3307 return IRStmt_Put(
3308 st->Ist.Put.offset,
sewardj6d076362004-09-23 11:06:17 +00003309 tbSubst_Expr(env, st->Ist.Put.data)
sewardj29632392004-08-22 02:38:11 +00003310 );
sewardjf6501992004-08-27 11:58:24 +00003311 case Ist_PutI:
3312 return IRStmt_PutI(
sewardj2d3f77c2004-09-22 23:49:09 +00003313 st->Ist.PutI.descr,
sewardjeeac8412004-11-02 00:26:55 +00003314 tbSubst_Expr(env, st->Ist.PutI.ix),
sewardj2d3f77c2004-09-22 23:49:09 +00003315 st->Ist.PutI.bias,
3316 tbSubst_Expr(env, st->Ist.PutI.data)
3317 );
sewardjf6501992004-08-27 11:58:24 +00003318
sewardj29632392004-08-22 02:38:11 +00003319 case Ist_Exit:
3320 return IRStmt_Exit(
sewardj0276d4b2004-11-15 15:30:21 +00003321 tbSubst_Expr(env, st->Ist.Exit.guard),
sewardj893aada2004-11-29 19:57:54 +00003322 st->Ist.Exit.jk,
sewardj29632392004-08-22 02:38:11 +00003323 st->Ist.Exit.dst
3324 );
sewardjf1689312005-03-16 18:19:10 +00003325 case Ist_IMark:
3326 return IRStmt_IMark(st->Ist.IMark.addr, st->Ist.IMark.len);
sewardjd2445f62005-03-21 00:15:53 +00003327 case Ist_NoOp:
3328 return IRStmt_NoOp();
sewardj3e838932005-01-07 12:09:15 +00003329 case Ist_MFence:
3330 return IRStmt_MFence();
sewardj17442fe2004-09-20 14:54:28 +00003331 case Ist_Dirty:
3332 d = st->Ist.Dirty.details;
3333 d2 = emptyIRDirty();
3334 *d2 = *d;
3335 if (d2->mFx != Ifx_None)
3336 d2->mAddr = tbSubst_Expr(env, d2->mAddr);
sewardjb8385d82004-11-02 01:34:15 +00003337 d2->guard = tbSubst_Expr(env, d2->guard);
sewardj17442fe2004-09-20 14:54:28 +00003338 for (i = 0; d2->args[i]; i++)
3339 d2->args[i] = tbSubst_Expr(env, d2->args[i]);
3340 return IRStmt_Dirty(d2);
sewardj29632392004-08-22 02:38:11 +00003341 default:
3342 vex_printf("\n"); ppIRStmt(st); vex_printf("\n");
3343 vpanic("tbSubst_Stmt");
3344 }
3345}
3346
3347
sewardj37d38012004-08-24 00:37:04 +00003348/* Traverse an expr, and detect if any part of it reads memory or does
3349 a Get. Be careful ... this really controls how much the
3350 tree-builder can reorder the code, so getting it right is critical.
3351*/
sewardj29632392004-08-22 02:38:11 +00003352static void setHints_Expr (Bool* doesLoad, Bool* doesGet, IRExpr* e )
3353{
sewardjc41f1fb2004-08-22 09:48:08 +00003354 Int i;
sewardj29632392004-08-22 02:38:11 +00003355 switch (e->tag) {
sewardjc41f1fb2004-08-22 09:48:08 +00003356 case Iex_CCall:
3357 for (i = 0; e->Iex.CCall.args[i]; i++)
3358 setHints_Expr(doesLoad, doesGet, e->Iex.CCall.args[i]);
3359 return;
3360 case Iex_Mux0X:
3361 setHints_Expr(doesLoad, doesGet, e->Iex.Mux0X.cond);
3362 setHints_Expr(doesLoad, doesGet, e->Iex.Mux0X.expr0);
3363 setHints_Expr(doesLoad, doesGet, e->Iex.Mux0X.exprX);
3364 return;
3365 case Iex_Binop:
3366 setHints_Expr(doesLoad, doesGet, e->Iex.Binop.arg1);
3367 setHints_Expr(doesLoad, doesGet, e->Iex.Binop.arg2);
3368 return;
3369 case Iex_Unop:
3370 setHints_Expr(doesLoad, doesGet, e->Iex.Unop.arg);
3371 return;
3372 case Iex_LDle:
sewardjc9ad1152004-10-14 00:08:21 +00003373 *doesLoad = True;
sewardjaba4fff2004-10-08 21:37:15 +00003374 setHints_Expr(doesLoad, doesGet, e->Iex.LDle.addr);
sewardjc41f1fb2004-08-22 09:48:08 +00003375 return;
sewardj29632392004-08-22 02:38:11 +00003376 case Iex_Get:
sewardjc9ad1152004-10-14 00:08:21 +00003377 *doesGet = True;
sewardj29632392004-08-22 02:38:11 +00003378 return;
sewardjf6501992004-08-27 11:58:24 +00003379 case Iex_GetI:
sewardjc9ad1152004-10-14 00:08:21 +00003380 *doesGet = True;
sewardjeeac8412004-11-02 00:26:55 +00003381 setHints_Expr(doesLoad, doesGet, e->Iex.GetI.ix);
sewardjf6501992004-08-27 11:58:24 +00003382 return;
sewardjc41f1fb2004-08-22 09:48:08 +00003383 case Iex_Tmp:
3384 case Iex_Const:
3385 return;
sewardj29632392004-08-22 02:38:11 +00003386 default:
3387 vex_printf("\n"); ppIRExpr(e); vex_printf("\n");
3388 vpanic("setHints_Expr");
3389 }
3390}
3391
3392
sewardjc9ad1152004-10-14 00:08:21 +00003393static void dumpInvalidated ( TmpInfo** env, IRBB* bb, /*INOUT*/Int* j )
sewardj29632392004-08-22 02:38:11 +00003394{
3395 Int k, oldest_op, oldest_k;
3396 TmpInfo* ti;
3397
3398 /* Dump all the bindings to marked as invalidated, in order. */
3399 while (True) {
3400
3401 /* find the oldest bind marked 'invalidateMe'. */
3402 oldest_op = 1<<30;
3403 oldest_k = 1<<30;
sewardjc9ad1152004-10-14 00:08:21 +00003404 for (k = 0; k < bb->tyenv->types_used; k++) {
3405 ti = env[k];
3406 if (!ti)
sewardj29632392004-08-22 02:38:11 +00003407 continue;
sewardj29632392004-08-22 02:38:11 +00003408 if (!ti->expr)
3409 continue;
3410 if (!ti->invalidateMe)
3411 continue;
sewardjc41f1fb2004-08-22 09:48:08 +00003412 /* vex_printf("FOUND INVAL %d %d\n", ti->origPos, oldest_op); */
sewardj29632392004-08-22 02:38:11 +00003413 if (ti->origPos < oldest_op) {
3414 oldest_op = ti->origPos;
3415 oldest_k = k;
3416 }
3417 }
3418
3419 /* No more binds to invalidate. */
3420 if (oldest_op == 1<<30)
sewardj8bee6d12005-03-22 02:24:05 +00003421 return;
sewardj29632392004-08-22 02:38:11 +00003422
3423 /* the oldest bind to invalidate has been identified */
3424 vassert(oldest_op != 1<<31 && oldest_k != 1<<31);
sewardjc9ad1152004-10-14 00:08:21 +00003425 ti = env[oldest_k];
sewardj29632392004-08-22 02:38:11 +00003426 vassert(ti->expr && ti->invalidateMe);
3427
3428 /* and invalidate it ... */
sewardjc9ad1152004-10-14 00:08:21 +00003429 bb->stmts[*j] = IRStmt_Tmp( (IRTemp)oldest_k, ti->expr );
sewardj29632392004-08-22 02:38:11 +00003430 /* vex_printf("**1 "); ppIRStmt(bb->stmts[*j]); vex_printf("\n"); */
3431 (*j)++;
3432 ti->invalidateMe = False;
3433 ti->expr = NULL; /* no longer available for substitution */
3434
3435 } /* loop which dumps the binds marked for invalidation */
3436}
3437
3438
3439
sewardj49651f42004-10-28 22:11:04 +00003440/* notstatic */ void do_treebuild_BB ( IRBB* bb )
sewardj29632392004-08-22 02:38:11 +00003441{
3442 Int i, j, k;
sewardj29632392004-08-22 02:38:11 +00003443 Bool invPut, invStore;
3444 IRStmt* st;
3445 IRStmt* st2;
3446 TmpInfo* ti;
3447 IRExpr* next2;
3448
3449 /* Mapping from IRTemp to TmpInfo*. */
sewardjc9ad1152004-10-14 00:08:21 +00003450 Int n_tmps = bb->tyenv->types_used;
3451 TmpInfo** env = LibVEX_Alloc(n_tmps * sizeof(TmpInfo*));
3452
3453 for (i = 0; i < n_tmps; i++)
3454 env[i] = NULL;
sewardj29632392004-08-22 02:38:11 +00003455
3456 /* Phase 1. Scan forwards in bb, counting use occurrences of each
3457 temp. Also count occurrences in the bb->next field. */
3458
3459 for (i = 0; i < bb->stmts_used; i++) {
3460 st = bb->stmts[i];
sewardj8bee6d12005-03-22 02:24:05 +00003461 if (st->tag == Ist_NoOp)
sewardj29632392004-08-22 02:38:11 +00003462 continue;
3463 occCount_Stmt( env, st );
3464 }
3465 occCount_Expr(env, bb->next );
3466
sewardjc9ad1152004-10-14 00:08:21 +00003467# if 0
sewardj29632392004-08-22 02:38:11 +00003468 for (i = 0; i < env->used; i++) {
3469 if (!env->inuse[i])
3470 continue;
3471 ppIRTemp( (IRTemp)(env->key[i]) );
3472 vex_printf(" used %d\n", ((TmpInfo*)env->val[i])->occ );
3473 }
sewardjc9ad1152004-10-14 00:08:21 +00003474# endif
sewardj29632392004-08-22 02:38:11 +00003475
3476 /* Phase 2. Fill in the origPos fields. */
3477
3478 for (i = 0; i < bb->stmts_used; i++) {
3479 st = bb->stmts[i];
sewardj29632392004-08-22 02:38:11 +00003480 if (st->tag != Ist_Tmp)
3481 continue;
3482
sewardjc9ad1152004-10-14 00:08:21 +00003483 ti = env[(Int)st->Ist.Tmp.tmp];
3484 if (!ti) {
sewardj84ff0652004-08-23 16:16:08 +00003485 vex_printf("\n");
3486 ppIRTemp(st->Ist.Tmp.tmp);
3487 vex_printf("\n");
sewardj29632392004-08-22 02:38:11 +00003488 vpanic("treebuild_BB (phase 2): unmapped IRTemp");
3489 }
sewardj29632392004-08-22 02:38:11 +00003490 ti->origPos = i;
3491 }
3492
3493 /* Phase 3. Scan forwards in bb.
3494
3495 On seeing 't = E', occ(t)==1,
3496 let E'=env(E), set t's binding to be E', and
3497 delete this stmt.
3498 Also examine E' and set the hints for E' appropriately
3499 (doesLoad? doesGet?)
3500
3501 On seeing any other stmt,
3502 let stmt' = env(stmt)
3503 remove from env any 't=E' binds invalidated by stmt
3504 emit the invalidated stmts
3505 emit stmt'
3506
3507 Apply env to bb->next.
3508 */
3509
3510 /* The stmts in bb are being reordered, and we are guaranteed to
3511 end up with no more than the number we started with. Use i to
3512 be the cursor of the current stmt examined and j <= i to be that
3513 for the current stmt being written.
3514 */
3515 j = 0;
3516 for (i = 0; i < bb->stmts_used; i++) {
3517 st = bb->stmts[i];
sewardj8bee6d12005-03-22 02:24:05 +00003518 if (st->tag == Ist_NoOp)
sewardj29632392004-08-22 02:38:11 +00003519 continue;
3520
3521 if (st->tag == Ist_Tmp) {
sewardje80679a2004-09-21 23:00:11 +00003522 /* vex_printf("acquire binding\n"); */
sewardjc9ad1152004-10-14 00:08:21 +00003523 ti = env[st->Ist.Tmp.tmp];
3524 if (!ti) {
sewardj29632392004-08-22 02:38:11 +00003525 vpanic("treebuild_BB (phase 2): unmapped IRTemp");
3526 }
sewardj29632392004-08-22 02:38:11 +00003527 if (ti->occ == 1) {
3528 /* ok, we have 't = E', occ(t)==1. Do the abovementioned actions. */
sewardj6d076362004-09-23 11:06:17 +00003529 IRExpr* e = st->Ist.Tmp.data;
sewardj29632392004-08-22 02:38:11 +00003530 IRExpr* e2 = tbSubst_Expr(env, e);
3531 ti->expr = e2;
3532 ti->eDoesLoad = ti->eDoesGet = False;
3533 setHints_Expr(&ti->eDoesLoad, &ti->eDoesGet, e2);
3534 /* don't advance j, as we are deleting this stmt and instead
3535 holding it temporarily in the env. */
3536 continue; /* the for (i = 0; i < bb->stmts_used; i++) loop */
3537 }
3538 }
3539
3540 /* we get here for any other kind of statement. */
3541 /* 'use up' any bindings required by the current statement. */
3542 st2 = tbSubst_Stmt(env, st);
3543
3544 /* Now, before this stmt, dump any bindings it invalidates.
3545 These need to be dumped in the order in which they originally
3546 appeared. (Stupid algorithm): first, mark all bindings which
3547 need to be dumped. Then, dump them in the order in which
3548 they were defined. */
sewardj3e838932005-01-07 12:09:15 +00003549
sewardj9d2e7692005-02-07 01:11:31 +00003550 invPut = toBool(st->tag == Ist_Put
3551 || st->tag == Ist_PutI
3552 || st->tag == Ist_Dirty);
sewardj3e838932005-01-07 12:09:15 +00003553
sewardj9d2e7692005-02-07 01:11:31 +00003554 invStore = toBool(st->tag == Ist_STle
3555 || st->tag == Ist_Dirty);
sewardj29632392004-08-22 02:38:11 +00003556
sewardjc9ad1152004-10-14 00:08:21 +00003557 for (k = 0; k < n_tmps; k++) {
3558 ti = env[k];
3559 if (!ti)
sewardj29632392004-08-22 02:38:11 +00003560 continue;
sewardj29632392004-08-22 02:38:11 +00003561 if (!ti->expr)
3562 continue;
3563
sewardj3e838932005-01-07 12:09:15 +00003564 /* Do we have to invalidate this binding? */
3565
sewardj29632392004-08-22 02:38:11 +00003566 ti->invalidateMe
sewardj9d2e7692005-02-07 01:11:31 +00003567 = toBool(
3568 /* a store invalidates loaded data */
sewardj4c5f6d52004-10-26 13:25:33 +00003569 (ti->eDoesLoad && invStore)
3570 /* a put invalidates get'd data */
3571 || (ti->eDoesGet && invPut)
3572 /* a put invalidates loaded data. Note, we could do
3573 much better here in the sense that we only need to
3574 invalidate trees containing loads if the Put in
3575 question is marked as requiring precise
3576 exceptions. */
sewardj3e838932005-01-07 12:09:15 +00003577 || (ti->eDoesLoad && invPut)
3578 /* probably overly conservative: a memory fence
3579 invalidates absolutely everything, so that all
3580 computation prior to it is forced to complete before
3581 proceeding with the fence. */
sewardj9d2e7692005-02-07 01:11:31 +00003582 || st->tag == Ist_MFence
3583 );
sewardjc41f1fb2004-08-22 09:48:08 +00003584 /*
3585 if (ti->invalidateMe)
sewardj3e838932005-01-07 12:09:15 +00003586 vex_printf("SET INVAL\n");
3587 */
sewardj29632392004-08-22 02:38:11 +00003588 }
3589
3590 dumpInvalidated ( env, bb, &j );
3591
3592 /* finally, emit the substituted statement */
3593 bb->stmts[j] = st2;
3594 /* vex_printf("**2 "); ppIRStmt(bb->stmts[j]); vex_printf("\n"); */
3595 j++;
3596
3597 vassert(j <= i+1);
3598 } /* for each stmt in the original bb ... */
3599
3600 /* Finally ... substitute the ->next field as much as possible, and
3601 dump any left-over bindings. Hmm. Perhaps there should be no
3602 left over bindings? Or any left-over bindings are
3603 by definition dead? */
3604 next2 = tbSubst_Expr(env, bb->next);
3605 bb->next = next2;
3606 bb->stmts_used = j;
3607}
3608
3609
sewardj695cff92004-10-13 14:50:14 +00003610/*---------------------------------------------------------------*/
sewardjedf4d692004-08-17 13:52:58 +00003611/*--- iropt main ---*/
3612/*---------------------------------------------------------------*/
3613
sewardj695cff92004-10-13 14:50:14 +00003614static Bool iropt_verbose = False; //True;
sewardj4345f7a2004-09-22 19:49:27 +00003615
3616
sewardj4345f7a2004-09-22 19:49:27 +00003617/* Do a simple cleanup pass on bb. This is: redundant Get removal,
3618 redundant Put removal, constant propagation, dead code removal,
3619 clean helper specialisation, and dead code removal (again).
sewardjb9230752004-12-29 19:25:06 +00003620*/
sewardj695cff92004-10-13 14:50:14 +00003621
sewardj4345f7a2004-09-22 19:49:27 +00003622
3623static
sewardj695cff92004-10-13 14:50:14 +00003624IRBB* cheap_transformations (
3625 IRBB* bb,
sewardj9d2e7692005-02-07 01:11:31 +00003626 IRExpr* (*specHelper) (HChar*, IRExpr**),
sewardj8d2291c2004-10-25 14:50:21 +00003627 Bool (*preciseMemExnsFn)(Int,Int)
3628 )
sewardj4345f7a2004-09-22 19:49:27 +00003629{
3630 redundant_get_removal_BB ( bb );
3631 if (iropt_verbose) {
3632 vex_printf("\n========= REDUNDANT GET\n\n" );
3633 ppIRBB(bb);
3634 }
sewardj044a2152004-10-21 10:22:10 +00003635
sewardj8d2291c2004-10-25 14:50:21 +00003636 redundant_put_removal_BB ( bb, preciseMemExnsFn );
sewardj4345f7a2004-09-22 19:49:27 +00003637 if (iropt_verbose) {
3638 vex_printf("\n========= REDUNDANT PUT\n\n" );
3639 ppIRBB(bb);
3640 }
sewardj044a2152004-10-21 10:22:10 +00003641
sewardj4345f7a2004-09-22 19:49:27 +00003642 bb = cprop_BB ( bb );
3643 if (iropt_verbose) {
3644 vex_printf("\n========= CPROPD\n\n" );
3645 ppIRBB(bb);
3646 }
3647
sewardj49651f42004-10-28 22:11:04 +00003648 do_deadcode_BB ( bb );
sewardj4345f7a2004-09-22 19:49:27 +00003649 if (iropt_verbose) {
3650 vex_printf("\n========= DEAD\n\n" );
3651 ppIRBB(bb);
3652 }
3653
sewardjb9230752004-12-29 19:25:06 +00003654 bb = spec_helpers_BB ( bb, specHelper );
sewardj49651f42004-10-28 22:11:04 +00003655 do_deadcode_BB ( bb );
sewardj4345f7a2004-09-22 19:49:27 +00003656 if (iropt_verbose) {
3657 vex_printf("\n========= SPECd \n\n" );
3658 ppIRBB(bb);
3659 }
3660
3661 return bb;
3662}
3663
sewardj695cff92004-10-13 14:50:14 +00003664
3665/* Do some more expensive transformations on bb, which are aimed at
3666 optimising as much as possible in the presence of GetI and PutI. */
3667
3668static
3669IRBB* expensive_transformations( IRBB* bb )
3670{
sewardjfe1ccfc2004-11-11 02:14:45 +00003671 do_cse_BB( bb );
sewardj695cff92004-10-13 14:50:14 +00003672 collapse_AddSub_chains_BB( bb );
sewardj08210532004-12-29 17:09:11 +00003673 do_redundant_GetI_elimination( bb );
sewardj695cff92004-10-13 14:50:14 +00003674 do_redundant_PutI_elimination( bb );
sewardj49651f42004-10-28 22:11:04 +00003675 do_deadcode_BB( bb );
sewardjb9230752004-12-29 19:25:06 +00003676 return bb;
sewardj695cff92004-10-13 14:50:14 +00003677}
3678
3679
sewardj4345f7a2004-09-22 19:49:27 +00003680/* Scan a flattened BB to see if it has any GetI or PutIs in it. Used
3681 as a heuristic hack to see if iropt needs to do expensive
sewardj695cff92004-10-13 14:50:14 +00003682 optimisations (CSE, PutI -> GetI forwarding, redundant PutI
3683 elimination) to improve code containing GetI or PutI. */
3684
sewardj4345f7a2004-09-22 19:49:27 +00003685static Bool hasGetIorPutI ( IRBB* bb )
3686{
3687 Int i, j;
3688 IRStmt* st;
3689 IRDirty* d;
3690
3691 for (i = 0; i < bb->stmts_used; i++) {
3692 st = bb->stmts[i];
sewardj4345f7a2004-09-22 19:49:27 +00003693 switch (st->tag) {
3694 case Ist_PutI:
3695 return True;
3696 case Ist_Tmp:
sewardj6d076362004-09-23 11:06:17 +00003697 if (st->Ist.Tmp.data->tag == Iex_GetI)
sewardj4345f7a2004-09-22 19:49:27 +00003698 return True;
3699 break;
3700 case Ist_Put:
sewardj496a58d2005-03-20 18:44:44 +00003701 vassert(isIRAtom(st->Ist.Put.data));
sewardj4345f7a2004-09-22 19:49:27 +00003702 break;
3703 case Ist_STle:
sewardj496a58d2005-03-20 18:44:44 +00003704 vassert(isIRAtom(st->Ist.STle.addr));
3705 vassert(isIRAtom(st->Ist.STle.data));
sewardj4345f7a2004-09-22 19:49:27 +00003706 break;
sewardj4345f7a2004-09-22 19:49:27 +00003707 case Ist_Dirty:
3708 d = st->Ist.Dirty.details;
sewardj496a58d2005-03-20 18:44:44 +00003709 vassert(isIRAtom(d->guard));
sewardj4345f7a2004-09-22 19:49:27 +00003710 for (j = 0; d->args[j]; j++)
sewardj496a58d2005-03-20 18:44:44 +00003711 vassert(isIRAtom(d->args[j]));
sewardj4345f7a2004-09-22 19:49:27 +00003712 if (d->mFx != Ifx_None)
sewardj496a58d2005-03-20 18:44:44 +00003713 vassert(isIRAtom(d->mAddr));
sewardj4345f7a2004-09-22 19:49:27 +00003714 break;
sewardjd2445f62005-03-21 00:15:53 +00003715 case Ist_NoOp:
sewardjf1689312005-03-16 18:19:10 +00003716 case Ist_IMark:
sewardj3e838932005-01-07 12:09:15 +00003717 case Ist_MFence:
3718 break;
3719 case Ist_Exit:
sewardj496a58d2005-03-20 18:44:44 +00003720 vassert(isIRAtom(st->Ist.Exit.guard));
sewardj3e838932005-01-07 12:09:15 +00003721 break;
sewardj4345f7a2004-09-22 19:49:27 +00003722 default:
3723 ppIRStmt(st);
3724 vpanic("hasGetIorPutI");
3725 }
3726
3727 }
3728 return False;
3729
3730}
3731
3732
sewardj695cff92004-10-13 14:50:14 +00003733/* ---------------- The main iropt entry point. ---------------- */
3734
sewardjedf4d692004-08-17 13:52:58 +00003735/* exported from this file */
sewardj695cff92004-10-13 14:50:14 +00003736/* Rules of the game:
3737
3738 - IRExpr/IRStmt trees should be treated as immutable, as they
3739 may get shared. So never change a field of such a tree node;
3740 instead construct and return a new one if needed.
3741*/
3742
sewardj4345f7a2004-09-22 19:49:27 +00003743
sewardj84ff0652004-08-23 16:16:08 +00003744IRBB* do_iropt_BB ( IRBB* bb0,
sewardj9d2e7692005-02-07 01:11:31 +00003745 IRExpr* (*specHelper) (HChar*, IRExpr**),
sewardj8d2291c2004-10-25 14:50:21 +00003746 Bool (*preciseMemExnsFn)(Int,Int),
sewardj695cff92004-10-13 14:50:14 +00003747 Addr64 guest_addr )
sewardjedf4d692004-08-17 13:52:58 +00003748{
sewardj9d2e7692005-02-07 01:11:31 +00003749 static Int n_total = 0;
3750 static Int n_expensive = 0;
sewardj29632392004-08-22 02:38:11 +00003751
sewardj695cff92004-10-13 14:50:14 +00003752 Bool do_expensive;
sewardj695cff92004-10-13 14:50:14 +00003753 IRBB *bb, *bb2;
sewardj8c2c10b2004-10-16 20:51:52 +00003754
sewardj4345f7a2004-09-22 19:49:27 +00003755 n_total++;
3756
3757 /* First flatten the block out, since all other
3758 phases assume flat code. */
3759
3760 bb = flatten_BB ( bb0 );
3761
3762 if (iropt_verbose) {
3763 vex_printf("\n========= FLAT\n\n" );
3764 ppIRBB(bb);
sewardj84be7372004-08-18 13:59:33 +00003765 }
sewardjd7217032004-08-19 10:49:10 +00003766
sewardj08210532004-12-29 17:09:11 +00003767 /* If at level 0, stop now. */
3768 if (vex_control.iropt_level <= 0) return bb;
3769
sewardj695cff92004-10-13 14:50:14 +00003770 /* Now do a preliminary cleanup pass, and figure out if we also
3771 need to do 'expensive' optimisations. Expensive optimisations
3772 are deemed necessary if the block contains any GetIs or PutIs.
3773 If needed, do expensive transformations and then another cheap
3774 cleanup pass. */
sewardj4345f7a2004-09-22 19:49:27 +00003775
sewardj8d2291c2004-10-25 14:50:21 +00003776 bb = cheap_transformations( bb, specHelper, preciseMemExnsFn );
sewardj695cff92004-10-13 14:50:14 +00003777
sewardj08613742004-10-25 13:01:45 +00003778 if (vex_control.iropt_level > 1) {
sewardj39555aa2004-10-24 22:29:19 +00003779 do_expensive = hasGetIorPutI(bb);
sewardj695cff92004-10-13 14:50:14 +00003780 if (do_expensive) {
sewardj39555aa2004-10-24 22:29:19 +00003781 n_expensive++;
sewardj39555aa2004-10-24 22:29:19 +00003782 if (DEBUG_IROPT)
3783 vex_printf("***** EXPENSIVE %d %d\n", n_total, n_expensive);
sewardj695cff92004-10-13 14:50:14 +00003784 bb = expensive_transformations( bb );
sewardj8d2291c2004-10-25 14:50:21 +00003785 bb = cheap_transformations( bb, specHelper, preciseMemExnsFn );
sewardj695cff92004-10-13 14:50:14 +00003786 }
sewardj39555aa2004-10-24 22:29:19 +00003787
3788 /* Now have a go at unrolling simple (single-BB) loops. If
3789 successful, clean up the results as much as possible. */
3790
3791 bb2 = maybe_loop_unroll_BB( bb, guest_addr );
3792 if (bb2) {
sewardj8d2291c2004-10-25 14:50:21 +00003793 bb = cheap_transformations( bb2, specHelper, preciseMemExnsFn );
sewardj39555aa2004-10-24 22:29:19 +00003794 if (do_expensive) {
3795 bb = expensive_transformations( bb );
sewardj8d2291c2004-10-25 14:50:21 +00003796 bb = cheap_transformations( bb, specHelper, preciseMemExnsFn );
sewardj39555aa2004-10-24 22:29:19 +00003797 } else {
3798 /* at least do CSE and dead code removal */
sewardjfe1ccfc2004-11-11 02:14:45 +00003799 do_cse_BB( bb );
sewardj49651f42004-10-28 22:11:04 +00003800 do_deadcode_BB( bb );
sewardj39555aa2004-10-24 22:29:19 +00003801 }
3802 if (0) vex_printf("vex iropt: unrolled a loop\n");
3803 }
3804
sewardjd7217032004-08-19 10:49:10 +00003805 }
3806
sewardj4345f7a2004-09-22 19:49:27 +00003807 return bb;
sewardjedf4d692004-08-17 13:52:58 +00003808}
3809
3810
sewardja1a370f2004-08-17 13:31:55 +00003811/*---------------------------------------------------------------*/
3812/*--- end ir/iropt.c ---*/
3813/*---------------------------------------------------------------*/