blob: 999276b2a3dbf33523e22354b7f60b96b7c43ca7 [file] [log] [blame]
nethercotebb1c9912004-01-04 16:43:23 +00001
njn25e49d8e72002-09-23 09:36:25 +00002/*--------------------------------------------------------------------*/
sewardj95448072004-11-22 20:19:51 +00003/*--- Instrument IR to perform memory checking operations. ---*/
njn25cac76cb2002-09-23 11:21:57 +00004/*--- mc_translate.c ---*/
njn25e49d8e72002-09-23 09:36:25 +00005/*--------------------------------------------------------------------*/
njnc9539842002-10-02 13:26:35 +00006
njn25e49d8e72002-09-23 09:36:25 +00007/*
nethercote137bc552003-11-14 17:47:54 +00008 This file is part of MemCheck, a heavyweight Valgrind tool for
njnc9539842002-10-02 13:26:35 +00009 detecting memory errors.
njn25e49d8e72002-09-23 09:36:25 +000010
sewardjb3a1e4b2015-08-21 11:32:26 +000011 Copyright (C) 2000-2015 Julian Seward
njn25e49d8e72002-09-23 09:36:25 +000012 jseward@acm.org
13
14 This program is free software; you can redistribute it and/or
15 modify it under the terms of the GNU General Public License as
16 published by the Free Software Foundation; either version 2 of the
17 License, or (at your option) any later version.
18
19 This program is distributed in the hope that it will be useful, but
20 WITHOUT ANY WARRANTY; without even the implied warranty of
21 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
22 General Public License for more details.
23
24 You should have received a copy of the GNU General Public License
25 along with this program; if not, write to the Free Software
26 Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
27 02111-1307, USA.
28
29 The GNU General Public License is contained in the file COPYING.
30*/
31
njnc7561b92005-06-19 01:24:32 +000032#include "pub_tool_basics.h"
philippe6643e962012-01-17 21:16:30 +000033#include "pub_tool_poolalloc.h" // For mc_include.h
njn1d0825f2006-03-27 11:37:07 +000034#include "pub_tool_hashtable.h" // For mc_include.h
njn132bfcc2005-06-04 19:16:06 +000035#include "pub_tool_libcassert.h"
njn36a20fa2005-06-03 03:08:39 +000036#include "pub_tool_libcprint.h"
njnc7561b92005-06-19 01:24:32 +000037#include "pub_tool_tooliface.h"
sewardj53ee1fc2005-12-23 02:29:58 +000038#include "pub_tool_machine.h" // VG_(fnptr_to_fnentry)
sewardj81651dc2007-08-28 06:05:20 +000039#include "pub_tool_xarray.h"
40#include "pub_tool_mallocfree.h"
41#include "pub_tool_libcbase.h"
njn25e49d8e72002-09-23 09:36:25 +000042
sewardj7cf4e6b2008-05-01 20:24:26 +000043#include "mc_include.h"
44
45
sewardj7ee7d852011-06-16 11:37:21 +000046/* FIXMEs JRS 2011-June-16.
47
48 Check the interpretation for vector narrowing and widening ops,
49 particularly the saturating ones. I suspect they are either overly
50 pessimistic and/or wrong.
sewardjbfd03f82014-08-26 18:35:13 +000051
52 Iop_QandSQsh64x2 and friends (vector-by-vector bidirectional
53 saturating shifts): the interpretation is overly pessimistic.
54 See comments on the relevant cases below for details.
55
56 Iop_Sh64Sx2 and friends (vector-by-vector bidirectional shifts,
57 both rounding and non-rounding variants): ditto
sewardj7ee7d852011-06-16 11:37:21 +000058*/
59
sewardj992dff92005-10-07 11:08:55 +000060/* This file implements the Memcheck instrumentation, and in
61 particular contains the core of its undefined value detection
62 machinery. For a comprehensive background of the terminology,
63 algorithms and rationale used herein, read:
64
65 Using Valgrind to detect undefined value errors with
66 bit-precision
67
68 Julian Seward and Nicholas Nethercote
69
70 2005 USENIX Annual Technical Conference (General Track),
71 Anaheim, CA, USA, April 10-15, 2005.
njn6665ea22007-05-24 23:14:41 +000072
73 ----
74
75 Here is as good a place as any to record exactly when V bits are and
76 should be checked, why, and what function is responsible.
77
78
79 Memcheck complains when an undefined value is used:
80
81 1. In the condition of a conditional branch. Because it could cause
82 incorrect control flow, and thus cause incorrect externally-visible
83 behaviour. [mc_translate.c:complainIfUndefined]
84
85 2. As an argument to a system call, or as the value that specifies
86 the system call number. Because it could cause an incorrect
87 externally-visible side effect. [mc_translate.c:mc_pre_reg_read]
88
89 3. As the address in a load or store. Because it could cause an
90 incorrect value to be used later, which could cause externally-visible
91 behaviour (eg. via incorrect control flow or an incorrect system call
92 argument) [complainIfUndefined]
93
94 4. As the target address of a branch. Because it could cause incorrect
95 control flow. [complainIfUndefined]
96
97 5. As an argument to setenv, unsetenv, or putenv. Because it could put
98 an incorrect value into the external environment.
99 [mc_replace_strmem.c:VG_WRAP_FUNCTION_ZU(*, *env)]
100
101 6. As the index in a GETI or PUTI operation. I'm not sure why... (njn).
102 [complainIfUndefined]
103
104 7. As an argument to the VALGRIND_CHECK_MEM_IS_DEFINED and
105 VALGRIND_CHECK_VALUE_IS_DEFINED client requests. Because the user
106 requested it. [in memcheck.h]
107
108
109 Memcheck also complains, but should not, when an undefined value is used:
110
111 8. As the shift value in certain SIMD shift operations (but not in the
112 standard integer shift operations). This inconsistency is due to
113 historical reasons.) [complainIfUndefined]
114
115
116 Memcheck does not complain, but should, when an undefined value is used:
117
118 9. As an input to a client request. Because the client request may
119 affect the visible behaviour -- see bug #144362 for an example
120 involving the malloc replacements in vg_replace_malloc.c and
121 VALGRIND_NON_SIMD_CALL* requests, where an uninitialised argument
122 isn't identified. That bug report also has some info on how to solve
123 the problem. [valgrind.h:VALGRIND_DO_CLIENT_REQUEST]
124
125
126 In practice, 1 and 2 account for the vast majority of cases.
sewardj992dff92005-10-07 11:08:55 +0000127*/
128
sewardjb9e6d242013-05-11 13:42:08 +0000129/* Generation of addr-definedness, addr-validity and
130 guard-definedness checks pertaining to loads and stores (Iex_Load,
131 Ist_Store, IRLoadG, IRStoreG, LLSC, CAS and Dirty memory
132 loads/stores) was re-checked 11 May 2013. */
133
sewardj95448072004-11-22 20:19:51 +0000134/*------------------------------------------------------------*/
135/*--- Forward decls ---*/
136/*------------------------------------------------------------*/
137
138struct _MCEnv;
139
sewardj7cf4e6b2008-05-01 20:24:26 +0000140static IRType shadowTypeV ( IRType ty );
sewardj95448072004-11-22 20:19:51 +0000141static IRExpr* expr2vbits ( struct _MCEnv* mce, IRExpr* e );
sewardjafa617b2008-07-22 09:59:48 +0000142static IRTemp findShadowTmpB ( struct _MCEnv* mce, IRTemp orig );
sewardj95448072004-11-22 20:19:51 +0000143
sewardjb5b87402011-03-07 16:05:35 +0000144static IRExpr *i128_const_zero(void);
sewardj95448072004-11-22 20:19:51 +0000145
146/*------------------------------------------------------------*/
147/*--- Memcheck running state, and tmp management. ---*/
148/*------------------------------------------------------------*/
149
sewardj1c0ce7a2009-07-01 08:10:49 +0000150/* Carries info about a particular tmp. The tmp's number is not
151 recorded, as this is implied by (equal to) its index in the tmpMap
152 in MCEnv. The tmp's type is also not recorded, as this is present
153 in MCEnv.sb->tyenv.
154
155 When .kind is Orig, .shadowV and .shadowB may give the identities
156 of the temps currently holding the associated definedness (shadowV)
157 and origin (shadowB) values, or these may be IRTemp_INVALID if code
158 to compute such values has not yet been emitted.
159
160 When .kind is VSh or BSh then the tmp is holds a V- or B- value,
161 and so .shadowV and .shadowB must be IRTemp_INVALID, since it is
162 illogical for a shadow tmp itself to be shadowed.
163*/
164typedef
165 enum { Orig=1, VSh=2, BSh=3 }
166 TempKind;
167
168typedef
169 struct {
170 TempKind kind;
171 IRTemp shadowV;
172 IRTemp shadowB;
173 }
174 TempMapEnt;
175
176
sewardj95448072004-11-22 20:19:51 +0000177/* Carries around state during memcheck instrumentation. */
178typedef
179 struct _MCEnv {
sewardj0b9d74a2006-12-24 02:24:11 +0000180 /* MODIFIED: the superblock being constructed. IRStmts are
181 added. */
sewardj1c0ce7a2009-07-01 08:10:49 +0000182 IRSB* sb;
sewardj7cf4e6b2008-05-01 20:24:26 +0000183 Bool trace;
sewardj95448072004-11-22 20:19:51 +0000184
sewardj1c0ce7a2009-07-01 08:10:49 +0000185 /* MODIFIED: a table [0 .. #temps_in_sb-1] which gives the
186 current kind and possibly shadow temps for each temp in the
187 IRSB being constructed. Note that it does not contain the
188 type of each tmp. If you want to know the type, look at the
189 relevant entry in sb->tyenv. It follows that at all times
190 during the instrumentation process, the valid indices for
191 tmpMap and sb->tyenv are identical, being 0 .. N-1 where N is
192 total number of Orig, V- and B- temps allocated so far.
193
194 The reason for this strange split (types in one place, all
195 other info in another) is that we need the types to be
196 attached to sb so as to make it possible to do
197 "typeOfIRExpr(mce->bb->tyenv, ...)" at various places in the
198 instrumentation process. */
199 XArray* /* of TempMapEnt */ tmpMap;
sewardj95448072004-11-22 20:19:51 +0000200
sewardjd5204dc2004-12-31 01:16:11 +0000201 /* MODIFIED: indicates whether "bogus" literals have so far been
202 found. Starts off False, and may change to True. */
sewardj54eac252012-03-27 10:19:39 +0000203 Bool bogusLiterals;
204
205 /* READONLY: indicates whether we should use expensive
206 interpretations of integer adds, since unfortunately LLVM
207 uses them to do ORs in some circumstances. Defaulted to True
208 on MacOS and False everywhere else. */
209 Bool useLLVMworkarounds;
sewardjd5204dc2004-12-31 01:16:11 +0000210
sewardj95448072004-11-22 20:19:51 +0000211 /* READONLY: the guest layout. This indicates which parts of
212 the guest state should be regarded as 'always defined'. */
florian3c0c9472014-09-24 12:06:55 +0000213 const VexGuestLayout* layout;
sewardj634ba772006-10-15 12:47:37 +0000214
sewardj95448072004-11-22 20:19:51 +0000215 /* READONLY: the host word type. Needed for constructing
216 arguments of type 'HWord' to be passed to helper functions.
217 Ity_I32 or Ity_I64 only. */
218 IRType hWordTy;
219 }
220 MCEnv;
221
222/* SHADOW TMP MANAGEMENT. Shadow tmps are allocated lazily (on
223 demand), as they are encountered. This is for two reasons.
224
225 (1) (less important reason): Many original tmps are unused due to
226 initial IR optimisation, and we do not want to spaces in tables
227 tracking them.
228
229 Shadow IRTemps are therefore allocated on demand. mce.tmpMap is a
230 table indexed [0 .. n_types-1], which gives the current shadow for
231 each original tmp, or INVALID_IRTEMP if none is so far assigned.
232 It is necessary to support making multiple assignments to a shadow
233 -- specifically, after testing a shadow for definedness, it needs
234 to be made defined. But IR's SSA property disallows this.
235
236 (2) (more important reason): Therefore, when a shadow needs to get
237 a new value, a new temporary is created, the value is assigned to
238 that, and the tmpMap is updated to reflect the new binding.
239
240 A corollary is that if the tmpMap maps a given tmp to
sewardjf1962d32006-10-19 13:22:16 +0000241 IRTemp_INVALID and we are hoping to read that shadow tmp, it means
sewardj95448072004-11-22 20:19:51 +0000242 there's a read-before-write error in the original tmps. The IR
243 sanity checker should catch all such anomalies, however.
njn25e49d8e72002-09-23 09:36:25 +0000244*/
sewardj95448072004-11-22 20:19:51 +0000245
sewardj1c0ce7a2009-07-01 08:10:49 +0000246/* Create a new IRTemp of type 'ty' and kind 'kind', and add it to
247 both the table in mce->sb and to our auxiliary mapping. Note that
248 newTemp may cause mce->tmpMap to resize, hence previous results
249 from VG_(indexXA)(mce->tmpMap) are invalidated. */
250static IRTemp newTemp ( MCEnv* mce, IRType ty, TempKind kind )
251{
252 Word newIx;
253 TempMapEnt ent;
254 IRTemp tmp = newIRTemp(mce->sb->tyenv, ty);
255 ent.kind = kind;
256 ent.shadowV = IRTemp_INVALID;
257 ent.shadowB = IRTemp_INVALID;
258 newIx = VG_(addToXA)( mce->tmpMap, &ent );
259 tl_assert(newIx == (Word)tmp);
260 return tmp;
261}
262
263
sewardj95448072004-11-22 20:19:51 +0000264/* Find the tmp currently shadowing the given original tmp. If none
265 so far exists, allocate one. */
sewardj7cf4e6b2008-05-01 20:24:26 +0000266static IRTemp findShadowTmpV ( MCEnv* mce, IRTemp orig )
njn25e49d8e72002-09-23 09:36:25 +0000267{
sewardj1c0ce7a2009-07-01 08:10:49 +0000268 TempMapEnt* ent;
269 /* VG_(indexXA) range-checks 'orig', hence no need to check
270 here. */
271 ent = (TempMapEnt*)VG_(indexXA)( mce->tmpMap, (Word)orig );
272 tl_assert(ent->kind == Orig);
273 if (ent->shadowV == IRTemp_INVALID) {
274 IRTemp tmpV
275 = newTemp( mce, shadowTypeV(mce->sb->tyenv->types[orig]), VSh );
276 /* newTemp may cause mce->tmpMap to resize, hence previous results
277 from VG_(indexXA) are invalid. */
278 ent = (TempMapEnt*)VG_(indexXA)( mce->tmpMap, (Word)orig );
279 tl_assert(ent->kind == Orig);
280 tl_assert(ent->shadowV == IRTemp_INVALID);
281 ent->shadowV = tmpV;
njn25e49d8e72002-09-23 09:36:25 +0000282 }
sewardj1c0ce7a2009-07-01 08:10:49 +0000283 return ent->shadowV;
njn25e49d8e72002-09-23 09:36:25 +0000284}
285
sewardj95448072004-11-22 20:19:51 +0000286/* Allocate a new shadow for the given original tmp. This means any
287 previous shadow is abandoned. This is needed because it is
288 necessary to give a new value to a shadow once it has been tested
289 for undefinedness, but unfortunately IR's SSA property disallows
290 this. Instead we must abandon the old shadow, allocate a new one
sewardj1c0ce7a2009-07-01 08:10:49 +0000291 and use that instead.
292
293 This is the same as findShadowTmpV, except we don't bother to see
294 if a shadow temp already existed -- we simply allocate a new one
295 regardless. */
sewardj7cf4e6b2008-05-01 20:24:26 +0000296static void newShadowTmpV ( MCEnv* mce, IRTemp orig )
njn25e49d8e72002-09-23 09:36:25 +0000297{
sewardj1c0ce7a2009-07-01 08:10:49 +0000298 TempMapEnt* ent;
299 /* VG_(indexXA) range-checks 'orig', hence no need to check
300 here. */
301 ent = (TempMapEnt*)VG_(indexXA)( mce->tmpMap, (Word)orig );
302 tl_assert(ent->kind == Orig);
303 if (1) {
304 IRTemp tmpV
305 = newTemp( mce, shadowTypeV(mce->sb->tyenv->types[orig]), VSh );
306 /* newTemp may cause mce->tmpMap to resize, hence previous results
307 from VG_(indexXA) are invalid. */
308 ent = (TempMapEnt*)VG_(indexXA)( mce->tmpMap, (Word)orig );
309 tl_assert(ent->kind == Orig);
310 ent->shadowV = tmpV;
311 }
sewardj95448072004-11-22 20:19:51 +0000312}
313
314
315/*------------------------------------------------------------*/
316/*--- IRAtoms -- a subset of IRExprs ---*/
317/*------------------------------------------------------------*/
318
319/* An atom is either an IRExpr_Const or an IRExpr_Tmp, as defined by
sewardj710d6c22005-03-20 18:55:15 +0000320 isIRAtom() in libvex_ir.h. Because this instrumenter expects flat
sewardj95448072004-11-22 20:19:51 +0000321 input, most of this code deals in atoms. Usefully, a value atom
322 always has a V-value which is also an atom: constants are shadowed
323 by constants, and temps are shadowed by the corresponding shadow
324 temporary. */
325
326typedef IRExpr IRAtom;
327
328/* (used for sanity checks only): is this an atom which looks
329 like it's from original code? */
330static Bool isOriginalAtom ( MCEnv* mce, IRAtom* a1 )
331{
332 if (a1->tag == Iex_Const)
333 return True;
sewardj1c0ce7a2009-07-01 08:10:49 +0000334 if (a1->tag == Iex_RdTmp) {
335 TempMapEnt* ent = VG_(indexXA)( mce->tmpMap, a1->Iex.RdTmp.tmp );
336 return ent->kind == Orig;
337 }
sewardj95448072004-11-22 20:19:51 +0000338 return False;
339}
340
341/* (used for sanity checks only): is this an atom which looks
342 like it's from shadow code? */
343static Bool isShadowAtom ( MCEnv* mce, IRAtom* a1 )
344{
345 if (a1->tag == Iex_Const)
346 return True;
sewardj1c0ce7a2009-07-01 08:10:49 +0000347 if (a1->tag == Iex_RdTmp) {
348 TempMapEnt* ent = VG_(indexXA)( mce->tmpMap, a1->Iex.RdTmp.tmp );
349 return ent->kind == VSh || ent->kind == BSh;
350 }
sewardj95448072004-11-22 20:19:51 +0000351 return False;
352}
353
354/* (used for sanity checks only): check that both args are atoms and
355 are identically-kinded. */
356static Bool sameKindedAtoms ( IRAtom* a1, IRAtom* a2 )
357{
sewardj0b9d74a2006-12-24 02:24:11 +0000358 if (a1->tag == Iex_RdTmp && a2->tag == Iex_RdTmp)
sewardj95448072004-11-22 20:19:51 +0000359 return True;
sewardjbef552a2005-08-30 12:54:36 +0000360 if (a1->tag == Iex_Const && a2->tag == Iex_Const)
sewardj95448072004-11-22 20:19:51 +0000361 return True;
362 return False;
363}
364
365
366/*------------------------------------------------------------*/
367/*--- Type management ---*/
368/*------------------------------------------------------------*/
369
370/* Shadow state is always accessed using integer types. This returns
371 an integer type with the same size (as per sizeofIRType) as the
372 given type. The only valid shadow types are Bit, I8, I16, I32,
sewardj45fa9f42012-05-21 10:18:10 +0000373 I64, I128, V128, V256. */
sewardj95448072004-11-22 20:19:51 +0000374
sewardj7cf4e6b2008-05-01 20:24:26 +0000375static IRType shadowTypeV ( IRType ty )
sewardj95448072004-11-22 20:19:51 +0000376{
377 switch (ty) {
378 case Ity_I1:
379 case Ity_I8:
380 case Ity_I16:
381 case Ity_I32:
sewardj6cf40ff2005-04-20 22:31:26 +0000382 case Ity_I64:
383 case Ity_I128: return ty;
sewardj1f4b1eb2015-04-06 14:52:28 +0000384 case Ity_F16: return Ity_I16;
sewardj3245c912004-12-10 14:58:26 +0000385 case Ity_F32: return Ity_I32;
sewardjb0ccb4d2012-04-02 10:22:05 +0000386 case Ity_D32: return Ity_I32;
sewardj3245c912004-12-10 14:58:26 +0000387 case Ity_F64: return Ity_I64;
sewardjb0ccb4d2012-04-02 10:22:05 +0000388 case Ity_D64: return Ity_I64;
sewardjb5b87402011-03-07 16:05:35 +0000389 case Ity_F128: return Ity_I128;
sewardjb0ccb4d2012-04-02 10:22:05 +0000390 case Ity_D128: return Ity_I128;
sewardj3245c912004-12-10 14:58:26 +0000391 case Ity_V128: return Ity_V128;
sewardj45fa9f42012-05-21 10:18:10 +0000392 case Ity_V256: return Ity_V256;
sewardj95448072004-11-22 20:19:51 +0000393 default: ppIRType(ty);
sewardj7cf4e6b2008-05-01 20:24:26 +0000394 VG_(tool_panic)("memcheck:shadowTypeV");
sewardj95448072004-11-22 20:19:51 +0000395 }
396}
397
398/* Produce a 'defined' value of the given shadow type. Should only be
399 supplied shadow types (Bit/I8/I16/I32/UI64). */
400static IRExpr* definedOfType ( IRType ty ) {
401 switch (ty) {
sewardj170ee212004-12-10 18:57:51 +0000402 case Ity_I1: return IRExpr_Const(IRConst_U1(False));
403 case Ity_I8: return IRExpr_Const(IRConst_U8(0));
404 case Ity_I16: return IRExpr_Const(IRConst_U16(0));
405 case Ity_I32: return IRExpr_Const(IRConst_U32(0));
406 case Ity_I64: return IRExpr_Const(IRConst_U64(0));
sewardjb5b87402011-03-07 16:05:35 +0000407 case Ity_I128: return i128_const_zero();
sewardj170ee212004-12-10 18:57:51 +0000408 case Ity_V128: return IRExpr_Const(IRConst_V128(0x0000));
sewardj1eb272f2014-01-26 18:36:52 +0000409 case Ity_V256: return IRExpr_Const(IRConst_V256(0x00000000));
sewardjf1962d32006-10-19 13:22:16 +0000410 default: VG_(tool_panic)("memcheck:definedOfType");
njn25e49d8e72002-09-23 09:36:25 +0000411 }
412}
413
414
sewardj95448072004-11-22 20:19:51 +0000415/*------------------------------------------------------------*/
416/*--- Constructing IR fragments ---*/
417/*------------------------------------------------------------*/
418
sewardj95448072004-11-22 20:19:51 +0000419/* add stmt to a bb */
sewardj7cf4e6b2008-05-01 20:24:26 +0000420static inline void stmt ( HChar cat, MCEnv* mce, IRStmt* st ) {
421 if (mce->trace) {
422 VG_(printf)(" %c: ", cat);
423 ppIRStmt(st);
424 VG_(printf)("\n");
425 }
sewardj1c0ce7a2009-07-01 08:10:49 +0000426 addStmtToIRSB(mce->sb, st);
sewardj7cf4e6b2008-05-01 20:24:26 +0000427}
428
429/* assign value to tmp */
430static inline
431void assign ( HChar cat, MCEnv* mce, IRTemp tmp, IRExpr* expr ) {
sewardj1c0ce7a2009-07-01 08:10:49 +0000432 stmt(cat, mce, IRStmt_WrTmp(tmp,expr));
sewardj7cf4e6b2008-05-01 20:24:26 +0000433}
sewardj95448072004-11-22 20:19:51 +0000434
435/* build various kinds of expressions */
sewardj57f92b02010-08-22 11:54:14 +0000436#define triop(_op, _arg1, _arg2, _arg3) \
437 IRExpr_Triop((_op),(_arg1),(_arg2),(_arg3))
sewardj95448072004-11-22 20:19:51 +0000438#define binop(_op, _arg1, _arg2) IRExpr_Binop((_op),(_arg1),(_arg2))
439#define unop(_op, _arg) IRExpr_Unop((_op),(_arg))
sewardjcc961652013-01-26 11:49:15 +0000440#define mkU1(_n) IRExpr_Const(IRConst_U1(_n))
sewardj95448072004-11-22 20:19:51 +0000441#define mkU8(_n) IRExpr_Const(IRConst_U8(_n))
442#define mkU16(_n) IRExpr_Const(IRConst_U16(_n))
443#define mkU32(_n) IRExpr_Const(IRConst_U32(_n))
444#define mkU64(_n) IRExpr_Const(IRConst_U64(_n))
sewardj170ee212004-12-10 18:57:51 +0000445#define mkV128(_n) IRExpr_Const(IRConst_V128(_n))
sewardj0b9d74a2006-12-24 02:24:11 +0000446#define mkexpr(_tmp) IRExpr_RdTmp((_tmp))
sewardj95448072004-11-22 20:19:51 +0000447
sewardj7cf4e6b2008-05-01 20:24:26 +0000448/* Bind the given expression to a new temporary, and return the
sewardj95448072004-11-22 20:19:51 +0000449 temporary. This effectively converts an arbitrary expression into
sewardj7cf4e6b2008-05-01 20:24:26 +0000450 an atom.
451
452 'ty' is the type of 'e' and hence the type that the new temporary
sewardj1c0ce7a2009-07-01 08:10:49 +0000453 needs to be. But passing it in is redundant, since we can deduce
454 the type merely by inspecting 'e'. So at least use that fact to
455 assert that the two types agree. */
456static IRAtom* assignNew ( HChar cat, MCEnv* mce, IRType ty, IRExpr* e )
457{
458 TempKind k;
459 IRTemp t;
460 IRType tyE = typeOfIRExpr(mce->sb->tyenv, e);
sewardjb0ccb4d2012-04-02 10:22:05 +0000461
sewardj7cf4e6b2008-05-01 20:24:26 +0000462 tl_assert(tyE == ty); /* so 'ty' is redundant (!) */
sewardj1c0ce7a2009-07-01 08:10:49 +0000463 switch (cat) {
464 case 'V': k = VSh; break;
465 case 'B': k = BSh; break;
466 case 'C': k = Orig; break;
467 /* happens when we are making up new "orig"
468 expressions, for IRCAS handling */
469 default: tl_assert(0);
470 }
471 t = newTemp(mce, ty, k);
sewardj7cf4e6b2008-05-01 20:24:26 +0000472 assign(cat, mce, t, e);
sewardj95448072004-11-22 20:19:51 +0000473 return mkexpr(t);
474}
475
476
477/*------------------------------------------------------------*/
sewardjb5b87402011-03-07 16:05:35 +0000478/*--- Helper functions for 128-bit ops ---*/
479/*------------------------------------------------------------*/
sewardj45fa9f42012-05-21 10:18:10 +0000480
sewardjb5b87402011-03-07 16:05:35 +0000481static IRExpr *i128_const_zero(void)
482{
sewardj45fa9f42012-05-21 10:18:10 +0000483 IRAtom* z64 = IRExpr_Const(IRConst_U64(0));
484 return binop(Iop_64HLto128, z64, z64);
sewardjb5b87402011-03-07 16:05:35 +0000485}
486
sewardj45fa9f42012-05-21 10:18:10 +0000487/* There are no I128-bit loads and/or stores [as generated by any
488 current front ends]. So we do not need to worry about that in
489 expr2vbits_Load */
490
sewardjb5b87402011-03-07 16:05:35 +0000491
492/*------------------------------------------------------------*/
sewardj95448072004-11-22 20:19:51 +0000493/*--- Constructing definedness primitive ops ---*/
494/*------------------------------------------------------------*/
495
496/* --------- Defined-if-either-defined --------- */
497
498static IRAtom* mkDifD8 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
499 tl_assert(isShadowAtom(mce,a1));
500 tl_assert(isShadowAtom(mce,a2));
sewardj7cf4e6b2008-05-01 20:24:26 +0000501 return assignNew('V', mce, Ity_I8, binop(Iop_And8, a1, a2));
sewardj95448072004-11-22 20:19:51 +0000502}
503
504static IRAtom* mkDifD16 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
505 tl_assert(isShadowAtom(mce,a1));
506 tl_assert(isShadowAtom(mce,a2));
sewardj7cf4e6b2008-05-01 20:24:26 +0000507 return assignNew('V', mce, Ity_I16, binop(Iop_And16, a1, a2));
sewardj95448072004-11-22 20:19:51 +0000508}
509
510static IRAtom* mkDifD32 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
511 tl_assert(isShadowAtom(mce,a1));
512 tl_assert(isShadowAtom(mce,a2));
sewardj7cf4e6b2008-05-01 20:24:26 +0000513 return assignNew('V', mce, Ity_I32, binop(Iop_And32, a1, a2));
sewardj95448072004-11-22 20:19:51 +0000514}
515
sewardj7010f6e2004-12-10 13:35:22 +0000516static IRAtom* mkDifD64 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
517 tl_assert(isShadowAtom(mce,a1));
518 tl_assert(isShadowAtom(mce,a2));
sewardj7cf4e6b2008-05-01 20:24:26 +0000519 return assignNew('V', mce, Ity_I64, binop(Iop_And64, a1, a2));
sewardj7010f6e2004-12-10 13:35:22 +0000520}
521
sewardj20d38f22005-02-07 23:50:18 +0000522static IRAtom* mkDifDV128 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
sewardj170ee212004-12-10 18:57:51 +0000523 tl_assert(isShadowAtom(mce,a1));
524 tl_assert(isShadowAtom(mce,a2));
sewardj7cf4e6b2008-05-01 20:24:26 +0000525 return assignNew('V', mce, Ity_V128, binop(Iop_AndV128, a1, a2));
sewardj170ee212004-12-10 18:57:51 +0000526}
527
sewardj350e8f72012-06-25 07:52:15 +0000528static IRAtom* mkDifDV256 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
529 tl_assert(isShadowAtom(mce,a1));
530 tl_assert(isShadowAtom(mce,a2));
531 return assignNew('V', mce, Ity_V256, binop(Iop_AndV256, a1, a2));
532}
533
sewardj95448072004-11-22 20:19:51 +0000534/* --------- Undefined-if-either-undefined --------- */
535
536static IRAtom* mkUifU8 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
537 tl_assert(isShadowAtom(mce,a1));
538 tl_assert(isShadowAtom(mce,a2));
sewardj7cf4e6b2008-05-01 20:24:26 +0000539 return assignNew('V', mce, Ity_I8, binop(Iop_Or8, a1, a2));
sewardj95448072004-11-22 20:19:51 +0000540}
541
542static IRAtom* mkUifU16 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
543 tl_assert(isShadowAtom(mce,a1));
544 tl_assert(isShadowAtom(mce,a2));
sewardj7cf4e6b2008-05-01 20:24:26 +0000545 return assignNew('V', mce, Ity_I16, binop(Iop_Or16, a1, a2));
sewardj95448072004-11-22 20:19:51 +0000546}
547
548static IRAtom* mkUifU32 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
549 tl_assert(isShadowAtom(mce,a1));
550 tl_assert(isShadowAtom(mce,a2));
sewardj7cf4e6b2008-05-01 20:24:26 +0000551 return assignNew('V', mce, Ity_I32, binop(Iop_Or32, a1, a2));
sewardj95448072004-11-22 20:19:51 +0000552}
553
554static IRAtom* mkUifU64 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
555 tl_assert(isShadowAtom(mce,a1));
556 tl_assert(isShadowAtom(mce,a2));
sewardj7cf4e6b2008-05-01 20:24:26 +0000557 return assignNew('V', mce, Ity_I64, binop(Iop_Or64, a1, a2));
sewardj95448072004-11-22 20:19:51 +0000558}
559
sewardjb5b87402011-03-07 16:05:35 +0000560static IRAtom* mkUifU128 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
561 IRAtom *tmp1, *tmp2, *tmp3, *tmp4, *tmp5, *tmp6;
562 tl_assert(isShadowAtom(mce,a1));
563 tl_assert(isShadowAtom(mce,a2));
564 tmp1 = assignNew('V', mce, Ity_I64, unop(Iop_128to64, a1));
565 tmp2 = assignNew('V', mce, Ity_I64, unop(Iop_128HIto64, a1));
566 tmp3 = assignNew('V', mce, Ity_I64, unop(Iop_128to64, a2));
567 tmp4 = assignNew('V', mce, Ity_I64, unop(Iop_128HIto64, a2));
568 tmp5 = assignNew('V', mce, Ity_I64, binop(Iop_Or64, tmp1, tmp3));
569 tmp6 = assignNew('V', mce, Ity_I64, binop(Iop_Or64, tmp2, tmp4));
570
571 return assignNew('V', mce, Ity_I128, binop(Iop_64HLto128, tmp6, tmp5));
572}
573
sewardj20d38f22005-02-07 23:50:18 +0000574static IRAtom* mkUifUV128 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
sewardj3245c912004-12-10 14:58:26 +0000575 tl_assert(isShadowAtom(mce,a1));
576 tl_assert(isShadowAtom(mce,a2));
sewardj7cf4e6b2008-05-01 20:24:26 +0000577 return assignNew('V', mce, Ity_V128, binop(Iop_OrV128, a1, a2));
sewardj3245c912004-12-10 14:58:26 +0000578}
579
sewardj350e8f72012-06-25 07:52:15 +0000580static IRAtom* mkUifUV256 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) {
581 tl_assert(isShadowAtom(mce,a1));
582 tl_assert(isShadowAtom(mce,a2));
583 return assignNew('V', mce, Ity_V256, binop(Iop_OrV256, a1, a2));
584}
585
sewardje50a1b12004-12-17 01:24:54 +0000586static IRAtom* mkUifU ( MCEnv* mce, IRType vty, IRAtom* a1, IRAtom* a2 ) {
sewardj95448072004-11-22 20:19:51 +0000587 switch (vty) {
sewardje50a1b12004-12-17 01:24:54 +0000588 case Ity_I8: return mkUifU8(mce, a1, a2);
sewardja1d93302004-12-12 16:45:06 +0000589 case Ity_I16: return mkUifU16(mce, a1, a2);
590 case Ity_I32: return mkUifU32(mce, a1, a2);
591 case Ity_I64: return mkUifU64(mce, a1, a2);
sewardjb5b87402011-03-07 16:05:35 +0000592 case Ity_I128: return mkUifU128(mce, a1, a2);
sewardj20d38f22005-02-07 23:50:18 +0000593 case Ity_V128: return mkUifUV128(mce, a1, a2);
sewardja2f30952013-03-27 11:40:02 +0000594 case Ity_V256: return mkUifUV256(mce, a1, a2);
sewardj95448072004-11-22 20:19:51 +0000595 default:
596 VG_(printf)("\n"); ppIRType(vty); VG_(printf)("\n");
597 VG_(tool_panic)("memcheck:mkUifU");
njn25e49d8e72002-09-23 09:36:25 +0000598 }
599}
600
sewardj95448072004-11-22 20:19:51 +0000601/* --------- The Left-family of operations. --------- */
njn25e49d8e72002-09-23 09:36:25 +0000602
sewardj95448072004-11-22 20:19:51 +0000603static IRAtom* mkLeft8 ( MCEnv* mce, IRAtom* a1 ) {
604 tl_assert(isShadowAtom(mce,a1));
sewardj7cf4e6b2008-05-01 20:24:26 +0000605 return assignNew('V', mce, Ity_I8, unop(Iop_Left8, a1));
sewardj95448072004-11-22 20:19:51 +0000606}
607
608static IRAtom* mkLeft16 ( MCEnv* mce, IRAtom* a1 ) {
609 tl_assert(isShadowAtom(mce,a1));
sewardj7cf4e6b2008-05-01 20:24:26 +0000610 return assignNew('V', mce, Ity_I16, unop(Iop_Left16, a1));
sewardj95448072004-11-22 20:19:51 +0000611}
612
613static IRAtom* mkLeft32 ( MCEnv* mce, IRAtom* a1 ) {
614 tl_assert(isShadowAtom(mce,a1));
sewardj7cf4e6b2008-05-01 20:24:26 +0000615 return assignNew('V', mce, Ity_I32, unop(Iop_Left32, a1));
sewardj95448072004-11-22 20:19:51 +0000616}
617
sewardj681be302005-01-15 20:43:58 +0000618static IRAtom* mkLeft64 ( MCEnv* mce, IRAtom* a1 ) {
619 tl_assert(isShadowAtom(mce,a1));
sewardj7cf4e6b2008-05-01 20:24:26 +0000620 return assignNew('V', mce, Ity_I64, unop(Iop_Left64, a1));
sewardj681be302005-01-15 20:43:58 +0000621}
622
sewardj95448072004-11-22 20:19:51 +0000623/* --------- 'Improvement' functions for AND/OR. --------- */
624
625/* ImproveAND(data, vbits) = data OR vbits. Defined (0) data 0s give
626 defined (0); all other -> undefined (1).
627*/
628static IRAtom* mkImproveAND8 ( MCEnv* mce, IRAtom* data, IRAtom* vbits )
njn25e49d8e72002-09-23 09:36:25 +0000629{
sewardj95448072004-11-22 20:19:51 +0000630 tl_assert(isOriginalAtom(mce, data));
631 tl_assert(isShadowAtom(mce, vbits));
632 tl_assert(sameKindedAtoms(data, vbits));
sewardj7cf4e6b2008-05-01 20:24:26 +0000633 return assignNew('V', mce, Ity_I8, binop(Iop_Or8, data, vbits));
sewardj95448072004-11-22 20:19:51 +0000634}
njn25e49d8e72002-09-23 09:36:25 +0000635
sewardj95448072004-11-22 20:19:51 +0000636static IRAtom* mkImproveAND16 ( MCEnv* mce, IRAtom* data, IRAtom* vbits )
637{
638 tl_assert(isOriginalAtom(mce, data));
639 tl_assert(isShadowAtom(mce, vbits));
640 tl_assert(sameKindedAtoms(data, vbits));
sewardj7cf4e6b2008-05-01 20:24:26 +0000641 return assignNew('V', mce, Ity_I16, binop(Iop_Or16, data, vbits));
sewardj95448072004-11-22 20:19:51 +0000642}
njn25e49d8e72002-09-23 09:36:25 +0000643
sewardj95448072004-11-22 20:19:51 +0000644static IRAtom* mkImproveAND32 ( MCEnv* mce, IRAtom* data, IRAtom* vbits )
645{
646 tl_assert(isOriginalAtom(mce, data));
647 tl_assert(isShadowAtom(mce, vbits));
648 tl_assert(sameKindedAtoms(data, vbits));
sewardj7cf4e6b2008-05-01 20:24:26 +0000649 return assignNew('V', mce, Ity_I32, binop(Iop_Or32, data, vbits));
sewardj95448072004-11-22 20:19:51 +0000650}
njn25e49d8e72002-09-23 09:36:25 +0000651
sewardj7010f6e2004-12-10 13:35:22 +0000652static IRAtom* mkImproveAND64 ( MCEnv* mce, IRAtom* data, IRAtom* vbits )
653{
654 tl_assert(isOriginalAtom(mce, data));
655 tl_assert(isShadowAtom(mce, vbits));
656 tl_assert(sameKindedAtoms(data, vbits));
sewardj7cf4e6b2008-05-01 20:24:26 +0000657 return assignNew('V', mce, Ity_I64, binop(Iop_Or64, data, vbits));
sewardj7010f6e2004-12-10 13:35:22 +0000658}
659
sewardj20d38f22005-02-07 23:50:18 +0000660static IRAtom* mkImproveANDV128 ( MCEnv* mce, IRAtom* data, IRAtom* vbits )
sewardj170ee212004-12-10 18:57:51 +0000661{
662 tl_assert(isOriginalAtom(mce, data));
663 tl_assert(isShadowAtom(mce, vbits));
664 tl_assert(sameKindedAtoms(data, vbits));
sewardj7cf4e6b2008-05-01 20:24:26 +0000665 return assignNew('V', mce, Ity_V128, binop(Iop_OrV128, data, vbits));
sewardj170ee212004-12-10 18:57:51 +0000666}
667
sewardj350e8f72012-06-25 07:52:15 +0000668static IRAtom* mkImproveANDV256 ( MCEnv* mce, IRAtom* data, IRAtom* vbits )
669{
670 tl_assert(isOriginalAtom(mce, data));
671 tl_assert(isShadowAtom(mce, vbits));
672 tl_assert(sameKindedAtoms(data, vbits));
673 return assignNew('V', mce, Ity_V256, binop(Iop_OrV256, data, vbits));
674}
675
sewardj95448072004-11-22 20:19:51 +0000676/* ImproveOR(data, vbits) = ~data OR vbits. Defined (0) data 1s give
677 defined (0); all other -> undefined (1).
678*/
679static IRAtom* mkImproveOR8 ( MCEnv* mce, IRAtom* data, IRAtom* vbits )
680{
681 tl_assert(isOriginalAtom(mce, data));
682 tl_assert(isShadowAtom(mce, vbits));
683 tl_assert(sameKindedAtoms(data, vbits));
684 return assignNew(
sewardj7cf4e6b2008-05-01 20:24:26 +0000685 'V', mce, Ity_I8,
sewardj95448072004-11-22 20:19:51 +0000686 binop(Iop_Or8,
sewardj7cf4e6b2008-05-01 20:24:26 +0000687 assignNew('V', mce, Ity_I8, unop(Iop_Not8, data)),
sewardj95448072004-11-22 20:19:51 +0000688 vbits) );
689}
njn25e49d8e72002-09-23 09:36:25 +0000690
sewardj95448072004-11-22 20:19:51 +0000691static IRAtom* mkImproveOR16 ( MCEnv* mce, IRAtom* data, IRAtom* vbits )
692{
693 tl_assert(isOriginalAtom(mce, data));
694 tl_assert(isShadowAtom(mce, vbits));
695 tl_assert(sameKindedAtoms(data, vbits));
696 return assignNew(
sewardj7cf4e6b2008-05-01 20:24:26 +0000697 'V', mce, Ity_I16,
sewardj95448072004-11-22 20:19:51 +0000698 binop(Iop_Or16,
sewardj7cf4e6b2008-05-01 20:24:26 +0000699 assignNew('V', mce, Ity_I16, unop(Iop_Not16, data)),
sewardj95448072004-11-22 20:19:51 +0000700 vbits) );
701}
njn25e49d8e72002-09-23 09:36:25 +0000702
sewardj95448072004-11-22 20:19:51 +0000703static IRAtom* mkImproveOR32 ( MCEnv* mce, IRAtom* data, IRAtom* vbits )
704{
705 tl_assert(isOriginalAtom(mce, data));
706 tl_assert(isShadowAtom(mce, vbits));
707 tl_assert(sameKindedAtoms(data, vbits));
708 return assignNew(
sewardj7cf4e6b2008-05-01 20:24:26 +0000709 'V', mce, Ity_I32,
sewardj95448072004-11-22 20:19:51 +0000710 binop(Iop_Or32,
sewardj7cf4e6b2008-05-01 20:24:26 +0000711 assignNew('V', mce, Ity_I32, unop(Iop_Not32, data)),
sewardj95448072004-11-22 20:19:51 +0000712 vbits) );
713}
714
sewardj7010f6e2004-12-10 13:35:22 +0000715static IRAtom* mkImproveOR64 ( MCEnv* mce, IRAtom* data, IRAtom* vbits )
716{
717 tl_assert(isOriginalAtom(mce, data));
718 tl_assert(isShadowAtom(mce, vbits));
719 tl_assert(sameKindedAtoms(data, vbits));
720 return assignNew(
sewardj7cf4e6b2008-05-01 20:24:26 +0000721 'V', mce, Ity_I64,
sewardj7010f6e2004-12-10 13:35:22 +0000722 binop(Iop_Or64,
sewardj7cf4e6b2008-05-01 20:24:26 +0000723 assignNew('V', mce, Ity_I64, unop(Iop_Not64, data)),
sewardj7010f6e2004-12-10 13:35:22 +0000724 vbits) );
725}
726
sewardj20d38f22005-02-07 23:50:18 +0000727static IRAtom* mkImproveORV128 ( MCEnv* mce, IRAtom* data, IRAtom* vbits )
sewardj170ee212004-12-10 18:57:51 +0000728{
729 tl_assert(isOriginalAtom(mce, data));
730 tl_assert(isShadowAtom(mce, vbits));
731 tl_assert(sameKindedAtoms(data, vbits));
732 return assignNew(
sewardj7cf4e6b2008-05-01 20:24:26 +0000733 'V', mce, Ity_V128,
sewardj20d38f22005-02-07 23:50:18 +0000734 binop(Iop_OrV128,
sewardj7cf4e6b2008-05-01 20:24:26 +0000735 assignNew('V', mce, Ity_V128, unop(Iop_NotV128, data)),
sewardj170ee212004-12-10 18:57:51 +0000736 vbits) );
737}
738
sewardj350e8f72012-06-25 07:52:15 +0000739static IRAtom* mkImproveORV256 ( MCEnv* mce, IRAtom* data, IRAtom* vbits )
740{
741 tl_assert(isOriginalAtom(mce, data));
742 tl_assert(isShadowAtom(mce, vbits));
743 tl_assert(sameKindedAtoms(data, vbits));
744 return assignNew(
745 'V', mce, Ity_V256,
746 binop(Iop_OrV256,
747 assignNew('V', mce, Ity_V256, unop(Iop_NotV256, data)),
748 vbits) );
749}
750
sewardj95448072004-11-22 20:19:51 +0000751/* --------- Pessimising casts. --------- */
752
sewardjb5b87402011-03-07 16:05:35 +0000753/* The function returns an expression of type DST_TY. If any of the VBITS
754 is undefined (value == 1) the resulting expression has all bits set to
755 1. Otherwise, all bits are 0. */
756
sewardj95448072004-11-22 20:19:51 +0000757static IRAtom* mkPCastTo( MCEnv* mce, IRType dst_ty, IRAtom* vbits )
758{
sewardj4cc684b2007-08-25 23:09:36 +0000759 IRType src_ty;
sewardj7cf97ee2004-11-28 14:25:01 +0000760 IRAtom* tmp1;
sewardj2eecb742012-06-01 16:11:41 +0000761
sewardj95448072004-11-22 20:19:51 +0000762 /* Note, dst_ty is a shadow type, not an original type. */
sewardj95448072004-11-22 20:19:51 +0000763 tl_assert(isShadowAtom(mce,vbits));
sewardj1c0ce7a2009-07-01 08:10:49 +0000764 src_ty = typeOfIRExpr(mce->sb->tyenv, vbits);
sewardj4cc684b2007-08-25 23:09:36 +0000765
766 /* Fast-track some common cases */
767 if (src_ty == Ity_I32 && dst_ty == Ity_I32)
sewardj7cf4e6b2008-05-01 20:24:26 +0000768 return assignNew('V', mce, Ity_I32, unop(Iop_CmpwNEZ32, vbits));
sewardj4cc684b2007-08-25 23:09:36 +0000769
770 if (src_ty == Ity_I64 && dst_ty == Ity_I64)
sewardj7cf4e6b2008-05-01 20:24:26 +0000771 return assignNew('V', mce, Ity_I64, unop(Iop_CmpwNEZ64, vbits));
sewardj4cc684b2007-08-25 23:09:36 +0000772
773 if (src_ty == Ity_I32 && dst_ty == Ity_I64) {
sewardj2eecb742012-06-01 16:11:41 +0000774 /* PCast the arg, then clone it. */
sewardj7cf4e6b2008-05-01 20:24:26 +0000775 IRAtom* tmp = assignNew('V', mce, Ity_I32, unop(Iop_CmpwNEZ32, vbits));
776 return assignNew('V', mce, Ity_I64, binop(Iop_32HLto64, tmp, tmp));
sewardj4cc684b2007-08-25 23:09:36 +0000777 }
778
sewardj1eb272f2014-01-26 18:36:52 +0000779 if (src_ty == Ity_I32 && dst_ty == Ity_V128) {
780 /* PCast the arg, then clone it 4 times. */
781 IRAtom* tmp = assignNew('V', mce, Ity_I32, unop(Iop_CmpwNEZ32, vbits));
782 tmp = assignNew('V', mce, Ity_I64, binop(Iop_32HLto64, tmp, tmp));
783 return assignNew('V', mce, Ity_V128, binop(Iop_64HLtoV128, tmp, tmp));
784 }
785
786 if (src_ty == Ity_I32 && dst_ty == Ity_V256) {
787 /* PCast the arg, then clone it 8 times. */
788 IRAtom* tmp = assignNew('V', mce, Ity_I32, unop(Iop_CmpwNEZ32, vbits));
789 tmp = assignNew('V', mce, Ity_I64, binop(Iop_32HLto64, tmp, tmp));
790 tmp = assignNew('V', mce, Ity_V128, binop(Iop_64HLtoV128, tmp, tmp));
791 return assignNew('V', mce, Ity_V256, binop(Iop_V128HLtoV256, tmp, tmp));
792 }
793
sewardj2eecb742012-06-01 16:11:41 +0000794 if (src_ty == Ity_I64 && dst_ty == Ity_I32) {
795 /* PCast the arg. This gives all 0s or all 1s. Then throw away
796 the top half. */
797 IRAtom* tmp = assignNew('V', mce, Ity_I64, unop(Iop_CmpwNEZ64, vbits));
798 return assignNew('V', mce, Ity_I32, unop(Iop_64to32, tmp));
799 }
800
sewardjbfd03f82014-08-26 18:35:13 +0000801 if (src_ty == Ity_V128 && dst_ty == Ity_I64) {
802 /* Use InterleaveHI64x2 to copy the top half of the vector into
803 the bottom half. Then we can UifU it with the original, throw
804 away the upper half of the result, and PCast-I64-to-I64
805 the lower half. */
806 // Generates vbits[127:64] : vbits[127:64]
807 IRAtom* hi64hi64
808 = assignNew('V', mce, Ity_V128,
809 binop(Iop_InterleaveHI64x2, vbits, vbits));
810 // Generates
811 // UifU(vbits[127:64],vbits[127:64]) : UifU(vbits[127:64],vbits[63:0])
812 // == vbits[127:64] : UifU(vbits[127:64],vbits[63:0])
813 IRAtom* lohi64
814 = mkUifUV128(mce, hi64hi64, vbits);
815 // Generates UifU(vbits[127:64],vbits[63:0])
816 IRAtom* lo64
817 = assignNew('V', mce, Ity_I64, unop(Iop_V128to64, lohi64));
818 // Generates
819 // PCast-to-I64( UifU(vbits[127:64], vbits[63:0] )
820 // == PCast-to-I64( vbits[127:0] )
821 IRAtom* res
822 = assignNew('V', mce, Ity_I64, unop(Iop_CmpwNEZ64, lo64));
823 return res;
824 }
825
sewardj4cc684b2007-08-25 23:09:36 +0000826 /* Else do it the slow way .. */
sewardj2eecb742012-06-01 16:11:41 +0000827 /* First of all, collapse vbits down to a single bit. */
sewardj4cc684b2007-08-25 23:09:36 +0000828 tmp1 = NULL;
829 switch (src_ty) {
sewardj95448072004-11-22 20:19:51 +0000830 case Ity_I1:
831 tmp1 = vbits;
njn25e49d8e72002-09-23 09:36:25 +0000832 break;
sewardj95448072004-11-22 20:19:51 +0000833 case Ity_I8:
sewardj7cf4e6b2008-05-01 20:24:26 +0000834 tmp1 = assignNew('V', mce, Ity_I1, unop(Iop_CmpNEZ8, vbits));
sewardj95448072004-11-22 20:19:51 +0000835 break;
836 case Ity_I16:
sewardj7cf4e6b2008-05-01 20:24:26 +0000837 tmp1 = assignNew('V', mce, Ity_I1, unop(Iop_CmpNEZ16, vbits));
sewardj95448072004-11-22 20:19:51 +0000838 break;
839 case Ity_I32:
sewardj7cf4e6b2008-05-01 20:24:26 +0000840 tmp1 = assignNew('V', mce, Ity_I1, unop(Iop_CmpNEZ32, vbits));
sewardj95448072004-11-22 20:19:51 +0000841 break;
842 case Ity_I64:
sewardj7cf4e6b2008-05-01 20:24:26 +0000843 tmp1 = assignNew('V', mce, Ity_I1, unop(Iop_CmpNEZ64, vbits));
sewardj95448072004-11-22 20:19:51 +0000844 break;
sewardj69a13322005-04-23 01:14:51 +0000845 case Ity_I128: {
846 /* Gah. Chop it in half, OR the halves together, and compare
847 that with zero. */
sewardj7cf4e6b2008-05-01 20:24:26 +0000848 IRAtom* tmp2 = assignNew('V', mce, Ity_I64, unop(Iop_128HIto64, vbits));
849 IRAtom* tmp3 = assignNew('V', mce, Ity_I64, unop(Iop_128to64, vbits));
850 IRAtom* tmp4 = assignNew('V', mce, Ity_I64, binop(Iop_Or64, tmp2, tmp3));
851 tmp1 = assignNew('V', mce, Ity_I1,
sewardj37c31cc2005-04-26 23:49:24 +0000852 unop(Iop_CmpNEZ64, tmp4));
sewardj69a13322005-04-23 01:14:51 +0000853 break;
854 }
Elliott Hughesa0664b92017-04-18 17:46:52 -0700855 case Ity_V128: {
856 /* Chop it in half, OR the halves together, and compare that
857 * with zero.
858 */
859 IRAtom* tmp2 = assignNew('V', mce, Ity_I64, unop(Iop_V128HIto64, vbits));
860 IRAtom* tmp3 = assignNew('V', mce, Ity_I64, unop(Iop_V128to64, vbits));
861 IRAtom* tmp4 = assignNew('V', mce, Ity_I64, binop(Iop_Or64, tmp2, tmp3));
862 tmp1 = assignNew('V', mce, Ity_I1,
863 unop(Iop_CmpNEZ64, tmp4));
864 break;
865 }
sewardj95448072004-11-22 20:19:51 +0000866 default:
sewardj4cc684b2007-08-25 23:09:36 +0000867 ppIRType(src_ty);
sewardj95448072004-11-22 20:19:51 +0000868 VG_(tool_panic)("mkPCastTo(1)");
869 }
870 tl_assert(tmp1);
871 /* Now widen up to the dst type. */
872 switch (dst_ty) {
873 case Ity_I1:
874 return tmp1;
875 case Ity_I8:
sewardj7cf4e6b2008-05-01 20:24:26 +0000876 return assignNew('V', mce, Ity_I8, unop(Iop_1Sto8, tmp1));
sewardj95448072004-11-22 20:19:51 +0000877 case Ity_I16:
sewardj7cf4e6b2008-05-01 20:24:26 +0000878 return assignNew('V', mce, Ity_I16, unop(Iop_1Sto16, tmp1));
sewardj95448072004-11-22 20:19:51 +0000879 case Ity_I32:
sewardj7cf4e6b2008-05-01 20:24:26 +0000880 return assignNew('V', mce, Ity_I32, unop(Iop_1Sto32, tmp1));
sewardj95448072004-11-22 20:19:51 +0000881 case Ity_I64:
sewardj7cf4e6b2008-05-01 20:24:26 +0000882 return assignNew('V', mce, Ity_I64, unop(Iop_1Sto64, tmp1));
sewardja1d93302004-12-12 16:45:06 +0000883 case Ity_V128:
sewardj7cf4e6b2008-05-01 20:24:26 +0000884 tmp1 = assignNew('V', mce, Ity_I64, unop(Iop_1Sto64, tmp1));
885 tmp1 = assignNew('V', mce, Ity_V128, binop(Iop_64HLtoV128, tmp1, tmp1));
sewardja1d93302004-12-12 16:45:06 +0000886 return tmp1;
sewardj69a13322005-04-23 01:14:51 +0000887 case Ity_I128:
sewardj7cf4e6b2008-05-01 20:24:26 +0000888 tmp1 = assignNew('V', mce, Ity_I64, unop(Iop_1Sto64, tmp1));
889 tmp1 = assignNew('V', mce, Ity_I128, binop(Iop_64HLto128, tmp1, tmp1));
sewardj69a13322005-04-23 01:14:51 +0000890 return tmp1;
sewardja2f30952013-03-27 11:40:02 +0000891 case Ity_V256:
892 tmp1 = assignNew('V', mce, Ity_I64, unop(Iop_1Sto64, tmp1));
893 tmp1 = assignNew('V', mce, Ity_V128, binop(Iop_64HLtoV128,
894 tmp1, tmp1));
895 tmp1 = assignNew('V', mce, Ity_V256, binop(Iop_V128HLtoV256,
896 tmp1, tmp1));
897 return tmp1;
sewardj95448072004-11-22 20:19:51 +0000898 default:
899 ppIRType(dst_ty);
900 VG_(tool_panic)("mkPCastTo(2)");
901 }
902}
903
sewardjbfd03f82014-08-26 18:35:13 +0000904/* This is a minor variant. It takes an arg of some type and returns
905 a value of the same type. The result consists entirely of Defined
906 (zero) bits except its least significant bit, which is a PCast of
907 the entire argument down to a single bit. */
908static IRAtom* mkPCastXXtoXXlsb ( MCEnv* mce, IRAtom* varg, IRType ty )
909{
910 if (ty == Ity_V128) {
911 /* --- Case for V128 --- */
912 IRAtom* varg128 = varg;
913 // generates: PCast-to-I64(varg128)
914 IRAtom* pcdTo64 = mkPCastTo(mce, Ity_I64, varg128);
915 // Now introduce zeros (defined bits) in the top 63 places
916 // generates: Def--(63)--Def PCast-to-I1(varg128)
917 IRAtom* d63pc
918 = assignNew('V', mce, Ity_I64, binop(Iop_And64, pcdTo64, mkU64(1)));
919 // generates: Def--(64)--Def
920 IRAtom* d64
921 = definedOfType(Ity_I64);
922 // generates: Def--(127)--Def PCast-to-I1(varg128)
923 IRAtom* res
924 = assignNew('V', mce, Ity_V128, binop(Iop_64HLtoV128, d64, d63pc));
925 return res;
926 }
927 if (ty == Ity_I64) {
928 /* --- Case for I64 --- */
929 // PCast to 64
930 IRAtom* pcd = mkPCastTo(mce, Ity_I64, varg);
931 // Zero (Def) out the top 63 bits
932 IRAtom* res
933 = assignNew('V', mce, Ity_I64, binop(Iop_And64, pcd, mkU64(1)));
934 return res;
935 }
936 /*NOTREACHED*/
937 tl_assert(0);
938}
939
sewardjd5204dc2004-12-31 01:16:11 +0000940/* --------- Accurate interpretation of CmpEQ/CmpNE. --------- */
941/*
942 Normally, we can do CmpEQ/CmpNE by doing UifU on the arguments, and
943 PCasting to Ity_U1. However, sometimes it is necessary to be more
944 accurate. The insight is that the result is defined if two
945 corresponding bits can be found, one from each argument, so that
946 both bits are defined but are different -- that makes EQ say "No"
947 and NE say "Yes". Hence, we compute an improvement term and DifD
948 it onto the "normal" (UifU) result.
949
950 The result is:
951
952 PCastTo<1> (
sewardje6f8af42005-07-06 18:48:59 +0000953 -- naive version
954 PCastTo<sz>( UifU<sz>(vxx, vyy) )
955
sewardjd5204dc2004-12-31 01:16:11 +0000956 `DifD<sz>`
sewardje6f8af42005-07-06 18:48:59 +0000957
958 -- improvement term
959 PCastTo<sz>( PCast<sz>( CmpEQ<sz> ( vec, 1...1 ) ) )
sewardjd5204dc2004-12-31 01:16:11 +0000960 )
sewardje6f8af42005-07-06 18:48:59 +0000961
sewardjd5204dc2004-12-31 01:16:11 +0000962 where
963 vec contains 0 (defined) bits where the corresponding arg bits
sewardje6f8af42005-07-06 18:48:59 +0000964 are defined but different, and 1 bits otherwise.
sewardjd5204dc2004-12-31 01:16:11 +0000965
sewardje6f8af42005-07-06 18:48:59 +0000966 vec = Or<sz>( vxx, // 0 iff bit defined
967 vyy, // 0 iff bit defined
968 Not<sz>(Xor<sz>( xx, yy )) // 0 iff bits different
969 )
970
971 If any bit of vec is 0, the result is defined and so the
972 improvement term should produce 0...0, else it should produce
973 1...1.
974
975 Hence require for the improvement term:
976
977 if vec == 1...1 then 1...1 else 0...0
978 ->
979 PCast<sz>( CmpEQ<sz> ( vec, 1...1 ) )
980
981 This was extensively re-analysed and checked on 6 July 05.
sewardjd5204dc2004-12-31 01:16:11 +0000982*/
983static IRAtom* expensiveCmpEQorNE ( MCEnv* mce,
984 IRType ty,
985 IRAtom* vxx, IRAtom* vyy,
986 IRAtom* xx, IRAtom* yy )
987{
sewardje6f8af42005-07-06 18:48:59 +0000988 IRAtom *naive, *vec, *improvement_term;
989 IRAtom *improved, *final_cast, *top;
990 IROp opDIFD, opUIFU, opXOR, opNOT, opCMP, opOR;
sewardjd5204dc2004-12-31 01:16:11 +0000991
992 tl_assert(isShadowAtom(mce,vxx));
993 tl_assert(isShadowAtom(mce,vyy));
994 tl_assert(isOriginalAtom(mce,xx));
995 tl_assert(isOriginalAtom(mce,yy));
996 tl_assert(sameKindedAtoms(vxx,xx));
997 tl_assert(sameKindedAtoms(vyy,yy));
998
999 switch (ty) {
sewardj4cfa81b2012-11-08 10:58:16 +00001000 case Ity_I16:
1001 opOR = Iop_Or16;
1002 opDIFD = Iop_And16;
1003 opUIFU = Iop_Or16;
1004 opNOT = Iop_Not16;
1005 opXOR = Iop_Xor16;
1006 opCMP = Iop_CmpEQ16;
1007 top = mkU16(0xFFFF);
1008 break;
sewardjd5204dc2004-12-31 01:16:11 +00001009 case Ity_I32:
sewardje6f8af42005-07-06 18:48:59 +00001010 opOR = Iop_Or32;
sewardjd5204dc2004-12-31 01:16:11 +00001011 opDIFD = Iop_And32;
1012 opUIFU = Iop_Or32;
1013 opNOT = Iop_Not32;
1014 opXOR = Iop_Xor32;
1015 opCMP = Iop_CmpEQ32;
1016 top = mkU32(0xFFFFFFFF);
1017 break;
tomcd986332005-04-26 07:44:48 +00001018 case Ity_I64:
sewardje6f8af42005-07-06 18:48:59 +00001019 opOR = Iop_Or64;
tomcd986332005-04-26 07:44:48 +00001020 opDIFD = Iop_And64;
1021 opUIFU = Iop_Or64;
1022 opNOT = Iop_Not64;
1023 opXOR = Iop_Xor64;
1024 opCMP = Iop_CmpEQ64;
sewardj37c31cc2005-04-26 23:49:24 +00001025 top = mkU64(0xFFFFFFFFFFFFFFFFULL);
tomcd986332005-04-26 07:44:48 +00001026 break;
sewardjd5204dc2004-12-31 01:16:11 +00001027 default:
1028 VG_(tool_panic)("expensiveCmpEQorNE");
1029 }
1030
1031 naive
sewardj7cf4e6b2008-05-01 20:24:26 +00001032 = mkPCastTo(mce,ty,
1033 assignNew('V', mce, ty, binop(opUIFU, vxx, vyy)));
sewardjd5204dc2004-12-31 01:16:11 +00001034
1035 vec
1036 = assignNew(
sewardj7cf4e6b2008-05-01 20:24:26 +00001037 'V', mce,ty,
sewardje6f8af42005-07-06 18:48:59 +00001038 binop( opOR,
sewardj7cf4e6b2008-05-01 20:24:26 +00001039 assignNew('V', mce,ty, binop(opOR, vxx, vyy)),
sewardjd5204dc2004-12-31 01:16:11 +00001040 assignNew(
sewardj7cf4e6b2008-05-01 20:24:26 +00001041 'V', mce,ty,
sewardjd5204dc2004-12-31 01:16:11 +00001042 unop( opNOT,
sewardj7cf4e6b2008-05-01 20:24:26 +00001043 assignNew('V', mce,ty, binop(opXOR, xx, yy))))));
sewardjd5204dc2004-12-31 01:16:11 +00001044
sewardje6f8af42005-07-06 18:48:59 +00001045 improvement_term
sewardj7cf4e6b2008-05-01 20:24:26 +00001046 = mkPCastTo( mce,ty,
1047 assignNew('V', mce,Ity_I1, binop(opCMP, vec, top)));
sewardjd5204dc2004-12-31 01:16:11 +00001048
1049 improved
sewardj7cf4e6b2008-05-01 20:24:26 +00001050 = assignNew( 'V', mce,ty, binop(opDIFD, naive, improvement_term) );
sewardjd5204dc2004-12-31 01:16:11 +00001051
1052 final_cast
1053 = mkPCastTo( mce, Ity_I1, improved );
1054
1055 return final_cast;
1056}
1057
sewardj95448072004-11-22 20:19:51 +00001058
sewardj992dff92005-10-07 11:08:55 +00001059/* --------- Semi-accurate interpretation of CmpORD. --------- */
1060
1061/* CmpORD32{S,U} does PowerPC-style 3-way comparisons:
1062
1063 CmpORD32S(x,y) = 1<<3 if x <s y
1064 = 1<<2 if x >s y
1065 = 1<<1 if x == y
1066
1067 and similarly the unsigned variant. The default interpretation is:
1068
1069 CmpORD32{S,U}#(x,y,x#,y#) = PCast(x# `UifU` y#)
sewardj1bc82102005-12-23 00:16:24 +00001070 & (7<<1)
sewardj992dff92005-10-07 11:08:55 +00001071
1072 The "& (7<<1)" reflects the fact that all result bits except 3,2,1
1073 are zero and therefore defined (viz, zero).
sewardja9e62a92005-10-07 12:13:21 +00001074
1075 Also deal with a special case better:
1076
1077 CmpORD32S(x,0)
1078
1079 Here, bit 3 (LT) of the result is a copy of the top bit of x and
1080 will be defined even if the rest of x isn't. In which case we do:
1081
1082 CmpORD32S#(x,x#,0,{impliedly 0}#)
sewardj1bc82102005-12-23 00:16:24 +00001083 = PCast(x#) & (3<<1) -- standard interp for GT#,EQ#
1084 | (x# >>u 31) << 3 -- LT# = x#[31]
sewardja9e62a92005-10-07 12:13:21 +00001085
sewardj1bc82102005-12-23 00:16:24 +00001086 Analogous handling for CmpORD64{S,U}.
sewardj992dff92005-10-07 11:08:55 +00001087*/
sewardja9e62a92005-10-07 12:13:21 +00001088static Bool isZeroU32 ( IRAtom* e )
1089{
1090 return
1091 toBool( e->tag == Iex_Const
1092 && e->Iex.Const.con->tag == Ico_U32
1093 && e->Iex.Const.con->Ico.U32 == 0 );
1094}
1095
sewardj1bc82102005-12-23 00:16:24 +00001096static Bool isZeroU64 ( IRAtom* e )
sewardj992dff92005-10-07 11:08:55 +00001097{
sewardj1bc82102005-12-23 00:16:24 +00001098 return
1099 toBool( e->tag == Iex_Const
1100 && e->Iex.Const.con->tag == Ico_U64
1101 && e->Iex.Const.con->Ico.U64 == 0 );
1102}
1103
1104static IRAtom* doCmpORD ( MCEnv* mce,
1105 IROp cmp_op,
1106 IRAtom* xxhash, IRAtom* yyhash,
1107 IRAtom* xx, IRAtom* yy )
1108{
1109 Bool m64 = cmp_op == Iop_CmpORD64S || cmp_op == Iop_CmpORD64U;
1110 Bool syned = cmp_op == Iop_CmpORD64S || cmp_op == Iop_CmpORD32S;
1111 IROp opOR = m64 ? Iop_Or64 : Iop_Or32;
1112 IROp opAND = m64 ? Iop_And64 : Iop_And32;
1113 IROp opSHL = m64 ? Iop_Shl64 : Iop_Shl32;
1114 IROp opSHR = m64 ? Iop_Shr64 : Iop_Shr32;
1115 IRType ty = m64 ? Ity_I64 : Ity_I32;
1116 Int width = m64 ? 64 : 32;
1117
1118 Bool (*isZero)(IRAtom*) = m64 ? isZeroU64 : isZeroU32;
1119
1120 IRAtom* threeLeft1 = NULL;
1121 IRAtom* sevenLeft1 = NULL;
1122
sewardj992dff92005-10-07 11:08:55 +00001123 tl_assert(isShadowAtom(mce,xxhash));
1124 tl_assert(isShadowAtom(mce,yyhash));
1125 tl_assert(isOriginalAtom(mce,xx));
1126 tl_assert(isOriginalAtom(mce,yy));
1127 tl_assert(sameKindedAtoms(xxhash,xx));
1128 tl_assert(sameKindedAtoms(yyhash,yy));
sewardj1bc82102005-12-23 00:16:24 +00001129 tl_assert(cmp_op == Iop_CmpORD32S || cmp_op == Iop_CmpORD32U
1130 || cmp_op == Iop_CmpORD64S || cmp_op == Iop_CmpORD64U);
sewardj992dff92005-10-07 11:08:55 +00001131
sewardja9e62a92005-10-07 12:13:21 +00001132 if (0) {
1133 ppIROp(cmp_op); VG_(printf)(" ");
1134 ppIRExpr(xx); VG_(printf)(" "); ppIRExpr( yy ); VG_(printf)("\n");
1135 }
1136
sewardj1bc82102005-12-23 00:16:24 +00001137 if (syned && isZero(yy)) {
sewardja9e62a92005-10-07 12:13:21 +00001138 /* fancy interpretation */
1139 /* if yy is zero, then it must be fully defined (zero#). */
sewardj1bc82102005-12-23 00:16:24 +00001140 tl_assert(isZero(yyhash));
1141 threeLeft1 = m64 ? mkU64(3<<1) : mkU32(3<<1);
sewardja9e62a92005-10-07 12:13:21 +00001142 return
1143 binop(
sewardj1bc82102005-12-23 00:16:24 +00001144 opOR,
sewardja9e62a92005-10-07 12:13:21 +00001145 assignNew(
sewardj7cf4e6b2008-05-01 20:24:26 +00001146 'V', mce,ty,
sewardja9e62a92005-10-07 12:13:21 +00001147 binop(
sewardj1bc82102005-12-23 00:16:24 +00001148 opAND,
1149 mkPCastTo(mce,ty, xxhash),
1150 threeLeft1
sewardja9e62a92005-10-07 12:13:21 +00001151 )),
1152 assignNew(
sewardj7cf4e6b2008-05-01 20:24:26 +00001153 'V', mce,ty,
sewardja9e62a92005-10-07 12:13:21 +00001154 binop(
sewardj1bc82102005-12-23 00:16:24 +00001155 opSHL,
sewardja9e62a92005-10-07 12:13:21 +00001156 assignNew(
sewardj7cf4e6b2008-05-01 20:24:26 +00001157 'V', mce,ty,
sewardj1bc82102005-12-23 00:16:24 +00001158 binop(opSHR, xxhash, mkU8(width-1))),
sewardja9e62a92005-10-07 12:13:21 +00001159 mkU8(3)
1160 ))
1161 );
1162 } else {
1163 /* standard interpretation */
sewardj1bc82102005-12-23 00:16:24 +00001164 sevenLeft1 = m64 ? mkU64(7<<1) : mkU32(7<<1);
sewardja9e62a92005-10-07 12:13:21 +00001165 return
1166 binop(
sewardj1bc82102005-12-23 00:16:24 +00001167 opAND,
1168 mkPCastTo( mce,ty,
1169 mkUifU(mce,ty, xxhash,yyhash)),
1170 sevenLeft1
sewardja9e62a92005-10-07 12:13:21 +00001171 );
1172 }
sewardj992dff92005-10-07 11:08:55 +00001173}
1174
1175
sewardj95448072004-11-22 20:19:51 +00001176/*------------------------------------------------------------*/
1177/*--- Emit a test and complaint if something is undefined. ---*/
1178/*------------------------------------------------------------*/
1179
sewardj7cf4e6b2008-05-01 20:24:26 +00001180static IRAtom* schemeE ( MCEnv* mce, IRExpr* e ); /* fwds */
1181
1182
sewardj95448072004-11-22 20:19:51 +00001183/* Set the annotations on a dirty helper to indicate that the stack
1184 pointer and instruction pointers might be read. This is the
1185 behaviour of all 'emit-a-complaint' style functions we might
1186 call. */
1187
1188static void setHelperAnns ( MCEnv* mce, IRDirty* di ) {
1189 di->nFxState = 2;
sewardj2eecb742012-06-01 16:11:41 +00001190 di->fxState[0].fx = Ifx_Read;
1191 di->fxState[0].offset = mce->layout->offset_SP;
1192 di->fxState[0].size = mce->layout->sizeof_SP;
1193 di->fxState[0].nRepeats = 0;
1194 di->fxState[0].repeatLen = 0;
1195 di->fxState[1].fx = Ifx_Read;
1196 di->fxState[1].offset = mce->layout->offset_IP;
1197 di->fxState[1].size = mce->layout->sizeof_IP;
1198 di->fxState[1].nRepeats = 0;
1199 di->fxState[1].repeatLen = 0;
sewardj95448072004-11-22 20:19:51 +00001200}
1201
1202
sewardjcafe5052013-01-17 14:24:35 +00001203/* Check the supplied *original* |atom| for undefinedness, and emit a
sewardj95448072004-11-22 20:19:51 +00001204 complaint if so. Once that happens, mark it as defined. This is
1205 possible because the atom is either a tmp or literal. If it's a
1206 tmp, it will be shadowed by a tmp, and so we can set the shadow to
1207 be defined. In fact as mentioned above, we will have to allocate a
1208 new tmp to carry the new 'defined' shadow value, and update the
1209 original->tmp mapping accordingly; we cannot simply assign a new
sewardjcafe5052013-01-17 14:24:35 +00001210 value to an existing shadow tmp as this breaks SSAness.
1211
sewardjb9e6d242013-05-11 13:42:08 +00001212 The checks are performed, any resulting complaint emitted, and
1213 |atom|'s shadow temp set to 'defined', ONLY in the case that
1214 |guard| evaluates to True at run-time. If it evaluates to False
1215 then no action is performed. If |guard| is NULL (the usual case)
1216 then it is assumed to be always-true, and hence these actions are
1217 performed unconditionally.
1218
1219 This routine does not generate code to check the definedness of
1220 |guard|. The caller is assumed to have taken care of that already.
sewardj95448072004-11-22 20:19:51 +00001221*/
sewardjb9e6d242013-05-11 13:42:08 +00001222static void complainIfUndefined ( MCEnv* mce, IRAtom* atom, IRExpr *guard )
sewardj95448072004-11-22 20:19:51 +00001223{
sewardj7cf97ee2004-11-28 14:25:01 +00001224 IRAtom* vatom;
1225 IRType ty;
1226 Int sz;
1227 IRDirty* di;
1228 IRAtom* cond;
sewardj7cf4e6b2008-05-01 20:24:26 +00001229 IRAtom* origin;
1230 void* fn;
florian6bd9dc12012-11-23 16:17:43 +00001231 const HChar* nm;
sewardj7cf4e6b2008-05-01 20:24:26 +00001232 IRExpr** args;
1233 Int nargs;
sewardj7cf97ee2004-11-28 14:25:01 +00001234
njn1d0825f2006-03-27 11:37:07 +00001235 // Don't do V bit tests if we're not reporting undefined value errors.
sewardj7cf4e6b2008-05-01 20:24:26 +00001236 if (MC_(clo_mc_level) == 1)
njn1d0825f2006-03-27 11:37:07 +00001237 return;
1238
sewardjb9e6d242013-05-11 13:42:08 +00001239 if (guard)
1240 tl_assert(isOriginalAtom(mce, guard));
1241
sewardj95448072004-11-22 20:19:51 +00001242 /* Since the original expression is atomic, there's no duplicated
1243 work generated by making multiple V-expressions for it. So we
1244 don't really care about the possibility that someone else may
1245 also create a V-interpretion for it. */
1246 tl_assert(isOriginalAtom(mce, atom));
sewardj7cf97ee2004-11-28 14:25:01 +00001247 vatom = expr2vbits( mce, atom );
sewardj95448072004-11-22 20:19:51 +00001248 tl_assert(isShadowAtom(mce, vatom));
1249 tl_assert(sameKindedAtoms(atom, vatom));
1250
sewardj1c0ce7a2009-07-01 08:10:49 +00001251 ty = typeOfIRExpr(mce->sb->tyenv, vatom);
sewardj95448072004-11-22 20:19:51 +00001252
1253 /* sz is only used for constructing the error message */
sewardj7cf97ee2004-11-28 14:25:01 +00001254 sz = ty==Ity_I1 ? 0 : sizeofIRType(ty);
sewardj95448072004-11-22 20:19:51 +00001255
sewardj7cf97ee2004-11-28 14:25:01 +00001256 cond = mkPCastTo( mce, Ity_I1, vatom );
sewardj95448072004-11-22 20:19:51 +00001257 /* cond will be 0 if all defined, and 1 if any not defined. */
1258
sewardj7cf4e6b2008-05-01 20:24:26 +00001259 /* Get the origin info for the value we are about to check. At
1260 least, if we are doing origin tracking. If not, use a dummy
1261 zero origin. */
1262 if (MC_(clo_mc_level) == 3) {
1263 origin = schemeE( mce, atom );
1264 if (mce->hWordTy == Ity_I64) {
1265 origin = assignNew( 'B', mce, Ity_I64, unop(Iop_32Uto64, origin) );
1266 }
1267 } else {
1268 origin = NULL;
1269 }
1270
1271 fn = NULL;
1272 nm = NULL;
1273 args = NULL;
1274 nargs = -1;
1275
sewardj95448072004-11-22 20:19:51 +00001276 switch (sz) {
1277 case 0:
sewardj7cf4e6b2008-05-01 20:24:26 +00001278 if (origin) {
1279 fn = &MC_(helperc_value_check0_fail_w_o);
1280 nm = "MC_(helperc_value_check0_fail_w_o)";
1281 args = mkIRExprVec_1(origin);
1282 nargs = 1;
1283 } else {
1284 fn = &MC_(helperc_value_check0_fail_no_o);
1285 nm = "MC_(helperc_value_check0_fail_no_o)";
1286 args = mkIRExprVec_0();
1287 nargs = 0;
1288 }
sewardj95448072004-11-22 20:19:51 +00001289 break;
1290 case 1:
sewardj7cf4e6b2008-05-01 20:24:26 +00001291 if (origin) {
1292 fn = &MC_(helperc_value_check1_fail_w_o);
1293 nm = "MC_(helperc_value_check1_fail_w_o)";
1294 args = mkIRExprVec_1(origin);
1295 nargs = 1;
1296 } else {
1297 fn = &MC_(helperc_value_check1_fail_no_o);
1298 nm = "MC_(helperc_value_check1_fail_no_o)";
1299 args = mkIRExprVec_0();
1300 nargs = 0;
1301 }
sewardj95448072004-11-22 20:19:51 +00001302 break;
1303 case 4:
sewardj7cf4e6b2008-05-01 20:24:26 +00001304 if (origin) {
1305 fn = &MC_(helperc_value_check4_fail_w_o);
1306 nm = "MC_(helperc_value_check4_fail_w_o)";
1307 args = mkIRExprVec_1(origin);
1308 nargs = 1;
1309 } else {
1310 fn = &MC_(helperc_value_check4_fail_no_o);
1311 nm = "MC_(helperc_value_check4_fail_no_o)";
1312 args = mkIRExprVec_0();
1313 nargs = 0;
1314 }
sewardj95448072004-11-22 20:19:51 +00001315 break;
sewardj11bcc4e2005-04-23 22:38:38 +00001316 case 8:
sewardj7cf4e6b2008-05-01 20:24:26 +00001317 if (origin) {
1318 fn = &MC_(helperc_value_check8_fail_w_o);
1319 nm = "MC_(helperc_value_check8_fail_w_o)";
1320 args = mkIRExprVec_1(origin);
1321 nargs = 1;
1322 } else {
1323 fn = &MC_(helperc_value_check8_fail_no_o);
1324 nm = "MC_(helperc_value_check8_fail_no_o)";
1325 args = mkIRExprVec_0();
1326 nargs = 0;
1327 }
sewardj11bcc4e2005-04-23 22:38:38 +00001328 break;
njn4c245e52009-03-15 23:25:38 +00001329 case 2:
1330 case 16:
sewardj7cf4e6b2008-05-01 20:24:26 +00001331 if (origin) {
1332 fn = &MC_(helperc_value_checkN_fail_w_o);
1333 nm = "MC_(helperc_value_checkN_fail_w_o)";
1334 args = mkIRExprVec_2( mkIRExpr_HWord( sz ), origin);
1335 nargs = 2;
1336 } else {
1337 fn = &MC_(helperc_value_checkN_fail_no_o);
1338 nm = "MC_(helperc_value_checkN_fail_no_o)";
1339 args = mkIRExprVec_1( mkIRExpr_HWord( sz ) );
1340 nargs = 1;
1341 }
sewardj95448072004-11-22 20:19:51 +00001342 break;
njn4c245e52009-03-15 23:25:38 +00001343 default:
1344 VG_(tool_panic)("unexpected szB");
sewardj95448072004-11-22 20:19:51 +00001345 }
sewardj7cf4e6b2008-05-01 20:24:26 +00001346
1347 tl_assert(fn);
1348 tl_assert(nm);
1349 tl_assert(args);
1350 tl_assert(nargs >= 0 && nargs <= 2);
1351 tl_assert( (MC_(clo_mc_level) == 3 && origin != NULL)
1352 || (MC_(clo_mc_level) == 2 && origin == NULL) );
1353
1354 di = unsafeIRDirty_0_N( nargs/*regparms*/, nm,
1355 VG_(fnptr_to_fnentry)( fn ), args );
sewardjb9e6d242013-05-11 13:42:08 +00001356 di->guard = cond; // and cond is PCast-to-1(atom#)
1357
1358 /* If the complaint is to be issued under a guard condition, AND
1359 that into the guard condition for the helper call. */
1360 if (guard) {
1361 IRAtom *g1 = assignNew('V', mce, Ity_I32, unop(Iop_1Uto32, di->guard));
1362 IRAtom *g2 = assignNew('V', mce, Ity_I32, unop(Iop_1Uto32, guard));
1363 IRAtom *e = assignNew('V', mce, Ity_I32, binop(Iop_And32, g1, g2));
1364 di->guard = assignNew('V', mce, Ity_I1, unop(Iop_32to1, e));
1365 }
florian434ffae2012-07-19 17:23:42 +00001366
sewardj95448072004-11-22 20:19:51 +00001367 setHelperAnns( mce, di );
sewardj7cf4e6b2008-05-01 20:24:26 +00001368 stmt( 'V', mce, IRStmt_Dirty(di));
sewardj95448072004-11-22 20:19:51 +00001369
sewardjb9e6d242013-05-11 13:42:08 +00001370 /* If |atom| is shadowed by an IRTemp, set the shadow tmp to be
1371 defined -- but only in the case where the guard evaluates to
1372 True at run-time. Do the update by setting the orig->shadow
1373 mapping for tmp to reflect the fact that this shadow is getting
1374 a new value. */
sewardj710d6c22005-03-20 18:55:15 +00001375 tl_assert(isIRAtom(vatom));
sewardj95448072004-11-22 20:19:51 +00001376 /* sameKindedAtoms ... */
sewardj0b9d74a2006-12-24 02:24:11 +00001377 if (vatom->tag == Iex_RdTmp) {
1378 tl_assert(atom->tag == Iex_RdTmp);
sewardjb9e6d242013-05-11 13:42:08 +00001379 if (guard == NULL) {
1380 // guard is 'always True', hence update unconditionally
1381 newShadowTmpV(mce, atom->Iex.RdTmp.tmp);
1382 assign('V', mce, findShadowTmpV(mce, atom->Iex.RdTmp.tmp),
1383 definedOfType(ty));
1384 } else {
1385 // update the temp only conditionally. Do this by copying
1386 // its old value when the guard is False.
1387 // The old value ..
1388 IRTemp old_tmpV = findShadowTmpV(mce, atom->Iex.RdTmp.tmp);
1389 newShadowTmpV(mce, atom->Iex.RdTmp.tmp);
1390 IRAtom* new_tmpV
1391 = assignNew('V', mce, shadowTypeV(ty),
1392 IRExpr_ITE(guard, definedOfType(ty),
1393 mkexpr(old_tmpV)));
1394 assign('V', mce, findShadowTmpV(mce, atom->Iex.RdTmp.tmp), new_tmpV);
1395 }
sewardj95448072004-11-22 20:19:51 +00001396 }
1397}
1398
1399
1400/*------------------------------------------------------------*/
1401/*--- Shadowing PUTs/GETs, and indexed variants thereof ---*/
1402/*------------------------------------------------------------*/
1403
1404/* Examine the always-defined sections declared in layout to see if
1405 the (offset,size) section is within one. Note, is is an error to
1406 partially fall into such a region: (offset,size) should either be
1407 completely in such a region or completely not-in such a region.
1408*/
1409static Bool isAlwaysDefd ( MCEnv* mce, Int offset, Int size )
1410{
1411 Int minoffD, maxoffD, i;
1412 Int minoff = offset;
1413 Int maxoff = minoff + size - 1;
1414 tl_assert((minoff & ~0xFFFF) == 0);
1415 tl_assert((maxoff & ~0xFFFF) == 0);
1416
1417 for (i = 0; i < mce->layout->n_alwaysDefd; i++) {
1418 minoffD = mce->layout->alwaysDefd[i].offset;
1419 maxoffD = minoffD + mce->layout->alwaysDefd[i].size - 1;
1420 tl_assert((minoffD & ~0xFFFF) == 0);
1421 tl_assert((maxoffD & ~0xFFFF) == 0);
1422
1423 if (maxoff < minoffD || maxoffD < minoff)
1424 continue; /* no overlap */
1425 if (minoff >= minoffD && maxoff <= maxoffD)
1426 return True; /* completely contained in an always-defd section */
1427
1428 VG_(tool_panic)("memcheck:isAlwaysDefd:partial overlap");
1429 }
1430 return False; /* could not find any containing section */
1431}
1432
1433
1434/* Generate into bb suitable actions to shadow this Put. If the state
1435 slice is marked 'always defined', do nothing. Otherwise, write the
1436 supplied V bits to the shadow state. We can pass in either an
1437 original atom or a V-atom, but not both. In the former case the
1438 relevant V-bits are then generated from the original.
florian434ffae2012-07-19 17:23:42 +00001439 We assume here, that the definedness of GUARD has already been checked.
sewardj95448072004-11-22 20:19:51 +00001440*/
1441static
1442void do_shadow_PUT ( MCEnv* mce, Int offset,
florian434ffae2012-07-19 17:23:42 +00001443 IRAtom* atom, IRAtom* vatom, IRExpr *guard )
sewardj95448072004-11-22 20:19:51 +00001444{
sewardj7cf97ee2004-11-28 14:25:01 +00001445 IRType ty;
njn1d0825f2006-03-27 11:37:07 +00001446
1447 // Don't do shadow PUTs if we're not doing undefined value checking.
1448 // Their absence lets Vex's optimiser remove all the shadow computation
1449 // that they depend on, which includes GETs of the shadow registers.
sewardj7cf4e6b2008-05-01 20:24:26 +00001450 if (MC_(clo_mc_level) == 1)
njn1d0825f2006-03-27 11:37:07 +00001451 return;
1452
sewardj95448072004-11-22 20:19:51 +00001453 if (atom) {
1454 tl_assert(!vatom);
1455 tl_assert(isOriginalAtom(mce, atom));
1456 vatom = expr2vbits( mce, atom );
1457 } else {
1458 tl_assert(vatom);
1459 tl_assert(isShadowAtom(mce, vatom));
1460 }
1461
sewardj1c0ce7a2009-07-01 08:10:49 +00001462 ty = typeOfIRExpr(mce->sb->tyenv, vatom);
sewardj95448072004-11-22 20:19:51 +00001463 tl_assert(ty != Ity_I1);
1464 if (isAlwaysDefd(mce, offset, sizeofIRType(ty))) {
1465 /* later: no ... */
1466 /* emit code to emit a complaint if any of the vbits are 1. */
1467 /* complainIfUndefined(mce, atom); */
1468 } else {
1469 /* Do a plain shadow Put. */
florian434ffae2012-07-19 17:23:42 +00001470 if (guard) {
1471 /* If the guard expression evaluates to false we simply Put the value
1472 that is already stored in the guest state slot */
1473 IRAtom *cond, *iffalse;
1474
sewardjcc961652013-01-26 11:49:15 +00001475 cond = assignNew('V', mce, Ity_I1, guard);
florian434ffae2012-07-19 17:23:42 +00001476 iffalse = assignNew('V', mce, ty,
1477 IRExpr_Get(offset + mce->layout->total_sizeB, ty));
florian5686b2d2013-01-29 03:57:40 +00001478 vatom = assignNew('V', mce, ty, IRExpr_ITE(cond, vatom, iffalse));
florian434ffae2012-07-19 17:23:42 +00001479 }
1480 stmt( 'V', mce, IRStmt_Put( offset + mce->layout->total_sizeB, vatom ));
sewardj95448072004-11-22 20:19:51 +00001481 }
1482}
1483
1484
1485/* Return an expression which contains the V bits corresponding to the
1486 given GETI (passed in in pieces).
1487*/
1488static
floriand39b0222012-05-31 15:48:13 +00001489void do_shadow_PUTI ( MCEnv* mce, IRPutI *puti)
sewardj95448072004-11-22 20:19:51 +00001490{
sewardj7cf97ee2004-11-28 14:25:01 +00001491 IRAtom* vatom;
1492 IRType ty, tyS;
1493 Int arrSize;;
floriand39b0222012-05-31 15:48:13 +00001494 IRRegArray* descr = puti->descr;
1495 IRAtom* ix = puti->ix;
1496 Int bias = puti->bias;
1497 IRAtom* atom = puti->data;
sewardj7cf97ee2004-11-28 14:25:01 +00001498
njn1d0825f2006-03-27 11:37:07 +00001499 // Don't do shadow PUTIs if we're not doing undefined value checking.
1500 // Their absence lets Vex's optimiser remove all the shadow computation
1501 // that they depend on, which includes GETIs of the shadow registers.
sewardj7cf4e6b2008-05-01 20:24:26 +00001502 if (MC_(clo_mc_level) == 1)
njn1d0825f2006-03-27 11:37:07 +00001503 return;
1504
sewardj95448072004-11-22 20:19:51 +00001505 tl_assert(isOriginalAtom(mce,atom));
sewardj7cf97ee2004-11-28 14:25:01 +00001506 vatom = expr2vbits( mce, atom );
sewardj95448072004-11-22 20:19:51 +00001507 tl_assert(sameKindedAtoms(atom, vatom));
sewardj7cf97ee2004-11-28 14:25:01 +00001508 ty = descr->elemTy;
sewardj7cf4e6b2008-05-01 20:24:26 +00001509 tyS = shadowTypeV(ty);
sewardj7cf97ee2004-11-28 14:25:01 +00001510 arrSize = descr->nElems * sizeofIRType(ty);
sewardj95448072004-11-22 20:19:51 +00001511 tl_assert(ty != Ity_I1);
1512 tl_assert(isOriginalAtom(mce,ix));
sewardjb9e6d242013-05-11 13:42:08 +00001513 complainIfUndefined(mce, ix, NULL);
sewardj95448072004-11-22 20:19:51 +00001514 if (isAlwaysDefd(mce, descr->base, arrSize)) {
1515 /* later: no ... */
1516 /* emit code to emit a complaint if any of the vbits are 1. */
1517 /* complainIfUndefined(mce, atom); */
1518 } else {
1519 /* Do a cloned version of the Put that refers to the shadow
1520 area. */
sewardj0b9d74a2006-12-24 02:24:11 +00001521 IRRegArray* new_descr
1522 = mkIRRegArray( descr->base + mce->layout->total_sizeB,
1523 tyS, descr->nElems);
floriand39b0222012-05-31 15:48:13 +00001524 stmt( 'V', mce, IRStmt_PutI( mkIRPutI(new_descr, ix, bias, vatom) ));
sewardj95448072004-11-22 20:19:51 +00001525 }
1526}
1527
1528
1529/* Return an expression which contains the V bits corresponding to the
1530 given GET (passed in in pieces).
1531*/
1532static
1533IRExpr* shadow_GET ( MCEnv* mce, Int offset, IRType ty )
1534{
sewardj7cf4e6b2008-05-01 20:24:26 +00001535 IRType tyS = shadowTypeV(ty);
sewardj95448072004-11-22 20:19:51 +00001536 tl_assert(ty != Ity_I1);
sewardjb5b87402011-03-07 16:05:35 +00001537 tl_assert(ty != Ity_I128);
sewardj95448072004-11-22 20:19:51 +00001538 if (isAlwaysDefd(mce, offset, sizeofIRType(ty))) {
1539 /* Always defined, return all zeroes of the relevant type */
1540 return definedOfType(tyS);
1541 } else {
1542 /* return a cloned version of the Get that refers to the shadow
1543 area. */
sewardj7cf4e6b2008-05-01 20:24:26 +00001544 /* FIXME: this isn't an atom! */
sewardj95448072004-11-22 20:19:51 +00001545 return IRExpr_Get( offset + mce->layout->total_sizeB, tyS );
1546 }
1547}
1548
1549
1550/* Return an expression which contains the V bits corresponding to the
1551 given GETI (passed in in pieces).
1552*/
1553static
sewardj0b9d74a2006-12-24 02:24:11 +00001554IRExpr* shadow_GETI ( MCEnv* mce,
1555 IRRegArray* descr, IRAtom* ix, Int bias )
sewardj95448072004-11-22 20:19:51 +00001556{
1557 IRType ty = descr->elemTy;
sewardj7cf4e6b2008-05-01 20:24:26 +00001558 IRType tyS = shadowTypeV(ty);
sewardj95448072004-11-22 20:19:51 +00001559 Int arrSize = descr->nElems * sizeofIRType(ty);
1560 tl_assert(ty != Ity_I1);
1561 tl_assert(isOriginalAtom(mce,ix));
sewardjb9e6d242013-05-11 13:42:08 +00001562 complainIfUndefined(mce, ix, NULL);
sewardj95448072004-11-22 20:19:51 +00001563 if (isAlwaysDefd(mce, descr->base, arrSize)) {
1564 /* Always defined, return all zeroes of the relevant type */
1565 return definedOfType(tyS);
1566 } else {
1567 /* return a cloned version of the Get that refers to the shadow
1568 area. */
sewardj0b9d74a2006-12-24 02:24:11 +00001569 IRRegArray* new_descr
1570 = mkIRRegArray( descr->base + mce->layout->total_sizeB,
1571 tyS, descr->nElems);
sewardj95448072004-11-22 20:19:51 +00001572 return IRExpr_GetI( new_descr, ix, bias );
1573 }
1574}
1575
1576
1577/*------------------------------------------------------------*/
1578/*--- Generating approximations for unknown operations, ---*/
1579/*--- using lazy-propagate semantics ---*/
1580/*------------------------------------------------------------*/
1581
1582/* Lazy propagation of undefinedness from two values, resulting in the
1583 specified shadow type.
1584*/
1585static
1586IRAtom* mkLazy2 ( MCEnv* mce, IRType finalVty, IRAtom* va1, IRAtom* va2 )
1587{
sewardj95448072004-11-22 20:19:51 +00001588 IRAtom* at;
sewardj1c0ce7a2009-07-01 08:10:49 +00001589 IRType t1 = typeOfIRExpr(mce->sb->tyenv, va1);
1590 IRType t2 = typeOfIRExpr(mce->sb->tyenv, va2);
sewardj95448072004-11-22 20:19:51 +00001591 tl_assert(isShadowAtom(mce,va1));
1592 tl_assert(isShadowAtom(mce,va2));
sewardj37c31cc2005-04-26 23:49:24 +00001593
1594 /* The general case is inefficient because PCast is an expensive
1595 operation. Here are some special cases which use PCast only
1596 once rather than twice. */
1597
1598 /* I64 x I64 -> I64 */
1599 if (t1 == Ity_I64 && t2 == Ity_I64 && finalVty == Ity_I64) {
1600 if (0) VG_(printf)("mkLazy2: I64 x I64 -> I64\n");
1601 at = mkUifU(mce, Ity_I64, va1, va2);
1602 at = mkPCastTo(mce, Ity_I64, at);
1603 return at;
1604 }
1605
1606 /* I64 x I64 -> I32 */
1607 if (t1 == Ity_I64 && t2 == Ity_I64 && finalVty == Ity_I32) {
1608 if (0) VG_(printf)("mkLazy2: I64 x I64 -> I32\n");
1609 at = mkUifU(mce, Ity_I64, va1, va2);
1610 at = mkPCastTo(mce, Ity_I32, at);
1611 return at;
1612 }
1613
1614 if (0) {
1615 VG_(printf)("mkLazy2 ");
1616 ppIRType(t1);
1617 VG_(printf)("_");
1618 ppIRType(t2);
1619 VG_(printf)("_");
1620 ppIRType(finalVty);
1621 VG_(printf)("\n");
1622 }
1623
1624 /* General case: force everything via 32-bit intermediaries. */
sewardj95448072004-11-22 20:19:51 +00001625 at = mkPCastTo(mce, Ity_I32, va1);
1626 at = mkUifU(mce, Ity_I32, at, mkPCastTo(mce, Ity_I32, va2));
1627 at = mkPCastTo(mce, finalVty, at);
1628 return at;
1629}
1630
1631
sewardjed69fdb2006-02-03 16:12:27 +00001632/* 3-arg version of the above. */
1633static
1634IRAtom* mkLazy3 ( MCEnv* mce, IRType finalVty,
1635 IRAtom* va1, IRAtom* va2, IRAtom* va3 )
1636{
1637 IRAtom* at;
sewardj1c0ce7a2009-07-01 08:10:49 +00001638 IRType t1 = typeOfIRExpr(mce->sb->tyenv, va1);
1639 IRType t2 = typeOfIRExpr(mce->sb->tyenv, va2);
1640 IRType t3 = typeOfIRExpr(mce->sb->tyenv, va3);
sewardjed69fdb2006-02-03 16:12:27 +00001641 tl_assert(isShadowAtom(mce,va1));
1642 tl_assert(isShadowAtom(mce,va2));
1643 tl_assert(isShadowAtom(mce,va3));
1644
1645 /* The general case is inefficient because PCast is an expensive
1646 operation. Here are some special cases which use PCast only
1647 twice rather than three times. */
1648
1649 /* I32 x I64 x I64 -> I64 */
1650 /* Standard FP idiom: rm x FParg1 x FParg2 -> FPresult */
1651 if (t1 == Ity_I32 && t2 == Ity_I64 && t3 == Ity_I64
1652 && finalVty == Ity_I64) {
1653 if (0) VG_(printf)("mkLazy3: I32 x I64 x I64 -> I64\n");
1654 /* Widen 1st arg to I64. Since 1st arg is typically a rounding
1655 mode indication which is fully defined, this should get
1656 folded out later. */
1657 at = mkPCastTo(mce, Ity_I64, va1);
1658 /* Now fold in 2nd and 3rd args. */
1659 at = mkUifU(mce, Ity_I64, at, va2);
1660 at = mkUifU(mce, Ity_I64, at, va3);
1661 /* and PCast once again. */
1662 at = mkPCastTo(mce, Ity_I64, at);
1663 return at;
1664 }
1665
carllfb583cb2013-01-22 20:26:34 +00001666 /* I32 x I8 x I64 -> I64 */
1667 if (t1 == Ity_I32 && t2 == Ity_I8 && t3 == Ity_I64
1668 && finalVty == Ity_I64) {
1669 if (0) VG_(printf)("mkLazy3: I32 x I8 x I64 -> I64\n");
1670 /* Widen 1st and 2nd args to I64. Since 1st arg is typically a
1671 * rounding mode indication which is fully defined, this should
1672 * get folded out later.
1673 */
1674 IRAtom* at1 = mkPCastTo(mce, Ity_I64, va1);
1675 IRAtom* at2 = mkPCastTo(mce, Ity_I64, va2);
1676 at = mkUifU(mce, Ity_I64, at1, at2); // UifU(PCast(va1), PCast(va2))
1677 at = mkUifU(mce, Ity_I64, at, va3);
1678 /* and PCast once again. */
1679 at = mkPCastTo(mce, Ity_I64, at);
1680 return at;
1681 }
1682
sewardj453e8f82006-02-09 03:25:06 +00001683 /* I32 x I64 x I64 -> I32 */
1684 if (t1 == Ity_I32 && t2 == Ity_I64 && t3 == Ity_I64
1685 && finalVty == Ity_I32) {
sewardj59570ff2010-01-01 11:59:33 +00001686 if (0) VG_(printf)("mkLazy3: I32 x I64 x I64 -> I32\n");
sewardj453e8f82006-02-09 03:25:06 +00001687 at = mkPCastTo(mce, Ity_I64, va1);
1688 at = mkUifU(mce, Ity_I64, at, va2);
1689 at = mkUifU(mce, Ity_I64, at, va3);
1690 at = mkPCastTo(mce, Ity_I32, at);
1691 return at;
1692 }
1693
sewardj59570ff2010-01-01 11:59:33 +00001694 /* I32 x I32 x I32 -> I32 */
1695 /* 32-bit FP idiom, as (eg) happens on ARM */
1696 if (t1 == Ity_I32 && t2 == Ity_I32 && t3 == Ity_I32
1697 && finalVty == Ity_I32) {
1698 if (0) VG_(printf)("mkLazy3: I32 x I32 x I32 -> I32\n");
1699 at = va1;
1700 at = mkUifU(mce, Ity_I32, at, va2);
1701 at = mkUifU(mce, Ity_I32, at, va3);
1702 at = mkPCastTo(mce, Ity_I32, at);
1703 return at;
1704 }
1705
sewardjb5b87402011-03-07 16:05:35 +00001706 /* I32 x I128 x I128 -> I128 */
1707 /* Standard FP idiom: rm x FParg1 x FParg2 -> FPresult */
1708 if (t1 == Ity_I32 && t2 == Ity_I128 && t3 == Ity_I128
1709 && finalVty == Ity_I128) {
1710 if (0) VG_(printf)("mkLazy3: I32 x I128 x I128 -> I128\n");
1711 /* Widen 1st arg to I128. Since 1st arg is typically a rounding
1712 mode indication which is fully defined, this should get
1713 folded out later. */
1714 at = mkPCastTo(mce, Ity_I128, va1);
1715 /* Now fold in 2nd and 3rd args. */
1716 at = mkUifU(mce, Ity_I128, at, va2);
1717 at = mkUifU(mce, Ity_I128, at, va3);
1718 /* and PCast once again. */
1719 at = mkPCastTo(mce, Ity_I128, at);
1720 return at;
1721 }
carllfb583cb2013-01-22 20:26:34 +00001722
1723 /* I32 x I8 x I128 -> I128 */
1724 /* Standard FP idiom: rm x FParg1 x FParg2 -> FPresult */
1725 if (t1 == Ity_I32 && t2 == Ity_I8 && t3 == Ity_I128
1726 && finalVty == Ity_I128) {
1727 if (0) VG_(printf)("mkLazy3: I32 x I8 x I128 -> I128\n");
sewardja28c43c2013-01-29 17:18:56 +00001728 /* Use I64 as an intermediate type, which means PCasting all 3
1729 args to I64 to start with. 1st arg is typically a rounding
1730 mode indication which is fully defined, so we hope that it
1731 will get folded out later. */
carllfb583cb2013-01-22 20:26:34 +00001732 IRAtom* at1 = mkPCastTo(mce, Ity_I64, va1);
1733 IRAtom* at2 = mkPCastTo(mce, Ity_I64, va2);
sewardja28c43c2013-01-29 17:18:56 +00001734 IRAtom* at3 = mkPCastTo(mce, Ity_I64, va3);
1735 /* Now UifU all three together. */
carllfb583cb2013-01-22 20:26:34 +00001736 at = mkUifU(mce, Ity_I64, at1, at2); // UifU(PCast(va1), PCast(va2))
sewardja28c43c2013-01-29 17:18:56 +00001737 at = mkUifU(mce, Ity_I64, at, at3); // ... `UifU` PCast(va3)
carllfb583cb2013-01-22 20:26:34 +00001738 /* and PCast once again. */
1739 at = mkPCastTo(mce, Ity_I128, at);
1740 return at;
1741 }
sewardj453e8f82006-02-09 03:25:06 +00001742 if (1) {
1743 VG_(printf)("mkLazy3: ");
sewardjed69fdb2006-02-03 16:12:27 +00001744 ppIRType(t1);
sewardj453e8f82006-02-09 03:25:06 +00001745 VG_(printf)(" x ");
sewardjed69fdb2006-02-03 16:12:27 +00001746 ppIRType(t2);
sewardj453e8f82006-02-09 03:25:06 +00001747 VG_(printf)(" x ");
sewardjed69fdb2006-02-03 16:12:27 +00001748 ppIRType(t3);
sewardj453e8f82006-02-09 03:25:06 +00001749 VG_(printf)(" -> ");
sewardjed69fdb2006-02-03 16:12:27 +00001750 ppIRType(finalVty);
1751 VG_(printf)("\n");
1752 }
1753
sewardj453e8f82006-02-09 03:25:06 +00001754 tl_assert(0);
sewardjed69fdb2006-02-03 16:12:27 +00001755 /* General case: force everything via 32-bit intermediaries. */
sewardj453e8f82006-02-09 03:25:06 +00001756 /*
sewardjed69fdb2006-02-03 16:12:27 +00001757 at = mkPCastTo(mce, Ity_I32, va1);
1758 at = mkUifU(mce, Ity_I32, at, mkPCastTo(mce, Ity_I32, va2));
1759 at = mkUifU(mce, Ity_I32, at, mkPCastTo(mce, Ity_I32, va3));
1760 at = mkPCastTo(mce, finalVty, at);
1761 return at;
sewardj453e8f82006-02-09 03:25:06 +00001762 */
sewardjed69fdb2006-02-03 16:12:27 +00001763}
1764
1765
sewardje91cea72006-02-08 19:32:02 +00001766/* 4-arg version of the above. */
1767static
1768IRAtom* mkLazy4 ( MCEnv* mce, IRType finalVty,
1769 IRAtom* va1, IRAtom* va2, IRAtom* va3, IRAtom* va4 )
1770{
1771 IRAtom* at;
sewardj1c0ce7a2009-07-01 08:10:49 +00001772 IRType t1 = typeOfIRExpr(mce->sb->tyenv, va1);
1773 IRType t2 = typeOfIRExpr(mce->sb->tyenv, va2);
1774 IRType t3 = typeOfIRExpr(mce->sb->tyenv, va3);
1775 IRType t4 = typeOfIRExpr(mce->sb->tyenv, va4);
sewardje91cea72006-02-08 19:32:02 +00001776 tl_assert(isShadowAtom(mce,va1));
1777 tl_assert(isShadowAtom(mce,va2));
1778 tl_assert(isShadowAtom(mce,va3));
1779 tl_assert(isShadowAtom(mce,va4));
1780
1781 /* The general case is inefficient because PCast is an expensive
1782 operation. Here are some special cases which use PCast only
1783 twice rather than three times. */
1784
sewardje91cea72006-02-08 19:32:02 +00001785 /* Standard FP idiom: rm x FParg1 x FParg2 x FParg3 -> FPresult */
Elliott Hughesa0664b92017-04-18 17:46:52 -07001786
1787 if (t1 == Ity_I32 && t2 == Ity_I128 && t3 == Ity_I128 && t4 == Ity_I128
1788 && finalVty == Ity_I128) {
1789 if (0) VG_(printf)("mkLazy4: I32 x I128 x I128 x I128 -> I128\n");
1790 /* Widen 1st arg to I128. Since 1st arg is typically a rounding
1791 mode indication which is fully defined, this should get
1792 folded out later. */
1793 at = mkPCastTo(mce, Ity_I128, va1);
1794 /* Now fold in 2nd, 3rd, 4th args. */
1795 at = mkUifU(mce, Ity_I128, at, va2);
1796 at = mkUifU(mce, Ity_I128, at, va3);
1797 at = mkUifU(mce, Ity_I128, at, va4);
1798 /* and PCast once again. */
1799 at = mkPCastTo(mce, Ity_I128, at);
1800 return at;
1801 }
1802
1803 /* I32 x I64 x I64 x I64 -> I64 */
sewardje91cea72006-02-08 19:32:02 +00001804 if (t1 == Ity_I32 && t2 == Ity_I64 && t3 == Ity_I64 && t4 == Ity_I64
1805 && finalVty == Ity_I64) {
1806 if (0) VG_(printf)("mkLazy4: I32 x I64 x I64 x I64 -> I64\n");
1807 /* Widen 1st arg to I64. Since 1st arg is typically a rounding
1808 mode indication which is fully defined, this should get
1809 folded out later. */
1810 at = mkPCastTo(mce, Ity_I64, va1);
1811 /* Now fold in 2nd, 3rd, 4th args. */
1812 at = mkUifU(mce, Ity_I64, at, va2);
1813 at = mkUifU(mce, Ity_I64, at, va3);
1814 at = mkUifU(mce, Ity_I64, at, va4);
1815 /* and PCast once again. */
1816 at = mkPCastTo(mce, Ity_I64, at);
1817 return at;
1818 }
sewardjb5b87402011-03-07 16:05:35 +00001819 /* I32 x I32 x I32 x I32 -> I32 */
1820 /* Standard FP idiom: rm x FParg1 x FParg2 x FParg3 -> FPresult */
1821 if (t1 == Ity_I32 && t2 == Ity_I32 && t3 == Ity_I32 && t4 == Ity_I32
1822 && finalVty == Ity_I32) {
1823 if (0) VG_(printf)("mkLazy4: I32 x I32 x I32 x I32 -> I32\n");
1824 at = va1;
1825 /* Now fold in 2nd, 3rd, 4th args. */
1826 at = mkUifU(mce, Ity_I32, at, va2);
1827 at = mkUifU(mce, Ity_I32, at, va3);
1828 at = mkUifU(mce, Ity_I32, at, va4);
1829 at = mkPCastTo(mce, Ity_I32, at);
1830 return at;
1831 }
sewardje91cea72006-02-08 19:32:02 +00001832
1833 if (1) {
sewardj453e8f82006-02-09 03:25:06 +00001834 VG_(printf)("mkLazy4: ");
sewardje91cea72006-02-08 19:32:02 +00001835 ppIRType(t1);
1836 VG_(printf)(" x ");
1837 ppIRType(t2);
1838 VG_(printf)(" x ");
1839 ppIRType(t3);
1840 VG_(printf)(" x ");
1841 ppIRType(t4);
1842 VG_(printf)(" -> ");
1843 ppIRType(finalVty);
1844 VG_(printf)("\n");
1845 }
1846
1847 tl_assert(0);
1848}
1849
1850
sewardj95448072004-11-22 20:19:51 +00001851/* Do the lazy propagation game from a null-terminated vector of
1852 atoms. This is presumably the arguments to a helper call, so the
1853 IRCallee info is also supplied in order that we can know which
1854 arguments should be ignored (via the .mcx_mask field).
1855*/
1856static
1857IRAtom* mkLazyN ( MCEnv* mce,
1858 IRAtom** exprvec, IRType finalVtype, IRCallee* cee )
1859{
sewardj4cc684b2007-08-25 23:09:36 +00001860 Int i;
sewardj95448072004-11-22 20:19:51 +00001861 IRAtom* here;
sewardj4cc684b2007-08-25 23:09:36 +00001862 IRAtom* curr;
1863 IRType mergeTy;
sewardj99430032011-05-04 09:09:31 +00001864 Bool mergeTy64 = True;
sewardj4cc684b2007-08-25 23:09:36 +00001865
1866 /* Decide on the type of the merge intermediary. If all relevant
1867 args are I64, then it's I64. In all other circumstances, use
1868 I32. */
1869 for (i = 0; exprvec[i]; i++) {
1870 tl_assert(i < 32);
1871 tl_assert(isOriginalAtom(mce, exprvec[i]));
1872 if (cee->mcx_mask & (1<<i))
1873 continue;
sewardj1c0ce7a2009-07-01 08:10:49 +00001874 if (typeOfIRExpr(mce->sb->tyenv, exprvec[i]) != Ity_I64)
sewardj4cc684b2007-08-25 23:09:36 +00001875 mergeTy64 = False;
1876 }
1877
1878 mergeTy = mergeTy64 ? Ity_I64 : Ity_I32;
1879 curr = definedOfType(mergeTy);
1880
sewardj95448072004-11-22 20:19:51 +00001881 for (i = 0; exprvec[i]; i++) {
1882 tl_assert(i < 32);
1883 tl_assert(isOriginalAtom(mce, exprvec[i]));
1884 /* Only take notice of this arg if the callee's mc-exclusion
1885 mask does not say it is to be excluded. */
1886 if (cee->mcx_mask & (1<<i)) {
1887 /* the arg is to be excluded from definedness checking. Do
1888 nothing. */
1889 if (0) VG_(printf)("excluding %s(%d)\n", cee->name, i);
1890 } else {
1891 /* calculate the arg's definedness, and pessimistically merge
1892 it in. */
sewardj4cc684b2007-08-25 23:09:36 +00001893 here = mkPCastTo( mce, mergeTy, expr2vbits(mce, exprvec[i]) );
1894 curr = mergeTy64
1895 ? mkUifU64(mce, here, curr)
1896 : mkUifU32(mce, here, curr);
sewardj95448072004-11-22 20:19:51 +00001897 }
1898 }
1899 return mkPCastTo(mce, finalVtype, curr );
1900}
1901
1902
1903/*------------------------------------------------------------*/
1904/*--- Generating expensive sequences for exact carry-chain ---*/
1905/*--- propagation in add/sub and related operations. ---*/
1906/*------------------------------------------------------------*/
1907
1908static
sewardjd5204dc2004-12-31 01:16:11 +00001909IRAtom* expensiveAddSub ( MCEnv* mce,
1910 Bool add,
1911 IRType ty,
1912 IRAtom* qaa, IRAtom* qbb,
1913 IRAtom* aa, IRAtom* bb )
sewardj95448072004-11-22 20:19:51 +00001914{
sewardj7cf97ee2004-11-28 14:25:01 +00001915 IRAtom *a_min, *b_min, *a_max, *b_max;
sewardjd5204dc2004-12-31 01:16:11 +00001916 IROp opAND, opOR, opXOR, opNOT, opADD, opSUB;
sewardj7cf97ee2004-11-28 14:25:01 +00001917
sewardj95448072004-11-22 20:19:51 +00001918 tl_assert(isShadowAtom(mce,qaa));
1919 tl_assert(isShadowAtom(mce,qbb));
1920 tl_assert(isOriginalAtom(mce,aa));
1921 tl_assert(isOriginalAtom(mce,bb));
1922 tl_assert(sameKindedAtoms(qaa,aa));
1923 tl_assert(sameKindedAtoms(qbb,bb));
1924
sewardjd5204dc2004-12-31 01:16:11 +00001925 switch (ty) {
1926 case Ity_I32:
1927 opAND = Iop_And32;
1928 opOR = Iop_Or32;
1929 opXOR = Iop_Xor32;
1930 opNOT = Iop_Not32;
1931 opADD = Iop_Add32;
1932 opSUB = Iop_Sub32;
1933 break;
tomd9774d72005-06-27 08:11:01 +00001934 case Ity_I64:
1935 opAND = Iop_And64;
1936 opOR = Iop_Or64;
1937 opXOR = Iop_Xor64;
1938 opNOT = Iop_Not64;
1939 opADD = Iop_Add64;
1940 opSUB = Iop_Sub64;
1941 break;
sewardjd5204dc2004-12-31 01:16:11 +00001942 default:
1943 VG_(tool_panic)("expensiveAddSub");
1944 }
sewardj95448072004-11-22 20:19:51 +00001945
1946 // a_min = aa & ~qaa
sewardj7cf4e6b2008-05-01 20:24:26 +00001947 a_min = assignNew('V', mce,ty,
sewardj95448072004-11-22 20:19:51 +00001948 binop(opAND, aa,
sewardj7cf4e6b2008-05-01 20:24:26 +00001949 assignNew('V', mce,ty, unop(opNOT, qaa))));
sewardj95448072004-11-22 20:19:51 +00001950
1951 // b_min = bb & ~qbb
sewardj7cf4e6b2008-05-01 20:24:26 +00001952 b_min = assignNew('V', mce,ty,
sewardj95448072004-11-22 20:19:51 +00001953 binop(opAND, bb,
sewardj7cf4e6b2008-05-01 20:24:26 +00001954 assignNew('V', mce,ty, unop(opNOT, qbb))));
sewardj95448072004-11-22 20:19:51 +00001955
1956 // a_max = aa | qaa
sewardj7cf4e6b2008-05-01 20:24:26 +00001957 a_max = assignNew('V', mce,ty, binop(opOR, aa, qaa));
sewardj95448072004-11-22 20:19:51 +00001958
1959 // b_max = bb | qbb
sewardj7cf4e6b2008-05-01 20:24:26 +00001960 b_max = assignNew('V', mce,ty, binop(opOR, bb, qbb));
sewardj95448072004-11-22 20:19:51 +00001961
sewardjd5204dc2004-12-31 01:16:11 +00001962 if (add) {
1963 // result = (qaa | qbb) | ((a_min + b_min) ^ (a_max + b_max))
1964 return
sewardj7cf4e6b2008-05-01 20:24:26 +00001965 assignNew('V', mce,ty,
sewardjd5204dc2004-12-31 01:16:11 +00001966 binop( opOR,
sewardj7cf4e6b2008-05-01 20:24:26 +00001967 assignNew('V', mce,ty, binop(opOR, qaa, qbb)),
1968 assignNew('V', mce,ty,
sewardjd5204dc2004-12-31 01:16:11 +00001969 binop( opXOR,
sewardj7cf4e6b2008-05-01 20:24:26 +00001970 assignNew('V', mce,ty, binop(opADD, a_min, b_min)),
1971 assignNew('V', mce,ty, binop(opADD, a_max, b_max))
sewardjd5204dc2004-12-31 01:16:11 +00001972 )
sewardj95448072004-11-22 20:19:51 +00001973 )
sewardjd5204dc2004-12-31 01:16:11 +00001974 )
1975 );
1976 } else {
1977 // result = (qaa | qbb) | ((a_min - b_max) ^ (a_max + b_min))
1978 return
sewardj7cf4e6b2008-05-01 20:24:26 +00001979 assignNew('V', mce,ty,
sewardjd5204dc2004-12-31 01:16:11 +00001980 binop( opOR,
sewardj7cf4e6b2008-05-01 20:24:26 +00001981 assignNew('V', mce,ty, binop(opOR, qaa, qbb)),
1982 assignNew('V', mce,ty,
sewardjd5204dc2004-12-31 01:16:11 +00001983 binop( opXOR,
sewardj7cf4e6b2008-05-01 20:24:26 +00001984 assignNew('V', mce,ty, binop(opSUB, a_min, b_max)),
1985 assignNew('V', mce,ty, binop(opSUB, a_max, b_min))
sewardjd5204dc2004-12-31 01:16:11 +00001986 )
1987 )
1988 )
1989 );
1990 }
1991
sewardj95448072004-11-22 20:19:51 +00001992}
1993
1994
sewardj4cfa81b2012-11-08 10:58:16 +00001995static
1996IRAtom* expensiveCountTrailingZeroes ( MCEnv* mce, IROp czop,
1997 IRAtom* atom, IRAtom* vatom )
1998{
1999 IRType ty;
2000 IROp xorOp, subOp, andOp;
2001 IRExpr *one;
2002 IRAtom *improver, *improved;
2003 tl_assert(isShadowAtom(mce,vatom));
2004 tl_assert(isOriginalAtom(mce,atom));
2005 tl_assert(sameKindedAtoms(atom,vatom));
2006
2007 switch (czop) {
2008 case Iop_Ctz32:
2009 ty = Ity_I32;
2010 xorOp = Iop_Xor32;
2011 subOp = Iop_Sub32;
2012 andOp = Iop_And32;
2013 one = mkU32(1);
2014 break;
2015 case Iop_Ctz64:
2016 ty = Ity_I64;
2017 xorOp = Iop_Xor64;
2018 subOp = Iop_Sub64;
2019 andOp = Iop_And64;
2020 one = mkU64(1);
2021 break;
2022 default:
2023 ppIROp(czop);
2024 VG_(tool_panic)("memcheck:expensiveCountTrailingZeroes");
2025 }
2026
2027 // improver = atom ^ (atom - 1)
2028 //
2029 // That is, improver has its low ctz(atom) bits equal to one;
2030 // higher bits (if any) equal to zero.
2031 improver = assignNew('V', mce,ty,
2032 binop(xorOp,
2033 atom,
2034 assignNew('V', mce, ty,
2035 binop(subOp, atom, one))));
2036
2037 // improved = vatom & improver
2038 //
2039 // That is, treat any V bits above the first ctz(atom) bits as
2040 // "defined".
2041 improved = assignNew('V', mce, ty,
2042 binop(andOp, vatom, improver));
2043
2044 // Return pessimizing cast of improved.
2045 return mkPCastTo(mce, ty, improved);
2046}
2047
2048
sewardj95448072004-11-22 20:19:51 +00002049/*------------------------------------------------------------*/
sewardjaaddbc22005-10-07 09:49:53 +00002050/*--- Scalar shifts. ---*/
2051/*------------------------------------------------------------*/
2052
2053/* Produce an interpretation for (aa << bb) (or >>s, >>u). The basic
2054 idea is to shift the definedness bits by the original shift amount.
2055 This introduces 0s ("defined") in new positions for left shifts and
2056 unsigned right shifts, and copies the top definedness bit for
2057 signed right shifts. So, conveniently, applying the original shift
2058 operator to the definedness bits for the left arg is exactly the
2059 right thing to do:
2060
2061 (qaa << bb)
2062
2063 However if the shift amount is undefined then the whole result
2064 is undefined. Hence need:
2065
2066 (qaa << bb) `UifU` PCast(qbb)
2067
2068 If the shift amount bb is a literal than qbb will say 'all defined'
2069 and the UifU and PCast will get folded out by post-instrumentation
2070 optimisation.
2071*/
2072static IRAtom* scalarShift ( MCEnv* mce,
2073 IRType ty,
2074 IROp original_op,
2075 IRAtom* qaa, IRAtom* qbb,
2076 IRAtom* aa, IRAtom* bb )
2077{
2078 tl_assert(isShadowAtom(mce,qaa));
2079 tl_assert(isShadowAtom(mce,qbb));
2080 tl_assert(isOriginalAtom(mce,aa));
2081 tl_assert(isOriginalAtom(mce,bb));
2082 tl_assert(sameKindedAtoms(qaa,aa));
2083 tl_assert(sameKindedAtoms(qbb,bb));
2084 return
2085 assignNew(
sewardj7cf4e6b2008-05-01 20:24:26 +00002086 'V', mce, ty,
sewardjaaddbc22005-10-07 09:49:53 +00002087 mkUifU( mce, ty,
sewardj7cf4e6b2008-05-01 20:24:26 +00002088 assignNew('V', mce, ty, binop(original_op, qaa, bb)),
sewardjaaddbc22005-10-07 09:49:53 +00002089 mkPCastTo(mce, ty, qbb)
2090 )
2091 );
2092}
2093
2094
2095/*------------------------------------------------------------*/
2096/*--- Helpers for dealing with vector primops. ---*/
sewardj3245c912004-12-10 14:58:26 +00002097/*------------------------------------------------------------*/
2098
sewardja1d93302004-12-12 16:45:06 +00002099/* Vector pessimisation -- pessimise within each lane individually. */
2100
2101static IRAtom* mkPCast8x16 ( MCEnv* mce, IRAtom* at )
2102{
sewardj7cf4e6b2008-05-01 20:24:26 +00002103 return assignNew('V', mce, Ity_V128, unop(Iop_CmpNEZ8x16, at));
sewardja1d93302004-12-12 16:45:06 +00002104}
2105
2106static IRAtom* mkPCast16x8 ( MCEnv* mce, IRAtom* at )
2107{
sewardj7cf4e6b2008-05-01 20:24:26 +00002108 return assignNew('V', mce, Ity_V128, unop(Iop_CmpNEZ16x8, at));
sewardja1d93302004-12-12 16:45:06 +00002109}
2110
2111static IRAtom* mkPCast32x4 ( MCEnv* mce, IRAtom* at )
2112{
sewardj7cf4e6b2008-05-01 20:24:26 +00002113 return assignNew('V', mce, Ity_V128, unop(Iop_CmpNEZ32x4, at));
sewardja1d93302004-12-12 16:45:06 +00002114}
2115
2116static IRAtom* mkPCast64x2 ( MCEnv* mce, IRAtom* at )
2117{
sewardj7cf4e6b2008-05-01 20:24:26 +00002118 return assignNew('V', mce, Ity_V128, unop(Iop_CmpNEZ64x2, at));
sewardja1d93302004-12-12 16:45:06 +00002119}
2120
sewardj350e8f72012-06-25 07:52:15 +00002121static IRAtom* mkPCast64x4 ( MCEnv* mce, IRAtom* at )
2122{
2123 return assignNew('V', mce, Ity_V256, unop(Iop_CmpNEZ64x4, at));
2124}
2125
2126static IRAtom* mkPCast32x8 ( MCEnv* mce, IRAtom* at )
2127{
2128 return assignNew('V', mce, Ity_V256, unop(Iop_CmpNEZ32x8, at));
2129}
2130
sewardjacd2e912005-01-13 19:17:06 +00002131static IRAtom* mkPCast32x2 ( MCEnv* mce, IRAtom* at )
2132{
sewardj7cf4e6b2008-05-01 20:24:26 +00002133 return assignNew('V', mce, Ity_I64, unop(Iop_CmpNEZ32x2, at));
sewardjacd2e912005-01-13 19:17:06 +00002134}
2135
sewardja2f30952013-03-27 11:40:02 +00002136static IRAtom* mkPCast16x16 ( MCEnv* mce, IRAtom* at )
2137{
2138 return assignNew('V', mce, Ity_V256, unop(Iop_CmpNEZ16x16, at));
2139}
2140
sewardjacd2e912005-01-13 19:17:06 +00002141static IRAtom* mkPCast16x4 ( MCEnv* mce, IRAtom* at )
2142{
sewardj7cf4e6b2008-05-01 20:24:26 +00002143 return assignNew('V', mce, Ity_I64, unop(Iop_CmpNEZ16x4, at));
sewardjacd2e912005-01-13 19:17:06 +00002144}
2145
sewardja2f30952013-03-27 11:40:02 +00002146static IRAtom* mkPCast8x32 ( MCEnv* mce, IRAtom* at )
2147{
2148 return assignNew('V', mce, Ity_V256, unop(Iop_CmpNEZ8x32, at));
2149}
2150
sewardjacd2e912005-01-13 19:17:06 +00002151static IRAtom* mkPCast8x8 ( MCEnv* mce, IRAtom* at )
2152{
sewardj7cf4e6b2008-05-01 20:24:26 +00002153 return assignNew('V', mce, Ity_I64, unop(Iop_CmpNEZ8x8, at));
sewardjacd2e912005-01-13 19:17:06 +00002154}
2155
sewardjc678b852010-09-22 00:58:51 +00002156static IRAtom* mkPCast16x2 ( MCEnv* mce, IRAtom* at )
2157{
2158 return assignNew('V', mce, Ity_I32, unop(Iop_CmpNEZ16x2, at));
2159}
2160
2161static IRAtom* mkPCast8x4 ( MCEnv* mce, IRAtom* at )
2162{
2163 return assignNew('V', mce, Ity_I32, unop(Iop_CmpNEZ8x4, at));
2164}
2165
sewardja1d93302004-12-12 16:45:06 +00002166
sewardj3245c912004-12-10 14:58:26 +00002167/* Here's a simple scheme capable of handling ops derived from SSE1
2168 code and while only generating ops that can be efficiently
2169 implemented in SSE1. */
2170
2171/* All-lanes versions are straightforward:
2172
sewardj20d38f22005-02-07 23:50:18 +00002173 binary32Fx4(x,y) ==> PCast32x4(UifUV128(x#,y#))
sewardj3245c912004-12-10 14:58:26 +00002174
2175 unary32Fx4(x,y) ==> PCast32x4(x#)
2176
2177 Lowest-lane-only versions are more complex:
2178
sewardj20d38f22005-02-07 23:50:18 +00002179 binary32F0x4(x,y) ==> SetV128lo32(
sewardj3245c912004-12-10 14:58:26 +00002180 x#,
sewardj20d38f22005-02-07 23:50:18 +00002181 PCast32(V128to32(UifUV128(x#,y#)))
sewardj3245c912004-12-10 14:58:26 +00002182 )
2183
2184 This is perhaps not so obvious. In particular, it's faster to
sewardj20d38f22005-02-07 23:50:18 +00002185 do a V128-bit UifU and then take the bottom 32 bits than the more
sewardj3245c912004-12-10 14:58:26 +00002186 obvious scheme of taking the bottom 32 bits of each operand
2187 and doing a 32-bit UifU. Basically since UifU is fast and
2188 chopping lanes off vector values is slow.
2189
2190 Finally:
2191
sewardj20d38f22005-02-07 23:50:18 +00002192 unary32F0x4(x) ==> SetV128lo32(
sewardj3245c912004-12-10 14:58:26 +00002193 x#,
sewardj20d38f22005-02-07 23:50:18 +00002194 PCast32(V128to32(x#))
sewardj3245c912004-12-10 14:58:26 +00002195 )
2196
2197 Where:
2198
2199 PCast32(v#) = 1Sto32(CmpNE32(v#,0))
2200 PCast32x4(v#) = CmpNEZ32x4(v#)
2201*/
2202
2203static
2204IRAtom* binary32Fx4 ( MCEnv* mce, IRAtom* vatomX, IRAtom* vatomY )
2205{
2206 IRAtom* at;
2207 tl_assert(isShadowAtom(mce, vatomX));
2208 tl_assert(isShadowAtom(mce, vatomY));
sewardj20d38f22005-02-07 23:50:18 +00002209 at = mkUifUV128(mce, vatomX, vatomY);
sewardj7cf4e6b2008-05-01 20:24:26 +00002210 at = assignNew('V', mce, Ity_V128, mkPCast32x4(mce, at));
sewardj3245c912004-12-10 14:58:26 +00002211 return at;
2212}
2213
2214static
2215IRAtom* unary32Fx4 ( MCEnv* mce, IRAtom* vatomX )
2216{
2217 IRAtom* at;
2218 tl_assert(isShadowAtom(mce, vatomX));
sewardj7cf4e6b2008-05-01 20:24:26 +00002219 at = assignNew('V', mce, Ity_V128, mkPCast32x4(mce, vatomX));
sewardj3245c912004-12-10 14:58:26 +00002220 return at;
2221}
2222
2223static
2224IRAtom* binary32F0x4 ( MCEnv* mce, IRAtom* vatomX, IRAtom* vatomY )
2225{
2226 IRAtom* at;
2227 tl_assert(isShadowAtom(mce, vatomX));
2228 tl_assert(isShadowAtom(mce, vatomY));
sewardj20d38f22005-02-07 23:50:18 +00002229 at = mkUifUV128(mce, vatomX, vatomY);
sewardj7cf4e6b2008-05-01 20:24:26 +00002230 at = assignNew('V', mce, Ity_I32, unop(Iop_V128to32, at));
sewardj3245c912004-12-10 14:58:26 +00002231 at = mkPCastTo(mce, Ity_I32, at);
sewardj7cf4e6b2008-05-01 20:24:26 +00002232 at = assignNew('V', mce, Ity_V128, binop(Iop_SetV128lo32, vatomX, at));
sewardj3245c912004-12-10 14:58:26 +00002233 return at;
2234}
2235
2236static
2237IRAtom* unary32F0x4 ( MCEnv* mce, IRAtom* vatomX )
2238{
2239 IRAtom* at;
2240 tl_assert(isShadowAtom(mce, vatomX));
sewardj7cf4e6b2008-05-01 20:24:26 +00002241 at = assignNew('V', mce, Ity_I32, unop(Iop_V128to32, vatomX));
sewardj3245c912004-12-10 14:58:26 +00002242 at = mkPCastTo(mce, Ity_I32, at);
sewardj7cf4e6b2008-05-01 20:24:26 +00002243 at = assignNew('V', mce, Ity_V128, binop(Iop_SetV128lo32, vatomX, at));
sewardj3245c912004-12-10 14:58:26 +00002244 return at;
2245}
2246
sewardj0b070592004-12-10 21:44:22 +00002247/* --- ... and ... 64Fx2 versions of the same ... --- */
2248
2249static
2250IRAtom* binary64Fx2 ( MCEnv* mce, IRAtom* vatomX, IRAtom* vatomY )
2251{
2252 IRAtom* at;
2253 tl_assert(isShadowAtom(mce, vatomX));
2254 tl_assert(isShadowAtom(mce, vatomY));
sewardj20d38f22005-02-07 23:50:18 +00002255 at = mkUifUV128(mce, vatomX, vatomY);
sewardj7cf4e6b2008-05-01 20:24:26 +00002256 at = assignNew('V', mce, Ity_V128, mkPCast64x2(mce, at));
sewardj0b070592004-12-10 21:44:22 +00002257 return at;
2258}
2259
2260static
2261IRAtom* unary64Fx2 ( MCEnv* mce, IRAtom* vatomX )
2262{
2263 IRAtom* at;
2264 tl_assert(isShadowAtom(mce, vatomX));
sewardj7cf4e6b2008-05-01 20:24:26 +00002265 at = assignNew('V', mce, Ity_V128, mkPCast64x2(mce, vatomX));
sewardj0b070592004-12-10 21:44:22 +00002266 return at;
2267}
2268
2269static
2270IRAtom* binary64F0x2 ( MCEnv* mce, IRAtom* vatomX, IRAtom* vatomY )
2271{
2272 IRAtom* at;
2273 tl_assert(isShadowAtom(mce, vatomX));
2274 tl_assert(isShadowAtom(mce, vatomY));
sewardj20d38f22005-02-07 23:50:18 +00002275 at = mkUifUV128(mce, vatomX, vatomY);
sewardj7cf4e6b2008-05-01 20:24:26 +00002276 at = assignNew('V', mce, Ity_I64, unop(Iop_V128to64, at));
sewardj0b070592004-12-10 21:44:22 +00002277 at = mkPCastTo(mce, Ity_I64, at);
sewardj7cf4e6b2008-05-01 20:24:26 +00002278 at = assignNew('V', mce, Ity_V128, binop(Iop_SetV128lo64, vatomX, at));
sewardj0b070592004-12-10 21:44:22 +00002279 return at;
2280}
2281
2282static
2283IRAtom* unary64F0x2 ( MCEnv* mce, IRAtom* vatomX )
2284{
2285 IRAtom* at;
2286 tl_assert(isShadowAtom(mce, vatomX));
sewardj7cf4e6b2008-05-01 20:24:26 +00002287 at = assignNew('V', mce, Ity_I64, unop(Iop_V128to64, vatomX));
sewardj0b070592004-12-10 21:44:22 +00002288 at = mkPCastTo(mce, Ity_I64, at);
sewardj7cf4e6b2008-05-01 20:24:26 +00002289 at = assignNew('V', mce, Ity_V128, binop(Iop_SetV128lo64, vatomX, at));
sewardj0b070592004-12-10 21:44:22 +00002290 return at;
2291}
2292
sewardj57f92b02010-08-22 11:54:14 +00002293/* --- --- ... and ... 32Fx2 versions of the same --- --- */
2294
2295static
2296IRAtom* binary32Fx2 ( MCEnv* mce, IRAtom* vatomX, IRAtom* vatomY )
2297{
2298 IRAtom* at;
2299 tl_assert(isShadowAtom(mce, vatomX));
2300 tl_assert(isShadowAtom(mce, vatomY));
2301 at = mkUifU64(mce, vatomX, vatomY);
2302 at = assignNew('V', mce, Ity_I64, mkPCast32x2(mce, at));
2303 return at;
2304}
2305
2306static
2307IRAtom* unary32Fx2 ( MCEnv* mce, IRAtom* vatomX )
2308{
2309 IRAtom* at;
2310 tl_assert(isShadowAtom(mce, vatomX));
2311 at = assignNew('V', mce, Ity_I64, mkPCast32x2(mce, vatomX));
2312 return at;
2313}
2314
sewardj350e8f72012-06-25 07:52:15 +00002315/* --- ... and ... 64Fx4 versions of the same ... --- */
2316
2317static
2318IRAtom* binary64Fx4 ( MCEnv* mce, IRAtom* vatomX, IRAtom* vatomY )
2319{
2320 IRAtom* at;
2321 tl_assert(isShadowAtom(mce, vatomX));
2322 tl_assert(isShadowAtom(mce, vatomY));
2323 at = mkUifUV256(mce, vatomX, vatomY);
2324 at = assignNew('V', mce, Ity_V256, mkPCast64x4(mce, at));
2325 return at;
2326}
2327
2328static
2329IRAtom* unary64Fx4 ( MCEnv* mce, IRAtom* vatomX )
2330{
2331 IRAtom* at;
2332 tl_assert(isShadowAtom(mce, vatomX));
2333 at = assignNew('V', mce, Ity_V256, mkPCast64x4(mce, vatomX));
2334 return at;
2335}
2336
2337/* --- ... and ... 32Fx8 versions of the same ... --- */
2338
2339static
2340IRAtom* binary32Fx8 ( MCEnv* mce, IRAtom* vatomX, IRAtom* vatomY )
2341{
2342 IRAtom* at;
2343 tl_assert(isShadowAtom(mce, vatomX));
2344 tl_assert(isShadowAtom(mce, vatomY));
2345 at = mkUifUV256(mce, vatomX, vatomY);
2346 at = assignNew('V', mce, Ity_V256, mkPCast32x8(mce, at));
2347 return at;
2348}
2349
2350static
2351IRAtom* unary32Fx8 ( MCEnv* mce, IRAtom* vatomX )
2352{
2353 IRAtom* at;
2354 tl_assert(isShadowAtom(mce, vatomX));
2355 at = assignNew('V', mce, Ity_V256, mkPCast32x8(mce, vatomX));
2356 return at;
2357}
2358
sewardj1eb272f2014-01-26 18:36:52 +00002359/* --- 64Fx2 binary FP ops, with rounding mode --- */
2360
2361static
2362IRAtom* binary64Fx2_w_rm ( MCEnv* mce, IRAtom* vRM,
2363 IRAtom* vatomX, IRAtom* vatomY )
2364{
2365 /* This is the same as binary64Fx2, except that we subsequently
2366 pessimise vRM (definedness of the rounding mode), widen to 128
2367 bits and UifU it into the result. As with the scalar cases, if
2368 the RM is a constant then it is defined and so this extra bit
2369 will get constant-folded out later. */
2370 // "do" the vector args
2371 IRAtom* t1 = binary64Fx2(mce, vatomX, vatomY);
2372 // PCast the RM, and widen it to 128 bits
2373 IRAtom* t2 = mkPCastTo(mce, Ity_V128, vRM);
2374 // Roll it into the result
2375 t1 = mkUifUV128(mce, t1, t2);
2376 return t1;
2377}
2378
2379/* --- ... and ... 32Fx4 versions of the same --- */
2380
2381static
2382IRAtom* binary32Fx4_w_rm ( MCEnv* mce, IRAtom* vRM,
2383 IRAtom* vatomX, IRAtom* vatomY )
2384{
2385 IRAtom* t1 = binary32Fx4(mce, vatomX, vatomY);
2386 // PCast the RM, and widen it to 128 bits
2387 IRAtom* t2 = mkPCastTo(mce, Ity_V128, vRM);
2388 // Roll it into the result
2389 t1 = mkUifUV128(mce, t1, t2);
2390 return t1;
2391}
2392
2393/* --- ... and ... 64Fx4 versions of the same --- */
2394
2395static
2396IRAtom* binary64Fx4_w_rm ( MCEnv* mce, IRAtom* vRM,
2397 IRAtom* vatomX, IRAtom* vatomY )
2398{
2399 IRAtom* t1 = binary64Fx4(mce, vatomX, vatomY);
2400 // PCast the RM, and widen it to 256 bits
2401 IRAtom* t2 = mkPCastTo(mce, Ity_V256, vRM);
2402 // Roll it into the result
2403 t1 = mkUifUV256(mce, t1, t2);
2404 return t1;
2405}
2406
2407/* --- ... and ... 32Fx8 versions of the same --- */
2408
2409static
2410IRAtom* binary32Fx8_w_rm ( MCEnv* mce, IRAtom* vRM,
2411 IRAtom* vatomX, IRAtom* vatomY )
2412{
2413 IRAtom* t1 = binary32Fx8(mce, vatomX, vatomY);
2414 // PCast the RM, and widen it to 256 bits
2415 IRAtom* t2 = mkPCastTo(mce, Ity_V256, vRM);
2416 // Roll it into the result
2417 t1 = mkUifUV256(mce, t1, t2);
2418 return t1;
2419}
2420
sewardj7222f642015-04-07 09:08:42 +00002421/* --- 64Fx2 unary FP ops, with rounding mode --- */
2422
2423static
2424IRAtom* unary64Fx2_w_rm ( MCEnv* mce, IRAtom* vRM, IRAtom* vatomX )
2425{
2426 /* Same scheme as binary64Fx2_w_rm. */
2427 // "do" the vector arg
2428 IRAtom* t1 = unary64Fx2(mce, vatomX);
2429 // PCast the RM, and widen it to 128 bits
2430 IRAtom* t2 = mkPCastTo(mce, Ity_V128, vRM);
2431 // Roll it into the result
2432 t1 = mkUifUV128(mce, t1, t2);
2433 return t1;
2434}
2435
2436/* --- ... and ... 32Fx4 versions of the same --- */
2437
2438static
2439IRAtom* unary32Fx4_w_rm ( MCEnv* mce, IRAtom* vRM, IRAtom* vatomX )
2440{
2441 /* Same scheme as unary32Fx4_w_rm. */
2442 IRAtom* t1 = unary32Fx4(mce, vatomX);
2443 // PCast the RM, and widen it to 128 bits
2444 IRAtom* t2 = mkPCastTo(mce, Ity_V128, vRM);
2445 // Roll it into the result
2446 t1 = mkUifUV128(mce, t1, t2);
2447 return t1;
2448}
2449
sewardj1eb272f2014-01-26 18:36:52 +00002450
sewardja1d93302004-12-12 16:45:06 +00002451/* --- --- Vector saturated narrowing --- --- */
2452
sewardjb5a29232011-10-22 09:29:41 +00002453/* We used to do something very clever here, but on closer inspection
2454 (2011-Jun-15), and in particular bug #279698, it turns out to be
2455 wrong. Part of the problem came from the fact that for a long
2456 time, the IR primops to do with saturated narrowing were
2457 underspecified and managed to confuse multiple cases which needed
2458 to be separate: the op names had a signedness qualifier, but in
2459 fact the source and destination signednesses needed to be specified
2460 independently, so the op names really need two independent
2461 signedness specifiers.
sewardja1d93302004-12-12 16:45:06 +00002462
sewardjb5a29232011-10-22 09:29:41 +00002463 As of 2011-Jun-15 (ish) the underspecification was sorted out
2464 properly. The incorrect instrumentation remained, though. That
2465 has now (2011-Oct-22) been fixed.
sewardja1d93302004-12-12 16:45:06 +00002466
sewardjb5a29232011-10-22 09:29:41 +00002467 What we now do is simple:
sewardja1d93302004-12-12 16:45:06 +00002468
sewardjb5a29232011-10-22 09:29:41 +00002469 Let the original narrowing op be QNarrowBinXtoYxZ, where Z is a
2470 number of lanes, X is the source lane width and signedness, and Y
2471 is the destination lane width and signedness. In all cases the
2472 destination lane width is half the source lane width, so the names
2473 have a bit of redundancy, but are at least easy to read.
sewardja1d93302004-12-12 16:45:06 +00002474
sewardjb5a29232011-10-22 09:29:41 +00002475 For example, Iop_QNarrowBin32Sto16Ux8 narrows 8 lanes of signed 32s
2476 to unsigned 16s.
sewardja1d93302004-12-12 16:45:06 +00002477
sewardjb5a29232011-10-22 09:29:41 +00002478 Let Vanilla(OP) be a function that takes OP, one of these
2479 saturating narrowing ops, and produces the same "shaped" narrowing
2480 op which is not saturating, but merely dumps the most significant
2481 bits. "same shape" means that the lane numbers and widths are the
2482 same as with OP.
sewardja1d93302004-12-12 16:45:06 +00002483
sewardjb5a29232011-10-22 09:29:41 +00002484 For example, Vanilla(Iop_QNarrowBin32Sto16Ux8)
2485 = Iop_NarrowBin32to16x8,
2486 that is, narrow 8 lanes of 32 bits to 8 lanes of 16 bits, by
2487 dumping the top half of each lane.
sewardja1d93302004-12-12 16:45:06 +00002488
sewardjb5a29232011-10-22 09:29:41 +00002489 So, with that in place, the scheme is simple, and it is simple to
2490 pessimise each lane individually and then apply Vanilla(OP) so as
2491 to get the result in the right "shape". If the original OP is
2492 QNarrowBinXtoYxZ then we produce
sewardja1d93302004-12-12 16:45:06 +00002493
sewardjb5a29232011-10-22 09:29:41 +00002494 Vanilla(OP)( PCast-X-to-X-x-Z(vatom1), PCast-X-to-X-x-Z(vatom2) )
sewardj9beeb0a2011-06-15 15:11:07 +00002495
sewardjb5a29232011-10-22 09:29:41 +00002496 or for the case when OP is unary (Iop_QNarrowUn*)
2497
2498 Vanilla(OP)( PCast-X-to-X-x-Z(vatom) )
sewardja1d93302004-12-12 16:45:06 +00002499*/
2500static
sewardjb5a29232011-10-22 09:29:41 +00002501IROp vanillaNarrowingOpOfShape ( IROp qnarrowOp )
2502{
2503 switch (qnarrowOp) {
2504 /* Binary: (128, 128) -> 128 */
2505 case Iop_QNarrowBin16Sto8Ux16:
2506 case Iop_QNarrowBin16Sto8Sx16:
2507 case Iop_QNarrowBin16Uto8Ux16:
carll62770672013-10-01 15:50:09 +00002508 case Iop_QNarrowBin64Sto32Sx4:
2509 case Iop_QNarrowBin64Uto32Ux4:
sewardjb5a29232011-10-22 09:29:41 +00002510 return Iop_NarrowBin16to8x16;
2511 case Iop_QNarrowBin32Sto16Ux8:
2512 case Iop_QNarrowBin32Sto16Sx8:
2513 case Iop_QNarrowBin32Uto16Ux8:
2514 return Iop_NarrowBin32to16x8;
2515 /* Binary: (64, 64) -> 64 */
2516 case Iop_QNarrowBin32Sto16Sx4:
2517 return Iop_NarrowBin32to16x4;
2518 case Iop_QNarrowBin16Sto8Ux8:
2519 case Iop_QNarrowBin16Sto8Sx8:
2520 return Iop_NarrowBin16to8x8;
2521 /* Unary: 128 -> 64 */
2522 case Iop_QNarrowUn64Uto32Ux2:
2523 case Iop_QNarrowUn64Sto32Sx2:
2524 case Iop_QNarrowUn64Sto32Ux2:
2525 return Iop_NarrowUn64to32x2;
2526 case Iop_QNarrowUn32Uto16Ux4:
2527 case Iop_QNarrowUn32Sto16Sx4:
2528 case Iop_QNarrowUn32Sto16Ux4:
Elliott Hughesa0664b92017-04-18 17:46:52 -07002529 case Iop_F32toF16x4:
sewardjb5a29232011-10-22 09:29:41 +00002530 return Iop_NarrowUn32to16x4;
2531 case Iop_QNarrowUn16Uto8Ux8:
2532 case Iop_QNarrowUn16Sto8Sx8:
2533 case Iop_QNarrowUn16Sto8Ux8:
2534 return Iop_NarrowUn16to8x8;
2535 default:
2536 ppIROp(qnarrowOp);
2537 VG_(tool_panic)("vanillaNarrowOpOfShape");
2538 }
2539}
2540
2541static
sewardj7ee7d852011-06-16 11:37:21 +00002542IRAtom* vectorNarrowBinV128 ( MCEnv* mce, IROp narrow_op,
2543 IRAtom* vatom1, IRAtom* vatom2)
sewardja1d93302004-12-12 16:45:06 +00002544{
2545 IRAtom *at1, *at2, *at3;
2546 IRAtom* (*pcast)( MCEnv*, IRAtom* );
2547 switch (narrow_op) {
carll62770672013-10-01 15:50:09 +00002548 case Iop_QNarrowBin64Sto32Sx4: pcast = mkPCast32x4; break;
2549 case Iop_QNarrowBin64Uto32Ux4: pcast = mkPCast32x4; break;
sewardj7ee7d852011-06-16 11:37:21 +00002550 case Iop_QNarrowBin32Sto16Sx8: pcast = mkPCast32x4; break;
2551 case Iop_QNarrowBin32Uto16Ux8: pcast = mkPCast32x4; break;
2552 case Iop_QNarrowBin32Sto16Ux8: pcast = mkPCast32x4; break;
2553 case Iop_QNarrowBin16Sto8Sx16: pcast = mkPCast16x8; break;
2554 case Iop_QNarrowBin16Uto8Ux16: pcast = mkPCast16x8; break;
2555 case Iop_QNarrowBin16Sto8Ux16: pcast = mkPCast16x8; break;
2556 default: VG_(tool_panic)("vectorNarrowBinV128");
sewardja1d93302004-12-12 16:45:06 +00002557 }
sewardjb5a29232011-10-22 09:29:41 +00002558 IROp vanilla_narrow = vanillaNarrowingOpOfShape(narrow_op);
sewardja1d93302004-12-12 16:45:06 +00002559 tl_assert(isShadowAtom(mce,vatom1));
2560 tl_assert(isShadowAtom(mce,vatom2));
sewardj7cf4e6b2008-05-01 20:24:26 +00002561 at1 = assignNew('V', mce, Ity_V128, pcast(mce, vatom1));
2562 at2 = assignNew('V', mce, Ity_V128, pcast(mce, vatom2));
sewardjb5a29232011-10-22 09:29:41 +00002563 at3 = assignNew('V', mce, Ity_V128, binop(vanilla_narrow, at1, at2));
sewardja1d93302004-12-12 16:45:06 +00002564 return at3;
2565}
2566
sewardjacd2e912005-01-13 19:17:06 +00002567static
sewardj7ee7d852011-06-16 11:37:21 +00002568IRAtom* vectorNarrowBin64 ( MCEnv* mce, IROp narrow_op,
2569 IRAtom* vatom1, IRAtom* vatom2)
sewardjacd2e912005-01-13 19:17:06 +00002570{
2571 IRAtom *at1, *at2, *at3;
2572 IRAtom* (*pcast)( MCEnv*, IRAtom* );
2573 switch (narrow_op) {
sewardj7ee7d852011-06-16 11:37:21 +00002574 case Iop_QNarrowBin32Sto16Sx4: pcast = mkPCast32x2; break;
2575 case Iop_QNarrowBin16Sto8Sx8: pcast = mkPCast16x4; break;
2576 case Iop_QNarrowBin16Sto8Ux8: pcast = mkPCast16x4; break;
2577 default: VG_(tool_panic)("vectorNarrowBin64");
sewardjacd2e912005-01-13 19:17:06 +00002578 }
sewardjb5a29232011-10-22 09:29:41 +00002579 IROp vanilla_narrow = vanillaNarrowingOpOfShape(narrow_op);
sewardjacd2e912005-01-13 19:17:06 +00002580 tl_assert(isShadowAtom(mce,vatom1));
2581 tl_assert(isShadowAtom(mce,vatom2));
sewardj7cf4e6b2008-05-01 20:24:26 +00002582 at1 = assignNew('V', mce, Ity_I64, pcast(mce, vatom1));
2583 at2 = assignNew('V', mce, Ity_I64, pcast(mce, vatom2));
sewardjb5a29232011-10-22 09:29:41 +00002584 at3 = assignNew('V', mce, Ity_I64, binop(vanilla_narrow, at1, at2));
sewardjacd2e912005-01-13 19:17:06 +00002585 return at3;
2586}
2587
sewardj57f92b02010-08-22 11:54:14 +00002588static
sewardjb5a29232011-10-22 09:29:41 +00002589IRAtom* vectorNarrowUnV128 ( MCEnv* mce, IROp narrow_op,
sewardj7ee7d852011-06-16 11:37:21 +00002590 IRAtom* vatom1)
sewardj57f92b02010-08-22 11:54:14 +00002591{
2592 IRAtom *at1, *at2;
2593 IRAtom* (*pcast)( MCEnv*, IRAtom* );
sewardjb5a29232011-10-22 09:29:41 +00002594 tl_assert(isShadowAtom(mce,vatom1));
2595 /* For vanilla narrowing (non-saturating), we can just apply
2596 the op directly to the V bits. */
2597 switch (narrow_op) {
2598 case Iop_NarrowUn16to8x8:
2599 case Iop_NarrowUn32to16x4:
2600 case Iop_NarrowUn64to32x2:
Elliott Hughesa0664b92017-04-18 17:46:52 -07002601 case Iop_F32toF16x4:
sewardjb5a29232011-10-22 09:29:41 +00002602 at1 = assignNew('V', mce, Ity_I64, unop(narrow_op, vatom1));
2603 return at1;
2604 default:
2605 break; /* Do Plan B */
2606 }
2607 /* Plan B: for ops that involve a saturation operation on the args,
2608 we must PCast before the vanilla narrow. */
2609 switch (narrow_op) {
sewardj7ee7d852011-06-16 11:37:21 +00002610 case Iop_QNarrowUn16Sto8Sx8: pcast = mkPCast16x8; break;
2611 case Iop_QNarrowUn16Sto8Ux8: pcast = mkPCast16x8; break;
2612 case Iop_QNarrowUn16Uto8Ux8: pcast = mkPCast16x8; break;
2613 case Iop_QNarrowUn32Sto16Sx4: pcast = mkPCast32x4; break;
2614 case Iop_QNarrowUn32Sto16Ux4: pcast = mkPCast32x4; break;
2615 case Iop_QNarrowUn32Uto16Ux4: pcast = mkPCast32x4; break;
2616 case Iop_QNarrowUn64Sto32Sx2: pcast = mkPCast64x2; break;
2617 case Iop_QNarrowUn64Sto32Ux2: pcast = mkPCast64x2; break;
2618 case Iop_QNarrowUn64Uto32Ux2: pcast = mkPCast64x2; break;
2619 default: VG_(tool_panic)("vectorNarrowUnV128");
sewardj57f92b02010-08-22 11:54:14 +00002620 }
sewardjb5a29232011-10-22 09:29:41 +00002621 IROp vanilla_narrow = vanillaNarrowingOpOfShape(narrow_op);
sewardj57f92b02010-08-22 11:54:14 +00002622 at1 = assignNew('V', mce, Ity_V128, pcast(mce, vatom1));
sewardjb5a29232011-10-22 09:29:41 +00002623 at2 = assignNew('V', mce, Ity_I64, unop(vanilla_narrow, at1));
sewardj57f92b02010-08-22 11:54:14 +00002624 return at2;
2625}
2626
2627static
sewardj7ee7d852011-06-16 11:37:21 +00002628IRAtom* vectorWidenI64 ( MCEnv* mce, IROp longen_op,
2629 IRAtom* vatom1)
sewardj57f92b02010-08-22 11:54:14 +00002630{
2631 IRAtom *at1, *at2;
2632 IRAtom* (*pcast)( MCEnv*, IRAtom* );
2633 switch (longen_op) {
sewardj7ee7d852011-06-16 11:37:21 +00002634 case Iop_Widen8Uto16x8: pcast = mkPCast16x8; break;
2635 case Iop_Widen8Sto16x8: pcast = mkPCast16x8; break;
2636 case Iop_Widen16Uto32x4: pcast = mkPCast32x4; break;
2637 case Iop_Widen16Sto32x4: pcast = mkPCast32x4; break;
2638 case Iop_Widen32Uto64x2: pcast = mkPCast64x2; break;
2639 case Iop_Widen32Sto64x2: pcast = mkPCast64x2; break;
Elliott Hughesa0664b92017-04-18 17:46:52 -07002640 case Iop_F16toF32x4: pcast = mkPCast32x4; break;
sewardj7ee7d852011-06-16 11:37:21 +00002641 default: VG_(tool_panic)("vectorWidenI64");
sewardj57f92b02010-08-22 11:54:14 +00002642 }
2643 tl_assert(isShadowAtom(mce,vatom1));
2644 at1 = assignNew('V', mce, Ity_V128, unop(longen_op, vatom1));
2645 at2 = assignNew('V', mce, Ity_V128, pcast(mce, at1));
2646 return at2;
2647}
2648
sewardja1d93302004-12-12 16:45:06 +00002649
2650/* --- --- Vector integer arithmetic --- --- */
2651
2652/* Simple ... UifU the args and per-lane pessimise the results. */
sewardjacd2e912005-01-13 19:17:06 +00002653
sewardja2f30952013-03-27 11:40:02 +00002654/* --- V256-bit versions --- */
2655
2656static
2657IRAtom* binary8Ix32 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2658{
2659 IRAtom* at;
2660 at = mkUifUV256(mce, vatom1, vatom2);
2661 at = mkPCast8x32(mce, at);
2662 return at;
2663}
2664
2665static
2666IRAtom* binary16Ix16 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2667{
2668 IRAtom* at;
2669 at = mkUifUV256(mce, vatom1, vatom2);
2670 at = mkPCast16x16(mce, at);
2671 return at;
2672}
2673
2674static
2675IRAtom* binary32Ix8 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2676{
2677 IRAtom* at;
2678 at = mkUifUV256(mce, vatom1, vatom2);
2679 at = mkPCast32x8(mce, at);
2680 return at;
2681}
2682
2683static
2684IRAtom* binary64Ix4 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2685{
2686 IRAtom* at;
2687 at = mkUifUV256(mce, vatom1, vatom2);
2688 at = mkPCast64x4(mce, at);
2689 return at;
2690}
2691
sewardj20d38f22005-02-07 23:50:18 +00002692/* --- V128-bit versions --- */
sewardjacd2e912005-01-13 19:17:06 +00002693
sewardja1d93302004-12-12 16:45:06 +00002694static
2695IRAtom* binary8Ix16 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2696{
2697 IRAtom* at;
sewardj20d38f22005-02-07 23:50:18 +00002698 at = mkUifUV128(mce, vatom1, vatom2);
sewardja1d93302004-12-12 16:45:06 +00002699 at = mkPCast8x16(mce, at);
2700 return at;
2701}
2702
2703static
2704IRAtom* binary16Ix8 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2705{
2706 IRAtom* at;
sewardj20d38f22005-02-07 23:50:18 +00002707 at = mkUifUV128(mce, vatom1, vatom2);
sewardja1d93302004-12-12 16:45:06 +00002708 at = mkPCast16x8(mce, at);
2709 return at;
2710}
2711
2712static
2713IRAtom* binary32Ix4 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2714{
2715 IRAtom* at;
sewardj20d38f22005-02-07 23:50:18 +00002716 at = mkUifUV128(mce, vatom1, vatom2);
sewardja1d93302004-12-12 16:45:06 +00002717 at = mkPCast32x4(mce, at);
2718 return at;
2719}
2720
2721static
2722IRAtom* binary64Ix2 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2723{
2724 IRAtom* at;
sewardj20d38f22005-02-07 23:50:18 +00002725 at = mkUifUV128(mce, vatom1, vatom2);
sewardja1d93302004-12-12 16:45:06 +00002726 at = mkPCast64x2(mce, at);
2727 return at;
2728}
sewardj3245c912004-12-10 14:58:26 +00002729
sewardjacd2e912005-01-13 19:17:06 +00002730/* --- 64-bit versions --- */
2731
2732static
2733IRAtom* binary8Ix8 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2734{
2735 IRAtom* at;
2736 at = mkUifU64(mce, vatom1, vatom2);
2737 at = mkPCast8x8(mce, at);
2738 return at;
2739}
2740
2741static
2742IRAtom* binary16Ix4 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2743{
2744 IRAtom* at;
2745 at = mkUifU64(mce, vatom1, vatom2);
2746 at = mkPCast16x4(mce, at);
2747 return at;
2748}
2749
2750static
2751IRAtom* binary32Ix2 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2752{
2753 IRAtom* at;
2754 at = mkUifU64(mce, vatom1, vatom2);
2755 at = mkPCast32x2(mce, at);
2756 return at;
2757}
2758
sewardj57f92b02010-08-22 11:54:14 +00002759static
2760IRAtom* binary64Ix1 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2761{
2762 IRAtom* at;
2763 at = mkUifU64(mce, vatom1, vatom2);
2764 at = mkPCastTo(mce, Ity_I64, at);
2765 return at;
2766}
2767
sewardjc678b852010-09-22 00:58:51 +00002768/* --- 32-bit versions --- */
2769
2770static
2771IRAtom* binary8Ix4 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2772{
2773 IRAtom* at;
2774 at = mkUifU32(mce, vatom1, vatom2);
2775 at = mkPCast8x4(mce, at);
2776 return at;
2777}
2778
2779static
2780IRAtom* binary16Ix2 ( MCEnv* mce, IRAtom* vatom1, IRAtom* vatom2 )
2781{
2782 IRAtom* at;
2783 at = mkUifU32(mce, vatom1, vatom2);
2784 at = mkPCast16x2(mce, at);
2785 return at;
2786}
2787
sewardj3245c912004-12-10 14:58:26 +00002788
2789/*------------------------------------------------------------*/
sewardj95448072004-11-22 20:19:51 +00002790/*--- Generate shadow values from all kinds of IRExprs. ---*/
2791/*------------------------------------------------------------*/
2792
2793static
sewardje91cea72006-02-08 19:32:02 +00002794IRAtom* expr2vbits_Qop ( MCEnv* mce,
2795 IROp op,
2796 IRAtom* atom1, IRAtom* atom2,
2797 IRAtom* atom3, IRAtom* atom4 )
2798{
2799 IRAtom* vatom1 = expr2vbits( mce, atom1 );
2800 IRAtom* vatom2 = expr2vbits( mce, atom2 );
2801 IRAtom* vatom3 = expr2vbits( mce, atom3 );
2802 IRAtom* vatom4 = expr2vbits( mce, atom4 );
2803
2804 tl_assert(isOriginalAtom(mce,atom1));
2805 tl_assert(isOriginalAtom(mce,atom2));
2806 tl_assert(isOriginalAtom(mce,atom3));
2807 tl_assert(isOriginalAtom(mce,atom4));
2808 tl_assert(isShadowAtom(mce,vatom1));
2809 tl_assert(isShadowAtom(mce,vatom2));
2810 tl_assert(isShadowAtom(mce,vatom3));
2811 tl_assert(isShadowAtom(mce,vatom4));
2812 tl_assert(sameKindedAtoms(atom1,vatom1));
2813 tl_assert(sameKindedAtoms(atom2,vatom2));
2814 tl_assert(sameKindedAtoms(atom3,vatom3));
2815 tl_assert(sameKindedAtoms(atom4,vatom4));
2816 switch (op) {
2817 case Iop_MAddF64:
2818 case Iop_MAddF64r32:
2819 case Iop_MSubF64:
2820 case Iop_MSubF64r32:
2821 /* I32(rm) x F64 x F64 x F64 -> F64 */
2822 return mkLazy4(mce, Ity_I64, vatom1, vatom2, vatom3, vatom4);
sewardjb5b87402011-03-07 16:05:35 +00002823
2824 case Iop_MAddF32:
2825 case Iop_MSubF32:
2826 /* I32(rm) x F32 x F32 x F32 -> F32 */
2827 return mkLazy4(mce, Ity_I32, vatom1, vatom2, vatom3, vatom4);
2828
Elliott Hughesa0664b92017-04-18 17:46:52 -07002829 case Iop_MAddF128:
2830 case Iop_MSubF128:
2831 case Iop_NegMAddF128:
2832 case Iop_NegMSubF128:
2833 /* I32(rm) x F128 x F128 x F128 -> F128 */
2834 return mkLazy4(mce, Ity_I128, vatom1, vatom2, vatom3, vatom4);
2835
sewardj350e8f72012-06-25 07:52:15 +00002836 /* V256-bit data-steering */
2837 case Iop_64x4toV256:
2838 return assignNew('V', mce, Ity_V256,
2839 IRExpr_Qop(op, vatom1, vatom2, vatom3, vatom4));
2840
sewardje91cea72006-02-08 19:32:02 +00002841 default:
2842 ppIROp(op);
2843 VG_(tool_panic)("memcheck:expr2vbits_Qop");
2844 }
2845}
2846
2847
2848static
sewardjed69fdb2006-02-03 16:12:27 +00002849IRAtom* expr2vbits_Triop ( MCEnv* mce,
2850 IROp op,
2851 IRAtom* atom1, IRAtom* atom2, IRAtom* atom3 )
2852{
sewardjed69fdb2006-02-03 16:12:27 +00002853 IRAtom* vatom1 = expr2vbits( mce, atom1 );
2854 IRAtom* vatom2 = expr2vbits( mce, atom2 );
2855 IRAtom* vatom3 = expr2vbits( mce, atom3 );
2856
2857 tl_assert(isOriginalAtom(mce,atom1));
2858 tl_assert(isOriginalAtom(mce,atom2));
2859 tl_assert(isOriginalAtom(mce,atom3));
2860 tl_assert(isShadowAtom(mce,vatom1));
2861 tl_assert(isShadowAtom(mce,vatom2));
2862 tl_assert(isShadowAtom(mce,vatom3));
2863 tl_assert(sameKindedAtoms(atom1,vatom1));
2864 tl_assert(sameKindedAtoms(atom2,vatom2));
2865 tl_assert(sameKindedAtoms(atom3,vatom3));
2866 switch (op) {
sewardjb5b87402011-03-07 16:05:35 +00002867 case Iop_AddF128:
2868 case Iop_SubF128:
2869 case Iop_MulF128:
2870 case Iop_DivF128:
Elliott Hughesa0664b92017-04-18 17:46:52 -07002871 case Iop_AddD128:
2872 case Iop_SubD128:
2873 case Iop_MulD128:
sewardjb0ccb4d2012-04-02 10:22:05 +00002874 case Iop_DivD128:
sewardj18c72fa2012-04-23 11:22:05 +00002875 case Iop_QuantizeD128:
2876 /* I32(rm) x F128/D128 x F128/D128 -> F128/D128 */
sewardjb5b87402011-03-07 16:05:35 +00002877 return mkLazy3(mce, Ity_I128, vatom1, vatom2, vatom3);
sewardjed69fdb2006-02-03 16:12:27 +00002878 case Iop_AddF64:
sewardjb0ccb4d2012-04-02 10:22:05 +00002879 case Iop_AddD64:
sewardjed69fdb2006-02-03 16:12:27 +00002880 case Iop_AddF64r32:
2881 case Iop_SubF64:
sewardjb0ccb4d2012-04-02 10:22:05 +00002882 case Iop_SubD64:
sewardjed69fdb2006-02-03 16:12:27 +00002883 case Iop_SubF64r32:
2884 case Iop_MulF64:
sewardjb0ccb4d2012-04-02 10:22:05 +00002885 case Iop_MulD64:
sewardjed69fdb2006-02-03 16:12:27 +00002886 case Iop_MulF64r32:
2887 case Iop_DivF64:
sewardjb0ccb4d2012-04-02 10:22:05 +00002888 case Iop_DivD64:
sewardjed69fdb2006-02-03 16:12:27 +00002889 case Iop_DivF64r32:
sewardj22ac5f42006-02-03 22:55:04 +00002890 case Iop_ScaleF64:
2891 case Iop_Yl2xF64:
2892 case Iop_Yl2xp1F64:
2893 case Iop_AtanF64:
sewardjd6075eb2006-02-04 15:25:23 +00002894 case Iop_PRemF64:
2895 case Iop_PRem1F64:
sewardj18c72fa2012-04-23 11:22:05 +00002896 case Iop_QuantizeD64:
2897 /* I32(rm) x F64/D64 x F64/D64 -> F64/D64 */
sewardjed69fdb2006-02-03 16:12:27 +00002898 return mkLazy3(mce, Ity_I64, vatom1, vatom2, vatom3);
sewardjd6075eb2006-02-04 15:25:23 +00002899 case Iop_PRemC3210F64:
2900 case Iop_PRem1C3210F64:
2901 /* I32(rm) x F64 x F64 -> I32 */
2902 return mkLazy3(mce, Ity_I32, vatom1, vatom2, vatom3);
sewardj59570ff2010-01-01 11:59:33 +00002903 case Iop_AddF32:
2904 case Iop_SubF32:
2905 case Iop_MulF32:
2906 case Iop_DivF32:
2907 /* I32(rm) x F32 x F32 -> I32 */
2908 return mkLazy3(mce, Ity_I32, vatom1, vatom2, vatom3);
sewardj18c72fa2012-04-23 11:22:05 +00002909 case Iop_SignificanceRoundD64:
florian733b4db2013-06-06 19:13:29 +00002910 /* IRRoundingMode(I32) x I8 x D64 -> D64 */
sewardj18c72fa2012-04-23 11:22:05 +00002911 return mkLazy3(mce, Ity_I64, vatom1, vatom2, vatom3);
2912 case Iop_SignificanceRoundD128:
florian733b4db2013-06-06 19:13:29 +00002913 /* IRRoundingMode(I32) x I8 x D128 -> D128 */
sewardj18c72fa2012-04-23 11:22:05 +00002914 return mkLazy3(mce, Ity_I128, vatom1, vatom2, vatom3);
sewardj7b7b1cb2014-09-01 11:34:32 +00002915 case Iop_SliceV128:
2916 /* (V128, V128, I8) -> V128 */
sewardjb9e6d242013-05-11 13:42:08 +00002917 complainIfUndefined(mce, atom3, NULL);
sewardj57f92b02010-08-22 11:54:14 +00002918 return assignNew('V', mce, Ity_V128, triop(op, vatom1, vatom2, atom3));
sewardj7b7b1cb2014-09-01 11:34:32 +00002919 case Iop_Slice64:
2920 /* (I64, I64, I8) -> I64 */
sewardjb9e6d242013-05-11 13:42:08 +00002921 complainIfUndefined(mce, atom3, NULL);
sewardj57f92b02010-08-22 11:54:14 +00002922 return assignNew('V', mce, Ity_I64, triop(op, vatom1, vatom2, atom3));
2923 case Iop_SetElem8x8:
2924 case Iop_SetElem16x4:
2925 case Iop_SetElem32x2:
sewardjb9e6d242013-05-11 13:42:08 +00002926 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00002927 return assignNew('V', mce, Ity_I64, triop(op, vatom1, atom2, vatom3));
carll24e40de2013-10-15 18:13:21 +00002928
sewardj1eb272f2014-01-26 18:36:52 +00002929 /* Vector FP with rounding mode as the first arg */
2930 case Iop_Add64Fx2:
2931 case Iop_Sub64Fx2:
2932 case Iop_Mul64Fx2:
2933 case Iop_Div64Fx2:
2934 return binary64Fx2_w_rm(mce, vatom1, vatom2, vatom3);
2935
2936 case Iop_Add32Fx4:
2937 case Iop_Sub32Fx4:
2938 case Iop_Mul32Fx4:
2939 case Iop_Div32Fx4:
2940 return binary32Fx4_w_rm(mce, vatom1, vatom2, vatom3);
2941
2942 case Iop_Add64Fx4:
2943 case Iop_Sub64Fx4:
2944 case Iop_Mul64Fx4:
2945 case Iop_Div64Fx4:
2946 return binary64Fx4_w_rm(mce, vatom1, vatom2, vatom3);
2947
2948 case Iop_Add32Fx8:
2949 case Iop_Sub32Fx8:
2950 case Iop_Mul32Fx8:
2951 case Iop_Div32Fx8:
2952 return binary32Fx8_w_rm(mce, vatom1, vatom2, vatom3);
2953
sewardjed69fdb2006-02-03 16:12:27 +00002954 default:
2955 ppIROp(op);
2956 VG_(tool_panic)("memcheck:expr2vbits_Triop");
2957 }
2958}
2959
2960
2961static
sewardj95448072004-11-22 20:19:51 +00002962IRAtom* expr2vbits_Binop ( MCEnv* mce,
2963 IROp op,
2964 IRAtom* atom1, IRAtom* atom2 )
2965{
2966 IRType and_or_ty;
2967 IRAtom* (*uifu) (MCEnv*, IRAtom*, IRAtom*);
2968 IRAtom* (*difd) (MCEnv*, IRAtom*, IRAtom*);
2969 IRAtom* (*improve) (MCEnv*, IRAtom*, IRAtom*);
2970
2971 IRAtom* vatom1 = expr2vbits( mce, atom1 );
2972 IRAtom* vatom2 = expr2vbits( mce, atom2 );
2973
2974 tl_assert(isOriginalAtom(mce,atom1));
2975 tl_assert(isOriginalAtom(mce,atom2));
2976 tl_assert(isShadowAtom(mce,vatom1));
2977 tl_assert(isShadowAtom(mce,vatom2));
2978 tl_assert(sameKindedAtoms(atom1,vatom1));
2979 tl_assert(sameKindedAtoms(atom2,vatom2));
2980 switch (op) {
2981
sewardjc678b852010-09-22 00:58:51 +00002982 /* 32-bit SIMD */
2983
2984 case Iop_Add16x2:
2985 case Iop_HAdd16Ux2:
2986 case Iop_HAdd16Sx2:
2987 case Iop_Sub16x2:
2988 case Iop_HSub16Ux2:
2989 case Iop_HSub16Sx2:
2990 case Iop_QAdd16Sx2:
2991 case Iop_QSub16Sx2:
sewardj9fb31092012-09-17 15:28:46 +00002992 case Iop_QSub16Ux2:
sewardj7a370652013-07-04 20:37:33 +00002993 case Iop_QAdd16Ux2:
sewardjc678b852010-09-22 00:58:51 +00002994 return binary16Ix2(mce, vatom1, vatom2);
2995
2996 case Iop_Add8x4:
2997 case Iop_HAdd8Ux4:
2998 case Iop_HAdd8Sx4:
2999 case Iop_Sub8x4:
3000 case Iop_HSub8Ux4:
3001 case Iop_HSub8Sx4:
3002 case Iop_QSub8Ux4:
3003 case Iop_QAdd8Ux4:
3004 case Iop_QSub8Sx4:
3005 case Iop_QAdd8Sx4:
3006 return binary8Ix4(mce, vatom1, vatom2);
3007
sewardjacd2e912005-01-13 19:17:06 +00003008 /* 64-bit SIMD */
3009
sewardj57f92b02010-08-22 11:54:14 +00003010 case Iop_ShrN8x8:
sewardjacd2e912005-01-13 19:17:06 +00003011 case Iop_ShrN16x4:
3012 case Iop_ShrN32x2:
sewardj03809ae2006-12-27 01:16:58 +00003013 case Iop_SarN8x8:
sewardjacd2e912005-01-13 19:17:06 +00003014 case Iop_SarN16x4:
3015 case Iop_SarN32x2:
3016 case Iop_ShlN16x4:
3017 case Iop_ShlN32x2:
sewardj114a9172008-02-09 01:49:32 +00003018 case Iop_ShlN8x8:
sewardjacd2e912005-01-13 19:17:06 +00003019 /* Same scheme as with all other shifts. */
sewardjb9e6d242013-05-11 13:42:08 +00003020 complainIfUndefined(mce, atom2, NULL);
sewardj7cf4e6b2008-05-01 20:24:26 +00003021 return assignNew('V', mce, Ity_I64, binop(op, vatom1, atom2));
sewardjacd2e912005-01-13 19:17:06 +00003022
sewardj7ee7d852011-06-16 11:37:21 +00003023 case Iop_QNarrowBin32Sto16Sx4:
3024 case Iop_QNarrowBin16Sto8Sx8:
3025 case Iop_QNarrowBin16Sto8Ux8:
3026 return vectorNarrowBin64(mce, op, vatom1, vatom2);
sewardjacd2e912005-01-13 19:17:06 +00003027
3028 case Iop_Min8Ux8:
sewardj57f92b02010-08-22 11:54:14 +00003029 case Iop_Min8Sx8:
sewardjacd2e912005-01-13 19:17:06 +00003030 case Iop_Max8Ux8:
sewardj57f92b02010-08-22 11:54:14 +00003031 case Iop_Max8Sx8:
sewardjacd2e912005-01-13 19:17:06 +00003032 case Iop_Avg8Ux8:
3033 case Iop_QSub8Sx8:
3034 case Iop_QSub8Ux8:
3035 case Iop_Sub8x8:
3036 case Iop_CmpGT8Sx8:
sewardj57f92b02010-08-22 11:54:14 +00003037 case Iop_CmpGT8Ux8:
sewardjacd2e912005-01-13 19:17:06 +00003038 case Iop_CmpEQ8x8:
3039 case Iop_QAdd8Sx8:
3040 case Iop_QAdd8Ux8:
sewardj57f92b02010-08-22 11:54:14 +00003041 case Iop_QSal8x8:
3042 case Iop_QShl8x8:
sewardjacd2e912005-01-13 19:17:06 +00003043 case Iop_Add8x8:
sewardj57f92b02010-08-22 11:54:14 +00003044 case Iop_Mul8x8:
3045 case Iop_PolynomialMul8x8:
sewardjacd2e912005-01-13 19:17:06 +00003046 return binary8Ix8(mce, vatom1, vatom2);
3047
3048 case Iop_Min16Sx4:
sewardj57f92b02010-08-22 11:54:14 +00003049 case Iop_Min16Ux4:
sewardjacd2e912005-01-13 19:17:06 +00003050 case Iop_Max16Sx4:
sewardj57f92b02010-08-22 11:54:14 +00003051 case Iop_Max16Ux4:
sewardjacd2e912005-01-13 19:17:06 +00003052 case Iop_Avg16Ux4:
3053 case Iop_QSub16Ux4:
3054 case Iop_QSub16Sx4:
3055 case Iop_Sub16x4:
3056 case Iop_Mul16x4:
3057 case Iop_MulHi16Sx4:
3058 case Iop_MulHi16Ux4:
3059 case Iop_CmpGT16Sx4:
sewardj57f92b02010-08-22 11:54:14 +00003060 case Iop_CmpGT16Ux4:
sewardjacd2e912005-01-13 19:17:06 +00003061 case Iop_CmpEQ16x4:
3062 case Iop_QAdd16Sx4:
3063 case Iop_QAdd16Ux4:
sewardj57f92b02010-08-22 11:54:14 +00003064 case Iop_QSal16x4:
3065 case Iop_QShl16x4:
sewardjacd2e912005-01-13 19:17:06 +00003066 case Iop_Add16x4:
sewardj57f92b02010-08-22 11:54:14 +00003067 case Iop_QDMulHi16Sx4:
3068 case Iop_QRDMulHi16Sx4:
sewardjacd2e912005-01-13 19:17:06 +00003069 return binary16Ix4(mce, vatom1, vatom2);
3070
3071 case Iop_Sub32x2:
sewardj114a9172008-02-09 01:49:32 +00003072 case Iop_Mul32x2:
sewardj57f92b02010-08-22 11:54:14 +00003073 case Iop_Max32Sx2:
3074 case Iop_Max32Ux2:
3075 case Iop_Min32Sx2:
3076 case Iop_Min32Ux2:
sewardjacd2e912005-01-13 19:17:06 +00003077 case Iop_CmpGT32Sx2:
sewardj57f92b02010-08-22 11:54:14 +00003078 case Iop_CmpGT32Ux2:
sewardjacd2e912005-01-13 19:17:06 +00003079 case Iop_CmpEQ32x2:
3080 case Iop_Add32x2:
sewardj57f92b02010-08-22 11:54:14 +00003081 case Iop_QAdd32Ux2:
3082 case Iop_QAdd32Sx2:
3083 case Iop_QSub32Ux2:
3084 case Iop_QSub32Sx2:
3085 case Iop_QSal32x2:
3086 case Iop_QShl32x2:
3087 case Iop_QDMulHi32Sx2:
3088 case Iop_QRDMulHi32Sx2:
sewardjacd2e912005-01-13 19:17:06 +00003089 return binary32Ix2(mce, vatom1, vatom2);
3090
sewardj57f92b02010-08-22 11:54:14 +00003091 case Iop_QSub64Ux1:
3092 case Iop_QSub64Sx1:
3093 case Iop_QAdd64Ux1:
3094 case Iop_QAdd64Sx1:
3095 case Iop_QSal64x1:
3096 case Iop_QShl64x1:
3097 case Iop_Sal64x1:
3098 return binary64Ix1(mce, vatom1, vatom2);
3099
sewardje541e222014-08-15 09:12:28 +00003100 case Iop_QShlNsatSU8x8:
3101 case Iop_QShlNsatUU8x8:
3102 case Iop_QShlNsatSS8x8:
sewardjb9e6d242013-05-11 13:42:08 +00003103 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003104 return mkPCast8x8(mce, vatom1);
3105
sewardje541e222014-08-15 09:12:28 +00003106 case Iop_QShlNsatSU16x4:
3107 case Iop_QShlNsatUU16x4:
3108 case Iop_QShlNsatSS16x4:
sewardjb9e6d242013-05-11 13:42:08 +00003109 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003110 return mkPCast16x4(mce, vatom1);
3111
sewardje541e222014-08-15 09:12:28 +00003112 case Iop_QShlNsatSU32x2:
3113 case Iop_QShlNsatUU32x2:
3114 case Iop_QShlNsatSS32x2:
sewardjb9e6d242013-05-11 13:42:08 +00003115 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003116 return mkPCast32x2(mce, vatom1);
3117
sewardje541e222014-08-15 09:12:28 +00003118 case Iop_QShlNsatSU64x1:
3119 case Iop_QShlNsatUU64x1:
3120 case Iop_QShlNsatSS64x1:
sewardjb9e6d242013-05-11 13:42:08 +00003121 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003122 return mkPCast32x2(mce, vatom1);
3123
3124 case Iop_PwMax32Sx2:
3125 case Iop_PwMax32Ux2:
3126 case Iop_PwMin32Sx2:
3127 case Iop_PwMin32Ux2:
3128 case Iop_PwMax32Fx2:
3129 case Iop_PwMin32Fx2:
sewardj350e8f72012-06-25 07:52:15 +00003130 return assignNew('V', mce, Ity_I64,
3131 binop(Iop_PwMax32Ux2,
3132 mkPCast32x2(mce, vatom1),
3133 mkPCast32x2(mce, vatom2)));
sewardj57f92b02010-08-22 11:54:14 +00003134
3135 case Iop_PwMax16Sx4:
3136 case Iop_PwMax16Ux4:
3137 case Iop_PwMin16Sx4:
3138 case Iop_PwMin16Ux4:
sewardj350e8f72012-06-25 07:52:15 +00003139 return assignNew('V', mce, Ity_I64,
3140 binop(Iop_PwMax16Ux4,
3141 mkPCast16x4(mce, vatom1),
3142 mkPCast16x4(mce, vatom2)));
sewardj57f92b02010-08-22 11:54:14 +00003143
3144 case Iop_PwMax8Sx8:
3145 case Iop_PwMax8Ux8:
3146 case Iop_PwMin8Sx8:
3147 case Iop_PwMin8Ux8:
sewardj350e8f72012-06-25 07:52:15 +00003148 return assignNew('V', mce, Ity_I64,
3149 binop(Iop_PwMax8Ux8,
3150 mkPCast8x8(mce, vatom1),
3151 mkPCast8x8(mce, vatom2)));
sewardj57f92b02010-08-22 11:54:14 +00003152
3153 case Iop_PwAdd32x2:
3154 case Iop_PwAdd32Fx2:
3155 return mkPCast32x2(mce,
sewardj350e8f72012-06-25 07:52:15 +00003156 assignNew('V', mce, Ity_I64,
3157 binop(Iop_PwAdd32x2,
3158 mkPCast32x2(mce, vatom1),
3159 mkPCast32x2(mce, vatom2))));
sewardj57f92b02010-08-22 11:54:14 +00003160
3161 case Iop_PwAdd16x4:
3162 return mkPCast16x4(mce,
sewardj350e8f72012-06-25 07:52:15 +00003163 assignNew('V', mce, Ity_I64,
3164 binop(op, mkPCast16x4(mce, vatom1),
3165 mkPCast16x4(mce, vatom2))));
sewardj57f92b02010-08-22 11:54:14 +00003166
3167 case Iop_PwAdd8x8:
3168 return mkPCast8x8(mce,
sewardj350e8f72012-06-25 07:52:15 +00003169 assignNew('V', mce, Ity_I64,
3170 binop(op, mkPCast8x8(mce, vatom1),
3171 mkPCast8x8(mce, vatom2))));
sewardj57f92b02010-08-22 11:54:14 +00003172
3173 case Iop_Shl8x8:
3174 case Iop_Shr8x8:
3175 case Iop_Sar8x8:
3176 case Iop_Sal8x8:
3177 return mkUifU64(mce,
3178 assignNew('V', mce, Ity_I64, binop(op, vatom1, atom2)),
3179 mkPCast8x8(mce,vatom2)
3180 );
3181
3182 case Iop_Shl16x4:
3183 case Iop_Shr16x4:
3184 case Iop_Sar16x4:
3185 case Iop_Sal16x4:
3186 return mkUifU64(mce,
3187 assignNew('V', mce, Ity_I64, binop(op, vatom1, atom2)),
3188 mkPCast16x4(mce,vatom2)
3189 );
3190
3191 case Iop_Shl32x2:
3192 case Iop_Shr32x2:
3193 case Iop_Sar32x2:
3194 case Iop_Sal32x2:
3195 return mkUifU64(mce,
3196 assignNew('V', mce, Ity_I64, binop(op, vatom1, atom2)),
3197 mkPCast32x2(mce,vatom2)
3198 );
3199
sewardjacd2e912005-01-13 19:17:06 +00003200 /* 64-bit data-steering */
3201 case Iop_InterleaveLO32x2:
3202 case Iop_InterleaveLO16x4:
3203 case Iop_InterleaveLO8x8:
3204 case Iop_InterleaveHI32x2:
3205 case Iop_InterleaveHI16x4:
3206 case Iop_InterleaveHI8x8:
sewardj57f92b02010-08-22 11:54:14 +00003207 case Iop_CatOddLanes8x8:
3208 case Iop_CatEvenLanes8x8:
sewardj114a9172008-02-09 01:49:32 +00003209 case Iop_CatOddLanes16x4:
3210 case Iop_CatEvenLanes16x4:
sewardj57f92b02010-08-22 11:54:14 +00003211 case Iop_InterleaveOddLanes8x8:
3212 case Iop_InterleaveEvenLanes8x8:
3213 case Iop_InterleaveOddLanes16x4:
3214 case Iop_InterleaveEvenLanes16x4:
sewardj7cf4e6b2008-05-01 20:24:26 +00003215 return assignNew('V', mce, Ity_I64, binop(op, vatom1, vatom2));
sewardjacd2e912005-01-13 19:17:06 +00003216
sewardj57f92b02010-08-22 11:54:14 +00003217 case Iop_GetElem8x8:
sewardjb9e6d242013-05-11 13:42:08 +00003218 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003219 return assignNew('V', mce, Ity_I8, binop(op, vatom1, atom2));
3220 case Iop_GetElem16x4:
sewardjb9e6d242013-05-11 13:42:08 +00003221 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003222 return assignNew('V', mce, Ity_I16, binop(op, vatom1, atom2));
3223 case Iop_GetElem32x2:
sewardjb9e6d242013-05-11 13:42:08 +00003224 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003225 return assignNew('V', mce, Ity_I32, binop(op, vatom1, atom2));
3226
sewardj114a9172008-02-09 01:49:32 +00003227 /* Perm8x8: rearrange values in left arg using steering values
3228 from right arg. So rearrange the vbits in the same way but
3229 pessimise wrt steering values. */
3230 case Iop_Perm8x8:
3231 return mkUifU64(
3232 mce,
sewardj7cf4e6b2008-05-01 20:24:26 +00003233 assignNew('V', mce, Ity_I64, binop(op, vatom1, atom2)),
sewardj114a9172008-02-09 01:49:32 +00003234 mkPCast8x8(mce, vatom2)
3235 );
3236
sewardj20d38f22005-02-07 23:50:18 +00003237 /* V128-bit SIMD */
sewardj0b070592004-12-10 21:44:22 +00003238
sewardj7222f642015-04-07 09:08:42 +00003239 case Iop_Sqrt32Fx4:
3240 return unary32Fx4_w_rm(mce, vatom1, vatom2);
3241 case Iop_Sqrt64Fx2:
3242 return unary64Fx2_w_rm(mce, vatom1, vatom2);
3243
sewardj57f92b02010-08-22 11:54:14 +00003244 case Iop_ShrN8x16:
sewardja1d93302004-12-12 16:45:06 +00003245 case Iop_ShrN16x8:
3246 case Iop_ShrN32x4:
3247 case Iop_ShrN64x2:
sewardj57f92b02010-08-22 11:54:14 +00003248 case Iop_SarN8x16:
sewardja1d93302004-12-12 16:45:06 +00003249 case Iop_SarN16x8:
3250 case Iop_SarN32x4:
sewardj57f92b02010-08-22 11:54:14 +00003251 case Iop_SarN64x2:
3252 case Iop_ShlN8x16:
sewardja1d93302004-12-12 16:45:06 +00003253 case Iop_ShlN16x8:
3254 case Iop_ShlN32x4:
3255 case Iop_ShlN64x2:
sewardj620eb5b2005-10-22 12:50:43 +00003256 /* Same scheme as with all other shifts. Note: 22 Oct 05:
3257 this is wrong now, scalar shifts are done properly lazily.
3258 Vector shifts should be fixed too. */
sewardjb9e6d242013-05-11 13:42:08 +00003259 complainIfUndefined(mce, atom2, NULL);
sewardj7cf4e6b2008-05-01 20:24:26 +00003260 return assignNew('V', mce, Ity_V128, binop(op, vatom1, atom2));
sewardja1d93302004-12-12 16:45:06 +00003261
sewardjcbf8be72005-11-10 18:34:41 +00003262 /* V x V shifts/rotates are done using the standard lazy scheme. */
sewardjbfd03f82014-08-26 18:35:13 +00003263 /* For the non-rounding variants of bi-di vector x vector
3264 shifts (the Iop_Sh.. ops, that is) we use the lazy scheme.
3265 But note that this is overly pessimistic, because in fact only
3266 the bottom 8 bits of each lane of the second argument are taken
3267 into account when shifting. So really we ought to ignore
3268 undefinedness in bits 8 and above of each lane in the
3269 second argument. */
sewardj43d60752005-11-10 18:13:01 +00003270 case Iop_Shl8x16:
3271 case Iop_Shr8x16:
3272 case Iop_Sar8x16:
sewardj57f92b02010-08-22 11:54:14 +00003273 case Iop_Sal8x16:
sewardjcbf8be72005-11-10 18:34:41 +00003274 case Iop_Rol8x16:
sewardjbfd03f82014-08-26 18:35:13 +00003275 case Iop_Sh8Sx16:
3276 case Iop_Sh8Ux16:
sewardj43d60752005-11-10 18:13:01 +00003277 return mkUifUV128(mce,
sewardj7cf4e6b2008-05-01 20:24:26 +00003278 assignNew('V', mce, Ity_V128, binop(op, vatom1, atom2)),
sewardj43d60752005-11-10 18:13:01 +00003279 mkPCast8x16(mce,vatom2)
3280 );
3281
3282 case Iop_Shl16x8:
3283 case Iop_Shr16x8:
3284 case Iop_Sar16x8:
sewardj57f92b02010-08-22 11:54:14 +00003285 case Iop_Sal16x8:
sewardjcbf8be72005-11-10 18:34:41 +00003286 case Iop_Rol16x8:
sewardjbfd03f82014-08-26 18:35:13 +00003287 case Iop_Sh16Sx8:
3288 case Iop_Sh16Ux8:
sewardj43d60752005-11-10 18:13:01 +00003289 return mkUifUV128(mce,
sewardj7cf4e6b2008-05-01 20:24:26 +00003290 assignNew('V', mce, Ity_V128, binop(op, vatom1, atom2)),
sewardj43d60752005-11-10 18:13:01 +00003291 mkPCast16x8(mce,vatom2)
3292 );
3293
3294 case Iop_Shl32x4:
3295 case Iop_Shr32x4:
3296 case Iop_Sar32x4:
sewardj57f92b02010-08-22 11:54:14 +00003297 case Iop_Sal32x4:
sewardjcbf8be72005-11-10 18:34:41 +00003298 case Iop_Rol32x4:
sewardjbfd03f82014-08-26 18:35:13 +00003299 case Iop_Sh32Sx4:
3300 case Iop_Sh32Ux4:
sewardj43d60752005-11-10 18:13:01 +00003301 return mkUifUV128(mce,
sewardj7cf4e6b2008-05-01 20:24:26 +00003302 assignNew('V', mce, Ity_V128, binop(op, vatom1, atom2)),
sewardj43d60752005-11-10 18:13:01 +00003303 mkPCast32x4(mce,vatom2)
3304 );
3305
sewardj57f92b02010-08-22 11:54:14 +00003306 case Iop_Shl64x2:
3307 case Iop_Shr64x2:
3308 case Iop_Sar64x2:
3309 case Iop_Sal64x2:
sewardj147865c2014-08-26 17:30:07 +00003310 case Iop_Rol64x2:
sewardjbfd03f82014-08-26 18:35:13 +00003311 case Iop_Sh64Sx2:
3312 case Iop_Sh64Ux2:
sewardj57f92b02010-08-22 11:54:14 +00003313 return mkUifUV128(mce,
3314 assignNew('V', mce, Ity_V128, binop(op, vatom1, atom2)),
3315 mkPCast64x2(mce,vatom2)
3316 );
3317
sewardjbfd03f82014-08-26 18:35:13 +00003318 /* For the rounding variants of bi-di vector x vector shifts, the
3319 rounding adjustment can cause undefinedness to propagate through
3320 the entire lane, in the worst case. Too complex to handle
3321 properly .. just UifU the arguments and then PCast them.
3322 Suboptimal but safe. */
3323 case Iop_Rsh8Sx16:
3324 case Iop_Rsh8Ux16:
3325 return binary8Ix16(mce, vatom1, vatom2);
3326 case Iop_Rsh16Sx8:
3327 case Iop_Rsh16Ux8:
3328 return binary16Ix8(mce, vatom1, vatom2);
3329 case Iop_Rsh32Sx4:
3330 case Iop_Rsh32Ux4:
3331 return binary32Ix4(mce, vatom1, vatom2);
3332 case Iop_Rsh64Sx2:
3333 case Iop_Rsh64Ux2:
3334 return binary64Ix2(mce, vatom1, vatom2);
3335
sewardj57f92b02010-08-22 11:54:14 +00003336 case Iop_F32ToFixed32Ux4_RZ:
3337 case Iop_F32ToFixed32Sx4_RZ:
3338 case Iop_Fixed32UToF32x4_RN:
3339 case Iop_Fixed32SToF32x4_RN:
sewardjb9e6d242013-05-11 13:42:08 +00003340 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003341 return mkPCast32x4(mce, vatom1);
3342
3343 case Iop_F32ToFixed32Ux2_RZ:
3344 case Iop_F32ToFixed32Sx2_RZ:
3345 case Iop_Fixed32UToF32x2_RN:
3346 case Iop_Fixed32SToF32x2_RN:
sewardjb9e6d242013-05-11 13:42:08 +00003347 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003348 return mkPCast32x2(mce, vatom1);
3349
sewardja1d93302004-12-12 16:45:06 +00003350 case Iop_QSub8Ux16:
3351 case Iop_QSub8Sx16:
3352 case Iop_Sub8x16:
3353 case Iop_Min8Ux16:
sewardj43d60752005-11-10 18:13:01 +00003354 case Iop_Min8Sx16:
sewardja1d93302004-12-12 16:45:06 +00003355 case Iop_Max8Ux16:
sewardj43d60752005-11-10 18:13:01 +00003356 case Iop_Max8Sx16:
sewardja1d93302004-12-12 16:45:06 +00003357 case Iop_CmpGT8Sx16:
sewardj43d60752005-11-10 18:13:01 +00003358 case Iop_CmpGT8Ux16:
sewardja1d93302004-12-12 16:45:06 +00003359 case Iop_CmpEQ8x16:
3360 case Iop_Avg8Ux16:
sewardj43d60752005-11-10 18:13:01 +00003361 case Iop_Avg8Sx16:
sewardja1d93302004-12-12 16:45:06 +00003362 case Iop_QAdd8Ux16:
3363 case Iop_QAdd8Sx16:
sewardjbfd03f82014-08-26 18:35:13 +00003364 case Iop_QAddExtUSsatSS8x16:
3365 case Iop_QAddExtSUsatUU8x16:
sewardj57f92b02010-08-22 11:54:14 +00003366 case Iop_QSal8x16:
3367 case Iop_QShl8x16:
sewardja1d93302004-12-12 16:45:06 +00003368 case Iop_Add8x16:
sewardj57f92b02010-08-22 11:54:14 +00003369 case Iop_Mul8x16:
3370 case Iop_PolynomialMul8x16:
carll24e40de2013-10-15 18:13:21 +00003371 case Iop_PolynomialMulAdd8x16:
sewardja1d93302004-12-12 16:45:06 +00003372 return binary8Ix16(mce, vatom1, vatom2);
3373
3374 case Iop_QSub16Ux8:
3375 case Iop_QSub16Sx8:
3376 case Iop_Sub16x8:
3377 case Iop_Mul16x8:
3378 case Iop_MulHi16Sx8:
3379 case Iop_MulHi16Ux8:
3380 case Iop_Min16Sx8:
sewardj43d60752005-11-10 18:13:01 +00003381 case Iop_Min16Ux8:
sewardja1d93302004-12-12 16:45:06 +00003382 case Iop_Max16Sx8:
sewardj43d60752005-11-10 18:13:01 +00003383 case Iop_Max16Ux8:
sewardja1d93302004-12-12 16:45:06 +00003384 case Iop_CmpGT16Sx8:
sewardj43d60752005-11-10 18:13:01 +00003385 case Iop_CmpGT16Ux8:
sewardja1d93302004-12-12 16:45:06 +00003386 case Iop_CmpEQ16x8:
3387 case Iop_Avg16Ux8:
sewardj43d60752005-11-10 18:13:01 +00003388 case Iop_Avg16Sx8:
sewardja1d93302004-12-12 16:45:06 +00003389 case Iop_QAdd16Ux8:
3390 case Iop_QAdd16Sx8:
sewardjbfd03f82014-08-26 18:35:13 +00003391 case Iop_QAddExtUSsatSS16x8:
3392 case Iop_QAddExtSUsatUU16x8:
sewardj57f92b02010-08-22 11:54:14 +00003393 case Iop_QSal16x8:
3394 case Iop_QShl16x8:
sewardja1d93302004-12-12 16:45:06 +00003395 case Iop_Add16x8:
sewardj57f92b02010-08-22 11:54:14 +00003396 case Iop_QDMulHi16Sx8:
3397 case Iop_QRDMulHi16Sx8:
carll24e40de2013-10-15 18:13:21 +00003398 case Iop_PolynomialMulAdd16x8:
sewardja1d93302004-12-12 16:45:06 +00003399 return binary16Ix8(mce, vatom1, vatom2);
3400
3401 case Iop_Sub32x4:
3402 case Iop_CmpGT32Sx4:
sewardj43d60752005-11-10 18:13:01 +00003403 case Iop_CmpGT32Ux4:
sewardja1d93302004-12-12 16:45:06 +00003404 case Iop_CmpEQ32x4:
sewardj43d60752005-11-10 18:13:01 +00003405 case Iop_QAdd32Sx4:
3406 case Iop_QAdd32Ux4:
3407 case Iop_QSub32Sx4:
3408 case Iop_QSub32Ux4:
sewardjbfd03f82014-08-26 18:35:13 +00003409 case Iop_QAddExtUSsatSS32x4:
3410 case Iop_QAddExtSUsatUU32x4:
sewardj57f92b02010-08-22 11:54:14 +00003411 case Iop_QSal32x4:
3412 case Iop_QShl32x4:
sewardj43d60752005-11-10 18:13:01 +00003413 case Iop_Avg32Ux4:
3414 case Iop_Avg32Sx4:
sewardja1d93302004-12-12 16:45:06 +00003415 case Iop_Add32x4:
sewardj43d60752005-11-10 18:13:01 +00003416 case Iop_Max32Ux4:
3417 case Iop_Max32Sx4:
3418 case Iop_Min32Ux4:
3419 case Iop_Min32Sx4:
sewardjb823b852010-06-18 08:18:38 +00003420 case Iop_Mul32x4:
sewardj57f92b02010-08-22 11:54:14 +00003421 case Iop_QDMulHi32Sx4:
3422 case Iop_QRDMulHi32Sx4:
carll24e40de2013-10-15 18:13:21 +00003423 case Iop_PolynomialMulAdd32x4:
sewardja1d93302004-12-12 16:45:06 +00003424 return binary32Ix4(mce, vatom1, vatom2);
3425
3426 case Iop_Sub64x2:
3427 case Iop_Add64x2:
carll62770672013-10-01 15:50:09 +00003428 case Iop_Max64Sx2:
3429 case Iop_Max64Ux2:
3430 case Iop_Min64Sx2:
3431 case Iop_Min64Ux2:
sewardj9a2afe92011-10-19 15:24:55 +00003432 case Iop_CmpEQ64x2:
sewardjb823b852010-06-18 08:18:38 +00003433 case Iop_CmpGT64Sx2:
carll62770672013-10-01 15:50:09 +00003434 case Iop_CmpGT64Ux2:
sewardj57f92b02010-08-22 11:54:14 +00003435 case Iop_QSal64x2:
3436 case Iop_QShl64x2:
3437 case Iop_QAdd64Ux2:
3438 case Iop_QAdd64Sx2:
3439 case Iop_QSub64Ux2:
3440 case Iop_QSub64Sx2:
sewardjbfd03f82014-08-26 18:35:13 +00003441 case Iop_QAddExtUSsatSS64x2:
3442 case Iop_QAddExtSUsatUU64x2:
carll24e40de2013-10-15 18:13:21 +00003443 case Iop_PolynomialMulAdd64x2:
3444 case Iop_CipherV128:
3445 case Iop_CipherLV128:
3446 case Iop_NCipherV128:
3447 case Iop_NCipherLV128:
Elliott Hughesa0664b92017-04-18 17:46:52 -07003448 case Iop_MulI128by10E:
3449 case Iop_MulI128by10ECarry:
carll24e40de2013-10-15 18:13:21 +00003450 return binary64Ix2(mce, vatom1, vatom2);
sewardja1d93302004-12-12 16:45:06 +00003451
carll62770672013-10-01 15:50:09 +00003452 case Iop_QNarrowBin64Sto32Sx4:
3453 case Iop_QNarrowBin64Uto32Ux4:
sewardj7ee7d852011-06-16 11:37:21 +00003454 case Iop_QNarrowBin32Sto16Sx8:
3455 case Iop_QNarrowBin32Uto16Ux8:
3456 case Iop_QNarrowBin32Sto16Ux8:
3457 case Iop_QNarrowBin16Sto8Sx16:
3458 case Iop_QNarrowBin16Uto8Ux16:
3459 case Iop_QNarrowBin16Sto8Ux16:
3460 return vectorNarrowBinV128(mce, op, vatom1, vatom2);
sewardja1d93302004-12-12 16:45:06 +00003461
sewardj0b070592004-12-10 21:44:22 +00003462 case Iop_Min64Fx2:
3463 case Iop_Max64Fx2:
sewardj0b070592004-12-10 21:44:22 +00003464 case Iop_CmpLT64Fx2:
3465 case Iop_CmpLE64Fx2:
3466 case Iop_CmpEQ64Fx2:
sewardj545663e2005-11-05 01:55:04 +00003467 case Iop_CmpUN64Fx2:
sewardj14350762015-02-24 12:24:35 +00003468 case Iop_RecipStep64Fx2:
3469 case Iop_RSqrtStep64Fx2:
sewardj0b070592004-12-10 21:44:22 +00003470 return binary64Fx2(mce, vatom1, vatom2);
3471
3472 case Iop_Sub64F0x2:
3473 case Iop_Mul64F0x2:
3474 case Iop_Min64F0x2:
3475 case Iop_Max64F0x2:
3476 case Iop_Div64F0x2:
3477 case Iop_CmpLT64F0x2:
3478 case Iop_CmpLE64F0x2:
3479 case Iop_CmpEQ64F0x2:
sewardj545663e2005-11-05 01:55:04 +00003480 case Iop_CmpUN64F0x2:
sewardj0b070592004-12-10 21:44:22 +00003481 case Iop_Add64F0x2:
3482 return binary64F0x2(mce, vatom1, vatom2);
3483
sewardj170ee212004-12-10 18:57:51 +00003484 case Iop_Min32Fx4:
3485 case Iop_Max32Fx4:
sewardj170ee212004-12-10 18:57:51 +00003486 case Iop_CmpLT32Fx4:
3487 case Iop_CmpLE32Fx4:
3488 case Iop_CmpEQ32Fx4:
sewardj545663e2005-11-05 01:55:04 +00003489 case Iop_CmpUN32Fx4:
cerione78ba2a2005-11-14 03:00:35 +00003490 case Iop_CmpGT32Fx4:
3491 case Iop_CmpGE32Fx4:
sewardjee6bb772014-08-24 14:02:22 +00003492 case Iop_RecipStep32Fx4:
3493 case Iop_RSqrtStep32Fx4:
sewardj3245c912004-12-10 14:58:26 +00003494 return binary32Fx4(mce, vatom1, vatom2);
3495
sewardj57f92b02010-08-22 11:54:14 +00003496 case Iop_Sub32Fx2:
3497 case Iop_Mul32Fx2:
3498 case Iop_Min32Fx2:
3499 case Iop_Max32Fx2:
3500 case Iop_CmpEQ32Fx2:
3501 case Iop_CmpGT32Fx2:
3502 case Iop_CmpGE32Fx2:
3503 case Iop_Add32Fx2:
sewardjee6bb772014-08-24 14:02:22 +00003504 case Iop_RecipStep32Fx2:
3505 case Iop_RSqrtStep32Fx2:
sewardj57f92b02010-08-22 11:54:14 +00003506 return binary32Fx2(mce, vatom1, vatom2);
3507
sewardj170ee212004-12-10 18:57:51 +00003508 case Iop_Sub32F0x4:
3509 case Iop_Mul32F0x4:
3510 case Iop_Min32F0x4:
3511 case Iop_Max32F0x4:
3512 case Iop_Div32F0x4:
3513 case Iop_CmpLT32F0x4:
3514 case Iop_CmpLE32F0x4:
3515 case Iop_CmpEQ32F0x4:
sewardj545663e2005-11-05 01:55:04 +00003516 case Iop_CmpUN32F0x4:
sewardj170ee212004-12-10 18:57:51 +00003517 case Iop_Add32F0x4:
3518 return binary32F0x4(mce, vatom1, vatom2);
3519
sewardje541e222014-08-15 09:12:28 +00003520 case Iop_QShlNsatSU8x16:
3521 case Iop_QShlNsatUU8x16:
3522 case Iop_QShlNsatSS8x16:
sewardjb9e6d242013-05-11 13:42:08 +00003523 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003524 return mkPCast8x16(mce, vatom1);
3525
sewardje541e222014-08-15 09:12:28 +00003526 case Iop_QShlNsatSU16x8:
3527 case Iop_QShlNsatUU16x8:
3528 case Iop_QShlNsatSS16x8:
sewardjb9e6d242013-05-11 13:42:08 +00003529 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003530 return mkPCast16x8(mce, vatom1);
3531
sewardje541e222014-08-15 09:12:28 +00003532 case Iop_QShlNsatSU32x4:
3533 case Iop_QShlNsatUU32x4:
3534 case Iop_QShlNsatSS32x4:
sewardjb9e6d242013-05-11 13:42:08 +00003535 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003536 return mkPCast32x4(mce, vatom1);
3537
sewardje541e222014-08-15 09:12:28 +00003538 case Iop_QShlNsatSU64x2:
3539 case Iop_QShlNsatUU64x2:
3540 case Iop_QShlNsatSS64x2:
sewardjb9e6d242013-05-11 13:42:08 +00003541 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003542 return mkPCast32x4(mce, vatom1);
3543
sewardjbfd03f82014-08-26 18:35:13 +00003544 /* Q-and-Qshift-by-imm-and-narrow of the form (V128, I8) -> V128.
3545 To make this simpler, do the following:
3546 * complain if the shift amount (the I8) is undefined
3547 * pcast each lane at the wide width
3548 * truncate each lane to half width
3549 * pcast the resulting 64-bit value to a single bit and use
3550 that as the least significant bit of the upper half of the
3551 result. */
3552 case Iop_QandQShrNnarrow64Uto32Ux2:
3553 case Iop_QandQSarNnarrow64Sto32Sx2:
3554 case Iop_QandQSarNnarrow64Sto32Ux2:
3555 case Iop_QandQRShrNnarrow64Uto32Ux2:
3556 case Iop_QandQRSarNnarrow64Sto32Sx2:
3557 case Iop_QandQRSarNnarrow64Sto32Ux2:
3558 case Iop_QandQShrNnarrow32Uto16Ux4:
3559 case Iop_QandQSarNnarrow32Sto16Sx4:
3560 case Iop_QandQSarNnarrow32Sto16Ux4:
3561 case Iop_QandQRShrNnarrow32Uto16Ux4:
3562 case Iop_QandQRSarNnarrow32Sto16Sx4:
3563 case Iop_QandQRSarNnarrow32Sto16Ux4:
3564 case Iop_QandQShrNnarrow16Uto8Ux8:
3565 case Iop_QandQSarNnarrow16Sto8Sx8:
3566 case Iop_QandQSarNnarrow16Sto8Ux8:
3567 case Iop_QandQRShrNnarrow16Uto8Ux8:
3568 case Iop_QandQRSarNnarrow16Sto8Sx8:
3569 case Iop_QandQRSarNnarrow16Sto8Ux8:
3570 {
3571 IRAtom* (*fnPessim) (MCEnv*, IRAtom*) = NULL;
3572 IROp opNarrow = Iop_INVALID;
3573 switch (op) {
3574 case Iop_QandQShrNnarrow64Uto32Ux2:
3575 case Iop_QandQSarNnarrow64Sto32Sx2:
3576 case Iop_QandQSarNnarrow64Sto32Ux2:
3577 case Iop_QandQRShrNnarrow64Uto32Ux2:
3578 case Iop_QandQRSarNnarrow64Sto32Sx2:
3579 case Iop_QandQRSarNnarrow64Sto32Ux2:
3580 fnPessim = mkPCast64x2;
3581 opNarrow = Iop_NarrowUn64to32x2;
3582 break;
3583 case Iop_QandQShrNnarrow32Uto16Ux4:
3584 case Iop_QandQSarNnarrow32Sto16Sx4:
3585 case Iop_QandQSarNnarrow32Sto16Ux4:
3586 case Iop_QandQRShrNnarrow32Uto16Ux4:
3587 case Iop_QandQRSarNnarrow32Sto16Sx4:
3588 case Iop_QandQRSarNnarrow32Sto16Ux4:
3589 fnPessim = mkPCast32x4;
3590 opNarrow = Iop_NarrowUn32to16x4;
3591 break;
3592 case Iop_QandQShrNnarrow16Uto8Ux8:
3593 case Iop_QandQSarNnarrow16Sto8Sx8:
3594 case Iop_QandQSarNnarrow16Sto8Ux8:
3595 case Iop_QandQRShrNnarrow16Uto8Ux8:
3596 case Iop_QandQRSarNnarrow16Sto8Sx8:
3597 case Iop_QandQRSarNnarrow16Sto8Ux8:
3598 fnPessim = mkPCast16x8;
3599 opNarrow = Iop_NarrowUn16to8x8;
3600 break;
3601 default:
3602 tl_assert(0);
3603 }
3604 complainIfUndefined(mce, atom2, NULL);
3605 // Pessimised shift result
3606 IRAtom* shV
3607 = fnPessim(mce, vatom1);
3608 // Narrowed, pessimised shift result
3609 IRAtom* shVnarrowed
3610 = assignNew('V', mce, Ity_I64, unop(opNarrow, shV));
3611 // Generates: Def--(63)--Def PCast-to-I1(narrowed)
3612 IRAtom* qV = mkPCastXXtoXXlsb(mce, shVnarrowed, Ity_I64);
3613 // and assemble the result
3614 return assignNew('V', mce, Ity_V128,
3615 binop(Iop_64HLtoV128, qV, shVnarrowed));
3616 }
3617
sewardj57f92b02010-08-22 11:54:14 +00003618 case Iop_Mull32Sx2:
3619 case Iop_Mull32Ux2:
sewardj4d6ce842014-07-21 09:21:57 +00003620 case Iop_QDMull32Sx2:
sewardj7ee7d852011-06-16 11:37:21 +00003621 return vectorWidenI64(mce, Iop_Widen32Sto64x2,
3622 mkUifU64(mce, vatom1, vatom2));
sewardj57f92b02010-08-22 11:54:14 +00003623
3624 case Iop_Mull16Sx4:
3625 case Iop_Mull16Ux4:
sewardj4d6ce842014-07-21 09:21:57 +00003626 case Iop_QDMull16Sx4:
sewardj7ee7d852011-06-16 11:37:21 +00003627 return vectorWidenI64(mce, Iop_Widen16Sto32x4,
3628 mkUifU64(mce, vatom1, vatom2));
sewardj57f92b02010-08-22 11:54:14 +00003629
3630 case Iop_Mull8Sx8:
3631 case Iop_Mull8Ux8:
3632 case Iop_PolynomialMull8x8:
sewardj7ee7d852011-06-16 11:37:21 +00003633 return vectorWidenI64(mce, Iop_Widen8Sto16x8,
3634 mkUifU64(mce, vatom1, vatom2));
sewardj57f92b02010-08-22 11:54:14 +00003635
3636 case Iop_PwAdd32x4:
3637 return mkPCast32x4(mce,
3638 assignNew('V', mce, Ity_V128, binop(op, mkPCast32x4(mce, vatom1),
3639 mkPCast32x4(mce, vatom2))));
3640
3641 case Iop_PwAdd16x8:
3642 return mkPCast16x8(mce,
3643 assignNew('V', mce, Ity_V128, binop(op, mkPCast16x8(mce, vatom1),
3644 mkPCast16x8(mce, vatom2))));
3645
3646 case Iop_PwAdd8x16:
3647 return mkPCast8x16(mce,
3648 assignNew('V', mce, Ity_V128, binop(op, mkPCast8x16(mce, vatom1),
3649 mkPCast8x16(mce, vatom2))));
3650
sewardj20d38f22005-02-07 23:50:18 +00003651 /* V128-bit data-steering */
3652 case Iop_SetV128lo32:
3653 case Iop_SetV128lo64:
3654 case Iop_64HLtoV128:
sewardja1d93302004-12-12 16:45:06 +00003655 case Iop_InterleaveLO64x2:
3656 case Iop_InterleaveLO32x4:
3657 case Iop_InterleaveLO16x8:
3658 case Iop_InterleaveLO8x16:
3659 case Iop_InterleaveHI64x2:
3660 case Iop_InterleaveHI32x4:
3661 case Iop_InterleaveHI16x8:
3662 case Iop_InterleaveHI8x16:
sewardj57f92b02010-08-22 11:54:14 +00003663 case Iop_CatOddLanes8x16:
3664 case Iop_CatOddLanes16x8:
3665 case Iop_CatOddLanes32x4:
3666 case Iop_CatEvenLanes8x16:
3667 case Iop_CatEvenLanes16x8:
3668 case Iop_CatEvenLanes32x4:
3669 case Iop_InterleaveOddLanes8x16:
3670 case Iop_InterleaveOddLanes16x8:
3671 case Iop_InterleaveOddLanes32x4:
3672 case Iop_InterleaveEvenLanes8x16:
3673 case Iop_InterleaveEvenLanes16x8:
3674 case Iop_InterleaveEvenLanes32x4:
sewardj7cf4e6b2008-05-01 20:24:26 +00003675 return assignNew('V', mce, Ity_V128, binop(op, vatom1, vatom2));
sewardj57f92b02010-08-22 11:54:14 +00003676
3677 case Iop_GetElem8x16:
sewardjb9e6d242013-05-11 13:42:08 +00003678 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003679 return assignNew('V', mce, Ity_I8, binop(op, vatom1, atom2));
3680 case Iop_GetElem16x8:
sewardjb9e6d242013-05-11 13:42:08 +00003681 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003682 return assignNew('V', mce, Ity_I16, binop(op, vatom1, atom2));
3683 case Iop_GetElem32x4:
sewardjb9e6d242013-05-11 13:42:08 +00003684 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003685 return assignNew('V', mce, Ity_I32, binop(op, vatom1, atom2));
3686 case Iop_GetElem64x2:
sewardjb9e6d242013-05-11 13:42:08 +00003687 complainIfUndefined(mce, atom2, NULL);
sewardj57f92b02010-08-22 11:54:14 +00003688 return assignNew('V', mce, Ity_I64, binop(op, vatom1, atom2));
3689
sewardj620eb5b2005-10-22 12:50:43 +00003690 /* Perm8x16: rearrange values in left arg using steering values
3691 from right arg. So rearrange the vbits in the same way but
sewardj350e8f72012-06-25 07:52:15 +00003692 pessimise wrt steering values. Perm32x4 ditto. */
sewardj620eb5b2005-10-22 12:50:43 +00003693 case Iop_Perm8x16:
3694 return mkUifUV128(
3695 mce,
sewardj7cf4e6b2008-05-01 20:24:26 +00003696 assignNew('V', mce, Ity_V128, binop(op, vatom1, atom2)),
sewardj620eb5b2005-10-22 12:50:43 +00003697 mkPCast8x16(mce, vatom2)
3698 );
sewardj350e8f72012-06-25 07:52:15 +00003699 case Iop_Perm32x4:
3700 return mkUifUV128(
3701 mce,
3702 assignNew('V', mce, Ity_V128, binop(op, vatom1, atom2)),
3703 mkPCast32x4(mce, vatom2)
3704 );
sewardj170ee212004-12-10 18:57:51 +00003705
sewardj43d60752005-11-10 18:13:01 +00003706 /* These two take the lower half of each 16-bit lane, sign/zero
3707 extend it to 32, and multiply together, producing a 32x4
3708 result (and implicitly ignoring half the operand bits). So
3709 treat it as a bunch of independent 16x8 operations, but then
3710 do 32-bit shifts left-right to copy the lower half results
3711 (which are all 0s or all 1s due to PCasting in binary16Ix8)
3712 into the upper half of each result lane. */
3713 case Iop_MullEven16Ux8:
3714 case Iop_MullEven16Sx8: {
3715 IRAtom* at;
3716 at = binary16Ix8(mce,vatom1,vatom2);
sewardj7cf4e6b2008-05-01 20:24:26 +00003717 at = assignNew('V', mce, Ity_V128, binop(Iop_ShlN32x4, at, mkU8(16)));
3718 at = assignNew('V', mce, Ity_V128, binop(Iop_SarN32x4, at, mkU8(16)));
sewardj43d60752005-11-10 18:13:01 +00003719 return at;
3720 }
3721
3722 /* Same deal as Iop_MullEven16{S,U}x8 */
3723 case Iop_MullEven8Ux16:
3724 case Iop_MullEven8Sx16: {
3725 IRAtom* at;
3726 at = binary8Ix16(mce,vatom1,vatom2);
sewardj7cf4e6b2008-05-01 20:24:26 +00003727 at = assignNew('V', mce, Ity_V128, binop(Iop_ShlN16x8, at, mkU8(8)));
3728 at = assignNew('V', mce, Ity_V128, binop(Iop_SarN16x8, at, mkU8(8)));
sewardj43d60752005-11-10 18:13:01 +00003729 return at;
3730 }
3731
carll62770672013-10-01 15:50:09 +00003732 /* Same deal as Iop_MullEven16{S,U}x8 */
3733 case Iop_MullEven32Ux4:
3734 case Iop_MullEven32Sx4: {
3735 IRAtom* at;
3736 at = binary32Ix4(mce,vatom1,vatom2);
3737 at = assignNew('V', mce, Ity_V128, binop(Iop_ShlN64x2, at, mkU8(32)));
3738 at = assignNew('V', mce, Ity_V128, binop(Iop_SarN64x2, at, mkU8(32)));
3739 return at;
3740 }
3741
sewardj43d60752005-11-10 18:13:01 +00003742 /* narrow 2xV128 into 1xV128, hi half from left arg, in a 2 x
3743 32x4 -> 16x8 laneage, discarding the upper half of each lane.
3744 Simply apply same op to the V bits, since this really no more
3745 than a data steering operation. */
sewardj7ee7d852011-06-16 11:37:21 +00003746 case Iop_NarrowBin32to16x8:
3747 case Iop_NarrowBin16to8x16:
carlldfbf2942013-08-12 18:04:22 +00003748 case Iop_NarrowBin64to32x4:
sewardj7cf4e6b2008-05-01 20:24:26 +00003749 return assignNew('V', mce, Ity_V128,
3750 binop(op, vatom1, vatom2));
sewardj43d60752005-11-10 18:13:01 +00003751
3752 case Iop_ShrV128:
3753 case Iop_ShlV128:
Elliott Hughesa0664b92017-04-18 17:46:52 -07003754 case Iop_I128StoBCD128:
sewardj43d60752005-11-10 18:13:01 +00003755 /* Same scheme as with all other shifts. Note: 10 Nov 05:
3756 this is wrong now, scalar shifts are done properly lazily.
3757 Vector shifts should be fixed too. */
sewardjb9e6d242013-05-11 13:42:08 +00003758 complainIfUndefined(mce, atom2, NULL);
sewardj7cf4e6b2008-05-01 20:24:26 +00003759 return assignNew('V', mce, Ity_V128, binop(op, vatom1, atom2));
sewardj43d60752005-11-10 18:13:01 +00003760
Elliott Hughesa0664b92017-04-18 17:46:52 -07003761 case Iop_BCDAdd:
3762 case Iop_BCDSub:
3763 return mkLazy2(mce, Ity_V128, vatom1, vatom2);
3764
carll24e40de2013-10-15 18:13:21 +00003765 /* SHA Iops */
3766 case Iop_SHA256:
3767 case Iop_SHA512:
3768 complainIfUndefined(mce, atom2, NULL);
3769 return assignNew('V', mce, Ity_V128, binop(op, vatom1, atom2));
3770
sewardj69a13322005-04-23 01:14:51 +00003771 /* I128-bit data-steering */
3772 case Iop_64HLto128:
sewardj7cf4e6b2008-05-01 20:24:26 +00003773 return assignNew('V', mce, Ity_I128, binop(op, vatom1, vatom2));
sewardj69a13322005-04-23 01:14:51 +00003774
sewardj350e8f72012-06-25 07:52:15 +00003775 /* V256-bit SIMD */
3776
sewardj350e8f72012-06-25 07:52:15 +00003777 case Iop_Max64Fx4:
3778 case Iop_Min64Fx4:
3779 return binary64Fx4(mce, vatom1, vatom2);
3780
sewardj350e8f72012-06-25 07:52:15 +00003781 case Iop_Max32Fx8:
3782 case Iop_Min32Fx8:
3783 return binary32Fx8(mce, vatom1, vatom2);
3784
3785 /* V256-bit data-steering */
3786 case Iop_V128HLtoV256:
3787 return assignNew('V', mce, Ity_V256, binop(op, vatom1, vatom2));
3788
sewardj3245c912004-12-10 14:58:26 +00003789 /* Scalar floating point */
3790
sewardjb5b87402011-03-07 16:05:35 +00003791 case Iop_F32toI64S:
florian1b9609a2012-09-01 00:15:45 +00003792 case Iop_F32toI64U:
sewardjb5b87402011-03-07 16:05:35 +00003793 /* I32(rm) x F32 -> I64 */
3794 return mkLazy2(mce, Ity_I64, vatom1, vatom2);
3795
3796 case Iop_I64StoF32:
3797 /* I32(rm) x I64 -> F32 */
3798 return mkLazy2(mce, Ity_I32, vatom1, vatom2);
3799
sewardjed69fdb2006-02-03 16:12:27 +00003800 case Iop_RoundF64toInt:
3801 case Iop_RoundF64toF32:
sewardj06f96d02009-12-31 19:24:12 +00003802 case Iop_F64toI64S:
sewardja201c452011-07-24 14:15:54 +00003803 case Iop_F64toI64U:
sewardj06f96d02009-12-31 19:24:12 +00003804 case Iop_I64StoF64:
sewardjf34eb492011-04-15 11:57:05 +00003805 case Iop_I64UtoF64:
sewardj22ac5f42006-02-03 22:55:04 +00003806 case Iop_SinF64:
3807 case Iop_CosF64:
3808 case Iop_TanF64:
3809 case Iop_2xm1F64:
3810 case Iop_SqrtF64:
sewardj14350762015-02-24 12:24:35 +00003811 case Iop_RecpExpF64:
sewardj22ac5f42006-02-03 22:55:04 +00003812 /* I32(rm) x I64/F64 -> I64/F64 */
sewardj95448072004-11-22 20:19:51 +00003813 return mkLazy2(mce, Ity_I64, vatom1, vatom2);
3814
sewardjea8b02f2012-04-12 17:28:57 +00003815 case Iop_ShlD64:
3816 case Iop_ShrD64:
sewardj18c72fa2012-04-23 11:22:05 +00003817 case Iop_RoundD64toInt:
florian054684f2013-06-06 21:21:46 +00003818 /* I32(rm) x D64 -> D64 */
sewardjea8b02f2012-04-12 17:28:57 +00003819 return mkLazy2(mce, Ity_I64, vatom1, vatom2);
3820
3821 case Iop_ShlD128:
3822 case Iop_ShrD128:
sewardj18c72fa2012-04-23 11:22:05 +00003823 case Iop_RoundD128toInt:
florian054684f2013-06-06 21:21:46 +00003824 /* I32(rm) x D128 -> D128 */
sewardjea8b02f2012-04-12 17:28:57 +00003825 return mkLazy2(mce, Ity_I128, vatom1, vatom2);
3826
florian72e46402015-09-05 20:39:27 +00003827 case Iop_RoundF128toInt:
3828 /* I32(rm) x F128 -> F128 */
3829 return mkLazy2(mce, Ity_I128, vatom1, vatom2);
3830
sewardjea8b02f2012-04-12 17:28:57 +00003831 case Iop_D64toI64S:
florian53eb2a02013-01-12 22:04:00 +00003832 case Iop_D64toI64U:
sewardjea8b02f2012-04-12 17:28:57 +00003833 case Iop_I64StoD64:
florian53eb2a02013-01-12 22:04:00 +00003834 case Iop_I64UtoD64:
florian054684f2013-06-06 21:21:46 +00003835 /* I32(rm) x I64/D64 -> D64/I64 */
sewardjea8b02f2012-04-12 17:28:57 +00003836 return mkLazy2(mce, Ity_I64, vatom1, vatom2);
3837
florianba5693c2013-06-17 19:04:24 +00003838 case Iop_F32toD32:
3839 case Iop_F64toD32:
3840 case Iop_F128toD32:
3841 case Iop_D32toF32:
3842 case Iop_D64toF32:
3843 case Iop_D128toF32:
3844 /* I32(rm) x F32/F64/F128/D32/D64/D128 -> D32/F32 */
3845 return mkLazy2(mce, Ity_I32, vatom1, vatom2);
3846
3847 case Iop_F32toD64:
florian39b08d82013-05-05 15:05:42 +00003848 case Iop_F64toD64:
florianba5693c2013-06-17 19:04:24 +00003849 case Iop_F128toD64:
3850 case Iop_D32toF64:
florian39b08d82013-05-05 15:05:42 +00003851 case Iop_D64toF64:
florian39b08d82013-05-05 15:05:42 +00003852 case Iop_D128toF64:
florianba5693c2013-06-17 19:04:24 +00003853 /* I32(rm) x F32/F64/F128/D32/D64/D128 -> D64/F64 */
florian39b08d82013-05-05 15:05:42 +00003854 return mkLazy2(mce, Ity_I64, vatom1, vatom2);
3855
florianba5693c2013-06-17 19:04:24 +00003856 case Iop_F32toD128:
3857 case Iop_F64toD128:
florian39b08d82013-05-05 15:05:42 +00003858 case Iop_F128toD128:
florianba5693c2013-06-17 19:04:24 +00003859 case Iop_D32toF128:
3860 case Iop_D64toF128:
florian39b08d82013-05-05 15:05:42 +00003861 case Iop_D128toF128:
florianba5693c2013-06-17 19:04:24 +00003862 /* I32(rm) x F32/F64/F128/D32/D64/D128 -> D128/F128 */
florian39b08d82013-05-05 15:05:42 +00003863 return mkLazy2(mce, Ity_I128, vatom1, vatom2);
3864
sewardjd376a762010-06-27 09:08:54 +00003865 case Iop_RoundF32toInt:
sewardjaec1be32010-01-03 22:29:32 +00003866 case Iop_SqrtF32:
sewardj14350762015-02-24 12:24:35 +00003867 case Iop_RecpExpF32:
sewardjaec1be32010-01-03 22:29:32 +00003868 /* I32(rm) x I32/F32 -> I32/F32 */
3869 return mkLazy2(mce, Ity_I32, vatom1, vatom2);
3870
sewardjb5b87402011-03-07 16:05:35 +00003871 case Iop_SqrtF128:
3872 /* I32(rm) x F128 -> F128 */
3873 return mkLazy2(mce, Ity_I128, vatom1, vatom2);
3874
3875 case Iop_I32StoF32:
florian1b9609a2012-09-01 00:15:45 +00003876 case Iop_I32UtoF32:
sewardjb5b87402011-03-07 16:05:35 +00003877 case Iop_F32toI32S:
florian1b9609a2012-09-01 00:15:45 +00003878 case Iop_F32toI32U:
sewardjb5b87402011-03-07 16:05:35 +00003879 /* First arg is I32 (rounding mode), second is F32/I32 (data). */
3880 return mkLazy2(mce, Ity_I32, vatom1, vatom2);
3881
sewardj1f4b1eb2015-04-06 14:52:28 +00003882 case Iop_F64toF16:
3883 case Iop_F32toF16:
3884 /* First arg is I32 (rounding mode), second is F64/F32 (data). */
3885 return mkLazy2(mce, Ity_I16, vatom1, vatom2);
3886
sewardjb5b87402011-03-07 16:05:35 +00003887 case Iop_F128toI32S: /* IRRoundingMode(I32) x F128 -> signed I32 */
florian1b9609a2012-09-01 00:15:45 +00003888 case Iop_F128toI32U: /* IRRoundingMode(I32) x F128 -> unsigned I32 */
sewardjb5b87402011-03-07 16:05:35 +00003889 case Iop_F128toF32: /* IRRoundingMode(I32) x F128 -> F32 */
florian733b4db2013-06-06 19:13:29 +00003890 case Iop_D128toI32S: /* IRRoundingMode(I32) x D128 -> signed I32 */
3891 case Iop_D128toI32U: /* IRRoundingMode(I32) x D128 -> unsigned I32 */
sewardjb5b87402011-03-07 16:05:35 +00003892 return mkLazy2(mce, Ity_I32, vatom1, vatom2);
3893
Elliott Hughesa0664b92017-04-18 17:46:52 -07003894 case Iop_F128toI128S: /* IRRoundingMode(I32) x F128 -> signed I128 */
3895 case Iop_RndF128: /* IRRoundingMode(I32) x F128 -> F128 */
3896 return mkLazy2(mce, Ity_I128, vatom1, vatom2);
3897
sewardjb5b87402011-03-07 16:05:35 +00003898 case Iop_F128toI64S: /* IRRoundingMode(I32) x F128 -> signed I64 */
florian1b9609a2012-09-01 00:15:45 +00003899 case Iop_F128toI64U: /* IRRoundingMode(I32) x F128 -> unsigned I64 */
sewardjb5b87402011-03-07 16:05:35 +00003900 case Iop_F128toF64: /* IRRoundingMode(I32) x F128 -> F64 */
florian733b4db2013-06-06 19:13:29 +00003901 case Iop_D128toD64: /* IRRoundingMode(I64) x D128 -> D64 */
3902 case Iop_D128toI64S: /* IRRoundingMode(I64) x D128 -> signed I64 */
3903 case Iop_D128toI64U: /* IRRoundingMode(I32) x D128 -> unsigned I64 */
sewardjb5b87402011-03-07 16:05:35 +00003904 return mkLazy2(mce, Ity_I64, vatom1, vatom2);
3905
3906 case Iop_F64HLtoF128:
sewardjb0ccb4d2012-04-02 10:22:05 +00003907 case Iop_D64HLtoD128:
sewardj350e8f72012-06-25 07:52:15 +00003908 return assignNew('V', mce, Ity_I128,
3909 binop(Iop_64HLto128, vatom1, vatom2));
sewardjb5b87402011-03-07 16:05:35 +00003910
sewardj59570ff2010-01-01 11:59:33 +00003911 case Iop_F64toI32U:
sewardj06f96d02009-12-31 19:24:12 +00003912 case Iop_F64toI32S:
sewardje9e16d32004-12-10 13:17:55 +00003913 case Iop_F64toF32:
sewardjf34eb492011-04-15 11:57:05 +00003914 case Iop_I64UtoF32:
florian53eb2a02013-01-12 22:04:00 +00003915 case Iop_D64toI32U:
3916 case Iop_D64toI32S:
3917 /* First arg is I32 (rounding mode), second is F64/D64 (data). */
sewardj95448072004-11-22 20:19:51 +00003918 return mkLazy2(mce, Ity_I32, vatom1, vatom2);
3919
sewardjea8b02f2012-04-12 17:28:57 +00003920 case Iop_D64toD32:
florian054684f2013-06-06 21:21:46 +00003921 /* First arg is I32 (rounding mode), second is D64 (data). */
florianf4bed372012-12-21 04:25:10 +00003922 return mkLazy2(mce, Ity_I32, vatom1, vatom2);
sewardjea8b02f2012-04-12 17:28:57 +00003923
sewardj06f96d02009-12-31 19:24:12 +00003924 case Iop_F64toI16S:
sewardj95448072004-11-22 20:19:51 +00003925 /* First arg is I32 (rounding mode), second is F64 (data). */
3926 return mkLazy2(mce, Ity_I16, vatom1, vatom2);
3927
sewardj18c72fa2012-04-23 11:22:05 +00003928 case Iop_InsertExpD64:
3929 /* I64 x I64 -> D64 */
3930 return mkLazy2(mce, Ity_I64, vatom1, vatom2);
3931
3932 case Iop_InsertExpD128:
3933 /* I64 x I128 -> D128 */
3934 return mkLazy2(mce, Ity_I128, vatom1, vatom2);
3935
sewardjb5b87402011-03-07 16:05:35 +00003936 case Iop_CmpF32:
sewardj95448072004-11-22 20:19:51 +00003937 case Iop_CmpF64:
sewardjb5b87402011-03-07 16:05:35 +00003938 case Iop_CmpF128:
sewardj18c72fa2012-04-23 11:22:05 +00003939 case Iop_CmpD64:
3940 case Iop_CmpD128:
florian29a36b92012-12-26 17:48:46 +00003941 case Iop_CmpExpD64:
3942 case Iop_CmpExpD128:
sewardj95448072004-11-22 20:19:51 +00003943 return mkLazy2(mce, Ity_I32, vatom1, vatom2);
3944
3945 /* non-FP after here */
3946
3947 case Iop_DivModU64to32:
3948 case Iop_DivModS64to32:
3949 return mkLazy2(mce, Ity_I64, vatom1, vatom2);
3950
sewardj69a13322005-04-23 01:14:51 +00003951 case Iop_DivModU128to64:
3952 case Iop_DivModS128to64:
3953 return mkLazy2(mce, Ity_I128, vatom1, vatom2);
3954
florian537ed2d2012-08-20 16:51:39 +00003955 case Iop_8HLto16:
3956 return assignNew('V', mce, Ity_I16, binop(op, vatom1, vatom2));
sewardj95448072004-11-22 20:19:51 +00003957 case Iop_16HLto32:
sewardj7cf4e6b2008-05-01 20:24:26 +00003958 return assignNew('V', mce, Ity_I32, binop(op, vatom1, vatom2));
sewardj95448072004-11-22 20:19:51 +00003959 case Iop_32HLto64:
sewardj7cf4e6b2008-05-01 20:24:26 +00003960 return assignNew('V', mce, Ity_I64, binop(op, vatom1, vatom2));
sewardj95448072004-11-22 20:19:51 +00003961
sewardjb5b87402011-03-07 16:05:35 +00003962 case Iop_DivModS64to64:
sewardj6cf40ff2005-04-20 22:31:26 +00003963 case Iop_MullS64:
3964 case Iop_MullU64: {
3965 IRAtom* vLo64 = mkLeft64(mce, mkUifU64(mce, vatom1,vatom2));
3966 IRAtom* vHi64 = mkPCastTo(mce, Ity_I64, vLo64);
sewardj350e8f72012-06-25 07:52:15 +00003967 return assignNew('V', mce, Ity_I128,
3968 binop(Iop_64HLto128, vHi64, vLo64));
sewardj6cf40ff2005-04-20 22:31:26 +00003969 }
3970
sewardj95448072004-11-22 20:19:51 +00003971 case Iop_MullS32:
3972 case Iop_MullU32: {
3973 IRAtom* vLo32 = mkLeft32(mce, mkUifU32(mce, vatom1,vatom2));
3974 IRAtom* vHi32 = mkPCastTo(mce, Ity_I32, vLo32);
sewardj350e8f72012-06-25 07:52:15 +00003975 return assignNew('V', mce, Ity_I64,
3976 binop(Iop_32HLto64, vHi32, vLo32));
sewardj95448072004-11-22 20:19:51 +00003977 }
3978
3979 case Iop_MullS16:
3980 case Iop_MullU16: {
3981 IRAtom* vLo16 = mkLeft16(mce, mkUifU16(mce, vatom1,vatom2));
3982 IRAtom* vHi16 = mkPCastTo(mce, Ity_I16, vLo16);
sewardj350e8f72012-06-25 07:52:15 +00003983 return assignNew('V', mce, Ity_I32,
3984 binop(Iop_16HLto32, vHi16, vLo16));
sewardj95448072004-11-22 20:19:51 +00003985 }
3986
3987 case Iop_MullS8:
3988 case Iop_MullU8: {
3989 IRAtom* vLo8 = mkLeft8(mce, mkUifU8(mce, vatom1,vatom2));
3990 IRAtom* vHi8 = mkPCastTo(mce, Ity_I8, vLo8);
sewardj7cf4e6b2008-05-01 20:24:26 +00003991 return assignNew('V', mce, Ity_I16, binop(Iop_8HLto16, vHi8, vLo8));
sewardj95448072004-11-22 20:19:51 +00003992 }
3993
sewardj5af05062010-10-18 16:31:14 +00003994 case Iop_Sad8Ux4: /* maybe we could do better? ftm, do mkLazy2. */
cerion9e591082005-06-23 15:28:34 +00003995 case Iop_DivS32:
3996 case Iop_DivU32:
sewardja201c452011-07-24 14:15:54 +00003997 case Iop_DivU32E:
sewardj169ac042011-09-05 12:12:34 +00003998 case Iop_DivS32E:
sewardj2157b2c2012-07-11 13:20:58 +00003999 case Iop_QAdd32S: /* could probably do better */
4000 case Iop_QSub32S: /* could probably do better */
cerion9e591082005-06-23 15:28:34 +00004001 return mkLazy2(mce, Ity_I32, vatom1, vatom2);
4002
sewardjb00944a2005-12-23 12:47:16 +00004003 case Iop_DivS64:
4004 case Iop_DivU64:
sewardja201c452011-07-24 14:15:54 +00004005 case Iop_DivS64E:
sewardj169ac042011-09-05 12:12:34 +00004006 case Iop_DivU64E:
sewardjb00944a2005-12-23 12:47:16 +00004007 return mkLazy2(mce, Ity_I64, vatom1, vatom2);
4008
sewardj95448072004-11-22 20:19:51 +00004009 case Iop_Add32:
sewardj54eac252012-03-27 10:19:39 +00004010 if (mce->bogusLiterals || mce->useLLVMworkarounds)
sewardjd5204dc2004-12-31 01:16:11 +00004011 return expensiveAddSub(mce,True,Ity_I32,
4012 vatom1,vatom2, atom1,atom2);
4013 else
4014 goto cheap_AddSub32;
sewardj95448072004-11-22 20:19:51 +00004015 case Iop_Sub32:
sewardjd5204dc2004-12-31 01:16:11 +00004016 if (mce->bogusLiterals)
4017 return expensiveAddSub(mce,False,Ity_I32,
4018 vatom1,vatom2, atom1,atom2);
4019 else
4020 goto cheap_AddSub32;
4021
4022 cheap_AddSub32:
sewardj95448072004-11-22 20:19:51 +00004023 case Iop_Mul32:
sewardj992dff92005-10-07 11:08:55 +00004024 return mkLeft32(mce, mkUifU32(mce, vatom1,vatom2));
4025
sewardj463b3d92005-07-18 11:41:15 +00004026 case Iop_CmpORD32S:
4027 case Iop_CmpORD32U:
sewardj1bc82102005-12-23 00:16:24 +00004028 case Iop_CmpORD64S:
4029 case Iop_CmpORD64U:
4030 return doCmpORD(mce, op, vatom1,vatom2, atom1,atom2);
sewardj95448072004-11-22 20:19:51 +00004031
sewardj681be302005-01-15 20:43:58 +00004032 case Iop_Add64:
sewardj54eac252012-03-27 10:19:39 +00004033 if (mce->bogusLiterals || mce->useLLVMworkarounds)
tomd9774d72005-06-27 08:11:01 +00004034 return expensiveAddSub(mce,True,Ity_I64,
4035 vatom1,vatom2, atom1,atom2);
4036 else
4037 goto cheap_AddSub64;
sewardj681be302005-01-15 20:43:58 +00004038 case Iop_Sub64:
tomd9774d72005-06-27 08:11:01 +00004039 if (mce->bogusLiterals)
4040 return expensiveAddSub(mce,False,Ity_I64,
4041 vatom1,vatom2, atom1,atom2);
4042 else
4043 goto cheap_AddSub64;
4044
4045 cheap_AddSub64:
4046 case Iop_Mul64:
sewardj681be302005-01-15 20:43:58 +00004047 return mkLeft64(mce, mkUifU64(mce, vatom1,vatom2));
4048
sewardj95448072004-11-22 20:19:51 +00004049 case Iop_Mul16:
4050 case Iop_Add16:
4051 case Iop_Sub16:
4052 return mkLeft16(mce, mkUifU16(mce, vatom1,vatom2));
4053
florian537ed2d2012-08-20 16:51:39 +00004054 case Iop_Mul8:
sewardj95448072004-11-22 20:19:51 +00004055 case Iop_Sub8:
4056 case Iop_Add8:
4057 return mkLeft8(mce, mkUifU8(mce, vatom1,vatom2));
4058
sewardj69a13322005-04-23 01:14:51 +00004059 case Iop_CmpEQ64:
sewardje6f8af42005-07-06 18:48:59 +00004060 case Iop_CmpNE64:
sewardj69a13322005-04-23 01:14:51 +00004061 if (mce->bogusLiterals)
sewardj4cfa81b2012-11-08 10:58:16 +00004062 goto expensive_cmp64;
sewardj69a13322005-04-23 01:14:51 +00004063 else
4064 goto cheap_cmp64;
sewardj4cfa81b2012-11-08 10:58:16 +00004065
4066 expensive_cmp64:
4067 case Iop_ExpCmpNE64:
4068 return expensiveCmpEQorNE(mce,Ity_I64, vatom1,vatom2, atom1,atom2 );
4069
sewardj69a13322005-04-23 01:14:51 +00004070 cheap_cmp64:
tomcd986332005-04-26 07:44:48 +00004071 case Iop_CmpLE64S: case Iop_CmpLE64U:
4072 case Iop_CmpLT64U: case Iop_CmpLT64S:
sewardj69a13322005-04-23 01:14:51 +00004073 return mkPCastTo(mce, Ity_I1, mkUifU64(mce, vatom1,vatom2));
4074
sewardjd5204dc2004-12-31 01:16:11 +00004075 case Iop_CmpEQ32:
sewardje6f8af42005-07-06 18:48:59 +00004076 case Iop_CmpNE32:
sewardjd5204dc2004-12-31 01:16:11 +00004077 if (mce->bogusLiterals)
sewardj4cfa81b2012-11-08 10:58:16 +00004078 goto expensive_cmp32;
sewardjd5204dc2004-12-31 01:16:11 +00004079 else
4080 goto cheap_cmp32;
sewardj4cfa81b2012-11-08 10:58:16 +00004081
4082 expensive_cmp32:
4083 case Iop_ExpCmpNE32:
4084 return expensiveCmpEQorNE(mce,Ity_I32, vatom1,vatom2, atom1,atom2 );
4085
sewardjd5204dc2004-12-31 01:16:11 +00004086 cheap_cmp32:
sewardj95448072004-11-22 20:19:51 +00004087 case Iop_CmpLE32S: case Iop_CmpLE32U:
4088 case Iop_CmpLT32U: case Iop_CmpLT32S:
sewardj95448072004-11-22 20:19:51 +00004089 return mkPCastTo(mce, Ity_I1, mkUifU32(mce, vatom1,vatom2));
4090
4091 case Iop_CmpEQ16: case Iop_CmpNE16:
4092 return mkPCastTo(mce, Ity_I1, mkUifU16(mce, vatom1,vatom2));
4093
sewardj4cfa81b2012-11-08 10:58:16 +00004094 case Iop_ExpCmpNE16:
4095 return expensiveCmpEQorNE(mce,Ity_I16, vatom1,vatom2, atom1,atom2 );
4096
sewardj95448072004-11-22 20:19:51 +00004097 case Iop_CmpEQ8: case Iop_CmpNE8:
4098 return mkPCastTo(mce, Ity_I1, mkUifU8(mce, vatom1,vatom2));
4099
sewardjafed4c52009-07-12 13:00:17 +00004100 case Iop_CasCmpEQ8: case Iop_CasCmpNE8:
4101 case Iop_CasCmpEQ16: case Iop_CasCmpNE16:
4102 case Iop_CasCmpEQ32: case Iop_CasCmpNE32:
4103 case Iop_CasCmpEQ64: case Iop_CasCmpNE64:
4104 /* Just say these all produce a defined result, regardless
4105 of their arguments. See COMMENT_ON_CasCmpEQ in this file. */
4106 return assignNew('V', mce, Ity_I1, definedOfType(Ity_I1));
4107
sewardjaaddbc22005-10-07 09:49:53 +00004108 case Iop_Shl64: case Iop_Shr64: case Iop_Sar64:
4109 return scalarShift( mce, Ity_I64, op, vatom1,vatom2, atom1,atom2 );
4110
sewardj95448072004-11-22 20:19:51 +00004111 case Iop_Shl32: case Iop_Shr32: case Iop_Sar32:
sewardjaaddbc22005-10-07 09:49:53 +00004112 return scalarShift( mce, Ity_I32, op, vatom1,vatom2, atom1,atom2 );
sewardj95448072004-11-22 20:19:51 +00004113
sewardjdb67f5f2004-12-14 01:15:31 +00004114 case Iop_Shl16: case Iop_Shr16: case Iop_Sar16:
sewardjaaddbc22005-10-07 09:49:53 +00004115 return scalarShift( mce, Ity_I16, op, vatom1,vatom2, atom1,atom2 );
sewardj95448072004-11-22 20:19:51 +00004116
florian537ed2d2012-08-20 16:51:39 +00004117 case Iop_Shl8: case Iop_Shr8: case Iop_Sar8:
sewardjaaddbc22005-10-07 09:49:53 +00004118 return scalarShift( mce, Ity_I8, op, vatom1,vatom2, atom1,atom2 );
sewardj95448072004-11-22 20:19:51 +00004119
sewardj350e8f72012-06-25 07:52:15 +00004120 case Iop_AndV256:
4121 uifu = mkUifUV256; difd = mkDifDV256;
4122 and_or_ty = Ity_V256; improve = mkImproveANDV256; goto do_And_Or;
sewardj20d38f22005-02-07 23:50:18 +00004123 case Iop_AndV128:
4124 uifu = mkUifUV128; difd = mkDifDV128;
4125 and_or_ty = Ity_V128; improve = mkImproveANDV128; goto do_And_Or;
sewardj7010f6e2004-12-10 13:35:22 +00004126 case Iop_And64:
4127 uifu = mkUifU64; difd = mkDifD64;
4128 and_or_ty = Ity_I64; improve = mkImproveAND64; goto do_And_Or;
sewardj95448072004-11-22 20:19:51 +00004129 case Iop_And32:
4130 uifu = mkUifU32; difd = mkDifD32;
4131 and_or_ty = Ity_I32; improve = mkImproveAND32; goto do_And_Or;
4132 case Iop_And16:
4133 uifu = mkUifU16; difd = mkDifD16;
4134 and_or_ty = Ity_I16; improve = mkImproveAND16; goto do_And_Or;
4135 case Iop_And8:
4136 uifu = mkUifU8; difd = mkDifD8;
4137 and_or_ty = Ity_I8; improve = mkImproveAND8; goto do_And_Or;
4138
sewardj350e8f72012-06-25 07:52:15 +00004139 case Iop_OrV256:
4140 uifu = mkUifUV256; difd = mkDifDV256;
4141 and_or_ty = Ity_V256; improve = mkImproveORV256; goto do_And_Or;
sewardj20d38f22005-02-07 23:50:18 +00004142 case Iop_OrV128:
4143 uifu = mkUifUV128; difd = mkDifDV128;
4144 and_or_ty = Ity_V128; improve = mkImproveORV128; goto do_And_Or;
sewardj7010f6e2004-12-10 13:35:22 +00004145 case Iop_Or64:
4146 uifu = mkUifU64; difd = mkDifD64;
4147 and_or_ty = Ity_I64; improve = mkImproveOR64; goto do_And_Or;
sewardj95448072004-11-22 20:19:51 +00004148 case Iop_Or32:
4149 uifu = mkUifU32; difd = mkDifD32;
4150 and_or_ty = Ity_I32; improve = mkImproveOR32; goto do_And_Or;
4151 case Iop_Or16:
4152 uifu = mkUifU16; difd = mkDifD16;
4153 and_or_ty = Ity_I16; improve = mkImproveOR16; goto do_And_Or;
4154 case Iop_Or8:
4155 uifu = mkUifU8; difd = mkDifD8;
4156 and_or_ty = Ity_I8; improve = mkImproveOR8; goto do_And_Or;
4157
4158 do_And_Or:
4159 return
4160 assignNew(
sewardj7cf4e6b2008-05-01 20:24:26 +00004161 'V', mce,
sewardj95448072004-11-22 20:19:51 +00004162 and_or_ty,
4163 difd(mce, uifu(mce, vatom1, vatom2),
4164 difd(mce, improve(mce, atom1, vatom1),
4165 improve(mce, atom2, vatom2) ) ) );
4166
4167 case Iop_Xor8:
4168 return mkUifU8(mce, vatom1, vatom2);
4169 case Iop_Xor16:
4170 return mkUifU16(mce, vatom1, vatom2);
4171 case Iop_Xor32:
4172 return mkUifU32(mce, vatom1, vatom2);
sewardj7010f6e2004-12-10 13:35:22 +00004173 case Iop_Xor64:
4174 return mkUifU64(mce, vatom1, vatom2);
sewardj20d38f22005-02-07 23:50:18 +00004175 case Iop_XorV128:
4176 return mkUifUV128(mce, vatom1, vatom2);
sewardj350e8f72012-06-25 07:52:15 +00004177 case Iop_XorV256:
4178 return mkUifUV256(mce, vatom1, vatom2);
njn25e49d8e72002-09-23 09:36:25 +00004179
sewardja2f30952013-03-27 11:40:02 +00004180 /* V256-bit SIMD */
4181
4182 case Iop_ShrN16x16:
4183 case Iop_ShrN32x8:
4184 case Iop_ShrN64x4:
4185 case Iop_SarN16x16:
4186 case Iop_SarN32x8:
4187 case Iop_ShlN16x16:
4188 case Iop_ShlN32x8:
4189 case Iop_ShlN64x4:
4190 /* Same scheme as with all other shifts. Note: 22 Oct 05:
4191 this is wrong now, scalar shifts are done properly lazily.
4192 Vector shifts should be fixed too. */
sewardjb9e6d242013-05-11 13:42:08 +00004193 complainIfUndefined(mce, atom2, NULL);
sewardja2f30952013-03-27 11:40:02 +00004194 return assignNew('V', mce, Ity_V256, binop(op, vatom1, atom2));
4195
4196 case Iop_QSub8Ux32:
4197 case Iop_QSub8Sx32:
4198 case Iop_Sub8x32:
4199 case Iop_Min8Ux32:
4200 case Iop_Min8Sx32:
4201 case Iop_Max8Ux32:
4202 case Iop_Max8Sx32:
4203 case Iop_CmpGT8Sx32:
4204 case Iop_CmpEQ8x32:
4205 case Iop_Avg8Ux32:
4206 case Iop_QAdd8Ux32:
4207 case Iop_QAdd8Sx32:
4208 case Iop_Add8x32:
4209 return binary8Ix32(mce, vatom1, vatom2);
4210
4211 case Iop_QSub16Ux16:
4212 case Iop_QSub16Sx16:
4213 case Iop_Sub16x16:
4214 case Iop_Mul16x16:
4215 case Iop_MulHi16Sx16:
4216 case Iop_MulHi16Ux16:
4217 case Iop_Min16Sx16:
4218 case Iop_Min16Ux16:
4219 case Iop_Max16Sx16:
4220 case Iop_Max16Ux16:
4221 case Iop_CmpGT16Sx16:
4222 case Iop_CmpEQ16x16:
4223 case Iop_Avg16Ux16:
4224 case Iop_QAdd16Ux16:
4225 case Iop_QAdd16Sx16:
4226 case Iop_Add16x16:
4227 return binary16Ix16(mce, vatom1, vatom2);
4228
4229 case Iop_Sub32x8:
4230 case Iop_CmpGT32Sx8:
4231 case Iop_CmpEQ32x8:
4232 case Iop_Add32x8:
4233 case Iop_Max32Ux8:
4234 case Iop_Max32Sx8:
4235 case Iop_Min32Ux8:
4236 case Iop_Min32Sx8:
4237 case Iop_Mul32x8:
4238 return binary32Ix8(mce, vatom1, vatom2);
4239
4240 case Iop_Sub64x4:
4241 case Iop_Add64x4:
4242 case Iop_CmpEQ64x4:
4243 case Iop_CmpGT64Sx4:
4244 return binary64Ix4(mce, vatom1, vatom2);
4245
4246 /* Perm32x8: rearrange values in left arg using steering values
4247 from right arg. So rearrange the vbits in the same way but
4248 pessimise wrt steering values. */
4249 case Iop_Perm32x8:
4250 return mkUifUV256(
4251 mce,
4252 assignNew('V', mce, Ity_V256, binop(op, vatom1, atom2)),
4253 mkPCast32x8(mce, vatom2)
4254 );
4255
sewardjbfd03f82014-08-26 18:35:13 +00004256 /* Q-and-Qshift-by-vector of the form (V128, V128) -> V256.
4257 Handle the shifted results in the same way that other
4258 binary Q ops are handled, eg QSub: UifU the two args,
4259 then pessimise -- which is binaryNIxM. But for the upper
4260 V128, we require to generate just 1 bit which is the
4261 pessimised shift result, with 127 defined zeroes above it.
4262
4263 Note that this overly pessimistic in that in fact only the
4264 bottom 8 bits of each lane of the second arg determine the shift
4265 amount. Really we ought to ignore any undefinedness in the
4266 rest of the lanes of the second arg. */
4267 case Iop_QandSQsh64x2: case Iop_QandUQsh64x2:
4268 case Iop_QandSQRsh64x2: case Iop_QandUQRsh64x2:
4269 case Iop_QandSQsh32x4: case Iop_QandUQsh32x4:
4270 case Iop_QandSQRsh32x4: case Iop_QandUQRsh32x4:
4271 case Iop_QandSQsh16x8: case Iop_QandUQsh16x8:
4272 case Iop_QandSQRsh16x8: case Iop_QandUQRsh16x8:
4273 case Iop_QandSQsh8x16: case Iop_QandUQsh8x16:
4274 case Iop_QandSQRsh8x16: case Iop_QandUQRsh8x16:
4275 {
4276 // The function to generate the pessimised shift result
4277 IRAtom* (*binaryNIxM)(MCEnv*,IRAtom*,IRAtom*) = NULL;
4278 switch (op) {
4279 case Iop_QandSQsh64x2:
4280 case Iop_QandUQsh64x2:
4281 case Iop_QandSQRsh64x2:
4282 case Iop_QandUQRsh64x2:
4283 binaryNIxM = binary64Ix2;
4284 break;
4285 case Iop_QandSQsh32x4:
4286 case Iop_QandUQsh32x4:
4287 case Iop_QandSQRsh32x4:
4288 case Iop_QandUQRsh32x4:
4289 binaryNIxM = binary32Ix4;
4290 break;
4291 case Iop_QandSQsh16x8:
4292 case Iop_QandUQsh16x8:
4293 case Iop_QandSQRsh16x8:
4294 case Iop_QandUQRsh16x8:
4295 binaryNIxM = binary16Ix8;
4296 break;
4297 case Iop_QandSQsh8x16:
4298 case Iop_QandUQsh8x16:
4299 case Iop_QandSQRsh8x16:
4300 case Iop_QandUQRsh8x16:
4301 binaryNIxM = binary8Ix16;
4302 break;
4303 default:
4304 tl_assert(0);
4305 }
4306 tl_assert(binaryNIxM);
4307 // Pessimised shift result, shV[127:0]
4308 IRAtom* shV = binaryNIxM(mce, vatom1, vatom2);
4309 // Generates: Def--(127)--Def PCast-to-I1(shV)
4310 IRAtom* qV = mkPCastXXtoXXlsb(mce, shV, Ity_V128);
4311 // and assemble the result
4312 return assignNew('V', mce, Ity_V256,
4313 binop(Iop_V128HLtoV256, qV, shV));
4314 }
4315
njn25e49d8e72002-09-23 09:36:25 +00004316 default:
sewardj95448072004-11-22 20:19:51 +00004317 ppIROp(op);
4318 VG_(tool_panic)("memcheck:expr2vbits_Binop");
njn25e49d8e72002-09-23 09:36:25 +00004319 }
njn25e49d8e72002-09-23 09:36:25 +00004320}
4321
njn25e49d8e72002-09-23 09:36:25 +00004322
sewardj95448072004-11-22 20:19:51 +00004323static
4324IRExpr* expr2vbits_Unop ( MCEnv* mce, IROp op, IRAtom* atom )
4325{
sewardjcafe5052013-01-17 14:24:35 +00004326 /* For the widening operations {8,16,32}{U,S}to{16,32,64}, the
4327 selection of shadow operation implicitly duplicates the logic in
4328 do_shadow_LoadG and should be kept in sync (in the very unlikely
4329 event that the interpretation of such widening ops changes in
4330 future). See comment in do_shadow_LoadG. */
sewardj95448072004-11-22 20:19:51 +00004331 IRAtom* vatom = expr2vbits( mce, atom );
4332 tl_assert(isOriginalAtom(mce,atom));
4333 switch (op) {
4334
sewardjc46e6cc2014-03-10 10:42:36 +00004335 case Iop_Abs64Fx2:
4336 case Iop_Neg64Fx2:
sewardj14350762015-02-24 12:24:35 +00004337 case Iop_RSqrtEst64Fx2:
4338 case Iop_RecipEst64Fx2:
sewardj0b070592004-12-10 21:44:22 +00004339 return unary64Fx2(mce, vatom);
4340
4341 case Iop_Sqrt64F0x2:
4342 return unary64F0x2(mce, vatom);
4343
sewardj350e8f72012-06-25 07:52:15 +00004344 case Iop_Sqrt32Fx8:
sewardjee6bb772014-08-24 14:02:22 +00004345 case Iop_RSqrtEst32Fx8:
4346 case Iop_RecipEst32Fx8:
sewardj350e8f72012-06-25 07:52:15 +00004347 return unary32Fx8(mce, vatom);
4348
4349 case Iop_Sqrt64Fx4:
4350 return unary64Fx4(mce, vatom);
4351
sewardjee6bb772014-08-24 14:02:22 +00004352 case Iop_RecipEst32Fx4:
cerion176cb4c2005-11-16 17:21:49 +00004353 case Iop_I32UtoFx4:
4354 case Iop_I32StoFx4:
4355 case Iop_QFtoI32Ux4_RZ:
4356 case Iop_QFtoI32Sx4_RZ:
4357 case Iop_RoundF32x4_RM:
4358 case Iop_RoundF32x4_RP:
4359 case Iop_RoundF32x4_RN:
4360 case Iop_RoundF32x4_RZ:
sewardjee6bb772014-08-24 14:02:22 +00004361 case Iop_RecipEst32Ux4:
sewardj57f92b02010-08-22 11:54:14 +00004362 case Iop_Abs32Fx4:
4363 case Iop_Neg32Fx4:
sewardjee6bb772014-08-24 14:02:22 +00004364 case Iop_RSqrtEst32Fx4:
sewardj170ee212004-12-10 18:57:51 +00004365 return unary32Fx4(mce, vatom);
4366
sewardj57f92b02010-08-22 11:54:14 +00004367 case Iop_I32UtoFx2:
4368 case Iop_I32StoFx2:
sewardjee6bb772014-08-24 14:02:22 +00004369 case Iop_RecipEst32Fx2:
4370 case Iop_RecipEst32Ux2:
sewardj57f92b02010-08-22 11:54:14 +00004371 case Iop_Abs32Fx2:
4372 case Iop_Neg32Fx2:
sewardjee6bb772014-08-24 14:02:22 +00004373 case Iop_RSqrtEst32Fx2:
sewardj57f92b02010-08-22 11:54:14 +00004374 return unary32Fx2(mce, vatom);
4375
sewardj170ee212004-12-10 18:57:51 +00004376 case Iop_Sqrt32F0x4:
sewardjee6bb772014-08-24 14:02:22 +00004377 case Iop_RSqrtEst32F0x4:
4378 case Iop_RecipEst32F0x4:
sewardj170ee212004-12-10 18:57:51 +00004379 return unary32F0x4(mce, vatom);
4380
sewardj20d38f22005-02-07 23:50:18 +00004381 case Iop_32UtoV128:
4382 case Iop_64UtoV128:
sewardj620eb5b2005-10-22 12:50:43 +00004383 case Iop_Dup8x16:
4384 case Iop_Dup16x8:
4385 case Iop_Dup32x4:
sewardjbfd03f82014-08-26 18:35:13 +00004386 case Iop_Reverse1sIn8_x16:
sewardj55404922014-06-26 10:51:03 +00004387 case Iop_Reverse8sIn16_x8:
4388 case Iop_Reverse8sIn32_x4:
4389 case Iop_Reverse16sIn32_x4:
4390 case Iop_Reverse8sIn64_x2:
4391 case Iop_Reverse16sIn64_x2:
4392 case Iop_Reverse32sIn64_x2:
sewardj350e8f72012-06-25 07:52:15 +00004393 case Iop_V256toV128_1: case Iop_V256toV128_0:
sewardjc46e6cc2014-03-10 10:42:36 +00004394 case Iop_ZeroHI64ofV128:
4395 case Iop_ZeroHI96ofV128:
4396 case Iop_ZeroHI112ofV128:
4397 case Iop_ZeroHI120ofV128:
sewardj7cf4e6b2008-05-01 20:24:26 +00004398 return assignNew('V', mce, Ity_V128, unop(op, vatom));
sewardj170ee212004-12-10 18:57:51 +00004399
sewardjb5b87402011-03-07 16:05:35 +00004400 case Iop_F128HItoF64: /* F128 -> high half of F128 */
sewardjb0ccb4d2012-04-02 10:22:05 +00004401 case Iop_D128HItoD64: /* D128 -> high half of D128 */
sewardjb5b87402011-03-07 16:05:35 +00004402 return assignNew('V', mce, Ity_I64, unop(Iop_128HIto64, vatom));
4403 case Iop_F128LOtoF64: /* F128 -> low half of F128 */
sewardjb0ccb4d2012-04-02 10:22:05 +00004404 case Iop_D128LOtoD64: /* D128 -> low half of D128 */
sewardjb5b87402011-03-07 16:05:35 +00004405 return assignNew('V', mce, Ity_I64, unop(Iop_128to64, vatom));
4406
4407 case Iop_NegF128:
4408 case Iop_AbsF128:
Elliott Hughesa0664b92017-04-18 17:46:52 -07004409 case Iop_RndF128:
4410 case Iop_TruncF128toI64S: /* F128 -> I64S */
4411 case Iop_TruncF128toI32S: /* F128 -> I32S (result stored in 64-bits) */
4412 case Iop_TruncF128toI64U: /* F128 -> I64U */
4413 case Iop_TruncF128toI32U: /* F128 -> I32U (result stored in 64-bits) */
sewardjb5b87402011-03-07 16:05:35 +00004414 return mkPCastTo(mce, Ity_I128, vatom);
4415
Elliott Hughesa0664b92017-04-18 17:46:52 -07004416 case Iop_BCD128toI128S:
4417 case Iop_MulI128by10:
4418 case Iop_MulI128by10Carry:
4419 case Iop_F16toF64x2:
4420 case Iop_F64toF16x2:
4421 return vatom;
4422
sewardjb5b87402011-03-07 16:05:35 +00004423 case Iop_I32StoF128: /* signed I32 -> F128 */
4424 case Iop_I64StoF128: /* signed I64 -> F128 */
florian1b9609a2012-09-01 00:15:45 +00004425 case Iop_I32UtoF128: /* unsigned I32 -> F128 */
4426 case Iop_I64UtoF128: /* unsigned I64 -> F128 */
sewardjb5b87402011-03-07 16:05:35 +00004427 case Iop_F32toF128: /* F32 -> F128 */
4428 case Iop_F64toF128: /* F64 -> F128 */
florian53eb2a02013-01-12 22:04:00 +00004429 case Iop_I32StoD128: /* signed I64 -> D128 */
sewardjea8b02f2012-04-12 17:28:57 +00004430 case Iop_I64StoD128: /* signed I64 -> D128 */
florian53eb2a02013-01-12 22:04:00 +00004431 case Iop_I32UtoD128: /* unsigned I32 -> D128 */
4432 case Iop_I64UtoD128: /* unsigned I64 -> D128 */
sewardjb5b87402011-03-07 16:05:35 +00004433 return mkPCastTo(mce, Ity_I128, vatom);
4434
sewardj1f4b1eb2015-04-06 14:52:28 +00004435 case Iop_F16toF64:
sewardj95448072004-11-22 20:19:51 +00004436 case Iop_F32toF64:
sewardj06f96d02009-12-31 19:24:12 +00004437 case Iop_I32StoF64:
sewardj59570ff2010-01-01 11:59:33 +00004438 case Iop_I32UtoF64:
sewardj95448072004-11-22 20:19:51 +00004439 case Iop_NegF64:
sewardj95448072004-11-22 20:19:51 +00004440 case Iop_AbsF64:
sewardjee6bb772014-08-24 14:02:22 +00004441 case Iop_RSqrtEst5GoodF64:
sewardjdead90a2008-08-08 08:38:23 +00004442 case Iop_RoundF64toF64_NEAREST:
4443 case Iop_RoundF64toF64_NegINF:
4444 case Iop_RoundF64toF64_PosINF:
4445 case Iop_RoundF64toF64_ZERO:
sewardj39cc7352005-06-09 21:31:55 +00004446 case Iop_Clz64:
sewardjea8b02f2012-04-12 17:28:57 +00004447 case Iop_D32toD64:
florian53eb2a02013-01-12 22:04:00 +00004448 case Iop_I32StoD64:
4449 case Iop_I32UtoD64:
sewardj18c72fa2012-04-23 11:22:05 +00004450 case Iop_ExtractExpD64: /* D64 -> I64 */
4451 case Iop_ExtractExpD128: /* D128 -> I64 */
florian974b4092012-12-27 20:06:18 +00004452 case Iop_ExtractSigD64: /* D64 -> I64 */
4453 case Iop_ExtractSigD128: /* D128 -> I64 */
florian1943eb52012-08-22 18:09:07 +00004454 case Iop_DPBtoBCD:
4455 case Iop_BCDtoDPB:
sewardj95448072004-11-22 20:19:51 +00004456 return mkPCastTo(mce, Ity_I64, vatom);
4457
sewardjea8b02f2012-04-12 17:28:57 +00004458 case Iop_D64toD128:
4459 return mkPCastTo(mce, Ity_I128, vatom);
4460
sewardj95448072004-11-22 20:19:51 +00004461 case Iop_Clz32:
sewardjed69fdb2006-02-03 16:12:27 +00004462 case Iop_TruncF64asF32:
sewardj59570ff2010-01-01 11:59:33 +00004463 case Iop_NegF32:
4464 case Iop_AbsF32:
sewardj1f4b1eb2015-04-06 14:52:28 +00004465 case Iop_F16toF32:
sewardj95448072004-11-22 20:19:51 +00004466 return mkPCastTo(mce, Ity_I32, vatom);
4467
sewardj4cfa81b2012-11-08 10:58:16 +00004468 case Iop_Ctz32:
4469 case Iop_Ctz64:
4470 return expensiveCountTrailingZeroes(mce, op, atom, vatom);
4471
sewardjd9dbc192005-04-27 11:40:27 +00004472 case Iop_1Uto64:
sewardja201c452011-07-24 14:15:54 +00004473 case Iop_1Sto64:
sewardjd9dbc192005-04-27 11:40:27 +00004474 case Iop_8Uto64:
4475 case Iop_8Sto64:
4476 case Iop_16Uto64:
4477 case Iop_16Sto64:
sewardj95448072004-11-22 20:19:51 +00004478 case Iop_32Sto64:
4479 case Iop_32Uto64:
sewardj20d38f22005-02-07 23:50:18 +00004480 case Iop_V128to64:
4481 case Iop_V128HIto64:
sewardj6cf40ff2005-04-20 22:31:26 +00004482 case Iop_128HIto64:
4483 case Iop_128to64:
sewardj57f92b02010-08-22 11:54:14 +00004484 case Iop_Dup8x8:
4485 case Iop_Dup16x4:
4486 case Iop_Dup32x2:
sewardj55404922014-06-26 10:51:03 +00004487 case Iop_Reverse8sIn16_x4:
4488 case Iop_Reverse8sIn32_x2:
4489 case Iop_Reverse16sIn32_x2:
4490 case Iop_Reverse8sIn64_x1:
4491 case Iop_Reverse16sIn64_x1:
4492 case Iop_Reverse32sIn64_x1:
sewardj350e8f72012-06-25 07:52:15 +00004493 case Iop_V256to64_0: case Iop_V256to64_1:
4494 case Iop_V256to64_2: case Iop_V256to64_3:
sewardj7cf4e6b2008-05-01 20:24:26 +00004495 return assignNew('V', mce, Ity_I64, unop(op, vatom));
sewardj95448072004-11-22 20:19:51 +00004496
4497 case Iop_64to32:
4498 case Iop_64HIto32:
4499 case Iop_1Uto32:
sewardj463b3d92005-07-18 11:41:15 +00004500 case Iop_1Sto32:
sewardj95448072004-11-22 20:19:51 +00004501 case Iop_8Uto32:
4502 case Iop_16Uto32:
4503 case Iop_16Sto32:
4504 case Iop_8Sto32:
cerionfafaa0d2005-09-12 22:29:38 +00004505 case Iop_V128to32:
sewardj7cf4e6b2008-05-01 20:24:26 +00004506 return assignNew('V', mce, Ity_I32, unop(op, vatom));
sewardj95448072004-11-22 20:19:51 +00004507
4508 case Iop_8Sto16:
4509 case Iop_8Uto16:
4510 case Iop_32to16:
4511 case Iop_32HIto16:
sewardjd9dbc192005-04-27 11:40:27 +00004512 case Iop_64to16:
sewardjf5176342012-12-13 18:31:49 +00004513 case Iop_GetMSBs8x16:
sewardj7cf4e6b2008-05-01 20:24:26 +00004514 return assignNew('V', mce, Ity_I16, unop(op, vatom));
sewardj95448072004-11-22 20:19:51 +00004515
4516 case Iop_1Uto8:
sewardja201c452011-07-24 14:15:54 +00004517 case Iop_1Sto8:
sewardj95448072004-11-22 20:19:51 +00004518 case Iop_16to8:
sewardj9a807e02006-12-17 14:20:31 +00004519 case Iop_16HIto8:
sewardj95448072004-11-22 20:19:51 +00004520 case Iop_32to8:
sewardjd9dbc192005-04-27 11:40:27 +00004521 case Iop_64to8:
sewardj4cfa81b2012-11-08 10:58:16 +00004522 case Iop_GetMSBs8x8:
sewardj7cf4e6b2008-05-01 20:24:26 +00004523 return assignNew('V', mce, Ity_I8, unop(op, vatom));
sewardj95448072004-11-22 20:19:51 +00004524
4525 case Iop_32to1:
sewardj7cf4e6b2008-05-01 20:24:26 +00004526 return assignNew('V', mce, Ity_I1, unop(Iop_32to1, vatom));
sewardj95448072004-11-22 20:19:51 +00004527
sewardjd9dbc192005-04-27 11:40:27 +00004528 case Iop_64to1:
sewardj7cf4e6b2008-05-01 20:24:26 +00004529 return assignNew('V', mce, Ity_I1, unop(Iop_64to1, vatom));
sewardjd9dbc192005-04-27 11:40:27 +00004530
sewardj95448072004-11-22 20:19:51 +00004531 case Iop_ReinterpF64asI64:
4532 case Iop_ReinterpI64asF64:
sewardj0b070592004-12-10 21:44:22 +00004533 case Iop_ReinterpI32asF32:
sewardj59570ff2010-01-01 11:59:33 +00004534 case Iop_ReinterpF32asI32:
sewardj18c72fa2012-04-23 11:22:05 +00004535 case Iop_ReinterpI64asD64:
sewardj0892b822012-04-29 20:20:16 +00004536 case Iop_ReinterpD64asI64:
sewardj350e8f72012-06-25 07:52:15 +00004537 case Iop_NotV256:
sewardj20d38f22005-02-07 23:50:18 +00004538 case Iop_NotV128:
sewardj7010f6e2004-12-10 13:35:22 +00004539 case Iop_Not64:
sewardj95448072004-11-22 20:19:51 +00004540 case Iop_Not32:
4541 case Iop_Not16:
4542 case Iop_Not8:
4543 case Iop_Not1:
4544 return vatom;
sewardj7010f6e2004-12-10 13:35:22 +00004545
sewardj57f92b02010-08-22 11:54:14 +00004546 case Iop_CmpNEZ8x8:
4547 case Iop_Cnt8x8:
sewardj2e4d5af2014-06-26 08:22:01 +00004548 case Iop_Clz8x8:
4549 case Iop_Cls8x8:
sewardj57f92b02010-08-22 11:54:14 +00004550 case Iop_Abs8x8:
4551 return mkPCast8x8(mce, vatom);
4552
4553 case Iop_CmpNEZ8x16:
4554 case Iop_Cnt8x16:
sewardj2e4d5af2014-06-26 08:22:01 +00004555 case Iop_Clz8x16:
4556 case Iop_Cls8x16:
sewardj57f92b02010-08-22 11:54:14 +00004557 case Iop_Abs8x16:
Elliott Hughesa0664b92017-04-18 17:46:52 -07004558 case Iop_Ctz8x16:
sewardj57f92b02010-08-22 11:54:14 +00004559 return mkPCast8x16(mce, vatom);
4560
4561 case Iop_CmpNEZ16x4:
sewardj2e4d5af2014-06-26 08:22:01 +00004562 case Iop_Clz16x4:
4563 case Iop_Cls16x4:
sewardj57f92b02010-08-22 11:54:14 +00004564 case Iop_Abs16x4:
4565 return mkPCast16x4(mce, vatom);
4566
4567 case Iop_CmpNEZ16x8:
sewardj2e4d5af2014-06-26 08:22:01 +00004568 case Iop_Clz16x8:
4569 case Iop_Cls16x8:
sewardj57f92b02010-08-22 11:54:14 +00004570 case Iop_Abs16x8:
Elliott Hughesa0664b92017-04-18 17:46:52 -07004571 case Iop_Ctz16x8:
sewardj57f92b02010-08-22 11:54:14 +00004572 return mkPCast16x8(mce, vatom);
4573
4574 case Iop_CmpNEZ32x2:
sewardj2e4d5af2014-06-26 08:22:01 +00004575 case Iop_Clz32x2:
4576 case Iop_Cls32x2:
sewardj57f92b02010-08-22 11:54:14 +00004577 case Iop_FtoI32Ux2_RZ:
4578 case Iop_FtoI32Sx2_RZ:
4579 case Iop_Abs32x2:
4580 return mkPCast32x2(mce, vatom);
4581
4582 case Iop_CmpNEZ32x4:
sewardj2e4d5af2014-06-26 08:22:01 +00004583 case Iop_Clz32x4:
4584 case Iop_Cls32x4:
sewardj57f92b02010-08-22 11:54:14 +00004585 case Iop_FtoI32Ux4_RZ:
4586 case Iop_FtoI32Sx4_RZ:
4587 case Iop_Abs32x4:
sewardjbfd03f82014-08-26 18:35:13 +00004588 case Iop_RSqrtEst32Ux4:
Elliott Hughesa0664b92017-04-18 17:46:52 -07004589 case Iop_Ctz32x4:
sewardj57f92b02010-08-22 11:54:14 +00004590 return mkPCast32x4(mce, vatom);
4591
florian537ed2d2012-08-20 16:51:39 +00004592 case Iop_CmpwNEZ32:
4593 return mkPCastTo(mce, Ity_I32, vatom);
4594
sewardj57f92b02010-08-22 11:54:14 +00004595 case Iop_CmpwNEZ64:
4596 return mkPCastTo(mce, Ity_I64, vatom);
4597
4598 case Iop_CmpNEZ64x2:
carll24e40de2013-10-15 18:13:21 +00004599 case Iop_CipherSV128:
4600 case Iop_Clz64x2:
sewardj87a5bad2014-06-15 21:56:54 +00004601 case Iop_Abs64x2:
Elliott Hughesa0664b92017-04-18 17:46:52 -07004602 case Iop_Ctz64x2:
sewardj57f92b02010-08-22 11:54:14 +00004603 return mkPCast64x2(mce, vatom);
4604
carlle6bd3e42013-10-18 01:20:11 +00004605 case Iop_PwBitMtxXpose64x2:
4606 return assignNew('V', mce, Ity_V128, unop(op, vatom));
4607
sewardj7ee7d852011-06-16 11:37:21 +00004608 case Iop_NarrowUn16to8x8:
4609 case Iop_NarrowUn32to16x4:
4610 case Iop_NarrowUn64to32x2:
4611 case Iop_QNarrowUn16Sto8Sx8:
4612 case Iop_QNarrowUn16Sto8Ux8:
4613 case Iop_QNarrowUn16Uto8Ux8:
4614 case Iop_QNarrowUn32Sto16Sx4:
4615 case Iop_QNarrowUn32Sto16Ux4:
4616 case Iop_QNarrowUn32Uto16Ux4:
4617 case Iop_QNarrowUn64Sto32Sx2:
4618 case Iop_QNarrowUn64Sto32Ux2:
4619 case Iop_QNarrowUn64Uto32Ux2:
Elliott Hughesa0664b92017-04-18 17:46:52 -07004620 case Iop_F32toF16x4:
sewardj7ee7d852011-06-16 11:37:21 +00004621 return vectorNarrowUnV128(mce, op, vatom);
sewardj57f92b02010-08-22 11:54:14 +00004622
sewardj7ee7d852011-06-16 11:37:21 +00004623 case Iop_Widen8Sto16x8:
4624 case Iop_Widen8Uto16x8:
4625 case Iop_Widen16Sto32x4:
4626 case Iop_Widen16Uto32x4:
4627 case Iop_Widen32Sto64x2:
4628 case Iop_Widen32Uto64x2:
Elliott Hughesa0664b92017-04-18 17:46:52 -07004629 case Iop_F16toF32x4:
sewardj7ee7d852011-06-16 11:37:21 +00004630 return vectorWidenI64(mce, op, vatom);
sewardj57f92b02010-08-22 11:54:14 +00004631
4632 case Iop_PwAddL32Ux2:
4633 case Iop_PwAddL32Sx2:
4634 return mkPCastTo(mce, Ity_I64,
4635 assignNew('V', mce, Ity_I64, unop(op, mkPCast32x2(mce, vatom))));
4636
4637 case Iop_PwAddL16Ux4:
4638 case Iop_PwAddL16Sx4:
4639 return mkPCast32x2(mce,
4640 assignNew('V', mce, Ity_I64, unop(op, mkPCast16x4(mce, vatom))));
4641
4642 case Iop_PwAddL8Ux8:
4643 case Iop_PwAddL8Sx8:
4644 return mkPCast16x4(mce,
4645 assignNew('V', mce, Ity_I64, unop(op, mkPCast8x8(mce, vatom))));
4646
4647 case Iop_PwAddL32Ux4:
4648 case Iop_PwAddL32Sx4:
4649 return mkPCast64x2(mce,
4650 assignNew('V', mce, Ity_V128, unop(op, mkPCast32x4(mce, vatom))));
4651
4652 case Iop_PwAddL16Ux8:
4653 case Iop_PwAddL16Sx8:
4654 return mkPCast32x4(mce,
4655 assignNew('V', mce, Ity_V128, unop(op, mkPCast16x8(mce, vatom))));
4656
4657 case Iop_PwAddL8Ux16:
4658 case Iop_PwAddL8Sx16:
4659 return mkPCast16x8(mce,
4660 assignNew('V', mce, Ity_V128, unop(op, mkPCast8x16(mce, vatom))));
4661
sewardjf34eb492011-04-15 11:57:05 +00004662 case Iop_I64UtoF32:
sewardj95448072004-11-22 20:19:51 +00004663 default:
4664 ppIROp(op);
4665 VG_(tool_panic)("memcheck:expr2vbits_Unop");
4666 }
4667}
4668
4669
sewardjb9e6d242013-05-11 13:42:08 +00004670/* Worker function -- do not call directly. See comments on
4671 expr2vbits_Load for the meaning of |guard|.
4672
4673 Generates IR to (1) perform a definedness test of |addr|, (2)
4674 perform a validity test of |addr|, and (3) return the Vbits for the
4675 location indicated by |addr|. All of this only happens when
4676 |guard| is NULL or |guard| evaluates to True at run time.
4677
4678 If |guard| evaluates to False at run time, the returned value is
4679 the IR-mandated 0x55..55 value, and no checks nor shadow loads are
4680 performed.
4681
4682 The definedness of |guard| itself is not checked. That is assumed
4683 to have been done before this point, by the caller. */
sewardj95448072004-11-22 20:19:51 +00004684static
sewardj67564542013-08-16 08:31:29 +00004685IRAtom* expr2vbits_Load_WRK ( MCEnv* mce,
4686 IREndness end, IRType ty,
sewardjcafe5052013-01-17 14:24:35 +00004687 IRAtom* addr, UInt bias, IRAtom* guard )
sewardj95448072004-11-22 20:19:51 +00004688{
sewardj95448072004-11-22 20:19:51 +00004689 tl_assert(isOriginalAtom(mce,addr));
sewardj2e595852005-06-30 23:33:37 +00004690 tl_assert(end == Iend_LE || end == Iend_BE);
sewardj95448072004-11-22 20:19:51 +00004691
4692 /* First, emit a definedness test for the address. This also sets
4693 the address (shadow) to 'defined' following the test. */
sewardjb9e6d242013-05-11 13:42:08 +00004694 complainIfUndefined( mce, addr, guard );
sewardj95448072004-11-22 20:19:51 +00004695
4696 /* Now cook up a call to the relevant helper function, to read the
4697 data V bits from shadow memory. */
sewardj7cf4e6b2008-05-01 20:24:26 +00004698 ty = shadowTypeV(ty);
sewardj2e595852005-06-30 23:33:37 +00004699
sewardj21a5f8c2013-08-08 10:41:46 +00004700 void* helper = NULL;
4701 const HChar* hname = NULL;
4702 Bool ret_via_outparam = False;
4703
sewardj67564542013-08-16 08:31:29 +00004704 if (end == Iend_LE) {
sewardj2e595852005-06-30 23:33:37 +00004705 switch (ty) {
sewardj67564542013-08-16 08:31:29 +00004706 case Ity_V256: helper = &MC_(helperc_LOADV256le);
4707 hname = "MC_(helperc_LOADV256le)";
4708 ret_via_outparam = True;
4709 break;
sewardj21a5f8c2013-08-08 10:41:46 +00004710 case Ity_V128: helper = &MC_(helperc_LOADV128le);
4711 hname = "MC_(helperc_LOADV128le)";
4712 ret_via_outparam = True;
4713 break;
4714 case Ity_I64: helper = &MC_(helperc_LOADV64le);
4715 hname = "MC_(helperc_LOADV64le)";
4716 break;
4717 case Ity_I32: helper = &MC_(helperc_LOADV32le);
4718 hname = "MC_(helperc_LOADV32le)";
4719 break;
4720 case Ity_I16: helper = &MC_(helperc_LOADV16le);
4721 hname = "MC_(helperc_LOADV16le)";
4722 break;
4723 case Ity_I8: helper = &MC_(helperc_LOADV8);
4724 hname = "MC_(helperc_LOADV8)";
4725 break;
4726 default: ppIRType(ty);
4727 VG_(tool_panic)("memcheck:expr2vbits_Load_WRK(LE)");
sewardj2e595852005-06-30 23:33:37 +00004728 }
4729 } else {
sewardj8cf88b72005-07-08 01:29:33 +00004730 switch (ty) {
sewardj67564542013-08-16 08:31:29 +00004731 case Ity_V256: helper = &MC_(helperc_LOADV256be);
4732 hname = "MC_(helperc_LOADV256be)";
4733 ret_via_outparam = True;
4734 break;
sewardj21a5f8c2013-08-08 10:41:46 +00004735 case Ity_V128: helper = &MC_(helperc_LOADV128be);
4736 hname = "MC_(helperc_LOADV128be)";
4737 ret_via_outparam = True;
4738 break;
4739 case Ity_I64: helper = &MC_(helperc_LOADV64be);
4740 hname = "MC_(helperc_LOADV64be)";
4741 break;
4742 case Ity_I32: helper = &MC_(helperc_LOADV32be);
4743 hname = "MC_(helperc_LOADV32be)";
4744 break;
4745 case Ity_I16: helper = &MC_(helperc_LOADV16be);
4746 hname = "MC_(helperc_LOADV16be)";
4747 break;
4748 case Ity_I8: helper = &MC_(helperc_LOADV8);
4749 hname = "MC_(helperc_LOADV8)";
4750 break;
4751 default: ppIRType(ty);
4752 VG_(tool_panic)("memcheck:expr2vbits_Load_WRK(BE)");
sewardj8cf88b72005-07-08 01:29:33 +00004753 }
sewardj95448072004-11-22 20:19:51 +00004754 }
4755
sewardj21a5f8c2013-08-08 10:41:46 +00004756 tl_assert(helper);
4757 tl_assert(hname);
4758
sewardj95448072004-11-22 20:19:51 +00004759 /* Generate the actual address into addrAct. */
sewardj21a5f8c2013-08-08 10:41:46 +00004760 IRAtom* addrAct;
sewardj95448072004-11-22 20:19:51 +00004761 if (bias == 0) {
4762 addrAct = addr;
4763 } else {
sewardj7cf97ee2004-11-28 14:25:01 +00004764 IROp mkAdd;
4765 IRAtom* eBias;
sewardj95448072004-11-22 20:19:51 +00004766 IRType tyAddr = mce->hWordTy;
4767 tl_assert( tyAddr == Ity_I32 || tyAddr == Ity_I64 );
sewardj7cf97ee2004-11-28 14:25:01 +00004768 mkAdd = tyAddr==Ity_I32 ? Iop_Add32 : Iop_Add64;
4769 eBias = tyAddr==Ity_I32 ? mkU32(bias) : mkU64(bias);
sewardj7cf4e6b2008-05-01 20:24:26 +00004770 addrAct = assignNew('V', mce, tyAddr, binop(mkAdd, addr, eBias) );
sewardj95448072004-11-22 20:19:51 +00004771 }
4772
4773 /* We need to have a place to park the V bits we're just about to
4774 read. */
sewardj21a5f8c2013-08-08 10:41:46 +00004775 IRTemp datavbits = newTemp(mce, ty, VSh);
4776
4777 /* Here's the call. */
4778 IRDirty* di;
4779 if (ret_via_outparam) {
4780 di = unsafeIRDirty_1_N( datavbits,
4781 2/*regparms*/,
4782 hname, VG_(fnptr_to_fnentry)( helper ),
floriana5c3ecb2013-08-15 20:55:42 +00004783 mkIRExprVec_2( IRExpr_VECRET(), addrAct ) );
sewardj21a5f8c2013-08-08 10:41:46 +00004784 } else {
4785 di = unsafeIRDirty_1_N( datavbits,
4786 1/*regparms*/,
4787 hname, VG_(fnptr_to_fnentry)( helper ),
4788 mkIRExprVec_1( addrAct ) );
4789 }
4790
sewardj95448072004-11-22 20:19:51 +00004791 setHelperAnns( mce, di );
sewardjcafe5052013-01-17 14:24:35 +00004792 if (guard) {
4793 di->guard = guard;
4794 /* Ideally the didn't-happen return value here would be all-ones
4795 (all-undefined), so it'd be obvious if it got used
florianad4e9792015-07-05 21:53:33 +00004796 inadvertently. We can get by with the IR-mandated default
sewardjcafe5052013-01-17 14:24:35 +00004797 value (0b01 repeating, 0x55 etc) as that'll still look pretty
4798 undefined if it ever leaks out. */
4799 }
sewardj7cf4e6b2008-05-01 20:24:26 +00004800 stmt( 'V', mce, IRStmt_Dirty(di) );
sewardj95448072004-11-22 20:19:51 +00004801
4802 return mkexpr(datavbits);
4803}
4804
4805
sewardjcafe5052013-01-17 14:24:35 +00004806/* Generate IR to do a shadow load. The helper is expected to check
4807 the validity of the address and return the V bits for that address.
4808 This can optionally be controlled by a guard, which is assumed to
4809 be True if NULL. In the case where the guard is False at runtime,
sewardjb9e6d242013-05-11 13:42:08 +00004810 the helper will return the didn't-do-the-call value of 0x55..55.
4811 Since that means "completely undefined result", the caller of
sewardjcafe5052013-01-17 14:24:35 +00004812 this function will need to fix up the result somehow in that
4813 case.
sewardjb9e6d242013-05-11 13:42:08 +00004814
4815 Caller of this function is also expected to have checked the
4816 definedness of |guard| before this point.
sewardjcafe5052013-01-17 14:24:35 +00004817*/
sewardj95448072004-11-22 20:19:51 +00004818static
sewardj67564542013-08-16 08:31:29 +00004819IRAtom* expr2vbits_Load ( MCEnv* mce,
4820 IREndness end, IRType ty,
sewardjcafe5052013-01-17 14:24:35 +00004821 IRAtom* addr, UInt bias,
4822 IRAtom* guard )
sewardj170ee212004-12-10 18:57:51 +00004823{
sewardj2e595852005-06-30 23:33:37 +00004824 tl_assert(end == Iend_LE || end == Iend_BE);
sewardj7cf4e6b2008-05-01 20:24:26 +00004825 switch (shadowTypeV(ty)) {
sewardj67564542013-08-16 08:31:29 +00004826 case Ity_I8:
4827 case Ity_I16:
4828 case Ity_I32:
sewardj170ee212004-12-10 18:57:51 +00004829 case Ity_I64:
sewardj21a5f8c2013-08-08 10:41:46 +00004830 case Ity_V128:
sewardj67564542013-08-16 08:31:29 +00004831 case Ity_V256:
sewardjcafe5052013-01-17 14:24:35 +00004832 return expr2vbits_Load_WRK(mce, end, ty, addr, bias, guard);
sewardj170ee212004-12-10 18:57:51 +00004833 default:
sewardj2e595852005-06-30 23:33:37 +00004834 VG_(tool_panic)("expr2vbits_Load");
sewardj170ee212004-12-10 18:57:51 +00004835 }
4836}
4837
4838
sewardjcafe5052013-01-17 14:24:35 +00004839/* The most general handler for guarded loads. Assumes the
sewardjb9e6d242013-05-11 13:42:08 +00004840 definedness of GUARD has already been checked by the caller. A
4841 GUARD of NULL is assumed to mean "always True". Generates code to
4842 check the definedness and validity of ADDR.
sewardjcafe5052013-01-17 14:24:35 +00004843
4844 Generate IR to do a shadow load from ADDR and return the V bits.
4845 The loaded type is TY. The loaded data is then (shadow) widened by
4846 using VWIDEN, which can be Iop_INVALID to denote a no-op. If GUARD
4847 evaluates to False at run time then the returned Vbits are simply
4848 VALT instead. Note therefore that the argument type of VWIDEN must
4849 be TY and the result type of VWIDEN must equal the type of VALT.
4850*/
florian434ffae2012-07-19 17:23:42 +00004851static
sewardjcafe5052013-01-17 14:24:35 +00004852IRAtom* expr2vbits_Load_guarded_General ( MCEnv* mce,
4853 IREndness end, IRType ty,
4854 IRAtom* addr, UInt bias,
4855 IRAtom* guard,
4856 IROp vwiden, IRAtom* valt )
florian434ffae2012-07-19 17:23:42 +00004857{
sewardjcafe5052013-01-17 14:24:35 +00004858 /* Sanity check the conversion operation, and also set TYWIDE. */
4859 IRType tyWide = Ity_INVALID;
4860 switch (vwiden) {
4861 case Iop_INVALID:
4862 tyWide = ty;
4863 break;
4864 case Iop_16Uto32: case Iop_16Sto32: case Iop_8Uto32: case Iop_8Sto32:
4865 tyWide = Ity_I32;
4866 break;
4867 default:
4868 VG_(tool_panic)("memcheck:expr2vbits_Load_guarded_General");
florian434ffae2012-07-19 17:23:42 +00004869 }
4870
sewardjcafe5052013-01-17 14:24:35 +00004871 /* If the guard evaluates to True, this will hold the loaded V bits
4872 at TY. If the guard evaluates to False, this will be all
4873 ones, meaning "all undefined", in which case we will have to
florian5686b2d2013-01-29 03:57:40 +00004874 replace it using an ITE below. */
sewardjcafe5052013-01-17 14:24:35 +00004875 IRAtom* iftrue1
4876 = assignNew('V', mce, ty,
4877 expr2vbits_Load(mce, end, ty, addr, bias, guard));
4878 /* Now (shadow-) widen the loaded V bits to the desired width. In
4879 the guard-is-False case, the allowable widening operators will
4880 in the worst case (unsigned widening) at least leave the
4881 pre-widened part as being marked all-undefined, and in the best
4882 case (signed widening) mark the whole widened result as
4883 undefined. Anyway, it doesn't matter really, since in this case
florian5686b2d2013-01-29 03:57:40 +00004884 we will replace said value with the default value |valt| using an
4885 ITE. */
sewardjcafe5052013-01-17 14:24:35 +00004886 IRAtom* iftrue2
4887 = vwiden == Iop_INVALID
4888 ? iftrue1
4889 : assignNew('V', mce, tyWide, unop(vwiden, iftrue1));
4890 /* These are the V bits we will return if the load doesn't take
4891 place. */
4892 IRAtom* iffalse
4893 = valt;
florian5686b2d2013-01-29 03:57:40 +00004894 /* Prepare the cond for the ITE. Convert a NULL cond into
sewardjcafe5052013-01-17 14:24:35 +00004895 something that iropt knows how to fold out later. */
4896 IRAtom* cond
sewardjcc961652013-01-26 11:49:15 +00004897 = guard == NULL ? mkU1(1) : guard;
sewardjcafe5052013-01-17 14:24:35 +00004898 /* And assemble the final result. */
florian5686b2d2013-01-29 03:57:40 +00004899 return assignNew('V', mce, tyWide, IRExpr_ITE(cond, iftrue2, iffalse));
sewardjcafe5052013-01-17 14:24:35 +00004900}
4901
4902
4903/* A simpler handler for guarded loads, in which there is no
4904 conversion operation, and the default V bit return (when the guard
4905 evaluates to False at runtime) is "all defined". If there is no
4906 guard expression or the guard is always TRUE this function behaves
sewardjb9e6d242013-05-11 13:42:08 +00004907 like expr2vbits_Load. It is assumed that definedness of GUARD has
4908 already been checked at the call site. */
sewardjcafe5052013-01-17 14:24:35 +00004909static
4910IRAtom* expr2vbits_Load_guarded_Simple ( MCEnv* mce,
4911 IREndness end, IRType ty,
4912 IRAtom* addr, UInt bias,
4913 IRAtom *guard )
4914{
4915 return expr2vbits_Load_guarded_General(
4916 mce, end, ty, addr, bias, guard, Iop_INVALID, definedOfType(ty)
4917 );
florian434ffae2012-07-19 17:23:42 +00004918}
4919
4920
sewardj170ee212004-12-10 18:57:51 +00004921static
florian5686b2d2013-01-29 03:57:40 +00004922IRAtom* expr2vbits_ITE ( MCEnv* mce,
4923 IRAtom* cond, IRAtom* iftrue, IRAtom* iffalse )
sewardj95448072004-11-22 20:19:51 +00004924{
florian5686b2d2013-01-29 03:57:40 +00004925 IRAtom *vbitsC, *vbits0, *vbits1;
sewardj95448072004-11-22 20:19:51 +00004926 IRType ty;
sewardj07bfda22013-01-29 21:11:55 +00004927 /* Given ITE(cond, iftrue, iffalse), generate
4928 ITE(cond, iftrue#, iffalse#) `UifU` PCast(cond#)
sewardj95448072004-11-22 20:19:51 +00004929 That is, steer the V bits like the originals, but trash the
4930 result if the steering value is undefined. This gives
4931 lazy propagation. */
4932 tl_assert(isOriginalAtom(mce, cond));
florian5686b2d2013-01-29 03:57:40 +00004933 tl_assert(isOriginalAtom(mce, iftrue));
4934 tl_assert(isOriginalAtom(mce, iffalse));
sewardj95448072004-11-22 20:19:51 +00004935
4936 vbitsC = expr2vbits(mce, cond);
florian5686b2d2013-01-29 03:57:40 +00004937 vbits1 = expr2vbits(mce, iftrue);
sewardj07bfda22013-01-29 21:11:55 +00004938 vbits0 = expr2vbits(mce, iffalse);
sewardj1c0ce7a2009-07-01 08:10:49 +00004939 ty = typeOfIRExpr(mce->sb->tyenv, vbits0);
sewardj95448072004-11-22 20:19:51 +00004940
4941 return
sewardj7cf4e6b2008-05-01 20:24:26 +00004942 mkUifU(mce, ty, assignNew('V', mce, ty,
florian5686b2d2013-01-29 03:57:40 +00004943 IRExpr_ITE(cond, vbits1, vbits0)),
sewardj95448072004-11-22 20:19:51 +00004944 mkPCastTo(mce, ty, vbitsC) );
4945}
4946
4947/* --------- This is the main expression-handling function. --------- */
4948
4949static
4950IRExpr* expr2vbits ( MCEnv* mce, IRExpr* e )
4951{
4952 switch (e->tag) {
4953
4954 case Iex_Get:
4955 return shadow_GET( mce, e->Iex.Get.offset, e->Iex.Get.ty );
4956
4957 case Iex_GetI:
4958 return shadow_GETI( mce, e->Iex.GetI.descr,
4959 e->Iex.GetI.ix, e->Iex.GetI.bias );
4960
sewardj0b9d74a2006-12-24 02:24:11 +00004961 case Iex_RdTmp:
sewardj7cf4e6b2008-05-01 20:24:26 +00004962 return IRExpr_RdTmp( findShadowTmpV(mce, e->Iex.RdTmp.tmp) );
sewardj95448072004-11-22 20:19:51 +00004963
4964 case Iex_Const:
sewardj1c0ce7a2009-07-01 08:10:49 +00004965 return definedOfType(shadowTypeV(typeOfIRExpr(mce->sb->tyenv, e)));
sewardj95448072004-11-22 20:19:51 +00004966
sewardje91cea72006-02-08 19:32:02 +00004967 case Iex_Qop:
4968 return expr2vbits_Qop(
4969 mce,
floriane2ab2972012-06-01 20:43:03 +00004970 e->Iex.Qop.details->op,
4971 e->Iex.Qop.details->arg1, e->Iex.Qop.details->arg2,
4972 e->Iex.Qop.details->arg3, e->Iex.Qop.details->arg4
sewardje91cea72006-02-08 19:32:02 +00004973 );
4974
sewardjed69fdb2006-02-03 16:12:27 +00004975 case Iex_Triop:
4976 return expr2vbits_Triop(
4977 mce,
florian26441742012-06-02 20:30:41 +00004978 e->Iex.Triop.details->op,
4979 e->Iex.Triop.details->arg1, e->Iex.Triop.details->arg2,
4980 e->Iex.Triop.details->arg3
sewardjed69fdb2006-02-03 16:12:27 +00004981 );
4982
sewardj95448072004-11-22 20:19:51 +00004983 case Iex_Binop:
4984 return expr2vbits_Binop(
4985 mce,
4986 e->Iex.Binop.op,
4987 e->Iex.Binop.arg1, e->Iex.Binop.arg2
4988 );
4989
4990 case Iex_Unop:
4991 return expr2vbits_Unop( mce, e->Iex.Unop.op, e->Iex.Unop.arg );
4992
sewardj2e595852005-06-30 23:33:37 +00004993 case Iex_Load:
4994 return expr2vbits_Load( mce, e->Iex.Load.end,
4995 e->Iex.Load.ty,
sewardjcafe5052013-01-17 14:24:35 +00004996 e->Iex.Load.addr, 0/*addr bias*/,
4997 NULL/* guard == "always True"*/ );
sewardj95448072004-11-22 20:19:51 +00004998
4999 case Iex_CCall:
5000 return mkLazyN( mce, e->Iex.CCall.args,
5001 e->Iex.CCall.retty,
5002 e->Iex.CCall.cee );
5003
florian5686b2d2013-01-29 03:57:40 +00005004 case Iex_ITE:
5005 return expr2vbits_ITE( mce, e->Iex.ITE.cond, e->Iex.ITE.iftrue,
sewardj07bfda22013-01-29 21:11:55 +00005006 e->Iex.ITE.iffalse);
njn25e49d8e72002-09-23 09:36:25 +00005007
5008 default:
sewardj95448072004-11-22 20:19:51 +00005009 VG_(printf)("\n");
5010 ppIRExpr(e);
5011 VG_(printf)("\n");
5012 VG_(tool_panic)("memcheck: expr2vbits");
njn25e49d8e72002-09-23 09:36:25 +00005013 }
njn25e49d8e72002-09-23 09:36:25 +00005014}
5015
5016/*------------------------------------------------------------*/
sewardj95448072004-11-22 20:19:51 +00005017/*--- Generate shadow stmts from all kinds of IRStmts. ---*/
njn25e49d8e72002-09-23 09:36:25 +00005018/*------------------------------------------------------------*/
5019
sewardj95448072004-11-22 20:19:51 +00005020/* Widen a value to the host word size. */
njn25e49d8e72002-09-23 09:36:25 +00005021
5022static
sewardj95448072004-11-22 20:19:51 +00005023IRExpr* zwidenToHostWord ( MCEnv* mce, IRAtom* vatom )
njn25e49d8e72002-09-23 09:36:25 +00005024{
sewardj7cf97ee2004-11-28 14:25:01 +00005025 IRType ty, tyH;
5026
sewardj95448072004-11-22 20:19:51 +00005027 /* vatom is vbits-value and as such can only have a shadow type. */
5028 tl_assert(isShadowAtom(mce,vatom));
njn25e49d8e72002-09-23 09:36:25 +00005029
sewardj1c0ce7a2009-07-01 08:10:49 +00005030 ty = typeOfIRExpr(mce->sb->tyenv, vatom);
sewardj7cf97ee2004-11-28 14:25:01 +00005031 tyH = mce->hWordTy;
njn25e49d8e72002-09-23 09:36:25 +00005032
sewardj95448072004-11-22 20:19:51 +00005033 if (tyH == Ity_I32) {
5034 switch (ty) {
sewardj7cf4e6b2008-05-01 20:24:26 +00005035 case Ity_I32:
5036 return vatom;
5037 case Ity_I16:
5038 return assignNew('V', mce, tyH, unop(Iop_16Uto32, vatom));
5039 case Ity_I8:
5040 return assignNew('V', mce, tyH, unop(Iop_8Uto32, vatom));
5041 default:
5042 goto unhandled;
sewardj8ec2cfc2002-10-13 00:57:26 +00005043 }
sewardj6cf40ff2005-04-20 22:31:26 +00005044 } else
5045 if (tyH == Ity_I64) {
5046 switch (ty) {
sewardj7cf4e6b2008-05-01 20:24:26 +00005047 case Ity_I32:
5048 return assignNew('V', mce, tyH, unop(Iop_32Uto64, vatom));
5049 case Ity_I16:
5050 return assignNew('V', mce, tyH, unop(Iop_32Uto64,
5051 assignNew('V', mce, Ity_I32, unop(Iop_16Uto32, vatom))));
5052 case Ity_I8:
5053 return assignNew('V', mce, tyH, unop(Iop_32Uto64,
5054 assignNew('V', mce, Ity_I32, unop(Iop_8Uto32, vatom))));
5055 default:
5056 goto unhandled;
sewardj6cf40ff2005-04-20 22:31:26 +00005057 }
sewardj95448072004-11-22 20:19:51 +00005058 } else {
5059 goto unhandled;
sewardj8ec2cfc2002-10-13 00:57:26 +00005060 }
sewardj95448072004-11-22 20:19:51 +00005061 unhandled:
5062 VG_(printf)("\nty = "); ppIRType(ty); VG_(printf)("\n");
5063 VG_(tool_panic)("zwidenToHostWord");
njn25e49d8e72002-09-23 09:36:25 +00005064}
5065
njn25e49d8e72002-09-23 09:36:25 +00005066
sewardjcafe5052013-01-17 14:24:35 +00005067/* Generate a shadow store. |addr| is always the original address
5068 atom. You can pass in either originals or V-bits for the data
5069 atom, but obviously not both. This function generates a check for
sewardjb9e6d242013-05-11 13:42:08 +00005070 the definedness and (indirectly) the validity of |addr|, but only
5071 when |guard| evaluates to True at run time (or is NULL).
njn25e49d8e72002-09-23 09:36:25 +00005072
sewardjcafe5052013-01-17 14:24:35 +00005073 |guard| :: Ity_I1 controls whether the store really happens; NULL
5074 means it unconditionally does. Note that |guard| itself is not
5075 checked for definedness; the caller of this function must do that
5076 if necessary.
5077*/
sewardj95448072004-11-22 20:19:51 +00005078static
sewardj2e595852005-06-30 23:33:37 +00005079void do_shadow_Store ( MCEnv* mce,
5080 IREndness end,
5081 IRAtom* addr, UInt bias,
sewardj1c0ce7a2009-07-01 08:10:49 +00005082 IRAtom* data, IRAtom* vdata,
5083 IRAtom* guard )
njn25e49d8e72002-09-23 09:36:25 +00005084{
sewardj170ee212004-12-10 18:57:51 +00005085 IROp mkAdd;
5086 IRType ty, tyAddr;
sewardj95448072004-11-22 20:19:51 +00005087 void* helper = NULL;
floriana5f894c2012-10-21 03:43:20 +00005088 const HChar* hname = NULL;
njn1d0825f2006-03-27 11:37:07 +00005089 IRConst* c;
sewardj170ee212004-12-10 18:57:51 +00005090
5091 tyAddr = mce->hWordTy;
5092 mkAdd = tyAddr==Ity_I32 ? Iop_Add32 : Iop_Add64;
5093 tl_assert( tyAddr == Ity_I32 || tyAddr == Ity_I64 );
sewardj2e595852005-06-30 23:33:37 +00005094 tl_assert( end == Iend_LE || end == Iend_BE );
sewardj170ee212004-12-10 18:57:51 +00005095
sewardj95448072004-11-22 20:19:51 +00005096 if (data) {
5097 tl_assert(!vdata);
5098 tl_assert(isOriginalAtom(mce, data));
5099 tl_assert(bias == 0);
5100 vdata = expr2vbits( mce, data );
5101 } else {
5102 tl_assert(vdata);
5103 }
njn25e49d8e72002-09-23 09:36:25 +00005104
sewardj95448072004-11-22 20:19:51 +00005105 tl_assert(isOriginalAtom(mce,addr));
5106 tl_assert(isShadowAtom(mce,vdata));
njn25e49d8e72002-09-23 09:36:25 +00005107
sewardj1c0ce7a2009-07-01 08:10:49 +00005108 if (guard) {
5109 tl_assert(isOriginalAtom(mce, guard));
5110 tl_assert(typeOfIRExpr(mce->sb->tyenv, guard) == Ity_I1);
5111 }
5112
5113 ty = typeOfIRExpr(mce->sb->tyenv, vdata);
njn25e49d8e72002-09-23 09:36:25 +00005114
njn1d0825f2006-03-27 11:37:07 +00005115 // If we're not doing undefined value checking, pretend that this value
5116 // is "all valid". That lets Vex's optimiser remove some of the V bit
5117 // shadow computation ops that precede it.
sewardj7cf4e6b2008-05-01 20:24:26 +00005118 if (MC_(clo_mc_level) == 1) {
njn1d0825f2006-03-27 11:37:07 +00005119 switch (ty) {
sewardj45fa9f42012-05-21 10:18:10 +00005120 case Ity_V256: // V256 weirdness -- used four times
sewardjbd43bfa2012-06-29 15:29:37 +00005121 c = IRConst_V256(V_BITS32_DEFINED); break;
sewardj45fa9f42012-05-21 10:18:10 +00005122 case Ity_V128: // V128 weirdness -- used twice
sewardj1c0ce7a2009-07-01 08:10:49 +00005123 c = IRConst_V128(V_BITS16_DEFINED); break;
njn1d0825f2006-03-27 11:37:07 +00005124 case Ity_I64: c = IRConst_U64 (V_BITS64_DEFINED); break;
5125 case Ity_I32: c = IRConst_U32 (V_BITS32_DEFINED); break;
5126 case Ity_I16: c = IRConst_U16 (V_BITS16_DEFINED); break;
5127 case Ity_I8: c = IRConst_U8 (V_BITS8_DEFINED); break;
5128 default: VG_(tool_panic)("memcheck:do_shadow_Store(LE)");
5129 }
5130 vdata = IRExpr_Const( c );
5131 }
5132
sewardj95448072004-11-22 20:19:51 +00005133 /* First, emit a definedness test for the address. This also sets
sewardjb9e6d242013-05-11 13:42:08 +00005134 the address (shadow) to 'defined' following the test. Both of
5135 those actions are gated on |guard|. */
5136 complainIfUndefined( mce, addr, guard );
njn25e49d8e72002-09-23 09:36:25 +00005137
sewardj170ee212004-12-10 18:57:51 +00005138 /* Now decide which helper function to call to write the data V
5139 bits into shadow memory. */
sewardj2e595852005-06-30 23:33:37 +00005140 if (end == Iend_LE) {
5141 switch (ty) {
sewardj45fa9f42012-05-21 10:18:10 +00005142 case Ity_V256: /* we'll use the helper four times */
sewardj2e595852005-06-30 23:33:37 +00005143 case Ity_V128: /* we'll use the helper twice */
njn1d0825f2006-03-27 11:37:07 +00005144 case Ity_I64: helper = &MC_(helperc_STOREV64le);
5145 hname = "MC_(helperc_STOREV64le)";
sewardj2e595852005-06-30 23:33:37 +00005146 break;
njn1d0825f2006-03-27 11:37:07 +00005147 case Ity_I32: helper = &MC_(helperc_STOREV32le);
5148 hname = "MC_(helperc_STOREV32le)";
sewardj2e595852005-06-30 23:33:37 +00005149 break;
njn1d0825f2006-03-27 11:37:07 +00005150 case Ity_I16: helper = &MC_(helperc_STOREV16le);
5151 hname = "MC_(helperc_STOREV16le)";
sewardj2e595852005-06-30 23:33:37 +00005152 break;
njn1d0825f2006-03-27 11:37:07 +00005153 case Ity_I8: helper = &MC_(helperc_STOREV8);
5154 hname = "MC_(helperc_STOREV8)";
sewardj2e595852005-06-30 23:33:37 +00005155 break;
5156 default: VG_(tool_panic)("memcheck:do_shadow_Store(LE)");
5157 }
5158 } else {
sewardj8cf88b72005-07-08 01:29:33 +00005159 switch (ty) {
5160 case Ity_V128: /* we'll use the helper twice */
njn1d0825f2006-03-27 11:37:07 +00005161 case Ity_I64: helper = &MC_(helperc_STOREV64be);
5162 hname = "MC_(helperc_STOREV64be)";
sewardj8cf88b72005-07-08 01:29:33 +00005163 break;
njn1d0825f2006-03-27 11:37:07 +00005164 case Ity_I32: helper = &MC_(helperc_STOREV32be);
5165 hname = "MC_(helperc_STOREV32be)";
sewardj8cf88b72005-07-08 01:29:33 +00005166 break;
njn1d0825f2006-03-27 11:37:07 +00005167 case Ity_I16: helper = &MC_(helperc_STOREV16be);
5168 hname = "MC_(helperc_STOREV16be)";
sewardj8cf88b72005-07-08 01:29:33 +00005169 break;
njn1d0825f2006-03-27 11:37:07 +00005170 case Ity_I8: helper = &MC_(helperc_STOREV8);
5171 hname = "MC_(helperc_STOREV8)";
sewardj8cf88b72005-07-08 01:29:33 +00005172 break;
sewardj45fa9f42012-05-21 10:18:10 +00005173 /* Note, no V256 case here, because no big-endian target that
5174 we support, has 256 vectors. */
sewardj8cf88b72005-07-08 01:29:33 +00005175 default: VG_(tool_panic)("memcheck:do_shadow_Store(BE)");
5176 }
sewardj95448072004-11-22 20:19:51 +00005177 }
njn25e49d8e72002-09-23 09:36:25 +00005178
sewardj45fa9f42012-05-21 10:18:10 +00005179 if (UNLIKELY(ty == Ity_V256)) {
5180
5181 /* V256-bit case -- phrased in terms of 64 bit units (Qs), with
5182 Q3 being the most significant lane. */
5183 /* These are the offsets of the Qs in memory. */
5184 Int offQ0, offQ1, offQ2, offQ3;
5185
5186 /* Various bits for constructing the 4 lane helper calls */
5187 IRDirty *diQ0, *diQ1, *diQ2, *diQ3;
5188 IRAtom *addrQ0, *addrQ1, *addrQ2, *addrQ3;
5189 IRAtom *vdataQ0, *vdataQ1, *vdataQ2, *vdataQ3;
5190 IRAtom *eBiasQ0, *eBiasQ1, *eBiasQ2, *eBiasQ3;
5191
5192 if (end == Iend_LE) {
5193 offQ0 = 0; offQ1 = 8; offQ2 = 16; offQ3 = 24;
5194 } else {
5195 offQ3 = 0; offQ2 = 8; offQ1 = 16; offQ0 = 24;
5196 }
5197
5198 eBiasQ0 = tyAddr==Ity_I32 ? mkU32(bias+offQ0) : mkU64(bias+offQ0);
5199 addrQ0 = assignNew('V', mce, tyAddr, binop(mkAdd, addr, eBiasQ0) );
5200 vdataQ0 = assignNew('V', mce, Ity_I64, unop(Iop_V256to64_0, vdata));
5201 diQ0 = unsafeIRDirty_0_N(
5202 1/*regparms*/,
5203 hname, VG_(fnptr_to_fnentry)( helper ),
5204 mkIRExprVec_2( addrQ0, vdataQ0 )
5205 );
5206
5207 eBiasQ1 = tyAddr==Ity_I32 ? mkU32(bias+offQ1) : mkU64(bias+offQ1);
5208 addrQ1 = assignNew('V', mce, tyAddr, binop(mkAdd, addr, eBiasQ1) );
5209 vdataQ1 = assignNew('V', mce, Ity_I64, unop(Iop_V256to64_1, vdata));
5210 diQ1 = unsafeIRDirty_0_N(
5211 1/*regparms*/,
5212 hname, VG_(fnptr_to_fnentry)( helper ),
5213 mkIRExprVec_2( addrQ1, vdataQ1 )
5214 );
5215
5216 eBiasQ2 = tyAddr==Ity_I32 ? mkU32(bias+offQ2) : mkU64(bias+offQ2);
5217 addrQ2 = assignNew('V', mce, tyAddr, binop(mkAdd, addr, eBiasQ2) );
5218 vdataQ2 = assignNew('V', mce, Ity_I64, unop(Iop_V256to64_2, vdata));
5219 diQ2 = unsafeIRDirty_0_N(
5220 1/*regparms*/,
5221 hname, VG_(fnptr_to_fnentry)( helper ),
5222 mkIRExprVec_2( addrQ2, vdataQ2 )
5223 );
5224
5225 eBiasQ3 = tyAddr==Ity_I32 ? mkU32(bias+offQ3) : mkU64(bias+offQ3);
5226 addrQ3 = assignNew('V', mce, tyAddr, binop(mkAdd, addr, eBiasQ3) );
5227 vdataQ3 = assignNew('V', mce, Ity_I64, unop(Iop_V256to64_3, vdata));
5228 diQ3 = unsafeIRDirty_0_N(
5229 1/*regparms*/,
5230 hname, VG_(fnptr_to_fnentry)( helper ),
5231 mkIRExprVec_2( addrQ3, vdataQ3 )
5232 );
5233
5234 if (guard)
5235 diQ0->guard = diQ1->guard = diQ2->guard = diQ3->guard = guard;
5236
5237 setHelperAnns( mce, diQ0 );
5238 setHelperAnns( mce, diQ1 );
5239 setHelperAnns( mce, diQ2 );
5240 setHelperAnns( mce, diQ3 );
5241 stmt( 'V', mce, IRStmt_Dirty(diQ0) );
5242 stmt( 'V', mce, IRStmt_Dirty(diQ1) );
5243 stmt( 'V', mce, IRStmt_Dirty(diQ2) );
5244 stmt( 'V', mce, IRStmt_Dirty(diQ3) );
5245
5246 }
5247 else if (UNLIKELY(ty == Ity_V128)) {
sewardj170ee212004-12-10 18:57:51 +00005248
sewardj20d38f22005-02-07 23:50:18 +00005249 /* V128-bit case */
sewardj170ee212004-12-10 18:57:51 +00005250 /* See comment in next clause re 64-bit regparms */
sewardj2e595852005-06-30 23:33:37 +00005251 /* also, need to be careful about endianness */
5252
njn4c245e52009-03-15 23:25:38 +00005253 Int offLo64, offHi64;
5254 IRDirty *diLo64, *diHi64;
5255 IRAtom *addrLo64, *addrHi64;
5256 IRAtom *vdataLo64, *vdataHi64;
5257 IRAtom *eBiasLo64, *eBiasHi64;
5258
sewardj2e595852005-06-30 23:33:37 +00005259 if (end == Iend_LE) {
5260 offLo64 = 0;
5261 offHi64 = 8;
5262 } else {
sewardj2e595852005-06-30 23:33:37 +00005263 offLo64 = 8;
5264 offHi64 = 0;
5265 }
5266
5267 eBiasLo64 = tyAddr==Ity_I32 ? mkU32(bias+offLo64) : mkU64(bias+offLo64);
sewardj7cf4e6b2008-05-01 20:24:26 +00005268 addrLo64 = assignNew('V', mce, tyAddr, binop(mkAdd, addr, eBiasLo64) );
5269 vdataLo64 = assignNew('V', mce, Ity_I64, unop(Iop_V128to64, vdata));
sewardj170ee212004-12-10 18:57:51 +00005270 diLo64 = unsafeIRDirty_0_N(
sewardj53ee1fc2005-12-23 02:29:58 +00005271 1/*regparms*/,
5272 hname, VG_(fnptr_to_fnentry)( helper ),
5273 mkIRExprVec_2( addrLo64, vdataLo64 )
5274 );
sewardj2e595852005-06-30 23:33:37 +00005275 eBiasHi64 = tyAddr==Ity_I32 ? mkU32(bias+offHi64) : mkU64(bias+offHi64);
sewardj7cf4e6b2008-05-01 20:24:26 +00005276 addrHi64 = assignNew('V', mce, tyAddr, binop(mkAdd, addr, eBiasHi64) );
5277 vdataHi64 = assignNew('V', mce, Ity_I64, unop(Iop_V128HIto64, vdata));
sewardj170ee212004-12-10 18:57:51 +00005278 diHi64 = unsafeIRDirty_0_N(
sewardj53ee1fc2005-12-23 02:29:58 +00005279 1/*regparms*/,
5280 hname, VG_(fnptr_to_fnentry)( helper ),
5281 mkIRExprVec_2( addrHi64, vdataHi64 )
5282 );
sewardj1c0ce7a2009-07-01 08:10:49 +00005283 if (guard) diLo64->guard = guard;
5284 if (guard) diHi64->guard = guard;
sewardj170ee212004-12-10 18:57:51 +00005285 setHelperAnns( mce, diLo64 );
5286 setHelperAnns( mce, diHi64 );
sewardj7cf4e6b2008-05-01 20:24:26 +00005287 stmt( 'V', mce, IRStmt_Dirty(diLo64) );
5288 stmt( 'V', mce, IRStmt_Dirty(diHi64) );
sewardj170ee212004-12-10 18:57:51 +00005289
sewardj95448072004-11-22 20:19:51 +00005290 } else {
sewardj170ee212004-12-10 18:57:51 +00005291
njn4c245e52009-03-15 23:25:38 +00005292 IRDirty *di;
5293 IRAtom *addrAct;
5294
sewardj170ee212004-12-10 18:57:51 +00005295 /* 8/16/32/64-bit cases */
5296 /* Generate the actual address into addrAct. */
5297 if (bias == 0) {
5298 addrAct = addr;
5299 } else {
njn4c245e52009-03-15 23:25:38 +00005300 IRAtom* eBias = tyAddr==Ity_I32 ? mkU32(bias) : mkU64(bias);
sewardj7cf4e6b2008-05-01 20:24:26 +00005301 addrAct = assignNew('V', mce, tyAddr, binop(mkAdd, addr, eBias));
sewardj170ee212004-12-10 18:57:51 +00005302 }
5303
5304 if (ty == Ity_I64) {
5305 /* We can't do this with regparm 2 on 32-bit platforms, since
5306 the back ends aren't clever enough to handle 64-bit
5307 regparm args. Therefore be different. */
5308 di = unsafeIRDirty_0_N(
sewardj53ee1fc2005-12-23 02:29:58 +00005309 1/*regparms*/,
5310 hname, VG_(fnptr_to_fnentry)( helper ),
5311 mkIRExprVec_2( addrAct, vdata )
5312 );
sewardj170ee212004-12-10 18:57:51 +00005313 } else {
5314 di = unsafeIRDirty_0_N(
sewardj53ee1fc2005-12-23 02:29:58 +00005315 2/*regparms*/,
5316 hname, VG_(fnptr_to_fnentry)( helper ),
sewardj170ee212004-12-10 18:57:51 +00005317 mkIRExprVec_2( addrAct,
sewardj53ee1fc2005-12-23 02:29:58 +00005318 zwidenToHostWord( mce, vdata ))
5319 );
sewardj170ee212004-12-10 18:57:51 +00005320 }
sewardj1c0ce7a2009-07-01 08:10:49 +00005321 if (guard) di->guard = guard;
sewardj170ee212004-12-10 18:57:51 +00005322 setHelperAnns( mce, di );
sewardj7cf4e6b2008-05-01 20:24:26 +00005323 stmt( 'V', mce, IRStmt_Dirty(di) );
sewardj95448072004-11-22 20:19:51 +00005324 }
njn25e49d8e72002-09-23 09:36:25 +00005325
sewardj95448072004-11-22 20:19:51 +00005326}
njn25e49d8e72002-09-23 09:36:25 +00005327
njn25e49d8e72002-09-23 09:36:25 +00005328
sewardj95448072004-11-22 20:19:51 +00005329/* Do lazy pessimistic propagation through a dirty helper call, by
5330 looking at the annotations on it. This is the most complex part of
5331 Memcheck. */
njn25e49d8e72002-09-23 09:36:25 +00005332
sewardj95448072004-11-22 20:19:51 +00005333static IRType szToITy ( Int n )
5334{
5335 switch (n) {
5336 case 1: return Ity_I8;
5337 case 2: return Ity_I16;
5338 case 4: return Ity_I32;
5339 case 8: return Ity_I64;
5340 default: VG_(tool_panic)("szToITy(memcheck)");
5341 }
5342}
njn25e49d8e72002-09-23 09:36:25 +00005343
sewardj95448072004-11-22 20:19:51 +00005344static
5345void do_shadow_Dirty ( MCEnv* mce, IRDirty* d )
5346{
sewardj2eecb742012-06-01 16:11:41 +00005347 Int i, k, n, toDo, gSz, gOff;
sewardj2e595852005-06-30 23:33:37 +00005348 IRAtom *src, *here, *curr;
njn4c245e52009-03-15 23:25:38 +00005349 IRType tySrc, tyDst;
sewardj2e595852005-06-30 23:33:37 +00005350 IRTemp dst;
5351 IREndness end;
5352
5353 /* What's the native endianness? We need to know this. */
sewardj6e340c72005-07-10 00:53:42 +00005354# if defined(VG_BIGENDIAN)
sewardj2e595852005-06-30 23:33:37 +00005355 end = Iend_BE;
sewardj6e340c72005-07-10 00:53:42 +00005356# elif defined(VG_LITTLEENDIAN)
sewardj2e595852005-06-30 23:33:37 +00005357 end = Iend_LE;
5358# else
5359# error "Unknown endianness"
5360# endif
njn25e49d8e72002-09-23 09:36:25 +00005361
sewardj95448072004-11-22 20:19:51 +00005362 /* First check the guard. */
sewardjb9e6d242013-05-11 13:42:08 +00005363 complainIfUndefined(mce, d->guard, NULL);
sewardj95448072004-11-22 20:19:51 +00005364
5365 /* Now round up all inputs and PCast over them. */
sewardj7cf97ee2004-11-28 14:25:01 +00005366 curr = definedOfType(Ity_I32);
sewardj95448072004-11-22 20:19:51 +00005367
florian434ffae2012-07-19 17:23:42 +00005368 /* Inputs: unmasked args
5369 Note: arguments are evaluated REGARDLESS of the guard expression */
sewardj95448072004-11-22 20:19:51 +00005370 for (i = 0; d->args[i]; i++) {
sewardj21a5f8c2013-08-08 10:41:46 +00005371 IRAtom* arg = d->args[i];
5372 if ( (d->cee->mcx_mask & (1<<i))
floriana5c3ecb2013-08-15 20:55:42 +00005373 || UNLIKELY(is_IRExpr_VECRET_or_BBPTR(arg)) ) {
sewardj95448072004-11-22 20:19:51 +00005374 /* ignore this arg */
njn25e49d8e72002-09-23 09:36:25 +00005375 } else {
sewardj21a5f8c2013-08-08 10:41:46 +00005376 here = mkPCastTo( mce, Ity_I32, expr2vbits(mce, arg) );
sewardj95448072004-11-22 20:19:51 +00005377 curr = mkUifU32(mce, here, curr);
njn25e49d8e72002-09-23 09:36:25 +00005378 }
5379 }
sewardj95448072004-11-22 20:19:51 +00005380
5381 /* Inputs: guest state that we read. */
5382 for (i = 0; i < d->nFxState; i++) {
5383 tl_assert(d->fxState[i].fx != Ifx_None);
5384 if (d->fxState[i].fx == Ifx_Write)
5385 continue;
sewardja7203252004-11-26 19:17:47 +00005386
sewardj2eecb742012-06-01 16:11:41 +00005387 /* Enumerate the described state segments */
5388 for (k = 0; k < 1 + d->fxState[i].nRepeats; k++) {
5389 gOff = d->fxState[i].offset + k * d->fxState[i].repeatLen;
5390 gSz = d->fxState[i].size;
sewardja7203252004-11-26 19:17:47 +00005391
sewardj2eecb742012-06-01 16:11:41 +00005392 /* Ignore any sections marked as 'always defined'. */
5393 if (isAlwaysDefd(mce, gOff, gSz)) {
5394 if (0)
5395 VG_(printf)("memcheck: Dirty gst: ignored off %d, sz %d\n",
5396 gOff, gSz);
5397 continue;
5398 }
sewardje9e16d32004-12-10 13:17:55 +00005399
sewardj2eecb742012-06-01 16:11:41 +00005400 /* This state element is read or modified. So we need to
5401 consider it. If larger than 8 bytes, deal with it in
5402 8-byte chunks. */
5403 while (True) {
5404 tl_assert(gSz >= 0);
5405 if (gSz == 0) break;
5406 n = gSz <= 8 ? gSz : 8;
5407 /* update 'curr' with UifU of the state slice
5408 gOff .. gOff+n-1 */
5409 tySrc = szToITy( n );
florian434ffae2012-07-19 17:23:42 +00005410
5411 /* Observe the guard expression. If it is false use an
5412 all-bits-defined bit pattern */
5413 IRAtom *cond, *iffalse, *iftrue;
5414
sewardjcc961652013-01-26 11:49:15 +00005415 cond = assignNew('V', mce, Ity_I1, d->guard);
florian434ffae2012-07-19 17:23:42 +00005416 iftrue = assignNew('V', mce, tySrc, shadow_GET(mce, gOff, tySrc));
5417 iffalse = assignNew('V', mce, tySrc, definedOfType(tySrc));
5418 src = assignNew('V', mce, tySrc,
florian5686b2d2013-01-29 03:57:40 +00005419 IRExpr_ITE(cond, iftrue, iffalse));
florian434ffae2012-07-19 17:23:42 +00005420
sewardj2eecb742012-06-01 16:11:41 +00005421 here = mkPCastTo( mce, Ity_I32, src );
5422 curr = mkUifU32(mce, here, curr);
5423 gSz -= n;
5424 gOff += n;
5425 }
5426 }
sewardj95448072004-11-22 20:19:51 +00005427 }
5428
5429 /* Inputs: memory. First set up some info needed regardless of
5430 whether we're doing reads or writes. */
sewardj95448072004-11-22 20:19:51 +00005431
5432 if (d->mFx != Ifx_None) {
5433 /* Because we may do multiple shadow loads/stores from the same
5434 base address, it's best to do a single test of its
5435 definedness right now. Post-instrumentation optimisation
5436 should remove all but this test. */
njn4c245e52009-03-15 23:25:38 +00005437 IRType tyAddr;
sewardj95448072004-11-22 20:19:51 +00005438 tl_assert(d->mAddr);
sewardjb9e6d242013-05-11 13:42:08 +00005439 complainIfUndefined(mce, d->mAddr, d->guard);
sewardj95448072004-11-22 20:19:51 +00005440
sewardj1c0ce7a2009-07-01 08:10:49 +00005441 tyAddr = typeOfIRExpr(mce->sb->tyenv, d->mAddr);
sewardj95448072004-11-22 20:19:51 +00005442 tl_assert(tyAddr == Ity_I32 || tyAddr == Ity_I64);
5443 tl_assert(tyAddr == mce->hWordTy); /* not really right */
5444 }
5445
5446 /* Deal with memory inputs (reads or modifies) */
5447 if (d->mFx == Ifx_Read || d->mFx == Ifx_Modify) {
sewardj95448072004-11-22 20:19:51 +00005448 toDo = d->mSize;
sewardj2e595852005-06-30 23:33:37 +00005449 /* chew off 32-bit chunks. We don't care about the endianness
5450 since it's all going to be condensed down to a single bit,
5451 but nevertheless choose an endianness which is hopefully
5452 native to the platform. */
sewardj95448072004-11-22 20:19:51 +00005453 while (toDo >= 4) {
5454 here = mkPCastTo(
5455 mce, Ity_I32,
sewardjcafe5052013-01-17 14:24:35 +00005456 expr2vbits_Load_guarded_Simple(
5457 mce, end, Ity_I32, d->mAddr, d->mSize - toDo, d->guard )
sewardj95448072004-11-22 20:19:51 +00005458 );
5459 curr = mkUifU32(mce, here, curr);
5460 toDo -= 4;
5461 }
5462 /* chew off 16-bit chunks */
5463 while (toDo >= 2) {
5464 here = mkPCastTo(
5465 mce, Ity_I32,
sewardjcafe5052013-01-17 14:24:35 +00005466 expr2vbits_Load_guarded_Simple(
5467 mce, end, Ity_I16, d->mAddr, d->mSize - toDo, d->guard )
sewardj95448072004-11-22 20:19:51 +00005468 );
5469 curr = mkUifU32(mce, here, curr);
5470 toDo -= 2;
5471 }
floriancda994b2012-06-08 16:01:19 +00005472 /* chew off the remaining 8-bit chunk, if any */
5473 if (toDo == 1) {
5474 here = mkPCastTo(
5475 mce, Ity_I32,
sewardjcafe5052013-01-17 14:24:35 +00005476 expr2vbits_Load_guarded_Simple(
5477 mce, end, Ity_I8, d->mAddr, d->mSize - toDo, d->guard )
floriancda994b2012-06-08 16:01:19 +00005478 );
5479 curr = mkUifU32(mce, here, curr);
5480 toDo -= 1;
5481 }
5482 tl_assert(toDo == 0);
sewardj95448072004-11-22 20:19:51 +00005483 }
5484
5485 /* Whew! So curr is a 32-bit V-value summarising pessimistically
5486 all the inputs to the helper. Now we need to re-distribute the
5487 results to all destinations. */
5488
5489 /* Outputs: the destination temporary, if there is one. */
5490 if (d->tmp != IRTemp_INVALID) {
sewardj7cf4e6b2008-05-01 20:24:26 +00005491 dst = findShadowTmpV(mce, d->tmp);
sewardj1c0ce7a2009-07-01 08:10:49 +00005492 tyDst = typeOfIRTemp(mce->sb->tyenv, d->tmp);
sewardj7cf4e6b2008-05-01 20:24:26 +00005493 assign( 'V', mce, dst, mkPCastTo( mce, tyDst, curr) );
sewardj95448072004-11-22 20:19:51 +00005494 }
5495
5496 /* Outputs: guest state that we write or modify. */
5497 for (i = 0; i < d->nFxState; i++) {
5498 tl_assert(d->fxState[i].fx != Ifx_None);
5499 if (d->fxState[i].fx == Ifx_Read)
5500 continue;
sewardj2eecb742012-06-01 16:11:41 +00005501
5502 /* Enumerate the described state segments */
5503 for (k = 0; k < 1 + d->fxState[i].nRepeats; k++) {
5504 gOff = d->fxState[i].offset + k * d->fxState[i].repeatLen;
5505 gSz = d->fxState[i].size;
5506
5507 /* Ignore any sections marked as 'always defined'. */
5508 if (isAlwaysDefd(mce, gOff, gSz))
5509 continue;
5510
5511 /* This state element is written or modified. So we need to
5512 consider it. If larger than 8 bytes, deal with it in
5513 8-byte chunks. */
5514 while (True) {
5515 tl_assert(gSz >= 0);
5516 if (gSz == 0) break;
5517 n = gSz <= 8 ? gSz : 8;
5518 /* Write suitably-casted 'curr' to the state slice
5519 gOff .. gOff+n-1 */
5520 tyDst = szToITy( n );
5521 do_shadow_PUT( mce, gOff,
5522 NULL, /* original atom */
florian434ffae2012-07-19 17:23:42 +00005523 mkPCastTo( mce, tyDst, curr ), d->guard );
sewardj2eecb742012-06-01 16:11:41 +00005524 gSz -= n;
5525 gOff += n;
5526 }
sewardje9e16d32004-12-10 13:17:55 +00005527 }
sewardj95448072004-11-22 20:19:51 +00005528 }
5529
sewardj2e595852005-06-30 23:33:37 +00005530 /* Outputs: memory that we write or modify. Same comments about
5531 endianness as above apply. */
sewardj95448072004-11-22 20:19:51 +00005532 if (d->mFx == Ifx_Write || d->mFx == Ifx_Modify) {
sewardj95448072004-11-22 20:19:51 +00005533 toDo = d->mSize;
5534 /* chew off 32-bit chunks */
5535 while (toDo >= 4) {
sewardj2e595852005-06-30 23:33:37 +00005536 do_shadow_Store( mce, end, d->mAddr, d->mSize - toDo,
5537 NULL, /* original data */
sewardj1c0ce7a2009-07-01 08:10:49 +00005538 mkPCastTo( mce, Ity_I32, curr ),
florian434ffae2012-07-19 17:23:42 +00005539 d->guard );
sewardj95448072004-11-22 20:19:51 +00005540 toDo -= 4;
5541 }
5542 /* chew off 16-bit chunks */
5543 while (toDo >= 2) {
sewardj2e595852005-06-30 23:33:37 +00005544 do_shadow_Store( mce, end, d->mAddr, d->mSize - toDo,
5545 NULL, /* original data */
sewardj1c0ce7a2009-07-01 08:10:49 +00005546 mkPCastTo( mce, Ity_I16, curr ),
florian434ffae2012-07-19 17:23:42 +00005547 d->guard );
sewardj95448072004-11-22 20:19:51 +00005548 toDo -= 2;
5549 }
floriancda994b2012-06-08 16:01:19 +00005550 /* chew off the remaining 8-bit chunk, if any */
5551 if (toDo == 1) {
5552 do_shadow_Store( mce, end, d->mAddr, d->mSize - toDo,
5553 NULL, /* original data */
5554 mkPCastTo( mce, Ity_I8, curr ),
florian434ffae2012-07-19 17:23:42 +00005555 d->guard );
floriancda994b2012-06-08 16:01:19 +00005556 toDo -= 1;
5557 }
5558 tl_assert(toDo == 0);
sewardj95448072004-11-22 20:19:51 +00005559 }
5560
njn25e49d8e72002-09-23 09:36:25 +00005561}
5562
sewardj1c0ce7a2009-07-01 08:10:49 +00005563
sewardj826ec492005-05-12 18:05:00 +00005564/* We have an ABI hint telling us that [base .. base+len-1] is to
5565 become undefined ("writable"). Generate code to call a helper to
5566 notify the A/V bit machinery of this fact.
5567
5568 We call
sewardj7cf4e6b2008-05-01 20:24:26 +00005569 void MC_(helperc_MAKE_STACK_UNINIT) ( Addr base, UWord len,
5570 Addr nia );
sewardj826ec492005-05-12 18:05:00 +00005571*/
5572static
sewardj7cf4e6b2008-05-01 20:24:26 +00005573void do_AbiHint ( MCEnv* mce, IRExpr* base, Int len, IRExpr* nia )
sewardj826ec492005-05-12 18:05:00 +00005574{
5575 IRDirty* di;
sewardj7cf4e6b2008-05-01 20:24:26 +00005576
Elliott Hughesa0664b92017-04-18 17:46:52 -07005577 if (MC_(clo_mc_level) == 3) {
5578 di = unsafeIRDirty_0_N(
5579 3/*regparms*/,
5580 "MC_(helperc_MAKE_STACK_UNINIT_w_o)",
5581 VG_(fnptr_to_fnentry)( &MC_(helperc_MAKE_STACK_UNINIT_w_o) ),
5582 mkIRExprVec_3( base, mkIRExpr_HWord( (UInt)len), nia )
5583 );
5584 } else {
5585 /* We ignore the supplied nia, since it is irrelevant. */
5586 tl_assert(MC_(clo_mc_level) == 2 || MC_(clo_mc_level) == 1);
5587 /* Special-case the len==128 case, since that is for amd64-ELF,
5588 which is a very common target. */
5589 if (len == 128) {
5590 di = unsafeIRDirty_0_N(
5591 1/*regparms*/,
5592 "MC_(helperc_MAKE_STACK_UNINIT_128_no_o)",
5593 VG_(fnptr_to_fnentry)( &MC_(helperc_MAKE_STACK_UNINIT_128_no_o)),
5594 mkIRExprVec_1( base )
5595 );
5596 } else {
5597 di = unsafeIRDirty_0_N(
5598 2/*regparms*/,
5599 "MC_(helperc_MAKE_STACK_UNINIT_no_o)",
5600 VG_(fnptr_to_fnentry)( &MC_(helperc_MAKE_STACK_UNINIT_no_o) ),
5601 mkIRExprVec_2( base, mkIRExpr_HWord( (UInt)len) )
5602 );
5603 }
5604 }
5605
sewardj7cf4e6b2008-05-01 20:24:26 +00005606 stmt( 'V', mce, IRStmt_Dirty(di) );
sewardj826ec492005-05-12 18:05:00 +00005607}
5608
njn25e49d8e72002-09-23 09:36:25 +00005609
sewardj1c0ce7a2009-07-01 08:10:49 +00005610/* ------ Dealing with IRCAS (big and complex) ------ */
5611
5612/* FWDS */
5613static IRAtom* gen_load_b ( MCEnv* mce, Int szB,
5614 IRAtom* baseaddr, Int offset );
5615static IRAtom* gen_maxU32 ( MCEnv* mce, IRAtom* b1, IRAtom* b2 );
5616static void gen_store_b ( MCEnv* mce, Int szB,
5617 IRAtom* baseaddr, Int offset, IRAtom* dataB,
5618 IRAtom* guard );
5619
5620static void do_shadow_CAS_single ( MCEnv* mce, IRCAS* cas );
5621static void do_shadow_CAS_double ( MCEnv* mce, IRCAS* cas );
5622
5623
5624/* Either ORIG and SHADOW are both IRExpr.RdTmps, or they are both
5625 IRExpr.Consts, else this asserts. If they are both Consts, it
5626 doesn't do anything. So that just leaves the RdTmp case.
5627
5628 In which case: this assigns the shadow value SHADOW to the IR
5629 shadow temporary associated with ORIG. That is, ORIG, being an
5630 original temporary, will have a shadow temporary associated with
5631 it. However, in the case envisaged here, there will so far have
5632 been no IR emitted to actually write a shadow value into that
5633 temporary. What this routine does is to (emit IR to) copy the
5634 value in SHADOW into said temporary, so that after this call,
5635 IRExpr.RdTmps of ORIG's shadow temp will correctly pick up the
5636 value in SHADOW.
5637
5638 Point is to allow callers to compute "by hand" a shadow value for
5639 ORIG, and force it to be associated with ORIG.
5640
5641 How do we know that that shadow associated with ORIG has not so far
5642 been assigned to? Well, we don't per se know that, but supposing
5643 it had. Then this routine would create a second assignment to it,
5644 and later the IR sanity checker would barf. But that never
5645 happens. QED.
5646*/
5647static void bind_shadow_tmp_to_orig ( UChar how,
5648 MCEnv* mce,
5649 IRAtom* orig, IRAtom* shadow )
5650{
5651 tl_assert(isOriginalAtom(mce, orig));
5652 tl_assert(isShadowAtom(mce, shadow));
5653 switch (orig->tag) {
5654 case Iex_Const:
5655 tl_assert(shadow->tag == Iex_Const);
5656 break;
5657 case Iex_RdTmp:
5658 tl_assert(shadow->tag == Iex_RdTmp);
5659 if (how == 'V') {
5660 assign('V', mce, findShadowTmpV(mce,orig->Iex.RdTmp.tmp),
5661 shadow);
5662 } else {
5663 tl_assert(how == 'B');
5664 assign('B', mce, findShadowTmpB(mce,orig->Iex.RdTmp.tmp),
5665 shadow);
5666 }
5667 break;
5668 default:
5669 tl_assert(0);
5670 }
5671}
5672
5673
5674static
5675void do_shadow_CAS ( MCEnv* mce, IRCAS* cas )
5676{
5677 /* Scheme is (both single- and double- cases):
5678
5679 1. fetch data#,dataB (the proposed new value)
5680
5681 2. fetch expd#,expdB (what we expect to see at the address)
5682
5683 3. check definedness of address
5684
5685 4. load old#,oldB from shadow memory; this also checks
5686 addressibility of the address
5687
5688 5. the CAS itself
5689
sewardjafed4c52009-07-12 13:00:17 +00005690 6. compute "expected == old". See COMMENT_ON_CasCmpEQ below.
sewardj1c0ce7a2009-07-01 08:10:49 +00005691
sewardjafed4c52009-07-12 13:00:17 +00005692 7. if "expected == old" (as computed by (6))
sewardj1c0ce7a2009-07-01 08:10:49 +00005693 store data#,dataB to shadow memory
5694
5695 Note that 5 reads 'old' but 4 reads 'old#'. Similarly, 5 stores
5696 'data' but 7 stores 'data#'. Hence it is possible for the
5697 shadow data to be incorrectly checked and/or updated:
5698
sewardj1c0ce7a2009-07-01 08:10:49 +00005699 * 7 is at least gated correctly, since the 'expected == old'
5700 condition is derived from outputs of 5. However, the shadow
5701 write could happen too late: imagine after 5 we are
5702 descheduled, a different thread runs, writes a different
5703 (shadow) value at the address, and then we resume, hence
5704 overwriting the shadow value written by the other thread.
5705
5706 Because the original memory access is atomic, there's no way to
5707 make both the original and shadow accesses into a single atomic
5708 thing, hence this is unavoidable.
5709
5710 At least as Valgrind stands, I don't think it's a problem, since
5711 we're single threaded *and* we guarantee that there are no
5712 context switches during the execution of any specific superblock
5713 -- context switches can only happen at superblock boundaries.
5714
5715 If Valgrind ever becomes MT in the future, then it might be more
5716 of a problem. A possible kludge would be to artificially
5717 associate with the location, a lock, which we must acquire and
5718 release around the transaction as a whole. Hmm, that probably
5719 would't work properly since it only guards us against other
5720 threads doing CASs on the same location, not against other
5721 threads doing normal reads and writes.
sewardjafed4c52009-07-12 13:00:17 +00005722
5723 ------------------------------------------------------------
5724
5725 COMMENT_ON_CasCmpEQ:
5726
5727 Note two things. Firstly, in the sequence above, we compute
5728 "expected == old", but we don't check definedness of it. Why
5729 not? Also, the x86 and amd64 front ends use
sewardjb9e6d242013-05-11 13:42:08 +00005730 Iop_CasCmp{EQ,NE}{8,16,32,64} comparisons to make the equivalent
sewardjafed4c52009-07-12 13:00:17 +00005731 determination (expected == old ?) for themselves, and we also
5732 don't check definedness for those primops; we just say that the
5733 result is defined. Why? Details follow.
5734
5735 x86/amd64 contains various forms of locked insns:
5736 * lock prefix before all basic arithmetic insn;
5737 eg lock xorl %reg1,(%reg2)
5738 * atomic exchange reg-mem
5739 * compare-and-swaps
5740
5741 Rather than attempt to represent them all, which would be a
5742 royal PITA, I used a result from Maurice Herlihy
5743 (http://en.wikipedia.org/wiki/Maurice_Herlihy), in which he
5744 demonstrates that compare-and-swap is a primitive more general
5745 than the other two, and so can be used to represent all of them.
5746 So the translation scheme for (eg) lock incl (%reg) is as
5747 follows:
5748
5749 again:
5750 old = * %reg
5751 new = old + 1
5752 atomically { if (* %reg == old) { * %reg = new } else { goto again } }
5753
5754 The "atomically" is the CAS bit. The scheme is always the same:
5755 get old value from memory, compute new value, atomically stuff
5756 new value back in memory iff the old value has not changed (iow,
5757 no other thread modified it in the meantime). If it has changed
5758 then we've been out-raced and we have to start over.
5759
5760 Now that's all very neat, but it has the bad side effect of
5761 introducing an explicit equality test into the translation.
5762 Consider the behaviour of said code on a memory location which
5763 is uninitialised. We will wind up doing a comparison on
5764 uninitialised data, and mc duly complains.
5765
5766 What's difficult about this is, the common case is that the
5767 location is uncontended, and so we're usually comparing the same
5768 value (* %reg) with itself. So we shouldn't complain even if it
5769 is undefined. But mc doesn't know that.
5770
5771 My solution is to mark the == in the IR specially, so as to tell
5772 mc that it almost certainly compares a value with itself, and we
5773 should just regard the result as always defined. Rather than
5774 add a bit to all IROps, I just cloned Iop_CmpEQ{8,16,32,64} into
5775 Iop_CasCmpEQ{8,16,32,64} so as not to disturb anything else.
5776
5777 So there's always the question of, can this give a false
5778 negative? eg, imagine that initially, * %reg is defined; and we
5779 read that; but then in the gap between the read and the CAS, a
5780 different thread writes an undefined (and different) value at
5781 the location. Then the CAS in this thread will fail and we will
5782 go back to "again:", but without knowing that the trip back
5783 there was based on an undefined comparison. No matter; at least
5784 the other thread won the race and the location is correctly
5785 marked as undefined. What if it wrote an uninitialised version
5786 of the same value that was there originally, though?
5787
5788 etc etc. Seems like there's a small corner case in which we
5789 might lose the fact that something's defined -- we're out-raced
5790 in between the "old = * reg" and the "atomically {", _and_ the
5791 other thread is writing in an undefined version of what's
5792 already there. Well, that seems pretty unlikely.
5793
5794 ---
5795
5796 If we ever need to reinstate it .. code which generates a
5797 definedness test for "expected == old" was removed at r10432 of
5798 this file.
sewardj1c0ce7a2009-07-01 08:10:49 +00005799 */
5800 if (cas->oldHi == IRTemp_INVALID) {
5801 do_shadow_CAS_single( mce, cas );
5802 } else {
5803 do_shadow_CAS_double( mce, cas );
5804 }
5805}
5806
5807
5808static void do_shadow_CAS_single ( MCEnv* mce, IRCAS* cas )
5809{
5810 IRAtom *vdataLo = NULL, *bdataLo = NULL;
5811 IRAtom *vexpdLo = NULL, *bexpdLo = NULL;
5812 IRAtom *voldLo = NULL, *boldLo = NULL;
sewardjafed4c52009-07-12 13:00:17 +00005813 IRAtom *expd_eq_old = NULL;
5814 IROp opCasCmpEQ;
sewardj1c0ce7a2009-07-01 08:10:49 +00005815 Int elemSzB;
5816 IRType elemTy;
5817 Bool otrak = MC_(clo_mc_level) >= 3; /* a shorthand */
5818
5819 /* single CAS */
5820 tl_assert(cas->oldHi == IRTemp_INVALID);
5821 tl_assert(cas->expdHi == NULL);
5822 tl_assert(cas->dataHi == NULL);
5823
5824 elemTy = typeOfIRExpr(mce->sb->tyenv, cas->expdLo);
5825 switch (elemTy) {
sewardjafed4c52009-07-12 13:00:17 +00005826 case Ity_I8: elemSzB = 1; opCasCmpEQ = Iop_CasCmpEQ8; break;
5827 case Ity_I16: elemSzB = 2; opCasCmpEQ = Iop_CasCmpEQ16; break;
5828 case Ity_I32: elemSzB = 4; opCasCmpEQ = Iop_CasCmpEQ32; break;
5829 case Ity_I64: elemSzB = 8; opCasCmpEQ = Iop_CasCmpEQ64; break;
sewardj1c0ce7a2009-07-01 08:10:49 +00005830 default: tl_assert(0); /* IR defn disallows any other types */
5831 }
5832
5833 /* 1. fetch data# (the proposed new value) */
5834 tl_assert(isOriginalAtom(mce, cas->dataLo));
5835 vdataLo
5836 = assignNew('V', mce, elemTy, expr2vbits(mce, cas->dataLo));
5837 tl_assert(isShadowAtom(mce, vdataLo));
5838 if (otrak) {
5839 bdataLo
5840 = assignNew('B', mce, Ity_I32, schemeE(mce, cas->dataLo));
5841 tl_assert(isShadowAtom(mce, bdataLo));
5842 }
5843
5844 /* 2. fetch expected# (what we expect to see at the address) */
5845 tl_assert(isOriginalAtom(mce, cas->expdLo));
5846 vexpdLo
5847 = assignNew('V', mce, elemTy, expr2vbits(mce, cas->expdLo));
5848 tl_assert(isShadowAtom(mce, vexpdLo));
5849 if (otrak) {
5850 bexpdLo
5851 = assignNew('B', mce, Ity_I32, schemeE(mce, cas->expdLo));
5852 tl_assert(isShadowAtom(mce, bexpdLo));
5853 }
5854
5855 /* 3. check definedness of address */
5856 /* 4. fetch old# from shadow memory; this also checks
5857 addressibility of the address */
5858 voldLo
5859 = assignNew(
5860 'V', mce, elemTy,
5861 expr2vbits_Load(
5862 mce,
sewardjcafe5052013-01-17 14:24:35 +00005863 cas->end, elemTy, cas->addr, 0/*Addr bias*/,
5864 NULL/*always happens*/
sewardj1c0ce7a2009-07-01 08:10:49 +00005865 ));
sewardjafed4c52009-07-12 13:00:17 +00005866 bind_shadow_tmp_to_orig('V', mce, mkexpr(cas->oldLo), voldLo);
sewardj1c0ce7a2009-07-01 08:10:49 +00005867 if (otrak) {
5868 boldLo
5869 = assignNew('B', mce, Ity_I32,
5870 gen_load_b(mce, elemSzB, cas->addr, 0/*addr bias*/));
sewardjafed4c52009-07-12 13:00:17 +00005871 bind_shadow_tmp_to_orig('B', mce, mkexpr(cas->oldLo), boldLo);
sewardj1c0ce7a2009-07-01 08:10:49 +00005872 }
5873
5874 /* 5. the CAS itself */
5875 stmt( 'C', mce, IRStmt_CAS(cas) );
5876
sewardjafed4c52009-07-12 13:00:17 +00005877 /* 6. compute "expected == old" */
5878 /* See COMMENT_ON_CasCmpEQ in this file background/rationale. */
sewardj1c0ce7a2009-07-01 08:10:49 +00005879 /* Note that 'C' is kinda faking it; it is indeed a non-shadow
5880 tree, but it's not copied from the input block. */
5881 expd_eq_old
5882 = assignNew('C', mce, Ity_I1,
sewardjafed4c52009-07-12 13:00:17 +00005883 binop(opCasCmpEQ, cas->expdLo, mkexpr(cas->oldLo)));
sewardj1c0ce7a2009-07-01 08:10:49 +00005884
5885 /* 7. if "expected == old"
5886 store data# to shadow memory */
5887 do_shadow_Store( mce, cas->end, cas->addr, 0/*bias*/,
5888 NULL/*data*/, vdataLo/*vdata*/,
5889 expd_eq_old/*guard for store*/ );
5890 if (otrak) {
5891 gen_store_b( mce, elemSzB, cas->addr, 0/*offset*/,
5892 bdataLo/*bdata*/,
5893 expd_eq_old/*guard for store*/ );
5894 }
5895}
5896
5897
5898static void do_shadow_CAS_double ( MCEnv* mce, IRCAS* cas )
5899{
5900 IRAtom *vdataHi = NULL, *bdataHi = NULL;
5901 IRAtom *vdataLo = NULL, *bdataLo = NULL;
5902 IRAtom *vexpdHi = NULL, *bexpdHi = NULL;
5903 IRAtom *vexpdLo = NULL, *bexpdLo = NULL;
5904 IRAtom *voldHi = NULL, *boldHi = NULL;
5905 IRAtom *voldLo = NULL, *boldLo = NULL;
sewardjafed4c52009-07-12 13:00:17 +00005906 IRAtom *xHi = NULL, *xLo = NULL, *xHL = NULL;
5907 IRAtom *expd_eq_old = NULL, *zero = NULL;
5908 IROp opCasCmpEQ, opOr, opXor;
sewardj1c0ce7a2009-07-01 08:10:49 +00005909 Int elemSzB, memOffsLo, memOffsHi;
5910 IRType elemTy;
5911 Bool otrak = MC_(clo_mc_level) >= 3; /* a shorthand */
5912
5913 /* double CAS */
5914 tl_assert(cas->oldHi != IRTemp_INVALID);
5915 tl_assert(cas->expdHi != NULL);
5916 tl_assert(cas->dataHi != NULL);
5917
5918 elemTy = typeOfIRExpr(mce->sb->tyenv, cas->expdLo);
5919 switch (elemTy) {
5920 case Ity_I8:
sewardjafed4c52009-07-12 13:00:17 +00005921 opCasCmpEQ = Iop_CasCmpEQ8; opOr = Iop_Or8; opXor = Iop_Xor8;
sewardj1c0ce7a2009-07-01 08:10:49 +00005922 elemSzB = 1; zero = mkU8(0);
5923 break;
5924 case Ity_I16:
sewardjafed4c52009-07-12 13:00:17 +00005925 opCasCmpEQ = Iop_CasCmpEQ16; opOr = Iop_Or16; opXor = Iop_Xor16;
sewardj1c0ce7a2009-07-01 08:10:49 +00005926 elemSzB = 2; zero = mkU16(0);
5927 break;
5928 case Ity_I32:
sewardjafed4c52009-07-12 13:00:17 +00005929 opCasCmpEQ = Iop_CasCmpEQ32; opOr = Iop_Or32; opXor = Iop_Xor32;
sewardj1c0ce7a2009-07-01 08:10:49 +00005930 elemSzB = 4; zero = mkU32(0);
5931 break;
5932 case Ity_I64:
sewardjafed4c52009-07-12 13:00:17 +00005933 opCasCmpEQ = Iop_CasCmpEQ64; opOr = Iop_Or64; opXor = Iop_Xor64;
sewardj1c0ce7a2009-07-01 08:10:49 +00005934 elemSzB = 8; zero = mkU64(0);
5935 break;
5936 default:
5937 tl_assert(0); /* IR defn disallows any other types */
5938 }
5939
5940 /* 1. fetch data# (the proposed new value) */
5941 tl_assert(isOriginalAtom(mce, cas->dataHi));
5942 tl_assert(isOriginalAtom(mce, cas->dataLo));
5943 vdataHi
5944 = assignNew('V', mce, elemTy, expr2vbits(mce, cas->dataHi));
5945 vdataLo
5946 = assignNew('V', mce, elemTy, expr2vbits(mce, cas->dataLo));
5947 tl_assert(isShadowAtom(mce, vdataHi));
5948 tl_assert(isShadowAtom(mce, vdataLo));
5949 if (otrak) {
5950 bdataHi
5951 = assignNew('B', mce, Ity_I32, schemeE(mce, cas->dataHi));
5952 bdataLo
5953 = assignNew('B', mce, Ity_I32, schemeE(mce, cas->dataLo));
5954 tl_assert(isShadowAtom(mce, bdataHi));
5955 tl_assert(isShadowAtom(mce, bdataLo));
5956 }
5957
5958 /* 2. fetch expected# (what we expect to see at the address) */
5959 tl_assert(isOriginalAtom(mce, cas->expdHi));
5960 tl_assert(isOriginalAtom(mce, cas->expdLo));
5961 vexpdHi
5962 = assignNew('V', mce, elemTy, expr2vbits(mce, cas->expdHi));
5963 vexpdLo
5964 = assignNew('V', mce, elemTy, expr2vbits(mce, cas->expdLo));
5965 tl_assert(isShadowAtom(mce, vexpdHi));
5966 tl_assert(isShadowAtom(mce, vexpdLo));
5967 if (otrak) {
5968 bexpdHi
5969 = assignNew('B', mce, Ity_I32, schemeE(mce, cas->expdHi));
5970 bexpdLo
5971 = assignNew('B', mce, Ity_I32, schemeE(mce, cas->expdLo));
5972 tl_assert(isShadowAtom(mce, bexpdHi));
5973 tl_assert(isShadowAtom(mce, bexpdLo));
5974 }
5975
5976 /* 3. check definedness of address */
5977 /* 4. fetch old# from shadow memory; this also checks
5978 addressibility of the address */
5979 if (cas->end == Iend_LE) {
5980 memOffsLo = 0;
5981 memOffsHi = elemSzB;
5982 } else {
5983 tl_assert(cas->end == Iend_BE);
5984 memOffsLo = elemSzB;
5985 memOffsHi = 0;
5986 }
5987 voldHi
5988 = assignNew(
5989 'V', mce, elemTy,
5990 expr2vbits_Load(
5991 mce,
sewardjcafe5052013-01-17 14:24:35 +00005992 cas->end, elemTy, cas->addr, memOffsHi/*Addr bias*/,
5993 NULL/*always happens*/
sewardj1c0ce7a2009-07-01 08:10:49 +00005994 ));
5995 voldLo
5996 = assignNew(
5997 'V', mce, elemTy,
5998 expr2vbits_Load(
5999 mce,
sewardjcafe5052013-01-17 14:24:35 +00006000 cas->end, elemTy, cas->addr, memOffsLo/*Addr bias*/,
6001 NULL/*always happens*/
sewardj1c0ce7a2009-07-01 08:10:49 +00006002 ));
sewardjafed4c52009-07-12 13:00:17 +00006003 bind_shadow_tmp_to_orig('V', mce, mkexpr(cas->oldHi), voldHi);
6004 bind_shadow_tmp_to_orig('V', mce, mkexpr(cas->oldLo), voldLo);
sewardj1c0ce7a2009-07-01 08:10:49 +00006005 if (otrak) {
6006 boldHi
6007 = assignNew('B', mce, Ity_I32,
6008 gen_load_b(mce, elemSzB, cas->addr,
6009 memOffsHi/*addr bias*/));
6010 boldLo
6011 = assignNew('B', mce, Ity_I32,
6012 gen_load_b(mce, elemSzB, cas->addr,
6013 memOffsLo/*addr bias*/));
sewardjafed4c52009-07-12 13:00:17 +00006014 bind_shadow_tmp_to_orig('B', mce, mkexpr(cas->oldHi), boldHi);
6015 bind_shadow_tmp_to_orig('B', mce, mkexpr(cas->oldLo), boldLo);
sewardj1c0ce7a2009-07-01 08:10:49 +00006016 }
6017
6018 /* 5. the CAS itself */
6019 stmt( 'C', mce, IRStmt_CAS(cas) );
6020
sewardjafed4c52009-07-12 13:00:17 +00006021 /* 6. compute "expected == old" */
6022 /* See COMMENT_ON_CasCmpEQ in this file background/rationale. */
sewardj1c0ce7a2009-07-01 08:10:49 +00006023 /* Note that 'C' is kinda faking it; it is indeed a non-shadow
6024 tree, but it's not copied from the input block. */
6025 /*
6026 xHi = oldHi ^ expdHi;
6027 xLo = oldLo ^ expdLo;
6028 xHL = xHi | xLo;
6029 expd_eq_old = xHL == 0;
6030 */
sewardj1c0ce7a2009-07-01 08:10:49 +00006031 xHi = assignNew('C', mce, elemTy,
6032 binop(opXor, cas->expdHi, mkexpr(cas->oldHi)));
sewardj1c0ce7a2009-07-01 08:10:49 +00006033 xLo = assignNew('C', mce, elemTy,
6034 binop(opXor, cas->expdLo, mkexpr(cas->oldLo)));
sewardj1c0ce7a2009-07-01 08:10:49 +00006035 xHL = assignNew('C', mce, elemTy,
6036 binop(opOr, xHi, xLo));
sewardj1c0ce7a2009-07-01 08:10:49 +00006037 expd_eq_old
6038 = assignNew('C', mce, Ity_I1,
sewardjafed4c52009-07-12 13:00:17 +00006039 binop(opCasCmpEQ, xHL, zero));
sewardj1c0ce7a2009-07-01 08:10:49 +00006040
6041 /* 7. if "expected == old"
6042 store data# to shadow memory */
6043 do_shadow_Store( mce, cas->end, cas->addr, memOffsHi/*bias*/,
6044 NULL/*data*/, vdataHi/*vdata*/,
6045 expd_eq_old/*guard for store*/ );
6046 do_shadow_Store( mce, cas->end, cas->addr, memOffsLo/*bias*/,
6047 NULL/*data*/, vdataLo/*vdata*/,
6048 expd_eq_old/*guard for store*/ );
6049 if (otrak) {
6050 gen_store_b( mce, elemSzB, cas->addr, memOffsHi/*offset*/,
6051 bdataHi/*bdata*/,
6052 expd_eq_old/*guard for store*/ );
6053 gen_store_b( mce, elemSzB, cas->addr, memOffsLo/*offset*/,
6054 bdataLo/*bdata*/,
6055 expd_eq_old/*guard for store*/ );
6056 }
6057}
6058
6059
sewardjdb5907d2009-11-26 17:20:21 +00006060/* ------ Dealing with LL/SC (not difficult) ------ */
6061
6062static void do_shadow_LLSC ( MCEnv* mce,
6063 IREndness stEnd,
6064 IRTemp stResult,
6065 IRExpr* stAddr,
6066 IRExpr* stStoredata )
6067{
6068 /* In short: treat a load-linked like a normal load followed by an
6069 assignment of the loaded (shadow) data to the result temporary.
6070 Treat a store-conditional like a normal store, and mark the
6071 result temporary as defined. */
6072 IRType resTy = typeOfIRTemp(mce->sb->tyenv, stResult);
6073 IRTemp resTmp = findShadowTmpV(mce, stResult);
6074
6075 tl_assert(isIRAtom(stAddr));
6076 if (stStoredata)
6077 tl_assert(isIRAtom(stStoredata));
6078
6079 if (stStoredata == NULL) {
6080 /* Load Linked */
6081 /* Just treat this as a normal load, followed by an assignment of
6082 the value to .result. */
6083 /* Stay sane */
6084 tl_assert(resTy == Ity_I64 || resTy == Ity_I32
6085 || resTy == Ity_I16 || resTy == Ity_I8);
6086 assign( 'V', mce, resTmp,
6087 expr2vbits_Load(
sewardjcafe5052013-01-17 14:24:35 +00006088 mce, stEnd, resTy, stAddr, 0/*addr bias*/,
6089 NULL/*always happens*/) );
sewardjdb5907d2009-11-26 17:20:21 +00006090 } else {
6091 /* Store Conditional */
6092 /* Stay sane */
6093 IRType dataTy = typeOfIRExpr(mce->sb->tyenv,
6094 stStoredata);
6095 tl_assert(dataTy == Ity_I64 || dataTy == Ity_I32
6096 || dataTy == Ity_I16 || dataTy == Ity_I8);
6097 do_shadow_Store( mce, stEnd,
6098 stAddr, 0/* addr bias */,
6099 stStoredata,
6100 NULL /* shadow data */,
6101 NULL/*guard*/ );
6102 /* This is a store conditional, so it writes to .result a value
6103 indicating whether or not the store succeeded. Just claim
6104 this value is always defined. In the PowerPC interpretation
6105 of store-conditional, definedness of the success indication
6106 depends on whether the address of the store matches the
6107 reservation address. But we can't tell that here (and
6108 anyway, we're not being PowerPC-specific). At least we are
6109 guaranteed that the definedness of the store address, and its
6110 addressibility, will be checked as per normal. So it seems
6111 pretty safe to just say that the success indication is always
6112 defined.
6113
6114 In schemeS, for origin tracking, we must correspondingly set
6115 a no-origin value for the origin shadow of .result.
6116 */
6117 tl_assert(resTy == Ity_I1);
6118 assign( 'V', mce, resTmp, definedOfType(resTy) );
6119 }
6120}
6121
6122
sewardjcafe5052013-01-17 14:24:35 +00006123/* ---- Dealing with LoadG/StoreG (not entirely simple) ---- */
6124
6125static void do_shadow_StoreG ( MCEnv* mce, IRStoreG* sg )
6126{
sewardjb9e6d242013-05-11 13:42:08 +00006127 complainIfUndefined(mce, sg->guard, NULL);
6128 /* do_shadow_Store will generate code to check the definedness and
6129 validity of sg->addr, in the case where sg->guard evaluates to
6130 True at run-time. */
sewardjcafe5052013-01-17 14:24:35 +00006131 do_shadow_Store( mce, sg->end,
6132 sg->addr, 0/* addr bias */,
6133 sg->data,
6134 NULL /* shadow data */,
6135 sg->guard );
6136}
6137
6138static void do_shadow_LoadG ( MCEnv* mce, IRLoadG* lg )
6139{
sewardjb9e6d242013-05-11 13:42:08 +00006140 complainIfUndefined(mce, lg->guard, NULL);
6141 /* expr2vbits_Load_guarded_General will generate code to check the
6142 definedness and validity of lg->addr, in the case where
6143 lg->guard evaluates to True at run-time. */
sewardjcafe5052013-01-17 14:24:35 +00006144
6145 /* Look at the LoadG's built-in conversion operation, to determine
6146 the source (actual loaded data) type, and the equivalent IROp.
6147 NOTE that implicitly we are taking a widening operation to be
6148 applied to original atoms and producing one that applies to V
6149 bits. Since signed and unsigned widening are self-shadowing,
6150 this is a straight copy of the op (modulo swapping from the
6151 IRLoadGOp form to the IROp form). Note also therefore that this
6152 implicitly duplicates the logic to do with said widening ops in
6153 expr2vbits_Unop. See comment at the start of expr2vbits_Unop. */
6154 IROp vwiden = Iop_INVALID;
6155 IRType loadedTy = Ity_INVALID;
6156 switch (lg->cvt) {
sewardj290b9ca2015-08-12 11:16:23 +00006157 case ILGop_IdentV128: loadedTy = Ity_V128; vwiden = Iop_INVALID; break;
6158 case ILGop_Ident64: loadedTy = Ity_I64; vwiden = Iop_INVALID; break;
6159 case ILGop_Ident32: loadedTy = Ity_I32; vwiden = Iop_INVALID; break;
6160 case ILGop_16Uto32: loadedTy = Ity_I16; vwiden = Iop_16Uto32; break;
6161 case ILGop_16Sto32: loadedTy = Ity_I16; vwiden = Iop_16Sto32; break;
6162 case ILGop_8Uto32: loadedTy = Ity_I8; vwiden = Iop_8Uto32; break;
6163 case ILGop_8Sto32: loadedTy = Ity_I8; vwiden = Iop_8Sto32; break;
sewardjcafe5052013-01-17 14:24:35 +00006164 default: VG_(tool_panic)("do_shadow_LoadG");
6165 }
6166
6167 IRAtom* vbits_alt
6168 = expr2vbits( mce, lg->alt );
6169 IRAtom* vbits_final
6170 = expr2vbits_Load_guarded_General(mce, lg->end, loadedTy,
6171 lg->addr, 0/*addr bias*/,
6172 lg->guard, vwiden, vbits_alt );
6173 /* And finally, bind the V bits to the destination temporary. */
6174 assign( 'V', mce, findShadowTmpV(mce, lg->dst), vbits_final );
6175}
6176
6177
sewardj95448072004-11-22 20:19:51 +00006178/*------------------------------------------------------------*/
6179/*--- Memcheck main ---*/
6180/*------------------------------------------------------------*/
njn25e49d8e72002-09-23 09:36:25 +00006181
sewardj7cf4e6b2008-05-01 20:24:26 +00006182static void schemeS ( MCEnv* mce, IRStmt* st );
6183
sewardj95448072004-11-22 20:19:51 +00006184static Bool isBogusAtom ( IRAtom* at )
njn25e49d8e72002-09-23 09:36:25 +00006185{
sewardj95448072004-11-22 20:19:51 +00006186 ULong n = 0;
6187 IRConst* con;
sewardj710d6c22005-03-20 18:55:15 +00006188 tl_assert(isIRAtom(at));
sewardj0b9d74a2006-12-24 02:24:11 +00006189 if (at->tag == Iex_RdTmp)
sewardj95448072004-11-22 20:19:51 +00006190 return False;
6191 tl_assert(at->tag == Iex_Const);
6192 con = at->Iex.Const.con;
6193 switch (con->tag) {
sewardjd5204dc2004-12-31 01:16:11 +00006194 case Ico_U1: return False;
6195 case Ico_U8: n = (ULong)con->Ico.U8; break;
6196 case Ico_U16: n = (ULong)con->Ico.U16; break;
6197 case Ico_U32: n = (ULong)con->Ico.U32; break;
6198 case Ico_U64: n = (ULong)con->Ico.U64; break;
sewardjf837aa72014-11-20 10:15:17 +00006199 case Ico_F32: return False;
sewardjd5204dc2004-12-31 01:16:11 +00006200 case Ico_F64: return False;
sewardjb5b87402011-03-07 16:05:35 +00006201 case Ico_F32i: return False;
sewardjd5204dc2004-12-31 01:16:11 +00006202 case Ico_F64i: return False;
6203 case Ico_V128: return False;
sewardj1eb272f2014-01-26 18:36:52 +00006204 case Ico_V256: return False;
sewardj95448072004-11-22 20:19:51 +00006205 default: ppIRExpr(at); tl_assert(0);
6206 }
6207 /* VG_(printf)("%llx\n", n); */
sewardj96a922e2005-04-23 23:26:29 +00006208 return (/*32*/ n == 0xFEFEFEFFULL
6209 /*32*/ || n == 0x80808080ULL
sewardj17b47432008-12-17 01:12:58 +00006210 /*32*/ || n == 0x7F7F7F7FULL
sewardja150fe92013-12-11 16:49:46 +00006211 /*32*/ || n == 0x7EFEFEFFULL
6212 /*32*/ || n == 0x81010100ULL
tomd9774d72005-06-27 08:11:01 +00006213 /*64*/ || n == 0xFFFFFFFFFEFEFEFFULL
sewardj96a922e2005-04-23 23:26:29 +00006214 /*64*/ || n == 0xFEFEFEFEFEFEFEFFULL
tomd9774d72005-06-27 08:11:01 +00006215 /*64*/ || n == 0x0000000000008080ULL
sewardj96a922e2005-04-23 23:26:29 +00006216 /*64*/ || n == 0x8080808080808080ULL
sewardj17b47432008-12-17 01:12:58 +00006217 /*64*/ || n == 0x0101010101010101ULL
sewardj96a922e2005-04-23 23:26:29 +00006218 );
sewardj95448072004-11-22 20:19:51 +00006219}
njn25e49d8e72002-09-23 09:36:25 +00006220
sewardj95448072004-11-22 20:19:51 +00006221static Bool checkForBogusLiterals ( /*FLAT*/ IRStmt* st )
6222{
sewardjd5204dc2004-12-31 01:16:11 +00006223 Int i;
6224 IRExpr* e;
6225 IRDirty* d;
sewardj1c0ce7a2009-07-01 08:10:49 +00006226 IRCAS* cas;
sewardj95448072004-11-22 20:19:51 +00006227 switch (st->tag) {
sewardj0b9d74a2006-12-24 02:24:11 +00006228 case Ist_WrTmp:
6229 e = st->Ist.WrTmp.data;
sewardj95448072004-11-22 20:19:51 +00006230 switch (e->tag) {
6231 case Iex_Get:
sewardj0b9d74a2006-12-24 02:24:11 +00006232 case Iex_RdTmp:
sewardj95448072004-11-22 20:19:51 +00006233 return False;
sewardjd5204dc2004-12-31 01:16:11 +00006234 case Iex_Const:
6235 return isBogusAtom(e);
sewardj95448072004-11-22 20:19:51 +00006236 case Iex_Unop:
sewardja150fe92013-12-11 16:49:46 +00006237 return isBogusAtom(e->Iex.Unop.arg)
6238 || e->Iex.Unop.op == Iop_GetMSBs8x16;
sewardjd5204dc2004-12-31 01:16:11 +00006239 case Iex_GetI:
6240 return isBogusAtom(e->Iex.GetI.ix);
sewardj95448072004-11-22 20:19:51 +00006241 case Iex_Binop:
6242 return isBogusAtom(e->Iex.Binop.arg1)
6243 || isBogusAtom(e->Iex.Binop.arg2);
sewardjed69fdb2006-02-03 16:12:27 +00006244 case Iex_Triop:
florian26441742012-06-02 20:30:41 +00006245 return isBogusAtom(e->Iex.Triop.details->arg1)
6246 || isBogusAtom(e->Iex.Triop.details->arg2)
6247 || isBogusAtom(e->Iex.Triop.details->arg3);
sewardje91cea72006-02-08 19:32:02 +00006248 case Iex_Qop:
floriane2ab2972012-06-01 20:43:03 +00006249 return isBogusAtom(e->Iex.Qop.details->arg1)
6250 || isBogusAtom(e->Iex.Qop.details->arg2)
6251 || isBogusAtom(e->Iex.Qop.details->arg3)
6252 || isBogusAtom(e->Iex.Qop.details->arg4);
florian5686b2d2013-01-29 03:57:40 +00006253 case Iex_ITE:
6254 return isBogusAtom(e->Iex.ITE.cond)
6255 || isBogusAtom(e->Iex.ITE.iftrue)
6256 || isBogusAtom(e->Iex.ITE.iffalse);
sewardj2e595852005-06-30 23:33:37 +00006257 case Iex_Load:
6258 return isBogusAtom(e->Iex.Load.addr);
sewardj95448072004-11-22 20:19:51 +00006259 case Iex_CCall:
6260 for (i = 0; e->Iex.CCall.args[i]; i++)
6261 if (isBogusAtom(e->Iex.CCall.args[i]))
6262 return True;
6263 return False;
6264 default:
6265 goto unhandled;
6266 }
sewardjd5204dc2004-12-31 01:16:11 +00006267 case Ist_Dirty:
6268 d = st->Ist.Dirty.details;
sewardj21a5f8c2013-08-08 10:41:46 +00006269 for (i = 0; d->args[i]; i++) {
6270 IRAtom* atom = d->args[i];
floriana5c3ecb2013-08-15 20:55:42 +00006271 if (LIKELY(!is_IRExpr_VECRET_or_BBPTR(atom))) {
sewardj21a5f8c2013-08-08 10:41:46 +00006272 if (isBogusAtom(atom))
6273 return True;
6274 }
6275 }
florian6c0aa2c2013-01-21 01:27:22 +00006276 if (isBogusAtom(d->guard))
sewardjd5204dc2004-12-31 01:16:11 +00006277 return True;
6278 if (d->mAddr && isBogusAtom(d->mAddr))
6279 return True;
6280 return False;
sewardj95448072004-11-22 20:19:51 +00006281 case Ist_Put:
6282 return isBogusAtom(st->Ist.Put.data);
sewardjd5204dc2004-12-31 01:16:11 +00006283 case Ist_PutI:
floriand39b0222012-05-31 15:48:13 +00006284 return isBogusAtom(st->Ist.PutI.details->ix)
6285 || isBogusAtom(st->Ist.PutI.details->data);
sewardj2e595852005-06-30 23:33:37 +00006286 case Ist_Store:
6287 return isBogusAtom(st->Ist.Store.addr)
6288 || isBogusAtom(st->Ist.Store.data);
sewardjcafe5052013-01-17 14:24:35 +00006289 case Ist_StoreG: {
6290 IRStoreG* sg = st->Ist.StoreG.details;
6291 return isBogusAtom(sg->addr) || isBogusAtom(sg->data)
6292 || isBogusAtom(sg->guard);
6293 }
6294 case Ist_LoadG: {
6295 IRLoadG* lg = st->Ist.LoadG.details;
6296 return isBogusAtom(lg->addr) || isBogusAtom(lg->alt)
6297 || isBogusAtom(lg->guard);
6298 }
sewardj95448072004-11-22 20:19:51 +00006299 case Ist_Exit:
sewardjd5204dc2004-12-31 01:16:11 +00006300 return isBogusAtom(st->Ist.Exit.guard);
sewardj826ec492005-05-12 18:05:00 +00006301 case Ist_AbiHint:
sewardj7cf4e6b2008-05-01 20:24:26 +00006302 return isBogusAtom(st->Ist.AbiHint.base)
6303 || isBogusAtom(st->Ist.AbiHint.nia);
sewardj21dc3452005-03-21 00:27:41 +00006304 case Ist_NoOp:
sewardj29faa502005-03-16 18:20:21 +00006305 case Ist_IMark:
sewardj72d75132007-11-09 23:06:35 +00006306 case Ist_MBE:
sewardjbd598e12005-01-07 12:10:21 +00006307 return False;
sewardj1c0ce7a2009-07-01 08:10:49 +00006308 case Ist_CAS:
6309 cas = st->Ist.CAS.details;
6310 return isBogusAtom(cas->addr)
6311 || (cas->expdHi ? isBogusAtom(cas->expdHi) : False)
6312 || isBogusAtom(cas->expdLo)
6313 || (cas->dataHi ? isBogusAtom(cas->dataHi) : False)
6314 || isBogusAtom(cas->dataLo);
sewardjdb5907d2009-11-26 17:20:21 +00006315 case Ist_LLSC:
6316 return isBogusAtom(st->Ist.LLSC.addr)
6317 || (st->Ist.LLSC.storedata
6318 ? isBogusAtom(st->Ist.LLSC.storedata)
6319 : False);
sewardj95448072004-11-22 20:19:51 +00006320 default:
6321 unhandled:
6322 ppIRStmt(st);
6323 VG_(tool_panic)("hasBogusLiterals");
6324 }
6325}
njn25e49d8e72002-09-23 09:36:25 +00006326
njn25e49d8e72002-09-23 09:36:25 +00006327
sewardj0b9d74a2006-12-24 02:24:11 +00006328IRSB* MC_(instrument) ( VgCallbackClosure* closure,
sewardj1c0ce7a2009-07-01 08:10:49 +00006329 IRSB* sb_in,
florian3c0c9472014-09-24 12:06:55 +00006330 const VexGuestLayout* layout,
6331 const VexGuestExtents* vge,
6332 const VexArchInfo* archinfo_host,
sewardjd54babf2005-03-21 00:55:49 +00006333 IRType gWordTy, IRType hWordTy )
sewardj95448072004-11-22 20:19:51 +00006334{
sewardj7cf4e6b2008-05-01 20:24:26 +00006335 Bool verboze = 0||False;
sewardjd5204dc2004-12-31 01:16:11 +00006336 Int i, j, first_stmt;
sewardj95448072004-11-22 20:19:51 +00006337 IRStmt* st;
sewardjd5204dc2004-12-31 01:16:11 +00006338 MCEnv mce;
sewardj1c0ce7a2009-07-01 08:10:49 +00006339 IRSB* sb_out;
sewardjd54babf2005-03-21 00:55:49 +00006340
6341 if (gWordTy != hWordTy) {
6342 /* We don't currently support this case. */
6343 VG_(tool_panic)("host/guest word size mismatch");
6344 }
njn25e49d8e72002-09-23 09:36:25 +00006345
sewardj6cf40ff2005-04-20 22:31:26 +00006346 /* Check we're not completely nuts */
sewardj7cf4e6b2008-05-01 20:24:26 +00006347 tl_assert(sizeof(UWord) == sizeof(void*));
6348 tl_assert(sizeof(Word) == sizeof(void*));
6349 tl_assert(sizeof(Addr) == sizeof(void*));
6350 tl_assert(sizeof(ULong) == 8);
6351 tl_assert(sizeof(Long) == 8);
sewardj7cf4e6b2008-05-01 20:24:26 +00006352 tl_assert(sizeof(UInt) == 4);
6353 tl_assert(sizeof(Int) == 4);
6354
6355 tl_assert(MC_(clo_mc_level) >= 1 && MC_(clo_mc_level) <= 3);
sewardj6cf40ff2005-04-20 22:31:26 +00006356
sewardj0b9d74a2006-12-24 02:24:11 +00006357 /* Set up SB */
sewardj1c0ce7a2009-07-01 08:10:49 +00006358 sb_out = deepCopyIRSBExceptStmts(sb_in);
njn25e49d8e72002-09-23 09:36:25 +00006359
sewardj1c0ce7a2009-07-01 08:10:49 +00006360 /* Set up the running environment. Both .sb and .tmpMap are
6361 modified as we go along. Note that tmps are added to both
6362 .sb->tyenv and .tmpMap together, so the valid index-set for
6363 those two arrays should always be identical. */
6364 VG_(memset)(&mce, 0, sizeof(mce));
6365 mce.sb = sb_out;
sewardj7cf4e6b2008-05-01 20:24:26 +00006366 mce.trace = verboze;
sewardj95448072004-11-22 20:19:51 +00006367 mce.layout = layout;
sewardj95448072004-11-22 20:19:51 +00006368 mce.hWordTy = hWordTy;
sewardjd5204dc2004-12-31 01:16:11 +00006369 mce.bogusLiterals = False;
sewardj1c0ce7a2009-07-01 08:10:49 +00006370
sewardj54eac252012-03-27 10:19:39 +00006371 /* Do expensive interpretation for Iop_Add32 and Iop_Add64 on
6372 Darwin. 10.7 is mostly built with LLVM, which uses these for
6373 bitfield inserts, and we get a lot of false errors if the cheap
6374 interpretation is used, alas. Could solve this much better if
6375 we knew which of such adds came from x86/amd64 LEA instructions,
6376 since these are the only ones really needing the expensive
6377 interpretation, but that would require some way to tag them in
6378 the _toIR.c front ends, which is a lot of faffing around. So
6379 for now just use the slow and blunt-instrument solution. */
6380 mce.useLLVMworkarounds = False;
6381# if defined(VGO_darwin)
6382 mce.useLLVMworkarounds = True;
6383# endif
6384
sewardj1c0ce7a2009-07-01 08:10:49 +00006385 mce.tmpMap = VG_(newXA)( VG_(malloc), "mc.MC_(instrument).1", VG_(free),
6386 sizeof(TempMapEnt));
philipped4dc5fc2015-05-01 16:46:38 +00006387 VG_(hintSizeXA) (mce.tmpMap, sb_in->tyenv->types_used);
sewardj1c0ce7a2009-07-01 08:10:49 +00006388 for (i = 0; i < sb_in->tyenv->types_used; i++) {
6389 TempMapEnt ent;
6390 ent.kind = Orig;
6391 ent.shadowV = IRTemp_INVALID;
6392 ent.shadowB = IRTemp_INVALID;
6393 VG_(addToXA)( mce.tmpMap, &ent );
sewardj7cf4e6b2008-05-01 20:24:26 +00006394 }
sewardj1c0ce7a2009-07-01 08:10:49 +00006395 tl_assert( VG_(sizeXA)( mce.tmpMap ) == sb_in->tyenv->types_used );
sewardj95448072004-11-22 20:19:51 +00006396
sewardj2672fae2015-09-01 08:48:04 +00006397 if (MC_(clo_expensive_definedness_checks)) {
florian9ee20eb2015-08-27 17:50:47 +00006398 /* For expensive definedness checking skip looking for bogus
6399 literals. */
6400 mce.bogusLiterals = True;
6401 } else {
6402 /* Make a preliminary inspection of the statements, to see if there
6403 are any dodgy-looking literals. If there are, we generate
6404 extra-detailed (hence extra-expensive) instrumentation in
6405 places. Scan the whole bb even if dodgyness is found earlier,
6406 so that the flatness assertion is applied to all stmts. */
6407 Bool bogus = False;
sewardj95448072004-11-22 20:19:51 +00006408
florian9ee20eb2015-08-27 17:50:47 +00006409 for (i = 0; i < sb_in->stmts_used; i++) {
6410 st = sb_in->stmts[i];
6411 tl_assert(st);
6412 tl_assert(isFlatIRStmt(st));
sewardj95448072004-11-22 20:19:51 +00006413
florian9ee20eb2015-08-27 17:50:47 +00006414 if (!bogus) {
6415 bogus = checkForBogusLiterals(st);
6416 if (0 && bogus) {
6417 VG_(printf)("bogus: ");
6418 ppIRStmt(st);
6419 VG_(printf)("\n");
6420 }
6421 if (bogus) break;
sewardj95448072004-11-22 20:19:51 +00006422 }
6423 }
florian9ee20eb2015-08-27 17:50:47 +00006424 mce.bogusLiterals = bogus;
sewardj151b90d2005-07-06 19:42:23 +00006425 }
sewardj151b90d2005-07-06 19:42:23 +00006426
sewardja0871482006-10-18 12:41:55 +00006427 /* Copy verbatim any IR preamble preceding the first IMark */
sewardj151b90d2005-07-06 19:42:23 +00006428
sewardj1c0ce7a2009-07-01 08:10:49 +00006429 tl_assert(mce.sb == sb_out);
6430 tl_assert(mce.sb != sb_in);
sewardjf1962d32006-10-19 13:22:16 +00006431
sewardja0871482006-10-18 12:41:55 +00006432 i = 0;
sewardj1c0ce7a2009-07-01 08:10:49 +00006433 while (i < sb_in->stmts_used && sb_in->stmts[i]->tag != Ist_IMark) {
sewardja0871482006-10-18 12:41:55 +00006434
sewardj1c0ce7a2009-07-01 08:10:49 +00006435 st = sb_in->stmts[i];
sewardja0871482006-10-18 12:41:55 +00006436 tl_assert(st);
6437 tl_assert(isFlatIRStmt(st));
6438
sewardj1c0ce7a2009-07-01 08:10:49 +00006439 stmt( 'C', &mce, sb_in->stmts[i] );
sewardja0871482006-10-18 12:41:55 +00006440 i++;
6441 }
6442
sewardjf1962d32006-10-19 13:22:16 +00006443 /* Nasty problem. IR optimisation of the pre-instrumented IR may
6444 cause the IR following the preamble to contain references to IR
6445 temporaries defined in the preamble. Because the preamble isn't
6446 instrumented, these temporaries don't have any shadows.
6447 Nevertheless uses of them following the preamble will cause
6448 memcheck to generate references to their shadows. End effect is
6449 to cause IR sanity check failures, due to references to
6450 non-existent shadows. This is only evident for the complex
6451 preambles used for function wrapping on TOC-afflicted platforms
sewardj6e9de462011-06-28 07:25:29 +00006452 (ppc64-linux).
sewardjf1962d32006-10-19 13:22:16 +00006453
6454 The following loop therefore scans the preamble looking for
6455 assignments to temporaries. For each one found it creates an
sewardjafa617b2008-07-22 09:59:48 +00006456 assignment to the corresponding (V) shadow temp, marking it as
sewardjf1962d32006-10-19 13:22:16 +00006457 'defined'. This is the same resulting IR as if the main
6458 instrumentation loop before had been applied to the statement
6459 'tmp = CONSTANT'.
sewardjafa617b2008-07-22 09:59:48 +00006460
6461 Similarly, if origin tracking is enabled, we must generate an
6462 assignment for the corresponding origin (B) shadow, claiming
6463 no-origin, as appropriate for a defined value.
sewardjf1962d32006-10-19 13:22:16 +00006464 */
6465 for (j = 0; j < i; j++) {
sewardj1c0ce7a2009-07-01 08:10:49 +00006466 if (sb_in->stmts[j]->tag == Ist_WrTmp) {
sewardj7cf4e6b2008-05-01 20:24:26 +00006467 /* findShadowTmpV checks its arg is an original tmp;
sewardjf1962d32006-10-19 13:22:16 +00006468 no need to assert that here. */
sewardj1c0ce7a2009-07-01 08:10:49 +00006469 IRTemp tmp_o = sb_in->stmts[j]->Ist.WrTmp.tmp;
sewardjafa617b2008-07-22 09:59:48 +00006470 IRTemp tmp_v = findShadowTmpV(&mce, tmp_o);
sewardj1c0ce7a2009-07-01 08:10:49 +00006471 IRType ty_v = typeOfIRTemp(sb_out->tyenv, tmp_v);
sewardjafa617b2008-07-22 09:59:48 +00006472 assign( 'V', &mce, tmp_v, definedOfType( ty_v ) );
6473 if (MC_(clo_mc_level) == 3) {
6474 IRTemp tmp_b = findShadowTmpB(&mce, tmp_o);
sewardj1c0ce7a2009-07-01 08:10:49 +00006475 tl_assert(typeOfIRTemp(sb_out->tyenv, tmp_b) == Ity_I32);
sewardjafa617b2008-07-22 09:59:48 +00006476 assign( 'B', &mce, tmp_b, mkU32(0)/* UNKNOWN ORIGIN */);
6477 }
sewardjf1962d32006-10-19 13:22:16 +00006478 if (0) {
sewardjafa617b2008-07-22 09:59:48 +00006479 VG_(printf)("create shadow tmp(s) for preamble tmp [%d] ty ", j);
6480 ppIRType( ty_v );
sewardjf1962d32006-10-19 13:22:16 +00006481 VG_(printf)("\n");
6482 }
6483 }
6484 }
6485
sewardja0871482006-10-18 12:41:55 +00006486 /* Iterate over the remaining stmts to generate instrumentation. */
6487
sewardj1c0ce7a2009-07-01 08:10:49 +00006488 tl_assert(sb_in->stmts_used > 0);
sewardja0871482006-10-18 12:41:55 +00006489 tl_assert(i >= 0);
sewardj1c0ce7a2009-07-01 08:10:49 +00006490 tl_assert(i < sb_in->stmts_used);
6491 tl_assert(sb_in->stmts[i]->tag == Ist_IMark);
sewardja0871482006-10-18 12:41:55 +00006492
sewardj1c0ce7a2009-07-01 08:10:49 +00006493 for (/* use current i*/; i < sb_in->stmts_used; i++) {
sewardj151b90d2005-07-06 19:42:23 +00006494
sewardj1c0ce7a2009-07-01 08:10:49 +00006495 st = sb_in->stmts[i];
6496 first_stmt = sb_out->stmts_used;
sewardj95448072004-11-22 20:19:51 +00006497
6498 if (verboze) {
sewardj7cf4e6b2008-05-01 20:24:26 +00006499 VG_(printf)("\n");
sewardj95448072004-11-22 20:19:51 +00006500 ppIRStmt(st);
sewardj7cf4e6b2008-05-01 20:24:26 +00006501 VG_(printf)("\n");
sewardj95448072004-11-22 20:19:51 +00006502 }
6503
sewardj1c0ce7a2009-07-01 08:10:49 +00006504 if (MC_(clo_mc_level) == 3) {
6505 /* See comments on case Ist_CAS below. */
6506 if (st->tag != Ist_CAS)
6507 schemeS( &mce, st );
6508 }
sewardj7cf4e6b2008-05-01 20:24:26 +00006509
sewardj29faa502005-03-16 18:20:21 +00006510 /* Generate instrumentation code for each stmt ... */
6511
sewardj95448072004-11-22 20:19:51 +00006512 switch (st->tag) {
6513
sewardj0b9d74a2006-12-24 02:24:11 +00006514 case Ist_WrTmp:
sewardj7cf4e6b2008-05-01 20:24:26 +00006515 assign( 'V', &mce, findShadowTmpV(&mce, st->Ist.WrTmp.tmp),
6516 expr2vbits( &mce, st->Ist.WrTmp.data) );
njn25e49d8e72002-09-23 09:36:25 +00006517 break;
6518
sewardj95448072004-11-22 20:19:51 +00006519 case Ist_Put:
6520 do_shadow_PUT( &mce,
6521 st->Ist.Put.offset,
6522 st->Ist.Put.data,
florian434ffae2012-07-19 17:23:42 +00006523 NULL /* shadow atom */, NULL /* guard */ );
njn25e49d8e72002-09-23 09:36:25 +00006524 break;
6525
sewardj95448072004-11-22 20:19:51 +00006526 case Ist_PutI:
floriand39b0222012-05-31 15:48:13 +00006527 do_shadow_PUTI( &mce, st->Ist.PutI.details);
njn25e49d8e72002-09-23 09:36:25 +00006528 break;
6529
sewardj2e595852005-06-30 23:33:37 +00006530 case Ist_Store:
6531 do_shadow_Store( &mce, st->Ist.Store.end,
6532 st->Ist.Store.addr, 0/* addr bias */,
6533 st->Ist.Store.data,
sewardj1c0ce7a2009-07-01 08:10:49 +00006534 NULL /* shadow data */,
6535 NULL/*guard*/ );
njn25e49d8e72002-09-23 09:36:25 +00006536 break;
6537
sewardjcafe5052013-01-17 14:24:35 +00006538 case Ist_StoreG:
6539 do_shadow_StoreG( &mce, st->Ist.StoreG.details );
6540 break;
6541
6542 case Ist_LoadG:
6543 do_shadow_LoadG( &mce, st->Ist.LoadG.details );
6544 break;
6545
sewardj95448072004-11-22 20:19:51 +00006546 case Ist_Exit:
sewardjb9e6d242013-05-11 13:42:08 +00006547 complainIfUndefined( &mce, st->Ist.Exit.guard, NULL );
njn25e49d8e72002-09-23 09:36:25 +00006548 break;
6549
sewardj29faa502005-03-16 18:20:21 +00006550 case Ist_IMark:
sewardj7cf4e6b2008-05-01 20:24:26 +00006551 break;
6552
6553 case Ist_NoOp:
sewardj72d75132007-11-09 23:06:35 +00006554 case Ist_MBE:
sewardjbd598e12005-01-07 12:10:21 +00006555 break;
6556
sewardj95448072004-11-22 20:19:51 +00006557 case Ist_Dirty:
6558 do_shadow_Dirty( &mce, st->Ist.Dirty.details );
njn25e49d8e72002-09-23 09:36:25 +00006559 break;
6560
sewardj826ec492005-05-12 18:05:00 +00006561 case Ist_AbiHint:
sewardj7cf4e6b2008-05-01 20:24:26 +00006562 do_AbiHint( &mce, st->Ist.AbiHint.base,
6563 st->Ist.AbiHint.len,
6564 st->Ist.AbiHint.nia );
sewardj826ec492005-05-12 18:05:00 +00006565 break;
6566
sewardj1c0ce7a2009-07-01 08:10:49 +00006567 case Ist_CAS:
6568 do_shadow_CAS( &mce, st->Ist.CAS.details );
6569 /* Note, do_shadow_CAS copies the CAS itself to the output
6570 block, because it needs to add instrumentation both
6571 before and after it. Hence skip the copy below. Also
6572 skip the origin-tracking stuff (call to schemeS) above,
6573 since that's all tangled up with it too; do_shadow_CAS
6574 does it all. */
6575 break;
6576
sewardjdb5907d2009-11-26 17:20:21 +00006577 case Ist_LLSC:
6578 do_shadow_LLSC( &mce,
6579 st->Ist.LLSC.end,
6580 st->Ist.LLSC.result,
6581 st->Ist.LLSC.addr,
6582 st->Ist.LLSC.storedata );
6583 break;
6584
njn25e49d8e72002-09-23 09:36:25 +00006585 default:
sewardj95448072004-11-22 20:19:51 +00006586 VG_(printf)("\n");
6587 ppIRStmt(st);
6588 VG_(printf)("\n");
6589 VG_(tool_panic)("memcheck: unhandled IRStmt");
6590
6591 } /* switch (st->tag) */
6592
sewardj7cf4e6b2008-05-01 20:24:26 +00006593 if (0 && verboze) {
sewardj1c0ce7a2009-07-01 08:10:49 +00006594 for (j = first_stmt; j < sb_out->stmts_used; j++) {
sewardj95448072004-11-22 20:19:51 +00006595 VG_(printf)(" ");
sewardj1c0ce7a2009-07-01 08:10:49 +00006596 ppIRStmt(sb_out->stmts[j]);
sewardj95448072004-11-22 20:19:51 +00006597 VG_(printf)("\n");
6598 }
6599 VG_(printf)("\n");
njn25e49d8e72002-09-23 09:36:25 +00006600 }
sewardj95448072004-11-22 20:19:51 +00006601
sewardj1c0ce7a2009-07-01 08:10:49 +00006602 /* ... and finally copy the stmt itself to the output. Except,
6603 skip the copy of IRCASs; see comments on case Ist_CAS
6604 above. */
6605 if (st->tag != Ist_CAS)
6606 stmt('C', &mce, st);
njn25e49d8e72002-09-23 09:36:25 +00006607 }
njn25e49d8e72002-09-23 09:36:25 +00006608
sewardj95448072004-11-22 20:19:51 +00006609 /* Now we need to complain if the jump target is undefined. */
sewardj1c0ce7a2009-07-01 08:10:49 +00006610 first_stmt = sb_out->stmts_used;
njn25e49d8e72002-09-23 09:36:25 +00006611
sewardj95448072004-11-22 20:19:51 +00006612 if (verboze) {
sewardj1c0ce7a2009-07-01 08:10:49 +00006613 VG_(printf)("sb_in->next = ");
6614 ppIRExpr(sb_in->next);
sewardj95448072004-11-22 20:19:51 +00006615 VG_(printf)("\n\n");
6616 }
njn25e49d8e72002-09-23 09:36:25 +00006617
sewardjb9e6d242013-05-11 13:42:08 +00006618 complainIfUndefined( &mce, sb_in->next, NULL );
njn25e49d8e72002-09-23 09:36:25 +00006619
sewardj7cf4e6b2008-05-01 20:24:26 +00006620 if (0 && verboze) {
sewardj1c0ce7a2009-07-01 08:10:49 +00006621 for (j = first_stmt; j < sb_out->stmts_used; j++) {
sewardj95448072004-11-22 20:19:51 +00006622 VG_(printf)(" ");
sewardj1c0ce7a2009-07-01 08:10:49 +00006623 ppIRStmt(sb_out->stmts[j]);
sewardj95448072004-11-22 20:19:51 +00006624 VG_(printf)("\n");
njn25e49d8e72002-09-23 09:36:25 +00006625 }
sewardj95448072004-11-22 20:19:51 +00006626 VG_(printf)("\n");
njn25e49d8e72002-09-23 09:36:25 +00006627 }
njn25e49d8e72002-09-23 09:36:25 +00006628
sewardj1c0ce7a2009-07-01 08:10:49 +00006629 /* If this fails, there's been some serious snafu with tmp management,
6630 that should be investigated. */
6631 tl_assert( VG_(sizeXA)( mce.tmpMap ) == mce.sb->tyenv->types_used );
6632 VG_(deleteXA)( mce.tmpMap );
6633
6634 tl_assert(mce.sb == sb_out);
6635 return sb_out;
sewardj95448072004-11-22 20:19:51 +00006636}
njn25e49d8e72002-09-23 09:36:25 +00006637
Elliott Hughesa0664b92017-04-18 17:46:52 -07006638
sewardj81651dc2007-08-28 06:05:20 +00006639/*------------------------------------------------------------*/
6640/*--- Post-tree-build final tidying ---*/
6641/*------------------------------------------------------------*/
6642
6643/* This exploits the observation that Memcheck often produces
6644 repeated conditional calls of the form
6645
sewardj7cf4e6b2008-05-01 20:24:26 +00006646 Dirty G MC_(helperc_value_check0/1/4/8_fail)(UInt otag)
sewardj81651dc2007-08-28 06:05:20 +00006647
6648 with the same guard expression G guarding the same helper call.
6649 The second and subsequent calls are redundant. This usually
6650 results from instrumentation of guest code containing multiple
6651 memory references at different constant offsets from the same base
6652 register. After optimisation of the instrumentation, you get a
6653 test for the definedness of the base register for each memory
6654 reference, which is kinda pointless. MC_(final_tidy) therefore
6655 looks for such repeated calls and removes all but the first. */
6656
Elliott Hughesa0664b92017-04-18 17:46:52 -07006657
6658/* With some testing on perf/bz2.c, on amd64 and x86, compiled with
6659 gcc-5.3.1 -O2, it appears that 16 entries in the array are enough to
6660 get almost all the benefits of this transformation whilst causing
6661 the slide-back case to just often enough to be verifiably
6662 correct. For posterity, the numbers are:
6663
6664 bz2-32
6665
6666 1 4,336 (112,212 -> 1,709,473; ratio 15.2)
6667 2 4,336 (112,194 -> 1,669,895; ratio 14.9)
6668 3 4,336 (112,194 -> 1,660,713; ratio 14.8)
6669 4 4,336 (112,194 -> 1,658,555; ratio 14.8)
6670 5 4,336 (112,194 -> 1,655,447; ratio 14.8)
6671 6 4,336 (112,194 -> 1,655,101; ratio 14.8)
6672 7 4,336 (112,194 -> 1,654,858; ratio 14.7)
6673 8 4,336 (112,194 -> 1,654,810; ratio 14.7)
6674 10 4,336 (112,194 -> 1,654,621; ratio 14.7)
6675 12 4,336 (112,194 -> 1,654,678; ratio 14.7)
6676 16 4,336 (112,194 -> 1,654,494; ratio 14.7)
6677 32 4,336 (112,194 -> 1,654,602; ratio 14.7)
6678 inf 4,336 (112,194 -> 1,654,602; ratio 14.7)
6679
6680 bz2-64
6681
6682 1 4,113 (107,329 -> 1,822,171; ratio 17.0)
6683 2 4,113 (107,329 -> 1,806,443; ratio 16.8)
6684 3 4,113 (107,329 -> 1,803,967; ratio 16.8)
6685 4 4,113 (107,329 -> 1,802,785; ratio 16.8)
6686 5 4,113 (107,329 -> 1,802,412; ratio 16.8)
6687 6 4,113 (107,329 -> 1,802,062; ratio 16.8)
6688 7 4,113 (107,329 -> 1,801,976; ratio 16.8)
6689 8 4,113 (107,329 -> 1,801,886; ratio 16.8)
6690 10 4,113 (107,329 -> 1,801,653; ratio 16.8)
6691 12 4,113 (107,329 -> 1,801,526; ratio 16.8)
6692 16 4,113 (107,329 -> 1,801,298; ratio 16.8)
6693 32 4,113 (107,329 -> 1,800,827; ratio 16.8)
6694 inf 4,113 (107,329 -> 1,800,827; ratio 16.8)
6695*/
6696
6697/* Structs for recording which (helper, guard) pairs we have already
sewardj81651dc2007-08-28 06:05:20 +00006698 seen. */
Elliott Hughesa0664b92017-04-18 17:46:52 -07006699
6700#define N_TIDYING_PAIRS 16
6701
sewardj81651dc2007-08-28 06:05:20 +00006702typedef
6703 struct { void* entry; IRExpr* guard; }
6704 Pair;
6705
Elliott Hughesa0664b92017-04-18 17:46:52 -07006706typedef
6707 struct {
6708 Pair pairs[N_TIDYING_PAIRS +1/*for bounds checking*/];
6709 UInt pairsUsed;
6710 }
6711 Pairs;
6712
6713
sewardj81651dc2007-08-28 06:05:20 +00006714/* Return True if e1 and e2 definitely denote the same value (used to
6715 compare guards). Return False if unknown; False is the safe
6716 answer. Since guest registers and guest memory do not have the
6717 SSA property we must return False if any Gets or Loads appear in
Elliott Hughesa0664b92017-04-18 17:46:52 -07006718 the expression. This implicitly assumes that e1 and e2 have the
6719 same IR type, which is always true here -- the type is Ity_I1. */
sewardj81651dc2007-08-28 06:05:20 +00006720
6721static Bool sameIRValue ( IRExpr* e1, IRExpr* e2 )
6722{
6723 if (e1->tag != e2->tag)
6724 return False;
6725 switch (e1->tag) {
6726 case Iex_Const:
6727 return eqIRConst( e1->Iex.Const.con, e2->Iex.Const.con );
6728 case Iex_Binop:
6729 return e1->Iex.Binop.op == e2->Iex.Binop.op
6730 && sameIRValue(e1->Iex.Binop.arg1, e2->Iex.Binop.arg1)
6731 && sameIRValue(e1->Iex.Binop.arg2, e2->Iex.Binop.arg2);
6732 case Iex_Unop:
6733 return e1->Iex.Unop.op == e2->Iex.Unop.op
6734 && sameIRValue(e1->Iex.Unop.arg, e2->Iex.Unop.arg);
6735 case Iex_RdTmp:
6736 return e1->Iex.RdTmp.tmp == e2->Iex.RdTmp.tmp;
florian5686b2d2013-01-29 03:57:40 +00006737 case Iex_ITE:
6738 return sameIRValue( e1->Iex.ITE.cond, e2->Iex.ITE.cond )
6739 && sameIRValue( e1->Iex.ITE.iftrue, e2->Iex.ITE.iftrue )
6740 && sameIRValue( e1->Iex.ITE.iffalse, e2->Iex.ITE.iffalse );
sewardj81651dc2007-08-28 06:05:20 +00006741 case Iex_Qop:
6742 case Iex_Triop:
6743 case Iex_CCall:
6744 /* be lazy. Could define equality for these, but they never
6745 appear to be used. */
6746 return False;
6747 case Iex_Get:
6748 case Iex_GetI:
6749 case Iex_Load:
6750 /* be conservative - these may not give the same value each
6751 time */
6752 return False;
6753 case Iex_Binder:
6754 /* should never see this */
6755 /* fallthrough */
6756 default:
6757 VG_(printf)("mc_translate.c: sameIRValue: unhandled: ");
6758 ppIRExpr(e1);
6759 VG_(tool_panic)("memcheck:sameIRValue");
6760 return False;
6761 }
6762}
6763
6764/* See if 'pairs' already has an entry for (entry, guard). Return
6765 True if so. If not, add an entry. */
6766
6767static
Elliott Hughesa0664b92017-04-18 17:46:52 -07006768Bool check_or_add ( Pairs* tidyingEnv, IRExpr* guard, void* entry )
sewardj81651dc2007-08-28 06:05:20 +00006769{
Elliott Hughesa0664b92017-04-18 17:46:52 -07006770 UInt i, n = tidyingEnv->pairsUsed;
6771 tl_assert(n <= N_TIDYING_PAIRS);
sewardj81651dc2007-08-28 06:05:20 +00006772 for (i = 0; i < n; i++) {
Elliott Hughesa0664b92017-04-18 17:46:52 -07006773 if (tidyingEnv->pairs[i].entry == entry
6774 && sameIRValue(tidyingEnv->pairs[i].guard, guard))
sewardj81651dc2007-08-28 06:05:20 +00006775 return True;
6776 }
Elliott Hughesa0664b92017-04-18 17:46:52 -07006777 /* (guard, entry) wasn't found in the array. Add it at the end.
6778 If the array is already full, slide the entries one slot
6779 backwards. This means we will lose to ability to detect
6780 duplicates from the pair in slot zero, but that happens so
6781 rarely that it's unlikely to have much effect on overall code
6782 quality. Also, this strategy loses the check for the oldest
6783 tracked exit (memory reference, basically) and so that is (I'd
6784 guess) least likely to be re-used after this point. */
6785 tl_assert(i == n);
6786 if (n == N_TIDYING_PAIRS) {
6787 for (i = 1; i < N_TIDYING_PAIRS; i++) {
6788 tidyingEnv->pairs[i-1] = tidyingEnv->pairs[i];
6789 }
6790 tidyingEnv->pairs[N_TIDYING_PAIRS-1].entry = entry;
6791 tidyingEnv->pairs[N_TIDYING_PAIRS-1].guard = guard;
6792 } else {
6793 tl_assert(n < N_TIDYING_PAIRS);
6794 tidyingEnv->pairs[n].entry = entry;
6795 tidyingEnv->pairs[n].guard = guard;
6796 n++;
6797 tidyingEnv->pairsUsed = n;
6798 }
sewardj81651dc2007-08-28 06:05:20 +00006799 return False;
6800}
6801
florian11f3cc82012-10-21 02:19:35 +00006802static Bool is_helperc_value_checkN_fail ( const HChar* name )
sewardj81651dc2007-08-28 06:05:20 +00006803{
Elliott Hughesa0664b92017-04-18 17:46:52 -07006804 /* This is expensive because it happens a lot. We are checking to
6805 see whether |name| is one of the following 8 strings:
6806
6807 MC_(helperc_value_check8_fail_no_o)
6808 MC_(helperc_value_check4_fail_no_o)
6809 MC_(helperc_value_check0_fail_no_o)
6810 MC_(helperc_value_check1_fail_no_o)
6811 MC_(helperc_value_check8_fail_w_o)
6812 MC_(helperc_value_check0_fail_w_o)
6813 MC_(helperc_value_check1_fail_w_o)
6814 MC_(helperc_value_check4_fail_w_o)
6815
6816 To speed it up, check the common prefix just once, rather than
6817 all 8 times.
6818 */
6819 const HChar* prefix = "MC_(helperc_value_check";
6820
6821 HChar n, p;
6822 while (True) {
6823 n = *name;
6824 p = *prefix;
6825 if (p == 0) break; /* ran off the end of the prefix */
6826 /* We still have some prefix to use */
6827 if (n == 0) return False; /* have prefix, but name ran out */
6828 if (n != p) return False; /* have both pfx and name, but no match */
6829 name++;
6830 prefix++;
6831 }
6832
6833 /* Check the part after the prefix. */
6834 tl_assert(*prefix == 0 && *name != 0);
6835 return 0==VG_(strcmp)(name, "8_fail_no_o)")
6836 || 0==VG_(strcmp)(name, "4_fail_no_o)")
6837 || 0==VG_(strcmp)(name, "0_fail_no_o)")
6838 || 0==VG_(strcmp)(name, "1_fail_no_o)")
6839 || 0==VG_(strcmp)(name, "8_fail_w_o)")
6840 || 0==VG_(strcmp)(name, "4_fail_w_o)")
6841 || 0==VG_(strcmp)(name, "0_fail_w_o)")
6842 || 0==VG_(strcmp)(name, "1_fail_w_o)");
sewardj81651dc2007-08-28 06:05:20 +00006843}
6844
6845IRSB* MC_(final_tidy) ( IRSB* sb_in )
6846{
Elliott Hughesa0664b92017-04-18 17:46:52 -07006847 Int i;
sewardj81651dc2007-08-28 06:05:20 +00006848 IRStmt* st;
6849 IRDirty* di;
6850 IRExpr* guard;
6851 IRCallee* cee;
6852 Bool alreadyPresent;
Elliott Hughesa0664b92017-04-18 17:46:52 -07006853 Pairs pairs;
6854
6855 pairs.pairsUsed = 0;
6856
6857 pairs.pairs[N_TIDYING_PAIRS].entry = (void*)0x123;
6858 pairs.pairs[N_TIDYING_PAIRS].guard = (IRExpr*)0x456;
6859
sewardj81651dc2007-08-28 06:05:20 +00006860 /* Scan forwards through the statements. Each time a call to one
6861 of the relevant helpers is seen, check if we have made a
6862 previous call to the same helper using the same guard
6863 expression, and if so, delete the call. */
6864 for (i = 0; i < sb_in->stmts_used; i++) {
6865 st = sb_in->stmts[i];
6866 tl_assert(st);
6867 if (st->tag != Ist_Dirty)
6868 continue;
6869 di = st->Ist.Dirty.details;
6870 guard = di->guard;
florian6c0aa2c2013-01-21 01:27:22 +00006871 tl_assert(guard);
sewardj81651dc2007-08-28 06:05:20 +00006872 if (0) { ppIRExpr(guard); VG_(printf)("\n"); }
6873 cee = di->cee;
6874 if (!is_helperc_value_checkN_fail( cee->name ))
6875 continue;
6876 /* Ok, we have a call to helperc_value_check0/1/4/8_fail with
6877 guard 'guard'. Check if we have already seen a call to this
6878 function with the same guard. If so, delete it. If not,
6879 add it to the set of calls we do know about. */
Elliott Hughesa0664b92017-04-18 17:46:52 -07006880 alreadyPresent = check_or_add( &pairs, guard, cee->addr );
sewardj81651dc2007-08-28 06:05:20 +00006881 if (alreadyPresent) {
6882 sb_in->stmts[i] = IRStmt_NoOp();
6883 if (0) VG_(printf)("XX\n");
6884 }
6885 }
Elliott Hughesa0664b92017-04-18 17:46:52 -07006886
6887 tl_assert(pairs.pairs[N_TIDYING_PAIRS].entry == (void*)0x123);
6888 tl_assert(pairs.pairs[N_TIDYING_PAIRS].guard == (IRExpr*)0x456);
6889
sewardj81651dc2007-08-28 06:05:20 +00006890 return sb_in;
6891}
6892
Elliott Hughesa0664b92017-04-18 17:46:52 -07006893#undef N_TIDYING_PAIRS
6894
sewardj81651dc2007-08-28 06:05:20 +00006895
sewardj7cf4e6b2008-05-01 20:24:26 +00006896/*------------------------------------------------------------*/
6897/*--- Origin tracking stuff ---*/
6898/*------------------------------------------------------------*/
6899
sewardj1c0ce7a2009-07-01 08:10:49 +00006900/* Almost identical to findShadowTmpV. */
sewardj7cf4e6b2008-05-01 20:24:26 +00006901static IRTemp findShadowTmpB ( MCEnv* mce, IRTemp orig )
6902{
sewardj1c0ce7a2009-07-01 08:10:49 +00006903 TempMapEnt* ent;
6904 /* VG_(indexXA) range-checks 'orig', hence no need to check
6905 here. */
6906 ent = (TempMapEnt*)VG_(indexXA)( mce->tmpMap, (Word)orig );
6907 tl_assert(ent->kind == Orig);
6908 if (ent->shadowB == IRTemp_INVALID) {
6909 IRTemp tmpB
6910 = newTemp( mce, Ity_I32, BSh );
6911 /* newTemp may cause mce->tmpMap to resize, hence previous results
6912 from VG_(indexXA) are invalid. */
6913 ent = (TempMapEnt*)VG_(indexXA)( mce->tmpMap, (Word)orig );
6914 tl_assert(ent->kind == Orig);
6915 tl_assert(ent->shadowB == IRTemp_INVALID);
6916 ent->shadowB = tmpB;
sewardj7cf4e6b2008-05-01 20:24:26 +00006917 }
sewardj1c0ce7a2009-07-01 08:10:49 +00006918 return ent->shadowB;
sewardj7cf4e6b2008-05-01 20:24:26 +00006919}
6920
6921static IRAtom* gen_maxU32 ( MCEnv* mce, IRAtom* b1, IRAtom* b2 )
6922{
6923 return assignNew( 'B', mce, Ity_I32, binop(Iop_Max32U, b1, b2) );
6924}
6925
sewardjcafe5052013-01-17 14:24:35 +00006926
6927/* Make a guarded origin load, with no special handling in the
6928 didn't-happen case. A GUARD of NULL is assumed to mean "always
6929 True".
6930
6931 Generate IR to do a shadow origins load from BASEADDR+OFFSET and
6932 return the otag. The loaded size is SZB. If GUARD evaluates to
6933 False at run time then the returned otag is zero.
6934*/
6935static IRAtom* gen_guarded_load_b ( MCEnv* mce, Int szB,
6936 IRAtom* baseaddr,
6937 Int offset, IRExpr* guard )
sewardj7cf4e6b2008-05-01 20:24:26 +00006938{
6939 void* hFun;
florian6bd9dc12012-11-23 16:17:43 +00006940 const HChar* hName;
sewardj7cf4e6b2008-05-01 20:24:26 +00006941 IRTemp bTmp;
6942 IRDirty* di;
sewardj1c0ce7a2009-07-01 08:10:49 +00006943 IRType aTy = typeOfIRExpr( mce->sb->tyenv, baseaddr );
sewardj7cf4e6b2008-05-01 20:24:26 +00006944 IROp opAdd = aTy == Ity_I32 ? Iop_Add32 : Iop_Add64;
6945 IRAtom* ea = baseaddr;
6946 if (offset != 0) {
6947 IRAtom* off = aTy == Ity_I32 ? mkU32( offset )
6948 : mkU64( (Long)(Int)offset );
6949 ea = assignNew( 'B', mce, aTy, binop(opAdd, ea, off));
6950 }
sewardj1c0ce7a2009-07-01 08:10:49 +00006951 bTmp = newTemp(mce, mce->hWordTy, BSh);
sewardj7cf4e6b2008-05-01 20:24:26 +00006952
6953 switch (szB) {
6954 case 1: hFun = (void*)&MC_(helperc_b_load1);
6955 hName = "MC_(helperc_b_load1)";
6956 break;
6957 case 2: hFun = (void*)&MC_(helperc_b_load2);
6958 hName = "MC_(helperc_b_load2)";
6959 break;
6960 case 4: hFun = (void*)&MC_(helperc_b_load4);
6961 hName = "MC_(helperc_b_load4)";
6962 break;
6963 case 8: hFun = (void*)&MC_(helperc_b_load8);
6964 hName = "MC_(helperc_b_load8)";
6965 break;
6966 case 16: hFun = (void*)&MC_(helperc_b_load16);
6967 hName = "MC_(helperc_b_load16)";
6968 break;
sewardj45fa9f42012-05-21 10:18:10 +00006969 case 32: hFun = (void*)&MC_(helperc_b_load32);
6970 hName = "MC_(helperc_b_load32)";
6971 break;
sewardj7cf4e6b2008-05-01 20:24:26 +00006972 default:
6973 VG_(printf)("mc_translate.c: gen_load_b: unhandled szB == %d\n", szB);
6974 tl_assert(0);
6975 }
6976 di = unsafeIRDirty_1_N(
6977 bTmp, 1/*regparms*/, hName, VG_(fnptr_to_fnentry)( hFun ),
6978 mkIRExprVec_1( ea )
6979 );
sewardjcafe5052013-01-17 14:24:35 +00006980 if (guard) {
6981 di->guard = guard;
6982 /* Ideally the didn't-happen return value here would be
6983 all-zeroes (unknown-origin), so it'd be harmless if it got
florianad4e9792015-07-05 21:53:33 +00006984 used inadvertently. We slum it out with the IR-mandated
sewardjcafe5052013-01-17 14:24:35 +00006985 default value (0b01 repeating, 0x55 etc) as that'll probably
6986 trump all legitimate otags via Max32, and it's pretty
6987 obviously bogus. */
6988 }
sewardj7cf4e6b2008-05-01 20:24:26 +00006989 /* no need to mess with any annotations. This call accesses
6990 neither guest state nor guest memory. */
6991 stmt( 'B', mce, IRStmt_Dirty(di) );
6992 if (mce->hWordTy == Ity_I64) {
6993 /* 64-bit host */
sewardj1c0ce7a2009-07-01 08:10:49 +00006994 IRTemp bTmp32 = newTemp(mce, Ity_I32, BSh);
sewardj7cf4e6b2008-05-01 20:24:26 +00006995 assign( 'B', mce, bTmp32, unop(Iop_64to32, mkexpr(bTmp)) );
6996 return mkexpr(bTmp32);
6997 } else {
6998 /* 32-bit host */
6999 return mkexpr(bTmp);
7000 }
7001}
sewardj1c0ce7a2009-07-01 08:10:49 +00007002
sewardjcafe5052013-01-17 14:24:35 +00007003
7004/* Generate IR to do a shadow origins load from BASEADDR+OFFSET. The
7005 loaded size is SZB. The load is regarded as unconditional (always
7006 happens).
7007*/
7008static IRAtom* gen_load_b ( MCEnv* mce, Int szB, IRAtom* baseaddr,
7009 Int offset )
florian434ffae2012-07-19 17:23:42 +00007010{
sewardjcafe5052013-01-17 14:24:35 +00007011 return gen_guarded_load_b(mce, szB, baseaddr, offset, NULL/*guard*/);
florian434ffae2012-07-19 17:23:42 +00007012}
7013
sewardjcafe5052013-01-17 14:24:35 +00007014
7015/* The most general handler for guarded origin loads. A GUARD of NULL
7016 is assumed to mean "always True".
7017
7018 Generate IR to do a shadow origin load from ADDR+BIAS and return
7019 the B bits. The loaded type is TY. If GUARD evaluates to False at
7020 run time then the returned B bits are simply BALT instead.
7021*/
7022static
7023IRAtom* expr2ori_Load_guarded_General ( MCEnv* mce,
7024 IRType ty,
7025 IRAtom* addr, UInt bias,
7026 IRAtom* guard, IRAtom* balt )
7027{
7028 /* If the guard evaluates to True, this will hold the loaded
7029 origin. If the guard evaluates to False, this will be zero,
7030 meaning "unknown origin", in which case we will have to replace
florian5686b2d2013-01-29 03:57:40 +00007031 it using an ITE below. */
sewardjcafe5052013-01-17 14:24:35 +00007032 IRAtom* iftrue
7033 = assignNew('B', mce, Ity_I32,
7034 gen_guarded_load_b(mce, sizeofIRType(ty),
7035 addr, bias, guard));
7036 /* These are the bits we will return if the load doesn't take
7037 place. */
7038 IRAtom* iffalse
7039 = balt;
florian5686b2d2013-01-29 03:57:40 +00007040 /* Prepare the cond for the ITE. Convert a NULL cond into
sewardjcafe5052013-01-17 14:24:35 +00007041 something that iropt knows how to fold out later. */
7042 IRAtom* cond
sewardjcc961652013-01-26 11:49:15 +00007043 = guard == NULL ? mkU1(1) : guard;
sewardjcafe5052013-01-17 14:24:35 +00007044 /* And assemble the final result. */
florian5686b2d2013-01-29 03:57:40 +00007045 return assignNew('B', mce, Ity_I32, IRExpr_ITE(cond, iftrue, iffalse));
sewardjcafe5052013-01-17 14:24:35 +00007046}
7047
7048
7049/* Generate a shadow origins store. guard :: Ity_I1 controls whether
7050 the store really happens; NULL means it unconditionally does. */
sewardj7cf4e6b2008-05-01 20:24:26 +00007051static void gen_store_b ( MCEnv* mce, Int szB,
sewardj1c0ce7a2009-07-01 08:10:49 +00007052 IRAtom* baseaddr, Int offset, IRAtom* dataB,
7053 IRAtom* guard )
sewardj7cf4e6b2008-05-01 20:24:26 +00007054{
7055 void* hFun;
florian6bd9dc12012-11-23 16:17:43 +00007056 const HChar* hName;
sewardj7cf4e6b2008-05-01 20:24:26 +00007057 IRDirty* di;
sewardj1c0ce7a2009-07-01 08:10:49 +00007058 IRType aTy = typeOfIRExpr( mce->sb->tyenv, baseaddr );
sewardj7cf4e6b2008-05-01 20:24:26 +00007059 IROp opAdd = aTy == Ity_I32 ? Iop_Add32 : Iop_Add64;
7060 IRAtom* ea = baseaddr;
sewardj1c0ce7a2009-07-01 08:10:49 +00007061 if (guard) {
7062 tl_assert(isOriginalAtom(mce, guard));
7063 tl_assert(typeOfIRExpr(mce->sb->tyenv, guard) == Ity_I1);
7064 }
sewardj7cf4e6b2008-05-01 20:24:26 +00007065 if (offset != 0) {
7066 IRAtom* off = aTy == Ity_I32 ? mkU32( offset )
7067 : mkU64( (Long)(Int)offset );
7068 ea = assignNew( 'B', mce, aTy, binop(opAdd, ea, off));
7069 }
7070 if (mce->hWordTy == Ity_I64)
7071 dataB = assignNew( 'B', mce, Ity_I64, unop(Iop_32Uto64, dataB));
7072
7073 switch (szB) {
7074 case 1: hFun = (void*)&MC_(helperc_b_store1);
7075 hName = "MC_(helperc_b_store1)";
7076 break;
7077 case 2: hFun = (void*)&MC_(helperc_b_store2);
7078 hName = "MC_(helperc_b_store2)";
7079 break;
7080 case 4: hFun = (void*)&MC_(helperc_b_store4);
7081 hName = "MC_(helperc_b_store4)";
7082 break;
7083 case 8: hFun = (void*)&MC_(helperc_b_store8);
7084 hName = "MC_(helperc_b_store8)";
7085 break;
7086 case 16: hFun = (void*)&MC_(helperc_b_store16);
7087 hName = "MC_(helperc_b_store16)";
7088 break;
sewardj45fa9f42012-05-21 10:18:10 +00007089 case 32: hFun = (void*)&MC_(helperc_b_store32);
7090 hName = "MC_(helperc_b_store32)";
7091 break;
sewardj7cf4e6b2008-05-01 20:24:26 +00007092 default:
7093 tl_assert(0);
7094 }
7095 di = unsafeIRDirty_0_N( 2/*regparms*/,
7096 hName, VG_(fnptr_to_fnentry)( hFun ),
7097 mkIRExprVec_2( ea, dataB )
7098 );
7099 /* no need to mess with any annotations. This call accesses
7100 neither guest state nor guest memory. */
sewardj1c0ce7a2009-07-01 08:10:49 +00007101 if (guard) di->guard = guard;
sewardj7cf4e6b2008-05-01 20:24:26 +00007102 stmt( 'B', mce, IRStmt_Dirty(di) );
7103}
7104
7105static IRAtom* narrowTo32 ( MCEnv* mce, IRAtom* e ) {
sewardj1c0ce7a2009-07-01 08:10:49 +00007106 IRType eTy = typeOfIRExpr(mce->sb->tyenv, e);
sewardj7cf4e6b2008-05-01 20:24:26 +00007107 if (eTy == Ity_I64)
7108 return assignNew( 'B', mce, Ity_I32, unop(Iop_64to32, e) );
7109 if (eTy == Ity_I32)
7110 return e;
7111 tl_assert(0);
7112}
7113
7114static IRAtom* zWidenFrom32 ( MCEnv* mce, IRType dstTy, IRAtom* e ) {
sewardj1c0ce7a2009-07-01 08:10:49 +00007115 IRType eTy = typeOfIRExpr(mce->sb->tyenv, e);
sewardj7cf4e6b2008-05-01 20:24:26 +00007116 tl_assert(eTy == Ity_I32);
7117 if (dstTy == Ity_I64)
7118 return assignNew( 'B', mce, Ity_I64, unop(Iop_32Uto64, e) );
7119 tl_assert(0);
7120}
7121
sewardjdb5907d2009-11-26 17:20:21 +00007122
sewardj7cf4e6b2008-05-01 20:24:26 +00007123static IRAtom* schemeE ( MCEnv* mce, IRExpr* e )
7124{
7125 tl_assert(MC_(clo_mc_level) == 3);
7126
7127 switch (e->tag) {
7128
7129 case Iex_GetI: {
7130 IRRegArray* descr_b;
7131 IRAtom *t1, *t2, *t3, *t4;
7132 IRRegArray* descr = e->Iex.GetI.descr;
7133 IRType equivIntTy
7134 = MC_(get_otrack_reg_array_equiv_int_type)(descr);
7135 /* If this array is unshadowable for whatever reason, use the
7136 usual approximation. */
7137 if (equivIntTy == Ity_INVALID)
7138 return mkU32(0);
7139 tl_assert(sizeofIRType(equivIntTy) >= 4);
7140 tl_assert(sizeofIRType(equivIntTy) == sizeofIRType(descr->elemTy));
7141 descr_b = mkIRRegArray( descr->base + 2*mce->layout->total_sizeB,
7142 equivIntTy, descr->nElems );
7143 /* Do a shadow indexed get of the same size, giving t1. Take
7144 the bottom 32 bits of it, giving t2. Compute into t3 the
7145 origin for the index (almost certainly zero, but there's
7146 no harm in being completely general here, since iropt will
7147 remove any useless code), and fold it in, giving a final
7148 value t4. */
7149 t1 = assignNew( 'B', mce, equivIntTy,
7150 IRExpr_GetI( descr_b, e->Iex.GetI.ix,
7151 e->Iex.GetI.bias ));
7152 t2 = narrowTo32( mce, t1 );
7153 t3 = schemeE( mce, e->Iex.GetI.ix );
7154 t4 = gen_maxU32( mce, t2, t3 );
7155 return t4;
7156 }
7157 case Iex_CCall: {
7158 Int i;
7159 IRAtom* here;
7160 IRExpr** args = e->Iex.CCall.args;
7161 IRAtom* curr = mkU32(0);
7162 for (i = 0; args[i]; i++) {
7163 tl_assert(i < 32);
7164 tl_assert(isOriginalAtom(mce, args[i]));
7165 /* Only take notice of this arg if the callee's
7166 mc-exclusion mask does not say it is to be excluded. */
7167 if (e->Iex.CCall.cee->mcx_mask & (1<<i)) {
7168 /* the arg is to be excluded from definedness checking.
7169 Do nothing. */
7170 if (0) VG_(printf)("excluding %s(%d)\n",
7171 e->Iex.CCall.cee->name, i);
7172 } else {
7173 /* calculate the arg's definedness, and pessimistically
7174 merge it in. */
7175 here = schemeE( mce, args[i] );
7176 curr = gen_maxU32( mce, curr, here );
7177 }
7178 }
7179 return curr;
7180 }
7181 case Iex_Load: {
7182 Int dszB;
7183 dszB = sizeofIRType(e->Iex.Load.ty);
7184 /* assert that the B value for the address is already
7185 available (somewhere) */
7186 tl_assert(isIRAtom(e->Iex.Load.addr));
7187 tl_assert(mce->hWordTy == Ity_I32 || mce->hWordTy == Ity_I64);
7188 return gen_load_b( mce, dszB, e->Iex.Load.addr, 0 );
7189 }
florian5686b2d2013-01-29 03:57:40 +00007190 case Iex_ITE: {
7191 IRAtom* b1 = schemeE( mce, e->Iex.ITE.cond );
florian5686b2d2013-01-29 03:57:40 +00007192 IRAtom* b3 = schemeE( mce, e->Iex.ITE.iftrue );
sewardj07bfda22013-01-29 21:11:55 +00007193 IRAtom* b2 = schemeE( mce, e->Iex.ITE.iffalse );
sewardj7cf4e6b2008-05-01 20:24:26 +00007194 return gen_maxU32( mce, b1, gen_maxU32( mce, b2, b3 ));
7195 }
7196 case Iex_Qop: {
floriane2ab2972012-06-01 20:43:03 +00007197 IRAtom* b1 = schemeE( mce, e->Iex.Qop.details->arg1 );
7198 IRAtom* b2 = schemeE( mce, e->Iex.Qop.details->arg2 );
7199 IRAtom* b3 = schemeE( mce, e->Iex.Qop.details->arg3 );
7200 IRAtom* b4 = schemeE( mce, e->Iex.Qop.details->arg4 );
sewardj7cf4e6b2008-05-01 20:24:26 +00007201 return gen_maxU32( mce, gen_maxU32( mce, b1, b2 ),
7202 gen_maxU32( mce, b3, b4 ) );
7203 }
7204 case Iex_Triop: {
florian26441742012-06-02 20:30:41 +00007205 IRAtom* b1 = schemeE( mce, e->Iex.Triop.details->arg1 );
7206 IRAtom* b2 = schemeE( mce, e->Iex.Triop.details->arg2 );
7207 IRAtom* b3 = schemeE( mce, e->Iex.Triop.details->arg3 );
sewardj7cf4e6b2008-05-01 20:24:26 +00007208 return gen_maxU32( mce, b1, gen_maxU32( mce, b2, b3 ) );
7209 }
7210 case Iex_Binop: {
sewardjafed4c52009-07-12 13:00:17 +00007211 switch (e->Iex.Binop.op) {
7212 case Iop_CasCmpEQ8: case Iop_CasCmpNE8:
7213 case Iop_CasCmpEQ16: case Iop_CasCmpNE16:
7214 case Iop_CasCmpEQ32: case Iop_CasCmpNE32:
7215 case Iop_CasCmpEQ64: case Iop_CasCmpNE64:
7216 /* Just say these all produce a defined result,
7217 regardless of their arguments. See
7218 COMMENT_ON_CasCmpEQ in this file. */
7219 return mkU32(0);
7220 default: {
7221 IRAtom* b1 = schemeE( mce, e->Iex.Binop.arg1 );
7222 IRAtom* b2 = schemeE( mce, e->Iex.Binop.arg2 );
7223 return gen_maxU32( mce, b1, b2 );
7224 }
7225 }
7226 tl_assert(0);
7227 /*NOTREACHED*/
sewardj7cf4e6b2008-05-01 20:24:26 +00007228 }
7229 case Iex_Unop: {
7230 IRAtom* b1 = schemeE( mce, e->Iex.Unop.arg );
7231 return b1;
7232 }
7233 case Iex_Const:
7234 return mkU32(0);
7235 case Iex_RdTmp:
7236 return mkexpr( findShadowTmpB( mce, e->Iex.RdTmp.tmp ));
7237 case Iex_Get: {
7238 Int b_offset = MC_(get_otrack_shadow_offset)(
7239 e->Iex.Get.offset,
7240 sizeofIRType(e->Iex.Get.ty)
7241 );
7242 tl_assert(b_offset >= -1
7243 && b_offset <= mce->layout->total_sizeB -4);
7244 if (b_offset >= 0) {
7245 /* FIXME: this isn't an atom! */
7246 return IRExpr_Get( b_offset + 2*mce->layout->total_sizeB,
7247 Ity_I32 );
7248 }
7249 return mkU32(0);
7250 }
7251 default:
7252 VG_(printf)("mc_translate.c: schemeE: unhandled: ");
7253 ppIRExpr(e);
7254 VG_(tool_panic)("memcheck:schemeE");
7255 }
7256}
7257
sewardjdb5907d2009-11-26 17:20:21 +00007258
sewardj7cf4e6b2008-05-01 20:24:26 +00007259static void do_origins_Dirty ( MCEnv* mce, IRDirty* d )
7260{
7261 // This is a hacked version of do_shadow_Dirty
sewardj2eecb742012-06-01 16:11:41 +00007262 Int i, k, n, toDo, gSz, gOff;
sewardj7cf4e6b2008-05-01 20:24:26 +00007263 IRAtom *here, *curr;
7264 IRTemp dst;
sewardj7cf4e6b2008-05-01 20:24:26 +00007265
7266 /* First check the guard. */
7267 curr = schemeE( mce, d->guard );
7268
7269 /* Now round up all inputs and maxU32 over them. */
7270
florian434ffae2012-07-19 17:23:42 +00007271 /* Inputs: unmasked args
7272 Note: arguments are evaluated REGARDLESS of the guard expression */
sewardj7cf4e6b2008-05-01 20:24:26 +00007273 for (i = 0; d->args[i]; i++) {
sewardj21a5f8c2013-08-08 10:41:46 +00007274 IRAtom* arg = d->args[i];
7275 if ( (d->cee->mcx_mask & (1<<i))
floriana5c3ecb2013-08-15 20:55:42 +00007276 || UNLIKELY(is_IRExpr_VECRET_or_BBPTR(arg)) ) {
sewardj7cf4e6b2008-05-01 20:24:26 +00007277 /* ignore this arg */
7278 } else {
sewardj21a5f8c2013-08-08 10:41:46 +00007279 here = schemeE( mce, arg );
sewardj7cf4e6b2008-05-01 20:24:26 +00007280 curr = gen_maxU32( mce, curr, here );
7281 }
7282 }
7283
7284 /* Inputs: guest state that we read. */
7285 for (i = 0; i < d->nFxState; i++) {
7286 tl_assert(d->fxState[i].fx != Ifx_None);
7287 if (d->fxState[i].fx == Ifx_Write)
7288 continue;
7289
sewardj2eecb742012-06-01 16:11:41 +00007290 /* Enumerate the described state segments */
7291 for (k = 0; k < 1 + d->fxState[i].nRepeats; k++) {
7292 gOff = d->fxState[i].offset + k * d->fxState[i].repeatLen;
7293 gSz = d->fxState[i].size;
sewardj7cf4e6b2008-05-01 20:24:26 +00007294
sewardj2eecb742012-06-01 16:11:41 +00007295 /* Ignore any sections marked as 'always defined'. */
7296 if (isAlwaysDefd(mce, gOff, gSz)) {
7297 if (0)
7298 VG_(printf)("memcheck: Dirty gst: ignored off %d, sz %d\n",
7299 gOff, gSz);
7300 continue;
sewardj7cf4e6b2008-05-01 20:24:26 +00007301 }
sewardj7cf4e6b2008-05-01 20:24:26 +00007302
sewardj2eecb742012-06-01 16:11:41 +00007303 /* This state element is read or modified. So we need to
7304 consider it. If larger than 4 bytes, deal with it in
7305 4-byte chunks. */
7306 while (True) {
7307 Int b_offset;
7308 tl_assert(gSz >= 0);
7309 if (gSz == 0) break;
7310 n = gSz <= 4 ? gSz : 4;
7311 /* update 'curr' with maxU32 of the state slice
7312 gOff .. gOff+n-1 */
7313 b_offset = MC_(get_otrack_shadow_offset)(gOff, 4);
7314 if (b_offset != -1) {
florian434ffae2012-07-19 17:23:42 +00007315 /* Observe the guard expression. If it is false use 0, i.e.
7316 nothing is known about the origin */
7317 IRAtom *cond, *iffalse, *iftrue;
7318
sewardjcc961652013-01-26 11:49:15 +00007319 cond = assignNew( 'B', mce, Ity_I1, d->guard);
florian434ffae2012-07-19 17:23:42 +00007320 iffalse = mkU32(0);
7321 iftrue = assignNew( 'B', mce, Ity_I32,
7322 IRExpr_Get(b_offset
7323 + 2*mce->layout->total_sizeB,
7324 Ity_I32));
7325 here = assignNew( 'B', mce, Ity_I32,
florian5686b2d2013-01-29 03:57:40 +00007326 IRExpr_ITE(cond, iftrue, iffalse));
sewardj2eecb742012-06-01 16:11:41 +00007327 curr = gen_maxU32( mce, curr, here );
7328 }
7329 gSz -= n;
7330 gOff += n;
7331 }
7332 }
sewardj7cf4e6b2008-05-01 20:24:26 +00007333 }
7334
7335 /* Inputs: memory */
7336
7337 if (d->mFx != Ifx_None) {
7338 /* Because we may do multiple shadow loads/stores from the same
7339 base address, it's best to do a single test of its
7340 definedness right now. Post-instrumentation optimisation
7341 should remove all but this test. */
7342 tl_assert(d->mAddr);
7343 here = schemeE( mce, d->mAddr );
7344 curr = gen_maxU32( mce, curr, here );
7345 }
7346
7347 /* Deal with memory inputs (reads or modifies) */
7348 if (d->mFx == Ifx_Read || d->mFx == Ifx_Modify) {
sewardj7cf4e6b2008-05-01 20:24:26 +00007349 toDo = d->mSize;
7350 /* chew off 32-bit chunks. We don't care about the endianness
7351 since it's all going to be condensed down to a single bit,
7352 but nevertheless choose an endianness which is hopefully
7353 native to the platform. */
7354 while (toDo >= 4) {
florian434ffae2012-07-19 17:23:42 +00007355 here = gen_guarded_load_b( mce, 4, d->mAddr, d->mSize - toDo,
7356 d->guard );
sewardj7cf4e6b2008-05-01 20:24:26 +00007357 curr = gen_maxU32( mce, curr, here );
7358 toDo -= 4;
7359 }
sewardj8c93fcc2008-10-30 13:08:31 +00007360 /* handle possible 16-bit excess */
7361 while (toDo >= 2) {
florian434ffae2012-07-19 17:23:42 +00007362 here = gen_guarded_load_b( mce, 2, d->mAddr, d->mSize - toDo,
7363 d->guard );
sewardj8c93fcc2008-10-30 13:08:31 +00007364 curr = gen_maxU32( mce, curr, here );
7365 toDo -= 2;
7366 }
floriancda994b2012-06-08 16:01:19 +00007367 /* chew off the remaining 8-bit chunk, if any */
7368 if (toDo == 1) {
florian434ffae2012-07-19 17:23:42 +00007369 here = gen_guarded_load_b( mce, 1, d->mAddr, d->mSize - toDo,
7370 d->guard );
floriancda994b2012-06-08 16:01:19 +00007371 curr = gen_maxU32( mce, curr, here );
7372 toDo -= 1;
7373 }
7374 tl_assert(toDo == 0);
sewardj7cf4e6b2008-05-01 20:24:26 +00007375 }
7376
7377 /* Whew! So curr is a 32-bit B-value which should give an origin
7378 of some use if any of the inputs to the helper are undefined.
7379 Now we need to re-distribute the results to all destinations. */
7380
7381 /* Outputs: the destination temporary, if there is one. */
7382 if (d->tmp != IRTemp_INVALID) {
7383 dst = findShadowTmpB(mce, d->tmp);
7384 assign( 'V', mce, dst, curr );
7385 }
7386
7387 /* Outputs: guest state that we write or modify. */
7388 for (i = 0; i < d->nFxState; i++) {
7389 tl_assert(d->fxState[i].fx != Ifx_None);
7390 if (d->fxState[i].fx == Ifx_Read)
7391 continue;
7392
sewardj2eecb742012-06-01 16:11:41 +00007393 /* Enumerate the described state segments */
7394 for (k = 0; k < 1 + d->fxState[i].nRepeats; k++) {
7395 gOff = d->fxState[i].offset + k * d->fxState[i].repeatLen;
7396 gSz = d->fxState[i].size;
sewardj7cf4e6b2008-05-01 20:24:26 +00007397
sewardj2eecb742012-06-01 16:11:41 +00007398 /* Ignore any sections marked as 'always defined'. */
7399 if (isAlwaysDefd(mce, gOff, gSz))
7400 continue;
7401
7402 /* This state element is written or modified. So we need to
7403 consider it. If larger than 4 bytes, deal with it in
7404 4-byte chunks. */
7405 while (True) {
7406 Int b_offset;
7407 tl_assert(gSz >= 0);
7408 if (gSz == 0) break;
7409 n = gSz <= 4 ? gSz : 4;
7410 /* Write 'curr' to the state slice gOff .. gOff+n-1 */
7411 b_offset = MC_(get_otrack_shadow_offset)(gOff, 4);
7412 if (b_offset != -1) {
florian434ffae2012-07-19 17:23:42 +00007413
florian6c0aa2c2013-01-21 01:27:22 +00007414 /* If the guard expression evaluates to false we simply Put
7415 the value that is already stored in the guest state slot */
7416 IRAtom *cond, *iffalse;
7417
sewardjcc961652013-01-26 11:49:15 +00007418 cond = assignNew('B', mce, Ity_I1,
7419 d->guard);
florian6c0aa2c2013-01-21 01:27:22 +00007420 iffalse = assignNew('B', mce, Ity_I32,
7421 IRExpr_Get(b_offset +
7422 2*mce->layout->total_sizeB,
7423 Ity_I32));
7424 curr = assignNew('V', mce, Ity_I32,
florian5686b2d2013-01-29 03:57:40 +00007425 IRExpr_ITE(cond, curr, iffalse));
florian6c0aa2c2013-01-21 01:27:22 +00007426
sewardj2eecb742012-06-01 16:11:41 +00007427 stmt( 'B', mce, IRStmt_Put(b_offset
florian6c0aa2c2013-01-21 01:27:22 +00007428 + 2*mce->layout->total_sizeB,
sewardj2eecb742012-06-01 16:11:41 +00007429 curr ));
7430 }
7431 gSz -= n;
7432 gOff += n;
sewardj7cf4e6b2008-05-01 20:24:26 +00007433 }
sewardj7cf4e6b2008-05-01 20:24:26 +00007434 }
7435 }
7436
7437 /* Outputs: memory that we write or modify. Same comments about
7438 endianness as above apply. */
7439 if (d->mFx == Ifx_Write || d->mFx == Ifx_Modify) {
sewardj7cf4e6b2008-05-01 20:24:26 +00007440 toDo = d->mSize;
7441 /* chew off 32-bit chunks */
7442 while (toDo >= 4) {
sewardj1c0ce7a2009-07-01 08:10:49 +00007443 gen_store_b( mce, 4, d->mAddr, d->mSize - toDo, curr,
florian434ffae2012-07-19 17:23:42 +00007444 d->guard );
sewardj7cf4e6b2008-05-01 20:24:26 +00007445 toDo -= 4;
7446 }
sewardj8c93fcc2008-10-30 13:08:31 +00007447 /* handle possible 16-bit excess */
7448 while (toDo >= 2) {
sewardjcafe5052013-01-17 14:24:35 +00007449 gen_store_b( mce, 2, d->mAddr, d->mSize - toDo, curr,
7450 d->guard );
sewardj8c93fcc2008-10-30 13:08:31 +00007451 toDo -= 2;
7452 }
floriancda994b2012-06-08 16:01:19 +00007453 /* chew off the remaining 8-bit chunk, if any */
7454 if (toDo == 1) {
7455 gen_store_b( mce, 1, d->mAddr, d->mSize - toDo, curr,
florian434ffae2012-07-19 17:23:42 +00007456 d->guard );
floriancda994b2012-06-08 16:01:19 +00007457 toDo -= 1;
7458 }
7459 tl_assert(toDo == 0);
sewardj7cf4e6b2008-05-01 20:24:26 +00007460 }
sewardj7cf4e6b2008-05-01 20:24:26 +00007461}
7462
sewardjdb5907d2009-11-26 17:20:21 +00007463
sewardjcafe5052013-01-17 14:24:35 +00007464/* Generate IR for origin shadowing for a general guarded store. */
7465static void do_origins_Store_guarded ( MCEnv* mce,
7466 IREndness stEnd,
7467 IRExpr* stAddr,
7468 IRExpr* stData,
7469 IRExpr* guard )
sewardjdb5907d2009-11-26 17:20:21 +00007470{
7471 Int dszB;
7472 IRAtom* dataB;
7473 /* assert that the B value for the address is already available
7474 (somewhere), since the call to schemeE will want to see it.
7475 XXXX how does this actually ensure that?? */
7476 tl_assert(isIRAtom(stAddr));
7477 tl_assert(isIRAtom(stData));
7478 dszB = sizeofIRType( typeOfIRExpr(mce->sb->tyenv, stData ) );
7479 dataB = schemeE( mce, stData );
sewardjcafe5052013-01-17 14:24:35 +00007480 gen_store_b( mce, dszB, stAddr, 0/*offset*/, dataB, guard );
7481}
7482
7483
7484/* Generate IR for origin shadowing for a plain store. */
7485static void do_origins_Store_plain ( MCEnv* mce,
7486 IREndness stEnd,
7487 IRExpr* stAddr,
7488 IRExpr* stData )
7489{
7490 do_origins_Store_guarded ( mce, stEnd, stAddr, stData,
7491 NULL/*guard*/ );
7492}
7493
7494
7495/* ---- Dealing with LoadG/StoreG (not entirely simple) ---- */
7496
7497static void do_origins_StoreG ( MCEnv* mce, IRStoreG* sg )
7498{
7499 do_origins_Store_guarded( mce, sg->end, sg->addr,
7500 sg->data, sg->guard );
7501}
7502
7503static void do_origins_LoadG ( MCEnv* mce, IRLoadG* lg )
7504{
7505 IRType loadedTy = Ity_INVALID;
7506 switch (lg->cvt) {
sewardj290b9ca2015-08-12 11:16:23 +00007507 case ILGop_IdentV128: loadedTy = Ity_V128; break;
7508 case ILGop_Ident64: loadedTy = Ity_I64; break;
7509 case ILGop_Ident32: loadedTy = Ity_I32; break;
7510 case ILGop_16Uto32: loadedTy = Ity_I16; break;
7511 case ILGop_16Sto32: loadedTy = Ity_I16; break;
7512 case ILGop_8Uto32: loadedTy = Ity_I8; break;
7513 case ILGop_8Sto32: loadedTy = Ity_I8; break;
sewardjcafe5052013-01-17 14:24:35 +00007514 default: VG_(tool_panic)("schemeS.IRLoadG");
7515 }
7516 IRAtom* ori_alt
7517 = schemeE( mce,lg->alt );
7518 IRAtom* ori_final
7519 = expr2ori_Load_guarded_General(mce, loadedTy,
7520 lg->addr, 0/*addr bias*/,
7521 lg->guard, ori_alt );
7522 /* And finally, bind the origin to the destination temporary. */
7523 assign( 'B', mce, findShadowTmpB(mce, lg->dst), ori_final );
sewardjdb5907d2009-11-26 17:20:21 +00007524}
7525
7526
sewardj7cf4e6b2008-05-01 20:24:26 +00007527static void schemeS ( MCEnv* mce, IRStmt* st )
7528{
7529 tl_assert(MC_(clo_mc_level) == 3);
7530
7531 switch (st->tag) {
7532
7533 case Ist_AbiHint:
7534 /* The value-check instrumenter handles this - by arranging
7535 to pass the address of the next instruction to
7536 MC_(helperc_MAKE_STACK_UNINIT). This is all that needs to
7537 happen for origin tracking w.r.t. AbiHints. So there is
7538 nothing to do here. */
7539 break;
7540
7541 case Ist_PutI: {
floriand39b0222012-05-31 15:48:13 +00007542 IRPutI *puti = st->Ist.PutI.details;
sewardj7cf4e6b2008-05-01 20:24:26 +00007543 IRRegArray* descr_b;
7544 IRAtom *t1, *t2, *t3, *t4;
floriand39b0222012-05-31 15:48:13 +00007545 IRRegArray* descr = puti->descr;
sewardj7cf4e6b2008-05-01 20:24:26 +00007546 IRType equivIntTy
7547 = MC_(get_otrack_reg_array_equiv_int_type)(descr);
7548 /* If this array is unshadowable for whatever reason,
7549 generate no code. */
7550 if (equivIntTy == Ity_INVALID)
7551 break;
7552 tl_assert(sizeofIRType(equivIntTy) >= 4);
7553 tl_assert(sizeofIRType(equivIntTy) == sizeofIRType(descr->elemTy));
7554 descr_b
7555 = mkIRRegArray( descr->base + 2*mce->layout->total_sizeB,
7556 equivIntTy, descr->nElems );
7557 /* Compute a value to Put - the conjoinment of the origin for
7558 the data to be Put-ted (obviously) and of the index value
7559 (not so obviously). */
floriand39b0222012-05-31 15:48:13 +00007560 t1 = schemeE( mce, puti->data );
7561 t2 = schemeE( mce, puti->ix );
sewardj7cf4e6b2008-05-01 20:24:26 +00007562 t3 = gen_maxU32( mce, t1, t2 );
7563 t4 = zWidenFrom32( mce, equivIntTy, t3 );
floriand39b0222012-05-31 15:48:13 +00007564 stmt( 'B', mce, IRStmt_PutI( mkIRPutI(descr_b, puti->ix,
7565 puti->bias, t4) ));
sewardj7cf4e6b2008-05-01 20:24:26 +00007566 break;
7567 }
sewardjdb5907d2009-11-26 17:20:21 +00007568
sewardj7cf4e6b2008-05-01 20:24:26 +00007569 case Ist_Dirty:
7570 do_origins_Dirty( mce, st->Ist.Dirty.details );
7571 break;
sewardjdb5907d2009-11-26 17:20:21 +00007572
7573 case Ist_Store:
sewardjcafe5052013-01-17 14:24:35 +00007574 do_origins_Store_plain( mce, st->Ist.Store.end,
7575 st->Ist.Store.addr,
7576 st->Ist.Store.data );
7577 break;
7578
7579 case Ist_StoreG:
7580 do_origins_StoreG( mce, st->Ist.StoreG.details );
7581 break;
7582
7583 case Ist_LoadG:
7584 do_origins_LoadG( mce, st->Ist.LoadG.details );
sewardjdb5907d2009-11-26 17:20:21 +00007585 break;
7586
7587 case Ist_LLSC: {
7588 /* In short: treat a load-linked like a normal load followed
7589 by an assignment of the loaded (shadow) data the result
7590 temporary. Treat a store-conditional like a normal store,
7591 and mark the result temporary as defined. */
7592 if (st->Ist.LLSC.storedata == NULL) {
7593 /* Load Linked */
7594 IRType resTy
7595 = typeOfIRTemp(mce->sb->tyenv, st->Ist.LLSC.result);
7596 IRExpr* vanillaLoad
7597 = IRExpr_Load(st->Ist.LLSC.end, resTy, st->Ist.LLSC.addr);
7598 tl_assert(resTy == Ity_I64 || resTy == Ity_I32
7599 || resTy == Ity_I16 || resTy == Ity_I8);
7600 assign( 'B', mce, findShadowTmpB(mce, st->Ist.LLSC.result),
7601 schemeE(mce, vanillaLoad));
7602 } else {
7603 /* Store conditional */
sewardjcafe5052013-01-17 14:24:35 +00007604 do_origins_Store_plain( mce, st->Ist.LLSC.end,
7605 st->Ist.LLSC.addr,
7606 st->Ist.LLSC.storedata );
sewardjdb5907d2009-11-26 17:20:21 +00007607 /* For the rationale behind this, see comments at the
7608 place where the V-shadow for .result is constructed, in
7609 do_shadow_LLSC. In short, we regard .result as
7610 always-defined. */
7611 assign( 'B', mce, findShadowTmpB(mce, st->Ist.LLSC.result),
7612 mkU32(0) );
sewardj1c0ce7a2009-07-01 08:10:49 +00007613 }
sewardj7cf4e6b2008-05-01 20:24:26 +00007614 break;
7615 }
sewardjdb5907d2009-11-26 17:20:21 +00007616
sewardj7cf4e6b2008-05-01 20:24:26 +00007617 case Ist_Put: {
7618 Int b_offset
7619 = MC_(get_otrack_shadow_offset)(
7620 st->Ist.Put.offset,
sewardj1c0ce7a2009-07-01 08:10:49 +00007621 sizeofIRType(typeOfIRExpr(mce->sb->tyenv, st->Ist.Put.data))
sewardj7cf4e6b2008-05-01 20:24:26 +00007622 );
7623 if (b_offset >= 0) {
7624 /* FIXME: this isn't an atom! */
7625 stmt( 'B', mce, IRStmt_Put(b_offset + 2*mce->layout->total_sizeB,
7626 schemeE( mce, st->Ist.Put.data )) );
7627 }
7628 break;
7629 }
sewardjdb5907d2009-11-26 17:20:21 +00007630
sewardj7cf4e6b2008-05-01 20:24:26 +00007631 case Ist_WrTmp:
7632 assign( 'B', mce, findShadowTmpB(mce, st->Ist.WrTmp.tmp),
7633 schemeE(mce, st->Ist.WrTmp.data) );
7634 break;
sewardjdb5907d2009-11-26 17:20:21 +00007635
sewardj7cf4e6b2008-05-01 20:24:26 +00007636 case Ist_MBE:
7637 case Ist_NoOp:
7638 case Ist_Exit:
7639 case Ist_IMark:
7640 break;
sewardjdb5907d2009-11-26 17:20:21 +00007641
sewardj7cf4e6b2008-05-01 20:24:26 +00007642 default:
7643 VG_(printf)("mc_translate.c: schemeS: unhandled: ");
7644 ppIRStmt(st);
7645 VG_(tool_panic)("memcheck:schemeS");
7646 }
7647}
7648
7649
Elliott Hughesa0664b92017-04-18 17:46:52 -07007650/*------------------------------------------------------------*/
7651/*--- Startup assertion checking ---*/
7652/*------------------------------------------------------------*/
7653
7654void MC_(do_instrumentation_startup_checks)( void )
7655{
7656 /* Make a best-effort check to see that is_helperc_value_checkN_fail
7657 is working as we expect. */
7658
7659# define CHECK(_expected, _string) \
7660 tl_assert((_expected) == is_helperc_value_checkN_fail(_string))
7661
7662 /* It should identify these 8, and no others, as targets. */
7663 CHECK(True, "MC_(helperc_value_check8_fail_no_o)");
7664 CHECK(True, "MC_(helperc_value_check4_fail_no_o)");
7665 CHECK(True, "MC_(helperc_value_check0_fail_no_o)");
7666 CHECK(True, "MC_(helperc_value_check1_fail_no_o)");
7667 CHECK(True, "MC_(helperc_value_check8_fail_w_o)");
7668 CHECK(True, "MC_(helperc_value_check0_fail_w_o)");
7669 CHECK(True, "MC_(helperc_value_check1_fail_w_o)");
7670 CHECK(True, "MC_(helperc_value_check4_fail_w_o)");
7671
7672 /* Ad-hoc selection of other strings gathered via a quick test. */
7673 CHECK(False, "amd64g_dirtyhelper_CPUID_avx2");
7674 CHECK(False, "amd64g_dirtyhelper_RDTSC");
7675 CHECK(False, "MC_(helperc_b_load1)");
7676 CHECK(False, "MC_(helperc_b_load2)");
7677 CHECK(False, "MC_(helperc_b_load4)");
7678 CHECK(False, "MC_(helperc_b_load8)");
7679 CHECK(False, "MC_(helperc_b_load16)");
7680 CHECK(False, "MC_(helperc_b_load32)");
7681 CHECK(False, "MC_(helperc_b_store1)");
7682 CHECK(False, "MC_(helperc_b_store2)");
7683 CHECK(False, "MC_(helperc_b_store4)");
7684 CHECK(False, "MC_(helperc_b_store8)");
7685 CHECK(False, "MC_(helperc_b_store16)");
7686 CHECK(False, "MC_(helperc_b_store32)");
7687 CHECK(False, "MC_(helperc_LOADV8)");
7688 CHECK(False, "MC_(helperc_LOADV16le)");
7689 CHECK(False, "MC_(helperc_LOADV32le)");
7690 CHECK(False, "MC_(helperc_LOADV64le)");
7691 CHECK(False, "MC_(helperc_LOADV128le)");
7692 CHECK(False, "MC_(helperc_LOADV256le)");
7693 CHECK(False, "MC_(helperc_STOREV16le)");
7694 CHECK(False, "MC_(helperc_STOREV32le)");
7695 CHECK(False, "MC_(helperc_STOREV64le)");
7696 CHECK(False, "MC_(helperc_STOREV8)");
7697 CHECK(False, "track_die_mem_stack_8");
7698 CHECK(False, "track_new_mem_stack_8_w_ECU");
7699 CHECK(False, "MC_(helperc_MAKE_STACK_UNINIT_w_o)");
7700 CHECK(False, "VG_(unknown_SP_update_w_ECU)");
7701
7702# undef CHECK
7703}
7704
7705
njn25e49d8e72002-09-23 09:36:25 +00007706/*--------------------------------------------------------------------*/
njn25cac76cb2002-09-23 11:21:57 +00007707/*--- end mc_translate.c ---*/
njn25e49d8e72002-09-23 09:36:25 +00007708/*--------------------------------------------------------------------*/