- 2a6411b Reduce code duplication on the TLS implementation. by Rafael Espindola · 15 years ago
- 236aa8a ADDS{D|S}rr_Int and MULS{D|S}rr_Int are not commutable. The users of these intrinsics expect the high bits will not be modified. by Evan Cheng · 15 years ago
- b9a47b8 Generate better code for v8i16 shuffles on SSE2 by Nate Begeman · 15 years ago
- 1d76864 Handle llvm.x86.sse2.maskmov.dqu in 64-bit. by Evan Cheng · 15 years ago
- b3379fb A few more isAsCheapAsAMove. by Evan Cheng · 16 years ago
- 1632782 The memory alignment requirement on some of the mov{h|l}p{d|s} patterns are 16-byte. That is overly strict. These instructions read / write f64 memory locations without alignment requirement. by Evan Cheng · 16 years ago
- b134709 Whitespace and other minor adjustments to make SSE instructions have by Dan Gohman · 16 years ago
- af9b952 Fixed x86 code generation of multiple for v2i64. It was incorrect for SSE4.1. by Mon P Wang · 16 years ago
- 15511cf Rename isSimpleLoad to canFoldAsLoad, to better reflect its meaning. by Dan Gohman · 16 years ago
- 62c939d Mark x86's V_SET0 and V_SETALLONES with isSimpleLoad, and teach X86's by Dan Gohman · 16 years ago
- 4b299d4 Fix lfence and mfence encoding. These look like MRM5r and MRM6r instructions except they do not have any operands. The RegModRM byte is encoded with register number 0. by Evan Cheng · 16 years ago
- a7250dd Fix the predicate for memop64 to be a regular load, not just an unindexed load. by Dan Gohman · 16 years ago
- 3358629 Now that predicates can be composed, simplify several of by Dan Gohman · 16 years ago
- e397acc Fix SSE4.1 roundss, roundsd. While the instructions have by Dale Johannesen · 16 years ago
- ae436ce Certain patterns involving the "movss" instruction were marked as requiring SSE2, when in reality movss is an SSE1 instruction. by Anders Carlsson · 16 years ago
- 5e249b4 "The original bug was a complaint that _mm_srli_si128 mis-compiled when passed by Bill Wendling · 16 years ago
- b7a75a5 Implement "punpckldq %xmm0, $xmm0" as "pshufd $0x50, %xmm0, %xmm" unless optimizing for code size. by Evan Cheng · 16 years ago
- c739489 unpckhps requires sse1, punpckhdq requires sse2. by Evan Cheng · 16 years ago
- 0b457f0 With sse3 and when the source is a load or has multiple uses, favors movddup over shuffp*, pshufd, etc. Without sse3 or when the source is from a register, make use of movlhps by Evan Cheng · 16 years ago
- 89d4a28 pmovsxbq etc. requires sse4.1. by Evan Cheng · 16 years ago
- ca57f78 Fix patterns for SSE4.1 move and sign extend instructions. Also add instructions which fold VZEXT_MOVL and VZEXT_LOAD. by Evan Cheng · 16 years ago
- f5aeb1a Rename ConstantSDNode::getValue to getZExtValue, for consistency by Dan Gohman · 16 years ago
- d0c0fae Fix for PR2687: Add patterns to match sint_to_fp and fp_to_sint for <2 x by Eli Friedman · 16 years ago
- 66e1315 FsFLD0S{S|D} and V_SETALLONES are as cheap as moves. by Evan Cheng · 16 years ago
- 67ca6be Tablegen generated code already tests the opcode value, so it's not by Dan Gohman · 16 years ago
- d9ced09 Add an EXTRACTPSmr pattern to match the pattern that X86ISelLowering creates. by Dan Gohman · 16 years ago
- e9d5035 Fix PR2620: Fix X86cmppd selection code so it expects operands to be v2f64. by Evan Cheng · 16 years ago
- e99b255 Fix a typo in last commit by Nate Begeman · 16 years ago
- 30a0de9 SSE codegen for vsetcc nodes by Nate Begeman · 16 years ago
- 331e2bd Fix for PR2472. Use movss to set lower 32-bits of a zero XMM vector. by Evan Cheng · 16 years ago
- 4e44443 Horizontal-add instructions are not commutative. by Evan Cheng · 16 years ago
- 35b9a77 mpsadbw is commutable. by Evan Cheng · 16 years ago
- d4b9c17 Disable some DAG combiner optimizations that may be by Duncan Sands · 16 years ago
- f26ffe9 Implement vector shift up / down and insert zero with ps{rl}lq / ps{rl}ldq. by Evan Cheng · 16 years ago
- c2ecdc5 Fix the encoding for two more "rm" instructions that were using MRMSrcReg. by Dan Gohman · 16 years ago
- bfbbd4d Fixed X86 encoding error CVTPS2PD and CVTPD2PS when the source operand by Mon P Wang · 16 years ago
- a315939 Eliminate x86.sse2.punpckh.qdq and x86.sse2.punpckl.qdq. by Evan Cheng · 16 years ago
- e716bb1 Eliminate x86.sse2.movs.d, x86.sse2.shuf.pd, x86.sse2.unpckh.pd, and x86.sse2.unpckl.pd intrinsics. These will be lowered into shuffles. by Evan Cheng · 16 years ago
- 999dbe6 Remove x86.sse2.loadh.pd and x86.sse2.loadl.pd. These will be lowered into load and shuffle instructions. by Evan Cheng · 16 years ago
- cd0baf2 Use movlps / movhps to modify low / high half of 16-byet memory location. by Evan Cheng · 16 years ago
- 50f778d Fix a duplicated pattern. by Evan Cheng · 16 years ago
- 0b924dc Use PMULDQ for v2i64 multiplies when SSE4.1 is available. And add by Dan Gohman · 16 years ago
- b193826 Bug: rcpps can only folds a load if the address is 16-byte aligned. Fixed many 'ps' load folding patterns in X86InstrSSE.td which are missing the proper alignment checks. by Evan Cheng · 16 years ago
- c36c0ab Add missing patterns. by Evan Cheng · 16 years ago
- 8e8de68 movsd and movq do not require 16-byte alignment. This fixes vec_set-5.ll on Linux. by Evan Cheng · 16 years ago
- 32097bd Fix one more encoding bug. by Nate Begeman · 16 years ago
- c9bdb00 Fix and encoding error in the psrad xmm, imm8 instruction. by Nate Begeman · 16 years ago
- 0d1704b Teach Legalize how to scalarize VSETCC by Nate Begeman · 16 years ago
- c2616e4 Initial X86 codegen support for VSETCC. by Nate Begeman · 16 years ago
- b70ea0b Some clean up. by Evan Cheng · 16 years ago
- 23573e5 Add a pattern to do move the low element of a v4f32 and zero extend the rest. by Evan Cheng · 16 years ago
- d880b97 Handle a few more cases of folding load i64 into xmm and zero top bits. by Evan Cheng · 16 years ago
- fd17f42 Use movq to move low half of XMM register and zero-extend the rest. by Evan Cheng · 16 years ago
- 7e2ff77 Handle vector move / load which zero the destination register top bits (i.e. movd, movq, movss (addr), movsd (addr)) with X86 specific dag combine. by Evan Cheng · 16 years ago
- 22b942a Add separate intrinsics for MMX / SSE shifts with i32 integer operands. This allow us to simplify the horribly complicated matching code. by Evan Cheng · 16 years ago
- b609339 80 column violation. by Evan Cheng · 16 years ago
- bd381a7 A better fix for my previous patch, MOVZQI2PQIrr just requires SSE2. by Chris Lattner · 16 years ago
- 171c11e Add support for the form of the SSE41 extractps instruction that by Dan Gohman · 16 years ago
- db66750 Fix the x86-64 side of PR2108 by adding a v2f64 version of by Chris Lattner · 16 years ago
- 0c0f83f Favors pshufd over shufps when shuffling elements from one vector. pshufd is faster than shufps. by Evan Cheng · 16 years ago
- 7aae876 Fix some SSE4.1 instruction encoding bugs. by Evan Cheng · 16 years ago
- 62a3f15 - SSE4.1 extractfps extracts a f32 into a gr32 register. Very useful! Not. Fix the instruction specification and teaches lowering code to use it only when the only use is a store instruction. by Evan Cheng · 16 years ago
- bc4efb8 Add a couple missing SSE4 instructions by Nate Begeman · 16 years ago
- da47e6e Replace all target specific implicit def instructions with a target independent one: TargetInstrInfo::IMPLICIT_DEF. by Evan Cheng · 16 years ago
- 029d9da Fix some 80 col violations. by Evan Cheng · 16 years ago
- 172b794 Fix a number of encoding bugs. SSE 4.1 instructions MPSADBWrri, PINSRDrr, etc. have 8-bits immediate field (ImmT == Imm8). by Evan Cheng · 16 years ago
- c8e3b14 Clean up my own mess. by Evan Cheng · 16 years ago
- 27b7db5 Implement x86 support for @llvm.prefetch. It corresponds to prefetcht{0|1|2} and prefetchnta instructions. by Evan Cheng · 16 years ago
- e9083d6 isTwoAddress = 1 -> Constraints. by Evan Cheng · 16 years ago
- e7b8a8b PSLLWri etc. are two-address instructions. by Evan Cheng · 16 years ago
- efec751 - When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type. by Evan Cheng · 16 years ago
- 22c5c1b llvm.memory.barrier, and impl for x86 and alpha by Andrew Lenharth · 16 years ago
- cdd1eec SSE4.1 64b integer insert/extract pattern support by Nate Begeman · 16 years ago
- 14d12ca Enable SSE4 codegen and pattern matching. Add some notes to the README. by Nate Begeman · 16 years ago
- ab5d56c xmm0 variable blends by Nate Begeman · 16 years ago
- fea2be5 memopv16i8 had wrong alignment requirement, would have broken pabsb by Nate Begeman · 16 years ago
- 1426d52 Skeleton of insert and extract matching, more to come by Nate Begeman · 17 years ago
- 204e84e The rest of the SSE4.1 intrinsic patterns that are obvious to me. Getting by Nate Begeman · 17 years ago
- 2f6f1c0 Some more SSE 4.1 intrinsic patterns. by Nate Begeman · 17 years ago
- 63ec90a SSE 4.1 Intrinsics and detection by Nate Begeman · 17 years ago
- d43d00c Significantly simplify and improve handling of FP function results on x86-32. by Chris Lattner · 17 years ago
- f77e037 add some missing flags. by Chris Lattner · 17 years ago
- ba7e756 Start inferring side effect information more aggressively, and fix many bugs in the by Chris Lattner · 17 years ago
- dd41527 remove explicit sets of 'neverHasSideEffects' that can now be by Chris Lattner · 17 years ago
- 834f1ce rename isLoad -> isSimpleLoad due to evan's desire to have such a predicate. by Chris Lattner · 17 years ago
- 4ee451d Remove attribution from file headers, per discussion on llvmdev. by Chris Lattner · 17 years ago
- 700a0fb Fix JIT encoding for CMPSD as well. by Evan Cheng · 17 years ago
- 627c00b Add "mayHaveSideEffects" and "neverHasSideEffects" flags to some instructions. I by Bill Wendling · 17 years ago
- d7610e1 Fix the JIT encoding of cmp*ss, which aborts with this assertion currently: by Chris Lattner · 17 years ago
- 7a831ce Make better use of instructions that clear high bits; fix various 2-wide shuffle bugs. by Evan Cheng · 17 years ago
- 6e141fd Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled. by Evan Cheng · 17 years ago
- 1076210 Remove a bogus optimization. It's not possible to do a move to low element to a <8 x i16> or <16 x i8> vector. by Evan Cheng · 17 years ago
- 8a59448 Fix a long standing deficiency in the X86 backend: we would by Chris Lattner · 17 years ago
- b348d18 Add support for vectors to int <-> float casts. by Nate Begeman · 17 years ago
- c784208 Add missing SSE builtins: CVTPD2PI, CVTPS2PI, by Dale Johannesen · 17 years ago
- 48abc5c Corrected many typing errors. And removed 'nest' parameter handling by Arnold Schwaighofer · 17 years ago
- 83e105c Add missing argument to PALIGNR by Dale Johannesen · 17 years ago
- c231e8c Added DAG xforms. e.g. by Evan Cheng · 17 years ago
- fef922a Typo. X86comi doesn't read / write chain's. by Evan Cheng · 17 years ago
- e5f6204 Enabling new condition code modeling scheme. by Evan Cheng · 17 years ago