| |
| Status |
| ~~~~~~ |
| |
| As of Jan 2014 the trunk contains a port to AArch64 ARMv8 -- loosely, |
| the 64-bit ARM architecture. Currently it supports integer and FP |
| instructions and can run anything generated by gcc-4.8.2 -O3. The |
| port is under active development. |
| |
| Current limitations, as of mid-May 2014. |
| |
| * limited support of vector (SIMD) instructions. Initial target is |
| support for instructions created by gcc-4.8.2 -O3 |
| (via autovectorisation). This is complete. |
| |
| * Integration with the built in GDB server: |
| - works ok (breakpoint, attach to a process blocked in a syscall, ...) |
| - still to do: |
| arm64 xml register description files (allowing shadow registers |
| to be looked at). |
| cpsr transfer to/from gdb to be looked at (see also arm equivalent code) |
| |
| * limited syscall support |
| |
| There has been extensive testing of the baseline simulation of integer |
| and FP instructions. Memcheck is also believed to work, at least for |
| small examples. Other tools appear to at least not crash when running |
| /bin/date. |
| |
| Enough syscalls and instructions are supported for substantial |
| programs to work. Firefox 26 is able to start up and quit. The noise |
| level from Memcheck is low enough to make it practical to use for real |
| debugging. |
| |
| |
| Building |
| ~~~~~~~~ |
| |
| You could probably build it directly on a target OS, using the normal |
| non-cross scheme |
| |
| ./autogen.sh ; ./configure --prefix=.. ; make ; make install |
| |
| Development so far was however done by cross compiling, viz: |
| |
| export CC=aarch64-linux-gnu-gcc |
| export LD=aarch64-linux-gnu-ld |
| export AR=aarch64-linux-gnu-ar |
| |
| ./autogen.sh |
| ./configure --prefix=`pwd`/Inst --host=aarch64-unknown-linux \ |
| --enable-only64bit |
| make -j4 |
| make -j4 install |
| |
| Doing this assumes that the install path (`pwd`/Inst) is valid on |
| both host and target, which isn't normally the case. To avoid |
| this limitation, do instead: |
| |
| ./configure --prefix=/install/path/on/target \ |
| --host=aarch64-unknown-linux \ |
| --enable-only64bit |
| make -j4 |
| make -j4 install DESTDIR=/a/temp/dir/on/host |
| # and then copy the contents of DESTDIR to the target. |
| |
| See README.android for more examples of cross-compile building. |
| |
| |
| Implementation tidying-up/TODO notes |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| UnwindStartRegs -- what should that contain? |
| |
| |
| vki-arm64-linux.h: vki_sigaction_base |
| I really don't think that __vki_sigrestore_t sa_restorer |
| should be present. Adding it surely puts sa_mask at a wrong |
| offset compared to (kernel) reality. But not having it causes |
| compilation of m_signals.c to fail in hard to understand ways, |
| so adding it temporarily. |
| |
| |
| m_trampoline.S: what's the unexecutable-insn value? 0xFFFFFFFF |
| is there at the moment, but 0x00000000 is probably what it should be. |
| Also, fix indentation/tab-vs-space stuff |
| |
| |
| ./include/vki/vki-arm64-linux.h: uses __uint128_t. Should change |
| it to __vki_uint128_t, but what's the defn of that? |
| |
| |
| m_debuginfo/priv_storage.h: need proper defn of DiCfSI |
| |
| |
| readdwarf.c: is this correct? |
| #elif defined(VGP_arm64_linux) |
| # define FP_REG 29 //??? |
| # define SP_REG 31 //??? |
| # define RA_REG_DEFAULT 30 //??? |
| |
| |
| vki-arm64-linux.h: |
| re linux-3.10.5/include/uapi/asm-generic/sembuf.h |
| I'd say the amd64 version has padding it shouldn't have. Check? |
| |
| |
| syswrap-linux.c run_a_thread_NORETURN assembly sections |
| seems like tst->os_state.exitcode has word type |
| in which case the ppc64_linux use of lwz to read it, is wrong |
| |
| |
| syswrap-linux.c ML_(do_fork_clone) |
| assuming that VGP_arm64_linux is the same as VGP_arm_linux here |
| |
| |
| dispatch-arm64-linux.S: FIXME: set up FP control state before |
| entering generated code. Also fix screwy indentation. |
| |
| |
| dispatcher-ery general: what's a good (predictor-friendly) way to |
| branch to a register? |
| |
| |
| in vki-arm64-scnums.h |
| //#if __BITS_PER_LONG == 64 && !defined(__SYSCALL_COMPAT) |
| Probably want to reenable that and clean up accordingly |
| |
| |
| putIRegXXorZR: figure out a way that the computed value is actually |
| used, so as to keep any memory reads that might generate it, alive. |
| (else the simulation can lose exceptions). At least, for writes to |
| the zero register generated by loads .. or .. can anything other |
| integer instructions, that write to a register, cause exceptions? |
| |
| |
| loads/stores: generate stack alignment checks as necessary |
| |
| |
| fix barrier insns: ISB, DMB |
| |
| |
| fix atomic loads/stores |
| |
| |
| FMADD/FMSUB/FNMADD/FNMSUB: generate and use the relevant fused |
| IROps so as to avoid double rounding |
| |
| |
| ARM64Instr_Call getRegUsage: re-check relative to what |
| getAllocableRegs_ARM64 makes available |
| |
| |
| Make dispatch-arm64-linux.S save any callee-saved Q regs |
| I think what is required is to save D8-D15 and nothing more than that. |
| |
| |
| wrapper for __NR3264_fstat -- correct? |
| |
| |
| PRE(sys_clone): get rid of references to vki_modify_ldt_t and the |
| definition of it in vki-arm64-linux.h. Ditto for 32 bit arm. |
| |
| |
| sigframe-arm64-linux.c: build_sigframe: references to nonexistent |
| siguc->uc_mcontext.trap_no, siguc->uc_mcontext.error_code have been |
| replaced by zero. Also in synth_ucontext. |
| |
| |
| m_debugger.c: |
| uregs.pstate = LibVEX_GuestARM64_get_nzcv(vex); /* is this correct? */ |
| Is that remotely correct? |
| |
| |
| host_arm64_defs.c: emit_ARM64INstr: |
| ARM64in_VDfromX and ARM64in_VQfromXX: use simple top-half zeroing |
| MOVs to vector registers instead of INS Vd.D[0], Xreg, to avoid false |
| dependencies on the top half of the register. (Or at least check |
| the semantics of INS Vd.D[0] to see if it zeroes out the top.) |
| |
| |
| preferredVectorSubTypeFromSize: review perf effects and decide |
| on a types-for-subparts policy |
| |
| |
| fold_IRExpr_Unop: add a reduction rule for this |
| 1Sto64(CmpNEZ64( Or64(GET:I64(1192),GET:I64(1184)) )) |
| vis 1Sto64(CmpNEZ64(x)) --> CmpwNEZ64(x) |
| |
| |
| check insn selection for memcheck-only primops: |
| Left64 CmpwNEZ64 V128to64 V128HIto64 1Sto64 CmpNEZ64 CmpNEZ32 |
| widen_z_8_to_64 1Sto32 Left32 32HLto64 CmpwNEZ32 CmpNEZ8 |
| |
| |
| isel: get rid of various cases where zero is put into a register |
| and just use xzr instead. Especially for CmpNEZ64/32. And for |
| writing zeroes into the CC thunk fields. |
| |
| |
| /* Keep this list in sync with that in iselNext below */ |
| /* Keep this list in sync with that for Ist_Exit above */ |
| uh .. they are not in sync |
| |
| |
| very stupid: |
| imm64 x23, 0xFFFFFFFFFFFFFFA0 |
| 17 F4 9F D2 F7 FF BF F2 F7 FF DF F2 F7 FF FF F2 |
| |
| |
| valgrind.h: fix VALGRIND_ALIGN_STACK/VALGRIND_RESTORE_STACK, |
| also add CFI annotations |
| |
| |
| could possibly bring r29 into use, which be useful as it is |
| callee saved |
| |
| |
| ubfm/sbfm etc: special case cases that are simple shifts, as iropt |
| can't always simplify the general-case IR to a shift in such cases. |
| |
| |
| LDP,STP (immediate, simm7) (FP&VEC) |
| should zero out hi parts of dst registers in the LDP case |
| |
| |
| DUP insns: use Iop_Dup8x16, Iop_Dup16x8, Iop_Dup32x4 |
| rather than doing it "by hand" |
| |
| |
| Any place where ZeroHI64ofV128 is used in conjunction with |
| FP vector IROps: find a way to make sure that arithmetic on |
| the upper half of the values is "harmless." |
| |
| |
| math_MINMAXV: use real Iop_Cat{Odd,Even}Lanes ops rather than |
| inline scalar code |
| |
| |
| chainXDirect_ARM64: use direct jump forms when possible |