- f672851 Implement str (s register, post index) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 1738f11 Implement ldr (post-index) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- eb7256b Port F32 GEMM A75 1x8 microkernel to JIT and specialize for min/max, add tests and benchmarks by Zhi An Ng · 2 years, 8 months ago
- 4decc8e Implement mov (x registers) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 8ceeebe Implement stp (x registers) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 9e51ad6 Implement cmp (x registers) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 18f71e0 Support vld1r_32 with 1 or 2 register(s) in list by Zhi An Ng · 2 years, 8 months ago
- f9fc9ec Integrate JIT generated GEMM microkernels into create_convolution2d_nhwc by Zhi An Ng · 2 years, 8 months ago
- 15dd611 Check code_buffer capacity before attempting to release it by Zhi An Ng · 2 years, 8 months ago
- c607028 Remove wb from JIT aarch32 instructions, use mem operand and ++ instead by Zhi An Ng · 2 years, 8 months ago
- fc67a86 Fix encoding of prfm by Zhi An Ng · 2 years, 8 months ago
- 2269ac8 Add default cases for switch, GCC warns that control reaches the end of non-void function. by Zhi An Ng · 2 years, 8 months ago
- 773458c Change return type for assembler functions to void to simplify code, move emit32 into common assembler by Zhi An Ng · 2 years, 8 months ago
- 4a1c6a8 Implement ldp (d registers) offset and post index for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 048704d Implement stp (q registers) offset and post indexed for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 3cec451 Implement tst (immediate) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 8709ac9 Implement csel for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 35d8e68 Implemnet stp (d register) offset and pre-index for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 658a67d Implement add (x registers) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 80eac62 Implement cmp (immediate) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- c98f0d2 Fix patching of branch instructions immediate by Zhi An Ng · 2 years, 8 months ago
- 491e9e0 Implement ldr for s and d registers and str for d registers (post-indexed) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 2f24c3e Implement dup (vector) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- f761632 Implement str (q register, post-indexed) and str (s register, offset) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 5a5c9e1 Implement mov (VRegister) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 5e31395 Implement stp (post-indexed) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 4915509 Implement add with immediate for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 4ab1390 Rename kTbz enum to kTbxz and add comment to clarify its usage for both TBZ and TBNZ by Zhi An Ng · 2 years, 8 months ago
- b10677e Implement unconditional branch for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 56e8b91 Implement tbz for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- cdfff79 Implement ret for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 3176868 Implement sub (x register) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 3f34299 Implement st1 for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 544d73d Implement fmax and fmin (vector) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- ecfb1f0 Implement fadd (vector) for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 0981080 Implement tbnz for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 6a1151b Implement fmla for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 157b0f4 Implement ldr ldp for q registers in aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- f67f1be Implement labels and B.cond for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- e2dc2ec Implement subs for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 234d6b4 Implement prfm (only PLDL1KEEP) on aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 65ccb13 Implement movi for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 6e68f54 Implement ld1 for 1, 2, and 3 registers for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 5702efb Implement ld2r for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 04cdc41 Implement ldr for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 0ba29e7 Implement LDP for aarch64 assembler by Zhi An Ng · 2 years, 8 months ago
- 70ea0a2 Specialize F32 GEMM A53 JIT microkernel for min/max params by Zhi An Ng · 2 years, 8 months ago
- 109a5eb Initial aarch64 assembler structure by Zhi An Ng · 2 years, 8 months ago
- 7d45d90 Create a new jit-test for jit-related tests that are not architecture specific by Zhi An Ng · 2 years, 9 months ago
- 3b32963 Fix bug in not changing memory to be executable when we have unused capacity. by Zhi An Ng · 2 years, 9 months ago
- 8f2eeee Skip calling __builtin_clear_cache on iOS, iOS uses sys_cache_invalidate by Zhi An Ng · 2 years, 9 months ago
- 49979b6 Implement vldr for S registers by Zhi An Ng · 2 years, 9 months ago
- 29d9acd Implement vcvt vcvtn vmul_f32, these are used in qc8 microkernels. by Zhi An Ng · 2 years, 9 months ago
- 0db2e4c Support - (minus) operator for creating S/D register lists, this looks closer to native assembly. by Zhi An Ng · 2 years, 9 months ago
- f527d56 Avoid using C++14 features in AArch32 assembler test by Marat Dukhan · 2 years, 9 months ago
- 50b0bd9 Fix encoding and supported immediate values for vldr and vstr. by Zhi An Ng · 2 years, 9 months ago
- 1aac8e8 Implement vmrs (FPSCR) by Zhi An Ng · 2 years, 9 months ago
- 0a1b7b6 Implement ldrd (immediate) by Zhi An Ng · 2 years, 9 months ago
- 26e55ed Implement vstr instruction by Zhi An Ng · 2 years, 9 months ago
- 97f99fc Return error if fail to get page size by Zhi An Ng · 2 years, 9 months ago
- 932e823 Implement str (imm) by Zhi An Ng · 2 years, 9 months ago
- 4ebd680 Implement moveq, cmp (imm), sub (imm). by Zhi An Ng · 2 years, 9 months ago
- 2b74ddd Implement vld1_8 with offset register by Zhi An Ng · 2 years, 9 months ago
- fea422d Implement vld1_32 (single element to one lane). by Zhi An Ng · 2 years, 9 months ago
- 938ee9b Implement bic, vld1_8 and vld1_32 for QRegisterList, assert encodings don't error out in tests. by Zhi An Ng · 2 years, 9 months ago
- 9364bdc Implement vsdot_s8 instruction by Zhi An Ng · 2 years, 9 months ago
- a251f87 Implement vqmovn_s16, and_, adds. by Zhi An Ng · 2 years, 9 months ago
- 7c8090d Implement vcmpe_f32, vmovpl_f32, vmovmi_f32. by Zhi An Ng · 2 years, 9 months ago
- 2d8180c Implement 2-argument add, vmla_f32, vmov_f32, vmov_f64, vstm. by Zhi An Ng · 2 years, 9 months ago
- be4e6a5 Add align for aligning instructions (similar to .align in assembly) by Zhi An Ng · 2 years, 9 months ago
- ec17e99 Add license to files by Zhi An Ng · 2 years, 9 months ago
- 3bdbe9f Fix xnn_release_code_memory to unmap entire capacity of buffer by Zhi An Ng · 2 years, 10 months ago
- 6fac719 Implement vqmovn_s32 and vext_8 by Zhi An Ng · 2 years, 10 months ago
- 4a58583 Implement vdup_8, vdup_16, vdup_32 by Zhi An Ng · 2 years, 10 months ago
- 2649014 Implement vmax_s8, vmin_s8, vqadd_s16, vqdmulh_s32, vqshl_s32, vrshl_s32 by Zhi An Ng · 2 years, 10 months ago
- 4ef8d51 Implement vst1_16, add some more test cases by Zhi An Ng · 2 years, 10 months ago
- 00a929f Implement vst1_8 and fix vst1_32 encoding by Zhi An Ng · 2 years, 10 months ago
- 3c4bb1c Fix conditions for flushing icache (only on arm/arm64) by Zhi An Ng · 2 years, 10 months ago
- a38a161 Implement vld1_8, vmlal_s16, vmovl_s8 by Zhi An Ng · 2 years, 10 months ago
- 6883abb JIT memory allocation and integration into Assembler by Zhi An Ng · 2 years, 10 months ago
- f0cb91e Fix formatting of bx signature by Zhi An Ng · 2 years, 10 months ago
- 10475ec Implement bx instruction by Zhi An Ng · 2 years, 10 months ago
- 16f3548 Implement pop and vpop (for D registers) by Zhi An Ng · 2 years, 10 months ago
- fe4a750 Implement vst1_32 (multiple single elements) and vst1_32 (single element from one lane) by Zhi An Ng · 2 years, 10 months ago
- ea612bc Implement vmax_f32 and vmin_f32 by Zhi An Ng · 2 years, 10 months ago
- 2fce75b Implement tst with immediate by Zhi An Ng · 2 years, 10 months ago
- f73e55b Implement add with immediate (drive-by fix for missing return when error in push) by Zhi An Ng · 2 years, 10 months ago
- c9f70f7 Implement vmla.f32, add DRegisterLane for lane-indexed DRegister by Zhi An Ng · 2 years, 10 months ago
- dfe8929 Implement vld1 (multiple single element) and vld1r (single element to all lanes) by Zhi An Ng · 2 years, 10 months ago
- 737ad01 Add .clang-format and reformat jit related files by Zhi An Ng · 2 years, 10 months ago
- 477bdbb Implement vldr instruction by Zhi An Ng · 2 years, 10 months ago
- f4beaf1 Implement vmov (q to q, d to d, s to s, core to d) by Zhi An Ng · 2 years, 10 months ago
- 7eef0a9 Fix formatting for parameters (use lowercase) by Zhi An Ng · 2 years, 10 months ago
- 637becf Implement vldm instruction by Zhi An Ng · 2 years, 10 months ago
- 68c27d3 Implement vpush, add SIMD registers and register lists. by Zhi An Ng · 2 years, 10 months ago
- 65584bd Implement labels and branches by Zhi An Ng · 2 years, 10 months ago
- 591b917 Implement pld instruction. by Zhi An Ng · 2 years, 10 months ago
- 4ab7b93 Implement sub and subs instructions. by Zhi An Ng · 2 years, 10 months ago
- ff2e8b2 Implement mov instruction. by Zhi An Ng · 2 years, 10 months ago
- 984644f Remove unused header by Zhi An Ng · 2 years, 10 months ago