- 4c1fd6f Allow generate-gemm-test.py to accept multiple output files, and shard the generated tests across all specified output files. by Zhi An Ng · 2 years, 9 months ago
- d454545 F16C implementation of F16 VBINARY[C] microkernels by Marat Dukhan · 2 years, 9 months ago
- d90af6f Move gemm-microkernel-tester test code into separate cc file by Zhi An Ng · 2 years, 9 months ago
- 2780863 Scalar transpose microkernel by Alan Kelly · 2 years, 9 months ago
- 49979b6 Implement vldr for S registers by Zhi An Ng · 2 years, 9 months ago
- a248337 Split more of qs8-gemm-minmax-rndnu out into another file, for microkernels with "c4" by Zhi An Ng · 2 years, 9 months ago
- 4c738f0 Fix wrong WAsm SIMD parameter initialization in f32-spmm-minmax.yaml by Marat Dukhan · 2 years, 9 months ago
- d5a5333 Additional tile sizes for QU8 neon lane microkernel. by Frank Barchard · 2 years, 9 months ago
- 751f622 F16C implementation of F16 VHSWISH microkernels by Marat Dukhan · 2 years, 9 months ago
- 645af97 FMA3 implementation of F16 DWCONV/VCLAMP/VMULCADDC microkernels by Marat Dukhan · 2 years, 9 months ago
- 8459822 Split F32 SCALEMINMAX parameter initialization functions by ISA by Marat Dukhan · 2 years, 9 months ago
- ef5560d Use ISA-specific parameter initialization functions in F32 PAVGPOOL tests by Marat Dukhan · 2 years, 9 months ago
- 3c949a3 Split QS8/QU8 AVGPOOL parameter initialization functions by ISA by Marat Dukhan · 2 years, 9 months ago
- da382d1 Refactor parameter initialization for AVGPOOL/GAVGPOOL/PAVGPOOL microkernels by Marat Dukhan · 2 years, 9 months ago
- a7d74b1 Specify parameter initialization function in GAVGPOOL microkernel tests by Marat Dukhan · 2 years, 9 months ago
- 4a6dca9 Specify parameter initialization function in [P]AVGPOOL microkernel tests by Marat Dukhan · 2 years, 9 months ago
- 5d456ce Refactor naming of QS8/QU8 AVGPOOL parameters by Marat Dukhan · 2 years, 9 months ago
- cbe478a Generate QU8 GAVGPOOL tests from YAML specification by Marat Dukhan · 2 years, 9 months ago
- bf72b54 Split qc8-igemm-minmax-fp32.yaml into 2 files, all microkernels with c go into a separate file. by Zhi An Ng · 2 years, 9 months ago
- 49d94ca Split qc8-gemm-minmax-fp32.yaml into 2 files, all the microkernels with c goes into a separate file. by Zhi An Ng · 2 years, 9 months ago
- 0e0f726 Split qs8-gemm-minmax-rndnu.yaml into 2 files, all the microkernels with c2 suffix goes into a separate file. by Zhi An Ng · 2 years, 9 months ago
- c4302c2 AVX2 implementations of F16 GEMM/IGEMM microkernels by Marat Dukhan · 2 years, 9 months ago
- 0afdfab Fix incorrect JIT tests in QC8 GEMM FP32 by Zhi An Ng · 2 years, 9 months ago
- 842bea9 Remove F16 VRELU microkernels by Marat Dukhan · 2 years, 9 months ago
- 14dd8d0 Convert F16 parameter structures to unions by Marat Dukhan · 2 years, 9 months ago
- 16b734c Add more QC8 GEMM/IGEMM JIT microkernels. by Zhi An Ng · 2 years, 9 months ago
- 58b17ba Remove VSCALE microkernels by Marat Dukhan · 2 years, 9 months ago
- ed73fb6 Add qc8 gemm and igemm JIT microkernels by Zhi An Ng · 2 years, 9 months ago
- 29d9acd Implement vcvt vcvtn vmul_f32, these are used in qc8 microkernels. by Zhi An Ng · 2 years, 9 months ago
- 13b57dd Add more converted microkernels used in init.c. by Zhi An Ng · 2 years, 9 months ago
- 4a5c771 Refactor F32 RADDSTOREEXPMINUSMAX microkernels by Marat Dukhan · 2 years, 9 months ago
- 5999c92 Refactor naming of RADDSTOREEXPMINUSMAX microkernels by Marat Dukhan · 2 years, 9 months ago
- 5876744 Minor refactoring of RADDSTOREEXPMINUSMAX interface by Marat Dukhan · 2 years, 9 months ago
- ed90216 aarch64 transpose TBL microkernel by Alan Kelly · 2 years, 9 months ago
- f623740 QC8 NEON lane microkernels by Frank Barchard · 2 years, 9 months ago
- 7c1115f Reoptimize microkernel selection for WAsm 1.0 by Marat Dukhan · 2 years, 9 months ago
- 7873586 Rename PLD to PRFM for aarch32 microkernels. by Frank Barchard · 2 years, 9 months ago
- 272d4d9 FP32 IMAGIC variants of scalar QC8/QS8/QU8 GEMM/IGEMM/DWCONV microkernels by Marat Dukhan · 2 years, 9 months ago
- f721e37 LRINTF variants of scalar F32->QS8 and F32->QU8 VCVT microkernels by Marat Dukhan · 2 years, 9 months ago
- bdf1099 Refactor scalar F32->QS8 and F32->QU8 microkernels by Marat Dukhan · 2 years, 9 months ago
- 2ac722e Refactor requantization in scalar QS8/QC8/QU8 microkernels by Marat Dukhan · 2 years, 9 months ago
- 0e80137 Refactor parameters in F32 VRND microkernels by Marat Dukhan · 2 years, 9 months ago
- bbfc27d Refactor NEON/NEONFMA VSIGMOID microkernels by Marat Dukhan · 2 years, 9 months ago
- ce834ad Refactor parameters in F32 VSIGMOID microkernels by Marat Dukhan · 2 years, 9 months ago
- 05b6cb1 Transpose microkernel tester uses iota instead of rng so that it's easier to debug tests by Alan Kelly · 2 years, 9 months ago
- 4a79ff2 Refactor parameters in F32 VELU microkernels by Marat Dukhan · 2 years, 9 months ago
- e5efb16 Refactor VUNARY microkernel parameters by Marat Dukhan · 2 years, 9 months ago
- e72b282 Refactor parameters in F32 VSQRT microkernels by Marat Dukhan · 2 years, 9 months ago
- 98c5215 Move mask_table into VBINARY[C] AVX microkernel parameters by Marat Dukhan · 2 years, 9 months ago
- d57186a Refactor F32 VMULCADDC parameters by Marat Dukhan · 2 years, 9 months ago
- f600497 Refactor parameter initialization in Vector Binary Elementwise microkernels by Marat Dukhan · 2 years, 9 months ago
- c83ef3b Refactor F32 MINMAX parameters for WAsm SIMD by Marat Dukhan · 2 years, 9 months ago
- 2894e99 Refactor F32 VLRELU microkernels by Marat Dukhan · 2 years, 9 months ago
- b7c1b71 Refactor F32->F16 VCVT microkernels by Marat Dukhan · 2 years, 9 months ago
- 134f984 Refactor F16->F32 VCVT microkernels by Marat Dukhan · 2 years, 9 months ago
- 87fe410 QC8 quantization for all aarch32 GEMM/IGEMM microkernels by Frank Barchard · 2 years, 9 months ago
- 447aa7b #include allocator.h header to gemm tests. by Frank Barchard · 2 years, 9 months ago
- 1945f0b SSE transpose x16 microkernel (4x8) by Alan Kelly · 2 years, 9 months ago
- 0d10cc7 Split VHSWISH parameter initialization functions per ISA by Marat Dukhan · 2 years, 9 months ago
- b43b47a Add a script to convert existing assembly microkernels to JIT codegen. by Zhi An Ng · 2 years, 9 months ago
- e4d3f76 Mark aarch64 microkernels as assembly for tests by Frank Barchard · 2 years, 9 months ago
- 0db2e4c Support - (minus) operator for creating S/D register lists, this looks closer to native assembly. by Zhi An Ng · 2 years, 9 months ago
- 2493de9 WAsmSIMD transpose microkernel by Alan Kelly · 2 years, 9 months ago
- c80ffb0 Fix generation of gemm tests for ADJBLOCK and rerun scripts. by Zhi An Ng · 2 years, 9 months ago
- e31f29e Declare assembly for QS8 microkernels by Frank Barchard · 2 years, 9 months ago
- 4c61779 Minimally support WebAssembly Relaxed SIMD builds by Marat Dukhan · 2 years, 9 months ago
- 50b0bd9 Fix encoding and supported immediate values for vldr and vstr. by Zhi An Ng · 2 years, 9 months ago
- 1aac8e8 Implement vmrs (FPSCR) by Zhi An Ng · 2 years, 9 months ago
- 0a1b7b6 Implement ldrd (immediate) by Zhi An Ng · 2 years, 9 months ago
- 26e55ed Implement vstr instruction by Zhi An Ng · 2 years, 9 months ago
- 932e823 Implement str (imm) by Zhi An Ng · 2 years, 9 months ago
- 4ebd680 Implement moveq, cmp (imm), sub (imm). by Zhi An Ng · 2 years, 9 months ago
- 2b74ddd Implement vld1_8 with offset register by Zhi An Ng · 2 years, 9 months ago
- fea422d Implement vld1_32 (single element to one lane). by Zhi An Ng · 2 years, 9 months ago
- e48b5c1 QS8 4x8 Neon Lane LD64 IGEMM AArch32 microkernel by Frank Barchard · 2 years, 9 months ago
- 4841021 QS8 4x8 dot product LD64 IGEMM AArch32 microkernel by Frank Barchard · 2 years, 9 months ago
- 938ee9b Implement bic, vld1_8 and vld1_32 for QRegisterList, assert encodings don't error out in tests. by Zhi An Ng · 2 years, 9 months ago
- 9364bdc Implement vsdot_s8 instruction by Zhi An Ng · 2 years, 9 months ago
- a251f87 Implement vqmovn_s16, and_, adds. by Zhi An Ng · 2 years, 9 months ago
- 7c8090d Implement vcmpe_f32, vmovpl_f32, vmovmi_f32. by Zhi An Ng · 2 years, 9 months ago
- 2d8180c Implement 2-argument add, vmla_f32, vmov_f32, vmov_f64, vstm. by Zhi An Ng · 2 years, 9 months ago
- 9f3f420 QS8 4x8 LD64 dot product GEMM AArch32 microkernel by Frank Barchard · 2 years, 9 months ago
- b63e84c Implement b (unconditional branch) by Zhi An Ng · 2 years, 9 months ago
- be4e6a5 Add align for aligning instructions (similar to .align in assembly) by Zhi An Ng · 2 years, 9 months ago
- ec17e99 Add license to files by Zhi An Ng · 2 years, 9 months ago
- fda06cb SSE transpose microkernel by Alan Kelly · 2 years, 10 months ago
- 7b5f779 AVX2 QS8->F32 and QU8->F32 VCVT microkernels by Marat Dukhan · 2 years, 10 months ago
- cd4089f AVX QS8->F32 and QU8->F32 VCVT microkernels by Marat Dukhan · 2 years, 10 months ago
- 2edf863 AVX512 F32->QS8 and F32->QU8 VCVT microkernels by Marat Dukhan · 2 years, 10 months ago
- 0d399ca AVX2 F32->QS8 and F32->QU8 VCVT microkernels by Marat Dukhan · 2 years, 10 months ago
- 3bdbe9f Fix xnn_release_code_memory to unmap entire capacity of buffer by Zhi An Ng · 2 years, 10 months ago
- b91432c AVX F32->QS8 and F32->QU8 VCVT microkernels by Marat Dukhan · 2 years, 10 months ago
- 6fac719 Implement vqmovn_s32 and vext_8 by Zhi An Ng · 2 years, 10 months ago
- 4a58583 Implement vdup_8, vdup_16, vdup_32 by Zhi An Ng · 2 years, 10 months ago
- 2649014 Implement vmax_s8, vmin_s8, vqadd_s16, vqdmulh_s32, vqshl_s32, vrshl_s32 by Zhi An Ng · 2 years, 10 months ago
- 4ef8d51 Implement vst1_16, add some more test cases by Zhi An Ng · 2 years, 10 months ago
- 00a929f Implement vst1_8 and fix vst1_32 encoding by Zhi An Ng · 2 years, 10 months ago
- a38a161 Implement vld1_8, vmlal_s16, vmovl_s8 by Zhi An Ng · 2 years, 10 months ago
- 6883abb JIT memory allocation and integration into Assembler by Zhi An Ng · 2 years, 10 months ago
- da7b2e2 QS8 4x8 lane GEMM AArch32 microkernel by Frank Barchard · 2 years, 10 months ago