- 47a74db Add specific microkernel for 1D convolutions with 1x3 kernel size for Android backend by Artsiom Ablavatski · 3 years ago
- 494cd2b S4 variant of C2 Neon GEMM/IGEMM microkernel by Frank Barchard · 3 years ago
- 952cb51 S4 variant of C2 Neon GEMM/IGEMM mull microkernel by Frank Barchard · 3 years ago
- 1d41247 Neon C2 microkernels switch to rndnu from gemmlowp by Frank Barchard · 3 years, 1 month ago
- 582e184 Evaluation stubs and tests for FP16->FP32 conversion by Marat Dukhan · 3 years, 1 month ago
- ddb3d16 F16 Fully Connected operator by Marat Dukhan · 3 years, 1 month ago
- d77f77d F32->F16 VCVT microkernels for NEON-FP16, F16C, and AVX512 by Marat Dukhan · 3 years, 1 month ago
- af2ba00 F16->F32 Convert operator by Marat Dukhan · 3 years, 1 month ago
- c9f9d67 Add Channel Tile of 16 for float and 32 for half float. by Frank Barchard · 3 years, 1 month ago
- dbe781b Enable 8x4, 8x9, 8x25 f32 dwconv by Frank Barchard · 3 years, 1 month ago
- e2c0001 Scalar FP16->FP32 VCVT microkernels by Marat Dukhan · 3 years, 1 month ago
- 434352f Benchmarks for FP16->FP32 VCVT microkernels by Marat Dukhan · 3 years, 1 month ago
- 322ed6f NEON FP16->FP32 VCVT microkernels by Marat Dukhan · 3 years, 1 month ago
- 1227adb SSE2/SSE4.1/AVX FP16->FP32 VCVT microkernels by Marat Dukhan · 3 years, 1 month ago
- 60f903b NEON FP16->FP32 conversion evaluation stubs by Marat Dukhan · 3 years, 1 month ago
- 3ed866b Test evaluation stubs for F16->F32 conversion by Marat Dukhan · 3 years, 1 month ago
- 8ff372c NEON-FP16 implementation of F16->F32 VCVT microkernels by Marat Dukhan · 3 years, 1 month ago
- 354cbc6 QU8 MUL8 variant of DWCONV by Frank Barchard · 3 years, 2 months ago
- 79c76ab F16->F32 conversion microkernels in AVX512-SKX implementation by Marat Dukhan · 3 years, 2 months ago
- f1a6ed3 F16->F32 conversion microkernels in F16C implementation by Marat Dukhan · 3 years, 2 months ago
- 2aa2e2a q8 dwconv add channel tiles of 24 and 32 for mul16 rndnu microkernels by Frank Barchard · 3 years, 2 months ago
- e4118ef Polyfill vld1q_u8_x4 for older AArch64 gcc versions by Marat Dukhan · 3 years, 2 months ago
- 98e054b Enable vectorized X8 LUT microkernels by Marat Dukhan · 3 years, 2 months ago
- 2b3c410 AVX512BW implementations of X8 LUT microkernels by Marat Dukhan · 3 years, 2 months ago
- 7c478e3 SSSE3, AVX, and AVX2 X8 LUT microkernels by Marat Dukhan · 3 years, 2 months ago
- 5de7bc0 QS8/QU8 Tanh operator using LUT microkernels by Marat Dukhan · 3 years, 2 months ago
- f718232 X8 LUT NEON microkernels by Marat Dukhan · 3 years, 2 months ago
- 548542c Fix CMake build by Marat Dukhan · 3 years, 2 months ago
- f6c991e Implement generic LUT-based elementwise operator by Marat Dukhan · 3 years, 2 months ago
- 5407437 Benchmark for X8 LUT microkernels by Marat Dukhan · 3 years, 2 months ago
- d67539d Auto-generate X8 LUT microkernels and tests by Marat Dukhan · 3 years, 2 months ago
- cdf59a5 Add QU8 NR=32 microkernels by Frank Barchard · 3 years, 2 months ago
- df8e604 4x8 QU8 Neon Dotproduct microkernel rename from ld64 to ld128 by Frank Barchard · 3 years, 2 months ago
- a49e41f QU8 4x16C4 NEON Dot Product GEMM/IGEMM microkernels for Cortex A55r1 by Frank Barchard · 3 years, 2 months ago
- 0a3093c QU8 vadd neon use x32 instead of x8 by Frank Barchard · 3 years, 2 months ago
- 7da8b02 Q8 dwconv switch from 8x25 to 16x25 by Frank Barchard · 3 years, 2 months ago
- e252f92 End-to-end benchmarks on QC8 MobileNet v1/v2 models by Marat Dukhan · 3 years, 2 months ago
- 0d06573 dwconv Q8 switch from 8x9 to 16x9 tile. by Frank Barchard · 3 years, 2 months ago
- 8b69802 Enable QU8 C4 NEON Dot Product GEMM/IGEMM microkernels for Cortex A55r1 by Frank Barchard · 3 years, 3 months ago
- ca4c68e QU8 C4 NEON Dot Product GEMM/IGEMM microkernels for Cortex A55r1 by Frank Barchard · 3 years, 3 months ago
- 0c76422 QU8 NEON Assembly remove channel wise by Frank Barchard · 3 years, 3 months ago
- 4066898 QU8 4x16 C4 NEON Assembly Dot Product GEMM/IGEMM microkernels by Frank Barchard · 3 years, 3 months ago
- 0049e89 QU8 C4 NEON Assembly Dot Product GEMM/IGEMM microkernels by Frank Barchard · 3 years, 3 months ago
- de9c64a Enable 4x16 QU8 dot production microkernels by Frank Barchard · 3 years, 3 months ago
- 65692c7 Fix build for Clang on Windows by peter · 3 years, 3 months ago
- e79acb7 S8 VCLAMP microkernels by Marat Dukhan · 3 years, 3 months ago
- 2314753 S8 MAXPOOL microkernels for all architectures by Marat Dukhan · 3 years, 3 months ago
- 9098aba E2E for QU8 GEMM microkernels by Frank Barchard · 3 years, 3 months ago
- e033126 Generate more tile sizes for QU8 gemm/igemm by Frank Barchard · 3 years, 3 months ago
- 2025515 Enable dot production microkernels for QU8 on ARM by Frank Barchard · 3 years, 3 months ago
- 88e839c QU8 C4 NEON Dot Product GEMM/IGEMM microkernels by Frank Barchard · 3 years, 3 months ago
- 0461f2d Generalize PAD microkernels to all 8-/16-/32-bit data types by Marat Dukhan · 3 years, 3 months ago
- 933051b Generalize FILL microkernels to all 8-/16-/32-bit data types by Marat Dukhan · 3 years, 3 months ago
- 7c74aff Add F32 VLRELU benchmarks by Marat Dukhan · 3 years, 3 months ago
- 4486f87 Prune NEON-DOT QS8 GEMM/IGEMM microkernels with FP32 & GEMMLOWP requantization by Marat Dukhan · 3 years, 3 months ago
- e16bf7d Prune AVX2/AVX512 QS8 GEMM/IGEMM microkernels with GEMMLOWP requantization by Marat Dukhan · 3 years, 3 months ago
- 66a3ca1 Initialize QS8 microkernel pointers on pre-NEON ARM architecture by Marat Dukhan · 3 years, 3 months ago
- 0ff7989 Use FP32 requantization for extended-weights QS8 GEMM microkernels on x86 by Marat Dukhan · 3 years, 3 months ago
- ec47958 Prune redundant NEON GEMM/IGEMM microkernels with GEMMLOWP requantization by Marat Dukhan · 3 years, 3 months ago
- f879d9e Add qs8-requantization-test to CMake build by Marat Dukhan · 3 years, 3 months ago
- 599d3db Fix CMake build by Marat Dukhan · 3 years, 3 months ago
- 0853b8a QS8/QU8 Multiply ND operators by Marat Dukhan · 3 years, 3 months ago
- 8b024c9 QS8/QU8 VMULC microkernel benchmark by Marat Dukhan · 3 years, 3 months ago
- fb3a94f QU8 4x16 Neon assembly microkernel for Cortex A75 by Frank Barchard · 3 years, 3 months ago
- 795e5ab QS8/QU8 VMUL microkernel benchmarks by Marat Dukhan · 3 years, 3 months ago
- 4a7b70f QS8/QU8 VMUL[C] microkernels in NEON implementation by Marat Dukhan · 3 years, 3 months ago
- 7999341 QS8/QU8 VMUL[C] microkernels in scalar implementation by Marat Dukhan · 3 years, 3 months ago
- 59ed1da QU8 4x16 Neon assembly microkernel by Frank Barchard · 3 years, 3 months ago
- a212eac QS8/QU8 VMUL[C] microkernels in SSE2/SSE4.1/AVX implementation by Marat Dukhan · 3 years, 3 months ago
- eb3cff3 LD128 versions of QS8/QU8 VADD[C] NEON microkernels by Marat Dukhan · 3 years, 3 months ago
- 01debd9 Optimize QS8 VADD[C] microkernel selection on ARM/ARM64 by Marat Dukhan · 3 years, 4 months ago
- 1ef9de8 QU8 VADD/VADDC microkernel benchmarks by Marat Dukhan · 3 years, 4 months ago
- 83a8d2f QS8 VADD/VADDC microkernel benchmarks by Marat Dukhan · 3 years, 4 months ago
- 60bb7ec Accumulate in 16 bits once in AVX2 MUL16 VPUNPCK QS8/QC8 DWCONV before extending to 32 bits by Marat Dukhan · 3 years, 4 months ago
- 881ab02 AVX2 MUL16 QS8/QC8 DWCONV microkernels using VPUNPCK instructions to extend the product by Marat Dukhan · 3 years, 4 months ago
- 2848059 Optimize QC8 DWCONV microkernel selection on AVX and XOP by Marat Dukhan · 3 years, 4 months ago
- 195b72f Split microkernel lists in CMakeLists into production and non-production by Marat Dukhan · 3 years, 4 months ago
- db3b0a7 Refactor microkernel lists in BUILD and CMakeLists.txt by Marat Dukhan · 3 years, 4 months ago
- 6084fb8 E2E benchmark for QU8 DWCONV microkernels by Marat Dukhan · 3 years, 4 months ago
- 73a899a QU8 DWCONV NEON microkernels with RNDNU requantization by Marat Dukhan · 3 years, 4 months ago
- 173661d QU8 GEMM/IGEMM NEON microkernels with RNDNU requantization by Marat Dukhan · 3 years, 4 months ago
- 0744fa0 QS8 DWCONV microkernel benchmark by Marat Dukhan · 3 years, 4 months ago
- bbfc6d3 E2E benchmark for QS8 DWCONV microkernels by Marat Dukhan · 3 years, 4 months ago
- 510b8e0 Code generator for RNDNU quantization mode on neon-mull-addw-dup microkernel by Frank Barchard · 3 years, 4 months ago
- 0966856 Accumulate in 16 bits once in SSE2/SSE4/AVX/XOP MUL16 QS8/QC8 DWCONV before extending to 32 bits by Marat Dukhan · 3 years, 4 months ago
- 5f2939f QS8/QC8 DWCONV NEON MUL8/MLA8 microkernels using 128-bit loads by Marat Dukhan · 3 years, 4 months ago
- 476eb84 Fix CMake build by Marat Dukhan · 3 years, 4 months ago
- caccd8e Accumulate in 16 bits once in NEON QS8/QC8 DWCONV before extending to 32 bits by Marat Dukhan · 3 years, 4 months ago
- 1a2dbe1 RNDNU scalar GEMM/IGEMM microkernel by Frank Barchard · 3 years, 4 months ago
- e76049a AVX512 implementation of QS8/QU8 VADD[C] microkernels by Marat Dukhan · 3 years, 4 months ago
- 28c82b2 Fix CMake build by Marat Dukhan · 3 years, 4 months ago
- 3eac69c Optimized QU8 VADD[C] microkernels for SSE4/AVX/XOP/AVX2 by Marat Dukhan · 3 years, 4 months ago
- 036b2b1 Add QU8 MobileNet v2 model to end-to-end benchmark by Marat Dukhan · 3 years, 4 months ago
- 76e78c8 Generalize QS8 VADD[C] templates to cover QU8 VADD[C] microkernels by Marat Dukhan · 3 years, 4 months ago
- 22fbe77 RNDNU quantized 1x16 and 4x16 Neon lane GEMM/IGEMM microkernels. by Frank Barchard · 3 years, 4 months ago
- 13db60f RNDNU quantized Neon assembly GEMM/IGEMM microkernels. by Frank Barchard · 3 years, 4 months ago
- 60729d0 4x16c4 RNDNU quantized Neon assembly GEMM/IGEMM microkernel. by Frank Barchard · 3 years, 4 months ago
- 4ba70b7 QS8/QC8 NEON microkernels using 8x8->16-bit multiplication by Marat Dukhan · 3 years, 4 months ago
- 20c36d4 Fix CMake build by Marat Dukhan · 3 years, 4 months ago
- e903dff QS8 GEMM/IGEMM microkernels with RNDNU requantization by Marat Dukhan · 3 years, 4 months ago