- 758b979 Expose XNNPACK transpose convolution implementation as TRANSPOSE_CONV builtin op by Yury Kartynnik · 3 years ago
- 0214d86 Expose XNNPACK transpose convolution implementation as TRANSPOSE_CONV builtin op by XNNPACK Team · 3 years ago
- 47a74db Add specific microkernel for 1D convolutions with 1x3 kernel size for Android backend by Artsiom Ablavatski · 3 years ago
- dcdc2a2 Expose the optionality of bias in 2D deconvolution by Yury Kartynnik · 3 years ago
- 1f31f99 Expose XNNPACK transpose convolution implementation as TRANSPOSE_CONV builtin op by Yury Kartynnik · 3 years ago
- 494cd2b S4 variant of C2 Neon GEMM/IGEMM microkernel by Frank Barchard · 3 years ago
- 952cb51 S4 variant of C2 Neon GEMM/IGEMM mull microkernel by Frank Barchard · 3 years ago
- fa4daf0 Add ISA check to QU8 GEMM benchmark by Frank Barchard · 3 years ago
- ccbaedf C2 Neon microkernel remove duplicate DUP instructions from NR loop. by Frank Barchard · 3 years ago
- 1d41247 Neon C2 microkernels switch to rndnu from gemmlowp by Frank Barchard · 3 years ago
- 8e9a66f Parse shuffle after channels for test names by Frank Barchard · 3 years ago
- 582e184 Evaluation stubs and tests for FP16->FP32 conversion by Marat Dukhan · 3 years ago
- ddb3d16 F16 Fully Connected operator by Marat Dukhan · 3 years ago
- d77f77d F32->F16 VCVT microkernels for NEON-FP16, F16C, and AVX512 by Marat Dukhan · 3 years ago
- af2ba00 F16->F32 Convert operator by Marat Dukhan · 3 years ago
- ade893c Support unary elementwise ops on 0-dimensional tensors (scalars) by Marat Dukhan · 3 years ago
- c9f9d67 Add Channel Tile of 16 for float and 32 for half float. by Frank Barchard · 3 years, 1 month ago
- dbe781b Enable 8x4, 8x9, 8x25 f32 dwconv by Frank Barchard · 3 years, 1 month ago
- e2c0001 Scalar FP16->FP32 VCVT microkernels by Marat Dukhan · 3 years, 1 month ago
- 434352f Benchmarks for FP16->FP32 VCVT microkernels by Marat Dukhan · 3 years, 1 month ago
- f6507f8 WAsm SIMD FP16->FP32 VCVT microkernels by Marat Dukhan · 3 years, 1 month ago
- 322ed6f NEON FP16->FP32 VCVT microkernels by Marat Dukhan · 3 years, 1 month ago
- 1227adb SSE2/SSE4.1/AVX FP16->FP32 VCVT microkernels by Marat Dukhan · 3 years, 1 month ago
- 2dd18fd Parse ROW_TILE field in multipass DWCONV variant by Frank Barchard · 3 years, 1 month ago
- 3c6d6b4 Update performance data on Raspberry Pi by Marat Dukhan · 3 years, 1 month ago
- 1851410 f32 dwconv load params first by Frank Barchard · 3 years, 1 month ago
- e187b79 f32 dwconv remainder handler remove branch by Frank Barchard · 3 years, 1 month ago
- 758c7ca f32 dwconv remove vector push/pop by Frank Barchard · 3 years, 1 month ago
- e40ef6e f32 dwconv use STR instead of ST1 by Frank Barchard · 3 years, 1 month ago
- 60f903b NEON FP16->FP32 conversion evaluation stubs by Marat Dukhan · 3 years, 1 month ago
- a18926a WAsm SIMD FP16->FP32 conversion evaluation stubs by Marat Dukhan · 3 years, 1 month ago
- 3ed866b Test evaluation stubs for F16->F32 conversion by Marat Dukhan · 3 years, 1 month ago
- 8ff372c NEON-FP16 implementation of F16->F32 VCVT microkernels by Marat Dukhan · 3 years, 1 month ago
- 0630d29 Refactor creation and setup of Operators from Nodes by Marat Dukhan · 3 years, 1 month ago
- 354cbc6 QU8 MUL8 variant of DWCONV by Frank Barchard · 3 years, 1 month ago
- 1b1b032 Avoid backward references in Bazel targets by Marat Dukhan · 3 years, 1 month ago
- 79c76ab F16->F32 conversion microkernels in AVX512-SKX implementation by Marat Dukhan · 3 years, 1 month ago
- f1a6ed3 F16->F32 conversion microkernels in F16C implementation by Marat Dukhan · 3 years, 1 month ago
- 694d252 Fix incorrect initialization of QC8 GEMM/IGEMM parameters on AArch32+NEONDOT by Marat Dukhan · 3 years, 1 month ago
- 0bf8afa Leverage f32x4.pmin and f32x4.pmax WAsm SIMD instructions by Marat Dukhan · 3 years, 1 month ago
- a4ad988 X8 LUT microkernels for WAsm SIMD by Marat Dukhan · 3 years, 2 months ago
- aea2d55 Fix test failure in quantized Leaky ReLU NC test under ASan by Marat Dukhan · 3 years, 2 months ago
- 2aa2e2a q8 dwconv add channel tiles of 24 and 32 for mul16 rndnu microkernels by Frank Barchard · 3 years, 2 months ago
- 5cc31e3 Replace _mm512_(loadu/storeu)_epi8 with _mm512_(loadu/storeu)_si512 by Marat Dukhan · 3 years, 2 months ago
- 37c3077 Avoid _mm512_(loadu/storeu)_epi32 in _mm512_(loadu/storeu)_epi8 polyfills by Marat Dukhan · 3 years, 2 months ago
- 2ea9075 Script to sort file names in BUILD and CMakeLists.txt by Frank Barchard · 3 years, 2 months ago
- d0bf04c Fully qualify std::signbit in ELUOperatorTester by Marat Dukhan · 3 years, 2 months ago
- b54871d Polyfill _mm512_loadu_epi8 & _mm512_storeu_epi8 for pre GCC-11 by Marat Dukhan · 3 years, 2 months ago
- 67492b0 Expose quantized ELU operator in Subgraph API by Marat Dukhan · 3 years, 2 months ago
- eec0052 QS8 ELU operator by Marat Dukhan · 3 years, 2 months ago
- e4118ef Polyfill vld1q_u8_x4 for older AArch64 gcc versions by Marat Dukhan · 3 years, 2 months ago
- 55bad94 Change QS8 to QU8 in dwconv test by Frank Barchard · 3 years, 2 months ago
- 2366290 Add qu8_gemm_4x16__aarch64_neon_mlal_lane_cortex_a75 benchmark to E2E by Frank Barchard · 3 years, 2 months ago
- 98e054b Enable vectorized X8 LUT microkernels by Marat Dukhan · 3 years, 2 months ago
- 2b3c410 AVX512BW implementations of X8 LUT microkernels by Marat Dukhan · 3 years, 2 months ago
- 7c478e3 SSSE3, AVX, and AVX2 X8 LUT microkernels by Marat Dukhan · 3 years, 2 months ago
- 5de7bc0 QS8/QU8 Tanh operator using LUT microkernels by Marat Dukhan · 3 years, 2 months ago
- f718232 X8 LUT NEON microkernels by Marat Dukhan · 3 years, 2 months ago
- 548542c Fix CMake build by Marat Dukhan · 3 years, 2 months ago
- a4ba5d4 Expose quantized Sigmoid operator in Subgraph API by Marat Dukhan · 3 years, 2 months ago
- 71a9bb1 QS8 Sigmoid operator by Marat Dukhan · 3 years, 2 months ago
- f6c991e Implement generic LUT-based elementwise operator by Marat Dukhan · 3 years, 2 months ago
- 5407437 Benchmark for X8 LUT microkernels by Marat Dukhan · 3 years, 2 months ago
- d67539d Auto-generate X8 LUT microkernels and tests by Marat Dukhan · 3 years, 2 months ago
- 2df7542 Add qu8_4x8__neon_mlal_lane benchmark by Frank Barchard · 3 years, 2 months ago
- cdf59a5 Add QU8 NR=32 microkernels by Frank Barchard · 3 years, 2 months ago
- d460d0b Neon IGEMM do remainder with reversed MR for shifts by Frank Barchard · 3 years, 2 months ago
- dfe763f Expose quantized Subtract operator in Subgraph API by Marat Dukhan · 3 years, 2 months ago
- b8cbcb5 Fuse rounding term into bias in QS8 & QU8 VADD[C] microkernels by Marat Dukhan · 3 years, 2 months ago
- 031ff4b Template bug fix in stores for remainder of 8 in Neon QS8 microkernels by Frank Barchard · 3 years, 2 months ago
- 8e2fd20 QS8 and QU8 Subtract ND operators by Marat Dukhan · 3 years, 2 months ago
- ec5c129 Template bug fix in stores for remainder of 8. by Frank Barchard · 3 years, 2 months ago
- 6428725 Rename ADD quantization parameters to ADDSUB by Marat Dukhan · 3 years, 2 months ago
- 1f83cf9 Code generator scripts check if file changed and skip write if it is the same. by Frank Barchard · 3 years, 2 months ago
- 8ae1a53 Remove duplicate prototypes by Frank Barchard · 3 years, 2 months ago
- 4c49494 Fix crash on AArch32 in scalar quantized microkernels by Marat Dukhan · 3 years, 2 months ago
- 2fee611 Fix compilation warnings in QU8 GEMM/IGEMM NEONDOT microkernels by Marat Dukhan · 3 years, 2 months ago
- 1ce78ab Leverage Load-Zero WAsm SIMD instructions in Chrome M88 microkernels by Marat Dukhan · 3 years, 2 months ago
- 189c1d0 Support specifying the version of WAsm SIMD instructions by Marat Dukhan · 3 years, 2 months ago
- df8e604 4x8 QU8 Neon Dotproduct microkernel rename from ld64 to ld128 by Frank Barchard · 3 years, 2 months ago
- 90cd7df Fix rewind params for qs8 4x16c4 by Frank Barchard · 3 years, 2 months ago
- 33b4f75 VRND microkernels using native WAsm SIMD instructions by Marat Dukhan · 3 years, 2 months ago
- bd5b027 4x8 QU8 microkernel use 16 byte UDOT to save 4 UDOT by Frank Barchard · 3 years, 2 months ago
- b7a7c30 NEON GEMM/IGEMM microkernels change store/dup to 2 of each by Frank Barchard · 3 years, 2 months ago
- 132774e QU8 microkernels change stores to non-lane STR by Frank Barchard · 3 years, 2 months ago
- 42a17dd Switch scalar gemmlowp to rndnu for benchmarks by Frank Barchard · 3 years, 2 months ago
- 29833fd Change stores to non-lane STR by Frank Barchard · 3 years, 2 months ago
- c37b8da Enable little core microkernel on Big/Little CPUs by Frank Barchard · 3 years, 2 months ago
- efc3ccf Add 4x16c4 cortex_a55 microkernels to GEMM and E2E benchmarks by Frank Barchard · 3 years, 2 months ago
- 1c70764 4x16c4 cortex_a55 microkernel tuning by Frank Barchard · 3 years, 2 months ago
- e7e001f Fix bug in QC8/QS8/QU8 IGEMM DOT16x2 LD128 WAsm SIMD microkernels by Marat Dukhan · 3 years, 2 months ago
- a49e41f QU8 4x16C4 NEON Dot Product GEMM/IGEMM microkernels for Cortex A55r1 by Frank Barchard · 3 years, 2 months ago
- 8589ecd QS8 IGEMM use x11 for params, x10 for a3 and x0 for cn_stride by Frank Barchard · 3 years, 2 months ago
- 4810905 Leverage v128.const WAsm SIMD instruction by Marat Dukhan · 3 years, 2 months ago
- 8dc106e QC8/QS8/QU8 GEMM/IGEMM WAsm SIMD microkernels using i32x4.dot_i16x8_s instruction by Marat Dukhan · 3 years, 2 months ago
- feee77f Leverage f32x4.nearest, f32x4.floor, f32x4.ceil, f32x4.trunc WAsm SIMD instructions by Marat Dukhan · 3 years, 2 months ago
- 5d27a7b Leverage f32x4.nearest, f32x4.floor, f32x4.ceil, f32x4.trunc WAsm SIMD instructions by Marat Dukhan · 3 years, 2 months ago
- 0a3093c QU8 vadd neon use x32 instead of x8 by Frank Barchard · 3 years, 2 months ago
- 7da8b02 Q8 dwconv switch from 8x25 to 16x25 by Frank Barchard · 3 years, 2 months ago
- e252f92 End-to-end benchmarks on QC8 MobileNet v1/v2 models by Marat Dukhan · 3 years, 2 months ago