- 354cbc6 QU8 MUL8 variant of DWCONV by Frank Barchard · 3 years ago
- 79c76ab F16->F32 conversion microkernels in AVX512-SKX implementation by Marat Dukhan · 3 years ago
- f1a6ed3 F16->F32 conversion microkernels in F16C implementation by Marat Dukhan · 3 years ago
- 694d252 Fix incorrect initialization of QC8 GEMM/IGEMM parameters on AArch32+NEONDOT by Marat Dukhan · 3 years ago
- 0bf8afa Leverage f32x4.pmin and f32x4.pmax WAsm SIMD instructions by Marat Dukhan · 3 years ago
- a4ad988 X8 LUT microkernels for WAsm SIMD by Marat Dukhan · 3 years ago
- 2aa2e2a q8 dwconv add channel tiles of 24 and 32 for mul16 rndnu microkernels by Frank Barchard · 3 years ago
- 5cc31e3 Replace _mm512_(loadu/storeu)_epi8 with _mm512_(loadu/storeu)_si512 by Marat Dukhan · 3 years ago
- 37c3077 Avoid _mm512_(loadu/storeu)_epi32 in _mm512_(loadu/storeu)_epi8 polyfills by Marat Dukhan · 3 years, 1 month ago
- b54871d Polyfill _mm512_loadu_epi8 & _mm512_storeu_epi8 for pre GCC-11 by Marat Dukhan · 3 years, 1 month ago
- 67492b0 Expose quantized ELU operator in Subgraph API by Marat Dukhan · 3 years, 1 month ago
- eec0052 QS8 ELU operator by Marat Dukhan · 3 years, 1 month ago
- e4118ef Polyfill vld1q_u8_x4 for older AArch64 gcc versions by Marat Dukhan · 3 years, 1 month ago
- 98e054b Enable vectorized X8 LUT microkernels by Marat Dukhan · 3 years, 1 month ago
- 2b3c410 AVX512BW implementations of X8 LUT microkernels by Marat Dukhan · 3 years, 1 month ago
- 7c478e3 SSSE3, AVX, and AVX2 X8 LUT microkernels by Marat Dukhan · 3 years, 1 month ago
- 5de7bc0 QS8/QU8 Tanh operator using LUT microkernels by Marat Dukhan · 3 years, 1 month ago
- f718232 X8 LUT NEON microkernels by Marat Dukhan · 3 years, 1 month ago
- a4ba5d4 Expose quantized Sigmoid operator in Subgraph API by Marat Dukhan · 3 years, 1 month ago
- 71a9bb1 QS8 Sigmoid operator by Marat Dukhan · 3 years, 1 month ago
- f6c991e Implement generic LUT-based elementwise operator by Marat Dukhan · 3 years, 1 month ago
- d67539d Auto-generate X8 LUT microkernels and tests by Marat Dukhan · 3 years, 1 month ago
- cdf59a5 Add QU8 NR=32 microkernels by Frank Barchard · 3 years, 1 month ago
- d460d0b Neon IGEMM do remainder with reversed MR for shifts by Frank Barchard · 3 years, 1 month ago
- dfe763f Expose quantized Subtract operator in Subgraph API by Marat Dukhan · 3 years, 1 month ago
- b8cbcb5 Fuse rounding term into bias in QS8 & QU8 VADD[C] microkernels by Marat Dukhan · 3 years, 1 month ago
- 031ff4b Template bug fix in stores for remainder of 8 in Neon QS8 microkernels by Frank Barchard · 3 years, 1 month ago
- 8e2fd20 QS8 and QU8 Subtract ND operators by Marat Dukhan · 3 years, 1 month ago
- ec5c129 Template bug fix in stores for remainder of 8. by Frank Barchard · 3 years, 1 month ago
- 6428725 Rename ADD quantization parameters to ADDSUB by Marat Dukhan · 3 years, 1 month ago
- 8ae1a53 Remove duplicate prototypes by Frank Barchard · 3 years, 1 month ago
- 4c49494 Fix crash on AArch32 in scalar quantized microkernels by Marat Dukhan · 3 years, 1 month ago
- 2fee611 Fix compilation warnings in QU8 GEMM/IGEMM NEONDOT microkernels by Marat Dukhan · 3 years, 1 month ago
- 1ce78ab Leverage Load-Zero WAsm SIMD instructions in Chrome M88 microkernels by Marat Dukhan · 3 years, 1 month ago
- 189c1d0 Support specifying the version of WAsm SIMD instructions by Marat Dukhan · 3 years, 1 month ago
- df8e604 4x8 QU8 Neon Dotproduct microkernel rename from ld64 to ld128 by Frank Barchard · 3 years, 1 month ago
- 90cd7df Fix rewind params for qs8 4x16c4 by Frank Barchard · 3 years, 1 month ago
- 33b4f75 VRND microkernels using native WAsm SIMD instructions by Marat Dukhan · 3 years, 1 month ago
- bd5b027 4x8 QU8 microkernel use 16 byte UDOT to save 4 UDOT by Frank Barchard · 3 years, 1 month ago
- b7a7c30 NEON GEMM/IGEMM microkernels change store/dup to 2 of each by Frank Barchard · 3 years, 1 month ago
- 132774e QU8 microkernels change stores to non-lane STR by Frank Barchard · 3 years, 1 month ago
- 29833fd Change stores to non-lane STR by Frank Barchard · 3 years, 1 month ago
- c37b8da Enable little core microkernel on Big/Little CPUs by Frank Barchard · 3 years, 1 month ago
- 1c70764 4x16c4 cortex_a55 microkernel tuning by Frank Barchard · 3 years, 1 month ago
- e7e001f Fix bug in QC8/QS8/QU8 IGEMM DOT16x2 LD128 WAsm SIMD microkernels by Marat Dukhan · 3 years, 1 month ago
- a49e41f QU8 4x16C4 NEON Dot Product GEMM/IGEMM microkernels for Cortex A55r1 by Frank Barchard · 3 years, 1 month ago
- 8589ecd QS8 IGEMM use x11 for params, x10 for a3 and x0 for cn_stride by Frank Barchard · 3 years, 1 month ago
- 4810905 Leverage v128.const WAsm SIMD instruction by Marat Dukhan · 3 years, 1 month ago
- 8dc106e QC8/QS8/QU8 GEMM/IGEMM WAsm SIMD microkernels using i32x4.dot_i16x8_s instruction by Marat Dukhan · 3 years, 1 month ago
- feee77f Leverage f32x4.nearest, f32x4.floor, f32x4.ceil, f32x4.trunc WAsm SIMD instructions by Marat Dukhan · 3 years, 1 month ago
- 5d27a7b Leverage f32x4.nearest, f32x4.floor, f32x4.ceil, f32x4.trunc WAsm SIMD instructions by Marat Dukhan · 3 years, 1 month ago
- 0a3093c QU8 vadd neon use x32 instead of x8 by Frank Barchard · 3 years, 1 month ago
- 7da8b02 Q8 dwconv switch from 8x25 to 16x25 by Frank Barchard · 3 years, 1 month ago
- 0d06573 dwconv Q8 switch from 8x9 to 16x9 tile. by Frank Barchard · 3 years, 1 month ago
- 1215c9a QS8 NEON GEMM microkernels use rewind instead of reload by Frank Barchard · 3 years, 1 month ago
- 6b30b73 Remainder branch move before label. by Frank Barchard · 3 years, 1 month ago
- fec7363 QU8 C4 4x8 rename registers to avoid 3 push/pops. by Frank Barchard · 3 years, 1 month ago
- 8b69802 Enable QU8 C4 NEON Dot Product GEMM/IGEMM microkernels for Cortex A55r1 by Frank Barchard · 3 years, 1 month ago
- ca4c68e QU8 C4 NEON Dot Product GEMM/IGEMM microkernels for Cortex A55r1 by Frank Barchard · 3 years, 1 month ago
- 56f157c Relabel branches for quantized assembly ARM microkernels by Frank Barchard · 3 years, 1 month ago
- 0c76422 QU8 NEON Assembly remove channel wise by Frank Barchard · 3 years, 1 month ago
- 408f153 Enable QU8 4x16 C4 NEON Assembly Dot Product GEMM/IGEMM microkernels by Frank Barchard · 3 years, 1 month ago
- 4066898 QU8 4x16 C4 NEON Assembly Dot Product GEMM/IGEMM microkernels by Frank Barchard · 3 years, 1 month ago
- 3b9b4bc Fix VCLAMP parameter initialization functions on pre-NEON ARM by Marat Dukhan · 3 years, 1 month ago
- a38bf33 QU8 4x8c4 rewind params with SUB by Frank Barchard · 3 years, 1 month ago
- b48f367 QU8 4x8 C4 NEON reload params during subtract by Frank Barchard · 3 years, 1 month ago
- 073185e QU8 4x8 C4 NEON Assembly Dot Product use partial sums on zero point by Frank Barchard · 3 years, 1 month ago
- 0049e89 QU8 C4 NEON Assembly Dot Product GEMM/IGEMM microkernels by Frank Barchard · 3 years, 1 month ago
- 6fe565e QU8 neondot use C2 partial sum for zero point accumulators. by Frank Barchard · 3 years, 1 month ago
- 3f2074f QU8 neondot use uint32x2 for zero point and accumulators by Frank Barchard · 3 years, 1 month ago
- 7a8dd87 Work around generating v128.storeXX_lane for quantized WAsm SIMD microkernels by Marat Dukhan · 3 years, 1 month ago
- de9c64a Enable 4x16 QU8 dot production microkernels by Frank Barchard · 3 years, 1 month ago
- 9cedb59 Accumulate in 16 bits once in WAsm SIMD MUL16 QS8/QC8 DWCONV before extending to 32 bits by Marat Dukhan · 3 years, 1 month ago
- a74310a Remove UDOT by zero point along the N axis by Frank Barchard · 3 years, 1 month ago
- 0d00baa Support quantized Clamp and Max Pooling operators in Subgraph API by Marat Dukhan · 3 years, 2 months ago
- 61c0c9e Clamp NC operator for S8 data type by Marat Dukhan · 3 years, 2 months ago
- 9491279 Refactor parameter initialization for VCLAMP microkernels by Marat Dukhan · 3 years, 2 months ago
- e79acb7 S8 VCLAMP microkernels by Marat Dukhan · 3 years, 2 months ago
- 1f5b108 Refactor U8 CLAMP microkernels by Marat Dukhan · 3 years, 2 months ago
- 2ea50a0 Refactor U8 MAXPOOL microkernels similarly to S8 MAXPOOL by Marat Dukhan · 3 years, 2 months ago
- dc5c148 S8 Max Pooling operator by Marat Dukhan · 3 years, 2 months ago
- 2314753 S8 MAXPOOL microkernels for all architectures by Marat Dukhan · 3 years, 2 months ago
- f158942 WAsm SIMD implementation of U8 MAXPOOL microkernel by Marat Dukhan · 3 years, 2 months ago
- 91ae165 Refactor initialization of MAXPOOL microkernel parameters by Marat Dukhan · 3 years, 2 months ago
- e033126 Generate more tile sizes for QU8 gemm/igemm by Frank Barchard · 3 years, 2 months ago
- b1cd381 Enable dot production microkernels for QU8 on Cortex A55 by Frank Barchard · 3 years, 2 months ago
- 2025515 Enable dot production microkernels for QU8 on ARM by Frank Barchard · 3 years, 2 months ago
- 88e839c QU8 C4 NEON Dot Product GEMM/IGEMM microkernels by Frank Barchard · 3 years, 2 months ago
- 0c2a31e Improve unpacking in SSE4+ QC8/QS8/QU8 GEMM/IGEMM microkernels by Marat Dukhan · 3 years, 2 months ago
- 36fe5aa Remove WAsm SIMD QS8 DWCONV microkernels with GEMMLOWP requantization by Marat Dukhan · 3 years, 2 months ago
- 88c2da6 Run template code generators by Frank Barchard · 3 years, 2 months ago
- b43c5ef Fix indent on C4 Neon Dot Product GEMM/IGEMM microkernels by Frank Barchard · 3 years, 2 months ago
- 8c96521 Fix Static Constant Padding + Convolution 2D fusion with quantization by Marat Dukhan · 3 years, 2 months ago
- e0a20d6 Expose quantized Static Constant Pad operator in Subgraph API by Marat Dukhan · 3 years, 2 months ago
- 139e961 X8 version of Constand Pad ND operator by Marat Dukhan · 3 years, 2 months ago
- 07706f6 Replace generic shuffle with narrow instructions in WAsm SIMD QS8/QU8/QC8 microkernels by Marat Dukhan · 3 years, 2 months ago
- dfc2db0 Add prefix to QC8/QS8/QU8 WAsm SIMD GEMM/IGEMM microkernels by Marat Dukhan · 3 years, 2 months ago
- 637a038 Fix implicit pointer cast warning in FP16ARITH microkernels by Marat Dukhan · 3 years, 2 months ago
- 0461f2d Generalize PAD microkernels to all 8-/16-/32-bit data types by Marat Dukhan · 3 years, 2 months ago
- 3e9dc22 Remove WAsm SIMD GEMM/IGEMM microkernels with GEMMLOWP requantization by Marat Dukhan · 3 years, 2 months ago