1. 4c3e5a9 GEMM benchmark assembly microkernels before intrinsics. by Frank Barchard · 3 years, 1 month ago
  2. e79acb7 S8 VCLAMP microkernels by Marat Dukhan · 3 years, 1 month ago
  3. 1f5b108 Refactor U8 CLAMP microkernels by Marat Dukhan · 3 years, 1 month ago
  4. 2ea50a0 Refactor U8 MAXPOOL microkernels similarly to S8 MAXPOOL by Marat Dukhan · 3 years, 1 month ago
  5. dc5c148 S8 Max Pooling operator by Marat Dukhan · 3 years, 1 month ago
  6. 2314753 S8 MAXPOOL microkernels for all architectures by Marat Dukhan · 3 years, 1 month ago
  7. f158942 WAsm SIMD implementation of U8 MAXPOOL microkernel by Marat Dukhan · 3 years, 2 months ago
  8. 91ae165 Refactor initialization of MAXPOOL microkernel parameters by Marat Dukhan · 3 years, 2 months ago
  9. ee69093 Enable shell and node environments for WAsm binaries by Marat Dukhan · 3 years, 2 months ago
  10. 9098aba E2E for QU8 GEMM microkernels by Frank Barchard · 3 years, 2 months ago
  11. e033126 Generate more tile sizes for QU8 gemm/igemm by Frank Barchard · 3 years, 2 months ago
  12. b1cd381 Enable dot production microkernels for QU8 on Cortex A55 by Frank Barchard · 3 years, 2 months ago
  13. 2025515 Enable dot production microkernels for QU8 on ARM by Frank Barchard · 3 years, 2 months ago
  14. 88e839c QU8 C4 NEON Dot Product GEMM/IGEMM microkernels by Frank Barchard · 3 years, 2 months ago
  15. cf557d4 Increase timeout for Multiply ND operator test by Marat Dukhan · 3 years, 2 months ago
  16. c67779e Fix flakiness in QS8/QC8 NHWC Deconvolution tests by Marat Dukhan · 3 years, 2 months ago
  17. e7991e7 Minor refactoring of RNG in Fully Connected tester by Marat Dukhan · 3 years, 2 months ago
  18. 57c7827 Fix flakiness in QS8/QC8 NHWC Convolution tests by Marat Dukhan · 3 years, 2 months ago
  19. b3faed3 Fix flakiness in QC8/QS8 GEMM/IGEMM/DWCONV microkernels by Marat Dukhan · 3 years, 2 months ago
  20. 0c2a31e Improve unpacking in SSE4+ QC8/QS8/QU8 GEMM/IGEMM microkernels by Marat Dukhan · 3 years, 2 months ago
  21. d960231 Remove tests for WAsm SIMD QS8 DWCONV microkernels with GEMMLOWP requantization by Marat Dukhan · 3 years, 2 months ago
  22. 36fe5aa Remove WAsm SIMD QS8 DWCONV microkernels with GEMMLOWP requantization by Marat Dukhan · 3 years, 2 months ago
  23. 88c2da6 Run template code generators by Frank Barchard · 3 years, 2 months ago
  24. b43c5ef Fix indent on C4 Neon Dot Product GEMM/IGEMM microkernels by Frank Barchard · 3 years, 2 months ago
  25. 8c96521 Fix Static Constant Padding + Convolution 2D fusion with quantization by Marat Dukhan · 3 years, 2 months ago
  26. e0a20d6 Expose quantized Static Constant Pad operator in Subgraph API by Marat Dukhan · 3 years, 2 months ago
  27. 139e961 X8 version of Constand Pad ND operator by Marat Dukhan · 3 years, 2 months ago
  28. 07706f6 Replace generic shuffle with narrow instructions in WAsm SIMD QS8/QU8/QC8 microkernels by Marat Dukhan · 3 years, 2 months ago
  29. dfc2db0 Add prefix to QC8/QS8/QU8 WAsm SIMD GEMM/IGEMM microkernels by Marat Dukhan · 3 years, 2 months ago
  30. 637a038 Fix implicit pointer cast warning in FP16ARITH microkernels by Marat Dukhan · 3 years, 2 months ago
  31. 0461f2d Generalize PAD microkernels to all 8-/16-/32-bit data types by Marat Dukhan · 3 years, 2 months ago
  32. 3e9dc22 Remove WAsm SIMD GEMM/IGEMM microkernels with GEMMLOWP requantization by Marat Dukhan · 3 years, 2 months ago
  33. 933051b Generalize FILL microkernels to all 8-/16-/32-bit data types by Marat Dukhan · 3 years, 2 months ago
  34. 7c74aff Add F32 VLRELU benchmarks by Marat Dukhan · 3 years, 2 months ago
  35. 4486f87 Prune NEON-DOT QS8 GEMM/IGEMM microkernels with FP32 & GEMMLOWP requantization by Marat Dukhan · 3 years, 2 months ago
  36. 400e7cb Prune WAsm SIMD QS8 GEMM/IGEMM microkernels with GEMMLOWP requantization by Marat Dukhan · 3 years, 2 months ago
  37. e16bf7d Prune AVX2/AVX512 QS8 GEMM/IGEMM microkernels with GEMMLOWP requantization by Marat Dukhan · 3 years, 2 months ago
  38. dc020ff Add 4x16c4 rndnu e2e benchmark for qs8. by Frank Barchard · 3 years, 2 months ago
  39. 8634f7e Refactor F32 VHSWISH benchmarks by Marat Dukhan · 3 years, 2 months ago
  40. 12e426c Refactor F32 VELU benchmarks by Marat Dukhan · 3 years, 2 months ago
  41. 9f8ea9b Refactor F32 VSIGMOID benchmarks by Marat Dukhan · 3 years, 2 months ago
  42. 5aeb32b Refactor F32 VSQRT benchmarks by Marat Dukhan · 3 years, 2 months ago
  43. 3b6c36e Refactor F32 VRELU benchmarks by Marat Dukhan · 3 years, 2 months ago
  44. 66a3ca1 Initialize QS8 microkernel pointers on pre-NEON ARM architecture by Marat Dukhan · 3 years, 2 months ago
  45. 8674629 Use QS8 GEMM WAsm SIMD microkernels with FP32 requantization in the benchmark by Marat Dukhan · 3 years, 2 months ago
  46. 0ff7989 Use FP32 requantization for extended-weights QS8 GEMM microkernels on x86 by Marat Dukhan · 3 years, 2 months ago
  47. 529d2c1 Remove x86 QS8 GEMM microkernels with GEMMLOWP requantization from benchmarks by Marat Dukhan · 3 years, 2 months ago
  48. ec47958 Prune redundant NEON GEMM/IGEMM microkernels with GEMMLOWP requantization by Marat Dukhan · 3 years, 2 months ago
  49. 348ed6d Add ISA checks in QS8/QU8 requantization tests by Marat Dukhan · 3 years, 2 months ago
  50. f879d9e Add qs8-requantization-test to CMake build by Marat Dukhan · 3 years, 2 months ago
  51. 3c5e662 Initialize QU8 VMUL[C] microkernels for pre-NEON ARM by Marat Dukhan · 3 years, 2 months ago
  52. 066a0cb Evaluate convertsion-based WAsm SIMD implementations in the rounding benchmark by Marat Dukhan · 3 years, 2 months ago
  53. 2dac7bb Unify on wasm_f64x2_spalt(0.0) to materialize zero SIMD vector in WAsm by Marat Dukhan · 3 years, 2 months ago
  54. d4db6af Replace wasm_i32x4_lt(vzero, vXX) with wasm_i32x4_shr(vXX, 31) by Marat Dukhan · 3 years, 2 months ago
  55. ebb6207 QU8 4x16 IGEMM remove push for X21 register by Frank Barchard · 3 years, 2 months ago
  56. 8a211a3 Check parameter initialization functions for non-NULL before calling by Marat Dukhan · 3 years, 2 months ago
  57. 085883b Remove references to Google-specific headers in BUILD.bazel by Marat Dukhan · 3 years, 2 months ago
  58. e145d56 Fix incompatibilities with AArch64 gcc in FP16 microkernels by Marat Dukhan · 3 years, 2 months ago
  59. eca1ea9 Fix typo in QU8 VMUL[C] NEON microkernels by Marat Dukhan · 3 years, 2 months ago
  60. 1e6fc21 Fix incompatible pointer type in QU8 DWCONV NEON microkernels by Marat Dukhan · 3 years, 2 months ago
  61. 1d90101 Fix GCC incompatibility in QS8/QU8 NEON microkernels by Marat Dukhan · 3 years, 2 months ago
  62. 8431a06 Include intrinsics polyfill on NEONV8 QS8/QU8 VMUL[C] microkernels by Marat Dukhan · 3 years, 2 months ago
  63. 91351ef Allocate additional XNN_EXTRA_BYTES for input in QS8/QU8 GEMM benchmarks by Marat Dukhan · 3 years, 2 months ago
  64. 2c6d196 Q8 4x16 and 1x16 Neon GEMM/IGEMM quantize using V0-V3 by Frank Barchard · 3 years, 2 months ago
  65. fbe0c6f Q8 4x16 Neon IGEMM quantize using V0-V3 by Frank Barchard · 3 years, 2 months ago
  66. 599d3db Fix CMake build by Marat Dukhan · 3 years, 2 months ago
  67. f479a1c Initialize QU8 4x16 Neon assembly microkernel for each ARM CPU. by Frank Barchard · 3 years, 2 months ago
  68. e961ecf QU8 benchmark remove quantization from Neon names for consistency by Frank Barchard · 3 years, 2 months ago
  69. 18f32f5 Expose quantized Multiply operator in Subgraph API by Marat Dukhan · 3 years, 2 months ago
  70. 0853b8a QS8/QU8 Multiply ND operators by Marat Dukhan · 3 years, 2 months ago
  71. 8b024c9 QS8/QU8 VMULC microkernel benchmark by Marat Dukhan · 3 years, 2 months ago
  72. fb3a94f QU8 4x16 Neon assembly microkernel for Cortex A75 by Frank Barchard · 3 years, 2 months ago
  73. 795e5ab QS8/QU8 VMUL microkernel benchmarks by Marat Dukhan · 3 years, 2 months ago
  74. 4a7b70f QS8/QU8 VMUL[C] microkernels in NEON implementation by Marat Dukhan · 3 years, 2 months ago
  75. 7999341 QS8/QU8 VMUL[C] microkernels in scalar implementation by Marat Dukhan · 3 years, 2 months ago
  76. a962f1e Enable QU8 4x16 Neon assembly microkernel by Frank Barchard · 3 years, 2 months ago
  77. 86a1618 QU8 Neon params replace pad with duplicated zero_point by Frank Barchard · 3 years, 2 months ago
  78. e2163bc Benchmark QU8 4x16 Neon assembly GEMM microkernel - Rename QS8 benchmarks to QU8 by Frank Barchard · 3 years, 2 months ago
  79. 87bd511 Fix small issues in binary elementwise microkernel test gen by Marat Dukhan · 3 years, 2 months ago
  80. 59ed1da QU8 4x16 Neon assembly microkernel by Frank Barchard · 3 years, 2 months ago
  81. 661ea6d QS8/QU8 VMUL[C] microkernels in WAsm SIMD implementation by Marat Dukhan · 3 years, 2 months ago
  82. a212eac QS8/QU8 VMUL[C] microkernels in SSE2/SSE4.1/AVX implementation by Marat Dukhan · 3 years, 2 months ago
  83. bea849a QS8 Deconvolution operator by Marat Dukhan · 3 years, 2 months ago
  84. 6967eb0 Add a rewind variable for params. - no impact on code, just simplified source by Frank Barchard · 3 years, 2 months ago
  85. eb3cff3 LD128 versions of QS8/QU8 VADD[C] NEON microkernels by Marat Dukhan · 3 years, 2 months ago
  86. 01debd9 Optimize QS8 VADD[C] microkernel selection on ARM/ARM64 by Marat Dukhan · 3 years, 2 months ago
  87. 1ef9de8 QU8 VADD/VADDC microkernel benchmarks by Marat Dukhan · 3 years, 2 months ago
  88. 83a8d2f QS8 VADD/VADDC microkernel benchmarks by Marat Dukhan · 3 years, 2 months ago
  89. bbe8824 Enable AVX2 MUL16 ADD16 microkernels in QS8 DWCONV benchmarks by Marat Dukhan · 3 years, 2 months ago
  90. 60bb7ec Accumulate in 16 bits once in AVX2 MUL16 VPUNPCK QS8/QC8 DWCONV before extending to 32 bits by Marat Dukhan · 3 years, 2 months ago
  91. 793c8da QS8 igemm comment for zero use int8_t* instead of float* by Frank Barchard · 3 years, 2 months ago
  92. 881ab02 AVX2 MUL16 QS8/QC8 DWCONV microkernels using VPUNPCK instructions to extend the product by Marat Dukhan · 3 years, 2 months ago
  93. 2848059 Optimize QC8 DWCONV microkernel selection on AVX and XOP by Marat Dukhan · 3 years, 2 months ago
  94. cc96770 Evaluate MUL32 XOP QS8 DWCONV microkernels in E2E benchmark by Marat Dukhan · 3 years, 2 months ago
  95. 195b72f Split microkernel lists in CMakeLists into production and non-production by Marat Dukhan · 3 years, 2 months ago
  96. 2c72495 Split microkernel lists in BUILD file into production and non-production by Marat Dukhan · 3 years, 2 months ago
  97. 02f06e3 Fix QS8 DWCONV microkernel selection for XOP processors by Marat Dukhan · 3 years, 2 months ago
  98. db3b0a7 Refactor microkernel lists in BUILD and CMakeLists.txt by Marat Dukhan · 3 years, 2 months ago
  99. caa7fc7 Optimize selection of QU8 DWCONV microkernel on AVX processors by Marat Dukhan · 3 years, 2 months ago
  100. 6084fb8 E2E benchmark for QU8 DWCONV microkernels by Marat Dukhan · 3 years, 2 months ago