1. 3357d9d Minor optimizations in NEON QS8 GEMM/IGEMM microkernels by Marat Dukhan · 3 years, 1 month ago
  2. e742d2a Re-generate QS8 GEMM/IGEMM microkernels by Marat Dukhan · 3 years, 1 month ago
  3. 533410e QS8 A53 GEMM bug fix for X1 - re-enable E2E by Frank Barchard · 3 years, 1 month ago
  4. 16d79ed Polyfill vcvtnq_s32_f32 for AArch32 GCC by Marat Dukhan · 3 years, 1 month ago
  5. 0ae35f2 QS8 LD128 GEMM/IGEMM dot product 4x16 microkernel by Frank Barchard · 3 years, 1 month ago
  6. 8228689 Support QC8 DWCONV microkernels by Marat Dukhan · 3 years, 1 month ago
  7. 7c9f1f9 Replace // with # for lines that only contain a comment. by Frank Barchard · 3 years, 1 month ago
  8. fc188ed QC8 GEMM/IGEMM microkernels for SSE/AVX/XOP by Marat Dukhan · 3 years, 1 month ago
  9. c3e3f1c QC8 GEMM/IGEMM microkernels for AVX512 by Marat Dukhan · 3 years, 1 month ago
  10. e06c813 Support QC8 IGEMM microkernels by Marat Dukhan · 3 years, 1 month ago
  11. 18630de QS8 NEONDOT GEMM/IGEMM microkernels with FP32 requantization by Marat Dukhan · 3 years, 1 month ago
  12. 801d2c2 Fix QS8 IGEMM with FP32 requantization for SSE/AVX/XOP by Marat Dukhan · 3 years, 1 month ago
  13. e695791 4x16C4 QS8 IGEMM Cortex A55 microkernel reuse X10 to save push by Frank Barchard · 3 years, 1 month ago
  14. 4a2d255 Remove redundant SSSE3 microkernels with FP32 requantization by Marat Dukhan · 3 years, 1 month ago
  15. c46e671 FP32 requantization in QS8 GEMM/IGEMM microkernels for SSE/AVX/XOP by Marat Dukhan · 3 years, 1 month ago
  16. c08221f Apply text format to assembly for consistency by Frank Barchard · 3 years, 2 months ago
  17. 1c538cd Add templates for all QS8 IGEMM assembly microkernels. by Frank Barchard · 3 years, 2 months ago
  18. 71855ee Support FP32 requantization in AVX512 QS8 microkernels by Marat Dukhan · 3 years, 2 months ago
  19. d4c7d82 AVX512-specific parameters for QS8 microkernels by Marat Dukhan · 3 years, 2 months ago
  20. 9b474cf Support FP32 requantization in AVX2 QS8 microkernels by Marat Dukhan · 3 years, 2 months ago
  21. f86ee8b Refactor requantization helper functions by Marat Dukhan · 3 years, 2 months ago
  22. e3d17bf Rename microkernel-related types and structures by Marat Dukhan · 3 years, 2 months ago
  23. b07c26a Rename QS8 GEMM/IGEMM/DWCONV microkernels by Marat Dukhan · 3 years, 2 months ago
  24. d65d20e Rename QS8 GEMM/IGEMM microkernel filenames by Marat Dukhan · 3 years, 2 months ago
  25. 0b57154 4x16C4 QS8 IGEMM Cortex A75 microkernel reuse X8 to save push by Frank Barchard · 3 years, 2 months ago
  26. e091adb 4x16 QS8 GEMM/IGEMM Cortex A53 microkernels reduce to use 2 GPR for temp by Frank Barchard · 3 years, 2 months ago
  27. 748fd12 Use specialized layouts in SSE4/AVX2 QS8 [I]GEMM & DWCONV microkernels by Marat Dukhan · 3 years, 2 months ago
  28. 4bb82cc 4x16 QS8 IGEMM microkernels use x8 for temp by Frank Barchard · 3 years, 2 months ago
  29. 4be4bd7 4x16 QS8 IGEMM microkernels use x14 for A1 by Frank Barchard · 3 years, 2 months ago
  30. fb672aa 4x16 QS8 IGEMM microkernel for Cortex A53 avoid a push by Frank Barchard · 3 years, 2 months ago
  31. d4416d6 4x16 QS8 microkernel for Cortex A53 by Frank Barchard · 3 years, 2 months ago
  32. 76f43f0 Apply consistent formatting to assembly by Frank Barchard · 3 years, 2 months ago
  33. a24cc08 Small refactoring of scalar QS8 microkernels by Marat Dukhan · 3 years, 2 months ago
  34. a1a4e78 Scalar QS8 GEMM and IGEMM microkernels by Marat Dukhan · 3 years, 2 months ago
  35. 938ea81 Code generate 1x8C8 nicrokernel for Cortex A75 with and without prfm by Frank Barchard · 3 years, 2 months ago
  36. b639210 Add prefetch of A for quantized microkernels. by Frank Barchard · 3 years, 2 months ago
  37. e111861 1x8 C8 A53 microkernel defer adap by Frank Barchard · 3 years, 2 months ago
  38. 7c4c771 C8 A53 microkernels prefetch A by Frank Barchard · 3 years, 2 months ago
  39. 2a3169d C8 A53 microkernels move 2nd load after MLA by Frank Barchard · 3 years, 2 months ago
  40. dddb38f QS8 1x8C8 IGEMM microkernel for Cortex A53 by Frank Barchard · 3 years, 2 months ago
  41. 2de3bce A53 C8 microkernel load A with ldr/ldr/ins by Frank Barchard · 3 years, 2 months ago
  42. 5549735 4X8 and 4x16 mla lane microkernels for A53 by Frank Barchard · 3 years, 2 months ago
  43. d68e114 Cortex A53 tuned C8 gemm/igemm microkernels by Frank Barchard · 3 years, 3 months ago
  44. 1f51d38 Add prefetch to MLA lane microkernel by Frank Barchard · 3 years, 3 months ago
  45. 4c6640c Disable MSan in QS8 GEMM/IGEMM microkernels with KR>1 by Marat Dukhan · 3 years, 3 months ago
  46. 4a35204 PRFM variant of QS8 C8 Neon microkernel. by Frank Barchard · 3 years, 3 months ago
  47. 2e42787 2x4c2/3x4c2 microkernels for SSE2/SSSE3/SSE4.1/AVX/XOP by Marat Dukhan · 3 years, 3 months ago
  48. e696c3f QS8 move loads to end of loop, 1 every 2 neon instructions. by Frank Barchard · 3 years, 3 months ago
  49. a3c1633 AVX versions of QS8 GEMM/IGEMM microkernels by Marat Dukhan · 3 years, 3 months ago
  50. b8ad46a Refactor code-generation templates for XOP microkernels by Marat Dukhan · 3 years, 3 months ago
  51. ae5082e QS8 C8 GEMM/IGEMM use load a/b last technique for Cortex A75 performance. by Frank Barchard · 3 years, 3 months ago
  52. c409471 Include XOP headers in clang-cl compatible way. Fix #1382. by Marat Dukhan · 3 years, 3 months ago
  53. 6e35de5 QS8 1X8C8 IGEMM microkernel by Frank Barchard · 3 years, 4 months ago
  54. 2c525e5 MOV 16b instead of 4s for GCC compatability. Fix #1360 by Frank Barchard · 3 years, 4 months ago
  55. f5f9cec Miscellaneous tweeks to QS8 IGEMM microkernels by Frank Barchard · 3 years, 4 months ago
  56. cbb8e70 QS8 2x8c8-aarch64-neon-mlal-padal IGEMM microkernel by Frank Barchard · 3 years, 4 months ago
  57. 7ca54df QS8 2x8c16-aarch64-neondot-ld64 IGEMM microkernel by Frank Barchard · 3 years, 4 months ago
  58. 671d1b0 QS8 4x16c4-aarch64-neondot-ld64 IGEMM microkernel by Frank Barchard · 3 years, 4 months ago
  59. 89e12f8 QS8 IGEMM for Cortex A55 by Frank Barchard · 3 years, 4 months ago
  60. 62b4ff7 Remove 12x8 QS8 GEMM and IGEMM Neon dotproduct microkernels. by Frank Barchard · 3 years, 4 months ago
  61. da78da1 QS8 C8 Neon microkernels with MUL and MLA versions. by Frank Barchard · 3 years, 4 months ago
  62. 618d85d QS8 Neon dot product intrinsics GEMM and IGEMM microkernels reduced remainder code. by Frank Barchard · 3 years, 4 months ago
  63. 6d8ca7d Quantized GEMM/IGEMM microkernels bump kc to be a multiple of channels. by Frank Barchard · 3 years, 4 months ago
  64. 02121ca QS8 Neon IGEMM microkernels with 8 bit MUL using DUP by Frank Barchard · 3 years, 5 months ago
  65. 01c341b C8 MLA Neon GEMM/IGEMM microkernels count k down from kc. by Frank Barchard · 3 years, 5 months ago
  66. 36f95cf QS8 Neon IGEMM C16 microkernel with two 8 bit multiplies and vpadal to accumulate. by Frank Barchard · 3 years, 5 months ago
  67. a0fe11d QS8 C8 Neon remove remainder handling code and rewind the A pointers by kc by Frank Barchard · 3 years, 5 months ago
  68. 6fa8078 QS8 C2 Neon igemm by Frank Barchard · 3 years, 5 months ago
  69. d79391d QS8 C8 Neon igemm by Frank Barchard · 3 years, 5 months ago
  70. fe14b85 Add space after casting by Frank Barchard · 3 years, 5 months ago
  71. ec0bf14 QS8 GEMM and IGEMM 3x8 3x16 and IGEMM 4x8 and 4x16 by Frank Barchard · 3 years, 6 months ago
  72. 146e999 Replace QS8 4x8 with 2x8 neon microkernel. Improves performance for aarch32. by Frank Barchard · 3 years, 9 months ago
  73. 66ccf64 Rename QS8 generator templates by Marat Dukhan · 3 years, 10 months ago
  74. a48848f 4x8, 6x8 and 8x16 Neon dot product GEMM microkernels by Frank Barchard · 3 years, 10 months ago
  75. 2fa1745 6x16 QS8 GEMM for Neon dot product by Frank Barchard · 3 years, 10 months ago
  76. ef4ce31 Remove trailing whitespace by Marat Dukhan · 3 years, 10 months ago
  77. d4c8303 Enable NEON DOT QS8 [I]GEMM microkernels on ARM64 by Marat Dukhan · 3 years, 10 months ago
  78. 12c5777 Optimization: 2x partial unroll to load 8 contiguous bytes. by Benoit Jacob · 3 years, 11 months ago
  79. a05487f Add xnn_qs8_igemm_minmax_ukernel_${MR}x${NR}c4__neondot (ARMv8.2+dotprod). by Benoit Jacob · 4 years ago
  80. 0af63ab Include polyfills for intrinsics in QS8 AVX512 GEMM/IGEMM microkernels by Marat Dukhan · 4 years ago
  81. bb00b1d AVX512 variants of QS8 GEMM and IGEMM microkernels by Marat Dukhan · 4 years ago
  82. f124e88 Polyfill _mm_loadu_si32 and _mm_storeu_si32 intrinsics by Marat Dukhan · 4 years ago
  83. 27203da WAsm SIMD versions of QS8 GEMM and IGEMM microkernels by Marat Dukhan · 4 years ago
  84. 23848db Reoptimize x86 requantization by Marat Dukhan · 4 years ago
  85. 40bbafe NEON variants of QS8 GEMM & IGEMM microkernels by Marat Dukhan · 4 years ago
  86. e7edc80 Add 3x4c8 variants of SSE2/SSSE3/SSE4.1/XOP GEMM/IGEMM microkernels by Marat Dukhan · 4 years ago
  87. 1280952 AVX2 version of QS8 GEMM and IGEMM microkernels by Marat Dukhan · 4 years ago
  88. 1566fee XOP versions of QS8 GEMM/IGEMM microkernels by Marat Dukhan · 4 years ago
  89. 07bd252 QS8 IGEMM MRx4c8 SSE2/SSSE3/SSE4.1 microkernels by Marat Dukhan · 4 years ago
  90. dee732b LD128 versions of QS8 GEMM SSE2/SSSE3/SSE4.1 microkernels by Marat Dukhan · 4 years ago
  91. 14d3ce8 Add LD64 suffix in QS8 GEMM/IGEMM microkernels by Marat Dukhan · 4 years ago
  92. 733d0be QS8 GEMM MRx4c8 SSE2/SSSE3/SSE4.1 microkernels by Marat Dukhan · 4 years ago
  93. f948068 QS8 IGEMM microkernels and infrastructure by Marat Dukhan · 4 years ago