1. b0da47a QS8 C8 neon microkernel load B at end of loop and PADAP at top of loop. by Frank Barchard · 3 years, 3 months ago
  2. 8e58994 2x8c8__aarch64_neon_mlal_padal GEMM microkernel load A0 last by Frank Barchard · 3 years, 3 months ago
  3. 7ca54df QS8 2x8c16-aarch64-neondot-ld64 IGEMM microkernel by Frank Barchard · 3 years, 4 months ago
  4. 7825897 C8 mul microkernel labels sorted and registers documented by Frank Barchard · 3 years, 4 months ago
  5. 2f06150 xnn_qs8_gemm_minmax_ukernel_2x8c8__aarch64_neon_mlal_padal GEMM microkernel by Frank Barchard · 3 years, 4 months ago
  6. 1dc9fef QS8 2x8c8-aarch64 GEMM microkernel by Frank Barchard · 3 years, 4 months ago
  7. baf46fc Tuned QS8 GEMM 2x8c16 MLAL PADAL assembly microkernel for AArch64 by Frank Barchard · 3 years, 4 months ago
  8. 5655cb7 QS8 GEMM 2x8c16 MLAL PADAL assembly microkernel for AArch64 by Frank Barchard · 3 years, 4 months ago
  9. b7941cb Round KC up for assembly microkernels. by Frank Barchard · 3 years, 4 months ago
  10. 62b4ff7 Remove 12x8 QS8 GEMM and IGEMM Neon dotproduct microkernels. by Frank Barchard · 3 years, 4 months ago
  11. da78da1 QS8 C8 Neon microkernels with MUL and MLA versions. by Frank Barchard · 3 years, 4 months ago
  12. 618d85d QS8 Neon dot product intrinsics GEMM and IGEMM microkernels reduced remainder code. by Frank Barchard · 3 years, 4 months ago
  13. d76a37b Re-label branch targets in c4-neondot assembly QS8 GEMM microkernels. by Frank Barchard · 3 years, 4 months ago
  14. 4a4be4e QS8 1x16c4 ld32 GEMM microkernel using NEON dot product by Frank Barchard · 3 years, 4 months ago
  15. 7aa4bfd QS8 Cortex A55 GEMM microkernel bump kc to be a multiple of channels. by Frank Barchard · 3 years, 4 months ago
  16. 6d8ca7d Quantized GEMM/IGEMM microkernels bump kc to be a multiple of channels. by Frank Barchard · 3 years, 4 months ago
  17. 8f6a1ed QS8 LD64 C4 dot product GEMM microkernel reduced remainder handling by Frank Barchard · 3 years, 4 months ago
  18. fd1dee7 QS8 C16 GEMM microkernel source renamed from mull to mlal by Frank Barchard · 3 years, 4 months ago
  19. a5e242c QS8 LD32 GEMM microkernel for big cores with dotproduct by Frank Barchard · 3 years, 4 months ago
  20. 01c341b C8 MLA Neon GEMM/IGEMM microkernels count k down from kc. by Frank Barchard · 3 years, 4 months ago
  21. 36f95cf QS8 Neon IGEMM C16 microkernel with two 8 bit multiplies and vpadal to accumulate. by Frank Barchard · 3 years, 4 months ago
  22. 71c4d1a QS8 Neon GEMM C16 microkernel with two 8 bit multiplies and vpadal to accumulate. by Frank Barchard · 3 years, 4 months ago
  23. 6d138db Remove scalar C4 QS8 and QU8 gemm microkernels. by Frank Barchard · 3 years, 4 months ago
  24. a0fe11d QS8 C8 Neon remove remainder handling code and rewind the A pointers by kc by Frank Barchard · 3 years, 4 months ago
  25. 32389c6 QS8 e2e benchmark for C2 neon microkernels by Frank Barchard · 3 years, 4 months ago
  26. aaafdc7 QS8 scalar gemm remove bias variables. by Frank Barchard · 3 years, 4 months ago
  27. fe14b85 Add space after casting by Frank Barchard · 3 years, 4 months ago
  28. 10f9f05 Remove 0 from ranges where not needed by Frank Barchard · 3 years, 4 months ago
  29. c8532ae Unroll KC loop to do MULL and then MLAL to 16 bit before lengthening to 32 bit. by Frank Barchard · 3 years, 5 months ago
  30. 7e1f371 QS8 GEMM for neon reorder with MR inner loop so mull and mlal to avoid dependency on destination. by Frank Barchard · 3 years, 5 months ago
  31. 8247e21 C2 QS8 microkernel using mull then mlal with KC loop of 16 by Frank Barchard · 3 years, 5 months ago
  32. 5899012 QS8 Neon GEMM C8 microkernel with 8 bit multiply and vpadal to accumulate. by Frank Barchard · 3 years, 5 months ago
  33. 2302ffd QS8 Neon GEMM microkernel with 8 bit multiply and vpadal to accumulate by Frank Barchard · 3 years, 5 months ago
  34. ec0bf14 QS8 GEMM and IGEMM 3x8 3x16 and IGEMM 4x8 and 4x16 by Frank Barchard · 3 years, 5 months ago
  35. 4ecae2e QS8 Neon GEMM microkernel with 8 bit multiply by Frank Barchard · 3 years, 5 months ago
  36. cfbc849 Add 4x8 and 4x16 qs8 gemm microkernels by Frank Barchard · 3 years, 5 months ago
  37. 146e999 Replace QS8 4x8 with 2x8 neon microkernel. Improves performance for aarch32. by Frank Barchard · 3 years, 9 months ago
  38. f2742c4 Cortex A55r1 QS8 GEMM microkernel by Frank Barchard · 3 years, 9 months ago
  39. 0797eb1 Rename QS8 assembly GEMM kernels to ld64 by Frank Barchard · 3 years, 9 months ago
  40. a463285 4x16 QS8 GEMM use 4 less registers, avoiding push/pop. by Frank Barchard · 3 years, 9 months ago
  41. 59df88b 4x16 QS8 GEMM defer params by Frank Barchard · 3 years, 9 months ago
  42. f1fd89e 1x16 QS8 GEMM AARCH64 assembly microkernel using dot product. by Frank Barchard · 3 years, 9 months ago
  43. a5237a5 Rename 4x16 GEMM dot product microkernel file name to allow for future variations. by Frank Barchard · 3 years, 9 months ago
  44. 31bb45b 4x16 QS8 GEMM AARCH64 assembly microkernel using dot product. by Frank Barchard · 3 years, 9 months ago
  45. 66ccf64 Rename QS8 generator templates by Marat Dukhan · 3 years, 9 months ago
  46. a48848f 4x8, 6x8 and 8x16 Neon dot product GEMM microkernels by Frank Barchard · 3 years, 9 months ago
  47. 2fa1745 6x16 QS8 GEMM for Neon dot product by Frank Barchard · 3 years, 9 months ago
  48. ef4ce31 Remove trailing whitespace by Marat Dukhan · 3 years, 10 months ago
  49. d4c8303 Enable NEON DOT QS8 [I]GEMM microkernels on ARM64 by Marat Dukhan · 3 years, 10 months ago
  50. 12c5777 Optimization: 2x partial unroll to load 8 contiguous bytes. by Benoit Jacob · 3 years, 10 months ago
  51. a964473 Add xnn_qs8_gemm_minmax_ukernel_${MR}x${NR}c4__neondot (ARMv8.2+dotprod). by Benoit Jacob · 3 years, 11 months ago
  52. 0af63ab Include polyfills for intrinsics in QS8 AVX512 GEMM/IGEMM microkernels by Marat Dukhan · 3 years, 11 months ago
  53. bb00b1d AVX512 variants of QS8 GEMM and IGEMM microkernels by Marat Dukhan · 3 years, 11 months ago
  54. f124e88 Polyfill _mm_loadu_si32 and _mm_storeu_si32 intrinsics by Marat Dukhan · 3 years, 11 months ago
  55. 5b3af47 Re-generate QS8 and QU8 microkernels from templates by Marat Dukhan · 3 years, 11 months ago
  56. b33fc0e Add xnn_q{u,s}8_gemm_minmax_ukernel_MRxNRc4__scalar by Benoit Jacob · 3 years, 11 months ago
  57. 27203da WAsm SIMD versions of QS8 GEMM and IGEMM microkernels by Marat Dukhan · 3 years, 11 months ago
  58. 23848db Reoptimize x86 requantization by Marat Dukhan · 3 years, 11 months ago
  59. 40bbafe NEON variants of QS8 GEMM & IGEMM microkernels by Marat Dukhan · 3 years, 11 months ago
  60. 683fab3 XW (eXtended Weights) optimization for QS8 GEMM microkernel by Marat Dukhan · 3 years, 11 months ago
  61. e7edc80 Add 3x4c8 variants of SSE2/SSSE3/SSE4.1/XOP GEMM/IGEMM microkernels by Marat Dukhan · 3 years, 11 months ago
  62. 1280952 AVX2 version of QS8 GEMM and IGEMM microkernels by Marat Dukhan · 3 years, 11 months ago
  63. 1566fee XOP versions of QS8 GEMM/IGEMM microkernels by Marat Dukhan · 3 years, 11 months ago
  64. dee732b LD128 versions of QS8 GEMM SSE2/SSSE3/SSE4.1 microkernels by Marat Dukhan · 3 years, 11 months ago
  65. 14d3ce8 Add LD64 suffix in QS8 GEMM/IGEMM microkernels by Marat Dukhan · 3 years, 11 months ago
  66. 733d0be QS8 GEMM MRx4c8 SSE2/SSSE3/SSE4.1 microkernels by Marat Dukhan · 3 years, 11 months ago
  67. 595e170 QS8 GEMM microkernels and infrastructure by Marat Dukhan · 3 years, 11 months ago