1. 7be427a Disable MSan and TSan in most microkernels with Out-of-Bounds reads by Marat Dukhan · 3 years ago
  2. 0bf8afa Leverage f32x4.pmin and f32x4.pmax WAsm SIMD instructions by Marat Dukhan · 3 years, 2 months ago
  3. b7a7c30 NEON GEMM/IGEMM microkernels change store/dup to 2 of each by Frank Barchard · 3 years, 3 months ago
  4. 4810905 Leverage v128.const WAsm SIMD instruction by Marat Dukhan · 3 years, 3 months ago
  5. 2dac7bb Unify on wasm_f64x2_spalt(0.0) to materialize zero SIMD vector in WAsm by Marat Dukhan · 3 years, 4 months ago
  6. ee029b2 Replace deprecated wasm_simd128.h intrinsics with new versions by Marat Dukhan · 3 years, 5 months ago
  7. a03020a Run generator scripts Sort names in BUILD files by Frank Barchard · 3 years, 5 months ago
  8. 79cd5f9 FP32 LD128 IGEMM for Cortex X1 by Frank Barchard · 3 years, 5 months ago
  9. 167d667 Comment change x8 is a temporary params pointer by Frank Barchard · 3 years, 5 months ago
  10. 143a110 Rename GEMM/IGEMM microkernels from Cortex-A57/A75 to prfm_cortex_a75 by Frank Barchard · 3 years, 5 months ago
  11. e349124 fp32 IGEMM 4x8 and 6x8 ld64 microkernels by Frank Barchard · 3 years, 5 months ago
  12. 7c9f1f9 Replace // with # for lines that only contain a comment. by Frank Barchard · 3 years, 6 months ago
  13. c08221f Apply text format to assembly for consistency by Frank Barchard · 3 years, 6 months ago
  14. 104ae5e Use ISA-specific layouts in F32 [I]GEMM & DWCONV microkernels by Marat Dukhan · 3 years, 6 months ago
  15. 76f43f0 Apply consistent formatting to assembly by Frank Barchard · 3 years, 6 months ago
  16. cbfa338 text format white space of prefetch instruction on ARM microkernels by Frank Barchard · 3 years, 7 months ago
  17. 802fcae Additional SSE/SSE2 GEMM/IGEMM microkernels by Marat Dukhan · 4 years ago
  18. 0725b8d Rename WebAssembly SIMD source files and functions with x86 or arm suffix after wasmsimd by Frank Barchard · 4 years ago
  19. 3b26206 Renumber labels in assembly sequentially by Frank Barchard · 4 years, 2 months ago
  20. 115d3e2 Remove PSIMD variants of GEMM and IGEMM microkernels by Marat Dukhan · 4 years, 4 months ago
  21. 490febe Cortex A7 microkernel based on LD64 with PLD added. 3.2% faster in end to end mobilenet v2 by Frank Barchard · 4 years, 4 months ago
  22. 688f6d8 Unify x86 and ARM flavors of WAsm SIMD GEMM/IGEMM/DWCONV with RELU by Marat Dukhan · 4 years, 4 months ago
  23. e39e646 WAsm SIMD versions of [I]GEMM microkernels with NR=2 by Marat Dukhan · 4 years, 4 months ago
  24. efc1014 ld64 aarch32 GEMM 4x8 microkernel do all loads before MLA by Frank Barchard · 4 years, 4 months ago
  25. d6ca9d8 4x8-minmax-aarch32-neon-pld-cortex-a75 Fix prefetch offset to not skip a cache line by Frank Barchard · 4 years, 5 months ago
  26. 569561d Generate PLD variation of AARCH32 LD64 by Frank Barchard · 4 years, 5 months ago
  27. 802808c GEMM/IGEMM microkernels with alternative activations in WAsm SIMD by Marat Dukhan · 4 years, 5 months ago
  28. ac014d7 DWCONV microkernels in WAsm SIMD intrinsics by Marat Dukhan · 4 years, 5 months ago
  29. 1bbf96b GEMM/IGEMM implementations in WAsm SIMD intrinsics by Marat Dukhan · 4 years, 5 months ago
  30. 016e586 iOS use Cortex-A75 microkernel which avoids x18 register by Frank Barchard · 4 years, 5 months ago
  31. 6724218 Avoid x18 register by Frank Barchard · 4 years, 5 months ago
  32. 909564c Update comment for x18 register by Frank Barchard · 4 years, 5 months ago
  33. b2217dd Disable tsan for micro-kernels which read out-of-bounds by Marat Dukhan · 4 years, 6 months ago
  34. 467f636 Fused [I]GEMM+RELU micro-kernels by Marat Dukhan · 4 years, 6 months ago
  35. 737a3a1 IGEMM 4x8 a53/a55 - Replace STP x19, x19 with STR by Frank Barchard · 4 years, 6 months ago
  36. b339045 Comment change rename clamp params to params by Frank Barchard · 4 years, 6 months ago
  37. c4668ed Comment fix for mr <= 4 by Frank Barchard · 4 years, 7 months ago
  38. f196d01 Support CMake build with MSVC by Marat Dukhan · 4 years, 7 months ago
  39. 163a7e6 Scalar & WAsm GEMM/IGEMM/DWCONV micro-kernels without activation by Marat Dukhan · 4 years, 7 months ago
  40. de06f49 Add MINMAX suffix to GEMM/IGEMM/DWCONV/PPMM micro-kernel names by Marat Dukhan · 4 years, 7 months ago
  41. 1c58711 Add MINMAX suffix to filenames of GEMM/IGEMM/PPMM/DWCONV micro-kernels by Marat Dukhan · 4 years, 8 months ago
  42. eb09a6b Rename F32/U8 output params to minmax params by Marat Dukhan · 4 years, 8 months ago
  43. a51cf48 Unify layout of min/max parameters by Marat Dukhan · 4 years, 8 months ago
  44. 0d1052c iOS 6x8 microkernel based on Cortex-A75 but with X18 avoided. by Frank Barchard · 4 years, 8 months ago
  45. 6f8c966 Use x13 instead of x18. by Frank Barchard · 4 years, 8 months ago
  46. 8fb9055 4x8 GEMM and IGEMM microkernels for Cortex A55. 7.8% faster for e2e mobile net v2. by Frank Barchard · 4 years, 8 months ago
  47. 36053aa 4x8 AARCH32 GEMM/IGEMM avoid r2 push/pop. by Frank Barchard · 4 years, 8 months ago
  48. b7dd29e 4x8 GEMM and IGEMM microkernels for AARCH32 Cortex A55. 11.5% faster end to end: by Frank Barchard · 4 years, 8 months ago
  49. f32ae34 Unify the value of $ABC variable across all templates by Marat Dukhan · 4 years, 8 months ago
  50. 91e1999 6x8 GEMM and IGEMM microkernels for Cortex A55. 9% faster end to end: by Frank Barchard · 4 years, 9 months ago
  51. 52261a8 IGEMM for Cortex-A75 aarch32 pad stack so vector push is aligned. by Frank Barchard · 4 years, 9 months ago
  52. 16d7272 ks loop use B.HI instead of B.NE to avoid bugs causing infinite loop. by Frank Barchard · 4 years, 9 months ago
  53. c87a8fd Cortex A53 IGEMM 32 bit ARM by Frank Barchard · 4 years, 9 months ago
  54. c1a0697 Replace load with mov for ks in xnn_f32_igemm_ukernel_4x8__aarch32_neon_cortex_a75 by Frank Barchard · 4 years, 9 months ago
  55. 9b499d6 Load parameters in order of usage. by Frank Barchard · 4 years, 9 months ago
  56. 8155854 Direct branch to source remainder handler for GEMM/IGEMM. by Frank Barchard · 4 years, 9 months ago
  57. 79ade18 LD64 microkernels branch directly to remainder if less than 2 channels. by Frank Barchard · 4 years, 9 months ago
  58. fd262e1 IGEMM 4x8 LD64 pad stack with 4 bytes to align to 8 bytes by Frank Barchard · 4 years, 9 months ago
  59. 90ce789 Cortex A75 IGEMM 32 bit ARM. by Frank Barchard · 4 years, 9 months ago
  60. dc38f07 LD64 IGEMM 32 bit ARM by Frank Barchard · 4 years, 9 months ago
  61. 534375d A53 GEMM / IGEMM kernel prefetches adjust by 1 by Frank Barchard · 4 years, 10 months ago
  62. c03b2bd 4x12 A53 GEMM and IGEMM use X8 for temp GPR by Frank Barchard · 4 years, 10 months ago
  63. 7693acf 4x8 Cortex-A53 GEMM / IGEMM use 1 GPR instead of 2. by Frank Barchard · 4 years, 10 months ago
  64. f884a7b 6X8 Cortex-A53 GEMM use 1 GPR instead of 2. by Frank Barchard · 4 years, 10 months ago
  65. cbb35d0 4x12 IGEMM use prefetch on A and B by Frank Barchard · 4 years, 10 months ago
  66. 3216758 6X8 Cortex-A53 IGEMM use 1 GPR instead of 2. by Frank Barchard · 4 years, 10 months ago
  67. 387c2d1 Generate A57 micro-kernels from A75 source. by Frank Barchard · 5 years ago
  68. c659140 a73 kernel move SUBS before clamp and add NOP before branch by Frank Barchard · 5 years ago
  69. d94b856 Rename strided gemm and igemm fma3 broadcasts. by Ashkan Aliabadi · 5 years ago
  70. 2712132 FMA3 microkernels with 4-wide shuffle by Marat Dukhan · 5 years ago
  71. eccfd71 NR=16 GEMM and IGEMM micro-kernels in AVX and FMA3 implementations by Marat Dukhan · 5 years ago
  72. cfb3134 Polyfill missing _cvtu32_mask16 intrinsic on old gcc by Marat Dukhan · 5 years ago
  73. 6383f49 Assembly GEMM kernel NC loop use SUBS instead of CMP+SUBS by Frank Barchard · 5 years ago
  74. 436ebe6 Separate WAsm micro-kernels and scalar micro-kernels by Marat Dukhan · 5 years ago
  75. 0f349c4 AVX512F implementation of GEMM & IGEMM micro-kernels by Marat Dukhan · 5 years ago
  76. c72fa1e Use XNN_ARCH_* macros for architecture-specific parts in micro-kernels by Marat Dukhan · 5 years ago
  77. 69172d9 6x8 ld128 GEMM microkernels by Frank Barchard · 5 years ago
  78. 40a672f Move generated micro-kernels into a subdirectory by Marat Dukhan · 5 years ago
  79. 5243bb0 DUP Neon GEMM kernels for Exynos by Frank Barchard · 5 years ago
  80. 91317c5 Rename neon intrinsics to lane. by Frank Barchard · 5 years ago
  81. fda12b8 AVX and FMA3 microkernels for GEMM/GEMMINC/IGEMM by Marat Dukhan · 5 years ago
  82. 5480997 Replace IDLETTERS with ABC by Frank Barchard · 5 years ago
  83. df06d80 Neon shuffle GEMM and IGEMM kernels. by Frank Barchard · 5 years ago
  84. 7ccaab6 IGEMM kernels add asserts for a, c, and w pointers. by Frank Barchard · 5 years ago
  85. 80b537a 6x8 IGEMM for Cortex A53 pipelined. by Frank Barchard · 5 years ago
  86. 7c8e0c7 4x8 IGEMM for Cortex-A53 pipelined by Frank Barchard · 5 years ago
  87. 684bbb0 CMP 2 instructions earlier in A/C clamping. by Frank Barchard · 5 years ago
  88. 9efaed7 A53 GEMM and IGEMM pipelined kernels prefetch C in epilogue by Frank Barchard · 5 years ago
  89. 5abe43c ST1 post increment for Cortex A53 GEMM/IGEMM microkernels by Frank Barchard · 5 years ago
  90. bd41971 A57 branch a version of A53 kernel by Frank Barchard · 5 years ago
  91. 64a5bfe A53 6x8 IGEMM kernel prefetch by Frank Barchard · 5 years ago
  92. ae777b4 4x8 a53 eliminate pushes to stack by Frank Barchard · 5 years ago
  93. b3c6c6e 6x8 A53 remove pushes for NEON by Frank Barchard · 5 years ago
  94. 46fb807 4x8 A53 GEMM, and GEMMINC unpipelined microkernels. by Frank Barchard · 5 years ago
  95. a7fb855 6x8 A53 GEMM, GEMMINC and IGEMM unpipelined microkernels. by Frank Barchard · 5 years ago
  96. fcfdc0e Automated g4 rollback of changelist 274728310. by Frank Barchard · 5 years ago
  97. 8e3c551 1x8 a53 kernel refactor based on a57. by Frank Barchard · 5 years ago
  98. baa9ead Update assembly Copyright notice to // comment by Frank Barchard · 5 years ago
  99. 459c9fc 6x8 and a53 kernel comments. by Frank Barchard · 5 years ago
  100. a5ca10e Neon intrinsics clamping - Replace 2 LD1R with 1 LD2R by Frank Barchard · 5 years ago