Port F32 GEMM A75 1x8 microkernel to JIT and specialize for min/max, add tests and benchmarks
Implement ld1r for aarch64 assembler
PiperOrigin-RevId: 426260122
diff --git a/CMakeLists.txt b/CMakeLists.txt
index b0b785e..2783a4d 100755
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -284,6 +284,7 @@
src/qs8-igemm/4x8c4-rndnu-aarch32-neondot-ld64.cc)
SET(JIT_AARCH64_SRCS
+ src/f32-gemm/1x8-aarch64-neonfma-cortex-a75.cc
src/f32-gemm/6x8-aarch64-neonfma-cortex-a75.cc
src/f32-igemm/6x8-aarch64-neonfma-cortex-a75.cc)