Code generator for PLD and non-PLD versions of aarch32 4x8 Cortex-A75 kernel

The only difference between PLD and non-PLD for a75 is the prefetch.
Several CPUs prefer a prefetch, but so far that is the only difference.

PiperOrigin-RevId: 285493217
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 959f210..c7dde34 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -1083,8 +1083,8 @@
 SET(XNNPACK_AARCH32_ASM_MICROKERNEL_SRCS
   src/q8-dwconv/up8x9-aarch32-neon.S
   src/f32-gemm/4x8-aarch32-neon-cortex-a53.S
-  src/f32-gemm/4x8-aarch32-neon-cortex-a75.S
-  src/f32-gemm/4x8-aarch32-neon-pld-cortex-a75.S
+  src/f32-gemm/gen/4x8-aarch32-neon-cortex-a75.S
+  src/f32-gemm/gen/4x8-aarch32-neon-pld-cortex-a75.S
   src/f32-gemm/4x8-aarch32-neon-ld64.S)
 
 SET(XNNPACK_AARCH64_ASM_MICROKERNEL_SRCS