XNNPACK Team | b455b12 | 2019-09-27 18:10:33 -0700 | [diff] [blame] | 1 | #!/bin/sh |
| 2 | # Copyright 2019 Google LLC |
| 3 | # |
| 4 | # This source code is licensed under the BSD-style license found in the |
| 5 | # LICENSE file in the root directory of this source tree. |
| 6 | |
| 7 | #################################### Scalar ################################### |
| 8 | ### Microkernels without unrolling |
Marat Dukhan | 40a672f | 2019-11-25 03:08:22 -0800 | [diff] [blame] | 9 | tools/xngen src/f32-spmm/scalar.c.in -D MR=1 -D NR=1 -D UNROLL=1 -o src/f32-spmm/gen/1x1-scalar.c |
| 10 | tools/xngen src/f32-spmm/scalar.c.in -D MR=2 -D NR=1 -D UNROLL=1 -o src/f32-spmm/gen/2x1-scalar.c |
| 11 | tools/xngen src/f32-spmm/scalar.c.in -D MR=4 -D NR=1 -D UNROLL=1 -o src/f32-spmm/gen/4x1-scalar.c |
| 12 | tools/xngen src/f32-spmm/scalar.c.in -D MR=8 -D NR=1 -D UNROLL=1 -o src/f32-spmm/gen/8x1-scalar.c |
| 13 | tools/xngen src/f32-spmm/scalar.c.in -D MR=8 -D NR=2 -D UNROLL=1 -o src/f32-spmm/gen/8x2-scalar.c |
| 14 | tools/xngen src/f32-spmm/scalar.c.in -D MR=8 -D NR=4 -D UNROLL=1 -o src/f32-spmm/gen/8x4-scalar.c |
XNNPACK Team | b455b12 | 2019-09-27 18:10:33 -0700 | [diff] [blame] | 15 | ### Microkernels with software pipelining |
Marat Dukhan | 40a672f | 2019-11-25 03:08:22 -0800 | [diff] [blame] | 16 | tools/xngen src/f32-spmm/scalar-pipelined.c.in -D MR=1 -D NR=1 -o src/f32-spmm/gen/1x1-scalar-pipelined.c |
| 17 | tools/xngen src/f32-spmm/scalar-pipelined.c.in -D MR=2 -D NR=1 -o src/f32-spmm/gen/2x1-scalar-pipelined.c |
| 18 | tools/xngen src/f32-spmm/scalar-pipelined.c.in -D MR=4 -D NR=1 -o src/f32-spmm/gen/4x1-scalar-pipelined.c |
| 19 | tools/xngen src/f32-spmm/scalar-pipelined.c.in -D MR=8 -D NR=1 -o src/f32-spmm/gen/8x1-scalar-pipelined.c |
XNNPACK Team | b455b12 | 2019-09-27 18:10:33 -0700 | [diff] [blame] | 20 | |
| 21 | ################################### ARM NEON ################################## |
| 22 | ### Microkernels without unrolling |
Marat Dukhan | 40a672f | 2019-11-25 03:08:22 -0800 | [diff] [blame] | 23 | tools/xngen src/f32-spmm/neon.c.in -D MR=4 -D NR=1 -D UNROLL=1 -D FMA=1 -o src/f32-spmm/gen/4x1-neonfma.c |
| 24 | tools/xngen src/f32-spmm/neon.c.in -D MR=8 -D NR=1 -D UNROLL=1 -D FMA=1 -o src/f32-spmm/gen/8x1-neonfma.c |
| 25 | tools/xngen src/f32-spmm/neon.c.in -D MR=12 -D NR=1 -D UNROLL=1 -D FMA=1 -o src/f32-spmm/gen/12x1-neonfma.c |
| 26 | tools/xngen src/f32-spmm/neon.c.in -D MR=16 -D NR=1 -D UNROLL=1 -D FMA=1 -o src/f32-spmm/gen/16x1-neonfma.c |
XNNPACK Team | b455b12 | 2019-09-27 18:10:33 -0700 | [diff] [blame] | 27 | ### Microkernels with 2X unrolling |
Marat Dukhan | 40a672f | 2019-11-25 03:08:22 -0800 | [diff] [blame] | 28 | tools/xngen src/f32-spmm/neon.c.in -D MR=4 -D NR=1 -D UNROLL=2 -D FMA=1 -o src/f32-spmm/gen/4x1-neonfma-unroll2.c |
| 29 | tools/xngen src/f32-spmm/neon.c.in -D MR=8 -D NR=1 -D UNROLL=2 -D FMA=1 -o src/f32-spmm/gen/8x1-neonfma-unroll2.c |
| 30 | tools/xngen src/f32-spmm/neon.c.in -D MR=16 -D NR=1 -D UNROLL=2 -D FMA=1 -o src/f32-spmm/gen/16x1-neonfma-unroll2.c |
XNNPACK Team | b455b12 | 2019-09-27 18:10:33 -0700 | [diff] [blame] | 31 | ### Microkernels for blocks of several output channels |
Marat Dukhan | 40a672f | 2019-11-25 03:08:22 -0800 | [diff] [blame] | 32 | tools/xngen src/f32-spmm/neon-blocked.c.in -D MR=4 -D NR=2 -D UNROLL=1 -D FMA=1 -o src/f32-spmm/gen/4x2-neonfma.c |
| 33 | tools/xngen src/f32-spmm/neon-blocked.c.in -D MR=8 -D NR=2 -D UNROLL=1 -D FMA=1 -o src/f32-spmm/gen/8x2-neonfma.c |
| 34 | tools/xngen src/f32-spmm/neon-blocked.c.in -D MR=12 -D NR=2 -D UNROLL=1 -D FMA=1 -o src/f32-spmm/gen/12x2-neonfma.c |
| 35 | tools/xngen src/f32-spmm/neon-blocked.c.in -D MR=16 -D NR=2 -D UNROLL=1 -D FMA=1 -o src/f32-spmm/gen/16x2-neonfma.c |
| 36 | tools/xngen src/f32-spmm/neon-blocked.c.in -D MR=4 -D NR=4 -D UNROLL=1 -D FMA=1 -o src/f32-spmm/gen/4x4-neonfma.c |
| 37 | tools/xngen src/f32-spmm/neon-blocked.c.in -D MR=8 -D NR=4 -D UNROLL=1 -D FMA=1 -o src/f32-spmm/gen/8x4-neonfma.c |
| 38 | tools/xngen src/f32-spmm/neon-blocked.c.in -D MR=12 -D NR=4 -D UNROLL=1 -D FMA=1 -o src/f32-spmm/gen/12x4-neonfma.c |
| 39 | tools/xngen src/f32-spmm/neon-blocked.c.in -D MR=16 -D NR=4 -D UNROLL=1 -D FMA=1 -o src/f32-spmm/gen/16x4-neonfma.c |
XNNPACK Team | b455b12 | 2019-09-27 18:10:33 -0700 | [diff] [blame] | 40 | ### Microkernels with software pipelining |
Marat Dukhan | 40a672f | 2019-11-25 03:08:22 -0800 | [diff] [blame] | 41 | tools/xngen src/f32-spmm/neon-pipelined.c.in -D MR=4 -D NR=1 -D FMA=1 -o src/f32-spmm/gen/4x1-neonfma-pipelined.c |
| 42 | tools/xngen src/f32-spmm/neon-pipelined.c.in -D MR=8 -D NR=1 -D FMA=1 -o src/f32-spmm/gen/8x1-neonfma-pipelined.c |
| 43 | tools/xngen src/f32-spmm/neon-pipelined.c.in -D MR=16 -D NR=1 -D FMA=1 -o src/f32-spmm/gen/16x1-neonfma-pipelined.c |
XNNPACK Team | b455b12 | 2019-09-27 18:10:33 -0700 | [diff] [blame] | 44 | |
| 45 | ################################### x86 SSE ################################### |
| 46 | ### Microkernels without unrolling |
Marat Dukhan | 40a672f | 2019-11-25 03:08:22 -0800 | [diff] [blame] | 47 | tools/xngen src/f32-spmm/sse.c.in -D MR=4 -D NR=1 -D UNROLL=1 -o src/f32-spmm/gen/4x1-sse.c |
| 48 | tools/xngen src/f32-spmm/sse.c.in -D MR=8 -D NR=1 -D UNROLL=1 -o src/f32-spmm/gen/8x1-sse.c |
XNNPACK Team | b455b12 | 2019-09-27 18:10:33 -0700 | [diff] [blame] | 49 | |
| 50 | ################################## Unit tests ################################# |
| 51 | tools/generate-spmm-test.py --spec test/f32-spmm.yaml --output test/f32-spmm.cc |