blob: 61a83317ad0fbe45a6307799665143fd8fc07f42 [file] [log] [blame]
David 'Digit' Turnerda4b8312010-02-09 12:25:56 -08001Android NDK & ARM NEON instruction set extension support
2--------------------------------------------------------
3
4Introduction:
5-------------
6
7Android NDK r3 added support for the new 'armeabi-v7a' ARM-based ABI
8that allows native code to use two useful instruction set extenstions:
9
10- Thumb-2, which provides performance comparable to 32-bit ARM
11 instructions with similar compactness to Thumb-1
12
13- VFPv3, which provides hardware FPU registers and computations,
14 to boost floating point performance significantly.
15
16 More specifically, by default 'armeabi-v7a' only supports
17 VFPv3-D16 which only uses/requires 16 hardware FPU 64-bit registers.
18
19More information about this can be read in docs/CPU-ARCH-ABIS.TXT
20
21The ARMv7 Architecture Reference Manual also defines another optional
22instruction set extension known as "ARM Advanced SIMD", nick-named
23"NEON". It provides:
24
25- A set of interesting scalar/vector instructions and registers
26 (the latter are mapped to the same chip area than the FPU ones),
27 comparable to MMX/SSE/3DNow! in the x86 world.
28
29- VFPv3-D32 as a requirement (i.e. 32 hardware FPU 64-bit registers,
30 instead of the minimum of 16).
31
32Not all ARMv7-based Android devices will support NEON, but those that
33do may benefit in significant ways from the scalar/vector instructions.
34
35The NDK supports the compilation of modules or even specific source
36files with support for NEON. What this means is that a specific compiler
37flag will be used to enable the use of GCC ARM Neon intrinsics and
38VFPv3-D32 at the same time. The intrinsics are described here:
39
40 http://gcc.gnu.org/onlinedocs/gcc/ARM-NEON-Intrinsics.html
41
42
43LOCAL_ARM_NEON:
44---------------
45
46Define LOCAL_ARM_NEON to 'true' in your module definition, and the NDK
47will build all its source files with NEON support. This can be useful if
48you want to build a static or shared library that specifically contains
49NEON code paths.
50
51
52Using the .neon suffix:
53-----------------------
54
55When listing sources files in your LOCAL_SRC_FILES variable, you now have
56the option of using the .neon suffix to indicate that you want to
57corresponding source(s) to be built with Neon support. For example:
58
59 LOCAL_SRC_FILES := foo.c.neon bar.c
60
61Will only build 'foo.c' with NEON support.
62
63Note that the .neon suffix can be used with the .arm suffix too (used to
64specify the 32-bit ARM instruction set for non-NEON instructions), but must
65appear after it.
66
67In other words, 'foo.c.arm.neon' works, but 'foo.c.neon.arm' does NOT.
68
69
70Build Requirements:
71------------------
72
73Neon support only works when targetting the 'armeabi-v7a' ABI, otherwise the
74NDK build scripts will complain and abort. It is important to use checks like
75the following in your Android.mk:
76
77 # define a static library containing our NEON code
78 ifeq ($(TARGET_ARCH_ABI),armeabi-v7a)
79 include $(CLEAR_VARS)
80 LOCAL_MODULE := mylib-neon
81 LOCAL_SRC_FILES := mylib-neon.c
82 LOCAL_ARM_NEON := true
83 include $(BUILD_STATIC_LIBRARY)
84 endif # TARGET_ARCH_ABI == armeabi-v7a
85
86
87Runtime Detection:
88------------------
89
90As said previously, NOT ALL ARMv7-BASED ANDROID DEVICES WILL SUPPORT NEON !
91It is thus crucial to perform runtime detection to know if the NEON-capable
92machine code can be run on the target device.
93
94To do that, use the 'cpufeatures' library that comes with this NDK. To lean
95more about it, see docs/CPU-FEATURES.TXT.
96
97You should explicitely check that android_getCpuFamily() returns
98ANDROID_CPU_FAMILY_ARM, and that android_getCpuFeatures() returns a value
99that has the ANDROID_CPU_ARM_FEATURE_NEON flag set, as in:
100
101 #include <cpu-features.h>
102
103 ...
104 ...
105
106 if (android_getCpuFamily() == ANDROID_CPU_FAMILY_ARM &&
107 (android_getCpuFeatures() & ANDROID_CPU_ARM_FEATURE_NEON) != 0)
108 {
109 // use NEON-optimized routines
110 ...
111 }
112 else
113 {
114 // use non-NEON fallback routines instead
115 ...
116 }
117
118 ...
David 'Digit' Turner93369c22010-02-12 15:51:33 -0800119
120Sample code:
121------------
122
123Look at the source code for the "hello-neon" sample in this NDK for an example
124on how to use the 'cpufeatures' library and Neon intrinsics at the same time.
125
126This implements a tiny benchmark for a FIR filter loop using a C version, and
127a NEON-optimized one for devices that support it.