arm_compute v18.01
Change-Id: I9bfa178c2e38bfd5fc812e62aab6760d87748e05
diff --git a/docs/00_introduction.dox b/docs/00_introduction.dox
index 4c6b8f3..fa6c227 100644
--- a/docs/00_introduction.dox
+++ b/docs/00_introduction.dox
@@ -189,6 +189,34 @@
@subsection S2_2_changelog Changelog
+v18.01 Public maintenance release
+ - Various bug fixes
+ - Added some of the missing validate() methods
+ - Added @ref arm_compute::CLDeconvolutionLayerUpsampleKernel / @ref arm_compute::CLDeconvolutionLayer @ref arm_compute::CLDeconvolutionLayerUpsample
+ - Added @ref arm_compute::CLPermuteKernel / @ref arm_compute::CLPermute
+ - Added method to clean the programs cache in the CL Kernel library.
+ - Added @ref arm_compute::GCArithmeticAdditionKernel / @ref arm_compute::GCArithmeticAddition
+ - Added @ref arm_compute::GCDepthwiseConvolutionLayer3x3Kernel / @ref arm_compute::GCDepthwiseConvolutionLayer3x3
+ - Added @ref arm_compute::GCNormalizePlanarYUVLayerKernel / @ref arm_compute::GCNormalizePlanarYUVLayer
+ - Added @ref arm_compute::GCScaleKernel / @ref arm_compute::GCScale
+ - Added @ref arm_compute::GCWeightsReshapeKernel / @ref arm_compute::GCConvolutionLayer
+ - Added FP16 support to the following GLES compute kernels:
+ - @ref arm_compute::GCCol2ImKernel
+ - @ref arm_compute::GCGEMMInterleave4x4Kernel
+ - @ref arm_compute::GCGEMMTranspose1xWKernel
+ - @ref arm_compute::GCIm2ColKernel
+ - Refactored NEON Winograd (@ref arm_compute::NEWinogradLayerKernel)
+ - Added @ref arm_compute::NEDirectConvolutionLayerOutputStageKernel
+ - Added QASYMM8 support to the following NEON kernels:
+ - @ref arm_compute::NEDepthwiseConvolutionLayer3x3Kernel
+ - @ref arm_compute::NEFillBorderKernel
+ - @ref arm_compute::NEPoolingLayerKernel
+ - Added new examples:
+ - graph_cl_mobilenet_qasymm8.cpp
+ - graph_inception_v3.cpp
+ - gc_dc.cpp
+ - More tests added to both validation and benchmarking suites.
+
v17.12 Public major release
- Most machine learning functions on OpenCL support the new data type QASYMM8
- Introduced logging interface
@@ -444,8 +472,8 @@
actual: False
embed_kernels: Embed OpenCL kernels and OpenGL ES compute shader in library binary (yes|no)
- default: False
- actual: False
+ default: True
+ actual: True
set_soname: Set the library's soname and shlibversion (requires SCons 2.4 or above) (yes|no)
default: False
@@ -733,6 +761,7 @@
aarch64-linux-android-clang++ examples/cl_convolution.cpp utils/Utils.cpp -I. -Iinclude -std=c++11 -larm_compute-static -larm_compute_core-static -L. -o cl_convolution_aarch64 -static-libstdc++ -pie -lOpenCL -DARM_COMPUTE_CL
To cross compile a GLES example:
+
#32 bit:
arm-linux-androideabi-clang++ examples/gc_absdiff.cpp utils/Utils.cpp -I. -Iinclude -std=c++11 -larm_compute-static -larm_compute_core-static -L. -o gc_absdiff_arm -static-libstdc++ -pie -DARM_COMPUTE_GC
#64 bit:
diff --git a/docs/02_tests.dox b/docs/02_tests.dox
index 1accf00..188f938 100644
--- a/docs/02_tests.dox
+++ b/docs/02_tests.dox
@@ -18,6 +18,10 @@
tests/fixtures cannot be parameterized based on the data type if static type
information is needed within the test (e.g. to validate the results).
+@note By default tests are not built. To enable them you need to add validation_tests=1 and / or benchmark_tests=1 to your SCons line.
+
+@note Tests are not included in the pre-built binary archive, you have to build them from sources.
+
@subsection tests_overview_structure Directory structure
.
@@ -311,7 +315,9 @@
If only a subset of the tests has to be executed the `--filter` option takes a
regular expression to select matching tests.
- ./arm_compute_benchmark --filter='NEON/.*AlexNet' ./data
+ ./arm_compute_benchmark --filter='^NEON/.*AlexNet' ./data
+
+@note Filtering will be much faster if the regular expression starts from the start ("^") or end ("$") of the line.
Additionally each test has a test id which can be used as a filter, too.
However, the test id is not guaranteed to be stable when new tests are added.
@@ -348,12 +354,29 @@
`MALI` will try to collect Mali hardware performance counters. (You need to have a recent enough Mali driver)
-`WALL_CLOCK` will measure time using `gettimeofday`: this should work on all platforms.
+`WALL_CLOCK_TIMER` will measure time using `gettimeofday`: this should work on all platforms.
-You can pass a combinations of these instruments: `--instruments=PMU,MALI,WALL_CLOCK`
+You can pass a combinations of these instruments: `--instruments=PMU,MALI,WALL_CLOCK_TIMER`
@note You need to make sure the instruments have been selected at compile time using the `pmu=1` or `mali=1` scons options.
+@subsubsection tests_running_examples Examples
+
+To run all the precommit validation tests:
+
+ LD_LIBRARY_PATH=. ./arm_compute_validation --mode=precommit
+
+To run the OpenCL precommit validation tests:
+
+ LD_LIBRARY_PATH=. ./arm_compute_validation --mode=precommit --filter="^CL.*"
+
+To run the NEON precommit benchmark tests with PMU and Wall Clock timer in miliseconds instruments enabled:
+
+ LD_LIBRARY_PATH=. ./arm_compute_benchmark --mode=precommit --filter="^NEON.*" --instruments="pmu,wall_clock_timer_ms" --iterations=10
+
+To run the OpenCL precommit benchmark tests with OpenCL kernel timers in miliseconds enabled:
+
+ LD_LIBRARY_PATH=. ./arm_compute_benchmark --mode=precommit --filter="^CL.*" --instruments="opencl_timer_ms" --iterations=10
*/
} // namespace test
} // namespace arm_compute
diff --git a/docs/Doxyfile b/docs/Doxyfile
index 796c10b..5a20d2e 100644
--- a/docs/Doxyfile
+++ b/docs/Doxyfile
@@ -38,7 +38,7 @@
# could be handy for archiving the generated documentation or if some version
# control system is used.
-PROJECT_NUMBER = 17.12
+PROJECT_NUMBER = 18.01
# Using the PROJECT_BRIEF tag one can provide an optional one line description
# for a project that appears at the top of each page and should give viewer a
@@ -855,7 +855,8 @@
# Note that relative paths are relative to the directory from which doxygen is
# run.
-EXCLUDE =
+EXCLUDE = ./arm_compute/core/NEON/kernels/assembly/ \
+ ./arm_compute/core/NEON/kernels/winograd/
# The EXCLUDE_SYMLINKS tag can be used to select whether or not files or
# directories that are symbolic links (a Unix file system feature) are excluded