arm_compute v18.01

Change-Id: I9bfa178c2e38bfd5fc812e62aab6760d87748e05
diff --git a/docs/00_introduction.dox b/docs/00_introduction.dox
index 4c6b8f3..fa6c227 100644
--- a/docs/00_introduction.dox
+++ b/docs/00_introduction.dox
@@ -189,6 +189,34 @@
 
 @subsection S2_2_changelog Changelog
 
+v18.01 Public maintenance release
+ - Various bug fixes
+ - Added some of the missing validate() methods
+ - Added @ref arm_compute::CLDeconvolutionLayerUpsampleKernel / @ref arm_compute::CLDeconvolutionLayer @ref arm_compute::CLDeconvolutionLayerUpsample
+ - Added @ref arm_compute::CLPermuteKernel / @ref arm_compute::CLPermute
+ - Added method to clean the programs cache in the CL Kernel library.
+ - Added @ref arm_compute::GCArithmeticAdditionKernel / @ref arm_compute::GCArithmeticAddition
+ - Added @ref arm_compute::GCDepthwiseConvolutionLayer3x3Kernel / @ref arm_compute::GCDepthwiseConvolutionLayer3x3
+ - Added @ref arm_compute::GCNormalizePlanarYUVLayerKernel / @ref arm_compute::GCNormalizePlanarYUVLayer
+ - Added @ref arm_compute::GCScaleKernel / @ref arm_compute::GCScale
+ - Added @ref arm_compute::GCWeightsReshapeKernel / @ref arm_compute::GCConvolutionLayer
+ - Added FP16 support to the following GLES compute kernels:
+    - @ref arm_compute::GCCol2ImKernel
+    - @ref arm_compute::GCGEMMInterleave4x4Kernel
+    - @ref arm_compute::GCGEMMTranspose1xWKernel
+    - @ref arm_compute::GCIm2ColKernel
+ - Refactored NEON Winograd (@ref arm_compute::NEWinogradLayerKernel)
+ - Added @ref arm_compute::NEDirectConvolutionLayerOutputStageKernel
+ - Added QASYMM8 support to the following NEON kernels:
+    - @ref arm_compute::NEDepthwiseConvolutionLayer3x3Kernel
+    - @ref arm_compute::NEFillBorderKernel
+    - @ref arm_compute::NEPoolingLayerKernel
+ - Added new examples:
+    - graph_cl_mobilenet_qasymm8.cpp
+    - graph_inception_v3.cpp
+    - gc_dc.cpp
+ - More tests added to both validation and benchmarking suites.
+
 v17.12 Public major release
  - Most machine learning functions on OpenCL support the new data type QASYMM8
  - Introduced logging interface
@@ -444,8 +472,8 @@
 		actual: False
 
 	embed_kernels: Embed OpenCL kernels and OpenGL ES compute shader in library binary (yes|no)
-		default: False
-		actual: False
+		default: True
+		actual: True
 
 	set_soname: Set the library's soname and shlibversion (requires SCons 2.4 or above) (yes|no)
 		default: False
@@ -733,6 +761,7 @@
 	aarch64-linux-android-clang++ examples/cl_convolution.cpp utils/Utils.cpp -I. -Iinclude -std=c++11 -larm_compute-static -larm_compute_core-static -L. -o cl_convolution_aarch64 -static-libstdc++ -pie -lOpenCL -DARM_COMPUTE_CL
 
 To cross compile a GLES example:
+
 	#32 bit:
 	arm-linux-androideabi-clang++ examples/gc_absdiff.cpp utils/Utils.cpp -I. -Iinclude -std=c++11 -larm_compute-static -larm_compute_core-static -L. -o gc_absdiff_arm -static-libstdc++ -pie -DARM_COMPUTE_GC
 	#64 bit:
diff --git a/docs/02_tests.dox b/docs/02_tests.dox
index 1accf00..188f938 100644
--- a/docs/02_tests.dox
+++ b/docs/02_tests.dox
@@ -18,6 +18,10 @@
 tests/fixtures cannot be parameterized based on the data type if static type
 information is needed within the test (e.g. to validate the results).
 
+@note By default tests are not built. To enable them you need to add validation_tests=1 and / or benchmark_tests=1 to your SCons line.
+
+@note Tests are not included in the pre-built binary archive, you have to build them from sources.
+
 @subsection tests_overview_structure Directory structure
 
     .
@@ -311,7 +315,9 @@
 If only a subset of the tests has to be executed the `--filter` option takes a
 regular expression to select matching tests.
 
-    ./arm_compute_benchmark --filter='NEON/.*AlexNet' ./data
+    ./arm_compute_benchmark --filter='^NEON/.*AlexNet' ./data
+
+@note Filtering will be much faster if the regular expression starts from the start ("^") or end ("$") of the line.
 
 Additionally each test has a test id which can be used as a filter, too.
 However, the test id is not guaranteed to be stable when new tests are added.
@@ -348,12 +354,29 @@
 
 `MALI` will try to collect Mali hardware performance counters. (You need to have a recent enough Mali driver)
 
-`WALL_CLOCK` will measure time using `gettimeofday`: this should work on all platforms.
+`WALL_CLOCK_TIMER` will measure time using `gettimeofday`: this should work on all platforms.
 
-You can pass a combinations of these instruments: `--instruments=PMU,MALI,WALL_CLOCK`
+You can pass a combinations of these instruments: `--instruments=PMU,MALI,WALL_CLOCK_TIMER`
 
 @note You need to make sure the instruments have been selected at compile time using the `pmu=1` or `mali=1` scons options.
 
+@subsubsection tests_running_examples Examples
+
+To run all the precommit validation tests:
+
+	LD_LIBRARY_PATH=. ./arm_compute_validation --mode=precommit
+
+To run the OpenCL precommit validation tests:
+
+	LD_LIBRARY_PATH=. ./arm_compute_validation --mode=precommit --filter="^CL.*"
+
+To run the NEON precommit benchmark tests with PMU and Wall Clock timer in miliseconds instruments enabled:
+
+	LD_LIBRARY_PATH=. ./arm_compute_benchmark --mode=precommit --filter="^NEON.*" --instruments="pmu,wall_clock_timer_ms" --iterations=10
+
+To run the OpenCL precommit benchmark tests with OpenCL kernel timers in miliseconds enabled:
+
+	LD_LIBRARY_PATH=. ./arm_compute_benchmark --mode=precommit --filter="^CL.*" --instruments="opencl_timer_ms" --iterations=10
 */
 } // namespace test
 } // namespace arm_compute
diff --git a/docs/Doxyfile b/docs/Doxyfile
index 796c10b..5a20d2e 100644
--- a/docs/Doxyfile
+++ b/docs/Doxyfile
@@ -38,7 +38,7 @@
 # could be handy for archiving the generated documentation or if some version
 # control system is used.
 
-PROJECT_NUMBER         = 17.12
+PROJECT_NUMBER         = 18.01
 
 # Using the PROJECT_BRIEF tag one can provide an optional one line description
 # for a project that appears at the top of each page and should give viewer a
@@ -855,7 +855,8 @@
 # Note that relative paths are relative to the directory from which doxygen is
 # run.
 
-EXCLUDE                = 
+EXCLUDE                = ./arm_compute/core/NEON/kernels/assembly/ \ 
+                         ./arm_compute/core/NEON/kernels/winograd/
 
 # The EXCLUDE_SYMLINKS tag can be used to select whether or not files or
 # directories that are symbolic links (a Unix file system feature) are excluded