arm_compute v19.11

commit: 0e205f7e1083bf26e6ffc33d96005999b541b8f9 [log] [tgz]
author: Jenkins <bsgcomp@arm.com> Thu Nov 28 16:53:35 2019 +0000
committer: Jenkins <bsgcomp@arm.com> Thu Nov 28 16:53:35 2019 +0000
tree: 8e204985bbc0a9588f67128cafe49c0419c6d3f9
parent: 975dfe175e3d7c62c27598b1c0e8e77ed90df463 [diff] [blame]
diff --git a/docs/00_introduction.dox b/docs/00_introduction.dox
index ca9e7e3..301e975 100644
--- a/docs/00_introduction.dox
+++ b/docs/00_introduction.dox

@@ -1,5 +1,5 @@
 ///
-/// Copyright (c) 2017-2018 ARM Limited.
+/// Copyright (c) 2017-2019 ARM Limited.
 ///
 /// SPDX-License-Identifier: MIT
 ///
@@ -51,8 +51,8 @@
 These binaries have been built using the following toolchains:
             - Linux armv7a: gcc-linaro-4.9-2016.02-x86_64_arm-linux-gnueabihf
             - Linux arm64-v8a: gcc-linaro-4.9-2016.02-x86_64_aarch64-linux-gnu
-            - Android armv7a: clang++ / libc++ NDK r17b
-            - Android am64-v8a: clang++ / libc++ NDK r17b
+            - Android armv7a: clang++ / libc++ NDK r17c
+            - Android am64-v8a: clang++ / libc++ NDK r17c
 
 @warning Make sure to use a compatible toolchain to build your application or you will get some std::bad_alloc errors at runtime.
 
@@ -236,6 +236,72 @@
 
 @subsection S2_2_changelog Changelog
 
+v19.11 Public major release
+ - Various bug fixes.
+ - Various optimisations.
+ - Updated recommended NDK version to r17c.
+ - Deprecated OpenCL kernels / functions:
+    - CLDepthwiseConvolutionLayerReshapeWeightsGenericKernel
+    - CLDepthwiseIm2ColKernel
+    - CLDepthwiseSeparableConvolutionLayer
+    - CLDepthwiseVectorToTensorKernel
+    - CLDirectConvolutionLayerOutputStageKernel
+ - Deprecated NEON kernels / functions:
+    - NEDepthwiseWeightsReshapeKernel
+    - NEDepthwiseIm2ColKernel
+    - NEDepthwiseSeparableConvolutionLayer
+    - NEDepthwiseVectorToTensorKernel
+    - NEDepthwiseConvolutionLayer3x3
+ - New OpenCL kernels / functions:
+    - @ref CLInstanceNormalizationLayerKernel / @ref CLInstanceNormalizationLayer
+    - @ref CLDepthwiseConvolutionLayerNativeKernel to replace the old generic depthwise convolution (see Deprecated
+      OpenCL kernels / functions)
+    - @ref CLLogSoftmaxLayer
+ - New NEON kernels / functions:
+    - @ref NEBoundingBoxTransformKernel / @ref NEBoundingBoxTransform
+    - @ref NEComputeAllAnchorsKernel / @ref NEComputeAllAnchors
+    - @ref NEDetectionPostProcessLayer
+    - @ref NEGenerateProposalsLayer
+    - @ref NEInstanceNormalizationLayerKernel / @ref NEInstanceNormalizationLayer
+    - @ref NELogSoftmaxLayer
+    - @ref NEROIAlignLayerKernel / @ref NEROIAlignLayer
+ - Added QASYMM8 support for:
+    - @ref CLGenerateProposalsLayer
+    - @ref CLROIAlignLayer
+    - @ref CPPBoxWithNonMaximaSuppressionLimit
+ - Added QASYMM16 support for:
+    - @ref CLBoundingBoxTransform
+ - Added FP16 support for:
+    - @ref CLGEMMMatrixMultiplyReshapedKernel
+ - Added new data type QASYMM8_PER_CHANNEL support for:
+    - @ref CLDequantizationLayer
+    - @ref NEDequantizationLayer
+ - Added new data type QSYMM8_PER_CHANNEL support for:
+    - @ref CLConvolutionLayer
+    - @ref NEConvolutionLayer
+    - @ref CLDepthwiseConvolutionLayer
+    - @ref NEDepthwiseConvolutionLayer
+ - Added FP16 mixed-precision support for:
+    - @ref CLGEMMMatrixMultiplyReshapedKernel
+    - @ref CLPoolingLayerKernel
+ - Added FP32 and FP16 ELU activation for:
+    - @ref CLActivationLayer
+    - @ref NEActivationLayer
+ - Added asymmetric padding support for:
+    - @ref CLDirectDeconvolutionLayer
+    - @ref CLGEMMDeconvolutionLayer
+    - @ref NEDeconvolutionLayer
+ - Added SYMMETRIC and REFLECT modes for @ref CLPadLayerKernel / @ref CLPadLayer.
+ - Replaced the calls to @ref NECopyKernel and @ref NEMemsetKernel with @ref NEPadLayer in @ref NEGenerateProposalsLayer.
+ - Replaced the calls to @ref CLCopyKernel and @ref CLMemsetKernel with @ref CLPadLayer in @ref CLGenerateProposalsLayer.
+ - Improved performance for CL Inception V3 - FP16.
+ - Improved accuracy for CL Inception V3 - FP16 by enabling FP32 accumulator (mixed-precision).
+ - Improved NEON performance by enabling fusing batch normalization with convolution and depth-wise convolution layer.
+ - Improved NEON performance for MobileNet-SSD by improving the output detection performance.
+ - Optimized @ref CLPadLayer.
+ - Optimized CL generic depthwise convolution layer by introducing @ref CLDepthwiseConvolutionLayerNativeKernel.
+ - Reduced memory consumption by implementing weights sharing.
+
 v19.08 Public major release
  - Various bug fixes.
  - Various optimisations.
@@ -290,7 +356,8 @@
  - Added an optimized depthwise convolution layer kernel for 5x5 filters (NEON only)
  - Added support to enable OpenCL kernel cache. Added example showing how to load the prebuilt OpenCL kernels from a binary cache file
  - Altered @ref QuantizationInfo interface to support per-channel quantization.
- - The @ref NEDepthwiseConvolutionLayer3x3 will be replaced by @ref NEDepthwiseConvolutionLayerOptimized to accommodate for future optimizations.
+ - The @ref CLDepthwiseConvolutionLayer3x3 will be included by @ref CLDepthwiseConvolutionLayer to accommodate for future optimizations.
+ - The @ref NEDepthwiseConvolutionLayerOptimized will be included by @ref NEDepthwiseConvolutionLayer to accommodate for future optimizations.
  - Removed inner_border_right and inner_border_top parameters from @ref CLDeconvolutionLayer interface
  - Removed inner_border_right and inner_border_top parameters from @ref NEDeconvolutionLayer interface
  - Optimized the NEON assembly kernel for GEMMLowp. The new implementation fuses the output stage and quantization with the matrix multiplication kernel
@@ -624,7 +691,7 @@
  - Added fused batched normalization and activation to @ref CLBatchNormalizationLayer and @ref NEBatchNormalizationLayer
  - Added support for non-square pooling to @ref NEPoolingLayer and @ref CLPoolingLayer
  - New OpenCL kernels / functions:
-    - @ref CLDirectConvolutionLayerOutputStageKernel
+    - CLDirectConvolutionLayerOutputStageKernel
  - New NEON kernels / functions
     - Added name() method to all kernels.
     - Added support for Winograd 5x5.
@@ -699,7 +766,7 @@
  - New NEON kernels / functions
     - arm_compute::NEGEMMLowpAArch64A53Kernel / arm_compute::NEGEMMLowpAArch64Kernel / arm_compute::NEGEMMLowpAArch64V8P4Kernel / arm_compute::NEGEMMInterleavedBlockedKernel / arm_compute::NEGEMMLowpAssemblyMatrixMultiplyCore
     - arm_compute::NEHGEMMAArch64FP16Kernel
-    - @ref NEDepthwiseConvolutionLayer3x3Kernel / @ref NEDepthwiseIm2ColKernel / @ref NEGEMMMatrixVectorMultiplyKernel / @ref NEDepthwiseVectorToTensorKernel / @ref NEDepthwiseConvolutionLayer
+    - @ref NEDepthwiseConvolutionLayer3x3Kernel / NEDepthwiseIm2ColKernel / @ref NEGEMMMatrixVectorMultiplyKernel / NEDepthwiseVectorToTensorKernel / @ref NEDepthwiseConvolutionLayer
     - @ref NEGEMMLowpOffsetContributionKernel / @ref NEGEMMLowpMatrixAReductionKernel / @ref NEGEMMLowpMatrixBReductionKernel / @ref NEGEMMLowpMatrixMultiplyCore
     - @ref NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel / @ref NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint
     - @ref NEGEMMLowpQuantizeDownInt32ToUint8ScaleKernel / @ref NEGEMMLowpQuantizeDownInt32ToUint8Scale
@@ -746,7 +813,7 @@
     - @ref NEReshapeLayerKernel / @ref NEReshapeLayer
 
  - New OpenCL kernels / functions:
-    - @ref CLDepthwiseConvolutionLayer3x3NCHWKernel @ref CLDepthwiseConvolutionLayer3x3NHWCKernel @ref CLDepthwiseIm2ColKernel @ref CLDepthwiseVectorToTensorKernel CLDepthwiseWeightsReshapeKernel / @ref CLDepthwiseConvolutionLayer3x3 @ref CLDepthwiseConvolutionLayer @ref CLDepthwiseSeparableConvolutionLayer
+    - @ref CLDepthwiseConvolutionLayer3x3NCHWKernel @ref CLDepthwiseConvolutionLayer3x3NHWCKernel CLDepthwiseIm2ColKernel CLDepthwiseVectorToTensorKernel CLDepthwiseWeightsReshapeKernel / @ref CLDepthwiseConvolutionLayer3x3 @ref CLDepthwiseConvolutionLayer CLDepthwiseSeparableConvolutionLayer
     - @ref CLDequantizationLayerKernel / @ref CLDequantizationLayer
     - @ref CLDirectConvolutionLayerKernel / @ref CLDirectConvolutionLayer
     - @ref CLFlattenLayer
@@ -829,7 +896,7 @@
 v17.03 Sources preview
  - New OpenCL kernels / functions:
    - @ref CLGradientKernel, @ref CLEdgeNonMaxSuppressionKernel, @ref CLEdgeTraceKernel / @ref CLCannyEdge
-   - GEMM refactoring + FP16 support: CLGEMMInterleave4x4Kernel, CLGEMMTranspose1xWKernel, @ref CLGEMMMatrixMultiplyKernel, @ref CLGEMMMatrixAdditionKernel / @ref CLGEMM
+   - GEMM refactoring + FP16 support: CLGEMMInterleave4x4Kernel, CLGEMMTranspose1xWKernel, @ref CLGEMMMatrixMultiplyKernel, CLGEMMMatrixAdditionKernel / @ref CLGEMM
    - @ref CLGEMMMatrixAccumulateBiasesKernel / @ref CLFullyConnectedLayer
    - @ref CLTransposeKernel / @ref CLTranspose
    - @ref CLLKTrackerInitKernel, @ref CLLKTrackerStage0Kernel, @ref CLLKTrackerStage1Kernel, @ref CLLKTrackerFinalizeKernel / @ref CLOpticalFlow
commit	0e205f7e1083bf26e6ffc33d96005999b541b8f9	[log] [tgz]
author	Jenkins <bsgcomp@arm.com>	Thu Nov 28 16:53:35 2019 +0000
committer	Jenkins <bsgcomp@arm.com>	Thu Nov 28 16:53:35 2019 +0000
tree	8e204985bbc0a9588f67128cafe49c0419c6d3f9
parent	975dfe175e3d7c62c27598b1c0e8e77ed90df463 [diff] [blame]