arm_compute v18.11
diff --git a/documentation/index.xhtml b/documentation/index.xhtml
index 987abfe..1cb890b 100644
--- a/documentation/index.xhtml
+++ b/documentation/index.xhtml
@@ -4,7 +4,7 @@
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
-<meta name="generator" content="Doxygen 1.8.11"/>
+<meta name="generator" content="Doxygen 1.8.13"/>
<meta name="robots" content="NOINDEX, NOFOLLOW" /> <!-- Prevent indexing by search engines -->
<title>Compute Library: Introduction</title>
<link href="tabs.css" rel="stylesheet" type="text/css"/>
@@ -16,14 +16,10 @@
<script type="text/javascript" src="navtree.js"></script>
<script type="text/javascript">
$(document).ready(initResizable);
- $(window).load(resizeHeight);
</script>
<link href="search/search.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="search/searchdata.js"></script>
<script type="text/javascript" src="search/search.js"></script>
-<script type="text/javascript">
- $(document).ready(function() { init_search(); });
-</script>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
extensions: ["tex2jax.js"],
@@ -40,7 +36,7 @@
<tr style="height: 56px;">
<td style="padding-left: 0.5em;">
<div id="projectname">Compute Library
-  <span id="projectnumber">18.08</span>
+  <span id="projectnumber">18.11</span>
</div>
</td>
</tr>
@@ -48,35 +44,19 @@
</table>
</div>
<!-- end header part -->
-<!-- Generated by Doxygen 1.8.11 -->
+<!-- Generated by Doxygen 1.8.13 -->
<script type="text/javascript">
var searchBox = new SearchBox("searchBox", "search",false,'Search');
</script>
- <div id="navrow1" class="tabs">
- <ul class="tablist">
- <li class="current"><a href="index.xhtml"><span>Main Page</span></a></li>
- <li><a href="pages.xhtml"><span>Related Pages</span></a></li>
- <li><a href="namespaces.xhtml"><span>Namespaces</span></a></li>
- <li><a href="annotated.xhtml"><span>Data Structures</span></a></li>
- <li><a href="files.xhtml"><span>Files</span></a></li>
- <li>
- <div id="MSearchBox" class="MSearchBoxInactive">
- <span class="left">
- <img id="MSearchSelect" src="search/mag_sel.png"
- onmouseover="return searchBox.OnSearchSelectShow()"
- onmouseout="return searchBox.OnSearchSelectHide()"
- alt=""/>
- <input type="text" id="MSearchField" value="Search" accesskey="S"
- onfocus="searchBox.OnSearchFieldFocus(true)"
- onblur="searchBox.OnSearchFieldFocus(false)"
- onkeyup="searchBox.OnSearchFieldChange(event)"/>
- </span><span class="right">
- <a id="MSearchClose" href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" border="0" src="search/close.png" alt=""/></a>
- </span>
- </div>
- </li>
- </ul>
- </div>
+<script type="text/javascript" src="menudata.js"></script>
+<script type="text/javascript" src="menu.js"></script>
+<script type="text/javascript">
+$(function() {
+ initMenu('',true,false,'search.php','Search');
+ $(document).ready(function() { init_search(); });
+});
+</script>
+<div id="main-nav"></div>
</div><!-- top -->
<div id="side-nav" class="ui-resizable side-nav-resizable">
<div id="nav-tree">
@@ -164,7 +144,7 @@
Pre-built binaries</h1>
<p>For each release we provide some pre-built binaries of the library <a href="https://github.com/ARM-software/ComputeLibrary/releases">here</a></p>
<p>These binaries have been built using the following toolchains:</p><ul>
-<li>Linux armv7a: gcc-linaro-arm-linux-gnueabihf-4.9-2014.07_linux</li>
+<li>Linux armv7a: gcc-linaro-4.9-2016.02-x86_64_arm-linux-gnueabihf</li>
<li>Linux arm64-v8a: gcc-linaro-4.9-2016.02-x86_64_aarch64-linux-gnu</li>
<li>Android armv7a: clang++ / libc++ NDK r17b</li>
<li>Android am64-v8a: clang++ / libc++ NDK r17b</li>
@@ -173,7 +153,7 @@
<h1><a class="anchor" id="S1_file_organisation"></a>
File organisation</h1>
<p>This archive contains:</p><ul>
-<li>The <a class="el" href="namespacearm__compute.xhtml" title="This file contains all available output stages for GEMMLowp on OpenCL. ">arm_compute</a> header and source files</li>
+<li>The <a class="el" href="namespacearm__compute.xhtml" title="Copyright (c) 2017-2018 ARM Limited. ">arm_compute</a> header and source files</li>
<li>The latest Khronos OpenCL 1.2 C headers from the <a href="https://www.khronos.org/registry/cl/">Khronos OpenCL registry</a></li>
<li>The latest Khronos cl2.hpp from the <a href="https://www.khronos.org/registry/cl/">Khronos OpenCL registry</a> (API version 2.1 when this document was written)</li>
<li>The latest Khronos OpenGL ES 3.1 C headers from the <a href="https://www.khronos.org/registry/gles/">Khronos OpenGL ES registry</a></li>
@@ -344,6 +324,87 @@
</pre><dl class="section note"><dt>Note</dt><dd>We're aiming at releasing one major public release with new features per quarter. All releases in between will only contain bug fixes.</dd></dl>
<h2><a class="anchor" id="S2_2_changelog"></a>
Changelog</h2>
+<p>v18.11 Public major release</p><ul>
+<li>Various bug fixes.</li>
+<li>Various optimisations.</li>
+<li>New Neon kernels / functions:<ul>
+<li><a class="el" href="classarm__compute_1_1_n_e_channel_shuffle_layer.xhtml">NEChannelShuffleLayer</a> / <a class="el" href="classarm__compute_1_1_n_e_channel_shuffle_layer_kernel.xhtml">NEChannelShuffleLayerKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_n_e_reduce_mean.xhtml">NEReduceMean</a></li>
+<li><a class="el" href="classarm__compute_1_1_n_e_reorg_layer.xhtml">NEReorgLayer</a> / <a class="el" href="classarm__compute_1_1_n_e_reorg_layer_kernel.xhtml">NEReorgLayerKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_n_e_prior_box_layer.xhtml">NEPriorBoxLayer</a> / <a class="el" href="classarm__compute_1_1_n_e_prior_box_layer_kernel.xhtml">NEPriorBoxLayerKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_n_e_upsample_layer.xhtml">NEUpsampleLayer</a> / <a class="el" href="classarm__compute_1_1_n_e_upsample_layer_kernel.xhtml">NEUpsampleLayerKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_n_e_y_o_l_o_layer.xhtml">NEYOLOLayer</a> / <a class="el" href="classarm__compute_1_1_n_e_y_o_l_o_layer_kernel.xhtml">NEYOLOLayerKernel</a></li>
+</ul>
+</li>
+<li>New OpenCL kernels / functions:<ul>
+<li><a class="el" href="classarm__compute_1_1_c_l_batch_to_space_layer.xhtml">CLBatchToSpaceLayer</a> / <a class="el" href="classarm__compute_1_1_c_l_batch_to_space_layer_kernel.xhtml">CLBatchToSpaceLayerKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_bounding_box_transform.xhtml">CLBoundingBoxTransform</a> / <a class="el" href="classarm__compute_1_1_c_l_bounding_box_transform_kernel.xhtml">CLBoundingBoxTransformKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_compute_all_anchors_kernel.xhtml">CLComputeAllAnchorsKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_generate_proposals_layer.xhtml">CLGenerateProposalsLayer</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_normalize_planar_y_u_v_layer.xhtml">CLNormalizePlanarYUVLayer</a> / <a class="el" href="classarm__compute_1_1_c_l_normalize_planar_y_u_v_layer_kernel.xhtml">CLNormalizePlanarYUVLayerKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_reorg_layer.xhtml">CLReorgLayer</a> / <a class="el" href="classarm__compute_1_1_c_l_reorg_layer_kernel.xhtml">CLReorgLayerKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_space_to_batch_layer.xhtml">CLSpaceToBatchLayer</a> / <a class="el" href="classarm__compute_1_1_c_l_space_to_batch_layer_kernel.xhtml">CLSpaceToBatchLayerKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_pad_layer.xhtml">CLPadLayer</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_reduce_mean.xhtml">CLReduceMean</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_prior_box_layer.xhtml">CLPriorBoxLayer</a> / <a class="el" href="classarm__compute_1_1_c_l_prior_box_layer_kernel.xhtml">CLPriorBoxLayerKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_r_o_i_align_layer.xhtml">CLROIAlignLayer</a> / <a class="el" href="classarm__compute_1_1_c_l_r_o_i_align_layer_kernel.xhtml">CLROIAlignLayerKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_slice.xhtml">CLSlice</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_split.xhtml">CLSplit</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_strided_slice.xhtml">CLStridedSlice</a> / <a class="el" href="classarm__compute_1_1_c_l_strided_slice_kernel.xhtml">CLStridedSliceKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_upsample_layer.xhtml">CLUpsampleLayer</a> / <a class="el" href="classarm__compute_1_1_c_l_upsample_layer_kernel.xhtml">CLUpsampleLayerKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_y_o_l_o_layer.xhtml">CLYOLOLayer</a> / <a class="el" href="classarm__compute_1_1_c_l_y_o_l_o_layer_kernel.xhtml">CLYOLOLayerKernel</a></li>
+</ul>
+</li>
+<li>New CPP kernels / functions:<ul>
+<li><a class="el" href="classarm__compute_1_1_c_p_p_box_with_non_maxima_suppression_limit.xhtml">CPPBoxWithNonMaximaSuppressionLimit</a> / <a class="el" href="classarm__compute_1_1_c_p_p_box_with_non_maxima_suppression_limit_kernel.xhtml">CPPBoxWithNonMaximaSuppressionLimitKernel</a></li>
+</ul>
+</li>
+<li>Added the validate method in:<ul>
+<li><a class="el" href="classarm__compute_1_1_n_e_depth_convert_layer.xhtml">NEDepthConvertLayer</a></li>
+<li><a class="el" href="classarm__compute_1_1_n_e_floor.xhtml">NEFloor</a> / <a class="el" href="classarm__compute_1_1_c_l_floor.xhtml">CLFloor</a></li>
+<li><a class="el" href="classarm__compute_1_1_n_e_g_e_m_m_matrix_addition_kernel.xhtml">NEGEMMMatrixAdditionKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_n_e_reshape_layer.xhtml">NEReshapeLayer</a> / <a class="el" href="classarm__compute_1_1_c_l_reshape_layer.xhtml">CLReshapeLayer</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_scale.xhtml">CLScale</a></li>
+</ul>
+</li>
+<li>Added new examples:<ul>
+<li><a class="el" href="graph__shufflenet_8cpp.xhtml">graph_shufflenet.cpp</a></li>
+<li><a class="el" href="graph__yolov3_8cpp.xhtml">graph_yolov3.cpp</a></li>
+</ul>
+</li>
+<li>Added documentation for add a new function or kernel.</li>
+<li>Improved doxygen documentation adding a list of the existing functions.</li>
+<li>Add 4D tensors support to<ul>
+<li><a class="el" href="classarm__compute_1_1_c_l_width_concatenate_layer.xhtml">CLWidthConcatenateLayer</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_flatten_layer.xhtml">CLFlattenLayer</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_softmax_layer.xhtml">CLSoftmaxLayer</a></li>
+</ul>
+</li>
+<li>Add dot product support for <a class="el" href="classarm__compute_1_1_c_l_depthwise_convolution_layer3x3_n_h_w_c_kernel.xhtml">CLDepthwiseConvolutionLayer3x3NHWCKernel</a> non-unit stride</li>
+<li>Add SVE support</li>
+<li>Fused batch normalization into convolution layer weights in <a class="el" href="classarm__compute_1_1_c_l_fuse_batch_normalization.xhtml">CLFuseBatchNormalization</a></li>
+<li>Fuses activation in <a class="el" href="classarm__compute_1_1_c_l_depthwise_convolution_layer3x3_n_c_h_w_kernel.xhtml">CLDepthwiseConvolutionLayer3x3NCHWKernel</a>, <a class="el" href="classarm__compute_1_1_c_l_depthwise_convolution_layer3x3_n_h_w_c_kernel.xhtml">CLDepthwiseConvolutionLayer3x3NHWCKernel</a> and <a class="el" href="classarm__compute_1_1_n_e_g_e_m_m_convolution_layer.xhtml">NEGEMMConvolutionLayer</a></li>
+<li>Added NHWC data layout support to:<ul>
+<li><a class="el" href="classarm__compute_1_1_c_l_channel_shuffle_layer.xhtml">CLChannelShuffleLayer</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_deconvolution_layer.xhtml">CLDeconvolutionLayer</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_l2_normalize_layer.xhtml">CLL2NormalizeLayer</a></li>
+</ul>
+</li>
+<li>Added QASYMM8 support to the following kernels:<ul>
+<li><a class="el" href="classarm__compute_1_1_c_l_scale_kernel.xhtml">CLScaleKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_n_e_depthwise_convolution_layer3x3_kernel.xhtml">NEDepthwiseConvolutionLayer3x3Kernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_pixel_wise_multiplication_kernel.xhtml">CLPixelWiseMultiplicationKernel</a></li>
+</ul>
+</li>
+<li>Added FP16 support to the following kernels:<ul>
+<li><a class="el" href="classarm__compute_1_1_c_l_depthwise_convolution_layer3x3_n_h_w_c_kernel.xhtml">CLDepthwiseConvolutionLayer3x3NHWCKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_n_e_depthwise_convolution_layer3x3_kernel.xhtml">NEDepthwiseConvolutionLayer3x3Kernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_normalize_planar_y_u_v_layer_kernel.xhtml">CLNormalizePlanarYUVLayerKernel</a></li>
+<li><a class="el" href="classarm__compute_1_1_c_l_winograd_convolution_layer.xhtml">CLWinogradConvolutionLayer</a> (5x5 kernel)</li>
+</ul>
+</li>
+<li>More tests added to both validation and benchmarking suites.</li>
+</ul>
<p>v18.08 Public major release</p><ul>
<li>Various bug fixes.</li>
<li>Various optimisations.</li>
@@ -501,7 +562,7 @@
</ul>
<p>v18.01 Public maintenance release</p><ul>
<li>Various bug fixes</li>
-<li>Added some of the missing <a class="el" href="namespacearm__compute_1_1test_1_1validation.xhtml#adace45051d8e72357da6c2b18ceaf25e">validate()</a> methods</li>
+<li>Added some of the missing <a class="el" href="namespacearm__compute_1_1test_1_1validation.xhtml#ae02c6fc90d9c60c634bfa258049eb46b">validate()</a> methods</li>
<li>Added <a class="el" href="classarm__compute_1_1_c_l_deconvolution_layer_upsample_kernel.xhtml">CLDeconvolutionLayerUpsampleKernel</a> / <a class="el" href="classarm__compute_1_1_c_l_deconvolution_layer.xhtml">CLDeconvolutionLayer</a> <a class="el" href="classarm__compute_1_1_c_l_deconvolution_layer_upsample.xhtml">CLDeconvolutionLayerUpsample</a></li>
<li>Added <a class="el" href="classarm__compute_1_1_c_l_permute_kernel.xhtml">CLPermuteKernel</a> / <a class="el" href="classarm__compute_1_1_c_l_permute.xhtml">CLPermute</a></li>
<li>Added method to clean the programs cache in the CL <a class="el" href="classarm__compute_1_1_kernel.xhtml" title="Kernel class. ">Kernel</a> library.</li>
@@ -686,7 +747,7 @@
<li><a class="el" href="classarm__compute_1_1_n_e_fill_array_kernel.xhtml">NEFillArrayKernel</a></li>
<li><a class="el" href="classarm__compute_1_1_n_e_gaussian_pyramid_hor_kernel.xhtml">NEGaussianPyramidHorKernel</a></li>
<li><a class="el" href="classarm__compute_1_1_n_e_gaussian_pyramid_vert_kernel.xhtml">NEGaussianPyramidVertKernel</a></li>
-<li><a class="el" href="namespacearm__compute.xhtml#a0b6679b5d5c7f7dc527258181b04cf35">NEHarrisScoreFP16Kernel</a></li>
+<li>NEHarrisScoreFP16Kernel</li>
<li><a class="el" href="classarm__compute_1_1_n_e_harris_score_kernel.xhtml">NEHarrisScoreKernel</a></li>
<li><a class="el" href="classarm__compute_1_1_n_e_h_o_g_detector_kernel.xhtml">NEHOGDetectorKernel</a></li>
<li><a class="el" href="classarm__compute_1_1_n_e_logits1_d_max_kernel.xhtml">NELogits1DMaxKernel</a></li>
@@ -696,7 +757,7 @@
<li><a class="el" href="classarm__compute_1_1_n_e_non_maxima_suppression3x3_kernel.xhtml">NENonMaximaSuppression3x3Kernel</a></li>
</ul>
<p>v17.03.1 First Major public release of the sources</p><ul>
-<li>Renamed the library to <a class="el" href="namespacearm__compute.xhtml" title="This file contains all available output stages for GEMMLowp on OpenCL. ">arm_compute</a></li>
+<li>Renamed the library to <a class="el" href="namespacearm__compute.xhtml" title="Copyright (c) 2017-2018 ARM Limited. ">arm_compute</a></li>
<li>New CPP target introduced for C++ kernels shared between NEON and CL functions.</li>
<li>New padding calculation interface introduced and ported most kernels / functions to use it.</li>
<li>New OpenCL kernels / functions:<ul>
@@ -879,7 +940,7 @@
<p><b>openmp</b> Build in the OpenMP scheduler for NEON.</p>
<dl class="section note"><dt>Note</dt><dd>Only works when building with g++ not clang++</dd></dl>
<p><b>cppthreads</b> Build in the C++11 scheduler for NEON.</p>
-<dl class="section see"><dt>See also</dt><dd><a class="el" href="classarm__compute_1_1_scheduler.xhtml#a12775a7fbfa126fa4f9f06f8e02d9a8e" title="Sets the user defined scheduler and makes it the active scheduler. ">Scheduler::set</a></dd></dl>
+<dl class="section see"><dt>See also</dt><dd><a class="el" href="classarm__compute_1_1_scheduler.xhtml#ad2fc671b2772dd9e28b81cf0e2514e85" title="Sets the user defined scheduler and makes it the active scheduler. ">Scheduler::set</a></dd></dl>
<h2><a class="anchor" id="S3_2_linux"></a>
Building for Linux</h2>
<h3><a class="anchor" id="S3_2_1_library"></a>
@@ -902,7 +963,7 @@
<h3><a class="anchor" id="S3_2_2_examples"></a>
How to manually build the examples ?</h3>
<p>The examples get automatically built by scons as part of the build process of the library described above. This section just describes how you can build and link your own application against our library.</p>
-<dl class="section note"><dt>Note</dt><dd>The following command lines assume the <a class="el" href="namespacearm__compute.xhtml" title="This file contains all available output stages for GEMMLowp on OpenCL. ">arm_compute</a> binaries are present in the current directory or in the system library path. If this is not the case you can specify the location of the pre-built library with the compiler option -L. When building the OpenCL example the commands below assume that the CL headers are located in the include folder where the command is executed.</dd></dl>
+<dl class="section note"><dt>Note</dt><dd>The following command lines assume the <a class="el" href="namespacearm__compute.xhtml" title="Copyright (c) 2017-2018 ARM Limited. ">arm_compute</a> binaries are present in the current directory or in the system library path. If this is not the case you can specify the location of the pre-built library with the compiler option -L. When building the OpenCL example the commands below assume that the CL headers are located in the include folder where the command is executed.</dd></dl>
<p>To cross compile a NEON example for Linux 32bit: </p><pre class="fragment">arm-linux-gnueabihf-g++ examples/neon_convolution.cpp utils/Utils.cpp -I. -Iinclude -std=c++11 -mfpu=neon -L. -larm_compute -larm_compute_core -o neon_convolution
</pre><p>To cross compile a NEON example for Linux 64bit: </p><pre class="fragment">aarch64-linux-gnu-g++ examples/neon_convolution.cpp utils/Utils.cpp -I. -Iinclude -std=c++11 -L. -larm_compute -larm_compute_core -o neon_convolution
</pre><p>(notice the only difference with the 32 bit command is that we don't need the -mfpu option and the compiler's name is different)</p>
@@ -916,7 +977,7 @@
<p>i.e. to cross compile the "graph_lenet" example for Linux 32bit: </p><pre class="fragment">arm-linux-gnueabihf-g++ examples/graph_lenet.cpp utils/Utils.cpp utils/GraphUtils.cpp utils/CommonGraphOptions.cpp -I. -Iinclude -std=c++11 -mfpu=neon -L. -larm_compute_graph -larm_compute -larm_compute_core -Wl,--allow-shlib-undefined -o graph_lenet
</pre><p>i.e. to cross compile the "graph_lenet" example for Linux 64bit: </p><pre class="fragment">aarch64-linux-gnu-g++ examples/graph_lenet.cpp utils/Utils.cpp utils/GraphUtils.cpp utils/CommonGraphOptions.cpp -I. -Iinclude -std=c++11 -L. -larm_compute_graph -larm_compute -larm_compute_core -Wl,--allow-shlib-undefined -o graph_lenet
</pre><p>(notice the only difference with the 32 bit command is that we don't need the -mfpu option and the compiler's name is different)</p>
-<dl class="section note"><dt>Note</dt><dd>If compiling using static libraries, this order must be followed when linking: arm_compute_graph_static, <a class="el" href="namespacearm__compute.xhtml" title="This file contains all available output stages for GEMMLowp on OpenCL. ">arm_compute</a>, arm_compute_core</dd></dl>
+<dl class="section note"><dt>Note</dt><dd>If compiling using static libraries, this order must be followed when linking: arm_compute_graph_static, <a class="el" href="namespacearm__compute.xhtml" title="Copyright (c) 2017-2018 ARM Limited. ">arm_compute</a>, arm_compute_core</dd></dl>
<p>To compile natively (i.e directly on an ARM device) for NEON for Linux 32bit: </p><pre class="fragment">g++ examples/neon_convolution.cpp utils/Utils.cpp -I. -Iinclude -std=c++11 -mfpu=neon -larm_compute -larm_compute_core -o neon_convolution
</pre><p>To compile natively (i.e directly on an ARM device) for NEON for Linux 64bit: </p><pre class="fragment">g++ examples/neon_convolution.cpp utils/Utils.cpp -I. -Iinclude -std=c++11 -larm_compute -larm_compute_core -o neon_convolution
</pre><p>(notice the only difference with the 32 bit command is that we don't need the -mfpu option)</p>
@@ -926,7 +987,7 @@
<p>i.e. to natively compile the "graph_lenet" example for Linux 32bit: </p><pre class="fragment">g++ examples/graph_lenet.cpp utils/Utils.cpp utils/GraphUtils.cpp utils/CommonGraphOptions.cpp -I. -Iinclude -std=c++11 -mfpu=neon -L. -larm_compute_graph -larm_compute -larm_compute_core -Wl,--allow-shlib-undefined -o graph_lenet
</pre><p>i.e. to natively compile the "graph_lenet" example for Linux 64bit: </p><pre class="fragment">g++ examples/graph_lenet.cpp utils/Utils.cpp utils/GraphUtils.cpp utils/CommonGraphOptions.cpp -I. -Iinclude -std=c++11 L. -larm_compute_graph -larm_compute -larm_compute_core -Wl,--allow-shlib-undefined -o graph_lenet
</pre><p>(notice the only difference with the 32 bit command is that we don't need the -mfpu option)</p>
-<dl class="section note"><dt>Note</dt><dd>If compiling using static libraries, this order must be followed when linking: arm_compute_graph_static, <a class="el" href="namespacearm__compute.xhtml" title="This file contains all available output stages for GEMMLowp on OpenCL. ">arm_compute</a>, arm_compute_core</dd>
+<dl class="section note"><dt>Note</dt><dd>If compiling using static libraries, this order must be followed when linking: arm_compute_graph_static, <a class="el" href="namespacearm__compute.xhtml" title="Copyright (c) 2017-2018 ARM Limited. ">arm_compute</a>, arm_compute_core</dd>
<dd>
These two commands assume libarm_compute.so is available in your library path, if not add the path to it using -L</dd></dl>
<p>To run the built executable simply run: </p><pre class="fragment">LD_LIBRARY_PATH=build ./neon_convolution
@@ -959,7 +1020,7 @@
</pre><h3><a class="anchor" id="S3_3_2_examples"></a>
How to manually build the examples ?</h3>
<p>The examples get automatically built by scons as part of the build process of the library described above. This section just describes how you can build and link your own application against our library.</p>
-<dl class="section note"><dt>Note</dt><dd>The following command lines assume the <a class="el" href="namespacearm__compute.xhtml" title="This file contains all available output stages for GEMMLowp on OpenCL. ">arm_compute</a> binaries are present in the current directory or in the system library path. If this is not the case you can specify the location of the pre-built library with the compiler option -L. When building the OpenCL example the commands below assume that the CL headers are located in the include folder where the command is executed.</dd></dl>
+<dl class="section note"><dt>Note</dt><dd>The following command lines assume the <a class="el" href="namespacearm__compute.xhtml" title="Copyright (c) 2017-2018 ARM Limited. ">arm_compute</a> binaries are present in the current directory or in the system library path. If this is not the case you can specify the location of the pre-built library with the compiler option -L. When building the OpenCL example the commands below assume that the CL headers are located in the include folder where the command is executed.</dd></dl>
<p>Once you've got your Android standalone toolchain built and added to your path you can do the following:</p>
<p>To cross compile a NEON example: </p><pre class="fragment">#32 bit:
arm-linux-androideabi-clang++ examples/neon_convolution.cpp utils/Utils.cpp -I. -Iinclude -std=c++11 -larm_compute-static -larm_compute_core-static -L. -o neon_convolution_arm -static-libstdc++ -pie
@@ -977,7 +1038,7 @@
arm-linux-androideabi-clang++ examples/graph_lenet.cpp utils/Utils.cpp utils/GraphUtils.cpp utils/CommonGraphOptions.cpp -I. -Iinclude -std=c++11 -Wl,--whole-archive -larm_compute_graph-static -Wl,--no-whole-archive -larm_compute-static -larm_compute_core-static -L. -o graph_lenet_arm -static-libstdc++ -pie -DARM_COMPUTE_CL
#64 bit:
aarch64-linux-android-clang++ examples/graph_lenet.cpp utils/Utils.cpp utils/GraphUtils.cpp utils/CommonGraphOptions.cpp -I. -Iinclude -std=c++11 -Wl,--whole-archive -larm_compute_graph-static -Wl,--no-whole-archive -larm_compute-static -larm_compute_core-static -L. -o graph_lenet_aarch64 -static-libstdc++ -pie -DARM_COMPUTE_CL
-</pre><dl class="section note"><dt>Note</dt><dd>Due to some issues in older versions of the Mali OpenCL DDK (<= r13p0), we recommend to link <a class="el" href="namespacearm__compute.xhtml" title="This file contains all available output stages for GEMMLowp on OpenCL. ">arm_compute</a> statically on Android. </dd>
+</pre><dl class="section note"><dt>Note</dt><dd>Due to some issues in older versions of the Mali OpenCL DDK (<= r13p0), we recommend to link <a class="el" href="namespacearm__compute.xhtml" title="Copyright (c) 2017-2018 ARM Limited. ">arm_compute</a> statically on Android. </dd>
<dd>
When linked statically the arm_compute_graph library currently needs the –whole-archive linker flag in order to work properly</dd></dl>
<p>Then you need to do is upload the executable and the shared library to the device using ADB: </p><pre class="fragment">adb push neon_convolution_arm /data/local/tmp/
@@ -1022,9 +1083,9 @@
<p>If the Windows subsystem for Linux is not available <a href="https://www.cygwin.com/">Cygwin</a> can be used to install and run <code>scons</code>. In addition to the default packages installed by Cygwin <code>scons</code> has to be selected in the installer. (<code>git</code> might also be useful but is not strictly required if you already have got the source code of the library.) Linaro provides pre-built versions of <a href="http://releases.linaro.org/components/toolchain/binaries/">GCC cross-compilers</a> that can be used from the Cygwin terminal. When building for Android the compiler is included in the Android standalone toolchain. After everything has been set up in the Cygwin terminal the general guide on building the library can be followed.</p>
<h2><a class="anchor" id="S3_6_cl_stub_library"></a>
The OpenCL stub library</h2>
-<p>In the opencl-1.2-stubs folder you will find the sources to build a stub OpenCL library which then can be used to link your application or <a class="el" href="namespacearm__compute.xhtml" title="This file contains all available output stages for GEMMLowp on OpenCL. ">arm_compute</a> against.</p>
+<p>In the opencl-1.2-stubs folder you will find the sources to build a stub OpenCL library which then can be used to link your application or <a class="el" href="namespacearm__compute.xhtml" title="Copyright (c) 2017-2018 ARM Limited. ">arm_compute</a> against.</p>
<p>If you preferred you could retrieve the OpenCL library from your device and link against this one but often this library will have dependencies on a range of system libraries forcing you to link your application against those too even though it is not using them.</p>
-<dl class="section warning"><dt>Warning</dt><dd>This OpenCL library provided is a stub and <em>not</em> a real implementation. You can use it to resolve OpenCL's symbols in <a class="el" href="namespacearm__compute.xhtml" title="This file contains all available output stages for GEMMLowp on OpenCL. ">arm_compute</a> while building the example but you must make sure the real libOpenCL.so is in your PATH when running the example or it will not work.</dd></dl>
+<dl class="section warning"><dt>Warning</dt><dd>This OpenCL library provided is a stub and <em>not</em> a real implementation. You can use it to resolve OpenCL's symbols in <a class="el" href="namespacearm__compute.xhtml" title="Copyright (c) 2017-2018 ARM Limited. ">arm_compute</a> while building the example but you must make sure the real libOpenCL.so is in your PATH when running the example or it will not work.</dd></dl>
<p>To cross-compile the stub OpenCL library simply run: </p><pre class="fragment"><target-prefix>-gcc -o libOpenCL.so -Iinclude opencl-1.2-stubs/opencl_stubs.c -fPIC -shared
</pre><p>For example: </p><pre class="fragment">#Linux 32bit
arm-linux-gnueabihf-gcc -o libOpenCL.so -Iinclude opencl-1.2-stubs/opencl_stubs.c -fPIC -shared
@@ -1036,7 +1097,7 @@
aarch64-linux-android-clang -o libOpenCL.so -Iinclude -shared opencl-1.2-stubs/opencl_stubs.c -fPIC -shared
</pre><h2><a class="anchor" id="S3_7_gles_stub_library"></a>
The Linux OpenGLES and EGL stub libraries</h2>
-<p>In the opengles-3.1-stubs folder you will find the sources to build stub EGL and OpenGLES libraries which then can be used to link your Linux application of <a class="el" href="namespacearm__compute.xhtml" title="This file contains all available output stages for GEMMLowp on OpenCL. ">arm_compute</a> against.</p>
+<p>In the opengles-3.1-stubs folder you will find the sources to build stub EGL and OpenGLES libraries which then can be used to link your Linux application of <a class="el" href="namespacearm__compute.xhtml" title="Copyright (c) 2017-2018 ARM Limited. ">arm_compute</a> against.</p>
<dl class="section note"><dt>Note</dt><dd>The stub libraries are only needed on Linux. For Android, the NDK toolchains already provide the meta-EGL and meta-GLES libraries.</dd></dl>
<p>To cross-compile the stub OpenGLES and EGL libraries simply run: </p><pre class="fragment"><target-prefix>-gcc -o libEGL.so -Iinclude/linux opengles-3.1-stubs/EGL.c -fPIC -shared
<target-prefix>-gcc -o libGLESv2.so -Iinclude/linux opengles-3.1-stubs/GLESv2.c -fPIC -shared
@@ -1081,7 +1142,7 @@
-# Keep the GPU frequency constant
-# Run multiple times the network (i.e. 10).
</pre><p>If you are not using the graph API or the benchmark infrastructure you will need to manually pass a <a class="el" href="classarm__compute_1_1_c_l_tuner.xhtml" title="Basic implementation of the OpenCL tuner interface. ">CLTuner</a> object to <a class="el" href="classarm__compute_1_1_c_l_scheduler.xhtml" title="Provides global access to a CL context and command queue. ">CLScheduler</a> before configuring any function.</p>
-<div class="fragment"><div class="line">CLTuner tuner;</div><div class="line"></div><div class="line"><span class="comment">// Setup Scheduler</span></div><div class="line"><a class="code" href="classarm__compute_1_1_c_l_scheduler.xhtml#a60f9a6836b628a7171914c4afe43b4a7">CLScheduler::get</a>().<a class="code" href="classarm__compute_1_1_c_l_scheduler.xhtml#a46ecf9ef0fe80ba2ed35acfc29856b7d">default_init</a>(&tuner);</div></div><!-- fragment --><p>After the first run, the <a class="el" href="classarm__compute_1_1_c_l_tuner.xhtml" title="Basic implementation of the OpenCL tuner interface. ">CLTuner</a>'s results can be exported to a file using the method "save_to_file()".</p><ul>
+<div class="fragment"><div class="line">CLTuner tuner;</div><div class="line"></div><div class="line"><span class="comment">// Setup Scheduler</span></div><div class="line"><a class="code" href="classarm__compute_1_1_c_l_scheduler.xhtml#a9b58d0eb9a2af8e6d7908695e1557d6c">CLScheduler::get</a>().<a class="code" href="classarm__compute_1_1_c_l_scheduler.xhtml#a46ecf9ef0fe80ba2ed35acfc29856b7d">default_init</a>(&tuner);</div></div><!-- fragment --><p>After the first run, the <a class="el" href="classarm__compute_1_1_c_l_tuner.xhtml" title="Basic implementation of the OpenCL tuner interface. ">CLTuner</a>'s results can be exported to a file using the method "save_to_file()".</p><ul>
<li>tuner.save_to_file("results.csv");</li>
</ul>
<p>This file can be also imported using the method "load_from_file("results.csv")".</p><ul>
@@ -1092,9 +1153,9 @@
<!-- start footer part -->
<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
<ul>
- <li class="footer">Generated on Wed Aug 29 2018 15:31:57 for Compute Library by
+ <li class="footer">Generated on Thu Nov 22 2018 11:57:52 for Compute Library by
<a href="http://www.doxygen.org/index.html">
- <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.11 </li>
+ <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.13 </li>
</ul>
</div>
</body>