array: add unchecked access via proxy object
This adds bounds-unchecked access to arrays through a `a.unchecked<Type,
Dimensions>()` method. (For `array_t<T>`, the `Type` template parameter
is omitted). The mutable version (which requires the array have the
`writeable` flag) is available as `a.mutable_unchecked<...>()`.
Specifying the Dimensions as a template parameter allows storage of an
std::array; having the strides and sizes stored that way (as opposed to
storing a copy of the array's strides/shape pointers) allows the
compiler to make significant optimizations of the shape() method that it
can't make with a pointer; testing with nested loops of the form:
for (size_t i0 = 0; i0 < r.shape(0); i0++)
for (size_t i1 = 0; i1 < r.shape(1); i1++)
...
r(i0, i1, ...) += 1;
over a 10 million element array gives around a 25% speedup (versus using
a pointer) for the 1D case, 33% for 2D, and runs more than twice as fast
with a 5D array.
diff --git a/docs/advanced/pycpp/numpy.rst b/docs/advanced/pycpp/numpy.rst
index d89e4be..f9d6acb 100644
--- a/docs/advanced/pycpp/numpy.rst
+++ b/docs/advanced/pycpp/numpy.rst
@@ -305,3 +305,48 @@
The file :file:`tests/test_numpy_vectorize.cpp` contains a complete
example that demonstrates using :func:`vectorize` in more detail.
+
+Direct access
+=============
+
+For performance reasons, particularly when dealing with very large arrays, it
+is often desirable to directly access array elements without internal checking
+of dimensions and bounds on every access when indices are known to be already
+valid. To avoid such checks, the ``array`` class and ``array_t<T>`` template
+class offer an unchecked proxy object that can be used for this unchecked
+access through the ``unchecked<N>`` and ``mutable_unchecked<N>`` methods,
+where ``N`` gives the required dimensionality of the array:
+
+.. code-block:: cpp
+
+ m.def("sum_3d", [](py::array_t<double> x) {
+ auto r = x.unchecked<3>(); // x must have ndim = 3; can be non-writeable
+ double sum = 0;
+ for (size_t i = 0; i < r.shape(0); i++)
+ for (size_t j = 0; j < r.shape(1); j++)
+ for (size_t k = 0; k < r.shape(2); k++)
+ sum += r(i, j, k);
+ return sum;
+ });
+ m.def("increment_3d", [](py::array_t<double> x) {
+ auto r = x.mutable_unchecked<3>(); // Will throw if ndim != 3 or flags.writeable is false
+ for (size_t i = 0; i < r.shape(0); i++)
+ for (size_t j = 0; j < r.shape(1); j++)
+ for (size_t k = 0; k < r.shape(2); k++)
+ r(i, j, k) += 1.0;
+ }, py::arg().noconvert());
+
+To obtain the proxy from an ``array`` object, you must specify both the data
+type and number of dimensions as template arguments, such as ``auto r =
+myarray.mutable_unchecked<float, 2>()``.
+
+Note that the returned proxy object directly references the array's data, and
+only reads its shape, strides, and writeable flag when constructed. You must
+take care to ensure that the referenced array is not destroyed or reshaped for
+the duration of the returned object, typically by limiting the scope of the
+returned instance.
+
+.. seealso::
+
+ The file :file:`tests/test_numpy_array.cpp` contains additional examples
+ demonstrating the use of this feature.