vectorize: pass-through of non-vectorizable args

This extends py::vectorize to automatically pass through
non-vectorizable arguments.  This removes the need for the documented
"explicitly exclude an argument" workaround.

Vectorization now applies to arithmetic, std::complex, and POD types,
passed as plain value or by const lvalue reference (previously only
pass-by-value types were supported).  Non-const lvalue references and
any other types are passed through as-is.

Functions with rvalue reference arguments (whether vectorizable or not)
are explicitly prohibited: an rvalue reference is inherently not
something that can be passed multiple times and is thus unsuitable to
being in a vectorized function.

The vectorize returned value is also now more sensitive to inputs:
previously it would return by value when all inputs are of size 1; this
is now amended to having all inputs of size 1 *and* 0 dimensions.  Thus
if you pass in, for example, [[1]], you get back a 1x1, 2D array, while
previously you got back just the resulting single value.

Vectorization of member function specializations is now also supported
via `py::vectorize(&Class::method)`; this required passthrough support
for the initial object pointer on the wrapping function pointer.
diff --git a/docs/advanced/pycpp/numpy.rst b/docs/advanced/pycpp/numpy.rst
index 57a52f6..fbcebc4 100644
--- a/docs/advanced/pycpp/numpy.rst
+++ b/docs/advanced/pycpp/numpy.rst
@@ -239,27 +239,13 @@
 The scalar argument ``z`` is transparently replicated 4 times.  The input
 arrays ``x`` and ``y`` are automatically converted into the right types (they
 are of type  ``numpy.dtype.int64`` but need to be ``numpy.dtype.int32`` and
-``numpy.dtype.float32``, respectively)
+``numpy.dtype.float32``, respectively).
 
-Sometimes we might want to explicitly exclude an argument from the vectorization
-because it makes little sense to wrap it in a NumPy array. For instance,
-suppose the function signature was
+.. note::
 
-.. code-block:: cpp
-
-    double my_func(int x, float y, my_custom_type *z);
-
-This can be done with a stateful Lambda closure:
-
-.. code-block:: cpp
-
-    // Vectorize a lambda function with a capture object (e.g. to exclude some arguments from the vectorization)
-    m.def("vectorized_func",
-        [](py::array_t<int> x, py::array_t<float> y, my_custom_type *z) {
-            auto stateful_closure = [z](int x, float y) { return my_func(x, y, z); };
-            return py::vectorize(stateful_closure)(x, y);
-        }
-    );
+    Only arithmetic, complex, and POD types passed by value or by ``const &``
+    reference are vectorized; all other arguments are passed through as-is.
+    Functions taking rvalue reference arguments cannot be vectorized.
 
 In cases where the computation is too complicated to be reduced to
 ``vectorize``, it will be necessary to create and access the buffer contents