Disable _PyStack_AsTuple() inlining

Issue #29234: Inlining _PyStack_AsTuple() into callers increases their stack
consumption, Disable inlining to optimize the stack consumption.

Add _Py_NO_INLINE: use __attribute__((noinline)) of GCC and Clang.

It reduces the stack consumption, bytes per call, before => after:

test_python_call: 1040 => 976 (-64 B)
test_python_getitem: 976 => 912 (-64 B)
test_python_iterator: 1120 => 1056 (-64 B)

=> total: 3136 => 2944 (- 192 B)
diff --git a/Include/pyport.h b/Include/pyport.h
index def2975..03c664f 100644
--- a/Include/pyport.h
+++ b/Include/pyport.h
@@ -507,7 +507,7 @@
  * locality.
  *
  * Usage:
- *    int _Py_HOT_FUNCTION x() { return 3; }
+ *    int _Py_HOT_FUNCTION x(void) { return 3; }
  *
  * Issue #28618: This attribute must not be abused, otherwise it can have a
  * negative effect on performance. Only the functions were Python spend most of
@@ -521,6 +521,19 @@
 #define _Py_HOT_FUNCTION
 #endif
 
+/* _Py_NO_INLINE
+ * Disable inlining on a function. For example, it helps to reduce the C stack
+ * consumption.
+ *
+ * Usage:
+ *    int _Py_NO_INLINE x(void) { return 3; }
+ */
+#if defined(__GNUC__) || defined(__clang__)
+#  define _Py_NO_INLINE __attribute__((noinline))
+#else
+#  define _Py_NO_INLINE
+#endif
+
 /**************************************************************************
 Prototypes that are missing from the standard include files on some systems
 (and possibly only some versions of such systems.)