bpo-25658: Implement PEP 539 for Thread Specific Storage (TSS) API (GH-1362)

See PEP 539 for details.

Highlights of changes:

- Add Thread Specific Storage (TSS) API
- Document the Thread Local Storage (TLS) API as deprecated
- Update code that used TLS API to use TSS API
diff --git a/Doc/c-api/init.rst b/Doc/c-api/init.rst
index c12e1c7..7792058 100644
--- a/Doc/c-api/init.rst
+++ b/Doc/c-api/init.rst
@@ -1192,3 +1192,160 @@
    Return the next thread state object after *tstate* from the list of all such
    objects belonging to the same :c:type:`PyInterpreterState` object.
 
+
+.. _thread-local-storage:
+
+Thread Local Storage Support
+============================
+
+.. sectionauthor:: Masayuki Yamamoto <ma3yuki.8mamo10@gmail.com>
+
+The Python interpreter provides low-level support for thread-local storage
+(TLS) which wraps the underlying native TLS implementation to support the
+Python-level thread local storage API (:class:`threading.local`).  The
+CPython C level APIs are similar to those offered by pthreads and Windows:
+use a thread key and functions to associate a :c:type:`void\*` value per
+thread.
+
+The GIL does *not* need to be held when calling these functions; they supply
+their own locking.
+
+Note that :file:`Python.h` does not include the declaration of the TLS APIs,
+you need to include :file:`pythread.h` to use thread-local storage.
+
+.. note::
+   None of these API functions handle memory management on behalf of the
+   :c:type:`void\*` values.  You need to allocate and deallocate them yourself.
+   If the :c:type:`void\*` values happen to be :c:type:`PyObject\*`, these
+   functions don't do refcount operations on them either.
+
+.. _thread-specific-storage-api:
+
+Thread Specific Storage (TSS) API
+---------------------------------
+
+TSS API is introduced to supersede the use of the existing TLS API within the
+CPython interpreter.  This API uses a new type :c:type:`Py_tss_t` instead of
+:c:type:`int` to represent thread keys.
+
+.. versionadded:: 3.7
+
+.. seealso:: "A New C-API for Thread-Local Storage in CPython" (:pep:`539`)
+
+
+.. c:type:: Py_tss_t
+
+   This data structure represents the state of a thread key, the definition of
+   which may depend on the underlying TLS implementation, and it has an
+   internal field representing the key's initialization state.  There are no
+   public members in this structure.
+
+   When :ref:`Py_LIMITED_API <stable>` is not defined, static allocation of
+   this type by :c:macro:`Py_tss_NEEDS_INIT` is allowed.
+
+
+.. c:macro:: Py_tss_NEEDS_INIT
+
+   This macro expands to the default value for :c:type:`Py_tss_t` variables.
+   Note that this macro won't be defined with :ref:`Py_LIMITED_API <stable>`.
+
+
+Dynamic Allocation
+~~~~~~~~~~~~~~~~~~
+
+Dynamic allocation of the :c:type:`Py_tss_t`, required in extension modules
+built with :ref:`Py_LIMITED_API <stable>`, where static allocation of this type
+is not possible due to its implementation being opaque at build time.
+
+
+.. c:function:: Py_tss_t* PyThread_tss_alloc()
+
+   Return a value which is the same state as a value initialized with
+   :c:macro:`Py_tss_NEEDS_INIT`, or *NULL* in the case of dynamic allocation
+   failure.
+
+
+.. c:function:: void PyThread_tss_free(Py_tss_t *key)
+
+   Free the given *key* allocated by :c:func:`PyThread_tss_alloc`, after
+   first calling :c:func:`PyThread_tss_delete` to ensure any associated
+   thread locals have been unassigned. This is a no-op if the *key*
+   argument is `NULL`.
+
+   .. note::
+      A freed key becomes a dangling pointer, you should reset the key to
+      `NULL`.
+
+
+Methods
+~~~~~~~
+
+The parameter *key* of these functions must not be *NULL*.  Moreover, the
+behaviors of :c:func:`PyThread_tss_set` and :c:func:`PyThread_tss_get` are
+undefined if the given :c:type:`Py_tss_t` has not been initialized by
+:c:func:`PyThread_tss_create`.
+
+
+.. c:function:: int PyThread_tss_is_created(Py_tss_t *key)
+
+   Return a non-zero value if the given :c:type:`Py_tss_t` has been initialized
+   by :c:func:`PyThread_tss_create`.
+
+
+.. c:function:: int PyThread_tss_create(Py_tss_t *key)
+
+   Return a zero value on successful initialization of a TSS key.  The behavior
+   is undefined if the value pointed to by the *key* argument is not
+   initialized by :c:macro:`Py_tss_NEEDS_INIT`.  This function can be called
+   repeatedly on the same key -- calling it on an already initialized key is a
+   no-op and immediately returns success.
+
+
+.. c:function:: void PyThread_tss_delete(Py_tss_t *key)
+
+   Destroy a TSS key to forget the values associated with the key across all
+   threads, and change the key's initialization state to uninitialized.  A
+   destroyed key is able to be initialized again by
+   :c:func:`PyThread_tss_create`. This function can be called repeatedly on
+   the same key -- calling it on an already destroyed key is a no-op.
+
+
+.. c:function:: int PyThread_tss_set(Py_tss_t *key, void *value)
+
+   Return a zero value to indicate successfully associating a :c:type:`void\*`
+   value with a TSS key in the current thread.  Each thread has a distinct
+   mapping of the key to a :c:type:`void\*` value.
+
+
+.. c:function:: void* PyThread_tss_get(Py_tss_t *key)
+
+   Return the :c:type:`void\*` value associated with a TSS key in the current
+   thread.  This returns *NULL* if no value is associated with the key in the
+   current thread.
+
+
+.. _thread-local-storage-api:
+
+Thread Local Storage (TLS) API
+------------------------------
+
+.. deprecated:: 3.7
+   This API is superseded by
+   :ref:`Thread Specific Storage (TSS) API <thread-specific-storage-api>`.
+
+.. note::
+   This version of the API does not support platforms where the native TLS key
+   is defined in a way that cannot be safely cast to ``int``.  On such platforms,
+   :c:func:`PyThread_create_key` will return immediately with a failure status,
+   and the other TLS functions will all be no-ops on such platforms.
+
+Due to the compatibility problem noted above, this version of the API should not
+be used in new code.
+
+.. c:function:: int PyThread_create_key()
+.. c:function:: void PyThread_delete_key(int key)
+.. c:function:: int PyThread_set_key_value(int key, void *value)
+.. c:function:: void* PyThread_get_key_value(int key)
+.. c:function:: void PyThread_delete_key_value(int key)
+.. c:function:: void PyThread_ReInitTLS()
+
diff --git a/Doc/whatsnew/3.7.rst b/Doc/whatsnew/3.7.rst
index 3e8617e..ecdd2fe 100644
--- a/Doc/whatsnew/3.7.rst
+++ b/Doc/whatsnew/3.7.rst
@@ -127,6 +127,38 @@
       PEP written and implemented by Barry Warsaw
 
 
+.. _whatsnew37-pep539:
+
+PEP 539: A New C-API for Thread-Local Storage in CPython
+--------------------------------------------------------
+
+While Python provides a C API for thread-local storage support; the existing
+:ref:`Thread Local Storage (TLS) API <thread-local-storage-api>` has used
+:c:type:`int` to represent TLS keys across all platforms.  This has not
+generally been a problem for officially-support platforms, but that is neither
+POSIX-compliant, nor portable in any practical sense.
+
+:pep:`539` changes this by providing a new :ref:`Thread Specific Storage (TSS)
+API <thread-specific-storage-api>` to CPython which supersedes use of the
+existing TLS API within the CPython interpreter, while deprecating the existing
+API.  The TSS API uses a new type :c:type:`Py_tss_t` instead of :c:type:`int`
+to represent TSS keys--an opaque type the definition of which may depend on
+the underlying TLS implementation.  Therefore, this will allow to build CPython
+on platforms where the native TLS key is defined in a way that cannot be safely
+cast to :c:type:`int`.
+
+Note that on platforms where the native TLS key is defined in a way that cannot
+be safely cast to :c:type:`int`, all functions of the existing TLS API will be
+no-op and immediately return failure. This indicates clearly that the old API
+is not supported on platforms where it cannot be used reliably, and that no
+effort will be made to add such support.
+
+.. seealso::
+
+    :pep:`539` -- A New C-API for Thread-Local Storage in CPython
+       PEP written by Erik M. Bray; implementation by Masayuki Yamamoto.
+
+
 Other Language Changes
 ======================