[3.9] bpo-40630: Add tracemalloc.reset_peak (GH-20102) (GH-20545)

* bpo-40630: Add tracemalloc.reset_peak (GH-20102, cherrypick 8b62644)

The reset_peak function sets the peak memory size to the current size,
representing a resetting of that metric. This allows for recording the
peak of specific sections of code, ignoring other code that may have
had a higher peak (since the most recent `tracemalloc.start()` or
tracemalloc.clear_traces()` call).

* Adjust docs to point to 3.9
diff --git a/Doc/library/tracemalloc.rst b/Doc/library/tracemalloc.rst
index 3eee945..20f668c 100644
--- a/Doc/library/tracemalloc.rst
+++ b/Doc/library/tracemalloc.rst
@@ -249,6 +249,47 @@
 
 See :meth:`Snapshot.statistics` for more options.
 
+Record the current and peak size of all traced memory blocks
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The following code computes two sums like ``0 + 1 + 2 + ...`` inefficiently, by
+creating a list of those numbers. This list consumes a lot of memory
+temporarily. We can use :func:`get_traced_memory` and :func:`reset_peak` to
+observe the small memory usage after the sum is computed as well as the peak
+memory usage during the computations::
+
+  import tracemalloc
+
+  tracemalloc.start()
+
+  # Example code: compute a sum with a large temporary list
+  large_sum = sum(list(range(100000)))
+
+  first_size, first_peak = tracemalloc.get_traced_memory()
+
+  tracemalloc.reset_peak()
+
+  # Example code: compute a sum with a small temporary list
+  small_sum = sum(list(range(1000)))
+
+  second_size, second_peak = tracemalloc.get_traced_memory()
+
+  print(f"{first_size=}, {first_peak=}")
+  print(f"{second_size=}, {second_peak=}")
+
+Output::
+
+  first_size=664, first_peak=3592984
+  second_size=804, second_peak=29704
+
+Using :func:`reset_peak` ensured we could accurately record the peak during the
+computation of ``small_sum``, even though it is much smaller than the overall
+peak size of memory blocks since the :func:`start` call. Without the call to
+:func:`reset_peak`, ``second_peak`` would still be the peak from the
+computation ``large_sum`` (that is, equal to ``first_peak``). In this case,
+both peaks are much higher than the final memory usage, and which suggests we
+could optimise (by removing the unnecessary call to :class:`list`, and writing
+``sum(range(...))``).
 
 API
 ---
@@ -289,6 +330,24 @@
    :mod:`tracemalloc` module as a tuple: ``(current: int, peak: int)``.
 
 
+.. function:: reset_peak()
+
+   Set the peak size of memory blocks traced by the :mod:`tracemalloc` module
+   to the current size.
+
+   Do nothing if the :mod:`tracemalloc` module is not tracing memory
+   allocations.
+
+   This function only modifies the recorded peak size, and does not modify or
+   clear any traces, unlike :func:`clear_traces`. Snapshots taken with
+   :func:`take_snapshot` before a call to :func:`reset_peak` can be
+   meaningfully compared to snapshots taken after the call.
+
+   See also :func:`get_traced_memory`.
+
+   .. versionadded:: 3.9
+
+
 .. function:: get_tracemalloc_memory()
 
    Get the memory usage in bytes of the :mod:`tracemalloc` module used to store