Issue #14744: Use the new _PyUnicodeWriter internal API to speed up str%args and str.format(args)

 * Formatting string, int, float and complex use the _PyUnicodeWriter API. It
   avoids a temporary buffer in most cases.
 * Add _PyUnicodeWriter_WriteStr() to restore the PyAccu optimization: just
   keep a reference to the string if the output is only composed of one string
 * Disable overallocation when formatting the last argument of str%args and
   str.format(args)
 * Overallocation allocates at least 100 characters: add min_length attribute
   to the _PyUnicodeWriter structure
 * Add new private functions: _PyUnicode_FastCopyCharacters(),
   _PyUnicode_FastFill() and _PyUnicode_FromASCII()

The speed up is around 20% in average.
diff --git a/Misc/NEWS b/Misc/NEWS
index 0d36966..6a01d3e 100644
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -13,6 +13,9 @@
 - Issue #14835: Make plistlib output empty arrays & dicts like OS X.
   Patch by Sidney San Martín.
 
+- Issue #14744: Use the new _PyUnicodeWriter internal API to speed up
+  str%args and str.format(args).
+
 - Issue #14930: Make memoryview objects weakrefable.
 
 - Issue #14775: Fix a potential quadratic dict build-up due to the garbage