Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | :mod:`string` --- Common string operations |
| 2 | ========================================== |
| 3 | |
| 4 | .. module:: string |
| 5 | :synopsis: Common string operations. |
| 6 | |
Éric Araujo | 19f9b71 | 2011-08-19 00:49:18 +0200 | [diff] [blame] | 7 | **Source code:** :source:`Lib/string.py` |
| 8 | |
| 9 | -------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 10 | |
Éric Araujo | 6e6cb8e | 2010-11-16 19:13:50 +0000 | [diff] [blame] | 11 | .. seealso:: |
| 12 | |
Ezio Melotti | a6229e6 | 2012-10-12 10:59:14 +0300 | [diff] [blame] | 13 | :ref:`textseq` |
Georg Brandl | b30f330 | 2011-01-06 09:23:56 +0000 | [diff] [blame] | 14 | |
| 15 | :ref:`string-methods` |
| 16 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 17 | String constants |
| 18 | ---------------- |
| 19 | |
| 20 | The constants defined in this module are: |
| 21 | |
| 22 | |
| 23 | .. data:: ascii_letters |
| 24 | |
| 25 | The concatenation of the :const:`ascii_lowercase` and :const:`ascii_uppercase` |
| 26 | constants described below. This value is not locale-dependent. |
| 27 | |
| 28 | |
| 29 | .. data:: ascii_lowercase |
| 30 | |
| 31 | The lowercase letters ``'abcdefghijklmnopqrstuvwxyz'``. This value is not |
| 32 | locale-dependent and will not change. |
| 33 | |
| 34 | |
| 35 | .. data:: ascii_uppercase |
| 36 | |
| 37 | The uppercase letters ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``. This value is not |
| 38 | locale-dependent and will not change. |
| 39 | |
| 40 | |
| 41 | .. data:: digits |
| 42 | |
| 43 | The string ``'0123456789'``. |
| 44 | |
| 45 | |
| 46 | .. data:: hexdigits |
| 47 | |
| 48 | The string ``'0123456789abcdefABCDEF'``. |
| 49 | |
| 50 | |
| 51 | .. data:: octdigits |
| 52 | |
| 53 | The string ``'01234567'``. |
| 54 | |
| 55 | |
| 56 | .. data:: punctuation |
| 57 | |
| 58 | String of ASCII characters which are considered punctuation characters |
Andre Delfino | b420428 | 2019-03-14 16:28:31 -0300 | [diff] [blame] | 59 | in the ``C`` locale: ``!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 60 | |
| 61 | .. data:: printable |
| 62 | |
| 63 | String of ASCII characters which are considered printable. This is a |
| 64 | combination of :const:`digits`, :const:`ascii_letters`, :const:`punctuation`, |
| 65 | and :const:`whitespace`. |
| 66 | |
| 67 | |
| 68 | .. data:: whitespace |
| 69 | |
Georg Brandl | 5076740 | 2008-11-22 08:31:09 +0000 | [diff] [blame] | 70 | A string containing all ASCII characters that are considered whitespace. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 71 | This includes the characters space, tab, linefeed, return, formfeed, and |
| 72 | vertical tab. |
| 73 | |
| 74 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 75 | .. _string-formatting: |
| 76 | |
Martin Panter | d5db147 | 2016-02-08 01:34:09 +0000 | [diff] [blame] | 77 | Custom String Formatting |
| 78 | ------------------------ |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 79 | |
Benjamin Peterson | 50923f9 | 2008-05-25 19:45:17 +0000 | [diff] [blame] | 80 | The built-in string class provides the ability to do complex variable |
Martin Panter | d5db147 | 2016-02-08 01:34:09 +0000 | [diff] [blame] | 81 | substitutions and value formatting via the :meth:`~str.format` method described in |
Benjamin Peterson | 50923f9 | 2008-05-25 19:45:17 +0000 | [diff] [blame] | 82 | :pep:`3101`. The :class:`Formatter` class in the :mod:`string` module allows |
| 83 | you to create and customize your own string formatting behaviors using the same |
Martin Panter | d5db147 | 2016-02-08 01:34:09 +0000 | [diff] [blame] | 84 | implementation as the built-in :meth:`~str.format` method. |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 85 | |
Benjamin Peterson | 1baf465 | 2009-12-31 03:11:23 +0000 | [diff] [blame] | 86 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 87 | .. class:: Formatter |
| 88 | |
| 89 | The :class:`Formatter` class has the following public methods: |
| 90 | |
Georg Brandl | 8e490de | 2011-01-24 19:53:18 +0000 | [diff] [blame] | 91 | .. method:: format(format_string, *args, **kwargs) |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 92 | |
Martin Panter | d5db147 | 2016-02-08 01:34:09 +0000 | [diff] [blame] | 93 | The primary API method. It takes a format string and |
R David Murray | e56bf97 | 2012-08-19 17:26:34 -0400 | [diff] [blame] | 94 | an arbitrary set of positional and keyword arguments. |
Martin Panter | d5db147 | 2016-02-08 01:34:09 +0000 | [diff] [blame] | 95 | It is just a wrapper that calls :meth:`vformat`. |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 96 | |
Serhiy Storchaka | 009b0a1 | 2017-01-13 09:10:51 +0200 | [diff] [blame] | 97 | .. versionchanged:: 3.7 |
| 98 | A format string argument is now :ref:`positional-only |
| 99 | <positional-only_parameter>`. |
Serhiy Storchaka | b876df4 | 2015-03-24 22:30:46 +0200 | [diff] [blame] | 100 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 101 | .. method:: vformat(format_string, args, kwargs) |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 102 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 103 | This function does the actual work of formatting. It is exposed as a |
| 104 | separate function for cases where you want to pass in a predefined |
| 105 | dictionary of arguments, rather than unpacking and repacking the |
Ezio Melotti | 28c88f4 | 2012-11-27 19:17:57 +0200 | [diff] [blame] | 106 | dictionary as individual arguments using the ``*args`` and ``**kwargs`` |
R David Murray | e56bf97 | 2012-08-19 17:26:34 -0400 | [diff] [blame] | 107 | syntax. :meth:`vformat` does the work of breaking up the format string |
| 108 | into character data and replacement fields. It calls the various |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 109 | methods described below. |
| 110 | |
| 111 | In addition, the :class:`Formatter` defines a number of methods that are |
| 112 | intended to be replaced by subclasses: |
| 113 | |
| 114 | .. method:: parse(format_string) |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 115 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 116 | Loop over the format_string and return an iterable of tuples |
| 117 | (*literal_text*, *field_name*, *format_spec*, *conversion*). This is used |
Georg Brandl | 70cd7bc | 2010-10-26 19:31:06 +0000 | [diff] [blame] | 118 | by :meth:`vformat` to break the string into either literal text, or |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 119 | replacement fields. |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 120 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 121 | The values in the tuple conceptually represent a span of literal text |
| 122 | followed by a single replacement field. If there is no literal text |
| 123 | (which can happen if two replacement fields occur consecutively), then |
| 124 | *literal_text* will be a zero-length string. If there is no replacement |
| 125 | field, then the values of *field_name*, *format_spec* and *conversion* |
| 126 | will be ``None``. |
| 127 | |
Eric Smith | 9d4ba39 | 2007-09-02 15:33:26 +0000 | [diff] [blame] | 128 | .. method:: get_field(field_name, args, kwargs) |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 129 | |
| 130 | Given *field_name* as returned by :meth:`parse` (see above), convert it to |
Georg Brandl | 7f13e6b | 2007-08-31 10:37:15 +0000 | [diff] [blame] | 131 | an object to be formatted. Returns a tuple (obj, used_key). The default |
| 132 | version takes strings of the form defined in :pep:`3101`, such as |
| 133 | "0[name]" or "label.title". *args* and *kwargs* are as passed in to |
| 134 | :meth:`vformat`. The return value *used_key* has the same meaning as the |
| 135 | *key* parameter to :meth:`get_value`. |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 136 | |
| 137 | .. method:: get_value(key, args, kwargs) |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 138 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 139 | Retrieve a given field value. The *key* argument will be either an |
| 140 | integer or a string. If it is an integer, it represents the index of the |
| 141 | positional argument in *args*; if it is a string, then it represents a |
| 142 | named argument in *kwargs*. |
| 143 | |
| 144 | The *args* parameter is set to the list of positional arguments to |
| 145 | :meth:`vformat`, and the *kwargs* parameter is set to the dictionary of |
| 146 | keyword arguments. |
| 147 | |
| 148 | For compound field names, these functions are only called for the first |
| 149 | component of the field name; Subsequent components are handled through |
| 150 | normal attribute and indexing operations. |
| 151 | |
| 152 | So for example, the field expression '0.name' would cause |
| 153 | :meth:`get_value` to be called with a *key* argument of 0. The ``name`` |
| 154 | attribute will be looked up after :meth:`get_value` returns by calling the |
| 155 | built-in :func:`getattr` function. |
| 156 | |
| 157 | If the index or keyword refers to an item that does not exist, then an |
| 158 | :exc:`IndexError` or :exc:`KeyError` should be raised. |
| 159 | |
| 160 | .. method:: check_unused_args(used_args, args, kwargs) |
| 161 | |
| 162 | Implement checking for unused arguments if desired. The arguments to this |
| 163 | function is the set of all argument keys that were actually referred to in |
| 164 | the format string (integers for positional arguments, and strings for |
| 165 | named arguments), and a reference to the *args* and *kwargs* that was |
| 166 | passed to vformat. The set of unused args can be calculated from these |
Georg Brandl | 7cb1319 | 2010-08-03 12:06:29 +0000 | [diff] [blame] | 167 | parameters. :meth:`check_unused_args` is assumed to raise an exception if |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 168 | the check fails. |
| 169 | |
| 170 | .. method:: format_field(value, format_spec) |
| 171 | |
| 172 | :meth:`format_field` simply calls the global :func:`format` built-in. The |
| 173 | method is provided so that subclasses can override it. |
| 174 | |
| 175 | .. method:: convert_field(value, conversion) |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 176 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 177 | Converts the value (returned by :meth:`get_field`) given a conversion type |
Ezio Melotti | d2191e0 | 2010-07-02 23:18:51 +0000 | [diff] [blame] | 178 | (as in the tuple returned by the :meth:`parse` method). The default |
R David Murray | e56bf97 | 2012-08-19 17:26:34 -0400 | [diff] [blame] | 179 | version understands 's' (str), 'r' (repr) and 'a' (ascii) conversion |
| 180 | types. |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 181 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 182 | |
| 183 | .. _formatstrings: |
| 184 | |
| 185 | Format String Syntax |
| 186 | -------------------- |
| 187 | |
| 188 | The :meth:`str.format` method and the :class:`Formatter` class share the same |
| 189 | syntax for format strings (although in the case of :class:`Formatter`, |
Martin Panter | bc1ee46 | 2016-02-13 00:41:37 +0000 | [diff] [blame] | 190 | subclasses can define their own format string syntax). The syntax is |
| 191 | related to that of :ref:`formatted string literals <f-strings>`, but |
| 192 | there are differences. |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 193 | |
Serhiy Storchaka | ddb961d | 2018-10-26 09:00:49 +0300 | [diff] [blame] | 194 | .. index:: |
Serhiy Storchaka | 913876d | 2018-10-28 13:41:26 +0200 | [diff] [blame] | 195 | single: {} (curly brackets); in string formatting |
| 196 | single: . (dot); in string formatting |
| 197 | single: [] (square brackets); in string formatting |
| 198 | single: ! (exclamation); in string formatting |
| 199 | single: : (colon); in string formatting |
Serhiy Storchaka | ddb961d | 2018-10-26 09:00:49 +0300 | [diff] [blame] | 200 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 201 | Format strings contain "replacement fields" surrounded by curly braces ``{}``. |
| 202 | Anything that is not contained in braces is considered literal text, which is |
| 203 | copied unchanged to the output. If you need to include a brace character in the |
| 204 | literal text, it can be escaped by doubling: ``{{`` and ``}}``. |
| 205 | |
| 206 | The grammar for a replacement field is as follows: |
| 207 | |
| 208 | .. productionlist:: sf |
Georg Brandl | 2f3ed68 | 2009-09-01 07:42:40 +0000 | [diff] [blame] | 209 | replacement_field: "{" [`field_name`] ["!" `conversion`] [":" `format_spec`] "}" |
Eric Smith | c4cae32 | 2009-04-22 00:53:01 +0000 | [diff] [blame] | 210 | field_name: arg_name ("." `attribute_name` | "[" `element_index` "]")* |
Mariatta | 7a561af | 2018-02-05 04:29:02 -0500 | [diff] [blame] | 211 | arg_name: [`identifier` | `digit`+] |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 212 | attribute_name: `identifier` |
Mariatta | 7a561af | 2018-02-05 04:29:02 -0500 | [diff] [blame] | 213 | element_index: `digit`+ | `index_string` |
Eric Smith | 2e9f202 | 2010-02-25 14:58:13 +0000 | [diff] [blame] | 214 | index_string: <any source character except "]"> + |
Benjamin Peterson | 065ba70 | 2008-11-09 01:43:02 +0000 | [diff] [blame] | 215 | conversion: "r" | "s" | "a" |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 216 | format_spec: <described in the next section> |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 217 | |
Georg Brandl | 2f3ed68 | 2009-09-01 07:42:40 +0000 | [diff] [blame] | 218 | In less formal terms, the replacement field can start with a *field_name* that specifies |
Eric Smith | c4cae32 | 2009-04-22 00:53:01 +0000 | [diff] [blame] | 219 | the object whose value is to be formatted and inserted |
| 220 | into the output instead of the replacement field. |
| 221 | The *field_name* is optionally followed by a *conversion* field, which is |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 222 | preceded by an exclamation point ``'!'``, and a *format_spec*, which is preceded |
Eric Smith | c4cae32 | 2009-04-22 00:53:01 +0000 | [diff] [blame] | 223 | by a colon ``':'``. These specify a non-default format for the replacement value. |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 224 | |
Ezio Melotti | d2191e0 | 2010-07-02 23:18:51 +0000 | [diff] [blame] | 225 | See also the :ref:`formatspec` section. |
| 226 | |
Ezio Melotti | e130a52 | 2011-10-19 10:58:56 +0300 | [diff] [blame] | 227 | The *field_name* itself begins with an *arg_name* that is either a number or a |
Eric Smith | c4cae32 | 2009-04-22 00:53:01 +0000 | [diff] [blame] | 228 | keyword. If it's a number, it refers to a positional argument, and if it's a keyword, |
| 229 | it refers to a named keyword argument. If the numerical arg_names in a format string |
| 230 | are 0, 1, 2, ... in sequence, they can all be omitted (not just some) |
| 231 | and the numbers 0, 1, 2, ... will be automatically inserted in that order. |
Éric Araujo | 29cf58c | 2011-09-01 18:59:06 +0200 | [diff] [blame] | 232 | Because *arg_name* is not quote-delimited, it is not possible to specify arbitrary |
| 233 | dictionary keys (e.g., the strings ``'10'`` or ``':-]'``) within a format string. |
Eric Smith | c4cae32 | 2009-04-22 00:53:01 +0000 | [diff] [blame] | 234 | The *arg_name* can be followed by any number of index or |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 235 | attribute expressions. An expression of the form ``'.name'`` selects the named |
| 236 | attribute using :func:`getattr`, while an expression of the form ``'[index]'`` |
| 237 | does an index lookup using :func:`__getitem__`. |
| 238 | |
Ezio Melotti | d2191e0 | 2010-07-02 23:18:51 +0000 | [diff] [blame] | 239 | .. versionchanged:: 3.1 |
Xiang Zhang | b9d8ad5 | 2018-06-13 09:42:44 +0800 | [diff] [blame] | 240 | The positional argument specifiers can be omitted for :meth:`str.format`, |
| 241 | so ``'{} {}'.format(a, b)`` is equivalent to ``'{0} {1}'.format(a, b)``. |
| 242 | |
| 243 | .. versionchanged:: 3.4 |
| 244 | The positional argument specifiers can be omitted for :class:`Formatter`. |
Ezio Melotti | d2191e0 | 2010-07-02 23:18:51 +0000 | [diff] [blame] | 245 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 246 | Some simple format string examples:: |
| 247 | |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 248 | "First, thou shalt count to {0}" # References first positional argument |
| 249 | "Bring me a {}" # Implicitly references the first positional argument |
| 250 | "From {} to {}" # Same as "From {0} to {1}" |
| 251 | "My quest is {name}" # References keyword argument 'name' |
| 252 | "Weight in tons {0.weight}" # 'weight' attribute of first positional arg |
| 253 | "Units destroyed: {players[0]}" # First element of keyword argument 'players'. |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 254 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 255 | The *conversion* field causes a type coercion before formatting. Normally, the |
| 256 | job of formatting a value is done by the :meth:`__format__` method of the value |
| 257 | itself. However, in some cases it is desirable to force a type to be formatted |
| 258 | as a string, overriding its own definition of formatting. By converting the |
| 259 | value to a string before calling :meth:`__format__`, the normal formatting logic |
| 260 | is bypassed. |
| 261 | |
Georg Brandl | 559e5d7 | 2008-06-11 18:37:52 +0000 | [diff] [blame] | 262 | Three conversion flags are currently supported: ``'!s'`` which calls :func:`str` |
| 263 | on the value, ``'!r'`` which calls :func:`repr` and ``'!a'`` which calls |
| 264 | :func:`ascii`. |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 265 | |
| 266 | Some examples:: |
| 267 | |
| 268 | "Harold's a clever {0!s}" # Calls str() on the argument first |
| 269 | "Bring out the holy {name!r}" # Calls repr() on the argument first |
Georg Brandl | 2f3ed68 | 2009-09-01 07:42:40 +0000 | [diff] [blame] | 270 | "More {!a}" # Calls ascii() on the argument first |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 271 | |
| 272 | The *format_spec* field contains a specification of how the value should be |
| 273 | presented, including such details as field width, alignment, padding, decimal |
Eric Smith | 0f7affe | 2010-02-15 11:57:31 +0000 | [diff] [blame] | 274 | precision and so on. Each value type can define its own "formatting |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 275 | mini-language" or interpretation of the *format_spec*. |
| 276 | |
| 277 | Most built-in types support a common formatting mini-language, which is |
| 278 | described in the next section. |
| 279 | |
| 280 | A *format_spec* field can also include nested replacement fields within it. |
Martin Panter | d5db147 | 2016-02-08 01:34:09 +0000 | [diff] [blame] | 281 | These nested replacement fields may contain a field name, conversion flag |
| 282 | and format specification, but deeper nesting is |
| 283 | not allowed. The replacement fields within the |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 284 | format_spec are substituted before the *format_spec* string is interpreted. |
| 285 | This allows the formatting of a value to be dynamically specified. |
| 286 | |
Ezio Melotti | d2191e0 | 2010-07-02 23:18:51 +0000 | [diff] [blame] | 287 | See the :ref:`formatexamples` section for some examples. |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 288 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 289 | |
| 290 | .. _formatspec: |
| 291 | |
| 292 | Format Specification Mini-Language |
| 293 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 294 | |
| 295 | "Format specifications" are used within replacement fields contained within a |
| 296 | format string to define how individual values are presented (see |
Martin Panter | bc1ee46 | 2016-02-13 00:41:37 +0000 | [diff] [blame] | 297 | :ref:`formatstrings` and :ref:`f-strings`). |
| 298 | They can also be passed directly to the built-in |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 299 | :func:`format` function. Each formattable type may define how the format |
| 300 | specification is to be interpreted. |
| 301 | |
| 302 | Most built-in types implement the following options for format specifications, |
| 303 | although some of the formatting options are only supported by the numeric types. |
| 304 | |
Eric Smith | 05c0774 | 2010-02-25 14:18:57 +0000 | [diff] [blame] | 305 | A general convention is that an empty format string (``""``) produces |
| 306 | the same result as if you had called :func:`str` on the value. A |
| 307 | non-empty format string typically modifies the result. |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 308 | |
| 309 | The general form of a *standard format specifier* is: |
| 310 | |
| 311 | .. productionlist:: sf |
Eric V. Smith | d7665ca | 2016-09-09 23:13:01 -0400 | [diff] [blame] | 312 | format_spec: [[`fill`]`align`][`sign`][#][0][`width`][`grouping_option`][.`precision`][`type`] |
Ezio Melotti | c318442 | 2013-10-21 02:53:07 +0300 | [diff] [blame] | 313 | fill: <any character> |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 314 | align: "<" | ">" | "=" | "^" |
| 315 | sign: "+" | "-" | " " |
nathankerr96 | 8b5fa28 | 2018-02-03 21:42:08 -0800 | [diff] [blame] | 316 | width: `digit`+ |
Eric V. Smith | d7665ca | 2016-09-09 23:13:01 -0400 | [diff] [blame] | 317 | grouping_option: "_" | "," |
nathankerr96 | 8b5fa28 | 2018-02-03 21:42:08 -0800 | [diff] [blame] | 318 | precision: `digit`+ |
Eric Smith | 05c0774 | 2010-02-25 14:18:57 +0000 | [diff] [blame] | 319 | type: "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%" |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 320 | |
Ezio Melotti | 2bbdfe7 | 2013-11-17 02:47:12 +0200 | [diff] [blame] | 321 | If a valid *align* value is specified, it can be preceded by a *fill* |
Ezio Melotti | c318442 | 2013-10-21 02:53:07 +0300 | [diff] [blame] | 322 | character that can be any character and defaults to a space if omitted. |
Martin Panter | d5db147 | 2016-02-08 01:34:09 +0000 | [diff] [blame] | 323 | It is not possible to use a literal curly brace ("``{``" or "``}``") as |
Martin Panter | bc1ee46 | 2016-02-13 00:41:37 +0000 | [diff] [blame] | 324 | the *fill* character in a :ref:`formatted string literal |
| 325 | <f-strings>` or when using the :meth:`str.format` |
Martin Panter | d5db147 | 2016-02-08 01:34:09 +0000 | [diff] [blame] | 326 | method. However, it is possible to insert a curly brace |
| 327 | with a nested replacement field. This limitation doesn't |
Ezio Melotti | c318442 | 2013-10-21 02:53:07 +0300 | [diff] [blame] | 328 | affect the :func:`format` function. |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 329 | |
| 330 | The meaning of the various alignment options is as follows: |
| 331 | |
Serhiy Storchaka | ddb961d | 2018-10-26 09:00:49 +0300 | [diff] [blame] | 332 | .. index:: |
Serhiy Storchaka | 913876d | 2018-10-28 13:41:26 +0200 | [diff] [blame] | 333 | single: < (less); in string formatting |
| 334 | single: > (greater); in string formatting |
| 335 | single: = (equals); in string formatting |
| 336 | single: ^ (caret); in string formatting |
Serhiy Storchaka | ddb961d | 2018-10-26 09:00:49 +0300 | [diff] [blame] | 337 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 338 | +---------+----------------------------------------------------------+ |
| 339 | | Option | Meaning | |
| 340 | +=========+==========================================================+ |
| 341 | | ``'<'`` | Forces the field to be left-aligned within the available | |
Georg Brandl | ca583b6 | 2011-02-07 12:13:58 +0000 | [diff] [blame] | 342 | | | space (this is the default for most objects). | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 343 | +---------+----------------------------------------------------------+ |
| 344 | | ``'>'`` | Forces the field to be right-aligned within the | |
Georg Brandl | ca583b6 | 2011-02-07 12:13:58 +0000 | [diff] [blame] | 345 | | | available space (this is the default for numbers). | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 346 | +---------+----------------------------------------------------------+ |
| 347 | | ``'='`` | Forces the padding to be placed after the sign (if any) | |
| 348 | | | but before the digits. This is used for printing fields | |
| 349 | | | in the form '+000000120'. This alignment option is only | |
Terry Jan Reedy | 4902c46 | 2016-03-20 21:05:57 -0400 | [diff] [blame] | 350 | | | valid for numeric types. It becomes the default when '0'| |
| 351 | | | immediately precedes the field width. | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 352 | +---------+----------------------------------------------------------+ |
| 353 | | ``'^'`` | Forces the field to be centered within the available | |
| 354 | | | space. | |
| 355 | +---------+----------------------------------------------------------+ |
| 356 | |
| 357 | Note that unless a minimum field width is defined, the field width will always |
| 358 | be the same size as the data to fill it, so that the alignment option has no |
| 359 | meaning in this case. |
| 360 | |
| 361 | The *sign* option is only valid for number types, and can be one of the |
| 362 | following: |
| 363 | |
Serhiy Storchaka | ddb961d | 2018-10-26 09:00:49 +0300 | [diff] [blame] | 364 | .. index:: |
Serhiy Storchaka | 913876d | 2018-10-28 13:41:26 +0200 | [diff] [blame] | 365 | single: + (plus); in string formatting |
| 366 | single: - (minus); in string formatting |
Serhiy Storchaka | ddb961d | 2018-10-26 09:00:49 +0300 | [diff] [blame] | 367 | single: space; in string formatting |
| 368 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 369 | +---------+----------------------------------------------------------+ |
| 370 | | Option | Meaning | |
| 371 | +=========+==========================================================+ |
| 372 | | ``'+'`` | indicates that a sign should be used for both | |
| 373 | | | positive as well as negative numbers. | |
| 374 | +---------+----------------------------------------------------------+ |
| 375 | | ``'-'`` | indicates that a sign should be used only for negative | |
| 376 | | | numbers (this is the default behavior). | |
| 377 | +---------+----------------------------------------------------------+ |
| 378 | | space | indicates that a leading space should be used on | |
| 379 | | | positive numbers, and a minus sign on negative numbers. | |
| 380 | +---------+----------------------------------------------------------+ |
| 381 | |
Eric Smith | 984bb58 | 2010-11-25 16:08:06 +0000 | [diff] [blame] | 382 | |
Serhiy Storchaka | 913876d | 2018-10-28 13:41:26 +0200 | [diff] [blame] | 383 | .. index:: single: # (hash); in string formatting |
Serhiy Storchaka | ddb961d | 2018-10-26 09:00:49 +0300 | [diff] [blame] | 384 | |
Eric Smith | 984bb58 | 2010-11-25 16:08:06 +0000 | [diff] [blame] | 385 | The ``'#'`` option causes the "alternate form" to be used for the |
| 386 | conversion. The alternate form is defined differently for different |
| 387 | types. This option is only valid for integer, float, complex and |
| 388 | Decimal types. For integers, when binary, octal, or hexadecimal output |
| 389 | is used, this option adds the prefix respective ``'0b'``, ``'0o'``, or |
| 390 | ``'0x'`` to the output value. For floats, complex and Decimal the |
| 391 | alternate form causes the result of the conversion to always contain a |
| 392 | decimal-point character, even if no digits follow it. Normally, a |
| 393 | decimal-point character appears in the result of these conversions |
| 394 | only if a digit follows it. In addition, for ``'g'`` and ``'G'`` |
| 395 | conversions, trailing zeros are not removed from the result. |
Eric Smith | d68af8f | 2008-07-16 00:15:35 +0000 | [diff] [blame] | 396 | |
Serhiy Storchaka | 913876d | 2018-10-28 13:41:26 +0200 | [diff] [blame] | 397 | .. index:: single: , (comma); in string formatting |
Serhiy Storchaka | ddb961d | 2018-10-26 09:00:49 +0300 | [diff] [blame] | 398 | |
Raymond Hettinger | 6db9470 | 2009-07-12 20:49:21 +0000 | [diff] [blame] | 399 | The ``','`` option signals the use of a comma for a thousands separator. |
| 400 | For a locale aware separator, use the ``'n'`` integer presentation type |
| 401 | instead. |
| 402 | |
Ezio Melotti | d2191e0 | 2010-07-02 23:18:51 +0000 | [diff] [blame] | 403 | .. versionchanged:: 3.1 |
| 404 | Added the ``','`` option (see also :pep:`378`). |
| 405 | |
Serhiy Storchaka | 913876d | 2018-10-28 13:41:26 +0200 | [diff] [blame] | 406 | .. index:: single: _ (underscore); in string formatting |
Serhiy Storchaka | ddb961d | 2018-10-26 09:00:49 +0300 | [diff] [blame] | 407 | |
Eric V. Smith | 89e1b1a | 2016-09-09 23:06:47 -0400 | [diff] [blame] | 408 | The ``'_'`` option signals the use of an underscore for a thousands |
| 409 | separator for floating point presentation types and for integer |
| 410 | presentation type ``'d'``. For integer presentation types ``'b'``, |
| 411 | ``'o'``, ``'x'``, and ``'X'``, underscores will be inserted every 4 |
| 412 | digits. For other presentation types, specifying this option is an |
| 413 | error. |
| 414 | |
| 415 | .. versionchanged:: 3.6 |
| 416 | Added the ``'_'`` option (see also :pep:`515`). |
| 417 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 418 | *width* is a decimal integer defining the minimum field width. If not |
| 419 | specified, then the field width will be determined by the content. |
| 420 | |
Terry Jan Reedy | 4902c46 | 2016-03-20 21:05:57 -0400 | [diff] [blame] | 421 | When no explicit alignment is given, preceding the *width* field by a zero |
| 422 | (``'0'``) character enables |
Terry Jan Reedy | f6190c1 | 2012-08-17 15:40:46 -0400 | [diff] [blame] | 423 | sign-aware zero-padding for numeric types. This is equivalent to a *fill* |
| 424 | character of ``'0'`` with an *alignment* type of ``'='``. |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 425 | |
| 426 | The *precision* is a decimal number indicating how many digits should be |
Georg Brandl | 3dbca81 | 2008-07-23 16:10:53 +0000 | [diff] [blame] | 427 | displayed after the decimal point for a floating point value formatted with |
| 428 | ``'f'`` and ``'F'``, or before and after the decimal point for a floating point |
| 429 | value formatted with ``'g'`` or ``'G'``. For non-number types the field |
| 430 | indicates the maximum field size - in other words, how many characters will be |
Eric Smith | e5fffc7 | 2009-05-07 19:38:09 +0000 | [diff] [blame] | 431 | used from the field content. The *precision* is not allowed for integer values. |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 432 | |
| 433 | Finally, the *type* determines how the data should be presented. |
| 434 | |
Eric Smith | 05c0774 | 2010-02-25 14:18:57 +0000 | [diff] [blame] | 435 | The available string presentation types are: |
| 436 | |
| 437 | +---------+----------------------------------------------------------+ |
| 438 | | Type | Meaning | |
| 439 | +=========+==========================================================+ |
| 440 | | ``'s'`` | String format. This is the default type for strings and | |
| 441 | | | may be omitted. | |
| 442 | +---------+----------------------------------------------------------+ |
| 443 | | None | The same as ``'s'``. | |
| 444 | +---------+----------------------------------------------------------+ |
| 445 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 446 | The available integer presentation types are: |
| 447 | |
| 448 | +---------+----------------------------------------------------------+ |
| 449 | | Type | Meaning | |
| 450 | +=========+==========================================================+ |
Eric Smith | d68af8f | 2008-07-16 00:15:35 +0000 | [diff] [blame] | 451 | | ``'b'`` | Binary format. Outputs the number in base 2. | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 452 | +---------+----------------------------------------------------------+ |
| 453 | | ``'c'`` | Character. Converts the integer to the corresponding | |
| 454 | | | unicode character before printing. | |
| 455 | +---------+----------------------------------------------------------+ |
| 456 | | ``'d'`` | Decimal Integer. Outputs the number in base 10. | |
| 457 | +---------+----------------------------------------------------------+ |
| 458 | | ``'o'`` | Octal format. Outputs the number in base 8. | |
| 459 | +---------+----------------------------------------------------------+ |
Serhiy Storchaka | 3f819ca | 2018-10-31 02:26:06 +0200 | [diff] [blame] | 460 | | ``'x'`` | Hex format. Outputs the number in base 16, using | |
| 461 | | | lower-case letters for the digits above 9. | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 462 | +---------+----------------------------------------------------------+ |
Serhiy Storchaka | 3f819ca | 2018-10-31 02:26:06 +0200 | [diff] [blame] | 463 | | ``'X'`` | Hex format. Outputs the number in base 16, using | |
| 464 | | | upper-case letters for the digits above 9. | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 465 | +---------+----------------------------------------------------------+ |
Eric Smith | 5e18a20 | 2008-05-12 10:01:24 +0000 | [diff] [blame] | 466 | | ``'n'`` | Number. This is the same as ``'d'``, except that it uses | |
| 467 | | | the current locale setting to insert the appropriate | |
| 468 | | | number separator characters. | |
| 469 | +---------+----------------------------------------------------------+ |
Georg Brandl | 3dbca81 | 2008-07-23 16:10:53 +0000 | [diff] [blame] | 470 | | None | The same as ``'d'``. | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 471 | +---------+----------------------------------------------------------+ |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 472 | |
Eric Smith | 05c0774 | 2010-02-25 14:18:57 +0000 | [diff] [blame] | 473 | In addition to the above presentation types, integers can be formatted |
| 474 | with the floating point presentation types listed below (except |
Serhiy Storchaka | ecf41da | 2016-10-19 16:29:26 +0300 | [diff] [blame] | 475 | ``'n'`` and ``None``). When doing so, :func:`float` is used to convert the |
Eric Smith | 05c0774 | 2010-02-25 14:18:57 +0000 | [diff] [blame] | 476 | integer to a floating point number before formatting. |
| 477 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 478 | The available presentation types for floating point and decimal values are: |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 479 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 480 | +---------+----------------------------------------------------------+ |
| 481 | | Type | Meaning | |
| 482 | +=========+==========================================================+ |
| 483 | | ``'e'`` | Exponent notation. Prints the number in scientific | |
| 484 | | | notation using the letter 'e' to indicate the exponent. | |
Eric V. Smith | 45fe62d | 2013-04-15 09:51:54 -0400 | [diff] [blame] | 485 | | | The default precision is ``6``. | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 486 | +---------+----------------------------------------------------------+ |
Eric Smith | 22b85b3 | 2008-07-17 19:18:29 +0000 | [diff] [blame] | 487 | | ``'E'`` | Exponent notation. Same as ``'e'`` except it uses an | |
| 488 | | | upper case 'E' as the separator character. | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 489 | +---------+----------------------------------------------------------+ |
Terry Jan Reedy | 28c7f8c | 2018-08-06 08:41:17 -0400 | [diff] [blame] | 490 | | ``'f'`` | Fixed-point notation. Displays the number as a | |
| 491 | | | fixed-point number. The default precision is ``6``. | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 492 | +---------+----------------------------------------------------------+ |
Terry Jan Reedy | 28c7f8c | 2018-08-06 08:41:17 -0400 | [diff] [blame] | 493 | | ``'F'`` | Fixed-point notation. Same as ``'f'``, but converts | |
| 494 | | | ``nan`` to ``NAN`` and ``inf`` to ``INF``. | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 495 | +---------+----------------------------------------------------------+ |
Mark Dickinson | c70614f | 2009-10-08 20:05:48 +0000 | [diff] [blame] | 496 | | ``'g'`` | General format. For a given precision ``p >= 1``, | |
| 497 | | | this rounds the number to ``p`` significant digits and | |
| 498 | | | then formats the result in either fixed-point format | |
| 499 | | | or in scientific notation, depending on its magnitude. | |
| 500 | | | | |
| 501 | | | The precise rules are as follows: suppose that the | |
| 502 | | | result formatted with presentation type ``'e'`` and | |
| 503 | | | precision ``p-1`` would have exponent ``exp``. Then | |
| 504 | | | if ``-4 <= exp < p``, the number is formatted | |
| 505 | | | with presentation type ``'f'`` and precision | |
| 506 | | | ``p-1-exp``. Otherwise, the number is formatted | |
| 507 | | | with presentation type ``'e'`` and precision ``p-1``. | |
| 508 | | | In both cases insignificant trailing zeros are removed | |
| 509 | | | from the significand, and the decimal point is also | |
| 510 | | | removed if there are no remaining digits following it. | |
| 511 | | | | |
Benjamin Peterson | 73a3f2d | 2010-10-12 23:07:13 +0000 | [diff] [blame] | 512 | | | Positive and negative infinity, positive and negative | |
Mark Dickinson | c70614f | 2009-10-08 20:05:48 +0000 | [diff] [blame] | 513 | | | zero, and nans, are formatted as ``inf``, ``-inf``, | |
| 514 | | | ``0``, ``-0`` and ``nan`` respectively, regardless of | |
| 515 | | | the precision. | |
| 516 | | | | |
| 517 | | | A precision of ``0`` is treated as equivalent to a | |
Eric V. Smith | 45fe62d | 2013-04-15 09:51:54 -0400 | [diff] [blame] | 518 | | | precision of ``1``. The default precision is ``6``. | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 519 | +---------+----------------------------------------------------------+ |
| 520 | | ``'G'`` | General format. Same as ``'g'`` except switches to | |
Mark Dickinson | c70614f | 2009-10-08 20:05:48 +0000 | [diff] [blame] | 521 | | | ``'E'`` if the number gets too large. The | |
| 522 | | | representations of infinity and NaN are uppercased, too. | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 523 | +---------+----------------------------------------------------------+ |
| 524 | | ``'n'`` | Number. This is the same as ``'g'``, except that it uses | |
| 525 | | | the current locale setting to insert the appropriate | |
| 526 | | | number separator characters. | |
| 527 | +---------+----------------------------------------------------------+ |
| 528 | | ``'%'`` | Percentage. Multiplies the number by 100 and displays | |
| 529 | | | in fixed (``'f'``) format, followed by a percent sign. | |
| 530 | +---------+----------------------------------------------------------+ |
Terry Jan Reedy | c6ad576 | 2014-10-06 02:04:33 -0400 | [diff] [blame] | 531 | | None | Similar to ``'g'``, except that fixed-point notation, | |
| 532 | | | when used, has at least one digit past the decimal point.| |
| 533 | | | The default precision is as high as needed to represent | |
| 534 | | | the particular value. The overall effect is to match the | |
| 535 | | | output of :func:`str` as altered by the other format | |
| 536 | | | modifiers. | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 537 | +---------+----------------------------------------------------------+ |
| 538 | |
| 539 | |
Ezio Melotti | d2191e0 | 2010-07-02 23:18:51 +0000 | [diff] [blame] | 540 | .. _formatexamples: |
| 541 | |
| 542 | Format examples |
| 543 | ^^^^^^^^^^^^^^^ |
| 544 | |
Martin Panter | d5db147 | 2016-02-08 01:34:09 +0000 | [diff] [blame] | 545 | This section contains examples of the :meth:`str.format` syntax and |
| 546 | comparison with the old ``%``-formatting. |
Ezio Melotti | d2191e0 | 2010-07-02 23:18:51 +0000 | [diff] [blame] | 547 | |
| 548 | In most of the cases the syntax is similar to the old ``%``-formatting, with the |
| 549 | addition of the ``{}`` and with ``:`` used instead of ``%``. |
| 550 | For example, ``'%03.2f'`` can be translated to ``'{:03.2f}'``. |
| 551 | |
| 552 | The new format syntax also supports new and different options, shown in the |
Andrés Delfino | d649910 | 2018-11-07 14:24:56 -0300 | [diff] [blame] | 553 | following examples. |
Ezio Melotti | d2191e0 | 2010-07-02 23:18:51 +0000 | [diff] [blame] | 554 | |
| 555 | Accessing arguments by position:: |
| 556 | |
| 557 | >>> '{0}, {1}, {2}'.format('a', 'b', 'c') |
| 558 | 'a, b, c' |
| 559 | >>> '{}, {}, {}'.format('a', 'b', 'c') # 3.1+ only |
| 560 | 'a, b, c' |
| 561 | >>> '{2}, {1}, {0}'.format('a', 'b', 'c') |
| 562 | 'c, b, a' |
| 563 | >>> '{2}, {1}, {0}'.format(*'abc') # unpacking argument sequence |
| 564 | 'c, b, a' |
| 565 | >>> '{0}{1}{0}'.format('abra', 'cad') # arguments' indices can be repeated |
| 566 | 'abracadabra' |
| 567 | |
| 568 | Accessing arguments by name:: |
| 569 | |
| 570 | >>> 'Coordinates: {latitude}, {longitude}'.format(latitude='37.24N', longitude='-115.81W') |
| 571 | 'Coordinates: 37.24N, -115.81W' |
| 572 | >>> coord = {'latitude': '37.24N', 'longitude': '-115.81W'} |
| 573 | >>> 'Coordinates: {latitude}, {longitude}'.format(**coord) |
| 574 | 'Coordinates: 37.24N, -115.81W' |
| 575 | |
| 576 | Accessing arguments' attributes:: |
| 577 | |
| 578 | >>> c = 3-5j |
| 579 | >>> ('The complex number {0} is formed from the real part {0.real} ' |
| 580 | ... 'and the imaginary part {0.imag}.').format(c) |
| 581 | 'The complex number (3-5j) is formed from the real part 3.0 and the imaginary part -5.0.' |
| 582 | >>> class Point: |
| 583 | ... def __init__(self, x, y): |
| 584 | ... self.x, self.y = x, y |
| 585 | ... def __str__(self): |
| 586 | ... return 'Point({self.x}, {self.y})'.format(self=self) |
| 587 | ... |
| 588 | >>> str(Point(4, 2)) |
| 589 | 'Point(4, 2)' |
| 590 | |
| 591 | Accessing arguments' items:: |
| 592 | |
| 593 | >>> coord = (3, 5) |
| 594 | >>> 'X: {0[0]}; Y: {0[1]}'.format(coord) |
| 595 | 'X: 3; Y: 5' |
| 596 | |
| 597 | Replacing ``%s`` and ``%r``:: |
| 598 | |
| 599 | >>> "repr() shows quotes: {!r}; str() doesn't: {!s}".format('test1', 'test2') |
| 600 | "repr() shows quotes: 'test1'; str() doesn't: test2" |
| 601 | |
| 602 | Aligning the text and specifying a width:: |
| 603 | |
| 604 | >>> '{:<30}'.format('left aligned') |
| 605 | 'left aligned ' |
| 606 | >>> '{:>30}'.format('right aligned') |
| 607 | ' right aligned' |
| 608 | >>> '{:^30}'.format('centered') |
| 609 | ' centered ' |
| 610 | >>> '{:*^30}'.format('centered') # use '*' as a fill char |
| 611 | '***********centered***********' |
| 612 | |
| 613 | Replacing ``%+f``, ``%-f``, and ``% f`` and specifying a sign:: |
| 614 | |
| 615 | >>> '{:+f}; {:+f}'.format(3.14, -3.14) # show it always |
| 616 | '+3.140000; -3.140000' |
| 617 | >>> '{: f}; {: f}'.format(3.14, -3.14) # show a space for positive numbers |
| 618 | ' 3.140000; -3.140000' |
| 619 | >>> '{:-f}; {:-f}'.format(3.14, -3.14) # show only the minus -- same as '{:f}; {:f}' |
| 620 | '3.140000; -3.140000' |
| 621 | |
| 622 | Replacing ``%x`` and ``%o`` and converting the value to different bases:: |
| 623 | |
| 624 | >>> # format also supports binary numbers |
| 625 | >>> "int: {0:d}; hex: {0:x}; oct: {0:o}; bin: {0:b}".format(42) |
| 626 | 'int: 42; hex: 2a; oct: 52; bin: 101010' |
| 627 | >>> # with 0x, 0o, or 0b as prefix: |
| 628 | >>> "int: {0:d}; hex: {0:#x}; oct: {0:#o}; bin: {0:#b}".format(42) |
| 629 | 'int: 42; hex: 0x2a; oct: 0o52; bin: 0b101010' |
| 630 | |
| 631 | Using the comma as a thousands separator:: |
| 632 | |
| 633 | >>> '{:,}'.format(1234567890) |
| 634 | '1,234,567,890' |
| 635 | |
| 636 | Expressing a percentage:: |
| 637 | |
| 638 | >>> points = 19 |
| 639 | >>> total = 22 |
Sandro Tosi | baf30da | 2011-12-24 15:53:35 +0100 | [diff] [blame] | 640 | >>> 'Correct answers: {:.2%}'.format(points/total) |
Ezio Melotti | d2191e0 | 2010-07-02 23:18:51 +0000 | [diff] [blame] | 641 | 'Correct answers: 86.36%' |
| 642 | |
| 643 | Using type-specific formatting:: |
| 644 | |
| 645 | >>> import datetime |
| 646 | >>> d = datetime.datetime(2010, 7, 4, 12, 15, 58) |
| 647 | >>> '{:%Y-%m-%d %H:%M:%S}'.format(d) |
| 648 | '2010-07-04 12:15:58' |
| 649 | |
| 650 | Nesting arguments and more complex examples:: |
| 651 | |
| 652 | >>> for align, text in zip('<^>', ['left', 'center', 'right']): |
Georg Brandl | a5770aa | 2011-02-07 12:10:46 +0000 | [diff] [blame] | 653 | ... '{0:{fill}{align}16}'.format(text, fill=align, align=align) |
Ezio Melotti | d2191e0 | 2010-07-02 23:18:51 +0000 | [diff] [blame] | 654 | ... |
| 655 | 'left<<<<<<<<<<<<' |
| 656 | '^^^^^center^^^^^' |
| 657 | '>>>>>>>>>>>right' |
| 658 | >>> |
| 659 | >>> octets = [192, 168, 0, 1] |
| 660 | >>> '{:02X}{:02X}{:02X}{:02X}'.format(*octets) |
| 661 | 'C0A80001' |
| 662 | >>> int(_, 16) |
| 663 | 3232235521 |
| 664 | >>> |
| 665 | >>> width = 5 |
Ezio Melotti | 4050792 | 2013-01-11 09:09:07 +0200 | [diff] [blame] | 666 | >>> for num in range(5,12): #doctest: +NORMALIZE_WHITESPACE |
Ezio Melotti | d2191e0 | 2010-07-02 23:18:51 +0000 | [diff] [blame] | 667 | ... for base in 'dXob': |
| 668 | ... print('{0:{width}{base}}'.format(num, base=base, width=width), end=' ') |
| 669 | ... print() |
| 670 | ... |
| 671 | 5 5 5 101 |
| 672 | 6 6 6 110 |
| 673 | 7 7 7 111 |
| 674 | 8 8 10 1000 |
| 675 | 9 9 11 1001 |
| 676 | 10 A 12 1010 |
| 677 | 11 B 13 1011 |
| 678 | |
| 679 | |
| 680 | |
Georg Brandl | 4b49131 | 2007-08-31 09:22:56 +0000 | [diff] [blame] | 681 | .. _template-strings: |
| 682 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 683 | Template strings |
| 684 | ---------------- |
| 685 | |
Barry Warsaw | 9f74deb | 2017-03-28 10:02:07 -0400 | [diff] [blame] | 686 | Template strings provide simpler string substitutions as described in |
| 687 | :pep:`292`. A primary use case for template strings is for |
| 688 | internationalization (i18n) since in that context, the simpler syntax and |
| 689 | functionality makes it easier to translate than other built-in string |
| 690 | formatting facilities in Python. As an example of a library built on template |
| 691 | strings for i18n, see the |
| 692 | `flufl.i18n <http://flufli18n.readthedocs.io/en/latest/>`_ package. |
| 693 | |
Serhiy Storchaka | 913876d | 2018-10-28 13:41:26 +0200 | [diff] [blame] | 694 | .. index:: single: $ (dollar); in template strings |
Serhiy Storchaka | ddb961d | 2018-10-26 09:00:49 +0300 | [diff] [blame] | 695 | |
Barry Warsaw | 9f74deb | 2017-03-28 10:02:07 -0400 | [diff] [blame] | 696 | Template strings support ``$``-based substitutions, using the following rules: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 697 | |
| 698 | * ``$$`` is an escape; it is replaced with a single ``$``. |
| 699 | |
| 700 | * ``$identifier`` names a substitution placeholder matching a mapping key of |
Barry Warsaw | 17d5f47 | 2015-06-09 14:20:31 -0400 | [diff] [blame] | 701 | ``"identifier"``. By default, ``"identifier"`` is restricted to any |
| 702 | case-insensitive ASCII alphanumeric string (including underscores) that |
| 703 | starts with an underscore or ASCII letter. The first non-identifier |
| 704 | character after the ``$`` character terminates this placeholder |
| 705 | specification. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 706 | |
Barry Warsaw | 17d5f47 | 2015-06-09 14:20:31 -0400 | [diff] [blame] | 707 | * ``${identifier}`` is equivalent to ``$identifier``. It is required when |
| 708 | valid identifier characters follow the placeholder but are not part of the |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 709 | placeholder, such as ``"${noun}ification"``. |
| 710 | |
| 711 | Any other appearance of ``$`` in the string will result in a :exc:`ValueError` |
| 712 | being raised. |
| 713 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 714 | The :mod:`string` module provides a :class:`Template` class that implements |
| 715 | these rules. The methods of :class:`Template` are: |
| 716 | |
| 717 | |
| 718 | .. class:: Template(template) |
| 719 | |
| 720 | The constructor takes a single argument which is the template string. |
| 721 | |
| 722 | |
Georg Brandl | 7f01a13 | 2009-09-16 15:58:14 +0000 | [diff] [blame] | 723 | .. method:: substitute(mapping, **kwds) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 724 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 725 | Performs the template substitution, returning a new string. *mapping* is |
| 726 | any dictionary-like object with keys that match the placeholders in the |
| 727 | template. Alternatively, you can provide keyword arguments, where the |
Georg Brandl | 7f01a13 | 2009-09-16 15:58:14 +0000 | [diff] [blame] | 728 | keywords are the placeholders. When both *mapping* and *kwds* are given |
| 729 | and there are duplicates, the placeholders from *kwds* take precedence. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 730 | |
| 731 | |
Georg Brandl | 7f01a13 | 2009-09-16 15:58:14 +0000 | [diff] [blame] | 732 | .. method:: safe_substitute(mapping, **kwds) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 733 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 734 | Like :meth:`substitute`, except that if placeholders are missing from |
Georg Brandl | 7f01a13 | 2009-09-16 15:58:14 +0000 | [diff] [blame] | 735 | *mapping* and *kwds*, instead of raising a :exc:`KeyError` exception, the |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 736 | original placeholder will appear in the resulting string intact. Also, |
| 737 | unlike with :meth:`substitute`, any other appearances of the ``$`` will |
| 738 | simply return ``$`` instead of raising :exc:`ValueError`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 739 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 740 | While other exceptions may still occur, this method is called "safe" |
Andrés Delfino | d649910 | 2018-11-07 14:24:56 -0300 | [diff] [blame] | 741 | because it always tries to return a usable string instead of |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 742 | raising an exception. In another sense, :meth:`safe_substitute` may be |
| 743 | anything other than safe, since it will silently ignore malformed |
| 744 | templates containing dangling delimiters, unmatched braces, or |
| 745 | placeholders that are not valid Python identifiers. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 746 | |
Benjamin Peterson | 2021100 | 2009-11-25 18:34:42 +0000 | [diff] [blame] | 747 | :class:`Template` instances also provide one public data attribute: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 748 | |
Benjamin Peterson | 2021100 | 2009-11-25 18:34:42 +0000 | [diff] [blame] | 749 | .. attribute:: template |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 750 | |
Benjamin Peterson | 2021100 | 2009-11-25 18:34:42 +0000 | [diff] [blame] | 751 | This is the object passed to the constructor's *template* argument. In |
| 752 | general, you shouldn't change it, but read-only access is not enforced. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 753 | |
Ezio Melotti | bcbc567 | 2013-02-21 12:30:32 +0200 | [diff] [blame] | 754 | Here is an example of how to use a Template:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 755 | |
| 756 | >>> from string import Template |
| 757 | >>> s = Template('$who likes $what') |
| 758 | >>> s.substitute(who='tim', what='kung pao') |
| 759 | 'tim likes kung pao' |
| 760 | >>> d = dict(who='tim') |
| 761 | >>> Template('Give $who $100').substitute(d) |
| 762 | Traceback (most recent call last): |
Ezio Melotti | bcbc567 | 2013-02-21 12:30:32 +0200 | [diff] [blame] | 763 | ... |
Ezio Melotti | 4050792 | 2013-01-11 09:09:07 +0200 | [diff] [blame] | 764 | ValueError: Invalid placeholder in string: line 1, col 11 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 765 | >>> Template('$who likes $what').substitute(d) |
| 766 | Traceback (most recent call last): |
Ezio Melotti | bcbc567 | 2013-02-21 12:30:32 +0200 | [diff] [blame] | 767 | ... |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 768 | KeyError: 'what' |
| 769 | >>> Template('$who likes $what').safe_substitute(d) |
| 770 | 'tim likes $what' |
| 771 | |
Barry Warsaw | 9f74deb | 2017-03-28 10:02:07 -0400 | [diff] [blame] | 772 | Advanced usage: you can derive subclasses of :class:`Template` to customize |
| 773 | the placeholder syntax, delimiter character, or the entire regular expression |
| 774 | used to parse template strings. To do this, you can override these class |
| 775 | attributes: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 776 | |
Barry Warsaw | 9f74deb | 2017-03-28 10:02:07 -0400 | [diff] [blame] | 777 | * *delimiter* -- This is the literal string describing a placeholder |
| 778 | introducing delimiter. The default value is ``$``. Note that this should |
| 779 | *not* be a regular expression, as the implementation will call |
| 780 | :meth:`re.escape` on this string as needed. Note further that you cannot |
| 781 | change the delimiter after class creation (i.e. a different delimiter must |
| 782 | be set in the subclass's class namespace). |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 783 | |
| 784 | * *idpattern* -- This is the regular expression describing the pattern for |
Barry Warsaw | ba42796 | 2017-09-04 16:32:10 -0400 | [diff] [blame] | 785 | non-braced placeholders. The default value is the regular expression |
Serhiy Storchaka | 87be28f | 2018-01-04 19:20:11 +0200 | [diff] [blame] | 786 | ``(?a:[_a-z][_a-z0-9]*)``. If this is given and *braceidpattern* is |
INADA Naoki | b22273e | 2017-10-13 16:02:23 +0900 | [diff] [blame] | 787 | ``None`` this pattern will also apply to braced placeholders. |
| 788 | |
| 789 | .. note:: |
| 790 | |
| 791 | Since default *flags* is ``re.IGNORECASE``, pattern ``[a-z]`` can match |
Barry Warsaw | e256b40 | 2017-11-21 10:28:13 -0500 | [diff] [blame] | 792 | with some non-ASCII characters. That's why we use the local ``a`` flag |
Serhiy Storchaka | 87be28f | 2018-01-04 19:20:11 +0200 | [diff] [blame] | 793 | here. |
Barry Warsaw | ba42796 | 2017-09-04 16:32:10 -0400 | [diff] [blame] | 794 | |
| 795 | .. versionchanged:: 3.7 |
| 796 | *braceidpattern* can be used to define separate patterns used inside and |
| 797 | outside the braces. |
| 798 | |
| 799 | * *braceidpattern* -- This is like *idpattern* but describes the pattern for |
| 800 | braced placeholders. Defaults to ``None`` which means to fall back to |
| 801 | *idpattern* (i.e. the same pattern is used both inside and outside braces). |
| 802 | If given, this allows you to define different patterns for braced and |
| 803 | unbraced placeholders. |
| 804 | |
| 805 | .. versionadded:: 3.7 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 806 | |
Georg Brandl | 056cb93 | 2010-07-29 17:16:10 +0000 | [diff] [blame] | 807 | * *flags* -- The regular expression flags that will be applied when compiling |
| 808 | the regular expression used for recognizing substitutions. The default value |
| 809 | is ``re.IGNORECASE``. Note that ``re.VERBOSE`` will always be added to the |
| 810 | flags, so custom *idpattern*\ s must follow conventions for verbose regular |
| 811 | expressions. |
| 812 | |
| 813 | .. versionadded:: 3.2 |
| 814 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 815 | Alternatively, you can provide the entire regular expression pattern by |
| 816 | overriding the class attribute *pattern*. If you do this, the value must be a |
| 817 | regular expression object with four named capturing groups. The capturing |
| 818 | groups correspond to the rules given above, along with the invalid placeholder |
| 819 | rule: |
| 820 | |
| 821 | * *escaped* -- This group matches the escape sequence, e.g. ``$$``, in the |
| 822 | default pattern. |
| 823 | |
| 824 | * *named* -- This group matches the unbraced placeholder name; it should not |
| 825 | include the delimiter in capturing group. |
| 826 | |
| 827 | * *braced* -- This group matches the brace enclosed placeholder name; it should |
| 828 | not include either the delimiter or braces in the capturing group. |
| 829 | |
| 830 | * *invalid* -- This group matches any other delimiter pattern (usually a single |
| 831 | delimiter), and it should appear last in the regular expression. |
| 832 | |
| 833 | |
Georg Brandl | abc3877 | 2009-04-12 15:51:51 +0000 | [diff] [blame] | 834 | Helper functions |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 835 | ---------------- |
| 836 | |
Georg Brandl | 10430ad | 2009-09-26 20:59:11 +0000 | [diff] [blame] | 837 | .. function:: capwords(s, sep=None) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 838 | |
Ezio Melotti | a40bdda | 2009-09-26 12:33:22 +0000 | [diff] [blame] | 839 | Split the argument into words using :meth:`str.split`, capitalize each word |
| 840 | using :meth:`str.capitalize`, and join the capitalized words using |
| 841 | :meth:`str.join`. If the optional second argument *sep* is absent |
| 842 | or ``None``, runs of whitespace characters are replaced by a single space |
| 843 | and leading and trailing whitespace are removed, otherwise *sep* is used to |
| 844 | split and join the words. |