Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1 | :mod:`string` --- Common string operations |
| 2 | ========================================== |
| 3 | |
| 4 | .. module:: string |
| 5 | :synopsis: Common string operations. |
| 6 | |
| 7 | |
| 8 | .. index:: module: re |
| 9 | |
| 10 | The :mod:`string` module contains a number of useful constants and |
| 11 | classes, as well as some deprecated legacy functions that are also |
| 12 | available as methods on strings. In addition, Python's built-in string |
| 13 | classes support the sequence type methods described in the |
| 14 | :ref:`typesseq` section, and also the string-specific methods described |
| 15 | in the :ref:`string-methods` section. To output formatted strings use |
| 16 | template strings or the ``%`` operator described in the |
| 17 | :ref:`string-formatting` section. Also, see the :mod:`re` module for |
| 18 | string functions based on regular expressions. |
| 19 | |
| 20 | |
| 21 | String constants |
| 22 | ---------------- |
| 23 | |
| 24 | The constants defined in this module are: |
| 25 | |
| 26 | |
| 27 | .. data:: ascii_letters |
| 28 | |
| 29 | The concatenation of the :const:`ascii_lowercase` and :const:`ascii_uppercase` |
| 30 | constants described below. This value is not locale-dependent. |
| 31 | |
| 32 | |
| 33 | .. data:: ascii_lowercase |
| 34 | |
| 35 | The lowercase letters ``'abcdefghijklmnopqrstuvwxyz'``. This value is not |
| 36 | locale-dependent and will not change. |
| 37 | |
| 38 | |
| 39 | .. data:: ascii_uppercase |
| 40 | |
| 41 | The uppercase letters ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``. This value is not |
| 42 | locale-dependent and will not change. |
| 43 | |
| 44 | |
| 45 | .. data:: digits |
| 46 | |
| 47 | The string ``'0123456789'``. |
| 48 | |
| 49 | |
| 50 | .. data:: hexdigits |
| 51 | |
| 52 | The string ``'0123456789abcdefABCDEF'``. |
| 53 | |
| 54 | |
| 55 | .. data:: letters |
| 56 | |
| 57 | The concatenation of the strings :const:`lowercase` and :const:`uppercase` |
| 58 | described below. The specific value is locale-dependent, and will be updated |
| 59 | when :func:`locale.setlocale` is called. |
| 60 | |
| 61 | |
| 62 | .. data:: lowercase |
| 63 | |
| 64 | A string containing all the characters that are considered lowercase letters. |
Georg Brandl | d5ad6da | 2009-03-04 18:24:41 +0000 | [diff] [blame] | 65 | On most systems this is the string ``'abcdefghijklmnopqrstuvwxyz'``. The |
| 66 | specific value is locale-dependent, and will be updated when |
| 67 | :func:`locale.setlocale` is called. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 68 | |
| 69 | |
| 70 | .. data:: octdigits |
| 71 | |
| 72 | The string ``'01234567'``. |
| 73 | |
| 74 | |
| 75 | .. data:: punctuation |
| 76 | |
| 77 | String of ASCII characters which are considered punctuation characters in the |
| 78 | ``C`` locale. |
| 79 | |
| 80 | |
| 81 | .. data:: printable |
| 82 | |
| 83 | String of characters which are considered printable. This is a combination of |
| 84 | :const:`digits`, :const:`letters`, :const:`punctuation`, and |
| 85 | :const:`whitespace`. |
| 86 | |
| 87 | |
| 88 | .. data:: uppercase |
| 89 | |
| 90 | A string containing all the characters that are considered uppercase letters. |
Georg Brandl | d5ad6da | 2009-03-04 18:24:41 +0000 | [diff] [blame] | 91 | On most systems this is the string ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``. The |
| 92 | specific value is locale-dependent, and will be updated when |
| 93 | :func:`locale.setlocale` is called. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 94 | |
| 95 | |
| 96 | .. data:: whitespace |
| 97 | |
| 98 | A string containing all characters that are considered whitespace. On most |
| 99 | systems this includes the characters space, tab, linefeed, return, formfeed, and |
Georg Brandl | d5ad6da | 2009-03-04 18:24:41 +0000 | [diff] [blame] | 100 | vertical tab. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 101 | |
| 102 | |
Benjamin Peterson | c15205e | 2008-05-25 20:05:52 +0000 | [diff] [blame] | 103 | .. _new-string-formatting: |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 104 | |
| 105 | String Formatting |
| 106 | ----------------- |
| 107 | |
Georg Brandl | 8b10f13 | 2009-12-19 17:30:28 +0000 | [diff] [blame] | 108 | .. versionadded:: 2.6 |
| 109 | |
| 110 | The built-in str and unicode classes provide the ability |
Benjamin Peterson | c15205e | 2008-05-25 20:05:52 +0000 | [diff] [blame] | 111 | to do complex variable substitutions and value formatting via the |
| 112 | :meth:`str.format` method described in :pep:`3101`. The :class:`Formatter` |
| 113 | class in the :mod:`string` module allows you to create and customize your own |
| 114 | string formatting behaviors using the same implementation as the built-in |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 115 | :meth:`format` method. |
| 116 | |
| 117 | .. class:: Formatter |
| 118 | |
| 119 | The :class:`Formatter` class has the following public methods: |
| 120 | |
| 121 | .. method:: format(format_string, *args, *kwargs) |
| 122 | |
| 123 | :meth:`format` is the primary API method. It takes a format template |
| 124 | string, and an arbitrary set of positional and keyword argument. |
| 125 | :meth:`format` is just a wrapper that calls :meth:`vformat`. |
| 126 | |
| 127 | .. method:: vformat(format_string, args, kwargs) |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 128 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 129 | This function does the actual work of formatting. It is exposed as a |
| 130 | separate function for cases where you want to pass in a predefined |
| 131 | dictionary of arguments, rather than unpacking and repacking the |
| 132 | dictionary as individual arguments using the ``*args`` and ``**kwds`` |
| 133 | syntax. :meth:`vformat` does the work of breaking up the format template |
| 134 | string into character data and replacement fields. It calls the various |
| 135 | methods described below. |
| 136 | |
| 137 | In addition, the :class:`Formatter` defines a number of methods that are |
| 138 | intended to be replaced by subclasses: |
| 139 | |
| 140 | .. method:: parse(format_string) |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 141 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 142 | Loop over the format_string and return an iterable of tuples |
| 143 | (*literal_text*, *field_name*, *format_spec*, *conversion*). This is used |
| 144 | by :meth:`vformat` to break the string in to either literal text, or |
| 145 | replacement fields. |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 146 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 147 | The values in the tuple conceptually represent a span of literal text |
| 148 | followed by a single replacement field. If there is no literal text |
| 149 | (which can happen if two replacement fields occur consecutively), then |
| 150 | *literal_text* will be a zero-length string. If there is no replacement |
| 151 | field, then the values of *field_name*, *format_spec* and *conversion* |
| 152 | will be ``None``. |
| 153 | |
| 154 | .. method:: get_field(field_name, args, kwargs) |
| 155 | |
| 156 | Given *field_name* as returned by :meth:`parse` (see above), convert it to |
| 157 | an object to be formatted. Returns a tuple (obj, used_key). The default |
| 158 | version takes strings of the form defined in :pep:`3101`, such as |
| 159 | "0[name]" or "label.title". *args* and *kwargs* are as passed in to |
| 160 | :meth:`vformat`. The return value *used_key* has the same meaning as the |
| 161 | *key* parameter to :meth:`get_value`. |
| 162 | |
| 163 | .. method:: get_value(key, args, kwargs) |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 164 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 165 | Retrieve a given field value. The *key* argument will be either an |
| 166 | integer or a string. If it is an integer, it represents the index of the |
| 167 | positional argument in *args*; if it is a string, then it represents a |
| 168 | named argument in *kwargs*. |
| 169 | |
| 170 | The *args* parameter is set to the list of positional arguments to |
| 171 | :meth:`vformat`, and the *kwargs* parameter is set to the dictionary of |
| 172 | keyword arguments. |
| 173 | |
| 174 | For compound field names, these functions are only called for the first |
| 175 | component of the field name; Subsequent components are handled through |
| 176 | normal attribute and indexing operations. |
| 177 | |
| 178 | So for example, the field expression '0.name' would cause |
| 179 | :meth:`get_value` to be called with a *key* argument of 0. The ``name`` |
| 180 | attribute will be looked up after :meth:`get_value` returns by calling the |
| 181 | built-in :func:`getattr` function. |
| 182 | |
| 183 | If the index or keyword refers to an item that does not exist, then an |
| 184 | :exc:`IndexError` or :exc:`KeyError` should be raised. |
| 185 | |
| 186 | .. method:: check_unused_args(used_args, args, kwargs) |
| 187 | |
| 188 | Implement checking for unused arguments if desired. The arguments to this |
| 189 | function is the set of all argument keys that were actually referred to in |
| 190 | the format string (integers for positional arguments, and strings for |
| 191 | named arguments), and a reference to the *args* and *kwargs* that was |
| 192 | passed to vformat. The set of unused args can be calculated from these |
| 193 | parameters. :meth:`check_unused_args` is assumed to throw an exception if |
| 194 | the check fails. |
| 195 | |
| 196 | .. method:: format_field(value, format_spec) |
| 197 | |
| 198 | :meth:`format_field` simply calls the global :func:`format` built-in. The |
| 199 | method is provided so that subclasses can override it. |
| 200 | |
| 201 | .. method:: convert_field(value, conversion) |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 202 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 203 | Converts the value (returned by :meth:`get_field`) given a conversion type |
| 204 | (as in the tuple returned by the :meth:`parse` method.) The default |
| 205 | version understands 'r' (repr) and 's' (str) conversion types. |
| 206 | |
| 207 | |
| 208 | .. _formatstrings: |
| 209 | |
| 210 | Format String Syntax |
| 211 | -------------------- |
| 212 | |
| 213 | The :meth:`str.format` method and the :class:`Formatter` class share the same |
| 214 | syntax for format strings (although in the case of :class:`Formatter`, |
| 215 | subclasses can define their own format string syntax.) |
| 216 | |
| 217 | Format strings contain "replacement fields" surrounded by curly braces ``{}``. |
| 218 | Anything that is not contained in braces is considered literal text, which is |
| 219 | copied unchanged to the output. If you need to include a brace character in the |
| 220 | literal text, it can be escaped by doubling: ``{{`` and ``}}``. |
| 221 | |
| 222 | The grammar for a replacement field is as follows: |
| 223 | |
| 224 | .. productionlist:: sf |
Georg Brandl | 254c17c | 2009-09-01 07:40:54 +0000 | [diff] [blame] | 225 | replacement_field: "{" [`field_name`] ["!" `conversion`] [":" `format_spec`] "}" |
Eric Smith | 4c07438 | 2009-04-22 00:47:00 +0000 | [diff] [blame] | 226 | field_name: arg_name ("." `attribute_name` | "[" `element_index` "]")* |
| 227 | arg_name: (`identifier` | `integer`)? |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 228 | attribute_name: `identifier` |
| 229 | element_index: `integer` |
| 230 | conversion: "r" | "s" |
| 231 | format_spec: <described in the next section> |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 232 | |
Georg Brandl | 254c17c | 2009-09-01 07:40:54 +0000 | [diff] [blame] | 233 | In less formal terms, the replacement field can start with a *field_name* that specifies |
Eric Smith | 4c07438 | 2009-04-22 00:47:00 +0000 | [diff] [blame] | 234 | the object whose value is to be formatted and inserted |
| 235 | into the output instead of the replacement field. |
| 236 | The *field_name* is optionally followed by a *conversion* field, which is |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 237 | preceded by an exclamation point ``'!'``, and a *format_spec*, which is preceded |
Eric Smith | 4c07438 | 2009-04-22 00:47:00 +0000 | [diff] [blame] | 238 | by a colon ``':'``. These specify a non-default format for the replacement value. |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 239 | |
Eric Smith | 4c07438 | 2009-04-22 00:47:00 +0000 | [diff] [blame] | 240 | The *field_name* itself begins with an *arg_name* that is either either a number or a |
| 241 | keyword. If it's a number, it refers to a positional argument, and if it's a keyword, |
| 242 | it refers to a named keyword argument. If the numerical arg_names in a format string |
| 243 | are 0, 1, 2, ... in sequence, they can all be omitted (not just some) |
| 244 | and the numbers 0, 1, 2, ... will be automatically inserted in that order. |
| 245 | The *arg_name* can be followed by any number of index or |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 246 | attribute expressions. An expression of the form ``'.name'`` selects the named |
| 247 | attribute using :func:`getattr`, while an expression of the form ``'[index]'`` |
| 248 | does an index lookup using :func:`__getitem__`. |
| 249 | |
| 250 | Some simple format string examples:: |
| 251 | |
| 252 | "First, thou shalt count to {0}" # References first positional argument |
Benjamin Peterson | 0e92858 | 2009-03-28 19:16:10 +0000 | [diff] [blame] | 253 | "Bring me a {}" # Implicitly references the first positional argument |
Georg Brandl | 254c17c | 2009-09-01 07:40:54 +0000 | [diff] [blame] | 254 | "From {} to {}" # Same as "From {0} to {1}" |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 255 | "My quest is {name}" # References keyword argument 'name' |
| 256 | "Weight in tons {0.weight}" # 'weight' attribute of first positional arg |
| 257 | "Units destroyed: {players[0]}" # First element of keyword argument 'players'. |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 258 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 259 | The *conversion* field causes a type coercion before formatting. Normally, the |
| 260 | job of formatting a value is done by the :meth:`__format__` method of the value |
| 261 | itself. However, in some cases it is desirable to force a type to be formatted |
| 262 | as a string, overriding its own definition of formatting. By converting the |
| 263 | value to a string before calling :meth:`__format__`, the normal formatting logic |
| 264 | is bypassed. |
| 265 | |
| 266 | Two conversion flags are currently supported: ``'!s'`` which calls :func:`str` |
| 267 | on the value, and ``'!r'`` which calls :func:`repr`. |
| 268 | |
| 269 | Some examples:: |
| 270 | |
| 271 | "Harold's a clever {0!s}" # Calls str() on the argument first |
| 272 | "Bring out the holy {name!r}" # Calls repr() on the argument first |
| 273 | |
| 274 | The *format_spec* field contains a specification of how the value should be |
| 275 | presented, including such details as field width, alignment, padding, decimal |
Eric Smith | cef3409 | 2010-02-15 11:55:38 +0000 | [diff] [blame] | 276 | precision and so on. Each value type can define its own "formatting |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 277 | mini-language" or interpretation of the *format_spec*. |
| 278 | |
| 279 | Most built-in types support a common formatting mini-language, which is |
| 280 | described in the next section. |
| 281 | |
| 282 | A *format_spec* field can also include nested replacement fields within it. |
| 283 | These nested replacement fields can contain only a field name; conversion flags |
| 284 | and format specifications are not allowed. The replacement fields within the |
| 285 | format_spec are substituted before the *format_spec* string is interpreted. |
| 286 | This allows the formatting of a value to be dynamically specified. |
| 287 | |
| 288 | For example, suppose you wanted to have a replacement field whose field width is |
| 289 | determined by another variable:: |
| 290 | |
| 291 | "A man with two {0:{1}}".format("noses", 10) |
| 292 | |
| 293 | This would first evaluate the inner replacement field, making the format string |
| 294 | effectively:: |
| 295 | |
| 296 | "A man with two {0:10}" |
| 297 | |
| 298 | Then the outer replacement field would be evaluated, producing:: |
| 299 | |
| 300 | "noses " |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 301 | |
Benjamin Peterson | 90f3673 | 2008-07-12 20:16:19 +0000 | [diff] [blame] | 302 | Which is substituted into the string, yielding:: |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 303 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 304 | "A man with two noses " |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 305 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 306 | (The extra space is because we specified a field width of 10, and because left |
| 307 | alignment is the default for strings.) |
| 308 | |
| 309 | |
| 310 | .. _formatspec: |
| 311 | |
| 312 | Format Specification Mini-Language |
| 313 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 314 | |
| 315 | "Format specifications" are used within replacement fields contained within a |
| 316 | format string to define how individual values are presented (see |
Georg Brandl | d7d4fd7 | 2009-07-26 14:37:28 +0000 | [diff] [blame] | 317 | :ref:`formatstrings`.) They can also be passed directly to the built-in |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 318 | :func:`format` function. Each formattable type may define how the format |
| 319 | specification is to be interpreted. |
| 320 | |
| 321 | Most built-in types implement the following options for format specifications, |
| 322 | although some of the formatting options are only supported by the numeric types. |
| 323 | |
Eric Smith | de8b2ac | 2010-02-25 14:14:35 +0000 | [diff] [blame^] | 324 | A general convention is that an empty format string (``""``) produces |
| 325 | the same result as if you had called :func:`str` on the value. A |
| 326 | non-empty format string typically modifies the result. |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 327 | |
| 328 | The general form of a *standard format specifier* is: |
| 329 | |
| 330 | .. productionlist:: sf |
Andrew M. Kuchling | fa6a427 | 2009-10-05 22:42:56 +0000 | [diff] [blame] | 331 | format_spec: [[`fill`]`align`][`sign`][#][0][`width`][,][.`precision`][`type`] |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 332 | fill: <a character other than '}'> |
| 333 | align: "<" | ">" | "=" | "^" |
| 334 | sign: "+" | "-" | " " |
| 335 | width: `integer` |
| 336 | precision: `integer` |
Eric Smith | de8b2ac | 2010-02-25 14:14:35 +0000 | [diff] [blame^] | 337 | type: "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%" |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 338 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 339 | The *fill* character can be any character other than '}' (which signifies the |
| 340 | end of the field). The presence of a fill character is signaled by the *next* |
| 341 | character, which must be one of the alignment options. If the second character |
| 342 | of *format_spec* is not a valid alignment option, then it is assumed that both |
| 343 | the fill character and the alignment option are absent. |
| 344 | |
| 345 | The meaning of the various alignment options is as follows: |
| 346 | |
| 347 | +---------+----------------------------------------------------------+ |
| 348 | | Option | Meaning | |
| 349 | +=========+==========================================================+ |
| 350 | | ``'<'`` | Forces the field to be left-aligned within the available | |
| 351 | | | space (This is the default.) | |
| 352 | +---------+----------------------------------------------------------+ |
| 353 | | ``'>'`` | Forces the field to be right-aligned within the | |
| 354 | | | available space. | |
| 355 | +---------+----------------------------------------------------------+ |
| 356 | | ``'='`` | Forces the padding to be placed after the sign (if any) | |
| 357 | | | but before the digits. This is used for printing fields | |
| 358 | | | in the form '+000000120'. This alignment option is only | |
| 359 | | | valid for numeric types. | |
| 360 | +---------+----------------------------------------------------------+ |
| 361 | | ``'^'`` | Forces the field to be centered within the available | |
| 362 | | | space. | |
| 363 | +---------+----------------------------------------------------------+ |
| 364 | |
| 365 | Note that unless a minimum field width is defined, the field width will always |
| 366 | be the same size as the data to fill it, so that the alignment option has no |
| 367 | meaning in this case. |
| 368 | |
| 369 | The *sign* option is only valid for number types, and can be one of the |
| 370 | following: |
| 371 | |
| 372 | +---------+----------------------------------------------------------+ |
| 373 | | Option | Meaning | |
| 374 | +=========+==========================================================+ |
| 375 | | ``'+'`` | indicates that a sign should be used for both | |
| 376 | | | positive as well as negative numbers. | |
| 377 | +---------+----------------------------------------------------------+ |
| 378 | | ``'-'`` | indicates that a sign should be used only for negative | |
| 379 | | | numbers (this is the default behavior). | |
| 380 | +---------+----------------------------------------------------------+ |
| 381 | | space | indicates that a leading space should be used on | |
| 382 | | | positive numbers, and a minus sign on negative numbers. | |
| 383 | +---------+----------------------------------------------------------+ |
| 384 | |
Benjamin Peterson | b535d32 | 2008-09-11 22:04:02 +0000 | [diff] [blame] | 385 | The ``'#'`` option is only valid for integers, and only for binary, octal, or |
| 386 | hexadecimal output. If present, it specifies that the output will be prefixed |
| 387 | by ``'0b'``, ``'0o'``, or ``'0x'``, respectively. |
Eric Smith | a5fa5a2 | 2008-07-16 00:11:49 +0000 | [diff] [blame] | 388 | |
Andrew M. Kuchling | fa6a427 | 2009-10-05 22:42:56 +0000 | [diff] [blame] | 389 | The ``','`` option signals the use of a comma for a thousands separator. |
| 390 | For a locale aware separator, use the ``'n'`` integer presentation type |
| 391 | instead. |
| 392 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 393 | *width* is a decimal integer defining the minimum field width. If not |
| 394 | specified, then the field width will be determined by the content. |
| 395 | |
| 396 | If the *width* field is preceded by a zero (``'0'``) character, this enables |
| 397 | zero-padding. This is equivalent to an *alignment* type of ``'='`` and a *fill* |
| 398 | character of ``'0'``. |
| 399 | |
| 400 | The *precision* is a decimal number indicating how many digits should be |
Georg Brandl | bf89981 | 2008-07-18 11:15:06 +0000 | [diff] [blame] | 401 | displayed after the decimal point for a floating point value formatted with |
| 402 | ``'f'`` and ``'F'``, or before and after the decimal point for a floating point |
| 403 | value formatted with ``'g'`` or ``'G'``. For non-number types the field |
| 404 | indicates the maximum field size - in other words, how many characters will be |
Eric Smith | 7523234 | 2009-05-07 19:36:09 +0000 | [diff] [blame] | 405 | used from the field content. The *precision* is not allowed for integer values. |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 406 | |
| 407 | Finally, the *type* determines how the data should be presented. |
| 408 | |
Eric Smith | de8b2ac | 2010-02-25 14:14:35 +0000 | [diff] [blame^] | 409 | The available string presentation types are: |
| 410 | |
| 411 | +---------+----------------------------------------------------------+ |
| 412 | | Type | Meaning | |
| 413 | +=========+==========================================================+ |
| 414 | | ``'s'`` | String format. This is the default type for strings and | |
| 415 | | | may be omitted. | |
| 416 | +---------+----------------------------------------------------------+ |
| 417 | | None | The same as ``'s'``. | |
| 418 | +---------+----------------------------------------------------------+ |
| 419 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 420 | The available integer presentation types are: |
| 421 | |
| 422 | +---------+----------------------------------------------------------+ |
| 423 | | Type | Meaning | |
| 424 | +=========+==========================================================+ |
Eric Smith | a5fa5a2 | 2008-07-16 00:11:49 +0000 | [diff] [blame] | 425 | | ``'b'`` | Binary format. Outputs the number in base 2. | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 426 | +---------+----------------------------------------------------------+ |
| 427 | | ``'c'`` | Character. Converts the integer to the corresponding | |
| 428 | | | unicode character before printing. | |
| 429 | +---------+----------------------------------------------------------+ |
| 430 | | ``'d'`` | Decimal Integer. Outputs the number in base 10. | |
| 431 | +---------+----------------------------------------------------------+ |
| 432 | | ``'o'`` | Octal format. Outputs the number in base 8. | |
| 433 | +---------+----------------------------------------------------------+ |
| 434 | | ``'x'`` | Hex format. Outputs the number in base 16, using lower- | |
| 435 | | | case letters for the digits above 9. | |
| 436 | +---------+----------------------------------------------------------+ |
| 437 | | ``'X'`` | Hex format. Outputs the number in base 16, using upper- | |
| 438 | | | case letters for the digits above 9. | |
| 439 | +---------+----------------------------------------------------------+ |
| 440 | | ``'n'`` | Number. This is the same as ``'d'``, except that it uses | |
| 441 | | | the current locale setting to insert the appropriate | |
| 442 | | | number separator characters. | |
| 443 | +---------+----------------------------------------------------------+ |
Georg Brandl | bf89981 | 2008-07-18 11:15:06 +0000 | [diff] [blame] | 444 | | None | The same as ``'d'``. | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 445 | +---------+----------------------------------------------------------+ |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 446 | |
Eric Smith | de8b2ac | 2010-02-25 14:14:35 +0000 | [diff] [blame^] | 447 | In addition to the above presentation types, integers can be formatted |
| 448 | with the floating point presentation types listed below (except |
| 449 | ``'n'`` and None). When doing so, :func:`float` is used to convert the |
| 450 | integer to a floating point number before formatting. |
| 451 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 452 | The available presentation types for floating point and decimal values are: |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 453 | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 454 | +---------+----------------------------------------------------------+ |
| 455 | | Type | Meaning | |
| 456 | +=========+==========================================================+ |
| 457 | | ``'e'`` | Exponent notation. Prints the number in scientific | |
| 458 | | | notation using the letter 'e' to indicate the exponent. | |
| 459 | +---------+----------------------------------------------------------+ |
Eric Smith | d6c393a | 2008-07-17 19:49:47 +0000 | [diff] [blame] | 460 | | ``'E'`` | Exponent notation. Same as ``'e'`` except it uses an | |
| 461 | | | upper case 'E' as the separator character. | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 462 | +---------+----------------------------------------------------------+ |
| 463 | | ``'f'`` | Fixed point. Displays the number as a fixed-point | |
| 464 | | | number. | |
| 465 | +---------+----------------------------------------------------------+ |
Eric Smith | d6c393a | 2008-07-17 19:49:47 +0000 | [diff] [blame] | 466 | | ``'F'`` | Fixed point. Same as ``'f'``. | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 467 | +---------+----------------------------------------------------------+ |
Mark Dickinson | d5a713e | 2009-10-08 20:02:25 +0000 | [diff] [blame] | 468 | | ``'g'`` | General format. For a given precision ``p >= 1``, | |
| 469 | | | this rounds the number to ``p`` significant digits and | |
| 470 | | | then formats the result in either fixed-point format | |
| 471 | | | or in scientific notation, depending on its magnitude. | |
| 472 | | | | |
| 473 | | | The precise rules are as follows: suppose that the | |
| 474 | | | result formatted with presentation type ``'e'`` and | |
| 475 | | | precision ``p-1`` would have exponent ``exp``. Then | |
| 476 | | | if ``-4 <= exp < p``, the number is formatted | |
| 477 | | | with presentation type ``'f'`` and precision | |
| 478 | | | ``p-1-exp``. Otherwise, the number is formatted | |
| 479 | | | with presentation type ``'e'`` and precision ``p-1``. | |
| 480 | | | In both cases insignificant trailing zeros are removed | |
| 481 | | | from the significand, and the decimal point is also | |
| 482 | | | removed if there are no remaining digits following it. | |
| 483 | | | | |
| 484 | | | Postive and negative infinity, positive and negative | |
| 485 | | | zero, and nans, are formatted as ``inf``, ``-inf``, | |
| 486 | | | ``0``, ``-0`` and ``nan`` respectively, regardless of | |
| 487 | | | the precision. | |
| 488 | | | | |
| 489 | | | A precision of ``0`` is treated as equivalent to a | |
| 490 | | | precision of ``1``. | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 491 | +---------+----------------------------------------------------------+ |
| 492 | | ``'G'`` | General format. Same as ``'g'`` except switches to | |
Mark Dickinson | d5a713e | 2009-10-08 20:02:25 +0000 | [diff] [blame] | 493 | | | ``'E'`` if the number gets too large. The | |
| 494 | | | representations of infinity and NaN are uppercased, too. | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 495 | +---------+----------------------------------------------------------+ |
| 496 | | ``'n'`` | Number. This is the same as ``'g'``, except that it uses | |
| 497 | | | the current locale setting to insert the appropriate | |
| 498 | | | number separator characters. | |
| 499 | +---------+----------------------------------------------------------+ |
| 500 | | ``'%'`` | Percentage. Multiplies the number by 100 and displays | |
| 501 | | | in fixed (``'f'``) format, followed by a percent sign. | |
| 502 | +---------+----------------------------------------------------------+ |
Georg Brandl | bf89981 | 2008-07-18 11:15:06 +0000 | [diff] [blame] | 503 | | None | The same as ``'g'``. | |
Georg Brandl | e321c2f | 2008-05-12 16:45:43 +0000 | [diff] [blame] | 504 | +---------+----------------------------------------------------------+ |
| 505 | |
| 506 | |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 507 | Template strings |
| 508 | ---------------- |
| 509 | |
Georg Brandl | 8b10f13 | 2009-12-19 17:30:28 +0000 | [diff] [blame] | 510 | .. versionadded:: 2.4 |
| 511 | |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 512 | Templates provide simpler string substitutions as described in :pep:`292`. |
| 513 | Instead of the normal ``%``\ -based substitutions, Templates support ``$``\ |
| 514 | -based substitutions, using the following rules: |
| 515 | |
| 516 | * ``$$`` is an escape; it is replaced with a single ``$``. |
| 517 | |
| 518 | * ``$identifier`` names a substitution placeholder matching a mapping key of |
| 519 | ``"identifier"``. By default, ``"identifier"`` must spell a Python |
| 520 | identifier. The first non-identifier character after the ``$`` character |
| 521 | terminates this placeholder specification. |
| 522 | |
| 523 | * ``${identifier}`` is equivalent to ``$identifier``. It is required when valid |
| 524 | identifier characters follow the placeholder but are not part of the |
| 525 | placeholder, such as ``"${noun}ification"``. |
| 526 | |
| 527 | Any other appearance of ``$`` in the string will result in a :exc:`ValueError` |
| 528 | being raised. |
| 529 | |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 530 | The :mod:`string` module provides a :class:`Template` class that implements |
| 531 | these rules. The methods of :class:`Template` are: |
| 532 | |
| 533 | |
| 534 | .. class:: Template(template) |
| 535 | |
| 536 | The constructor takes a single argument which is the template string. |
| 537 | |
| 538 | |
Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 539 | .. method:: substitute(mapping[, **kws]) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 540 | |
Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 541 | Performs the template substitution, returning a new string. *mapping* is |
| 542 | any dictionary-like object with keys that match the placeholders in the |
| 543 | template. Alternatively, you can provide keyword arguments, where the |
| 544 | keywords are the placeholders. When both *mapping* and *kws* are given |
| 545 | and there are duplicates, the placeholders from *kws* take precedence. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 546 | |
| 547 | |
Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 548 | .. method:: safe_substitute(mapping[, **kws]) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 549 | |
Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 550 | Like :meth:`substitute`, except that if placeholders are missing from |
| 551 | *mapping* and *kws*, instead of raising a :exc:`KeyError` exception, the |
| 552 | original placeholder will appear in the resulting string intact. Also, |
| 553 | unlike with :meth:`substitute`, any other appearances of the ``$`` will |
| 554 | simply return ``$`` instead of raising :exc:`ValueError`. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 555 | |
Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 556 | While other exceptions may still occur, this method is called "safe" |
| 557 | because substitutions always tries to return a usable string instead of |
| 558 | raising an exception. In another sense, :meth:`safe_substitute` may be |
| 559 | anything other than safe, since it will silently ignore malformed |
| 560 | templates containing dangling delimiters, unmatched braces, or |
| 561 | placeholders that are not valid Python identifiers. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 562 | |
Georg Brandl | 1136ff5 | 2009-11-18 20:05:15 +0000 | [diff] [blame] | 563 | :class:`Template` instances also provide one public data attribute: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 564 | |
Georg Brandl | 1136ff5 | 2009-11-18 20:05:15 +0000 | [diff] [blame] | 565 | .. attribute:: template |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 566 | |
Georg Brandl | 1136ff5 | 2009-11-18 20:05:15 +0000 | [diff] [blame] | 567 | This is the object passed to the constructor's *template* argument. In |
| 568 | general, you shouldn't change it, but read-only access is not enforced. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 569 | |
Georg Brandl | e8f1b00 | 2008-03-22 22:04:10 +0000 | [diff] [blame] | 570 | Here is an example of how to use a Template: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 571 | |
| 572 | >>> from string import Template |
| 573 | >>> s = Template('$who likes $what') |
| 574 | >>> s.substitute(who='tim', what='kung pao') |
| 575 | 'tim likes kung pao' |
| 576 | >>> d = dict(who='tim') |
| 577 | >>> Template('Give $who $100').substitute(d) |
| 578 | Traceback (most recent call last): |
| 579 | [...] |
| 580 | ValueError: Invalid placeholder in string: line 1, col 10 |
| 581 | >>> Template('$who likes $what').substitute(d) |
| 582 | Traceback (most recent call last): |
| 583 | [...] |
| 584 | KeyError: 'what' |
| 585 | >>> Template('$who likes $what').safe_substitute(d) |
| 586 | 'tim likes $what' |
| 587 | |
| 588 | Advanced usage: you can derive subclasses of :class:`Template` to customize the |
| 589 | placeholder syntax, delimiter character, or the entire regular expression used |
| 590 | to parse template strings. To do this, you can override these class attributes: |
| 591 | |
| 592 | * *delimiter* -- This is the literal string describing a placeholder introducing |
| 593 | delimiter. The default value ``$``. Note that this should *not* be a regular |
| 594 | expression, as the implementation will call :meth:`re.escape` on this string as |
| 595 | needed. |
| 596 | |
| 597 | * *idpattern* -- This is the regular expression describing the pattern for |
| 598 | non-braced placeholders (the braces will be added automatically as |
| 599 | appropriate). The default value is the regular expression |
| 600 | ``[_a-z][_a-z0-9]*``. |
| 601 | |
| 602 | Alternatively, you can provide the entire regular expression pattern by |
| 603 | overriding the class attribute *pattern*. If you do this, the value must be a |
| 604 | regular expression object with four named capturing groups. The capturing |
| 605 | groups correspond to the rules given above, along with the invalid placeholder |
| 606 | rule: |
| 607 | |
| 608 | * *escaped* -- This group matches the escape sequence, e.g. ``$$``, in the |
| 609 | default pattern. |
| 610 | |
| 611 | * *named* -- This group matches the unbraced placeholder name; it should not |
| 612 | include the delimiter in capturing group. |
| 613 | |
| 614 | * *braced* -- This group matches the brace enclosed placeholder name; it should |
| 615 | not include either the delimiter or braces in the capturing group. |
| 616 | |
| 617 | * *invalid* -- This group matches any other delimiter pattern (usually a single |
| 618 | delimiter), and it should appear last in the regular expression. |
| 619 | |
| 620 | |
| 621 | String functions |
| 622 | ---------------- |
| 623 | |
| 624 | The following functions are available to operate on string and Unicode objects. |
| 625 | They are not available as string methods. |
| 626 | |
| 627 | |
Ezio Melotti | 9aac245 | 2009-09-26 11:20:53 +0000 | [diff] [blame] | 628 | .. function:: capwords(s[, sep]) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 629 | |
Ezio Melotti | 9aac245 | 2009-09-26 11:20:53 +0000 | [diff] [blame] | 630 | Split the argument into words using :meth:`str.split`, capitalize each word |
| 631 | using :meth:`str.capitalize`, and join the capitalized words using |
| 632 | :meth:`str.join`. If the optional second argument *sep* is absent |
| 633 | or ``None``, runs of whitespace characters are replaced by a single space |
| 634 | and leading and trailing whitespace are removed, otherwise *sep* is used to |
| 635 | split and join the words. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 636 | |
| 637 | |
| 638 | .. function:: maketrans(from, to) |
| 639 | |
| 640 | Return a translation table suitable for passing to :func:`translate`, that will |
| 641 | map each character in *from* into the character at the same position in *to*; |
| 642 | *from* and *to* must have the same length. |
| 643 | |
Georg Brandl | 16a57f6 | 2009-04-27 15:29:09 +0000 | [diff] [blame] | 644 | .. note:: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 645 | |
| 646 | Don't use strings derived from :const:`lowercase` and :const:`uppercase` as |
| 647 | arguments; in some locales, these don't have the same length. For case |
Georg Brandl | d5ad6da | 2009-03-04 18:24:41 +0000 | [diff] [blame] | 648 | conversions, always use :meth:`str.lower` and :meth:`str.upper`. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 649 | |
| 650 | |
| 651 | Deprecated string functions |
| 652 | --------------------------- |
| 653 | |
| 654 | The following list of functions are also defined as methods of string and |
| 655 | Unicode objects; see section :ref:`string-methods` for more information on |
| 656 | those. You should consider these functions as deprecated, although they will |
| 657 | not be removed until Python 3.0. The functions defined in this module are: |
| 658 | |
| 659 | |
| 660 | .. function:: atof(s) |
| 661 | |
| 662 | .. deprecated:: 2.0 |
| 663 | Use the :func:`float` built-in function. |
| 664 | |
| 665 | .. index:: builtin: float |
| 666 | |
| 667 | Convert a string to a floating point number. The string must have the standard |
| 668 | syntax for a floating point literal in Python, optionally preceded by a sign |
| 669 | (``+`` or ``-``). Note that this behaves identical to the built-in function |
| 670 | :func:`float` when passed a string. |
| 671 | |
| 672 | .. note:: |
| 673 | |
| 674 | .. index:: |
| 675 | single: NaN |
| 676 | single: Infinity |
| 677 | |
| 678 | When passing in a string, values for NaN and Infinity may be returned, depending |
| 679 | on the underlying C library. The specific set of strings accepted which cause |
| 680 | these values to be returned depends entirely on the C library and is known to |
| 681 | vary. |
| 682 | |
| 683 | |
| 684 | .. function:: atoi(s[, base]) |
| 685 | |
| 686 | .. deprecated:: 2.0 |
| 687 | Use the :func:`int` built-in function. |
| 688 | |
| 689 | .. index:: builtin: eval |
| 690 | |
| 691 | Convert string *s* to an integer in the given *base*. The string must consist |
| 692 | of one or more digits, optionally preceded by a sign (``+`` or ``-``). The |
| 693 | *base* defaults to 10. If it is 0, a default base is chosen depending on the |
| 694 | leading characters of the string (after stripping the sign): ``0x`` or ``0X`` |
| 695 | means 16, ``0`` means 8, anything else means 10. If *base* is 16, a leading |
| 696 | ``0x`` or ``0X`` is always accepted, though not required. This behaves |
| 697 | identically to the built-in function :func:`int` when passed a string. (Also |
| 698 | note: for a more flexible interpretation of numeric literals, use the built-in |
| 699 | function :func:`eval`.) |
| 700 | |
| 701 | |
| 702 | .. function:: atol(s[, base]) |
| 703 | |
| 704 | .. deprecated:: 2.0 |
| 705 | Use the :func:`long` built-in function. |
| 706 | |
| 707 | .. index:: builtin: long |
| 708 | |
| 709 | Convert string *s* to a long integer in the given *base*. The string must |
| 710 | consist of one or more digits, optionally preceded by a sign (``+`` or ``-``). |
| 711 | The *base* argument has the same meaning as for :func:`atoi`. A trailing ``l`` |
| 712 | or ``L`` is not allowed, except if the base is 0. Note that when invoked |
| 713 | without *base* or with *base* set to 10, this behaves identical to the built-in |
| 714 | function :func:`long` when passed a string. |
| 715 | |
| 716 | |
| 717 | .. function:: capitalize(word) |
| 718 | |
| 719 | Return a copy of *word* with only its first character capitalized. |
| 720 | |
| 721 | |
| 722 | .. function:: expandtabs(s[, tabsize]) |
| 723 | |
| 724 | Expand tabs in a string replacing them by one or more spaces, depending on the |
| 725 | current column and the given tab size. The column number is reset to zero after |
| 726 | each newline occurring in the string. This doesn't understand other non-printing |
| 727 | characters or escape sequences. The tab size defaults to 8. |
| 728 | |
| 729 | |
| 730 | .. function:: find(s, sub[, start[,end]]) |
| 731 | |
| 732 | Return the lowest index in *s* where the substring *sub* is found such that |
| 733 | *sub* is wholly contained in ``s[start:end]``. Return ``-1`` on failure. |
| 734 | Defaults for *start* and *end* and interpretation of negative values is the same |
| 735 | as for slices. |
| 736 | |
| 737 | |
| 738 | .. function:: rfind(s, sub[, start[, end]]) |
| 739 | |
| 740 | Like :func:`find` but find the highest index. |
| 741 | |
| 742 | |
| 743 | .. function:: index(s, sub[, start[, end]]) |
| 744 | |
| 745 | Like :func:`find` but raise :exc:`ValueError` when the substring is not found. |
| 746 | |
| 747 | |
| 748 | .. function:: rindex(s, sub[, start[, end]]) |
| 749 | |
| 750 | Like :func:`rfind` but raise :exc:`ValueError` when the substring is not found. |
| 751 | |
| 752 | |
| 753 | .. function:: count(s, sub[, start[, end]]) |
| 754 | |
| 755 | Return the number of (non-overlapping) occurrences of substring *sub* in string |
| 756 | ``s[start:end]``. Defaults for *start* and *end* and interpretation of negative |
| 757 | values are the same as for slices. |
| 758 | |
| 759 | |
| 760 | .. function:: lower(s) |
| 761 | |
| 762 | Return a copy of *s*, but with upper case letters converted to lower case. |
| 763 | |
| 764 | |
| 765 | .. function:: split(s[, sep[, maxsplit]]) |
| 766 | |
| 767 | Return a list of the words of the string *s*. If the optional second argument |
| 768 | *sep* is absent or ``None``, the words are separated by arbitrary strings of |
| 769 | whitespace characters (space, tab, newline, return, formfeed). If the second |
| 770 | argument *sep* is present and not ``None``, it specifies a string to be used as |
| 771 | the word separator. The returned list will then have one more item than the |
| 772 | number of non-overlapping occurrences of the separator in the string. The |
| 773 | optional third argument *maxsplit* defaults to 0. If it is nonzero, at most |
| 774 | *maxsplit* number of splits occur, and the remainder of the string is returned |
| 775 | as the final element of the list (thus, the list will have at most |
| 776 | ``maxsplit+1`` elements). |
| 777 | |
| 778 | The behavior of split on an empty string depends on the value of *sep*. If *sep* |
| 779 | is not specified, or specified as ``None``, the result will be an empty list. |
| 780 | If *sep* is specified as any string, the result will be a list containing one |
| 781 | element which is an empty string. |
| 782 | |
| 783 | |
| 784 | .. function:: rsplit(s[, sep[, maxsplit]]) |
| 785 | |
| 786 | Return a list of the words of the string *s*, scanning *s* from the end. To all |
| 787 | intents and purposes, the resulting list of words is the same as returned by |
| 788 | :func:`split`, except when the optional third argument *maxsplit* is explicitly |
| 789 | specified and nonzero. When *maxsplit* is nonzero, at most *maxsplit* number of |
| 790 | splits -- the *rightmost* ones -- occur, and the remainder of the string is |
| 791 | returned as the first element of the list (thus, the list will have at most |
| 792 | ``maxsplit+1`` elements). |
| 793 | |
| 794 | .. versionadded:: 2.4 |
| 795 | |
| 796 | |
| 797 | .. function:: splitfields(s[, sep[, maxsplit]]) |
| 798 | |
| 799 | This function behaves identically to :func:`split`. (In the past, :func:`split` |
| 800 | was only used with one argument, while :func:`splitfields` was only used with |
| 801 | two arguments.) |
| 802 | |
| 803 | |
| 804 | .. function:: join(words[, sep]) |
| 805 | |
| 806 | Concatenate a list or tuple of words with intervening occurrences of *sep*. |
| 807 | The default value for *sep* is a single space character. It is always true that |
| 808 | ``string.join(string.split(s, sep), sep)`` equals *s*. |
| 809 | |
| 810 | |
| 811 | .. function:: joinfields(words[, sep]) |
| 812 | |
| 813 | This function behaves identically to :func:`join`. (In the past, :func:`join` |
| 814 | was only used with one argument, while :func:`joinfields` was only used with two |
| 815 | arguments.) Note that there is no :meth:`joinfields` method on string objects; |
| 816 | use the :meth:`join` method instead. |
| 817 | |
| 818 | |
| 819 | .. function:: lstrip(s[, chars]) |
| 820 | |
| 821 | Return a copy of the string with leading characters removed. If *chars* is |
| 822 | omitted or ``None``, whitespace characters are removed. If given and not |
| 823 | ``None``, *chars* must be a string; the characters in the string will be |
| 824 | stripped from the beginning of the string this method is called on. |
| 825 | |
| 826 | .. versionchanged:: 2.2.3 |
| 827 | The *chars* parameter was added. The *chars* parameter cannot be passed in |
| 828 | earlier 2.2 versions. |
| 829 | |
| 830 | |
| 831 | .. function:: rstrip(s[, chars]) |
| 832 | |
| 833 | Return a copy of the string with trailing characters removed. If *chars* is |
| 834 | omitted or ``None``, whitespace characters are removed. If given and not |
| 835 | ``None``, *chars* must be a string; the characters in the string will be |
| 836 | stripped from the end of the string this method is called on. |
| 837 | |
| 838 | .. versionchanged:: 2.2.3 |
| 839 | The *chars* parameter was added. The *chars* parameter cannot be passed in |
| 840 | earlier 2.2 versions. |
| 841 | |
| 842 | |
| 843 | .. function:: strip(s[, chars]) |
| 844 | |
| 845 | Return a copy of the string with leading and trailing characters removed. If |
| 846 | *chars* is omitted or ``None``, whitespace characters are removed. If given and |
| 847 | not ``None``, *chars* must be a string; the characters in the string will be |
| 848 | stripped from the both ends of the string this method is called on. |
| 849 | |
| 850 | .. versionchanged:: 2.2.3 |
| 851 | The *chars* parameter was added. The *chars* parameter cannot be passed in |
| 852 | earlier 2.2 versions. |
| 853 | |
| 854 | |
| 855 | .. function:: swapcase(s) |
| 856 | |
| 857 | Return a copy of *s*, but with lower case letters converted to upper case and |
| 858 | vice versa. |
| 859 | |
| 860 | |
| 861 | .. function:: translate(s, table[, deletechars]) |
| 862 | |
| 863 | Delete all characters from *s* that are in *deletechars* (if present), and then |
| 864 | translate the characters using *table*, which must be a 256-character string |
| 865 | giving the translation for each character value, indexed by its ordinal. If |
| 866 | *table* is ``None``, then only the character deletion step is performed. |
| 867 | |
| 868 | |
| 869 | .. function:: upper(s) |
| 870 | |
| 871 | Return a copy of *s*, but with lower case letters converted to upper case. |
| 872 | |
| 873 | |
Georg Brandl | 2cc39ad | 2009-06-08 16:03:41 +0000 | [diff] [blame] | 874 | .. function:: ljust(s, width[, fillchar]) |
| 875 | rjust(s, width[, fillchar]) |
| 876 | center(s, width[, fillchar]) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 877 | |
| 878 | These functions respectively left-justify, right-justify and center a string in |
| 879 | a field of given width. They return a string that is at least *width* |
Georg Brandl | 2cc39ad | 2009-06-08 16:03:41 +0000 | [diff] [blame] | 880 | characters wide, created by padding the string *s* with the character *fillchar* |
| 881 | (default is a space) until the given width on the right, left or both sides. |
| 882 | The string is never truncated. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 883 | |
| 884 | |
| 885 | .. function:: zfill(s, width) |
| 886 | |
| 887 | Pad a numeric string on the left with zero digits until the given width is |
| 888 | reached. Strings starting with a sign are handled correctly. |
| 889 | |
| 890 | |
| 891 | .. function:: replace(str, old, new[, maxreplace]) |
| 892 | |
| 893 | Return a copy of string *str* with all occurrences of substring *old* replaced |
| 894 | by *new*. If the optional argument *maxreplace* is given, the first |
| 895 | *maxreplace* occurrences are replaced. |
| 896 | |