Fred Drake | 295da24 | 1998-08-10 19:42:37 +0000 | [diff] [blame] | 1 | \section{\module{string} --- |
Fred Drake | ffbe687 | 1999-04-22 21:23:22 +0000 | [diff] [blame] | 2 | Common string operations} |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 3 | |
Fred Drake | ffbe687 | 1999-04-22 21:23:22 +0000 | [diff] [blame] | 4 | \declaremodule{standard}{string} |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 5 | \modulesynopsis{Common string operations.} |
| 6 | |
Barry Warsaw | 08b07de | 2004-08-25 03:09:58 +0000 | [diff] [blame] | 7 | The \module{string} module contains a number of useful constants and classes, |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 8 | as well as some deprecated legacy functions that are also available as methods |
| 9 | on strings. See the module \refmodule{re}\refstmodindex{re} for string |
| 10 | functions based on regular expressions. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 11 | |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 12 | \subsection{String constants} |
Guido van Rossum | 0bf4d89 | 1995-03-02 12:37:30 +0000 | [diff] [blame] | 13 | |
Andrew M. Kuchling | be06302 | 2000-12-26 16:14:32 +0000 | [diff] [blame] | 14 | The constants defined in this module are: |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 15 | |
Fred Drake | 960fdf9 | 2001-07-20 18:38:26 +0000 | [diff] [blame] | 16 | \begin{datadesc}{ascii_letters} |
| 17 | The concatenation of the \constant{ascii_lowercase} and |
| 18 | \constant{ascii_uppercase} constants described below. This value is |
| 19 | not locale-dependent. |
| 20 | \end{datadesc} |
| 21 | |
| 22 | \begin{datadesc}{ascii_lowercase} |
| 23 | The lowercase letters \code{'abcdefghijklmnopqrstuvwxyz'}. This |
| 24 | value is not locale-dependent and will not change. |
| 25 | \end{datadesc} |
| 26 | |
| 27 | \begin{datadesc}{ascii_uppercase} |
| 28 | The uppercase letters \code{'ABCDEFGHIJKLMNOPQRSTUVWXYZ'}. This |
| 29 | value is not locale-dependent and will not change. |
| 30 | \end{datadesc} |
| 31 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 32 | \begin{datadesc}{digits} |
| 33 | The string \code{'0123456789'}. |
| 34 | \end{datadesc} |
| 35 | |
| 36 | \begin{datadesc}{hexdigits} |
| 37 | The string \code{'0123456789abcdefABCDEF'}. |
| 38 | \end{datadesc} |
| 39 | |
| 40 | \begin{datadesc}{letters} |
Fred Drake | 0682be4 | 2000-04-10 18:35:49 +0000 | [diff] [blame] | 41 | The concatenation of the strings \constant{lowercase} and |
Fred Drake | 960fdf9 | 2001-07-20 18:38:26 +0000 | [diff] [blame] | 42 | \constant{uppercase} described below. The specific value is |
| 43 | locale-dependent, and will be updated when |
| 44 | \function{locale.setlocale()} is called. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 45 | \end{datadesc} |
| 46 | |
| 47 | \begin{datadesc}{lowercase} |
| 48 | A string containing all the characters that are considered lowercase |
| 49 | letters. On most systems this is the string |
Guido van Rossum | 8675115 | 1995-02-28 17:14:32 +0000 | [diff] [blame] | 50 | \code{'abcdefghijklmnopqrstuvwxyz'}. Do not change its definition --- |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 51 | the effect on the routines \function{upper()} and |
Fred Drake | 960fdf9 | 2001-07-20 18:38:26 +0000 | [diff] [blame] | 52 | \function{swapcase()} is undefined. The specific value is |
| 53 | locale-dependent, and will be updated when |
| 54 | \function{locale.setlocale()} is called. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 55 | \end{datadesc} |
| 56 | |
| 57 | \begin{datadesc}{octdigits} |
| 58 | The string \code{'01234567'}. |
| 59 | \end{datadesc} |
| 60 | |
Fred Drake | 480abc2 | 2000-09-18 16:48:13 +0000 | [diff] [blame] | 61 | \begin{datadesc}{punctuation} |
| 62 | String of \ASCII{} characters which are considered punctuation |
| 63 | characters in the \samp{C} locale. |
| 64 | \end{datadesc} |
| 65 | |
| 66 | \begin{datadesc}{printable} |
| 67 | String of characters which are considered printable. This is a |
| 68 | combination of \constant{digits}, \constant{letters}, |
| 69 | \constant{punctuation}, and \constant{whitespace}. |
| 70 | \end{datadesc} |
| 71 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 72 | \begin{datadesc}{uppercase} |
| 73 | A string containing all the characters that are considered uppercase |
| 74 | letters. On most systems this is the string |
Guido van Rossum | 8675115 | 1995-02-28 17:14:32 +0000 | [diff] [blame] | 75 | \code{'ABCDEFGHIJKLMNOPQRSTUVWXYZ'}. Do not change its definition --- |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 76 | the effect on the routines \function{lower()} and |
Fred Drake | 960fdf9 | 2001-07-20 18:38:26 +0000 | [diff] [blame] | 77 | \function{swapcase()} is undefined. The specific value is |
| 78 | locale-dependent, and will be updated when |
| 79 | \function{locale.setlocale()} is called. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 80 | \end{datadesc} |
| 81 | |
| 82 | \begin{datadesc}{whitespace} |
| 83 | A string containing all characters that are considered whitespace. |
| 84 | On most systems this includes the characters space, tab, linefeed, |
Guido van Rossum | 8675115 | 1995-02-28 17:14:32 +0000 | [diff] [blame] | 85 | return, formfeed, and vertical tab. Do not change its definition --- |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 86 | the effect on the routines \function{strip()} and \function{split()} |
| 87 | is undefined. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 88 | \end{datadesc} |
| 89 | |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 90 | \subsection{Template strings} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 91 | |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 92 | Templates are Unicode strings that can be used to provide string substitutions |
| 93 | as described in \pep{292}. There is a \class{Template} class that is a |
| 94 | subclass of \class{unicode}, overriding the default \method{__mod__()} method. |
| 95 | Instead of the normal \samp{\%}-based substitutions, Template strings support |
| 96 | \samp{\$}-based substitutions, using the following rules: |
| 97 | |
| 98 | \begin{itemize} |
| 99 | \item \samp{\$\$} is an escape; it is replaced with a single \samp{\$}. |
| 100 | |
| 101 | \item \samp{\$identifier} names a substitution placeholder matching a mapping |
| 102 | key of "identifier". By default, "identifier" must spell a Python |
| 103 | identifier. The first non-identifier character after the \samp{\$} |
| 104 | character terminates this placeholder specification. |
| 105 | |
| 106 | \item \samp{\$\{identifier\}} is equivalent to \samp{\$identifier}. It is |
| 107 | required when valid identifier characters follow the placeholder but are |
| 108 | not part of the placeholder, e.g. "\$\{noun\}ification". |
| 109 | \end{itemize} |
| 110 | |
| 111 | Any other appearance of \samp{\$} in the string will result in a |
| 112 | \exception{ValueError} being raised. |
| 113 | |
| 114 | Template strings are used just like normal strings, in that the modulus |
| 115 | operator is used to interpolate a dictionary of values into a Template string, |
| 116 | e.g.: |
| 117 | |
| 118 | \begin{verbatim} |
| 119 | >>> from string import Template |
| 120 | >>> s = Template('$who likes $what') |
| 121 | >>> print s % dict(who='tim', what='kung pao') |
| 122 | tim likes kung pao |
| 123 | >>> Template('Give $who $100') % dict(who='tim') |
| 124 | Traceback (most recent call last): |
| 125 | [...] |
| 126 | ValueError: Invalid placeholder at index 10 |
| 127 | \end{verbatim} |
| 128 | |
| 129 | There is also a \class{SafeTemplate} class, derived from \class{Template} |
| 130 | which acts the same as \class{Template}, except that if placeholders are |
| 131 | missing in the interpolation dictionary, no \exception{KeyError} will be |
| 132 | raised. Instead the original placeholder (with or without the braces, as |
| 133 | appropriate) will be used: |
| 134 | |
| 135 | \begin{verbatim} |
| 136 | >>> from string import SafeTemplate |
| 137 | >>> s = SafeTemplate('$who likes $what for ${meal}') |
| 138 | >>> print s % dict(who='tim') |
| 139 | tim likes $what for ${meal} |
| 140 | \end{verbatim} |
| 141 | |
| 142 | The values in the mapping will automatically be converted to Unicode strings, |
| 143 | using the built-in \function{unicode()} function, which will be called without |
| 144 | optional arguments \var{encoding} or \var{errors}. |
| 145 | |
| 146 | Advanced usage: you can derive subclasses of \class{Template} or |
| 147 | \class{SafeTemplate} to use application-specific placeholder rules. To do |
| 148 | this, you override the class attribute \member{pattern}; the value must be a |
| 149 | compiled regular expression object with four named capturing groups. The |
| 150 | capturing groups correspond to the rules given above, along with the invalid |
| 151 | placeholder rule: |
| 152 | |
| 153 | \begin{itemize} |
| 154 | \item \var{escaped} -- This group matches the escape sequence, i.e. \samp{\$\$} |
| 155 | in the default pattern. |
| 156 | \item \var{named} -- This group matches the unbraced placeholder name; it |
| 157 | should not include the \samp{\$} in capturing group. |
| 158 | \item \var{braced} -- This group matches the brace delimited placeholder name; |
| 159 | it should not include either the \samp{\$} or braces in the capturing |
| 160 | group. |
| 161 | \item \var{bogus} -- This group matches any other \samp{\$}. It usually just |
| 162 | matches a single \samp{\$} and should appear last. |
| 163 | \end{itemize} |
| 164 | |
| 165 | \subsection{String functions} |
| 166 | |
| 167 | The following functions are available to operate on string and Unicode |
| 168 | objects. They are not available as string methods. |
| 169 | |
| 170 | \begin{funcdesc}{capwords}{s} |
| 171 | Split the argument into words using \function{split()}, capitalize |
| 172 | each word using \function{capitalize()}, and join the capitalized |
| 173 | words using \function{join()}. Note that this replaces runs of |
| 174 | whitespace characters by a single space, and removes leading and |
| 175 | trailing whitespace. |
| 176 | \end{funcdesc} |
| 177 | |
| 178 | \begin{funcdesc}{maketrans}{from, to} |
| 179 | Return a translation table suitable for passing to |
| 180 | \function{translate()} or \function{regex.compile()}, that will map |
| 181 | each character in \var{from} into the character at the same position |
| 182 | in \var{to}; \var{from} and \var{to} must have the same length. |
| 183 | |
| 184 | \warning{Don't use strings derived from \constant{lowercase} |
| 185 | and \constant{uppercase} as arguments; in some locales, these don't have |
| 186 | the same length. For case conversions, always use |
| 187 | \function{lower()} and \function{upper()}.} |
| 188 | \end{funcdesc} |
| 189 | |
| 190 | \subsection{Deprecated string functions} |
| 191 | |
| 192 | The following list of functions are also defined as methods of string and |
| 193 | Unicode objects; see ``String Methods'' (section |
| 194 | \ref{string-methods}) for more information on those. You should consider |
| 195 | these functions as deprecated, although they will not be removed until Python |
| 196 | 3.0. The functions defined in this module are: |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 197 | |
| 198 | \begin{funcdesc}{atof}{s} |
Fred Drake | 15f0666 | 2000-10-04 13:59:52 +0000 | [diff] [blame] | 199 | \deprecated{2.0}{Use the \function{float()} built-in function.} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 200 | Convert a string to a floating point number. The string must have |
| 201 | the standard syntax for a floating point literal in Python, |
Fred Drake | 70a66c9 | 1999-02-18 16:08:36 +0000 | [diff] [blame] | 202 | optionally preceded by a sign (\samp{+} or \samp{-}). Note that |
| 203 | this behaves identical to the built-in function |
| 204 | \function{float()}\bifuncindex{float} when passed a string. |
| 205 | |
Fred Drake | 0aa811c | 2001-10-20 04:24:09 +0000 | [diff] [blame] | 206 | \note{When passing in a string, values for NaN\index{NaN} |
Fred Drake | 70a66c9 | 1999-02-18 16:08:36 +0000 | [diff] [blame] | 207 | and Infinity\index{Infinity} may be returned, depending on the |
| 208 | underlying C library. The specific set of strings accepted which |
| 209 | cause these values to be returned depends entirely on the C library |
Fred Drake | 0aa811c | 2001-10-20 04:24:09 +0000 | [diff] [blame] | 210 | and is known to vary.} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 211 | \end{funcdesc} |
| 212 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 213 | \begin{funcdesc}{atoi}{s\optional{, base}} |
Fred Drake | 15f0666 | 2000-10-04 13:59:52 +0000 | [diff] [blame] | 214 | \deprecated{2.0}{Use the \function{int()} built-in function.} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 215 | Convert string \var{s} to an integer in the given \var{base}. The |
| 216 | string must consist of one or more digits, optionally preceded by a |
| 217 | sign (\samp{+} or \samp{-}). The \var{base} defaults to 10. If it |
| 218 | is 0, a default base is chosen depending on the leading characters |
| 219 | of the string (after stripping the sign): \samp{0x} or \samp{0X} |
| 220 | means 16, \samp{0} means 8, anything else means 10. If \var{base} |
Fred Drake | fffe5db | 2000-09-21 05:25:30 +0000 | [diff] [blame] | 221 | is 16, a leading \samp{0x} or \samp{0X} is always accepted, though |
| 222 | not required. This behaves identically to the built-in function |
| 223 | \function{int()} when passed a string. (Also note: for a more |
| 224 | flexible interpretation of numeric literals, use the built-in |
| 225 | function \function{eval()}\bifuncindex{eval}.) |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 226 | \end{funcdesc} |
| 227 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 228 | \begin{funcdesc}{atol}{s\optional{, base}} |
Fred Drake | 15f0666 | 2000-10-04 13:59:52 +0000 | [diff] [blame] | 229 | \deprecated{2.0}{Use the \function{long()} built-in function.} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 230 | Convert string \var{s} to a long integer in the given \var{base}. |
| 231 | The string must consist of one or more digits, optionally preceded |
| 232 | by a sign (\samp{+} or \samp{-}). The \var{base} argument has the |
| 233 | same meaning as for \function{atoi()}. A trailing \samp{l} or |
| 234 | \samp{L} is not allowed, except if the base is 0. Note that when |
| 235 | invoked without \var{base} or with \var{base} set to 10, this |
| 236 | behaves identical to the built-in function |
| 237 | \function{long()}\bifuncindex{long} when passed a string. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 238 | \end{funcdesc} |
| 239 | |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 240 | \begin{funcdesc}{capitalize}{word} |
Fred Drake | 473f46a | 2002-06-20 21:18:46 +0000 | [diff] [blame] | 241 | Return a copy of \var{word} with only its first character capitalized. |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 242 | \end{funcdesc} |
| 243 | |
Fred Drake | 15f0666 | 2000-10-04 13:59:52 +0000 | [diff] [blame] | 244 | \begin{funcdesc}{expandtabs}{s\optional{, tabsize}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 245 | Expand tabs in a string, i.e.\ replace them by one or more spaces, |
| 246 | depending on the current column and the given tab size. The column |
| 247 | number is reset to zero after each newline occurring in the string. |
| 248 | This doesn't understand other non-printing characters or escape |
Guido van Rossum | 9700e9b | 1999-01-25 22:31:53 +0000 | [diff] [blame] | 249 | sequences. The tab size defaults to 8. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 250 | \end{funcdesc} |
| 251 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 252 | \begin{funcdesc}{find}{s, sub\optional{, start\optional{,end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 253 | Return the lowest index in \var{s} where the substring \var{sub} is |
| 254 | found such that \var{sub} is wholly contained in |
| 255 | \code{\var{s}[\var{start}:\var{end}]}. Return \code{-1} on failure. |
| 256 | Defaults for \var{start} and \var{end} and interpretation of |
| 257 | negative values is the same as for slices. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 258 | \end{funcdesc} |
| 259 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 260 | \begin{funcdesc}{rfind}{s, sub\optional{, start\optional{, end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 261 | Like \function{find()} but find the highest index. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 262 | \end{funcdesc} |
| 263 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 264 | \begin{funcdesc}{index}{s, sub\optional{, start\optional{, end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 265 | Like \function{find()} but raise \exception{ValueError} when the |
| 266 | substring is not found. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 267 | \end{funcdesc} |
| 268 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 269 | \begin{funcdesc}{rindex}{s, sub\optional{, start\optional{, end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 270 | Like \function{rfind()} but raise \exception{ValueError} when the |
| 271 | substring is not found. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 272 | \end{funcdesc} |
| 273 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 274 | \begin{funcdesc}{count}{s, sub\optional{, start\optional{, end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 275 | Return the number of (non-overlapping) occurrences of substring |
| 276 | \var{sub} in string \code{\var{s}[\var{start}:\var{end}]}. |
| 277 | Defaults for \var{start} and \var{end} and interpretation of |
Andrew M. Kuchling | a4ca07c | 2000-06-21 01:48:46 +0000 | [diff] [blame] | 278 | negative values are the same as for slices. |
Guido van Rossum | ab3a250 | 1994-08-01 12:18:36 +0000 | [diff] [blame] | 279 | \end{funcdesc} |
| 280 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 281 | \begin{funcdesc}{lower}{s} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 282 | Return a copy of \var{s}, but with upper case letters converted to |
| 283 | lower case. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 284 | \end{funcdesc} |
| 285 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 286 | \begin{funcdesc}{split}{s\optional{, sep\optional{, maxsplit}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 287 | Return a list of the words of the string \var{s}. If the optional |
| 288 | second argument \var{sep} is absent or \code{None}, the words are |
| 289 | separated by arbitrary strings of whitespace characters (space, tab, |
| 290 | newline, return, formfeed). If the second argument \var{sep} is |
| 291 | present and not \code{None}, it specifies a string to be used as the |
Fred Drake | a7ce52b0 | 1999-05-27 17:18:08 +0000 | [diff] [blame] | 292 | word separator. The returned list will then have one more item |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 293 | than the number of non-overlapping occurrences of the separator in |
| 294 | the string. The optional third argument \var{maxsplit} defaults to |
| 295 | 0. If it is nonzero, at most \var{maxsplit} number of splits occur, |
| 296 | and the remainder of the string is returned as the final element of |
| 297 | the list (thus, the list will have at most \code{\var{maxsplit}+1} |
| 298 | elements). |
Nicholas Bastin | 07973da | 2004-03-21 16:59:59 +0000 | [diff] [blame] | 299 | |
| 300 | The behavior of split on an empty string depends on the value of \var{sep}. |
| 301 | If \var{sep} is not specified, or specified as \code{None}, the result will |
| 302 | be an empty list. If \var{sep} is specified as any string, the result will |
| 303 | be a list containing one element which is an empty string. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 304 | \end{funcdesc} |
| 305 | |
Hye-Shik Chang | 3ae811b | 2003-12-15 18:49:53 +0000 | [diff] [blame] | 306 | \begin{funcdesc}{rsplit}{s\optional{, sep\optional{, maxsplit}}} |
Hye-Shik Chang | c6f066f | 2003-12-17 02:49:03 +0000 | [diff] [blame] | 307 | Return a list of the words of the string \var{s}, scanning \var{s} |
| 308 | from the end. To all intents and purposes, the resulting list of |
| 309 | words is the same as returned by \function{split()}, except when the |
| 310 | optional third argument \var{maxsplit} is explicitly specified and |
| 311 | nonzero. When \var{maxsplit} is nonzero, at most \var{maxsplit} |
Fred Drake | 32fef9f | 2003-12-30 23:08:14 +0000 | [diff] [blame] | 312 | number of splits -- the \emph{rightmost} ones -- occur, and the remainder |
Hye-Shik Chang | c6f066f | 2003-12-17 02:49:03 +0000 | [diff] [blame] | 313 | of the string is returned as the first element of the list (thus, the |
| 314 | list will have at most \code{\var{maxsplit}+1} elements). |
Hye-Shik Chang | 3ae811b | 2003-12-15 18:49:53 +0000 | [diff] [blame] | 315 | \versionadded{2.4} |
| 316 | \end{funcdesc} |
| 317 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 318 | \begin{funcdesc}{splitfields}{s\optional{, sep\optional{, maxsplit}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 319 | This function behaves identically to \function{split()}. (In the |
| 320 | past, \function{split()} was only used with one argument, while |
| 321 | \function{splitfields()} was only used with two arguments.) |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 322 | \end{funcdesc} |
| 323 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 324 | \begin{funcdesc}{join}{words\optional{, sep}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 325 | Concatenate a list or tuple of words with intervening occurrences of |
| 326 | \var{sep}. The default value for \var{sep} is a single space |
| 327 | character. It is always true that |
| 328 | \samp{string.join(string.split(\var{s}, \var{sep}), \var{sep})} |
| 329 | equals \var{s}. |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 330 | \end{funcdesc} |
| 331 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 332 | \begin{funcdesc}{joinfields}{words\optional{, sep}} |
Fred Drake | b7c1895 | 2002-09-12 14:16:07 +0000 | [diff] [blame] | 333 | This function behaves identically to \function{join()}. (In the past, |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 334 | \function{join()} was only used with one argument, while |
| 335 | \function{joinfields()} was only used with two arguments.) |
Fred Drake | b7c1895 | 2002-09-12 14:16:07 +0000 | [diff] [blame] | 336 | Note that there is no \method{joinfields()} method on string |
| 337 | objects; use the \method{join()} method instead. |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 338 | \end{funcdesc} |
| 339 | |
Walter Dörwald | de02bcb | 2002-04-22 17:42:37 +0000 | [diff] [blame] | 340 | \begin{funcdesc}{lstrip}{s\optional{, chars}} |
| 341 | Return a copy of the string with leading characters removed. If |
| 342 | \var{chars} is omitted or \code{None}, whitespace characters are |
| 343 | removed. If given and not \code{None}, \var{chars} must be a string; |
| 344 | the characters in the string will be stripped from the beginning of |
| 345 | the string this method is called on. |
Neal Norwitz | ffe33b7 | 2003-04-10 22:35:32 +0000 | [diff] [blame] | 346 | \versionchanged[The \var{chars} parameter was added. The \var{chars} |
| 347 | parameter cannot be passed in earlier 2.2 versions]{2.2.3} |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 348 | \end{funcdesc} |
| 349 | |
Walter Dörwald | de02bcb | 2002-04-22 17:42:37 +0000 | [diff] [blame] | 350 | \begin{funcdesc}{rstrip}{s\optional{, chars}} |
| 351 | Return a copy of the string with trailing characters removed. If |
| 352 | \var{chars} is omitted or \code{None}, whitespace characters are |
| 353 | removed. If given and not \code{None}, \var{chars} must be a string; |
| 354 | the characters in the string will be stripped from the end of the |
| 355 | string this method is called on. |
Neal Norwitz | ffe33b7 | 2003-04-10 22:35:32 +0000 | [diff] [blame] | 356 | \versionchanged[The \var{chars} parameter was added. The \var{chars} |
Martin v. Löwis | b0c319a | 2004-07-19 16:34:01 +0000 | [diff] [blame] | 357 | parameter cannot be passed in earlier 2.2 versions]{2.2.3} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 358 | \end{funcdesc} |
| 359 | |
Walter Dörwald | de02bcb | 2002-04-22 17:42:37 +0000 | [diff] [blame] | 360 | \begin{funcdesc}{strip}{s\optional{, chars}} |
| 361 | Return a copy of the string with leading and trailing characters |
| 362 | removed. If \var{chars} is omitted or \code{None}, whitespace |
| 363 | characters are removed. If given and not \code{None}, \var{chars} |
| 364 | must be a string; the characters in the string will be stripped from |
| 365 | the both ends of the string this method is called on. |
Neal Norwitz | ffe33b7 | 2003-04-10 22:35:32 +0000 | [diff] [blame] | 366 | \versionchanged[The \var{chars} parameter was added. The \var{chars} |
Neal Norwitz | a6bdf2a | 2003-04-17 23:07:13 +0000 | [diff] [blame] | 367 | parameter cannot be passed in earlier 2.2 versions]{2.2.3} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 368 | \end{funcdesc} |
| 369 | |
| 370 | \begin{funcdesc}{swapcase}{s} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 371 | Return a copy of \var{s}, but with lower case letters |
| 372 | converted to upper case and vice versa. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 373 | \end{funcdesc} |
| 374 | |
Guido van Rossum | f4d0d57 | 1996-07-30 18:23:05 +0000 | [diff] [blame] | 375 | \begin{funcdesc}{translate}{s, table\optional{, deletechars}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 376 | Delete all characters from \var{s} that are in \var{deletechars} (if |
| 377 | present), and then translate the characters using \var{table}, which |
| 378 | must be a 256-character string giving the translation for each |
Raymond Hettinger | 5c5fca9 | 2003-07-13 02:06:47 +0000 | [diff] [blame] | 379 | character value, indexed by its ordinal. |
Guido van Rossum | f65f278 | 1995-09-13 17:37:21 +0000 | [diff] [blame] | 380 | \end{funcdesc} |
| 381 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 382 | \begin{funcdesc}{upper}{s} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 383 | Return a copy of \var{s}, but with lower case letters converted to |
| 384 | upper case. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 385 | \end{funcdesc} |
| 386 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 387 | \begin{funcdesc}{ljust}{s, width} |
| 388 | \funcline{rjust}{s, width} |
| 389 | \funcline{center}{s, width} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 390 | These functions respectively left-justify, right-justify and center |
| 391 | a string in a field of given width. They return a string that is at |
| 392 | least \var{width} characters wide, created by padding the string |
| 393 | \var{s} with spaces until the given width on the right, left or both |
| 394 | sides. The string is never truncated. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 395 | \end{funcdesc} |
| 396 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 397 | \begin{funcdesc}{zfill}{s, width} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 398 | Pad a numeric string on the left with zero digits until the given |
| 399 | width is reached. Strings starting with a sign are handled |
| 400 | correctly. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 401 | \end{funcdesc} |
Guido van Rossum | 0bf4d89 | 1995-03-02 12:37:30 +0000 | [diff] [blame] | 402 | |
Martin v. Löwis | 8bafb2a | 2003-11-18 19:48:57 +0000 | [diff] [blame] | 403 | \begin{funcdesc}{replace}{str, old, new\optional{, maxreplace}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 404 | Return a copy of string \var{str} with all occurrences of substring |
| 405 | \var{old} replaced by \var{new}. If the optional argument |
Martin v. Löwis | 8bafb2a | 2003-11-18 19:48:57 +0000 | [diff] [blame] | 406 | \var{maxreplace} is given, the first \var{maxreplace} occurrences are |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 407 | replaced. |
Guido van Rossum | c8a80cd | 1997-03-25 16:41:31 +0000 | [diff] [blame] | 408 | \end{funcdesc} |